PAPERZILLA
Crunching Academic Papers into Bite-sized Insights.
About
Sign Out
← Back to papers

Social SciencesSocial SciencesGeneral Social Sciences

A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts
SHARE
Overview
Paper Summary
Conflicts of Interest
Identified Weaknesses
Rating Explanation
Good to know
Topic Hierarchy
File Information
Paper Summary
Paperzilla title
BERTopic and NMF: Tag-teaming Twitter to Reveal Hidden Travel Topics During COVID
This study compared four topic modeling techniques (LDA, NMF, Top2Vec, and BERTopic) to analyze Twitter data related to travel during the COVID-19 pandemic. The findings suggest that BERTopic and NMF are most effective in identifying distinct and interpretable topics from short, unstructured social media posts due to their ability to effectively utilize contextual information and handle noisy data, respectively.
Possible Conflicts of Interest
None identified
Identified Weaknesses
Instability and Reproducibility of Topic Models
The study acknowledges that the choice of topic model can greatly influence the results, especially for BERTopic where repeated modeling leads to different outcomes due to the stochastic nature of the model. This introduces a level of instability and makes reproducibility challenging.
Limited Generalizability Across Social Media Platforms
The research focuses solely on Twitter data. While the authors argue that the methodology should be transferable, the specific characteristics of Twitter (character limits, hashtags, etc.) may not fully represent the diversity and complexity of other social media platforms.
Exclusion of Newer Language Models
While the study evaluates four different models, it doesn't explore other emerging models like GPT-3 or WuDao 2.0. This limits the scope of the comparison and potentially overlooks more powerful techniques.
Subjectivity of Interpretation
The study emphasizes the role of human interpretation in making sense of topic modeling results. However, this reliance on subjective judgment can introduce bias and make comparisons across different researchers less reliable.
Influence of Keyword Selection
The choice of keywords for the term search function in Top2Vec and BERTopic significantly influences the results. The study does not fully address how researchers should select appropriate keywords and mitigate potential biases introduced by this selection process.
Rating Explanation
This research provides a valuable comparison of four different topic modeling techniques applied to social media data, a growing area of interest in social sciences. The study highlights the strengths and weaknesses of each model, offering practical guidance for researchers. Although limited to Twitter data and excluding some newer models, the comparative analysis and emphasis on methodological challenges are significant contributions.
Good to know
This is our free standard analysis. Paperzilla Pro fact-checks every citation, researches author backgrounds and funding sources, and uses advanced AI reasoning for more thorough insights.
Explore Pro →
Topic Hierarchy
File Information
Original Title:
A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts
File Name:
pdf.pdf
[download]
File Size:
1.45 MB
Uploaded:
July 14, 2025 at 06:45 AM
Privacy:
🌐 Public
© 2025 Paperzilla. All rights reserved.

If you are not redirected automatically, click here.