Reddit comment analysis: sentiment prediction and topic modeling using VADER and BERTopic

This work aims at exploring data analysis techniques applied to the social media platform Reddit, highlighting the execution of an Exploratory Data Analysis (EDA) to identify trends and patterns of interaction among users. For sentiment analysis of the comments, the VADER model ("Valence Aware...

Full description

Saved in:
Bibliographic Details
Main Authors: Denilson de Oliveira Silva, Richard Matheus Avelino da Silva, Patrícia Virgínia de Santana Lima, Jéssica Cristina Pereira Batista, Sílvio Fernando Alves Xavier Júnior
Format: Article
Language:English
Published: Universidade Federal de Pernambuco (UFPE) 2024-12-01
Series:Socioeconomic Analytics
Subjects:
Online Access:https://periodicos.ufpe.br/revistas/index.php/SECAN/article/view/265074
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825201950892752896
author Denilson de Oliveira Silva
Richard Matheus Avelino da Silva
Patrícia Virgínia de Santana Lima
Jéssica Cristina Pereira Batista
Sílvio Fernando Alves Xavier Júnior
author_facet Denilson de Oliveira Silva
Richard Matheus Avelino da Silva
Patrícia Virgínia de Santana Lima
Jéssica Cristina Pereira Batista
Sílvio Fernando Alves Xavier Júnior
author_sort Denilson de Oliveira Silva
collection DOAJ
description This work aims at exploring data analysis techniques applied to the social media platform Reddit, highlighting the execution of an Exploratory Data Analysis (EDA) to identify trends and patterns of interaction among users. For sentiment analysis of the comments, the VADER model ("Valence Aware Dictionary and Sentiment Reasoner") is used, and topic modeling is performed with BERTopic ("Bidirectional Encoder Representations from Transformers for Topic Modeling"). The goal is to compare the accuracy and effectiveness of the models in classifying emotions and themes expressed in the comments. The comparison of the models allows identifying which approach yields the most accurate results, which is aligned with the context of discussions on Reddit, providing valuable insights into user behavior and preferences.
format Article
id doaj-art-d7834608058e4a1db25c14034b7002ba
institution Kabale University
issn 2965-4661
language English
publishDate 2024-12-01
publisher Universidade Federal de Pernambuco (UFPE)
record_format Article
series Socioeconomic Analytics
spelling doaj-art-d7834608058e4a1db25c14034b7002ba2025-02-07T17:46:07ZengUniversidade Federal de Pernambuco (UFPE)Socioeconomic Analytics2965-46612024-12-012110.51359/2965-4661.2024.265074Reddit comment analysis: sentiment prediction and topic modeling using VADER and BERTopicDenilson de Oliveira Silva0https://orcid.org/0009-0000-4031-7772Richard Matheus Avelino da Silva1https://orcid.org/0009-0009-9718-9439Patrícia Virgínia de Santana Lima2https://orcid.org/0009-0005-4746-830XJéssica Cristina Pereira Batista3https://orcid.org/0009-0005-5026-0155Sílvio Fernando Alves Xavier Júnior4https://orcid.org/0000-0002-4832-0711State University of ParaíbaState University of ParaíbaState University of ParaíbaState University of ParaíbaState University of Paraíba This work aims at exploring data analysis techniques applied to the social media platform Reddit, highlighting the execution of an Exploratory Data Analysis (EDA) to identify trends and patterns of interaction among users. For sentiment analysis of the comments, the VADER model ("Valence Aware Dictionary and Sentiment Reasoner") is used, and topic modeling is performed with BERTopic ("Bidirectional Encoder Representations from Transformers for Topic Modeling"). The goal is to compare the accuracy and effectiveness of the models in classifying emotions and themes expressed in the comments. The comparison of the models allows identifying which approach yields the most accurate results, which is aligned with the context of discussions on Reddit, providing valuable insights into user behavior and preferences. https://periodicos.ufpe.br/revistas/index.php/SECAN/article/view/265074Sentiment Analysistext miningExploratory Data AnalysisReddittopic modelling
spellingShingle Denilson de Oliveira Silva
Richard Matheus Avelino da Silva
Patrícia Virgínia de Santana Lima
Jéssica Cristina Pereira Batista
Sílvio Fernando Alves Xavier Júnior
Reddit comment analysis: sentiment prediction and topic modeling using VADER and BERTopic
Socioeconomic Analytics
Sentiment Analysis
text mining
Exploratory Data Analysis
Reddit
topic modelling
title Reddit comment analysis: sentiment prediction and topic modeling using VADER and BERTopic
title_full Reddit comment analysis: sentiment prediction and topic modeling using VADER and BERTopic
title_fullStr Reddit comment analysis: sentiment prediction and topic modeling using VADER and BERTopic
title_full_unstemmed Reddit comment analysis: sentiment prediction and topic modeling using VADER and BERTopic
title_short Reddit comment analysis: sentiment prediction and topic modeling using VADER and BERTopic
title_sort reddit comment analysis sentiment prediction and topic modeling using vader and bertopic
topic Sentiment Analysis
text mining
Exploratory Data Analysis
Reddit
topic modelling
url https://periodicos.ufpe.br/revistas/index.php/SECAN/article/view/265074
work_keys_str_mv AT denilsondeoliveirasilva redditcommentanalysissentimentpredictionandtopicmodelingusingvaderandbertopic
AT richardmatheusavelinodasilva redditcommentanalysissentimentpredictionandtopicmodelingusingvaderandbertopic
AT patriciavirginiadesantanalima redditcommentanalysissentimentpredictionandtopicmodelingusingvaderandbertopic
AT jessicacristinapereirabatista redditcommentanalysissentimentpredictionandtopicmodelingusingvaderandbertopic
AT silviofernandoalvesxavierjunior redditcommentanalysissentimentpredictionandtopicmodelingusingvaderandbertopic