Comparison of Classification Algorithms with Bag of Words Feature in Sentiment Analysis

The rapid growth of digital culture, especially on social media platforms, has led to the emergence of unique viral phenomena characterized by unconventional humor and illogical logic such as the Italian brainroot anomaly. Although there have been many studies on sentiment analysis, there is still a...

Full description

Saved in:
Bibliographic Details
Main Author: Fenilinas Adi Artanto
Format: Article
Language:English
Published: LPPM ISB Atma Luhur 2025-07-01
Series:Jurnal Sisfokom
Subjects:
Online Access:https://jurnal.atmaluhur.ac.id/index.php/sisfokom/article/view/2426
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849728897370816512
author Fenilinas Adi Artanto
author_facet Fenilinas Adi Artanto
author_sort Fenilinas Adi Artanto
collection DOAJ
description The rapid growth of digital culture, especially on social media platforms, has led to the emergence of unique viral phenomena characterized by unconventional humor and illogical logic such as the Italian brainroot anomaly. Although there have been many studies on sentiment analysis, there is still a lack of studies focusing on cultural sentiment such as humor in the Italian brainroot anomaly. This study provides an overview of user sentiment analysis of the game “Hantu Tung Tung Tung Sahur 3D,” a culturally viral application anomaly italian brainroot among young people on the Google Play Store during the month of Ramadan. User reviews were collected through web scraping, and data preprocessing involved tokenization, stopword removal, lowercase, stemming, and filtering to prepare the text for analysis. Feature extraction was performed using the Bag of Words method. This study compares the performance of four widely used classification algorithms—Support Vector Machine (SVM), Naïve Bayes, Decision Tree (C4.5), and Random Forest—implemented through Orange Data Mining software, with evaluation based on K-Fold Cross Validation. The novelty of this study lies in its focus on sentiment analysis in a unique and culturally viral digital context, as well as a comparative evaluation of classification algorithms specifically on this dataset. The results show that the Random Forest algorithm achieves the highest Area Under the Curve (AUC) score of 0.529, outperforming Naïve Bayes (0.504), SVM (0.503), and Decision Tree (0.498). These findings provide new insights into the suitability of ensemble methods such as Random Forest for sentiment analysis in specific digital phenomena, highlighting its potential for more reliable sentiment classification in similar contexts.
format Article
id doaj-art-1384fbc751344b048004d1e94fcc61b5
institution DOAJ
issn 2301-7988
2581-0588
language English
publishDate 2025-07-01
publisher LPPM ISB Atma Luhur
record_format Article
series Jurnal Sisfokom
spelling doaj-art-1384fbc751344b048004d1e94fcc61b52025-08-20T03:09:24ZengLPPM ISB Atma LuhurJurnal Sisfokom2301-79882581-05882025-07-0114342242810.32736/sisfokom.v14i3.24262089Comparison of Classification Algorithms with Bag of Words Feature in Sentiment AnalysisFenilinas Adi Artanto0Department of Informatics, Faculty of Engineering and Computer Science, Universitas Muhammadiyah Pekajangan PekalonganThe rapid growth of digital culture, especially on social media platforms, has led to the emergence of unique viral phenomena characterized by unconventional humor and illogical logic such as the Italian brainroot anomaly. Although there have been many studies on sentiment analysis, there is still a lack of studies focusing on cultural sentiment such as humor in the Italian brainroot anomaly. This study provides an overview of user sentiment analysis of the game “Hantu Tung Tung Tung Sahur 3D,” a culturally viral application anomaly italian brainroot among young people on the Google Play Store during the month of Ramadan. User reviews were collected through web scraping, and data preprocessing involved tokenization, stopword removal, lowercase, stemming, and filtering to prepare the text for analysis. Feature extraction was performed using the Bag of Words method. This study compares the performance of four widely used classification algorithms—Support Vector Machine (SVM), Naïve Bayes, Decision Tree (C4.5), and Random Forest—implemented through Orange Data Mining software, with evaluation based on K-Fold Cross Validation. The novelty of this study lies in its focus on sentiment analysis in a unique and culturally viral digital context, as well as a comparative evaluation of classification algorithms specifically on this dataset. The results show that the Random Forest algorithm achieves the highest Area Under the Curve (AUC) score of 0.529, outperforming Naïve Bayes (0.504), SVM (0.503), and Decision Tree (0.498). These findings provide new insights into the suitability of ensemble methods such as Random Forest for sentiment analysis in specific digital phenomena, highlighting its potential for more reliable sentiment classification in similar contexts.https://jurnal.atmaluhur.ac.id/index.php/sisfokom/article/view/2426analysis sentimentanomaly trenbag of wordsgoogle play store
spellingShingle Fenilinas Adi Artanto
Comparison of Classification Algorithms with Bag of Words Feature in Sentiment Analysis
Jurnal Sisfokom
analysis sentiment
anomaly tren
bag of words
google play store
title Comparison of Classification Algorithms with Bag of Words Feature in Sentiment Analysis
title_full Comparison of Classification Algorithms with Bag of Words Feature in Sentiment Analysis
title_fullStr Comparison of Classification Algorithms with Bag of Words Feature in Sentiment Analysis
title_full_unstemmed Comparison of Classification Algorithms with Bag of Words Feature in Sentiment Analysis
title_short Comparison of Classification Algorithms with Bag of Words Feature in Sentiment Analysis
title_sort comparison of classification algorithms with bag of words feature in sentiment analysis
topic analysis sentiment
anomaly tren
bag of words
google play store
url https://jurnal.atmaluhur.ac.id/index.php/sisfokom/article/view/2426
work_keys_str_mv AT fenilinasadiartanto comparisonofclassificationalgorithmswithbagofwordsfeatureinsentimentanalysis