Sentiment Analysis of Emoji and Latinized Arabic in Indonesian Youtube Comments: A LABERT-LSTM Model

This study addresses the challenges of sentiment analysis on Indonesian-language YouTube comments, which are complex due to the use of dialects, slang words, emojis, and Latinized Arabic text. The proposed LABERT-LSTM model integrates BERT for deep feature extraction and Bi-LSTM to capture word seq...

Full description

Saved in:
Bibliographic Details
Main Authors: M. Noer Fadli Hidayat, Didik Dwi Prasetya, Triyanna Widiyaningtyas
Format: Article
Language:English
Published: Yayasan Pendidikan Riset dan Pengembangan Intelektual (YRPI) 2025-06-01
Series:Journal of Applied Engineering and Technological Science
Subjects:
Online Access:http://journal.yrpipku.com/index.php/jaets/article/view/7000
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study addresses the challenges of sentiment analysis on Indonesian-language YouTube comments, which are complex due to the use of dialects, slang words, emojis, and Latinized Arabic text. The proposed LABERT-LSTM model integrates BERT for deep feature extraction and Bi-LSTM to capture word sequence context effectively. The dataset comprises 24,593 YouTube comments from five renowned Islamic preachers discussing the topic of “tahlilan”. After data preprocessing, the model was evaluated using accuracy, precision, recall, and F1-score metrics. The results demonstrate that LABERT-LSTM achieved an accuracy of 0.95756, precision of 0.94014, recall of 0.91815, and an F1-score of 0.92868, outperforming standalone BERT and Bi-LSTM models by reducing misclassification and improving predictions for negative, positive, and neutral sentiment classes. Future research recommendations include expanding the dataset to other social media platforms, adopting advanced NLP techniques, conducting studies in other languages, and optimizing the model for enhanced performance and computational efficiency.
ISSN:2715-6087
2715-6079