Sentiment Analysis of Emoji and Latinized Arabic in Indonesian Youtube Comments: A LABERT-LSTM Model
This study addresses the challenges of sentiment analysis on Indonesian-language YouTube comments, which are complex due to the use of dialects, slang words, emojis, and Latinized Arabic text. The proposed LABERT-LSTM model integrates BERT for deep feature extraction and Bi-LSTM to capture word seq...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Yayasan Pendidikan Riset dan Pengembangan Intelektual (YRPI)
2025-06-01
|
| Series: | Journal of Applied Engineering and Technological Science |
| Subjects: | |
| Online Access: | http://journal.yrpipku.com/index.php/jaets/article/view/7000 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | This study addresses the challenges of sentiment analysis on Indonesian-language YouTube comments, which are complex due to the use of dialects, slang words, emojis, and Latinized Arabic text. The proposed LABERT-LSTM model integrates BERT for deep feature extraction and Bi-LSTM to capture word sequence context effectively. The dataset comprises 24,593 YouTube comments from five renowned Islamic preachers discussing the topic of “tahlilan”. After data preprocessing, the model was evaluated using accuracy, precision, recall, and F1-score metrics. The results demonstrate that LABERT-LSTM achieved an accuracy of 0.95756, precision of 0.94014, recall of 0.91815, and an F1-score of 0.92868, outperforming standalone BERT and Bi-LSTM models by reducing misclassification and improving predictions for negative, positive, and neutral sentiment classes. Future research recommendations include expanding the dataset to other social media platforms, adopting advanced NLP techniques, conducting studies in other languages, and optimizing the model for enhanced performance and computational efficiency.
|
|---|---|
| ISSN: | 2715-6087 2715-6079 |