Emotion on the edge: An evaluation of feature representations and machine learning models
This paper presents a comprehensive analysis of textual emotion classification, employing a tweet-based dataset to classify emotions such as surprise, love, fear, anger, sadness, and joy. We compare the performances of nine distinct machine learning classification models using Bag of Words (BoW) and...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-03-01
|
| Series: | Natural Language Processing Journal |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2949719125000032 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850059316954923008 |
|---|---|
| author | James Thomas Black Muhammad Zeeshan Shakir |
| author_facet | James Thomas Black Muhammad Zeeshan Shakir |
| author_sort | James Thomas Black |
| collection | DOAJ |
| description | This paper presents a comprehensive analysis of textual emotion classification, employing a tweet-based dataset to classify emotions such as surprise, love, fear, anger, sadness, and joy. We compare the performances of nine distinct machine learning classification models using Bag of Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) feature representations, as well as a fine-tuned DistilBERT transformer model. We examine the training and inference times of models to determine the most efficient combination when employing an edge architecture, investigating each model’s performance from training to inference using an edge board. The study underscores the significance of combinations of models and features in machine learning, detailing how these choices affect model performance when low computation power needs to be considered. The findings reveal that feature representations significantly influence model efficacy, with BoW and TF-IDF models outperforming DistilBERT. The results show that while BoW models tend to have higher accuracy, the overall performance of TF-IDF models is superior, requiring less time for fitting, Stochastic Gradient Descent and Support Vector Machines proving to be the most efficient in terms of performance and inference times. |
| format | Article |
| id | doaj-art-b45e90ef2b744d5495d8bdcf5321d748 |
| institution | DOAJ |
| issn | 2949-7191 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Natural Language Processing Journal |
| spelling | doaj-art-b45e90ef2b744d5495d8bdcf5321d7482025-08-20T02:50:55ZengElsevierNatural Language Processing Journal2949-71912025-03-011010012710.1016/j.nlp.2025.100127Emotion on the edge: An evaluation of feature representations and machine learning modelsJames Thomas Black0Muhammad Zeeshan Shakir1Corresponding author.; School of Computing, Engineering and Physical Sciences, University of the West of Scotland, Paisley, PA1 2BE, United KingdomSchool of Computing, Engineering and Physical Sciences, University of the West of Scotland, Paisley, PA1 2BE, United KingdomThis paper presents a comprehensive analysis of textual emotion classification, employing a tweet-based dataset to classify emotions such as surprise, love, fear, anger, sadness, and joy. We compare the performances of nine distinct machine learning classification models using Bag of Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) feature representations, as well as a fine-tuned DistilBERT transformer model. We examine the training and inference times of models to determine the most efficient combination when employing an edge architecture, investigating each model’s performance from training to inference using an edge board. The study underscores the significance of combinations of models and features in machine learning, detailing how these choices affect model performance when low computation power needs to be considered. The findings reveal that feature representations significantly influence model efficacy, with BoW and TF-IDF models outperforming DistilBERT. The results show that while BoW models tend to have higher accuracy, the overall performance of TF-IDF models is superior, requiring less time for fitting, Stochastic Gradient Descent and Support Vector Machines proving to be the most efficient in terms of performance and inference times.http://www.sciencedirect.com/science/article/pii/S2949719125000032Emotion classificationBag of wordsTF-IDFNatural language processingDistilBERT |
| spellingShingle | James Thomas Black Muhammad Zeeshan Shakir Emotion on the edge: An evaluation of feature representations and machine learning models Natural Language Processing Journal Emotion classification Bag of words TF-IDF Natural language processing DistilBERT |
| title | Emotion on the edge: An evaluation of feature representations and machine learning models |
| title_full | Emotion on the edge: An evaluation of feature representations and machine learning models |
| title_fullStr | Emotion on the edge: An evaluation of feature representations and machine learning models |
| title_full_unstemmed | Emotion on the edge: An evaluation of feature representations and machine learning models |
| title_short | Emotion on the edge: An evaluation of feature representations and machine learning models |
| title_sort | emotion on the edge an evaluation of feature representations and machine learning models |
| topic | Emotion classification Bag of words TF-IDF Natural language processing DistilBERT |
| url | http://www.sciencedirect.com/science/article/pii/S2949719125000032 |
| work_keys_str_mv | AT jamesthomasblack emotionontheedgeanevaluationoffeaturerepresentationsandmachinelearningmodels AT muhammadzeeshanshakir emotionontheedgeanevaluationoffeaturerepresentationsandmachinelearningmodels |