Emotion on the edge: An evaluation of feature representations and machine learning models

This paper presents a comprehensive analysis of textual emotion classification, employing a tweet-based dataset to classify emotions such as surprise, love, fear, anger, sadness, and joy. We compare the performances of nine distinct machine learning classification models using Bag of Words (BoW) and...

Full description

Saved in:
Bibliographic Details
Main Authors: James Thomas Black, Muhammad Zeeshan Shakir
Format: Article
Language:English
Published: Elsevier 2025-03-01
Series:Natural Language Processing Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2949719125000032
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850059316954923008
author James Thomas Black
Muhammad Zeeshan Shakir
author_facet James Thomas Black
Muhammad Zeeshan Shakir
author_sort James Thomas Black
collection DOAJ
description This paper presents a comprehensive analysis of textual emotion classification, employing a tweet-based dataset to classify emotions such as surprise, love, fear, anger, sadness, and joy. We compare the performances of nine distinct machine learning classification models using Bag of Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) feature representations, as well as a fine-tuned DistilBERT transformer model. We examine the training and inference times of models to determine the most efficient combination when employing an edge architecture, investigating each model’s performance from training to inference using an edge board. The study underscores the significance of combinations of models and features in machine learning, detailing how these choices affect model performance when low computation power needs to be considered. The findings reveal that feature representations significantly influence model efficacy, with BoW and TF-IDF models outperforming DistilBERT. The results show that while BoW models tend to have higher accuracy, the overall performance of TF-IDF models is superior, requiring less time for fitting, Stochastic Gradient Descent and Support Vector Machines proving to be the most efficient in terms of performance and inference times.
format Article
id doaj-art-b45e90ef2b744d5495d8bdcf5321d748
institution DOAJ
issn 2949-7191
language English
publishDate 2025-03-01
publisher Elsevier
record_format Article
series Natural Language Processing Journal
spelling doaj-art-b45e90ef2b744d5495d8bdcf5321d7482025-08-20T02:50:55ZengElsevierNatural Language Processing Journal2949-71912025-03-011010012710.1016/j.nlp.2025.100127Emotion on the edge: An evaluation of feature representations and machine learning modelsJames Thomas Black0Muhammad Zeeshan Shakir1Corresponding author.; School of Computing, Engineering and Physical Sciences, University of the West of Scotland, Paisley, PA1 2BE, United KingdomSchool of Computing, Engineering and Physical Sciences, University of the West of Scotland, Paisley, PA1 2BE, United KingdomThis paper presents a comprehensive analysis of textual emotion classification, employing a tweet-based dataset to classify emotions such as surprise, love, fear, anger, sadness, and joy. We compare the performances of nine distinct machine learning classification models using Bag of Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) feature representations, as well as a fine-tuned DistilBERT transformer model. We examine the training and inference times of models to determine the most efficient combination when employing an edge architecture, investigating each model’s performance from training to inference using an edge board. The study underscores the significance of combinations of models and features in machine learning, detailing how these choices affect model performance when low computation power needs to be considered. The findings reveal that feature representations significantly influence model efficacy, with BoW and TF-IDF models outperforming DistilBERT. The results show that while BoW models tend to have higher accuracy, the overall performance of TF-IDF models is superior, requiring less time for fitting, Stochastic Gradient Descent and Support Vector Machines proving to be the most efficient in terms of performance and inference times.http://www.sciencedirect.com/science/article/pii/S2949719125000032Emotion classificationBag of wordsTF-IDFNatural language processingDistilBERT
spellingShingle James Thomas Black
Muhammad Zeeshan Shakir
Emotion on the edge: An evaluation of feature representations and machine learning models
Natural Language Processing Journal
Emotion classification
Bag of words
TF-IDF
Natural language processing
DistilBERT
title Emotion on the edge: An evaluation of feature representations and machine learning models
title_full Emotion on the edge: An evaluation of feature representations and machine learning models
title_fullStr Emotion on the edge: An evaluation of feature representations and machine learning models
title_full_unstemmed Emotion on the edge: An evaluation of feature representations and machine learning models
title_short Emotion on the edge: An evaluation of feature representations and machine learning models
title_sort emotion on the edge an evaluation of feature representations and machine learning models
topic Emotion classification
Bag of words
TF-IDF
Natural language processing
DistilBERT
url http://www.sciencedirect.com/science/article/pii/S2949719125000032
work_keys_str_mv AT jamesthomasblack emotionontheedgeanevaluationoffeaturerepresentationsandmachinelearningmodels
AT muhammadzeeshanshakir emotionontheedgeanevaluationoffeaturerepresentationsandmachinelearningmodels