Multi-Class Sentiment Analysis on Twitter: Classification Performance and Challenges
Sentiment analysis refers to the automatic collection, aggregation, and classification of data collected online into different emotion classes. While most of the work related to sentiment analysis of texts focuses on the binary and ternary classification of these data, the task of multi-class classi...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Tsinghua University Press
2019-09-01
|
Series: | Big Data Mining and Analytics |
Subjects: | |
Online Access: | https://www.sciopen.com/article/10.26599/BDMA.2019.9020002 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832573635963912192 |
---|---|
author | Mondher Bouazizi Tomoaki Ohtsuki |
author_facet | Mondher Bouazizi Tomoaki Ohtsuki |
author_sort | Mondher Bouazizi |
collection | DOAJ |
description | Sentiment analysis refers to the automatic collection, aggregation, and classification of data collected online into different emotion classes. While most of the work related to sentiment analysis of texts focuses on the binary and ternary classification of these data, the task of multi-class classification has received less attention. Multi-class classification has always been a challenging task given the complexity of natural languages and the difficulty of understanding and mathematically "quantifying" how humans express their feelings. In this paper, we study the task of multi-class classification of online posts of Twitter users, and show how far it is possible to go with the classification, and the limitations and difficulties of this task. The proposed approach of multi-class classification achieves an accuracy of 60.2% for 7 different sentiment classes which, compared to an accuracy of 81.3% for binary classification, emphasizes the effect of having multiple classes on the classification performance. Nonetheless, we propose a novel model to represent the different sentiments and show how this model helps to understand how sentiments are related. The model is then used to analyze the challenges that multi-class classification presents and to highlight possible future enhancements to multi-class classification accuracy. |
format | Article |
id | doaj-art-995c0c7f582941c9a8fcf9ea8800243e |
institution | Kabale University |
issn | 2096-0654 |
language | English |
publishDate | 2019-09-01 |
publisher | Tsinghua University Press |
record_format | Article |
series | Big Data Mining and Analytics |
spelling | doaj-art-995c0c7f582941c9a8fcf9ea8800243e2025-02-02T03:44:40ZengTsinghua University PressBig Data Mining and Analytics2096-06542019-09-012318119410.26599/BDMA.2019.9020002Multi-Class Sentiment Analysis on Twitter: Classification Performance and ChallengesMondher Bouazizi0Tomoaki Ohtsuki1<institution content-type="dept">Department of Information and Computer Science</institution>, <institution>Keio University</institution>, <city>Yokohama</city> <postal-code>223-8542</postal-code>, <country>Japan</country>.<institution content-type="dept">Department of Information and Computer Science</institution>, <institution>Keio University</institution>, <city>Yokohama</city> <postal-code>223-8542</postal-code>, <country>Japan</country>.Sentiment analysis refers to the automatic collection, aggregation, and classification of data collected online into different emotion classes. While most of the work related to sentiment analysis of texts focuses on the binary and ternary classification of these data, the task of multi-class classification has received less attention. Multi-class classification has always been a challenging task given the complexity of natural languages and the difficulty of understanding and mathematically "quantifying" how humans express their feelings. In this paper, we study the task of multi-class classification of online posts of Twitter users, and show how far it is possible to go with the classification, and the limitations and difficulties of this task. The proposed approach of multi-class classification achieves an accuracy of 60.2% for 7 different sentiment classes which, compared to an accuracy of 81.3% for binary classification, emphasizes the effect of having multiple classes on the classification performance. Nonetheless, we propose a novel model to represent the different sentiments and show how this model helps to understand how sentiments are related. The model is then used to analyze the challenges that multi-class classification presents and to highlight possible future enhancements to multi-class classification accuracy.https://www.sciopen.com/article/10.26599/BDMA.2019.9020002twittersentiment analysismachine learning |
spellingShingle | Mondher Bouazizi Tomoaki Ohtsuki Multi-Class Sentiment Analysis on Twitter: Classification Performance and Challenges Big Data Mining and Analytics sentiment analysis machine learning |
title | Multi-Class Sentiment Analysis on Twitter: Classification Performance and Challenges |
title_full | Multi-Class Sentiment Analysis on Twitter: Classification Performance and Challenges |
title_fullStr | Multi-Class Sentiment Analysis on Twitter: Classification Performance and Challenges |
title_full_unstemmed | Multi-Class Sentiment Analysis on Twitter: Classification Performance and Challenges |
title_short | Multi-Class Sentiment Analysis on Twitter: Classification Performance and Challenges |
title_sort | multi class sentiment analysis on twitter classification performance and challenges |
topic | twitter sentiment analysis machine learning |
url | https://www.sciopen.com/article/10.26599/BDMA.2019.9020002 |
work_keys_str_mv | AT mondherbouazizi multiclasssentimentanalysisontwitterclassificationperformanceandchallenges AT tomoakiohtsuki multiclasssentimentanalysisontwitterclassificationperformanceandchallenges |