Text vectorization in sentiment analysis: A comparative study of TF-IDF and Word2Vec from Amazon Fine Food Reviews

Sentiment analysis is a practical tool for marketing and branding teams. Companies can collect and analyze opinions or reviews from social media platforms, blog posts, and other numerous forums. It may help them acquire positive feedback to reinforce strengths or identify negative emotions to make i...

Full description

Saved in:
Bibliographic Details
Main Author: Lu Jiaxin
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:ITM Web of Conferences
Online Access:https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_03001.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825206573330333696
author Lu Jiaxin
author_facet Lu Jiaxin
author_sort Lu Jiaxin
collection DOAJ
description Sentiment analysis is a practical tool for marketing and branding teams. Companies can collect and analyze opinions or reviews from social media platforms, blog posts, and other numerous forums. It may help them acquire positive feedback to reinforce strengths or identify negative emotions to make improvements. The research is to compare two text vectorization methods in opinion mining: Term Frequency-Inverse Document Frequency (TF-IDF) and Word2Vec, using Amazon Fine Food Reviews dataset. This study will use these two methods to vectorize preprocessed text data and also input the vectorized data to the emotion classification model, analyzing the performance of two methods in the emotion classification task. The consequence indicates that the former outperforms the latter in handling large datasets, particularly in distinguishing between different sentiment categories, but latter is superior in capturing the semantic relationship of words. Therefore, it is suggested that the advantages of the two methods be combined in practical applications to improve the accuracy and efficiency.
format Article
id doaj-art-846f59e778ff45a99e4591dd01a87c18
institution Kabale University
issn 2271-2097
language English
publishDate 2025-01-01
publisher EDP Sciences
record_format Article
series ITM Web of Conferences
spelling doaj-art-846f59e778ff45a99e4591dd01a87c182025-02-07T08:21:11ZengEDP SciencesITM Web of Conferences2271-20972025-01-01700300110.1051/itmconf/20257003001itmconf_dai2024_03001Text vectorization in sentiment analysis: A comparative study of TF-IDF and Word2Vec from Amazon Fine Food ReviewsLu Jiaxin0ECS, University of SouthamptonSentiment analysis is a practical tool for marketing and branding teams. Companies can collect and analyze opinions or reviews from social media platforms, blog posts, and other numerous forums. It may help them acquire positive feedback to reinforce strengths or identify negative emotions to make improvements. The research is to compare two text vectorization methods in opinion mining: Term Frequency-Inverse Document Frequency (TF-IDF) and Word2Vec, using Amazon Fine Food Reviews dataset. This study will use these two methods to vectorize preprocessed text data and also input the vectorized data to the emotion classification model, analyzing the performance of two methods in the emotion classification task. The consequence indicates that the former outperforms the latter in handling large datasets, particularly in distinguishing between different sentiment categories, but latter is superior in capturing the semantic relationship of words. Therefore, it is suggested that the advantages of the two methods be combined in practical applications to improve the accuracy and efficiency.https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_03001.pdf
spellingShingle Lu Jiaxin
Text vectorization in sentiment analysis: A comparative study of TF-IDF and Word2Vec from Amazon Fine Food Reviews
ITM Web of Conferences
title Text vectorization in sentiment analysis: A comparative study of TF-IDF and Word2Vec from Amazon Fine Food Reviews
title_full Text vectorization in sentiment analysis: A comparative study of TF-IDF and Word2Vec from Amazon Fine Food Reviews
title_fullStr Text vectorization in sentiment analysis: A comparative study of TF-IDF and Word2Vec from Amazon Fine Food Reviews
title_full_unstemmed Text vectorization in sentiment analysis: A comparative study of TF-IDF and Word2Vec from Amazon Fine Food Reviews
title_short Text vectorization in sentiment analysis: A comparative study of TF-IDF and Word2Vec from Amazon Fine Food Reviews
title_sort text vectorization in sentiment analysis a comparative study of tf idf and word2vec from amazon fine food reviews
url https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_03001.pdf
work_keys_str_mv AT lujiaxin textvectorizationinsentimentanalysisacomparativestudyoftfidfandword2vecfromamazonfinefoodreviews