Comparative Analysis of TF-IDF and Word2Vec in Sentiment Analysis: A Case of Food Reviews

Sentiment analysis is an important area of natural language processing that supports applications such as market analysis, customer feedback, and social media monitoring by identifying and classifying opinions in text. Text representation is the basis of sentiment analysis, and TF-IDF and Word2Vec a...

Full description

Saved in:
Bibliographic Details
Main Author: Zhan Zerui
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:ITM Web of Conferences
Online Access:https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_02013.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825206547201916928
author Zhan Zerui
author_facet Zhan Zerui
author_sort Zhan Zerui
collection DOAJ
description Sentiment analysis is an important area of natural language processing that supports applications such as market analysis, customer feedback, and social media monitoring by identifying and classifying opinions in text. Text representation is the basis of sentiment analysis, and TF-IDF and Word2Vec are two commonly used methods to carry out text vectorization by counting word frequency and capturing semantic relations respectively. This paper compares the performance of TF-IDF and Word2Vec in sentiment analysis of food reviews to provide a more effective basis for enterprises and researchers to choose text representation techniques. Based on 560,000 food review data, this paper focuses on comparing the accuracy and generalization ability of the two methods under different dataset sizes. The results showed that TF-IDF showed high accuracy in training data (99.16%), but showed obvious overfitting problems in test data (73.9%). In contrast, Word2Vec was more balanced on training and testing data (68.4% vs. 68.65%), showing better generalization. This finding has guiding implications for choosing text representation methods, especially in sentiment analysis tasks on large data sets.
format Article
id doaj-art-1ff8c827585246a5a82c8d6146b58f24
institution Kabale University
issn 2271-2097
language English
publishDate 2025-01-01
publisher EDP Sciences
record_format Article
series ITM Web of Conferences
spelling doaj-art-1ff8c827585246a5a82c8d6146b58f242025-02-07T08:21:10ZengEDP SciencesITM Web of Conferences2271-20972025-01-01700201310.1051/itmconf/20257002013itmconf_dai2024_02013Comparative Analysis of TF-IDF and Word2Vec in Sentiment Analysis: A Case of Food ReviewsZhan Zerui0Dublin International College, Beijing University of TechnologySentiment analysis is an important area of natural language processing that supports applications such as market analysis, customer feedback, and social media monitoring by identifying and classifying opinions in text. Text representation is the basis of sentiment analysis, and TF-IDF and Word2Vec are two commonly used methods to carry out text vectorization by counting word frequency and capturing semantic relations respectively. This paper compares the performance of TF-IDF and Word2Vec in sentiment analysis of food reviews to provide a more effective basis for enterprises and researchers to choose text representation techniques. Based on 560,000 food review data, this paper focuses on comparing the accuracy and generalization ability of the two methods under different dataset sizes. The results showed that TF-IDF showed high accuracy in training data (99.16%), but showed obvious overfitting problems in test data (73.9%). In contrast, Word2Vec was more balanced on training and testing data (68.4% vs. 68.65%), showing better generalization. This finding has guiding implications for choosing text representation methods, especially in sentiment analysis tasks on large data sets.https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_02013.pdf
spellingShingle Zhan Zerui
Comparative Analysis of TF-IDF and Word2Vec in Sentiment Analysis: A Case of Food Reviews
ITM Web of Conferences
title Comparative Analysis of TF-IDF and Word2Vec in Sentiment Analysis: A Case of Food Reviews
title_full Comparative Analysis of TF-IDF and Word2Vec in Sentiment Analysis: A Case of Food Reviews
title_fullStr Comparative Analysis of TF-IDF and Word2Vec in Sentiment Analysis: A Case of Food Reviews
title_full_unstemmed Comparative Analysis of TF-IDF and Word2Vec in Sentiment Analysis: A Case of Food Reviews
title_short Comparative Analysis of TF-IDF and Word2Vec in Sentiment Analysis: A Case of Food Reviews
title_sort comparative analysis of tf idf and word2vec in sentiment analysis a case of food reviews
url https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_02013.pdf
work_keys_str_mv AT zhanzerui comparativeanalysisoftfidfandword2vecinsentimentanalysisacaseoffoodreviews