Wineinformatics: Wine Score Prediction with Wine Price and Reviews

Wineinformatics is a new field that applies data science to wine-related data. The goal of this paper is to determine whether incorporating wine price can improve the accuracy of score prediction. To explore the relationship between wine price and wine score, naive Bayes classifier and support vecto...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuka Nagayoshi, Bernard Chen
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Fermentation
Subjects:
Online Access:https://www.mdpi.com/2311-5637/10/12/598
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850042506530521088
author Yuka Nagayoshi
Bernard Chen
author_facet Yuka Nagayoshi
Bernard Chen
author_sort Yuka Nagayoshi
collection DOAJ
description Wineinformatics is a new field that applies data science to wine-related data. The goal of this paper is to determine whether incorporating wine price can improve the accuracy of score prediction. To explore the relationship between wine price and wine score, naive Bayes classifier and support vector machine (SVM) classifier are employed to predict the scores as either equal to or above 90 or below 90. The price values are normalized using four different methods: mean, median, boxplot mean, and boxplot median. To conduct a proper comparison, the original dataset from previous research, which includes a total of 14,349 wine reviews, was preprocessed by filtering all null price values, resulting in 9721 wine reviews. Using this dataset, classifiers, and normalization methods, the models with and without the price feature were compared. SVM classifier with mean normalization method (USD 50.04) achieved the best accuracy of 87.98%, while naive Bayes classifier with boxplot median normalization method (USD 28.00) showed the greatest improvement of 0.99%. From all the results, we concluded that boxplot median normalization (USD 28.00) is the most effective method in this study. These results indicate that incorporating price as an attribute enhances machine learning algorithms’ ability to recognize the correlation between wine reviews and scores.
format Article
id doaj-art-9acf515415334f8abb3bb3434ddf7447
institution DOAJ
issn 2311-5637
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Fermentation
spelling doaj-art-9acf515415334f8abb3bb3434ddf74472025-08-20T02:55:32ZengMDPI AGFermentation2311-56372024-11-01101259810.3390/fermentation10120598Wineinformatics: Wine Score Prediction with Wine Price and ReviewsYuka Nagayoshi0Bernard Chen1Department of Computer Science and Engineering, University of Central Arkansas, Conway, AR 72035, USADepartment of Computer Science and Engineering, University of Central Arkansas, Conway, AR 72035, USAWineinformatics is a new field that applies data science to wine-related data. The goal of this paper is to determine whether incorporating wine price can improve the accuracy of score prediction. To explore the relationship between wine price and wine score, naive Bayes classifier and support vector machine (SVM) classifier are employed to predict the scores as either equal to or above 90 or below 90. The price values are normalized using four different methods: mean, median, boxplot mean, and boxplot median. To conduct a proper comparison, the original dataset from previous research, which includes a total of 14,349 wine reviews, was preprocessed by filtering all null price values, resulting in 9721 wine reviews. Using this dataset, classifiers, and normalization methods, the models with and without the price feature were compared. SVM classifier with mean normalization method (USD 50.04) achieved the best accuracy of 87.98%, while naive Bayes classifier with boxplot median normalization method (USD 28.00) showed the greatest improvement of 0.99%. From all the results, we concluded that boxplot median normalization (USD 28.00) is the most effective method in this study. These results indicate that incorporating price as an attribute enhances machine learning algorithms’ ability to recognize the correlation between wine reviews and scores.https://www.mdpi.com/2311-5637/10/12/598wineinformaticswine pricewine reviewsnaïve BayesSVM
spellingShingle Yuka Nagayoshi
Bernard Chen
Wineinformatics: Wine Score Prediction with Wine Price and Reviews
Fermentation
wineinformatics
wine price
wine reviews
naïve Bayes
SVM
title Wineinformatics: Wine Score Prediction with Wine Price and Reviews
title_full Wineinformatics: Wine Score Prediction with Wine Price and Reviews
title_fullStr Wineinformatics: Wine Score Prediction with Wine Price and Reviews
title_full_unstemmed Wineinformatics: Wine Score Prediction with Wine Price and Reviews
title_short Wineinformatics: Wine Score Prediction with Wine Price and Reviews
title_sort wineinformatics wine score prediction with wine price and reviews
topic wineinformatics
wine price
wine reviews
naïve Bayes
SVM
url https://www.mdpi.com/2311-5637/10/12/598
work_keys_str_mv AT yukanagayoshi wineinformaticswinescorepredictionwithwinepriceandreviews
AT bernardchen wineinformaticswinescorepredictionwithwinepriceandreviews