Detecting Fake Reviews in E-Commerce: A Case Study on Shopee Using Support Vector Machine and Random Forest

The increasing popularity of online shopping, particularly on platforms such as Shopee, has made product reviews a significant factor influencing consumer purchasing decisions. However, the presence of fake reviews generated by non-human agents undermines consumer trust and affects platform credibil...

Full description

Saved in:
Bibliographic Details
Main Authors: Khoirotulmuadiba Purifyregalia, Khothibul Umam, Nur Cahyo Hendro Wibowo, Maya Rini Handayani
Format: Article
Language:English
Published: Politeknik Negeri Batam 2025-06-01
Series:Journal of Applied Informatics and Computing
Subjects:
Online Access:https://jurnal.polibatam.ac.id/index.php/JAIC/article/view/9514
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The increasing popularity of online shopping, particularly on platforms such as Shopee, has made product reviews a significant factor influencing consumer purchasing decisions. However, the presence of fake reviews generated by non-human agents undermines consumer trust and affects platform credibility. This study aims to detect fake reviews on Shopee by applying a text classification approach using Random Forest and Support Vector Machine (SVM) algorithms. A dataset consisting of 3,686 Shopee product reviews was collected and underwent preprocessing steps including data cleaning, normalization, tokenization, and TF-IDF weighting. Review labeling was performed automatically through the Latent Dirichlet Allocation (LDA) method, categorizing reviews into Original (OR) and Computer-Generated (CG). Model performance was evaluated using accuracy, precision, recall, and F1-score metrics. Experimental results show that the SVM algorithm achieved the highest accuracy at 88.84%, outperforming Random Forest which obtained 80.39%. These findings highlight the effectiveness of SVM in handling high-dimensional text data for fake review detection. The study contributes to the application of automated topic modeling (LDA) for labeling e-commerce reviews in the Indonesian context and opens opportunities for further enhancement using larger datasets and deep learning-based models to improve classification accuracy and scalability.
ISSN:2548-6861