Detecting Fake Reviews in E-Commerce: A Case Study on Shopee Using Support Vector Machine and Random Forest
The increasing popularity of online shopping, particularly on platforms such as Shopee, has made product reviews a significant factor influencing consumer purchasing decisions. However, the presence of fake reviews generated by non-human agents undermines consumer trust and affects platform credibil...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Politeknik Negeri Batam
2025-06-01
|
| Series: | Journal of Applied Informatics and Computing |
| Subjects: | |
| Online Access: | https://jurnal.polibatam.ac.id/index.php/JAIC/article/view/9514 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The increasing popularity of online shopping, particularly on platforms such as Shopee, has made product reviews a significant factor influencing consumer purchasing decisions. However, the presence of fake reviews generated by non-human agents undermines consumer trust and affects platform credibility. This study aims to detect fake reviews on Shopee by applying a text classification approach using Random Forest and Support Vector Machine (SVM) algorithms. A dataset consisting of 3,686 Shopee product reviews was collected and underwent preprocessing steps including data cleaning, normalization, tokenization, and TF-IDF weighting. Review labeling was performed automatically through the Latent Dirichlet Allocation (LDA) method, categorizing reviews into Original (OR) and Computer-Generated (CG). Model performance was evaluated using accuracy, precision, recall, and F1-score metrics. Experimental results show that the SVM algorithm achieved the highest accuracy at 88.84%, outperforming Random Forest which obtained 80.39%. These findings highlight the effectiveness of SVM in handling high-dimensional text data for fake review detection. The study contributes to the application of automated topic modeling (LDA) for labeling e-commerce reviews in the Indonesian context and opens opportunities for further enhancement using larger datasets and deep learning-based models to improve classification accuracy and scalability. |
|---|---|
| ISSN: | 2548-6861 |