Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets
Online product reviews play a major role in the success or failure of an E-commerce business. Before procuring products or services, the shoppers usually go through the online reviews posted by previous customers to get recommendations of the details of products and make purchasing decisions. Nevert...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2021-01-01
|
| Series: | Applied Bionics and Biomechanics |
| Online Access: | http://dx.doi.org/10.1155/2021/5522574 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850234821216829440 |
|---|---|
| author | Saleh Nagi Alsubari Sachin N. Deshmukh Mosleh Hmoud Al-Adhaileh Fawaz Waselalla Alsaade Theyazn H. H. Aldhyani |
| author_facet | Saleh Nagi Alsubari Sachin N. Deshmukh Mosleh Hmoud Al-Adhaileh Fawaz Waselalla Alsaade Theyazn H. H. Aldhyani |
| author_sort | Saleh Nagi Alsubari |
| collection | DOAJ |
| description | Online product reviews play a major role in the success or failure of an E-commerce business. Before procuring products or services, the shoppers usually go through the online reviews posted by previous customers to get recommendations of the details of products and make purchasing decisions. Nevertheless, it is possible to enhance or hamper specific E-business products by posting fake reviews, which can be written by persons called fraudsters. These reviews can cause financial loss to E-commerce businesses and misguide consumers to take the wrong decision to search for alternative products. Thus, developing a fake review detection system is ultimately required for E-commerce business. The proposed methodology has used four standard fake review datasets of multidomains include hotels, restaurants, Yelp, and Amazon. Further, preprocessing methods such as stopword removal, punctuation removal, and tokenization have performed as well as padding sequence method for making the input sequence has fixed length during training, validation, and testing the model. As this methodology uses different sizes of datasets, various input word-embedding matrices of n-gram features of the review’s text are developed and created with help of word-embedding layer that is one component of the proposed model. Convolutional and max-pooling layers of the CNN technique are implemented for dimensionality reduction and feature extraction, respectively. Based on gate mechanisms, the LSTM layer is combined with the CNN technique for learning and handling the contextual information of n-gram features of the review’s text. Finally, a sigmoid activation function as the last layer of the proposed model receives the input sequences from the previous layer and performs binary classification task of review text into fake or truthful. In this paper, the proposed CNN-LSTM model was evaluated in two types of experiments, in-domain and cross-domain experiments. For an in-domain experiment, the model is applied on each dataset individually, while in the case of a cross-domain experiment, all datasets are gathered and put into a single data frame and evaluated entirely. The testing results of the model in-domain experiment datasets were 77%, 85%, 86%, and 87% in the terms of accuracy for restaurant, hotel, Yelp, and Amazon datasets, respectively. Concerning the cross-domain experiment, the proposed model has attained 89% accuracy. Furthermore, comparative analysis of the results of in-domain experiments with existing approaches has been done based on accuracy metric and, it is observed that the proposed model outperformed the compared methods. |
| format | Article |
| id | doaj-art-0dd83c493a4543299a7fcec4e9d74b6e |
| institution | OA Journals |
| issn | 1176-2322 1754-2103 |
| language | English |
| publishDate | 2021-01-01 |
| publisher | Wiley |
| record_format | Article |
| series | Applied Bionics and Biomechanics |
| spelling | doaj-art-0dd83c493a4543299a7fcec4e9d74b6e2025-08-20T02:02:30ZengWileyApplied Bionics and Biomechanics1176-23221754-21032021-01-01202110.1155/2021/55225745522574Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain DatasetsSaleh Nagi Alsubari0Sachin N. Deshmukh1Mosleh Hmoud Al-Adhaileh2Fawaz Waselalla Alsaade3Theyazn H. H. Aldhyani4Department of Computer Science & Information Technology, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, IndiaDepartment of Computer Science & Information Technology, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, IndiaDeanship of E-Learning and Distance Education King Faisal University Saudi Arabia, Al-Ahsa, Saudi ArabiaCollege of Computer Sciences and Information Technology, King Faisal University, Hofuf, Saudi ArabiaCommunity College of Abqaiq, King Faisal University, P.O. Box 400, Al-Ahsa, Saudi ArabiaOnline product reviews play a major role in the success or failure of an E-commerce business. Before procuring products or services, the shoppers usually go through the online reviews posted by previous customers to get recommendations of the details of products and make purchasing decisions. Nevertheless, it is possible to enhance or hamper specific E-business products by posting fake reviews, which can be written by persons called fraudsters. These reviews can cause financial loss to E-commerce businesses and misguide consumers to take the wrong decision to search for alternative products. Thus, developing a fake review detection system is ultimately required for E-commerce business. The proposed methodology has used four standard fake review datasets of multidomains include hotels, restaurants, Yelp, and Amazon. Further, preprocessing methods such as stopword removal, punctuation removal, and tokenization have performed as well as padding sequence method for making the input sequence has fixed length during training, validation, and testing the model. As this methodology uses different sizes of datasets, various input word-embedding matrices of n-gram features of the review’s text are developed and created with help of word-embedding layer that is one component of the proposed model. Convolutional and max-pooling layers of the CNN technique are implemented for dimensionality reduction and feature extraction, respectively. Based on gate mechanisms, the LSTM layer is combined with the CNN technique for learning and handling the contextual information of n-gram features of the review’s text. Finally, a sigmoid activation function as the last layer of the proposed model receives the input sequences from the previous layer and performs binary classification task of review text into fake or truthful. In this paper, the proposed CNN-LSTM model was evaluated in two types of experiments, in-domain and cross-domain experiments. For an in-domain experiment, the model is applied on each dataset individually, while in the case of a cross-domain experiment, all datasets are gathered and put into a single data frame and evaluated entirely. The testing results of the model in-domain experiment datasets were 77%, 85%, 86%, and 87% in the terms of accuracy for restaurant, hotel, Yelp, and Amazon datasets, respectively. Concerning the cross-domain experiment, the proposed model has attained 89% accuracy. Furthermore, comparative analysis of the results of in-domain experiments with existing approaches has been done based on accuracy metric and, it is observed that the proposed model outperformed the compared methods.http://dx.doi.org/10.1155/2021/5522574 |
| spellingShingle | Saleh Nagi Alsubari Sachin N. Deshmukh Mosleh Hmoud Al-Adhaileh Fawaz Waselalla Alsaade Theyazn H. H. Aldhyani Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets Applied Bionics and Biomechanics |
| title | Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets |
| title_full | Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets |
| title_fullStr | Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets |
| title_full_unstemmed | Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets |
| title_short | Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets |
| title_sort | development of integrated neural network model for identification of fake reviews in e commerce using multidomain datasets |
| url | http://dx.doi.org/10.1155/2021/5522574 |
| work_keys_str_mv | AT salehnagialsubari developmentofintegratedneuralnetworkmodelforidentificationoffakereviewsinecommerceusingmultidomaindatasets AT sachinndeshmukh developmentofintegratedneuralnetworkmodelforidentificationoffakereviewsinecommerceusingmultidomaindatasets AT moslehhmoudaladhaileh developmentofintegratedneuralnetworkmodelforidentificationoffakereviewsinecommerceusingmultidomaindatasets AT fawazwaselallaalsaade developmentofintegratedneuralnetworkmodelforidentificationoffakereviewsinecommerceusingmultidomaindatasets AT theyaznhhaldhyani developmentofintegratedneuralnetworkmodelforidentificationoffakereviewsinecommerceusingmultidomaindatasets |