Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets

Online product reviews play a major role in the success or failure of an E-commerce business. Before procuring products or services, the shoppers usually go through the online reviews posted by previous customers to get recommendations of the details of products and make purchasing decisions. Nevert...

Full description

Saved in:
Bibliographic Details
Main Authors: Saleh Nagi Alsubari, Sachin N. Deshmukh, Mosleh Hmoud Al-Adhaileh, Fawaz Waselalla Alsaade, Theyazn H. H. Aldhyani
Format: Article
Language:English
Published: Wiley 2021-01-01
Series:Applied Bionics and Biomechanics
Online Access:http://dx.doi.org/10.1155/2021/5522574
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850234821216829440
author Saleh Nagi Alsubari
Sachin N. Deshmukh
Mosleh Hmoud Al-Adhaileh
Fawaz Waselalla Alsaade
Theyazn H. H. Aldhyani
author_facet Saleh Nagi Alsubari
Sachin N. Deshmukh
Mosleh Hmoud Al-Adhaileh
Fawaz Waselalla Alsaade
Theyazn H. H. Aldhyani
author_sort Saleh Nagi Alsubari
collection DOAJ
description Online product reviews play a major role in the success or failure of an E-commerce business. Before procuring products or services, the shoppers usually go through the online reviews posted by previous customers to get recommendations of the details of products and make purchasing decisions. Nevertheless, it is possible to enhance or hamper specific E-business products by posting fake reviews, which can be written by persons called fraudsters. These reviews can cause financial loss to E-commerce businesses and misguide consumers to take the wrong decision to search for alternative products. Thus, developing a fake review detection system is ultimately required for E-commerce business. The proposed methodology has used four standard fake review datasets of multidomains include hotels, restaurants, Yelp, and Amazon. Further, preprocessing methods such as stopword removal, punctuation removal, and tokenization have performed as well as padding sequence method for making the input sequence has fixed length during training, validation, and testing the model. As this methodology uses different sizes of datasets, various input word-embedding matrices of n-gram features of the review’s text are developed and created with help of word-embedding layer that is one component of the proposed model. Convolutional and max-pooling layers of the CNN technique are implemented for dimensionality reduction and feature extraction, respectively. Based on gate mechanisms, the LSTM layer is combined with the CNN technique for learning and handling the contextual information of n-gram features of the review’s text. Finally, a sigmoid activation function as the last layer of the proposed model receives the input sequences from the previous layer and performs binary classification task of review text into fake or truthful. In this paper, the proposed CNN-LSTM model was evaluated in two types of experiments, in-domain and cross-domain experiments. For an in-domain experiment, the model is applied on each dataset individually, while in the case of a cross-domain experiment, all datasets are gathered and put into a single data frame and evaluated entirely. The testing results of the model in-domain experiment datasets were 77%, 85%, 86%, and 87% in the terms of accuracy for restaurant, hotel, Yelp, and Amazon datasets, respectively. Concerning the cross-domain experiment, the proposed model has attained 89% accuracy. Furthermore, comparative analysis of the results of in-domain experiments with existing approaches has been done based on accuracy metric and, it is observed that the proposed model outperformed the compared methods.
format Article
id doaj-art-0dd83c493a4543299a7fcec4e9d74b6e
institution OA Journals
issn 1176-2322
1754-2103
language English
publishDate 2021-01-01
publisher Wiley
record_format Article
series Applied Bionics and Biomechanics
spelling doaj-art-0dd83c493a4543299a7fcec4e9d74b6e2025-08-20T02:02:30ZengWileyApplied Bionics and Biomechanics1176-23221754-21032021-01-01202110.1155/2021/55225745522574Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain DatasetsSaleh Nagi Alsubari0Sachin N. Deshmukh1Mosleh Hmoud Al-Adhaileh2Fawaz Waselalla Alsaade3Theyazn H. H. Aldhyani4Department of Computer Science & Information Technology, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, IndiaDepartment of Computer Science & Information Technology, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, IndiaDeanship of E-Learning and Distance Education King Faisal University Saudi Arabia, Al-Ahsa, Saudi ArabiaCollege of Computer Sciences and Information Technology, King Faisal University, Hofuf, Saudi ArabiaCommunity College of Abqaiq, King Faisal University, P.O. Box 400, Al-Ahsa, Saudi ArabiaOnline product reviews play a major role in the success or failure of an E-commerce business. Before procuring products or services, the shoppers usually go through the online reviews posted by previous customers to get recommendations of the details of products and make purchasing decisions. Nevertheless, it is possible to enhance or hamper specific E-business products by posting fake reviews, which can be written by persons called fraudsters. These reviews can cause financial loss to E-commerce businesses and misguide consumers to take the wrong decision to search for alternative products. Thus, developing a fake review detection system is ultimately required for E-commerce business. The proposed methodology has used four standard fake review datasets of multidomains include hotels, restaurants, Yelp, and Amazon. Further, preprocessing methods such as stopword removal, punctuation removal, and tokenization have performed as well as padding sequence method for making the input sequence has fixed length during training, validation, and testing the model. As this methodology uses different sizes of datasets, various input word-embedding matrices of n-gram features of the review’s text are developed and created with help of word-embedding layer that is one component of the proposed model. Convolutional and max-pooling layers of the CNN technique are implemented for dimensionality reduction and feature extraction, respectively. Based on gate mechanisms, the LSTM layer is combined with the CNN technique for learning and handling the contextual information of n-gram features of the review’s text. Finally, a sigmoid activation function as the last layer of the proposed model receives the input sequences from the previous layer and performs binary classification task of review text into fake or truthful. In this paper, the proposed CNN-LSTM model was evaluated in two types of experiments, in-domain and cross-domain experiments. For an in-domain experiment, the model is applied on each dataset individually, while in the case of a cross-domain experiment, all datasets are gathered and put into a single data frame and evaluated entirely. The testing results of the model in-domain experiment datasets were 77%, 85%, 86%, and 87% in the terms of accuracy for restaurant, hotel, Yelp, and Amazon datasets, respectively. Concerning the cross-domain experiment, the proposed model has attained 89% accuracy. Furthermore, comparative analysis of the results of in-domain experiments with existing approaches has been done based on accuracy metric and, it is observed that the proposed model outperformed the compared methods.http://dx.doi.org/10.1155/2021/5522574
spellingShingle Saleh Nagi Alsubari
Sachin N. Deshmukh
Mosleh Hmoud Al-Adhaileh
Fawaz Waselalla Alsaade
Theyazn H. H. Aldhyani
Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets
Applied Bionics and Biomechanics
title Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets
title_full Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets
title_fullStr Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets
title_full_unstemmed Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets
title_short Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets
title_sort development of integrated neural network model for identification of fake reviews in e commerce using multidomain datasets
url http://dx.doi.org/10.1155/2021/5522574
work_keys_str_mv AT salehnagialsubari developmentofintegratedneuralnetworkmodelforidentificationoffakereviewsinecommerceusingmultidomaindatasets
AT sachinndeshmukh developmentofintegratedneuralnetworkmodelforidentificationoffakereviewsinecommerceusingmultidomaindatasets
AT moslehhmoudaladhaileh developmentofintegratedneuralnetworkmodelforidentificationoffakereviewsinecommerceusingmultidomaindatasets
AT fawazwaselallaalsaade developmentofintegratedneuralnetworkmodelforidentificationoffakereviewsinecommerceusingmultidomaindatasets
AT theyaznhhaldhyani developmentofintegratedneuralnetworkmodelforidentificationoffakereviewsinecommerceusingmultidomaindatasets