Arabic Fake News Dataset Development: Humans and AI-Generated Contributions

The extensive use of social media platforms has promoted the rapid spread of fake news on the internet, such as fake reviews, rumors, and propaganda. Although these terminologies have different objectives, they share the aim of causing harm in the form of fake news. This study presents an Arabic fak...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hanen Himdi, Nuha Zamzami, Fatma Najar, Mada Alrehaili, Nizar Bouguila
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Arabic fake news detection Okaz dataset deep learning (DL) transformers GPT-generated fake news natural language processing (NLP)
Online Access:	https://ieeexplore.ieee.org/document/10945848/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850188282027048960
author	Hanen Himdi Nuha Zamzami Fatma Najar Mada Alrehaili Nizar Bouguila
author_facet	Hanen Himdi Nuha Zamzami Fatma Najar Mada Alrehaili Nizar Bouguila
author_sort	Hanen Himdi
collection	DOAJ
description	The extensive use of social media platforms has promoted the rapid spread of fake news on the internet, such as fake reviews, rumors, and propaganda. Although these terminologies have different objectives, they share the aim of causing harm in the form of fake news. This study presents an Arabic fake news detection framework to overcome the widespread fake news phenomenon. The proposed framework introduces the first Arabic fake news dataset compiled by passing through strict guidelines to produce fake articles composed by humans and the generative pre-trained transformer (GPT). First, we performed human-based experiments to evaluate the ability of humans to distinguish real news articles from fake news articles. Our findings reveal that humans could roughly identify half of the fake articles from humans or GPT, raising concerns about their ability to detect fake news. This highlights the growing concern surrounding fake news, especially because GPT demonstrates the ability to generate fake news that closely resembles human-created content, further amplifying the issue. To address this issue, we performed the same task using Deep Learning (DL) and transformer-based methods with different word embeddings. Across all the employed models, the study revealed that the innovative transformer-based model, ARBERT, outperformed the DL models, reaching an accuracy of 78% in classifying real and fake news generated by humans and GPT. The findings suggest effective techniques for addressing and resolving this issue.
format	Article
id	doaj-art-c716b47a370449c6b5ee3e21b4bdf213
institution	OA Journals
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-c716b47a370449c6b5ee3e21b4bdf2132025-08-20T02:15:54ZengIEEEIEEE Access2169-35362025-01-0113622346225310.1109/ACCESS.2025.355637610945848Arabic Fake News Dataset Development: Humans and AI-Generated ContributionsHanen Himdi0https://orcid.org/0000-0002-0182-2511Nuha Zamzami1https://orcid.org/0000-0001-9328-9218Fatma Najar2https://orcid.org/0000-0003-2301-4803Mada Alrehaili3Nizar Bouguila4https://orcid.org/0000-0001-7224-7940Department of Computer Science and Artificial Intelligence, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi ArabiaDepartment of Computer Science and Artificial Intelligence, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi ArabiaDepartment of Mathematics and Computer Science, John Jay College of Criminal Justice, The City University of New York, New York, NY, USAOkaz Organization for Press and Publications, Jeddah, Saudi ArabiaConcordia Institute for Information Systems Engineering (CIISE), Concordia University, Montreal, QC, CanadaThe extensive use of social media platforms has promoted the rapid spread of fake news on the internet, such as fake reviews, rumors, and propaganda. Although these terminologies have different objectives, they share the aim of causing harm in the form of fake news. This study presents an Arabic fake news detection framework to overcome the widespread fake news phenomenon. The proposed framework introduces the first Arabic fake news dataset compiled by passing through strict guidelines to produce fake articles composed by humans and the generative pre-trained transformer (GPT). First, we performed human-based experiments to evaluate the ability of humans to distinguish real news articles from fake news articles. Our findings reveal that humans could roughly identify half of the fake articles from humans or GPT, raising concerns about their ability to detect fake news. This highlights the growing concern surrounding fake news, especially because GPT demonstrates the ability to generate fake news that closely resembles human-created content, further amplifying the issue. To address this issue, we performed the same task using Deep Learning (DL) and transformer-based methods with different word embeddings. Across all the employed models, the study revealed that the innovative transformer-based model, ARBERT, outperformed the DL models, reaching an accuracy of 78% in classifying real and fake news generated by humans and GPT. The findings suggest effective techniques for addressing and resolving this issue.https://ieeexplore.ieee.org/document/10945848/Arabic fake news detectionOkaz datasetdeep learning (DL)transformersGPT-generated fake newsnatural language processing (NLP)
spellingShingle	Hanen Himdi Nuha Zamzami Fatma Najar Mada Alrehaili Nizar Bouguila Arabic Fake News Dataset Development: Humans and AI-Generated Contributions IEEE Access Arabic fake news detection Okaz dataset deep learning (DL) transformers GPT-generated fake news natural language processing (NLP)
title	Arabic Fake News Dataset Development: Humans and AI-Generated Contributions
title_full	Arabic Fake News Dataset Development: Humans and AI-Generated Contributions
title_fullStr	Arabic Fake News Dataset Development: Humans and AI-Generated Contributions
title_full_unstemmed	Arabic Fake News Dataset Development: Humans and AI-Generated Contributions
title_short	Arabic Fake News Dataset Development: Humans and AI-Generated Contributions
title_sort	arabic fake news dataset development humans and ai generated contributions
topic	Arabic fake news detection Okaz dataset deep learning (DL) transformers GPT-generated fake news natural language processing (NLP)
url	https://ieeexplore.ieee.org/document/10945848/
work_keys_str_mv	AT hanenhimdi arabicfakenewsdatasetdevelopmenthumansandaigeneratedcontributions AT nuhazamzami arabicfakenewsdatasetdevelopmenthumansandaigeneratedcontributions AT fatmanajar arabicfakenewsdatasetdevelopmenthumansandaigeneratedcontributions AT madaalrehaili arabicfakenewsdatasetdevelopmenthumansandaigeneratedcontributions AT nizarbouguila arabicfakenewsdatasetdevelopmenthumansandaigeneratedcontributions

Arabic Fake News Dataset Development: Humans and AI-Generated Contributions

Similar Items