Arabic Fake News Dataset Development: Humans and AI-Generated Contributions
The extensive use of social media platforms has promoted the rapid spread of fake news on the internet, such as fake reviews, rumors, and propaganda. Although these terminologies have different objectives, they share the aim of causing harm in the form of fake news. This study presents an Arabic fak...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10945848/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850188282027048960 |
|---|---|
| author | Hanen Himdi Nuha Zamzami Fatma Najar Mada Alrehaili Nizar Bouguila |
| author_facet | Hanen Himdi Nuha Zamzami Fatma Najar Mada Alrehaili Nizar Bouguila |
| author_sort | Hanen Himdi |
| collection | DOAJ |
| description | The extensive use of social media platforms has promoted the rapid spread of fake news on the internet, such as fake reviews, rumors, and propaganda. Although these terminologies have different objectives, they share the aim of causing harm in the form of fake news. This study presents an Arabic fake news detection framework to overcome the widespread fake news phenomenon. The proposed framework introduces the first Arabic fake news dataset compiled by passing through strict guidelines to produce fake articles composed by humans and the generative pre-trained transformer (GPT). First, we performed human-based experiments to evaluate the ability of humans to distinguish real news articles from fake news articles. Our findings reveal that humans could roughly identify half of the fake articles from humans or GPT, raising concerns about their ability to detect fake news. This highlights the growing concern surrounding fake news, especially because GPT demonstrates the ability to generate fake news that closely resembles human-created content, further amplifying the issue. To address this issue, we performed the same task using Deep Learning (DL) and transformer-based methods with different word embeddings. Across all the employed models, the study revealed that the innovative transformer-based model, ARBERT, outperformed the DL models, reaching an accuracy of 78% in classifying real and fake news generated by humans and GPT. The findings suggest effective techniques for addressing and resolving this issue. |
| format | Article |
| id | doaj-art-c716b47a370449c6b5ee3e21b4bdf213 |
| institution | OA Journals |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-c716b47a370449c6b5ee3e21b4bdf2132025-08-20T02:15:54ZengIEEEIEEE Access2169-35362025-01-0113622346225310.1109/ACCESS.2025.355637610945848Arabic Fake News Dataset Development: Humans and AI-Generated ContributionsHanen Himdi0https://orcid.org/0000-0002-0182-2511Nuha Zamzami1https://orcid.org/0000-0001-9328-9218Fatma Najar2https://orcid.org/0000-0003-2301-4803Mada Alrehaili3Nizar Bouguila4https://orcid.org/0000-0001-7224-7940Department of Computer Science and Artificial Intelligence, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi ArabiaDepartment of Computer Science and Artificial Intelligence, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi ArabiaDepartment of Mathematics and Computer Science, John Jay College of Criminal Justice, The City University of New York, New York, NY, USAOkaz Organization for Press and Publications, Jeddah, Saudi ArabiaConcordia Institute for Information Systems Engineering (CIISE), Concordia University, Montreal, QC, CanadaThe extensive use of social media platforms has promoted the rapid spread of fake news on the internet, such as fake reviews, rumors, and propaganda. Although these terminologies have different objectives, they share the aim of causing harm in the form of fake news. This study presents an Arabic fake news detection framework to overcome the widespread fake news phenomenon. The proposed framework introduces the first Arabic fake news dataset compiled by passing through strict guidelines to produce fake articles composed by humans and the generative pre-trained transformer (GPT). First, we performed human-based experiments to evaluate the ability of humans to distinguish real news articles from fake news articles. Our findings reveal that humans could roughly identify half of the fake articles from humans or GPT, raising concerns about their ability to detect fake news. This highlights the growing concern surrounding fake news, especially because GPT demonstrates the ability to generate fake news that closely resembles human-created content, further amplifying the issue. To address this issue, we performed the same task using Deep Learning (DL) and transformer-based methods with different word embeddings. Across all the employed models, the study revealed that the innovative transformer-based model, ARBERT, outperformed the DL models, reaching an accuracy of 78% in classifying real and fake news generated by humans and GPT. The findings suggest effective techniques for addressing and resolving this issue.https://ieeexplore.ieee.org/document/10945848/Arabic fake news detectionOkaz datasetdeep learning (DL)transformersGPT-generated fake newsnatural language processing (NLP) |
| spellingShingle | Hanen Himdi Nuha Zamzami Fatma Najar Mada Alrehaili Nizar Bouguila Arabic Fake News Dataset Development: Humans and AI-Generated Contributions IEEE Access Arabic fake news detection Okaz dataset deep learning (DL) transformers GPT-generated fake news natural language processing (NLP) |
| title | Arabic Fake News Dataset Development: Humans and AI-Generated Contributions |
| title_full | Arabic Fake News Dataset Development: Humans and AI-Generated Contributions |
| title_fullStr | Arabic Fake News Dataset Development: Humans and AI-Generated Contributions |
| title_full_unstemmed | Arabic Fake News Dataset Development: Humans and AI-Generated Contributions |
| title_short | Arabic Fake News Dataset Development: Humans and AI-Generated Contributions |
| title_sort | arabic fake news dataset development humans and ai generated contributions |
| topic | Arabic fake news detection Okaz dataset deep learning (DL) transformers GPT-generated fake news natural language processing (NLP) |
| url | https://ieeexplore.ieee.org/document/10945848/ |
| work_keys_str_mv | AT hanenhimdi arabicfakenewsdatasetdevelopmenthumansandaigeneratedcontributions AT nuhazamzami arabicfakenewsdatasetdevelopmenthumansandaigeneratedcontributions AT fatmanajar arabicfakenewsdatasetdevelopmenthumansandaigeneratedcontributions AT madaalrehaili arabicfakenewsdatasetdevelopmenthumansandaigeneratedcontributions AT nizarbouguila arabicfakenewsdatasetdevelopmenthumansandaigeneratedcontributions |