Misinformation Detection: A Review for High and Low-Resource Languages

The rapid spread of misinformation on platforms like Twitter, and Facebook, and in news headlines highlights the urgent need for effective ways to detect it. Currently, researchers are increasingly using machine learning (ML) and deep learning (DL) techniques to tackle misinformation detection (MID)...

Full description

Saved in:
Bibliographic Details
Main Authors: Seani Rananga, Bassey Isong, Abiodun Modupe, Vukosi Marivate
Format: Article
Language:English
Published: Informatics Department, Faculty of Computer Science Bina Darma University 2024-12-01
Series:Journal of Information Systems and Informatics
Subjects:
Online Access:https://journal-isi.org/index.php/isi/article/view/931
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849709205201616896
author Seani Rananga
Bassey Isong
Abiodun Modupe
Vukosi Marivate
author_facet Seani Rananga
Bassey Isong
Abiodun Modupe
Vukosi Marivate
author_sort Seani Rananga
collection DOAJ
description The rapid spread of misinformation on platforms like Twitter, and Facebook, and in news headlines highlights the urgent need for effective ways to detect it. Currently, researchers are increasingly using machine learning (ML) and deep learning (DL) techniques to tackle misinformation detection (MID) because of their proven success. However, this task is still challenging due to the complexity of deceptive language, digital editing tools, and the lack of reliable linguistic resources for non-English languages. This paper provides a comprehensive analysis of relevant research, providing insights into advanced techniques for MID. It covers dataset assessments, the importance of using multiple forms of data (multimodality), and different language representations. By applying the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) methodology, the study identified and analyzed literature from 2019 to 2024 across five databases: Google Scholar, Springer, Elsevier, ACM, and IEEE Xplore. The study selected thirty-one papers and examined the effectiveness of various ML and DL approaches with a focal point on performance metrics, datasets, and false or misleading information detection challenges. The findings indicate that most current MID models are heavily dependent on DL techniques, with approximately 81% of studies preferring these over traditional ML methods. In addition, most studies are text-based, with much less attention given to audio, speech, images, and videos. The most effective models are mainly designed for high-resource languages, with English datasets being the most used (67%), followed by Arabic (14%), Chinese (11%), and others. Less than 10% of the studies focus on low-resource languages (LRLs). Therefore, the study highlighted the need for robust datasets and interpretable, scalable MID models for LRLs. It emphasizes the critical need to prioritize and advance MID research for LRLs across all data types, including text, audio, speech, images, videos, and multimodal approaches.  This study aims to support ongoing efforts to combat misinformation and promote a more informed understanding of under-resourced African languages.
format Article
id doaj-art-61599dca28754e61b1f2def81b257fdd
institution DOAJ
issn 2656-5935
2656-4882
language English
publishDate 2024-12-01
publisher Informatics Department, Faculty of Computer Science Bina Darma University
record_format Article
series Journal of Information Systems and Informatics
spelling doaj-art-61599dca28754e61b1f2def81b257fdd2025-08-20T03:15:24ZengInformatics Department, Faculty of Computer Science Bina Darma UniversityJournal of Information Systems and Informatics2656-59352656-48822024-12-01642892292210.51519/journalisi.v6i4.931931Misinformation Detection: A Review for High and Low-Resource LanguagesSeani Rananga0Bassey Isong1Abiodun Modupe2Vukosi Marivate3North-West University and University of PretoriaNorth-West UniversityUniversity of PretoriaUniversity of PretoriaThe rapid spread of misinformation on platforms like Twitter, and Facebook, and in news headlines highlights the urgent need for effective ways to detect it. Currently, researchers are increasingly using machine learning (ML) and deep learning (DL) techniques to tackle misinformation detection (MID) because of their proven success. However, this task is still challenging due to the complexity of deceptive language, digital editing tools, and the lack of reliable linguistic resources for non-English languages. This paper provides a comprehensive analysis of relevant research, providing insights into advanced techniques for MID. It covers dataset assessments, the importance of using multiple forms of data (multimodality), and different language representations. By applying the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) methodology, the study identified and analyzed literature from 2019 to 2024 across five databases: Google Scholar, Springer, Elsevier, ACM, and IEEE Xplore. The study selected thirty-one papers and examined the effectiveness of various ML and DL approaches with a focal point on performance metrics, datasets, and false or misleading information detection challenges. The findings indicate that most current MID models are heavily dependent on DL techniques, with approximately 81% of studies preferring these over traditional ML methods. In addition, most studies are text-based, with much less attention given to audio, speech, images, and videos. The most effective models are mainly designed for high-resource languages, with English datasets being the most used (67%), followed by Arabic (14%), Chinese (11%), and others. Less than 10% of the studies focus on low-resource languages (LRLs). Therefore, the study highlighted the need for robust datasets and interpretable, scalable MID models for LRLs. It emphasizes the critical need to prioritize and advance MID research for LRLs across all data types, including text, audio, speech, images, videos, and multimodal approaches.  This study aims to support ongoing efforts to combat misinformation and promote a more informed understanding of under-resourced African languages.https://journal-isi.org/index.php/isi/article/view/931misinformation detection, low-resource languages, high-resource languages, african languages.
spellingShingle Seani Rananga
Bassey Isong
Abiodun Modupe
Vukosi Marivate
Misinformation Detection: A Review for High and Low-Resource Languages
Journal of Information Systems and Informatics
misinformation detection, low-resource languages, high-resource languages, african languages.
title Misinformation Detection: A Review for High and Low-Resource Languages
title_full Misinformation Detection: A Review for High and Low-Resource Languages
title_fullStr Misinformation Detection: A Review for High and Low-Resource Languages
title_full_unstemmed Misinformation Detection: A Review for High and Low-Resource Languages
title_short Misinformation Detection: A Review for High and Low-Resource Languages
title_sort misinformation detection a review for high and low resource languages
topic misinformation detection, low-resource languages, high-resource languages, african languages.
url https://journal-isi.org/index.php/isi/article/view/931
work_keys_str_mv AT seanirananga misinformationdetectionareviewforhighandlowresourcelanguages
AT basseyisong misinformationdetectionareviewforhighandlowresourcelanguages
AT abiodunmodupe misinformationdetectionareviewforhighandlowresourcelanguages
AT vukosimarivate misinformationdetectionareviewforhighandlowresourcelanguages