Multilingual hope speech detection from tweets using transfer learning models

Abstract Social media has become a powerful tool for public discourse, shaping opinions and the emotional landscape of communities. The extensive use of social media has led to a massive influx of online content. This content includes instances where negativity is amplified through hateful speech bu...

Full description

Saved in:
Bibliographic Details
Main Authors: Muhammad Ahmad, Iqra Ameer, Wareesa Sharif, Sardar Usman, Muhammad Muzamil, Ameer Hamza, Muhammad Jalal, Ildar Batyrshin, Grigori Sidorov
Format: Article
Language:English
Published: Nature Portfolio 2025-03-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-88687-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850040138391879680
author Muhammad Ahmad
Iqra Ameer
Wareesa Sharif
Sardar Usman
Muhammad Muzamil
Ameer Hamza
Muhammad Jalal
Ildar Batyrshin
Grigori Sidorov
author_facet Muhammad Ahmad
Iqra Ameer
Wareesa Sharif
Sardar Usman
Muhammad Muzamil
Ameer Hamza
Muhammad Jalal
Ildar Batyrshin
Grigori Sidorov
author_sort Muhammad Ahmad
collection DOAJ
description Abstract Social media has become a powerful tool for public discourse, shaping opinions and the emotional landscape of communities. The extensive use of social media has led to a massive influx of online content. This content includes instances where negativity is amplified through hateful speech but also a significant number of posts that provide support and encouragement, commonly known as hope speech. In recent years, researchers have focused on the automatic detection of hope speech in languages such as Russian, English, Hindi, Spanish, and Bengali. However, to the best of our knowledge, detection of hope speech in Urdu and English, particularly using translation-based techniques, remains unexplored. To contribute to this area we have created a multilingual dataset in English and Urdu and applied a translation-based approach to handle multilingual challenges and utilized several state-of-the-art machine learning, deep learning, and transfer learning based methods to benchmark our dataset. Our observations indicate that a rigorous process for annotator selection, along with detailed annotation guidelines, significantly improved the quality of the dataset. Through extensive experimentation, our proposed methodology, based on the Bert transformer model, achieved benchmark performance, surpassing traditional machine learning models with accuracies of 87% for English and 79% for Urdu. These results show improvements of 8.75% in English and 1.87% in Urdu over baseline models (SVM 80% English and 78% in Urdu).
format Article
id doaj-art-5cc901d59cef46eabfa4ad3d7a67277f
institution DOAJ
issn 2045-2322
language English
publishDate 2025-03-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-5cc901d59cef46eabfa4ad3d7a67277f2025-08-20T02:56:09ZengNature PortfolioScientific Reports2045-23222025-03-0115111710.1038/s41598-025-88687-wMultilingual hope speech detection from tweets using transfer learning modelsMuhammad Ahmad0Iqra Ameer1Wareesa Sharif2Sardar Usman3Muhammad Muzamil4Ameer Hamza5Muhammad Jalal6Ildar Batyrshin7Grigori Sidorov8Centro de Investigación en Computación, Instituto Politécnico Nacional (CIC-PN)Department of Computer Science, Division of Engineering and Science at Abington, The Pennsylvania State UniversityDepartment of Computer science, Artificial Intelligence, and Software Engineering, The Islamia University of BahawalpurSchool of Informatics and Robotics, Institute of Arts and CultureSchool of Informatics and Robotics, Institute of Arts and CultureSchool of Informatics and Robotics, Institute of Arts and CultureSchool of Informatics and Robotics, Institute of Arts and CultureCentro de Investigación en Computación, Instituto Politécnico Nacional (CIC-PN)Centro de Investigación en Computación, Instituto Politécnico Nacional (CIC-PN)Abstract Social media has become a powerful tool for public discourse, shaping opinions and the emotional landscape of communities. The extensive use of social media has led to a massive influx of online content. This content includes instances where negativity is amplified through hateful speech but also a significant number of posts that provide support and encouragement, commonly known as hope speech. In recent years, researchers have focused on the automatic detection of hope speech in languages such as Russian, English, Hindi, Spanish, and Bengali. However, to the best of our knowledge, detection of hope speech in Urdu and English, particularly using translation-based techniques, remains unexplored. To contribute to this area we have created a multilingual dataset in English and Urdu and applied a translation-based approach to handle multilingual challenges and utilized several state-of-the-art machine learning, deep learning, and transfer learning based methods to benchmark our dataset. Our observations indicate that a rigorous process for annotator selection, along with detailed annotation guidelines, significantly improved the quality of the dataset. Through extensive experimentation, our proposed methodology, based on the Bert transformer model, achieved benchmark performance, surpassing traditional machine learning models with accuracies of 87% for English and 79% for Urdu. These results show improvements of 8.75% in English and 1.87% in Urdu over baseline models (SVM 80% English and 78% in Urdu).https://doi.org/10.1038/s41598-025-88687-wHope speechDeep learningTransfer learningBertRobertaSVM
spellingShingle Muhammad Ahmad
Iqra Ameer
Wareesa Sharif
Sardar Usman
Muhammad Muzamil
Ameer Hamza
Muhammad Jalal
Ildar Batyrshin
Grigori Sidorov
Multilingual hope speech detection from tweets using transfer learning models
Scientific Reports
Hope speech
Deep learning
Transfer learning
Bert
Roberta
SVM
title Multilingual hope speech detection from tweets using transfer learning models
title_full Multilingual hope speech detection from tweets using transfer learning models
title_fullStr Multilingual hope speech detection from tweets using transfer learning models
title_full_unstemmed Multilingual hope speech detection from tweets using transfer learning models
title_short Multilingual hope speech detection from tweets using transfer learning models
title_sort multilingual hope speech detection from tweets using transfer learning models
topic Hope speech
Deep learning
Transfer learning
Bert
Roberta
SVM
url https://doi.org/10.1038/s41598-025-88687-w
work_keys_str_mv AT muhammadahmad multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels
AT iqraameer multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels
AT wareesasharif multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels
AT sardarusman multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels
AT muhammadmuzamil multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels
AT ameerhamza multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels
AT muhammadjalal multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels
AT ildarbatyrshin multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels
AT grigorisidorov multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels