Multilingual hope speech detection from tweets using transfer learning models
Abstract Social media has become a powerful tool for public discourse, shaping opinions and the emotional landscape of communities. The extensive use of social media has led to a massive influx of online content. This content includes instances where negativity is amplified through hateful speech bu...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-03-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-88687-w |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850040138391879680 |
|---|---|
| author | Muhammad Ahmad Iqra Ameer Wareesa Sharif Sardar Usman Muhammad Muzamil Ameer Hamza Muhammad Jalal Ildar Batyrshin Grigori Sidorov |
| author_facet | Muhammad Ahmad Iqra Ameer Wareesa Sharif Sardar Usman Muhammad Muzamil Ameer Hamza Muhammad Jalal Ildar Batyrshin Grigori Sidorov |
| author_sort | Muhammad Ahmad |
| collection | DOAJ |
| description | Abstract Social media has become a powerful tool for public discourse, shaping opinions and the emotional landscape of communities. The extensive use of social media has led to a massive influx of online content. This content includes instances where negativity is amplified through hateful speech but also a significant number of posts that provide support and encouragement, commonly known as hope speech. In recent years, researchers have focused on the automatic detection of hope speech in languages such as Russian, English, Hindi, Spanish, and Bengali. However, to the best of our knowledge, detection of hope speech in Urdu and English, particularly using translation-based techniques, remains unexplored. To contribute to this area we have created a multilingual dataset in English and Urdu and applied a translation-based approach to handle multilingual challenges and utilized several state-of-the-art machine learning, deep learning, and transfer learning based methods to benchmark our dataset. Our observations indicate that a rigorous process for annotator selection, along with detailed annotation guidelines, significantly improved the quality of the dataset. Through extensive experimentation, our proposed methodology, based on the Bert transformer model, achieved benchmark performance, surpassing traditional machine learning models with accuracies of 87% for English and 79% for Urdu. These results show improvements of 8.75% in English and 1.87% in Urdu over baseline models (SVM 80% English and 78% in Urdu). |
| format | Article |
| id | doaj-art-5cc901d59cef46eabfa4ad3d7a67277f |
| institution | DOAJ |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-5cc901d59cef46eabfa4ad3d7a67277f2025-08-20T02:56:09ZengNature PortfolioScientific Reports2045-23222025-03-0115111710.1038/s41598-025-88687-wMultilingual hope speech detection from tweets using transfer learning modelsMuhammad Ahmad0Iqra Ameer1Wareesa Sharif2Sardar Usman3Muhammad Muzamil4Ameer Hamza5Muhammad Jalal6Ildar Batyrshin7Grigori Sidorov8Centro de Investigación en Computación, Instituto Politécnico Nacional (CIC-PN)Department of Computer Science, Division of Engineering and Science at Abington, The Pennsylvania State UniversityDepartment of Computer science, Artificial Intelligence, and Software Engineering, The Islamia University of BahawalpurSchool of Informatics and Robotics, Institute of Arts and CultureSchool of Informatics and Robotics, Institute of Arts and CultureSchool of Informatics and Robotics, Institute of Arts and CultureSchool of Informatics and Robotics, Institute of Arts and CultureCentro de Investigación en Computación, Instituto Politécnico Nacional (CIC-PN)Centro de Investigación en Computación, Instituto Politécnico Nacional (CIC-PN)Abstract Social media has become a powerful tool for public discourse, shaping opinions and the emotional landscape of communities. The extensive use of social media has led to a massive influx of online content. This content includes instances where negativity is amplified through hateful speech but also a significant number of posts that provide support and encouragement, commonly known as hope speech. In recent years, researchers have focused on the automatic detection of hope speech in languages such as Russian, English, Hindi, Spanish, and Bengali. However, to the best of our knowledge, detection of hope speech in Urdu and English, particularly using translation-based techniques, remains unexplored. To contribute to this area we have created a multilingual dataset in English and Urdu and applied a translation-based approach to handle multilingual challenges and utilized several state-of-the-art machine learning, deep learning, and transfer learning based methods to benchmark our dataset. Our observations indicate that a rigorous process for annotator selection, along with detailed annotation guidelines, significantly improved the quality of the dataset. Through extensive experimentation, our proposed methodology, based on the Bert transformer model, achieved benchmark performance, surpassing traditional machine learning models with accuracies of 87% for English and 79% for Urdu. These results show improvements of 8.75% in English and 1.87% in Urdu over baseline models (SVM 80% English and 78% in Urdu).https://doi.org/10.1038/s41598-025-88687-wHope speechDeep learningTransfer learningBertRobertaSVM |
| spellingShingle | Muhammad Ahmad Iqra Ameer Wareesa Sharif Sardar Usman Muhammad Muzamil Ameer Hamza Muhammad Jalal Ildar Batyrshin Grigori Sidorov Multilingual hope speech detection from tweets using transfer learning models Scientific Reports Hope speech Deep learning Transfer learning Bert Roberta SVM |
| title | Multilingual hope speech detection from tweets using transfer learning models |
| title_full | Multilingual hope speech detection from tweets using transfer learning models |
| title_fullStr | Multilingual hope speech detection from tweets using transfer learning models |
| title_full_unstemmed | Multilingual hope speech detection from tweets using transfer learning models |
| title_short | Multilingual hope speech detection from tweets using transfer learning models |
| title_sort | multilingual hope speech detection from tweets using transfer learning models |
| topic | Hope speech Deep learning Transfer learning Bert Roberta SVM |
| url | https://doi.org/10.1038/s41598-025-88687-w |
| work_keys_str_mv | AT muhammadahmad multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels AT iqraameer multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels AT wareesasharif multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels AT sardarusman multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels AT muhammadmuzamil multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels AT ameerhamza multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels AT muhammadjalal multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels AT ildarbatyrshin multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels AT grigorisidorov multilingualhopespeechdetectionfromtweetsusingtransferlearningmodels |