Deep neural architectures for Kashmiri-English machine translation
Abstract This paper presents the first comprehensive deep learning-based Neural Machine Translation (NMT) framework for the Kashmiri-English language pair. We introduce a high-quality parallel corpus of 270,000 sentence pairs and evaluate three NMT architectures: a basic encoder-decoder model, an at...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-08-01
|
| Series: | Scientific Reports |
| Online Access: | https://doi.org/10.1038/s41598-025-14177-8 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849736256564494336 |
|---|---|
| author | Syed Matla Ul Qumar Muzaffar Azim S. M. K. Quadri Mohannad Alkanan Mohammad Shuaib Mir Yonis Gulzar |
| author_facet | Syed Matla Ul Qumar Muzaffar Azim S. M. K. Quadri Mohannad Alkanan Mohammad Shuaib Mir Yonis Gulzar |
| author_sort | Syed Matla Ul Qumar |
| collection | DOAJ |
| description | Abstract This paper presents the first comprehensive deep learning-based Neural Machine Translation (NMT) framework for the Kashmiri-English language pair. We introduce a high-quality parallel corpus of 270,000 sentence pairs and evaluate three NMT architectures: a basic encoder-decoder model, an attention-enhanced model, and a Transformer-based model. All models are trained from scratch using byte-pair encoded vocabularies and evaluated using BLEU, GLEU, ROUGE, and ChrF + + metrics. The Transformer architecture outperforms RNN-based baselines, achieving a BLEU-4 score of 0.2965 and demonstrating superior handling of long-range dependencies and Kashmiri’s morphological complexity. We further provide a structured linguistic error analysis and validate the significance of performance differences through bootstrap resampling. This work establishes the first NMT benchmark for Kashmiri-English translation and contributes a reusable dataset, baseline models, and evaluation methodology for future research in low-resource neural translation. |
| format | Article |
| id | doaj-art-acb5c4672b9e46b69b2b3a99d722b300 |
| institution | DOAJ |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-acb5c4672b9e46b69b2b3a99d722b3002025-08-20T03:07:20ZengNature PortfolioScientific Reports2045-23222025-08-0115112110.1038/s41598-025-14177-8Deep neural architectures for Kashmiri-English machine translationSyed Matla Ul Qumar0Muzaffar Azim1S. M. K. Quadri2Mohannad Alkanan3Mohammad Shuaib Mir4Yonis Gulzar5FTK-Centre for Information Technology, Jamia Millia IslamiaFTK-Centre for Information Technology, Jamia Millia IslamiaDepartment of Computer Science, Jamia Millia IslamiaDepartment of Management Information systems, College of Business Administration, King Faisal UniversityDepartment of Management Information systems, College of Business Administration, King Faisal UniversityDepartment of Management Information systems, College of Business Administration, King Faisal UniversityAbstract This paper presents the first comprehensive deep learning-based Neural Machine Translation (NMT) framework for the Kashmiri-English language pair. We introduce a high-quality parallel corpus of 270,000 sentence pairs and evaluate three NMT architectures: a basic encoder-decoder model, an attention-enhanced model, and a Transformer-based model. All models are trained from scratch using byte-pair encoded vocabularies and evaluated using BLEU, GLEU, ROUGE, and ChrF + + metrics. The Transformer architecture outperforms RNN-based baselines, achieving a BLEU-4 score of 0.2965 and demonstrating superior handling of long-range dependencies and Kashmiri’s morphological complexity. We further provide a structured linguistic error analysis and validate the significance of performance differences through bootstrap resampling. This work establishes the first NMT benchmark for Kashmiri-English translation and contributes a reusable dataset, baseline models, and evaluation methodology for future research in low-resource neural translation.https://doi.org/10.1038/s41598-025-14177-8 |
| spellingShingle | Syed Matla Ul Qumar Muzaffar Azim S. M. K. Quadri Mohannad Alkanan Mohammad Shuaib Mir Yonis Gulzar Deep neural architectures for Kashmiri-English machine translation Scientific Reports |
| title | Deep neural architectures for Kashmiri-English machine translation |
| title_full | Deep neural architectures for Kashmiri-English machine translation |
| title_fullStr | Deep neural architectures for Kashmiri-English machine translation |
| title_full_unstemmed | Deep neural architectures for Kashmiri-English machine translation |
| title_short | Deep neural architectures for Kashmiri-English machine translation |
| title_sort | deep neural architectures for kashmiri english machine translation |
| url | https://doi.org/10.1038/s41598-025-14177-8 |
| work_keys_str_mv | AT syedmatlaulqumar deepneuralarchitecturesforkashmirienglishmachinetranslation AT muzaffarazim deepneuralarchitecturesforkashmirienglishmachinetranslation AT smkquadri deepneuralarchitecturesforkashmirienglishmachinetranslation AT mohannadalkanan deepneuralarchitecturesforkashmirienglishmachinetranslation AT mohammadshuaibmir deepneuralarchitecturesforkashmirienglishmachinetranslation AT yonisgulzar deepneuralarchitecturesforkashmirienglishmachinetranslation |