Deep neural architectures for Kashmiri-English machine translation

Abstract This paper presents the first comprehensive deep learning-based Neural Machine Translation (NMT) framework for the Kashmiri-English language pair. We introduce a high-quality parallel corpus of 270,000 sentence pairs and evaluate three NMT architectures: a basic encoder-decoder model, an at...

Full description

Saved in:
Bibliographic Details
Main Authors: Syed Matla Ul Qumar, Muzaffar Azim, S. M. K. Quadri, Mohannad Alkanan, Mohammad Shuaib Mir, Yonis Gulzar
Format: Article
Language:English
Published: Nature Portfolio 2025-08-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-14177-8
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849736256564494336
author Syed Matla Ul Qumar
Muzaffar Azim
S. M. K. Quadri
Mohannad Alkanan
Mohammad Shuaib Mir
Yonis Gulzar
author_facet Syed Matla Ul Qumar
Muzaffar Azim
S. M. K. Quadri
Mohannad Alkanan
Mohammad Shuaib Mir
Yonis Gulzar
author_sort Syed Matla Ul Qumar
collection DOAJ
description Abstract This paper presents the first comprehensive deep learning-based Neural Machine Translation (NMT) framework for the Kashmiri-English language pair. We introduce a high-quality parallel corpus of 270,000 sentence pairs and evaluate three NMT architectures: a basic encoder-decoder model, an attention-enhanced model, and a Transformer-based model. All models are trained from scratch using byte-pair encoded vocabularies and evaluated using BLEU, GLEU, ROUGE, and ChrF + + metrics. The Transformer architecture outperforms RNN-based baselines, achieving a BLEU-4 score of 0.2965 and demonstrating superior handling of long-range dependencies and Kashmiri’s morphological complexity. We further provide a structured linguistic error analysis and validate the significance of performance differences through bootstrap resampling. This work establishes the first NMT benchmark for Kashmiri-English translation and contributes a reusable dataset, baseline models, and evaluation methodology for future research in low-resource neural translation.
format Article
id doaj-art-acb5c4672b9e46b69b2b3a99d722b300
institution DOAJ
issn 2045-2322
language English
publishDate 2025-08-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-acb5c4672b9e46b69b2b3a99d722b3002025-08-20T03:07:20ZengNature PortfolioScientific Reports2045-23222025-08-0115112110.1038/s41598-025-14177-8Deep neural architectures for Kashmiri-English machine translationSyed Matla Ul Qumar0Muzaffar Azim1S. M. K. Quadri2Mohannad Alkanan3Mohammad Shuaib Mir4Yonis Gulzar5FTK-Centre for Information Technology, Jamia Millia IslamiaFTK-Centre for Information Technology, Jamia Millia IslamiaDepartment of Computer Science, Jamia Millia IslamiaDepartment of Management Information systems, College of Business Administration, King Faisal UniversityDepartment of Management Information systems, College of Business Administration, King Faisal UniversityDepartment of Management Information systems, College of Business Administration, King Faisal UniversityAbstract This paper presents the first comprehensive deep learning-based Neural Machine Translation (NMT) framework for the Kashmiri-English language pair. We introduce a high-quality parallel corpus of 270,000 sentence pairs and evaluate three NMT architectures: a basic encoder-decoder model, an attention-enhanced model, and a Transformer-based model. All models are trained from scratch using byte-pair encoded vocabularies and evaluated using BLEU, GLEU, ROUGE, and ChrF + + metrics. The Transformer architecture outperforms RNN-based baselines, achieving a BLEU-4 score of 0.2965 and demonstrating superior handling of long-range dependencies and Kashmiri’s morphological complexity. We further provide a structured linguistic error analysis and validate the significance of performance differences through bootstrap resampling. This work establishes the first NMT benchmark for Kashmiri-English translation and contributes a reusable dataset, baseline models, and evaluation methodology for future research in low-resource neural translation.https://doi.org/10.1038/s41598-025-14177-8
spellingShingle Syed Matla Ul Qumar
Muzaffar Azim
S. M. K. Quadri
Mohannad Alkanan
Mohammad Shuaib Mir
Yonis Gulzar
Deep neural architectures for Kashmiri-English machine translation
Scientific Reports
title Deep neural architectures for Kashmiri-English machine translation
title_full Deep neural architectures for Kashmiri-English machine translation
title_fullStr Deep neural architectures for Kashmiri-English machine translation
title_full_unstemmed Deep neural architectures for Kashmiri-English machine translation
title_short Deep neural architectures for Kashmiri-English machine translation
title_sort deep neural architectures for kashmiri english machine translation
url https://doi.org/10.1038/s41598-025-14177-8
work_keys_str_mv AT syedmatlaulqumar deepneuralarchitecturesforkashmirienglishmachinetranslation
AT muzaffarazim deepneuralarchitecturesforkashmirienglishmachinetranslation
AT smkquadri deepneuralarchitecturesforkashmirienglishmachinetranslation
AT mohannadalkanan deepneuralarchitecturesforkashmirienglishmachinetranslation
AT mohammadshuaibmir deepneuralarchitecturesforkashmirienglishmachinetranslation
AT yonisgulzar deepneuralarchitecturesforkashmirienglishmachinetranslation