Advances in machine transliteration methods, limitations, challenges, applications and future directions
Machine transliteration is critical in natural language processing (NLP), facilitating script conversion while preserving phonetic integrity across diverse languages. Using the PRISMA framework, this review analyzes 73 selected studies on machine transliteration, covering both methodological advance...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-06-01
|
| Series: | Natural Language Processing Journal |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2949719125000342 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849423314120867840 |
|---|---|
| author | A’la Syauqi Aji Prasetya Wibawa |
| author_facet | A’la Syauqi Aji Prasetya Wibawa |
| author_sort | A’la Syauqi |
| collection | DOAJ |
| description | Machine transliteration is critical in natural language processing (NLP), facilitating script conversion while preserving phonetic integrity across diverse languages. Using the PRISMA framework, this review analyzes 73 selected studies on machine transliteration, covering both methodological advancements and its role in NLP applications. Among these, 37 studies focus on transliteration methods (rule-based, statistical, machine learning, hybrid, and semantic), while 32 studies explore their application in NLP tasks such as machine translation, sentiment analysis, and text normalization. Rule-based methods provide structured frameworks but face challenges in adapting to linguistic variability. Statistical techniques demonstrate robustness yet depend heavily on the availability of parallel corpora. Machine learning models leverage neural architectures to achieve high accuracy but are constrained by data scarcity for low-resource languages. Hybrid approaches integrate multiple methodologies, while semantic knowledge-based models enhance accuracy by incorporating linguistic features. The review highlights transliteration’s role in NLP applications such as machine translation, sentiment analysis, and text normalization, which are critical for improving multilingual language accessibility. Findings show that machine learning-based approaches dominate transliteration research (32 of 73 studies), followed by rule-based and hybrid methods. These approaches contribute to improving multilingual accessibility and NLP performance. This study provides actionable insights for researchers and practitioners by synthesizing advancements and identifying challenges. These insights enable the development more efficient and inclusive transliteration systems, ultimately supporting linguistic diversity and advancing multilingual NLP technologies. The review identifies gaps in addressing underrepresented languages like Javanese, where complex character sets, orthographic rules, and scriptio continua remain underexplored. |
| format | Article |
| id | doaj-art-4eb6454e05ff438cb7e81baab95006f9 |
| institution | Kabale University |
| issn | 2949-7191 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Natural Language Processing Journal |
| spelling | doaj-art-4eb6454e05ff438cb7e81baab95006f92025-08-20T03:30:39ZengElsevierNatural Language Processing Journal2949-71912025-06-011110015810.1016/j.nlp.2025.100158Advances in machine transliteration methods, limitations, challenges, applications and future directionsA’la Syauqi0Aji Prasetya Wibawa1Department of Electrical Engineering and Informatics, Faculty of Engineering, Universitas Negeri Malang, Jl. Semarang no. 5, Malang 65145, Indonesia; Informatics Engineering, Faculty of Science and Technology, Universitas Islam Negeri Maulana Malik Ibrahim Malang, Malang 65144, Indonesia; Corresponding author at: Informatics Engineering, Faculty of Science and Technology, Universitas Islam Negeri Maulana Malik Ibrahim Malang, Malang 65144, Indonesia.Informatics Engineering, Faculty of Science and Technology, Universitas Islam Negeri Maulana Malik Ibrahim Malang, Malang 65144, IndonesiaMachine transliteration is critical in natural language processing (NLP), facilitating script conversion while preserving phonetic integrity across diverse languages. Using the PRISMA framework, this review analyzes 73 selected studies on machine transliteration, covering both methodological advancements and its role in NLP applications. Among these, 37 studies focus on transliteration methods (rule-based, statistical, machine learning, hybrid, and semantic), while 32 studies explore their application in NLP tasks such as machine translation, sentiment analysis, and text normalization. Rule-based methods provide structured frameworks but face challenges in adapting to linguistic variability. Statistical techniques demonstrate robustness yet depend heavily on the availability of parallel corpora. Machine learning models leverage neural architectures to achieve high accuracy but are constrained by data scarcity for low-resource languages. Hybrid approaches integrate multiple methodologies, while semantic knowledge-based models enhance accuracy by incorporating linguistic features. The review highlights transliteration’s role in NLP applications such as machine translation, sentiment analysis, and text normalization, which are critical for improving multilingual language accessibility. Findings show that machine learning-based approaches dominate transliteration research (32 of 73 studies), followed by rule-based and hybrid methods. These approaches contribute to improving multilingual accessibility and NLP performance. This study provides actionable insights for researchers and practitioners by synthesizing advancements and identifying challenges. These insights enable the development more efficient and inclusive transliteration systems, ultimately supporting linguistic diversity and advancing multilingual NLP technologies. The review identifies gaps in addressing underrepresented languages like Javanese, where complex character sets, orthographic rules, and scriptio continua remain underexplored.http://www.sciencedirect.com/science/article/pii/S2949719125000342Systematic literature reviewMachine transliterationTransliteration methodsNatural language processingScript conversionLow-resource languages |
| spellingShingle | A’la Syauqi Aji Prasetya Wibawa Advances in machine transliteration methods, limitations, challenges, applications and future directions Natural Language Processing Journal Systematic literature review Machine transliteration Transliteration methods Natural language processing Script conversion Low-resource languages |
| title | Advances in machine transliteration methods, limitations, challenges, applications and future directions |
| title_full | Advances in machine transliteration methods, limitations, challenges, applications and future directions |
| title_fullStr | Advances in machine transliteration methods, limitations, challenges, applications and future directions |
| title_full_unstemmed | Advances in machine transliteration methods, limitations, challenges, applications and future directions |
| title_short | Advances in machine transliteration methods, limitations, challenges, applications and future directions |
| title_sort | advances in machine transliteration methods limitations challenges applications and future directions |
| topic | Systematic literature review Machine transliteration Transliteration methods Natural language processing Script conversion Low-resource languages |
| url | http://www.sciencedirect.com/science/article/pii/S2949719125000342 |
| work_keys_str_mv | AT alasyauqi advancesinmachinetransliterationmethodslimitationschallengesapplicationsandfuturedirections AT ajiprasetyawibawa advancesinmachinetransliterationmethodslimitationschallengesapplicationsandfuturedirections |