Advances in machine transliteration methods, limitations, challenges, applications and future directions

Machine transliteration is critical in natural language processing (NLP), facilitating script conversion while preserving phonetic integrity across diverse languages. Using the PRISMA framework, this review analyzes 73 selected studies on machine transliteration, covering both methodological advance...

Full description

Saved in:
Bibliographic Details
Main Authors: A’la Syauqi, Aji Prasetya Wibawa
Format: Article
Language:English
Published: Elsevier 2025-06-01
Series:Natural Language Processing Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2949719125000342
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849423314120867840
author A’la Syauqi
Aji Prasetya Wibawa
author_facet A’la Syauqi
Aji Prasetya Wibawa
author_sort A’la Syauqi
collection DOAJ
description Machine transliteration is critical in natural language processing (NLP), facilitating script conversion while preserving phonetic integrity across diverse languages. Using the PRISMA framework, this review analyzes 73 selected studies on machine transliteration, covering both methodological advancements and its role in NLP applications. Among these, 37 studies focus on transliteration methods (rule-based, statistical, machine learning, hybrid, and semantic), while 32 studies explore their application in NLP tasks such as machine translation, sentiment analysis, and text normalization. Rule-based methods provide structured frameworks but face challenges in adapting to linguistic variability. Statistical techniques demonstrate robustness yet depend heavily on the availability of parallel corpora. Machine learning models leverage neural architectures to achieve high accuracy but are constrained by data scarcity for low-resource languages. Hybrid approaches integrate multiple methodologies, while semantic knowledge-based models enhance accuracy by incorporating linguistic features. The review highlights transliteration’s role in NLP applications such as machine translation, sentiment analysis, and text normalization, which are critical for improving multilingual language accessibility. Findings show that machine learning-based approaches dominate transliteration research (32 of 73 studies), followed by rule-based and hybrid methods. These approaches contribute to improving multilingual accessibility and NLP performance. This study provides actionable insights for researchers and practitioners by synthesizing advancements and identifying challenges. These insights enable the development more efficient and inclusive transliteration systems, ultimately supporting linguistic diversity and advancing multilingual NLP technologies. The review identifies gaps in addressing underrepresented languages like Javanese, where complex character sets, orthographic rules, and scriptio continua remain underexplored.
format Article
id doaj-art-4eb6454e05ff438cb7e81baab95006f9
institution Kabale University
issn 2949-7191
language English
publishDate 2025-06-01
publisher Elsevier
record_format Article
series Natural Language Processing Journal
spelling doaj-art-4eb6454e05ff438cb7e81baab95006f92025-08-20T03:30:39ZengElsevierNatural Language Processing Journal2949-71912025-06-011110015810.1016/j.nlp.2025.100158Advances in machine transliteration methods, limitations, challenges, applications and future directionsA’la Syauqi0Aji Prasetya Wibawa1Department of Electrical Engineering and Informatics, Faculty of Engineering, Universitas Negeri Malang, Jl. Semarang no. 5, Malang 65145, Indonesia; Informatics Engineering, Faculty of Science and Technology, Universitas Islam Negeri Maulana Malik Ibrahim Malang, Malang 65144, Indonesia; Corresponding author at: Informatics Engineering, Faculty of Science and Technology, Universitas Islam Negeri Maulana Malik Ibrahim Malang, Malang 65144, Indonesia.Informatics Engineering, Faculty of Science and Technology, Universitas Islam Negeri Maulana Malik Ibrahim Malang, Malang 65144, IndonesiaMachine transliteration is critical in natural language processing (NLP), facilitating script conversion while preserving phonetic integrity across diverse languages. Using the PRISMA framework, this review analyzes 73 selected studies on machine transliteration, covering both methodological advancements and its role in NLP applications. Among these, 37 studies focus on transliteration methods (rule-based, statistical, machine learning, hybrid, and semantic), while 32 studies explore their application in NLP tasks such as machine translation, sentiment analysis, and text normalization. Rule-based methods provide structured frameworks but face challenges in adapting to linguistic variability. Statistical techniques demonstrate robustness yet depend heavily on the availability of parallel corpora. Machine learning models leverage neural architectures to achieve high accuracy but are constrained by data scarcity for low-resource languages. Hybrid approaches integrate multiple methodologies, while semantic knowledge-based models enhance accuracy by incorporating linguistic features. The review highlights transliteration’s role in NLP applications such as machine translation, sentiment analysis, and text normalization, which are critical for improving multilingual language accessibility. Findings show that machine learning-based approaches dominate transliteration research (32 of 73 studies), followed by rule-based and hybrid methods. These approaches contribute to improving multilingual accessibility and NLP performance. This study provides actionable insights for researchers and practitioners by synthesizing advancements and identifying challenges. These insights enable the development more efficient and inclusive transliteration systems, ultimately supporting linguistic diversity and advancing multilingual NLP technologies. The review identifies gaps in addressing underrepresented languages like Javanese, where complex character sets, orthographic rules, and scriptio continua remain underexplored.http://www.sciencedirect.com/science/article/pii/S2949719125000342Systematic literature reviewMachine transliterationTransliteration methodsNatural language processingScript conversionLow-resource languages
spellingShingle A’la Syauqi
Aji Prasetya Wibawa
Advances in machine transliteration methods, limitations, challenges, applications and future directions
Natural Language Processing Journal
Systematic literature review
Machine transliteration
Transliteration methods
Natural language processing
Script conversion
Low-resource languages
title Advances in machine transliteration methods, limitations, challenges, applications and future directions
title_full Advances in machine transliteration methods, limitations, challenges, applications and future directions
title_fullStr Advances in machine transliteration methods, limitations, challenges, applications and future directions
title_full_unstemmed Advances in machine transliteration methods, limitations, challenges, applications and future directions
title_short Advances in machine transliteration methods, limitations, challenges, applications and future directions
title_sort advances in machine transliteration methods limitations challenges applications and future directions
topic Systematic literature review
Machine transliteration
Transliteration methods
Natural language processing
Script conversion
Low-resource languages
url http://www.sciencedirect.com/science/article/pii/S2949719125000342
work_keys_str_mv AT alasyauqi advancesinmachinetransliterationmethodslimitationschallengesapplicationsandfuturedirections
AT ajiprasetyawibawa advancesinmachinetransliterationmethodslimitationschallengesapplicationsandfuturedirections