Advances in machine transliteration methods, limitations, challenges, applications and future directions

Machine transliteration is critical in natural language processing (NLP), facilitating script conversion while preserving phonetic integrity across diverse languages. Using the PRISMA framework, this review analyzes 73 selected studies on machine transliteration, covering both methodological advance...

Full description

Saved in:

Bibliographic Details
Main Authors:	A’la Syauqi, Aji Prasetya Wibawa
Format:	Article
Language:	English
Published:	Elsevier 2025-06-01
Series:	Natural Language Processing Journal
Subjects:	Systematic literature review Machine transliteration Transliteration methods Natural language processing Script conversion Low-resource languages
Online Access:	http://www.sciencedirect.com/science/article/pii/S2949719125000342
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849423314120867840
author	A’la Syauqi Aji Prasetya Wibawa
author_facet	A’la Syauqi Aji Prasetya Wibawa
author_sort	A’la Syauqi
collection	DOAJ
description	Machine transliteration is critical in natural language processing (NLP), facilitating script conversion while preserving phonetic integrity across diverse languages. Using the PRISMA framework, this review analyzes 73 selected studies on machine transliteration, covering both methodological advancements and its role in NLP applications. Among these, 37 studies focus on transliteration methods (rule-based, statistical, machine learning, hybrid, and semantic), while 32 studies explore their application in NLP tasks such as machine translation, sentiment analysis, and text normalization. Rule-based methods provide structured frameworks but face challenges in adapting to linguistic variability. Statistical techniques demonstrate robustness yet depend heavily on the availability of parallel corpora. Machine learning models leverage neural architectures to achieve high accuracy but are constrained by data scarcity for low-resource languages. Hybrid approaches integrate multiple methodologies, while semantic knowledge-based models enhance accuracy by incorporating linguistic features. The review highlights transliteration’s role in NLP applications such as machine translation, sentiment analysis, and text normalization, which are critical for improving multilingual language accessibility. Findings show that machine learning-based approaches dominate transliteration research (32 of 73 studies), followed by rule-based and hybrid methods. These approaches contribute to improving multilingual accessibility and NLP performance. This study provides actionable insights for researchers and practitioners by synthesizing advancements and identifying challenges. These insights enable the development more efficient and inclusive transliteration systems, ultimately supporting linguistic diversity and advancing multilingual NLP technologies. The review identifies gaps in addressing underrepresented languages like Javanese, where complex character sets, orthographic rules, and scriptio continua remain underexplored.
format	Article
id	doaj-art-4eb6454e05ff438cb7e81baab95006f9
institution	Kabale University
issn	2949-7191
language	English
publishDate	2025-06-01
publisher	Elsevier
record_format	Article
series	Natural Language Processing Journal
spelling	doaj-art-4eb6454e05ff438cb7e81baab95006f92025-08-20T03:30:39ZengElsevierNatural Language Processing Journal2949-71912025-06-011110015810.1016/j.nlp.2025.100158Advances in machine transliteration methods, limitations, challenges, applications and future directionsA’la Syauqi0Aji Prasetya Wibawa1Department of Electrical Engineering and Informatics, Faculty of Engineering, Universitas Negeri Malang, Jl. Semarang no. 5, Malang 65145, Indonesia; Informatics Engineering, Faculty of Science and Technology, Universitas Islam Negeri Maulana Malik Ibrahim Malang, Malang 65144, Indonesia; Corresponding author at: Informatics Engineering, Faculty of Science and Technology, Universitas Islam Negeri Maulana Malik Ibrahim Malang, Malang 65144, Indonesia.Informatics Engineering, Faculty of Science and Technology, Universitas Islam Negeri Maulana Malik Ibrahim Malang, Malang 65144, IndonesiaMachine transliteration is critical in natural language processing (NLP), facilitating script conversion while preserving phonetic integrity across diverse languages. Using the PRISMA framework, this review analyzes 73 selected studies on machine transliteration, covering both methodological advancements and its role in NLP applications. Among these, 37 studies focus on transliteration methods (rule-based, statistical, machine learning, hybrid, and semantic), while 32 studies explore their application in NLP tasks such as machine translation, sentiment analysis, and text normalization. Rule-based methods provide structured frameworks but face challenges in adapting to linguistic variability. Statistical techniques demonstrate robustness yet depend heavily on the availability of parallel corpora. Machine learning models leverage neural architectures to achieve high accuracy but are constrained by data scarcity for low-resource languages. Hybrid approaches integrate multiple methodologies, while semantic knowledge-based models enhance accuracy by incorporating linguistic features. The review highlights transliteration’s role in NLP applications such as machine translation, sentiment analysis, and text normalization, which are critical for improving multilingual language accessibility. Findings show that machine learning-based approaches dominate transliteration research (32 of 73 studies), followed by rule-based and hybrid methods. These approaches contribute to improving multilingual accessibility and NLP performance. This study provides actionable insights for researchers and practitioners by synthesizing advancements and identifying challenges. These insights enable the development more efficient and inclusive transliteration systems, ultimately supporting linguistic diversity and advancing multilingual NLP technologies. The review identifies gaps in addressing underrepresented languages like Javanese, where complex character sets, orthographic rules, and scriptio continua remain underexplored.http://www.sciencedirect.com/science/article/pii/S2949719125000342Systematic literature reviewMachine transliterationTransliteration methodsNatural language processingScript conversionLow-resource languages
spellingShingle	A’la Syauqi Aji Prasetya Wibawa Advances in machine transliteration methods, limitations, challenges, applications and future directions Natural Language Processing Journal Systematic literature review Machine transliteration Transliteration methods Natural language processing Script conversion Low-resource languages
title	Advances in machine transliteration methods, limitations, challenges, applications and future directions
title_full	Advances in machine transliteration methods, limitations, challenges, applications and future directions
title_fullStr	Advances in machine transliteration methods, limitations, challenges, applications and future directions
title_full_unstemmed	Advances in machine transliteration methods, limitations, challenges, applications and future directions
title_short	Advances in machine transliteration methods, limitations, challenges, applications and future directions
title_sort	advances in machine transliteration methods limitations challenges applications and future directions
topic	Systematic literature review Machine transliteration Transliteration methods Natural language processing Script conversion Low-resource languages
url	http://www.sciencedirect.com/science/article/pii/S2949719125000342
work_keys_str_mv	AT alasyauqi advancesinmachinetransliterationmethodslimitationschallengesapplicationsandfuturedirections AT ajiprasetyawibawa advancesinmachinetransliterationmethodslimitationschallengesapplicationsandfuturedirections

Advances in machine transliteration methods, limitations, challenges, applications and future directions

Similar Items