Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models

In the field of computational linguistics, addressing machine translation (MT) challenges for low-resource languages remains crucial, as these languages often lack extensive data compared to high-resource languages. General large language models (LLMs), such as GPT-4 and Llama, primarily trained on...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiao Liang, Yen-Min Jasmina Khaw, Soung-Yue Liew, Tien-Ping Tan, Donghong Qin
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10918960/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849392886216392704
author Xiao Liang
Yen-Min Jasmina Khaw
Soung-Yue Liew
Tien-Ping Tan
Donghong Qin
author_facet Xiao Liang
Yen-Min Jasmina Khaw
Soung-Yue Liew
Tien-Ping Tan
Donghong Qin
author_sort Xiao Liang
collection DOAJ
description In the field of computational linguistics, addressing machine translation (MT) challenges for low-resource languages remains crucial, as these languages often lack extensive data compared to high-resource languages. General large language models (LLMs), such as GPT-4 and Llama, primarily trained on monolingual corpora, face significant challenges in translating low-resource languages, often resulting in subpar translation quality. This study introduces Language-Specific Fine-Tuning with Low-rank adaptation (LSFTL), a method that enhances translation for low-resource languages by optimizing the multi-head attention and feed-forward networks of Transformer layers through low-rank matrix adaptation. LSFTL preserves the majority of the model parameters while selectively fine-tuning key components, thereby maintaining stability and enhancing translation quality. Experiments on non-English centered low-resource Asian languages demonstrated that LSFTL improved COMET scores by 1-3 points compared to specialized multilingual machine translation models. Additionally, LSFTL’s parameter-efficient approach allows smaller models to achieve performance comparable to their larger counterparts, highlighting its significance in making machine translation systems more accessible and effective for low-resource languages.
format Article
id doaj-art-75ceddfb7b6f48e2ac2eeeae56129741
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-75ceddfb7b6f48e2ac2eeeae561297412025-08-20T03:40:40ZengIEEEIEEE Access2169-35362025-01-0113466164662610.1109/ACCESS.2025.354979510918960Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language ModelsXiao Liang0https://orcid.org/0009-0006-5321-4565Yen-Min Jasmina Khaw1https://orcid.org/0000-0002-6554-5883Soung-Yue Liew2https://orcid.org/0000-0002-8853-7755Tien-Ping Tan3https://orcid.org/0000-0002-4154-4747Donghong Qin4Department of Computer Science, Faculty of Information and Communication Technology, Universiti Tunku Abdul Rahman, Kampar, MalaysiaDepartment of Computer Science, Faculty of Information and Communication Technology, Universiti Tunku Abdul Rahman, Kampar, MalaysiaDepartment of Computer and Communication Technology, Faculty of Information and Communication Technology, Universiti Tunku Abdul Rahman, Kampar, MalaysiaSchool of Computer Sciences, Universiti Sains Malaysia, George Town, MalaysiaSchool of Artificial Intelligence, Guangxi Minzu University, Nanning, ChinaIn the field of computational linguistics, addressing machine translation (MT) challenges for low-resource languages remains crucial, as these languages often lack extensive data compared to high-resource languages. General large language models (LLMs), such as GPT-4 and Llama, primarily trained on monolingual corpora, face significant challenges in translating low-resource languages, often resulting in subpar translation quality. This study introduces Language-Specific Fine-Tuning with Low-rank adaptation (LSFTL), a method that enhances translation for low-resource languages by optimizing the multi-head attention and feed-forward networks of Transformer layers through low-rank matrix adaptation. LSFTL preserves the majority of the model parameters while selectively fine-tuning key components, thereby maintaining stability and enhancing translation quality. Experiments on non-English centered low-resource Asian languages demonstrated that LSFTL improved COMET scores by 1-3 points compared to specialized multilingual machine translation models. Additionally, LSFTL’s parameter-efficient approach allows smaller models to achieve performance comparable to their larger counterparts, highlighting its significance in making machine translation systems more accessible and effective for low-resource languages.https://ieeexplore.ieee.org/document/10918960/Machine translationlow-resource languageslarge language modelsparameter-efficient fine-tuningLoRA
spellingShingle Xiao Liang
Yen-Min Jasmina Khaw
Soung-Yue Liew
Tien-Ping Tan
Donghong Qin
Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models
IEEE Access
Machine translation
low-resource languages
large language models
parameter-efficient fine-tuning
LoRA
title Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models
title_full Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models
title_fullStr Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models
title_full_unstemmed Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models
title_short Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models
title_sort toward low resource languages machine translation a language specific fine tuning with lora for specialized large language models
topic Machine translation
low-resource languages
large language models
parameter-efficient fine-tuning
LoRA
url https://ieeexplore.ieee.org/document/10918960/
work_keys_str_mv AT xiaoliang towardlowresourcelanguagesmachinetranslationalanguagespecificfinetuningwithloraforspecializedlargelanguagemodels
AT yenminjasminakhaw towardlowresourcelanguagesmachinetranslationalanguagespecificfinetuningwithloraforspecializedlargelanguagemodels
AT soungyueliew towardlowresourcelanguagesmachinetranslationalanguagespecificfinetuningwithloraforspecializedlargelanguagemodels
AT tienpingtan towardlowresourcelanguagesmachinetranslationalanguagespecificfinetuningwithloraforspecializedlargelanguagemodels
AT donghongqin towardlowresourcelanguagesmachinetranslationalanguagespecificfinetuningwithloraforspecializedlargelanguagemodels