Data Poisoning Attack on Black-Box Neural Machine Translation to Truncate Translation

Neural machine translation (NMT) systems have achieved outstanding performance and have been widely deployed in the real world. However, the undertranslation problem caused by the distribution of high-translation-entropy words in source sentences still exists, and can be aggravated by poisoning atta...

Full description

Saved in:

Bibliographic Details
Main Authors:	Lingfang Li, Weijian Hu, Mingxing Luo
Format:	Article
Language:	English
Published:	MDPI AG 2024-12-01
Series:	Entropy
Subjects:	neural machine translation data poisoning backdoor attack
Online Access:	https://www.mdpi.com/1099-4300/26/12/1081
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850050858262200320
author	Lingfang Li Weijian Hu Mingxing Luo
author_facet	Lingfang Li Weijian Hu Mingxing Luo
author_sort	Lingfang Li
collection	DOAJ
description	Neural machine translation (NMT) systems have achieved outstanding performance and have been widely deployed in the real world. However, the undertranslation problem caused by the distribution of high-translation-entropy words in source sentences still exists, and can be aggravated by poisoning attacks. In this paper, we propose a new backdoor attack on NMT models by poisoning a small fraction of parallel training data. Our attack increases the translation entropy of words after injecting a backdoor trigger, making them more easily discarded by NMT. The final translation is part of target translation, and the position of the injected trigger poison affects the scope of the truncation. Moreover, we also propose a defense method, Backdoor Defense by Sematic Representation Change (BDSRC), against our attack. Specifically, we selected backdoor candidates based on the similarity between the semantic representation of words in a sentence and the overall sentence representation. Then, the injected backdoor is identified through computing the semantic deviation caused by backdoor candidates. The experiments show that our attack strategy can achieve a nearly 100% attack success rate, and the functionality of main translation tasks is almost unaffected in models having performance degradation that is less than 1 BLEU. Nonetheless, our defense method can effectively identify backdoor triggers and alleviate performance degradation.
format	Article
id	doaj-art-ed739dc2135942e48dd1fd9cf5680efa
institution	DOAJ
issn	1099-4300
language	English
publishDate	2024-12-01
publisher	MDPI AG
record_format	Article
series	Entropy
spelling	doaj-art-ed739dc2135942e48dd1fd9cf5680efa2025-08-20T02:53:19ZengMDPI AGEntropy1099-43002024-12-012612108110.3390/e26121081Data Poisoning Attack on Black-Box Neural Machine Translation to Truncate TranslationLingfang Li0Weijian Hu1Mingxing Luo2School of Information Science and Technology, Southwest Jiaotong University, Chengdu 610032, ChinaSchool of Information Engineer, Inner Mongolia University of Science & Technology, Hohhot 010021, ChinaSchool of Information Science and Technology, Southwest Jiaotong University, Chengdu 610032, ChinaNeural machine translation (NMT) systems have achieved outstanding performance and have been widely deployed in the real world. However, the undertranslation problem caused by the distribution of high-translation-entropy words in source sentences still exists, and can be aggravated by poisoning attacks. In this paper, we propose a new backdoor attack on NMT models by poisoning a small fraction of parallel training data. Our attack increases the translation entropy of words after injecting a backdoor trigger, making them more easily discarded by NMT. The final translation is part of target translation, and the position of the injected trigger poison affects the scope of the truncation. Moreover, we also propose a defense method, Backdoor Defense by Sematic Representation Change (BDSRC), against our attack. Specifically, we selected backdoor candidates based on the similarity between the semantic representation of words in a sentence and the overall sentence representation. Then, the injected backdoor is identified through computing the semantic deviation caused by backdoor candidates. The experiments show that our attack strategy can achieve a nearly 100% attack success rate, and the functionality of main translation tasks is almost unaffected in models having performance degradation that is less than 1 BLEU. Nonetheless, our defense method can effectively identify backdoor triggers and alleviate performance degradation.https://www.mdpi.com/1099-4300/26/12/1081neural machine translationdata poisoningbackdoor attack
spellingShingle	Lingfang Li Weijian Hu Mingxing Luo Data Poisoning Attack on Black-Box Neural Machine Translation to Truncate Translation Entropy neural machine translation data poisoning backdoor attack
title	Data Poisoning Attack on Black-Box Neural Machine Translation to Truncate Translation
title_full	Data Poisoning Attack on Black-Box Neural Machine Translation to Truncate Translation
title_fullStr	Data Poisoning Attack on Black-Box Neural Machine Translation to Truncate Translation
title_full_unstemmed	Data Poisoning Attack on Black-Box Neural Machine Translation to Truncate Translation
title_short	Data Poisoning Attack on Black-Box Neural Machine Translation to Truncate Translation
title_sort	data poisoning attack on black box neural machine translation to truncate translation
topic	neural machine translation data poisoning backdoor attack
url	https://www.mdpi.com/1099-4300/26/12/1081
work_keys_str_mv	AT lingfangli datapoisoningattackonblackboxneuralmachinetranslationtotruncatetranslation AT weijianhu datapoisoningattackonblackboxneuralmachinetranslationtotruncatetranslation AT mingxingluo datapoisoningattackonblackboxneuralmachinetranslationtotruncatetranslation

Data Poisoning Attack on Black-Box Neural Machine Translation to Truncate Translation

Similar Items