A Backdoor Attack Against LSTM-Based Text Classification Systems

With the widespread use of deep learning system in many applications, the adversary has strong incentive to explore vulnerabilities of deep neural networks and manipulate them. Backdoor attacks against deep neural networks have been reported to be a new type of threat. In this attack, the adversary...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jiazhu Dai, Chuanshuai Chen, Yufeng Li
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	Backdoor attacks LSTM poisoning data text classification
Online Access:	https://ieeexplore.ieee.org/document/8836465/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850113490048516096
author	Jiazhu Dai Chuanshuai Chen Yufeng Li
author_facet	Jiazhu Dai Chuanshuai Chen Yufeng Li
author_sort	Jiazhu Dai
collection	DOAJ
description	With the widespread use of deep learning system in many applications, the adversary has strong incentive to explore vulnerabilities of deep neural networks and manipulate them. Backdoor attacks against deep neural networks have been reported to be a new type of threat. In this attack, the adversary will inject backdoors into the model and then cause the misbehavior of the model through inputs including backdoor triggers. Existed research mainly focuses on backdoor attacks in image classification based on CNN, little attention has been paid to the backdoor attacks in RNN. In this paper, we implement a backdoor attack against LSTM-based text classification by data poisoning. After the backdoor is injected, the model will misclassify any text samples that contains a specific trigger sentence into the target category determined by the adversary. The backdoor attack is stealthy and the backdoor injected in the model has little impact on the performance of the model. We consider the backdoor attack in black-box setting, where the adversary has no knowledge of model structures or training algorithms except for a small amount of training data. We verify the attack through sentiment analysis experiment on the dataset of IMDB movie reviews. The experimental results indicate that our attack can achieve around 96% success rate with 1% poisoning rate.
format	Article
id	doaj-art-325745b248ce4e11a07b53edb256b152
institution	OA Journals
issn	2169-3536
language	English
publishDate	2019-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-325745b248ce4e11a07b53edb256b1522025-08-20T02:37:09ZengIEEEIEEE Access2169-35362019-01-01713887213887810.1109/ACCESS.2019.29413768836465A Backdoor Attack Against LSTM-Based Text Classification SystemsJiazhu Dai0Chuanshuai Chen1https://orcid.org/0000-0003-2227-3440Yufeng Li2https://orcid.org/0000-0001-6663-8259School of Computer Engineering and Technology, Shanghai University, Shanghai, ChinaSchool of Computer Engineering and Technology, Shanghai University, Shanghai, ChinaSchool of Computer Engineering and Technology, Shanghai University, Shanghai, ChinaWith the widespread use of deep learning system in many applications, the adversary has strong incentive to explore vulnerabilities of deep neural networks and manipulate them. Backdoor attacks against deep neural networks have been reported to be a new type of threat. In this attack, the adversary will inject backdoors into the model and then cause the misbehavior of the model through inputs including backdoor triggers. Existed research mainly focuses on backdoor attacks in image classification based on CNN, little attention has been paid to the backdoor attacks in RNN. In this paper, we implement a backdoor attack against LSTM-based text classification by data poisoning. After the backdoor is injected, the model will misclassify any text samples that contains a specific trigger sentence into the target category determined by the adversary. The backdoor attack is stealthy and the backdoor injected in the model has little impact on the performance of the model. We consider the backdoor attack in black-box setting, where the adversary has no knowledge of model structures or training algorithms except for a small amount of training data. We verify the attack through sentiment analysis experiment on the dataset of IMDB movie reviews. The experimental results indicate that our attack can achieve around 96% success rate with 1% poisoning rate.https://ieeexplore.ieee.org/document/8836465/Backdoor attacksLSTMpoisoning datatext classification
spellingShingle	Jiazhu Dai Chuanshuai Chen Yufeng Li A Backdoor Attack Against LSTM-Based Text Classification Systems IEEE Access Backdoor attacks LSTM poisoning data text classification
title	A Backdoor Attack Against LSTM-Based Text Classification Systems
title_full	A Backdoor Attack Against LSTM-Based Text Classification Systems
title_fullStr	A Backdoor Attack Against LSTM-Based Text Classification Systems
title_full_unstemmed	A Backdoor Attack Against LSTM-Based Text Classification Systems
title_short	A Backdoor Attack Against LSTM-Based Text Classification Systems
title_sort	backdoor attack against lstm based text classification systems
topic	Backdoor attacks LSTM poisoning data text classification
url	https://ieeexplore.ieee.org/document/8836465/
work_keys_str_mv	AT jiazhudai abackdoorattackagainstlstmbasedtextclassificationsystems AT chuanshuaichen abackdoorattackagainstlstmbasedtextclassificationsystems AT yufengli abackdoorattackagainstlstmbasedtextclassificationsystems AT jiazhudai backdoorattackagainstlstmbasedtextclassificationsystems AT chuanshuaichen backdoorattackagainstlstmbasedtextclassificationsystems AT yufengli backdoorattackagainstlstmbasedtextclassificationsystems

A Backdoor Attack Against LSTM-Based Text Classification Systems

Similar Items