A survey of backdoor attacks and defences: From deep neural networks to large language models
Deep neural networks (DNNs) have found extensive applications in safety-critical artificial intelligence systems, such as autonomous driving and facial recognition systems. However, recent research has revealed their susceptibility to backdoors maliciously injected by adversaries. This vulnerability...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
KeAi Communications Co., Ltd.
2025-09-01
|
| Series: | Journal of Electronic Science and Technology |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S1674862X25000278 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849228360532623360 |
|---|---|
| author | Ling-Xin Jin Wei Jiang Xiang-Yu Wen Mei-Yu Lin Jin-Yu Zhan Xing-Zhi Zhou Maregu Assefa Habtie Naoufel Werghi |
| author_facet | Ling-Xin Jin Wei Jiang Xiang-Yu Wen Mei-Yu Lin Jin-Yu Zhan Xing-Zhi Zhou Maregu Assefa Habtie Naoufel Werghi |
| author_sort | Ling-Xin Jin |
| collection | DOAJ |
| description | Deep neural networks (DNNs) have found extensive applications in safety-critical artificial intelligence systems, such as autonomous driving and facial recognition systems. However, recent research has revealed their susceptibility to backdoors maliciously injected by adversaries. This vulnerability arises due to the intricate architecture and opacity of DNNs, resulting in numerous redundant neurons embedded within the models. Adversaries exploit these vulnerabilities to conceal malicious backdoor information within DNNs, thereby causing erroneous outputs and posing substantial threats to the efficacy of DNN-based applications. This article presents a comprehensive survey of backdoor attacks against DNNs and the countermeasure methods employed to mitigate them. Initially, we trace the evolution of the concept from traditional backdoor attacks to backdoor attacks against DNNs, highlighting the feasibility and practicality of generating backdoor attacks against DNNs. Subsequently, we provide an overview of notable works encompassing various attack and defense strategies, facilitating a comparative analysis of their approaches. Through these discussions, we offer constructive insights aimed at refining these techniques. Finally, we extend our research perspective to the domain of large language models (LLMs) and synthesize the characteristics and developmental trends of backdoor attacks and defense methods targeting LLMs. Through a systematic review of existing studies on backdoor vulnerabilities in LLMs, we identify critical open challenges in this field and propose actionable directions for future research. |
| format | Article |
| id | doaj-art-44e2b11516fa478ea800e3a71ca39bc4 |
| institution | Kabale University |
| issn | 2666-223X |
| language | English |
| publishDate | 2025-09-01 |
| publisher | KeAi Communications Co., Ltd. |
| record_format | Article |
| series | Journal of Electronic Science and Technology |
| spelling | doaj-art-44e2b11516fa478ea800e3a71ca39bc42025-08-23T04:47:55ZengKeAi Communications Co., Ltd.Journal of Electronic Science and Technology2666-223X2025-09-0123310032610.1016/j.jnlest.2025.100326A survey of backdoor attacks and defences: From deep neural networks to large language modelsLing-Xin Jin0Wei Jiang1Xiang-Yu Wen2Mei-Yu Lin3Jin-Yu Zhan4Xing-Zhi Zhou5Maregu Assefa Habtie6Naoufel Werghi7School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, China; School of Computer Science, Khalifa University, Abu Dhabi, 127788, the United Arab EmiratesSchool of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, China; Corresponding author.School of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, 999077, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, ChinaSchool of Computer Science, Khalifa University, Abu Dhabi, 127788, the United Arab EmiratesSchool of Computer Science, Khalifa University, Abu Dhabi, 127788, the United Arab EmiratesDeep neural networks (DNNs) have found extensive applications in safety-critical artificial intelligence systems, such as autonomous driving and facial recognition systems. However, recent research has revealed their susceptibility to backdoors maliciously injected by adversaries. This vulnerability arises due to the intricate architecture and opacity of DNNs, resulting in numerous redundant neurons embedded within the models. Adversaries exploit these vulnerabilities to conceal malicious backdoor information within DNNs, thereby causing erroneous outputs and posing substantial threats to the efficacy of DNN-based applications. This article presents a comprehensive survey of backdoor attacks against DNNs and the countermeasure methods employed to mitigate them. Initially, we trace the evolution of the concept from traditional backdoor attacks to backdoor attacks against DNNs, highlighting the feasibility and practicality of generating backdoor attacks against DNNs. Subsequently, we provide an overview of notable works encompassing various attack and defense strategies, facilitating a comparative analysis of their approaches. Through these discussions, we offer constructive insights aimed at refining these techniques. Finally, we extend our research perspective to the domain of large language models (LLMs) and synthesize the characteristics and developmental trends of backdoor attacks and defense methods targeting LLMs. Through a systematic review of existing studies on backdoor vulnerabilities in LLMs, we identify critical open challenges in this field and propose actionable directions for future research.http://www.sciencedirect.com/science/article/pii/S1674862X25000278Backdoor attacksBackdoor defensesDeep neural networksLarge language model |
| spellingShingle | Ling-Xin Jin Wei Jiang Xiang-Yu Wen Mei-Yu Lin Jin-Yu Zhan Xing-Zhi Zhou Maregu Assefa Habtie Naoufel Werghi A survey of backdoor attacks and defences: From deep neural networks to large language models Journal of Electronic Science and Technology Backdoor attacks Backdoor defenses Deep neural networks Large language model |
| title | A survey of backdoor attacks and defences: From deep neural networks to large language models |
| title_full | A survey of backdoor attacks and defences: From deep neural networks to large language models |
| title_fullStr | A survey of backdoor attacks and defences: From deep neural networks to large language models |
| title_full_unstemmed | A survey of backdoor attacks and defences: From deep neural networks to large language models |
| title_short | A survey of backdoor attacks and defences: From deep neural networks to large language models |
| title_sort | survey of backdoor attacks and defences from deep neural networks to large language models |
| topic | Backdoor attacks Backdoor defenses Deep neural networks Large language model |
| url | http://www.sciencedirect.com/science/article/pii/S1674862X25000278 |
| work_keys_str_mv | AT lingxinjin asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT weijiang asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT xiangyuwen asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT meiyulin asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT jinyuzhan asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT xingzhizhou asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT mareguassefahabtie asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT naoufelwerghi asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT lingxinjin surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT weijiang surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT xiangyuwen surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT meiyulin surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT jinyuzhan surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT xingzhizhou surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT mareguassefahabtie surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT naoufelwerghi surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels |