A survey of backdoor attacks and defences: From deep neural networks to large language models

Deep neural networks (DNNs) have found extensive applications in safety-critical artificial intelligence systems, such as autonomous driving and facial recognition systems. However, recent research has revealed their susceptibility to backdoors maliciously injected by adversaries. This vulnerability...

Full description

Saved in:
Bibliographic Details
Main Authors: Ling-Xin Jin, Wei Jiang, Xiang-Yu Wen, Mei-Yu Lin, Jin-Yu Zhan, Xing-Zhi Zhou, Maregu Assefa Habtie, Naoufel Werghi
Format: Article
Language:English
Published: KeAi Communications Co., Ltd. 2025-09-01
Series:Journal of Electronic Science and Technology
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1674862X25000278
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849228360532623360
author Ling-Xin Jin
Wei Jiang
Xiang-Yu Wen
Mei-Yu Lin
Jin-Yu Zhan
Xing-Zhi Zhou
Maregu Assefa Habtie
Naoufel Werghi
author_facet Ling-Xin Jin
Wei Jiang
Xiang-Yu Wen
Mei-Yu Lin
Jin-Yu Zhan
Xing-Zhi Zhou
Maregu Assefa Habtie
Naoufel Werghi
author_sort Ling-Xin Jin
collection DOAJ
description Deep neural networks (DNNs) have found extensive applications in safety-critical artificial intelligence systems, such as autonomous driving and facial recognition systems. However, recent research has revealed their susceptibility to backdoors maliciously injected by adversaries. This vulnerability arises due to the intricate architecture and opacity of DNNs, resulting in numerous redundant neurons embedded within the models. Adversaries exploit these vulnerabilities to conceal malicious backdoor information within DNNs, thereby causing erroneous outputs and posing substantial threats to the efficacy of DNN-based applications. This article presents a comprehensive survey of backdoor attacks against DNNs and the countermeasure methods employed to mitigate them. Initially, we trace the evolution of the concept from traditional backdoor attacks to backdoor attacks against DNNs, highlighting the feasibility and practicality of generating backdoor attacks against DNNs. Subsequently, we provide an overview of notable works encompassing various attack and defense strategies, facilitating a comparative analysis of their approaches. Through these discussions, we offer constructive insights aimed at refining these techniques. Finally, we extend our research perspective to the domain of large language models (LLMs) and synthesize the characteristics and developmental trends of backdoor attacks and defense methods targeting LLMs. Through a systematic review of existing studies on backdoor vulnerabilities in LLMs, we identify critical open challenges in this field and propose actionable directions for future research.
format Article
id doaj-art-44e2b11516fa478ea800e3a71ca39bc4
institution Kabale University
issn 2666-223X
language English
publishDate 2025-09-01
publisher KeAi Communications Co., Ltd.
record_format Article
series Journal of Electronic Science and Technology
spelling doaj-art-44e2b11516fa478ea800e3a71ca39bc42025-08-23T04:47:55ZengKeAi Communications Co., Ltd.Journal of Electronic Science and Technology2666-223X2025-09-0123310032610.1016/j.jnlest.2025.100326A survey of backdoor attacks and defences: From deep neural networks to large language modelsLing-Xin Jin0Wei Jiang1Xiang-Yu Wen2Mei-Yu Lin3Jin-Yu Zhan4Xing-Zhi Zhou5Maregu Assefa Habtie6Naoufel Werghi7School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, China; School of Computer Science, Khalifa University, Abu Dhabi, 127788, the United Arab EmiratesSchool of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, China; Corresponding author.School of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, 999077, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, ChinaSchool of Computer Science, Khalifa University, Abu Dhabi, 127788, the United Arab EmiratesSchool of Computer Science, Khalifa University, Abu Dhabi, 127788, the United Arab EmiratesDeep neural networks (DNNs) have found extensive applications in safety-critical artificial intelligence systems, such as autonomous driving and facial recognition systems. However, recent research has revealed their susceptibility to backdoors maliciously injected by adversaries. This vulnerability arises due to the intricate architecture and opacity of DNNs, resulting in numerous redundant neurons embedded within the models. Adversaries exploit these vulnerabilities to conceal malicious backdoor information within DNNs, thereby causing erroneous outputs and posing substantial threats to the efficacy of DNN-based applications. This article presents a comprehensive survey of backdoor attacks against DNNs and the countermeasure methods employed to mitigate them. Initially, we trace the evolution of the concept from traditional backdoor attacks to backdoor attacks against DNNs, highlighting the feasibility and practicality of generating backdoor attacks against DNNs. Subsequently, we provide an overview of notable works encompassing various attack and defense strategies, facilitating a comparative analysis of their approaches. Through these discussions, we offer constructive insights aimed at refining these techniques. Finally, we extend our research perspective to the domain of large language models (LLMs) and synthesize the characteristics and developmental trends of backdoor attacks and defense methods targeting LLMs. Through a systematic review of existing studies on backdoor vulnerabilities in LLMs, we identify critical open challenges in this field and propose actionable directions for future research.http://www.sciencedirect.com/science/article/pii/S1674862X25000278Backdoor attacksBackdoor defensesDeep neural networksLarge language model
spellingShingle Ling-Xin Jin
Wei Jiang
Xiang-Yu Wen
Mei-Yu Lin
Jin-Yu Zhan
Xing-Zhi Zhou
Maregu Assefa Habtie
Naoufel Werghi
A survey of backdoor attacks and defences: From deep neural networks to large language models
Journal of Electronic Science and Technology
Backdoor attacks
Backdoor defenses
Deep neural networks
Large language model
title A survey of backdoor attacks and defences: From deep neural networks to large language models
title_full A survey of backdoor attacks and defences: From deep neural networks to large language models
title_fullStr A survey of backdoor attacks and defences: From deep neural networks to large language models
title_full_unstemmed A survey of backdoor attacks and defences: From deep neural networks to large language models
title_short A survey of backdoor attacks and defences: From deep neural networks to large language models
title_sort survey of backdoor attacks and defences from deep neural networks to large language models
topic Backdoor attacks
Backdoor defenses
Deep neural networks
Large language model
url http://www.sciencedirect.com/science/article/pii/S1674862X25000278
work_keys_str_mv AT lingxinjin asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT weijiang asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT xiangyuwen asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT meiyulin asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT jinyuzhan asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT xingzhizhou asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT mareguassefahabtie asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT naoufelwerghi asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT lingxinjin surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT weijiang surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT xiangyuwen surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT meiyulin surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT jinyuzhan surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT xingzhizhou surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT mareguassefahabtie surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels
AT naoufelwerghi surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels