A survey of backdoor attacks and defences: From deep neural networks to large language models

Deep neural networks (DNNs) have found extensive applications in safety-critical artificial intelligence systems, such as autonomous driving and facial recognition systems. However, recent research has revealed their susceptibility to backdoors maliciously injected by adversaries. This vulnerability...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ling-Xin Jin, Wei Jiang, Xiang-Yu Wen, Mei-Yu Lin, Jin-Yu Zhan, Xing-Zhi Zhou, Maregu Assefa Habtie, Naoufel Werghi
Format:	Article
Language:	English
Published:	KeAi Communications Co., Ltd. 2025-09-01
Series:	Journal of Electronic Science and Technology
Subjects:	Backdoor attacks Backdoor defenses Deep neural networks Large language model
Online Access:	http://www.sciencedirect.com/science/article/pii/S1674862X25000278
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849228360532623360
author	Ling-Xin Jin Wei Jiang Xiang-Yu Wen Mei-Yu Lin Jin-Yu Zhan Xing-Zhi Zhou Maregu Assefa Habtie Naoufel Werghi
author_facet	Ling-Xin Jin Wei Jiang Xiang-Yu Wen Mei-Yu Lin Jin-Yu Zhan Xing-Zhi Zhou Maregu Assefa Habtie Naoufel Werghi
author_sort	Ling-Xin Jin
collection	DOAJ
description	Deep neural networks (DNNs) have found extensive applications in safety-critical artificial intelligence systems, such as autonomous driving and facial recognition systems. However, recent research has revealed their susceptibility to backdoors maliciously injected by adversaries. This vulnerability arises due to the intricate architecture and opacity of DNNs, resulting in numerous redundant neurons embedded within the models. Adversaries exploit these vulnerabilities to conceal malicious backdoor information within DNNs, thereby causing erroneous outputs and posing substantial threats to the efficacy of DNN-based applications. This article presents a comprehensive survey of backdoor attacks against DNNs and the countermeasure methods employed to mitigate them. Initially, we trace the evolution of the concept from traditional backdoor attacks to backdoor attacks against DNNs, highlighting the feasibility and practicality of generating backdoor attacks against DNNs. Subsequently, we provide an overview of notable works encompassing various attack and defense strategies, facilitating a comparative analysis of their approaches. Through these discussions, we offer constructive insights aimed at refining these techniques. Finally, we extend our research perspective to the domain of large language models (LLMs) and synthesize the characteristics and developmental trends of backdoor attacks and defense methods targeting LLMs. Through a systematic review of existing studies on backdoor vulnerabilities in LLMs, we identify critical open challenges in this field and propose actionable directions for future research.
format	Article
id	doaj-art-44e2b11516fa478ea800e3a71ca39bc4
institution	Kabale University
issn	2666-223X
language	English
publishDate	2025-09-01
publisher	KeAi Communications Co., Ltd.
record_format	Article
series	Journal of Electronic Science and Technology
spelling	doaj-art-44e2b11516fa478ea800e3a71ca39bc42025-08-23T04:47:55ZengKeAi Communications Co., Ltd.Journal of Electronic Science and Technology2666-223X2025-09-0123310032610.1016/j.jnlest.2025.100326A survey of backdoor attacks and defences: From deep neural networks to large language modelsLing-Xin Jin0Wei Jiang1Xiang-Yu Wen2Mei-Yu Lin3Jin-Yu Zhan4Xing-Zhi Zhou5Maregu Assefa Habtie6Naoufel Werghi7School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, China; School of Computer Science, Khalifa University, Abu Dhabi, 127788, the United Arab EmiratesSchool of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, China; Corresponding author.School of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, 999077, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, ChinaSchool of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, ChinaSchool of Computer Science, Khalifa University, Abu Dhabi, 127788, the United Arab EmiratesSchool of Computer Science, Khalifa University, Abu Dhabi, 127788, the United Arab EmiratesDeep neural networks (DNNs) have found extensive applications in safety-critical artificial intelligence systems, such as autonomous driving and facial recognition systems. However, recent research has revealed their susceptibility to backdoors maliciously injected by adversaries. This vulnerability arises due to the intricate architecture and opacity of DNNs, resulting in numerous redundant neurons embedded within the models. Adversaries exploit these vulnerabilities to conceal malicious backdoor information within DNNs, thereby causing erroneous outputs and posing substantial threats to the efficacy of DNN-based applications. This article presents a comprehensive survey of backdoor attacks against DNNs and the countermeasure methods employed to mitigate them. Initially, we trace the evolution of the concept from traditional backdoor attacks to backdoor attacks against DNNs, highlighting the feasibility and practicality of generating backdoor attacks against DNNs. Subsequently, we provide an overview of notable works encompassing various attack and defense strategies, facilitating a comparative analysis of their approaches. Through these discussions, we offer constructive insights aimed at refining these techniques. Finally, we extend our research perspective to the domain of large language models (LLMs) and synthesize the characteristics and developmental trends of backdoor attacks and defense methods targeting LLMs. Through a systematic review of existing studies on backdoor vulnerabilities in LLMs, we identify critical open challenges in this field and propose actionable directions for future research.http://www.sciencedirect.com/science/article/pii/S1674862X25000278Backdoor attacksBackdoor defensesDeep neural networksLarge language model
spellingShingle	Ling-Xin Jin Wei Jiang Xiang-Yu Wen Mei-Yu Lin Jin-Yu Zhan Xing-Zhi Zhou Maregu Assefa Habtie Naoufel Werghi A survey of backdoor attacks and defences: From deep neural networks to large language models Journal of Electronic Science and Technology Backdoor attacks Backdoor defenses Deep neural networks Large language model
title	A survey of backdoor attacks and defences: From deep neural networks to large language models
title_full	A survey of backdoor attacks and defences: From deep neural networks to large language models
title_fullStr	A survey of backdoor attacks and defences: From deep neural networks to large language models
title_full_unstemmed	A survey of backdoor attacks and defences: From deep neural networks to large language models
title_short	A survey of backdoor attacks and defences: From deep neural networks to large language models
title_sort	survey of backdoor attacks and defences from deep neural networks to large language models
topic	Backdoor attacks Backdoor defenses Deep neural networks Large language model
url	http://www.sciencedirect.com/science/article/pii/S1674862X25000278
work_keys_str_mv	AT lingxinjin asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT weijiang asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT xiangyuwen asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT meiyulin asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT jinyuzhan asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT xingzhizhou asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT mareguassefahabtie asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT naoufelwerghi asurveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT lingxinjin surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT weijiang surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT xiangyuwen surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT meiyulin surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT jinyuzhan surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT xingzhizhou surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT mareguassefahabtie surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels AT naoufelwerghi surveyofbackdoorattacksanddefencesfromdeepneuralnetworkstolargelanguagemodels

A survey of backdoor attacks and defences: From deep neural networks to large language models

Similar Items