A4FL: Federated Adversarial Defense via Adversarial Training and Pruning Against Backdoor Attack

Backdoor attacks threaten federated learning (FL) models, where malicious participants embed hidden triggers into local models during training. These triggers can compromise crucial applications, such as autonomous systems, when they activate specific inputs, causing a targeted misclassification in...

Full description

Saved in:
Bibliographic Details
Main Authors: Saeed-Uz-Zaman, Bin Li, Muhammad Hamid, Muhammad Saleem, Mohammed Aman
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10992684/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849471133470949376
author Saeed-Uz-Zaman
Bin Li
Muhammad Hamid
Muhammad Saleem
Mohammed Aman
author_facet Saeed-Uz-Zaman
Bin Li
Muhammad Hamid
Muhammad Saleem
Mohammed Aman
author_sort Saeed-Uz-Zaman
collection DOAJ
description Backdoor attacks threaten federated learning (FL) models, where malicious participants embed hidden triggers into local models during training. These triggers can compromise crucial applications, such as autonomous systems, when they activate specific inputs, causing a targeted misclassification in the global model. We recommend a strong defense mechanism that combines statistical testing, model refinement, and adversarial training methods. The primary goal is to develop a robust defense against backdoor attacks in federated learning (FL), where malicious participants embed hidden triggers into local models. This defense aims to preserve the integrity of the global model and ensure high reliability in real-world FL deployments, even when facing sophisticated adversarial strategies. Our defense strategy incorporates “Messy” samples with obvious triggers and “wrap” samples with similar but nonidentical triggers during adversarial training. This dual approach enhances the model’s ability to detect and resist hidden manipulations. We facilitate applying neuron pruning to remove compromised neurons, further refining the model architecture for improved security. Continuous statistical testing, including variance analysis and cosine similarity checks, ensures that only legitimate and significant updates are integrated into the global model. A key innovation of our method is a significance-based filtering mechanism that effectively identifies and excludes malicious updates, preventing backdoor triggers from affecting the global model. This iterative defense process adapts to attack strategies, maintaining the model’s robustness. Empirical results confirm that this defense mechanism significantly improves FL models’ resilience to sophisticated backdoor attacks while preserving high accuracy and reliability. Balancing defensive strategies from adversarial training and sample diversification to model pruning provides a dependable framework for safeguarding FL models where integrity and security are critical. Experimental results demonstrate that our defense mechanism significantly enhances FL models’ resistance to sophisticated backdoor attacks while maintaining high accuracy and reliability in real-world deployments. These solutions ensure the potential significance of balanced defense solutions, which offer strong protection against adversarial backdoor assaults. This framework provides a dependable solution for securing FL models in environments where integrity and security are paramount.
format Article
id doaj-art-1286dc8a4a21443387294bbea1069968
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-1286dc8a4a21443387294bbea10699682025-08-20T03:24:56ZengIEEEIEEE Access2169-35362025-01-0113910709108810.1109/ACCESS.2025.356827510992684A4FL: Federated Adversarial Defense via Adversarial Training and Pruning Against Backdoor Attack Saeed-Uz-Zaman0https://orcid.org/0009-0007-9045-7571Bin Li1Muhammad Hamid2https://orcid.org/0000-0002-2440-6596Muhammad Saleem3https://orcid.org/0000-0001-7283-0036Mohammed Aman4https://orcid.org/0000-0002-9064-9596School of Information Engineering, Yangzhou University, Yangzhou, ChinaSchool of Information Engineering, Yangzhou University, Yangzhou, ChinaDepartment of Computer Science, Government College Women University, Sialkot, PakistanDepartment of Industrial Engineering, Faculty of Engineering at Rabigh, King Abdulaziz University, Jeddah, Saudi ArabiaDepartment of Industrial Engineering, College of Engineering, University of Business and Technology, Jeddah, Saudi ArabiaBackdoor attacks threaten federated learning (FL) models, where malicious participants embed hidden triggers into local models during training. These triggers can compromise crucial applications, such as autonomous systems, when they activate specific inputs, causing a targeted misclassification in the global model. We recommend a strong defense mechanism that combines statistical testing, model refinement, and adversarial training methods. The primary goal is to develop a robust defense against backdoor attacks in federated learning (FL), where malicious participants embed hidden triggers into local models. This defense aims to preserve the integrity of the global model and ensure high reliability in real-world FL deployments, even when facing sophisticated adversarial strategies. Our defense strategy incorporates “Messy” samples with obvious triggers and “wrap” samples with similar but nonidentical triggers during adversarial training. This dual approach enhances the model’s ability to detect and resist hidden manipulations. We facilitate applying neuron pruning to remove compromised neurons, further refining the model architecture for improved security. Continuous statistical testing, including variance analysis and cosine similarity checks, ensures that only legitimate and significant updates are integrated into the global model. A key innovation of our method is a significance-based filtering mechanism that effectively identifies and excludes malicious updates, preventing backdoor triggers from affecting the global model. This iterative defense process adapts to attack strategies, maintaining the model’s robustness. Empirical results confirm that this defense mechanism significantly improves FL models’ resilience to sophisticated backdoor attacks while preserving high accuracy and reliability. Balancing defensive strategies from adversarial training and sample diversification to model pruning provides a dependable framework for safeguarding FL models where integrity and security are critical. Experimental results demonstrate that our defense mechanism significantly enhances FL models’ resistance to sophisticated backdoor attacks while maintaining high accuracy and reliability in real-world deployments. These solutions ensure the potential significance of balanced defense solutions, which offer strong protection against adversarial backdoor assaults. This framework provides a dependable solution for securing FL models in environments where integrity and security are paramount.https://ieeexplore.ieee.org/document/10992684/Backdoor attackadversarial traininguniversal adversarial perturbations (UAPs)backdoor defense
spellingShingle Saeed-Uz-Zaman
Bin Li
Muhammad Hamid
Muhammad Saleem
Mohammed Aman
A4FL: Federated Adversarial Defense via Adversarial Training and Pruning Against Backdoor Attack
IEEE Access
Backdoor attack
adversarial training
universal adversarial perturbations (UAPs)
backdoor defense
title A4FL: Federated Adversarial Defense via Adversarial Training and Pruning Against Backdoor Attack
title_full A4FL: Federated Adversarial Defense via Adversarial Training and Pruning Against Backdoor Attack
title_fullStr A4FL: Federated Adversarial Defense via Adversarial Training and Pruning Against Backdoor Attack
title_full_unstemmed A4FL: Federated Adversarial Defense via Adversarial Training and Pruning Against Backdoor Attack
title_short A4FL: Federated Adversarial Defense via Adversarial Training and Pruning Against Backdoor Attack
title_sort a4fl federated adversarial defense via adversarial training and pruning against backdoor attack
topic Backdoor attack
adversarial training
universal adversarial perturbations (UAPs)
backdoor defense
url https://ieeexplore.ieee.org/document/10992684/
work_keys_str_mv AT saeeduzzaman a4flfederatedadversarialdefenseviaadversarialtrainingandpruningagainstbackdoorattack
AT binli a4flfederatedadversarialdefenseviaadversarialtrainingandpruningagainstbackdoorattack
AT muhammadhamid a4flfederatedadversarialdefenseviaadversarialtrainingandpruningagainstbackdoorattack
AT muhammadsaleem a4flfederatedadversarialdefenseviaadversarialtrainingandpruningagainstbackdoorattack
AT mohammedaman a4flfederatedadversarialdefenseviaadversarialtrainingandpruningagainstbackdoorattack