Fortify the Guardian, Not the Treasure: Resilient Adversarial Detectors

Adaptive adversarial attacks, where adversaries tailor their strategies with full knowledge of defense mechanisms, pose significant challenges to the robustness of adversarial detectors. In this paper, we introduce RADAR (Robust Adversarial Detection via Adversarial Retraining), an approach designed...

Full description

Saved in:
Bibliographic Details
Main Authors: Raz Lapid, Almog Dubin, Moshe Sipper
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/12/22/3451
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850267171157966848
author Raz Lapid
Almog Dubin
Moshe Sipper
author_facet Raz Lapid
Almog Dubin
Moshe Sipper
author_sort Raz Lapid
collection DOAJ
description Adaptive adversarial attacks, where adversaries tailor their strategies with full knowledge of defense mechanisms, pose significant challenges to the robustness of adversarial detectors. In this paper, we introduce RADAR (Robust Adversarial Detection via Adversarial Retraining), an approach designed to fortify adversarial detectors against such adaptive attacks while preserving the classifier’s accuracy. RADAR employs adversarial training by incorporating adversarial examples—crafted to deceive both the classifier and the detector—into the training process. This dual optimization enables the detector to learn and adapt to sophisticated attack scenarios. Comprehensive experiments on CIFAR-10, SVHN, and ImageNet datasets demonstrate that RADAR substantially enhances the detector’s ability to accurately identify adaptive adversarial attacks without degrading classifier performance.
format Article
id doaj-art-ece622c446b94dd2a7f5dfe8007e4a88
institution OA Journals
issn 2227-7390
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-ece622c446b94dd2a7f5dfe8007e4a882025-08-20T01:53:54ZengMDPI AGMathematics2227-73902024-11-011222345110.3390/math12223451Fortify the Guardian, Not the Treasure: Resilient Adversarial DetectorsRaz Lapid0Almog Dubin1Moshe Sipper2Department of Computer Science, Ben-Gurion University, Beer-Sheva 8410501, IsraelDeepKeep, Tel-Aviv 6701203, IsraelDepartment of Computer Science, Ben-Gurion University, Beer-Sheva 8410501, IsraelAdaptive adversarial attacks, where adversaries tailor their strategies with full knowledge of defense mechanisms, pose significant challenges to the robustness of adversarial detectors. In this paper, we introduce RADAR (Robust Adversarial Detection via Adversarial Retraining), an approach designed to fortify adversarial detectors against such adaptive attacks while preserving the classifier’s accuracy. RADAR employs adversarial training by incorporating adversarial examples—crafted to deceive both the classifier and the detector—into the training process. This dual optimization enables the detector to learn and adapt to sophisticated attack scenarios. Comprehensive experiments on CIFAR-10, SVHN, and ImageNet datasets demonstrate that RADAR substantially enhances the detector’s ability to accurately identify adaptive adversarial attacks without degrading classifier performance.https://www.mdpi.com/2227-7390/12/22/3451robustnessadversarial attacksadaptive adversarial attacksdeep learning
spellingShingle Raz Lapid
Almog Dubin
Moshe Sipper
Fortify the Guardian, Not the Treasure: Resilient Adversarial Detectors
Mathematics
robustness
adversarial attacks
adaptive adversarial attacks
deep learning
title Fortify the Guardian, Not the Treasure: Resilient Adversarial Detectors
title_full Fortify the Guardian, Not the Treasure: Resilient Adversarial Detectors
title_fullStr Fortify the Guardian, Not the Treasure: Resilient Adversarial Detectors
title_full_unstemmed Fortify the Guardian, Not the Treasure: Resilient Adversarial Detectors
title_short Fortify the Guardian, Not the Treasure: Resilient Adversarial Detectors
title_sort fortify the guardian not the treasure resilient adversarial detectors
topic robustness
adversarial attacks
adaptive adversarial attacks
deep learning
url https://www.mdpi.com/2227-7390/12/22/3451
work_keys_str_mv AT razlapid fortifytheguardiannotthetreasureresilientadversarialdetectors
AT almogdubin fortifytheguardiannotthetreasureresilientadversarialdetectors
AT moshesipper fortifytheguardiannotthetreasureresilientadversarialdetectors