You Only Attack Once: Single-Step DeepFool Algorithm

Adversarial attacks expose the latent vulnerabilities within artificial intelligence systems, necessitating a reassessment and enhancement of model robustness to ensure the reliability and security of deep learning models against malicious attacks. We propose a fast method designed to efficiently fi...

Full description

Saved in:
Bibliographic Details
Main Authors: Jun Li, Yanwei Xu, Yaocun Hu, Yongyong Ma, Xin Yin
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/1/302
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Adversarial attacks expose the latent vulnerabilities within artificial intelligence systems, necessitating a reassessment and enhancement of model robustness to ensure the reliability and security of deep learning models against malicious attacks. We propose a fast method designed to efficiently find sample points close to the decision boundary. By computing the gradient information of each class in the input samples and comparing these gradient differences with the true class, we can identify the target class most sensitive to the decision boundary, thus generating adversarial examples. This technique is referred to as the “You Only Attack Once” (YOAO) algorithm. Compared to the DeepFool algorithm, this method requires only a single iteration to achieve effective attack results. The experimental results demonstrate that the proposed algorithm outperforms the original approach in various scenarios, especially in resource-constrained environments. Under a single iteration, it achieves a 70.6% higher success rate of the attacks compared to the DeepFool algorithm. Our proposed method shows promise for widespread application in both offensive and defensive strategies for diverse deep learning models. We investigated the relationship between classifier accuracy and adversarial attack success rate, comparing the algorithm with others. Our experiments validated that the proposed algorithm exhibits higher attack success rates and efficiency. Furthermore, we performed data visualization on the ImageNet dataset, demonstrating that the proposed algorithm focuses more on attacking important features. Finally, we discussed the existing issues with the algorithm and outlined future research directions. Our code will be made public upon acceptance of the paper.
ISSN:2076-3417