Dual-Mode Method for Generating Adversarial Examples to Attack Deep Neural Networks

Deep neural networks yield desirable performance in text, image, and speech classification. However, these networks are vulnerable to adversarial examples. An adversarial example is a sample generated by inserting a small amount of noise into an original sample (with minimal distortion) such that it...

Full description

Saved in:
Bibliographic Details
Main Authors: Hyun Kwon, Sunghwan Kim
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10046665/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Deep neural networks yield desirable performance in text, image, and speech classification. However, these networks are vulnerable to adversarial examples. An adversarial example is a sample generated by inserting a small amount of noise into an original sample (with minimal distortion) such that it is recognized incorrectly by the targeted model. A typical method of attack using adversarial examples must satisfy two conditions: the distortion of the original sample must be kept to a minimum and misrecognition must be induced in the targeted deep neural network. Therefore, considerable time and numerous iterations are required to generate an adversarial example because both the conditions must be satisfied during the generation process. However, there are cases in which it may be desirable to generate an adversarial example that acts quickly to induce misrecognition in the deep neural network without considering the amount of distortion applied to the original sample. In this paper, we propose a dual-mode method for creating adversarial examples that allows the user to prioritize the malfunctioning of deep neural networks according to the situation. The proposed method generates an adversarial example using one of two modes: mode 1, which takes the level of distortion into account, and mode 0, which does not consider distortion and can generate examples rapidly. To evaluate the method experimentally, MNIST and CIFAR10 were used as the datasets. The results show that the proposed method can generate a targeted or untargeted adversarial example for MNIST with 50% fewer iterations using mode 0 than using mode 1. For CIFAR10, the reduction in the number of iterations that can be achieved using mode 0 is 80% and 88% for targeted and untargeted adversarial examples, respectively, and the attack success rate is 100%.
ISSN:2169-3536