A Generalized Framework for Adversarial Attack Detection and Prevention Using Grad-CAM and Clustering Techniques
Through advances in AI-based computer vision technology, the performance of modern image classification models has surpassed human perception, making them valuable in various fields. However, adversarial attacks, which involve small changes to images that are hard for humans to perceive, can cause c...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-01-01
|
| Series: | Systems |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2079-8954/13/2/88 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Through advances in AI-based computer vision technology, the performance of modern image classification models has surpassed human perception, making them valuable in various fields. However, adversarial attacks, which involve small changes to images that are hard for humans to perceive, can cause classification models to misclassify images. Considering the availability of classification models that use neural networks, it is crucial to prevent adversarial attacks. Recent detection methods are only effective for specific attacks or cannot be applied to various models. Therefore, in this paper, we proposed an attention mechanism-based method for detecting adversarial attacks. We utilized a framework using an ensemble model, Grad-CAM and calculated the silhouette coefficient for detection. We applied this method to Resnet18, Mobilenetv2, and VGG16 classification models that were fine-tuned on the CIFAR-10 dataset. The average performance demonstrated that Mobilenetv2 achieved an F1-Score of 0.9022 and an accuracy of 0.9103, Resnet18 achieved an F1-Score of 0.9124 and an accuracy of 0.9302, and VGG16 achieved an F1-Score of 0.9185 and an accuracy of 0.9252. The results demonstrated that our method not only detects but also prevents adversarial attacks by mitigating their effects and effectively restoring labels. |
|---|---|
| ISSN: | 2079-8954 |