Enhancing Image Classification using Graph Attention Networks
Excellent performance in artificial intelligence image classification leads to extensive applications throughout areas such as healthcare facilities, robotic systems and multimedia platforms. The research field has evolved through new developments in both Vision Transformers (ViTs) alongside Graph N...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | Arabic |
| Published: |
University of Information Technology and Communications
2025-08-01
|
| Series: | Iraqi Journal for Computers and Informatics |
| Subjects: | |
| Online Access: | https://ijci.uoitc.edu.iq/index.php/ijci/article/view/548 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Excellent performance in artificial intelligence image classification leads to extensive applications throughout areas such as healthcare facilities, robotic systems and multimedia platforms. The research field has evolved through new developments in both Vision Transformers (ViTs) alongside Graph Neural Networks (GNNs). A new image classification method utilizes integrated Vision Transformers (ViTs) and Graph Attention Networks (GATs) to improve results for difficult dataset types. The hybrid architecture made possible by combining ViTs with GATs successfully captures complex relationships within visual data because ViTs deliver powerful global feature extraction while GATs establish strong patch-level dependencies. The implementation of GATs via their built-in attention mechanism allows dynamic region prioritization for both accurate recognition and better interpretability of images. The experiments using benchmark datasets CIFAR-10, CIFAR-100, ImageNet, Fashion-MNIST, and SVHN show that ViT + GAT outperforms Swin Transformer and ConvNeXt for state-of-the-art architectures. The proposed method delivers prominent improvements in all classification metrics including accuracy and both accuracy and resistance to noise interference and adversarial perturbations. Model reliability and task generalization capabilities are demonstrated through the precision, recall, F1-score and AUC-ROC metrics. This project integrates smartphone-level ViT technology with deep social modeling GAT components to redefine image classification methods. The method's outstanding performance proves itself as a promising solution for complex visual recognition challenges on multiple scale levels. |
|---|---|
| ISSN: | 2313-190X 2520-4912 |