A New Lightweight Hybrid Model for Pistachio Classification Using Transformers and EfficientNet
In recent years, Vision Transformers (ViTs) have gained prominence as a highly effective method for image classification, often outperforming traditional Convolutional Neural Networks (CNNs). However, their relatively slow processing speed limits their practical use, particularly in real-time applic...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10990223/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | In recent years, Vision Transformers (ViTs) have gained prominence as a highly effective method for image classification, often outperforming traditional Convolutional Neural Networks (CNNs). However, their relatively slow processing speed limits their practical use, particularly in real-time applications. Conversely, CNN-based transfer learning models provide faster inference but may struggle with classification accuracy on complex datasets. To address these challenges, Temporal Coordinate Attention (TCA) modules have been introduced to optimize efficiency and performance. This study proposes a hybrid architecture combining EfficientNet, Vision Transformer, and Temporal Channel Attention modules that integrates the accuracy of ViTs, the computational efficiency of CNNs, and the enhancement capabilities of TCA modules. The model is designed to classify Siirt and Kirmizi pistachio varieties with high precision. It achieves outstanding results, including 99.07% accuracy, 99.12% recall, and a Cohen’s Kappa score of 98.10%. These findings highlight the model’s robustness, demonstrating its ability to perform reliable classifications with minimal bias, making it well-suited for real-world applications. |
|---|---|
| ISSN: | 2169-3536 |