A New Hybrid ConvViT Model for Dangerous Farm Insect Detection

This study proposes a novel hybrid convolution and vision transformer model (ConvViT) designed to detect harmful insect species that adversely affect agricultural production and play a critical role in global food security. By utilizing a dataset comprising images of 15 distinct insect species, the...

Full description

Saved in:
Bibliographic Details
Main Authors: Anil Utku, Mahmut Kaya, Yavuz Canbay
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/5/2518
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study proposes a novel hybrid convolution and vision transformer model (ConvViT) designed to detect harmful insect species that adversely affect agricultural production and play a critical role in global food security. By utilizing a dataset comprising images of 15 distinct insect species, the suggested approach combines the strengths of traditional convolutional neural networks (CNNs) with vision transformer (ViT) architectures. This integration aims to capture local-level morphological features effectively while analyzing global spatial relationships more comprehensively. While the CNN structure excels at discerning fine morphological details of insects, the ViT’s self-attention mechanism enables a holistic evaluation of their overall configurations. Several data preprocessing steps were implemented to enhance the model’s performance, including data augmentation techniques and strategies to ensure class balance. In addition, hyperparameter optimization contributed to more stable and robust model training. Experimental results indicate that the ConvViT model outperforms commonly used benchmark architectures such as EfficientNetB0, DenseNet201, ResNet-50, VGG-16, and standalone ViT, achieving a classification accuracy of 93.61%. This hybrid approach improves accuracy and strengthens generalization capabilities, delivering steady performance during training and testing phases, thereby increasing its reliability for field applications. The findings highlight that the ConvViT model achieves high efficiency in pest detection by integrating local and global feature learning. Consequently, this scalable artificial intelligence solution can support sustainable agricultural practices by enabling the early and accurate identification of pests and reducing the need for intensive pesticide use.
ISSN:2076-3417