A lightweight real-time unified detection model for rice and wheat ears in complex agricultural environments

Ear recognition and detection is a commonly used method in the yield prediction process for rice and wheat crops. In current agricultural technology research, rice and wheat ear are typically treated and modeled separately to improve detection accuracy. Due to the significant similarities in the phe...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaojun Shen, Shuai Li, Fen Qiu, Lili Yao
Format: Article
Language:English
Published: Elsevier 2025-08-01
Series:Smart Agricultural Technology
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2772375525002886
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Ear recognition and detection is a commonly used method in the yield prediction process for rice and wheat crops. In current agricultural technology research, rice and wheat ear are typically treated and modeled separately to improve detection accuracy. Due to the significant similarities in the phenotyping structures and physicochemical indicators of rice and wheat ear, a unified detection model can be developed to fulfill the requisite modeling specifications. Therefore, in response to the lack of research on the unified real-time detection of rice and wheat ear, this paper proposes a lightweight detection model, Light-Y, suitable for complex environments. The model utilizes the lightweight MobileNetV3 network combined with the dynamic detection head DyHead to reconstruct the YOLOv5s network. Through multi-scale feature aggregation and attention mechanisms, the model effectively enhances its ability to capture dense targets in complex scenarios while reducing computational redundancy. On this basis, rice and wheat ear data collected by smartphones and drones are used, along with transfer learning and a staged data introduction strategy, to achieve efficient integration of multi-source data, significantly improving the generalization ability and adaptability of the model for detecting rice and wheat ear targets. Finally, channel pruning is applied to remove inefficient channels, effectively reducing computational costs and optimizing resource allocation efficiency. The experimental results show that the mAP@0.5 of Light-Y reaches 91.9 %, an improvement of 0.4 %, with a weight of 4.68 MB, a parameter count of 2.2 × 10⁶, and FLOPs of 4 × 10⁹. In terms of accuracy, efficiency, and resource consumption, Light-Y outperforms existing mainstream models (e.g., YOLOv8n, YOLO11n). Further validation demonstrates that Light-Y achieves detection accuracy R² of 0.96, 0.95, 0.95, and 0.94 on the smartphone-based wheat ear dataset, smartphone-based rice ear dataset, drone-based wheat ear dataset, and drone-based rice ear dataset, respectively, showcasing excellent detection and counting performance.
ISSN:2772-3755