LCFANet: A Novel Lightweight Cross-Level Feature Aggregation Network for Small Agricultural Pest Detection
In agricultural pest detection, the small size of pests poses a critical hurdle to detection accuracy. To mitigate this concern, we propose a Lightweight Cross-Level Feature Aggregation Network (LCFANet), which comprises three key components: a deep feature extraction network, a deep feature fusion...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Agronomy |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2073-4395/15/5/1168 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | In agricultural pest detection, the small size of pests poses a critical hurdle to detection accuracy. To mitigate this concern, we propose a Lightweight Cross-Level Feature Aggregation Network (LCFANet), which comprises three key components: a deep feature extraction network, a deep feature fusion network, and a multi-scale object detection head. Within the feature extraction and fusion networks, we introduce the Dual Temporal Feature Aggregation C3k2 (DTFA-C3k2) module, leveraging a spatiotemporal fusion mechanism to integrate multi-receptive field features while preserving fine-grained texture and structural details across scales. This significantly improves detection performance for objects with large scale variations. Additionally, we propose the Aggregated Downsampling Convolution (ADown-Conv) module, a dual-path compression unit that enhances feature representation while efficiently reducing spatial dimensions. For feature fusion, we design a Cross-Level Hierarchical Feature Pyramid (CLHFP), which employs bidirectional integration—backward pyramid construction for deep-to-shallow fusion and forward pyramid construction for feature refinement. The detection head incorporates a Multi-Scale Adaptive Spatial Fusion (MSASF) module, adaptively fusing features at specific scales to improve accuracy for varying-sized objects. Furthermore, we introduce the MPDINIoU loss function, combining InnerIoU and MPDIoU to optimize bounding box regression. The LCFANet-n model has <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>2.78</mn><mi>M</mi></mrow></semantics></math></inline-formula> parameters and a computational cost of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>6.7</mn></mrow></semantics></math></inline-formula> GFLOPs, enabling lightweight deployment. Extensive experiments on the public dataset demonstrate that the LCFANet-n model achieves a precision of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>71.7</mn><mo>%</mo></mrow></semantics></math></inline-formula>, recall of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>68.5</mn><mo>%</mo></mrow></semantics></math></inline-formula>, mAP50 of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>70.4</mn><mo>%</mo></mrow></semantics></math></inline-formula>, and mAP50-95 of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>45.1</mn><mo>%</mo></mrow></semantics></math></inline-formula>, reaching state-of-the-art (SOTA) performance in small-sized pest detection while maintaining a lightweight architecture. |
|---|---|
| ISSN: | 2073-4395 |