YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles
Unmanned Aerial Vehicles (UAVs) face a significant challenge in balancing high accuracy and high efficiency when performing real-time object detection tasks, especially amidst intricate backgrounds, diverse target scales, and stringent onboard computational resource constraints. To tackle these diff...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/13/2313 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849704614260113408 |
|---|---|
| author | Shimin Weng Han Wang Jiashu Wang Changming Xu Ende Zhang |
| author_facet | Shimin Weng Han Wang Jiashu Wang Changming Xu Ende Zhang |
| author_sort | Shimin Weng |
| collection | DOAJ |
| description | Unmanned Aerial Vehicles (UAVs) face a significant challenge in balancing high accuracy and high efficiency when performing real-time object detection tasks, especially amidst intricate backgrounds, diverse target scales, and stringent onboard computational resource constraints. To tackle these difficulties, this study introduces YOLO-SRMX, a lightweight real-time object detection framework specifically designed for infrared imagery captured by UAVs. Firstly, the model utilizes ShuffleNetV2 as an efficient lightweight backbone and integrates the novel Multi-Scale Dilated Attention (MSDA) module. This strategy not only facilitates a substantial 46.4% reduction in parameter volume but also, through the flexible adaptation of receptive fields, boosts the model’s robustness and precision in multi-scale object recognition tasks. Secondly, within the neck network, multi-scale feature extraction is facilitated through the design of novel composite convolutions, ConvX and MConv, based on a “split–differentiate–concatenate” paradigm. Furthermore, the lightweight GhostConv is incorporated to reduce model complexity. By synthesizing these principles, a novel composite receptive field lightweight convolution, DRFAConvP, is proposed to further optimize multi-scale feature fusion efficiency and promote model lightweighting. Finally, the Wise-IoU loss function is adopted to replace the traditional bounding box loss. This is coupled with a dynamic non-monotonic focusing mechanism formulated using the concept of outlier degrees. This mechanism intelligently assigns elevated gradient weights to anchor boxes of moderate quality by assessing their relative outlier degree, while concurrently diminishing the gradient contributions from both high-quality and low-quality anchor boxes. Consequently, this approach enhances the model’s localization accuracy for small targets in complex scenes. Experimental evaluations on the HIT-UAV dataset corroborate that YOLO-SRMX achieves an <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>mAP</mi><mn>50</mn></msub></semantics></math></inline-formula> of 82.8%, representing a 7.81% improvement over the baseline YOLOv8s model; an F1 score of 80%, marking a 3.9% increase; and a substantial 65.3% reduction in computational cost (GFLOPs). YOLO-SRMX demonstrates an exceptional trade-off between detection accuracy and operational efficiency, thereby underscoring its considerable potential for efficient and precise object detection on resource-constrained UAV platforms. |
| format | Article |
| id | doaj-art-33564554fc9147a48e07589dc0b431cb |
| institution | DOAJ |
| issn | 2072-4292 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Remote Sensing |
| spelling | doaj-art-33564554fc9147a48e07589dc0b431cb2025-08-20T03:16:42ZengMDPI AGRemote Sensing2072-42922025-07-011713231310.3390/rs17132313YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial VehiclesShimin Weng0Han Wang1Jiashu Wang2Changming Xu3Ende Zhang4School of Computer and Communication Engineering, Northeastern University, Qinhuangdao 066004, ChinaSchool of Computer and Communication Engineering, Northeastern University, Qinhuangdao 066004, ChinaSchool of Computer and Communication Engineering, Northeastern University, Qinhuangdao 066004, ChinaSchool of Computer and Communication Engineering, Northeastern University, Qinhuangdao 066004, ChinaSchool of Computer Science and Engineering, Northeastern University, Shenyang 110819, ChinaUnmanned Aerial Vehicles (UAVs) face a significant challenge in balancing high accuracy and high efficiency when performing real-time object detection tasks, especially amidst intricate backgrounds, diverse target scales, and stringent onboard computational resource constraints. To tackle these difficulties, this study introduces YOLO-SRMX, a lightweight real-time object detection framework specifically designed for infrared imagery captured by UAVs. Firstly, the model utilizes ShuffleNetV2 as an efficient lightweight backbone and integrates the novel Multi-Scale Dilated Attention (MSDA) module. This strategy not only facilitates a substantial 46.4% reduction in parameter volume but also, through the flexible adaptation of receptive fields, boosts the model’s robustness and precision in multi-scale object recognition tasks. Secondly, within the neck network, multi-scale feature extraction is facilitated through the design of novel composite convolutions, ConvX and MConv, based on a “split–differentiate–concatenate” paradigm. Furthermore, the lightweight GhostConv is incorporated to reduce model complexity. By synthesizing these principles, a novel composite receptive field lightweight convolution, DRFAConvP, is proposed to further optimize multi-scale feature fusion efficiency and promote model lightweighting. Finally, the Wise-IoU loss function is adopted to replace the traditional bounding box loss. This is coupled with a dynamic non-monotonic focusing mechanism formulated using the concept of outlier degrees. This mechanism intelligently assigns elevated gradient weights to anchor boxes of moderate quality by assessing their relative outlier degree, while concurrently diminishing the gradient contributions from both high-quality and low-quality anchor boxes. Consequently, this approach enhances the model’s localization accuracy for small targets in complex scenes. Experimental evaluations on the HIT-UAV dataset corroborate that YOLO-SRMX achieves an <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>mAP</mi><mn>50</mn></msub></semantics></math></inline-formula> of 82.8%, representing a 7.81% improvement over the baseline YOLOv8s model; an F1 score of 80%, marking a 3.9% increase; and a substantial 65.3% reduction in computational cost (GFLOPs). YOLO-SRMX demonstrates an exceptional trade-off between detection accuracy and operational efficiency, thereby underscoring its considerable potential for efficient and precise object detection on resource-constrained UAV platforms.https://www.mdpi.com/2072-4292/17/13/2313Unmanned Aerial Vehicle (UAV)object detectionlightweight modelYOLOreal-time detection |
| spellingShingle | Shimin Weng Han Wang Jiashu Wang Changming Xu Ende Zhang YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles Remote Sensing Unmanned Aerial Vehicle (UAV) object detection lightweight model YOLO real-time detection |
| title | YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles |
| title_full | YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles |
| title_fullStr | YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles |
| title_full_unstemmed | YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles |
| title_short | YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles |
| title_sort | yolo srmx a lightweight model for real time object detection on unmanned aerial vehicles |
| topic | Unmanned Aerial Vehicle (UAV) object detection lightweight model YOLO real-time detection |
| url | https://www.mdpi.com/2072-4292/17/13/2313 |
| work_keys_str_mv | AT shiminweng yolosrmxalightweightmodelforrealtimeobjectdetectiononunmannedaerialvehicles AT hanwang yolosrmxalightweightmodelforrealtimeobjectdetectiononunmannedaerialvehicles AT jiashuwang yolosrmxalightweightmodelforrealtimeobjectdetectiononunmannedaerialvehicles AT changmingxu yolosrmxalightweightmodelforrealtimeobjectdetectiononunmannedaerialvehicles AT endezhang yolosrmxalightweightmodelforrealtimeobjectdetectiononunmannedaerialvehicles |