YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles

Unmanned Aerial Vehicles (UAVs) face a significant challenge in balancing high accuracy and high efficiency when performing real-time object detection tasks, especially amidst intricate backgrounds, diverse target scales, and stringent onboard computational resource constraints. To tackle these diff...

Full description

Saved in:
Bibliographic Details
Main Authors: Shimin Weng, Han Wang, Jiashu Wang, Changming Xu, Ende Zhang
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/13/2313
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849704614260113408
author Shimin Weng
Han Wang
Jiashu Wang
Changming Xu
Ende Zhang
author_facet Shimin Weng
Han Wang
Jiashu Wang
Changming Xu
Ende Zhang
author_sort Shimin Weng
collection DOAJ
description Unmanned Aerial Vehicles (UAVs) face a significant challenge in balancing high accuracy and high efficiency when performing real-time object detection tasks, especially amidst intricate backgrounds, diverse target scales, and stringent onboard computational resource constraints. To tackle these difficulties, this study introduces YOLO-SRMX, a lightweight real-time object detection framework specifically designed for infrared imagery captured by UAVs. Firstly, the model utilizes ShuffleNetV2 as an efficient lightweight backbone and integrates the novel Multi-Scale Dilated Attention (MSDA) module. This strategy not only facilitates a substantial 46.4% reduction in parameter volume but also, through the flexible adaptation of receptive fields, boosts the model’s robustness and precision in multi-scale object recognition tasks. Secondly, within the neck network, multi-scale feature extraction is facilitated through the design of novel composite convolutions, ConvX and MConv, based on a “split–differentiate–concatenate” paradigm. Furthermore, the lightweight GhostConv is incorporated to reduce model complexity. By synthesizing these principles, a novel composite receptive field lightweight convolution, DRFAConvP, is proposed to further optimize multi-scale feature fusion efficiency and promote model lightweighting. Finally, the Wise-IoU loss function is adopted to replace the traditional bounding box loss. This is coupled with a dynamic non-monotonic focusing mechanism formulated using the concept of outlier degrees. This mechanism intelligently assigns elevated gradient weights to anchor boxes of moderate quality by assessing their relative outlier degree, while concurrently diminishing the gradient contributions from both high-quality and low-quality anchor boxes. Consequently, this approach enhances the model’s localization accuracy for small targets in complex scenes. Experimental evaluations on the HIT-UAV dataset corroborate that YOLO-SRMX achieves an <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>mAP</mi><mn>50</mn></msub></semantics></math></inline-formula> of 82.8%, representing a 7.81% improvement over the baseline YOLOv8s model; an F1 score of 80%, marking a 3.9% increase; and a substantial 65.3% reduction in computational cost (GFLOPs). YOLO-SRMX demonstrates an exceptional trade-off between detection accuracy and operational efficiency, thereby underscoring its considerable potential for efficient and precise object detection on resource-constrained UAV platforms.
format Article
id doaj-art-33564554fc9147a48e07589dc0b431cb
institution DOAJ
issn 2072-4292
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj-art-33564554fc9147a48e07589dc0b431cb2025-08-20T03:16:42ZengMDPI AGRemote Sensing2072-42922025-07-011713231310.3390/rs17132313YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial VehiclesShimin Weng0Han Wang1Jiashu Wang2Changming Xu3Ende Zhang4School of Computer and Communication Engineering, Northeastern University, Qinhuangdao 066004, ChinaSchool of Computer and Communication Engineering, Northeastern University, Qinhuangdao 066004, ChinaSchool of Computer and Communication Engineering, Northeastern University, Qinhuangdao 066004, ChinaSchool of Computer and Communication Engineering, Northeastern University, Qinhuangdao 066004, ChinaSchool of Computer Science and Engineering, Northeastern University, Shenyang 110819, ChinaUnmanned Aerial Vehicles (UAVs) face a significant challenge in balancing high accuracy and high efficiency when performing real-time object detection tasks, especially amidst intricate backgrounds, diverse target scales, and stringent onboard computational resource constraints. To tackle these difficulties, this study introduces YOLO-SRMX, a lightweight real-time object detection framework specifically designed for infrared imagery captured by UAVs. Firstly, the model utilizes ShuffleNetV2 as an efficient lightweight backbone and integrates the novel Multi-Scale Dilated Attention (MSDA) module. This strategy not only facilitates a substantial 46.4% reduction in parameter volume but also, through the flexible adaptation of receptive fields, boosts the model’s robustness and precision in multi-scale object recognition tasks. Secondly, within the neck network, multi-scale feature extraction is facilitated through the design of novel composite convolutions, ConvX and MConv, based on a “split–differentiate–concatenate” paradigm. Furthermore, the lightweight GhostConv is incorporated to reduce model complexity. By synthesizing these principles, a novel composite receptive field lightweight convolution, DRFAConvP, is proposed to further optimize multi-scale feature fusion efficiency and promote model lightweighting. Finally, the Wise-IoU loss function is adopted to replace the traditional bounding box loss. This is coupled with a dynamic non-monotonic focusing mechanism formulated using the concept of outlier degrees. This mechanism intelligently assigns elevated gradient weights to anchor boxes of moderate quality by assessing their relative outlier degree, while concurrently diminishing the gradient contributions from both high-quality and low-quality anchor boxes. Consequently, this approach enhances the model’s localization accuracy for small targets in complex scenes. Experimental evaluations on the HIT-UAV dataset corroborate that YOLO-SRMX achieves an <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>mAP</mi><mn>50</mn></msub></semantics></math></inline-formula> of 82.8%, representing a 7.81% improvement over the baseline YOLOv8s model; an F1 score of 80%, marking a 3.9% increase; and a substantial 65.3% reduction in computational cost (GFLOPs). YOLO-SRMX demonstrates an exceptional trade-off between detection accuracy and operational efficiency, thereby underscoring its considerable potential for efficient and precise object detection on resource-constrained UAV platforms.https://www.mdpi.com/2072-4292/17/13/2313Unmanned Aerial Vehicle (UAV)object detectionlightweight modelYOLOreal-time detection
spellingShingle Shimin Weng
Han Wang
Jiashu Wang
Changming Xu
Ende Zhang
YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles
Remote Sensing
Unmanned Aerial Vehicle (UAV)
object detection
lightweight model
YOLO
real-time detection
title YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles
title_full YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles
title_fullStr YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles
title_full_unstemmed YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles
title_short YOLO-SRMX: A Lightweight Model for Real-Time Object Detection on Unmanned Aerial Vehicles
title_sort yolo srmx a lightweight model for real time object detection on unmanned aerial vehicles
topic Unmanned Aerial Vehicle (UAV)
object detection
lightweight model
YOLO
real-time detection
url https://www.mdpi.com/2072-4292/17/13/2313
work_keys_str_mv AT shiminweng yolosrmxalightweightmodelforrealtimeobjectdetectiononunmannedaerialvehicles
AT hanwang yolosrmxalightweightmodelforrealtimeobjectdetectiononunmannedaerialvehicles
AT jiashuwang yolosrmxalightweightmodelforrealtimeobjectdetectiononunmannedaerialvehicles
AT changmingxu yolosrmxalightweightmodelforrealtimeobjectdetectiononunmannedaerialvehicles
AT endezhang yolosrmxalightweightmodelforrealtimeobjectdetectiononunmannedaerialvehicles