Efficient vision transformers with edge enhancement for robust small target detection in drone-based remote sensing
Small object detection in UAV remote sensing imagery faces significant challenges due to scale variations, background clutter, and real-time processing requirements. This study proposes a lightweight transformer-based detector, MLD-DETR, which enhances detection performance in complex scenarios thro...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-07-01
|
| Series: | Frontiers in Remote Sensing |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/frsen.2025.1599099/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850096162670903296 |
|---|---|
| author | Xuguang Zhu Zhizhao Zhang |
| author_facet | Xuguang Zhu Zhizhao Zhang |
| author_sort | Xuguang Zhu |
| collection | DOAJ |
| description | Small object detection in UAV remote sensing imagery faces significant challenges due to scale variations, background clutter, and real-time processing requirements. This study proposes a lightweight transformer-based detector, MLD-DETR, which enhances detection performance in complex scenarios through multi-scale edge enhancement and hierarchical attention mechanisms. First, a Multi-Scale Edge Enhancement Fusion (MSEEF) module is designed, integrating adaptive pooling and edge-aware convolution to preserve target boundary details while enabling cross-scale feature interaction. Second, a Layered Attention Fusion (LAF) mechanism is developed, leveraging spatial depth-wise convolution and omnidirectional kernel feature fusion to improve hierarchical localization capability for densely occluded targets. Furthermore, a Dynamic Positional Encoding (DPE) module replaces traditional fixed positional embeddings, enhancing spatial perception accuracy under complex geometric perspectives through learnable spatial adapters. Combined with an Inner Generalized Intersection-over-Union (Inner-GIoU) loss function to optimize bounding box geometric consistency, MLD-DETR achieves 36.7% AP50% and 14.5% APs on the VisDrone2019 dataset, outperforming the baseline RT-DETR by 3.2% and 1.8% in accuracy while achieving 20% parameter reduction and maintaining computational efficiency suitable for UAV platforms equipped with modern edge computing hardware. Experimental results demonstrate the algorithm’s superior performance in UAV remote sensing applications such as crop disease monitoring and traffic congestion detection, offering an efficient solution for real-time edge-device deployment. |
| format | Article |
| id | doaj-art-708e7bc4ffdc4cd88f1d562eb032b108 |
| institution | DOAJ |
| issn | 2673-6187 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Remote Sensing |
| spelling | doaj-art-708e7bc4ffdc4cd88f1d562eb032b1082025-08-20T02:41:17ZengFrontiers Media S.A.Frontiers in Remote Sensing2673-61872025-07-01610.3389/frsen.2025.15990991599099Efficient vision transformers with edge enhancement for robust small target detection in drone-based remote sensingXuguang Zhu0Zhizhao Zhang1College of Innovation and Practice, Liaoning Technical University, Fuxin, ChinaSchool of Software, Liaoning Technical University, Huludao, ChinaSmall object detection in UAV remote sensing imagery faces significant challenges due to scale variations, background clutter, and real-time processing requirements. This study proposes a lightweight transformer-based detector, MLD-DETR, which enhances detection performance in complex scenarios through multi-scale edge enhancement and hierarchical attention mechanisms. First, a Multi-Scale Edge Enhancement Fusion (MSEEF) module is designed, integrating adaptive pooling and edge-aware convolution to preserve target boundary details while enabling cross-scale feature interaction. Second, a Layered Attention Fusion (LAF) mechanism is developed, leveraging spatial depth-wise convolution and omnidirectional kernel feature fusion to improve hierarchical localization capability for densely occluded targets. Furthermore, a Dynamic Positional Encoding (DPE) module replaces traditional fixed positional embeddings, enhancing spatial perception accuracy under complex geometric perspectives through learnable spatial adapters. Combined with an Inner Generalized Intersection-over-Union (Inner-GIoU) loss function to optimize bounding box geometric consistency, MLD-DETR achieves 36.7% AP50% and 14.5% APs on the VisDrone2019 dataset, outperforming the baseline RT-DETR by 3.2% and 1.8% in accuracy while achieving 20% parameter reduction and maintaining computational efficiency suitable for UAV platforms equipped with modern edge computing hardware. Experimental results demonstrate the algorithm’s superior performance in UAV remote sensing applications such as crop disease monitoring and traffic congestion detection, offering an efficient solution for real-time edge-device deployment.https://www.frontiersin.org/articles/10.3389/frsen.2025.1599099/fullUAVdrone-based remote sensingRT-DETRsmall object detectionmulti-scale edge enhancement |
| spellingShingle | Xuguang Zhu Zhizhao Zhang Efficient vision transformers with edge enhancement for robust small target detection in drone-based remote sensing Frontiers in Remote Sensing UAV drone-based remote sensing RT-DETR small object detection multi-scale edge enhancement |
| title | Efficient vision transformers with edge enhancement for robust small target detection in drone-based remote sensing |
| title_full | Efficient vision transformers with edge enhancement for robust small target detection in drone-based remote sensing |
| title_fullStr | Efficient vision transformers with edge enhancement for robust small target detection in drone-based remote sensing |
| title_full_unstemmed | Efficient vision transformers with edge enhancement for robust small target detection in drone-based remote sensing |
| title_short | Efficient vision transformers with edge enhancement for robust small target detection in drone-based remote sensing |
| title_sort | efficient vision transformers with edge enhancement for robust small target detection in drone based remote sensing |
| topic | UAV drone-based remote sensing RT-DETR small object detection multi-scale edge enhancement |
| url | https://www.frontiersin.org/articles/10.3389/frsen.2025.1599099/full |
| work_keys_str_mv | AT xuguangzhu efficientvisiontransformerswithedgeenhancementforrobustsmalltargetdetectionindronebasedremotesensing AT zhizhaozhang efficientvisiontransformerswithedgeenhancementforrobustsmalltargetdetectionindronebasedremotesensing |