LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion
Abstract In aerial target detection tasks, achieving a balance between high detection accuracy and low computational cost remains a key challenge. To address the aforementioned problems, this paper introduces the lightweight multiscale feature fusion and attention-YOLO (LMSFA-YOLO), which is lightwe...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-06-01
|
| Series: | Journal of King Saud University: Computer and Information Sciences |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s44443-025-00069-4 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849331645762502656 |
|---|---|
| author | Yuanbo Chu Jiahao Wang Longhui Ma Chenxing Wu |
| author_facet | Yuanbo Chu Jiahao Wang Longhui Ma Chenxing Wu |
| author_sort | Yuanbo Chu |
| collection | DOAJ |
| description | Abstract In aerial target detection tasks, achieving a balance between high detection accuracy and low computational cost remains a key challenge. To address the aforementioned problems, this paper introduces the lightweight multiscale feature fusion and attention-YOLO (LMSFA-YOLO), which is lightweight and accurate for aerial tiny target recognition. Firstly, we propose a lightweight multiscale convolution (LMSConv) and a lightweight multiscale cross-stage partial (LMSCSP). These methods optimize convolutional computation cost and enhance multiscale information extraction, significantly reducing computational cost and parameters, while improving feature representation and fusion without sacrificing accuracy. Subsequently, the mixed local channel attention (MLCA) is combined to create an effective mixed channel attention spatial pyramid pooling (EMCASPP), aiming to simultaneously integrate local and channel space information to enhance the feature fusion ability of the model. To further improve the precision of feature extraction and preserve detailed information, a high-resolution shallow feature layer is applied. Finally, to increase the accuracy of bounding box regression, we introduce ShapeIoU to emphasize the scale and shape of the bounding box, replacing the original IoU. Experimental results demonstrate that LMSFA-YOLO surpasses the baseline YOLOv5s with 35.8% fewer parameters while improving the F1-score and mAP by 5.0% and 7.0% on the VisDrone dataset and by 6.1% and 7.6% on the AI-TOD dataset. Furthermore, it runs in real-time (> 30 FPS) on the Jetson Orin Nano. These results validate the effectiveness of LMSFA-YOLO in achieving high detection performance while reducing computation cost, making it well-suited for deployment on edge devices with limited computational resources. |
| format | Article |
| id | doaj-art-78a385800baa413195d7473ceea6d91e |
| institution | Kabale University |
| issn | 1319-1578 2213-1248 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | Springer |
| record_format | Article |
| series | Journal of King Saud University: Computer and Information Sciences |
| spelling | doaj-art-78a385800baa413195d7473ceea6d91e2025-08-20T03:46:28ZengSpringerJournal of King Saud University: Computer and Information Sciences1319-15782213-12482025-06-0137412110.1007/s44443-025-00069-4LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusionYuanbo Chu0Jiahao Wang1Longhui Ma2Chenxing Wu3School of Opto-Electronical Engineering, Xi’an Technological UniversitySchool of Opto-Electronical Engineering, Xi’an Technological UniversitySchool of Opto-Electronical Engineering, Xi’an Technological UniversitySchool of Defence Science and Technology, Xi’an Technological UniversityAbstract In aerial target detection tasks, achieving a balance between high detection accuracy and low computational cost remains a key challenge. To address the aforementioned problems, this paper introduces the lightweight multiscale feature fusion and attention-YOLO (LMSFA-YOLO), which is lightweight and accurate for aerial tiny target recognition. Firstly, we propose a lightweight multiscale convolution (LMSConv) and a lightweight multiscale cross-stage partial (LMSCSP). These methods optimize convolutional computation cost and enhance multiscale information extraction, significantly reducing computational cost and parameters, while improving feature representation and fusion without sacrificing accuracy. Subsequently, the mixed local channel attention (MLCA) is combined to create an effective mixed channel attention spatial pyramid pooling (EMCASPP), aiming to simultaneously integrate local and channel space information to enhance the feature fusion ability of the model. To further improve the precision of feature extraction and preserve detailed information, a high-resolution shallow feature layer is applied. Finally, to increase the accuracy of bounding box regression, we introduce ShapeIoU to emphasize the scale and shape of the bounding box, replacing the original IoU. Experimental results demonstrate that LMSFA-YOLO surpasses the baseline YOLOv5s with 35.8% fewer parameters while improving the F1-score and mAP by 5.0% and 7.0% on the VisDrone dataset and by 6.1% and 7.6% on the AI-TOD dataset. Furthermore, it runs in real-time (> 30 FPS) on the Jetson Orin Nano. These results validate the effectiveness of LMSFA-YOLO in achieving high detection performance while reducing computation cost, making it well-suited for deployment on edge devices with limited computational resources.https://doi.org/10.1007/s44443-025-00069-4Tiny target detectionLightweight networkAttention mechanismMultiscale convolutionHigh resolution shallow feature |
| spellingShingle | Yuanbo Chu Jiahao Wang Longhui Ma Chenxing Wu LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion Journal of King Saud University: Computer and Information Sciences Tiny target detection Lightweight network Attention mechanism Multiscale convolution High resolution shallow feature |
| title | LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion |
| title_full | LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion |
| title_fullStr | LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion |
| title_full_unstemmed | LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion |
| title_short | LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion |
| title_sort | lmsfa yolo a lightweight target detection network in remote sensing images based on multiscale feature fusion |
| topic | Tiny target detection Lightweight network Attention mechanism Multiscale convolution High resolution shallow feature |
| url | https://doi.org/10.1007/s44443-025-00069-4 |
| work_keys_str_mv | AT yuanbochu lmsfayoloalightweighttargetdetectionnetworkinremotesensingimagesbasedonmultiscalefeaturefusion AT jiahaowang lmsfayoloalightweighttargetdetectionnetworkinremotesensingimagesbasedonmultiscalefeaturefusion AT longhuima lmsfayoloalightweighttargetdetectionnetworkinremotesensingimagesbasedonmultiscalefeaturefusion AT chenxingwu lmsfayoloalightweighttargetdetectionnetworkinremotesensingimagesbasedonmultiscalefeaturefusion |