LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion

Abstract In aerial target detection tasks, achieving a balance between high detection accuracy and low computational cost remains a key challenge. To address the aforementioned problems, this paper introduces the lightweight multiscale feature fusion and attention-YOLO (LMSFA-YOLO), which is lightwe...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuanbo Chu, Jiahao Wang, Longhui Ma, Chenxing Wu
Format: Article
Language:English
Published: Springer 2025-06-01
Series:Journal of King Saud University: Computer and Information Sciences
Subjects:
Online Access:https://doi.org/10.1007/s44443-025-00069-4
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849331645762502656
author Yuanbo Chu
Jiahao Wang
Longhui Ma
Chenxing Wu
author_facet Yuanbo Chu
Jiahao Wang
Longhui Ma
Chenxing Wu
author_sort Yuanbo Chu
collection DOAJ
description Abstract In aerial target detection tasks, achieving a balance between high detection accuracy and low computational cost remains a key challenge. To address the aforementioned problems, this paper introduces the lightweight multiscale feature fusion and attention-YOLO (LMSFA-YOLO), which is lightweight and accurate for aerial tiny target recognition. Firstly, we propose a lightweight multiscale convolution (LMSConv) and a lightweight multiscale cross-stage partial (LMSCSP). These methods optimize convolutional computation cost and enhance multiscale information extraction, significantly reducing computational cost and parameters, while improving feature representation and fusion without sacrificing accuracy. Subsequently, the mixed local channel attention (MLCA) is combined to create an effective mixed channel attention spatial pyramid pooling (EMCASPP), aiming to simultaneously integrate local and channel space information to enhance the feature fusion ability of the model. To further improve the precision of feature extraction and preserve detailed information, a high-resolution shallow feature layer is applied. Finally, to increase the accuracy of bounding box regression, we introduce ShapeIoU to emphasize the scale and shape of the bounding box, replacing the original IoU. Experimental results demonstrate that LMSFA-YOLO surpasses the baseline YOLOv5s with 35.8% fewer parameters while improving the F1-score and mAP by 5.0% and 7.0% on the VisDrone dataset and by 6.1% and 7.6% on the AI-TOD dataset. Furthermore, it runs in real-time (> 30 FPS) on the Jetson Orin Nano. These results validate the effectiveness of LMSFA-YOLO in achieving high detection performance while reducing computation cost, making it well-suited for deployment on edge devices with limited computational resources.
format Article
id doaj-art-78a385800baa413195d7473ceea6d91e
institution Kabale University
issn 1319-1578
2213-1248
language English
publishDate 2025-06-01
publisher Springer
record_format Article
series Journal of King Saud University: Computer and Information Sciences
spelling doaj-art-78a385800baa413195d7473ceea6d91e2025-08-20T03:46:28ZengSpringerJournal of King Saud University: Computer and Information Sciences1319-15782213-12482025-06-0137412110.1007/s44443-025-00069-4LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusionYuanbo Chu0Jiahao Wang1Longhui Ma2Chenxing Wu3School of Opto-Electronical Engineering, Xi’an Technological UniversitySchool of Opto-Electronical Engineering, Xi’an Technological UniversitySchool of Opto-Electronical Engineering, Xi’an Technological UniversitySchool of Defence Science and Technology, Xi’an Technological UniversityAbstract In aerial target detection tasks, achieving a balance between high detection accuracy and low computational cost remains a key challenge. To address the aforementioned problems, this paper introduces the lightweight multiscale feature fusion and attention-YOLO (LMSFA-YOLO), which is lightweight and accurate for aerial tiny target recognition. Firstly, we propose a lightweight multiscale convolution (LMSConv) and a lightweight multiscale cross-stage partial (LMSCSP). These methods optimize convolutional computation cost and enhance multiscale information extraction, significantly reducing computational cost and parameters, while improving feature representation and fusion without sacrificing accuracy. Subsequently, the mixed local channel attention (MLCA) is combined to create an effective mixed channel attention spatial pyramid pooling (EMCASPP), aiming to simultaneously integrate local and channel space information to enhance the feature fusion ability of the model. To further improve the precision of feature extraction and preserve detailed information, a high-resolution shallow feature layer is applied. Finally, to increase the accuracy of bounding box regression, we introduce ShapeIoU to emphasize the scale and shape of the bounding box, replacing the original IoU. Experimental results demonstrate that LMSFA-YOLO surpasses the baseline YOLOv5s with 35.8% fewer parameters while improving the F1-score and mAP by 5.0% and 7.0% on the VisDrone dataset and by 6.1% and 7.6% on the AI-TOD dataset. Furthermore, it runs in real-time (> 30 FPS) on the Jetson Orin Nano. These results validate the effectiveness of LMSFA-YOLO in achieving high detection performance while reducing computation cost, making it well-suited for deployment on edge devices with limited computational resources.https://doi.org/10.1007/s44443-025-00069-4Tiny target detectionLightweight networkAttention mechanismMultiscale convolutionHigh resolution shallow feature
spellingShingle Yuanbo Chu
Jiahao Wang
Longhui Ma
Chenxing Wu
LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion
Journal of King Saud University: Computer and Information Sciences
Tiny target detection
Lightweight network
Attention mechanism
Multiscale convolution
High resolution shallow feature
title LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion
title_full LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion
title_fullStr LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion
title_full_unstemmed LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion
title_short LMSFA-YOLO: A lightweight target detection network in Remote sensing images based on Multiscale feature fusion
title_sort lmsfa yolo a lightweight target detection network in remote sensing images based on multiscale feature fusion
topic Tiny target detection
Lightweight network
Attention mechanism
Multiscale convolution
High resolution shallow feature
url https://doi.org/10.1007/s44443-025-00069-4
work_keys_str_mv AT yuanbochu lmsfayoloalightweighttargetdetectionnetworkinremotesensingimagesbasedonmultiscalefeaturefusion
AT jiahaowang lmsfayoloalightweighttargetdetectionnetworkinremotesensingimagesbasedonmultiscalefeaturefusion
AT longhuima lmsfayoloalightweighttargetdetectionnetworkinremotesensingimagesbasedonmultiscalefeaturefusion
AT chenxingwu lmsfayoloalightweighttargetdetectionnetworkinremotesensingimagesbasedonmultiscalefeaturefusion