VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection

Abstract As an important branch of remote sensing technology, aerial image target detection plays an indispensable role in supporting urban planning, disaster assessment, and other fields. However, this task faces many challenges such as small object size and complex background, which increase the d...

Full description

Saved in:

Bibliographic Details
Main Authors:	Haodong Li, Haicheng Qu
Format:	Article
Language:	English
Published:	Springer 2025-06-01
Series:	Complex & Intelligent Systems
Subjects:	Object detection Aerial images Multi-scale feature fusion Contextual attention Feature extraction
Online Access:	https://doi.org/10.1007/s40747-025-01888-8
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849388703182487552
author	Haodong Li Haicheng Qu
author_facet	Haodong Li Haicheng Qu
author_sort	Haodong Li
collection	DOAJ
description	Abstract As an important branch of remote sensing technology, aerial image target detection plays an indispensable role in supporting urban planning, disaster assessment, and other fields. However, this task faces many challenges such as small object size and complex background, which increase the difficulty of detection. Existing methods usually use multi-scale feature fusion or attention mechanism to improve performance, but they often ignore the role of object feature perception in the image and have problems such as insufficient use of context information. To address these problems, we propose the VMC-Net framework to optimize the aerial image object detection task. The VHeat C2f module enhances the feature extraction capability and generates a clearer target feature map; the multi-scale feature aggregation and distribution module adds feature distribution technology on the basis of the multi-scale feature fusion strategy to achieve more effective scale interaction; the contextual attention guided fusion module uses attention mechanism and weighted fusion method to effectively utilize context information and significantly improve the performance of small object detection. We evaluate the VMC-Net framework on the AI-TOD, VisDrone-2019 and TinyPerson datasets. Experimental results show that our framework outperforms the mainstream target detection methods in the past three years in aerial object detection, with mAP50 scores of 45.6%, 45.9%, and 25.4% respectively.
format	Article
id	doaj-art-31bfcbab442c40339fb4faef0feaf079
institution	Kabale University
issn	2199-4536 2198-6053
language	English
publishDate	2025-06-01
publisher	Springer
record_format	Article
series	Complex & Intelligent Systems
spelling	doaj-art-31bfcbab442c40339fb4faef0feaf0792025-08-20T03:42:11ZengSpringerComplex & Intelligent Systems2199-45362198-60532025-06-0111812510.1007/s40747-025-01888-8VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detectionHaodong Li0Haicheng Qu1Liaoning Technical University, School of SoftwareLiaoning Technical University, School of SoftwareAbstract As an important branch of remote sensing technology, aerial image target detection plays an indispensable role in supporting urban planning, disaster assessment, and other fields. However, this task faces many challenges such as small object size and complex background, which increase the difficulty of detection. Existing methods usually use multi-scale feature fusion or attention mechanism to improve performance, but they often ignore the role of object feature perception in the image and have problems such as insufficient use of context information. To address these problems, we propose the VMC-Net framework to optimize the aerial image object detection task. The VHeat C2f module enhances the feature extraction capability and generates a clearer target feature map; the multi-scale feature aggregation and distribution module adds feature distribution technology on the basis of the multi-scale feature fusion strategy to achieve more effective scale interaction; the contextual attention guided fusion module uses attention mechanism and weighted fusion method to effectively utilize context information and significantly improve the performance of small object detection. We evaluate the VMC-Net framework on the AI-TOD, VisDrone-2019 and TinyPerson datasets. Experimental results show that our framework outperforms the mainstream target detection methods in the past three years in aerial object detection, with mAP50 scores of 45.6%, 45.9%, and 25.4% respectively.https://doi.org/10.1007/s40747-025-01888-8Object detectionAerial imagesMulti-scale feature fusionContextual attentionFeature extraction
spellingShingle	Haodong Li Haicheng Qu VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection Complex & Intelligent Systems Object detection Aerial images Multi-scale feature fusion Contextual attention Feature extraction
title	VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection
title_full	VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection
title_fullStr	VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection
title_full_unstemmed	VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection
title_short	VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection
title_sort	vmc net multi scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection
topic	Object detection Aerial images Multi-scale feature fusion Contextual attention Feature extraction
url	https://doi.org/10.1007/s40747-025-01888-8
work_keys_str_mv	AT haodongli vmcnetmultiscalefeatureaggregationanddistributionwithcontextualattentionguidedfusionforaerialobjectdetection AT haichengqu vmcnetmultiscalefeatureaggregationanddistributionwithcontextualattentionguidedfusionforaerialobjectdetection

VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection

Similar Items