VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection
Abstract As an important branch of remote sensing technology, aerial image target detection plays an indispensable role in supporting urban planning, disaster assessment, and other fields. However, this task faces many challenges such as small object size and complex background, which increase the d...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-06-01
|
| Series: | Complex & Intelligent Systems |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s40747-025-01888-8 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849388703182487552 |
|---|---|
| author | Haodong Li Haicheng Qu |
| author_facet | Haodong Li Haicheng Qu |
| author_sort | Haodong Li |
| collection | DOAJ |
| description | Abstract As an important branch of remote sensing technology, aerial image target detection plays an indispensable role in supporting urban planning, disaster assessment, and other fields. However, this task faces many challenges such as small object size and complex background, which increase the difficulty of detection. Existing methods usually use multi-scale feature fusion or attention mechanism to improve performance, but they often ignore the role of object feature perception in the image and have problems such as insufficient use of context information. To address these problems, we propose the VMC-Net framework to optimize the aerial image object detection task. The VHeat C2f module enhances the feature extraction capability and generates a clearer target feature map; the multi-scale feature aggregation and distribution module adds feature distribution technology on the basis of the multi-scale feature fusion strategy to achieve more effective scale interaction; the contextual attention guided fusion module uses attention mechanism and weighted fusion method to effectively utilize context information and significantly improve the performance of small object detection. We evaluate the VMC-Net framework on the AI-TOD, VisDrone-2019 and TinyPerson datasets. Experimental results show that our framework outperforms the mainstream target detection methods in the past three years in aerial object detection, with mAP50 scores of 45.6%, 45.9%, and 25.4% respectively. |
| format | Article |
| id | doaj-art-31bfcbab442c40339fb4faef0feaf079 |
| institution | Kabale University |
| issn | 2199-4536 2198-6053 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | Springer |
| record_format | Article |
| series | Complex & Intelligent Systems |
| spelling | doaj-art-31bfcbab442c40339fb4faef0feaf0792025-08-20T03:42:11ZengSpringerComplex & Intelligent Systems2199-45362198-60532025-06-0111812510.1007/s40747-025-01888-8VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detectionHaodong Li0Haicheng Qu1Liaoning Technical University, School of SoftwareLiaoning Technical University, School of SoftwareAbstract As an important branch of remote sensing technology, aerial image target detection plays an indispensable role in supporting urban planning, disaster assessment, and other fields. However, this task faces many challenges such as small object size and complex background, which increase the difficulty of detection. Existing methods usually use multi-scale feature fusion or attention mechanism to improve performance, but they often ignore the role of object feature perception in the image and have problems such as insufficient use of context information. To address these problems, we propose the VMC-Net framework to optimize the aerial image object detection task. The VHeat C2f module enhances the feature extraction capability and generates a clearer target feature map; the multi-scale feature aggregation and distribution module adds feature distribution technology on the basis of the multi-scale feature fusion strategy to achieve more effective scale interaction; the contextual attention guided fusion module uses attention mechanism and weighted fusion method to effectively utilize context information and significantly improve the performance of small object detection. We evaluate the VMC-Net framework on the AI-TOD, VisDrone-2019 and TinyPerson datasets. Experimental results show that our framework outperforms the mainstream target detection methods in the past three years in aerial object detection, with mAP50 scores of 45.6%, 45.9%, and 25.4% respectively.https://doi.org/10.1007/s40747-025-01888-8Object detectionAerial imagesMulti-scale feature fusionContextual attentionFeature extraction |
| spellingShingle | Haodong Li Haicheng Qu VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection Complex & Intelligent Systems Object detection Aerial images Multi-scale feature fusion Contextual attention Feature extraction |
| title | VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection |
| title_full | VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection |
| title_fullStr | VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection |
| title_full_unstemmed | VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection |
| title_short | VMC-Net: multi-scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection |
| title_sort | vmc net multi scale feature aggregation and distribution with contextual attention guided fusion for aerial object detection |
| topic | Object detection Aerial images Multi-scale feature fusion Contextual attention Feature extraction |
| url | https://doi.org/10.1007/s40747-025-01888-8 |
| work_keys_str_mv | AT haodongli vmcnetmultiscalefeatureaggregationanddistributionwithcontextualattentionguidedfusionforaerialobjectdetection AT haichengqu vmcnetmultiscalefeatureaggregationanddistributionwithcontextualattentionguidedfusionforaerialobjectdetection |