A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms
In recent years, the rapid advancement and pervasive deployment of unmanned aerial vehicle (UAV) technology have catalyzed transformative applications across the military, civilian, and scientific domains. While aerial imaging has emerged as a pivotal tool in modern remote sensing systems, persisten...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/7/1118 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850212266234871808 |
|---|---|
| author | Shufang Xu Heng Li Tianci Liu Hongmin Gao |
| author_facet | Shufang Xu Heng Li Tianci Liu Hongmin Gao |
| author_sort | Shufang Xu |
| collection | DOAJ |
| description | In recent years, the rapid advancement and pervasive deployment of unmanned aerial vehicle (UAV) technology have catalyzed transformative applications across the military, civilian, and scientific domains. While aerial imaging has emerged as a pivotal tool in modern remote sensing systems, persistent challenges remain in achieving robust small-target detection under complex all-weather conditions. This paper presents an innovative multimodal fusion framework incorporating photometric perception and cross-attention mechanisms to address the critical limitations of current single-modality detection systems, particularly their susceptibility to reduced accuracy and elevated false-negative rates in adverse environmental conditions. Our architecture introduces three novel components: (1) a bidirectional hierarchical feature extraction network that enables the synergistic processing of heterogeneous sensor data; (2) a cross-modality attention mechanism that dynamically establishes inter-modal feature correlations through learnable attention weights; (3) an adaptive photometric weighting fusion module that implements spectral characteristic-aware feature recalibration. The proposed system achieves multimodal complementarity through two-phase integration: first by establishing cross-modal feature correspondences through attention-guided feature alignment, then performing weighted fusion based on photometric reliability assessment. Comprehensive experiments demonstrate that our framework achieves an improvement of at least 3.6% in mAP compared to the other models on the challenging LLVIP dataset, and with particular improvements in detection reliability on the KAIST dataset. This research advances the state of the art in aerial target detection by providing a principled approach for multimodal sensor fusion, with significant implications for surveillance, disaster response, and precision agriculture applications. |
| format | Article |
| id | doaj-art-d565ff922395454aa85448fdcd254569 |
| institution | OA Journals |
| issn | 2072-4292 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Remote Sensing |
| spelling | doaj-art-d565ff922395454aa85448fdcd2545692025-08-20T02:09:22ZengMDPI AGRemote Sensing2072-42922025-03-01177111810.3390/rs17071118A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention MechanismsShufang Xu0Heng Li1Tianci Liu2Hongmin Gao3College of Computer Science and Software Engineering, Hohai University, Nanjing 211100, ChinaCollege of Information Science and Engineering, Hohai University, Changzhou 213200, ChinaCollege of Computer Science and Software Engineering, Hohai University, Nanjing 211100, ChinaCollege of Computer Science and Software Engineering, Hohai University, Nanjing 211100, ChinaIn recent years, the rapid advancement and pervasive deployment of unmanned aerial vehicle (UAV) technology have catalyzed transformative applications across the military, civilian, and scientific domains. While aerial imaging has emerged as a pivotal tool in modern remote sensing systems, persistent challenges remain in achieving robust small-target detection under complex all-weather conditions. This paper presents an innovative multimodal fusion framework incorporating photometric perception and cross-attention mechanisms to address the critical limitations of current single-modality detection systems, particularly their susceptibility to reduced accuracy and elevated false-negative rates in adverse environmental conditions. Our architecture introduces three novel components: (1) a bidirectional hierarchical feature extraction network that enables the synergistic processing of heterogeneous sensor data; (2) a cross-modality attention mechanism that dynamically establishes inter-modal feature correlations through learnable attention weights; (3) an adaptive photometric weighting fusion module that implements spectral characteristic-aware feature recalibration. The proposed system achieves multimodal complementarity through two-phase integration: first by establishing cross-modal feature correspondences through attention-guided feature alignment, then performing weighted fusion based on photometric reliability assessment. Comprehensive experiments demonstrate that our framework achieves an improvement of at least 3.6% in mAP compared to the other models on the challenging LLVIP dataset, and with particular improvements in detection reliability on the KAIST dataset. This research advances the state of the art in aerial target detection by providing a principled approach for multimodal sensor fusion, with significant implications for surveillance, disaster response, and precision agriculture applications.https://www.mdpi.com/2072-4292/17/7/1118light sensingcross-attention mechanismmultimodalsmall-target detectionaerial photography with UAV |
| spellingShingle | Shufang Xu Heng Li Tianci Liu Hongmin Gao A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms Remote Sensing light sensing cross-attention mechanism multimodal small-target detection aerial photography with UAV |
| title | A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms |
| title_full | A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms |
| title_fullStr | A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms |
| title_full_unstemmed | A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms |
| title_short | A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms |
| title_sort | method for airborne small target detection with a multimodal fusion framework integrating photometric perception and cross attention mechanisms |
| topic | light sensing cross-attention mechanism multimodal small-target detection aerial photography with UAV |
| url | https://www.mdpi.com/2072-4292/17/7/1118 |
| work_keys_str_mv | AT shufangxu amethodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms AT hengli amethodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms AT tianciliu amethodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms AT hongmingao amethodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms AT shufangxu methodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms AT hengli methodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms AT tianciliu methodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms AT hongmingao methodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms |