A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms

In recent years, the rapid advancement and pervasive deployment of unmanned aerial vehicle (UAV) technology have catalyzed transformative applications across the military, civilian, and scientific domains. While aerial imaging has emerged as a pivotal tool in modern remote sensing systems, persisten...

Full description

Saved in:
Bibliographic Details
Main Authors: Shufang Xu, Heng Li, Tianci Liu, Hongmin Gao
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/7/1118
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850212266234871808
author Shufang Xu
Heng Li
Tianci Liu
Hongmin Gao
author_facet Shufang Xu
Heng Li
Tianci Liu
Hongmin Gao
author_sort Shufang Xu
collection DOAJ
description In recent years, the rapid advancement and pervasive deployment of unmanned aerial vehicle (UAV) technology have catalyzed transformative applications across the military, civilian, and scientific domains. While aerial imaging has emerged as a pivotal tool in modern remote sensing systems, persistent challenges remain in achieving robust small-target detection under complex all-weather conditions. This paper presents an innovative multimodal fusion framework incorporating photometric perception and cross-attention mechanisms to address the critical limitations of current single-modality detection systems, particularly their susceptibility to reduced accuracy and elevated false-negative rates in adverse environmental conditions. Our architecture introduces three novel components: (1) a bidirectional hierarchical feature extraction network that enables the synergistic processing of heterogeneous sensor data; (2) a cross-modality attention mechanism that dynamically establishes inter-modal feature correlations through learnable attention weights; (3) an adaptive photometric weighting fusion module that implements spectral characteristic-aware feature recalibration. The proposed system achieves multimodal complementarity through two-phase integration: first by establishing cross-modal feature correspondences through attention-guided feature alignment, then performing weighted fusion based on photometric reliability assessment. Comprehensive experiments demonstrate that our framework achieves an improvement of at least 3.6% in mAP compared to the other models on the challenging LLVIP dataset, and with particular improvements in detection reliability on the KAIST dataset. This research advances the state of the art in aerial target detection by providing a principled approach for multimodal sensor fusion, with significant implications for surveillance, disaster response, and precision agriculture applications.
format Article
id doaj-art-d565ff922395454aa85448fdcd254569
institution OA Journals
issn 2072-4292
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj-art-d565ff922395454aa85448fdcd2545692025-08-20T02:09:22ZengMDPI AGRemote Sensing2072-42922025-03-01177111810.3390/rs17071118A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention MechanismsShufang Xu0Heng Li1Tianci Liu2Hongmin Gao3College of Computer Science and Software Engineering, Hohai University, Nanjing 211100, ChinaCollege of Information Science and Engineering, Hohai University, Changzhou 213200, ChinaCollege of Computer Science and Software Engineering, Hohai University, Nanjing 211100, ChinaCollege of Computer Science and Software Engineering, Hohai University, Nanjing 211100, ChinaIn recent years, the rapid advancement and pervasive deployment of unmanned aerial vehicle (UAV) technology have catalyzed transformative applications across the military, civilian, and scientific domains. While aerial imaging has emerged as a pivotal tool in modern remote sensing systems, persistent challenges remain in achieving robust small-target detection under complex all-weather conditions. This paper presents an innovative multimodal fusion framework incorporating photometric perception and cross-attention mechanisms to address the critical limitations of current single-modality detection systems, particularly their susceptibility to reduced accuracy and elevated false-negative rates in adverse environmental conditions. Our architecture introduces three novel components: (1) a bidirectional hierarchical feature extraction network that enables the synergistic processing of heterogeneous sensor data; (2) a cross-modality attention mechanism that dynamically establishes inter-modal feature correlations through learnable attention weights; (3) an adaptive photometric weighting fusion module that implements spectral characteristic-aware feature recalibration. The proposed system achieves multimodal complementarity through two-phase integration: first by establishing cross-modal feature correspondences through attention-guided feature alignment, then performing weighted fusion based on photometric reliability assessment. Comprehensive experiments demonstrate that our framework achieves an improvement of at least 3.6% in mAP compared to the other models on the challenging LLVIP dataset, and with particular improvements in detection reliability on the KAIST dataset. This research advances the state of the art in aerial target detection by providing a principled approach for multimodal sensor fusion, with significant implications for surveillance, disaster response, and precision agriculture applications.https://www.mdpi.com/2072-4292/17/7/1118light sensingcross-attention mechanismmultimodalsmall-target detectionaerial photography with UAV
spellingShingle Shufang Xu
Heng Li
Tianci Liu
Hongmin Gao
A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms
Remote Sensing
light sensing
cross-attention mechanism
multimodal
small-target detection
aerial photography with UAV
title A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms
title_full A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms
title_fullStr A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms
title_full_unstemmed A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms
title_short A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms
title_sort method for airborne small target detection with a multimodal fusion framework integrating photometric perception and cross attention mechanisms
topic light sensing
cross-attention mechanism
multimodal
small-target detection
aerial photography with UAV
url https://www.mdpi.com/2072-4292/17/7/1118
work_keys_str_mv AT shufangxu amethodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms
AT hengli amethodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms
AT tianciliu amethodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms
AT hongmingao amethodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms
AT shufangxu methodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms
AT hengli methodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms
AT tianciliu methodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms
AT hongmingao methodforairbornesmalltargetdetectionwithamultimodalfusionframeworkintegratingphotometricperceptionandcrossattentionmechanisms