Multispectral Target Detection Based on Deep Feature Fusion of Visible and Infrared Modalities

Multispectral detection leverages visible and infrared imaging to improve detection performance in complex environments. However, conventional convolution-based fusion methods predominantly rely on local feature interactions, limiting their capacity to fully exploit cross-modal information and makin...

Full description

Saved in:
Bibliographic Details
Main Authors: Yongsheng Zhao, Yuxing Gao, Xu Yang, Luyang Yang
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/11/5857
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Multispectral detection leverages visible and infrared imaging to improve detection performance in complex environments. However, conventional convolution-based fusion methods predominantly rely on local feature interactions, limiting their capacity to fully exploit cross-modal information and making them more susceptible to interference from complex backgrounds. To overcome these challenges, the YOLO-MEDet multispectral target detection model is proposed. Firstly, the YOLOv5 architecture is redesigned into a two-stream backbone network, incorporating a midway fusion strategy to integrate multimodal features from the C3 to C5 layers, thereby enhancing detection accuracy and robustness. Secondly, the Attention-Enhanced Feature Fusion Framework (AEFF) is introduced to optimize both cross-modal and intra-modal feature representations by employing an attention mechanism, effectively boosting model performance. Finally, the C3-PSA (C3 Pyramid Compressed Attention) module is integrated to reinforce multiscale spatial feature extraction and refine feature representation, ultimately improving detection accuracy while reducing false alarms and missed detections in complex scenarios. Extensive experiments on the FLIR, KAIST, and M3FD datasets, along with additional validation using SimuNPS simulations, confirm the superiority of YOLO-MEDet. The results indicate that the proposed model outperforms existing approaches across multiple evaluation metrics, providing an innovative solution for multispectral target detection.
ISSN:2076-3417