Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism

Due to the inherent limitations of visible-light sensors, monitoring systems that rely solely on single-modal visible-light images exhibit reduced accuracy, posing safety concerns in applications such as autonomous driving. Infrared and visible-light image fusion technology addresses this issue by g...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhenge Qu, Zhuoning Dong, Yuxin Guo, Hui Ren, Hongyang Fu
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/13/7044
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849319855363194880
author Zhenge Qu
Zhuoning Dong
Yuxin Guo
Hui Ren
Hongyang Fu
author_facet Zhenge Qu
Zhuoning Dong
Yuxin Guo
Hui Ren
Hongyang Fu
author_sort Zhenge Qu
collection DOAJ
description Due to the inherent limitations of visible-light sensors, monitoring systems that rely solely on single-modal visible-light images exhibit reduced accuracy, posing safety concerns in applications such as autonomous driving. Infrared and visible-light image fusion technology addresses this issue by generating composite images that integrate complementary information from both modalities, thereby enhancing perception robustness. This study focuses on target detection in fused images. Given that targets in such images are often small and severely occluded, we propose an optimized detection framework to overcome these challenges. Specifically, we improve the YOLOv8 baseline model by introducing a dedicated small-object detection layer, incorporating the Global Attention Mechanism (GAM), and refining the loss function. Experimental results show that our method achieves a 5.0% improvement in mAP and a 6.5% gain in recall over the original YOLOv8. Furthermore, comparative experiments on fused and single-modal inputs demonstrate that fused images yield the highest detection accuracy. These results confirm that leveraging fused inputs significantly enhances detection accuracy and robustness in complex environments.
format Article
id doaj-art-659a7dfdd4ee49f0963db716c4a189a5
institution Kabale University
issn 2076-3417
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-659a7dfdd4ee49f0963db716c4a189a52025-08-20T03:50:17ZengMDPI AGApplied Sciences2076-34172025-06-011513704410.3390/app15137044Multi-Target Detection Algorithm for Fusion Images Based on an Attention MechanismZhenge Qu0Zhuoning Dong1Yuxin Guo2Hui Ren3Hongyang Fu4School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, ChinaSchool of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, ChinaSchool of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, ChinaSchool of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, ChinaInstitute of Systems Engineering, China Academy of Engineering Physics, Mianyang 621999, ChinaDue to the inherent limitations of visible-light sensors, monitoring systems that rely solely on single-modal visible-light images exhibit reduced accuracy, posing safety concerns in applications such as autonomous driving. Infrared and visible-light image fusion technology addresses this issue by generating composite images that integrate complementary information from both modalities, thereby enhancing perception robustness. This study focuses on target detection in fused images. Given that targets in such images are often small and severely occluded, we propose an optimized detection framework to overcome these challenges. Specifically, we improve the YOLOv8 baseline model by introducing a dedicated small-object detection layer, incorporating the Global Attention Mechanism (GAM), and refining the loss function. Experimental results show that our method achieves a 5.0% improvement in mAP and a 6.5% gain in recall over the original YOLOv8. Furthermore, comparative experiments on fused and single-modal inputs demonstrate that fused images yield the highest detection accuracy. These results confirm that leveraging fused inputs significantly enhances detection accuracy and robustness in complex environments.https://www.mdpi.com/2076-3417/15/13/7044image fusiontarget detectionglobal attention mechanismdeep learning
spellingShingle Zhenge Qu
Zhuoning Dong
Yuxin Guo
Hui Ren
Hongyang Fu
Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism
Applied Sciences
image fusion
target detection
global attention mechanism
deep learning
title Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism
title_full Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism
title_fullStr Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism
title_full_unstemmed Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism
title_short Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism
title_sort multi target detection algorithm for fusion images based on an attention mechanism
topic image fusion
target detection
global attention mechanism
deep learning
url https://www.mdpi.com/2076-3417/15/13/7044
work_keys_str_mv AT zhengequ multitargetdetectionalgorithmforfusionimagesbasedonanattentionmechanism
AT zhuoningdong multitargetdetectionalgorithmforfusionimagesbasedonanattentionmechanism
AT yuxinguo multitargetdetectionalgorithmforfusionimagesbasedonanattentionmechanism
AT huiren multitargetdetectionalgorithmforfusionimagesbasedonanattentionmechanism
AT hongyangfu multitargetdetectionalgorithmforfusionimagesbasedonanattentionmechanism