Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism
Due to the inherent limitations of visible-light sensors, monitoring systems that rely solely on single-modal visible-light images exhibit reduced accuracy, posing safety concerns in applications such as autonomous driving. Infrared and visible-light image fusion technology addresses this issue by g...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/13/7044 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849319855363194880 |
|---|---|
| author | Zhenge Qu Zhuoning Dong Yuxin Guo Hui Ren Hongyang Fu |
| author_facet | Zhenge Qu Zhuoning Dong Yuxin Guo Hui Ren Hongyang Fu |
| author_sort | Zhenge Qu |
| collection | DOAJ |
| description | Due to the inherent limitations of visible-light sensors, monitoring systems that rely solely on single-modal visible-light images exhibit reduced accuracy, posing safety concerns in applications such as autonomous driving. Infrared and visible-light image fusion technology addresses this issue by generating composite images that integrate complementary information from both modalities, thereby enhancing perception robustness. This study focuses on target detection in fused images. Given that targets in such images are often small and severely occluded, we propose an optimized detection framework to overcome these challenges. Specifically, we improve the YOLOv8 baseline model by introducing a dedicated small-object detection layer, incorporating the Global Attention Mechanism (GAM), and refining the loss function. Experimental results show that our method achieves a 5.0% improvement in mAP and a 6.5% gain in recall over the original YOLOv8. Furthermore, comparative experiments on fused and single-modal inputs demonstrate that fused images yield the highest detection accuracy. These results confirm that leveraging fused inputs significantly enhances detection accuracy and robustness in complex environments. |
| format | Article |
| id | doaj-art-659a7dfdd4ee49f0963db716c4a189a5 |
| institution | Kabale University |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-659a7dfdd4ee49f0963db716c4a189a52025-08-20T03:50:17ZengMDPI AGApplied Sciences2076-34172025-06-011513704410.3390/app15137044Multi-Target Detection Algorithm for Fusion Images Based on an Attention MechanismZhenge Qu0Zhuoning Dong1Yuxin Guo2Hui Ren3Hongyang Fu4School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, ChinaSchool of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, ChinaSchool of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, ChinaSchool of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, ChinaInstitute of Systems Engineering, China Academy of Engineering Physics, Mianyang 621999, ChinaDue to the inherent limitations of visible-light sensors, monitoring systems that rely solely on single-modal visible-light images exhibit reduced accuracy, posing safety concerns in applications such as autonomous driving. Infrared and visible-light image fusion technology addresses this issue by generating composite images that integrate complementary information from both modalities, thereby enhancing perception robustness. This study focuses on target detection in fused images. Given that targets in such images are often small and severely occluded, we propose an optimized detection framework to overcome these challenges. Specifically, we improve the YOLOv8 baseline model by introducing a dedicated small-object detection layer, incorporating the Global Attention Mechanism (GAM), and refining the loss function. Experimental results show that our method achieves a 5.0% improvement in mAP and a 6.5% gain in recall over the original YOLOv8. Furthermore, comparative experiments on fused and single-modal inputs demonstrate that fused images yield the highest detection accuracy. These results confirm that leveraging fused inputs significantly enhances detection accuracy and robustness in complex environments.https://www.mdpi.com/2076-3417/15/13/7044image fusiontarget detectionglobal attention mechanismdeep learning |
| spellingShingle | Zhenge Qu Zhuoning Dong Yuxin Guo Hui Ren Hongyang Fu Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism Applied Sciences image fusion target detection global attention mechanism deep learning |
| title | Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism |
| title_full | Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism |
| title_fullStr | Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism |
| title_full_unstemmed | Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism |
| title_short | Multi-Target Detection Algorithm for Fusion Images Based on an Attention Mechanism |
| title_sort | multi target detection algorithm for fusion images based on an attention mechanism |
| topic | image fusion target detection global attention mechanism deep learning |
| url | https://www.mdpi.com/2076-3417/15/13/7044 |
| work_keys_str_mv | AT zhengequ multitargetdetectionalgorithmforfusionimagesbasedonanattentionmechanism AT zhuoningdong multitargetdetectionalgorithmforfusionimagesbasedonanattentionmechanism AT yuxinguo multitargetdetectionalgorithmforfusionimagesbasedonanattentionmechanism AT huiren multitargetdetectionalgorithmforfusionimagesbasedonanattentionmechanism AT hongyangfu multitargetdetectionalgorithmforfusionimagesbasedonanattentionmechanism |