An adaptive feature fusion strategy using dual-layer attention and multi-modal deep reinforcement learning for all-media similarity search

Abstract This paper proposes a novel adaptive feature fusion strategy that combines a dual-layer attention mechanism and Multi-modal deep reinforcement learning (DRL) to optimize cross-modal information retrieval. The dual-layer attention mechanism enhances the model's ability to capture deep s...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jin Yue, Jiayun Lang, Rui Feng
Format:	Article
Language:	English
Published:	Springer 2025-05-01
Series:	Discover Artificial Intelligence
Subjects:	Dual layer attention mechanism Multi modal deep reinforcement learning Adaptive feature fusion Cross modal information retrieval
Online Access:	https://doi.org/10.1007/s44163-025-00332-7
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Abstract This paper proposes a novel adaptive feature fusion strategy that combines a dual-layer attention mechanism and Multi-modal deep reinforcement learning (DRL) to optimize cross-modal information retrieval. The dual-layer attention mechanism enhances the model's ability to capture deep semantic relationships between different modalities, while DRL optimizes the feature extraction and fusion process, improving adaptability in complex environments. Experimental results demonstrate that this strategy outperforms traditional CNN and RNN methods in terms of accuracy, recall, and efficiency across a range of cross-modal retrieval tasks, particularly in multi-modal data environments such as text-image, text-video, and image-video. The proposed approach offers a promising solution for improving the accuracy and efficiency of cross-modal information retrieval.
ISSN:	2731-0809

An adaptive feature fusion strategy using dual-layer attention and multi-modal deep reinforcement learning for all-media similarity search

Similar Items