Attention-enhanced multimodal feature fusion network for clothes-changing person re-identification

Abstract Clothes-Changing Person Re-Identification is a challenging problem in computer vision, primarily due to the appearance variations caused by clothing changes across different camera views. This poses significant challenges to traditional person re-identification techniques that rely on cloth...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yongkang Ding, Jiechen Li, Hao Wang, Ziang Liu, Anqi Wang
Format:	Article
Language:	English
Published:	Springer 2024-11-01
Series:	Complex & Intelligent Systems
Subjects:	Person re-identification Clothes-changing scenarios Computer vision Image retrieval
Online Access:	https://doi.org/10.1007/s40747-024-01646-2
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Abstract Clothes-Changing Person Re-Identification is a challenging problem in computer vision, primarily due to the appearance variations caused by clothing changes across different camera views. This poses significant challenges to traditional person re-identification techniques that rely on clothing features. These challenges include the inconsistency of clothing and the difficulty in learning reliable clothing-irrelevant local features. To address this issue, we propose a novel network architecture called the Attention-Enhanced Multimodal Feature Fusion Network (AE-Net). AE-Net effectively mitigates the impact of clothing changes on recognition accuracy by integrating RGB global features, grayscale image features, and clothing-irrelevant features obtained through semantic segmentation. Specifically, global features capture the overall appearance of the person; grayscale image features help eliminate the interference of color in recognition; and clothing-irrelevant features derived from semantic segmentation enforce the model to learn features independent of the person’s clothing. Additionally, we introduce a multi-scale fusion attention mechanism that further enhances the model’s ability to capture both detailed and global structures, thereby improving recognition accuracy and robustness. Extensive experimental results demonstrate that AE-Net outperforms several state-of-the-art methods on the PRCC and LTCC datasets, particularly in scenarios with significant clothing changes. On the PRCC and LTCC datasets, AE-Net achieves Top-1 accuracy rates of 60.4% and 42.9%, respectively.
ISSN:	2199-4536 2198-6053

Attention-enhanced multimodal feature fusion network for clothes-changing person re-identification

Similar Items