Multi-granularity feature intersection learning for visible-infrared person re-identification

Abstract This paper proposes a multi-granularity feature intersection network (MGFINet) for visible-infrared person re-identification (VI-ReID). VI-ReID aims to retrieve images of the same pedestrian from different spectral cameras. The key challenge is to extract pedestrian descriptions with both i...

Full description

Saved in:
Bibliographic Details
Main Authors: Sixian Chan, Jie Wang, Jiaao Cui, Jie Hu, Zhuorong Li, Jiafa Mao
Format: Article
Language:English
Published: Springer 2025-05-01
Series:Complex & Intelligent Systems
Subjects:
Online Access:https://doi.org/10.1007/s40747-025-01853-5
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract This paper proposes a multi-granularity feature intersection network (MGFINet) for visible-infrared person re-identification (VI-ReID). VI-ReID aims to retrieve images of the same pedestrian from different spectral cameras. The key challenge is to extract pedestrian descriptions with both inter-class discriminability and intra-class similarity. Previous methods ignore the potential loss of details during representation extraction and the presence of data bias in the metric function, limiting further improvements in retrieval performance. Meanwhile, the discrepancy regarding how to calculate the loss for representation learning and metric learning also affects the model’s training. To address the above issues, MGFINet consists of three components: a hierarchical part pooling method (HPP), a hierarchical part restriction method (HPC), and a feature intersection (FI) loss. HPP adopts a hierarchical framework to extract multi-granularity pedestrian representations, and it performs an inter-layer fusion operation to exploit the high-resolution information from shallow layers and the semantic representability from deep layers. Meanwhile, HPP employs part pooling with different step sizes to capture pedestrian details in each layer. Next, HPC spreads the identity loss across all layers to reduce the distance for gradient backpropagation and further optimize fine-grained features in shallow layers. Besides, FI loss combines representation and metric learning by incorporating hyperparameters of classifiers into metric learning, mitigating data bias and reducing the gap between the two learning processes. Finally, extensive experiments evaluated on two public datasets, SYSU-MM01 and RegDB demonstrate the effectiveness of the proposed method.
ISSN:2199-4536
2198-6053