Visible-infrared person re-identification with region-based augmentation and cross modality attention
Abstract Visible-infrared person re-identification (VI-ReID) aims to search the same pedestrian of interest across visible and infrared modalities. Existing models mainly focus on compensating for modality-specific information to reduce modality variation. However, these methods often introduce inte...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-05-01
|
| Series: | Scientific Reports |
| Online Access: | https://doi.org/10.1038/s41598-025-01979-z |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849704160534986752 |
|---|---|
| author | Yuwei Guo Wenhao Zhang Licheng Jiao Shuang Wang Shuo Wang Fang Liu |
| author_facet | Yuwei Guo Wenhao Zhang Licheng Jiao Shuang Wang Shuo Wang Fang Liu |
| author_sort | Yuwei Guo |
| collection | DOAJ |
| description | Abstract Visible-infrared person re-identification (VI-ReID) aims to search the same pedestrian of interest across visible and infrared modalities. Existing models mainly focus on compensating for modality-specific information to reduce modality variation. However, these methods often introduce interfering information and lead to higher computational overhead when generating the corresponding images or features. Additionally, the pedestrian region characteristics in VI-ReID are not effectively utilized, thus resulting in ambiguous or unnatural images. To address these issues, it is critical to leverage pedestrian attentive features and learn modality-complete and -consistent representation. In this paper, a novel Region-based Augmentation and Cross Modality Attention (RACA) model is proposed, focusing on the pedestrian regions to efficiently compensate for missing modality-specific features. Specifically, we propose a region-based data augmentation module PedMix to enhance pedestrian region coherence by mixing the corresponding regions from different modalities, thus generating more natural images. Moreover, a lightweight hybrid compensation module, i.e., a Modality Feature Transfer (MFT) module, is proposed to integrate cross attention and convolution networks to avoid introducing interfering information while preserving minimal computational overhead. Extensive experiments conducted on the benchmark SYSU-MM01 and RegDB datasets demonstrated the effectiveness of our proposed RACA model. |
| format | Article |
| id | doaj-art-116c1fc0c830485a83f0edcf9d8e5fd0 |
| institution | DOAJ |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-116c1fc0c830485a83f0edcf9d8e5fd02025-08-20T03:16:51ZengNature PortfolioScientific Reports2045-23222025-05-0115111610.1038/s41598-025-01979-zVisible-infrared person re-identification with region-based augmentation and cross modality attentionYuwei Guo0Wenhao Zhang1Licheng Jiao2Shuang Wang3Shuo Wang4Fang Liu5Key Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, School of Artificial Intelligence, International Research Center of Intelligent Perception and Computation, Xidian UniversityKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, School of Artificial Intelligence, International Research Center of Intelligent Perception and Computation, Xidian UniversityKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, School of Artificial Intelligence, International Research Center of Intelligent Perception and Computation, Xidian UniversityKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, School of Artificial Intelligence, International Research Center of Intelligent Perception and Computation, Xidian UniversitySchool of Computer Science, The University of BirminghamKey Laboratory of Intelligent Perception and Image Understanding of the Ministry of Education of China, School of Artificial Intelligence, International Research Center of Intelligent Perception and Computation, Xidian UniversityAbstract Visible-infrared person re-identification (VI-ReID) aims to search the same pedestrian of interest across visible and infrared modalities. Existing models mainly focus on compensating for modality-specific information to reduce modality variation. However, these methods often introduce interfering information and lead to higher computational overhead when generating the corresponding images or features. Additionally, the pedestrian region characteristics in VI-ReID are not effectively utilized, thus resulting in ambiguous or unnatural images. To address these issues, it is critical to leverage pedestrian attentive features and learn modality-complete and -consistent representation. In this paper, a novel Region-based Augmentation and Cross Modality Attention (RACA) model is proposed, focusing on the pedestrian regions to efficiently compensate for missing modality-specific features. Specifically, we propose a region-based data augmentation module PedMix to enhance pedestrian region coherence by mixing the corresponding regions from different modalities, thus generating more natural images. Moreover, a lightweight hybrid compensation module, i.e., a Modality Feature Transfer (MFT) module, is proposed to integrate cross attention and convolution networks to avoid introducing interfering information while preserving minimal computational overhead. Extensive experiments conducted on the benchmark SYSU-MM01 and RegDB datasets demonstrated the effectiveness of our proposed RACA model.https://doi.org/10.1038/s41598-025-01979-z |
| spellingShingle | Yuwei Guo Wenhao Zhang Licheng Jiao Shuang Wang Shuo Wang Fang Liu Visible-infrared person re-identification with region-based augmentation and cross modality attention Scientific Reports |
| title | Visible-infrared person re-identification with region-based augmentation and cross modality attention |
| title_full | Visible-infrared person re-identification with region-based augmentation and cross modality attention |
| title_fullStr | Visible-infrared person re-identification with region-based augmentation and cross modality attention |
| title_full_unstemmed | Visible-infrared person re-identification with region-based augmentation and cross modality attention |
| title_short | Visible-infrared person re-identification with region-based augmentation and cross modality attention |
| title_sort | visible infrared person re identification with region based augmentation and cross modality attention |
| url | https://doi.org/10.1038/s41598-025-01979-z |
| work_keys_str_mv | AT yuweiguo visibleinfraredpersonreidentificationwithregionbasedaugmentationandcrossmodalityattention AT wenhaozhang visibleinfraredpersonreidentificationwithregionbasedaugmentationandcrossmodalityattention AT lichengjiao visibleinfraredpersonreidentificationwithregionbasedaugmentationandcrossmodalityattention AT shuangwang visibleinfraredpersonreidentificationwithregionbasedaugmentationandcrossmodalityattention AT shuowang visibleinfraredpersonreidentificationwithregionbasedaugmentationandcrossmodalityattention AT fangliu visibleinfraredpersonreidentificationwithregionbasedaugmentationandcrossmodalityattention |