Fine-Grained Fusion With Distractor Suppression for Video-Based Person Re-Identification
Video based person re-identification aims to associate video clips with the same identity by designing discriminative and representative features. Existing approaches simply compute representations for video clips via frame-level or region-level feature aggregation, where fine-grained local informat...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2019-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/8782110/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Video based person re-identification aims to associate video clips with the same identity by designing discriminative and representative features. Existing approaches simply compute representations for video clips via frame-level or region-level feature aggregation, where fine-grained local information is inaccessible. To address this issue, we propose a novel module called fine-grained fusion with distractor suppression (short as FFDS) to fully exploit the local features towards better representation of a specific video clip. Concretely, in the proposed FFDS module, the importance of each local feature of an anchor image is calculated by pixel-wise correlation mining with other intra-sequence frames. In this way, ’good’ local features co-exist across the video frames are enhanced in the attention map, while sparse ’distractors’ can be suppressed. Moreover, to maintain the high-level semantic information of deep CNN features as well as enjoying the fine-grained local information, we adopt the feature mimicking scheme during the training process. Extensive experiments on two challenging large-scale datasets demonstrate effectiveness of the proposed method. |
|---|---|
| ISSN: | 2169-3536 |