Complementarity-Oriented Feature Fusion for Face-Phone Trajectory Matching

CCTVs and telecom base stations act as sensors, and collect massive face and phone related data. When used for person localization and trajectory characterization, they each present quite different spatiotemporal characteristics: CCTV is associated with slowly sampled face ID trajectories with spati...

Full description

Saved in:
Bibliographic Details
Main Authors: Changfeng Cao, Wenchuan Zhang, Hua Yang, Dan Ruan
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10844270/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:CCTVs and telecom base stations act as sensors, and collect massive face and phone related data. When used for person localization and trajectory characterization, they each present quite different spatiotemporal characteristics: CCTV is associated with slowly sampled face ID trajectories with spatial resolution of approximately 20 meters, while telecom readings provide fast sampled phone ID trajectories with spatial uncertainty of a few hundred meters. The face or phone trajectory can be seen as an observation of the real trajectory of a moving pedestrian. It is useful to identify the correspondence between face and phone trajectories to reconstruct the trajectory of moving persons. To this end, we propose a complementarity-oriented feature fusion mechanism (COFFM) to model and utilize the common embedding and complementarity of these two measurement modalities. Specifically, a Cycle Heterogeneous Trajectory Translation Network (CCTTN) is proposed to realize a TFE (Trajectory Feature Extractor) which captures the latent transforming relationships between the face and phone modalities. The latent features from both transforming directions are concatenated in the Feature Unifying (FU) module and fed into a binary face-phone trajectory matching discriminator (FPTPMD) to infer whether a face-phone trajectory pair corresponds to the same underlying motion trajectory. We evaluated our method on a large real-world face-phone trajectory dataset and showed promising results with the accuracy of 97.1% which exceeds the comparable similarity-based methods. The developed principle and framework generalize well to other multi-modality trajectory matching tasks.
ISSN:2169-3536