Integrating Generative and Contrastive Approaches for Human Action Recognition
This study introduces a novel approach to unsupervised skeleton-based human action recognition by integrating generative and contrastive learning methods. We propose a decomposition of representations, allowing for the preservation of detailed motion information for the generative learning objective...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11020639/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850130463605129216 |
|---|---|
| author | Pablo Cervantes Yusuke Sekikawa Ikuro Sato Koichi Shinoda |
| author_facet | Pablo Cervantes Yusuke Sekikawa Ikuro Sato Koichi Shinoda |
| author_sort | Pablo Cervantes |
| collection | DOAJ |
| description | This study introduces a novel approach to unsupervised skeleton-based human action recognition by integrating generative and contrastive learning methods. We propose a decomposition of representations, allowing for the preservation of detailed motion information for the generative learning objective while also extracting action features for the contrastive learning objective. By swapping contrastive representations between positive pairs (coining the name SwapCLR), we ensure that the generative and contrastive representations are complementary and both objectives contribute to learning a strong representation for downstream tasks like action recognition. Additionally, we address the challenge of noisy data in skeleton-based action recognition with a new saturating reconstruction loss, significantly reducing the impact of noise common to key-point detections. Our method demonstrates state-of-the-art performance in unsupervised action recognition on the NTU and PKU-MMD datasets, while also enabling generative downstream tasks such as motion in-painting and motion generation. Overall, these experimental results confirm the method’s effectiveness and suggest its applicability to a variety of action analysis tasks. |
| format | Article |
| id | doaj-art-9fd955e4ce9e42b585ca5ca393e1e17e |
| institution | OA Journals |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-9fd955e4ce9e42b585ca5ca393e1e17e2025-08-20T02:32:41ZengIEEEIEEE Access2169-35362025-01-011310009510010410.1109/ACCESS.2025.357570711020639Integrating Generative and Contrastive Approaches for Human Action RecognitionPablo Cervantes0https://orcid.org/0000-0002-5256-9317Yusuke Sekikawa1https://orcid.org/0000-0003-1111-5949Ikuro Sato2https://orcid.org/0000-0001-5234-3177Koichi Shinoda3https://orcid.org/0000-0003-1095-3203Institute of Science Tokyo (formerly Tokyo Institute of Technology), Tokyo, JapanDenso IT Laboratory, Minato City, JapanInstitute of Science Tokyo (formerly Tokyo Institute of Technology), Tokyo, JapanInstitute of Science Tokyo (formerly Tokyo Institute of Technology), Tokyo, JapanThis study introduces a novel approach to unsupervised skeleton-based human action recognition by integrating generative and contrastive learning methods. We propose a decomposition of representations, allowing for the preservation of detailed motion information for the generative learning objective while also extracting action features for the contrastive learning objective. By swapping contrastive representations between positive pairs (coining the name SwapCLR), we ensure that the generative and contrastive representations are complementary and both objectives contribute to learning a strong representation for downstream tasks like action recognition. Additionally, we address the challenge of noisy data in skeleton-based action recognition with a new saturating reconstruction loss, significantly reducing the impact of noise common to key-point detections. Our method demonstrates state-of-the-art performance in unsupervised action recognition on the NTU and PKU-MMD datasets, while also enabling generative downstream tasks such as motion in-painting and motion generation. Overall, these experimental results confirm the method’s effectiveness and suggest its applicability to a variety of action analysis tasks.https://ieeexplore.ieee.org/document/11020639/Generative and contrastiverepresentation learningunsupervised 3D action recognition |
| spellingShingle | Pablo Cervantes Yusuke Sekikawa Ikuro Sato Koichi Shinoda Integrating Generative and Contrastive Approaches for Human Action Recognition IEEE Access Generative and contrastive representation learning unsupervised 3D action recognition |
| title | Integrating Generative and Contrastive Approaches for Human Action Recognition |
| title_full | Integrating Generative and Contrastive Approaches for Human Action Recognition |
| title_fullStr | Integrating Generative and Contrastive Approaches for Human Action Recognition |
| title_full_unstemmed | Integrating Generative and Contrastive Approaches for Human Action Recognition |
| title_short | Integrating Generative and Contrastive Approaches for Human Action Recognition |
| title_sort | integrating generative and contrastive approaches for human action recognition |
| topic | Generative and contrastive representation learning unsupervised 3D action recognition |
| url | https://ieeexplore.ieee.org/document/11020639/ |
| work_keys_str_mv | AT pablocervantes integratinggenerativeandcontrastiveapproachesforhumanactionrecognition AT yusukesekikawa integratinggenerativeandcontrastiveapproachesforhumanactionrecognition AT ikurosato integratinggenerativeandcontrastiveapproachesforhumanactionrecognition AT koichishinoda integratinggenerativeandcontrastiveapproachesforhumanactionrecognition |