Contrastive Mask Learning for Self-Supervised 3D Skeleton-Based Action Recognition

In this paper, we propose a contrastive mask learning (CML) method for self-supervised 3D skeleton-based action recognition. Specifically, the mask modeling mechanism is integrated into multi-level contrastive learning with the aim of forming a mutually beneficial learning scheme from both contrasti...

Full description

Saved in:
Bibliographic Details
Main Author: Haoyuan Zhang
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/5/1521
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850030735234170880
author Haoyuan Zhang
author_facet Haoyuan Zhang
author_sort Haoyuan Zhang
collection DOAJ
description In this paper, we propose a contrastive mask learning (CML) method for self-supervised 3D skeleton-based action recognition. Specifically, the mask modeling mechanism is integrated into multi-level contrastive learning with the aim of forming a mutually beneficial learning scheme from both contrastive learning and masked skeleton reconstruction. The contrastive objective is extended from an individual skeleton instance to clusters by closing the gap between cluster assignment from different instances of the same category, with the goal of pursuing inter-instance consistency. Compared with previous methods, CML integrates contrastive and masked learning comprehensively and enables intra-/inter-instance consistency pursuit via multi-level contrast, which leads to more discriminative skeleton representation learning. Our extensive evaluation of the challenging NTU RGB+D and PKU-MMD benchmarks demonstrates that representations learned via CML exhibit superior discriminability, consistently outperforming state-of-the-art methods in terms of action recognition accuracy.
format Article
id doaj-art-7cfa836396cd48938e40a49e38dcb1e2
institution DOAJ
issn 1424-8220
language English
publishDate 2025-02-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj-art-7cfa836396cd48938e40a49e38dcb1e22025-08-20T02:59:08ZengMDPI AGSensors1424-82202025-02-01255152110.3390/s25051521Contrastive Mask Learning for Self-Supervised 3D Skeleton-Based Action RecognitionHaoyuan Zhang0School of Electrical and Information Engineering, North Minzu Univeristy, Yinchuan 750021, ChinaIn this paper, we propose a contrastive mask learning (CML) method for self-supervised 3D skeleton-based action recognition. Specifically, the mask modeling mechanism is integrated into multi-level contrastive learning with the aim of forming a mutually beneficial learning scheme from both contrastive learning and masked skeleton reconstruction. The contrastive objective is extended from an individual skeleton instance to clusters by closing the gap between cluster assignment from different instances of the same category, with the goal of pursuing inter-instance consistency. Compared with previous methods, CML integrates contrastive and masked learning comprehensively and enables intra-/inter-instance consistency pursuit via multi-level contrast, which leads to more discriminative skeleton representation learning. Our extensive evaluation of the challenging NTU RGB+D and PKU-MMD benchmarks demonstrates that representations learned via CML exhibit superior discriminability, consistently outperforming state-of-the-art methods in terms of action recognition accuracy.https://www.mdpi.com/1424-8220/25/5/1521self-supervised learningcontrastive mask learning3D skeleton action recognition
spellingShingle Haoyuan Zhang
Contrastive Mask Learning for Self-Supervised 3D Skeleton-Based Action Recognition
Sensors
self-supervised learning
contrastive mask learning
3D skeleton action recognition
title Contrastive Mask Learning for Self-Supervised 3D Skeleton-Based Action Recognition
title_full Contrastive Mask Learning for Self-Supervised 3D Skeleton-Based Action Recognition
title_fullStr Contrastive Mask Learning for Self-Supervised 3D Skeleton-Based Action Recognition
title_full_unstemmed Contrastive Mask Learning for Self-Supervised 3D Skeleton-Based Action Recognition
title_short Contrastive Mask Learning for Self-Supervised 3D Skeleton-Based Action Recognition
title_sort contrastive mask learning for self supervised 3d skeleton based action recognition
topic self-supervised learning
contrastive mask learning
3D skeleton action recognition
url https://www.mdpi.com/1424-8220/25/5/1521
work_keys_str_mv AT haoyuanzhang contrastivemasklearningforselfsupervised3dskeletonbasedactionrecognition