Lightweight and efficient skeleton-based sports activity recognition with ASTM-Net.

Human Activity Recognition (HAR) plays a pivotal role in video understanding, with applications ranging from surveillance to virtual reality. Skeletal data has emerged as a robust modality for HAR, overcoming challenges such as noisy backgrounds and lighting variations. However, current Graph Convol...

Full description

Saved in:
Bibliographic Details
Main Authors: Bin Wu, Mei Xue, Ying Jia, Ning Zhang, GuoJin Zhao, XiuPing Wang, Chunlei Zhang
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0324605
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849429181023125504
author Bin Wu
Mei Xue
Ying Jia
Ning Zhang
GuoJin Zhao
XiuPing Wang
Chunlei Zhang
author_facet Bin Wu
Mei Xue
Ying Jia
Ning Zhang
GuoJin Zhao
XiuPing Wang
Chunlei Zhang
author_sort Bin Wu
collection DOAJ
description Human Activity Recognition (HAR) plays a pivotal role in video understanding, with applications ranging from surveillance to virtual reality. Skeletal data has emerged as a robust modality for HAR, overcoming challenges such as noisy backgrounds and lighting variations. However, current Graph Convolutional Network (GCNN)-based methods for skeletal activity recognition face two key limitations: (1) they fail to capture dynamic changes in node affinities induced by movements, and (2) they overlook the interplay between spatial and temporal information critical for recognizing complex actions. To address these challenges, we propose ASTM‑Net, an Activity‑aware SpatioTemporal Multi‑branch graph convolutional network comprising two novel modules. First, the Activity‑aware Spatial Graph convolution Module (ASGM) dynamically models Activity‑Aware Adjacency Graphs (3A‑Graphs) by fusing a manually initialized physical graph, a learnable graph optimized end‑to‑end, and a dynamically inferred, activity‑related graph-thereby capturing evolving spatial affinities. Second, we introduce the Temporal Multi‑branch Graph convolution Module (TMGM), which employs parallel branches of channel‑reduction, dilated temporal convolutions with varied dilation rates, pooling, and pointwise convolutions to effectively model both fine‑grained and long‑range temporal dependencies. This multi‑branch design not only addresses diverse action speeds and durations but also maintains parameter efficiency. By integrating ASGM and TMGM, ASTM‑Net jointly captures spatial-temporal mutualities with significantly reduced computational cost. Extensive experiments on NTU‑RGB + D, NTU‑RGB + D 120, and Toyota Smarthome demonstrate ASTM‑Net's superiority: it outperforms DualHead‑Net‑ALLs by 0.31% on NTU‑RGB + D X‑Sub and surpasses SkateFormer by 2.22% on Toyota Smarthome Cross‑Subject; it reduces parameters by 51.9% and FLOPs by 49.7% compared to MST‑GCNN‑ALLs while improving accuracy by 0.82%; and under 30% random node occlusion, it achieves 86.94% accuracy-3.49% higher than CBAM‑STGCN.
format Article
id doaj-art-ffdb3fd5a62642e8b0f051c96ebefe88
institution Kabale University
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-ffdb3fd5a62642e8b0f051c96ebefe882025-08-20T03:28:26ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01207e032460510.1371/journal.pone.0324605Lightweight and efficient skeleton-based sports activity recognition with ASTM-Net.Bin WuMei XueYing JiaNing ZhangGuoJin ZhaoXiuPing WangChunlei ZhangHuman Activity Recognition (HAR) plays a pivotal role in video understanding, with applications ranging from surveillance to virtual reality. Skeletal data has emerged as a robust modality for HAR, overcoming challenges such as noisy backgrounds and lighting variations. However, current Graph Convolutional Network (GCNN)-based methods for skeletal activity recognition face two key limitations: (1) they fail to capture dynamic changes in node affinities induced by movements, and (2) they overlook the interplay between spatial and temporal information critical for recognizing complex actions. To address these challenges, we propose ASTM‑Net, an Activity‑aware SpatioTemporal Multi‑branch graph convolutional network comprising two novel modules. First, the Activity‑aware Spatial Graph convolution Module (ASGM) dynamically models Activity‑Aware Adjacency Graphs (3A‑Graphs) by fusing a manually initialized physical graph, a learnable graph optimized end‑to‑end, and a dynamically inferred, activity‑related graph-thereby capturing evolving spatial affinities. Second, we introduce the Temporal Multi‑branch Graph convolution Module (TMGM), which employs parallel branches of channel‑reduction, dilated temporal convolutions with varied dilation rates, pooling, and pointwise convolutions to effectively model both fine‑grained and long‑range temporal dependencies. This multi‑branch design not only addresses diverse action speeds and durations but also maintains parameter efficiency. By integrating ASGM and TMGM, ASTM‑Net jointly captures spatial-temporal mutualities with significantly reduced computational cost. Extensive experiments on NTU‑RGB + D, NTU‑RGB + D 120, and Toyota Smarthome demonstrate ASTM‑Net's superiority: it outperforms DualHead‑Net‑ALLs by 0.31% on NTU‑RGB + D X‑Sub and surpasses SkateFormer by 2.22% on Toyota Smarthome Cross‑Subject; it reduces parameters by 51.9% and FLOPs by 49.7% compared to MST‑GCNN‑ALLs while improving accuracy by 0.82%; and under 30% random node occlusion, it achieves 86.94% accuracy-3.49% higher than CBAM‑STGCN.https://doi.org/10.1371/journal.pone.0324605
spellingShingle Bin Wu
Mei Xue
Ying Jia
Ning Zhang
GuoJin Zhao
XiuPing Wang
Chunlei Zhang
Lightweight and efficient skeleton-based sports activity recognition with ASTM-Net.
PLoS ONE
title Lightweight and efficient skeleton-based sports activity recognition with ASTM-Net.
title_full Lightweight and efficient skeleton-based sports activity recognition with ASTM-Net.
title_fullStr Lightweight and efficient skeleton-based sports activity recognition with ASTM-Net.
title_full_unstemmed Lightweight and efficient skeleton-based sports activity recognition with ASTM-Net.
title_short Lightweight and efficient skeleton-based sports activity recognition with ASTM-Net.
title_sort lightweight and efficient skeleton based sports activity recognition with astm net
url https://doi.org/10.1371/journal.pone.0324605
work_keys_str_mv AT binwu lightweightandefficientskeletonbasedsportsactivityrecognitionwithastmnet
AT meixue lightweightandefficientskeletonbasedsportsactivityrecognitionwithastmnet
AT yingjia lightweightandefficientskeletonbasedsportsactivityrecognitionwithastmnet
AT ningzhang lightweightandefficientskeletonbasedsportsactivityrecognitionwithastmnet
AT guojinzhao lightweightandefficientskeletonbasedsportsactivityrecognitionwithastmnet
AT xiupingwang lightweightandefficientskeletonbasedsportsactivityrecognitionwithastmnet
AT chunleizhang lightweightandefficientskeletonbasedsportsactivityrecognitionwithastmnet