MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition

Graph convolutional networks (GCNs) have been widely used and have achieved remarkable results in skeleton-based action recognition. We note that existing GCN-based approaches rely on local context information of the skeleton joints to construct adaptive graphs for feature aggregation, limiting thei...

Full description

Saved in:

Bibliographic Details
Main Authors:	Kowovi Comivi Alowonou, Ji-Hyeong Han
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Skeleton-based action recognition GCN dynamic graph topology multi-scale temporal processing
Online Access:	https://ieeexplore.ieee.org/document/10807218/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850120243765051392
author	Kowovi Comivi Alowonou Ji-Hyeong Han
author_facet	Kowovi Comivi Alowonou Ji-Hyeong Han
author_sort	Kowovi Comivi Alowonou
collection	DOAJ
description	Graph convolutional networks (GCNs) have been widely used and have achieved remarkable results in skeleton-based action recognition. We note that existing GCN-based approaches rely on local context information of the skeleton joints to construct adaptive graphs for feature aggregation, limiting their ability to understand actions that involve coordinated movements across various parts of the body. An adaptive graph built upon the global context information of the joints can help move beyond this limitation. Therefore, in this paper, we propose a novel approach to skeleton-based action recognition named Multi-stage Adaptive Graph Convolution Network (MSA-GCN). It consists of two modules: Multi-stage Adaptive Graph Convolution (MSA-GC) and Temporal Multi-Scale Transformer (TMST). These two modules work together to capture complex spatial and temporal patterns within skeleton data effectively. Specifically, MSA-GC explores both local and global context information of the joints across all sequences to construct the adaptive graph and facilitates the understanding of complex and nuanced relationships between joints. On the other hand, the TMST module integrates a Gated Multi-stage Temporal Convolution (GMSTC) with a Temporal Multi-Head Self-Attention (TMHSA) to capture global temporal features and accommodate both long-term and short-term dependencies within action sequences. Through extensive experiments on multiple benchmark datasets, including NTU RGB+D 60, NTU RGB+D 120, and Northwestern-UCLA, MSA-GCN achieves state-of-the-art performance and verifies its effectiveness in skeleton-based action recognition.
format	Article
id	doaj-art-0f5e538a67b24206be515c3a3024deb9
institution	OA Journals
issn	2169-3536
language	English
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-0f5e538a67b24206be515c3a3024deb92025-08-20T02:35:23ZengIEEEIEEE Access2169-35362024-01-011219355219356310.1109/ACCESS.2024.352017210807218MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action RecognitionKowovi Comivi Alowonou0Ji-Hyeong Han1https://orcid.org/0000-0001-8391-6898Department of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, South KoreaDepartment of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, South KoreaGraph convolutional networks (GCNs) have been widely used and have achieved remarkable results in skeleton-based action recognition. We note that existing GCN-based approaches rely on local context information of the skeleton joints to construct adaptive graphs for feature aggregation, limiting their ability to understand actions that involve coordinated movements across various parts of the body. An adaptive graph built upon the global context information of the joints can help move beyond this limitation. Therefore, in this paper, we propose a novel approach to skeleton-based action recognition named Multi-stage Adaptive Graph Convolution Network (MSA-GCN). It consists of two modules: Multi-stage Adaptive Graph Convolution (MSA-GC) and Temporal Multi-Scale Transformer (TMST). These two modules work together to capture complex spatial and temporal patterns within skeleton data effectively. Specifically, MSA-GC explores both local and global context information of the joints across all sequences to construct the adaptive graph and facilitates the understanding of complex and nuanced relationships between joints. On the other hand, the TMST module integrates a Gated Multi-stage Temporal Convolution (GMSTC) with a Temporal Multi-Head Self-Attention (TMHSA) to capture global temporal features and accommodate both long-term and short-term dependencies within action sequences. Through extensive experiments on multiple benchmark datasets, including NTU RGB+D 60, NTU RGB+D 120, and Northwestern-UCLA, MSA-GCN achieves state-of-the-art performance and verifies its effectiveness in skeleton-based action recognition.https://ieeexplore.ieee.org/document/10807218/Skeleton-based action recognitionGCNdynamic graph topologymulti-scale temporal processing
spellingShingle	Kowovi Comivi Alowonou Ji-Hyeong Han MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition IEEE Access Skeleton-based action recognition GCN dynamic graph topology multi-scale temporal processing
title	MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition
title_full	MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition
title_fullStr	MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition
title_full_unstemmed	MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition
title_short	MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition
title_sort	msa gcn exploiting multi scale temporal dynamics with adaptive graph convolution for skeleton based action recognition
topic	Skeleton-based action recognition GCN dynamic graph topology multi-scale temporal processing
url	https://ieeexplore.ieee.org/document/10807218/
work_keys_str_mv	AT kowovicomivialowonou msagcnexploitingmultiscaletemporaldynamicswithadaptivegraphconvolutionforskeletonbasedactionrecognition AT jihyeonghan msagcnexploitingmultiscaletemporaldynamicswithadaptivegraphconvolutionforskeletonbasedactionrecognition

MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition

Similar Items