MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition

Graph convolutional networks (GCNs) have been widely used and have achieved remarkable results in skeleton-based action recognition. We note that existing GCN-based approaches rely on local context information of the skeleton joints to construct adaptive graphs for feature aggregation, limiting thei...

Full description

Saved in:
Bibliographic Details
Main Authors: Kowovi Comivi Alowonou, Ji-Hyeong Han
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10807218/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850120243765051392
author Kowovi Comivi Alowonou
Ji-Hyeong Han
author_facet Kowovi Comivi Alowonou
Ji-Hyeong Han
author_sort Kowovi Comivi Alowonou
collection DOAJ
description Graph convolutional networks (GCNs) have been widely used and have achieved remarkable results in skeleton-based action recognition. We note that existing GCN-based approaches rely on local context information of the skeleton joints to construct adaptive graphs for feature aggregation, limiting their ability to understand actions that involve coordinated movements across various parts of the body. An adaptive graph built upon the global context information of the joints can help move beyond this limitation. Therefore, in this paper, we propose a novel approach to skeleton-based action recognition named Multi-stage Adaptive Graph Convolution Network (MSA-GCN). It consists of two modules: Multi-stage Adaptive Graph Convolution (MSA-GC) and Temporal Multi-Scale Transformer (TMST). These two modules work together to capture complex spatial and temporal patterns within skeleton data effectively. Specifically, MSA-GC explores both local and global context information of the joints across all sequences to construct the adaptive graph and facilitates the understanding of complex and nuanced relationships between joints. On the other hand, the TMST module integrates a Gated Multi-stage Temporal Convolution (GMSTC) with a Temporal Multi-Head Self-Attention (TMHSA) to capture global temporal features and accommodate both long-term and short-term dependencies within action sequences. Through extensive experiments on multiple benchmark datasets, including NTU RGB+D 60, NTU RGB+D 120, and Northwestern-UCLA, MSA-GCN achieves state-of-the-art performance and verifies its effectiveness in skeleton-based action recognition.
format Article
id doaj-art-0f5e538a67b24206be515c3a3024deb9
institution OA Journals
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-0f5e538a67b24206be515c3a3024deb92025-08-20T02:35:23ZengIEEEIEEE Access2169-35362024-01-011219355219356310.1109/ACCESS.2024.352017210807218MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action RecognitionKowovi Comivi Alowonou0Ji-Hyeong Han1https://orcid.org/0000-0001-8391-6898Department of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, South KoreaDepartment of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, South KoreaGraph convolutional networks (GCNs) have been widely used and have achieved remarkable results in skeleton-based action recognition. We note that existing GCN-based approaches rely on local context information of the skeleton joints to construct adaptive graphs for feature aggregation, limiting their ability to understand actions that involve coordinated movements across various parts of the body. An adaptive graph built upon the global context information of the joints can help move beyond this limitation. Therefore, in this paper, we propose a novel approach to skeleton-based action recognition named Multi-stage Adaptive Graph Convolution Network (MSA-GCN). It consists of two modules: Multi-stage Adaptive Graph Convolution (MSA-GC) and Temporal Multi-Scale Transformer (TMST). These two modules work together to capture complex spatial and temporal patterns within skeleton data effectively. Specifically, MSA-GC explores both local and global context information of the joints across all sequences to construct the adaptive graph and facilitates the understanding of complex and nuanced relationships between joints. On the other hand, the TMST module integrates a Gated Multi-stage Temporal Convolution (GMSTC) with a Temporal Multi-Head Self-Attention (TMHSA) to capture global temporal features and accommodate both long-term and short-term dependencies within action sequences. Through extensive experiments on multiple benchmark datasets, including NTU RGB+D 60, NTU RGB+D 120, and Northwestern-UCLA, MSA-GCN achieves state-of-the-art performance and verifies its effectiveness in skeleton-based action recognition.https://ieeexplore.ieee.org/document/10807218/Skeleton-based action recognitionGCNdynamic graph topologymulti-scale temporal processing
spellingShingle Kowovi Comivi Alowonou
Ji-Hyeong Han
MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition
IEEE Access
Skeleton-based action recognition
GCN
dynamic graph topology
multi-scale temporal processing
title MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition
title_full MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition
title_fullStr MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition
title_full_unstemmed MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition
title_short MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition
title_sort msa gcn exploiting multi scale temporal dynamics with adaptive graph convolution for skeleton based action recognition
topic Skeleton-based action recognition
GCN
dynamic graph topology
multi-scale temporal processing
url https://ieeexplore.ieee.org/document/10807218/
work_keys_str_mv AT kowovicomivialowonou msagcnexploitingmultiscaletemporaldynamicswithadaptivegraphconvolutionforskeletonbasedactionrecognition
AT jihyeonghan msagcnexploitingmultiscaletemporaldynamicswithadaptivegraphconvolutionforskeletonbasedactionrecognition