KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition

Abstract Video-based action recognition remains a challenging task due to the difficulty in accurately modeling spatio-temporal dynamics and distinguishing foreground motion from static background clutter. Existing methods often struggle with capturing long-range temporal dependencies and tend to ov...

Full description

Saved in:
Bibliographic Details
Main Authors: Hui Ma, Xuelian Ma
Format: Article
Language:English
Published: Springer 2025-08-01
Series:Discover Applied Sciences
Subjects:
Online Access:https://doi.org/10.1007/s42452-025-07504-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849761767899529216
author Hui Ma
Xuelian Ma
author_facet Hui Ma
Xuelian Ma
author_sort Hui Ma
collection DOAJ
description Abstract Video-based action recognition remains a challenging task due to the difficulty in accurately modeling spatio-temporal dynamics and distinguishing foreground motion from static background clutter. Existing methods often struggle with capturing long-range temporal dependencies and tend to overfit to irrelevant background features, leading to reduced recognition performance in complex scenes. To address these limitations, we propose a biologically inspired two-branch convolutional network, termed Key-information Spatio-temporal Correlation Network (KSC-Net). The architecture integrates two novel modules. First, a Dynamic Feature Filter (DF) is introduced to enhance sensitivity to salient motion by suppressing redundant visual signals through second-order temporal difference and Laplacian-based spatial filtering. This module mimics the edge-enhancing and motion-focusing mechanisms of human vision. Second, a Spatio-temporal Self-similarity Gated (SG) module captures long-range correlations by computing feature similarity across frames and adaptively regulating memory propagation using a bi-directional gated structure with temporal offset pooling. Extensive experiments on public benchmarks including Kinetics-400, UCF-101, and HMDB-51 demonstrate that our proposed model achieves superior Top-1 recognition accuracy compared to state-of-the-art methods, validating the effectiveness of the proposed biologically inspired spatio-temporal modeling framework.
format Article
id doaj-art-712bdd4e72b14e979f73b57c26de2fed
institution DOAJ
issn 3004-9261
language English
publishDate 2025-08-01
publisher Springer
record_format Article
series Discover Applied Sciences
spelling doaj-art-712bdd4e72b14e979f73b57c26de2fed2025-08-20T03:05:55ZengSpringerDiscover Applied Sciences3004-92612025-08-017811910.1007/s42452-025-07504-1KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognitionHui Ma0Xuelian Ma1Hebei Agricultural UniversityDepartment of Physical Education, Hebei Vocational University of Technology and EngineeringAbstract Video-based action recognition remains a challenging task due to the difficulty in accurately modeling spatio-temporal dynamics and distinguishing foreground motion from static background clutter. Existing methods often struggle with capturing long-range temporal dependencies and tend to overfit to irrelevant background features, leading to reduced recognition performance in complex scenes. To address these limitations, we propose a biologically inspired two-branch convolutional network, termed Key-information Spatio-temporal Correlation Network (KSC-Net). The architecture integrates two novel modules. First, a Dynamic Feature Filter (DF) is introduced to enhance sensitivity to salient motion by suppressing redundant visual signals through second-order temporal difference and Laplacian-based spatial filtering. This module mimics the edge-enhancing and motion-focusing mechanisms of human vision. Second, a Spatio-temporal Self-similarity Gated (SG) module captures long-range correlations by computing feature similarity across frames and adaptively regulating memory propagation using a bi-directional gated structure with temporal offset pooling. Extensive experiments on public benchmarks including Kinetics-400, UCF-101, and HMDB-51 demonstrate that our proposed model achieves superior Top-1 recognition accuracy compared to state-of-the-art methods, validating the effectiveness of the proposed biologically inspired spatio-temporal modeling framework.https://doi.org/10.1007/s42452-025-07504-1Spatio-temporal correlation networkHuman perceptionDynamic filterSpatio-temporal self-similarity gated moduleAction recognition
spellingShingle Hui Ma
Xuelian Ma
KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition
Discover Applied Sciences
Spatio-temporal correlation network
Human perception
Dynamic filter
Spatio-temporal self-similarity gated module
Action recognition
title KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition
title_full KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition
title_fullStr KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition
title_full_unstemmed KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition
title_short KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition
title_sort ksc net a biologically inspired spatio temporal correlation network for video based human action recognition
topic Spatio-temporal correlation network
Human perception
Dynamic filter
Spatio-temporal self-similarity gated module
Action recognition
url https://doi.org/10.1007/s42452-025-07504-1
work_keys_str_mv AT huima kscnetabiologicallyinspiredspatiotemporalcorrelationnetworkforvideobasedhumanactionrecognition
AT xuelianma kscnetabiologicallyinspiredspatiotemporalcorrelationnetworkforvideobasedhumanactionrecognition