KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition
Abstract Video-based action recognition remains a challenging task due to the difficulty in accurately modeling spatio-temporal dynamics and distinguishing foreground motion from static background clutter. Existing methods often struggle with capturing long-range temporal dependencies and tend to ov...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-08-01
|
| Series: | Discover Applied Sciences |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s42452-025-07504-1 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Video-based action recognition remains a challenging task due to the difficulty in accurately modeling spatio-temporal dynamics and distinguishing foreground motion from static background clutter. Existing methods often struggle with capturing long-range temporal dependencies and tend to overfit to irrelevant background features, leading to reduced recognition performance in complex scenes. To address these limitations, we propose a biologically inspired two-branch convolutional network, termed Key-information Spatio-temporal Correlation Network (KSC-Net). The architecture integrates two novel modules. First, a Dynamic Feature Filter (DF) is introduced to enhance sensitivity to salient motion by suppressing redundant visual signals through second-order temporal difference and Laplacian-based spatial filtering. This module mimics the edge-enhancing and motion-focusing mechanisms of human vision. Second, a Spatio-temporal Self-similarity Gated (SG) module captures long-range correlations by computing feature similarity across frames and adaptively regulating memory propagation using a bi-directional gated structure with temporal offset pooling. Extensive experiments on public benchmarks including Kinetics-400, UCF-101, and HMDB-51 demonstrate that our proposed model achieves superior Top-1 recognition accuracy compared to state-of-the-art methods, validating the effectiveness of the proposed biologically inspired spatio-temporal modeling framework. |
|---|---|
| ISSN: | 3004-9261 |