KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition
Abstract Video-based action recognition remains a challenging task due to the difficulty in accurately modeling spatio-temporal dynamics and distinguishing foreground motion from static background clutter. Existing methods often struggle with capturing long-range temporal dependencies and tend to ov...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-08-01
|
| Series: | Discover Applied Sciences |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s42452-025-07504-1 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849761767899529216 |
|---|---|
| author | Hui Ma Xuelian Ma |
| author_facet | Hui Ma Xuelian Ma |
| author_sort | Hui Ma |
| collection | DOAJ |
| description | Abstract Video-based action recognition remains a challenging task due to the difficulty in accurately modeling spatio-temporal dynamics and distinguishing foreground motion from static background clutter. Existing methods often struggle with capturing long-range temporal dependencies and tend to overfit to irrelevant background features, leading to reduced recognition performance in complex scenes. To address these limitations, we propose a biologically inspired two-branch convolutional network, termed Key-information Spatio-temporal Correlation Network (KSC-Net). The architecture integrates two novel modules. First, a Dynamic Feature Filter (DF) is introduced to enhance sensitivity to salient motion by suppressing redundant visual signals through second-order temporal difference and Laplacian-based spatial filtering. This module mimics the edge-enhancing and motion-focusing mechanisms of human vision. Second, a Spatio-temporal Self-similarity Gated (SG) module captures long-range correlations by computing feature similarity across frames and adaptively regulating memory propagation using a bi-directional gated structure with temporal offset pooling. Extensive experiments on public benchmarks including Kinetics-400, UCF-101, and HMDB-51 demonstrate that our proposed model achieves superior Top-1 recognition accuracy compared to state-of-the-art methods, validating the effectiveness of the proposed biologically inspired spatio-temporal modeling framework. |
| format | Article |
| id | doaj-art-712bdd4e72b14e979f73b57c26de2fed |
| institution | DOAJ |
| issn | 3004-9261 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Springer |
| record_format | Article |
| series | Discover Applied Sciences |
| spelling | doaj-art-712bdd4e72b14e979f73b57c26de2fed2025-08-20T03:05:55ZengSpringerDiscover Applied Sciences3004-92612025-08-017811910.1007/s42452-025-07504-1KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognitionHui Ma0Xuelian Ma1Hebei Agricultural UniversityDepartment of Physical Education, Hebei Vocational University of Technology and EngineeringAbstract Video-based action recognition remains a challenging task due to the difficulty in accurately modeling spatio-temporal dynamics and distinguishing foreground motion from static background clutter. Existing methods often struggle with capturing long-range temporal dependencies and tend to overfit to irrelevant background features, leading to reduced recognition performance in complex scenes. To address these limitations, we propose a biologically inspired two-branch convolutional network, termed Key-information Spatio-temporal Correlation Network (KSC-Net). The architecture integrates two novel modules. First, a Dynamic Feature Filter (DF) is introduced to enhance sensitivity to salient motion by suppressing redundant visual signals through second-order temporal difference and Laplacian-based spatial filtering. This module mimics the edge-enhancing and motion-focusing mechanisms of human vision. Second, a Spatio-temporal Self-similarity Gated (SG) module captures long-range correlations by computing feature similarity across frames and adaptively regulating memory propagation using a bi-directional gated structure with temporal offset pooling. Extensive experiments on public benchmarks including Kinetics-400, UCF-101, and HMDB-51 demonstrate that our proposed model achieves superior Top-1 recognition accuracy compared to state-of-the-art methods, validating the effectiveness of the proposed biologically inspired spatio-temporal modeling framework.https://doi.org/10.1007/s42452-025-07504-1Spatio-temporal correlation networkHuman perceptionDynamic filterSpatio-temporal self-similarity gated moduleAction recognition |
| spellingShingle | Hui Ma Xuelian Ma KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition Discover Applied Sciences Spatio-temporal correlation network Human perception Dynamic filter Spatio-temporal self-similarity gated module Action recognition |
| title | KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition |
| title_full | KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition |
| title_fullStr | KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition |
| title_full_unstemmed | KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition |
| title_short | KSC-Net: a biologically inspired spatio-temporal correlation network for video-based human action recognition |
| title_sort | ksc net a biologically inspired spatio temporal correlation network for video based human action recognition |
| topic | Spatio-temporal correlation network Human perception Dynamic filter Spatio-temporal self-similarity gated module Action recognition |
| url | https://doi.org/10.1007/s42452-025-07504-1 |
| work_keys_str_mv | AT huima kscnetabiologicallyinspiredspatiotemporalcorrelationnetworkforvideobasedhumanactionrecognition AT xuelianma kscnetabiologicallyinspiredspatiotemporalcorrelationnetworkforvideobasedhumanactionrecognition |