Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy
Temporal action localization (TAL) is a research hotspot in video understanding, which aims to locate and classify actions in videos. However, existing methods have difficulties in capturing long-term actions due to focusing on local temporal information, which leads to poor performance in localizin...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/15/2458 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849406318788476928 |
|---|---|
| author | Xiangbin Liu Qian Peng |
| author_facet | Xiangbin Liu Qian Peng |
| author_sort | Xiangbin Liu |
| collection | DOAJ |
| description | Temporal action localization (TAL) is a research hotspot in video understanding, which aims to locate and classify actions in videos. However, existing methods have difficulties in capturing long-term actions due to focusing on local temporal information, which leads to poor performance in localizing long-term temporal sequences. In addition, most methods ignore the boundary importance for action instances, resulting in inaccurate localized boundaries. To address these issues, this paper proposes a state space model for temporal action localization, called Separated Bidirectional Mamba (SBM), which innovatively understands frame changes from the perspective of state transformation. It adapts to different sequence lengths and incorporates state information from the forward and backward for each frame through forward Mamba and backward Mamba to obtain more comprehensive action representations, enhancing modeling capabilities for long-term temporal sequences. Moreover, this paper designs a Boundary Correction Strategy (BCS). It calculates the contribution of each frame to action instances based on the pre-localized results, then adjusts weights of frames in boundary regression to ensure the boundaries are shifted towards the frames with higher contributions, leading to more accurate boundaries. To demonstrate the effectiveness of the proposed method, this paper reports mean Average Precision (mAP) under temporal Intersection over Union (tIoU) thresholds on four challenging benchmarks: THUMOS13, ActivityNet-1.3, HACS, and FineAction, where the proposed method achieves mAPs of 73.7%, 42.0%, 45.2%, and 29.1%, respectively, surpassing the state-of-the-art approaches. |
| format | Article |
| id | doaj-art-3d35dbe307ad4e27862ca7d060a80b91 |
| institution | Kabale University |
| issn | 2227-7390 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Mathematics |
| spelling | doaj-art-3d35dbe307ad4e27862ca7d060a80b912025-08-20T03:36:26ZengMDPI AGMathematics2227-73902025-07-011315245810.3390/math13152458Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction StrategyXiangbin Liu0Qian Peng1College of Information Science and Engineering, Hunan Normal University, Changsha 410081, ChinaCollege of Information Science and Engineering, Hunan Normal University, Changsha 410081, ChinaTemporal action localization (TAL) is a research hotspot in video understanding, which aims to locate and classify actions in videos. However, existing methods have difficulties in capturing long-term actions due to focusing on local temporal information, which leads to poor performance in localizing long-term temporal sequences. In addition, most methods ignore the boundary importance for action instances, resulting in inaccurate localized boundaries. To address these issues, this paper proposes a state space model for temporal action localization, called Separated Bidirectional Mamba (SBM), which innovatively understands frame changes from the perspective of state transformation. It adapts to different sequence lengths and incorporates state information from the forward and backward for each frame through forward Mamba and backward Mamba to obtain more comprehensive action representations, enhancing modeling capabilities for long-term temporal sequences. Moreover, this paper designs a Boundary Correction Strategy (BCS). It calculates the contribution of each frame to action instances based on the pre-localized results, then adjusts weights of frames in boundary regression to ensure the boundaries are shifted towards the frames with higher contributions, leading to more accurate boundaries. To demonstrate the effectiveness of the proposed method, this paper reports mean Average Precision (mAP) under temporal Intersection over Union (tIoU) thresholds on four challenging benchmarks: THUMOS13, ActivityNet-1.3, HACS, and FineAction, where the proposed method achieves mAPs of 73.7%, 42.0%, 45.2%, and 29.1%, respectively, surpassing the state-of-the-art approaches.https://www.mdpi.com/2227-7390/13/15/2458video understandingtemporal action localizationseparated bidirectional mambaboundary correction strategy |
| spellingShingle | Xiangbin Liu Qian Peng Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy Mathematics video understanding temporal action localization separated bidirectional mamba boundary correction strategy |
| title | Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy |
| title_full | Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy |
| title_fullStr | Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy |
| title_full_unstemmed | Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy |
| title_short | Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy |
| title_sort | enhanced temporal action localization with separated bidirectional mamba and boundary correction strategy |
| topic | video understanding temporal action localization separated bidirectional mamba boundary correction strategy |
| url | https://www.mdpi.com/2227-7390/13/15/2458 |
| work_keys_str_mv | AT xiangbinliu enhancedtemporalactionlocalizationwithseparatedbidirectionalmambaandboundarycorrectionstrategy AT qianpeng enhancedtemporalactionlocalizationwithseparatedbidirectionalmambaandboundarycorrectionstrategy |