Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy

Temporal action localization (TAL) is a research hotspot in video understanding, which aims to locate and classify actions in videos. However, existing methods have difficulties in capturing long-term actions due to focusing on local temporal information, which leads to poor performance in localizin...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiangbin Liu, Qian Peng
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/15/2458
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849406318788476928
author Xiangbin Liu
Qian Peng
author_facet Xiangbin Liu
Qian Peng
author_sort Xiangbin Liu
collection DOAJ
description Temporal action localization (TAL) is a research hotspot in video understanding, which aims to locate and classify actions in videos. However, existing methods have difficulties in capturing long-term actions due to focusing on local temporal information, which leads to poor performance in localizing long-term temporal sequences. In addition, most methods ignore the boundary importance for action instances, resulting in inaccurate localized boundaries. To address these issues, this paper proposes a state space model for temporal action localization, called Separated Bidirectional Mamba (SBM), which innovatively understands frame changes from the perspective of state transformation. It adapts to different sequence lengths and incorporates state information from the forward and backward for each frame through forward Mamba and backward Mamba to obtain more comprehensive action representations, enhancing modeling capabilities for long-term temporal sequences. Moreover, this paper designs a Boundary Correction Strategy (BCS). It calculates the contribution of each frame to action instances based on the pre-localized results, then adjusts weights of frames in boundary regression to ensure the boundaries are shifted towards the frames with higher contributions, leading to more accurate boundaries. To demonstrate the effectiveness of the proposed method, this paper reports mean Average Precision (mAP) under temporal Intersection over Union (tIoU) thresholds on four challenging benchmarks: THUMOS13, ActivityNet-1.3, HACS, and FineAction, where the proposed method achieves mAPs of 73.7%, 42.0%, 45.2%, and 29.1%, respectively, surpassing the state-of-the-art approaches.
format Article
id doaj-art-3d35dbe307ad4e27862ca7d060a80b91
institution Kabale University
issn 2227-7390
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-3d35dbe307ad4e27862ca7d060a80b912025-08-20T03:36:26ZengMDPI AGMathematics2227-73902025-07-011315245810.3390/math13152458Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction StrategyXiangbin Liu0Qian Peng1College of Information Science and Engineering, Hunan Normal University, Changsha 410081, ChinaCollege of Information Science and Engineering, Hunan Normal University, Changsha 410081, ChinaTemporal action localization (TAL) is a research hotspot in video understanding, which aims to locate and classify actions in videos. However, existing methods have difficulties in capturing long-term actions due to focusing on local temporal information, which leads to poor performance in localizing long-term temporal sequences. In addition, most methods ignore the boundary importance for action instances, resulting in inaccurate localized boundaries. To address these issues, this paper proposes a state space model for temporal action localization, called Separated Bidirectional Mamba (SBM), which innovatively understands frame changes from the perspective of state transformation. It adapts to different sequence lengths and incorporates state information from the forward and backward for each frame through forward Mamba and backward Mamba to obtain more comprehensive action representations, enhancing modeling capabilities for long-term temporal sequences. Moreover, this paper designs a Boundary Correction Strategy (BCS). It calculates the contribution of each frame to action instances based on the pre-localized results, then adjusts weights of frames in boundary regression to ensure the boundaries are shifted towards the frames with higher contributions, leading to more accurate boundaries. To demonstrate the effectiveness of the proposed method, this paper reports mean Average Precision (mAP) under temporal Intersection over Union (tIoU) thresholds on four challenging benchmarks: THUMOS13, ActivityNet-1.3, HACS, and FineAction, where the proposed method achieves mAPs of 73.7%, 42.0%, 45.2%, and 29.1%, respectively, surpassing the state-of-the-art approaches.https://www.mdpi.com/2227-7390/13/15/2458video understandingtemporal action localizationseparated bidirectional mambaboundary correction strategy
spellingShingle Xiangbin Liu
Qian Peng
Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy
Mathematics
video understanding
temporal action localization
separated bidirectional mamba
boundary correction strategy
title Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy
title_full Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy
title_fullStr Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy
title_full_unstemmed Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy
title_short Enhanced Temporal Action Localization with Separated Bidirectional Mamba and Boundary Correction Strategy
title_sort enhanced temporal action localization with separated bidirectional mamba and boundary correction strategy
topic video understanding
temporal action localization
separated bidirectional mamba
boundary correction strategy
url https://www.mdpi.com/2227-7390/13/15/2458
work_keys_str_mv AT xiangbinliu enhancedtemporalactionlocalizationwithseparatedbidirectionalmambaandboundarycorrectionstrategy
AT qianpeng enhancedtemporalactionlocalizationwithseparatedbidirectionalmambaandboundarycorrectionstrategy