Pseudo label refining for semi-supervised temporal action localization.
The training of temporal action localization models relies heavily on a large amount of manually annotated data. Video annotation is more tedious and time-consuming compared with image annotation. Therefore, the semi-supervised method that combines labeled and unlabeled data for joint training has a...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2025-01-01
|
| Series: | PLoS ONE |
| Online Access: | https://doi.org/10.1371/journal.pone.0318418 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850187763632046080 |
|---|---|
| author | Lingwen Meng Guobang Ban Guanghui Xi Siqi Guo |
| author_facet | Lingwen Meng Guobang Ban Guanghui Xi Siqi Guo |
| author_sort | Lingwen Meng |
| collection | DOAJ |
| description | The training of temporal action localization models relies heavily on a large amount of manually annotated data. Video annotation is more tedious and time-consuming compared with image annotation. Therefore, the semi-supervised method that combines labeled and unlabeled data for joint training has attracted increasing attention from academics and industry. This study proposes a method called pseudo-label refining (PLR) based on the teacher-student framework, which consists of three key components. First, we propose pseudo-label self-refinement which features in a temporal region interesting pooling to improve the boundary accuracy of TAL pseudo label. Second, we design a module named boundary synthesis to further refined temporal interval in pseudo label with multiple inference. Finally, an adaptive weight learning strategy is tailored for progressively learning pseudo labels with different qualities. The method proposed in this study uses ActionFormer and BMN as the detector and achieves significant improvement on the THUMOS14 and ActivityNet v1.3 datasets. The experimental results show that the proposed method significantly improve the localization accuracy compared to other advanced SSTAL methods at a label rate of 10% to 60%. Further ablation experiments show the effectiveness of each module, proving that the PLR method can improve the accuracy of pseudo-labels obtained by teacher model reasoning. |
| format | Article |
| id | doaj-art-0ad7ef4ced3b4c8fa69740da9926aebd |
| institution | OA Journals |
| issn | 1932-6203 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | Public Library of Science (PLoS) |
| record_format | Article |
| series | PLoS ONE |
| spelling | doaj-art-0ad7ef4ced3b4c8fa69740da9926aebd2025-08-20T02:16:02ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01202e031841810.1371/journal.pone.0318418Pseudo label refining for semi-supervised temporal action localization.Lingwen MengGuobang BanGuanghui XiSiqi GuoThe training of temporal action localization models relies heavily on a large amount of manually annotated data. Video annotation is more tedious and time-consuming compared with image annotation. Therefore, the semi-supervised method that combines labeled and unlabeled data for joint training has attracted increasing attention from academics and industry. This study proposes a method called pseudo-label refining (PLR) based on the teacher-student framework, which consists of three key components. First, we propose pseudo-label self-refinement which features in a temporal region interesting pooling to improve the boundary accuracy of TAL pseudo label. Second, we design a module named boundary synthesis to further refined temporal interval in pseudo label with multiple inference. Finally, an adaptive weight learning strategy is tailored for progressively learning pseudo labels with different qualities. The method proposed in this study uses ActionFormer and BMN as the detector and achieves significant improvement on the THUMOS14 and ActivityNet v1.3 datasets. The experimental results show that the proposed method significantly improve the localization accuracy compared to other advanced SSTAL methods at a label rate of 10% to 60%. Further ablation experiments show the effectiveness of each module, proving that the PLR method can improve the accuracy of pseudo-labels obtained by teacher model reasoning.https://doi.org/10.1371/journal.pone.0318418 |
| spellingShingle | Lingwen Meng Guobang Ban Guanghui Xi Siqi Guo Pseudo label refining for semi-supervised temporal action localization. PLoS ONE |
| title | Pseudo label refining for semi-supervised temporal action localization. |
| title_full | Pseudo label refining for semi-supervised temporal action localization. |
| title_fullStr | Pseudo label refining for semi-supervised temporal action localization. |
| title_full_unstemmed | Pseudo label refining for semi-supervised temporal action localization. |
| title_short | Pseudo label refining for semi-supervised temporal action localization. |
| title_sort | pseudo label refining for semi supervised temporal action localization |
| url | https://doi.org/10.1371/journal.pone.0318418 |
| work_keys_str_mv | AT lingwenmeng pseudolabelrefiningforsemisupervisedtemporalactionlocalization AT guobangban pseudolabelrefiningforsemisupervisedtemporalactionlocalization AT guanghuixi pseudolabelrefiningforsemisupervisedtemporalactionlocalization AT siqiguo pseudolabelrefiningforsemisupervisedtemporalactionlocalization |