Origin centric and part based pose decomposition for 3D human pose estimation
Abstract Transformer-based approaches have recently made significant advancements in 3D human pose estimation from 2D inputs. Existing methods typically either consider the entire 2D skeleton for global features extraction or break it into independent parts for local features learning. However, capt...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-08-01
|
| Series: | Scientific Reports |
| Online Access: | https://doi.org/10.1038/s41598-025-16381-y |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849226401199161344 |
|---|---|
| author | Zhijie Lin Jinxin Yao Juan Huang Jingjing Chen Yingying Xu Lu Yang Lei Zhao Wei Xing |
| author_facet | Zhijie Lin Jinxin Yao Juan Huang Jingjing Chen Yingying Xu Lu Yang Lei Zhao Wei Xing |
| author_sort | Zhijie Lin |
| collection | DOAJ |
| description | Abstract Transformer-based approaches have recently made significant advancements in 3D human pose estimation from 2D inputs. Existing methods typically either consider the entire 2D skeleton for global features extraction or break it into independent parts for local features learning. However, capturing the spatial dependencies of the entire 2D skeleton does not effectively facilitate learning local spatial features, while partitioning the skeleton into independent segments disrupts the relevance of individual joints to the whole. In this paper, we propose a novel Origin-centric Part Transformer (OPFormer) block to address this issue through two steps: Skeleton Separation and Skeleton Recombination. Skeleton Separation separates the 2D skeleton into several distinct parts, enabling the extraction of fine-grained local spatial features that accurately reflect the geometric structure of the human body. Secondly, we introduce the concept of a human skeleton Origin, which serves as a central hub to reconnect different parts through Skeleton Recombination. The resulting local features, when fused with global features from the Spatial Transformer Encoder, yield more accurate 3D results. Comprehensive experiments conducted on the Human3.6M and MPI-INF-3DHP benchmark datasets verify that our approach attains state-of-the-art performance. It should be emphasized that OPFormer achieves a Mean Per Joint Position Error (MPJPE) of 37.6mm on the Human3.6M dataset without any additional training data. |
| format | Article |
| id | doaj-art-0bb8d58ff5d341e882eef4436d8b960d |
| institution | Kabale University |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-0bb8d58ff5d341e882eef4436d8b960d2025-08-24T11:19:14ZengNature PortfolioScientific Reports2045-23222025-08-0115111310.1038/s41598-025-16381-yOrigin centric and part based pose decomposition for 3D human pose estimationZhijie Lin0Jinxin Yao1Juan Huang2Jingjing Chen3Yingying Xu4Lu Yang5Lei Zhao6Wei Xing7School of Information and Electronic Engineering, Zhejiang University of Science and TechnologySchool of Information and Electronic Engineering, Zhejiang University of Science and TechnologySchool of Biological and Chemical Engineering, Zhejiang University of Science and TechnologyFaculty of Science, Hong Kong Baptist UniversitySchool of Humanities and Social Science, Beihang UniversityOffice of Education Affair, China Jiliang UniversityCollege of Computer Science and Technology, Zhejiang UniversityCollege of Computer Science and Technology, Zhejiang UniversityAbstract Transformer-based approaches have recently made significant advancements in 3D human pose estimation from 2D inputs. Existing methods typically either consider the entire 2D skeleton for global features extraction or break it into independent parts for local features learning. However, capturing the spatial dependencies of the entire 2D skeleton does not effectively facilitate learning local spatial features, while partitioning the skeleton into independent segments disrupts the relevance of individual joints to the whole. In this paper, we propose a novel Origin-centric Part Transformer (OPFormer) block to address this issue through two steps: Skeleton Separation and Skeleton Recombination. Skeleton Separation separates the 2D skeleton into several distinct parts, enabling the extraction of fine-grained local spatial features that accurately reflect the geometric structure of the human body. Secondly, we introduce the concept of a human skeleton Origin, which serves as a central hub to reconnect different parts through Skeleton Recombination. The resulting local features, when fused with global features from the Spatial Transformer Encoder, yield more accurate 3D results. Comprehensive experiments conducted on the Human3.6M and MPI-INF-3DHP benchmark datasets verify that our approach attains state-of-the-art performance. It should be emphasized that OPFormer achieves a Mean Per Joint Position Error (MPJPE) of 37.6mm on the Human3.6M dataset without any additional training data.https://doi.org/10.1038/s41598-025-16381-y |
| spellingShingle | Zhijie Lin Jinxin Yao Juan Huang Jingjing Chen Yingying Xu Lu Yang Lei Zhao Wei Xing Origin centric and part based pose decomposition for 3D human pose estimation Scientific Reports |
| title | Origin centric and part based pose decomposition for 3D human pose estimation |
| title_full | Origin centric and part based pose decomposition for 3D human pose estimation |
| title_fullStr | Origin centric and part based pose decomposition for 3D human pose estimation |
| title_full_unstemmed | Origin centric and part based pose decomposition for 3D human pose estimation |
| title_short | Origin centric and part based pose decomposition for 3D human pose estimation |
| title_sort | origin centric and part based pose decomposition for 3d human pose estimation |
| url | https://doi.org/10.1038/s41598-025-16381-y |
| work_keys_str_mv | AT zhijielin origincentricandpartbasedposedecompositionfor3dhumanposeestimation AT jinxinyao origincentricandpartbasedposedecompositionfor3dhumanposeestimation AT juanhuang origincentricandpartbasedposedecompositionfor3dhumanposeestimation AT jingjingchen origincentricandpartbasedposedecompositionfor3dhumanposeestimation AT yingyingxu origincentricandpartbasedposedecompositionfor3dhumanposeestimation AT luyang origincentricandpartbasedposedecompositionfor3dhumanposeestimation AT leizhao origincentricandpartbasedposedecompositionfor3dhumanposeestimation AT weixing origincentricandpartbasedposedecompositionfor3dhumanposeestimation |