Origin centric and part based pose decomposition for 3D human pose estimation

Abstract Transformer-based approaches have recently made significant advancements in 3D human pose estimation from 2D inputs. Existing methods typically either consider the entire 2D skeleton for global features extraction or break it into independent parts for local features learning. However, capt...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhijie Lin, Jinxin Yao, Juan Huang, Jingjing Chen, Yingying Xu, Lu Yang, Lei Zhao, Wei Xing
Format: Article
Language:English
Published: Nature Portfolio 2025-08-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-16381-y
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849226401199161344
author Zhijie Lin
Jinxin Yao
Juan Huang
Jingjing Chen
Yingying Xu
Lu Yang
Lei Zhao
Wei Xing
author_facet Zhijie Lin
Jinxin Yao
Juan Huang
Jingjing Chen
Yingying Xu
Lu Yang
Lei Zhao
Wei Xing
author_sort Zhijie Lin
collection DOAJ
description Abstract Transformer-based approaches have recently made significant advancements in 3D human pose estimation from 2D inputs. Existing methods typically either consider the entire 2D skeleton for global features extraction or break it into independent parts for local features learning. However, capturing the spatial dependencies of the entire 2D skeleton does not effectively facilitate learning local spatial features, while partitioning the skeleton into independent segments disrupts the relevance of individual joints to the whole. In this paper, we propose a novel Origin-centric Part Transformer (OPFormer) block to address this issue through two steps: Skeleton Separation and Skeleton Recombination. Skeleton Separation separates the 2D skeleton into several distinct parts, enabling the extraction of fine-grained local spatial features that accurately reflect the geometric structure of the human body. Secondly, we introduce the concept of a human skeleton Origin, which serves as a central hub to reconnect different parts through Skeleton Recombination. The resulting local features, when fused with global features from the Spatial Transformer Encoder, yield more accurate 3D results. Comprehensive experiments conducted on the Human3.6M and MPI-INF-3DHP benchmark datasets verify that our approach attains state-of-the-art performance. It should be emphasized that OPFormer achieves a Mean Per Joint Position Error (MPJPE) of 37.6mm on the Human3.6M dataset without any additional training data.
format Article
id doaj-art-0bb8d58ff5d341e882eef4436d8b960d
institution Kabale University
issn 2045-2322
language English
publishDate 2025-08-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-0bb8d58ff5d341e882eef4436d8b960d2025-08-24T11:19:14ZengNature PortfolioScientific Reports2045-23222025-08-0115111310.1038/s41598-025-16381-yOrigin centric and part based pose decomposition for 3D human pose estimationZhijie Lin0Jinxin Yao1Juan Huang2Jingjing Chen3Yingying Xu4Lu Yang5Lei Zhao6Wei Xing7School of Information and Electronic Engineering, Zhejiang University of Science and TechnologySchool of Information and Electronic Engineering, Zhejiang University of Science and TechnologySchool of Biological and Chemical Engineering, Zhejiang University of Science and TechnologyFaculty of Science, Hong Kong Baptist UniversitySchool of Humanities and Social Science, Beihang UniversityOffice of Education Affair, China Jiliang UniversityCollege of Computer Science and Technology, Zhejiang UniversityCollege of Computer Science and Technology, Zhejiang UniversityAbstract Transformer-based approaches have recently made significant advancements in 3D human pose estimation from 2D inputs. Existing methods typically either consider the entire 2D skeleton for global features extraction or break it into independent parts for local features learning. However, capturing the spatial dependencies of the entire 2D skeleton does not effectively facilitate learning local spatial features, while partitioning the skeleton into independent segments disrupts the relevance of individual joints to the whole. In this paper, we propose a novel Origin-centric Part Transformer (OPFormer) block to address this issue through two steps: Skeleton Separation and Skeleton Recombination. Skeleton Separation separates the 2D skeleton into several distinct parts, enabling the extraction of fine-grained local spatial features that accurately reflect the geometric structure of the human body. Secondly, we introduce the concept of a human skeleton Origin, which serves as a central hub to reconnect different parts through Skeleton Recombination. The resulting local features, when fused with global features from the Spatial Transformer Encoder, yield more accurate 3D results. Comprehensive experiments conducted on the Human3.6M and MPI-INF-3DHP benchmark datasets verify that our approach attains state-of-the-art performance. It should be emphasized that OPFormer achieves a Mean Per Joint Position Error (MPJPE) of 37.6mm on the Human3.6M dataset without any additional training data.https://doi.org/10.1038/s41598-025-16381-y
spellingShingle Zhijie Lin
Jinxin Yao
Juan Huang
Jingjing Chen
Yingying Xu
Lu Yang
Lei Zhao
Wei Xing
Origin centric and part based pose decomposition for 3D human pose estimation
Scientific Reports
title Origin centric and part based pose decomposition for 3D human pose estimation
title_full Origin centric and part based pose decomposition for 3D human pose estimation
title_fullStr Origin centric and part based pose decomposition for 3D human pose estimation
title_full_unstemmed Origin centric and part based pose decomposition for 3D human pose estimation
title_short Origin centric and part based pose decomposition for 3D human pose estimation
title_sort origin centric and part based pose decomposition for 3d human pose estimation
url https://doi.org/10.1038/s41598-025-16381-y
work_keys_str_mv AT zhijielin origincentricandpartbasedposedecompositionfor3dhumanposeestimation
AT jinxinyao origincentricandpartbasedposedecompositionfor3dhumanposeestimation
AT juanhuang origincentricandpartbasedposedecompositionfor3dhumanposeestimation
AT jingjingchen origincentricandpartbasedposedecompositionfor3dhumanposeestimation
AT yingyingxu origincentricandpartbasedposedecompositionfor3dhumanposeestimation
AT luyang origincentricandpartbasedposedecompositionfor3dhumanposeestimation
AT leizhao origincentricandpartbasedposedecompositionfor3dhumanposeestimation
AT weixing origincentricandpartbasedposedecompositionfor3dhumanposeestimation