Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network

In this study, we propose a novel approach for human pose estimation (HPE) in occluded scenes by progressively fusing features extracted from RGB-D images, which contain RGB and depth images. Conventional bottom-up human pose estimation models that rely solely on RGB inputs often produce erroneous s...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jae-hyuk Yoon, Soon-kak Kwon
Format:	Article
Language:	English
Published:	MDPI AG 2025-08-01
Series:	Applied Sciences
Subjects:	deep learning computer vision human pose estimation RGB-D image
Online Access:	https://www.mdpi.com/2076-3417/15/15/8746
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849407548850962432
author	Jae-hyuk Yoon Soon-kak Kwon
author_facet	Jae-hyuk Yoon Soon-kak Kwon
author_sort	Jae-hyuk Yoon
collection	DOAJ
description	In this study, we propose a novel approach for human pose estimation (HPE) in occluded scenes by progressively fusing features extracted from RGB-D images, which contain RGB and depth images. Conventional bottom-up human pose estimation models that rely solely on RGB inputs often produce erroneous skeletons when parts of a person’s body are obscured by another individual, because they struggle to accurately infer body connectivity due to the lack of 3D topological information. To address this limitation, we modify the traditional OpenPose that is a bottom-up HPE model to take a depth image as an additional input, thereby providing explicit 3D spatial cues. Each input modality is processed by a dedicated feature extractor. Each input modality is processed by a dedicated feature extractor. In addition to the two existing modules for each stage—joint connectivity and joint confidence map estimations for the color image—we integrate a new module for estimating joint confidence maps for the depth image into the initial few stages. Subsequently, the confidence maps derived from both depth and RGB modalities are fused at each stage and forwarded to the next, ensuring that 3D topological information from the depth image is effectively utilized for both joint localization and body part association. Subsequently, the confidence maps derived from both depth and RGB modalities are fused at each stage and forwarded to the next to ensure that 3D topological information is effectively utilized for estimating both joint localization and their connectivity. The experimental results on the NTU 120+ RGB-D Dataset verify that our proposed approach achieves a 13.3% improvement in average recall compared to the original OpenPose model. The proposed method can enhance the performance of the bottom-up HPE models for the occlusion scenes.
format	Article
id	doaj-art-edd3b72ae01c47ddb8a17cd80e9ae969
institution	Kabale University
issn	2076-3417
language	English
publishDate	2025-08-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj-art-edd3b72ae01c47ddb8a17cd80e9ae9692025-08-20T03:36:02ZengMDPI AGApplied Sciences2076-34172025-08-011515874610.3390/app15158746Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural NetworkJae-hyuk Yoon0Soon-kak Kwon1Department of Computer Software Engineering, Dong-eui University, Busan 47340, Republic of KoreaDepartment of Computer Software Engineering, Dong-eui University, Busan 47340, Republic of KoreaIn this study, we propose a novel approach for human pose estimation (HPE) in occluded scenes by progressively fusing features extracted from RGB-D images, which contain RGB and depth images. Conventional bottom-up human pose estimation models that rely solely on RGB inputs often produce erroneous skeletons when parts of a person’s body are obscured by another individual, because they struggle to accurately infer body connectivity due to the lack of 3D topological information. To address this limitation, we modify the traditional OpenPose that is a bottom-up HPE model to take a depth image as an additional input, thereby providing explicit 3D spatial cues. Each input modality is processed by a dedicated feature extractor. Each input modality is processed by a dedicated feature extractor. In addition to the two existing modules for each stage—joint connectivity and joint confidence map estimations for the color image—we integrate a new module for estimating joint confidence maps for the depth image into the initial few stages. Subsequently, the confidence maps derived from both depth and RGB modalities are fused at each stage and forwarded to the next, ensuring that 3D topological information from the depth image is effectively utilized for both joint localization and body part association. Subsequently, the confidence maps derived from both depth and RGB modalities are fused at each stage and forwarded to the next to ensure that 3D topological information is effectively utilized for estimating both joint localization and their connectivity. The experimental results on the NTU 120+ RGB-D Dataset verify that our proposed approach achieves a 13.3% improvement in average recall compared to the original OpenPose model. The proposed method can enhance the performance of the bottom-up HPE models for the occlusion scenes.https://www.mdpi.com/2076-3417/15/15/8746deep learningcomputer visionhuman pose estimationRGB-D image
spellingShingle	Jae-hyuk Yoon Soon-kak Kwon Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network Applied Sciences deep learning computer vision human pose estimation RGB-D image
title	Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network
title_full	Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network
title_fullStr	Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network
title_full_unstemmed	Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network
title_short	Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network
title_sort	robust human pose estimation method for body to body occlusion using rgb d fusion neural network
topic	deep learning computer vision human pose estimation RGB-D image
url	https://www.mdpi.com/2076-3417/15/15/8746
work_keys_str_mv	AT jaehyukyoon robusthumanposeestimationmethodforbodytobodyocclusionusingrgbdfusionneuralnetwork AT soonkakkwon robusthumanposeestimationmethodforbodytobodyocclusionusingrgbdfusionneuralnetwork

Robust Human Pose Estimation Method for Body-to-Body Occlusion Using RGB-D Fusion Neural Network

Similar Items