MG6D: A Deep Fusion Approach for 6D Pose Estimation With Mamba and Graph Convolution Network

Accurate and efficient 6D pose estimation is a fundamental technology in many industrial applications. While existing dense correspondence methods have shown progress, they face challenges in multimodal feature fusion under complex scenarios involving occlusions, illumination variations, and sensor...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jiaqi Zhu, Bin Li, Xinhua Zhao
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	6D pose estimation panoramic attention fusion mamba graph feature fusion
Online Access:	https://ieeexplore.ieee.org/document/11021472/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850130477540704256
author	Jiaqi Zhu Bin Li Xinhua Zhao
author_facet	Jiaqi Zhu Bin Li Xinhua Zhao
author_sort	Jiaqi Zhu
collection	DOAJ
description	Accurate and efficient 6D pose estimation is a fundamental technology in many industrial applications. While existing dense correspondence methods have shown progress, they face challenges in multimodal feature fusion under complex scenarios involving occlusions, illumination variations, and sensor noise. This paper proposes a novel 6D pose estimation framework that addresses these limitations through a hybrid Mamba-Graph architecture. The algorithm first introduces a panoramic attention fusion Mamba module, leveraging state-space modeling to capture long-range dependencies in multi-modal data while establishing cross-dimensional interactions between channel and spatial features to emphasize critical information. A dynamic graph convolutional adaptive fusion module is then designed to enable cross-modal geometric consistency modeling via multi-modal feature integration. Finally, a texture-geometry co-driven keypoint selection mechanism is proposed to ensure keypoint distributions satisfy both spatial uniformity and discriminability requirements. Experimental results on three common datasets demonstrate that the proposed algorithm achieves ADD(-S) metrics of 99.82%, 80.26%, and 97.2%, respectively. Notably, it exhibits significant advantages in pose estimation for objects with repetitive textures and high symmetry.
format	Article
id	doaj-art-bdec6f87e2784d32a9bedcaee4aeb743
institution	OA Journals
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-bdec6f87e2784d32a9bedcaee4aeb7432025-08-20T02:32:41ZengIEEEIEEE Access2169-35362025-01-011310043310044510.1109/ACCESS.2025.357577811021472MG6D: A Deep Fusion Approach for 6D Pose Estimation With Mamba and Graph Convolution NetworkJiaqi Zhu0https://orcid.org/0009-0000-8116-3984Bin Li1Xinhua Zhao2School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, ChinaTianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaSchool of Computer Science and Engineering, Tianjin University of Technology, Tianjin, ChinaAccurate and efficient 6D pose estimation is a fundamental technology in many industrial applications. While existing dense correspondence methods have shown progress, they face challenges in multimodal feature fusion under complex scenarios involving occlusions, illumination variations, and sensor noise. This paper proposes a novel 6D pose estimation framework that addresses these limitations through a hybrid Mamba-Graph architecture. The algorithm first introduces a panoramic attention fusion Mamba module, leveraging state-space modeling to capture long-range dependencies in multi-modal data while establishing cross-dimensional interactions between channel and spatial features to emphasize critical information. A dynamic graph convolutional adaptive fusion module is then designed to enable cross-modal geometric consistency modeling via multi-modal feature integration. Finally, a texture-geometry co-driven keypoint selection mechanism is proposed to ensure keypoint distributions satisfy both spatial uniformity and discriminability requirements. Experimental results on three common datasets demonstrate that the proposed algorithm achieves ADD(-S) metrics of 99.82%, 80.26%, and 97.2%, respectively. Notably, it exhibits significant advantages in pose estimation for objects with repetitive textures and high symmetry.https://ieeexplore.ieee.org/document/11021472/6D pose estimationpanoramic attentionfusion mambagraph feature fusion
spellingShingle	Jiaqi Zhu Bin Li Xinhua Zhao MG6D: A Deep Fusion Approach for 6D Pose Estimation With Mamba and Graph Convolution Network IEEE Access 6D pose estimation panoramic attention fusion mamba graph feature fusion
title	MG6D: A Deep Fusion Approach for 6D Pose Estimation With Mamba and Graph Convolution Network
title_full	MG6D: A Deep Fusion Approach for 6D Pose Estimation With Mamba and Graph Convolution Network
title_fullStr	MG6D: A Deep Fusion Approach for 6D Pose Estimation With Mamba and Graph Convolution Network
title_full_unstemmed	MG6D: A Deep Fusion Approach for 6D Pose Estimation With Mamba and Graph Convolution Network
title_short	MG6D: A Deep Fusion Approach for 6D Pose Estimation With Mamba and Graph Convolution Network
title_sort	mg6d a deep fusion approach for 6d pose estimation with mamba and graph convolution network
topic	6D pose estimation panoramic attention fusion mamba graph feature fusion
url	https://ieeexplore.ieee.org/document/11021472/
work_keys_str_mv	AT jiaqizhu mg6dadeepfusionapproachfor6dposeestimationwithmambaandgraphconvolutionnetwork AT binli mg6dadeepfusionapproachfor6dposeestimationwithmambaandgraphconvolutionnetwork AT xinhuazhao mg6dadeepfusionapproachfor6dposeestimationwithmambaandgraphconvolutionnetwork

MG6D: A Deep Fusion Approach for 6D Pose Estimation With Mamba and Graph Convolution Network

Similar Items