GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation

Abstract Medical image segmentation is vital for accurate diagnosis. While U-Net-based models are effective, they struggle to capture long-range dependencies in complex anatomy. We propose GH-UNet, a Group-wise Hybrid Convolution-ViT model within the U-Net framework, to address this limitation. GH-U...

Full description

Saved in:

Bibliographic Details
Main Authors:	Shengxiang Wang, Ge Li, Min Gao, Linlin Zhuo, Mingzhe Liu, Zhizhong Ma, Wei Zhao, Xiangzheng Fu
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-07-01
Series:	npj Digital Medicine
Online Access:	https://doi.org/10.1038/s41746-025-01829-2
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849332402261852160
author	Shengxiang Wang Ge Li Min Gao Linlin Zhuo Mingzhe Liu Zhizhong Ma Wei Zhao Xiangzheng Fu
author_facet	Shengxiang Wang Ge Li Min Gao Linlin Zhuo Mingzhe Liu Zhizhong Ma Wei Zhao Xiangzheng Fu
author_sort	Shengxiang Wang
collection	DOAJ
description	Abstract Medical image segmentation is vital for accurate diagnosis. While U-Net-based models are effective, they struggle to capture long-range dependencies in complex anatomy. We propose GH-UNet, a Group-wise Hybrid Convolution-ViT model within the U-Net framework, to address this limitation. GH-UNet integrates a hybrid convolution-Transformer encoder for both local detail and global context modeling, a Group-wise Dynamic Gating (GDG) module for adaptive feature weighting, and a cascaded decoder for multi-scale integration. Both the encoder and GDG are modular, enabling compatibility with various CNN or ViT backbones. Extensive experiments on five public and one private dataset show GH-UNet consistently achieves superior performance. On ISIC2016, it surpasses H2Former with 1.37% and 1.94% gains in DICE and IOU, respectively, using only 38% of the parameters and 49.61% of the FLOPs. The code is freely accessible via: https://github.com/xiachashuanghua/GH-UNet .
format	Article
id	doaj-art-b439201d860b4fe1aed606e503c62be9
institution	Kabale University
issn	2398-6352
language	English
publishDate	2025-07-01
publisher	Nature Portfolio
record_format	Article
series	npj Digital Medicine
spelling	doaj-art-b439201d860b4fe1aed606e503c62be92025-08-20T03:46:12ZengNature Portfolionpj Digital Medicine2398-63522025-07-018111510.1038/s41746-025-01829-2GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentationShengxiang Wang0Ge Li1Min Gao2Linlin Zhuo3Mingzhe Liu4Zhizhong Ma5Wei Zhao6Xiangzheng Fu7School of Data Science and Artificial Intelligence, Wenzhou University of TechnologyDepartment of Radiology, Xiangya Hospital, Central South UniversityDepartment of Radiology, The Second Xiangya Hospital, Central South UniversitySchool of Data Science and Artificial Intelligence, Wenzhou University of TechnologySchool of Data Science and Artificial Intelligence, Wenzhou University of TechnologySchool of Data Science and Artificial Intelligence, Wenzhou University of TechnologyDepartment of Radiology, The Second Xiangya Hospital, Central South UniversitySchool of Chinese Medicine, Hong Kong Baptist UniversityAbstract Medical image segmentation is vital for accurate diagnosis. While U-Net-based models are effective, they struggle to capture long-range dependencies in complex anatomy. We propose GH-UNet, a Group-wise Hybrid Convolution-ViT model within the U-Net framework, to address this limitation. GH-UNet integrates a hybrid convolution-Transformer encoder for both local detail and global context modeling, a Group-wise Dynamic Gating (GDG) module for adaptive feature weighting, and a cascaded decoder for multi-scale integration. Both the encoder and GDG are modular, enabling compatibility with various CNN or ViT backbones. Extensive experiments on five public and one private dataset show GH-UNet consistently achieves superior performance. On ISIC2016, it surpasses H2Former with 1.37% and 1.94% gains in DICE and IOU, respectively, using only 38% of the parameters and 49.61% of the FLOPs. The code is freely accessible via: https://github.com/xiachashuanghua/GH-UNet .https://doi.org/10.1038/s41746-025-01829-2
spellingShingle	Shengxiang Wang Ge Li Min Gao Linlin Zhuo Mingzhe Liu Zhizhong Ma Wei Zhao Xiangzheng Fu GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation npj Digital Medicine
title	GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation
title_full	GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation
title_fullStr	GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation
title_full_unstemmed	GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation
title_short	GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation
title_sort	gh unet group wise hybrid convolution vit for robust medical image segmentation
url	https://doi.org/10.1038/s41746-025-01829-2
work_keys_str_mv	AT shengxiangwang ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT geli ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT mingao ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT linlinzhuo ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT mingzheliu ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT zhizhongma ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT weizhao ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT xiangzhengfu ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation

GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation

Similar Items