GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation
Abstract Medical image segmentation is vital for accurate diagnosis. While U-Net-based models are effective, they struggle to capture long-range dependencies in complex anatomy. We propose GH-UNet, a Group-wise Hybrid Convolution-ViT model within the U-Net framework, to address this limitation. GH-U...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | npj Digital Medicine |
| Online Access: | https://doi.org/10.1038/s41746-025-01829-2 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849332402261852160 |
|---|---|
| author | Shengxiang Wang Ge Li Min Gao Linlin Zhuo Mingzhe Liu Zhizhong Ma Wei Zhao Xiangzheng Fu |
| author_facet | Shengxiang Wang Ge Li Min Gao Linlin Zhuo Mingzhe Liu Zhizhong Ma Wei Zhao Xiangzheng Fu |
| author_sort | Shengxiang Wang |
| collection | DOAJ |
| description | Abstract Medical image segmentation is vital for accurate diagnosis. While U-Net-based models are effective, they struggle to capture long-range dependencies in complex anatomy. We propose GH-UNet, a Group-wise Hybrid Convolution-ViT model within the U-Net framework, to address this limitation. GH-UNet integrates a hybrid convolution-Transformer encoder for both local detail and global context modeling, a Group-wise Dynamic Gating (GDG) module for adaptive feature weighting, and a cascaded decoder for multi-scale integration. Both the encoder and GDG are modular, enabling compatibility with various CNN or ViT backbones. Extensive experiments on five public and one private dataset show GH-UNet consistently achieves superior performance. On ISIC2016, it surpasses H2Former with 1.37% and 1.94% gains in DICE and IOU, respectively, using only 38% of the parameters and 49.61% of the FLOPs. The code is freely accessible via: https://github.com/xiachashuanghua/GH-UNet . |
| format | Article |
| id | doaj-art-b439201d860b4fe1aed606e503c62be9 |
| institution | Kabale University |
| issn | 2398-6352 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | npj Digital Medicine |
| spelling | doaj-art-b439201d860b4fe1aed606e503c62be92025-08-20T03:46:12ZengNature Portfolionpj Digital Medicine2398-63522025-07-018111510.1038/s41746-025-01829-2GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentationShengxiang Wang0Ge Li1Min Gao2Linlin Zhuo3Mingzhe Liu4Zhizhong Ma5Wei Zhao6Xiangzheng Fu7School of Data Science and Artificial Intelligence, Wenzhou University of TechnologyDepartment of Radiology, Xiangya Hospital, Central South UniversityDepartment of Radiology, The Second Xiangya Hospital, Central South UniversitySchool of Data Science and Artificial Intelligence, Wenzhou University of TechnologySchool of Data Science and Artificial Intelligence, Wenzhou University of TechnologySchool of Data Science and Artificial Intelligence, Wenzhou University of TechnologyDepartment of Radiology, The Second Xiangya Hospital, Central South UniversitySchool of Chinese Medicine, Hong Kong Baptist UniversityAbstract Medical image segmentation is vital for accurate diagnosis. While U-Net-based models are effective, they struggle to capture long-range dependencies in complex anatomy. We propose GH-UNet, a Group-wise Hybrid Convolution-ViT model within the U-Net framework, to address this limitation. GH-UNet integrates a hybrid convolution-Transformer encoder for both local detail and global context modeling, a Group-wise Dynamic Gating (GDG) module for adaptive feature weighting, and a cascaded decoder for multi-scale integration. Both the encoder and GDG are modular, enabling compatibility with various CNN or ViT backbones. Extensive experiments on five public and one private dataset show GH-UNet consistently achieves superior performance. On ISIC2016, it surpasses H2Former with 1.37% and 1.94% gains in DICE and IOU, respectively, using only 38% of the parameters and 49.61% of the FLOPs. The code is freely accessible via: https://github.com/xiachashuanghua/GH-UNet .https://doi.org/10.1038/s41746-025-01829-2 |
| spellingShingle | Shengxiang Wang Ge Li Min Gao Linlin Zhuo Mingzhe Liu Zhizhong Ma Wei Zhao Xiangzheng Fu GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation npj Digital Medicine |
| title | GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation |
| title_full | GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation |
| title_fullStr | GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation |
| title_full_unstemmed | GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation |
| title_short | GH-UNet: group-wise hybrid convolution-VIT for robust medical image segmentation |
| title_sort | gh unet group wise hybrid convolution vit for robust medical image segmentation |
| url | https://doi.org/10.1038/s41746-025-01829-2 |
| work_keys_str_mv | AT shengxiangwang ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT geli ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT mingao ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT linlinzhuo ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT mingzheliu ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT zhizhongma ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT weizhao ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation AT xiangzhengfu ghunetgroupwisehybridconvolutionvitforrobustmedicalimagesegmentation |