A U-Shaped Architecture Based on Hybrid CNN and Mamba for Medical Image Segmentation

Accurate medical image segmentation plays a critical role in clinical diagnosis, treatment planning, and a wide range of healthcare applications. Although U-shaped CNNs and Transformer-based architectures have shown promise, CNNs struggle to capture long-range dependencies, whereas Transformers suff...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaoxuan Ma, Yingao Du, Dong Sui
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/14/7821
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849409138258345984
author Xiaoxuan Ma
Yingao Du
Dong Sui
author_facet Xiaoxuan Ma
Yingao Du
Dong Sui
author_sort Xiaoxuan Ma
collection DOAJ
description Accurate medical image segmentation plays a critical role in clinical diagnosis, treatment planning, and a wide range of healthcare applications. Although U-shaped CNNs and Transformer-based architectures have shown promise, CNNs struggle to capture long-range dependencies, whereas Transformers suffer from quadratic growth in computational cost as image resolution increases. To address these issues, we propose HCMUNet, a novel medical image segmentation model that innovatively combines the local feature extraction capabilities of CNNs with the efficient long-range dependency modeling of Mamba, enhancing feature representation while reducing computational cost. In addition, HCMUNet features a redesigned skip connection and a novel attention module that integrates multi-scale features to recover spatial details lost during down-sampling and to promote richer cross-dimensional interactions. HCMUNet achieves Dice Similarity Coefficients (DSC) of 90.32%, 81.52%, and 92.11% on the ISIC 2018, Synapse multi-organ, and ACDC datasets, respectively, outperforming baseline methods by 0.65%, 1.05%, and 1.39%. Furthermore, HCMUNet consistently outperforms U-Net and Swin-UNet, achieving average Dice score improvements of approximately 5% and 2% across the evaluated datasets. These results collectively affirm the effectiveness and reliability of the proposed model across different segmentation tasks.
format Article
id doaj-art-50e76f080c954652b5c95044e350077e
institution Kabale University
issn 2076-3417
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-50e76f080c954652b5c95044e350077e2025-08-20T03:35:36ZengMDPI AGApplied Sciences2076-34172025-07-011514782110.3390/app15147821A U-Shaped Architecture Based on Hybrid CNN and Mamba for Medical Image SegmentationXiaoxuan Ma0Yingao Du1Dong Sui2School of Intelligence Science and Technology, Beijing University of Civil Engineering and Architecture, Beijing 102616, ChinaSchool of Intelligence Science and Technology, Beijing University of Civil Engineering and Architecture, Beijing 102616, ChinaSchool of Intelligence Science and Technology, Beijing University of Civil Engineering and Architecture, Beijing 102616, ChinaAccurate medical image segmentation plays a critical role in clinical diagnosis, treatment planning, and a wide range of healthcare applications. Although U-shaped CNNs and Transformer-based architectures have shown promise, CNNs struggle to capture long-range dependencies, whereas Transformers suffer from quadratic growth in computational cost as image resolution increases. To address these issues, we propose HCMUNet, a novel medical image segmentation model that innovatively combines the local feature extraction capabilities of CNNs with the efficient long-range dependency modeling of Mamba, enhancing feature representation while reducing computational cost. In addition, HCMUNet features a redesigned skip connection and a novel attention module that integrates multi-scale features to recover spatial details lost during down-sampling and to promote richer cross-dimensional interactions. HCMUNet achieves Dice Similarity Coefficients (DSC) of 90.32%, 81.52%, and 92.11% on the ISIC 2018, Synapse multi-organ, and ACDC datasets, respectively, outperforming baseline methods by 0.65%, 1.05%, and 1.39%. Furthermore, HCMUNet consistently outperforms U-Net and Swin-UNet, achieving average Dice score improvements of approximately 5% and 2% across the evaluated datasets. These results collectively affirm the effectiveness and reliability of the proposed model across different segmentation tasks.https://www.mdpi.com/2076-3417/15/14/7821medical image segmentationMambaConvolutional Neural Networkshybrid architecturesState Space Model
spellingShingle Xiaoxuan Ma
Yingao Du
Dong Sui
A U-Shaped Architecture Based on Hybrid CNN and Mamba for Medical Image Segmentation
Applied Sciences
medical image segmentation
Mamba
Convolutional Neural Networks
hybrid architectures
State Space Model
title A U-Shaped Architecture Based on Hybrid CNN and Mamba for Medical Image Segmentation
title_full A U-Shaped Architecture Based on Hybrid CNN and Mamba for Medical Image Segmentation
title_fullStr A U-Shaped Architecture Based on Hybrid CNN and Mamba for Medical Image Segmentation
title_full_unstemmed A U-Shaped Architecture Based on Hybrid CNN and Mamba for Medical Image Segmentation
title_short A U-Shaped Architecture Based on Hybrid CNN and Mamba for Medical Image Segmentation
title_sort u shaped architecture based on hybrid cnn and mamba for medical image segmentation
topic medical image segmentation
Mamba
Convolutional Neural Networks
hybrid architectures
State Space Model
url https://www.mdpi.com/2076-3417/15/14/7821
work_keys_str_mv AT xiaoxuanma aushapedarchitecturebasedonhybridcnnandmambaformedicalimagesegmentation
AT yingaodu aushapedarchitecturebasedonhybridcnnandmambaformedicalimagesegmentation
AT dongsui aushapedarchitecturebasedonhybridcnnandmambaformedicalimagesegmentation
AT xiaoxuanma ushapedarchitecturebasedonhybridcnnandmambaformedicalimagesegmentation
AT yingaodu ushapedarchitecturebasedonhybridcnnandmambaformedicalimagesegmentation
AT dongsui ushapedarchitecturebasedonhybridcnnandmambaformedicalimagesegmentation