MedFuseNet: fusing local and global deep feature representations with hybrid attention mechanisms for medical image segmentation
Abstract Medical image segmentation plays a crucial role in addressing emerging healthcare challenges. Although several impressive deep learning architectures based on convolutional neural networks (CNNs) and Transformers have recently demonstrated remarkable performance, there is still potential fo...
Saved in:
| Main Authors: | , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-02-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-89096-9 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850067134267260928 |
|---|---|
| author | Ruiyuan Chen Saiqi He Junjie Xie Tao Wang Yingying Xu Jiangxiong Fang Xiaoming Zhao Shiqing Zhang Guoyu Wang Hongsheng Lu Zhaohui Yang |
| author_facet | Ruiyuan Chen Saiqi He Junjie Xie Tao Wang Yingying Xu Jiangxiong Fang Xiaoming Zhao Shiqing Zhang Guoyu Wang Hongsheng Lu Zhaohui Yang |
| author_sort | Ruiyuan Chen |
| collection | DOAJ |
| description | Abstract Medical image segmentation plays a crucial role in addressing emerging healthcare challenges. Although several impressive deep learning architectures based on convolutional neural networks (CNNs) and Transformers have recently demonstrated remarkable performance, there is still potential for further performance improvement due to their inherent limitations in capturing feature correlations of input data. To address this issue, this paper proposes a novel encoder-decoder architecture called MedFuseNet that aims to fuse local and global deep feature representations with hybrid attention mechanisms for medical image segmentation. More specifically, the proposed approach contains two branches for feature learning in parallel: one leverages CNNs to learn local correlations of input data, and the other utilizes Swin-Transformer to capture global contextual correlations of input data. For feature fusion and enhancement, the designed hybrid attention mechanisms combine four different attention modules: (1) an atrous spatial pyramid pooling (ASPP) module for the CNN branch, (2) a cross attention module in the encoder for fusing local and global features, (3) an adaptive cross attention (ACA) module in skip connections for further performing fusion, and (4) a squeeze-and-excitation attention (SE-attention) module in the decoder for highlighting informative features. We evaluate our proposed approach on the public ACDC and Synapse datasets, and achieves the average DSC of 89.73% and 78.40%, respectively. Experimental results on these two datasets demonstrate the effectiveness of our proposed approach on medical image segmentation tasks, outperforming other used state-of-the-art approaches. |
| format | Article |
| id | doaj-art-e6d5493f3fc04fce8e00adf1a0fe5e93 |
| institution | DOAJ |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-02-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-e6d5493f3fc04fce8e00adf1a0fe5e932025-08-20T02:48:29ZengNature PortfolioScientific Reports2045-23222025-02-0115111410.1038/s41598-025-89096-9MedFuseNet: fusing local and global deep feature representations with hybrid attention mechanisms for medical image segmentationRuiyuan Chen0Saiqi He1Junjie Xie2Tao Wang3Yingying Xu4Jiangxiong Fang5Xiaoming Zhao6Shiqing Zhang7Guoyu Wang8Hongsheng Lu9Zhaohui Yang10Taizhou Central Hospital (Taizhou University Hospital), Taizhou UniversityHengyang Central HospitalTaizhou Central Hospital (Taizhou University Hospital), Taizhou UniversityTaizhou Central Hospital (Taizhou University Hospital), Taizhou UniversityTaizhou Central Hospital (Taizhou University Hospital), Taizhou UniversityTaizhou Central Hospital (Taizhou University Hospital), Taizhou UniversityTaizhou Central Hospital (Taizhou University Hospital), Taizhou UniversityTaizhou Central Hospital (Taizhou University Hospital), Taizhou UniversityTaizhou Central Hospital (Taizhou University Hospital), Taizhou UniversityTaizhou Central Hospital (Taizhou University Hospital), Taizhou UniversityTaizhou Central Hospital (Taizhou University Hospital), Taizhou UniversityAbstract Medical image segmentation plays a crucial role in addressing emerging healthcare challenges. Although several impressive deep learning architectures based on convolutional neural networks (CNNs) and Transformers have recently demonstrated remarkable performance, there is still potential for further performance improvement due to their inherent limitations in capturing feature correlations of input data. To address this issue, this paper proposes a novel encoder-decoder architecture called MedFuseNet that aims to fuse local and global deep feature representations with hybrid attention mechanisms for medical image segmentation. More specifically, the proposed approach contains two branches for feature learning in parallel: one leverages CNNs to learn local correlations of input data, and the other utilizes Swin-Transformer to capture global contextual correlations of input data. For feature fusion and enhancement, the designed hybrid attention mechanisms combine four different attention modules: (1) an atrous spatial pyramid pooling (ASPP) module for the CNN branch, (2) a cross attention module in the encoder for fusing local and global features, (3) an adaptive cross attention (ACA) module in skip connections for further performing fusion, and (4) a squeeze-and-excitation attention (SE-attention) module in the decoder for highlighting informative features. We evaluate our proposed approach on the public ACDC and Synapse datasets, and achieves the average DSC of 89.73% and 78.40%, respectively. Experimental results on these two datasets demonstrate the effectiveness of our proposed approach on medical image segmentation tasks, outperforming other used state-of-the-art approaches.https://doi.org/10.1038/s41598-025-89096-9Medical image segmentationDeep learningConvolutional neural networksSwin-TransformerHybrid attention mechanisms |
| spellingShingle | Ruiyuan Chen Saiqi He Junjie Xie Tao Wang Yingying Xu Jiangxiong Fang Xiaoming Zhao Shiqing Zhang Guoyu Wang Hongsheng Lu Zhaohui Yang MedFuseNet: fusing local and global deep feature representations with hybrid attention mechanisms for medical image segmentation Scientific Reports Medical image segmentation Deep learning Convolutional neural networks Swin-Transformer Hybrid attention mechanisms |
| title | MedFuseNet: fusing local and global deep feature representations with hybrid attention mechanisms for medical image segmentation |
| title_full | MedFuseNet: fusing local and global deep feature representations with hybrid attention mechanisms for medical image segmentation |
| title_fullStr | MedFuseNet: fusing local and global deep feature representations with hybrid attention mechanisms for medical image segmentation |
| title_full_unstemmed | MedFuseNet: fusing local and global deep feature representations with hybrid attention mechanisms for medical image segmentation |
| title_short | MedFuseNet: fusing local and global deep feature representations with hybrid attention mechanisms for medical image segmentation |
| title_sort | medfusenet fusing local and global deep feature representations with hybrid attention mechanisms for medical image segmentation |
| topic | Medical image segmentation Deep learning Convolutional neural networks Swin-Transformer Hybrid attention mechanisms |
| url | https://doi.org/10.1038/s41598-025-89096-9 |
| work_keys_str_mv | AT ruiyuanchen medfusenetfusinglocalandglobaldeepfeaturerepresentationswithhybridattentionmechanismsformedicalimagesegmentation AT saiqihe medfusenetfusinglocalandglobaldeepfeaturerepresentationswithhybridattentionmechanismsformedicalimagesegmentation AT junjiexie medfusenetfusinglocalandglobaldeepfeaturerepresentationswithhybridattentionmechanismsformedicalimagesegmentation AT taowang medfusenetfusinglocalandglobaldeepfeaturerepresentationswithhybridattentionmechanismsformedicalimagesegmentation AT yingyingxu medfusenetfusinglocalandglobaldeepfeaturerepresentationswithhybridattentionmechanismsformedicalimagesegmentation AT jiangxiongfang medfusenetfusinglocalandglobaldeepfeaturerepresentationswithhybridattentionmechanismsformedicalimagesegmentation AT xiaomingzhao medfusenetfusinglocalandglobaldeepfeaturerepresentationswithhybridattentionmechanismsformedicalimagesegmentation AT shiqingzhang medfusenetfusinglocalandglobaldeepfeaturerepresentationswithhybridattentionmechanismsformedicalimagesegmentation AT guoyuwang medfusenetfusinglocalandglobaldeepfeaturerepresentationswithhybridattentionmechanismsformedicalimagesegmentation AT hongshenglu medfusenetfusinglocalandglobaldeepfeaturerepresentationswithhybridattentionmechanismsformedicalimagesegmentation AT zhaohuiyang medfusenetfusinglocalandglobaldeepfeaturerepresentationswithhybridattentionmechanismsformedicalimagesegmentation |