Medical Image Segmentation Network Based on Dual-Encoder Interactive Fusion
Hybrid CNN–Transformer networks seek to merge the local feature extraction capabilities of CNNs with the long-range dependency modeling abilities of Transformers, aiming to simultaneously address both local details and global contextual information. However, in many existing studies, CNNs and Transf...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/7/3785 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Hybrid CNN–Transformer networks seek to merge the local feature extraction capabilities of CNNs with the long-range dependency modeling abilities of Transformers, aiming to simultaneously address both local details and global contextual information. However, in many existing studies, CNNs and Transformers are often combined through the straightforward fusion of encoder features, which does not promote effective interaction between the two, thus limiting the potential benefits of each architecture. To overcome this shortfall, this study introduces a novel medical image segmentation (MIS) network, designated as DEFI-Net, which is based on dual-encoder interactive fusion. This network enhances segmentation performance by fostering interactive learning and feature fusion between the CNN and Transformer encoders. Specifically, during the encoding phase, DEFI-Net utilizes parallel encoding with both the CNN and Transformer to extract local and global features from the input images. The global–local interaction learning (GLIL) module then enables both the Transformer and CNN to assimilate global semantics and local details from each other, fully leveraging the strengths of the two encoders. In the feature fusion phase, the global–local feature fusion (GLFF) module integrates features from both encoders, using both global and local information to produce a more precise and comprehensive representation of features. Extensive experiments conducted on multiple public datasets, including multi-organ, cardiac, and colon polyp datasets, demonstrate that DEFI-Net surpasses several existing methods in terms of segmentation accuracy, thus highlighting its effectiveness and robustness in MIS tasks. |
|---|---|
| ISSN: | 2076-3417 |