MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification
Abstract Esophageal cancer is a globally significant but understudied type of cancer with high mortality rates. The staging and differentiation of esophageal cancer are crucial factors in determining the prognosis and surgical treatment plan for patients, as well as improving their chances of surviv...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2025-01-01
|
Series: | Complex & Intelligent Systems |
Subjects: | |
Online Access: | https://doi.org/10.1007/s40747-024-01708-5 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832571177589014528 |
---|---|
author | Xiangzuo Huo Shengwei Tian Long Yu Wendong Zhang Aolun Li Qimeng Yang Jinmiao Song |
author_facet | Xiangzuo Huo Shengwei Tian Long Yu Wendong Zhang Aolun Li Qimeng Yang Jinmiao Song |
author_sort | Xiangzuo Huo |
collection | DOAJ |
description | Abstract Esophageal cancer is a globally significant but understudied type of cancer with high mortality rates. The staging and differentiation of esophageal cancer are crucial factors in determining the prognosis and surgical treatment plan for patients, as well as improving their chances of survival. Endoscopy and histopathological examination are considered as the gold standard for esophageal cancer diagnosis. However, some previous studies have employed deep learning-based methods for esophageal cancer analysis, which are limited to single-modal features, resulting in inadequate classification results. In response to these limitations, multi-modal learning has emerged as a promising alternative for medical image analysis tasks. In this paper, we propose a hierarchical feature fusion network, MM-HiFuse, for multi-modal multitask learning to improve the classification accuracy of esophageal cancer staging and differentiation level. The proposed architecture combines low-level to deep-level features of both pathological and endoscopic images to achieve accurate classification results. The key characteristics of MM-HiFuse include: (i) a parallel hierarchy of convolution and self-attention layers specifically designed for pathological and endoscopic image features; (ii) a multi-modal hierarchical feature fusion module (MHF) and a new multitask weighted combination loss function. The benefits of these features are the effective extraction of multi-modal representations at different semantic scales and the mutual complementarity of the multitask learning, leading to improved classification performance. Experimental results demonstrate that MM-HiFuse outperforms single-modal methods in esophageal cancer staging and differentiation classification. Our findings provide evidence for the early diagnosis and accurate staging of esophageal cancer and serve as a new inspiration for the application of multi-modal multitask learning in medical image analysis. Code is available at https://github.com/huoxiangzuo/MM-HiFuse . |
format | Article |
id | doaj-art-828d6d02bce549d6a4d0c9bfef60d7cb |
institution | Kabale University |
issn | 2199-4536 2198-6053 |
language | English |
publishDate | 2025-01-01 |
publisher | Springer |
record_format | Article |
series | Complex & Intelligent Systems |
spelling | doaj-art-828d6d02bce549d6a4d0c9bfef60d7cb2025-02-02T12:50:19ZengSpringerComplex & Intelligent Systems2199-45362198-60532025-01-0111111210.1007/s40747-024-01708-5MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classificationXiangzuo Huo0Shengwei Tian1Long Yu2Wendong Zhang3Aolun Li4Qimeng Yang5Jinmiao Song6School of Computer and Information Engineering, Tianjin Agricultural UniversitySchool of Software, Xinjiang UniversitySchool of Software, Xinjiang UniversitySchool of Software, Xinjiang UniversitySchool of Software, Xinjiang UniversitySchool of Software, Xinjiang UniversitySchool of Software, Xinjiang UniversityAbstract Esophageal cancer is a globally significant but understudied type of cancer with high mortality rates. The staging and differentiation of esophageal cancer are crucial factors in determining the prognosis and surgical treatment plan for patients, as well as improving their chances of survival. Endoscopy and histopathological examination are considered as the gold standard for esophageal cancer diagnosis. However, some previous studies have employed deep learning-based methods for esophageal cancer analysis, which are limited to single-modal features, resulting in inadequate classification results. In response to these limitations, multi-modal learning has emerged as a promising alternative for medical image analysis tasks. In this paper, we propose a hierarchical feature fusion network, MM-HiFuse, for multi-modal multitask learning to improve the classification accuracy of esophageal cancer staging and differentiation level. The proposed architecture combines low-level to deep-level features of both pathological and endoscopic images to achieve accurate classification results. The key characteristics of MM-HiFuse include: (i) a parallel hierarchy of convolution and self-attention layers specifically designed for pathological and endoscopic image features; (ii) a multi-modal hierarchical feature fusion module (MHF) and a new multitask weighted combination loss function. The benefits of these features are the effective extraction of multi-modal representations at different semantic scales and the mutual complementarity of the multitask learning, leading to improved classification performance. Experimental results demonstrate that MM-HiFuse outperforms single-modal methods in esophageal cancer staging and differentiation classification. Our findings provide evidence for the early diagnosis and accurate staging of esophageal cancer and serve as a new inspiration for the application of multi-modal multitask learning in medical image analysis. Code is available at https://github.com/huoxiangzuo/MM-HiFuse .https://doi.org/10.1007/s40747-024-01708-5Esophagus CancerMulti-modal Multi-task LearningFeature FusionHybrid NetworkSelf-attention |
spellingShingle | Xiangzuo Huo Shengwei Tian Long Yu Wendong Zhang Aolun Li Qimeng Yang Jinmiao Song MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification Complex & Intelligent Systems Esophagus Cancer Multi-modal Multi-task Learning Feature Fusion Hybrid Network Self-attention |
title | MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification |
title_full | MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification |
title_fullStr | MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification |
title_full_unstemmed | MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification |
title_short | MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification |
title_sort | mm hifuse multi modal multi task hierarchical feature fusion for esophagus cancer staging and differentiation classification |
topic | Esophagus Cancer Multi-modal Multi-task Learning Feature Fusion Hybrid Network Self-attention |
url | https://doi.org/10.1007/s40747-024-01708-5 |
work_keys_str_mv | AT xiangzuohuo mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification AT shengweitian mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification AT longyu mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification AT wendongzhang mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification AT aolunli mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification AT qimengyang mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification AT jinmiaosong mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification |