MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification

Abstract Esophageal cancer is a globally significant but understudied type of cancer with high mortality rates. The staging and differentiation of esophageal cancer are crucial factors in determining the prognosis and surgical treatment plan for patients, as well as improving their chances of surviv...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiangzuo Huo, Shengwei Tian, Long Yu, Wendong Zhang, Aolun Li, Qimeng Yang, Jinmiao Song
Format: Article
Language:English
Published: Springer 2025-01-01
Series:Complex & Intelligent Systems
Subjects:
Online Access:https://doi.org/10.1007/s40747-024-01708-5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832571177589014528
author Xiangzuo Huo
Shengwei Tian
Long Yu
Wendong Zhang
Aolun Li
Qimeng Yang
Jinmiao Song
author_facet Xiangzuo Huo
Shengwei Tian
Long Yu
Wendong Zhang
Aolun Li
Qimeng Yang
Jinmiao Song
author_sort Xiangzuo Huo
collection DOAJ
description Abstract Esophageal cancer is a globally significant but understudied type of cancer with high mortality rates. The staging and differentiation of esophageal cancer are crucial factors in determining the prognosis and surgical treatment plan for patients, as well as improving their chances of survival. Endoscopy and histopathological examination are considered as the gold standard for esophageal cancer diagnosis. However, some previous studies have employed deep learning-based methods for esophageal cancer analysis, which are limited to single-modal features, resulting in inadequate classification results. In response to these limitations, multi-modal learning has emerged as a promising alternative for medical image analysis tasks. In this paper, we propose a hierarchical feature fusion network, MM-HiFuse, for multi-modal multitask learning to improve the classification accuracy of esophageal cancer staging and differentiation level. The proposed architecture combines low-level to deep-level features of both pathological and endoscopic images to achieve accurate classification results. The key characteristics of MM-HiFuse include: (i) a parallel hierarchy of convolution and self-attention layers specifically designed for pathological and endoscopic image features; (ii) a multi-modal hierarchical feature fusion module (MHF) and a new multitask weighted combination loss function. The benefits of these features are the effective extraction of multi-modal representations at different semantic scales and the mutual complementarity of the multitask learning, leading to improved classification performance. Experimental results demonstrate that MM-HiFuse outperforms single-modal methods in esophageal cancer staging and differentiation classification. Our findings provide evidence for the early diagnosis and accurate staging of esophageal cancer and serve as a new inspiration for the application of multi-modal multitask learning in medical image analysis. Code is available at https://github.com/huoxiangzuo/MM-HiFuse .
format Article
id doaj-art-828d6d02bce549d6a4d0c9bfef60d7cb
institution Kabale University
issn 2199-4536
2198-6053
language English
publishDate 2025-01-01
publisher Springer
record_format Article
series Complex & Intelligent Systems
spelling doaj-art-828d6d02bce549d6a4d0c9bfef60d7cb2025-02-02T12:50:19ZengSpringerComplex & Intelligent Systems2199-45362198-60532025-01-0111111210.1007/s40747-024-01708-5MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classificationXiangzuo Huo0Shengwei Tian1Long Yu2Wendong Zhang3Aolun Li4Qimeng Yang5Jinmiao Song6School of Computer and Information Engineering, Tianjin Agricultural UniversitySchool of Software, Xinjiang UniversitySchool of Software, Xinjiang UniversitySchool of Software, Xinjiang UniversitySchool of Software, Xinjiang UniversitySchool of Software, Xinjiang UniversitySchool of Software, Xinjiang UniversityAbstract Esophageal cancer is a globally significant but understudied type of cancer with high mortality rates. The staging and differentiation of esophageal cancer are crucial factors in determining the prognosis and surgical treatment plan for patients, as well as improving their chances of survival. Endoscopy and histopathological examination are considered as the gold standard for esophageal cancer diagnosis. However, some previous studies have employed deep learning-based methods for esophageal cancer analysis, which are limited to single-modal features, resulting in inadequate classification results. In response to these limitations, multi-modal learning has emerged as a promising alternative for medical image analysis tasks. In this paper, we propose a hierarchical feature fusion network, MM-HiFuse, for multi-modal multitask learning to improve the classification accuracy of esophageal cancer staging and differentiation level. The proposed architecture combines low-level to deep-level features of both pathological and endoscopic images to achieve accurate classification results. The key characteristics of MM-HiFuse include: (i) a parallel hierarchy of convolution and self-attention layers specifically designed for pathological and endoscopic image features; (ii) a multi-modal hierarchical feature fusion module (MHF) and a new multitask weighted combination loss function. The benefits of these features are the effective extraction of multi-modal representations at different semantic scales and the mutual complementarity of the multitask learning, leading to improved classification performance. Experimental results demonstrate that MM-HiFuse outperforms single-modal methods in esophageal cancer staging and differentiation classification. Our findings provide evidence for the early diagnosis and accurate staging of esophageal cancer and serve as a new inspiration for the application of multi-modal multitask learning in medical image analysis. Code is available at https://github.com/huoxiangzuo/MM-HiFuse .https://doi.org/10.1007/s40747-024-01708-5Esophagus CancerMulti-modal Multi-task LearningFeature FusionHybrid NetworkSelf-attention
spellingShingle Xiangzuo Huo
Shengwei Tian
Long Yu
Wendong Zhang
Aolun Li
Qimeng Yang
Jinmiao Song
MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification
Complex & Intelligent Systems
Esophagus Cancer
Multi-modal Multi-task Learning
Feature Fusion
Hybrid Network
Self-attention
title MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification
title_full MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification
title_fullStr MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification
title_full_unstemmed MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification
title_short MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification
title_sort mm hifuse multi modal multi task hierarchical feature fusion for esophagus cancer staging and differentiation classification
topic Esophagus Cancer
Multi-modal Multi-task Learning
Feature Fusion
Hybrid Network
Self-attention
url https://doi.org/10.1007/s40747-024-01708-5
work_keys_str_mv AT xiangzuohuo mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification
AT shengweitian mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification
AT longyu mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification
AT wendongzhang mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification
AT aolunli mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification
AT qimengyang mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification
AT jinmiaosong mmhifusemultimodalmultitaskhierarchicalfeaturefusionforesophaguscancerstaginganddifferentiationclassification