Attention-based multimodal deep learning for interpretable and generalizable prediction of pathological complete response in breast cancer

Abstract Background Accurate prediction of pathological complete response (pCR) to neoadjuvant chemotherapy has significant clinical utility in the management of breast cancer treatment. Although multimodal deep learning models have shown promise for predicting pCR from medical imaging and other cli...

Full description

Saved in:
Bibliographic Details
Main Authors: Taishi Nishizawa, Takouhie Maldjian, Zhicheng Jiao, Tim Q. Duong
Format: Article
Language:English
Published: BMC 2025-07-01
Series:Journal of Translational Medicine
Subjects:
Online Access:https://doi.org/10.1186/s12967-025-06617-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849234711519428608
author Taishi Nishizawa
Takouhie Maldjian
Zhicheng Jiao
Tim Q. Duong
author_facet Taishi Nishizawa
Takouhie Maldjian
Zhicheng Jiao
Tim Q. Duong
author_sort Taishi Nishizawa
collection DOAJ
description Abstract Background Accurate prediction of pathological complete response (pCR) to neoadjuvant chemotherapy has significant clinical utility in the management of breast cancer treatment. Although multimodal deep learning models have shown promise for predicting pCR from medical imaging and other clinical data, their adoption has been limited due to challenges with interpretability and generalizability across institutions. Methods We developed a multimodal deep learning model combining post contrast-enhanced whole-breast MRI at pre- and post-treatment timepoints with non-imaging clinical features. The model integrates 3D convolutional neural networks and self-attention to capture spatial and cross-modal interactions. We utilized two public multi-institutional datasets to perform internal and external validation of the model. For model training and validation, we used data from the I-SPY 2 trial (N = 660). For external validation, we used the I-SPY 1 dataset (N = 114). Results Of the 660 patients in I-SPY 2, 217 patients achieved pCR (32.88%). Of the 114 patients in I-SPY 1, 29 achieved pCR (25.44%). The attention-based multimodal model yielded the best predictive performance with an AUC of 0.73 ± 0.04 on the internal data and an AUC of 0.71 ± 0.02 on the external dataset. The MRI-only model (internal AUC = 0.68 ± 0.03, external AUC = 0.70 ± 0.04) and the non-MRI clinical features-only model (internal AUC = 0.66 ± 0.08, external AUC = 0.71 ± 0.03) trailed in performance, indicating the combination of both modalities is most effective. Conclusion We present a robust and interpretable deep learning framework for pCR prediction in breast cancer patients undergoing NAC. By combining imaging and clinical data with attention-based fusion, the model achieves strong predictive performance and generalizes across institutions.
format Article
id doaj-art-aa9dd372d498483f9a15c46e310f5c4c
institution Kabale University
issn 1479-5876
language English
publishDate 2025-07-01
publisher BMC
record_format Article
series Journal of Translational Medicine
spelling doaj-art-aa9dd372d498483f9a15c46e310f5c4c2025-08-20T04:03:02ZengBMCJournal of Translational Medicine1479-58762025-07-0123111310.1186/s12967-025-06617-wAttention-based multimodal deep learning for interpretable and generalizable prediction of pathological complete response in breast cancerTaishi Nishizawa0Takouhie Maldjian1Zhicheng Jiao2Tim Q. Duong3Department of Diagnostic Radiology, Warren Alpert Medical School of Brown UniversityDepartment of Radiology, Montefiore Health System and Albert Einstein College of MedicineDepartment of Diagnostic Radiology, Warren Alpert Medical School of Brown UniversityDepartment of Radiology, Montefiore Health System and Albert Einstein College of MedicineAbstract Background Accurate prediction of pathological complete response (pCR) to neoadjuvant chemotherapy has significant clinical utility in the management of breast cancer treatment. Although multimodal deep learning models have shown promise for predicting pCR from medical imaging and other clinical data, their adoption has been limited due to challenges with interpretability and generalizability across institutions. Methods We developed a multimodal deep learning model combining post contrast-enhanced whole-breast MRI at pre- and post-treatment timepoints with non-imaging clinical features. The model integrates 3D convolutional neural networks and self-attention to capture spatial and cross-modal interactions. We utilized two public multi-institutional datasets to perform internal and external validation of the model. For model training and validation, we used data from the I-SPY 2 trial (N = 660). For external validation, we used the I-SPY 1 dataset (N = 114). Results Of the 660 patients in I-SPY 2, 217 patients achieved pCR (32.88%). Of the 114 patients in I-SPY 1, 29 achieved pCR (25.44%). The attention-based multimodal model yielded the best predictive performance with an AUC of 0.73 ± 0.04 on the internal data and an AUC of 0.71 ± 0.02 on the external dataset. The MRI-only model (internal AUC = 0.68 ± 0.03, external AUC = 0.70 ± 0.04) and the non-MRI clinical features-only model (internal AUC = 0.66 ± 0.08, external AUC = 0.71 ± 0.03) trailed in performance, indicating the combination of both modalities is most effective. Conclusion We present a robust and interpretable deep learning framework for pCR prediction in breast cancer patients undergoing NAC. By combining imaging and clinical data with attention-based fusion, the model achieves strong predictive performance and generalizes across institutions.https://doi.org/10.1186/s12967-025-06617-wCNN3D ResNetArtificial intelligenceMagnetic resonance imagingDynamic contrast enhanced MRIMolecular subtypes
spellingShingle Taishi Nishizawa
Takouhie Maldjian
Zhicheng Jiao
Tim Q. Duong
Attention-based multimodal deep learning for interpretable and generalizable prediction of pathological complete response in breast cancer
Journal of Translational Medicine
CNN
3D ResNet
Artificial intelligence
Magnetic resonance imaging
Dynamic contrast enhanced MRI
Molecular subtypes
title Attention-based multimodal deep learning for interpretable and generalizable prediction of pathological complete response in breast cancer
title_full Attention-based multimodal deep learning for interpretable and generalizable prediction of pathological complete response in breast cancer
title_fullStr Attention-based multimodal deep learning for interpretable and generalizable prediction of pathological complete response in breast cancer
title_full_unstemmed Attention-based multimodal deep learning for interpretable and generalizable prediction of pathological complete response in breast cancer
title_short Attention-based multimodal deep learning for interpretable and generalizable prediction of pathological complete response in breast cancer
title_sort attention based multimodal deep learning for interpretable and generalizable prediction of pathological complete response in breast cancer
topic CNN
3D ResNet
Artificial intelligence
Magnetic resonance imaging
Dynamic contrast enhanced MRI
Molecular subtypes
url https://doi.org/10.1186/s12967-025-06617-w
work_keys_str_mv AT taishinishizawa attentionbasedmultimodaldeeplearningforinterpretableandgeneralizablepredictionofpathologicalcompleteresponseinbreastcancer
AT takouhiemaldjian attentionbasedmultimodaldeeplearningforinterpretableandgeneralizablepredictionofpathologicalcompleteresponseinbreastcancer
AT zhichengjiao attentionbasedmultimodaldeeplearningforinterpretableandgeneralizablepredictionofpathologicalcompleteresponseinbreastcancer
AT timqduong attentionbasedmultimodaldeeplearningforinterpretableandgeneralizablepredictionofpathologicalcompleteresponseinbreastcancer