Attention-based multimodal deep learning for interpretable and generalizable prediction of pathological complete response in breast cancer

Abstract Background Accurate prediction of pathological complete response (pCR) to neoadjuvant chemotherapy has significant clinical utility in the management of breast cancer treatment. Although multimodal deep learning models have shown promise for predicting pCR from medical imaging and other cli...

Full description

Saved in:

Bibliographic Details
Main Authors:	Taishi Nishizawa, Takouhie Maldjian, Zhicheng Jiao, Tim Q. Duong
Format:	Article
Language:	English
Published:	BMC 2025-07-01
Series:	Journal of Translational Medicine
Subjects:	CNN 3D ResNet Artificial intelligence Magnetic resonance imaging Dynamic contrast enhanced MRI Molecular subtypes
Online Access:	https://doi.org/10.1186/s12967-025-06617-w
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Abstract Background Accurate prediction of pathological complete response (pCR) to neoadjuvant chemotherapy has significant clinical utility in the management of breast cancer treatment. Although multimodal deep learning models have shown promise for predicting pCR from medical imaging and other clinical data, their adoption has been limited due to challenges with interpretability and generalizability across institutions. Methods We developed a multimodal deep learning model combining post contrast-enhanced whole-breast MRI at pre- and post-treatment timepoints with non-imaging clinical features. The model integrates 3D convolutional neural networks and self-attention to capture spatial and cross-modal interactions. We utilized two public multi-institutional datasets to perform internal and external validation of the model. For model training and validation, we used data from the I-SPY 2 trial (N = 660). For external validation, we used the I-SPY 1 dataset (N = 114). Results Of the 660 patients in I-SPY 2, 217 patients achieved pCR (32.88%). Of the 114 patients in I-SPY 1, 29 achieved pCR (25.44%). The attention-based multimodal model yielded the best predictive performance with an AUC of 0.73 ± 0.04 on the internal data and an AUC of 0.71 ± 0.02 on the external dataset. The MRI-only model (internal AUC = 0.68 ± 0.03, external AUC = 0.70 ± 0.04) and the non-MRI clinical features-only model (internal AUC = 0.66 ± 0.08, external AUC = 0.71 ± 0.03) trailed in performance, indicating the combination of both modalities is most effective. Conclusion We present a robust and interpretable deep learning framework for pCR prediction in breast cancer patients undergoing NAC. By combining imaging and clinical data with attention-based fusion, the model achieves strong predictive performance and generalizes across institutions.
ISSN:	1479-5876

Attention-based multimodal deep learning for interpretable and generalizable prediction of pathological complete response in breast cancer

Similar Items