Development and validation of interpretable machine learning models to predict distant metastasis and prognosis of muscle-invasive bladder cancer patients

Abstract Muscle-Invasive Bladder Cancer (MIBC) is a more aggressive disease than non-muscle-invasive bladder cancer (NMIBC), with greater chances of metastasis. We sought to develop machine learning (ML) models to predict metastasis and prognosis in MIBC patients. Clinical data of MIBC cases from 20...

Full description

Saved in:
Bibliographic Details
Main Authors: Qian Deng, Shan Li, Yuxiang Zhang, Yuanyuan Jia, Yanhui Yang
Format: Article
Language:English
Published: Nature Portfolio 2025-04-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-96089-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850184681025175552
author Qian Deng
Shan Li
Yuxiang Zhang
Yuanyuan Jia
Yanhui Yang
author_facet Qian Deng
Shan Li
Yuxiang Zhang
Yuanyuan Jia
Yanhui Yang
author_sort Qian Deng
collection DOAJ
description Abstract Muscle-Invasive Bladder Cancer (MIBC) is a more aggressive disease than non-muscle-invasive bladder cancer (NMIBC), with greater chances of metastasis. We sought to develop machine learning (ML) models to predict metastasis and prognosis in MIBC patients. Clinical data of MIBC cases from 2000 to 2020 were sourced from the Surveillance, Epidemiology, and End Results (SEER) database. Clinical variables used to predict DM were identified through univariate and multivariate logistic regression, and Recursive Feature Elimination (RFE). Thirteen ML models predicting DM were evaluated based on AUC, PRAUC, accuracy, sensitivity, specificity, precision, cross-entropy, Brier score, balanced accuracy, and F-beta score. SHapley Additive exPlanations (SHAP) framework helped interpret the best model. Additionally, we utilized ML algorithm combinations to predict prognosis in MIBC patients with metastasis. A total of 43,951 T2-T4 MIBC patients aged over 18 years old from the SEER database were enrolled consecutively. Nine clinical variables were selected to predict DM. The CatBoost model was identified as the optimal predictor, with AUC values of 0.956 [0.933, 0.969] for the training set, 0.882 [0.857, 0.919] for the internal test set, and 0.839 [0.723, 0.936] for the external test set. The model achieved an accuracy of 0.875 [0.854, 0.896], sensitivity of 0.869 [0.851, 0.889], specificity of 0.883 [0.823, 0.912], and precision of 0.917 [0.885, 0.944]. SHAP analysis revealed that tumor size was the most influential factor in predicting distant metastasis. For prognosis, the “RSF + Enet[alpha = 0.8]” model emerged as the top performer, with C-index values of 0.683 in training, 0.688 in the internal test, and 0.666 in the external test sets. Our ML models provide high accuracy and dependability, delivering refined, individualized predictions for metastasis risk and prognosis in MIBC patients.
format Article
id doaj-art-8dbb3a29abfa484792beffd7bde9f38f
institution OA Journals
issn 2045-2322
language English
publishDate 2025-04-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-8dbb3a29abfa484792beffd7bde9f38f2025-08-20T02:16:59ZengNature PortfolioScientific Reports2045-23222025-04-0115112110.1038/s41598-025-96089-1Development and validation of interpretable machine learning models to predict distant metastasis and prognosis of muscle-invasive bladder cancer patientsQian Deng0Shan Li1Yuxiang Zhang2Yuanyuan Jia3Yanhui Yang4Luoyang Central Hospital Affiliated of Zhengzhou UniversityDepartment of Urology, Children’s Hospital of Chongqing Medical UniversityDepartment of Urology Surgery, The First Affiliated Hospital, and College of Clinical Medicine of Henan University of Science and TechnologyDepartment of Oncology, Huai’an Second People’s Hospital, Affiliated to Xuzhou Medical UniversityDepartment of Emergency Surgery (Trauma Center), The First Affiliated Hospital, and College of Clinical Medicine of Henan University of Science and TechnologyAbstract Muscle-Invasive Bladder Cancer (MIBC) is a more aggressive disease than non-muscle-invasive bladder cancer (NMIBC), with greater chances of metastasis. We sought to develop machine learning (ML) models to predict metastasis and prognosis in MIBC patients. Clinical data of MIBC cases from 2000 to 2020 were sourced from the Surveillance, Epidemiology, and End Results (SEER) database. Clinical variables used to predict DM were identified through univariate and multivariate logistic regression, and Recursive Feature Elimination (RFE). Thirteen ML models predicting DM were evaluated based on AUC, PRAUC, accuracy, sensitivity, specificity, precision, cross-entropy, Brier score, balanced accuracy, and F-beta score. SHapley Additive exPlanations (SHAP) framework helped interpret the best model. Additionally, we utilized ML algorithm combinations to predict prognosis in MIBC patients with metastasis. A total of 43,951 T2-T4 MIBC patients aged over 18 years old from the SEER database were enrolled consecutively. Nine clinical variables were selected to predict DM. The CatBoost model was identified as the optimal predictor, with AUC values of 0.956 [0.933, 0.969] for the training set, 0.882 [0.857, 0.919] for the internal test set, and 0.839 [0.723, 0.936] for the external test set. The model achieved an accuracy of 0.875 [0.854, 0.896], sensitivity of 0.869 [0.851, 0.889], specificity of 0.883 [0.823, 0.912], and precision of 0.917 [0.885, 0.944]. SHAP analysis revealed that tumor size was the most influential factor in predicting distant metastasis. For prognosis, the “RSF + Enet[alpha = 0.8]” model emerged as the top performer, with C-index values of 0.683 in training, 0.688 in the internal test, and 0.666 in the external test sets. Our ML models provide high accuracy and dependability, delivering refined, individualized predictions for metastasis risk and prognosis in MIBC patients.https://doi.org/10.1038/s41598-025-96089-1Machine learningMuscle-Invasive bladder CancerDistant metastasisPrognosis predictionSEER
spellingShingle Qian Deng
Shan Li
Yuxiang Zhang
Yuanyuan Jia
Yanhui Yang
Development and validation of interpretable machine learning models to predict distant metastasis and prognosis of muscle-invasive bladder cancer patients
Scientific Reports
Machine learning
Muscle-Invasive bladder Cancer
Distant metastasis
Prognosis prediction
SEER
title Development and validation of interpretable machine learning models to predict distant metastasis and prognosis of muscle-invasive bladder cancer patients
title_full Development and validation of interpretable machine learning models to predict distant metastasis and prognosis of muscle-invasive bladder cancer patients
title_fullStr Development and validation of interpretable machine learning models to predict distant metastasis and prognosis of muscle-invasive bladder cancer patients
title_full_unstemmed Development and validation of interpretable machine learning models to predict distant metastasis and prognosis of muscle-invasive bladder cancer patients
title_short Development and validation of interpretable machine learning models to predict distant metastasis and prognosis of muscle-invasive bladder cancer patients
title_sort development and validation of interpretable machine learning models to predict distant metastasis and prognosis of muscle invasive bladder cancer patients
topic Machine learning
Muscle-Invasive bladder Cancer
Distant metastasis
Prognosis prediction
SEER
url https://doi.org/10.1038/s41598-025-96089-1
work_keys_str_mv AT qiandeng developmentandvalidationofinterpretablemachinelearningmodelstopredictdistantmetastasisandprognosisofmuscleinvasivebladdercancerpatients
AT shanli developmentandvalidationofinterpretablemachinelearningmodelstopredictdistantmetastasisandprognosisofmuscleinvasivebladdercancerpatients
AT yuxiangzhang developmentandvalidationofinterpretablemachinelearningmodelstopredictdistantmetastasisandprognosisofmuscleinvasivebladdercancerpatients
AT yuanyuanjia developmentandvalidationofinterpretablemachinelearningmodelstopredictdistantmetastasisandprognosisofmuscleinvasivebladdercancerpatients
AT yanhuiyang developmentandvalidationofinterpretablemachinelearningmodelstopredictdistantmetastasisandprognosisofmuscleinvasivebladdercancerpatients