A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation

Abstract The prediction of in-hospital mortality in cancer patients with acute pulmonary embolism (APE) remains a significant clinical challenge. This study aimed to develop and validate a machine learning model using XGBoost to predict in-hospital mortality in this vulnerable population. A retrospe...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhen-nan Yuan, Yu-juan Xue, Hai-jun Wang, Shi-ning Qu, Chu-lin Huang, Hao Wang, Hao Zhang, Min-ze Zhang, Xue-zhong Xing
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-02072-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849705325986316288
author Zhen-nan Yuan
Yu-juan Xue
Hai-jun Wang
Shi-ning Qu
Chu-lin Huang
Hao Wang
Hao Zhang
Min-ze Zhang
Xue-zhong Xing
author_facet Zhen-nan Yuan
Yu-juan Xue
Hai-jun Wang
Shi-ning Qu
Chu-lin Huang
Hao Wang
Hao Zhang
Min-ze Zhang
Xue-zhong Xing
author_sort Zhen-nan Yuan
collection DOAJ
description Abstract The prediction of in-hospital mortality in cancer patients with acute pulmonary embolism (APE) remains a significant clinical challenge. This study aimed to develop and validate a machine learning model using XGBoost to predict in-hospital mortality in this vulnerable population. A retrospective cohort study was conducted using the MIMIC-IV 2.2 database and external data from the intensive care unit of Cancer hospital, Chinese Academy of Medical Sciences, collected between May 1, 2021, and April 30, 2023. A total of 448 cancer patients with APE were included from the MIMIC-IV 2.2 database, divided into a training set (70%, n = 314) and an internal validation set (30%, n = 134). An external validation cohort consisted of 56 patients. An XGBoost model was trained and the SHAP (SHapley Additive Explanations) method was used to identify the top 10 predictors of in-hospital mortality. These predictors included Glasgow Coma Scale (GCS) score, albumin, platelet count, age, serum creatinine, hemoglobin, presence of metastasis, lactate, creatine kinase (CK), and types of cancer. The XGBoost model achieved an area under the ROC curve (AUC) of 0.806 (95% CI: 0.717–0.896) in the internal validation set and 0.724 (95% CI: 0.686–0.901) in the external validation set. Calibration curves indicated good model fit, and decision curve analysis (DCA) demonstrated a high clinical benefit across both the internal and external validation cohorts. The XGBoost model, leveraging SHAP for interpretation, effectively predicts in-hospital mortality in cancer patients with APE. This model provides valuable insights for clinical decision-making and has the potential to improve patient outcomes through early intervention and personalized treatment strategies. Further validation in diverse clinical settings is warranted to confirm its generalizability.
format Article
id doaj-art-5a9e2b57334b45f889f9ca5fdab3fae7
institution DOAJ
issn 2045-2322
language English
publishDate 2025-05-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-5a9e2b57334b45f889f9ca5fdab3fae72025-08-20T03:16:30ZengNature PortfolioScientific Reports2045-23222025-05-0115111110.1038/s41598-025-02072-1A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretationZhen-nan Yuan0Yu-juan Xue1Hai-jun Wang2Shi-ning Qu3Chu-lin Huang4Hao Wang5Hao Zhang6Min-ze Zhang7Xue-zhong Xing8Department of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegeDepartment of Pediatrics, Peking University People’s Hospital, Peking UniversityDepartment of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegeDepartment of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegeDepartment of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegeDepartment of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegeDepartment of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegePeople’s Hospital of Lishui DistrictDepartment of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegeAbstract The prediction of in-hospital mortality in cancer patients with acute pulmonary embolism (APE) remains a significant clinical challenge. This study aimed to develop and validate a machine learning model using XGBoost to predict in-hospital mortality in this vulnerable population. A retrospective cohort study was conducted using the MIMIC-IV 2.2 database and external data from the intensive care unit of Cancer hospital, Chinese Academy of Medical Sciences, collected between May 1, 2021, and April 30, 2023. A total of 448 cancer patients with APE were included from the MIMIC-IV 2.2 database, divided into a training set (70%, n = 314) and an internal validation set (30%, n = 134). An external validation cohort consisted of 56 patients. An XGBoost model was trained and the SHAP (SHapley Additive Explanations) method was used to identify the top 10 predictors of in-hospital mortality. These predictors included Glasgow Coma Scale (GCS) score, albumin, platelet count, age, serum creatinine, hemoglobin, presence of metastasis, lactate, creatine kinase (CK), and types of cancer. The XGBoost model achieved an area under the ROC curve (AUC) of 0.806 (95% CI: 0.717–0.896) in the internal validation set and 0.724 (95% CI: 0.686–0.901) in the external validation set. Calibration curves indicated good model fit, and decision curve analysis (DCA) demonstrated a high clinical benefit across both the internal and external validation cohorts. The XGBoost model, leveraging SHAP for interpretation, effectively predicts in-hospital mortality in cancer patients with APE. This model provides valuable insights for clinical decision-making and has the potential to improve patient outcomes through early intervention and personalized treatment strategies. Further validation in diverse clinical settings is warranted to confirm its generalizability.https://doi.org/10.1038/s41598-025-02072-1Machine learningAcute pulmonary embolismIn-hospital mortalityCancer
spellingShingle Zhen-nan Yuan
Yu-juan Xue
Hai-jun Wang
Shi-ning Qu
Chu-lin Huang
Hao Wang
Hao Zhang
Min-ze Zhang
Xue-zhong Xing
A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation
Scientific Reports
Machine learning
Acute pulmonary embolism
In-hospital mortality
Cancer
title A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation
title_full A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation
title_fullStr A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation
title_full_unstemmed A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation
title_short A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation
title_sort predictive model for hospital death in cancer patients with acute pulmonary embolism using xgboost machine learning and shap interpretation
topic Machine learning
Acute pulmonary embolism
In-hospital mortality
Cancer
url https://doi.org/10.1038/s41598-025-02072-1
work_keys_str_mv AT zhennanyuan apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT yujuanxue apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT haijunwang apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT shiningqu apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT chulinhuang apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT haowang apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT haozhang apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT minzezhang apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT xuezhongxing apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT zhennanyuan predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT yujuanxue predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT haijunwang predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT shiningqu predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT chulinhuang predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT haowang predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT haozhang predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT minzezhang predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation
AT xuezhongxing predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation