A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation
Abstract The prediction of in-hospital mortality in cancer patients with acute pulmonary embolism (APE) remains a significant clinical challenge. This study aimed to develop and validate a machine learning model using XGBoost to predict in-hospital mortality in this vulnerable population. A retrospe...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-05-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-02072-1 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849705325986316288 |
|---|---|
| author | Zhen-nan Yuan Yu-juan Xue Hai-jun Wang Shi-ning Qu Chu-lin Huang Hao Wang Hao Zhang Min-ze Zhang Xue-zhong Xing |
| author_facet | Zhen-nan Yuan Yu-juan Xue Hai-jun Wang Shi-ning Qu Chu-lin Huang Hao Wang Hao Zhang Min-ze Zhang Xue-zhong Xing |
| author_sort | Zhen-nan Yuan |
| collection | DOAJ |
| description | Abstract The prediction of in-hospital mortality in cancer patients with acute pulmonary embolism (APE) remains a significant clinical challenge. This study aimed to develop and validate a machine learning model using XGBoost to predict in-hospital mortality in this vulnerable population. A retrospective cohort study was conducted using the MIMIC-IV 2.2 database and external data from the intensive care unit of Cancer hospital, Chinese Academy of Medical Sciences, collected between May 1, 2021, and April 30, 2023. A total of 448 cancer patients with APE were included from the MIMIC-IV 2.2 database, divided into a training set (70%, n = 314) and an internal validation set (30%, n = 134). An external validation cohort consisted of 56 patients. An XGBoost model was trained and the SHAP (SHapley Additive Explanations) method was used to identify the top 10 predictors of in-hospital mortality. These predictors included Glasgow Coma Scale (GCS) score, albumin, platelet count, age, serum creatinine, hemoglobin, presence of metastasis, lactate, creatine kinase (CK), and types of cancer. The XGBoost model achieved an area under the ROC curve (AUC) of 0.806 (95% CI: 0.717–0.896) in the internal validation set and 0.724 (95% CI: 0.686–0.901) in the external validation set. Calibration curves indicated good model fit, and decision curve analysis (DCA) demonstrated a high clinical benefit across both the internal and external validation cohorts. The XGBoost model, leveraging SHAP for interpretation, effectively predicts in-hospital mortality in cancer patients with APE. This model provides valuable insights for clinical decision-making and has the potential to improve patient outcomes through early intervention and personalized treatment strategies. Further validation in diverse clinical settings is warranted to confirm its generalizability. |
| format | Article |
| id | doaj-art-5a9e2b57334b45f889f9ca5fdab3fae7 |
| institution | DOAJ |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-5a9e2b57334b45f889f9ca5fdab3fae72025-08-20T03:16:30ZengNature PortfolioScientific Reports2045-23222025-05-0115111110.1038/s41598-025-02072-1A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretationZhen-nan Yuan0Yu-juan Xue1Hai-jun Wang2Shi-ning Qu3Chu-lin Huang4Hao Wang5Hao Zhang6Min-ze Zhang7Xue-zhong Xing8Department of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegeDepartment of Pediatrics, Peking University People’s Hospital, Peking UniversityDepartment of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegeDepartment of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegeDepartment of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegeDepartment of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegeDepartment of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegePeople’s Hospital of Lishui DistrictDepartment of Intensive Care Unit, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical CollegeAbstract The prediction of in-hospital mortality in cancer patients with acute pulmonary embolism (APE) remains a significant clinical challenge. This study aimed to develop and validate a machine learning model using XGBoost to predict in-hospital mortality in this vulnerable population. A retrospective cohort study was conducted using the MIMIC-IV 2.2 database and external data from the intensive care unit of Cancer hospital, Chinese Academy of Medical Sciences, collected between May 1, 2021, and April 30, 2023. A total of 448 cancer patients with APE were included from the MIMIC-IV 2.2 database, divided into a training set (70%, n = 314) and an internal validation set (30%, n = 134). An external validation cohort consisted of 56 patients. An XGBoost model was trained and the SHAP (SHapley Additive Explanations) method was used to identify the top 10 predictors of in-hospital mortality. These predictors included Glasgow Coma Scale (GCS) score, albumin, platelet count, age, serum creatinine, hemoglobin, presence of metastasis, lactate, creatine kinase (CK), and types of cancer. The XGBoost model achieved an area under the ROC curve (AUC) of 0.806 (95% CI: 0.717–0.896) in the internal validation set and 0.724 (95% CI: 0.686–0.901) in the external validation set. Calibration curves indicated good model fit, and decision curve analysis (DCA) demonstrated a high clinical benefit across both the internal and external validation cohorts. The XGBoost model, leveraging SHAP for interpretation, effectively predicts in-hospital mortality in cancer patients with APE. This model provides valuable insights for clinical decision-making and has the potential to improve patient outcomes through early intervention and personalized treatment strategies. Further validation in diverse clinical settings is warranted to confirm its generalizability.https://doi.org/10.1038/s41598-025-02072-1Machine learningAcute pulmonary embolismIn-hospital mortalityCancer |
| spellingShingle | Zhen-nan Yuan Yu-juan Xue Hai-jun Wang Shi-ning Qu Chu-lin Huang Hao Wang Hao Zhang Min-ze Zhang Xue-zhong Xing A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation Scientific Reports Machine learning Acute pulmonary embolism In-hospital mortality Cancer |
| title | A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation |
| title_full | A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation |
| title_fullStr | A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation |
| title_full_unstemmed | A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation |
| title_short | A predictive model for hospital death in cancer patients with acute pulmonary embolism using XGBoost machine learning and SHAP interpretation |
| title_sort | predictive model for hospital death in cancer patients with acute pulmonary embolism using xgboost machine learning and shap interpretation |
| topic | Machine learning Acute pulmonary embolism In-hospital mortality Cancer |
| url | https://doi.org/10.1038/s41598-025-02072-1 |
| work_keys_str_mv | AT zhennanyuan apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT yujuanxue apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT haijunwang apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT shiningqu apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT chulinhuang apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT haowang apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT haozhang apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT minzezhang apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT xuezhongxing apredictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT zhennanyuan predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT yujuanxue predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT haijunwang predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT shiningqu predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT chulinhuang predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT haowang predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT haozhang predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT minzezhang predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation AT xuezhongxing predictivemodelforhospitaldeathincancerpatientswithacutepulmonaryembolismusingxgboostmachinelearningandshapinterpretation |