Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study

BackgroundSepsis is a life-threatening condition frequently observed in patients with intracerebral hemorrhage (ICH) who are critically ill. Early and accurate identification and prediction of sepsis are crucial. Machine learning (ML)–based predictive models exhibit promising...

Full description

Saved in:
Bibliographic Details
Main Authors: Xianglin Liu, Zhihua Huang, Yizhi Guo, Yandeng Li, Jianming Zhu, Jun Wen, Yunchun Gao, Jianyi Liu
Format: Article
Language:English
Published: JMIR Publications 2025-04-01
Series:Journal of Medical Internet Research
Online Access:https://www.jmir.org/2025/1/e71413
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849698025470951424
author Xianglin Liu
Zhihua Huang
Yizhi Guo
Yandeng Li
Jianming Zhu
Jun Wen
Yunchun Gao
Jianyi Liu
author_facet Xianglin Liu
Zhihua Huang
Yizhi Guo
Yandeng Li
Jianming Zhu
Jun Wen
Yunchun Gao
Jianyi Liu
author_sort Xianglin Liu
collection DOAJ
description BackgroundSepsis is a life-threatening condition frequently observed in patients with intracerebral hemorrhage (ICH) who are critically ill. Early and accurate identification and prediction of sepsis are crucial. Machine learning (ML)–based predictive models exhibit promising sepsis prediction capabilities in emergency settings. However, their application in predicting sepsis among patients with ICH is still limited. ObjectiveThe aim of the study is to develop an ML-driven risk calculator for early prediction of sepsis in patients with ICH who are critically ill and to clarify feature importance and explain the model using the Shapley Additive Explanations method. MethodsPatients with ICH admitted to the intensive care unit (ICU) from the Medical Information Mart for Intensive Care IV database between 2008 and 2022 were divided into training and internal test sets. The external test was performed using the eICU Collaborative Research Database, which includes over 200,000 ICU admissions across the United States between 2014 and 2015. Sepsis following ICU admission was identified using Sepsis-3.0 through clinical diagnosis combining elevation of the Sequential Organ Failure Assessment by ≥2 points with suspected infection. The Boruta algorithm was used for feature selection, confirming 29 features. Nine ML algorithms were used to construct the prediction models. Predictive performance was compared using several evaluation metrics, including the area under the receiver operating characteristic curve (AUC). The Shapley Additive Explanations technique was used to interpret the final model, and a web-based risk calculator was constructed for clinical practice. ResultsOverall, 2414 patients with ICH were enrolled from the Medical Information Mart for Intensive Care IV database, with 1689 and 725 patients assigned to the training and internal test sets, respectively. An external test set of 2806 patients with ICH from the eICU database was used. Among the 9 ML models tested, the categorical boosting (CatBoost) model demonstrated the best discriminative ability. After reducing features based on their importance, an explainable final CatBoost model was developed using 8 features. The final model accurately predicted sepsis in internal (AUC=0.812) and external (AUC=0.771) tests. ConclusionsWe constructed a web-based risk calculator with 8 features based on the CatBoost model to assist clinicians in identifying people at high risk for sepsis in patients with ICH who are critically ill.
format Article
id doaj-art-dd10d22f1cbc435c8909003e1a4c7ab2
institution DOAJ
issn 1438-8871
language English
publishDate 2025-04-01
publisher JMIR Publications
record_format Article
series Journal of Medical Internet Research
spelling doaj-art-dd10d22f1cbc435c8909003e1a4c7ab22025-08-20T03:19:02ZengJMIR PublicationsJournal of Medical Internet Research1438-88712025-04-0127e7141310.2196/71413Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective StudyXianglin Liuhttps://orcid.org/0009-0005-3342-0874Zhihua Huanghttps://orcid.org/0009-0005-0339-5460Yizhi Guohttps://orcid.org/0009-0001-5168-994XYandeng Lihttps://orcid.org/0009-0005-4555-0342Jianming Zhuhttps://orcid.org/0000-0002-9503-3416Jun Wenhttps://orcid.org/0009-0007-7756-9501Yunchun Gaohttps://orcid.org/0009-0001-6840-4218Jianyi Liuhttps://orcid.org/0009-0009-2747-1623 BackgroundSepsis is a life-threatening condition frequently observed in patients with intracerebral hemorrhage (ICH) who are critically ill. Early and accurate identification and prediction of sepsis are crucial. Machine learning (ML)–based predictive models exhibit promising sepsis prediction capabilities in emergency settings. However, their application in predicting sepsis among patients with ICH is still limited. ObjectiveThe aim of the study is to develop an ML-driven risk calculator for early prediction of sepsis in patients with ICH who are critically ill and to clarify feature importance and explain the model using the Shapley Additive Explanations method. MethodsPatients with ICH admitted to the intensive care unit (ICU) from the Medical Information Mart for Intensive Care IV database between 2008 and 2022 were divided into training and internal test sets. The external test was performed using the eICU Collaborative Research Database, which includes over 200,000 ICU admissions across the United States between 2014 and 2015. Sepsis following ICU admission was identified using Sepsis-3.0 through clinical diagnosis combining elevation of the Sequential Organ Failure Assessment by ≥2 points with suspected infection. The Boruta algorithm was used for feature selection, confirming 29 features. Nine ML algorithms were used to construct the prediction models. Predictive performance was compared using several evaluation metrics, including the area under the receiver operating characteristic curve (AUC). The Shapley Additive Explanations technique was used to interpret the final model, and a web-based risk calculator was constructed for clinical practice. ResultsOverall, 2414 patients with ICH were enrolled from the Medical Information Mart for Intensive Care IV database, with 1689 and 725 patients assigned to the training and internal test sets, respectively. An external test set of 2806 patients with ICH from the eICU database was used. Among the 9 ML models tested, the categorical boosting (CatBoost) model demonstrated the best discriminative ability. After reducing features based on their importance, an explainable final CatBoost model was developed using 8 features. The final model accurately predicted sepsis in internal (AUC=0.812) and external (AUC=0.771) tests. ConclusionsWe constructed a web-based risk calculator with 8 features based on the CatBoost model to assist clinicians in identifying people at high risk for sepsis in patients with ICH who are critically ill.https://www.jmir.org/2025/1/e71413
spellingShingle Xianglin Liu
Zhihua Huang
Yizhi Guo
Yandeng Li
Jianming Zhu
Jun Wen
Yunchun Gao
Jianyi Liu
Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study
Journal of Medical Internet Research
title Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study
title_full Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study
title_fullStr Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study
title_full_unstemmed Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study
title_short Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study
title_sort identification and validation of an explainable prediction model of sepsis in patients with intracerebral hemorrhage multicenter retrospective study
url https://www.jmir.org/2025/1/e71413
work_keys_str_mv AT xianglinliu identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy
AT zhihuahuang identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy
AT yizhiguo identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy
AT yandengli identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy
AT jianmingzhu identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy
AT junwen identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy
AT yunchungao identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy
AT jianyiliu identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy