Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study

BackgroundSepsis is a life-threatening condition frequently observed in patients with intracerebral hemorrhage (ICH) who are critically ill. Early and accurate identification and prediction of sepsis are crucial. Machine learning (ML)–based predictive models exhibit promising...

Full description

Saved in:
Bibliographic Details
Main Authors: Xianglin Liu, Zhihua Huang, Yizhi Guo, Yandeng Li, Jianming Zhu, Jun Wen, Yunchun Gao, Jianyi Liu
Format: Article
Language:English
Published: JMIR Publications 2025-04-01
Series:Journal of Medical Internet Research
Online Access:https://www.jmir.org/2025/1/e71413
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:BackgroundSepsis is a life-threatening condition frequently observed in patients with intracerebral hemorrhage (ICH) who are critically ill. Early and accurate identification and prediction of sepsis are crucial. Machine learning (ML)–based predictive models exhibit promising sepsis prediction capabilities in emergency settings. However, their application in predicting sepsis among patients with ICH is still limited. ObjectiveThe aim of the study is to develop an ML-driven risk calculator for early prediction of sepsis in patients with ICH who are critically ill and to clarify feature importance and explain the model using the Shapley Additive Explanations method. MethodsPatients with ICH admitted to the intensive care unit (ICU) from the Medical Information Mart for Intensive Care IV database between 2008 and 2022 were divided into training and internal test sets. The external test was performed using the eICU Collaborative Research Database, which includes over 200,000 ICU admissions across the United States between 2014 and 2015. Sepsis following ICU admission was identified using Sepsis-3.0 through clinical diagnosis combining elevation of the Sequential Organ Failure Assessment by ≥2 points with suspected infection. The Boruta algorithm was used for feature selection, confirming 29 features. Nine ML algorithms were used to construct the prediction models. Predictive performance was compared using several evaluation metrics, including the area under the receiver operating characteristic curve (AUC). The Shapley Additive Explanations technique was used to interpret the final model, and a web-based risk calculator was constructed for clinical practice. ResultsOverall, 2414 patients with ICH were enrolled from the Medical Information Mart for Intensive Care IV database, with 1689 and 725 patients assigned to the training and internal test sets, respectively. An external test set of 2806 patients with ICH from the eICU database was used. Among the 9 ML models tested, the categorical boosting (CatBoost) model demonstrated the best discriminative ability. After reducing features based on their importance, an explainable final CatBoost model was developed using 8 features. The final model accurately predicted sepsis in internal (AUC=0.812) and external (AUC=0.771) tests. ConclusionsWe constructed a web-based risk calculator with 8 features based on the CatBoost model to assist clinicians in identifying people at high risk for sepsis in patients with ICH who are critically ill.
ISSN:1438-8871