Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study

BackgroundSepsis is a life-threatening condition frequently observed in patients with intracerebral hemorrhage (ICH) who are critically ill. Early and accurate identification and prediction of sepsis are crucial. Machine learning (ML)–based predictive models exhibit promising...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xianglin Liu, Zhihua Huang, Yizhi Guo, Yandeng Li, Jianming Zhu, Jun Wen, Yunchun Gao, Jianyi Liu
Format:	Article
Language:	English
Published:	JMIR Publications 2025-04-01
Series:	Journal of Medical Internet Research
Online Access:	https://www.jmir.org/2025/1/e71413
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849698025470951424
author	Xianglin Liu Zhihua Huang Yizhi Guo Yandeng Li Jianming Zhu Jun Wen Yunchun Gao Jianyi Liu
author_facet	Xianglin Liu Zhihua Huang Yizhi Guo Yandeng Li Jianming Zhu Jun Wen Yunchun Gao Jianyi Liu
author_sort	Xianglin Liu
collection	DOAJ
description	BackgroundSepsis is a life-threatening condition frequently observed in patients with intracerebral hemorrhage (ICH) who are critically ill. Early and accurate identification and prediction of sepsis are crucial. Machine learning (ML)–based predictive models exhibit promising sepsis prediction capabilities in emergency settings. However, their application in predicting sepsis among patients with ICH is still limited. ObjectiveThe aim of the study is to develop an ML-driven risk calculator for early prediction of sepsis in patients with ICH who are critically ill and to clarify feature importance and explain the model using the Shapley Additive Explanations method. MethodsPatients with ICH admitted to the intensive care unit (ICU) from the Medical Information Mart for Intensive Care IV database between 2008 and 2022 were divided into training and internal test sets. The external test was performed using the eICU Collaborative Research Database, which includes over 200,000 ICU admissions across the United States between 2014 and 2015. Sepsis following ICU admission was identified using Sepsis-3.0 through clinical diagnosis combining elevation of the Sequential Organ Failure Assessment by ≥2 points with suspected infection. The Boruta algorithm was used for feature selection, confirming 29 features. Nine ML algorithms were used to construct the prediction models. Predictive performance was compared using several evaluation metrics, including the area under the receiver operating characteristic curve (AUC). The Shapley Additive Explanations technique was used to interpret the final model, and a web-based risk calculator was constructed for clinical practice. ResultsOverall, 2414 patients with ICH were enrolled from the Medical Information Mart for Intensive Care IV database, with 1689 and 725 patients assigned to the training and internal test sets, respectively. An external test set of 2806 patients with ICH from the eICU database was used. Among the 9 ML models tested, the categorical boosting (CatBoost) model demonstrated the best discriminative ability. After reducing features based on their importance, an explainable final CatBoost model was developed using 8 features. The final model accurately predicted sepsis in internal (AUC=0.812) and external (AUC=0.771) tests. ConclusionsWe constructed a web-based risk calculator with 8 features based on the CatBoost model to assist clinicians in identifying people at high risk for sepsis in patients with ICH who are critically ill.
format	Article
id	doaj-art-dd10d22f1cbc435c8909003e1a4c7ab2
institution	DOAJ
issn	1438-8871
language	English
publishDate	2025-04-01
publisher	JMIR Publications
record_format	Article
series	Journal of Medical Internet Research
spelling	doaj-art-dd10d22f1cbc435c8909003e1a4c7ab22025-08-20T03:19:02ZengJMIR PublicationsJournal of Medical Internet Research1438-88712025-04-0127e7141310.2196/71413Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective StudyXianglin Liuhttps://orcid.org/0009-0005-3342-0874Zhihua Huanghttps://orcid.org/0009-0005-0339-5460Yizhi Guohttps://orcid.org/0009-0001-5168-994XYandeng Lihttps://orcid.org/0009-0005-4555-0342Jianming Zhuhttps://orcid.org/0000-0002-9503-3416Jun Wenhttps://orcid.org/0009-0007-7756-9501Yunchun Gaohttps://orcid.org/0009-0001-6840-4218Jianyi Liuhttps://orcid.org/0009-0009-2747-1623 BackgroundSepsis is a life-threatening condition frequently observed in patients with intracerebral hemorrhage (ICH) who are critically ill. Early and accurate identification and prediction of sepsis are crucial. Machine learning (ML)–based predictive models exhibit promising sepsis prediction capabilities in emergency settings. However, their application in predicting sepsis among patients with ICH is still limited. ObjectiveThe aim of the study is to develop an ML-driven risk calculator for early prediction of sepsis in patients with ICH who are critically ill and to clarify feature importance and explain the model using the Shapley Additive Explanations method. MethodsPatients with ICH admitted to the intensive care unit (ICU) from the Medical Information Mart for Intensive Care IV database between 2008 and 2022 were divided into training and internal test sets. The external test was performed using the eICU Collaborative Research Database, which includes over 200,000 ICU admissions across the United States between 2014 and 2015. Sepsis following ICU admission was identified using Sepsis-3.0 through clinical diagnosis combining elevation of the Sequential Organ Failure Assessment by ≥2 points with suspected infection. The Boruta algorithm was used for feature selection, confirming 29 features. Nine ML algorithms were used to construct the prediction models. Predictive performance was compared using several evaluation metrics, including the area under the receiver operating characteristic curve (AUC). The Shapley Additive Explanations technique was used to interpret the final model, and a web-based risk calculator was constructed for clinical practice. ResultsOverall, 2414 patients with ICH were enrolled from the Medical Information Mart for Intensive Care IV database, with 1689 and 725 patients assigned to the training and internal test sets, respectively. An external test set of 2806 patients with ICH from the eICU database was used. Among the 9 ML models tested, the categorical boosting (CatBoost) model demonstrated the best discriminative ability. After reducing features based on their importance, an explainable final CatBoost model was developed using 8 features. The final model accurately predicted sepsis in internal (AUC=0.812) and external (AUC=0.771) tests. ConclusionsWe constructed a web-based risk calculator with 8 features based on the CatBoost model to assist clinicians in identifying people at high risk for sepsis in patients with ICH who are critically ill.https://www.jmir.org/2025/1/e71413
spellingShingle	Xianglin Liu Zhihua Huang Yizhi Guo Yandeng Li Jianming Zhu Jun Wen Yunchun Gao Jianyi Liu Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study Journal of Medical Internet Research
title	Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study
title_full	Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study
title_fullStr	Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study
title_full_unstemmed	Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study
title_short	Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study
title_sort	identification and validation of an explainable prediction model of sepsis in patients with intracerebral hemorrhage multicenter retrospective study
url	https://www.jmir.org/2025/1/e71413
work_keys_str_mv	AT xianglinliu identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy AT zhihuahuang identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy AT yizhiguo identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy AT yandengli identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy AT jianmingzhu identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy AT junwen identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy AT yunchungao identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy AT jianyiliu identificationandvalidationofanexplainablepredictionmodelofsepsisinpatientswithintracerebralhemorrhagemulticenterretrospectivestudy

Identification and Validation of an Explainable Prediction Model of Sepsis in Patients With Intracerebral Hemorrhage: Multicenter Retrospective Study

Similar Items