Derivation and validation of a clinical predictive model for longer duration diarrhea among pediatric patients in Kenya using machine learning algorithms

Abstract Background Despite the adverse health outcomes associated with longer duration diarrhea (LDD), there are currently no clinical decision tools for timely identification and better management of children with increased risk. This study utilizes machine learning (ML) to derive and validate a p...

Full description

Saved in:

Bibliographic Details
Main Authors:	Billy Ogwel, Vincent H. Mzazi, Alex O. Awuor, Caleb Okonji, Raphael O. Anyango, Caren Oreso, John B. Ochieng, Stephen Munga, Dilruba Nasrin, Kirkby D. Tickell, Patricia B. Pavlinac, Karen L. Kotloff, Richard Omore
Format:	Article
Language:	English
Published:	BMC 2025-01-01
Series:	BMC Medical Informatics and Decision Making
Subjects:	Machine Learning Longer duration diarrhea Pediatric Prediction
Online Access:	https://doi.org/10.1186/s12911-025-02855-6
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832594717690298368
author	Billy Ogwel Vincent H. Mzazi Alex O. Awuor Caleb Okonji Raphael O. Anyango Caren Oreso John B. Ochieng Stephen Munga Dilruba Nasrin Kirkby D. Tickell Patricia B. Pavlinac Karen L. Kotloff Richard Omore
author_facet	Billy Ogwel Vincent H. Mzazi Alex O. Awuor Caleb Okonji Raphael O. Anyango Caren Oreso John B. Ochieng Stephen Munga Dilruba Nasrin Kirkby D. Tickell Patricia B. Pavlinac Karen L. Kotloff Richard Omore
author_sort	Billy Ogwel
collection	DOAJ
description	Abstract Background Despite the adverse health outcomes associated with longer duration diarrhea (LDD), there are currently no clinical decision tools for timely identification and better management of children with increased risk. This study utilizes machine learning (ML) to derive and validate a predictive model for LDD among children presenting with diarrhea to health facilities. Methods LDD was defined as a diarrhea episode lasting ≥ 7 days. We used 7 ML algorithms to build prognostic models for the prediction of LDD among children < 5 years using de-identified data from Vaccine Impact on Diarrhea in Africa study (N = 1,482) in model development and data from Enterics for Global Health Shigella study (N = 682) in temporal validation of the champion model. Features included demographic, medical history and clinical examination data collected at enrolment in both studies. We conducted split-sampling and employed K-fold cross-validation with over-sampling technique in the model development. Moreover, critical predictors of LDD and their impact on prediction were obtained using an explainable model agnostic approach. The champion model was determined based on the area under the curve (AUC) metric. Model calibrations were assessed using Brier, Spiegelhalter’s z-test and its accompanying p-value. Results There was a significant difference in prevalence of LDD between the development and temporal validation cohorts (478 [32.3%] vs 69 [10.1%]; p < 0.001). The following variables were associated with LDD in decreasing order: pre-enrolment diarrhea days (55.1%), modified Vesikari score(18.2%), age group (10.7%), vomit days (8.8%), respiratory rate (6.5%), vomiting (6.4%), vomit frequency (6.2%), rotavirus vaccination (6.1%), skin pinch (2.4%) and stool frequency (2.4%). While all models showed good prediction capability, the random forest model achieved the best performance (AUC [95% Confidence Interval]: 83.0 [78.6–87.5] and 71.0 [62.5–79.4]) on the development and temporal validation datasets, respectively. While the random forest model showed slight deviations from perfect calibration, these deviations were not statistically significant (Brier score = 0.17, Spiegelhalter p-value = 0.219). Conclusions Our study suggests ML derived algorithms could be used to rapidly identify children at increased risk of LDD. Integrating ML derived models into clinical decision-making may allow clinicians to target these children with closer observation and enhanced management.
format	Article
id	doaj-art-bd3c170ffd5a488da8df26d7e3e7224d
institution	Kabale University
issn	1472-6947
language	English
publishDate	2025-01-01
publisher	BMC
record_format	Article
series	BMC Medical Informatics and Decision Making
spelling	doaj-art-bd3c170ffd5a488da8df26d7e3e7224d2025-01-19T12:26:01ZengBMCBMC Medical Informatics and Decision Making1472-69472025-01-0125111610.1186/s12911-025-02855-6Derivation and validation of a clinical predictive model for longer duration diarrhea among pediatric patients in Kenya using machine learning algorithmsBilly Ogwel0Vincent H. Mzazi1Alex O. Awuor2Caleb Okonji3Raphael O. Anyango4Caren Oreso5John B. Ochieng6Stephen Munga7Dilruba Nasrin8Kirkby D. Tickell9Patricia B. Pavlinac10Karen L. Kotloff11Richard Omore12Kenya Medical Research Institute- Center for Global Health Research (KEMRI-CGHR)Department of Information Systems, University of South AfricaKenya Medical Research Institute- Center for Global Health Research (KEMRI-CGHR)Kenya Medical Research Institute- Center for Global Health Research (KEMRI-CGHR)Kenya Medical Research Institute- Center for Global Health Research (KEMRI-CGHR)Kenya Medical Research Institute- Center for Global Health Research (KEMRI-CGHR)Kenya Medical Research Institute- Center for Global Health Research (KEMRI-CGHR)Kenya Medical Research Institute- Center for Global Health Research (KEMRI-CGHR)Department of Medicine, Center for Vaccine Development and Global Health, University of Maryland School of MedicineDepartment of Global Health, University of WashingtonDepartment of Global Health, University of WashingtonDepartment of Medicine, Center for Vaccine Development and Global Health, University of Maryland School of MedicineKenya Medical Research Institute- Center for Global Health Research (KEMRI-CGHR)Abstract Background Despite the adverse health outcomes associated with longer duration diarrhea (LDD), there are currently no clinical decision tools for timely identification and better management of children with increased risk. This study utilizes machine learning (ML) to derive and validate a predictive model for LDD among children presenting with diarrhea to health facilities. Methods LDD was defined as a diarrhea episode lasting ≥ 7 days. We used 7 ML algorithms to build prognostic models for the prediction of LDD among children < 5 years using de-identified data from Vaccine Impact on Diarrhea in Africa study (N = 1,482) in model development and data from Enterics for Global Health Shigella study (N = 682) in temporal validation of the champion model. Features included demographic, medical history and clinical examination data collected at enrolment in both studies. We conducted split-sampling and employed K-fold cross-validation with over-sampling technique in the model development. Moreover, critical predictors of LDD and their impact on prediction were obtained using an explainable model agnostic approach. The champion model was determined based on the area under the curve (AUC) metric. Model calibrations were assessed using Brier, Spiegelhalter’s z-test and its accompanying p-value. Results There was a significant difference in prevalence of LDD between the development and temporal validation cohorts (478 [32.3%] vs 69 [10.1%]; p < 0.001). The following variables were associated with LDD in decreasing order: pre-enrolment diarrhea days (55.1%), modified Vesikari score(18.2%), age group (10.7%), vomit days (8.8%), respiratory rate (6.5%), vomiting (6.4%), vomit frequency (6.2%), rotavirus vaccination (6.1%), skin pinch (2.4%) and stool frequency (2.4%). While all models showed good prediction capability, the random forest model achieved the best performance (AUC [95% Confidence Interval]: 83.0 [78.6–87.5] and 71.0 [62.5–79.4]) on the development and temporal validation datasets, respectively. While the random forest model showed slight deviations from perfect calibration, these deviations were not statistically significant (Brier score = 0.17, Spiegelhalter p-value = 0.219). Conclusions Our study suggests ML derived algorithms could be used to rapidly identify children at increased risk of LDD. Integrating ML derived models into clinical decision-making may allow clinicians to target these children with closer observation and enhanced management.https://doi.org/10.1186/s12911-025-02855-6Machine LearningLonger duration diarrheaPediatricPrediction
spellingShingle	Billy Ogwel Vincent H. Mzazi Alex O. Awuor Caleb Okonji Raphael O. Anyango Caren Oreso John B. Ochieng Stephen Munga Dilruba Nasrin Kirkby D. Tickell Patricia B. Pavlinac Karen L. Kotloff Richard Omore Derivation and validation of a clinical predictive model for longer duration diarrhea among pediatric patients in Kenya using machine learning algorithms BMC Medical Informatics and Decision Making Machine Learning Longer duration diarrhea Pediatric Prediction
title	Derivation and validation of a clinical predictive model for longer duration diarrhea among pediatric patients in Kenya using machine learning algorithms
title_full	Derivation and validation of a clinical predictive model for longer duration diarrhea among pediatric patients in Kenya using machine learning algorithms
title_fullStr	Derivation and validation of a clinical predictive model for longer duration diarrhea among pediatric patients in Kenya using machine learning algorithms
title_full_unstemmed	Derivation and validation of a clinical predictive model for longer duration diarrhea among pediatric patients in Kenya using machine learning algorithms
title_short	Derivation and validation of a clinical predictive model for longer duration diarrhea among pediatric patients in Kenya using machine learning algorithms
title_sort	derivation and validation of a clinical predictive model for longer duration diarrhea among pediatric patients in kenya using machine learning algorithms
topic	Machine Learning Longer duration diarrhea Pediatric Prediction
url	https://doi.org/10.1186/s12911-025-02855-6
work_keys_str_mv	AT billyogwel derivationandvalidationofaclinicalpredictivemodelforlongerdurationdiarrheaamongpediatricpatientsinkenyausingmachinelearningalgorithms AT vincenthmzazi derivationandvalidationofaclinicalpredictivemodelforlongerdurationdiarrheaamongpediatricpatientsinkenyausingmachinelearningalgorithms AT alexoawuor derivationandvalidationofaclinicalpredictivemodelforlongerdurationdiarrheaamongpediatricpatientsinkenyausingmachinelearningalgorithms AT calebokonji derivationandvalidationofaclinicalpredictivemodelforlongerdurationdiarrheaamongpediatricpatientsinkenyausingmachinelearningalgorithms AT raphaeloanyango derivationandvalidationofaclinicalpredictivemodelforlongerdurationdiarrheaamongpediatricpatientsinkenyausingmachinelearningalgorithms AT carenoreso derivationandvalidationofaclinicalpredictivemodelforlongerdurationdiarrheaamongpediatricpatientsinkenyausingmachinelearningalgorithms AT johnbochieng derivationandvalidationofaclinicalpredictivemodelforlongerdurationdiarrheaamongpediatricpatientsinkenyausingmachinelearningalgorithms AT stephenmunga derivationandvalidationofaclinicalpredictivemodelforlongerdurationdiarrheaamongpediatricpatientsinkenyausingmachinelearningalgorithms AT dilrubanasrin derivationandvalidationofaclinicalpredictivemodelforlongerdurationdiarrheaamongpediatricpatientsinkenyausingmachinelearningalgorithms AT kirkbydtickell derivationandvalidationofaclinicalpredictivemodelforlongerdurationdiarrheaamongpediatricpatientsinkenyausingmachinelearningalgorithms AT patriciabpavlinac derivationandvalidationofaclinicalpredictivemodelforlongerdurationdiarrheaamongpediatricpatientsinkenyausingmachinelearningalgorithms AT karenlkotloff derivationandvalidationofaclinicalpredictivemodelforlongerdurationdiarrheaamongpediatricpatientsinkenyausingmachinelearningalgorithms AT richardomore derivationandvalidationofaclinicalpredictivemodelforlongerdurationdiarrheaamongpediatricpatientsinkenyausingmachinelearningalgorithms

Derivation and validation of a clinical predictive model for longer duration diarrhea among pediatric patients in Kenya using machine learning algorithms

Similar Items