Development of a machine learning-based prediction model for serious bacterial infections in febrile young infants

Background To develop and validate machine learning (ML)-based models to predict serious bacterial infections (SBIs) in febrile infants aged ≤90 days.Methods This retrospective study analysed data from febrile infants (≥38.0℃) aged ≤90 days. The development dataset comprised data from patients who v...

Full description

Saved in:
Bibliographic Details
Main Authors: Jina Lee, Jong Seung Lee, Seak Hee Oh, Jun Sung Park, Reenar Yoo, Soo-young Lim, Dahyun Kim, Min Kyo Chun, Jeeho Han, Jeong-Yong Lee, Seung Jun Choi
Format: Article
Language:English
Published: BMJ Publishing Group 2025-07-01
Series:BMJ Paediatrics Open
Online Access:https://bmjpaedsopen.bmj.com/content/9/1/e003548.full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background To develop and validate machine learning (ML)-based models to predict serious bacterial infections (SBIs) in febrile infants aged ≤90 days.Methods This retrospective study analysed data from febrile infants (≥38.0℃) aged ≤90 days. The development dataset comprised data from patients who visited the Seoul Asan Medical Center between 2015 and 2021, whereas the validation dataset included data from those who visited the centre from January 2022 to August 2023. Logistic regression (LR) and eXtreme Gradient Boosting (XGB) were used to develop the models for predicting SBIs, which were then compared with traditional rule-based models.Results The study included data from 2860 patients: 2288 (80%) in the development dataset and 572 (20%) in the validation dataset. SBIs were confirmed in 482 patients (21.0%) in the development dataset and 131 (22.9%) in the validation dataset. The XGB and LR models showed excellent performance with areas under the curve of 0.990 and 0.981 in development, and 0.989 and 0.985 in validation datasets. In validation, both models demonstrated superior specificity (82.3–87.0% vs 46.2–72.2%) and positive predictive value (61.5–68.5% vs 34.4–49.8%) compared with traditional rule-based models, while maintaining perfect sensitivity and negative predictive value (both 100% vs 81.7–100% and 92.0–100%, respectively) without any false negatives. Urinalysis, C-reactive protein and procalcitonin were identified as top-tier features in the XGB model.Conclusions The ML-based prediction model demonstrated robust performance, with superior specificity and perfect sensitivity, which may enhance the accuracy of SBI detection and reduce the costs associated with false positives.
ISSN:2399-9772