Prediction of bloodstream infection using machine learning based primarily on biochemical data

Abstract Early diagnosis of bloodstream infection (BSI) is crucial for informed antibiotic use. This study developed a machine learning approach for early BSI detection using a comprehensive dataset from Rigshospitalet, Denmark (2010–2020). The dataset included 144,398 samples from adult patients, c...

Full description

Saved in:
Bibliographic Details
Main Authors: Ramtin Zargari Marandi, Frederik Boetius Hertz, Jesper Qvist Thomassen, Steen Christian Rasmussen, Ruth Frikke-Schmidt, Niels Frimodt-Møller, Karen Leth Nielsen, Cameron Ross MacPherson
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-01821-6
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Early diagnosis of bloodstream infection (BSI) is crucial for informed antibiotic use. This study developed a machine learning approach for early BSI detection using a comprehensive dataset from Rigshospitalet, Denmark (2010–2020). The dataset included 144,398 samples from adult patients, containing blood culture results, demographics, and up to 36 biochemical variables. Positive blood culture was observed in 6.4% of samples, mostly caused by Staphylococcus aureus, Escherichia coli, and Enterococcus faecium. 80% of the samples (N = 43,351 patients) were used for ML model development and five-fold cross-validation, with 20% for independent testing (N = 10,837). Among seven models, LightGBM performed best, achieving an AUC of 0.69 on the test set. It was more accurate in detecting negatives, with a negative predictive value (NPV) of 0.96 and specificity of 0.74, compared to a positive predictive value (PPV) of 0.13 and sensitivity of 0.54. SHapley Additive exPlanations (SHAP) identified platelets, leukocytes, and neutrophils-to-lymphocytes as the top-3 predictive features. The model showed higher sensitivity (average 0.66) for common pathogens, e.g., 0.71 for E. coli. Results highlight the potential of biochemical variables as diagnostic factors for BSI, indicating clinical use to focus on identifying patients at low risks and can be further enhanced in future investigations.
ISSN:2045-2322