Prediction of bloodstream infection using machine learning based primarily on biochemical data
Abstract Early diagnosis of bloodstream infection (BSI) is crucial for informed antibiotic use. This study developed a machine learning approach for early BSI detection using a comprehensive dataset from Rigshospitalet, Denmark (2010–2020). The dataset included 144,398 samples from adult patients, c...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-05-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-01821-6 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Early diagnosis of bloodstream infection (BSI) is crucial for informed antibiotic use. This study developed a machine learning approach for early BSI detection using a comprehensive dataset from Rigshospitalet, Denmark (2010–2020). The dataset included 144,398 samples from adult patients, containing blood culture results, demographics, and up to 36 biochemical variables. Positive blood culture was observed in 6.4% of samples, mostly caused by Staphylococcus aureus, Escherichia coli, and Enterococcus faecium. 80% of the samples (N = 43,351 patients) were used for ML model development and five-fold cross-validation, with 20% for independent testing (N = 10,837). Among seven models, LightGBM performed best, achieving an AUC of 0.69 on the test set. It was more accurate in detecting negatives, with a negative predictive value (NPV) of 0.96 and specificity of 0.74, compared to a positive predictive value (PPV) of 0.13 and sensitivity of 0.54. SHapley Additive exPlanations (SHAP) identified platelets, leukocytes, and neutrophils-to-lymphocytes as the top-3 predictive features. The model showed higher sensitivity (average 0.66) for common pathogens, e.g., 0.71 for E. coli. Results highlight the potential of biochemical variables as diagnostic factors for BSI, indicating clinical use to focus on identifying patients at low risks and can be further enhanced in future investigations. |
|---|---|
| ISSN: | 2045-2322 |