Predicting carbapenem-resistant Pseudomonas aeruginosa infection risk using XGBoost model and explainability

Abstract The prevalence and spread of carbapenem-resistant Pseudomonas aeruginosa (CRPA) is a global public health problem. This study aims to identify the risk factors of CRPA infection and construct a machine learning model to provide a prediction tool for clinical prevention and control. A total...

Full description

Saved in:
Bibliographic Details
Main Authors: Yan Jiang, Hong-wei Wang, Fang-ying Tian, Yue Guo, Xiu-mei Wang
Format: Article
Language:English
Published: Nature Portfolio 2025-06-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-04028-x
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849469942674489344
author Yan Jiang
Hong-wei Wang
Fang-ying Tian
Yue Guo
Xiu-mei Wang
author_facet Yan Jiang
Hong-wei Wang
Fang-ying Tian
Yue Guo
Xiu-mei Wang
author_sort Yan Jiang
collection DOAJ
description Abstract The prevalence and spread of carbapenem-resistant Pseudomonas aeruginosa (CRPA) is a global public health problem. This study aims to identify the risk factors of CRPA infection and construct a machine learning model to provide a prediction tool for clinical prevention and control. A total of 1949 patients with P.aeruginosa health care-associated infections (HAIs) were enrolled in this study. A total of 89 patients with CRPA infection and 89 patients with CSPA infection were matched 1:1. LASSO regression was used to screen the variables, and the XGBoost model was established (136 cases in the training set and 60 cases in the test set). Shapley additive explain (SHAP) method was used to explain the importance of variables. The area under the ROC curve (AUC) and calibration curve were used to evaluate the performance of the model. There were 89 cases of CRPA infection, and the CRPA infection rate was 4.57%. Respiratory tract was the most common source of infection, and ICU and hematology department were the high-risk departments. The AUC value of the XGBoost machine learning model in the training set was 0.987 (95%CI: 0.974-1.000), and the AUC value in the test set was 0.862 (95%CI: 0.750–0.974). The clinical decision curve also showed good results of the model. SHAP results showed that ICU admission, duration of central venous catheterization, use of carbapenems and fluoroquinolones were important factors for predicting CRPA infection. The XGBoost machine learning model is helpful for the early prevention and screening of CRPA infection in medical institutions. Infection control and clinical departments should carry out effective prevention and control for high-risk factors to reduce the occurrence of CRPA infection.
format Article
id doaj-art-47812845e3134b65b4ebb520b454632f
institution Kabale University
issn 2045-2322
language English
publishDate 2025-06-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-47812845e3134b65b4ebb520b454632f2025-08-20T03:25:19ZengNature PortfolioScientific Reports2045-23222025-06-0115111110.1038/s41598-025-04028-xPredicting carbapenem-resistant Pseudomonas aeruginosa infection risk using XGBoost model and explainabilityYan Jiang0Hong-wei Wang1Fang-ying Tian2Yue Guo3Xiu-mei Wang4Department of Nursing, Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Third Hospital of Shanxi Medical University, Tongji Shanxi HospitalShanxi Medical UniversityDepartment of Hospital Infection, The Second Hospital of Shanxi Medical UniversityDepartment of Nursing, Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Third Hospital of Shanxi Medical University, Tongji Shanxi HospitalDepartment of Nursing, Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Third Hospital of Shanxi Medical University, Tongji Shanxi HospitalAbstract The prevalence and spread of carbapenem-resistant Pseudomonas aeruginosa (CRPA) is a global public health problem. This study aims to identify the risk factors of CRPA infection and construct a machine learning model to provide a prediction tool for clinical prevention and control. A total of 1949 patients with P.aeruginosa health care-associated infections (HAIs) were enrolled in this study. A total of 89 patients with CRPA infection and 89 patients with CSPA infection were matched 1:1. LASSO regression was used to screen the variables, and the XGBoost model was established (136 cases in the training set and 60 cases in the test set). Shapley additive explain (SHAP) method was used to explain the importance of variables. The area under the ROC curve (AUC) and calibration curve were used to evaluate the performance of the model. There were 89 cases of CRPA infection, and the CRPA infection rate was 4.57%. Respiratory tract was the most common source of infection, and ICU and hematology department were the high-risk departments. The AUC value of the XGBoost machine learning model in the training set was 0.987 (95%CI: 0.974-1.000), and the AUC value in the test set was 0.862 (95%CI: 0.750–0.974). The clinical decision curve also showed good results of the model. SHAP results showed that ICU admission, duration of central venous catheterization, use of carbapenems and fluoroquinolones were important factors for predicting CRPA infection. The XGBoost machine learning model is helpful for the early prevention and screening of CRPA infection in medical institutions. Infection control and clinical departments should carry out effective prevention and control for high-risk factors to reduce the occurrence of CRPA infection.https://doi.org/10.1038/s41598-025-04028-xCarbapenem resistancePseudomonas aeruginosaMachine learning modelsXGBoostRisk factors
spellingShingle Yan Jiang
Hong-wei Wang
Fang-ying Tian
Yue Guo
Xiu-mei Wang
Predicting carbapenem-resistant Pseudomonas aeruginosa infection risk using XGBoost model and explainability
Scientific Reports
Carbapenem resistance
Pseudomonas aeruginosa
Machine learning models
XGBoost
Risk factors
title Predicting carbapenem-resistant Pseudomonas aeruginosa infection risk using XGBoost model and explainability
title_full Predicting carbapenem-resistant Pseudomonas aeruginosa infection risk using XGBoost model and explainability
title_fullStr Predicting carbapenem-resistant Pseudomonas aeruginosa infection risk using XGBoost model and explainability
title_full_unstemmed Predicting carbapenem-resistant Pseudomonas aeruginosa infection risk using XGBoost model and explainability
title_short Predicting carbapenem-resistant Pseudomonas aeruginosa infection risk using XGBoost model and explainability
title_sort predicting carbapenem resistant pseudomonas aeruginosa infection risk using xgboost model and explainability
topic Carbapenem resistance
Pseudomonas aeruginosa
Machine learning models
XGBoost
Risk factors
url https://doi.org/10.1038/s41598-025-04028-x
work_keys_str_mv AT yanjiang predictingcarbapenemresistantpseudomonasaeruginosainfectionriskusingxgboostmodelandexplainability
AT hongweiwang predictingcarbapenemresistantpseudomonasaeruginosainfectionriskusingxgboostmodelandexplainability
AT fangyingtian predictingcarbapenemresistantpseudomonasaeruginosainfectionriskusingxgboostmodelandexplainability
AT yueguo predictingcarbapenemresistantpseudomonasaeruginosainfectionriskusingxgboostmodelandexplainability
AT xiumeiwang predictingcarbapenemresistantpseudomonasaeruginosainfectionriskusingxgboostmodelandexplainability