Development and validation of a machine learning-based risk prediction model for stroke-associated pneumonia in older adult hemorrhagic stroke
ObjectiveTo develop and validate a machine learning (ML)-based model for predicting stroke-associated pneumonia (SAP) risk in older adult hemorrhagic stroke patients.MethodsA retrospective collection of older adult hemorrhagic stroke patients from three tertiary hospitals in Guiyang, Guizhou Provinc...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-06-01
|
| Series: | Frontiers in Neurology |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fneur.2025.1591570/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849422531062136832 |
|---|---|
| author | Yi Cao Yi Cao Haipeng Deng Shaoyun Liu Xi Zeng Yangyang Gou Weiting Zhang Yixinyuan Li Hua Yang Min Peng |
| author_facet | Yi Cao Yi Cao Haipeng Deng Shaoyun Liu Xi Zeng Yangyang Gou Weiting Zhang Yixinyuan Li Hua Yang Min Peng |
| author_sort | Yi Cao |
| collection | DOAJ |
| description | ObjectiveTo develop and validate a machine learning (ML)-based model for predicting stroke-associated pneumonia (SAP) risk in older adult hemorrhagic stroke patients.MethodsA retrospective collection of older adult hemorrhagic stroke patients from three tertiary hospitals in Guiyang, Guizhou Province (January 2019–December 2022) formed the modeling cohort, randomly split into training and internal validation sets (7:3 ratio). External validation utilized retrospective data from January–December 2023. After univariate and multivariate regression analyses, four ML models (Logistic Regression, XGBoost, Naive Bayes, and SVM) were constructed. Receiver operating characteristic (ROC) curves and area under the curve (AUC) were calculated for training and internal validation sets. Model performance was compared using Delong's test or Bootstrap test, while sensitivity, specificity, accuracy, precision, recall, and F1-score evaluated predictive efficacy. Calibration curves assessed model calibration. The optimal model underwent external validation using ROC and calibration curves.ResultsA total of 788 older adult hemorrhagic stroke patients were enrolled, divided into a training set (n = 462), an internal validation set (n = 196), and an external validation set (n = 130). The incidence of SAP in older adult patients with hemorrhagic stroke was 46.7% (368/788). Advanced age [OR = 1.064, 95% CI (1.024, 1.104)], smoking[OR = 2.488, 95% CI (1.460, 4.24)], low GCS score [OR = 0.675, 95% CI (0.553, 0.825)], low Braden score [OR = 0.741, 95% CI (0.640, 0.858)], and nasogastric tube [OR = 1.761, 95% CI (1.048, 2.960)] were identified as risk factors for SAP. Among the four machine learning algorithms evaluated [XGBoost, Logistic Regression (LR), Support Vector Machine (SVM), and Naive Bayes], the LR model demonstrated robust and consistent performance in predicting SAP among older adult patients with hemorrhagic stroke across multiple evaluation metrics. Furthermore, the model exhibited stable generalizability within the external validation cohort. Based on these findings, the LR framework was subsequently selected for external validation, accompanied by a nomogram visualization. The model achieved AUC values of 0.883 (training), 0.855 (internal validation), and 0.882 (external validation). The Hosmer-Lemeshow (H-L) test indicates that the calibration of the model is satisfactory in all three datasets, with P-values of 0.381, 0.142, and 0.066 respectively.ConclusionsThis study constructed and validated a risk prediction model for SAP in older adult patients with hemorrhagic stroke based on multi-center data. The results indicated that among the four machine learning algorithms (XGBoost, LR, SVM, and Naive Bayes), the LR model demonstrated the best and most stable predictive performance. Age, smoking, low GCS score, low Braden score, and nasogastric tube were identified as predictive factors for SAP in these patients. These indicators are easily obtainable in clinical practice and facilitate rapid bedside assessment. Through internal and external validation, the model was proven to have good generalization ability, and a nomogram was ultimately drawn to provide an objective and operational risk assessment tool for clinical nursing practice. It helps in the early identification of high-risk patients and guides targeted interventions, thereby reducing the incidence of SAP and improving patient prognosis. |
| format | Article |
| id | doaj-art-cc620a5ed3554ab49a1f58538f0ca39d |
| institution | Kabale University |
| issn | 1664-2295 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Neurology |
| spelling | doaj-art-cc620a5ed3554ab49a1f58538f0ca39d2025-08-20T03:31:05ZengFrontiers Media S.A.Frontiers in Neurology1664-22952025-06-011610.3389/fneur.2025.15915701591570Development and validation of a machine learning-based risk prediction model for stroke-associated pneumonia in older adult hemorrhagic strokeYi Cao0Yi Cao1Haipeng Deng2Shaoyun Liu3Xi Zeng4Yangyang Gou5Weiting Zhang6Yixinyuan Li7Hua Yang8Min Peng9Department of Neurosurgery, Affiliated Hospital of Guizhou Medical University, Guiyang, ChinaSchool of Nursing, Guizhou Medical University, Guiyang, ChinaDepartment of Neurosurgery, Affiliated Hospital of Guizhou Medical University, Guiyang, ChinaDepartment of Neurosurgery, Affiliated Hospital of Guizhou Medical University, Guiyang, ChinaDepartment of Neurosurgery, Affiliated Hospital of Guizhou Medical University, Guiyang, ChinaSchool of Nursing, Guizhou Medical University, Guiyang, ChinaSchool of Nursing, Guizhou Medical University, Guiyang, ChinaSchool of Nursing, Guizhou Medical University, Guiyang, ChinaDepartment of Neurosurgery, Affiliated Hospital of Guizhou Medical University, Guiyang, ChinaDepartment of Nursing Quality Management, Affiliated Hospital of Guizhou Medical University, Guiyang, ChinaObjectiveTo develop and validate a machine learning (ML)-based model for predicting stroke-associated pneumonia (SAP) risk in older adult hemorrhagic stroke patients.MethodsA retrospective collection of older adult hemorrhagic stroke patients from three tertiary hospitals in Guiyang, Guizhou Province (January 2019–December 2022) formed the modeling cohort, randomly split into training and internal validation sets (7:3 ratio). External validation utilized retrospective data from January–December 2023. After univariate and multivariate regression analyses, four ML models (Logistic Regression, XGBoost, Naive Bayes, and SVM) were constructed. Receiver operating characteristic (ROC) curves and area under the curve (AUC) were calculated for training and internal validation sets. Model performance was compared using Delong's test or Bootstrap test, while sensitivity, specificity, accuracy, precision, recall, and F1-score evaluated predictive efficacy. Calibration curves assessed model calibration. The optimal model underwent external validation using ROC and calibration curves.ResultsA total of 788 older adult hemorrhagic stroke patients were enrolled, divided into a training set (n = 462), an internal validation set (n = 196), and an external validation set (n = 130). The incidence of SAP in older adult patients with hemorrhagic stroke was 46.7% (368/788). Advanced age [OR = 1.064, 95% CI (1.024, 1.104)], smoking[OR = 2.488, 95% CI (1.460, 4.24)], low GCS score [OR = 0.675, 95% CI (0.553, 0.825)], low Braden score [OR = 0.741, 95% CI (0.640, 0.858)], and nasogastric tube [OR = 1.761, 95% CI (1.048, 2.960)] were identified as risk factors for SAP. Among the four machine learning algorithms evaluated [XGBoost, Logistic Regression (LR), Support Vector Machine (SVM), and Naive Bayes], the LR model demonstrated robust and consistent performance in predicting SAP among older adult patients with hemorrhagic stroke across multiple evaluation metrics. Furthermore, the model exhibited stable generalizability within the external validation cohort. Based on these findings, the LR framework was subsequently selected for external validation, accompanied by a nomogram visualization. The model achieved AUC values of 0.883 (training), 0.855 (internal validation), and 0.882 (external validation). The Hosmer-Lemeshow (H-L) test indicates that the calibration of the model is satisfactory in all three datasets, with P-values of 0.381, 0.142, and 0.066 respectively.ConclusionsThis study constructed and validated a risk prediction model for SAP in older adult patients with hemorrhagic stroke based on multi-center data. The results indicated that among the four machine learning algorithms (XGBoost, LR, SVM, and Naive Bayes), the LR model demonstrated the best and most stable predictive performance. Age, smoking, low GCS score, low Braden score, and nasogastric tube were identified as predictive factors for SAP in these patients. These indicators are easily obtainable in clinical practice and facilitate rapid bedside assessment. Through internal and external validation, the model was proven to have good generalization ability, and a nomogram was ultimately drawn to provide an objective and operational risk assessment tool for clinical nursing practice. It helps in the early identification of high-risk patients and guides targeted interventions, thereby reducing the incidence of SAP and improving patient prognosis.https://www.frontiersin.org/articles/10.3389/fneur.2025.1591570/fullmachine learningolder adulthemorrhagic strokestroke-associated pneumoniaprediction modelvalidation |
| spellingShingle | Yi Cao Yi Cao Haipeng Deng Shaoyun Liu Xi Zeng Yangyang Gou Weiting Zhang Yixinyuan Li Hua Yang Min Peng Development and validation of a machine learning-based risk prediction model for stroke-associated pneumonia in older adult hemorrhagic stroke Frontiers in Neurology machine learning older adult hemorrhagic stroke stroke-associated pneumonia prediction model validation |
| title | Development and validation of a machine learning-based risk prediction model for stroke-associated pneumonia in older adult hemorrhagic stroke |
| title_full | Development and validation of a machine learning-based risk prediction model for stroke-associated pneumonia in older adult hemorrhagic stroke |
| title_fullStr | Development and validation of a machine learning-based risk prediction model for stroke-associated pneumonia in older adult hemorrhagic stroke |
| title_full_unstemmed | Development and validation of a machine learning-based risk prediction model for stroke-associated pneumonia in older adult hemorrhagic stroke |
| title_short | Development and validation of a machine learning-based risk prediction model for stroke-associated pneumonia in older adult hemorrhagic stroke |
| title_sort | development and validation of a machine learning based risk prediction model for stroke associated pneumonia in older adult hemorrhagic stroke |
| topic | machine learning older adult hemorrhagic stroke stroke-associated pneumonia prediction model validation |
| url | https://www.frontiersin.org/articles/10.3389/fneur.2025.1591570/full |
| work_keys_str_mv | AT yicao developmentandvalidationofamachinelearningbasedriskpredictionmodelforstrokeassociatedpneumoniainolderadulthemorrhagicstroke AT yicao developmentandvalidationofamachinelearningbasedriskpredictionmodelforstrokeassociatedpneumoniainolderadulthemorrhagicstroke AT haipengdeng developmentandvalidationofamachinelearningbasedriskpredictionmodelforstrokeassociatedpneumoniainolderadulthemorrhagicstroke AT shaoyunliu developmentandvalidationofamachinelearningbasedriskpredictionmodelforstrokeassociatedpneumoniainolderadulthemorrhagicstroke AT xizeng developmentandvalidationofamachinelearningbasedriskpredictionmodelforstrokeassociatedpneumoniainolderadulthemorrhagicstroke AT yangyanggou developmentandvalidationofamachinelearningbasedriskpredictionmodelforstrokeassociatedpneumoniainolderadulthemorrhagicstroke AT weitingzhang developmentandvalidationofamachinelearningbasedriskpredictionmodelforstrokeassociatedpneumoniainolderadulthemorrhagicstroke AT yixinyuanli developmentandvalidationofamachinelearningbasedriskpredictionmodelforstrokeassociatedpneumoniainolderadulthemorrhagicstroke AT huayang developmentandvalidationofamachinelearningbasedriskpredictionmodelforstrokeassociatedpneumoniainolderadulthemorrhagicstroke AT minpeng developmentandvalidationofamachinelearningbasedriskpredictionmodelforstrokeassociatedpneumoniainolderadulthemorrhagicstroke |