Machine learning-based risk prediction models for bronchopulmonary dysplasia in preterm infants: a high-altitude cohort study

Background Bronchopulmonary dysplasia (BPD) is a significant cause of morbidity in preterm infants, yet its development and severity at high altitudes (>1500 m) remain poorly understood. This study aimed to identify altitude-specific risk factors and develop robust, interpretable predictive m...

Full description

Saved in:
Bibliographic Details
Main Authors: Heng Zhang, Fei Wang, Hongying Mi, Xiaoyan Xu, Ou Jiang, Yilin Lin, Lianfang Tang, Ziwei Li, Rui Ba
Format: Article
Language:English
Published: BMJ Publishing Group 2025-07-01
Series:BMJ Paediatrics Open
Online Access:https://bmjpaedsopen.bmj.com/content/9/1/e003652.full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849701347573628928
author Heng Zhang
Fei Wang
Hongying Mi
Xiaoyan Xu
Ou Jiang
Yilin Lin
Lianfang Tang
Ziwei Li
Rui Ba
author_facet Heng Zhang
Fei Wang
Hongying Mi
Xiaoyan Xu
Ou Jiang
Yilin Lin
Lianfang Tang
Ziwei Li
Rui Ba
author_sort Heng Zhang
collection DOAJ
description Background Bronchopulmonary dysplasia (BPD) is a significant cause of morbidity in preterm infants, yet its development and severity at high altitudes (>1500 m) remain poorly understood. This study aimed to identify altitude-specific risk factors and develop robust, interpretable predictive models for BPD in this unique population.Methods In this retrospective matched cohort study, 378 preterm infants (<32 weeks gestation, <1500 g birth weight) admitted to a high-altitude (1500 m) NICU(Neonatal Intensive Care Unit) between 2019 and 2023 were analysed. The cohort included 189 BPD cases (91 mild, 61 moderate, 37 severe) and 189 matched controls. Maternal, perinatal and postnatal data were collected. Machine learning models (XGBoost, logistic regression, random forest) were developed and rigorously evaluated using comprehensive performance metrics to predict BPD occurrence and severity. SHAP (SHapley Additive exPlanations) analysis was employed to interpret the best-performing model.Results Key risk factors for BPD development included maternal hypertension (OR 2.31, 95% CI 1.56 to 3.42), initial oxygen requirement >30% (OR 3.15, 95% CI 2.13 to 4.65) and lack of exclusive breast milk feeding (OR 1.89, 95% CI 1.28 to 2.79). Severe BPD was independently associated with prolonged invasive ventilation (>7 days) (OR 4.12, 95% CI 2.78 to 6.11), elevated C reactive protein (>10 mg/L) (OR 2.87, 95% CI 1.93 to 4.26) and patent ductus arteriosus (OR 2.53, 95% CI 1.71 to 3.74). Machine learning models demonstrated strong predictive performance; the optimal XGBoost model achieved an area under the curve of 0.89 (95% CI 0.85 to 0.93), an F1 score of 0.82, a Matthews Correlation Coefficient of 0.73 and a balanced accuracy of 0.85. SHAP analysis identified initial FiO2 >30%, mechanical ventilation and maternal hypertension as the top three most influential predictors for the XGBoost model.Conclusions This study provides the first comprehensive analysis of BPD risk factors at a specific high altitude and validates effective, interpretable machine learning models for its prediction. These findings highlight the critical importance of altitude-specific adjustments in risk assessment and emphasise the potential for model-guided early interventions to improve outcomes for this vulnerable population.
format Article
id doaj-art-c7868c368aee4da286757e671333e3de
institution DOAJ
issn 2399-9772
language English
publishDate 2025-07-01
publisher BMJ Publishing Group
record_format Article
series BMJ Paediatrics Open
spelling doaj-art-c7868c368aee4da286757e671333e3de2025-08-20T03:17:58ZengBMJ Publishing GroupBMJ Paediatrics Open2399-97722025-07-019110.1136/bmjpo-2025-003652Machine learning-based risk prediction models for bronchopulmonary dysplasia in preterm infants: a high-altitude cohort studyHeng Zhang0Fei Wang1Hongying Mi2Xiaoyan Xu3Ou Jiang4Yilin Lin5Lianfang Tang6Ziwei Li7Rui Ba82 Department of Cardiovascular Surgery, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, People`s Republic of ChinaSchool of Public Affairs, Xiamen University, Xiamen, ChinaDepartment of Pediatrics, The Affiliated Hospital of Kunming University of Science and Technology/Yunnan First People`s Hospital, Kunming, Yunnan Province, ChinaDepartment of Pathophysiology, College of Basic Medical Science, China Medical University, Shenyang, Liaoning Province, ChinaFaculty of Medicine of Kunming University of Science and Technology, Kunming, Yunnan Province, ChinaDepartment of Pediatrics, The Affiliated Hospital of Kunming University of Science and Technology/Yunnan First People`s Hospital, Kunming, Yunnan Province, ChinaDepartment of Pediatrics, The Affiliated Hospital of Kunming University of Science and Technology/Yunnan First People`s Hospital, Kunming, Yunnan Province, ChinaDepartment of Pediatrics, The Affiliated Hospital of Kunming University of Science and Technology/Yunnan First People`s Hospital, Kunming, Yunnan Province, ChinaDepartment of Pediatrics, The Affiliated Hospital of Kunming University of Science and Technology/Yunnan First People`s Hospital, Kunming, Yunnan Province, ChinaBackground Bronchopulmonary dysplasia (BPD) is a significant cause of morbidity in preterm infants, yet its development and severity at high altitudes (>1500 m) remain poorly understood. This study aimed to identify altitude-specific risk factors and develop robust, interpretable predictive models for BPD in this unique population.Methods In this retrospective matched cohort study, 378 preterm infants (<32 weeks gestation, <1500 g birth weight) admitted to a high-altitude (1500 m) NICU(Neonatal Intensive Care Unit) between 2019 and 2023 were analysed. The cohort included 189 BPD cases (91 mild, 61 moderate, 37 severe) and 189 matched controls. Maternal, perinatal and postnatal data were collected. Machine learning models (XGBoost, logistic regression, random forest) were developed and rigorously evaluated using comprehensive performance metrics to predict BPD occurrence and severity. SHAP (SHapley Additive exPlanations) analysis was employed to interpret the best-performing model.Results Key risk factors for BPD development included maternal hypertension (OR 2.31, 95% CI 1.56 to 3.42), initial oxygen requirement >30% (OR 3.15, 95% CI 2.13 to 4.65) and lack of exclusive breast milk feeding (OR 1.89, 95% CI 1.28 to 2.79). Severe BPD was independently associated with prolonged invasive ventilation (>7 days) (OR 4.12, 95% CI 2.78 to 6.11), elevated C reactive protein (>10 mg/L) (OR 2.87, 95% CI 1.93 to 4.26) and patent ductus arteriosus (OR 2.53, 95% CI 1.71 to 3.74). Machine learning models demonstrated strong predictive performance; the optimal XGBoost model achieved an area under the curve of 0.89 (95% CI 0.85 to 0.93), an F1 score of 0.82, a Matthews Correlation Coefficient of 0.73 and a balanced accuracy of 0.85. SHAP analysis identified initial FiO2 >30%, mechanical ventilation and maternal hypertension as the top three most influential predictors for the XGBoost model.Conclusions This study provides the first comprehensive analysis of BPD risk factors at a specific high altitude and validates effective, interpretable machine learning models for its prediction. These findings highlight the critical importance of altitude-specific adjustments in risk assessment and emphasise the potential for model-guided early interventions to improve outcomes for this vulnerable population.https://bmjpaedsopen.bmj.com/content/9/1/e003652.full
spellingShingle Heng Zhang
Fei Wang
Hongying Mi
Xiaoyan Xu
Ou Jiang
Yilin Lin
Lianfang Tang
Ziwei Li
Rui Ba
Machine learning-based risk prediction models for bronchopulmonary dysplasia in preterm infants: a high-altitude cohort study
BMJ Paediatrics Open
title Machine learning-based risk prediction models for bronchopulmonary dysplasia in preterm infants: a high-altitude cohort study
title_full Machine learning-based risk prediction models for bronchopulmonary dysplasia in preterm infants: a high-altitude cohort study
title_fullStr Machine learning-based risk prediction models for bronchopulmonary dysplasia in preterm infants: a high-altitude cohort study
title_full_unstemmed Machine learning-based risk prediction models for bronchopulmonary dysplasia in preterm infants: a high-altitude cohort study
title_short Machine learning-based risk prediction models for bronchopulmonary dysplasia in preterm infants: a high-altitude cohort study
title_sort machine learning based risk prediction models for bronchopulmonary dysplasia in preterm infants a high altitude cohort study
url https://bmjpaedsopen.bmj.com/content/9/1/e003652.full
work_keys_str_mv AT hengzhang machinelearningbasedriskpredictionmodelsforbronchopulmonarydysplasiainpreterminfantsahighaltitudecohortstudy
AT feiwang machinelearningbasedriskpredictionmodelsforbronchopulmonarydysplasiainpreterminfantsahighaltitudecohortstudy
AT hongyingmi machinelearningbasedriskpredictionmodelsforbronchopulmonarydysplasiainpreterminfantsahighaltitudecohortstudy
AT xiaoyanxu machinelearningbasedriskpredictionmodelsforbronchopulmonarydysplasiainpreterminfantsahighaltitudecohortstudy
AT oujiang machinelearningbasedriskpredictionmodelsforbronchopulmonarydysplasiainpreterminfantsahighaltitudecohortstudy
AT yilinlin machinelearningbasedriskpredictionmodelsforbronchopulmonarydysplasiainpreterminfantsahighaltitudecohortstudy
AT lianfangtang machinelearningbasedriskpredictionmodelsforbronchopulmonarydysplasiainpreterminfantsahighaltitudecohortstudy
AT ziweili machinelearningbasedriskpredictionmodelsforbronchopulmonarydysplasiainpreterminfantsahighaltitudecohortstudy
AT ruiba machinelearningbasedriskpredictionmodelsforbronchopulmonarydysplasiainpreterminfantsahighaltitudecohortstudy