Integrating CT radiomics and clinical features using machine learning to predict post-COVID pulmonary fibrosis

Abstract Background The lack of reliable biomarkers for the early detection and risk stratification of post-COVID-19 pulmonary fibrosis (PCPF) underscores the urgency advanced predictive tools. This study aimed to develop a machine learning-based predictive model integrating quantitative CT (qCT) ra...

Full description

Saved in:
Bibliographic Details
Main Authors: Qianqian Zhao, Yijie Li, Chunliu Zhao, Ran Dong, Jiaxin Tian, Ze Zhang, Lin Huang, Jingwen Huang, Junhai Yan, Zhitao Yang, Jiangnan Ruan, Ping Wang, Li Yu, Jieming Qu, Min Zhou
Format: Article
Language:English
Published: BMC 2025-07-01
Series:Respiratory Research
Subjects:
Online Access:https://doi.org/10.1186/s12931-025-03305-7
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background The lack of reliable biomarkers for the early detection and risk stratification of post-COVID-19 pulmonary fibrosis (PCPF) underscores the urgency advanced predictive tools. This study aimed to develop a machine learning-based predictive model integrating quantitative CT (qCT) radiomics and clinical features to assess the risk of lung fibrosis in COVID-19 patients. Methods A total of 204 patients with confirmed COVID-19 pneumonia were included in the study. Of these, 93 patients were assigned to the development cohort (74 for training and 19 for internal validation), while 111 patients from three independent hospitals constituted the external validation cohort. Chest CT images were analyzed using qCT software. Clinical data and laboratory parameters were obtained from electronic health records. Least absolute shrinkage and selection operator (LASSO) regression with 5-fold cross-validation was used to select the most predictive features. Twelve machine learning algorithms were independently trained. Their performances were evaluated by receiver operating characteristic (ROC) curves, area under the curve (AUC) values, sensitivity, and specificity. Results Seventy-eight features were extracted and reduced to ten features for model development. These included two qCT radiomics signatures: (1) whole lung_reticulation (%) interstitial lung disease (ILD) texture analysis, (2) interstitial lung abnormality (ILA)_Num of lung zones ≥ 5%_whole lung_ILA. Among 12 machine learning algorithms evaluated, the support vector machine (SVM) model demonstrated the best predictive performance, with AUCs of 0.836 (95% CI: 0.830–0.842) in the training cohort, 0.796 (95% CI: 0.777–0.816) in the internal validation cohort, and 0.797 (95% CI: 0.691–0.873) in the external validation cohort. Conclusions The integration of CT radiomics, clinical and laboratory variables using machine learning provides a robust tool for predicting pulmonary fibrosis progression in COVID-19 patients, facilitating early risk assessment and intervention.
ISSN:1465-993X