Early prediction of preeclampsia from clinical, multi-omics and laboratory data using random forest model

Abstract Background Predicting preeclampsia (PE) within the first 16 weeks of gestation is difficult due to various risk factors, poorly understood causes and likely multiple pathogenic phenotypes of preeclampsia.  Objectives In this study, we aimed to develop prediction models for early-onset preec...

Full description

Saved in:
Bibliographic Details
Main Authors: Qiang Zhao, Jia Li, Zhuo Diao, Xiao Zhang, Suihua Feng, Guixue Hou, Wenqiu Xu, Zhiguang Zhao, Zhixu Qiu, Wenzhi Yang, Si Zhou, Peirun Tian, Qun Zhang, Weiping Chen, Huahua Li, Gefei Xiao, Jie Qin, Liqing Hu, Zhongzhe Li, Liang Lin, Shunyao Wang, Ruyun Gao, Wuyan Huang, Xiaohong Ruan, Sufen Zhang, Jianguo Zhang, Lijian Zhao, Rui Zhang
Format: Article
Language:English
Published: BMC 2025-05-01
Series:BMC Pregnancy and Childbirth
Subjects:
Online Access:https://doi.org/10.1186/s12884-025-07582-4
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Predicting preeclampsia (PE) within the first 16 weeks of gestation is difficult due to various risk factors, poorly understood causes and likely multiple pathogenic phenotypes of preeclampsia.  Objectives In this study, we aimed to develop prediction models for early-onset preeclampsia (EPE) and late-onset preeclampsia (LPE) respectively using clinical data, metabolome and proteome analyses on plasma samples and laboratory data. Methods We retrospectively recruited 56 EPE, 50 LPE patients and 92 normotensive controls from three tertiary hospitals and used clinical and laboratory data in early pregnancy. Models for EPE and LPE were fitted with the use of patient’ clinical, multi-omics and laboratory data. Results By comparing multi-omics and laboratory test variables between EPE, LPE and healthy controls, we identified sets of differentially expressed biomarkers, including 49 and 33 metabolites, 28 and 36 proteins as well as 5 and 7 laboratory variables associated with EPE and LPE respectively. Using the random forest algorithm, we developed a prediction model using seven clinical factors, seven metabolites, five laboratory test variables. The model yielded the highest accuracy for EPE prediction with good sensitivity (87.5%, 95% confidence interval [CI]: 67.64%-97.34%) and specificity (94.1%, 95% CI: 80.32%-99.28%). We also developed a prediction model that exhibited high accuracy in separating LPE from controls (sensitivity: 66.67%, 95% CI: 43.03%-85.41%; specificity: 94.12%, 95% CI: 80.32%-99.28%) using seven clinical factors, five metabolites and eight proteins. Conclusion Our study has identified a set of significant omics and laboratory features for PE prediction. The established models yielded high prediction performance for preeclampsia risk from clinical, multi-omics and laboratory information.
ISSN:1471-2393