Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models

<b>Background:</b> Preeclampsia, affecting 2–4% of pregnancies worldwide, poses a substantial risk to maternal health. Late-onset preeclampsia, in particular, has a high incidence among preeclampsia cases. However, existing prediction models are limited in terms of the early detection ca...

Full description

Saved in:
Bibliographic Details
Main Authors: Yangyang Zhang, Xunke Gu, Nan Yang, Yuting Xue, Lijuan Ma, Yongqing Wang, Hua Zhang, Keke Jia
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Biomedicines
Subjects:
Online Access:https://www.mdpi.com/2227-9059/13/2/347
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850081100074844160
author Yangyang Zhang
Xunke Gu
Nan Yang
Yuting Xue
Lijuan Ma
Yongqing Wang
Hua Zhang
Keke Jia
author_facet Yangyang Zhang
Xunke Gu
Nan Yang
Yuting Xue
Lijuan Ma
Yongqing Wang
Hua Zhang
Keke Jia
author_sort Yangyang Zhang
collection DOAJ
description <b>Background:</b> Preeclampsia, affecting 2–4% of pregnancies worldwide, poses a substantial risk to maternal health. Late-onset preeclampsia, in particular, has a high incidence among preeclampsia cases. However, existing prediction models are limited in terms of the early detection capabilities and often rely on costly and less accessible indicators, making them less applicable in resource-limited settings. <b>Objective:</b> To develop and evaluate prediction models for late-onset preeclampsia using general information, maternal risk factors, and laboratory indicators from early gestation (6–13 weeks). <b>Methods:</b> A dataset of 2000 pregnancies, including 110 late-onset preeclampsia cases, was analyzed. General information and maternal risk factors were collected from the hospital information system. Relevant laboratory indicators between 6 and 13 weeks of gestation were examined. Logistic regression was used as the baseline model to assess the predictive performance of the support vector machine and extreme gradient boosting models for late-onset preeclampsia. <b>Results</b>: The logistic regression model, only considering general information and risk factors, identified 19.1% of cases, with a false positive rate of 0.4%. When selecting 15 factors encompassing general information, risk factors, and laboratory indicators, the false positive rate increased to 0.7% and the detection rate improved to 27.3%. The support vector machine model, only considering general information and risk factors, achieved a detection rate of 27.3%, with a false positive rate of 0.0%. After including all the laboratory indicators, the false positive rate increased to 7.7% but the detection rate significantly improved to 54.5%. The extreme gradient boosting model, only considering general information and risk factors, achieved a detection rate of 31.6%, with a false positive rate of 1.5%. After including all the laboratory indicators, the false positive rate remained at 0.7% but the detection rate increased to 52.6%. Additionally, after adding the laboratory indicators, the areas under the ROC curve for the logistic regression, support vector machine, and extreme gradient boosting models were 0.877, 0.839, and 0.842, respectively. <b>Conclusion:</b> Compared with the logistic regression model, both the support vector machine and extreme gradient boosting models significantly improved the detection rates for late-onset preeclampsia. However, the support vector machine model had a comparatively higher false positive rate. Notably, the logistic regression and extreme gradient boosting models exhibited high negative predictive values of 99.3%, underscoring their effectiveness in accurately identifying pregnant women less likely to develop late-onset preeclampsia. Additionally, logistic regression showed the highest areas under the ROC curve, suggesting that the traditional model has unique advantages in relation to prediction.
format Article
id doaj-art-5a03cccb8ddb48a3aa545cfc1d4545d7
institution DOAJ
issn 2227-9059
language English
publishDate 2025-02-01
publisher MDPI AG
record_format Article
series Biomedicines
spelling doaj-art-5a03cccb8ddb48a3aa545cfc1d4545d72025-08-20T02:44:49ZengMDPI AGBiomedicines2227-90592025-02-0113234710.3390/biomedicines13020347Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting ModelsYangyang Zhang0Xunke Gu1Nan Yang2Yuting Xue3Lijuan Ma4Yongqing Wang5Hua Zhang6Keke Jia7Department of Clinical Laboratory, Peking University Third Hospital, Beijing 100191, ChinaDepartment of Obstetrics and Gynecology, Peking University Third Hospital, Beijing 100191, ChinaDepartment of Blood Transfusion, Peking University Third Hospital, Beijing 100191, ChinaDepartment of Clinical Laboratory, Peking University Third Hospital, Beijing 100191, ChinaDepartment of Clinical Laboratory, Peking University Third Hospital, Beijing 100191, ChinaDepartment of Obstetrics and Gynecology, Peking University Third Hospital, Beijing 100191, ChinaResearch Center of Clinical Epidemiology, Peking University Third Hospital, Beijing 100191, ChinaDepartment of Clinical Laboratory, Peking University Third Hospital, Beijing 100191, China<b>Background:</b> Preeclampsia, affecting 2–4% of pregnancies worldwide, poses a substantial risk to maternal health. Late-onset preeclampsia, in particular, has a high incidence among preeclampsia cases. However, existing prediction models are limited in terms of the early detection capabilities and often rely on costly and less accessible indicators, making them less applicable in resource-limited settings. <b>Objective:</b> To develop and evaluate prediction models for late-onset preeclampsia using general information, maternal risk factors, and laboratory indicators from early gestation (6–13 weeks). <b>Methods:</b> A dataset of 2000 pregnancies, including 110 late-onset preeclampsia cases, was analyzed. General information and maternal risk factors were collected from the hospital information system. Relevant laboratory indicators between 6 and 13 weeks of gestation were examined. Logistic regression was used as the baseline model to assess the predictive performance of the support vector machine and extreme gradient boosting models for late-onset preeclampsia. <b>Results</b>: The logistic regression model, only considering general information and risk factors, identified 19.1% of cases, with a false positive rate of 0.4%. When selecting 15 factors encompassing general information, risk factors, and laboratory indicators, the false positive rate increased to 0.7% and the detection rate improved to 27.3%. The support vector machine model, only considering general information and risk factors, achieved a detection rate of 27.3%, with a false positive rate of 0.0%. After including all the laboratory indicators, the false positive rate increased to 7.7% but the detection rate significantly improved to 54.5%. The extreme gradient boosting model, only considering general information and risk factors, achieved a detection rate of 31.6%, with a false positive rate of 1.5%. After including all the laboratory indicators, the false positive rate remained at 0.7% but the detection rate increased to 52.6%. Additionally, after adding the laboratory indicators, the areas under the ROC curve for the logistic regression, support vector machine, and extreme gradient boosting models were 0.877, 0.839, and 0.842, respectively. <b>Conclusion:</b> Compared with the logistic regression model, both the support vector machine and extreme gradient boosting models significantly improved the detection rates for late-onset preeclampsia. However, the support vector machine model had a comparatively higher false positive rate. Notably, the logistic regression and extreme gradient boosting models exhibited high negative predictive values of 99.3%, underscoring their effectiveness in accurately identifying pregnant women less likely to develop late-onset preeclampsia. Additionally, logistic regression showed the highest areas under the ROC curve, suggesting that the traditional model has unique advantages in relation to prediction.https://www.mdpi.com/2227-9059/13/2/347late-onset preeclampsiamaternal risk factorslaboratory indicatorslogistic regressionsupport vector machineextreme gradient boosting
spellingShingle Yangyang Zhang
Xunke Gu
Nan Yang
Yuting Xue
Lijuan Ma
Yongqing Wang
Hua Zhang
Keke Jia
Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models
Biomedicines
late-onset preeclampsia
maternal risk factors
laboratory indicators
logistic regression
support vector machine
extreme gradient boosting
title Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models
title_full Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models
title_fullStr Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models
title_full_unstemmed Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models
title_short Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models
title_sort prediction models for late onset preeclampsia a study based on logistic regression support vector machine and extreme gradient boosting models
topic late-onset preeclampsia
maternal risk factors
laboratory indicators
logistic regression
support vector machine
extreme gradient boosting
url https://www.mdpi.com/2227-9059/13/2/347
work_keys_str_mv AT yangyangzhang predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels
AT xunkegu predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels
AT nanyang predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels
AT yutingxue predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels
AT lijuanma predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels
AT yongqingwang predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels
AT huazhang predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels
AT kekejia predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels