Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models
<b>Background:</b> Preeclampsia, affecting 2–4% of pregnancies worldwide, poses a substantial risk to maternal health. Late-onset preeclampsia, in particular, has a high incidence among preeclampsia cases. However, existing prediction models are limited in terms of the early detection ca...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-02-01
|
| Series: | Biomedicines |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-9059/13/2/347 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850081100074844160 |
|---|---|
| author | Yangyang Zhang Xunke Gu Nan Yang Yuting Xue Lijuan Ma Yongqing Wang Hua Zhang Keke Jia |
| author_facet | Yangyang Zhang Xunke Gu Nan Yang Yuting Xue Lijuan Ma Yongqing Wang Hua Zhang Keke Jia |
| author_sort | Yangyang Zhang |
| collection | DOAJ |
| description | <b>Background:</b> Preeclampsia, affecting 2–4% of pregnancies worldwide, poses a substantial risk to maternal health. Late-onset preeclampsia, in particular, has a high incidence among preeclampsia cases. However, existing prediction models are limited in terms of the early detection capabilities and often rely on costly and less accessible indicators, making them less applicable in resource-limited settings. <b>Objective:</b> To develop and evaluate prediction models for late-onset preeclampsia using general information, maternal risk factors, and laboratory indicators from early gestation (6–13 weeks). <b>Methods:</b> A dataset of 2000 pregnancies, including 110 late-onset preeclampsia cases, was analyzed. General information and maternal risk factors were collected from the hospital information system. Relevant laboratory indicators between 6 and 13 weeks of gestation were examined. Logistic regression was used as the baseline model to assess the predictive performance of the support vector machine and extreme gradient boosting models for late-onset preeclampsia. <b>Results</b>: The logistic regression model, only considering general information and risk factors, identified 19.1% of cases, with a false positive rate of 0.4%. When selecting 15 factors encompassing general information, risk factors, and laboratory indicators, the false positive rate increased to 0.7% and the detection rate improved to 27.3%. The support vector machine model, only considering general information and risk factors, achieved a detection rate of 27.3%, with a false positive rate of 0.0%. After including all the laboratory indicators, the false positive rate increased to 7.7% but the detection rate significantly improved to 54.5%. The extreme gradient boosting model, only considering general information and risk factors, achieved a detection rate of 31.6%, with a false positive rate of 1.5%. After including all the laboratory indicators, the false positive rate remained at 0.7% but the detection rate increased to 52.6%. Additionally, after adding the laboratory indicators, the areas under the ROC curve for the logistic regression, support vector machine, and extreme gradient boosting models were 0.877, 0.839, and 0.842, respectively. <b>Conclusion:</b> Compared with the logistic regression model, both the support vector machine and extreme gradient boosting models significantly improved the detection rates for late-onset preeclampsia. However, the support vector machine model had a comparatively higher false positive rate. Notably, the logistic regression and extreme gradient boosting models exhibited high negative predictive values of 99.3%, underscoring their effectiveness in accurately identifying pregnant women less likely to develop late-onset preeclampsia. Additionally, logistic regression showed the highest areas under the ROC curve, suggesting that the traditional model has unique advantages in relation to prediction. |
| format | Article |
| id | doaj-art-5a03cccb8ddb48a3aa545cfc1d4545d7 |
| institution | DOAJ |
| issn | 2227-9059 |
| language | English |
| publishDate | 2025-02-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Biomedicines |
| spelling | doaj-art-5a03cccb8ddb48a3aa545cfc1d4545d72025-08-20T02:44:49ZengMDPI AGBiomedicines2227-90592025-02-0113234710.3390/biomedicines13020347Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting ModelsYangyang Zhang0Xunke Gu1Nan Yang2Yuting Xue3Lijuan Ma4Yongqing Wang5Hua Zhang6Keke Jia7Department of Clinical Laboratory, Peking University Third Hospital, Beijing 100191, ChinaDepartment of Obstetrics and Gynecology, Peking University Third Hospital, Beijing 100191, ChinaDepartment of Blood Transfusion, Peking University Third Hospital, Beijing 100191, ChinaDepartment of Clinical Laboratory, Peking University Third Hospital, Beijing 100191, ChinaDepartment of Clinical Laboratory, Peking University Third Hospital, Beijing 100191, ChinaDepartment of Obstetrics and Gynecology, Peking University Third Hospital, Beijing 100191, ChinaResearch Center of Clinical Epidemiology, Peking University Third Hospital, Beijing 100191, ChinaDepartment of Clinical Laboratory, Peking University Third Hospital, Beijing 100191, China<b>Background:</b> Preeclampsia, affecting 2–4% of pregnancies worldwide, poses a substantial risk to maternal health. Late-onset preeclampsia, in particular, has a high incidence among preeclampsia cases. However, existing prediction models are limited in terms of the early detection capabilities and often rely on costly and less accessible indicators, making them less applicable in resource-limited settings. <b>Objective:</b> To develop and evaluate prediction models for late-onset preeclampsia using general information, maternal risk factors, and laboratory indicators from early gestation (6–13 weeks). <b>Methods:</b> A dataset of 2000 pregnancies, including 110 late-onset preeclampsia cases, was analyzed. General information and maternal risk factors were collected from the hospital information system. Relevant laboratory indicators between 6 and 13 weeks of gestation were examined. Logistic regression was used as the baseline model to assess the predictive performance of the support vector machine and extreme gradient boosting models for late-onset preeclampsia. <b>Results</b>: The logistic regression model, only considering general information and risk factors, identified 19.1% of cases, with a false positive rate of 0.4%. When selecting 15 factors encompassing general information, risk factors, and laboratory indicators, the false positive rate increased to 0.7% and the detection rate improved to 27.3%. The support vector machine model, only considering general information and risk factors, achieved a detection rate of 27.3%, with a false positive rate of 0.0%. After including all the laboratory indicators, the false positive rate increased to 7.7% but the detection rate significantly improved to 54.5%. The extreme gradient boosting model, only considering general information and risk factors, achieved a detection rate of 31.6%, with a false positive rate of 1.5%. After including all the laboratory indicators, the false positive rate remained at 0.7% but the detection rate increased to 52.6%. Additionally, after adding the laboratory indicators, the areas under the ROC curve for the logistic regression, support vector machine, and extreme gradient boosting models were 0.877, 0.839, and 0.842, respectively. <b>Conclusion:</b> Compared with the logistic regression model, both the support vector machine and extreme gradient boosting models significantly improved the detection rates for late-onset preeclampsia. However, the support vector machine model had a comparatively higher false positive rate. Notably, the logistic regression and extreme gradient boosting models exhibited high negative predictive values of 99.3%, underscoring their effectiveness in accurately identifying pregnant women less likely to develop late-onset preeclampsia. Additionally, logistic regression showed the highest areas under the ROC curve, suggesting that the traditional model has unique advantages in relation to prediction.https://www.mdpi.com/2227-9059/13/2/347late-onset preeclampsiamaternal risk factorslaboratory indicatorslogistic regressionsupport vector machineextreme gradient boosting |
| spellingShingle | Yangyang Zhang Xunke Gu Nan Yang Yuting Xue Lijuan Ma Yongqing Wang Hua Zhang Keke Jia Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models Biomedicines late-onset preeclampsia maternal risk factors laboratory indicators logistic regression support vector machine extreme gradient boosting |
| title | Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models |
| title_full | Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models |
| title_fullStr | Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models |
| title_full_unstemmed | Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models |
| title_short | Prediction Models for Late-Onset Preeclampsia: A Study Based on Logistic Regression, Support Vector Machine, and Extreme Gradient Boosting Models |
| title_sort | prediction models for late onset preeclampsia a study based on logistic regression support vector machine and extreme gradient boosting models |
| topic | late-onset preeclampsia maternal risk factors laboratory indicators logistic regression support vector machine extreme gradient boosting |
| url | https://www.mdpi.com/2227-9059/13/2/347 |
| work_keys_str_mv | AT yangyangzhang predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels AT xunkegu predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels AT nanyang predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels AT yutingxue predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels AT lijuanma predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels AT yongqingwang predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels AT huazhang predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels AT kekejia predictionmodelsforlateonsetpreeclampsiaastudybasedonlogisticregressionsupportvectormachineandextremegradientboostingmodels |