Using machine learning to predict patients with polycystic ovary disease in Chinese women

Objective: With an estimated global frequency ranging from5 % to 21 %, polycystic ovary syndrome (PCOS) is one of the most prevalent hormonal disorders. There are many factors found to be related to PCOS. However, most of these researches used traditional methods such as multiple logistic regression...

Full description

Saved in:
Bibliographic Details
Main Authors: Chen-Yu Wang, Dee Pei, Chun-Kai Wang, Jyun-Cheng Ke, Siou-Ting Lee, Ta-Wei Chu, Yao-Jen Liang
Format: Article
Language:English
Published: Elsevier 2025-01-01
Series:Taiwanese Journal of Obstetrics & Gynecology
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1028455924002791
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841553928093696000
author Chen-Yu Wang
Dee Pei
Chun-Kai Wang
Jyun-Cheng Ke
Siou-Ting Lee
Ta-Wei Chu
Yao-Jen Liang
author_facet Chen-Yu Wang
Dee Pei
Chun-Kai Wang
Jyun-Cheng Ke
Siou-Ting Lee
Ta-Wei Chu
Yao-Jen Liang
author_sort Chen-Yu Wang
collection DOAJ
description Objective: With an estimated global frequency ranging from5 % to 21 %, polycystic ovary syndrome (PCOS) is one of the most prevalent hormonal disorders. There are many factors found to be related to PCOS. However, most of these researches used traditional methods such as multiple logistic regression (LR). Nowadays, machine learning (Mach-L) emerges as a new method and can be used in medical researches. In the present study, there were two goals: 1. Compare the accuracy of five alternative Mach-L techniques with that of conventional LR. 2. Use Mach-L to forecast PCOS and prioritize the risk factors. Materials and methods: Totally, 170 PCOS patients and 950 control participants were included. We collected information on demographics, biochemistry, and lifestyle. PCOS was identified using Rotterdam criteria. Random Forest (RF), stochastic gradient boosting (SGB), multivariate adaptive regression splines (MARS), extreme gradient boosting (XGBoost), and gradient boosting with categorical features support (CatBoost) are five Mach-L algorithms that were used. Models with lower estimation errors were better. Results: By using t-test, we found subjects with PCOS were younger, glutamic oxaloacetic transaminase (GOT), glutamic pyruvic transaminase (GPT), γ-Glutamyl transferase (γ-GT), Triglyceride (TG), and educational levels were higher. All the five Mach-L methods had lower estimation errors compared to LR. The average of the AUC derived from Mach-L was mean AUC of 0.6669, higher than the that of LR (0.5908). Finally, age, TG, GPT, white blood cell count (WBC), uric acid (UA), and platelet (Plt) were the six most important risk factors selected by Mach-L. Conclusion: Mach-L methods overtook conventional LR and age was the most significant factor, followed by TG, GPT, WBC, UA, and Plt in a cohort of Chinese women.
format Article
id doaj-art-e544d3dbebe94751a20cb324c1d1b688
institution Kabale University
issn 1028-4559
language English
publishDate 2025-01-01
publisher Elsevier
record_format Article
series Taiwanese Journal of Obstetrics & Gynecology
spelling doaj-art-e544d3dbebe94751a20cb324c1d1b6882025-01-09T06:12:50ZengElsevierTaiwanese Journal of Obstetrics & Gynecology1028-45592025-01-016416875Using machine learning to predict patients with polycystic ovary disease in Chinese womenChen-Yu Wang0Dee Pei1Chun-Kai Wang2Jyun-Cheng Ke3Siou-Ting Lee4Ta-Wei Chu5Yao-Jen Liang6Department of Obstetrics and Gynecology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan; Graduate Institute of Applied Science and Engineering, Fu Jen Catholic University, New Taipei City, TaiwanDepartment of Medicine, Medical School, Fu Jen Catholic University, Department of Endocrinology and Metabolism, Fu Jen Catholic University Hospital, New Taipei City, TaiwanDepartment of Obstetrics and Gynecology, Zuoying Branch of Kaohsiung Armed Forces General Hospital, Kaohsiung, TaiwanDepartment of Obstetrics and Gynecology, Tri-Service General Hospital, National Defense Medical Center, Taipei, TaiwanDepartment of Obstetrics and Gynecology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan; Department of Obstetrics and Gynecology, Taoyuan Armed Forces General Hospital, Taoyuan, TaiwanDepartment of Obstetrics and Gynecology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan; MJ Health Research Foundation, Taipei, TaiwanGraduate Institute of Applied Science and Engineering, Fu Jen Catholic University, New Taipei City, Taiwan; Corresponding author. Department of Life Science, Graduate Institute of Applied Science and Engineering, Fu-Jen Catholic University, No. 510, Zhongzheng Rd., Xinzhuang Dist., New Taipei City, 24205, Taiwan.Objective: With an estimated global frequency ranging from5 % to 21 %, polycystic ovary syndrome (PCOS) is one of the most prevalent hormonal disorders. There are many factors found to be related to PCOS. However, most of these researches used traditional methods such as multiple logistic regression (LR). Nowadays, machine learning (Mach-L) emerges as a new method and can be used in medical researches. In the present study, there were two goals: 1. Compare the accuracy of five alternative Mach-L techniques with that of conventional LR. 2. Use Mach-L to forecast PCOS and prioritize the risk factors. Materials and methods: Totally, 170 PCOS patients and 950 control participants were included. We collected information on demographics, biochemistry, and lifestyle. PCOS was identified using Rotterdam criteria. Random Forest (RF), stochastic gradient boosting (SGB), multivariate adaptive regression splines (MARS), extreme gradient boosting (XGBoost), and gradient boosting with categorical features support (CatBoost) are five Mach-L algorithms that were used. Models with lower estimation errors were better. Results: By using t-test, we found subjects with PCOS were younger, glutamic oxaloacetic transaminase (GOT), glutamic pyruvic transaminase (GPT), γ-Glutamyl transferase (γ-GT), Triglyceride (TG), and educational levels were higher. All the five Mach-L methods had lower estimation errors compared to LR. The average of the AUC derived from Mach-L was mean AUC of 0.6669, higher than the that of LR (0.5908). Finally, age, TG, GPT, white blood cell count (WBC), uric acid (UA), and platelet (Plt) were the six most important risk factors selected by Mach-L. Conclusion: Mach-L methods overtook conventional LR and age was the most significant factor, followed by TG, GPT, WBC, UA, and Plt in a cohort of Chinese women.http://www.sciencedirect.com/science/article/pii/S1028455924002791Machine learningLogistic regressionPolycystic ovary syndrome
spellingShingle Chen-Yu Wang
Dee Pei
Chun-Kai Wang
Jyun-Cheng Ke
Siou-Ting Lee
Ta-Wei Chu
Yao-Jen Liang
Using machine learning to predict patients with polycystic ovary disease in Chinese women
Taiwanese Journal of Obstetrics & Gynecology
Machine learning
Logistic regression
Polycystic ovary syndrome
title Using machine learning to predict patients with polycystic ovary disease in Chinese women
title_full Using machine learning to predict patients with polycystic ovary disease in Chinese women
title_fullStr Using machine learning to predict patients with polycystic ovary disease in Chinese women
title_full_unstemmed Using machine learning to predict patients with polycystic ovary disease in Chinese women
title_short Using machine learning to predict patients with polycystic ovary disease in Chinese women
title_sort using machine learning to predict patients with polycystic ovary disease in chinese women
topic Machine learning
Logistic regression
Polycystic ovary syndrome
url http://www.sciencedirect.com/science/article/pii/S1028455924002791
work_keys_str_mv AT chenyuwang usingmachinelearningtopredictpatientswithpolycysticovarydiseaseinchinesewomen
AT deepei usingmachinelearningtopredictpatientswithpolycysticovarydiseaseinchinesewomen
AT chunkaiwang usingmachinelearningtopredictpatientswithpolycysticovarydiseaseinchinesewomen
AT jyunchengke usingmachinelearningtopredictpatientswithpolycysticovarydiseaseinchinesewomen
AT sioutinglee usingmachinelearningtopredictpatientswithpolycysticovarydiseaseinchinesewomen
AT taweichu usingmachinelearningtopredictpatientswithpolycysticovarydiseaseinchinesewomen
AT yaojenliang usingmachinelearningtopredictpatientswithpolycysticovarydiseaseinchinesewomen