Screening of serum biomarkers in patients with PCOS through lipid omics and ensemble machine learning.

Polycystic ovary syndrome (PCOS) is a primary endocrine disorder affecting premenopausal women involving metabolic dysregulation. We aimed to screen serum biomarkers in PCOS patients using untargeted lipidomics and ensemble machine learning. Serum from PCOS patients and non-PCOS subjects were collec...

Full description

Saved in:
Bibliographic Details
Main Authors: Ji-Ying Chen, Wu-Jie Chen, Zhi-Ying Zhu, Shi Xu, Li-Lan Huang, Wen-Qing Tan, Yong-Gang Zhang, Yan-Li Zhao
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0313494
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841533203119079424
author Ji-Ying Chen
Wu-Jie Chen
Zhi-Ying Zhu
Shi Xu
Li-Lan Huang
Wen-Qing Tan
Yong-Gang Zhang
Yan-Li Zhao
author_facet Ji-Ying Chen
Wu-Jie Chen
Zhi-Ying Zhu
Shi Xu
Li-Lan Huang
Wen-Qing Tan
Yong-Gang Zhang
Yan-Li Zhao
author_sort Ji-Ying Chen
collection DOAJ
description Polycystic ovary syndrome (PCOS) is a primary endocrine disorder affecting premenopausal women involving metabolic dysregulation. We aimed to screen serum biomarkers in PCOS patients using untargeted lipidomics and ensemble machine learning. Serum from PCOS patients and non-PCOS subjects were collected for untargeted lipidomics analysis. Through analyzing the classification of differential lipid metabolites and the association between differential lipid metabolites and clinical indexes, ensemble machine learning, data preprocessing, statistical test pre-screening, ensemble learning method secondary screening, biomarkers verification and evaluation, and diagnostic panel model construction and verification were performed on the data of untargeted lipidomics. Results indicated that different lipid metabolites not only differ between groups but also have close effects on different corresponding clinical indexes. PI (18:0/20:3)-H and PE (18:1p/22:6)-H were identified as candidate biomarkers. Three machine learning models, logistic regression, random forest, and support vector machine, showed that screened biomarkers had better classification ability and effect. In addition, the correlation of candidate biomarkers was low, indicating that the overlap between the selected biomarkers was low, and the combination of panels was more optimized. When the AUC value of the test set of the constructed diagnostic panel model was 0.815, the model's accuracy in the test set was 0.74, specificity was 0.88, and sensitivity was 0.7. This study demonstrated the applicability and robustness of machine learning algorithms to analyze lipid metabolism data for efficient and reliable biomarker screening. PI (18:0/20:3)-H and PE (18:1p/22:6)-H showed great potential in diagnosing PCOS.
format Article
id doaj-art-e9752b99f5b945998e75ff568e6311c6
institution Kabale University
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-e9752b99f5b945998e75ff568e6311c62025-01-17T05:31:39ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01201e031349410.1371/journal.pone.0313494Screening of serum biomarkers in patients with PCOS through lipid omics and ensemble machine learning.Ji-Ying ChenWu-Jie ChenZhi-Ying ZhuShi XuLi-Lan HuangWen-Qing TanYong-Gang ZhangYan-Li ZhaoPolycystic ovary syndrome (PCOS) is a primary endocrine disorder affecting premenopausal women involving metabolic dysregulation. We aimed to screen serum biomarkers in PCOS patients using untargeted lipidomics and ensemble machine learning. Serum from PCOS patients and non-PCOS subjects were collected for untargeted lipidomics analysis. Through analyzing the classification of differential lipid metabolites and the association between differential lipid metabolites and clinical indexes, ensemble machine learning, data preprocessing, statistical test pre-screening, ensemble learning method secondary screening, biomarkers verification and evaluation, and diagnostic panel model construction and verification were performed on the data of untargeted lipidomics. Results indicated that different lipid metabolites not only differ between groups but also have close effects on different corresponding clinical indexes. PI (18:0/20:3)-H and PE (18:1p/22:6)-H were identified as candidate biomarkers. Three machine learning models, logistic regression, random forest, and support vector machine, showed that screened biomarkers had better classification ability and effect. In addition, the correlation of candidate biomarkers was low, indicating that the overlap between the selected biomarkers was low, and the combination of panels was more optimized. When the AUC value of the test set of the constructed diagnostic panel model was 0.815, the model's accuracy in the test set was 0.74, specificity was 0.88, and sensitivity was 0.7. This study demonstrated the applicability and robustness of machine learning algorithms to analyze lipid metabolism data for efficient and reliable biomarker screening. PI (18:0/20:3)-H and PE (18:1p/22:6)-H showed great potential in diagnosing PCOS.https://doi.org/10.1371/journal.pone.0313494
spellingShingle Ji-Ying Chen
Wu-Jie Chen
Zhi-Ying Zhu
Shi Xu
Li-Lan Huang
Wen-Qing Tan
Yong-Gang Zhang
Yan-Li Zhao
Screening of serum biomarkers in patients with PCOS through lipid omics and ensemble machine learning.
PLoS ONE
title Screening of serum biomarkers in patients with PCOS through lipid omics and ensemble machine learning.
title_full Screening of serum biomarkers in patients with PCOS through lipid omics and ensemble machine learning.
title_fullStr Screening of serum biomarkers in patients with PCOS through lipid omics and ensemble machine learning.
title_full_unstemmed Screening of serum biomarkers in patients with PCOS through lipid omics and ensemble machine learning.
title_short Screening of serum biomarkers in patients with PCOS through lipid omics and ensemble machine learning.
title_sort screening of serum biomarkers in patients with pcos through lipid omics and ensemble machine learning
url https://doi.org/10.1371/journal.pone.0313494
work_keys_str_mv AT jiyingchen screeningofserumbiomarkersinpatientswithpcosthroughlipidomicsandensemblemachinelearning
AT wujiechen screeningofserumbiomarkersinpatientswithpcosthroughlipidomicsandensemblemachinelearning
AT zhiyingzhu screeningofserumbiomarkersinpatientswithpcosthroughlipidomicsandensemblemachinelearning
AT shixu screeningofserumbiomarkersinpatientswithpcosthroughlipidomicsandensemblemachinelearning
AT lilanhuang screeningofserumbiomarkersinpatientswithpcosthroughlipidomicsandensemblemachinelearning
AT wenqingtan screeningofserumbiomarkersinpatientswithpcosthroughlipidomicsandensemblemachinelearning
AT yonggangzhang screeningofserumbiomarkersinpatientswithpcosthroughlipidomicsandensemblemachinelearning
AT yanlizhao screeningofserumbiomarkersinpatientswithpcosthroughlipidomicsandensemblemachinelearning