Assessing individual genetic susceptibility to metabolic syndrome: interpretable machine learning method

Background Genome-wide association studies have provided profound insights into the genetic aetiology of metabolic syndrome (MetS). However, there is a lack of machine-learning (ML)-based predictive models to assess individual genetic susceptibility to MetS. This study utilized single-nucleotide pol...

Full description

Saved in:
Bibliographic Details
Main Authors: Tao Huang, Yuanyuan Li, Simin Wang, Shijie Qiao, Xiujuan Zheng, Wenhui Xiong, Menghan Yang, Xirui Huang, Bizhen Gao
Format: Article
Language:English
Published: Taylor & Francis Group 2025-12-01
Series:Annals of Medicine
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/07853890.2025.2519679
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849686522994884608
author Tao Huang
Yuanyuan Li
Simin Wang
Shijie Qiao
Xiujuan Zheng
Wenhui Xiong
Menghan Yang
Xirui Huang
Bizhen Gao
author_facet Tao Huang
Yuanyuan Li
Simin Wang
Shijie Qiao
Xiujuan Zheng
Wenhui Xiong
Menghan Yang
Xirui Huang
Bizhen Gao
author_sort Tao Huang
collection DOAJ
description Background Genome-wide association studies have provided profound insights into the genetic aetiology of metabolic syndrome (MetS). However, there is a lack of machine-learning (ML)-based predictive models to assess individual genetic susceptibility to MetS. This study utilized single-nucleotide polymorphisms (SNPs) as variables and employed ML-based genetic risk score (GRS) models to predict the occurrence of MetS, bringing it closer to clinical application.Methods Feature selection was performed using Least Absolute Shrinkage and Selection Operator. Six ML algorithms were employed to construct GRS models. A fivefold cross-validation was utilized to aid in the internal validation of models. The receiver operating characteristic (ROC) curve was used to select the better-performing GRS model. The SHapley Additive exPlanations (SHAP) was then applied to interpret the model. After extracting GRS, stratified analysis of BMI, age and gender was performed. Finally, these conventional risk factors and GRS were integrated through multivariate logistic regression to establish a combined model.Results A total of 17 SNPs were selected for analysis. Among the GRS models, the extreme gradient boosting (XGBoost) model demonstrated superior discriminative performance (AUC = 0.837). The XGBoost’s optimal robustness was also validated through five-fold cross-validation (mean ROC-AUC = 0.706). The XGBoost-based SHAP algorithm not only elucidated the global effects of 17 SNPs across all samples, but also described the interaction between SNPs, providing a visual representation of how SNPs impact the prediction of MetS in an individual. There was a strong correlation between GRS and MetS risk, particularly observed among young individuals, males and overweight individuals. Furthermore, the model combining conventional risk factors and GRS exhibited excellent discriminative performance (AUC = 0.962) and outstanding robustness (mean ROC-AUC = 0.959).Conclusion This study established a reliable XGBoost-based GRS model and a GRS prediction platform (https://metabolicsyndromeapps.shinyapps.io/geneticriskscore/) to assess individual genetic susceptibility to MetS. This model has high interpretability and can provide personalized reference for determining the necessity of primary prevention measures for MetS. Additionally, there may be interactions between traditional risk factors and GRS, and the integration of both in a comprehensive model is useful in the prediction of MetS occurrence.
format Article
id doaj-art-d7d5c81051424da2a0193a2ed63621a2
institution DOAJ
issn 0785-3890
1365-2060
language English
publishDate 2025-12-01
publisher Taylor & Francis Group
record_format Article
series Annals of Medicine
spelling doaj-art-d7d5c81051424da2a0193a2ed63621a22025-08-20T03:22:39ZengTaylor & Francis GroupAnnals of Medicine0785-38901365-20602025-12-0157110.1080/07853890.2025.2519679Assessing individual genetic susceptibility to metabolic syndrome: interpretable machine learning methodTao Huang0Yuanyuan Li1Simin Wang2Shijie Qiao3Xiujuan Zheng4Wenhui Xiong5Menghan Yang6Xirui Huang7Bizhen Gao8College of Integrative Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou, ChinaCollege of Traditional Chinese Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou, ChinaCollege of Integrative Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou, ChinaCollege of Traditional Chinese Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou, ChinaCollege of Traditional Chinese Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou, ChinaCollege of Traditional Chinese Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou, ChinaCollege of Traditional Chinese Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou, ChinaCollege of Traditional Chinese Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou, ChinaCollege of Integrative Medicine, Fujian University of Traditional Chinese Medicine, Fuzhou, ChinaBackground Genome-wide association studies have provided profound insights into the genetic aetiology of metabolic syndrome (MetS). However, there is a lack of machine-learning (ML)-based predictive models to assess individual genetic susceptibility to MetS. This study utilized single-nucleotide polymorphisms (SNPs) as variables and employed ML-based genetic risk score (GRS) models to predict the occurrence of MetS, bringing it closer to clinical application.Methods Feature selection was performed using Least Absolute Shrinkage and Selection Operator. Six ML algorithms were employed to construct GRS models. A fivefold cross-validation was utilized to aid in the internal validation of models. The receiver operating characteristic (ROC) curve was used to select the better-performing GRS model. The SHapley Additive exPlanations (SHAP) was then applied to interpret the model. After extracting GRS, stratified analysis of BMI, age and gender was performed. Finally, these conventional risk factors and GRS were integrated through multivariate logistic regression to establish a combined model.Results A total of 17 SNPs were selected for analysis. Among the GRS models, the extreme gradient boosting (XGBoost) model demonstrated superior discriminative performance (AUC = 0.837). The XGBoost’s optimal robustness was also validated through five-fold cross-validation (mean ROC-AUC = 0.706). The XGBoost-based SHAP algorithm not only elucidated the global effects of 17 SNPs across all samples, but also described the interaction between SNPs, providing a visual representation of how SNPs impact the prediction of MetS in an individual. There was a strong correlation between GRS and MetS risk, particularly observed among young individuals, males and overweight individuals. Furthermore, the model combining conventional risk factors and GRS exhibited excellent discriminative performance (AUC = 0.962) and outstanding robustness (mean ROC-AUC = 0.959).Conclusion This study established a reliable XGBoost-based GRS model and a GRS prediction platform (https://metabolicsyndromeapps.shinyapps.io/geneticriskscore/) to assess individual genetic susceptibility to MetS. This model has high interpretability and can provide personalized reference for determining the necessity of primary prevention measures for MetS. Additionally, there may be interactions between traditional risk factors and GRS, and the integration of both in a comprehensive model is useful in the prediction of MetS occurrence.https://www.tandfonline.com/doi/10.1080/07853890.2025.2519679Metabolic syndromemachine learninggenetic risk scoreprediction modelSHAP
spellingShingle Tao Huang
Yuanyuan Li
Simin Wang
Shijie Qiao
Xiujuan Zheng
Wenhui Xiong
Menghan Yang
Xirui Huang
Bizhen Gao
Assessing individual genetic susceptibility to metabolic syndrome: interpretable machine learning method
Annals of Medicine
Metabolic syndrome
machine learning
genetic risk score
prediction model
SHAP
title Assessing individual genetic susceptibility to metabolic syndrome: interpretable machine learning method
title_full Assessing individual genetic susceptibility to metabolic syndrome: interpretable machine learning method
title_fullStr Assessing individual genetic susceptibility to metabolic syndrome: interpretable machine learning method
title_full_unstemmed Assessing individual genetic susceptibility to metabolic syndrome: interpretable machine learning method
title_short Assessing individual genetic susceptibility to metabolic syndrome: interpretable machine learning method
title_sort assessing individual genetic susceptibility to metabolic syndrome interpretable machine learning method
topic Metabolic syndrome
machine learning
genetic risk score
prediction model
SHAP
url https://www.tandfonline.com/doi/10.1080/07853890.2025.2519679
work_keys_str_mv AT taohuang assessingindividualgeneticsusceptibilitytometabolicsyndromeinterpretablemachinelearningmethod
AT yuanyuanli assessingindividualgeneticsusceptibilitytometabolicsyndromeinterpretablemachinelearningmethod
AT siminwang assessingindividualgeneticsusceptibilitytometabolicsyndromeinterpretablemachinelearningmethod
AT shijieqiao assessingindividualgeneticsusceptibilitytometabolicsyndromeinterpretablemachinelearningmethod
AT xiujuanzheng assessingindividualgeneticsusceptibilitytometabolicsyndromeinterpretablemachinelearningmethod
AT wenhuixiong assessingindividualgeneticsusceptibilitytometabolicsyndromeinterpretablemachinelearningmethod
AT menghanyang assessingindividualgeneticsusceptibilitytometabolicsyndromeinterpretablemachinelearningmethod
AT xiruihuang assessingindividualgeneticsusceptibilitytometabolicsyndromeinterpretablemachinelearningmethod
AT bizhengao assessingindividualgeneticsusceptibilitytometabolicsyndromeinterpretablemachinelearningmethod