Clustering‐based risk stratification of prediabetes populations: Insights from the Taiwan and UK Biobanks

ABSTRACT Aims/Introduction This study aimed to identify low‐ and high‐risk diabetes groups within prediabetes populations using data from the Taiwan Biobank (TWB) and UK Biobank (UKB) through a clustering‐based Unsupervised Learning (UL) approach, to inform targeted type 2 diabetes (T2D) interventio...

Full description

Saved in:
Bibliographic Details
Main Authors: Djeane Debora Onthoni, Ying‐Erh Chen, Yi‐Hsuan Lai, Guo‐Hung Li, Yong‐Sheng Zhuang, Hong‐Ming Lin, Yu‐Ping Hsiao, Ade Indra Onthoni, Hung‐Yi Chiou, Ren‐Hua Chung
Format: Article
Language:English
Published: Wiley 2025-01-01
Series:Journal of Diabetes Investigation
Subjects:
Online Access:https://doi.org/10.1111/jdi.14328
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850283741741580288
author Djeane Debora Onthoni
Ying‐Erh Chen
Yi‐Hsuan Lai
Guo‐Hung Li
Yong‐Sheng Zhuang
Hong‐Ming Lin
Yu‐Ping Hsiao
Ade Indra Onthoni
Hung‐Yi Chiou
Ren‐Hua Chung
author_facet Djeane Debora Onthoni
Ying‐Erh Chen
Yi‐Hsuan Lai
Guo‐Hung Li
Yong‐Sheng Zhuang
Hong‐Ming Lin
Yu‐Ping Hsiao
Ade Indra Onthoni
Hung‐Yi Chiou
Ren‐Hua Chung
author_sort Djeane Debora Onthoni
collection DOAJ
description ABSTRACT Aims/Introduction This study aimed to identify low‐ and high‐risk diabetes groups within prediabetes populations using data from the Taiwan Biobank (TWB) and UK Biobank (UKB) through a clustering‐based Unsupervised Learning (UL) approach, to inform targeted type 2 diabetes (T2D) interventions. Materials and Methods Data from TWB and UKB, comprising clinical and genetic information, were analyzed. Prediabetes was defined by glucose thresholds, and incident T2D was identified through follow‐up data. K‐means clustering was performed on prediabetes participants using significant features determined through logistic regression and LASSO. Cluster stability was assessed using mean Jaccard similarity, silhouette score, and the elbow method. Results We identified two stable clusters representing high‐ and low‐risk diabetes groups in both biobanks. The high‐risk clusters showed higher diabetes incidence, with 15.7% in TWB and 13.0% in UKB, compared to 7.3% and 9.1% in the low‐risk clusters, respectively. Notably, males were predominant in the high‐risk groups, constituting 76.6% in TWB and 52.7% in UKB. In TWB, the high‐risk group also exhibited significantly higher BMI, fasting glucose, and triglycerides, while UKB showed marginal significance in BMI and other metabolic indicators. Current smoking was significantly associated with increased diabetes risk in the TWB high‐risk group (P < 0.001). Kaplan–Meier curves indicated significant differences in diabetes complication incidences between clusters. Conclusions UL effectively identified risk‐specific groups within prediabetes populations, with high‐risk groups strongly associated male gender, higher BMI, smoking, and metabolic markers. Tailored preventive strategies, particularly for young males in Taiwan, are crucial to reducing T2D risk.
format Article
id doaj-art-50a2e27a694e4dbcbc0652d1e5d4b3ca
institution OA Journals
issn 2040-1116
2040-1124
language English
publishDate 2025-01-01
publisher Wiley
record_format Article
series Journal of Diabetes Investigation
spelling doaj-art-50a2e27a694e4dbcbc0652d1e5d4b3ca2025-08-20T01:47:43ZengWileyJournal of Diabetes Investigation2040-11162040-11242025-01-01161253510.1111/jdi.14328Clustering‐based risk stratification of prediabetes populations: Insights from the Taiwan and UK BiobanksDjeane Debora Onthoni0Ying‐Erh Chen1Yi‐Hsuan Lai2Guo‐Hung Li3Yong‐Sheng Zhuang4Hong‐Ming Lin5Yu‐Ping Hsiao6Ade Indra Onthoni7Hung‐Yi Chiou8Ren‐Hua Chung9Institute of Population Health Sciences National Health Research Institutes Miaoli County TaiwanDepartment of Risk Management and Insurance Tamkang University New Taipei City TaiwanInstitute of Population Health Sciences National Health Research Institutes Miaoli County TaiwanInstitute of Population Health Sciences National Health Research Institutes Miaoli County TaiwanInstitute of Population Health Sciences National Health Research Institutes Miaoli County TaiwanInstitute of Population Health Sciences National Health Research Institutes Miaoli County TaiwanInstitute of Population Health Sciences National Health Research Institutes Miaoli County TaiwanInstitute of Population Health Sciences National Health Research Institutes Miaoli County TaiwanInstitute of Population Health Sciences National Health Research Institutes Miaoli County TaiwanInstitute of Population Health Sciences National Health Research Institutes Miaoli County TaiwanABSTRACT Aims/Introduction This study aimed to identify low‐ and high‐risk diabetes groups within prediabetes populations using data from the Taiwan Biobank (TWB) and UK Biobank (UKB) through a clustering‐based Unsupervised Learning (UL) approach, to inform targeted type 2 diabetes (T2D) interventions. Materials and Methods Data from TWB and UKB, comprising clinical and genetic information, were analyzed. Prediabetes was defined by glucose thresholds, and incident T2D was identified through follow‐up data. K‐means clustering was performed on prediabetes participants using significant features determined through logistic regression and LASSO. Cluster stability was assessed using mean Jaccard similarity, silhouette score, and the elbow method. Results We identified two stable clusters representing high‐ and low‐risk diabetes groups in both biobanks. The high‐risk clusters showed higher diabetes incidence, with 15.7% in TWB and 13.0% in UKB, compared to 7.3% and 9.1% in the low‐risk clusters, respectively. Notably, males were predominant in the high‐risk groups, constituting 76.6% in TWB and 52.7% in UKB. In TWB, the high‐risk group also exhibited significantly higher BMI, fasting glucose, and triglycerides, while UKB showed marginal significance in BMI and other metabolic indicators. Current smoking was significantly associated with increased diabetes risk in the TWB high‐risk group (P < 0.001). Kaplan–Meier curves indicated significant differences in diabetes complication incidences between clusters. Conclusions UL effectively identified risk‐specific groups within prediabetes populations, with high‐risk groups strongly associated male gender, higher BMI, smoking, and metabolic markers. Tailored preventive strategies, particularly for young males in Taiwan, are crucial to reducing T2D risk.https://doi.org/10.1111/jdi.14328Machine learningPrediabetesRisk stratification
spellingShingle Djeane Debora Onthoni
Ying‐Erh Chen
Yi‐Hsuan Lai
Guo‐Hung Li
Yong‐Sheng Zhuang
Hong‐Ming Lin
Yu‐Ping Hsiao
Ade Indra Onthoni
Hung‐Yi Chiou
Ren‐Hua Chung
Clustering‐based risk stratification of prediabetes populations: Insights from the Taiwan and UK Biobanks
Journal of Diabetes Investigation
Machine learning
Prediabetes
Risk stratification
title Clustering‐based risk stratification of prediabetes populations: Insights from the Taiwan and UK Biobanks
title_full Clustering‐based risk stratification of prediabetes populations: Insights from the Taiwan and UK Biobanks
title_fullStr Clustering‐based risk stratification of prediabetes populations: Insights from the Taiwan and UK Biobanks
title_full_unstemmed Clustering‐based risk stratification of prediabetes populations: Insights from the Taiwan and UK Biobanks
title_short Clustering‐based risk stratification of prediabetes populations: Insights from the Taiwan and UK Biobanks
title_sort clustering based risk stratification of prediabetes populations insights from the taiwan and uk biobanks
topic Machine learning
Prediabetes
Risk stratification
url https://doi.org/10.1111/jdi.14328
work_keys_str_mv AT djeanedeboraonthoni clusteringbasedriskstratificationofprediabetespopulationsinsightsfromthetaiwanandukbiobanks
AT yingerhchen clusteringbasedriskstratificationofprediabetespopulationsinsightsfromthetaiwanandukbiobanks
AT yihsuanlai clusteringbasedriskstratificationofprediabetespopulationsinsightsfromthetaiwanandukbiobanks
AT guohungli clusteringbasedriskstratificationofprediabetespopulationsinsightsfromthetaiwanandukbiobanks
AT yongshengzhuang clusteringbasedriskstratificationofprediabetespopulationsinsightsfromthetaiwanandukbiobanks
AT hongminglin clusteringbasedriskstratificationofprediabetespopulationsinsightsfromthetaiwanandukbiobanks
AT yupinghsiao clusteringbasedriskstratificationofprediabetespopulationsinsightsfromthetaiwanandukbiobanks
AT adeindraonthoni clusteringbasedriskstratificationofprediabetespopulationsinsightsfromthetaiwanandukbiobanks
AT hungyichiou clusteringbasedriskstratificationofprediabetespopulationsinsightsfromthetaiwanandukbiobanks
AT renhuachung clusteringbasedriskstratificationofprediabetespopulationsinsightsfromthetaiwanandukbiobanks