Artificial intelligence survival models for identifying relevant risk factors for incident diabetes in Azar cohort population

Background: This study aimed to identify some risk factors associated with time to diabetes type II events using artificial intelligence (AI) survival models (SM) in a population cohort from East Azerbaijan, Iran. Methods: Data from Azar-Cohort spanning from 2014 to 2020 was analyzed using the rando...

Full description

Saved in:
Bibliographic Details
Main Authors: Neda Gilani, Mohammadhossein Somi, Farzaneh Hamidi, Pasqualina Santaguida, Elnaz Faramarzi, Reza Arabi Belaghi
Format: Article
Language:English
Published: Tabriz University of Medical Sciences 2025-05-01
Series:Health Promotion Perspectives
Subjects:
Online Access:https://hpp.tbzmed.ac.ir/PDF/hpp-15-82.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background: This study aimed to identify some risk factors associated with time to diabetes type II events using artificial intelligence (AI) survival models (SM) in a population cohort from East Azerbaijan, Iran. Methods: Data from Azar-Cohort spanning from 2014 to 2020 was analyzed using the random forest (RF) variable selection method along with Cox regression to identify the most relevant risk factors associated with diabetes. We then developed prediction models using RF survival analysis. Lasso-variable selection and RF variable selection were used to select the most important variables. The concordance index (C-index) was used to evaluate the concordance of the prediction models. Results: Our LASSO-Cox regression identified six factors to be significantly associated with diabetes: age, mean corpuscular hemoglobin concentration (MCHC), waist circumference (WC), body mass index (BMI), use of sleep medication, and hypertension stage 1 and stage 2. The model included all variables with a C-index of 76.3%. In contrast, the RF analysis identified 21 important variables predicting a higher probability of having diabetes. Of those, WC, MCHC, triglyceride, and age were the most important predictors of diabetes. The RF model converged after 500 trees with an out-of-bag (OOB) of 0.28 and a C-index of 79.5%. Conclusion: RF machine learning algorithms and LASSO-Cox regression analyses consistently identified WC, hypertension, and MCHC as the main risk factors for developing diabetes. The RF approach demonstrated slightly better accuracy in predicting the likelihood of diabetes at different time points.
ISSN:2228-6497