Development and Validation of a Machine Learning Algorithm for Predicting Diabetes Retinopathy in Patients With Type 2 Diabetes: Algorithm Development Study

Abstract BackgroundDiabetic retinopathy (DR) is the leading cause of preventable blindness worldwide. Machine learning (ML) systems can enhance DR in community-based screening. However, predictive power models for usability and performance are still being determined....

Full description

Saved in:
Bibliographic Details
Main Authors: Sunyoung Kim, Jaeyu Park, Yejun Son, Hojae Lee, Selin Woo, Myeongcheol Lee, Hayeon Lee, Hyunji Sang, Dong Keon Yon, Sang Youl Rhee
Format: Article
Language:English
Published: JMIR Publications 2025-02-01
Series:JMIR Medical Informatics
Online Access:https://medinform.jmir.org/2025/1/e58107
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850067164019556352
author Sunyoung Kim
Jaeyu Park
Yejun Son
Hojae Lee
Selin Woo
Myeongcheol Lee
Hayeon Lee
Hyunji Sang
Dong Keon Yon
Sang Youl Rhee
author_facet Sunyoung Kim
Jaeyu Park
Yejun Son
Hojae Lee
Selin Woo
Myeongcheol Lee
Hayeon Lee
Hyunji Sang
Dong Keon Yon
Sang Youl Rhee
author_sort Sunyoung Kim
collection DOAJ
description Abstract BackgroundDiabetic retinopathy (DR) is the leading cause of preventable blindness worldwide. Machine learning (ML) systems can enhance DR in community-based screening. However, predictive power models for usability and performance are still being determined. ObjectiveThis study used data from 3 university hospitals in South Korea to conduct a simple and accurate assessment of ML-based risk prediction for the development of DR that can be universally applied to adults with type 2 diabetes mellitus (T2DM). MethodsDR was predicted using data from 2 independent electronic medical records: a discovery cohort (one hospital, n=14,694) and a validation cohort (2 hospitals, n=1856). The primary outcome was the presence of DR at 3 years. Different ML-based models were selected through hyperparameter tuning in the discovery cohort, and the area under the receiver operating characteristic (ROC) curve was analyzed in both cohorts. ResultsAmong 14,694 patients screened for inclusion, 348 (2.37%) were diagnosed with DR. For DR, the extreme gradient boosting (XGBoost) system had an accuracy of 75.13% (95% CI 74.10‐76.17), a sensitivity of 71.00% (95% CI 66.83‐75.17), and a specificity of 75.23% (95% CI 74.16‐76.31) in the original dataset. Among the validation datasets, XGBoost had an accuracy of 65.14%, a sensitivity of 64.96%, and a specificity of 65.15%. The most common feature in the XGBoost model is dyslipidemia, followed by cancer, hypertension, chronic kidney disease, neuropathy, and cardiovascular disease. ConclusionsThis approach shows the potential to enhance patient outcomes by enabling timely interventions in patients with T2DM, improving our understanding of contributing factors, and reducing DR-related complications. The proposed prediction model is expected to be both competitive and cost-effective, particularly for primary care settings in South Korea.
format Article
id doaj-art-ad84708fb0f0489480192e33db3cf101
institution DOAJ
issn 2291-9694
language English
publishDate 2025-02-01
publisher JMIR Publications
record_format Article
series JMIR Medical Informatics
spelling doaj-art-ad84708fb0f0489480192e33db3cf1012025-08-20T02:48:28ZengJMIR PublicationsJMIR Medical Informatics2291-96942025-02-0113e58107e5810710.2196/58107Development and Validation of a Machine Learning Algorithm for Predicting Diabetes Retinopathy in Patients With Type 2 Diabetes: Algorithm Development StudySunyoung Kimhttp://orcid.org/0000-0003-4115-4455Jaeyu Parkhttp://orcid.org/0009-0005-2009-386XYejun Sonhttp://orcid.org/0009-0001-3939-2983Hojae Leehttp://orcid.org/0009-0002-1737-2540Selin Woohttp://orcid.org/0000-0001-7961-2074Myeongcheol Leehttp://orcid.org/0009-0006-7185-9471Hayeon Leehttp://orcid.org/0009-0000-2403-6241Hyunji Sanghttp://orcid.org/0000-0003-2557-5911Dong Keon Yonhttp://orcid.org/0000-0003-1628-9948Sang Youl Rheehttp://orcid.org/0000-0003-0119-5818 Abstract BackgroundDiabetic retinopathy (DR) is the leading cause of preventable blindness worldwide. Machine learning (ML) systems can enhance DR in community-based screening. However, predictive power models for usability and performance are still being determined. ObjectiveThis study used data from 3 university hospitals in South Korea to conduct a simple and accurate assessment of ML-based risk prediction for the development of DR that can be universally applied to adults with type 2 diabetes mellitus (T2DM). MethodsDR was predicted using data from 2 independent electronic medical records: a discovery cohort (one hospital, n=14,694) and a validation cohort (2 hospitals, n=1856). The primary outcome was the presence of DR at 3 years. Different ML-based models were selected through hyperparameter tuning in the discovery cohort, and the area under the receiver operating characteristic (ROC) curve was analyzed in both cohorts. ResultsAmong 14,694 patients screened for inclusion, 348 (2.37%) were diagnosed with DR. For DR, the extreme gradient boosting (XGBoost) system had an accuracy of 75.13% (95% CI 74.10‐76.17), a sensitivity of 71.00% (95% CI 66.83‐75.17), and a specificity of 75.23% (95% CI 74.16‐76.31) in the original dataset. Among the validation datasets, XGBoost had an accuracy of 65.14%, a sensitivity of 64.96%, and a specificity of 65.15%. The most common feature in the XGBoost model is dyslipidemia, followed by cancer, hypertension, chronic kidney disease, neuropathy, and cardiovascular disease. ConclusionsThis approach shows the potential to enhance patient outcomes by enabling timely interventions in patients with T2DM, improving our understanding of contributing factors, and reducing DR-related complications. The proposed prediction model is expected to be both competitive and cost-effective, particularly for primary care settings in South Korea.https://medinform.jmir.org/2025/1/e58107
spellingShingle Sunyoung Kim
Jaeyu Park
Yejun Son
Hojae Lee
Selin Woo
Myeongcheol Lee
Hayeon Lee
Hyunji Sang
Dong Keon Yon
Sang Youl Rhee
Development and Validation of a Machine Learning Algorithm for Predicting Diabetes Retinopathy in Patients With Type 2 Diabetes: Algorithm Development Study
JMIR Medical Informatics
title Development and Validation of a Machine Learning Algorithm for Predicting Diabetes Retinopathy in Patients With Type 2 Diabetes: Algorithm Development Study
title_full Development and Validation of a Machine Learning Algorithm for Predicting Diabetes Retinopathy in Patients With Type 2 Diabetes: Algorithm Development Study
title_fullStr Development and Validation of a Machine Learning Algorithm for Predicting Diabetes Retinopathy in Patients With Type 2 Diabetes: Algorithm Development Study
title_full_unstemmed Development and Validation of a Machine Learning Algorithm for Predicting Diabetes Retinopathy in Patients With Type 2 Diabetes: Algorithm Development Study
title_short Development and Validation of a Machine Learning Algorithm for Predicting Diabetes Retinopathy in Patients With Type 2 Diabetes: Algorithm Development Study
title_sort development and validation of a machine learning algorithm for predicting diabetes retinopathy in patients with type 2 diabetes algorithm development study
url https://medinform.jmir.org/2025/1/e58107
work_keys_str_mv AT sunyoungkim developmentandvalidationofamachinelearningalgorithmforpredictingdiabetesretinopathyinpatientswithtype2diabetesalgorithmdevelopmentstudy
AT jaeyupark developmentandvalidationofamachinelearningalgorithmforpredictingdiabetesretinopathyinpatientswithtype2diabetesalgorithmdevelopmentstudy
AT yejunson developmentandvalidationofamachinelearningalgorithmforpredictingdiabetesretinopathyinpatientswithtype2diabetesalgorithmdevelopmentstudy
AT hojaelee developmentandvalidationofamachinelearningalgorithmforpredictingdiabetesretinopathyinpatientswithtype2diabetesalgorithmdevelopmentstudy
AT selinwoo developmentandvalidationofamachinelearningalgorithmforpredictingdiabetesretinopathyinpatientswithtype2diabetesalgorithmdevelopmentstudy
AT myeongcheollee developmentandvalidationofamachinelearningalgorithmforpredictingdiabetesretinopathyinpatientswithtype2diabetesalgorithmdevelopmentstudy
AT hayeonlee developmentandvalidationofamachinelearningalgorithmforpredictingdiabetesretinopathyinpatientswithtype2diabetesalgorithmdevelopmentstudy
AT hyunjisang developmentandvalidationofamachinelearningalgorithmforpredictingdiabetesretinopathyinpatientswithtype2diabetesalgorithmdevelopmentstudy
AT dongkeonyon developmentandvalidationofamachinelearningalgorithmforpredictingdiabetesretinopathyinpatientswithtype2diabetesalgorithmdevelopmentstudy
AT sangyoulrhee developmentandvalidationofamachinelearningalgorithmforpredictingdiabetesretinopathyinpatientswithtype2diabetesalgorithmdevelopmentstudy