A machine learning-based prediction of diabetic retinopathy using the Korea national health and nutrition examination survey (2008–2012, 2017–2021)

BackgroundMachine learning technology that uses available clinical data to predict diabetic retinopathy (DR) can be highly valuable in medical settings where fundus cameras are not accessible.ObjectiveThis study aimed to develop and compare machine learning algorithms for predicting DR without fundu...

Full description

Saved in:
Bibliographic Details
Main Authors: Min Seok Kim, Young Wook Choi, Borghare Shubham Prakash, Youngju Lee, Soo Lim, Se Joon Woo
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-05-01
Series:Frontiers in Medicine
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fmed.2025.1542860/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849688996544774144
author Min Seok Kim
Young Wook Choi
Borghare Shubham Prakash
Youngju Lee
Soo Lim
Se Joon Woo
author_facet Min Seok Kim
Young Wook Choi
Borghare Shubham Prakash
Youngju Lee
Soo Lim
Se Joon Woo
author_sort Min Seok Kim
collection DOAJ
description BackgroundMachine learning technology that uses available clinical data to predict diabetic retinopathy (DR) can be highly valuable in medical settings where fundus cameras are not accessible.ObjectiveThis study aimed to develop and compare machine learning algorithms for predicting DR without fundus image.MethodsWe used data from Korea National Health and Nutrition Examination Survey (2008–2012 and 2017–2021) and enrolled individuals aged ≥ 20 years with diabetes who received fundus examination. Predictive models for DR were developed using logistic regression and three machine learning algorithms: extreme gradient boosting, decision tree, and random forest. Model performance was evaluated using area under the receiver operating characteristic curve (AUC) and accuracy for the diagnosis of DR, and feature importance was determined using Shapley Additive Explanations (SHAP).ResultsAmong the 3,026 diabetic participants (male, 50.7%; mean age, 63.7 ± 10.5 years), 671 (22.2%) had DR. The random forest model, using 16 variables, achieved the highest AUC of 0.748 (95% confidence interval, 0.705–0.790) with a sensitivity 0.669, specificity of 0.729 and an accuracy of 0.715. As interpreted by SHAP, HbA1c, fasting glucose levels, duration of diabetes, and body mass index were identified as common key determinants influencing the model’s outcomes.ConclusionThe DR prediction models using machine learning techniques demonstrated reliable performance even without fundus imaging, with the random forest model showing particularly strong results. These models could assist in managing DR by identifying high-risk patients, enabling timely ophthalmic referrals.
format Article
id doaj-art-5e095302735e49389ae730caef44cb61
institution DOAJ
issn 2296-858X
language English
publishDate 2025-05-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Medicine
spelling doaj-art-5e095302735e49389ae730caef44cb612025-08-20T03:21:47ZengFrontiers Media S.A.Frontiers in Medicine2296-858X2025-05-011210.3389/fmed.2025.15428601542860A machine learning-based prediction of diabetic retinopathy using the Korea national health and nutrition examination survey (2008–2012, 2017–2021)Min Seok Kim0Young Wook Choi1Borghare Shubham Prakash2Youngju Lee3Soo Lim4Se Joon Woo5Department of Ophthalmology, Seoul National University College of Medicine, Seoul National University Bundang Hospital, Seongnam-si, Republic of KoreaRetiMark R&D Center, Seoul, Republic of KoreaRetiMark R&D Center, Seoul, Republic of KoreaRetiMark R&D Center, Seoul, Republic of KoreaDepartment of Internal Medicine, Seoul National University College of Medicine and Seoul National University Bundang Hospital, Seongnam-si, Republic of KoreaDepartment of Ophthalmology, Seoul National University College of Medicine, Seoul National University Bundang Hospital, Seongnam-si, Republic of KoreaBackgroundMachine learning technology that uses available clinical data to predict diabetic retinopathy (DR) can be highly valuable in medical settings where fundus cameras are not accessible.ObjectiveThis study aimed to develop and compare machine learning algorithms for predicting DR without fundus image.MethodsWe used data from Korea National Health and Nutrition Examination Survey (2008–2012 and 2017–2021) and enrolled individuals aged ≥ 20 years with diabetes who received fundus examination. Predictive models for DR were developed using logistic regression and three machine learning algorithms: extreme gradient boosting, decision tree, and random forest. Model performance was evaluated using area under the receiver operating characteristic curve (AUC) and accuracy for the diagnosis of DR, and feature importance was determined using Shapley Additive Explanations (SHAP).ResultsAmong the 3,026 diabetic participants (male, 50.7%; mean age, 63.7 ± 10.5 years), 671 (22.2%) had DR. The random forest model, using 16 variables, achieved the highest AUC of 0.748 (95% confidence interval, 0.705–0.790) with a sensitivity 0.669, specificity of 0.729 and an accuracy of 0.715. As interpreted by SHAP, HbA1c, fasting glucose levels, duration of diabetes, and body mass index were identified as common key determinants influencing the model’s outcomes.ConclusionThe DR prediction models using machine learning techniques demonstrated reliable performance even without fundus imaging, with the random forest model showing particularly strong results. These models could assist in managing DR by identifying high-risk patients, enabling timely ophthalmic referrals.https://www.frontiersin.org/articles/10.3389/fmed.2025.1542860/fulldiabetic retinopathymachine learningrandom forest algorithmsKoreaprediction
spellingShingle Min Seok Kim
Young Wook Choi
Borghare Shubham Prakash
Youngju Lee
Soo Lim
Se Joon Woo
A machine learning-based prediction of diabetic retinopathy using the Korea national health and nutrition examination survey (2008–2012, 2017–2021)
Frontiers in Medicine
diabetic retinopathy
machine learning
random forest algorithms
Korea
prediction
title A machine learning-based prediction of diabetic retinopathy using the Korea national health and nutrition examination survey (2008–2012, 2017–2021)
title_full A machine learning-based prediction of diabetic retinopathy using the Korea national health and nutrition examination survey (2008–2012, 2017–2021)
title_fullStr A machine learning-based prediction of diabetic retinopathy using the Korea national health and nutrition examination survey (2008–2012, 2017–2021)
title_full_unstemmed A machine learning-based prediction of diabetic retinopathy using the Korea national health and nutrition examination survey (2008–2012, 2017–2021)
title_short A machine learning-based prediction of diabetic retinopathy using the Korea national health and nutrition examination survey (2008–2012, 2017–2021)
title_sort machine learning based prediction of diabetic retinopathy using the korea national health and nutrition examination survey 2008 2012 2017 2021
topic diabetic retinopathy
machine learning
random forest algorithms
Korea
prediction
url https://www.frontiersin.org/articles/10.3389/fmed.2025.1542860/full
work_keys_str_mv AT minseokkim amachinelearningbasedpredictionofdiabeticretinopathyusingthekoreanationalhealthandnutritionexaminationsurvey2008201220172021
AT youngwookchoi amachinelearningbasedpredictionofdiabeticretinopathyusingthekoreanationalhealthandnutritionexaminationsurvey2008201220172021
AT borghareshubhamprakash amachinelearningbasedpredictionofdiabeticretinopathyusingthekoreanationalhealthandnutritionexaminationsurvey2008201220172021
AT youngjulee amachinelearningbasedpredictionofdiabeticretinopathyusingthekoreanationalhealthandnutritionexaminationsurvey2008201220172021
AT soolim amachinelearningbasedpredictionofdiabeticretinopathyusingthekoreanationalhealthandnutritionexaminationsurvey2008201220172021
AT sejoonwoo amachinelearningbasedpredictionofdiabeticretinopathyusingthekoreanationalhealthandnutritionexaminationsurvey2008201220172021
AT minseokkim machinelearningbasedpredictionofdiabeticretinopathyusingthekoreanationalhealthandnutritionexaminationsurvey2008201220172021
AT youngwookchoi machinelearningbasedpredictionofdiabeticretinopathyusingthekoreanationalhealthandnutritionexaminationsurvey2008201220172021
AT borghareshubhamprakash machinelearningbasedpredictionofdiabeticretinopathyusingthekoreanationalhealthandnutritionexaminationsurvey2008201220172021
AT youngjulee machinelearningbasedpredictionofdiabeticretinopathyusingthekoreanationalhealthandnutritionexaminationsurvey2008201220172021
AT soolim machinelearningbasedpredictionofdiabeticretinopathyusingthekoreanationalhealthandnutritionexaminationsurvey2008201220172021
AT sejoonwoo machinelearningbasedpredictionofdiabeticretinopathyusingthekoreanationalhealthandnutritionexaminationsurvey2008201220172021