Development and internal validation of an interpretable risk prediction model for diabetic peripheral neuropathy in type 2 diabetes: a single-centre retrospective cohort study in China

Objective Diabetic peripheral neuropathy (DPN) is a common and serious complication of diabetes, which can lead to foot deformity, ulceration, and even amputation. Early identification is crucial, as more than half of DPN patients are asymptomatic in the early stage. This study aimed to develop and...

Full description

Saved in:
Bibliographic Details
Main Authors: Li Cao, Linli Zhang, Feng Ju, Lianhua Liu, Bo Bi, Mei Gui, Xiaodan Wang
Format: Article
Language:English
Published: BMJ Publishing Group 2025-04-01
Series:BMJ Open
Online Access:https://bmjopen.bmj.com/content/15/4/e092463.full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849736008198782976
author Li Cao
Linli Zhang
Feng Ju
Lianhua Liu
Bo Bi
Mei Gui
Xiaodan Wang
author_facet Li Cao
Linli Zhang
Feng Ju
Lianhua Liu
Bo Bi
Mei Gui
Xiaodan Wang
author_sort Li Cao
collection DOAJ
description Objective Diabetic peripheral neuropathy (DPN) is a common and serious complication of diabetes, which can lead to foot deformity, ulceration, and even amputation. Early identification is crucial, as more than half of DPN patients are asymptomatic in the early stage. This study aimed to develop and validate multiple risk prediction models for DPN in patients with type 2 diabetes mellitus (T2DM) and to apply the Shapley Additive Explanation (SHAP) method to interpret the best-performing model and identify key risk factors for DPN.Design A single-centre retrospective cohort study.Setting The study was conducted at a tertiary teaching hospital in Hainan.Participants and methods Data were retrospectively collected from the electronic medical records of patients with diabetes admitted between 1 January 2021 and 28 March 2023. After data preprocessing, 73 variables were retained for baseline analysis. Feature selection was performed using univariate analysis combined with recursive feature elimination (RFE). The dataset was split into training and test sets in an 8:2 ratio, with the training set balanced via the Synthetic Minority Over-sampling Technique. Six machine learning algorithms were applied to develop prediction models for DPN. Hyperparameters were optimised using grid search with 10-fold cross-validation. Model performance was assessed using various metrics on the test set, and the SHAP method was used to interpret the best-performing model.Results The study included 3343 T2DM inpatients, with a median age of 60 years (IQR 53–69), and 88.6% (2962/3343) had DPN. The RFE method identified 12 key factors for model construction. Among the six models, XGBoost showed the best predictive performance, achieving an area under the curve of 0.960, accuracy of 0.927, precision of 0.969, recall of 0.948, F1-score of 0.958 and a G-mean of 0.850 on the test set. The SHAP analysis highlighted C reactive protein, total bile acids, gamma-glutamyl transpeptidase, age and lipoprotein(a) as the top five predictors of DPN.Conclusions The machine learning approach successfully established a DPN risk prediction model with excellent performance. The use of the interpretable SHAP method could enhance the model’s clinical applicability.
format Article
id doaj-art-0ad3ca96e05a4c6ba8e008ba59bd9b3a
institution DOAJ
issn 2044-6055
language English
publishDate 2025-04-01
publisher BMJ Publishing Group
record_format Article
series BMJ Open
spelling doaj-art-0ad3ca96e05a4c6ba8e008ba59bd9b3a2025-08-20T03:07:24ZengBMJ Publishing GroupBMJ Open2044-60552025-04-0115410.1136/bmjopen-2024-092463Development and internal validation of an interpretable risk prediction model for diabetic peripheral neuropathy in type 2 diabetes: a single-centre retrospective cohort study in ChinaLi Cao0Linli Zhang1Feng Ju2Lianhua Liu3Bo Bi4Mei Gui5Xiaodan Wang61 Department of Biostatistics, School of Public Health, Hainan Medical University, Haikou, Hainan, China2 Department of Mathematics, Physics, and Chemistry teaching, Hainan University, Haikou, Hainan, China3 Department of Endocrinology, The Second Affiliated Hospital of Hainan Medical University, Haikou, Hainan, China1 Department of Biostatistics, School of Public Health, Hainan Medical University, Haikou, Hainan, China1 Department of Biostatistics, School of Public Health, Hainan Medical University, Haikou, Hainan, China1 Department of Biostatistics, School of Public Health, Hainan Medical University, Haikou, Hainan, China1 Department of Biostatistics, School of Public Health, Hainan Medical University, Haikou, Hainan, ChinaObjective Diabetic peripheral neuropathy (DPN) is a common and serious complication of diabetes, which can lead to foot deformity, ulceration, and even amputation. Early identification is crucial, as more than half of DPN patients are asymptomatic in the early stage. This study aimed to develop and validate multiple risk prediction models for DPN in patients with type 2 diabetes mellitus (T2DM) and to apply the Shapley Additive Explanation (SHAP) method to interpret the best-performing model and identify key risk factors for DPN.Design A single-centre retrospective cohort study.Setting The study was conducted at a tertiary teaching hospital in Hainan.Participants and methods Data were retrospectively collected from the electronic medical records of patients with diabetes admitted between 1 January 2021 and 28 March 2023. After data preprocessing, 73 variables were retained for baseline analysis. Feature selection was performed using univariate analysis combined with recursive feature elimination (RFE). The dataset was split into training and test sets in an 8:2 ratio, with the training set balanced via the Synthetic Minority Over-sampling Technique. Six machine learning algorithms were applied to develop prediction models for DPN. Hyperparameters were optimised using grid search with 10-fold cross-validation. Model performance was assessed using various metrics on the test set, and the SHAP method was used to interpret the best-performing model.Results The study included 3343 T2DM inpatients, with a median age of 60 years (IQR 53–69), and 88.6% (2962/3343) had DPN. The RFE method identified 12 key factors for model construction. Among the six models, XGBoost showed the best predictive performance, achieving an area under the curve of 0.960, accuracy of 0.927, precision of 0.969, recall of 0.948, F1-score of 0.958 and a G-mean of 0.850 on the test set. The SHAP analysis highlighted C reactive protein, total bile acids, gamma-glutamyl transpeptidase, age and lipoprotein(a) as the top five predictors of DPN.Conclusions The machine learning approach successfully established a DPN risk prediction model with excellent performance. The use of the interpretable SHAP method could enhance the model’s clinical applicability.https://bmjopen.bmj.com/content/15/4/e092463.full
spellingShingle Li Cao
Linli Zhang
Feng Ju
Lianhua Liu
Bo Bi
Mei Gui
Xiaodan Wang
Development and internal validation of an interpretable risk prediction model for diabetic peripheral neuropathy in type 2 diabetes: a single-centre retrospective cohort study in China
BMJ Open
title Development and internal validation of an interpretable risk prediction model for diabetic peripheral neuropathy in type 2 diabetes: a single-centre retrospective cohort study in China
title_full Development and internal validation of an interpretable risk prediction model for diabetic peripheral neuropathy in type 2 diabetes: a single-centre retrospective cohort study in China
title_fullStr Development and internal validation of an interpretable risk prediction model for diabetic peripheral neuropathy in type 2 diabetes: a single-centre retrospective cohort study in China
title_full_unstemmed Development and internal validation of an interpretable risk prediction model for diabetic peripheral neuropathy in type 2 diabetes: a single-centre retrospective cohort study in China
title_short Development and internal validation of an interpretable risk prediction model for diabetic peripheral neuropathy in type 2 diabetes: a single-centre retrospective cohort study in China
title_sort development and internal validation of an interpretable risk prediction model for diabetic peripheral neuropathy in type 2 diabetes a single centre retrospective cohort study in china
url https://bmjopen.bmj.com/content/15/4/e092463.full
work_keys_str_mv AT licao developmentandinternalvalidationofaninterpretableriskpredictionmodelfordiabeticperipheralneuropathyintype2diabetesasinglecentreretrospectivecohortstudyinchina
AT linlizhang developmentandinternalvalidationofaninterpretableriskpredictionmodelfordiabeticperipheralneuropathyintype2diabetesasinglecentreretrospectivecohortstudyinchina
AT fengju developmentandinternalvalidationofaninterpretableriskpredictionmodelfordiabeticperipheralneuropathyintype2diabetesasinglecentreretrospectivecohortstudyinchina
AT lianhualiu developmentandinternalvalidationofaninterpretableriskpredictionmodelfordiabeticperipheralneuropathyintype2diabetesasinglecentreretrospectivecohortstudyinchina
AT bobi developmentandinternalvalidationofaninterpretableriskpredictionmodelfordiabeticperipheralneuropathyintype2diabetesasinglecentreretrospectivecohortstudyinchina
AT meigui developmentandinternalvalidationofaninterpretableriskpredictionmodelfordiabeticperipheralneuropathyintype2diabetesasinglecentreretrospectivecohortstudyinchina
AT xiaodanwang developmentandinternalvalidationofaninterpretableriskpredictionmodelfordiabeticperipheralneuropathyintype2diabetesasinglecentreretrospectivecohortstudyinchina