Development and validation of an explainable machine learning model for predicting osteoporosis in patients with type 2 diabetes mellitus

ObjectiveOsteoporosis is a common complication in patients with type 2 diabetes mellitus (T2DM), yet its screening rate remains low. This study aimed to develop and validate a cost-effective and interpretable machine learning (ML) model to predict the risk of osteoporosis in patients with T2DM.Metho...

Full description

Saved in:
Bibliographic Details
Main Authors: Qipeng Wei, Zihao Liu, Xiaofeng Chen, Hao Li, Weijun Guo, Qingyan Huang, Jinxiang Zhan, Shiji Chen, Dongling Cai
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-08-01
Series:Frontiers in Endocrinology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fendo.2025.1611499/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:ObjectiveOsteoporosis is a common complication in patients with type 2 diabetes mellitus (T2DM), yet its screening rate remains low. This study aimed to develop and validate a cost-effective and interpretable machine learning (ML) model to predict the risk of osteoporosis in patients with T2DM.MethodsThis retrospective study included 1560 inpatients who underwent dual-energy X-ray absorptiometry (DXA) between January 2022 and December 2023 at Panyu Hospital of Chinese Medicine. Demographic information and laboratory test results obtained within 24 hours of hospital admission were collected. Potential predictive features were identified using univariate analysis, least absolute shrinkage and selection operator (LASSO) regression, and the Boruta algorithm. Eight supervised ML algorithms were applied to construct predictive models. Model performance was evaluated based on the area under the receiver operating characteristic curve (AUC), calibration plots, decision curve analysis (DCA), accuracy, sensitivity, specificity, and F1 score. The SHapley Additive exPlanations (SHAP) method was used to interpret the model and visualize feature importance.ResultsTen predictive features were selected based on the intersection of the three feature selection methods. Among the tested models, logistic regression achieved the best overall performance, with an AUC of 0.812, an accuracy of 0.762, a sensitivity of 0.809, a specificity of 0.761, and an F1 score of 0.771 in the validation set. Calibration plots and DCA curves demonstrated good agreement and the highest net clinical benefit. SHAP analysis identified age, sex, alkaline phosphatase, uric acid, hemoglobin, and neutrophil count as the six most influential features. An easy-to-use, web-based risk calculator was developed based on the logistic model and is available at: https://t2dm.shinyapps.io/t2dm-osteoporosis/.ConclusionWe developed an interpretable and accessible ML-based online tool that enables preliminary screening of osteoporosis risk in patients with T2DM using routine blood indicators. This tool may assist clinicians in early risk identification and reduce the underdiagnosis of osteoporosis.
ISSN:1664-2392