A comparative approach of machine learning models to predict attrition in a diabetes management program.

Approximately 11.6% of Americans have diabetes and South Carolina has one of the highest rates of adults with diabetes. Diabetes self-management programs have been observed to be effective in promoting weight loss and improving diabetes knowledge and self-care behaviors. The ability to keep vulnerab...

Full description

Saved in:
Bibliographic Details
Main Authors: Samantha Kanny, Grisha Post, Patricia Carbajales-Dale, William Cummings, Janet Evatt, Windsor Westbrook Sherrill
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-07-01
Series:PLOS Digital Health
Online Access:https://doi.org/10.1371/journal.pdig.0000930
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849427757636780032
author Samantha Kanny
Grisha Post
Patricia Carbajales-Dale
William Cummings
Janet Evatt
Windsor Westbrook Sherrill
author_facet Samantha Kanny
Grisha Post
Patricia Carbajales-Dale
William Cummings
Janet Evatt
Windsor Westbrook Sherrill
author_sort Samantha Kanny
collection DOAJ
description Approximately 11.6% of Americans have diabetes and South Carolina has one of the highest rates of adults with diabetes. Diabetes self-management programs have been observed to be effective in promoting weight loss and improving diabetes knowledge and self-care behaviors. The ability to keep vulnerable individuals in these programs is critical to helping the growing diabetic population. Utilizing machine learning is gaining popularity in healthcare settings. The objective of this study is to assess the effectiveness of several machine learning methods in predicting attrition from a diabetes self-management program, utilizing participant demographics and various evaluation measures. Data were collected from participants enrolled in Health Extension for Diabetes (HED). Descriptive statistics were used to examine HED participant demographics, while Mann-Whitney U tests and chi-square tests were used to examine relationships between demographics and pre-program evaluation measures. Through the various analyses, health-related measures - specifically the SF-12 quality of life scores, Distressed Communities Index (DCI) score, along with demographic factors (race, age, height, and educational attainment), and spatial variables (drive time to the nearest grocery store) emerged as influential predictors of attrition. However, the machine learning models showed poor overall performance, with AUC values ranging from 0.53 - 0.64 and F-1 scores between 0.19 - 0.36, indicating low predictive power. Among the models tested, XGBoost with downsampling yielded the highest AUC value (0.64) and a slightly higher F-1 score (0.36). To enhance model interpretability, SHAP (SHapley Additive exPlanations) was applied. While these models are not suitable for accurately predicting individual attrition risk in diabetes self-management programs, they identify potential factors influencing dropout rates. These findings underscore the difficulty for models to accurately predict health behavior outcomes, highlighting the need for future research to improve predictive modeling to better support patient engagement and retention.
format Article
id doaj-art-466321e470614f8ca8969e0159d2b279
institution Kabale University
issn 2767-3170
language English
publishDate 2025-07-01
publisher Public Library of Science (PLoS)
record_format Article
series PLOS Digital Health
spelling doaj-art-466321e470614f8ca8969e0159d2b2792025-08-20T03:28:55ZengPublic Library of Science (PLoS)PLOS Digital Health2767-31702025-07-0147e000093010.1371/journal.pdig.0000930A comparative approach of machine learning models to predict attrition in a diabetes management program.Samantha KannyGrisha PostPatricia Carbajales-DaleWilliam CummingsJanet EvattWindsor Westbrook SherrillApproximately 11.6% of Americans have diabetes and South Carolina has one of the highest rates of adults with diabetes. Diabetes self-management programs have been observed to be effective in promoting weight loss and improving diabetes knowledge and self-care behaviors. The ability to keep vulnerable individuals in these programs is critical to helping the growing diabetic population. Utilizing machine learning is gaining popularity in healthcare settings. The objective of this study is to assess the effectiveness of several machine learning methods in predicting attrition from a diabetes self-management program, utilizing participant demographics and various evaluation measures. Data were collected from participants enrolled in Health Extension for Diabetes (HED). Descriptive statistics were used to examine HED participant demographics, while Mann-Whitney U tests and chi-square tests were used to examine relationships between demographics and pre-program evaluation measures. Through the various analyses, health-related measures - specifically the SF-12 quality of life scores, Distressed Communities Index (DCI) score, along with demographic factors (race, age, height, and educational attainment), and spatial variables (drive time to the nearest grocery store) emerged as influential predictors of attrition. However, the machine learning models showed poor overall performance, with AUC values ranging from 0.53 - 0.64 and F-1 scores between 0.19 - 0.36, indicating low predictive power. Among the models tested, XGBoost with downsampling yielded the highest AUC value (0.64) and a slightly higher F-1 score (0.36). To enhance model interpretability, SHAP (SHapley Additive exPlanations) was applied. While these models are not suitable for accurately predicting individual attrition risk in diabetes self-management programs, they identify potential factors influencing dropout rates. These findings underscore the difficulty for models to accurately predict health behavior outcomes, highlighting the need for future research to improve predictive modeling to better support patient engagement and retention.https://doi.org/10.1371/journal.pdig.0000930
spellingShingle Samantha Kanny
Grisha Post
Patricia Carbajales-Dale
William Cummings
Janet Evatt
Windsor Westbrook Sherrill
A comparative approach of machine learning models to predict attrition in a diabetes management program.
PLOS Digital Health
title A comparative approach of machine learning models to predict attrition in a diabetes management program.
title_full A comparative approach of machine learning models to predict attrition in a diabetes management program.
title_fullStr A comparative approach of machine learning models to predict attrition in a diabetes management program.
title_full_unstemmed A comparative approach of machine learning models to predict attrition in a diabetes management program.
title_short A comparative approach of machine learning models to predict attrition in a diabetes management program.
title_sort comparative approach of machine learning models to predict attrition in a diabetes management program
url https://doi.org/10.1371/journal.pdig.0000930
work_keys_str_mv AT samanthakanny acomparativeapproachofmachinelearningmodelstopredictattritioninadiabetesmanagementprogram
AT grishapost acomparativeapproachofmachinelearningmodelstopredictattritioninadiabetesmanagementprogram
AT patriciacarbajalesdale acomparativeapproachofmachinelearningmodelstopredictattritioninadiabetesmanagementprogram
AT williamcummings acomparativeapproachofmachinelearningmodelstopredictattritioninadiabetesmanagementprogram
AT janetevatt acomparativeapproachofmachinelearningmodelstopredictattritioninadiabetesmanagementprogram
AT windsorwestbrooksherrill acomparativeapproachofmachinelearningmodelstopredictattritioninadiabetesmanagementprogram
AT samanthakanny comparativeapproachofmachinelearningmodelstopredictattritioninadiabetesmanagementprogram
AT grishapost comparativeapproachofmachinelearningmodelstopredictattritioninadiabetesmanagementprogram
AT patriciacarbajalesdale comparativeapproachofmachinelearningmodelstopredictattritioninadiabetesmanagementprogram
AT williamcummings comparativeapproachofmachinelearningmodelstopredictattritioninadiabetesmanagementprogram
AT janetevatt comparativeapproachofmachinelearningmodelstopredictattritioninadiabetesmanagementprogram
AT windsorwestbrooksherrill comparativeapproachofmachinelearningmodelstopredictattritioninadiabetesmanagementprogram