Machine Learning-Based Diabetes Risk Prediction Using Associated Behavioral Features

Diabetes is a global health concern that affects people of all races. With different uncertainties in human lifestyles, it is difficult to predict diabetes while assuming that the risk patterns are the same for all. The likelihood of diabetes in a patient is mostly predicted using machine learning (...

Full description

Saved in:
Bibliographic Details
Main Authors: Ayodeji O. J. Ibitoye, Joseph D. Akinyemi, Olufade F. W. Onifade
Format: Article
Language:English
Published: World Scientific Publishing 2024-01-01
Series:Computing Open
Subjects:
Online Access:https://www.worldscientific.com/doi/10.1142/S2972370124500065
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832542534681755648
author Ayodeji O. J. Ibitoye
Joseph D. Akinyemi
Olufade F. W. Onifade
author_facet Ayodeji O. J. Ibitoye
Joseph D. Akinyemi
Olufade F. W. Onifade
author_sort Ayodeji O. J. Ibitoye
collection DOAJ
description Diabetes is a global health concern that affects people of all races. With different uncertainties in human lifestyles, it is difficult to predict diabetes while assuming that the risk patterns are the same for all. The likelihood of diabetes in a patient is mostly predicted using machine learning (ML) models on features explicitly available in datasets, while the intrinsic relationship between features viz-a-viz their potential relevance to the presence of diabetes is oftentimes neglected. In this work, we explored feature importance and correlation to derive the top 15 feature pairs from a dataset of 263,882 samples of anonymized patient information. These top-15 feature pairs were fed into five different ML models (decision tree (DT), neural networks (NN), random forest (RF), support vector machine (SVM) and extreme gradient boosting (XGB)) for predicting the likelihood of diabetes, while also feeding the direct features (without correlated pairing) separately into the same 5[Formula: see text]ML models. The models’ performances were evaluated using accuracy, precision, recall and F1-score and NN presented the best performance overall achieving an F1-score of 85% for the correlated feature pairs (CF) and 75% for the direct feature pairs. The results confirm the importance of the correlation/relationship between features in predicting the likelihood of diabetes in patients more accurately.
format Article
id doaj-art-b05f60f7b5334a0f947fbcd4ad046b38
institution Kabale University
issn 2972-3701
language English
publishDate 2024-01-01
publisher World Scientific Publishing
record_format Article
series Computing Open
spelling doaj-art-b05f60f7b5334a0f947fbcd4ad046b382025-02-04T03:24:11ZengWorld Scientific PublishingComputing Open2972-37012024-01-010210.1142/S2972370124500065Machine Learning-Based Diabetes Risk Prediction Using Associated Behavioral FeaturesAyodeji O. J. Ibitoye0Joseph D. Akinyemi1Olufade F. W. Onifade2School of Computing and Mathematical Sciences, University of Greenwich, SE10 9LS, London, United KingdomDepartment of Computer Science, University of York, YO10 5DD, York, United KingdomDepartment of Computer Science, University of Ibadan, NigeriaDiabetes is a global health concern that affects people of all races. With different uncertainties in human lifestyles, it is difficult to predict diabetes while assuming that the risk patterns are the same for all. The likelihood of diabetes in a patient is mostly predicted using machine learning (ML) models on features explicitly available in datasets, while the intrinsic relationship between features viz-a-viz their potential relevance to the presence of diabetes is oftentimes neglected. In this work, we explored feature importance and correlation to derive the top 15 feature pairs from a dataset of 263,882 samples of anonymized patient information. These top-15 feature pairs were fed into five different ML models (decision tree (DT), neural networks (NN), random forest (RF), support vector machine (SVM) and extreme gradient boosting (XGB)) for predicting the likelihood of diabetes, while also feeding the direct features (without correlated pairing) separately into the same 5[Formula: see text]ML models. The models’ performances were evaluated using accuracy, precision, recall and F1-score and NN presented the best performance overall achieving an F1-score of 85% for the correlated feature pairs (CF) and 75% for the direct feature pairs. The results confirm the importance of the correlation/relationship between features in predicting the likelihood of diabetes in patients more accurately.https://www.worldscientific.com/doi/10.1142/S2972370124500065Diabetesmachine learningrisk predictionpaired relationshipdecision support
spellingShingle Ayodeji O. J. Ibitoye
Joseph D. Akinyemi
Olufade F. W. Onifade
Machine Learning-Based Diabetes Risk Prediction Using Associated Behavioral Features
Computing Open
Diabetes
machine learning
risk prediction
paired relationship
decision support
title Machine Learning-Based Diabetes Risk Prediction Using Associated Behavioral Features
title_full Machine Learning-Based Diabetes Risk Prediction Using Associated Behavioral Features
title_fullStr Machine Learning-Based Diabetes Risk Prediction Using Associated Behavioral Features
title_full_unstemmed Machine Learning-Based Diabetes Risk Prediction Using Associated Behavioral Features
title_short Machine Learning-Based Diabetes Risk Prediction Using Associated Behavioral Features
title_sort machine learning based diabetes risk prediction using associated behavioral features
topic Diabetes
machine learning
risk prediction
paired relationship
decision support
url https://www.worldscientific.com/doi/10.1142/S2972370124500065
work_keys_str_mv AT ayodejiojibitoye machinelearningbaseddiabetesriskpredictionusingassociatedbehavioralfeatures
AT josephdakinyemi machinelearningbaseddiabetesriskpredictionusingassociatedbehavioralfeatures
AT olufadefwonifade machinelearningbaseddiabetesriskpredictionusingassociatedbehavioralfeatures