Comparisons of Machine Learning Models for Prediction of Susceptibility to Diabetes

Diabetes is a chronic disorder causing millions of people to suffer from severe complications such as heart attacks, kidney failures, and permanent vision loss. This study aims to find an optimal choice among the five selected models that perform the best on diabetes prediction, and thus provide val...

Full description

Saved in:
Bibliographic Details
Main Author: Jiao Yutian
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:ITM Web of Conferences
Online Access:https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_04035.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Diabetes is a chronic disorder causing millions of people to suffer from severe complications such as heart attacks, kidney failures, and permanent vision loss. This study aims to find an optimal choice among the five selected models that perform the best on diabetes prediction, and thus provide valuable insights in early detection of diabetes. This study compares the predictive performance of machine learning models such as Random Forest (RF), Logistic Regression (LR), and Support Vector Machine (SVM). The study preprocessed the Pima Indians Diabetes (PID) dataset, and the models were trained on it before being assessed using four assessment criteria. According to the results, LR had the best accuracy of 0.76, with RF and SVM coming in second and third, respectively. Results showed that LR achieved the highest accuracy of 0.76, closely followed by RF and SVM. While SVM has the highest precision, it performs poorly on recall, limiting its overall performance on diabetes prediction. On the contrary, LR and RF achieved good results in the F-score, making them outperform the other models in terms of overall performance score in predicting diabetes.
ISSN:2271-2097