Performance of Machine Learning Classifiers for Diabetes Prediction
In this study, machine learning (ML) classifiers were evaluated for their effectiveness in predicting diabetes using the Pima Indians Diabetes Database. The dataset included 768 instances with nine attributes, where the target variable indicated whether a patient tested positive for diabetes. The cl...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IJMADA
2024-08-01
|
Series: | International Journal of Management and Data Analytics |
Subjects: | |
Online Access: | https://ijmada.com/index.php/ijmada/article/view/39 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832593407251316736 |
---|---|
author | Mijala Manandhar Shaikat Baidya Babalpreet Kaur Katia Atoji |
author_facet | Mijala Manandhar Shaikat Baidya Babalpreet Kaur Katia Atoji |
author_sort | Mijala Manandhar |
collection | DOAJ |
description | In this study, machine learning (ML) classifiers were evaluated for their effectiveness in predicting diabetes using the Pima Indians Diabetes Database. The dataset included 768 instances with nine attributes, where the target variable indicated whether a patient tested positive for diabetes. The classifiers were grouped into Function (Logistic Regression, Multilayer Perceptron, Stochastic Gradient Descent), Rules (Decision Table, JRip, OneR), and Trees (Decision Stump, Hoeffding Tree, J48). Performance metrics such as accuracy, precision, recall, Matthews Correlation Coefficient, ROC Area, and F1-measure were used to compare the classifiers. Among the Function classifiers, Stochastic Gradient Descent (SGD) demonstrated the highest performance, particularly in handling large datasets and minimizing overfitting. Logistic Regression and Multilayer Perceptron also showed robust results, but SGD was superior in most metrics. For the Rules classifiers, JRip outperformed others due to its iterative rule optimization, whereas OneR's simplicity resulted in the lowest performance. Decision Table offered a clear representation of decision rules but was limited by the complexity of the dataset. In the Trees group, J48 was the most effective, benefitting from its ability to handle complex interactions and numerous features. The study highlights the potential of ML algorithms in early diabetes detection, enabling timely intervention and personalized management strategies. The importance of key predictors such as plasma glucose, BMI, and age was emphasized. Future research should focus on integrating multiple datasets and exploring more complex ML algorithms to enhance prediction accuracy and generalization. The development of real-time predictive systems is crucial for improving clinical processes and patient outcomes. |
format | Article |
id | doaj-art-46c9c512579e4224825cdd25d03840cd |
institution | Kabale University |
issn | 2816-9395 |
language | English |
publishDate | 2024-08-01 |
publisher | IJMADA |
record_format | Article |
series | International Journal of Management and Data Analytics |
spelling | doaj-art-46c9c512579e4224825cdd25d03840cd2025-01-20T15:45:31ZengIJMADAInternational Journal of Management and Data Analytics2816-93952024-08-01411839Performance of Machine Learning Classifiers for Diabetes PredictionMijala Manandhar0Shaikat Baidya1Babalpreet Kaur2Katia Atoji3University Canada WestUniversity Canada WestUniversity Canada WestUniversity Canada WestIn this study, machine learning (ML) classifiers were evaluated for their effectiveness in predicting diabetes using the Pima Indians Diabetes Database. The dataset included 768 instances with nine attributes, where the target variable indicated whether a patient tested positive for diabetes. The classifiers were grouped into Function (Logistic Regression, Multilayer Perceptron, Stochastic Gradient Descent), Rules (Decision Table, JRip, OneR), and Trees (Decision Stump, Hoeffding Tree, J48). Performance metrics such as accuracy, precision, recall, Matthews Correlation Coefficient, ROC Area, and F1-measure were used to compare the classifiers. Among the Function classifiers, Stochastic Gradient Descent (SGD) demonstrated the highest performance, particularly in handling large datasets and minimizing overfitting. Logistic Regression and Multilayer Perceptron also showed robust results, but SGD was superior in most metrics. For the Rules classifiers, JRip outperformed others due to its iterative rule optimization, whereas OneR's simplicity resulted in the lowest performance. Decision Table offered a clear representation of decision rules but was limited by the complexity of the dataset. In the Trees group, J48 was the most effective, benefitting from its ability to handle complex interactions and numerous features. The study highlights the potential of ML algorithms in early diabetes detection, enabling timely intervention and personalized management strategies. The importance of key predictors such as plasma glucose, BMI, and age was emphasized. Future research should focus on integrating multiple datasets and exploring more complex ML algorithms to enhance prediction accuracy and generalization. The development of real-time predictive systems is crucial for improving clinical processes and patient outcomes.https://ijmada.com/index.php/ijmada/article/view/39healthcareearly diagnosticsinsulinglucosepatient care |
spellingShingle | Mijala Manandhar Shaikat Baidya Babalpreet Kaur Katia Atoji Performance of Machine Learning Classifiers for Diabetes Prediction International Journal of Management and Data Analytics healthcare early diagnostics insulin glucose patient care |
title | Performance of Machine Learning Classifiers for Diabetes Prediction |
title_full | Performance of Machine Learning Classifiers for Diabetes Prediction |
title_fullStr | Performance of Machine Learning Classifiers for Diabetes Prediction |
title_full_unstemmed | Performance of Machine Learning Classifiers for Diabetes Prediction |
title_short | Performance of Machine Learning Classifiers for Diabetes Prediction |
title_sort | performance of machine learning classifiers for diabetes prediction |
topic | healthcare early diagnostics insulin glucose patient care |
url | https://ijmada.com/index.php/ijmada/article/view/39 |
work_keys_str_mv | AT mijalamanandhar performanceofmachinelearningclassifiersfordiabetesprediction AT shaikatbaidya performanceofmachinelearningclassifiersfordiabetesprediction AT babalpreetkaur performanceofmachinelearningclassifiersfordiabetesprediction AT katiaatoji performanceofmachinelearningclassifiersfordiabetesprediction |