Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus

The increasing prevalence of diabetes necessitates the development of effective early detection methods to mitigate its health impacts. This paper investigates the impact of feature transformation and machine learning (ML) models on the early detection of diabetes using a binary tabular classificati...

Full description

Saved in:
Bibliographic Details
Main Authors: Ahmed Ali Linkon, Inshad Rahman Noman, Md Rashedul Islam, Joy Chakra Bortty, Kanchon Kumar Bishnu, Araf Islam, Rakibul Hasan, Masuk Abdullah
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10747366/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849339822642036736
author Ahmed Ali Linkon
Inshad Rahman Noman
Md Rashedul Islam
Joy Chakra Bortty
Kanchon Kumar Bishnu
Araf Islam
Rakibul Hasan
Masuk Abdullah
author_facet Ahmed Ali Linkon
Inshad Rahman Noman
Md Rashedul Islam
Joy Chakra Bortty
Kanchon Kumar Bishnu
Araf Islam
Rakibul Hasan
Masuk Abdullah
author_sort Ahmed Ali Linkon
collection DOAJ
description The increasing prevalence of diabetes necessitates the development of effective early detection methods to mitigate its health impacts. This paper investigates the impact of feature transformation and machine learning (ML) models on the early detection of diabetes using a binary tabular classification dataset. We explore three feature transformation techniques, no transformation, normalization, and min-max scaling, to assess their influence on the performance of various ML models. To comprehensively evaluate the effectiveness of these preprocessing techniques, we experimented with twelve different ML models, including both traditional algorithms and ensemble methods. A publicly available dataset has been used for this research, containing 768 samples and 8 features. To ensure their effectiveness, the models are assessed using several evaluation metrics, including accuracy, precision, recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC). Among the ML models, Light Gradient Boosting Machine (LGBM) achieved the highest accuracy of 82.91% when min-max scaling was applied to the data. Our results demonstrate the varying effectiveness of different combinations of feature transformation techniques and ML models in enhancing diabetes detection performance. Furthermore, it has been observed that the ensemble models generally achieved better performance than traditional ML models. These findings provide valuable insights for optimizing preprocessing and model selection strategies in the development of robust early diabetes detection systems.
format Article
id doaj-art-1357954315ea452e9d1ec79eb5287411
institution Kabale University
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-1357954315ea452e9d1ec79eb52874112025-08-20T03:44:02ZengIEEEIEEE Access2169-35362024-01-011216542516544010.1109/ACCESS.2024.348874310747366Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes MellitusAhmed Ali Linkon0Inshad Rahman Noman1https://orcid.org/0009-0009-5833-7697Md Rashedul Islam2Joy Chakra Bortty3Kanchon Kumar Bishnu4https://orcid.org/0009-0007-1811-3002Araf Islam5Rakibul Hasan6https://orcid.org/0009-0001-7268-390XMasuk Abdullah7https://orcid.org/0000-0002-9330-1026Department of Computer Science, Westcliff University, Irvine, CA, USADepartment of Computer Science, California State University, Los Angeles, CA, USADepartment of Business Administration, Westcliff University, Irvine, CA, USADepartment of Computer Science, Westcliff University, Irvine, CA, USADepartment of Computer Science, California State University, Los Angeles, CA, USADepartment of Computer Science, Westcliff University, Irvine, CA, USADepartment of Business Administration, Westcliff University, Irvine, CA, USADepartment of Vehicles Engineering, Faculty of Engineering, University of Debrecen, Debrecen, HungaryThe increasing prevalence of diabetes necessitates the development of effective early detection methods to mitigate its health impacts. This paper investigates the impact of feature transformation and machine learning (ML) models on the early detection of diabetes using a binary tabular classification dataset. We explore three feature transformation techniques, no transformation, normalization, and min-max scaling, to assess their influence on the performance of various ML models. To comprehensively evaluate the effectiveness of these preprocessing techniques, we experimented with twelve different ML models, including both traditional algorithms and ensemble methods. A publicly available dataset has been used for this research, containing 768 samples and 8 features. To ensure their effectiveness, the models are assessed using several evaluation metrics, including accuracy, precision, recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC). Among the ML models, Light Gradient Boosting Machine (LGBM) achieved the highest accuracy of 82.91% when min-max scaling was applied to the data. Our results demonstrate the varying effectiveness of different combinations of feature transformation techniques and ML models in enhancing diabetes detection performance. Furthermore, it has been observed that the ensemble models generally achieved better performance than traditional ML models. These findings provide valuable insights for optimizing preprocessing and model selection strategies in the development of robust early diabetes detection systems.https://ieeexplore.ieee.org/document/10747366/Machine learningdiabetes detectionfeature transformationdata preprocessing
spellingShingle Ahmed Ali Linkon
Inshad Rahman Noman
Md Rashedul Islam
Joy Chakra Bortty
Kanchon Kumar Bishnu
Araf Islam
Rakibul Hasan
Masuk Abdullah
Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus
IEEE Access
Machine learning
diabetes detection
feature transformation
data preprocessing
title Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus
title_full Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus
title_fullStr Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus
title_full_unstemmed Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus
title_short Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus
title_sort evaluation of feature transformation and machine learning models on early detection of diabetes mellitus
topic Machine learning
diabetes detection
feature transformation
data preprocessing
url https://ieeexplore.ieee.org/document/10747366/
work_keys_str_mv AT ahmedalilinkon evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus
AT inshadrahmannoman evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus
AT mdrashedulislam evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus
AT joychakrabortty evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus
AT kanchonkumarbishnu evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus
AT arafislam evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus
AT rakibulhasan evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus
AT masukabdullah evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus