Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus
The increasing prevalence of diabetes necessitates the development of effective early detection methods to mitigate its health impacts. This paper investigates the impact of feature transformation and machine learning (ML) models on the early detection of diabetes using a binary tabular classificati...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2024-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10747366/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849339822642036736 |
|---|---|
| author | Ahmed Ali Linkon Inshad Rahman Noman Md Rashedul Islam Joy Chakra Bortty Kanchon Kumar Bishnu Araf Islam Rakibul Hasan Masuk Abdullah |
| author_facet | Ahmed Ali Linkon Inshad Rahman Noman Md Rashedul Islam Joy Chakra Bortty Kanchon Kumar Bishnu Araf Islam Rakibul Hasan Masuk Abdullah |
| author_sort | Ahmed Ali Linkon |
| collection | DOAJ |
| description | The increasing prevalence of diabetes necessitates the development of effective early detection methods to mitigate its health impacts. This paper investigates the impact of feature transformation and machine learning (ML) models on the early detection of diabetes using a binary tabular classification dataset. We explore three feature transformation techniques, no transformation, normalization, and min-max scaling, to assess their influence on the performance of various ML models. To comprehensively evaluate the effectiveness of these preprocessing techniques, we experimented with twelve different ML models, including both traditional algorithms and ensemble methods. A publicly available dataset has been used for this research, containing 768 samples and 8 features. To ensure their effectiveness, the models are assessed using several evaluation metrics, including accuracy, precision, recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC). Among the ML models, Light Gradient Boosting Machine (LGBM) achieved the highest accuracy of 82.91% when min-max scaling was applied to the data. Our results demonstrate the varying effectiveness of different combinations of feature transformation techniques and ML models in enhancing diabetes detection performance. Furthermore, it has been observed that the ensemble models generally achieved better performance than traditional ML models. These findings provide valuable insights for optimizing preprocessing and model selection strategies in the development of robust early diabetes detection systems. |
| format | Article |
| id | doaj-art-1357954315ea452e9d1ec79eb5287411 |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2024-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-1357954315ea452e9d1ec79eb52874112025-08-20T03:44:02ZengIEEEIEEE Access2169-35362024-01-011216542516544010.1109/ACCESS.2024.348874310747366Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes MellitusAhmed Ali Linkon0Inshad Rahman Noman1https://orcid.org/0009-0009-5833-7697Md Rashedul Islam2Joy Chakra Bortty3Kanchon Kumar Bishnu4https://orcid.org/0009-0007-1811-3002Araf Islam5Rakibul Hasan6https://orcid.org/0009-0001-7268-390XMasuk Abdullah7https://orcid.org/0000-0002-9330-1026Department of Computer Science, Westcliff University, Irvine, CA, USADepartment of Computer Science, California State University, Los Angeles, CA, USADepartment of Business Administration, Westcliff University, Irvine, CA, USADepartment of Computer Science, Westcliff University, Irvine, CA, USADepartment of Computer Science, California State University, Los Angeles, CA, USADepartment of Computer Science, Westcliff University, Irvine, CA, USADepartment of Business Administration, Westcliff University, Irvine, CA, USADepartment of Vehicles Engineering, Faculty of Engineering, University of Debrecen, Debrecen, HungaryThe increasing prevalence of diabetes necessitates the development of effective early detection methods to mitigate its health impacts. This paper investigates the impact of feature transformation and machine learning (ML) models on the early detection of diabetes using a binary tabular classification dataset. We explore three feature transformation techniques, no transformation, normalization, and min-max scaling, to assess their influence on the performance of various ML models. To comprehensively evaluate the effectiveness of these preprocessing techniques, we experimented with twelve different ML models, including both traditional algorithms and ensemble methods. A publicly available dataset has been used for this research, containing 768 samples and 8 features. To ensure their effectiveness, the models are assessed using several evaluation metrics, including accuracy, precision, recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (AUC-ROC). Among the ML models, Light Gradient Boosting Machine (LGBM) achieved the highest accuracy of 82.91% when min-max scaling was applied to the data. Our results demonstrate the varying effectiveness of different combinations of feature transformation techniques and ML models in enhancing diabetes detection performance. Furthermore, it has been observed that the ensemble models generally achieved better performance than traditional ML models. These findings provide valuable insights for optimizing preprocessing and model selection strategies in the development of robust early diabetes detection systems.https://ieeexplore.ieee.org/document/10747366/Machine learningdiabetes detectionfeature transformationdata preprocessing |
| spellingShingle | Ahmed Ali Linkon Inshad Rahman Noman Md Rashedul Islam Joy Chakra Bortty Kanchon Kumar Bishnu Araf Islam Rakibul Hasan Masuk Abdullah Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus IEEE Access Machine learning diabetes detection feature transformation data preprocessing |
| title | Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus |
| title_full | Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus |
| title_fullStr | Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus |
| title_full_unstemmed | Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus |
| title_short | Evaluation of Feature Transformation and Machine Learning Models on Early Detection of Diabetes Mellitus |
| title_sort | evaluation of feature transformation and machine learning models on early detection of diabetes mellitus |
| topic | Machine learning diabetes detection feature transformation data preprocessing |
| url | https://ieeexplore.ieee.org/document/10747366/ |
| work_keys_str_mv | AT ahmedalilinkon evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus AT inshadrahmannoman evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus AT mdrashedulislam evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus AT joychakrabortty evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus AT kanchonkumarbishnu evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus AT arafislam evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus AT rakibulhasan evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus AT masukabdullah evaluationoffeaturetransformationandmachinelearningmodelsonearlydetectionofdiabetesmellitus |