Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate

Chronic Kidney Disease (CKD) is a progressive condition that requires accurate diagnosis and staging for effective clinical management. Conventional CKD diagnosis relies on estimated Glomerular Filtration Rate (eGFR), a measure of kidney function derived from serum biomarkers such as serum creatinin...

Full description

Saved in:
Bibliographic Details
Main Authors: Samit Kumar Ghosh, Namareq Widatalla, Ahsan H. Khandoker
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10979939/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850034700433752064
author Samit Kumar Ghosh
Namareq Widatalla
Ahsan H. Khandoker
author_facet Samit Kumar Ghosh
Namareq Widatalla
Ahsan H. Khandoker
author_sort Samit Kumar Ghosh
collection DOAJ
description Chronic Kidney Disease (CKD) is a progressive condition that requires accurate diagnosis and staging for effective clinical management. Conventional CKD diagnosis relies on estimated Glomerular Filtration Rate (eGFR), a measure of kidney function derived from serum biomarkers such as serum creatinine (SCr) and cystatin C (SCysC). However, eGFR calculations may be inaccurate when applied to diverse patient populations. This study proposes a machine learning (ML) system that integrates regression-based eGFR estimation, metaheuristic optimization using the Grey Wolf Optimizer (GWO), and multi-class classification with various ML models to enhance CKD staging and classification. The model estimates eGFR using three established CKD Epidemiology Collaboration (CKD-EPI) equations incorporating SCr, SCysC, and their combined values. Regression models assess predictive performance, specifically Linear Regression (LR) and Support Vector Regression (SVR). SVR demonstrates superior performance compared to LR for <inline-formula> <tex-math notation="LaTeX">$\text {CKD-EPI}_{\text {SCr-SCysC}}$ </tex-math></inline-formula> achieved a root mean squared error (RMSE) of 3.03, a mean absolute percentage error (MAPE) of 2.97%, and a coefficient of determination (<inline-formula> <tex-math notation="LaTeX">$\text {R}^{2}$ </tex-math></inline-formula>) score of 0.97. The application of GWO for hyperparameter tuning has resulted in a 37.3% reduction in root mean square error (RMSE), a 37.4% drop in mean absolute percentage error (MAPE), and a 2.06% improvement in <inline-formula> <tex-math notation="LaTeX">$\text {R}^{2}$ </tex-math></inline-formula> to improve the precision of prediction. Once the model fine-tunes the eGFR estimations, it feeds them into various algorithms for CKD stage classification, including Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). Among these, XGBoost achieves the highest classification accuracy of 97.76%, along with an F1-score of 97.45%, demonstrating its effectiveness in CKD staging. Shapley Additive Explanations (SHAP) provide global and local feature importance insights, enhancing clinical decision-making and model transparency. Future research will validate the model using more extensive and more diverse datasets. Additionally, it will incorporate extra clinical parameters, including biomarkers and genetic data, to enhance the precision of CKD risk prediction. This research enhances AI-driven nephrology by providing a scalable, interpretable, and highly accurate solution for diagnosing and managing CKD.
format Article
id doaj-art-97b2172191a6409eb4891ee590661be0
institution DOAJ
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-97b2172191a6409eb4891ee590661be02025-08-20T02:57:44ZengIEEEIEEE Access2169-35362025-01-0113780577807210.1109/ACCESS.2025.356554910979939Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration RateSamit Kumar Ghosh0https://orcid.org/0000-0003-2267-7314Namareq Widatalla1https://orcid.org/0000-0001-9848-8531Ahsan H. Khandoker2https://orcid.org/0000-0002-0636-1646Department of Biomedical Engineering and Biotechnology, Khalifa University, Abu Dhabi, United Arab EmiratesDepartment of Biomedical Engineering and Biotechnology, Khalifa University, Abu Dhabi, United Arab EmiratesDepartment of Biomedical Engineering and Biotechnology, Khalifa University, Abu Dhabi, United Arab EmiratesChronic Kidney Disease (CKD) is a progressive condition that requires accurate diagnosis and staging for effective clinical management. Conventional CKD diagnosis relies on estimated Glomerular Filtration Rate (eGFR), a measure of kidney function derived from serum biomarkers such as serum creatinine (SCr) and cystatin C (SCysC). However, eGFR calculations may be inaccurate when applied to diverse patient populations. This study proposes a machine learning (ML) system that integrates regression-based eGFR estimation, metaheuristic optimization using the Grey Wolf Optimizer (GWO), and multi-class classification with various ML models to enhance CKD staging and classification. The model estimates eGFR using three established CKD Epidemiology Collaboration (CKD-EPI) equations incorporating SCr, SCysC, and their combined values. Regression models assess predictive performance, specifically Linear Regression (LR) and Support Vector Regression (SVR). SVR demonstrates superior performance compared to LR for <inline-formula> <tex-math notation="LaTeX">$\text {CKD-EPI}_{\text {SCr-SCysC}}$ </tex-math></inline-formula> achieved a root mean squared error (RMSE) of 3.03, a mean absolute percentage error (MAPE) of 2.97%, and a coefficient of determination (<inline-formula> <tex-math notation="LaTeX">$\text {R}^{2}$ </tex-math></inline-formula>) score of 0.97. The application of GWO for hyperparameter tuning has resulted in a 37.3% reduction in root mean square error (RMSE), a 37.4% drop in mean absolute percentage error (MAPE), and a 2.06% improvement in <inline-formula> <tex-math notation="LaTeX">$\text {R}^{2}$ </tex-math></inline-formula> to improve the precision of prediction. Once the model fine-tunes the eGFR estimations, it feeds them into various algorithms for CKD stage classification, including Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). Among these, XGBoost achieves the highest classification accuracy of 97.76%, along with an F1-score of 97.45%, demonstrating its effectiveness in CKD staging. Shapley Additive Explanations (SHAP) provide global and local feature importance insights, enhancing clinical decision-making and model transparency. Future research will validate the model using more extensive and more diverse datasets. Additionally, it will incorporate extra clinical parameters, including biomarkers and genetic data, to enhance the precision of CKD risk prediction. This research enhances AI-driven nephrology by providing a scalable, interpretable, and highly accurate solution for diagnosing and managing CKD.https://ieeexplore.ieee.org/document/10979939/Chronic kidney diseasesCKD-EPI equationcystatin Cglomerular filtration rateserum creatininemachine learning
spellingShingle Samit Kumar Ghosh
Namareq Widatalla
Ahsan H. Khandoker
Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate
IEEE Access
Chronic kidney diseases
CKD-EPI equation
cystatin C
glomerular filtration rate
serum creatinine
machine learning
title Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate
title_full Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate
title_fullStr Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate
title_full_unstemmed Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate
title_short Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate
title_sort machine learning framework for early detection of chronic kidney disease stages using optimized estimated glomerular filtration rate
topic Chronic kidney diseases
CKD-EPI equation
cystatin C
glomerular filtration rate
serum creatinine
machine learning
url https://ieeexplore.ieee.org/document/10979939/
work_keys_str_mv AT samitkumarghosh machinelearningframeworkforearlydetectionofchronickidneydiseasestagesusingoptimizedestimatedglomerularfiltrationrate
AT namareqwidatalla machinelearningframeworkforearlydetectionofchronickidneydiseasestagesusingoptimizedestimatedglomerularfiltrationrate
AT ahsanhkhandoker machinelearningframeworkforearlydetectionofchronickidneydiseasestagesusingoptimizedestimatedglomerularfiltrationrate