Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate
Chronic Kidney Disease (CKD) is a progressive condition that requires accurate diagnosis and staging for effective clinical management. Conventional CKD diagnosis relies on estimated Glomerular Filtration Rate (eGFR), a measure of kidney function derived from serum biomarkers such as serum creatinin...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10979939/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850034700433752064 |
|---|---|
| author | Samit Kumar Ghosh Namareq Widatalla Ahsan H. Khandoker |
| author_facet | Samit Kumar Ghosh Namareq Widatalla Ahsan H. Khandoker |
| author_sort | Samit Kumar Ghosh |
| collection | DOAJ |
| description | Chronic Kidney Disease (CKD) is a progressive condition that requires accurate diagnosis and staging for effective clinical management. Conventional CKD diagnosis relies on estimated Glomerular Filtration Rate (eGFR), a measure of kidney function derived from serum biomarkers such as serum creatinine (SCr) and cystatin C (SCysC). However, eGFR calculations may be inaccurate when applied to diverse patient populations. This study proposes a machine learning (ML) system that integrates regression-based eGFR estimation, metaheuristic optimization using the Grey Wolf Optimizer (GWO), and multi-class classification with various ML models to enhance CKD staging and classification. The model estimates eGFR using three established CKD Epidemiology Collaboration (CKD-EPI) equations incorporating SCr, SCysC, and their combined values. Regression models assess predictive performance, specifically Linear Regression (LR) and Support Vector Regression (SVR). SVR demonstrates superior performance compared to LR for <inline-formula> <tex-math notation="LaTeX">$\text {CKD-EPI}_{\text {SCr-SCysC}}$ </tex-math></inline-formula> achieved a root mean squared error (RMSE) of 3.03, a mean absolute percentage error (MAPE) of 2.97%, and a coefficient of determination (<inline-formula> <tex-math notation="LaTeX">$\text {R}^{2}$ </tex-math></inline-formula>) score of 0.97. The application of GWO for hyperparameter tuning has resulted in a 37.3% reduction in root mean square error (RMSE), a 37.4% drop in mean absolute percentage error (MAPE), and a 2.06% improvement in <inline-formula> <tex-math notation="LaTeX">$\text {R}^{2}$ </tex-math></inline-formula> to improve the precision of prediction. Once the model fine-tunes the eGFR estimations, it feeds them into various algorithms for CKD stage classification, including Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). Among these, XGBoost achieves the highest classification accuracy of 97.76%, along with an F1-score of 97.45%, demonstrating its effectiveness in CKD staging. Shapley Additive Explanations (SHAP) provide global and local feature importance insights, enhancing clinical decision-making and model transparency. Future research will validate the model using more extensive and more diverse datasets. Additionally, it will incorporate extra clinical parameters, including biomarkers and genetic data, to enhance the precision of CKD risk prediction. This research enhances AI-driven nephrology by providing a scalable, interpretable, and highly accurate solution for diagnosing and managing CKD. |
| format | Article |
| id | doaj-art-97b2172191a6409eb4891ee590661be0 |
| institution | DOAJ |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-97b2172191a6409eb4891ee590661be02025-08-20T02:57:44ZengIEEEIEEE Access2169-35362025-01-0113780577807210.1109/ACCESS.2025.356554910979939Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration RateSamit Kumar Ghosh0https://orcid.org/0000-0003-2267-7314Namareq Widatalla1https://orcid.org/0000-0001-9848-8531Ahsan H. Khandoker2https://orcid.org/0000-0002-0636-1646Department of Biomedical Engineering and Biotechnology, Khalifa University, Abu Dhabi, United Arab EmiratesDepartment of Biomedical Engineering and Biotechnology, Khalifa University, Abu Dhabi, United Arab EmiratesDepartment of Biomedical Engineering and Biotechnology, Khalifa University, Abu Dhabi, United Arab EmiratesChronic Kidney Disease (CKD) is a progressive condition that requires accurate diagnosis and staging for effective clinical management. Conventional CKD diagnosis relies on estimated Glomerular Filtration Rate (eGFR), a measure of kidney function derived from serum biomarkers such as serum creatinine (SCr) and cystatin C (SCysC). However, eGFR calculations may be inaccurate when applied to diverse patient populations. This study proposes a machine learning (ML) system that integrates regression-based eGFR estimation, metaheuristic optimization using the Grey Wolf Optimizer (GWO), and multi-class classification with various ML models to enhance CKD staging and classification. The model estimates eGFR using three established CKD Epidemiology Collaboration (CKD-EPI) equations incorporating SCr, SCysC, and their combined values. Regression models assess predictive performance, specifically Linear Regression (LR) and Support Vector Regression (SVR). SVR demonstrates superior performance compared to LR for <inline-formula> <tex-math notation="LaTeX">$\text {CKD-EPI}_{\text {SCr-SCysC}}$ </tex-math></inline-formula> achieved a root mean squared error (RMSE) of 3.03, a mean absolute percentage error (MAPE) of 2.97%, and a coefficient of determination (<inline-formula> <tex-math notation="LaTeX">$\text {R}^{2}$ </tex-math></inline-formula>) score of 0.97. The application of GWO for hyperparameter tuning has resulted in a 37.3% reduction in root mean square error (RMSE), a 37.4% drop in mean absolute percentage error (MAPE), and a 2.06% improvement in <inline-formula> <tex-math notation="LaTeX">$\text {R}^{2}$ </tex-math></inline-formula> to improve the precision of prediction. Once the model fine-tunes the eGFR estimations, it feeds them into various algorithms for CKD stage classification, including Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). Among these, XGBoost achieves the highest classification accuracy of 97.76%, along with an F1-score of 97.45%, demonstrating its effectiveness in CKD staging. Shapley Additive Explanations (SHAP) provide global and local feature importance insights, enhancing clinical decision-making and model transparency. Future research will validate the model using more extensive and more diverse datasets. Additionally, it will incorporate extra clinical parameters, including biomarkers and genetic data, to enhance the precision of CKD risk prediction. This research enhances AI-driven nephrology by providing a scalable, interpretable, and highly accurate solution for diagnosing and managing CKD.https://ieeexplore.ieee.org/document/10979939/Chronic kidney diseasesCKD-EPI equationcystatin Cglomerular filtration rateserum creatininemachine learning |
| spellingShingle | Samit Kumar Ghosh Namareq Widatalla Ahsan H. Khandoker Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate IEEE Access Chronic kidney diseases CKD-EPI equation cystatin C glomerular filtration rate serum creatinine machine learning |
| title | Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate |
| title_full | Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate |
| title_fullStr | Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate |
| title_full_unstemmed | Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate |
| title_short | Machine Learning Framework for Early Detection of Chronic Kidney Disease Stages Using Optimized Estimated Glomerular Filtration Rate |
| title_sort | machine learning framework for early detection of chronic kidney disease stages using optimized estimated glomerular filtration rate |
| topic | Chronic kidney diseases CKD-EPI equation cystatin C glomerular filtration rate serum creatinine machine learning |
| url | https://ieeexplore.ieee.org/document/10979939/ |
| work_keys_str_mv | AT samitkumarghosh machinelearningframeworkforearlydetectionofchronickidneydiseasestagesusingoptimizedestimatedglomerularfiltrationrate AT namareqwidatalla machinelearningframeworkforearlydetectionofchronickidneydiseasestagesusingoptimizedestimatedglomerularfiltrationrate AT ahsanhkhandoker machinelearningframeworkforearlydetectionofchronickidneydiseasestagesusingoptimizedestimatedglomerularfiltrationrate |