Comparative analysis of machine learning models for predicting river water quality: a case study of the Zayandeh Rood River

Given the key role of rivers in supplying drinking water, supporting industry, agriculture, and ecosystems, water quality assessment and pollution quantification are essential for sustainable use. This study evaluated five machine learning models, i.e., Lasso Regression, Random Forest (RF), Gradient...

Full description

Saved in:

Bibliographic Details
Main Authors:	Elham Fazel Najafabadi, Paria Shojaei, Mojgan Askarizadeh
Format:	Article
Language:	English
Published:	Elsevier 2025-09-01
Series:	Results in Engineering
Subjects:	Water quality prediction Machine learning algorithms Zayandeh Rood River Surface water
Online Access:	http://www.sciencedirect.com/science/article/pii/S259012302502732X
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849233517687341056
author	Elham Fazel Najafabadi Paria Shojaei Mojgan Askarizadeh
author_facet	Elham Fazel Najafabadi Paria Shojaei Mojgan Askarizadeh
author_sort	Elham Fazel Najafabadi
collection	DOAJ
description	Given the key role of rivers in supplying drinking water, supporting industry, agriculture, and ecosystems, water quality assessment and pollution quantification are essential for sustainable use. This study evaluated five machine learning models, i.e., Lasso Regression, Random Forest (RF), Gradient Boosting (GB), XGBoost, and Support Vector Machine (SVM) for predicting four water quality parameters—EC (Electrical Conductivity), TDS (Total Dissolved Solids), Sodium Adsorption Ratio (SAR), and TH (Total Hardness)—using data collected over a 31-year period from eight monitoring stations along the Zayandeh Rood River, a vital water source for drinking, agriculture, and industry in the arid region of central Iran. The models were evaluated based on five statistical criteria: R², RMSE, RRMSE, r, and MAE. Two dimensionality reduction techniques—PCA and correlation matrix-based feature reduction—were implemented to enhance model efficiency and mitigate multicollinearity. The findings indicate that the best-performing model for a given parameter varied across stations. However, the differences in evaluation metrics between the best models were quite low in most stations. The GB and SVM models outperformed other models in predicting EC, and TDS (0.80<R²<0.99). However, in predicting SAR, the GB and XGBoost models (0.955<R2<0.999), and in predicting TH, the Lasso and SVM models achieved higher accuracy (0.830<R²<0.996). The Lasso regression model proved to be the most effective for predicting TH at half of the monitoring stations.
format	Article
id	doaj-art-036f2b0ad4a543fd8359eb9b5bfe8d4e
institution	Kabale University
issn	2590-1230
language	English
publishDate	2025-09-01
publisher	Elsevier
record_format	Article
series	Results in Engineering
spelling	doaj-art-036f2b0ad4a543fd8359eb9b5bfe8d4e2025-08-20T05:07:32ZengElsevierResults in Engineering2590-12302025-09-012710666510.1016/j.rineng.2025.106665Comparative analysis of machine learning models for predicting river water quality: a case study of the Zayandeh Rood RiverElham Fazel Najafabadi0Paria Shojaei1Mojgan Askarizadeh2Department of Water Science and Engineering. College of Agriculture, Isfahan University of Technology, Isfahan, Iran; Corresponding author.Department of Architecture and Civil Engineering, University of Bath, Bath, UKDepartment of Computer Engineering, Faculty of Engineering, Ardakan University, Ardakan, Yazd, IranGiven the key role of rivers in supplying drinking water, supporting industry, agriculture, and ecosystems, water quality assessment and pollution quantification are essential for sustainable use. This study evaluated five machine learning models, i.e., Lasso Regression, Random Forest (RF), Gradient Boosting (GB), XGBoost, and Support Vector Machine (SVM) for predicting four water quality parameters—EC (Electrical Conductivity), TDS (Total Dissolved Solids), Sodium Adsorption Ratio (SAR), and TH (Total Hardness)—using data collected over a 31-year period from eight monitoring stations along the Zayandeh Rood River, a vital water source for drinking, agriculture, and industry in the arid region of central Iran. The models were evaluated based on five statistical criteria: R², RMSE, RRMSE, r, and MAE. Two dimensionality reduction techniques—PCA and correlation matrix-based feature reduction—were implemented to enhance model efficiency and mitigate multicollinearity. The findings indicate that the best-performing model for a given parameter varied across stations. However, the differences in evaluation metrics between the best models were quite low in most stations. The GB and SVM models outperformed other models in predicting EC, and TDS (0.80<R²<0.99). However, in predicting SAR, the GB and XGBoost models (0.955<R2<0.999), and in predicting TH, the Lasso and SVM models achieved higher accuracy (0.830<R²<0.996). The Lasso regression model proved to be the most effective for predicting TH at half of the monitoring stations.http://www.sciencedirect.com/science/article/pii/S259012302502732XWater quality predictionMachine learning algorithmsZayandeh Rood RiverSurface water
spellingShingle	Elham Fazel Najafabadi Paria Shojaei Mojgan Askarizadeh Comparative analysis of machine learning models for predicting river water quality: a case study of the Zayandeh Rood River Results in Engineering Water quality prediction Machine learning algorithms Zayandeh Rood River Surface water
title	Comparative analysis of machine learning models for predicting river water quality: a case study of the Zayandeh Rood River
title_full	Comparative analysis of machine learning models for predicting river water quality: a case study of the Zayandeh Rood River
title_fullStr	Comparative analysis of machine learning models for predicting river water quality: a case study of the Zayandeh Rood River
title_full_unstemmed	Comparative analysis of machine learning models for predicting river water quality: a case study of the Zayandeh Rood River
title_short	Comparative analysis of machine learning models for predicting river water quality: a case study of the Zayandeh Rood River
title_sort	comparative analysis of machine learning models for predicting river water quality a case study of the zayandeh rood river
topic	Water quality prediction Machine learning algorithms Zayandeh Rood River Surface water
url	http://www.sciencedirect.com/science/article/pii/S259012302502732X
work_keys_str_mv	AT elhamfazelnajafabadi comparativeanalysisofmachinelearningmodelsforpredictingriverwaterqualityacasestudyofthezayandehroodriver AT pariashojaei comparativeanalysisofmachinelearningmodelsforpredictingriverwaterqualityacasestudyofthezayandehroodriver AT mojganaskarizadeh comparativeanalysisofmachinelearningmodelsforpredictingriverwaterqualityacasestudyofthezayandehroodriver

Comparative analysis of machine learning models for predicting river water quality: a case study of the Zayandeh Rood River

Similar Items