Optimizing machine learning methods for groundwater quality prediction: Case study in District Bagh, Azad Kashmir, Pakistan
Groundwater quality monitoring is crucial for protecting the environment and human health. Machine learning (ML) offers substantial potential for enhancing groundwater quality prediction, classification, and identification of pollution indicators. This study evaluates various base ML algorithms and...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-09-01
|
| Series: | Ecotoxicology and Environmental Safety |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S0147651325009558 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Groundwater quality monitoring is crucial for protecting the environment and human health. Machine learning (ML) offers substantial potential for enhancing groundwater quality prediction, classification, and identification of pollution indicators. This study evaluates various base ML algorithms and stacking ensemble classifiers (meta-classifiers) using data from 90 groundwater samples collected in District Bagh, Azad Kashmir, Pakistan. The aim was to establish a reliable method for predicting groundwater quality classification. Six supervised machine learning classifiers were utilized, namely Logistic Regression (LR), K-Nearest Neighbours (KNN), Decision Trees (DT), Support Vector Machines (SVM), Random Forest (RF), and Extreme Gradient Boosting (XGB). These classifiers, along with their corresponding meta-classifiers (Meta-LR, Meta-KNN, Meta-DT, Meta-SVM, Meta-RF, and Meta-XGB), were developed and compared to evaluate their effectiveness in classifying and predicting groundwater quality. Evaluation metrics such as precision, recall, F1-score, accuracy, R2, RMSE and ROC curves were used to assess classifiers' performance. Among all the classifiers, SVM and its meta-classifier (Meta-SVM) emerged as the most effective, achieving the highest accuracy score of 0.85–0.89, F1-score (0.88–0.89), R2 (0.88–1), RMSE (6.72), and Area Under the Curve (AUC) of 0.795. Meta-classifiers achieved better performance than base models for LR (0.85–0.92), SVM (0.88–1.00), and XGB (0.52–0.89). The study also identified key pollution indicators influencing groundwater quality in the area, such as Total Dissolved Solids (TDS), Sulphate (SO4), and Nitrate (NO3). These indicators showed an increasing trend over time. The research highlights the potential of ML techniques, particularly SVM and meta-SVM, in predicting groundwater quality based on key pollution indicators. The findings underscore the importance of ongoing monitoring and predictive modeling in managing groundwater resources effectively and mitigating pollution impacts. Future applications could refine models and expand datasets to enhance predictive accuracy and applicability across regions and conditions. |
|---|---|
| ISSN: | 0147-6513 |