Data-driven insights into groundwater quality: machine and deep learning approaches

Arsenic and nitrate contamination of groundwater have been major causes of concern to both the environment and the health of the people, which are significant risks to drinking water quality. In this study, machine learning (ML) and deep learning (DL) models are applied to predict groundwater contam...

Full description

Saved in:
Bibliographic Details
Main Authors: Gift Mbuzi, Abdur Rashid Sangi, Baha Ihnaini, Anil Carie, Sruthi Sivarajan, Satish Anamalamudi
Format: Article
Language:English
Published: Mehran University of Engineering and Technology 2025-07-01
Series:Mehran University Research Journal of Engineering and Technology
Subjects:
Online Access:https://murjet.muet.edu.pk/index.php/home/article/view/317
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849319646516215808
author Gift Mbuzi
Abdur Rashid Sangi
Baha Ihnaini
Anil Carie
Sruthi Sivarajan
Satish Anamalamudi
author_facet Gift Mbuzi
Abdur Rashid Sangi
Baha Ihnaini
Anil Carie
Sruthi Sivarajan
Satish Anamalamudi
author_sort Gift Mbuzi
collection DOAJ
description Arsenic and nitrate contamination of groundwater have been major causes of concern to both the environment and the health of the people, which are significant risks to drinking water quality. In this study, machine learning (ML) and deep learning (DL) models are applied to predict groundwater contamination trends in different parts of India. Mapping a five-year time series historical dataset (2016–2021) of important physicochemical parameters such as conductivity, pH, BOD, fluoride, arsenic, and nitrate, this paper compares some machine learning and deep learning models. Feature importance revealed BOD, total dissolved solids (TDS), and conductivity to be important predictors of arsenic contamination, while agricultural and industrial activities dictate nitrate contamination. Temporal analysis for the variability of arsenic levels revealed decreasing values post-year 2019, which may be due to dilution effects and regulatory measures, while nitrate contamination fluctuated region-wise. After hyperparameter tuning, XGBoost was the most predictive (R² = 0.70), outperforming traditional regression analysis. Partial Dependence Plots (PDP) also caught detailed non-linear relationships among water quality parameters. The findings indicate the potential of predictive models based on AI in groundwater monitoring in real-time to enable better mitigation of contamination. This study contributes to the offering of reliable AI-based systems of monitoring the groundwater in real-life cases and sustainable resource management planning.
format Article
id doaj-art-e29132de8dd3438ea9abef61ddea12e3
institution Kabale University
issn 0254-7821
2413-7219
language English
publishDate 2025-07-01
publisher Mehran University of Engineering and Technology
record_format Article
series Mehran University Research Journal of Engineering and Technology
spelling doaj-art-e29132de8dd3438ea9abef61ddea12e32025-08-20T03:50:21ZengMehran University of Engineering and TechnologyMehran University Research Journal of Engineering and Technology0254-78212413-72192025-07-0144313214010.22581/muet1982.0317321Data-driven insights into groundwater quality: machine and deep learning approachesGift Mbuzi0Abdur Rashid Sangi1Baha Ihnaini2Anil Carie3Sruthi Sivarajan4Satish Anamalamudi5Dept. of CSE SRM University-AP, Amaravati, Andhra Pradesh, IndiaDept. of Computer Science, CSMT, Wenshou-Kean University, Zhejiang, ChinaDept. of Computer Science, CSMT, Wenshou-Kean University, Zhejiang, ChinaDept. of CSE SRM University-AP, Amaravati, Andhra Pradesh, IndiaDept. of CSE SRM University-AP, Amaravati, Andhra Pradesh, IndiaDept. of CSE SRM University-AP, Amaravati, Andhra Pradesh, IndiaArsenic and nitrate contamination of groundwater have been major causes of concern to both the environment and the health of the people, which are significant risks to drinking water quality. In this study, machine learning (ML) and deep learning (DL) models are applied to predict groundwater contamination trends in different parts of India. Mapping a five-year time series historical dataset (2016–2021) of important physicochemical parameters such as conductivity, pH, BOD, fluoride, arsenic, and nitrate, this paper compares some machine learning and deep learning models. Feature importance revealed BOD, total dissolved solids (TDS), and conductivity to be important predictors of arsenic contamination, while agricultural and industrial activities dictate nitrate contamination. Temporal analysis for the variability of arsenic levels revealed decreasing values post-year 2019, which may be due to dilution effects and regulatory measures, while nitrate contamination fluctuated region-wise. After hyperparameter tuning, XGBoost was the most predictive (R² = 0.70), outperforming traditional regression analysis. Partial Dependence Plots (PDP) also caught detailed non-linear relationships among water quality parameters. The findings indicate the potential of predictive models based on AI in groundwater monitoring in real-time to enable better mitigation of contamination. This study contributes to the offering of reliable AI-based systems of monitoring the groundwater in real-life cases and sustainable resource management planning.https://murjet.muet.edu.pk/index.php/home/article/view/317hydrogeochemistrygroundwater contaminationmachine learningdeep learning water resource management
spellingShingle Gift Mbuzi
Abdur Rashid Sangi
Baha Ihnaini
Anil Carie
Sruthi Sivarajan
Satish Anamalamudi
Data-driven insights into groundwater quality: machine and deep learning approaches
Mehran University Research Journal of Engineering and Technology
hydrogeochemistry
groundwater contamination
machine learning
deep learning
water resource management
title Data-driven insights into groundwater quality: machine and deep learning approaches
title_full Data-driven insights into groundwater quality: machine and deep learning approaches
title_fullStr Data-driven insights into groundwater quality: machine and deep learning approaches
title_full_unstemmed Data-driven insights into groundwater quality: machine and deep learning approaches
title_short Data-driven insights into groundwater quality: machine and deep learning approaches
title_sort data driven insights into groundwater quality machine and deep learning approaches
topic hydrogeochemistry
groundwater contamination
machine learning
deep learning
water resource management
url https://murjet.muet.edu.pk/index.php/home/article/view/317
work_keys_str_mv AT giftmbuzi datadriveninsightsintogroundwaterqualitymachineanddeeplearningapproaches
AT abdurrashidsangi datadriveninsightsintogroundwaterqualitymachineanddeeplearningapproaches
AT bahaihnaini datadriveninsightsintogroundwaterqualitymachineanddeeplearningapproaches
AT anilcarie datadriveninsightsintogroundwaterqualitymachineanddeeplearningapproaches
AT sruthisivarajan datadriveninsightsintogroundwaterqualitymachineanddeeplearningapproaches
AT satishanamalamudi datadriveninsightsintogroundwaterqualitymachineanddeeplearningapproaches