Machine learning and topological kriging for river water quality data interpolation
Monitoring of river water quality data is crucial to prevent river water pollution. With limited sampling data, the statistical method of kriging interpolation is indispensable. This method can predict unsampled values based on interconnected surrounding values. Two types of kriging methods that can...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
AIMS Press
2025-02-01
|
| Series: | AIMS Environmental Science |
| Subjects: | |
| Online Access: | https://www.aimspress.com/article/doi/10.3934/environsci.2025006 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Monitoring of river water quality data is crucial to prevent river water pollution. With limited sampling data, the statistical method of kriging interpolation is indispensable. This method can predict unsampled values based on interconnected surrounding values. Two types of kriging methods that can be applied are Machine Learning (ML) kriging and topological kriging (top-kriging). ML kriging is an extension of ordinary kriging by adding a Super Learning (SL) component. Here, we used SL type Support Vector Regression (SVR). Ordinary Kriging and ML Kriging are based on point values. Top-Kriging is defined as the estimation of streamflow-related variables in ungauged catchments and is based on a non-zero catchment area, not a point value. The three methods were applied in Chemical Oxygen Demand (COD) as water river quality in the Special Region of Yogyakarta (DIY), Indonesia. Based on the Mean Square Error (MSE) and Mean Absolute Error (MAE) comparison, Top kriging provided better accuracy that produced the smallest MSE and MAE. This showed that top kriging is suitable for interpolating data with river flow cases. The interpolation result was that the COD value in the upstream area was low, meaning that the level of organic pollution was minimal. Further downstream, after passing through densely populated residential and industrial areas, the COD values were higher. |
|---|---|
| ISSN: | 2372-0352 |