Kalman filtering assimilated machine learning methods significantly improve the prediction performance of water quality parameters
Accurate water quality prediction is essential for effective water pollution prevention and emergency responses. However, existing research on machine learning (ML)-based data assimilation methods remains limited, particularly in terms of addressing the combined impacts of climate change and anthrop...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-12-01
|
| Series: | Ecological Informatics |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S1574954125003462 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849233556781400064 |
|---|---|
| author | Zhenyu Gao Guoqiang Wang Jinyue Chen Lei Fang Shilong Ren A. Yinglan Shuping Ji Ruobing Liu Qiao Wang |
| author_facet | Zhenyu Gao Guoqiang Wang Jinyue Chen Lei Fang Shilong Ren A. Yinglan Shuping Ji Ruobing Liu Qiao Wang |
| author_sort | Zhenyu Gao |
| collection | DOAJ |
| description | Accurate water quality prediction is essential for effective water pollution prevention and emergency responses. However, existing research on machine learning (ML)-based data assimilation methods remains limited, particularly in terms of addressing the combined impacts of climate change and anthropogenic activities. To address this gap, we proposed a novel ‘ML–Kalman filter (KF)’ data assimilation framework and evaluated its performance in the Dahei River Basin, a representative semi-arid watershed. Our results demonstrated significant improvements in predicting key water quality parameters, including total nitrogen (TN), total phosphorus (TP), and the permanganate index (CODMn), through the integration of KF with four ML models (LSTM, RF, XGBoost, and SVR). The accuracy enhancement ranged from 4.3 % to 17.6 %, with TP showing the most substantial improvement (9.2 %–17.6 %), followed by TN (6.4 %–11.1 %) and CODMn (4.3 %–12.1 %). After assimilation, the models exhibited the following performance ranking for TN based on the coefficient of determination (R2): LSTM–KF (R2 = 0.909) > RF–KF (R2 = 0.886) > SVR–KF (R2 = 0.840) > XGBoost–KF (R2 = 0.797), with similar trends observed for TP and CODMn. The proposed framework demonstrates strong portability and applicability across different monitoring sections and temporal resolutions, offering a robust solution for regions with limited monitoring capabilities and challenging climatic conditions. These findings provide valuable data and technical support for advancing water pollution prediction and early warning systems, particularly for ecological and environmental departments operating in data-deficient regions. |
| format | Article |
| id | doaj-art-91db876eee6c42369ae8065122974d50 |
| institution | Kabale University |
| issn | 1574-9541 |
| language | English |
| publishDate | 2025-12-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Ecological Informatics |
| spelling | doaj-art-91db876eee6c42369ae8065122974d502025-08-20T05:05:51ZengElsevierEcological Informatics1574-95412025-12-019010333710.1016/j.ecoinf.2025.103337Kalman filtering assimilated machine learning methods significantly improve the prediction performance of water quality parametersZhenyu Gao0Guoqiang Wang1Jinyue Chen2Lei Fang3Shilong Ren4A. Yinglan5Shuping Ji6Ruobing Liu7Qiao Wang8Academician Workstation for Big Data in Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266237, ChinaAcademician Workstation for Big Data in Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266237, China; Innovation Research Center of Satellite Application, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China; Corresponding author at: Academician Workstation for Big Data in Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266237, China.Academician Workstation for Big Data in Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266237, China; Shenzhen Research Institute of Shandong University, Shenzhen 518057, ChinaAcademician Workstation for Big Data in Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266237, ChinaAcademician Workstation for Big Data in Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266237, ChinaInnovation Research Center of Satellite Application, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, ChinaAcademician Workstation for Big Data in Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266237, ChinaAcademician Workstation for Big Data in Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266237, ChinaAcademician Workstation for Big Data in Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266237, ChinaAccurate water quality prediction is essential for effective water pollution prevention and emergency responses. However, existing research on machine learning (ML)-based data assimilation methods remains limited, particularly in terms of addressing the combined impacts of climate change and anthropogenic activities. To address this gap, we proposed a novel ‘ML–Kalman filter (KF)’ data assimilation framework and evaluated its performance in the Dahei River Basin, a representative semi-arid watershed. Our results demonstrated significant improvements in predicting key water quality parameters, including total nitrogen (TN), total phosphorus (TP), and the permanganate index (CODMn), through the integration of KF with four ML models (LSTM, RF, XGBoost, and SVR). The accuracy enhancement ranged from 4.3 % to 17.6 %, with TP showing the most substantial improvement (9.2 %–17.6 %), followed by TN (6.4 %–11.1 %) and CODMn (4.3 %–12.1 %). After assimilation, the models exhibited the following performance ranking for TN based on the coefficient of determination (R2): LSTM–KF (R2 = 0.909) > RF–KF (R2 = 0.886) > SVR–KF (R2 = 0.840) > XGBoost–KF (R2 = 0.797), with similar trends observed for TP and CODMn. The proposed framework demonstrates strong portability and applicability across different monitoring sections and temporal resolutions, offering a robust solution for regions with limited monitoring capabilities and challenging climatic conditions. These findings provide valuable data and technical support for advancing water pollution prediction and early warning systems, particularly for ecological and environmental departments operating in data-deficient regions.http://www.sciencedirect.com/science/article/pii/S1574954125003462Kalman filterTime series predictionMachine learningData assimilation |
| spellingShingle | Zhenyu Gao Guoqiang Wang Jinyue Chen Lei Fang Shilong Ren A. Yinglan Shuping Ji Ruobing Liu Qiao Wang Kalman filtering assimilated machine learning methods significantly improve the prediction performance of water quality parameters Ecological Informatics Kalman filter Time series prediction Machine learning Data assimilation |
| title | Kalman filtering assimilated machine learning methods significantly improve the prediction performance of water quality parameters |
| title_full | Kalman filtering assimilated machine learning methods significantly improve the prediction performance of water quality parameters |
| title_fullStr | Kalman filtering assimilated machine learning methods significantly improve the prediction performance of water quality parameters |
| title_full_unstemmed | Kalman filtering assimilated machine learning methods significantly improve the prediction performance of water quality parameters |
| title_short | Kalman filtering assimilated machine learning methods significantly improve the prediction performance of water quality parameters |
| title_sort | kalman filtering assimilated machine learning methods significantly improve the prediction performance of water quality parameters |
| topic | Kalman filter Time series prediction Machine learning Data assimilation |
| url | http://www.sciencedirect.com/science/article/pii/S1574954125003462 |
| work_keys_str_mv | AT zhenyugao kalmanfilteringassimilatedmachinelearningmethodssignificantlyimprovethepredictionperformanceofwaterqualityparameters AT guoqiangwang kalmanfilteringassimilatedmachinelearningmethodssignificantlyimprovethepredictionperformanceofwaterqualityparameters AT jinyuechen kalmanfilteringassimilatedmachinelearningmethodssignificantlyimprovethepredictionperformanceofwaterqualityparameters AT leifang kalmanfilteringassimilatedmachinelearningmethodssignificantlyimprovethepredictionperformanceofwaterqualityparameters AT shilongren kalmanfilteringassimilatedmachinelearningmethodssignificantlyimprovethepredictionperformanceofwaterqualityparameters AT ayinglan kalmanfilteringassimilatedmachinelearningmethodssignificantlyimprovethepredictionperformanceofwaterqualityparameters AT shupingji kalmanfilteringassimilatedmachinelearningmethodssignificantlyimprovethepredictionperformanceofwaterqualityparameters AT ruobingliu kalmanfilteringassimilatedmachinelearningmethodssignificantlyimprovethepredictionperformanceofwaterqualityparameters AT qiaowang kalmanfilteringassimilatedmachinelearningmethodssignificantlyimprovethepredictionperformanceofwaterqualityparameters |