Deep representation learning enables cross-basin water quality prediction under data-scarce conditions
Abstract Artificial intelligence has been extensively used to predict surface water quality to assess the health of aquatic ecosystems proactively. However, water quality prediction in data-scarce conditions is a challenge, especially with heterogeneous data from monitoring sites that lack similarit...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-04-01
|
| Series: | npj Clean Water |
| Online Access: | https://doi.org/10.1038/s41545-025-00466-2 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850172894669176832 |
|---|---|
| author | Yue Zheng Xiaoran Zhang Yongchao Zhou Yiping Zhang Tuqiao Zhang Raziyeh Farmani |
| author_facet | Yue Zheng Xiaoran Zhang Yongchao Zhou Yiping Zhang Tuqiao Zhang Raziyeh Farmani |
| author_sort | Yue Zheng |
| collection | DOAJ |
| description | Abstract Artificial intelligence has been extensively used to predict surface water quality to assess the health of aquatic ecosystems proactively. However, water quality prediction in data-scarce conditions is a challenge, especially with heterogeneous data from monitoring sites that lack similarity in water quality, hindering the information transfer. A deep learning model is proposed that utilizes representation learning to capture knowledge from source river basins during the pre-training stage, and incorporates meteorological data to accurately predict water quality. This model is successfully implemented and validated using data from 149 monitoring sites across inland China. The results show that the model has outstanding prediction accuracy across all sites, with a mean Nash-Sutcliffe efficiency of 0.80, and has a significant advantage in multi-indicator prediction. The model maintains its excellent performance even when trained with only half of the data. This can be attributed to the representation learning used in the pre-training stage, which enables extensive and accurate prediction under data-scarce conditions. The developed model holds significant potential for cross-basin water quality prediction, which could substantially advance the development of water environment system management. |
| format | Article |
| id | doaj-art-08667cb304fa49f5a35533cf9b752ad1 |
| institution | OA Journals |
| issn | 2059-7037 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | npj Clean Water |
| spelling | doaj-art-08667cb304fa49f5a35533cf9b752ad12025-08-20T02:19:58ZengNature Portfolionpj Clean Water2059-70372025-04-018111110.1038/s41545-025-00466-2Deep representation learning enables cross-basin water quality prediction under data-scarce conditionsYue Zheng0Xiaoran Zhang1Yongchao Zhou2Yiping Zhang3Tuqiao Zhang4Raziyeh Farmani5The Institute of Municipal Engineering, Zhejiang UniversityThe Institute of Municipal Engineering, Zhejiang UniversityThe Institute of Municipal Engineering, Zhejiang UniversityThe Institute of Municipal Engineering, Zhejiang UniversityThe Institute of Municipal Engineering, Zhejiang UniversityCentre for Water Systems, Faculty of Environment, Science and Economy, University of ExeterAbstract Artificial intelligence has been extensively used to predict surface water quality to assess the health of aquatic ecosystems proactively. However, water quality prediction in data-scarce conditions is a challenge, especially with heterogeneous data from monitoring sites that lack similarity in water quality, hindering the information transfer. A deep learning model is proposed that utilizes representation learning to capture knowledge from source river basins during the pre-training stage, and incorporates meteorological data to accurately predict water quality. This model is successfully implemented and validated using data from 149 monitoring sites across inland China. The results show that the model has outstanding prediction accuracy across all sites, with a mean Nash-Sutcliffe efficiency of 0.80, and has a significant advantage in multi-indicator prediction. The model maintains its excellent performance even when trained with only half of the data. This can be attributed to the representation learning used in the pre-training stage, which enables extensive and accurate prediction under data-scarce conditions. The developed model holds significant potential for cross-basin water quality prediction, which could substantially advance the development of water environment system management.https://doi.org/10.1038/s41545-025-00466-2 |
| spellingShingle | Yue Zheng Xiaoran Zhang Yongchao Zhou Yiping Zhang Tuqiao Zhang Raziyeh Farmani Deep representation learning enables cross-basin water quality prediction under data-scarce conditions npj Clean Water |
| title | Deep representation learning enables cross-basin water quality prediction under data-scarce conditions |
| title_full | Deep representation learning enables cross-basin water quality prediction under data-scarce conditions |
| title_fullStr | Deep representation learning enables cross-basin water quality prediction under data-scarce conditions |
| title_full_unstemmed | Deep representation learning enables cross-basin water quality prediction under data-scarce conditions |
| title_short | Deep representation learning enables cross-basin water quality prediction under data-scarce conditions |
| title_sort | deep representation learning enables cross basin water quality prediction under data scarce conditions |
| url | https://doi.org/10.1038/s41545-025-00466-2 |
| work_keys_str_mv | AT yuezheng deeprepresentationlearningenablescrossbasinwaterqualitypredictionunderdatascarceconditions AT xiaoranzhang deeprepresentationlearningenablescrossbasinwaterqualitypredictionunderdatascarceconditions AT yongchaozhou deeprepresentationlearningenablescrossbasinwaterqualitypredictionunderdatascarceconditions AT yipingzhang deeprepresentationlearningenablescrossbasinwaterqualitypredictionunderdatascarceconditions AT tuqiaozhang deeprepresentationlearningenablescrossbasinwaterqualitypredictionunderdatascarceconditions AT raziyehfarmani deeprepresentationlearningenablescrossbasinwaterqualitypredictionunderdatascarceconditions |