Toward Trustworthy Machine Learning for Daily Sediment Modeling in the Riverine Systems: An Integrated Framework With Enhanced Uncertainty Quantification and Interpretability

Abstract Accurately predicting sediment dynamics and understanding their intrinsic contributors are pivotal for sustainable environment and water management. While machine learning (ML) enables precise predictions, its “black‐box” nature hinders transparency and credibility, posing challenges in int...

Full description

Saved in:
Bibliographic Details
Main Authors: Z. J. Yue, N. N. Wang, B. D. Xu, X. Huang, D. M. Yang, H. B. Xiao, Z. H. Shi
Format: Article
Language:English
Published: Wiley 2025-05-01
Series:Water Resources Research
Subjects:
Online Access:https://doi.org/10.1029/2024WR038650
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Accurately predicting sediment dynamics and understanding their intrinsic contributors are pivotal for sustainable environment and water management. While machine learning (ML) enables precise predictions, its “black‐box” nature hinders transparency and credibility, posing challenges in interpretability and uncertainty quantification (UQ). To achieve trustworthy ML for riverine sediment timeseries predictions, this study proposes an integrated ML framework, enhancing key steps: feature selection, UQ, and interpretation. Lagged hydro‐environmental variables are incorporated via rigorous feature selection. SHapley Additive exPlanations (SHAP) and conformal prediction are utilized to refine interpretability and UQ, respectively. Based on 41‐year multi‐source data and three ensemble learning algorithms (LightGBM, XGBoost, and random forest (RF)), this study models daily suspended sediment concentration (SSC) separately for seven subtropical watersheds and evaluates overall and local accuracy. Key findings include: (a) Discharge and precipitation dominate SSC variability (explaining ∼56.8% and ∼18.9% of the variability, respectively). Sampling‐day discharge and accumulative lagged precipitation should be prioritized as predictors. Precipitation‐discharge interaction effects on SSC exhibit simple threshold effects, whereas the interaction effects of hydrological (precipitation, discharge) and environmental (SPEI, land cover) factors involve complex, bidirectional threshold effects. (b) LightGBM and XGBoost excel in long‐term/general prediction, while RF outperform for short‐term/extreme value predictions. (c) Conformal prediction‐based UQ provides probabilistic information to quantify prediction reliability and efficiency, alongside uncertainty sources: discharge (∼38.9%) > precipitation (∼33.4%) > land cover (∼19.6%) > SPEI (∼8.1%). This framework advances trustworthy ML in riverine sediment modeling, while its algorithm‐agnostic design ensures potential scalability to support broader hydrological applications and informed environmental decision‐making.
ISSN:0043-1397
1944-7973