Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed

Understanding the drivers of algal bloom variability is critical for managing eutrophication in freshwater reservoirs, yet most predictive models underrepresent the role of dissolved organic matter (DOM) composition. This study explores the contribution of size-fractionated DOM, specifically dissolv...

Full description

Saved in:
Bibliographic Details
Main Authors: Yun Kyung Lee, Haeseong Oh, Bo-Mi Lee, Jin Hur
Format: Article
Language:English
Published: Elsevier 2025-09-01
Series:Ecological Indicators
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1470160X25009999
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Understanding the drivers of algal bloom variability is critical for managing eutrophication in freshwater reservoirs, yet most predictive models underrepresent the role of dissolved organic matter (DOM) composition. This study explores the contribution of size-fractionated DOM, specifically dissolved organic carbon (DOC) and nitrogen (DON) fractions, including biopolymers (BP) and humic substances (HS), to algal bloom prediction across heterogeneous watersheds. For this purpose, we incorporated molecular DOM fractions and fluorescence-based spectroscopic properties into machine learning (ML) models to predict chlorophyll-a (Chl-a) concentrations in the Han River watershed, one of South Korea’s largest and most complex freshwater systems. Three ML models, extreme gradient boosting (XGBoost), deep neural networks (DNN), and multiple linear regression (MLR), were evaluated using a year-long dataset. Among them, XGBoost yielded the highest performance (R2 = 0.928; Root Mean Square Error (RMSE) = 0.033), especially when the full input set, including DOM properties, was used. Shapley additive explanations (SHAP) analysis identified bioavailable DON fractions as the most influential predictors, with spatial variations observed across sub-watersheds. Partial least squares structural equation modeling (PLS-SEM) further revealed the direct and indirect effects of DOM components, hydrologic variables, and fluorescence peaks on algal dynamics. These findings demonstrate that integrating size-fractionated DOM profiles and spectroscopic indicators enhances both the predictive power and interpretability of algal bloom variability across sub-watersheds. The proposed framework provides a cost-effective and scalable tool for water quality management, especially in data-limited and spatially diverse freshwater systems.
ISSN:1470-160X