Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed

Understanding the drivers of algal bloom variability is critical for managing eutrophication in freshwater reservoirs, yet most predictive models underrepresent the role of dissolved organic matter (DOM) composition. This study explores the contribution of size-fractionated DOM, specifically dissolv...

Full description

Saved in:
Bibliographic Details
Main Authors: Yun Kyung Lee, Haeseong Oh, Bo-Mi Lee, Jin Hur
Format: Article
Language:English
Published: Elsevier 2025-09-01
Series:Ecological Indicators
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1470160X25009999
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849228398163918848
author Yun Kyung Lee
Haeseong Oh
Bo-Mi Lee
Jin Hur
author_facet Yun Kyung Lee
Haeseong Oh
Bo-Mi Lee
Jin Hur
author_sort Yun Kyung Lee
collection DOAJ
description Understanding the drivers of algal bloom variability is critical for managing eutrophication in freshwater reservoirs, yet most predictive models underrepresent the role of dissolved organic matter (DOM) composition. This study explores the contribution of size-fractionated DOM, specifically dissolved organic carbon (DOC) and nitrogen (DON) fractions, including biopolymers (BP) and humic substances (HS), to algal bloom prediction across heterogeneous watersheds. For this purpose, we incorporated molecular DOM fractions and fluorescence-based spectroscopic properties into machine learning (ML) models to predict chlorophyll-a (Chl-a) concentrations in the Han River watershed, one of South Korea’s largest and most complex freshwater systems. Three ML models, extreme gradient boosting (XGBoost), deep neural networks (DNN), and multiple linear regression (MLR), were evaluated using a year-long dataset. Among them, XGBoost yielded the highest performance (R2 = 0.928; Root Mean Square Error (RMSE) = 0.033), especially when the full input set, including DOM properties, was used. Shapley additive explanations (SHAP) analysis identified bioavailable DON fractions as the most influential predictors, with spatial variations observed across sub-watersheds. Partial least squares structural equation modeling (PLS-SEM) further revealed the direct and indirect effects of DOM components, hydrologic variables, and fluorescence peaks on algal dynamics. These findings demonstrate that integrating size-fractionated DOM profiles and spectroscopic indicators enhances both the predictive power and interpretability of algal bloom variability across sub-watersheds. The proposed framework provides a cost-effective and scalable tool for water quality management, especially in data-limited and spatially diverse freshwater systems.
format Article
id doaj-art-ce10ab25dd0e430c918d810fa29c4d5d
institution Kabale University
issn 1470-160X
language English
publishDate 2025-09-01
publisher Elsevier
record_format Article
series Ecological Indicators
spelling doaj-art-ce10ab25dd0e430c918d810fa29c4d5d2025-08-23T04:47:45ZengElsevierEcological Indicators1470-160X2025-09-0117811406710.1016/j.ecolind.2025.114067Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershedYun Kyung Lee0Haeseong Oh1Bo-Mi Lee2Jin Hur3Department of Environment and Energy, Sejong University, 209 Neungdong-ro, Gwangjin-gu, Seoul 05006, South Korea; Corresponding authors.Department of Environment and Energy, Sejong University, 209 Neungdong-ro, Gwangjin-gu, Seoul 05006, South KoreaWater Environmental Research Department Han River Environmental Research Center, National Institute of Environmental Research, Incheon 22689, South KoreaDepartment of Environment and Energy, Sejong University, 209 Neungdong-ro, Gwangjin-gu, Seoul 05006, South Korea; Corresponding authors.Understanding the drivers of algal bloom variability is critical for managing eutrophication in freshwater reservoirs, yet most predictive models underrepresent the role of dissolved organic matter (DOM) composition. This study explores the contribution of size-fractionated DOM, specifically dissolved organic carbon (DOC) and nitrogen (DON) fractions, including biopolymers (BP) and humic substances (HS), to algal bloom prediction across heterogeneous watersheds. For this purpose, we incorporated molecular DOM fractions and fluorescence-based spectroscopic properties into machine learning (ML) models to predict chlorophyll-a (Chl-a) concentrations in the Han River watershed, one of South Korea’s largest and most complex freshwater systems. Three ML models, extreme gradient boosting (XGBoost), deep neural networks (DNN), and multiple linear regression (MLR), were evaluated using a year-long dataset. Among them, XGBoost yielded the highest performance (R2 = 0.928; Root Mean Square Error (RMSE) = 0.033), especially when the full input set, including DOM properties, was used. Shapley additive explanations (SHAP) analysis identified bioavailable DON fractions as the most influential predictors, with spatial variations observed across sub-watersheds. Partial least squares structural equation modeling (PLS-SEM) further revealed the direct and indirect effects of DOM components, hydrologic variables, and fluorescence peaks on algal dynamics. These findings demonstrate that integrating size-fractionated DOM profiles and spectroscopic indicators enhances both the predictive power and interpretability of algal bloom variability across sub-watersheds. The proposed framework provides a cost-effective and scalable tool for water quality management, especially in data-limited and spatially diverse freshwater systems.http://www.sciencedirect.com/science/article/pii/S1470160X25009999Agal bloomDissolved organic matterMachine learningFluorescence spectroscopyDOM size fractionation
spellingShingle Yun Kyung Lee
Haeseong Oh
Bo-Mi Lee
Jin Hur
Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed
Ecological Indicators
Agal bloom
Dissolved organic matter
Machine learning
Fluorescence spectroscopy
DOM size fractionation
title Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed
title_full Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed
title_fullStr Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed
title_full_unstemmed Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed
title_short Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed
title_sort deciphering and predicting algal bloom variability using size fractionated organic matter and machine learning in a complex watershed
topic Agal bloom
Dissolved organic matter
Machine learning
Fluorescence spectroscopy
DOM size fractionation
url http://www.sciencedirect.com/science/article/pii/S1470160X25009999
work_keys_str_mv AT yunkyunglee decipheringandpredictingalgalbloomvariabilityusingsizefractionatedorganicmatterandmachinelearninginacomplexwatershed
AT haeseongoh decipheringandpredictingalgalbloomvariabilityusingsizefractionatedorganicmatterandmachinelearninginacomplexwatershed
AT bomilee decipheringandpredictingalgalbloomvariabilityusingsizefractionatedorganicmatterandmachinelearninginacomplexwatershed
AT jinhur decipheringandpredictingalgalbloomvariabilityusingsizefractionatedorganicmatterandmachinelearninginacomplexwatershed