Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed
Understanding the drivers of algal bloom variability is critical for managing eutrophication in freshwater reservoirs, yet most predictive models underrepresent the role of dissolved organic matter (DOM) composition. This study explores the contribution of size-fractionated DOM, specifically dissolv...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-09-01
|
| Series: | Ecological Indicators |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S1470160X25009999 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849228398163918848 |
|---|---|
| author | Yun Kyung Lee Haeseong Oh Bo-Mi Lee Jin Hur |
| author_facet | Yun Kyung Lee Haeseong Oh Bo-Mi Lee Jin Hur |
| author_sort | Yun Kyung Lee |
| collection | DOAJ |
| description | Understanding the drivers of algal bloom variability is critical for managing eutrophication in freshwater reservoirs, yet most predictive models underrepresent the role of dissolved organic matter (DOM) composition. This study explores the contribution of size-fractionated DOM, specifically dissolved organic carbon (DOC) and nitrogen (DON) fractions, including biopolymers (BP) and humic substances (HS), to algal bloom prediction across heterogeneous watersheds. For this purpose, we incorporated molecular DOM fractions and fluorescence-based spectroscopic properties into machine learning (ML) models to predict chlorophyll-a (Chl-a) concentrations in the Han River watershed, one of South Korea’s largest and most complex freshwater systems. Three ML models, extreme gradient boosting (XGBoost), deep neural networks (DNN), and multiple linear regression (MLR), were evaluated using a year-long dataset. Among them, XGBoost yielded the highest performance (R2 = 0.928; Root Mean Square Error (RMSE) = 0.033), especially when the full input set, including DOM properties, was used. Shapley additive explanations (SHAP) analysis identified bioavailable DON fractions as the most influential predictors, with spatial variations observed across sub-watersheds. Partial least squares structural equation modeling (PLS-SEM) further revealed the direct and indirect effects of DOM components, hydrologic variables, and fluorescence peaks on algal dynamics. These findings demonstrate that integrating size-fractionated DOM profiles and spectroscopic indicators enhances both the predictive power and interpretability of algal bloom variability across sub-watersheds. The proposed framework provides a cost-effective and scalable tool for water quality management, especially in data-limited and spatially diverse freshwater systems. |
| format | Article |
| id | doaj-art-ce10ab25dd0e430c918d810fa29c4d5d |
| institution | Kabale University |
| issn | 1470-160X |
| language | English |
| publishDate | 2025-09-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Ecological Indicators |
| spelling | doaj-art-ce10ab25dd0e430c918d810fa29c4d5d2025-08-23T04:47:45ZengElsevierEcological Indicators1470-160X2025-09-0117811406710.1016/j.ecolind.2025.114067Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershedYun Kyung Lee0Haeseong Oh1Bo-Mi Lee2Jin Hur3Department of Environment and Energy, Sejong University, 209 Neungdong-ro, Gwangjin-gu, Seoul 05006, South Korea; Corresponding authors.Department of Environment and Energy, Sejong University, 209 Neungdong-ro, Gwangjin-gu, Seoul 05006, South KoreaWater Environmental Research Department Han River Environmental Research Center, National Institute of Environmental Research, Incheon 22689, South KoreaDepartment of Environment and Energy, Sejong University, 209 Neungdong-ro, Gwangjin-gu, Seoul 05006, South Korea; Corresponding authors.Understanding the drivers of algal bloom variability is critical for managing eutrophication in freshwater reservoirs, yet most predictive models underrepresent the role of dissolved organic matter (DOM) composition. This study explores the contribution of size-fractionated DOM, specifically dissolved organic carbon (DOC) and nitrogen (DON) fractions, including biopolymers (BP) and humic substances (HS), to algal bloom prediction across heterogeneous watersheds. For this purpose, we incorporated molecular DOM fractions and fluorescence-based spectroscopic properties into machine learning (ML) models to predict chlorophyll-a (Chl-a) concentrations in the Han River watershed, one of South Korea’s largest and most complex freshwater systems. Three ML models, extreme gradient boosting (XGBoost), deep neural networks (DNN), and multiple linear regression (MLR), were evaluated using a year-long dataset. Among them, XGBoost yielded the highest performance (R2 = 0.928; Root Mean Square Error (RMSE) = 0.033), especially when the full input set, including DOM properties, was used. Shapley additive explanations (SHAP) analysis identified bioavailable DON fractions as the most influential predictors, with spatial variations observed across sub-watersheds. Partial least squares structural equation modeling (PLS-SEM) further revealed the direct and indirect effects of DOM components, hydrologic variables, and fluorescence peaks on algal dynamics. These findings demonstrate that integrating size-fractionated DOM profiles and spectroscopic indicators enhances both the predictive power and interpretability of algal bloom variability across sub-watersheds. The proposed framework provides a cost-effective and scalable tool for water quality management, especially in data-limited and spatially diverse freshwater systems.http://www.sciencedirect.com/science/article/pii/S1470160X25009999Agal bloomDissolved organic matterMachine learningFluorescence spectroscopyDOM size fractionation |
| spellingShingle | Yun Kyung Lee Haeseong Oh Bo-Mi Lee Jin Hur Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed Ecological Indicators Agal bloom Dissolved organic matter Machine learning Fluorescence spectroscopy DOM size fractionation |
| title | Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed |
| title_full | Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed |
| title_fullStr | Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed |
| title_full_unstemmed | Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed |
| title_short | Deciphering and predicting algal bloom variability using size-fractionated organic matter and machine learning in a complex watershed |
| title_sort | deciphering and predicting algal bloom variability using size fractionated organic matter and machine learning in a complex watershed |
| topic | Agal bloom Dissolved organic matter Machine learning Fluorescence spectroscopy DOM size fractionation |
| url | http://www.sciencedirect.com/science/article/pii/S1470160X25009999 |
| work_keys_str_mv | AT yunkyunglee decipheringandpredictingalgalbloomvariabilityusingsizefractionatedorganicmatterandmachinelearninginacomplexwatershed AT haeseongoh decipheringandpredictingalgalbloomvariabilityusingsizefractionatedorganicmatterandmachinelearninginacomplexwatershed AT bomilee decipheringandpredictingalgalbloomvariabilityusingsizefractionatedorganicmatterandmachinelearninginacomplexwatershed AT jinhur decipheringandpredictingalgalbloomvariabilityusingsizefractionatedorganicmatterandmachinelearninginacomplexwatershed |