A Machine Learning-Based Assessment of Proxies and Drivers of Harmful Algal Blooms in the Western Lake Erie Basin Using Satellite Remote Sensing
The western region of Lake Erie has been experiencing severe water-quality issues, mainly through the infestation of algal blooms, highlighting the urgent need for action. Understanding the drivers and the intricacies associated with algal bloom phenomena is important to develop effective water-qual...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/13/2164 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849319808199294976 |
|---|---|
| author | Neha Joshi Armeen Ghoorkhanian Jongmin Park Kaiguang Zhao Sami Khanal |
| author_facet | Neha Joshi Armeen Ghoorkhanian Jongmin Park Kaiguang Zhao Sami Khanal |
| author_sort | Neha Joshi |
| collection | DOAJ |
| description | The western region of Lake Erie has been experiencing severe water-quality issues, mainly through the infestation of algal blooms, highlighting the urgent need for action. Understanding the drivers and the intricacies associated with algal bloom phenomena is important to develop effective water-quality remediation strategies. In this study, the influences of multiple bloom drivers were explored, together with Harmonized Landsat Sentinel-2 (HLS) images, using the datasets collected in Western Lake Erie from 2013 to 2022. Bloom drivers included a group of physicochemical and meteorological variables, and Chlorophyll-a (Chl-a) served as a proxy for algal blooms. Various combinations of these datasets were used as predictor variables for three machine learning models, including Support Vector Regression (SVR), Extreme Gradient Boosting (XGB), and Random Forest (RF). Each model is complemented with the SHapley Additive exPlanations (SHAP) model to understand the role of predictor variables in Chl-a estimation. A combination of physicochemical variables and optical spectral bands yielded the highest model performance (R<sup>2</sup> up to 0.76, RMSE as low as 8.04 µg/L). The models using only meteorological data and spectral bands performed poorly (R<sup>2</sup> < 0.40), indicating the limited standalone predictive power of meteorological variables. While satellite-only models achieved moderate performance (R<sup>2</sup> up to 0.48), they could still be useful for preliminary monitoring where field data are unavailable. Furthermore, all 20 variables did not substantially improve model performance over models with only spectral and physicochemical inputs. While SVR achieved the highest R<sup>2</sup> in individual runs, XGB provided the most stable and consistently strong performance across input configurations, which could be an important consideration for operational use. These findings are highly relevant for harmful algal bloom (HAB) monitoring, where Chl-a serves as a critical proxy. By clarifying the contribution of diverse variables to Chl-a prediction and identifying robust modeling approaches, this study provides actionable insights to support data-driven management decisions aimed at mitigating HAB impacts in freshwater systems. |
| format | Article |
| id | doaj-art-ddbb0212c8f644bf8ba115c75f954f3e |
| institution | Kabale University |
| issn | 2072-4292 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Remote Sensing |
| spelling | doaj-art-ddbb0212c8f644bf8ba115c75f954f3e2025-08-20T03:50:20ZengMDPI AGRemote Sensing2072-42922025-06-011713216410.3390/rs17132164A Machine Learning-Based Assessment of Proxies and Drivers of Harmful Algal Blooms in the Western Lake Erie Basin Using Satellite Remote SensingNeha Joshi0Armeen Ghoorkhanian1Jongmin Park2Kaiguang Zhao3Sami Khanal4Department of Food, Agricultural and Biological Engineering, The Ohio State University, Columbus, OH 43210, USADepartment of Food, Agricultural and Biological Engineering, The Ohio State University, Columbus, OH 43210, USADepartment of Food, Agricultural and Biological Engineering, The Ohio State University, Columbus, OH 43210, USASchool of Environment and Natural Resources, The Ohio State University, Columbus, OH 43210, USADepartment of Food, Agricultural and Biological Engineering, The Ohio State University, Columbus, OH 43210, USAThe western region of Lake Erie has been experiencing severe water-quality issues, mainly through the infestation of algal blooms, highlighting the urgent need for action. Understanding the drivers and the intricacies associated with algal bloom phenomena is important to develop effective water-quality remediation strategies. In this study, the influences of multiple bloom drivers were explored, together with Harmonized Landsat Sentinel-2 (HLS) images, using the datasets collected in Western Lake Erie from 2013 to 2022. Bloom drivers included a group of physicochemical and meteorological variables, and Chlorophyll-a (Chl-a) served as a proxy for algal blooms. Various combinations of these datasets were used as predictor variables for three machine learning models, including Support Vector Regression (SVR), Extreme Gradient Boosting (XGB), and Random Forest (RF). Each model is complemented with the SHapley Additive exPlanations (SHAP) model to understand the role of predictor variables in Chl-a estimation. A combination of physicochemical variables and optical spectral bands yielded the highest model performance (R<sup>2</sup> up to 0.76, RMSE as low as 8.04 µg/L). The models using only meteorological data and spectral bands performed poorly (R<sup>2</sup> < 0.40), indicating the limited standalone predictive power of meteorological variables. While satellite-only models achieved moderate performance (R<sup>2</sup> up to 0.48), they could still be useful for preliminary monitoring where field data are unavailable. Furthermore, all 20 variables did not substantially improve model performance over models with only spectral and physicochemical inputs. While SVR achieved the highest R<sup>2</sup> in individual runs, XGB provided the most stable and consistently strong performance across input configurations, which could be an important consideration for operational use. These findings are highly relevant for harmful algal bloom (HAB) monitoring, where Chl-a serves as a critical proxy. By clarifying the contribution of diverse variables to Chl-a prediction and identifying robust modeling approaches, this study provides actionable insights to support data-driven management decisions aimed at mitigating HAB impacts in freshwater systems.https://www.mdpi.com/2072-4292/17/13/2164harmful algal bloomsHarmonized Landsat Sentinel-2water qualitymachine learning |
| spellingShingle | Neha Joshi Armeen Ghoorkhanian Jongmin Park Kaiguang Zhao Sami Khanal A Machine Learning-Based Assessment of Proxies and Drivers of Harmful Algal Blooms in the Western Lake Erie Basin Using Satellite Remote Sensing Remote Sensing harmful algal blooms Harmonized Landsat Sentinel-2 water quality machine learning |
| title | A Machine Learning-Based Assessment of Proxies and Drivers of Harmful Algal Blooms in the Western Lake Erie Basin Using Satellite Remote Sensing |
| title_full | A Machine Learning-Based Assessment of Proxies and Drivers of Harmful Algal Blooms in the Western Lake Erie Basin Using Satellite Remote Sensing |
| title_fullStr | A Machine Learning-Based Assessment of Proxies and Drivers of Harmful Algal Blooms in the Western Lake Erie Basin Using Satellite Remote Sensing |
| title_full_unstemmed | A Machine Learning-Based Assessment of Proxies and Drivers of Harmful Algal Blooms in the Western Lake Erie Basin Using Satellite Remote Sensing |
| title_short | A Machine Learning-Based Assessment of Proxies and Drivers of Harmful Algal Blooms in the Western Lake Erie Basin Using Satellite Remote Sensing |
| title_sort | machine learning based assessment of proxies and drivers of harmful algal blooms in the western lake erie basin using satellite remote sensing |
| topic | harmful algal blooms Harmonized Landsat Sentinel-2 water quality machine learning |
| url | https://www.mdpi.com/2072-4292/17/13/2164 |
| work_keys_str_mv | AT nehajoshi amachinelearningbasedassessmentofproxiesanddriversofharmfulalgalbloomsinthewesternlakeeriebasinusingsatelliteremotesensing AT armeenghoorkhanian amachinelearningbasedassessmentofproxiesanddriversofharmfulalgalbloomsinthewesternlakeeriebasinusingsatelliteremotesensing AT jongminpark amachinelearningbasedassessmentofproxiesanddriversofharmfulalgalbloomsinthewesternlakeeriebasinusingsatelliteremotesensing AT kaiguangzhao amachinelearningbasedassessmentofproxiesanddriversofharmfulalgalbloomsinthewesternlakeeriebasinusingsatelliteremotesensing AT samikhanal amachinelearningbasedassessmentofproxiesanddriversofharmfulalgalbloomsinthewesternlakeeriebasinusingsatelliteremotesensing AT nehajoshi machinelearningbasedassessmentofproxiesanddriversofharmfulalgalbloomsinthewesternlakeeriebasinusingsatelliteremotesensing AT armeenghoorkhanian machinelearningbasedassessmentofproxiesanddriversofharmfulalgalbloomsinthewesternlakeeriebasinusingsatelliteremotesensing AT jongminpark machinelearningbasedassessmentofproxiesanddriversofharmfulalgalbloomsinthewesternlakeeriebasinusingsatelliteremotesensing AT kaiguangzhao machinelearningbasedassessmentofproxiesanddriversofharmfulalgalbloomsinthewesternlakeeriebasinusingsatelliteremotesensing AT samikhanal machinelearningbasedassessmentofproxiesanddriversofharmfulalgalbloomsinthewesternlakeeriebasinusingsatelliteremotesensing |