Hindcasting Maximum Water Depths in Coastal Watersheds: The Importance of Incorporating Off‐Channel Data and Their Uncertainties in Machine Learning Models

Abstract In the absence of adequate observations on the off‐channel areas, flood models are typically trained and validated against stream water depths. This approach can be efficient for physics‐based models, which incorporate the underlying physical processes, but the efficiency for data‐driven mo...

Full description

Saved in:
Bibliographic Details
Main Authors: Maryam Pakdehi, Ebrahim Ahmadisharaf
Format: Article
Language:English
Published: Wiley 2025-04-01
Series:Water Resources Research
Subjects:
Online Access:https://doi.org/10.1029/2024WR039244
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849422641651253248
author Maryam Pakdehi
Ebrahim Ahmadisharaf
author_facet Maryam Pakdehi
Ebrahim Ahmadisharaf
author_sort Maryam Pakdehi
collection DOAJ
description Abstract In the absence of adequate observations on the off‐channel areas, flood models are typically trained and validated against stream water depths. This approach can be efficient for physics‐based models, which incorporate the underlying physical processes, but the efficiency for data‐driven models like machine learning (ML) algorithms is unclear. The existing off‐channel observations like high‐water marks (HWMs) are also subject to uncertainty. This paper addressed three research questions: (a) how useful are ML models, trained with stream gauges, for hindcasting water depths in the off‐channel areas? (b) how does incorporating the uncertainty of HWMs improve the model performance? and (c) does the uncertainty incorporation improve the model transferability to other watersheds and events? To answer these questions, we evaluated the performance of ML models across three large coastal watersheds in the US during three hurricanes—Michael, Ida and Ian. The model was developed under three scenarios, which differed in terms of the flood observational data (stream gauges and HWMs) used for their training and validation. A loss function was proposed to incorporate the uncertainty of observations. We found that ML models trained solely by stream gauges performed well only for stream hindcasts. Satisfactory hindcasts on off‐channel areas were obtained by incorporating the HWMs' uncertainty via the loss function. This uncertainty incorporation reduced the model bias and resulted in the best transferability to other coastal watersheds and flood events. Our study provides insights about developing transferable ML models for hindcasting water depths on streams and off‐channel areas in coastal watersheds during extreme events.
format Article
id doaj-art-d11cb9ab62c84a8fbd36d7a810b5a499
institution Kabale University
issn 0043-1397
1944-7973
language English
publishDate 2025-04-01
publisher Wiley
record_format Article
series Water Resources Research
spelling doaj-art-d11cb9ab62c84a8fbd36d7a810b5a4992025-08-20T03:31:00ZengWileyWater Resources Research0043-13971944-79732025-04-01614n/an/a10.1029/2024WR039244Hindcasting Maximum Water Depths in Coastal Watersheds: The Importance of Incorporating Off‐Channel Data and Their Uncertainties in Machine Learning ModelsMaryam Pakdehi0Ebrahim Ahmadisharaf1Department of Civil and Environmental Engineering FAMU‐FSU College of Engineering Tallahassee FL USADepartment of Civil and Environmental Engineering FAMU‐FSU College of Engineering Tallahassee FL USAAbstract In the absence of adequate observations on the off‐channel areas, flood models are typically trained and validated against stream water depths. This approach can be efficient for physics‐based models, which incorporate the underlying physical processes, but the efficiency for data‐driven models like machine learning (ML) algorithms is unclear. The existing off‐channel observations like high‐water marks (HWMs) are also subject to uncertainty. This paper addressed three research questions: (a) how useful are ML models, trained with stream gauges, for hindcasting water depths in the off‐channel areas? (b) how does incorporating the uncertainty of HWMs improve the model performance? and (c) does the uncertainty incorporation improve the model transferability to other watersheds and events? To answer these questions, we evaluated the performance of ML models across three large coastal watersheds in the US during three hurricanes—Michael, Ida and Ian. The model was developed under three scenarios, which differed in terms of the flood observational data (stream gauges and HWMs) used for their training and validation. A loss function was proposed to incorporate the uncertainty of observations. We found that ML models trained solely by stream gauges performed well only for stream hindcasts. Satisfactory hindcasts on off‐channel areas were obtained by incorporating the HWMs' uncertainty via the loss function. This uncertainty incorporation reduced the model bias and resulted in the best transferability to other coastal watersheds and flood events. Our study provides insights about developing transferable ML models for hindcasting water depths on streams and off‐channel areas in coastal watersheds during extreme events.https://doi.org/10.1029/2024WR039244flood hindcastingwater depthhigh‐water marks (HWMs)coastal watershedsuncertaintymachine learning (ML)
spellingShingle Maryam Pakdehi
Ebrahim Ahmadisharaf
Hindcasting Maximum Water Depths in Coastal Watersheds: The Importance of Incorporating Off‐Channel Data and Their Uncertainties in Machine Learning Models
Water Resources Research
flood hindcasting
water depth
high‐water marks (HWMs)
coastal watersheds
uncertainty
machine learning (ML)
title Hindcasting Maximum Water Depths in Coastal Watersheds: The Importance of Incorporating Off‐Channel Data and Their Uncertainties in Machine Learning Models
title_full Hindcasting Maximum Water Depths in Coastal Watersheds: The Importance of Incorporating Off‐Channel Data and Their Uncertainties in Machine Learning Models
title_fullStr Hindcasting Maximum Water Depths in Coastal Watersheds: The Importance of Incorporating Off‐Channel Data and Their Uncertainties in Machine Learning Models
title_full_unstemmed Hindcasting Maximum Water Depths in Coastal Watersheds: The Importance of Incorporating Off‐Channel Data and Their Uncertainties in Machine Learning Models
title_short Hindcasting Maximum Water Depths in Coastal Watersheds: The Importance of Incorporating Off‐Channel Data and Their Uncertainties in Machine Learning Models
title_sort hindcasting maximum water depths in coastal watersheds the importance of incorporating off channel data and their uncertainties in machine learning models
topic flood hindcasting
water depth
high‐water marks (HWMs)
coastal watersheds
uncertainty
machine learning (ML)
url https://doi.org/10.1029/2024WR039244
work_keys_str_mv AT maryampakdehi hindcastingmaximumwaterdepthsincoastalwatershedstheimportanceofincorporatingoffchanneldataandtheiruncertaintiesinmachinelearningmodels
AT ebrahimahmadisharaf hindcastingmaximumwaterdepthsincoastalwatershedstheimportanceofincorporatingoffchanneldataandtheiruncertaintiesinmachinelearningmodels