Hindcasting Extreme Significant Wave Heights Under Fetch-Limited Conditions with Tree-Based Models

Accurately hindcasting waves in semi-enclosed, fetch-limited basins remains challenging for reanalysis models, which tend to underestimate storm peaks near the coast. We developed interpretable ML models for Rijeka Bay (northern Adriatic) using only wind observations from two land-based wind station...

Full description

Saved in:
Bibliographic Details
Main Authors: Damjan Bujak, Hanna Miličević, Goran Lončar, Dalibor Carević
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Journal of Marine Science and Engineering
Subjects:
Online Access:https://www.mdpi.com/2077-1312/13/7/1355
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849246315653890048
author Damjan Bujak
Hanna Miličević
Goran Lončar
Dalibor Carević
author_facet Damjan Bujak
Hanna Miličević
Goran Lončar
Dalibor Carević
author_sort Damjan Bujak
collection DOAJ
description Accurately hindcasting waves in semi-enclosed, fetch-limited basins remains challenging for reanalysis models, which tend to underestimate storm peaks near the coast. We developed interpretable ML models for Rijeka Bay (northern Adriatic) using only wind observations from two land-based wind stations to predict buoy <i>H<sub>m</sub></i><sub>0</sub> measurements spanning 2009–2011 (testing) and 2019–2021 (training and validation). The tested tree-based models included Random Forest, XGBoost, and Explainable Boosting Machine. This study introduces a novel approach in the literature by employing weighted schemes and feature engineering to enhance the predictive performance of interpretable, low-complexity machine learning models in hindcasting waves. Representing wind direction as sine–cosine components generally reduced RMSE and BIAS relative to traditional speed–direction inputs, while an exponential sample weight scheme that emphasized storm waves halved extreme <i>H<sub>m</sub></i><sub>0</sub> underprediction without inflating overall RMSE. The best-performing model, a Random Forest model, achieved an RMSE of 0.096 m and a correlation of 0.855 on the unseen test set—30% lower overall RMSE and 50% lower extreme wave RMSE than the MEDSEA and COEXMED hindcasts. Additionally, the underprediction was reduced by 90% compared to these reanalysis models. The method offers a computationally lightweight, transferable supplement to numerical wave guidance for coastal engineering and harbor operations.
format Article
id doaj-art-54190afbda34447982dfd3fd19136949
institution Kabale University
issn 2077-1312
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Journal of Marine Science and Engineering
spelling doaj-art-54190afbda34447982dfd3fd191369492025-08-20T03:58:31ZengMDPI AGJournal of Marine Science and Engineering2077-13122025-07-01137135510.3390/jmse13071355Hindcasting Extreme Significant Wave Heights Under Fetch-Limited Conditions with Tree-Based ModelsDamjan Bujak0Hanna Miličević1Goran Lončar2Dalibor Carević3Faculty of Civil Engineering, University of Zagreb, 10000 Zagreb, CroatiaFaculty of Civil Engineering, University of Zagreb, 10000 Zagreb, CroatiaFaculty of Civil Engineering, University of Zagreb, 10000 Zagreb, CroatiaFaculty of Civil Engineering, University of Zagreb, 10000 Zagreb, CroatiaAccurately hindcasting waves in semi-enclosed, fetch-limited basins remains challenging for reanalysis models, which tend to underestimate storm peaks near the coast. We developed interpretable ML models for Rijeka Bay (northern Adriatic) using only wind observations from two land-based wind stations to predict buoy <i>H<sub>m</sub></i><sub>0</sub> measurements spanning 2009–2011 (testing) and 2019–2021 (training and validation). The tested tree-based models included Random Forest, XGBoost, and Explainable Boosting Machine. This study introduces a novel approach in the literature by employing weighted schemes and feature engineering to enhance the predictive performance of interpretable, low-complexity machine learning models in hindcasting waves. Representing wind direction as sine–cosine components generally reduced RMSE and BIAS relative to traditional speed–direction inputs, while an exponential sample weight scheme that emphasized storm waves halved extreme <i>H<sub>m</sub></i><sub>0</sub> underprediction without inflating overall RMSE. The best-performing model, a Random Forest model, achieved an RMSE of 0.096 m and a correlation of 0.855 on the unseen test set—30% lower overall RMSE and 50% lower extreme wave RMSE than the MEDSEA and COEXMED hindcasts. Additionally, the underprediction was reduced by 90% compared to these reanalysis models. The method offers a computationally lightweight, transferable supplement to numerical wave guidance for coastal engineering and harbor operations.https://www.mdpi.com/2077-1312/13/7/1355machine learningrandom forestXGBoostexplainable boosting modelwave predictionwave hindcast
spellingShingle Damjan Bujak
Hanna Miličević
Goran Lončar
Dalibor Carević
Hindcasting Extreme Significant Wave Heights Under Fetch-Limited Conditions with Tree-Based Models
Journal of Marine Science and Engineering
machine learning
random forest
XGBoost
explainable boosting model
wave prediction
wave hindcast
title Hindcasting Extreme Significant Wave Heights Under Fetch-Limited Conditions with Tree-Based Models
title_full Hindcasting Extreme Significant Wave Heights Under Fetch-Limited Conditions with Tree-Based Models
title_fullStr Hindcasting Extreme Significant Wave Heights Under Fetch-Limited Conditions with Tree-Based Models
title_full_unstemmed Hindcasting Extreme Significant Wave Heights Under Fetch-Limited Conditions with Tree-Based Models
title_short Hindcasting Extreme Significant Wave Heights Under Fetch-Limited Conditions with Tree-Based Models
title_sort hindcasting extreme significant wave heights under fetch limited conditions with tree based models
topic machine learning
random forest
XGBoost
explainable boosting model
wave prediction
wave hindcast
url https://www.mdpi.com/2077-1312/13/7/1355
work_keys_str_mv AT damjanbujak hindcastingextremesignificantwaveheightsunderfetchlimitedconditionswithtreebasedmodels
AT hannamilicevic hindcastingextremesignificantwaveheightsunderfetchlimitedconditionswithtreebasedmodels
AT goranloncar hindcastingextremesignificantwaveheightsunderfetchlimitedconditionswithtreebasedmodels
AT daliborcarevic hindcastingextremesignificantwaveheightsunderfetchlimitedconditionswithtreebasedmodels