A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering

A novel PcStream clustering-based single-site stochastic model is introduced for the simulation of daily streamflow time series. The PcStream clustering algorithm effectively manages real-time temporal data clusters and adjusts to concept drifts, enabling refined streamflow categorisation that accur...

Full description

Saved in:
Bibliographic Details
Main Authors: Shalini Balaram, Roshan Srivastav, K Srinivasan
Format: Article
Language:English
Published: IOP Publishing 2025-01-01
Series:Environmental Research Communications
Subjects:
Online Access:https://doi.org/10.1088/2515-7620/adb544
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849715989094072320
author Shalini Balaram
Roshan Srivastav
K Srinivasan
author_facet Shalini Balaram
Roshan Srivastav
K Srinivasan
author_sort Shalini Balaram
collection DOAJ
description A novel PcStream clustering-based single-site stochastic model is introduced for the simulation of daily streamflow time series. The PcStream clustering algorithm effectively manages real-time temporal data clusters and adjusts to concept drifts, enabling refined streamflow categorisation that accurately includes high values without misclassification. The methodology followed by the proposed model involves a series of steps that begin with fitting kappa and Generalized Extreme Value GEV distributions to model daily variations and extreme values, followed by clustering data using the PcStream algorithm. A Markov chain model regenerates cluster series while the nearest neighbour approach fills them with historical data. Additionally, flow series are classified into rising, falling or constant phases, and then flows are simulated using parametric distribution to reproduce observed dynamics in synthetic streamflow accurately. The methodology was tested by comparing the statistics of observed and simulated flows comparing five gage stations in the Pacific Northwest basin. The results confirm that the model successfully reproduces key aspects of streamflow, including seasonal patterns, low flows, autocorrelations, and flow duration curves. It also reproduces the basic statistics on daily, monthly and annual time scales well. The proposed streamflow model demonstrated exceptional accuracy with percent bias (PBIAS) ranging from −0.41% to +0.33% across all stations. The Index of Agreement (d) values were consistently high (0.93–1.00), while MAE varied from 458 to 37,361 cfs and RMSE from 805 to 56,042 cfs, with larger errors corresponding to stations with higher mean flows. The model effectively captured both low flows (7Q10) and high flows across stations ranging from small catchments (105 sq mi) to major catchments (59,700 sq mi), handling flow ranges spanning four orders of magnitude (0.3 to 492,000 cfs). It effectively captures the nuances of streamflow pulses through explicit modelling of different flow phases. The efficacy of the proposed model is also brought out through a comparison with the hybrid Modified Continuous Time Markov Chain (MCTMC) model .
format Article
id doaj-art-fbe358c03e044cc0bedd22a6f60bd00c
institution DOAJ
issn 2515-7620
language English
publishDate 2025-01-01
publisher IOP Publishing
record_format Article
series Environmental Research Communications
spelling doaj-art-fbe358c03e044cc0bedd22a6f60bd00c2025-08-20T03:13:10ZengIOP PublishingEnvironmental Research Communications2515-76202025-01-017202101110.1088/2515-7620/adb544A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clusteringShalini Balaram0https://orcid.org/0009-0001-3610-1212Roshan Srivastav1https://orcid.org/0000-0002-8175-8969K Srinivasan2Department of Civil Engineering, Indian Institute of Technology Madras , Chennai, 600036, IndiaDepartment of Civil & Environmental Engineering, Indian Institute of Technology Tirupati , Tirupati, 517 506, IndiaDepartment of Civil Engineering, Indian Institute of Technology Madras , Chennai, 600036, IndiaA novel PcStream clustering-based single-site stochastic model is introduced for the simulation of daily streamflow time series. The PcStream clustering algorithm effectively manages real-time temporal data clusters and adjusts to concept drifts, enabling refined streamflow categorisation that accurately includes high values without misclassification. The methodology followed by the proposed model involves a series of steps that begin with fitting kappa and Generalized Extreme Value GEV distributions to model daily variations and extreme values, followed by clustering data using the PcStream algorithm. A Markov chain model regenerates cluster series while the nearest neighbour approach fills them with historical data. Additionally, flow series are classified into rising, falling or constant phases, and then flows are simulated using parametric distribution to reproduce observed dynamics in synthetic streamflow accurately. The methodology was tested by comparing the statistics of observed and simulated flows comparing five gage stations in the Pacific Northwest basin. The results confirm that the model successfully reproduces key aspects of streamflow, including seasonal patterns, low flows, autocorrelations, and flow duration curves. It also reproduces the basic statistics on daily, monthly and annual time scales well. The proposed streamflow model demonstrated exceptional accuracy with percent bias (PBIAS) ranging from −0.41% to +0.33% across all stations. The Index of Agreement (d) values were consistently high (0.93–1.00), while MAE varied from 458 to 37,361 cfs and RMSE from 805 to 56,042 cfs, with larger errors corresponding to stations with higher mean flows. The model effectively captured both low flows (7Q10) and high flows across stations ranging from small catchments (105 sq mi) to major catchments (59,700 sq mi), handling flow ranges spanning four orders of magnitude (0.3 to 492,000 cfs). It effectively captures the nuances of streamflow pulses through explicit modelling of different flow phases. The efficacy of the proposed model is also brought out through a comparison with the hybrid Modified Continuous Time Markov Chain (MCTMC) model .https://doi.org/10.1088/2515-7620/adb544streamflow simulationhydrological systemswater resource managementstochastic modellingmachine learningPcStream clustering
spellingShingle Shalini Balaram
Roshan Srivastav
K Srinivasan
A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering
Environmental Research Communications
streamflow simulation
hydrological systems
water resource management
stochastic modelling
machine learning
PcStream clustering
title A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering
title_full A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering
title_fullStr A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering
title_full_unstemmed A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering
title_short A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering
title_sort novel multi step methodology for stochastic simulation of streamflow time series using pcstream clustering
topic streamflow simulation
hydrological systems
water resource management
stochastic modelling
machine learning
PcStream clustering
url https://doi.org/10.1088/2515-7620/adb544
work_keys_str_mv AT shalinibalaram anovelmultistepmethodologyforstochasticsimulationofstreamflowtimeseriesusingpcstreamclustering
AT roshansrivastav anovelmultistepmethodologyforstochasticsimulationofstreamflowtimeseriesusingpcstreamclustering
AT ksrinivasan anovelmultistepmethodologyforstochasticsimulationofstreamflowtimeseriesusingpcstreamclustering
AT shalinibalaram novelmultistepmethodologyforstochasticsimulationofstreamflowtimeseriesusingpcstreamclustering
AT roshansrivastav novelmultistepmethodologyforstochasticsimulationofstreamflowtimeseriesusingpcstreamclustering
AT ksrinivasan novelmultistepmethodologyforstochasticsimulationofstreamflowtimeseriesusingpcstreamclustering