A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering
A novel PcStream clustering-based single-site stochastic model is introduced for the simulation of daily streamflow time series. The PcStream clustering algorithm effectively manages real-time temporal data clusters and adjusts to concept drifts, enabling refined streamflow categorisation that accur...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IOP Publishing
2025-01-01
|
| Series: | Environmental Research Communications |
| Subjects: | |
| Online Access: | https://doi.org/10.1088/2515-7620/adb544 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849715989094072320 |
|---|---|
| author | Shalini Balaram Roshan Srivastav K Srinivasan |
| author_facet | Shalini Balaram Roshan Srivastav K Srinivasan |
| author_sort | Shalini Balaram |
| collection | DOAJ |
| description | A novel PcStream clustering-based single-site stochastic model is introduced for the simulation of daily streamflow time series. The PcStream clustering algorithm effectively manages real-time temporal data clusters and adjusts to concept drifts, enabling refined streamflow categorisation that accurately includes high values without misclassification. The methodology followed by the proposed model involves a series of steps that begin with fitting kappa and Generalized Extreme Value GEV distributions to model daily variations and extreme values, followed by clustering data using the PcStream algorithm. A Markov chain model regenerates cluster series while the nearest neighbour approach fills them with historical data. Additionally, flow series are classified into rising, falling or constant phases, and then flows are simulated using parametric distribution to reproduce observed dynamics in synthetic streamflow accurately. The methodology was tested by comparing the statistics of observed and simulated flows comparing five gage stations in the Pacific Northwest basin. The results confirm that the model successfully reproduces key aspects of streamflow, including seasonal patterns, low flows, autocorrelations, and flow duration curves. It also reproduces the basic statistics on daily, monthly and annual time scales well. The proposed streamflow model demonstrated exceptional accuracy with percent bias (PBIAS) ranging from −0.41% to +0.33% across all stations. The Index of Agreement (d) values were consistently high (0.93–1.00), while MAE varied from 458 to 37,361 cfs and RMSE from 805 to 56,042 cfs, with larger errors corresponding to stations with higher mean flows. The model effectively captured both low flows (7Q10) and high flows across stations ranging from small catchments (105 sq mi) to major catchments (59,700 sq mi), handling flow ranges spanning four orders of magnitude (0.3 to 492,000 cfs). It effectively captures the nuances of streamflow pulses through explicit modelling of different flow phases. The efficacy of the proposed model is also brought out through a comparison with the hybrid Modified Continuous Time Markov Chain (MCTMC) model . |
| format | Article |
| id | doaj-art-fbe358c03e044cc0bedd22a6f60bd00c |
| institution | DOAJ |
| issn | 2515-7620 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IOP Publishing |
| record_format | Article |
| series | Environmental Research Communications |
| spelling | doaj-art-fbe358c03e044cc0bedd22a6f60bd00c2025-08-20T03:13:10ZengIOP PublishingEnvironmental Research Communications2515-76202025-01-017202101110.1088/2515-7620/adb544A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clusteringShalini Balaram0https://orcid.org/0009-0001-3610-1212Roshan Srivastav1https://orcid.org/0000-0002-8175-8969K Srinivasan2Department of Civil Engineering, Indian Institute of Technology Madras , Chennai, 600036, IndiaDepartment of Civil & Environmental Engineering, Indian Institute of Technology Tirupati , Tirupati, 517 506, IndiaDepartment of Civil Engineering, Indian Institute of Technology Madras , Chennai, 600036, IndiaA novel PcStream clustering-based single-site stochastic model is introduced for the simulation of daily streamflow time series. The PcStream clustering algorithm effectively manages real-time temporal data clusters and adjusts to concept drifts, enabling refined streamflow categorisation that accurately includes high values without misclassification. The methodology followed by the proposed model involves a series of steps that begin with fitting kappa and Generalized Extreme Value GEV distributions to model daily variations and extreme values, followed by clustering data using the PcStream algorithm. A Markov chain model regenerates cluster series while the nearest neighbour approach fills them with historical data. Additionally, flow series are classified into rising, falling or constant phases, and then flows are simulated using parametric distribution to reproduce observed dynamics in synthetic streamflow accurately. The methodology was tested by comparing the statistics of observed and simulated flows comparing five gage stations in the Pacific Northwest basin. The results confirm that the model successfully reproduces key aspects of streamflow, including seasonal patterns, low flows, autocorrelations, and flow duration curves. It also reproduces the basic statistics on daily, monthly and annual time scales well. The proposed streamflow model demonstrated exceptional accuracy with percent bias (PBIAS) ranging from −0.41% to +0.33% across all stations. The Index of Agreement (d) values were consistently high (0.93–1.00), while MAE varied from 458 to 37,361 cfs and RMSE from 805 to 56,042 cfs, with larger errors corresponding to stations with higher mean flows. The model effectively captured both low flows (7Q10) and high flows across stations ranging from small catchments (105 sq mi) to major catchments (59,700 sq mi), handling flow ranges spanning four orders of magnitude (0.3 to 492,000 cfs). It effectively captures the nuances of streamflow pulses through explicit modelling of different flow phases. The efficacy of the proposed model is also brought out through a comparison with the hybrid Modified Continuous Time Markov Chain (MCTMC) model .https://doi.org/10.1088/2515-7620/adb544streamflow simulationhydrological systemswater resource managementstochastic modellingmachine learningPcStream clustering |
| spellingShingle | Shalini Balaram Roshan Srivastav K Srinivasan A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering Environmental Research Communications streamflow simulation hydrological systems water resource management stochastic modelling machine learning PcStream clustering |
| title | A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering |
| title_full | A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering |
| title_fullStr | A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering |
| title_full_unstemmed | A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering |
| title_short | A novel multi-step methodology for stochastic simulation of streamflow time series using PcStream clustering |
| title_sort | novel multi step methodology for stochastic simulation of streamflow time series using pcstream clustering |
| topic | streamflow simulation hydrological systems water resource management stochastic modelling machine learning PcStream clustering |
| url | https://doi.org/10.1088/2515-7620/adb544 |
| work_keys_str_mv | AT shalinibalaram anovelmultistepmethodologyforstochasticsimulationofstreamflowtimeseriesusingpcstreamclustering AT roshansrivastav anovelmultistepmethodologyforstochasticsimulationofstreamflowtimeseriesusingpcstreamclustering AT ksrinivasan anovelmultistepmethodologyforstochasticsimulationofstreamflowtimeseriesusingpcstreamclustering AT shalinibalaram novelmultistepmethodologyforstochasticsimulationofstreamflowtimeseriesusingpcstreamclustering AT roshansrivastav novelmultistepmethodologyforstochasticsimulationofstreamflowtimeseriesusingpcstreamclustering AT ksrinivasan novelmultistepmethodologyforstochasticsimulationofstreamflowtimeseriesusingpcstreamclustering |