Disaggregating IMERG satellite precipitation over Czech Republic: an innovative approach using hybrid Extreme Gradient Boosting based on Fuzzy Spatial-Temporal Multivariate Clustering

Abstract Accurate precipitation estimation at high spatial and temporal resolutions is essential for hydrological and meteorological applications, especially in regions experiencing water resource degradation. This study presents a robust non-parametric framework for disaggregating coarse-resolution...

Full description

Saved in:
Bibliographic Details
Main Authors: Ujjwal Singh, Sadaf Nasreen, Gaurav Tripathi, Pragya Mehrishi, Rajani Kumar Pradhan, Poppová Bestakova, Vivek Vikram Singh, K C Gouda, Laxmi Kant Sharma, Kiran Jalem, Petr Maca, Rama Rao Nidamanuri, Akhilesh Singh Raghubanshi, Yannis Markonis, Rakovec Oldřich, Martin Hanel
Format: Article
Language:English
Published: SpringerOpen 2025-06-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-025-01208-4
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Accurate precipitation estimation at high spatial and temporal resolutions is essential for hydrological and meteorological applications, especially in regions experiencing water resource degradation. This study presents a robust non-parametric framework for disaggregating coarse-resolution satellite precipitation data to finer scales, using a hybrid model that integrates Extreme Gradient Boosting (XGBoost) with multivariate spatio-temporal fuzzy clustering. Eight clusters were delineated based on Integrated Multi-satellite Retrievals for GPM (IMERG) precipitation and Shuttle Radar Topography Mission (SRTM) elevation data, with one representative station per cluster used for training and validation, and an additional 19 stations employed solely for independent validation. We downscaled 255 months (June 2000–September 2021) of IMERG precipitation data from 11 to 1 km spatial resolution across the Czech Republic. The disaggregated precipitation demonstrated marked accuracy improvements when evaluated against observed station data, with $$R^2$$ R 2 values ranging from 0.63 to 0.85, RMSE between 17.43 mm and 32.41 mm, NSE from 0.39 to 0.82, and KGE spanning 0.67 to 0.86-indicating a significant reduction in the bias inherent in the original IMERG data. The proposed methodology achieved (1) enhanced agreement between disaggregated and observed monthly precipitation, (2) significant improvement in IMERG data accuracy at finer scales, and (3) demonstrated operational potential in regions with sparse ground-based observations. This approach offers a promising solution for generating reliable, high-resolution precipitation datasets in data-scarce environments, with broad applicability in global hydrological and meteorological modelling.
ISSN:2196-1115