SMRFR: A global multilayer soil moisture dataset generated using Random Forest from multi-source data

Abstract Accurate and continuous monitoring of soil moisture (SM) is crucial for a wide range of applications in agriculture, hydrology, and climate modelling. In this study, we present a novel machine learning (ML) based framework for generating a continuously updated, multilayer global SM dataset:...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuhan Liu, Yuanyuan Zha, Gulin Ran, Yonggen Zhang, Liangsheng Shi
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-025-05511-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849333726615437312
author Yuhan Liu
Yuanyuan Zha
Gulin Ran
Yonggen Zhang
Liangsheng Shi
author_facet Yuhan Liu
Yuanyuan Zha
Gulin Ran
Yonggen Zhang
Liangsheng Shi
author_sort Yuhan Liu
collection DOAJ
description Abstract Accurate and continuous monitoring of soil moisture (SM) is crucial for a wide range of applications in agriculture, hydrology, and climate modelling. In this study, we present a novel machine learning (ML) based framework for generating a continuously updated, multilayer global SM dataset: SMRFR (Soil Moisture via Random Forest Regression). Leveraging publicly available reanalysis and remote sensing data, SMRFR provides daily SM estimates at five soil layers (0–5, 5–10, 10–30, 30–50 and 50–100 cm) with a spatial resolution of 9 km, covering the period from 2000 to 2023. Evaluation results demonstrate that SMRFR effectively captures both spatial and temporal SM variability. It also exhibits strong generalization capacity, successfully transferring knowledge across continents and accurately capturing transient and seasonal SM dynamics following rainfall events. SMRFR achieved an unbiased root mean square error of 0.0339 m3/m3 on the validation set. Our novel SM dataset offers a basis and valuable reference for agricultural, hydrological, and ecological research, enabling improved analysis and modelling of SM dynamics at regional to global scales.
format Article
id doaj-art-e38bcaf97a11473694292e61c27ea1c6
institution Kabale University
issn 2052-4463
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-e38bcaf97a11473694292e61c27ea1c62025-08-20T03:45:45ZengNature PortfolioScientific Data2052-44632025-07-0112111610.1038/s41597-025-05511-wSMRFR: A global multilayer soil moisture dataset generated using Random Forest from multi-source dataYuhan Liu0Yuanyuan Zha1Gulin Ran2Yonggen Zhang3Liangsheng Shi4State Key Laboratory of Water Resources Engineering and Management, Wuhan UniversityState Key Laboratory of Water Resources Engineering and Management, Wuhan UniversityState Key Laboratory of Water Resources Engineering and Management, Wuhan UniversityInstitute of Surface-Earth System Science, School of Earth System Science, Tianjin UniversityState Key Laboratory of Water Resources Engineering and Management, Wuhan UniversityAbstract Accurate and continuous monitoring of soil moisture (SM) is crucial for a wide range of applications in agriculture, hydrology, and climate modelling. In this study, we present a novel machine learning (ML) based framework for generating a continuously updated, multilayer global SM dataset: SMRFR (Soil Moisture via Random Forest Regression). Leveraging publicly available reanalysis and remote sensing data, SMRFR provides daily SM estimates at five soil layers (0–5, 5–10, 10–30, 30–50 and 50–100 cm) with a spatial resolution of 9 km, covering the period from 2000 to 2023. Evaluation results demonstrate that SMRFR effectively captures both spatial and temporal SM variability. It also exhibits strong generalization capacity, successfully transferring knowledge across continents and accurately capturing transient and seasonal SM dynamics following rainfall events. SMRFR achieved an unbiased root mean square error of 0.0339 m3/m3 on the validation set. Our novel SM dataset offers a basis and valuable reference for agricultural, hydrological, and ecological research, enabling improved analysis and modelling of SM dynamics at regional to global scales.https://doi.org/10.1038/s41597-025-05511-w
spellingShingle Yuhan Liu
Yuanyuan Zha
Gulin Ran
Yonggen Zhang
Liangsheng Shi
SMRFR: A global multilayer soil moisture dataset generated using Random Forest from multi-source data
Scientific Data
title SMRFR: A global multilayer soil moisture dataset generated using Random Forest from multi-source data
title_full SMRFR: A global multilayer soil moisture dataset generated using Random Forest from multi-source data
title_fullStr SMRFR: A global multilayer soil moisture dataset generated using Random Forest from multi-source data
title_full_unstemmed SMRFR: A global multilayer soil moisture dataset generated using Random Forest from multi-source data
title_short SMRFR: A global multilayer soil moisture dataset generated using Random Forest from multi-source data
title_sort smrfr a global multilayer soil moisture dataset generated using random forest from multi source data
url https://doi.org/10.1038/s41597-025-05511-w
work_keys_str_mv AT yuhanliu smrfraglobalmultilayersoilmoisturedatasetgeneratedusingrandomforestfrommultisourcedata
AT yuanyuanzha smrfraglobalmultilayersoilmoisturedatasetgeneratedusingrandomforestfrommultisourcedata
AT gulinran smrfraglobalmultilayersoilmoisturedatasetgeneratedusingrandomforestfrommultisourcedata
AT yonggenzhang smrfraglobalmultilayersoilmoisturedatasetgeneratedusingrandomforestfrommultisourcedata
AT liangshengshi smrfraglobalmultilayersoilmoisturedatasetgeneratedusingrandomforestfrommultisourcedata