A partitioned conditioned Latin hypercube sampling method considering spatial heterogeneity in digital soil mapping

Abstract The design of sampling methods is crucial in digital soil mapping for soil organic carbon (SOC), as it directly affects prediction precision and reliability. While sampling methods based on environmental variables are widely used, the spatial heterogeneity of soil properties poses challenge...

Full description

Saved in:
Bibliographic Details
Main Authors: Biao Huang, Guijian Yang, Jiancong Lei, Xiaomi Wang
Format: Article
Language:English
Published: Nature Portfolio 2025-04-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-95631-5
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850181235105595392
author Biao Huang
Guijian Yang
Jiancong Lei
Xiaomi Wang
author_facet Biao Huang
Guijian Yang
Jiancong Lei
Xiaomi Wang
author_sort Biao Huang
collection DOAJ
description Abstract The design of sampling methods is crucial in digital soil mapping for soil organic carbon (SOC), as it directly affects prediction precision and reliability. While sampling methods based on environmental variables are widely used, the spatial heterogeneity of soil properties poses challenges by introducing variability in influential driving factors across subregions, potentially reducing prediction accuracy. To address this, a partitioned conditioned Latin hypercube sampling (PcLHS) method explicitly considering spatial heterogeneity is proposed. PcLHS first employs the regionalization with dynamically constrained agglomerative clustering and partitioning (REDCAP) method to partition the study area into relatively homogeneous subregions. Key environmental variables are then identified using the Boruta and the Variance Inflation Factor method, followed by conditioned Latin hypercube sampling (cLHS) to select training points within each subregion. Finally, the selected training points are combined to form the complete training dataset. A case study on SOC sampling in northeastern France demonstrated that PcLHS consistently outperformed traditional sampling methods, achieving lower root mean square error (RMSE, 0.40–0.43), higher coefficient of determination (R2, 0.36–0.44), and improved concordance correlation coefficient (CCC, 0.58–0.63). Compared to other methods, PcLHS reduced RMSE by 4–11%, increased R2 by 18–46%, and improved CCC by 14–29%. These results highlight the necessity of considering spatial heterogeneity in soil sampling design and establish PcLHS as an effective method for SOC prediction in heterogeneous landscapes.
format Article
id doaj-art-7481e2b0d5a74ebc8fb5c56fee6992f4
institution OA Journals
issn 2045-2322
language English
publishDate 2025-04-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-7481e2b0d5a74ebc8fb5c56fee6992f42025-08-20T02:17:57ZengNature PortfolioScientific Reports2045-23222025-04-0115112510.1038/s41598-025-95631-5A partitioned conditioned Latin hypercube sampling method considering spatial heterogeneity in digital soil mappingBiao Huang0Guijian Yang1Jiancong Lei2Xiaomi Wang3School of Geographic Sciences, Hunan Normal UniversityTechnical Department, Guizhou Engineering Technology Consulting Co., LtdSchool of Geographic Sciences, Hunan Normal UniversitySchool of Geographic Sciences, Hunan Normal UniversityAbstract The design of sampling methods is crucial in digital soil mapping for soil organic carbon (SOC), as it directly affects prediction precision and reliability. While sampling methods based on environmental variables are widely used, the spatial heterogeneity of soil properties poses challenges by introducing variability in influential driving factors across subregions, potentially reducing prediction accuracy. To address this, a partitioned conditioned Latin hypercube sampling (PcLHS) method explicitly considering spatial heterogeneity is proposed. PcLHS first employs the regionalization with dynamically constrained agglomerative clustering and partitioning (REDCAP) method to partition the study area into relatively homogeneous subregions. Key environmental variables are then identified using the Boruta and the Variance Inflation Factor method, followed by conditioned Latin hypercube sampling (cLHS) to select training points within each subregion. Finally, the selected training points are combined to form the complete training dataset. A case study on SOC sampling in northeastern France demonstrated that PcLHS consistently outperformed traditional sampling methods, achieving lower root mean square error (RMSE, 0.40–0.43), higher coefficient of determination (R2, 0.36–0.44), and improved concordance correlation coefficient (CCC, 0.58–0.63). Compared to other methods, PcLHS reduced RMSE by 4–11%, increased R2 by 18–46%, and improved CCC by 14–29%. These results highlight the necessity of considering spatial heterogeneity in soil sampling design and establish PcLHS as an effective method for SOC prediction in heterogeneous landscapes.https://doi.org/10.1038/s41598-025-95631-5Soil organic carbonSpatial heterogeneityPartitioned conditional Latin hypercube sampling method
spellingShingle Biao Huang
Guijian Yang
Jiancong Lei
Xiaomi Wang
A partitioned conditioned Latin hypercube sampling method considering spatial heterogeneity in digital soil mapping
Scientific Reports
Soil organic carbon
Spatial heterogeneity
Partitioned conditional Latin hypercube sampling method
title A partitioned conditioned Latin hypercube sampling method considering spatial heterogeneity in digital soil mapping
title_full A partitioned conditioned Latin hypercube sampling method considering spatial heterogeneity in digital soil mapping
title_fullStr A partitioned conditioned Latin hypercube sampling method considering spatial heterogeneity in digital soil mapping
title_full_unstemmed A partitioned conditioned Latin hypercube sampling method considering spatial heterogeneity in digital soil mapping
title_short A partitioned conditioned Latin hypercube sampling method considering spatial heterogeneity in digital soil mapping
title_sort partitioned conditioned latin hypercube sampling method considering spatial heterogeneity in digital soil mapping
topic Soil organic carbon
Spatial heterogeneity
Partitioned conditional Latin hypercube sampling method
url https://doi.org/10.1038/s41598-025-95631-5
work_keys_str_mv AT biaohuang apartitionedconditionedlatinhypercubesamplingmethodconsideringspatialheterogeneityindigitalsoilmapping
AT guijianyang apartitionedconditionedlatinhypercubesamplingmethodconsideringspatialheterogeneityindigitalsoilmapping
AT jianconglei apartitionedconditionedlatinhypercubesamplingmethodconsideringspatialheterogeneityindigitalsoilmapping
AT xiaomiwang apartitionedconditionedlatinhypercubesamplingmethodconsideringspatialheterogeneityindigitalsoilmapping
AT biaohuang partitionedconditionedlatinhypercubesamplingmethodconsideringspatialheterogeneityindigitalsoilmapping
AT guijianyang partitionedconditionedlatinhypercubesamplingmethodconsideringspatialheterogeneityindigitalsoilmapping
AT jianconglei partitionedconditionedlatinhypercubesamplingmethodconsideringspatialheterogeneityindigitalsoilmapping
AT xiaomiwang partitionedconditionedlatinhypercubesamplingmethodconsideringspatialheterogeneityindigitalsoilmapping