Spatiotemporal prediction of soil organic carbon density in Europe (2000–2022) using earth observation and machine learning
This article describes a comprehensive framework for soil organic carbon density (SOCD, kg/m3) modeling and mapping, based on spatiotemporal random forest (RF) and quantile regression forests (QRF). A total of 45,616 SOCD observations and various Earth observation (EO) feature layers were used to pr...
Saved in:
| Main Authors: | , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
PeerJ Inc.
2025-07-01
|
| Series: | PeerJ |
| Subjects: | |
| Online Access: | https://peerj.com/articles/19605.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849430913410138112 |
|---|---|
| author | Xuemeng Tian Sytze de Bruin Rolf Simoes Mustafa Serkan Isik Robert Minarik Yu-Feng Ho Murat Şahin Martin Herold Davide Consoli Tomislav Hengl |
| author_facet | Xuemeng Tian Sytze de Bruin Rolf Simoes Mustafa Serkan Isik Robert Minarik Yu-Feng Ho Murat Şahin Martin Herold Davide Consoli Tomislav Hengl |
| author_sort | Xuemeng Tian |
| collection | DOAJ |
| description | This article describes a comprehensive framework for soil organic carbon density (SOCD, kg/m3) modeling and mapping, based on spatiotemporal random forest (RF) and quantile regression forests (QRF). A total of 45,616 SOCD observations and various Earth observation (EO) feature layers were used to produce 30 m SOCD maps for the EU at four-year intervals (2000–2022) and four soil depth intervals (0–20 cm, 20–50 cm, 50–100 cm, and 100–200 cm). Per-pixel 95% probability prediction intervals (PIs) and extrapolation risk probabilities are also provided. Model evaluation indicates good overall accuracy (R2 = 0.63 and CCC = 0.76 for hold-out independent tests). Prediction accuracy varies by land cover, depth interval and year of prediction with the worst accuracy for shrubland and deeper soils 100–200 cm. The PI validation confirmed effective uncertainty estimation, though with reduced accuracy for higher SOCD values. Shapley analysis identified soil depth as the most influential feature, followed by vegetation, long-term bioclimate, and topographic features. While pixel-level uncertainty is substantial, spatial aggregation reduces uncertainty by approximately 66%. Detecting SOCD changes remains challenging but offers a baseline for future improvements. Maps, based primarily on topsoil data from cropland, grassland, and woodland, are best suited for applications related to these land covers and depths. We recommend that users interpret the maps in conjunction with local knowledge and consider the accompanying uncertainty and extrapolation risk layers. All data and code are available under an open license at https://doi.org/10.5281/zenodo.13754343 and https://github.com/AI4SoilHealth/SoilHealthDataCube/. |
| format | Article |
| id | doaj-art-c8e4613dc4d742e591dbe9b4f4a40f46 |
| institution | Kabale University |
| issn | 2167-8359 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | PeerJ Inc. |
| record_format | Article |
| series | PeerJ |
| spelling | doaj-art-c8e4613dc4d742e591dbe9b4f4a40f462025-08-20T03:27:48ZengPeerJ Inc.PeerJ2167-83592025-07-0113e1960510.7717/peerj.19605Spatiotemporal prediction of soil organic carbon density in Europe (2000–2022) using earth observation and machine learningXuemeng Tian0Sytze de Bruin1Rolf Simoes2Mustafa Serkan Isik3Robert Minarik4Yu-Feng Ho5Murat Şahin6Martin Herold7Davide Consoli8Tomislav Hengl9OpenGeoHub, Doorwerth, NetherlandsLaboratory of Geo-Information Science and Remote Sensing, Wageningen University and Research, Wageningen, NetherlandsOpenGeoHub, Doorwerth, NetherlandsOpenGeoHub, Doorwerth, NetherlandsOpenGeoHub, Doorwerth, NetherlandsOpenGeoHub, Doorwerth, NetherlandsDepartment of Geosciences & Engineering, Delft University of Technology, Delft, NetherlandsLaboratory of Geo-Information Science and Remote Sensing, Wageningen University and Research, Wageningen, NetherlandsOpenGeoHub, Doorwerth, NetherlandsOpenGeoHub, Doorwerth, NetherlandsThis article describes a comprehensive framework for soil organic carbon density (SOCD, kg/m3) modeling and mapping, based on spatiotemporal random forest (RF) and quantile regression forests (QRF). A total of 45,616 SOCD observations and various Earth observation (EO) feature layers were used to produce 30 m SOCD maps for the EU at four-year intervals (2000–2022) and four soil depth intervals (0–20 cm, 20–50 cm, 50–100 cm, and 100–200 cm). Per-pixel 95% probability prediction intervals (PIs) and extrapolation risk probabilities are also provided. Model evaluation indicates good overall accuracy (R2 = 0.63 and CCC = 0.76 for hold-out independent tests). Prediction accuracy varies by land cover, depth interval and year of prediction with the worst accuracy for shrubland and deeper soils 100–200 cm. The PI validation confirmed effective uncertainty estimation, though with reduced accuracy for higher SOCD values. Shapley analysis identified soil depth as the most influential feature, followed by vegetation, long-term bioclimate, and topographic features. While pixel-level uncertainty is substantial, spatial aggregation reduces uncertainty by approximately 66%. Detecting SOCD changes remains challenging but offers a baseline for future improvements. Maps, based primarily on topsoil data from cropland, grassland, and woodland, are best suited for applications related to these land covers and depths. We recommend that users interpret the maps in conjunction with local knowledge and consider the accompanying uncertainty and extrapolation risk layers. All data and code are available under an open license at https://doi.org/10.5281/zenodo.13754343 and https://github.com/AI4SoilHealth/SoilHealthDataCube/.https://peerj.com/articles/19605.pdfSoil organic carbon densityMachine learningEarth observationUncertaintySpatial aggregationTime series |
| spellingShingle | Xuemeng Tian Sytze de Bruin Rolf Simoes Mustafa Serkan Isik Robert Minarik Yu-Feng Ho Murat Şahin Martin Herold Davide Consoli Tomislav Hengl Spatiotemporal prediction of soil organic carbon density in Europe (2000–2022) using earth observation and machine learning PeerJ Soil organic carbon density Machine learning Earth observation Uncertainty Spatial aggregation Time series |
| title | Spatiotemporal prediction of soil organic carbon density in Europe (2000–2022) using earth observation and machine learning |
| title_full | Spatiotemporal prediction of soil organic carbon density in Europe (2000–2022) using earth observation and machine learning |
| title_fullStr | Spatiotemporal prediction of soil organic carbon density in Europe (2000–2022) using earth observation and machine learning |
| title_full_unstemmed | Spatiotemporal prediction of soil organic carbon density in Europe (2000–2022) using earth observation and machine learning |
| title_short | Spatiotemporal prediction of soil organic carbon density in Europe (2000–2022) using earth observation and machine learning |
| title_sort | spatiotemporal prediction of soil organic carbon density in europe 2000 2022 using earth observation and machine learning |
| topic | Soil organic carbon density Machine learning Earth observation Uncertainty Spatial aggregation Time series |
| url | https://peerj.com/articles/19605.pdf |
| work_keys_str_mv | AT xuemengtian spatiotemporalpredictionofsoilorganiccarbondensityineurope20002022usingearthobservationandmachinelearning AT sytzedebruin spatiotemporalpredictionofsoilorganiccarbondensityineurope20002022usingearthobservationandmachinelearning AT rolfsimoes spatiotemporalpredictionofsoilorganiccarbondensityineurope20002022usingearthobservationandmachinelearning AT mustafaserkanisik spatiotemporalpredictionofsoilorganiccarbondensityineurope20002022usingearthobservationandmachinelearning AT robertminarik spatiotemporalpredictionofsoilorganiccarbondensityineurope20002022usingearthobservationandmachinelearning AT yufengho spatiotemporalpredictionofsoilorganiccarbondensityineurope20002022usingearthobservationandmachinelearning AT muratsahin spatiotemporalpredictionofsoilorganiccarbondensityineurope20002022usingearthobservationandmachinelearning AT martinherold spatiotemporalpredictionofsoilorganiccarbondensityineurope20002022usingearthobservationandmachinelearning AT davideconsoli spatiotemporalpredictionofsoilorganiccarbondensityineurope20002022usingearthobservationandmachinelearning AT tomislavhengl spatiotemporalpredictionofsoilorganiccarbondensityineurope20002022usingearthobservationandmachinelearning |