Assessing the impact of multi-source environmental variables on soil organic carbon in different land use types of China using an interpretable high-precision machine learning method
To explore the impact of environmental factors on soil organic carbon (SOC) with machine learning (ML) model is of great significance for mitigating climate change and soil carbon sequestration and emission reduction. However, the traditional ML model is limited by the hyperparameter adjustment of a...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2024-12-01
|
| Series: | Ecological Indicators |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S1470160X24013220 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850064981872082944 |
|---|---|
| author | Feng Wang Ruilin Liang Shuyue Li Meiyan Xiang Weihao Yang Miao Lu Yingqiang Song |
| author_facet | Feng Wang Ruilin Liang Shuyue Li Meiyan Xiang Weihao Yang Miao Lu Yingqiang Song |
| author_sort | Feng Wang |
| collection | DOAJ |
| description | To explore the impact of environmental factors on soil organic carbon (SOC) with machine learning (ML) model is of great significance for mitigating climate change and soil carbon sequestration and emission reduction. However, the traditional ML model is limited by the hyperparameter adjustment of artificially trial-and-error experimentation and the inexplicability of fitting process, and the precision and performance of ML model cannot be fully utilized. For the end, this study developed a tree-structured Parzen estimator-extreme gradient boosting (TPE-XGBoost) method based on SHapley additive explanations (SHAP) analysis to analyze the response of climate, human activities, soil properties and terrain for SOC (0-200cm) in different land use types of China. The results of descriptive statistics described the order of SOC content: forest land > grassland > cultivated land > unused land. With the increase of soil depth, the SOC content of all land types decreased continuously, and the values indicate a left-skewed non-normal distribution. The fitting accuracy (R2) of TPE-XGBoost model for SOC content was greater than 0.8. At the depth of 0-5cm, the prediction accuracy of cultivated land (R2 = 0.96), grassland (R2 = 0.93), forest land (R2 = 0.95) and unused land (R2 = 0.95) was the highest. The result of SHAP analysis showed that the factors that contributed the most to the fitting accuracy of cultivated land, grassland, forest land and unused land in all depths were temperature, soil pH, temperature and elevation. From surface to deep soil, the mean SHAP value showed a downward trend, indicating that the driving force of environmental factors on the content of SOC gradually weakened. The individual explanations of the variance partitioning (VP) analysis of climate, terrain, and soil property for cultivated land (0-200cm), forest land (30-60cm), and unused land (0-200cm) was as high as 0.32, 0.17, and 0.16, respectively, which indicated that these environmental factors had a high response to SOC content. It is found that the appropriate temperature not only promotes plant roots to obtain nutrients, but also interacts with soil pH on microorganisms, thereby increasing the SOC content. The results confirm that the TPE-XGBoost model based on SHAP analysis can reliably explain the nonlinear driving effect of environmental factors on the SOC, which provides credible decision support for accounting carbon budget and carbon sequestration in large-scale regions. |
| format | Article |
| id | doaj-art-9126a13cb17b49c4a79f665a9a6ff1ca |
| institution | DOAJ |
| issn | 1470-160X |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Ecological Indicators |
| spelling | doaj-art-9126a13cb17b49c4a79f665a9a6ff1ca2025-08-20T02:49:08ZengElsevierEcological Indicators1470-160X2024-12-0116911286510.1016/j.ecolind.2024.112865Assessing the impact of multi-source environmental variables on soil organic carbon in different land use types of China using an interpretable high-precision machine learning methodFeng Wang0Ruilin Liang1Shuyue Li2Meiyan Xiang3Weihao Yang4Miao Lu5Yingqiang Song6School of civil engineering and geomatics, Shandong University of Technology, Zibo 255000, ChinaSchool of civil engineering and geomatics, Shandong University of Technology, Zibo 255000, ChinaSchool of civil engineering and geomatics, Shandong University of Technology, Zibo 255000, ChinaSchool of civil engineering and geomatics, Shandong University of Technology, Zibo 255000, ChinaSchool of civil engineering and geomatics, Shandong University of Technology, Zibo 255000, ChinaState Key Laboratory of Efficient Utilization of Arid and Semi-arid Arable Land in Northern China / Key Labora tory of Agricultural Remote Sensing, Ministry of Agriculture and Rural Affairs / Institute of Agricultural Resources and Regional Planning, Chinese Academy of Agricultural Sciences, Beijing 100081, China; National Center of Technology Innovationfor Comprehensive Utilization of Saline-Alkali Land, Dongying 257300, China; Corresponding authors.School of civil engineering and geomatics, Shandong University of Technology, Zibo 255000, China; National Center of Technology Innovationfor Comprehensive Utilization of Saline-Alkali Land, Dongying 257300, China; Corresponding authors.To explore the impact of environmental factors on soil organic carbon (SOC) with machine learning (ML) model is of great significance for mitigating climate change and soil carbon sequestration and emission reduction. However, the traditional ML model is limited by the hyperparameter adjustment of artificially trial-and-error experimentation and the inexplicability of fitting process, and the precision and performance of ML model cannot be fully utilized. For the end, this study developed a tree-structured Parzen estimator-extreme gradient boosting (TPE-XGBoost) method based on SHapley additive explanations (SHAP) analysis to analyze the response of climate, human activities, soil properties and terrain for SOC (0-200cm) in different land use types of China. The results of descriptive statistics described the order of SOC content: forest land > grassland > cultivated land > unused land. With the increase of soil depth, the SOC content of all land types decreased continuously, and the values indicate a left-skewed non-normal distribution. The fitting accuracy (R2) of TPE-XGBoost model for SOC content was greater than 0.8. At the depth of 0-5cm, the prediction accuracy of cultivated land (R2 = 0.96), grassland (R2 = 0.93), forest land (R2 = 0.95) and unused land (R2 = 0.95) was the highest. The result of SHAP analysis showed that the factors that contributed the most to the fitting accuracy of cultivated land, grassland, forest land and unused land in all depths were temperature, soil pH, temperature and elevation. From surface to deep soil, the mean SHAP value showed a downward trend, indicating that the driving force of environmental factors on the content of SOC gradually weakened. The individual explanations of the variance partitioning (VP) analysis of climate, terrain, and soil property for cultivated land (0-200cm), forest land (30-60cm), and unused land (0-200cm) was as high as 0.32, 0.17, and 0.16, respectively, which indicated that these environmental factors had a high response to SOC content. It is found that the appropriate temperature not only promotes plant roots to obtain nutrients, but also interacts with soil pH on microorganisms, thereby increasing the SOC content. The results confirm that the TPE-XGBoost model based on SHAP analysis can reliably explain the nonlinear driving effect of environmental factors on the SOC, which provides credible decision support for accounting carbon budget and carbon sequestration in large-scale regions.http://www.sciencedirect.com/science/article/pii/S1470160X24013220HyperparameterMachine learningLand useSoil organic carbon |
| spellingShingle | Feng Wang Ruilin Liang Shuyue Li Meiyan Xiang Weihao Yang Miao Lu Yingqiang Song Assessing the impact of multi-source environmental variables on soil organic carbon in different land use types of China using an interpretable high-precision machine learning method Ecological Indicators Hyperparameter Machine learning Land use Soil organic carbon |
| title | Assessing the impact of multi-source environmental variables on soil organic carbon in different land use types of China using an interpretable high-precision machine learning method |
| title_full | Assessing the impact of multi-source environmental variables on soil organic carbon in different land use types of China using an interpretable high-precision machine learning method |
| title_fullStr | Assessing the impact of multi-source environmental variables on soil organic carbon in different land use types of China using an interpretable high-precision machine learning method |
| title_full_unstemmed | Assessing the impact of multi-source environmental variables on soil organic carbon in different land use types of China using an interpretable high-precision machine learning method |
| title_short | Assessing the impact of multi-source environmental variables on soil organic carbon in different land use types of China using an interpretable high-precision machine learning method |
| title_sort | assessing the impact of multi source environmental variables on soil organic carbon in different land use types of china using an interpretable high precision machine learning method |
| topic | Hyperparameter Machine learning Land use Soil organic carbon |
| url | http://www.sciencedirect.com/science/article/pii/S1470160X24013220 |
| work_keys_str_mv | AT fengwang assessingtheimpactofmultisourceenvironmentalvariablesonsoilorganiccarbonindifferentlandusetypesofchinausinganinterpretablehighprecisionmachinelearningmethod AT ruilinliang assessingtheimpactofmultisourceenvironmentalvariablesonsoilorganiccarbonindifferentlandusetypesofchinausinganinterpretablehighprecisionmachinelearningmethod AT shuyueli assessingtheimpactofmultisourceenvironmentalvariablesonsoilorganiccarbonindifferentlandusetypesofchinausinganinterpretablehighprecisionmachinelearningmethod AT meiyanxiang assessingtheimpactofmultisourceenvironmentalvariablesonsoilorganiccarbonindifferentlandusetypesofchinausinganinterpretablehighprecisionmachinelearningmethod AT weihaoyang assessingtheimpactofmultisourceenvironmentalvariablesonsoilorganiccarbonindifferentlandusetypesofchinausinganinterpretablehighprecisionmachinelearningmethod AT miaolu assessingtheimpactofmultisourceenvironmentalvariablesonsoilorganiccarbonindifferentlandusetypesofchinausinganinterpretablehighprecisionmachinelearningmethod AT yingqiangsong assessingtheimpactofmultisourceenvironmentalvariablesonsoilorganiccarbonindifferentlandusetypesofchinausinganinterpretablehighprecisionmachinelearningmethod |