A case-based method of selecting covariates for digital soil mapping

Selecting a proper set of covariates is one of the most important factors that influence the accuracy of digital soil mapping (DSM). The statistical or machine learning methods for selecting DSM covariates are not available for those situations with limited samples. To solve the problem, this paper...

Full description

Saved in:
Bibliographic Details
Main Authors: Peng LIANG, Cheng-zhi QIN, A-xing ZHU, Zhi-wei HOU, Nai-qing FAN, Yi-jie WANG
Format: Article
Language:English
Published: KeAi Communications Co., Ltd. 2020-08-01
Series:Journal of Integrative Agriculture
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2095311919628571
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Selecting a proper set of covariates is one of the most important factors that influence the accuracy of digital soil mapping (DSM). The statistical or machine learning methods for selecting DSM covariates are not available for those situations with limited samples. To solve the problem, this paper proposed a case-based method which could formalize the covariate selection knowledge contained in practical DSM applications. The proposed method trained Random Forest (RF) classifiers with DSM cases extracted from the practical DSM applications and then used the trained classifiers to determine whether each one potential covariate should be used in a new DSM application. In this study, we took topographic covariates as examples of covariates and extracted 191 DSM cases from 56 peer-reviewed journal articles to evaluate the performance of the proposed case-based method by Leave-One-Out cross validation. Compared with a novices’ commonly-used way of selecting DSM covariates, the proposed case-based method improved more than 30% accuracy according to three quantitative evaluation indices (i.e., recall, precision, and F1-score). The proposed method could be also applied to selecting the proper set of covariates for other similar geographical modeling domains, such as landslide susceptibility mapping, and species distribution modeling.
ISSN:2095-3119