The impact of fractional cover distribution in training samples on the accuracy of fractional cover estimation: a model-based evaluation

In machine learning-based fractional cover estimation, the fractional cover distribution in training samples critically influences model construction and, consequently the accuracy of the estimations. While some studies have descriptively compared the accuracies of machine learning-based estimations...

Full description

Saved in:
Bibliographic Details
Main Authors: Rujia Wang, Chen Shi
Format: Article
Language:English
Published: Taylor & Francis Group 2025-07-01
Series:Geo-spatial Information Science
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/10095020.2025.2514815
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In machine learning-based fractional cover estimation, the fractional cover distribution in training samples critically influences model construction and, consequently the accuracy of the estimations. While some studies have descriptively compared the accuracies of machine learning-based estimations across training sets derived from different sampling methods, a significant gap remains in quantitatively analyzing how the fractional cover distribution in training samples affects accuracy. This study aims to bridge this gap by introducing descriptors for fractional cover distribution in the training set and establishing mathematical relationships between these descriptors and the accuracy of fractional cover estimation. We employed the Dirichlet distribution to characterize the joint fractional cover of multiple land classes and the Beta distribution for single-class cover. Subsequently, two descriptors were developed: the Kullback-Leibler (KL) divergence, measuring the similarity of fractional cover distributions for the target class between the training and test sets, and the geometric angle, representing the fractional cover distributions of the target class in the training set at the same KL divergence. Fractional cover estimation was performed using random forest regression, with accuracy assessed on an independent test set. The relationships between the KL divergence and accuracy, and between the geometric angle and accuracy at the same KL divergence, were modeled using univariate linear models and harmonic models, respectively. The combined effects of these descriptors on accuracy were further analyzed using coupled harmonic analysis and generalized additive models. Our experimental results, using both simulated and real data, demonstrated the effectiveness of these models. Given the strong explanatory power of the KL divergence in the accuracy of fractional cover estimation, we encourage researchers to report detailed statistical information of both training and test sets, enriching the understanding of model performance in fractional cover estimation.
ISSN:1009-5020
1993-5153