On the use of adversarial validation for quantifying dissimilarity in geospatial machine learning prediction

Recent geospatial machine learning studies have shown that the results of model evaluation via cross-validation (CV) are strongly affected by the dissimilarity between the sample data and the prediction locations. In this paper, we propose a method to quantify such a dissimilarity in the interval 0...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yanwen Wang, Mahdi Khodadadzadeh, Raúl Zurita-Milla
Format:	Article
Language:	English
Published:	Taylor & Francis Group 2025-12-01
Series:	GIScience & Remote Sensing
Subjects:	Machine learning geospatial regression model evaluation cross-validation feature space
Online Access:	https://www.tandfonline.com/doi/10.1080/15481603.2025.2460513
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832540341296693248
author	Yanwen Wang Mahdi Khodadadzadeh Raúl Zurita-Milla
author_facet	Yanwen Wang Mahdi Khodadadzadeh Raúl Zurita-Milla
author_sort	Yanwen Wang
collection	DOAJ
description	Recent geospatial machine learning studies have shown that the results of model evaluation via cross-validation (CV) are strongly affected by the dissimilarity between the sample data and the prediction locations. In this paper, we propose a method to quantify such a dissimilarity in the interval 0 to 100% and from the perspective of the data feature space. The proposed method is based on adversarial validation, which is an approach that can check whether sample data and prediction locations can be separated with a binary classifier. The proposed method is called dissimilarity quantification by adversarial validation (DAV). To study the effectiveness and generality of DAV, we tested it on a series of experiments based on both synthetic and real datasets and with gradually increasing dissimilarities. Results show that DAV effectively quantified dissimilarity across the entire range of values. Next to this, we studied how dissimilarity affects CV methods’ evaluations by comparing the results of random CV method (RDM-CV) and of two geospatial CV methods, namely, block and spatial+ CV (BLK-CV and SP-CV). Our results showed the evaluations follow similar patterns in all datasets and predictions: when dissimilarity is low (usually lower than 30%), RDM-CV provides the most accurate evaluation results. As dissimilarity increases, geospatial CV methods, especially SP-CV, become more and more accurate and even outperform RDM-CV. When dissimilarity is high ([Formula: see text]), no CV method provides accurate evaluations. These results show the importance of considering feature space dissimilarity when working with geospatial machine learning predictions and can help researchers and practitioners to select more suitable CV methods for evaluating their predictions.
format	Article
id	doaj-art-a94539ddf8384a74b04d67815dedc526
institution	Kabale University
issn	1548-1603 1943-7226
language	English
publishDate	2025-12-01
publisher	Taylor & Francis Group
record_format	Article
series	GIScience & Remote Sensing
spelling	doaj-art-a94539ddf8384a74b04d67815dedc5262025-02-05T04:39:20ZengTaylor & Francis GroupGIScience & Remote Sensing1548-16031943-72262025-12-0162110.1080/15481603.2025.2460513On the use of adversarial validation for quantifying dissimilarity in geospatial machine learning predictionYanwen Wang0Mahdi Khodadadzadeh1Raúl Zurita-Milla2Faculty of Geo-Information Science and Earth Observation (ITC), University of Twente, Enschede, The NetherlandsFaculty of Geo-Information Science and Earth Observation (ITC), University of Twente, Enschede, The NetherlandsFaculty of Geo-Information Science and Earth Observation (ITC), University of Twente, Enschede, The NetherlandsRecent geospatial machine learning studies have shown that the results of model evaluation via cross-validation (CV) are strongly affected by the dissimilarity between the sample data and the prediction locations. In this paper, we propose a method to quantify such a dissimilarity in the interval 0 to 100% and from the perspective of the data feature space. The proposed method is based on adversarial validation, which is an approach that can check whether sample data and prediction locations can be separated with a binary classifier. The proposed method is called dissimilarity quantification by adversarial validation (DAV). To study the effectiveness and generality of DAV, we tested it on a series of experiments based on both synthetic and real datasets and with gradually increasing dissimilarities. Results show that DAV effectively quantified dissimilarity across the entire range of values. Next to this, we studied how dissimilarity affects CV methods’ evaluations by comparing the results of random CV method (RDM-CV) and of two geospatial CV methods, namely, block and spatial+ CV (BLK-CV and SP-CV). Our results showed the evaluations follow similar patterns in all datasets and predictions: when dissimilarity is low (usually lower than 30%), RDM-CV provides the most accurate evaluation results. As dissimilarity increases, geospatial CV methods, especially SP-CV, become more and more accurate and even outperform RDM-CV. When dissimilarity is high ([Formula: see text]), no CV method provides accurate evaluations. These results show the importance of considering feature space dissimilarity when working with geospatial machine learning predictions and can help researchers and practitioners to select more suitable CV methods for evaluating their predictions.https://www.tandfonline.com/doi/10.1080/15481603.2025.2460513Machine learninggeospatial regressionmodel evaluationcross-validationfeature space
spellingShingle	Yanwen Wang Mahdi Khodadadzadeh Raúl Zurita-Milla On the use of adversarial validation for quantifying dissimilarity in geospatial machine learning prediction GIScience & Remote Sensing Machine learning geospatial regression model evaluation cross-validation feature space
title	On the use of adversarial validation for quantifying dissimilarity in geospatial machine learning prediction
title_full	On the use of adversarial validation for quantifying dissimilarity in geospatial machine learning prediction
title_fullStr	On the use of adversarial validation for quantifying dissimilarity in geospatial machine learning prediction
title_full_unstemmed	On the use of adversarial validation for quantifying dissimilarity in geospatial machine learning prediction
title_short	On the use of adversarial validation for quantifying dissimilarity in geospatial machine learning prediction
title_sort	on the use of adversarial validation for quantifying dissimilarity in geospatial machine learning prediction
topic	Machine learning geospatial regression model evaluation cross-validation feature space
url	https://www.tandfonline.com/doi/10.1080/15481603.2025.2460513
work_keys_str_mv	AT yanwenwang ontheuseofadversarialvalidationforquantifyingdissimilarityingeospatialmachinelearningprediction AT mahdikhodadadzadeh ontheuseofadversarialvalidationforquantifyingdissimilarityingeospatialmachinelearningprediction AT raulzuritamilla ontheuseofadversarialvalidationforquantifyingdissimilarityingeospatialmachinelearningprediction

On the use of adversarial validation for quantifying dissimilarity in geospatial machine learning prediction

Similar Items