Machine learning-guided field site selection for river classification

Sufficient abundance and variety of field site sampling are crucial for obtaining an accurate reach-scale river classification of a regional stream network in support of scientific research and river management. However, many studies still randomly select field sites or only visit accessible streams...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhihao Wang, Gregory Brian Pasternack, Yufang Jin, Costanza Rampini, Serena Alexander, Nikhil Kumar, Rune Storesund, K. Martin Perales, Christopher Lim, Stephanie Moreno, Igor Lacan
Format: Article
Language:English
Published: Elsevier 2025-08-01
Series:International Journal of Applied Earth Observations and Geoinformation
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1569843225003899
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sufficient abundance and variety of field site sampling are crucial for obtaining an accurate reach-scale river classification of a regional stream network in support of scientific research and river management. However, many studies still randomly select field sites or only visit accessible streams. This leads to an inadequate exploration of stream characteristics, resulting in incomplete or inaccurate classification. Machine learning has been recognized for discovering and extracting streams’ geomorphic patterns efficiently and accurately from data, but its application in field site sampling design is still in its infancy. This study developed a general and practical field site selection framework by incorporating machine learning in a human-in-the-loop manner. This framework includes three steps: (1) initial field site selection via machine learning from prior datasets, (2) selected field site accessibility evaluation and observation, and (3) additional field site decision and selection via an iterative learning process. In an example application to the San Francisco Bay Area (California, USA), our framework extracted representative geomorphic characteristics of (i) previous known stream types from prior labeled and geospatial datasets and (ii) previously unrecognized stream types based on uncertainty information obtained by machine learning. Moreover, we propose methods for replacing inaccessible sites to ensure sufficient information is retained in the selected field sites. Results revealed clear differences in variable distributions between the 148 high‐certainty sites and the 51 high‐uncertainty sites, a pattern that was validated by our field surveys. Furthermore, the 41 newly identified high‐uncertainty sites were found under-represented in the initial surveyed sites and thus their selection for the next round of field surveys will help fill the important feature gaps left by the initial survey. The feasibility of this framework allows river scientists and land use decision-makers to better understand river patterns and manage spatial planning.
ISSN:1569-8432