Machine learning-guided field site selection for river classification

Sufficient abundance and variety of field site sampling are crucial for obtaining an accurate reach-scale river classification of a regional stream network in support of scientific research and river management. However, many studies still randomly select field sites or only visit accessible streams...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhihao Wang, Gregory Brian Pasternack, Yufang Jin, Costanza Rampini, Serena Alexander, Nikhil Kumar, Rune Storesund, K. Martin Perales, Christopher Lim, Stephanie Moreno, Igor Lacan
Format: Article
Language:English
Published: Elsevier 2025-08-01
Series:International Journal of Applied Earth Observations and Geoinformation
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1569843225003899
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849391648183681024
author Zhihao Wang
Gregory Brian Pasternack
Yufang Jin
Costanza Rampini
Serena Alexander
Nikhil Kumar
Rune Storesund
K. Martin Perales
Christopher Lim
Stephanie Moreno
Igor Lacan
author_facet Zhihao Wang
Gregory Brian Pasternack
Yufang Jin
Costanza Rampini
Serena Alexander
Nikhil Kumar
Rune Storesund
K. Martin Perales
Christopher Lim
Stephanie Moreno
Igor Lacan
author_sort Zhihao Wang
collection DOAJ
description Sufficient abundance and variety of field site sampling are crucial for obtaining an accurate reach-scale river classification of a regional stream network in support of scientific research and river management. However, many studies still randomly select field sites or only visit accessible streams. This leads to an inadequate exploration of stream characteristics, resulting in incomplete or inaccurate classification. Machine learning has been recognized for discovering and extracting streams’ geomorphic patterns efficiently and accurately from data, but its application in field site sampling design is still in its infancy. This study developed a general and practical field site selection framework by incorporating machine learning in a human-in-the-loop manner. This framework includes three steps: (1) initial field site selection via machine learning from prior datasets, (2) selected field site accessibility evaluation and observation, and (3) additional field site decision and selection via an iterative learning process. In an example application to the San Francisco Bay Area (California, USA), our framework extracted representative geomorphic characteristics of (i) previous known stream types from prior labeled and geospatial datasets and (ii) previously unrecognized stream types based on uncertainty information obtained by machine learning. Moreover, we propose methods for replacing inaccessible sites to ensure sufficient information is retained in the selected field sites. Results revealed clear differences in variable distributions between the 148 high‐certainty sites and the 51 high‐uncertainty sites, a pattern that was validated by our field surveys. Furthermore, the 41 newly identified high‐uncertainty sites were found under-represented in the initial surveyed sites and thus their selection for the next round of field surveys will help fill the important feature gaps left by the initial survey. The feasibility of this framework allows river scientists and land use decision-makers to better understand river patterns and manage spatial planning.
format Article
id doaj-art-b3507fb1f18b4d5b8a5601976d6764c0
institution Kabale University
issn 1569-8432
language English
publishDate 2025-08-01
publisher Elsevier
record_format Article
series International Journal of Applied Earth Observations and Geoinformation
spelling doaj-art-b3507fb1f18b4d5b8a5601976d6764c02025-08-20T03:41:00ZengElsevierInternational Journal of Applied Earth Observations and Geoinformation1569-84322025-08-0114210474210.1016/j.jag.2025.104742Machine learning-guided field site selection for river classificationZhihao Wang0Gregory Brian Pasternack1Yufang Jin2Costanza Rampini3Serena Alexander4Nikhil Kumar5Rune Storesund6K. Martin Perales7Christopher Lim8Stephanie Moreno9Igor Lacan10Department of Land, Air and Water Resources, 1 Shields Avenue, University of California, Davis, CA, the United States of America; Corresponding author.Department of Land, Air and Water Resources, 1 Shields Avenue, University of California, Davis, CA, the United States of AmericaDepartment of Land, Air and Water Resources, 1 Shields Avenue, University of California, Davis, CA, the United States of AmericaDepartment of Environmental Studies, San José State University, One Washington Square, San Jose, CA 95192, the United States of AmericaDepartment of Civil and Environmental Engineering, Northeastern University, Boston, MA 02115, the United States of AmericaDepartment of Land, Air and Water Resources, 1 Shields Avenue, University of California, Davis, CA, the United States of AmericaSafeR3, 154 Lawson Road, Kensington, CA 94707, the United States of AmericaNapa County Resource Conservation District, 1303 Jefferson Street, Suite 500B, Napa, CA 94559, the United States of AmericaContra Costa Resource Conservation District, 2001 Clayton Road, Ste. 200, Concord, CA 94520, the United States of AmericaNorth Santa Clara Resource Conservation District, 1560 Berger Drive, Room 211, San Jose, CA 95112, the United States of AmericaUniversity of California Cooperative Extension, San Mateo/San Francisco Counties, 1500 Purissima Creek Road, Half Moon Bay, CA 94019, the United States of AmericaSufficient abundance and variety of field site sampling are crucial for obtaining an accurate reach-scale river classification of a regional stream network in support of scientific research and river management. However, many studies still randomly select field sites or only visit accessible streams. This leads to an inadequate exploration of stream characteristics, resulting in incomplete or inaccurate classification. Machine learning has been recognized for discovering and extracting streams’ geomorphic patterns efficiently and accurately from data, but its application in field site sampling design is still in its infancy. This study developed a general and practical field site selection framework by incorporating machine learning in a human-in-the-loop manner. This framework includes three steps: (1) initial field site selection via machine learning from prior datasets, (2) selected field site accessibility evaluation and observation, and (3) additional field site decision and selection via an iterative learning process. In an example application to the San Francisco Bay Area (California, USA), our framework extracted representative geomorphic characteristics of (i) previous known stream types from prior labeled and geospatial datasets and (ii) previously unrecognized stream types based on uncertainty information obtained by machine learning. Moreover, we propose methods for replacing inaccessible sites to ensure sufficient information is retained in the selected field sites. Results revealed clear differences in variable distributions between the 148 high‐certainty sites and the 51 high‐uncertainty sites, a pattern that was validated by our field surveys. Furthermore, the 41 newly identified high‐uncertainty sites were found under-represented in the initial surveyed sites and thus their selection for the next round of field surveys will help fill the important feature gaps left by the initial survey. The feasibility of this framework allows river scientists and land use decision-makers to better understand river patterns and manage spatial planning.http://www.sciencedirect.com/science/article/pii/S1569843225003899River classificationMachine learningField site selectionPrior datasetsUncertainty information
spellingShingle Zhihao Wang
Gregory Brian Pasternack
Yufang Jin
Costanza Rampini
Serena Alexander
Nikhil Kumar
Rune Storesund
K. Martin Perales
Christopher Lim
Stephanie Moreno
Igor Lacan
Machine learning-guided field site selection for river classification
International Journal of Applied Earth Observations and Geoinformation
River classification
Machine learning
Field site selection
Prior datasets
Uncertainty information
title Machine learning-guided field site selection for river classification
title_full Machine learning-guided field site selection for river classification
title_fullStr Machine learning-guided field site selection for river classification
title_full_unstemmed Machine learning-guided field site selection for river classification
title_short Machine learning-guided field site selection for river classification
title_sort machine learning guided field site selection for river classification
topic River classification
Machine learning
Field site selection
Prior datasets
Uncertainty information
url http://www.sciencedirect.com/science/article/pii/S1569843225003899
work_keys_str_mv AT zhihaowang machinelearningguidedfieldsiteselectionforriverclassification
AT gregorybrianpasternack machinelearningguidedfieldsiteselectionforriverclassification
AT yufangjin machinelearningguidedfieldsiteselectionforriverclassification
AT costanzarampini machinelearningguidedfieldsiteselectionforriverclassification
AT serenaalexander machinelearningguidedfieldsiteselectionforriverclassification
AT nikhilkumar machinelearningguidedfieldsiteselectionforriverclassification
AT runestoresund machinelearningguidedfieldsiteselectionforriverclassification
AT kmartinperales machinelearningguidedfieldsiteselectionforriverclassification
AT christopherlim machinelearningguidedfieldsiteselectionforriverclassification
AT stephaniemoreno machinelearningguidedfieldsiteselectionforriverclassification
AT igorlacan machinelearningguidedfieldsiteselectionforriverclassification