Machine learning-guided field site selection for river classification
Sufficient abundance and variety of field site sampling are crucial for obtaining an accurate reach-scale river classification of a regional stream network in support of scientific research and river management. However, many studies still randomly select field sites or only visit accessible streams...
Saved in:
| Main Authors: | , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-08-01
|
| Series: | International Journal of Applied Earth Observations and Geoinformation |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S1569843225003899 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849391648183681024 |
|---|---|
| author | Zhihao Wang Gregory Brian Pasternack Yufang Jin Costanza Rampini Serena Alexander Nikhil Kumar Rune Storesund K. Martin Perales Christopher Lim Stephanie Moreno Igor Lacan |
| author_facet | Zhihao Wang Gregory Brian Pasternack Yufang Jin Costanza Rampini Serena Alexander Nikhil Kumar Rune Storesund K. Martin Perales Christopher Lim Stephanie Moreno Igor Lacan |
| author_sort | Zhihao Wang |
| collection | DOAJ |
| description | Sufficient abundance and variety of field site sampling are crucial for obtaining an accurate reach-scale river classification of a regional stream network in support of scientific research and river management. However, many studies still randomly select field sites or only visit accessible streams. This leads to an inadequate exploration of stream characteristics, resulting in incomplete or inaccurate classification. Machine learning has been recognized for discovering and extracting streams’ geomorphic patterns efficiently and accurately from data, but its application in field site sampling design is still in its infancy. This study developed a general and practical field site selection framework by incorporating machine learning in a human-in-the-loop manner. This framework includes three steps: (1) initial field site selection via machine learning from prior datasets, (2) selected field site accessibility evaluation and observation, and (3) additional field site decision and selection via an iterative learning process. In an example application to the San Francisco Bay Area (California, USA), our framework extracted representative geomorphic characteristics of (i) previous known stream types from prior labeled and geospatial datasets and (ii) previously unrecognized stream types based on uncertainty information obtained by machine learning. Moreover, we propose methods for replacing inaccessible sites to ensure sufficient information is retained in the selected field sites. Results revealed clear differences in variable distributions between the 148 high‐certainty sites and the 51 high‐uncertainty sites, a pattern that was validated by our field surveys. Furthermore, the 41 newly identified high‐uncertainty sites were found under-represented in the initial surveyed sites and thus their selection for the next round of field surveys will help fill the important feature gaps left by the initial survey. The feasibility of this framework allows river scientists and land use decision-makers to better understand river patterns and manage spatial planning. |
| format | Article |
| id | doaj-art-b3507fb1f18b4d5b8a5601976d6764c0 |
| institution | Kabale University |
| issn | 1569-8432 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Elsevier |
| record_format | Article |
| series | International Journal of Applied Earth Observations and Geoinformation |
| spelling | doaj-art-b3507fb1f18b4d5b8a5601976d6764c02025-08-20T03:41:00ZengElsevierInternational Journal of Applied Earth Observations and Geoinformation1569-84322025-08-0114210474210.1016/j.jag.2025.104742Machine learning-guided field site selection for river classificationZhihao Wang0Gregory Brian Pasternack1Yufang Jin2Costanza Rampini3Serena Alexander4Nikhil Kumar5Rune Storesund6K. Martin Perales7Christopher Lim8Stephanie Moreno9Igor Lacan10Department of Land, Air and Water Resources, 1 Shields Avenue, University of California, Davis, CA, the United States of America; Corresponding author.Department of Land, Air and Water Resources, 1 Shields Avenue, University of California, Davis, CA, the United States of AmericaDepartment of Land, Air and Water Resources, 1 Shields Avenue, University of California, Davis, CA, the United States of AmericaDepartment of Environmental Studies, San José State University, One Washington Square, San Jose, CA 95192, the United States of AmericaDepartment of Civil and Environmental Engineering, Northeastern University, Boston, MA 02115, the United States of AmericaDepartment of Land, Air and Water Resources, 1 Shields Avenue, University of California, Davis, CA, the United States of AmericaSafeR3, 154 Lawson Road, Kensington, CA 94707, the United States of AmericaNapa County Resource Conservation District, 1303 Jefferson Street, Suite 500B, Napa, CA 94559, the United States of AmericaContra Costa Resource Conservation District, 2001 Clayton Road, Ste. 200, Concord, CA 94520, the United States of AmericaNorth Santa Clara Resource Conservation District, 1560 Berger Drive, Room 211, San Jose, CA 95112, the United States of AmericaUniversity of California Cooperative Extension, San Mateo/San Francisco Counties, 1500 Purissima Creek Road, Half Moon Bay, CA 94019, the United States of AmericaSufficient abundance and variety of field site sampling are crucial for obtaining an accurate reach-scale river classification of a regional stream network in support of scientific research and river management. However, many studies still randomly select field sites or only visit accessible streams. This leads to an inadequate exploration of stream characteristics, resulting in incomplete or inaccurate classification. Machine learning has been recognized for discovering and extracting streams’ geomorphic patterns efficiently and accurately from data, but its application in field site sampling design is still in its infancy. This study developed a general and practical field site selection framework by incorporating machine learning in a human-in-the-loop manner. This framework includes three steps: (1) initial field site selection via machine learning from prior datasets, (2) selected field site accessibility evaluation and observation, and (3) additional field site decision and selection via an iterative learning process. In an example application to the San Francisco Bay Area (California, USA), our framework extracted representative geomorphic characteristics of (i) previous known stream types from prior labeled and geospatial datasets and (ii) previously unrecognized stream types based on uncertainty information obtained by machine learning. Moreover, we propose methods for replacing inaccessible sites to ensure sufficient information is retained in the selected field sites. Results revealed clear differences in variable distributions between the 148 high‐certainty sites and the 51 high‐uncertainty sites, a pattern that was validated by our field surveys. Furthermore, the 41 newly identified high‐uncertainty sites were found under-represented in the initial surveyed sites and thus their selection for the next round of field surveys will help fill the important feature gaps left by the initial survey. The feasibility of this framework allows river scientists and land use decision-makers to better understand river patterns and manage spatial planning.http://www.sciencedirect.com/science/article/pii/S1569843225003899River classificationMachine learningField site selectionPrior datasetsUncertainty information |
| spellingShingle | Zhihao Wang Gregory Brian Pasternack Yufang Jin Costanza Rampini Serena Alexander Nikhil Kumar Rune Storesund K. Martin Perales Christopher Lim Stephanie Moreno Igor Lacan Machine learning-guided field site selection for river classification International Journal of Applied Earth Observations and Geoinformation River classification Machine learning Field site selection Prior datasets Uncertainty information |
| title | Machine learning-guided field site selection for river classification |
| title_full | Machine learning-guided field site selection for river classification |
| title_fullStr | Machine learning-guided field site selection for river classification |
| title_full_unstemmed | Machine learning-guided field site selection for river classification |
| title_short | Machine learning-guided field site selection for river classification |
| title_sort | machine learning guided field site selection for river classification |
| topic | River classification Machine learning Field site selection Prior datasets Uncertainty information |
| url | http://www.sciencedirect.com/science/article/pii/S1569843225003899 |
| work_keys_str_mv | AT zhihaowang machinelearningguidedfieldsiteselectionforriverclassification AT gregorybrianpasternack machinelearningguidedfieldsiteselectionforriverclassification AT yufangjin machinelearningguidedfieldsiteselectionforriverclassification AT costanzarampini machinelearningguidedfieldsiteselectionforriverclassification AT serenaalexander machinelearningguidedfieldsiteselectionforriverclassification AT nikhilkumar machinelearningguidedfieldsiteselectionforriverclassification AT runestoresund machinelearningguidedfieldsiteselectionforriverclassification AT kmartinperales machinelearningguidedfieldsiteselectionforriverclassification AT christopherlim machinelearningguidedfieldsiteselectionforriverclassification AT stephaniemoreno machinelearningguidedfieldsiteselectionforriverclassification AT igorlacan machinelearningguidedfieldsiteselectionforriverclassification |