ChemLotUS: A Benchmark Data Set of Lotic Chemistry Across US River Networks

Abstract We present a curated water chemistry data set for lotic systems across the contiguous US containing 35,000,000 records from 290,000 locations. These records are spatially joined to high‐resolution national hydrography data sets, providing information on watershed area, network position, and...

Full description

Saved in:
Bibliographic Details
Main Authors: N. Fernandez, M. J. Cohen, J. W. Jawitz
Format: Article
Language:English
Published: Wiley 2025-05-01
Series:Water Resources Research
Subjects:
Online Access:https://doi.org/10.1029/2024WR039355
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849422739793772544
author N. Fernandez
M. J. Cohen
J. W. Jawitz
author_facet N. Fernandez
M. J. Cohen
J. W. Jawitz
author_sort N. Fernandez
collection DOAJ
description Abstract We present a curated water chemistry data set for lotic systems across the contiguous US containing 35,000,000 records from 290,000 locations. These records are spatially joined to high‐resolution national hydrography data sets, providing information on watershed area, network position, and other hydrographic information. Our curation process follows best practices applied to raw query results from the Water Quality Portal, followed by assigning network context (position and watershed attributes) to each site from the high‐resolution National Hydrography Data set. The ChemLotUS data set currently includes 11 analytes selected to represent geogenic, biogenic, and anthropogenic processes: calcium, conductivity, pH, total suspended solids, turbidity, dissolved oxygen, total organic carbon, chlorophyll a, nitrate, soluble reactive phosphorus, and total phosphorus. All records from the raw query were modified during curation, most notably by removing duplicated observations, converting units, and aggregating strongly correlated chemical forms. Following curation, 65% of the original records were preserved, with significant reductions from raw to curated data in the means of nine constituents and, more notably, in the standard deviations of all constituents. 95% of monitored river reaches were linked to three or fewer monitoring sites, with sample patterns revealing a strong measurement bias to high order streams. We demonstrate the functionality of ChemLotUS by identifying spatiotemporal patterns in water quality at the CONUS‐scale, including diurnal variations of dissolved oxygen, pH in headwaters compared to their corresponding river mouths, and total suspended solids as a function of stream order. ChemLotUS enables new opportunities for investigations of continental scale variation in and controls on water quality.
format Article
id doaj-art-6be8c3d9728245c99ecf28b972cea69e
institution Kabale University
issn 0043-1397
1944-7973
language English
publishDate 2025-05-01
publisher Wiley
record_format Article
series Water Resources Research
spelling doaj-art-6be8c3d9728245c99ecf28b972cea69e2025-08-20T03:30:56ZengWileyWater Resources Research0043-13971944-79732025-05-01615n/an/a10.1029/2024WR039355ChemLotUS: A Benchmark Data Set of Lotic Chemistry Across US River NetworksN. Fernandez0M. J. Cohen1J. W. Jawitz2Soil, Water, and Ecosystem Sciences Department University of Florida Gainesville FL USASchool of Forest, Fisheries, and Geomatics Sciences University of Florida Gainesville FL USASoil, Water, and Ecosystem Sciences Department University of Florida Gainesville FL USAAbstract We present a curated water chemistry data set for lotic systems across the contiguous US containing 35,000,000 records from 290,000 locations. These records are spatially joined to high‐resolution national hydrography data sets, providing information on watershed area, network position, and other hydrographic information. Our curation process follows best practices applied to raw query results from the Water Quality Portal, followed by assigning network context (position and watershed attributes) to each site from the high‐resolution National Hydrography Data set. The ChemLotUS data set currently includes 11 analytes selected to represent geogenic, biogenic, and anthropogenic processes: calcium, conductivity, pH, total suspended solids, turbidity, dissolved oxygen, total organic carbon, chlorophyll a, nitrate, soluble reactive phosphorus, and total phosphorus. All records from the raw query were modified during curation, most notably by removing duplicated observations, converting units, and aggregating strongly correlated chemical forms. Following curation, 65% of the original records were preserved, with significant reductions from raw to curated data in the means of nine constituents and, more notably, in the standard deviations of all constituents. 95% of monitored river reaches were linked to three or fewer monitoring sites, with sample patterns revealing a strong measurement bias to high order streams. We demonstrate the functionality of ChemLotUS by identifying spatiotemporal patterns in water quality at the CONUS‐scale, including diurnal variations of dissolved oxygen, pH in headwaters compared to their corresponding river mouths, and total suspended solids as a function of stream order. ChemLotUS enables new opportunities for investigations of continental scale variation in and controls on water quality.https://doi.org/10.1029/2024WR039355water qualitybig datarivers and streamsstream chemistry
spellingShingle N. Fernandez
M. J. Cohen
J. W. Jawitz
ChemLotUS: A Benchmark Data Set of Lotic Chemistry Across US River Networks
Water Resources Research
water quality
big data
rivers and streams
stream chemistry
title ChemLotUS: A Benchmark Data Set of Lotic Chemistry Across US River Networks
title_full ChemLotUS: A Benchmark Data Set of Lotic Chemistry Across US River Networks
title_fullStr ChemLotUS: A Benchmark Data Set of Lotic Chemistry Across US River Networks
title_full_unstemmed ChemLotUS: A Benchmark Data Set of Lotic Chemistry Across US River Networks
title_short ChemLotUS: A Benchmark Data Set of Lotic Chemistry Across US River Networks
title_sort chemlotus a benchmark data set of lotic chemistry across us river networks
topic water quality
big data
rivers and streams
stream chemistry
url https://doi.org/10.1029/2024WR039355
work_keys_str_mv AT nfernandez chemlotusabenchmarkdatasetofloticchemistryacrossusrivernetworks
AT mjcohen chemlotusabenchmarkdatasetofloticchemistryacrossusrivernetworks
AT jwjawitz chemlotusabenchmarkdatasetofloticchemistryacrossusrivernetworks