Machine Learning-Based Prediction of Ecosystem-Scale CO<sub>2</sub> Flux Measurements

AmeriFlux is a network of hundreds of sites across the contiguous United States providing tower-based ecosystem-scale carbon dioxide flux measurements at 30 min temporal resolution. While geographically wide-ranging, over its existence the network has suffered from multiple issues including towers r...

Full description

Saved in:
Bibliographic Details
Main Authors: Jeffrey Uyekawa, John Leland, Darby Bergl, Yujie Liu, Andrew D. Richardson, Benjamin Lucas
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Land
Subjects:
Online Access:https://www.mdpi.com/2073-445X/14/1/124
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832588170256973824
author Jeffrey Uyekawa
John Leland
Darby Bergl
Yujie Liu
Andrew D. Richardson
Benjamin Lucas
author_facet Jeffrey Uyekawa
John Leland
Darby Bergl
Yujie Liu
Andrew D. Richardson
Benjamin Lucas
author_sort Jeffrey Uyekawa
collection DOAJ
description AmeriFlux is a network of hundreds of sites across the contiguous United States providing tower-based ecosystem-scale carbon dioxide flux measurements at 30 min temporal resolution. While geographically wide-ranging, over its existence the network has suffered from multiple issues including towers regularly ceasing operation for extended periods and a lack of standardization of measurements between sites. In this study, we use machine learning algorithms to predict CO<sub>2</sub> flux measurements at NEON sites (a subset of Ameriflux sites), creating a model to gap-fill measurements when sites are down or replace measurements when they are incorrect. Machine learning algorithms also have the ability to generalize to new sites, potentially even those without a flux tower. We compared the performance of seven machine learning algorithms using 35 environmental drivers and site-specific variables as predictors. We found that Extreme Gradient Boosting (XGBoost) consistently produced the most accurate predictions (Root Mean Squared Error of 1.81 μmolm<sup>−2</sup>s<sup>−1</sup>, R<sup>2</sup> of 0.86). The model showed excellent performance testing on sites that are ecologically similar to other sites (the Mid Atlantic, New England, and the Rocky Mountains), but poorer performance at sites with fewer ecological similarities to other sites in the data (Pacific Northwest, Florida, and Puerto Rico). The results show strong potential for machine learning-based models to make more skillful predictions than state-of-the-art process-based models, being able to estimate the multi-year mean carbon balance to within an error ±50 gCm<sup>−2</sup>y<sup>−1</sup> for 29 of our 44 test sites. These results have significant implications for being able to accurately predict the carbon flux or gap-fill an extended outage at any AmeriFlux site, and for being able to quantify carbon flux in support of natural climate solutions.
format Article
id doaj-art-e83a704b1bf943669f33a7505e9a6f78
institution Kabale University
issn 2073-445X
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Land
spelling doaj-art-e83a704b1bf943669f33a7505e9a6f782025-01-24T13:38:00ZengMDPI AGLand2073-445X2025-01-0114112410.3390/land14010124Machine Learning-Based Prediction of Ecosystem-Scale CO<sub>2</sub> Flux MeasurementsJeffrey Uyekawa0John Leland1Darby Bergl2Yujie Liu3Andrew D. Richardson4Benjamin Lucas5Department of Mathematics and Statistics, Northern Arizona University, Flagstaff, AZ 86011, USADepartment of Mathematics and Statistics, Northern Arizona University, Flagstaff, AZ 86011, USACenter for Ecosystem Science and Society, Northern Arizona University, Flagstaff, AZ 86011, USACenter for Ecosystem Science and Society, Northern Arizona University, Flagstaff, AZ 86011, USACenter for Ecosystem Science and Society, Northern Arizona University, Flagstaff, AZ 86011, USADepartment of Mathematics and Statistics, Northern Arizona University, Flagstaff, AZ 86011, USAAmeriFlux is a network of hundreds of sites across the contiguous United States providing tower-based ecosystem-scale carbon dioxide flux measurements at 30 min temporal resolution. While geographically wide-ranging, over its existence the network has suffered from multiple issues including towers regularly ceasing operation for extended periods and a lack of standardization of measurements between sites. In this study, we use machine learning algorithms to predict CO<sub>2</sub> flux measurements at NEON sites (a subset of Ameriflux sites), creating a model to gap-fill measurements when sites are down or replace measurements when they are incorrect. Machine learning algorithms also have the ability to generalize to new sites, potentially even those without a flux tower. We compared the performance of seven machine learning algorithms using 35 environmental drivers and site-specific variables as predictors. We found that Extreme Gradient Boosting (XGBoost) consistently produced the most accurate predictions (Root Mean Squared Error of 1.81 μmolm<sup>−2</sup>s<sup>−1</sup>, R<sup>2</sup> of 0.86). The model showed excellent performance testing on sites that are ecologically similar to other sites (the Mid Atlantic, New England, and the Rocky Mountains), but poorer performance at sites with fewer ecological similarities to other sites in the data (Pacific Northwest, Florida, and Puerto Rico). The results show strong potential for machine learning-based models to make more skillful predictions than state-of-the-art process-based models, being able to estimate the multi-year mean carbon balance to within an error ±50 gCm<sup>−2</sup>y<sup>−1</sup> for 29 of our 44 test sites. These results have significant implications for being able to accurately predict the carbon flux or gap-fill an extended outage at any AmeriFlux site, and for being able to quantify carbon flux in support of natural climate solutions.https://www.mdpi.com/2073-445X/14/1/124carbon dioxide fluxnature-based climate solutionsmachine learningXGBoostNational Ecological Observatory NetworkAmeriFlux
spellingShingle Jeffrey Uyekawa
John Leland
Darby Bergl
Yujie Liu
Andrew D. Richardson
Benjamin Lucas
Machine Learning-Based Prediction of Ecosystem-Scale CO<sub>2</sub> Flux Measurements
Land
carbon dioxide flux
nature-based climate solutions
machine learning
XGBoost
National Ecological Observatory Network
AmeriFlux
title Machine Learning-Based Prediction of Ecosystem-Scale CO<sub>2</sub> Flux Measurements
title_full Machine Learning-Based Prediction of Ecosystem-Scale CO<sub>2</sub> Flux Measurements
title_fullStr Machine Learning-Based Prediction of Ecosystem-Scale CO<sub>2</sub> Flux Measurements
title_full_unstemmed Machine Learning-Based Prediction of Ecosystem-Scale CO<sub>2</sub> Flux Measurements
title_short Machine Learning-Based Prediction of Ecosystem-Scale CO<sub>2</sub> Flux Measurements
title_sort machine learning based prediction of ecosystem scale co sub 2 sub flux measurements
topic carbon dioxide flux
nature-based climate solutions
machine learning
XGBoost
National Ecological Observatory Network
AmeriFlux
url https://www.mdpi.com/2073-445X/14/1/124
work_keys_str_mv AT jeffreyuyekawa machinelearningbasedpredictionofecosystemscalecosub2subfluxmeasurements
AT johnleland machinelearningbasedpredictionofecosystemscalecosub2subfluxmeasurements
AT darbybergl machinelearningbasedpredictionofecosystemscalecosub2subfluxmeasurements
AT yujieliu machinelearningbasedpredictionofecosystemscalecosub2subfluxmeasurements
AT andrewdrichardson machinelearningbasedpredictionofecosystemscalecosub2subfluxmeasurements
AT benjaminlucas machinelearningbasedpredictionofecosystemscalecosub2subfluxmeasurements