Prediction of High-ozone Events Using GAM, SMOTE, and Tail Dependence Approaches in Texas (2005–2019)

Abstract We test three methods for ozone prediction in the El Paso (ELP) and Houston-Galveston-Brazoria (HGB) regions of Texas from 2005–2019: (1) a Generalized Additive Model (GAMs) approach; (2) a GAM approach with the addition of the Synthetic Minority Over-sampling TEchnique (SMOTE) and (3) a ta...

Full description

Saved in:
Bibliographic Details
Main Authors: Benjamin Brown-Steiner, Xiong Zhou, Matthew J. Alvarado, Brook T. Russell
Format: Article
Language:English
Published: Springer 2021-07-01
Series:Aerosol and Air Quality Research
Subjects:
Online Access:https://doi.org/10.4209/aaqr.210077
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825197493161295872
author Benjamin Brown-Steiner
Xiong Zhou
Matthew J. Alvarado
Brook T. Russell
author_facet Benjamin Brown-Steiner
Xiong Zhou
Matthew J. Alvarado
Brook T. Russell
author_sort Benjamin Brown-Steiner
collection DOAJ
description Abstract We test three methods for ozone prediction in the El Paso (ELP) and Houston-Galveston-Brazoria (HGB) regions of Texas from 2005–2019: (1) a Generalized Additive Model (GAMs) approach; (2) a GAM approach with the addition of the Synthetic Minority Over-sampling TEchnique (SMOTE) and (3) a tail dependence modeling approach based in extreme value theory (EVT). We also compare the feature selection capabilities of the tail dependence approach to other feature selection methods. We find that the GAM+SMOTE model outperformed the GAM-only model when predicting ozone values for the root mean square error metric, particularly with regard to the above-threshold ozone values, which may be of particularly useful for extreme ozone event prediction. In addition, we find that the improvement of above-threshold MDA8 O3 prediction for the GAM+SMOTE method tends to come at the cost of below-threshold prediction, which is particularly important if MDA8 O3 trends are of interest. We also find that the tail dependence approach is capable of predicting extreme ozone events, but algorithmic stability and configuration complexity can make this approach difficult to operationalize on a broad scale and that the selection of the threshold needs to be carefully considered. Finally, the feature selection via the tail dependence method performs comparably to other forms of machine learning-based feature selection and we find that there are multiple parameter sets that can predict MDA8 O3 with equal success.
format Article
id doaj-art-36ce98c8a82349d2ba655a5719df438d
institution Kabale University
issn 1680-8584
2071-1409
language English
publishDate 2021-07-01
publisher Springer
record_format Article
series Aerosol and Air Quality Research
spelling doaj-art-36ce98c8a82349d2ba655a5719df438d2025-02-09T12:21:18ZengSpringerAerosol and Air Quality Research1680-85842071-14092021-07-01211011310.4209/aaqr.210077Prediction of High-ozone Events Using GAM, SMOTE, and Tail Dependence Approaches in Texas (2005–2019)Benjamin Brown-Steiner0Xiong Zhou1Matthew J. Alvarado2Brook T. Russell3Atmospheric and Environmental Research (AER)Atmospheric and Environmental Research (AER)Atmospheric and Environmental Research (AER)School of Mathematical and Statistical Sciences, Clemson UniversityAbstract We test three methods for ozone prediction in the El Paso (ELP) and Houston-Galveston-Brazoria (HGB) regions of Texas from 2005–2019: (1) a Generalized Additive Model (GAMs) approach; (2) a GAM approach with the addition of the Synthetic Minority Over-sampling TEchnique (SMOTE) and (3) a tail dependence modeling approach based in extreme value theory (EVT). We also compare the feature selection capabilities of the tail dependence approach to other feature selection methods. We find that the GAM+SMOTE model outperformed the GAM-only model when predicting ozone values for the root mean square error metric, particularly with regard to the above-threshold ozone values, which may be of particularly useful for extreme ozone event prediction. In addition, we find that the improvement of above-threshold MDA8 O3 prediction for the GAM+SMOTE method tends to come at the cost of below-threshold prediction, which is particularly important if MDA8 O3 trends are of interest. We also find that the tail dependence approach is capable of predicting extreme ozone events, but algorithmic stability and configuration complexity can make this approach difficult to operationalize on a broad scale and that the selection of the threshold needs to be carefully considered. Finally, the feature selection via the tail dependence method performs comparably to other forms of machine learning-based feature selection and we find that there are multiple parameter sets that can predict MDA8 O3 with equal success.https://doi.org/10.4209/aaqr.210077GAMSMOTETail dependenceOzone predictionFeature selection
spellingShingle Benjamin Brown-Steiner
Xiong Zhou
Matthew J. Alvarado
Brook T. Russell
Prediction of High-ozone Events Using GAM, SMOTE, and Tail Dependence Approaches in Texas (2005–2019)
Aerosol and Air Quality Research
GAM
SMOTE
Tail dependence
Ozone prediction
Feature selection
title Prediction of High-ozone Events Using GAM, SMOTE, and Tail Dependence Approaches in Texas (2005–2019)
title_full Prediction of High-ozone Events Using GAM, SMOTE, and Tail Dependence Approaches in Texas (2005–2019)
title_fullStr Prediction of High-ozone Events Using GAM, SMOTE, and Tail Dependence Approaches in Texas (2005–2019)
title_full_unstemmed Prediction of High-ozone Events Using GAM, SMOTE, and Tail Dependence Approaches in Texas (2005–2019)
title_short Prediction of High-ozone Events Using GAM, SMOTE, and Tail Dependence Approaches in Texas (2005–2019)
title_sort prediction of high ozone events using gam smote and tail dependence approaches in texas 2005 2019
topic GAM
SMOTE
Tail dependence
Ozone prediction
Feature selection
url https://doi.org/10.4209/aaqr.210077
work_keys_str_mv AT benjaminbrownsteiner predictionofhighozoneeventsusinggamsmoteandtaildependenceapproachesintexas20052019
AT xiongzhou predictionofhighozoneeventsusinggamsmoteandtaildependenceapproachesintexas20052019
AT matthewjalvarado predictionofhighozoneeventsusinggamsmoteandtaildependenceapproachesintexas20052019
AT brooktrussell predictionofhighozoneeventsusinggamsmoteandtaildependenceapproachesintexas20052019