Hourly surface nitrogen dioxide retrieval from GEMS tropospheric vertical column densities: benefit of using time-contiguous input features for machine learning models

<p>Launched in 2020, the Korean Geostationary Environmental Monitoring Spectrometer (GEMS) is the first geostationary satellite mission for observing trace gas concentrations in the Earth's atmosphere. Observations are made over Asia. Geostationary orbits allow for hourly measurements, wh...

Full description

Saved in:
Bibliographic Details
Main Authors: J. Gödeke, A. Richter, K. Lange, P. Maaß, H. Hong, H. Lee, J. Park
Format: Article
Language:English
Published: Copernicus Publications 2025-08-01
Series:Atmospheric Measurement Techniques
Online Access:https://amt.copernicus.org/articles/18/3747/2025/amt-18-3747-2025.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849389638316195840
author J. Gödeke
A. Richter
K. Lange
P. Maaß
H. Hong
H. Lee
J. Park
author_facet J. Gödeke
A. Richter
K. Lange
P. Maaß
H. Hong
H. Lee
J. Park
author_sort J. Gödeke
collection DOAJ
description <p>Launched in 2020, the Korean Geostationary Environmental Monitoring Spectrometer (GEMS) is the first geostationary satellite mission for observing trace gas concentrations in the Earth's atmosphere. Observations are made over Asia. Geostationary orbits allow for hourly measurements, which lead to a much higher temporal resolution compared to daily measurements taken from low-Earth orbits, such as by the TROPOspheric Monitoring Instrument (TROPOMI) or the Ozone Monitoring Instrument (OMI). This work estimates the hourly concentration of surface nitrogen dioxide (<span class="inline-formula">NO<sub>2</sub></span>) from GEMS tropospheric <span class="inline-formula">NO<sub>2</sub></span> vertical column densities (VCDs) and additional meteorological features, which serve as inputs for random forests and linear regression models. With several measurements per day, machine learning models can use not only current observations but also those from previous hours as inputs. We demonstrate that using these time-contiguous inputs leads to reliable improvements regarding all considered performance measures, such as Pearson correlation or mean square error. For random forests, the average performance gains are between 4.5 % and 7.5 %, depending on the performance measure. For linear regression models, average performance gains are between 7 % and 15 %. For performance evaluation, spatial cross-validation with surface in situ measurements is used to measure how well the trained models perform at locations where they have not received any training data. In other words, we inspect the models' ability to generalize to unseen locations. Additionally, we investigate the influence of tropospheric <span class="inline-formula">NO<sub>2</sub></span> VCDs on the performance. The region of our study is South Korea.</p>
format Article
id doaj-art-475e4e5465ab4c3298c77ea4fb69c8f7
institution Kabale University
issn 1867-1381
1867-8548
language English
publishDate 2025-08-01
publisher Copernicus Publications
record_format Article
series Atmospheric Measurement Techniques
spelling doaj-art-475e4e5465ab4c3298c77ea4fb69c8f72025-08-20T03:41:54ZengCopernicus PublicationsAtmospheric Measurement Techniques1867-13811867-85482025-08-01183747377910.5194/amt-18-3747-2025Hourly surface nitrogen dioxide retrieval from GEMS tropospheric vertical column densities: benefit of using time-contiguous input features for machine learning modelsJ. Gödeke0A. Richter1K. Lange2P. Maaß3H. Hong4H. Lee5J. Park6Center for Industrial Mathematics, University of Bremen, Bremen, GermanyInstitute of Environmental Physics, University of Bremen, Bremen, GermanyInstitute of Environmental Physics, University of Bremen, Bremen, GermanyCenter for Industrial Mathematics, University of Bremen, Bremen, GermanyEnvironmental Satellite Center, National Institute of Environmental Research, Incheon, Republic of KoreaDivision of Earth Environmental System Science, Major of Spatial Information Engineering, Pukyong National University, Busan, Republic of KoreaDivision of Earth Environmental System Science, Major of Spatial Information Engineering, Pukyong National University, Busan, Republic of Korea<p>Launched in 2020, the Korean Geostationary Environmental Monitoring Spectrometer (GEMS) is the first geostationary satellite mission for observing trace gas concentrations in the Earth's atmosphere. Observations are made over Asia. Geostationary orbits allow for hourly measurements, which lead to a much higher temporal resolution compared to daily measurements taken from low-Earth orbits, such as by the TROPOspheric Monitoring Instrument (TROPOMI) or the Ozone Monitoring Instrument (OMI). This work estimates the hourly concentration of surface nitrogen dioxide (<span class="inline-formula">NO<sub>2</sub></span>) from GEMS tropospheric <span class="inline-formula">NO<sub>2</sub></span> vertical column densities (VCDs) and additional meteorological features, which serve as inputs for random forests and linear regression models. With several measurements per day, machine learning models can use not only current observations but also those from previous hours as inputs. We demonstrate that using these time-contiguous inputs leads to reliable improvements regarding all considered performance measures, such as Pearson correlation or mean square error. For random forests, the average performance gains are between 4.5 % and 7.5 %, depending on the performance measure. For linear regression models, average performance gains are between 7 % and 15 %. For performance evaluation, spatial cross-validation with surface in situ measurements is used to measure how well the trained models perform at locations where they have not received any training data. In other words, we inspect the models' ability to generalize to unseen locations. Additionally, we investigate the influence of tropospheric <span class="inline-formula">NO<sub>2</sub></span> VCDs on the performance. The region of our study is South Korea.</p>https://amt.copernicus.org/articles/18/3747/2025/amt-18-3747-2025.pdf
spellingShingle J. Gödeke
A. Richter
K. Lange
P. Maaß
H. Hong
H. Lee
J. Park
Hourly surface nitrogen dioxide retrieval from GEMS tropospheric vertical column densities: benefit of using time-contiguous input features for machine learning models
Atmospheric Measurement Techniques
title Hourly surface nitrogen dioxide retrieval from GEMS tropospheric vertical column densities: benefit of using time-contiguous input features for machine learning models
title_full Hourly surface nitrogen dioxide retrieval from GEMS tropospheric vertical column densities: benefit of using time-contiguous input features for machine learning models
title_fullStr Hourly surface nitrogen dioxide retrieval from GEMS tropospheric vertical column densities: benefit of using time-contiguous input features for machine learning models
title_full_unstemmed Hourly surface nitrogen dioxide retrieval from GEMS tropospheric vertical column densities: benefit of using time-contiguous input features for machine learning models
title_short Hourly surface nitrogen dioxide retrieval from GEMS tropospheric vertical column densities: benefit of using time-contiguous input features for machine learning models
title_sort hourly surface nitrogen dioxide retrieval from gems tropospheric vertical column densities benefit of using time contiguous input features for machine learning models
url https://amt.copernicus.org/articles/18/3747/2025/amt-18-3747-2025.pdf
work_keys_str_mv AT jgodeke hourlysurfacenitrogendioxideretrievalfromgemstroposphericverticalcolumndensitiesbenefitofusingtimecontiguousinputfeaturesformachinelearningmodels
AT arichter hourlysurfacenitrogendioxideretrievalfromgemstroposphericverticalcolumndensitiesbenefitofusingtimecontiguousinputfeaturesformachinelearningmodels
AT klange hourlysurfacenitrogendioxideretrievalfromgemstroposphericverticalcolumndensitiesbenefitofusingtimecontiguousinputfeaturesformachinelearningmodels
AT pmaaß hourlysurfacenitrogendioxideretrievalfromgemstroposphericverticalcolumndensitiesbenefitofusingtimecontiguousinputfeaturesformachinelearningmodels
AT hhong hourlysurfacenitrogendioxideretrievalfromgemstroposphericverticalcolumndensitiesbenefitofusingtimecontiguousinputfeaturesformachinelearningmodels
AT hlee hourlysurfacenitrogendioxideretrievalfromgemstroposphericverticalcolumndensitiesbenefitofusingtimecontiguousinputfeaturesformachinelearningmodels
AT jpark hourlysurfacenitrogendioxideretrievalfromgemstroposphericverticalcolumndensitiesbenefitofusingtimecontiguousinputfeaturesformachinelearningmodels