Using feature importance as an exploratory data analysis tool on Earth system models

<p>Machine learning (ML) models are commonly used to generate predictions, but these models can also support the discovery of new science. Generating accurate predictions necessitates that a model captures the structure of the underlying data. If the structure is properly extracted, ML could b...

Full description

Saved in:
Bibliographic Details
Main Authors: D. Ries, K. Goode, K. McClernon, B. Hillman
Format: Article
Language:English
Published: Copernicus Publications 2025-02-01
Series:Geoscientific Model Development
Online Access:https://gmd.copernicus.org/articles/18/1041/2025/gmd-18-1041-2025.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849716035959128064
author D. Ries
K. Goode
K. McClernon
B. Hillman
author_facet D. Ries
K. Goode
K. McClernon
B. Hillman
author_sort D. Ries
collection DOAJ
description <p>Machine learning (ML) models are commonly used to generate predictions, but these models can also support the discovery of new science. Generating accurate predictions necessitates that a model captures the structure of the underlying data. If the structure is properly extracted, ML could be a useful exploratory and evidential tool. In this paper, we present a case study that demonstrates the use of ML for exploratory data analysis (EDA) in the climate space. We apply the ML explainability method of spatiotemporal zeroed feature importance (stZFI) to understand how climate-variable associations evolve over space and time. Our analyses focus on data from ensembles of Earth system models (ESMs) which provide data on different climate states and conditions. We elect to work with ESM ensembles since they allow us to compare feature importance across alternative scenarios not available with observed data. The ensembles also account for natural variability so that we can distinguish between signal and noise due to natural climate variability when computing feature importance. The use of perturbed initial condition ensembles introduces variability mimicking the natural variability in the atmosphere; thus the signals emerging using feature importance (FI) can be evaluated against the natural variability in the climate system. For our analyses, we consider the 1991 volcanic eruption of Mount Pinatubo, which was a large stratospheric aerosol injection. We explore the climate pathway associated with the eruption from aerosols to radiation to temperature at both the near-surface and stratospheric levels. In addition to applying the method to data generated from two different ESMs, we apply stZFI to reanalysis data to compare the associations identified by stZFI. We show how stZFI tracks the importance of aerosol optical depth over time on forecasting temperatures. This case study illustrates usefulness of an ML tool (stZFI) for EDA on a well-studied climate exemplar.</p>
format Article
id doaj-art-c5e73cf7f04f430b8b3f0bd97eeaaa63
institution DOAJ
issn 1991-959X
1991-9603
language English
publishDate 2025-02-01
publisher Copernicus Publications
record_format Article
series Geoscientific Model Development
spelling doaj-art-c5e73cf7f04f430b8b3f0bd97eeaaa632025-08-20T03:13:08ZengCopernicus PublicationsGeoscientific Model Development1991-959X1991-96032025-02-01181041106510.5194/gmd-18-1041-2025Using feature importance as an exploratory data analysis tool on Earth system modelsD. Ries0K. Goode1K. McClernon2B. Hillman3Sandia National Laboratories, Albuquerque, NM, United States of AmericaSandia National Laboratories, Albuquerque, NM, United States of AmericaSandia National Laboratories, Albuquerque, NM, United States of AmericaSandia National Laboratories, Albuquerque, NM, United States of America<p>Machine learning (ML) models are commonly used to generate predictions, but these models can also support the discovery of new science. Generating accurate predictions necessitates that a model captures the structure of the underlying data. If the structure is properly extracted, ML could be a useful exploratory and evidential tool. In this paper, we present a case study that demonstrates the use of ML for exploratory data analysis (EDA) in the climate space. We apply the ML explainability method of spatiotemporal zeroed feature importance (stZFI) to understand how climate-variable associations evolve over space and time. Our analyses focus on data from ensembles of Earth system models (ESMs) which provide data on different climate states and conditions. We elect to work with ESM ensembles since they allow us to compare feature importance across alternative scenarios not available with observed data. The ensembles also account for natural variability so that we can distinguish between signal and noise due to natural climate variability when computing feature importance. The use of perturbed initial condition ensembles introduces variability mimicking the natural variability in the atmosphere; thus the signals emerging using feature importance (FI) can be evaluated against the natural variability in the climate system. For our analyses, we consider the 1991 volcanic eruption of Mount Pinatubo, which was a large stratospheric aerosol injection. We explore the climate pathway associated with the eruption from aerosols to radiation to temperature at both the near-surface and stratospheric levels. In addition to applying the method to data generated from two different ESMs, we apply stZFI to reanalysis data to compare the associations identified by stZFI. We show how stZFI tracks the importance of aerosol optical depth over time on forecasting temperatures. This case study illustrates usefulness of an ML tool (stZFI) for EDA on a well-studied climate exemplar.</p>https://gmd.copernicus.org/articles/18/1041/2025/gmd-18-1041-2025.pdf
spellingShingle D. Ries
K. Goode
K. McClernon
B. Hillman
Using feature importance as an exploratory data analysis tool on Earth system models
Geoscientific Model Development
title Using feature importance as an exploratory data analysis tool on Earth system models
title_full Using feature importance as an exploratory data analysis tool on Earth system models
title_fullStr Using feature importance as an exploratory data analysis tool on Earth system models
title_full_unstemmed Using feature importance as an exploratory data analysis tool on Earth system models
title_short Using feature importance as an exploratory data analysis tool on Earth system models
title_sort using feature importance as an exploratory data analysis tool on earth system models
url https://gmd.copernicus.org/articles/18/1041/2025/gmd-18-1041-2025.pdf
work_keys_str_mv AT dries usingfeatureimportanceasanexploratorydataanalysistoolonearthsystemmodels
AT kgoode usingfeatureimportanceasanexploratorydataanalysistoolonearthsystemmodels
AT kmcclernon usingfeatureimportanceasanexploratorydataanalysistoolonearthsystemmodels
AT bhillman usingfeatureimportanceasanexploratorydataanalysistoolonearthsystemmodels