DeepAir: deep learning and satellite imagery to estimate high-resolution PM2.5 at scale

Air pollution, specifically PM _2.5 , has become a significant global concern owing to its detrimental impacts on public health. Even so, the high-resolution monitoring of air pollution is still a challenge on a global scale. To cope with this, machine learning (ML) techniques have been utilized to...

Full description

Saved in:
Bibliographic Details
Main Authors: Wenxuan Guo, Zhaoping Hu, Ling Jin, Yanyan Xu, Marta C Gonzalez
Format: Article
Language:English
Published: IOP Publishing 2025-01-01
Series:Machine Learning: Science and Technology
Subjects:
Online Access:https://doi.org/10.1088/2632-2153/adb67a
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Air pollution, specifically PM _2.5 , has become a significant global concern owing to its detrimental impacts on public health. Even so, the high-resolution monitoring of air pollution is still a challenge on a global scale. To cope with this, machine learning (ML) techniques have been utilized to infer the concentration of air pollutants at a fine scale. In this study, we propose DeepAir , a learning framework for estimating PM _2.5 concentrations at a fine scale with sparsely distributed observations. DeepAir integrates a pre-trained convolutional neural network with the LightGBM method. This framework estimates the PM _2.5 concentration of a given patch, utilizing a synergy of geographical information, meteorological conditions, and satellite observations. We select California as the focal region and train the model with data from 2014 to 2017 provided by 130 PM _2.5 observation stations in the state. Upon training, the model can be applied to estimate the daily PM _2.5 concentrations at 1 km resolution across California. Our methodology meticulously incorporates meteorological variables, with a particular emphasis on wildfire propagation, and contemplates the complex interplay of various features. To ascertain the efficacy of our model, we employ the 10-fold cross-validation technique, which confirms that our model surpasses traditional ML and standalone deep learning methods.
ISSN:2632-2153