Attention mechanism augmented random forest model for multiple air pollutants estimation

Machine learning techniques based on satellite observations energize the derivation of near-surface air pollutant concentrations. However, most of previous studies mainly focused on estimating single air pollutant concentration, ignoring the interactions and dependencies between different air pollut...

Full description

Saved in:
Bibliographic Details
Main Authors: Xinyu Yu, Man Sing Wong, Kwon-Ho Lee
Format: Article
Language:English
Published: Elsevier 2025-07-01
Series:International Journal of Applied Earth Observations and Geoinformation
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1569843225003085
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Machine learning techniques based on satellite observations energize the derivation of near-surface air pollutant concentrations. However, most of previous studies mainly focused on estimating single air pollutant concentration, ignoring the interactions and dependencies between different air pollutants. Therefore, we proposed a Multiple Pollutants simultaneous estimation method based on Attention mechanism augmented Random Forest model (MPA-RF), including PM2.5, PM10, O3, NO2, CO and SO2. Specifically, self-attention mechanism was incorporated with the multi-output random forest first to emphasize pertinent features in inputs during model training. Additionally, the multi-head self-attention was also integrated to derive the interactions and temporal dependencies of different air pollutants from historical data. Satellite observations from Advanced Himawari Imager (AHI) in three major urban agglomerations in China were extracted to demonstrate the model performance using sample- and site-based cross-validation schemes. Results elucidate that the proposed model is capable of deriving simultaneous estimations of six air pollutants with high accuracy, R2 ranging from 0.74 to 0.93. Benefiting from the consideration of interactions and dependencies between different air pollutants, the proposed model outperforms other single-task contrast models with an R2 improvement ranging from 9% to 26%. Moreover, the derived seamless estimations offer a basis for air pollution spatio-temporal patterns and dynamic evolution analysis with time-saving and efficient manner.
ISSN:1569-8432