Time series AQI forecasting using Kalman-integrated Bi-GRU and Chi-square divergence optimization

Abstract Air pollution has become a pressing global concern, demanding accurate forecasting systems to safeguard public health. Existing AQI prediction models often falter due to missing data, high variability, and limited ability to handle distributional uncertainty. This study introduces a novel d...

Full description

Saved in:
Bibliographic Details
Main Authors: Narmeen Fatima, Samia Nawaz Yousafzai, Nadhem Nemri, Hadeel Alsolai, Shouki A. Ebad, Shaymaa Sorour, Yeonghyeon Gu, Muhammad Syafrudin, Norma Latif Fitriyani
Format: Article
Language:English
Published: Nature Portfolio 2025-08-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-12422-8
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Air pollution has become a pressing global concern, demanding accurate forecasting systems to safeguard public health. Existing AQI prediction models often falter due to missing data, high variability, and limited ability to handle distributional uncertainty. This study introduces a novel deep learning framework that integrates Kalman Attention with a Bi-Directional Gated Recurrent Unit (Bi-GRU) for robust AQI time-series forecasting. Unlike conventional attention mechanisms, Kalman Attention dynamically adjusts to data uncertainty, enhancing temporal feature weighting. Additionally, we incorporate a Chi-square Divergence-based regularization term into the loss function to explicitly minimize the distributional mismatch between predicted and actual pollutant levels–a contribution not explored in prior AQI models. Missing values are imputed using a pollutant-specific ARIMA model to preserve time-dependent trends. The proposed system is evaluated using real-world data from the U.S. Environmental Protection Agency (2022-2024) across six major pollutants (CO, NO $${_2}$$ , $$\hbox {SO}_2$$ , $$\hbox {O}_3$$ , $$\hbox {PM}_{10}$$ , $$\hbox {PM}_{2.5}$$ ) in the Denver-Aurora-Lakewood region. Experimental results demonstrate significant improvements over baseline models (LSTM, CNN-LSTM), achieving an $$\hbox {R}^2$$ of 0.96794, MSE of 4.11 $$\times$$ 10 $$^{-7}$$ , and MAE of 0.000423. This work advances AQI forecasting by addressing uncertainty, distributional alignment, and missing data within a unified architecture, providing a scalable solution for environmental monitoring and policy support.
ISSN:2045-2322