Air quality prediction based on factor analysis combined with Transformer and CNN-BILSTM-ATTENTION models

Abstract This study presents an innovative air quality prediction framework that integrates factor analysis with deep learning models for precise prediction of original variables. Using data from Beijing’s Tiantan station, factor analysis was applied to reduce dimensionality. We embed the factor sco...

Full description

Saved in:
Bibliographic Details
Main Authors: Shuyuan Liu, Yang Hu
Format: Article
Language:English
Published: Nature Portfolio 2025-06-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-03780-4
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract This study presents an innovative air quality prediction framework that integrates factor analysis with deep learning models for precise prediction of original variables. Using data from Beijing’s Tiantan station, factor analysis was applied to reduce dimensionality. We embed the factor score matrix into the Transformer model which leveraged self-attention to capture long-term dependencies, marking a significant advancement over traditional LSTM methods. Our hybrid framework outperforms these methods and surpasses models like Transformer, N-BEATS, and Informer combined with principal component and factor analysis. Residual analysis and $${R}^{2}$$ evaluation confirmed superior accuracy and stability, with the maximum likelihood factor analysis Transformer model achieving an MSE of 0.1619 and $${R}^{2}$$ of 0.8520 for factor 1, and an MSE of 0.0476 and $${R}^{2}$$ of 0.9563 for factor 2. Additionally, we introduced a cutting-edge CNN-BILSTM-ATTENTION model with discrete wavelet transform, which optimizes predictive performance by extracting local features, capturing temporal dependencies, and enhancing key time steps. Its MSE was 0.0405, with $${R}^{2}$$ values all above 0.94, demonstrating exceptional performance. This study emphasizes the groundbreaking integration of factor analysis with deep learning, transforming causal relationships into conditions for predictive models. Future plans include optimizing factor extraction, exploring external data sources, and developing more efficient deep learning architectures.
ISSN:2045-2322