High-accuracy PM2.5 prediction via mutual information filtering and Bayesian-Optimized Spatio-Temporal Convolutional Networks

Abstract Air pollution, particularly fine particulate matter (PM2.5), poses severe threats to human health and ecological sustainability, rendering accurate prediction of PM2.5 concentrations imperative for proactive public health interventions and evidence-based policy-making. While deep learning m...

Full description

Saved in:
Bibliographic Details
Main Author: Wanyu Wang
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-08896-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849238558661935104
author Wanyu Wang
author_facet Wanyu Wang
author_sort Wanyu Wang
collection DOAJ
description Abstract Air pollution, particularly fine particulate matter (PM2.5), poses severe threats to human health and ecological sustainability, rendering accurate prediction of PM2.5 concentrations imperative for proactive public health interventions and evidence-based policy-making. While deep learning models like LSTM, GRU, and CNN are widely adopted for their robust modeling capacities, the direct use of raw, unfiltered data introduces feature redundancy. This not only prolongs training duration and elevates overfitting risks but also degrades prediction accuracy by complicating model convergence. To address these challenges, this paper presents an advanced PM2.5 prediction framework incorporating three key innovations.First, in contrast to conventional static threshold-based feature selection, a dynamic framework integrating Mutual Information (MI) and Adaptive Information Distance (AID) is proposed. By quantifying nonlinear feature correlations (via MI) and spatial redundancies (via AID), the framework adaptively prunes redundant inputs, thereby enhancing the information utility of the feature space for subsequent modeling.Second, a Bayesian optimizer guided by multimodal Gaussian distributions is designed to overcome the limitation of traditional unimodal optimization, which often stagnates at local optima. This optimizer explores multiple potential optima in the parameter space concurrently, enabling efficient global hyperparameter search and improving model robustness to noise.Third, an Information Screening-Enhanced Spatiotemporal Convolutional Network (MIBO-STCN) is introduced. Building on our prior STCN framework that integrates causal convolution and adaptive receptive fields, this architecture embeds an information screening layer to achieve synergistic optimization of spatiotemporal dependency modeling and redundancy reduction. Through this synergistic data preprocessing and optimization pipeline, the framework substantially boosts the prediction accuracy and accelerates convergence of the STCN model. Experimental results demonstrate that the proposed approach outperforms state-of-the-art models, exhibiting robust generalization capabilities in PM2.5 concentration forecasting across diverse scenarios.
format Article
id doaj-art-99eac9ebc4f543c393fc96f8ee7f4de7
institution Kabale University
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-99eac9ebc4f543c393fc96f8ee7f4de72025-08-20T04:01:34ZengNature PortfolioScientific Reports2045-23222025-07-0115111410.1038/s41598-025-08896-1High-accuracy PM2.5 prediction via mutual information filtering and Bayesian-Optimized Spatio-Temporal Convolutional NetworksWanyu Wang0Shanghai University of Engineering ScienceAbstract Air pollution, particularly fine particulate matter (PM2.5), poses severe threats to human health and ecological sustainability, rendering accurate prediction of PM2.5 concentrations imperative for proactive public health interventions and evidence-based policy-making. While deep learning models like LSTM, GRU, and CNN are widely adopted for their robust modeling capacities, the direct use of raw, unfiltered data introduces feature redundancy. This not only prolongs training duration and elevates overfitting risks but also degrades prediction accuracy by complicating model convergence. To address these challenges, this paper presents an advanced PM2.5 prediction framework incorporating three key innovations.First, in contrast to conventional static threshold-based feature selection, a dynamic framework integrating Mutual Information (MI) and Adaptive Information Distance (AID) is proposed. By quantifying nonlinear feature correlations (via MI) and spatial redundancies (via AID), the framework adaptively prunes redundant inputs, thereby enhancing the information utility of the feature space for subsequent modeling.Second, a Bayesian optimizer guided by multimodal Gaussian distributions is designed to overcome the limitation of traditional unimodal optimization, which often stagnates at local optima. This optimizer explores multiple potential optima in the parameter space concurrently, enabling efficient global hyperparameter search and improving model robustness to noise.Third, an Information Screening-Enhanced Spatiotemporal Convolutional Network (MIBO-STCN) is introduced. Building on our prior STCN framework that integrates causal convolution and adaptive receptive fields, this architecture embeds an information screening layer to achieve synergistic optimization of spatiotemporal dependency modeling and redundancy reduction. Through this synergistic data preprocessing and optimization pipeline, the framework substantially boosts the prediction accuracy and accelerates convergence of the STCN model. Experimental results demonstrate that the proposed approach outperforms state-of-the-art models, exhibiting robust generalization capabilities in PM2.5 concentration forecasting across diverse scenarios.https://doi.org/10.1038/s41598-025-08896-1PM2.5 predictionAir pollutionRedundant information filteringDeep learningBayesian optimization
spellingShingle Wanyu Wang
High-accuracy PM2.5 prediction via mutual information filtering and Bayesian-Optimized Spatio-Temporal Convolutional Networks
Scientific Reports
PM2.5 prediction
Air pollution
Redundant information filtering
Deep learning
Bayesian optimization
title High-accuracy PM2.5 prediction via mutual information filtering and Bayesian-Optimized Spatio-Temporal Convolutional Networks
title_full High-accuracy PM2.5 prediction via mutual information filtering and Bayesian-Optimized Spatio-Temporal Convolutional Networks
title_fullStr High-accuracy PM2.5 prediction via mutual information filtering and Bayesian-Optimized Spatio-Temporal Convolutional Networks
title_full_unstemmed High-accuracy PM2.5 prediction via mutual information filtering and Bayesian-Optimized Spatio-Temporal Convolutional Networks
title_short High-accuracy PM2.5 prediction via mutual information filtering and Bayesian-Optimized Spatio-Temporal Convolutional Networks
title_sort high accuracy pm2 5 prediction via mutual information filtering and bayesian optimized spatio temporal convolutional networks
topic PM2.5 prediction
Air pollution
Redundant information filtering
Deep learning
Bayesian optimization
url https://doi.org/10.1038/s41598-025-08896-1
work_keys_str_mv AT wanyuwang highaccuracypm25predictionviamutualinformationfilteringandbayesianoptimizedspatiotemporalconvolutionalnetworks