Novel hybrid data-driven modeling based on feature space reconstruction and multihead self-attention gated recurrent unit: applied to PM2.5 concentrations prediction
Abstract In response to the problem of neglecting the periodic and global characteristics of sequence data when predicting PM2.5 concentrations via machine learning models, a PM2.5 concentrations prediction model based on feature space reconstruction and multihead self-attention gated recurrent unit...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-05-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-00911-9 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849309856035504128 |
|---|---|
| author | Xiaoxin Yue Yulong Bai Qinghe Yu Lin Ding Wei Song Wenhui Liu Huhu Ren Qi Song |
| author_facet | Xiaoxin Yue Yulong Bai Qinghe Yu Lin Ding Wei Song Wenhui Liu Huhu Ren Qi Song |
| author_sort | Xiaoxin Yue |
| collection | DOAJ |
| description | Abstract In response to the problem of neglecting the periodic and global characteristics of sequence data when predicting PM2.5 concentrations via machine learning models, a PM2.5 concentrations prediction model based on feature space reconstruction and multihead self-attention gated recurrent unit (FSR-MSAGRU) is proposed in this study. First, the raw sequence data are subjected to frequency spectrum analysis to determine the period value of the PM2.5 sequence data. Subsequently, the seasonal trend decomposition procedure based on loess (STL) is employed to capture the periodicity and trend information in the PM2.5 sequence data. Then, the feature space of the PM2.5 sequence data is reconstructed using the raw PM2.5 sequence data, decomposed seasonal components, trend components, and residual components. Finally, the reconstructed feature data are input into multihead self-attention gated recurrent unit (MSAGRU) with the ability to capture global feature information to predict PM2.5 concentrations. Favorable prediction results were attained by the proposed FSR-MSAGRU model across 6 distinct experimental datasets, with a PCC exceeding 0.98 and a decrease in the prediction accuracy metric SMAPE of at least 68% compared to that of the GRU model. Comparative experimental results with 13 reference models demonstrate that the proposed model exhibits better prediction performances and stronger generalization abilities. |
| format | Article |
| id | doaj-art-0647c6eeea3944ea99755ff0671a111c |
| institution | Kabale University |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-0647c6eeea3944ea99755ff0671a111c2025-08-20T03:53:57ZengNature PortfolioScientific Reports2045-23222025-05-0115112110.1038/s41598-025-00911-9Novel hybrid data-driven modeling based on feature space reconstruction and multihead self-attention gated recurrent unit: applied to PM2.5 concentrations predictionXiaoxin Yue0Yulong Bai1Qinghe Yu2Lin Ding3Wei Song4Wenhui Liu5Huhu Ren6Qi Song7College of Physics and Electrical Engineering, Northwest Normal UniversityCollege of Physics and Electrical Engineering, Northwest Normal UniversityCollege of Physics and Electrical Engineering, Northwest Normal UniversityCollege of Physics and Electrical Engineering, Northwest Normal UniversityCollege of Physics and Electrical Engineering, Northwest Normal UniversityCollege of Physics and Electrical Engineering, Northwest Normal UniversityCollege of Physics and Electrical Engineering, Northwest Normal UniversityCollege of Physics and Electrical Engineering, Northwest Normal UniversityAbstract In response to the problem of neglecting the periodic and global characteristics of sequence data when predicting PM2.5 concentrations via machine learning models, a PM2.5 concentrations prediction model based on feature space reconstruction and multihead self-attention gated recurrent unit (FSR-MSAGRU) is proposed in this study. First, the raw sequence data are subjected to frequency spectrum analysis to determine the period value of the PM2.5 sequence data. Subsequently, the seasonal trend decomposition procedure based on loess (STL) is employed to capture the periodicity and trend information in the PM2.5 sequence data. Then, the feature space of the PM2.5 sequence data is reconstructed using the raw PM2.5 sequence data, decomposed seasonal components, trend components, and residual components. Finally, the reconstructed feature data are input into multihead self-attention gated recurrent unit (MSAGRU) with the ability to capture global feature information to predict PM2.5 concentrations. Favorable prediction results were attained by the proposed FSR-MSAGRU model across 6 distinct experimental datasets, with a PCC exceeding 0.98 and a decrease in the prediction accuracy metric SMAPE of at least 68% compared to that of the GRU model. Comparative experimental results with 13 reference models demonstrate that the proposed model exhibits better prediction performances and stronger generalization abilities.https://doi.org/10.1038/s41598-025-00911-9Machine learningFeature space reconstructionMultihead Self-attentionGated recurrent unitPM2.5 concentration prediction |
| spellingShingle | Xiaoxin Yue Yulong Bai Qinghe Yu Lin Ding Wei Song Wenhui Liu Huhu Ren Qi Song Novel hybrid data-driven modeling based on feature space reconstruction and multihead self-attention gated recurrent unit: applied to PM2.5 concentrations prediction Scientific Reports Machine learning Feature space reconstruction Multihead Self-attention Gated recurrent unit PM2.5 concentration prediction |
| title | Novel hybrid data-driven modeling based on feature space reconstruction and multihead self-attention gated recurrent unit: applied to PM2.5 concentrations prediction |
| title_full | Novel hybrid data-driven modeling based on feature space reconstruction and multihead self-attention gated recurrent unit: applied to PM2.5 concentrations prediction |
| title_fullStr | Novel hybrid data-driven modeling based on feature space reconstruction and multihead self-attention gated recurrent unit: applied to PM2.5 concentrations prediction |
| title_full_unstemmed | Novel hybrid data-driven modeling based on feature space reconstruction and multihead self-attention gated recurrent unit: applied to PM2.5 concentrations prediction |
| title_short | Novel hybrid data-driven modeling based on feature space reconstruction and multihead self-attention gated recurrent unit: applied to PM2.5 concentrations prediction |
| title_sort | novel hybrid data driven modeling based on feature space reconstruction and multihead self attention gated recurrent unit applied to pm2 5 concentrations prediction |
| topic | Machine learning Feature space reconstruction Multihead Self-attention Gated recurrent unit PM2.5 concentration prediction |
| url | https://doi.org/10.1038/s41598-025-00911-9 |
| work_keys_str_mv | AT xiaoxinyue novelhybriddatadrivenmodelingbasedonfeaturespacereconstructionandmultiheadselfattentiongatedrecurrentunitappliedtopm25concentrationsprediction AT yulongbai novelhybriddatadrivenmodelingbasedonfeaturespacereconstructionandmultiheadselfattentiongatedrecurrentunitappliedtopm25concentrationsprediction AT qingheyu novelhybriddatadrivenmodelingbasedonfeaturespacereconstructionandmultiheadselfattentiongatedrecurrentunitappliedtopm25concentrationsprediction AT linding novelhybriddatadrivenmodelingbasedonfeaturespacereconstructionandmultiheadselfattentiongatedrecurrentunitappliedtopm25concentrationsprediction AT weisong novelhybriddatadrivenmodelingbasedonfeaturespacereconstructionandmultiheadselfattentiongatedrecurrentunitappliedtopm25concentrationsprediction AT wenhuiliu novelhybriddatadrivenmodelingbasedonfeaturespacereconstructionandmultiheadselfattentiongatedrecurrentunitappliedtopm25concentrationsprediction AT huhuren novelhybriddatadrivenmodelingbasedonfeaturespacereconstructionandmultiheadselfattentiongatedrecurrentunitappliedtopm25concentrationsprediction AT qisong novelhybriddatadrivenmodelingbasedonfeaturespacereconstructionandmultiheadselfattentiongatedrecurrentunitappliedtopm25concentrationsprediction |