Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula
Air pollution, particularly fine particulate matter (PM<sub>2.5</sub>), poses significant environmental and public health challenges in South Korea. The National Institute of Environmental Research (NIER) currently relies on numerical models such as the Community Multiscale Air Quality (...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Atmosphere |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2073-4433/16/5/524 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850257732547903488 |
|---|---|
| author | Chae-Yeon Lee Ju-Yong Lee Seung-Hee Han Jin-Goo Kang Jeong-Beom Lee Dae-Ryun Choi |
| author_facet | Chae-Yeon Lee Ju-Yong Lee Seung-Hee Han Jin-Goo Kang Jeong-Beom Lee Dae-Ryun Choi |
| author_sort | Chae-Yeon Lee |
| collection | DOAJ |
| description | Air pollution, particularly fine particulate matter (PM<sub>2.5</sub>), poses significant environmental and public health challenges in South Korea. The National Institute of Environmental Research (NIER) currently relies on numerical models such as the Community Multiscale Air Quality (CMAQ) model for PM<sub>2.5</sub> forecasting. However, these models exhibit inherent uncertainties due to limitations in emission inventories, meteorological inputs, and model frameworks. To address these challenges, this study evaluates and compares the forecasting performance of two alternative models: Long Short-Term Memory (LSTM), a deep learning model, and Seasonal Auto Regressive Integrated Moving Average with Exogenous Variables (SARIMAX), a statistical model. The performance evaluation was focused on Seoul, South Korea, and took place over different forecast lead times (D00–D02). The results indicate that for short-term forecasts (D00), SARIMAX outperformed LSTM in all statistical metrics, particularly in detecting high PM<sub>2.5</sub> concentrations, with a 19.43% higher Probability of Detection (POD). However, SARIMAX exhibited a sharp performance decline in extended forecasts (D01–D02). In contrast, LSTM demonstrated relatively stable accuracy over longer lead times, effectively capturing complex PM<sub>2.5</sub> concentration patterns, particularly during high-concentration episodes. These findings highlight the strengths and limitations of statistical and deep learning models. While SARIMAX excels in short-term forecasting with limited training data, LSTM proves advantageous for long-term forecasting, benefiting from its ability to learn complex temporal patterns from historical data. The results suggest that an integrated air quality forecasting system combining numerical, statistical, and machine learning approaches could enhance PM<sub>2.5</sub> forecasting accuracy. |
| format | Article |
| id | doaj-art-3e0ca9664ffe47bfad09d1b42a68e331 |
| institution | OA Journals |
| issn | 2073-4433 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Atmosphere |
| spelling | doaj-art-3e0ca9664ffe47bfad09d1b42a68e3312025-08-20T01:56:20ZengMDPI AGAtmosphere2073-44332025-04-0116552410.3390/atmos16050524Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean PeninsulaChae-Yeon Lee0Ju-Yong Lee1Seung-Hee Han2Jin-Goo Kang3Jeong-Beom Lee4Dae-Ryun Choi5Division of Ocean & Atmosphere Sciences, Korea Polar Research Institute, Incheon 21990, Republic of KoreaDepartment of Environmental and Engineering, Graduate School, Anyang University, Anyang 14028, Republic of KoreaDepartment of Environmental and Engineering, Graduate School, Anyang University, Anyang 14028, Republic of KoreaDepartment of Environmental and Energy Engineering, Anyang University, Anyang 14028, Republic of KoreaDepartment of Environmental and Engineering, Graduate School, Anyang University, Anyang 14028, Republic of KoreaDepartment of Environmental and Energy Engineering, Anyang University, Anyang 14028, Republic of KoreaAir pollution, particularly fine particulate matter (PM<sub>2.5</sub>), poses significant environmental and public health challenges in South Korea. The National Institute of Environmental Research (NIER) currently relies on numerical models such as the Community Multiscale Air Quality (CMAQ) model for PM<sub>2.5</sub> forecasting. However, these models exhibit inherent uncertainties due to limitations in emission inventories, meteorological inputs, and model frameworks. To address these challenges, this study evaluates and compares the forecasting performance of two alternative models: Long Short-Term Memory (LSTM), a deep learning model, and Seasonal Auto Regressive Integrated Moving Average with Exogenous Variables (SARIMAX), a statistical model. The performance evaluation was focused on Seoul, South Korea, and took place over different forecast lead times (D00–D02). The results indicate that for short-term forecasts (D00), SARIMAX outperformed LSTM in all statistical metrics, particularly in detecting high PM<sub>2.5</sub> concentrations, with a 19.43% higher Probability of Detection (POD). However, SARIMAX exhibited a sharp performance decline in extended forecasts (D01–D02). In contrast, LSTM demonstrated relatively stable accuracy over longer lead times, effectively capturing complex PM<sub>2.5</sub> concentration patterns, particularly during high-concentration episodes. These findings highlight the strengths and limitations of statistical and deep learning models. While SARIMAX excels in short-term forecasting with limited training data, LSTM proves advantageous for long-term forecasting, benefiting from its ability to learn complex temporal patterns from historical data. The results suggest that an integrated air quality forecasting system combining numerical, statistical, and machine learning approaches could enhance PM<sub>2.5</sub> forecasting accuracy.https://www.mdpi.com/2073-4433/16/5/524PM<sub>2.5</sub> forecastingLSTMSARIMAXair quality predictiondeep learningstatistical modeling |
| spellingShingle | Chae-Yeon Lee Ju-Yong Lee Seung-Hee Han Jin-Goo Kang Jeong-Beom Lee Dae-Ryun Choi Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula Atmosphere PM<sub>2.5</sub> forecasting LSTM SARIMAX air quality prediction deep learning statistical modeling |
| title | Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula |
| title_full | Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula |
| title_fullStr | Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula |
| title_full_unstemmed | Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula |
| title_short | Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula |
| title_sort | performance evaluation of pm sub 2 5 sub forecasting using sarimax and lstm in the korean peninsula |
| topic | PM<sub>2.5</sub> forecasting LSTM SARIMAX air quality prediction deep learning statistical modeling |
| url | https://www.mdpi.com/2073-4433/16/5/524 |
| work_keys_str_mv | AT chaeyeonlee performanceevaluationofpmsub25subforecastingusingsarimaxandlstminthekoreanpeninsula AT juyonglee performanceevaluationofpmsub25subforecastingusingsarimaxandlstminthekoreanpeninsula AT seungheehan performanceevaluationofpmsub25subforecastingusingsarimaxandlstminthekoreanpeninsula AT jingookang performanceevaluationofpmsub25subforecastingusingsarimaxandlstminthekoreanpeninsula AT jeongbeomlee performanceevaluationofpmsub25subforecastingusingsarimaxandlstminthekoreanpeninsula AT daeryunchoi performanceevaluationofpmsub25subforecastingusingsarimaxandlstminthekoreanpeninsula |