Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula

Air pollution, particularly fine particulate matter (PM<sub>2.5</sub>), poses significant environmental and public health challenges in South Korea. The National Institute of Environmental Research (NIER) currently relies on numerical models such as the Community Multiscale Air Quality (...

Full description

Saved in:

Bibliographic Details
Main Authors:	Chae-Yeon Lee, Ju-Yong Lee, Seung-Hee Han, Jin-Goo Kang, Jeong-Beom Lee, Dae-Ryun Choi
Format:	Article
Language:	English
Published:	MDPI AG 2025-04-01
Series:	Atmosphere
Subjects:	PM<sub>2.5</sub> forecasting LSTM SARIMAX air quality prediction deep learning statistical modeling
Online Access:	https://www.mdpi.com/2073-4433/16/5/524
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850257732547903488
author	Chae-Yeon Lee Ju-Yong Lee Seung-Hee Han Jin-Goo Kang Jeong-Beom Lee Dae-Ryun Choi
author_facet	Chae-Yeon Lee Ju-Yong Lee Seung-Hee Han Jin-Goo Kang Jeong-Beom Lee Dae-Ryun Choi
author_sort	Chae-Yeon Lee
collection	DOAJ
description	Air pollution, particularly fine particulate matter (PM<sub>2.5</sub>), poses significant environmental and public health challenges in South Korea. The National Institute of Environmental Research (NIER) currently relies on numerical models such as the Community Multiscale Air Quality (CMAQ) model for PM<sub>2.5</sub> forecasting. However, these models exhibit inherent uncertainties due to limitations in emission inventories, meteorological inputs, and model frameworks. To address these challenges, this study evaluates and compares the forecasting performance of two alternative models: Long Short-Term Memory (LSTM), a deep learning model, and Seasonal Auto Regressive Integrated Moving Average with Exogenous Variables (SARIMAX), a statistical model. The performance evaluation was focused on Seoul, South Korea, and took place over different forecast lead times (D00–D02). The results indicate that for short-term forecasts (D00), SARIMAX outperformed LSTM in all statistical metrics, particularly in detecting high PM<sub>2.5</sub> concentrations, with a 19.43% higher Probability of Detection (POD). However, SARIMAX exhibited a sharp performance decline in extended forecasts (D01–D02). In contrast, LSTM demonstrated relatively stable accuracy over longer lead times, effectively capturing complex PM<sub>2.5</sub> concentration patterns, particularly during high-concentration episodes. These findings highlight the strengths and limitations of statistical and deep learning models. While SARIMAX excels in short-term forecasting with limited training data, LSTM proves advantageous for long-term forecasting, benefiting from its ability to learn complex temporal patterns from historical data. The results suggest that an integrated air quality forecasting system combining numerical, statistical, and machine learning approaches could enhance PM<sub>2.5</sub> forecasting accuracy.
format	Article
id	doaj-art-3e0ca9664ffe47bfad09d1b42a68e331
institution	OA Journals
issn	2073-4433
language	English
publishDate	2025-04-01
publisher	MDPI AG
record_format	Article
series	Atmosphere
spelling	doaj-art-3e0ca9664ffe47bfad09d1b42a68e3312025-08-20T01:56:20ZengMDPI AGAtmosphere2073-44332025-04-0116552410.3390/atmos16050524Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean PeninsulaChae-Yeon Lee0Ju-Yong Lee1Seung-Hee Han2Jin-Goo Kang3Jeong-Beom Lee4Dae-Ryun Choi5Division of Ocean & Atmosphere Sciences, Korea Polar Research Institute, Incheon 21990, Republic of KoreaDepartment of Environmental and Engineering, Graduate School, Anyang University, Anyang 14028, Republic of KoreaDepartment of Environmental and Engineering, Graduate School, Anyang University, Anyang 14028, Republic of KoreaDepartment of Environmental and Energy Engineering, Anyang University, Anyang 14028, Republic of KoreaDepartment of Environmental and Engineering, Graduate School, Anyang University, Anyang 14028, Republic of KoreaDepartment of Environmental and Energy Engineering, Anyang University, Anyang 14028, Republic of KoreaAir pollution, particularly fine particulate matter (PM<sub>2.5</sub>), poses significant environmental and public health challenges in South Korea. The National Institute of Environmental Research (NIER) currently relies on numerical models such as the Community Multiscale Air Quality (CMAQ) model for PM<sub>2.5</sub> forecasting. However, these models exhibit inherent uncertainties due to limitations in emission inventories, meteorological inputs, and model frameworks. To address these challenges, this study evaluates and compares the forecasting performance of two alternative models: Long Short-Term Memory (LSTM), a deep learning model, and Seasonal Auto Regressive Integrated Moving Average with Exogenous Variables (SARIMAX), a statistical model. The performance evaluation was focused on Seoul, South Korea, and took place over different forecast lead times (D00–D02). The results indicate that for short-term forecasts (D00), SARIMAX outperformed LSTM in all statistical metrics, particularly in detecting high PM<sub>2.5</sub> concentrations, with a 19.43% higher Probability of Detection (POD). However, SARIMAX exhibited a sharp performance decline in extended forecasts (D01–D02). In contrast, LSTM demonstrated relatively stable accuracy over longer lead times, effectively capturing complex PM<sub>2.5</sub> concentration patterns, particularly during high-concentration episodes. These findings highlight the strengths and limitations of statistical and deep learning models. While SARIMAX excels in short-term forecasting with limited training data, LSTM proves advantageous for long-term forecasting, benefiting from its ability to learn complex temporal patterns from historical data. The results suggest that an integrated air quality forecasting system combining numerical, statistical, and machine learning approaches could enhance PM<sub>2.5</sub> forecasting accuracy.https://www.mdpi.com/2073-4433/16/5/524PM<sub>2.5</sub> forecastingLSTMSARIMAXair quality predictiondeep learningstatistical modeling
spellingShingle	Chae-Yeon Lee Ju-Yong Lee Seung-Hee Han Jin-Goo Kang Jeong-Beom Lee Dae-Ryun Choi Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula Atmosphere PM<sub>2.5</sub> forecasting LSTM SARIMAX air quality prediction deep learning statistical modeling
title	Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula
title_full	Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula
title_fullStr	Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula
title_full_unstemmed	Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula
title_short	Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula
title_sort	performance evaluation of pm sub 2 5 sub forecasting using sarimax and lstm in the korean peninsula
topic	PM<sub>2.5</sub> forecasting LSTM SARIMAX air quality prediction deep learning statistical modeling
url	https://www.mdpi.com/2073-4433/16/5/524
work_keys_str_mv	AT chaeyeonlee performanceevaluationofpmsub25subforecastingusingsarimaxandlstminthekoreanpeninsula AT juyonglee performanceevaluationofpmsub25subforecastingusingsarimaxandlstminthekoreanpeninsula AT seungheehan performanceevaluationofpmsub25subforecastingusingsarimaxandlstminthekoreanpeninsula AT jingookang performanceevaluationofpmsub25subforecastingusingsarimaxandlstminthekoreanpeninsula AT jeongbeomlee performanceevaluationofpmsub25subforecastingusingsarimaxandlstminthekoreanpeninsula AT daeryunchoi performanceevaluationofpmsub25subforecastingusingsarimaxandlstminthekoreanpeninsula

Performance Evaluation of PM<sub>2.5</sub> Forecasting Using SARIMAX and LSTM in the Korean Peninsula

Similar Items