Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations

Abstract Air pollution continues to have a significant impact on Europeans living in urban areas, and episodes of elevated PMx are responsible for a large number of premature deaths (mostly due to heart disease and stroke) each year. According to the annual EEA reports, Poland is one of the most pol...

Full description

Saved in:
Bibliographic Details
Main Authors: Bartosz Czernecki, Michał Marosz, Joanna Jędruszkiewicz
Format: Article
Language:English
Published: Springer 2021-03-01
Series:Aerosol and Air Quality Research
Subjects:
Online Access:https://doi.org/10.4209/aaqr.200586
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823862892171100160
author Bartosz Czernecki
Michał Marosz
Joanna Jędruszkiewicz
author_facet Bartosz Czernecki
Michał Marosz
Joanna Jędruszkiewicz
author_sort Bartosz Czernecki
collection DOAJ
description Abstract Air pollution continues to have a significant impact on Europeans living in urban areas, and episodes of elevated PMx are responsible for a large number of premature deaths (mostly due to heart disease and stroke) each year. According to the annual EEA reports, Poland is one of the most polluted countries in Europe, experiencing high PMx concentrations during winter that mostly result from large emissions and unfavourable weather conditions in combination with environmental features. Thus, in addition to implementing municipal mitigation strategies, alerting residents to pollution episodes through accurate PMx forecasting is necessary. This research aimed to assess the feasibility of short-term PMx forecasting via machine learning (ML) and the subsequent identification of the primary meteorological covariates. The data comprised 10 years of hourly winter PM10 and PM2.5 concentrations measured at 11 urban air quality monitoring stations, including background, traffic, and industrial sites, in four large Polish agglomerations, viz., Poznań, Kraków, Łódź, and Gdańsk, which cover areas with high population density and diverse environments that extend from the Baltic Sea coast (Tricity) through the lowlands (Poznań and Łódź) to the highlands (Kraków). We tested four ML models: AIC-based stepwise regression, two tree-based algorithms (random forests and XGBoost), and neural networks. Employing analysis and cross-validation, we found that XGBoost performed the best, followed by random forests and neural networks, and stepwise regression performed the worst. This ranking was apparent in the threshold exceedance values of the binary forecasts obtained via regression. Overall, our results confirm the high applicability of ML to short-term air quality prediction with the perfect prog approach.
format Article
id doaj-art-6658e8b75ac449d0aa5550b87abd6796
institution Kabale University
issn 1680-8584
2071-1409
language English
publishDate 2021-03-01
publisher Springer
record_format Article
series Aerosol and Air Quality Research
spelling doaj-art-6658e8b75ac449d0aa5550b87abd67962025-02-09T12:20:10ZengSpringerAerosol and Air Quality Research1680-85842071-14092021-03-0121711810.4209/aaqr.200586Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish AgglomerationsBartosz Czernecki0Michał Marosz1Joanna Jędruszkiewicz2Department of Meteorology and Climatology, Adam Mickiewicz University in PoznańInstitute of Meteorology and Water Management—National Research InstituteInstitute of Geography, Pedagogical University of CracowAbstract Air pollution continues to have a significant impact on Europeans living in urban areas, and episodes of elevated PMx are responsible for a large number of premature deaths (mostly due to heart disease and stroke) each year. According to the annual EEA reports, Poland is one of the most polluted countries in Europe, experiencing high PMx concentrations during winter that mostly result from large emissions and unfavourable weather conditions in combination with environmental features. Thus, in addition to implementing municipal mitigation strategies, alerting residents to pollution episodes through accurate PMx forecasting is necessary. This research aimed to assess the feasibility of short-term PMx forecasting via machine learning (ML) and the subsequent identification of the primary meteorological covariates. The data comprised 10 years of hourly winter PM10 and PM2.5 concentrations measured at 11 urban air quality monitoring stations, including background, traffic, and industrial sites, in four large Polish agglomerations, viz., Poznań, Kraków, Łódź, and Gdańsk, which cover areas with high population density and diverse environments that extend from the Baltic Sea coast (Tricity) through the lowlands (Poznań and Łódź) to the highlands (Kraków). We tested four ML models: AIC-based stepwise regression, two tree-based algorithms (random forests and XGBoost), and neural networks. Employing analysis and cross-validation, we found that XGBoost performed the best, followed by random forests and neural networks, and stepwise regression performed the worst. This ranking was apparent in the threshold exceedance values of the binary forecasts obtained via regression. Overall, our results confirm the high applicability of ML to short-term air quality prediction with the perfect prog approach.https://doi.org/10.4209/aaqr.200586PM10PM2.5Air qualityMachine learningShort-term forecasting
spellingShingle Bartosz Czernecki
Michał Marosz
Joanna Jędruszkiewicz
Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations
Aerosol and Air Quality Research
PM10
PM2.5
Air quality
Machine learning
Short-term forecasting
title Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations
title_full Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations
title_fullStr Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations
title_full_unstemmed Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations
title_short Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations
title_sort assessment of machine learning algorithms in short term forecasting of pm10 and pm2 5 concentrations in selected polish agglomerations
topic PM10
PM2.5
Air quality
Machine learning
Short-term forecasting
url https://doi.org/10.4209/aaqr.200586
work_keys_str_mv AT bartoszczernecki assessmentofmachinelearningalgorithmsinshorttermforecastingofpm10andpm25concentrationsinselectedpolishagglomerations
AT michałmarosz assessmentofmachinelearningalgorithmsinshorttermforecastingofpm10andpm25concentrationsinselectedpolishagglomerations
AT joannajedruszkiewicz assessmentofmachinelearningalgorithmsinshorttermforecastingofpm10andpm25concentrationsinselectedpolishagglomerations