Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations
Abstract Air pollution continues to have a significant impact on Europeans living in urban areas, and episodes of elevated PMx are responsible for a large number of premature deaths (mostly due to heart disease and stroke) each year. According to the annual EEA reports, Poland is one of the most pol...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2021-03-01
|
Series: | Aerosol and Air Quality Research |
Subjects: | |
Online Access: | https://doi.org/10.4209/aaqr.200586 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1823862892171100160 |
---|---|
author | Bartosz Czernecki Michał Marosz Joanna Jędruszkiewicz |
author_facet | Bartosz Czernecki Michał Marosz Joanna Jędruszkiewicz |
author_sort | Bartosz Czernecki |
collection | DOAJ |
description | Abstract Air pollution continues to have a significant impact on Europeans living in urban areas, and episodes of elevated PMx are responsible for a large number of premature deaths (mostly due to heart disease and stroke) each year. According to the annual EEA reports, Poland is one of the most polluted countries in Europe, experiencing high PMx concentrations during winter that mostly result from large emissions and unfavourable weather conditions in combination with environmental features. Thus, in addition to implementing municipal mitigation strategies, alerting residents to pollution episodes through accurate PMx forecasting is necessary. This research aimed to assess the feasibility of short-term PMx forecasting via machine learning (ML) and the subsequent identification of the primary meteorological covariates. The data comprised 10 years of hourly winter PM10 and PM2.5 concentrations measured at 11 urban air quality monitoring stations, including background, traffic, and industrial sites, in four large Polish agglomerations, viz., Poznań, Kraków, Łódź, and Gdańsk, which cover areas with high population density and diverse environments that extend from the Baltic Sea coast (Tricity) through the lowlands (Poznań and Łódź) to the highlands (Kraków). We tested four ML models: AIC-based stepwise regression, two tree-based algorithms (random forests and XGBoost), and neural networks. Employing analysis and cross-validation, we found that XGBoost performed the best, followed by random forests and neural networks, and stepwise regression performed the worst. This ranking was apparent in the threshold exceedance values of the binary forecasts obtained via regression. Overall, our results confirm the high applicability of ML to short-term air quality prediction with the perfect prog approach. |
format | Article |
id | doaj-art-6658e8b75ac449d0aa5550b87abd6796 |
institution | Kabale University |
issn | 1680-8584 2071-1409 |
language | English |
publishDate | 2021-03-01 |
publisher | Springer |
record_format | Article |
series | Aerosol and Air Quality Research |
spelling | doaj-art-6658e8b75ac449d0aa5550b87abd67962025-02-09T12:20:10ZengSpringerAerosol and Air Quality Research1680-85842071-14092021-03-0121711810.4209/aaqr.200586Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish AgglomerationsBartosz Czernecki0Michał Marosz1Joanna Jędruszkiewicz2Department of Meteorology and Climatology, Adam Mickiewicz University in PoznańInstitute of Meteorology and Water Management—National Research InstituteInstitute of Geography, Pedagogical University of CracowAbstract Air pollution continues to have a significant impact on Europeans living in urban areas, and episodes of elevated PMx are responsible for a large number of premature deaths (mostly due to heart disease and stroke) each year. According to the annual EEA reports, Poland is one of the most polluted countries in Europe, experiencing high PMx concentrations during winter that mostly result from large emissions and unfavourable weather conditions in combination with environmental features. Thus, in addition to implementing municipal mitigation strategies, alerting residents to pollution episodes through accurate PMx forecasting is necessary. This research aimed to assess the feasibility of short-term PMx forecasting via machine learning (ML) and the subsequent identification of the primary meteorological covariates. The data comprised 10 years of hourly winter PM10 and PM2.5 concentrations measured at 11 urban air quality monitoring stations, including background, traffic, and industrial sites, in four large Polish agglomerations, viz., Poznań, Kraków, Łódź, and Gdańsk, which cover areas with high population density and diverse environments that extend from the Baltic Sea coast (Tricity) through the lowlands (Poznań and Łódź) to the highlands (Kraków). We tested four ML models: AIC-based stepwise regression, two tree-based algorithms (random forests and XGBoost), and neural networks. Employing analysis and cross-validation, we found that XGBoost performed the best, followed by random forests and neural networks, and stepwise regression performed the worst. This ranking was apparent in the threshold exceedance values of the binary forecasts obtained via regression. Overall, our results confirm the high applicability of ML to short-term air quality prediction with the perfect prog approach.https://doi.org/10.4209/aaqr.200586PM10PM2.5Air qualityMachine learningShort-term forecasting |
spellingShingle | Bartosz Czernecki Michał Marosz Joanna Jędruszkiewicz Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations Aerosol and Air Quality Research PM10 PM2.5 Air quality Machine learning Short-term forecasting |
title | Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations |
title_full | Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations |
title_fullStr | Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations |
title_full_unstemmed | Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations |
title_short | Assessment of Machine Learning Algorithms in Short-term Forecasting of PM10 and PM2.5 Concentrations in Selected Polish Agglomerations |
title_sort | assessment of machine learning algorithms in short term forecasting of pm10 and pm2 5 concentrations in selected polish agglomerations |
topic | PM10 PM2.5 Air quality Machine learning Short-term forecasting |
url | https://doi.org/10.4209/aaqr.200586 |
work_keys_str_mv | AT bartoszczernecki assessmentofmachinelearningalgorithmsinshorttermforecastingofpm10andpm25concentrationsinselectedpolishagglomerations AT michałmarosz assessmentofmachinelearningalgorithmsinshorttermforecastingofpm10andpm25concentrationsinselectedpolishagglomerations AT joannajedruszkiewicz assessmentofmachinelearningalgorithmsinshorttermforecastingofpm10andpm25concentrationsinselectedpolishagglomerations |