Utilizing Machine Learning-based Classification Models for Tracking Air Pollution Sources: A Case Study in Korea

Abstract Urbanization and industrialization pose significant challenges in promptly identifying and managing air pollution sources. The application of machine learning technology offers a promising solution to solve the issue. By analyzing multidimensional datasets containing a wide range of air pol...

Full description

Saved in:
Bibliographic Details
Main Authors: Yelim Choi, Bogyeong Kang, Daekeun Kim
Format: Article
Language:English
Published: Springer 2024-05-01
Series:Aerosol and Air Quality Research
Subjects:
Online Access:https://doi.org/10.4209/aaqr.230222
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823862860710674432
author Yelim Choi
Bogyeong Kang
Daekeun Kim
author_facet Yelim Choi
Bogyeong Kang
Daekeun Kim
author_sort Yelim Choi
collection DOAJ
description Abstract Urbanization and industrialization pose significant challenges in promptly identifying and managing air pollution sources. The application of machine learning technology offers a promising solution to solve the issue. By analyzing multidimensional datasets containing a wide range of air pollutants, a machine learning approach has the potential to significantly improve air pollution management and facilitate source tracking. This study aims to comprehensively evaluate machine learning-based emission source classification models to provide insights into air pollution source tracking and management. Using 972 datasets consisting of five emission sources and 27 air pollutants, different classification models were implemented and subsequently compared: Random Forest (RF), Naïve Bayes Classifier (NBC), Support Vector Machine (SVM), Artificial Neural Network (ANN), and K-Nearest Neighbors (K-NN). The RF model was found to have better predictive performance than the other four models, achieving an accuracy of 0.9691 and a kappa value of 0.9537. Hydrogen chloride and acetaldehyde were the most important variables for classifying emission sources. The findings suggest the potential of machine learning techniques in addressing air pollution challenges, and the classifier model implemented in this study shows great promise for effective emission source identification.
format Article
id doaj-art-f94d4d2aae0e460db7e7d5eb3f62321e
institution Kabale University
issn 1680-8584
2071-1409
language English
publishDate 2024-05-01
publisher Springer
record_format Article
series Aerosol and Air Quality Research
spelling doaj-art-f94d4d2aae0e460db7e7d5eb3f62321e2025-02-09T12:23:59ZengSpringerAerosol and Air Quality Research1680-85842071-14092024-05-0124711110.4209/aaqr.230222Utilizing Machine Learning-based Classification Models for Tracking Air Pollution Sources: A Case Study in KoreaYelim Choi0Bogyeong Kang1Daekeun Kim2Department of Environmental Engineering, Seoul National University of Science and TechnologyDepartment of Environmental Engineering, Seoul National University of Science and TechnologyDepartment of Environmental Engineering, Seoul National University of Science and TechnologyAbstract Urbanization and industrialization pose significant challenges in promptly identifying and managing air pollution sources. The application of machine learning technology offers a promising solution to solve the issue. By analyzing multidimensional datasets containing a wide range of air pollutants, a machine learning approach has the potential to significantly improve air pollution management and facilitate source tracking. This study aims to comprehensively evaluate machine learning-based emission source classification models to provide insights into air pollution source tracking and management. Using 972 datasets consisting of five emission sources and 27 air pollutants, different classification models were implemented and subsequently compared: Random Forest (RF), Naïve Bayes Classifier (NBC), Support Vector Machine (SVM), Artificial Neural Network (ANN), and K-Nearest Neighbors (K-NN). The RF model was found to have better predictive performance than the other four models, achieving an accuracy of 0.9691 and a kappa value of 0.9537. Hydrogen chloride and acetaldehyde were the most important variables for classifying emission sources. The findings suggest the potential of machine learning techniques in addressing air pollution challenges, and the classifier model implemented in this study shows great promise for effective emission source identification.https://doi.org/10.4209/aaqr.230222Machine learningEmission sourcesAir pollutantsClassification
spellingShingle Yelim Choi
Bogyeong Kang
Daekeun Kim
Utilizing Machine Learning-based Classification Models for Tracking Air Pollution Sources: A Case Study in Korea
Aerosol and Air Quality Research
Machine learning
Emission sources
Air pollutants
Classification
title Utilizing Machine Learning-based Classification Models for Tracking Air Pollution Sources: A Case Study in Korea
title_full Utilizing Machine Learning-based Classification Models for Tracking Air Pollution Sources: A Case Study in Korea
title_fullStr Utilizing Machine Learning-based Classification Models for Tracking Air Pollution Sources: A Case Study in Korea
title_full_unstemmed Utilizing Machine Learning-based Classification Models for Tracking Air Pollution Sources: A Case Study in Korea
title_short Utilizing Machine Learning-based Classification Models for Tracking Air Pollution Sources: A Case Study in Korea
title_sort utilizing machine learning based classification models for tracking air pollution sources a case study in korea
topic Machine learning
Emission sources
Air pollutants
Classification
url https://doi.org/10.4209/aaqr.230222
work_keys_str_mv AT yelimchoi utilizingmachinelearningbasedclassificationmodelsfortrackingairpollutionsourcesacasestudyinkorea
AT bogyeongkang utilizingmachinelearningbasedclassificationmodelsfortrackingairpollutionsourcesacasestudyinkorea
AT daekeunkim utilizingmachinelearningbasedclassificationmodelsfortrackingairpollutionsourcesacasestudyinkorea