Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset

Air pollution is an urgent global environmental problem, with significant impacts on public health and ecosystem stability. This research aims to develop an air quality classification model using the Global Air Pollution dataset from Kaggle, which consists of 23,463 rows of data and 12 features, inc...

Full description

Saved in:
Bibliographic Details
Main Authors: Cindy Dinda Sabella, Yoga Pristyanto
Format: Article
Language:English
Published: Politeknik Negeri Batam 2024-11-01
Series:Journal of Applied Informatics and Computing
Subjects:
Online Access:https://jurnal.polibatam.ac.id/index.php/JAIC/article/view/8611
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850169062103973888
author Cindy Dinda Sabella
Yoga Pristyanto
author_facet Cindy Dinda Sabella
Yoga Pristyanto
author_sort Cindy Dinda Sabella
collection DOAJ
description Air pollution is an urgent global environmental problem, with significant impacts on public health and ecosystem stability. This research aims to develop an air quality classification model using the Global Air Pollution dataset from Kaggle, which consists of 23,463 rows of data and 12 features, including important variables such as Air Quality Index (AQI), PM2.5, NO2, and O3. Decision Tree, Random Forest, and Support Vector Machine (SVM) algorithms are applied to perform classification, with a focus on hyperparameter tuning to increase model accuracy. The research results show that the Decision Tree provides the best results with an accuracy of 99.89% after tuning hyperparameters using the Grid Search method. The SVM model showed an improvement of 94.89% to 99.32%, while Random Forest recorded an accuracy of 96.87% with no significant improvement after tuning. Importance feature analysis identified PM2.5 and AQI as the dominant factors in influencing air quality, with PM2.5 having the highest importance value of 0.93. This research confirms that machine learning can be an effective tool for integrating and classifying air pollution. It is hoped that the integration of this model into a real-time air quality monitoring system can help make more responsive and precise decisions in dealing with air pollution problems.
format Article
id doaj-art-ccd3fc5f66fa4d01b3330402327d0c44
institution OA Journals
issn 2548-6861
language English
publishDate 2024-11-01
publisher Politeknik Negeri Batam
record_format Article
series Journal of Applied Informatics and Computing
spelling doaj-art-ccd3fc5f66fa4d01b3330402327d0c442025-08-20T02:20:49ZengPoliteknik Negeri BatamJournal of Applied Informatics and Computing2548-68612024-11-018247848610.30871/jaic.v8i2.86118611Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution DatasetCindy Dinda Sabella0Yoga Pristyanto1Universitas Amikom YogyakartaUniversitas Amikom YogyakartaAir pollution is an urgent global environmental problem, with significant impacts on public health and ecosystem stability. This research aims to develop an air quality classification model using the Global Air Pollution dataset from Kaggle, which consists of 23,463 rows of data and 12 features, including important variables such as Air Quality Index (AQI), PM2.5, NO2, and O3. Decision Tree, Random Forest, and Support Vector Machine (SVM) algorithms are applied to perform classification, with a focus on hyperparameter tuning to increase model accuracy. The research results show that the Decision Tree provides the best results with an accuracy of 99.89% after tuning hyperparameters using the Grid Search method. The SVM model showed an improvement of 94.89% to 99.32%, while Random Forest recorded an accuracy of 96.87% with no significant improvement after tuning. Importance feature analysis identified PM2.5 and AQI as the dominant factors in influencing air quality, with PM2.5 having the highest importance value of 0.93. This research confirms that machine learning can be an effective tool for integrating and classifying air pollution. It is hoped that the integration of this model into a real-time air quality monitoring system can help make more responsive and precise decisions in dealing with air pollution problems.https://jurnal.polibatam.ac.id/index.php/JAIC/article/view/8611decision treeair quality indexair pollutionmachine learningclassification
spellingShingle Cindy Dinda Sabella
Yoga Pristyanto
Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset
Journal of Applied Informatics and Computing
decision tree
air quality index
air pollution
machine learning
classification
title Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset
title_full Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset
title_fullStr Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset
title_full_unstemmed Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset
title_short Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset
title_sort evaluation of the decision tree model for air condition classification on the global air pollution dataset
topic decision tree
air quality index
air pollution
machine learning
classification
url https://jurnal.polibatam.ac.id/index.php/JAIC/article/view/8611
work_keys_str_mv AT cindydindasabella evaluationofthedecisiontreemodelforairconditionclassificationontheglobalairpollutiondataset
AT yogapristyanto evaluationofthedecisiontreemodelforairconditionclassificationontheglobalairpollutiondataset