Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset
Air pollution is an urgent global environmental problem, with significant impacts on public health and ecosystem stability. This research aims to develop an air quality classification model using the Global Air Pollution dataset from Kaggle, which consists of 23,463 rows of data and 12 features, inc...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Politeknik Negeri Batam
2024-11-01
|
| Series: | Journal of Applied Informatics and Computing |
| Subjects: | |
| Online Access: | https://jurnal.polibatam.ac.id/index.php/JAIC/article/view/8611 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850169062103973888 |
|---|---|
| author | Cindy Dinda Sabella Yoga Pristyanto |
| author_facet | Cindy Dinda Sabella Yoga Pristyanto |
| author_sort | Cindy Dinda Sabella |
| collection | DOAJ |
| description | Air pollution is an urgent global environmental problem, with significant impacts on public health and ecosystem stability. This research aims to develop an air quality classification model using the Global Air Pollution dataset from Kaggle, which consists of 23,463 rows of data and 12 features, including important variables such as Air Quality Index (AQI), PM2.5, NO2, and O3. Decision Tree, Random Forest, and Support Vector Machine (SVM) algorithms are applied to perform classification, with a focus on hyperparameter tuning to increase model accuracy. The research results show that the Decision Tree provides the best results with an accuracy of 99.89% after tuning hyperparameters using the Grid Search method. The SVM model showed an improvement of 94.89% to 99.32%, while Random Forest recorded an accuracy of 96.87% with no significant improvement after tuning. Importance feature analysis identified PM2.5 and AQI as the dominant factors in influencing air quality, with PM2.5 having the highest importance value of 0.93. This research confirms that machine learning can be an effective tool for integrating and classifying air pollution. It is hoped that the integration of this model into a real-time air quality monitoring system can help make more responsive and precise decisions in dealing with air pollution problems. |
| format | Article |
| id | doaj-art-ccd3fc5f66fa4d01b3330402327d0c44 |
| institution | OA Journals |
| issn | 2548-6861 |
| language | English |
| publishDate | 2024-11-01 |
| publisher | Politeknik Negeri Batam |
| record_format | Article |
| series | Journal of Applied Informatics and Computing |
| spelling | doaj-art-ccd3fc5f66fa4d01b3330402327d0c442025-08-20T02:20:49ZengPoliteknik Negeri BatamJournal of Applied Informatics and Computing2548-68612024-11-018247848610.30871/jaic.v8i2.86118611Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution DatasetCindy Dinda Sabella0Yoga Pristyanto1Universitas Amikom YogyakartaUniversitas Amikom YogyakartaAir pollution is an urgent global environmental problem, with significant impacts on public health and ecosystem stability. This research aims to develop an air quality classification model using the Global Air Pollution dataset from Kaggle, which consists of 23,463 rows of data and 12 features, including important variables such as Air Quality Index (AQI), PM2.5, NO2, and O3. Decision Tree, Random Forest, and Support Vector Machine (SVM) algorithms are applied to perform classification, with a focus on hyperparameter tuning to increase model accuracy. The research results show that the Decision Tree provides the best results with an accuracy of 99.89% after tuning hyperparameters using the Grid Search method. The SVM model showed an improvement of 94.89% to 99.32%, while Random Forest recorded an accuracy of 96.87% with no significant improvement after tuning. Importance feature analysis identified PM2.5 and AQI as the dominant factors in influencing air quality, with PM2.5 having the highest importance value of 0.93. This research confirms that machine learning can be an effective tool for integrating and classifying air pollution. It is hoped that the integration of this model into a real-time air quality monitoring system can help make more responsive and precise decisions in dealing with air pollution problems.https://jurnal.polibatam.ac.id/index.php/JAIC/article/view/8611decision treeair quality indexair pollutionmachine learningclassification |
| spellingShingle | Cindy Dinda Sabella Yoga Pristyanto Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset Journal of Applied Informatics and Computing decision tree air quality index air pollution machine learning classification |
| title | Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset |
| title_full | Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset |
| title_fullStr | Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset |
| title_full_unstemmed | Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset |
| title_short | Evaluation of the Decision Tree Model for Air Condition Classification on the Global Air Pollution Dataset |
| title_sort | evaluation of the decision tree model for air condition classification on the global air pollution dataset |
| topic | decision tree air quality index air pollution machine learning classification |
| url | https://jurnal.polibatam.ac.id/index.php/JAIC/article/view/8611 |
| work_keys_str_mv | AT cindydindasabella evaluationofthedecisiontreemodelforairconditionclassificationontheglobalairpollutiondataset AT yogapristyanto evaluationofthedecisiontreemodelforairconditionclassificationontheglobalairpollutiondataset |