Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment

Abstract Factor analysis (FA) receptor models are widely used for source apportionment (SA) due to their ability to extract the source contribution and profile from the data. However, there is subjectivity in the source identification and labelling due to manual interpretation, which is time-consumi...

Full description

Saved in:
Bibliographic Details
Main Authors: Vikas Kumar, Vasudev Malyan, Manoranjan Sahu, Basudev Biswal
Format: Article
Language:English
Published: Springer 2023-04-01
Series:Aerosol and Air Quality Research
Subjects:
Online Access:https://doi.org/10.4209/aaqr.220386
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823862856565653504
author Vikas Kumar
Vasudev Malyan
Manoranjan Sahu
Basudev Biswal
author_facet Vikas Kumar
Vasudev Malyan
Manoranjan Sahu
Basudev Biswal
author_sort Vikas Kumar
collection DOAJ
description Abstract Factor analysis (FA) receptor models are widely used for source apportionment (SA) due to their ability to extract the source contribution and profile from the data. However, there is subjectivity in the source identification and labelling due to manual interpretation, which is time-consuming. This raises a barrier to the development of the real-time SA process. In this study, a machine learning (ML) classification algorithm, k-nearest neighbour (kNN), is applied to the source profiles obtained from the United States Environmental Protection Agency’s (U.S. EPA) SPECIATE database to develop a model that can automatically label the factors derived from FA receptor models. The train and test score of the model is 0.85 and 0.79, respectively. The overall weighted average precision, recall and F1 score is 0.79. The performance of the model during validation exhibits acceptable results. The application of ML models for source profile labelling will reduce the time taken and the subjectivity associated with results due to modeler bias. This process can act as another layer of the process for verification of the results of FA receptor models. The application of this methodology advances the process towards real-time SA.
format Article
id doaj-art-f12c5a434c0f48529c345ab8e715a9d5
institution Kabale University
issn 1680-8584
2071-1409
language English
publishDate 2023-04-01
publisher Springer
record_format Article
series Aerosol and Air Quality Research
spelling doaj-art-f12c5a434c0f48529c345ab8e715a9d52025-02-09T12:22:18ZengSpringerAerosol and Air Quality Research1680-85842071-14092023-04-0123711110.4209/aaqr.220386Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source ApportionmentVikas Kumar0Vasudev Malyan1Manoranjan Sahu2Basudev Biswal3Interdisciplinary Program in Climate Studies, Indian Institute of Technology BombayAerosol and Nanoparticle Technology Laboratory, Environmental Science and Engineering Department, Indian Institute of Technology BombayAerosol and Nanoparticle Technology Laboratory, Environmental Science and Engineering Department, Indian Institute of Technology BombayDepartment of Civil Engineering, Indian Institute of Technology BombayAbstract Factor analysis (FA) receptor models are widely used for source apportionment (SA) due to their ability to extract the source contribution and profile from the data. However, there is subjectivity in the source identification and labelling due to manual interpretation, which is time-consuming. This raises a barrier to the development of the real-time SA process. In this study, a machine learning (ML) classification algorithm, k-nearest neighbour (kNN), is applied to the source profiles obtained from the United States Environmental Protection Agency’s (U.S. EPA) SPECIATE database to develop a model that can automatically label the factors derived from FA receptor models. The train and test score of the model is 0.85 and 0.79, respectively. The overall weighted average precision, recall and F1 score is 0.79. The performance of the model during validation exhibits acceptable results. The application of ML models for source profile labelling will reduce the time taken and the subjectivity associated with results due to modeler bias. This process can act as another layer of the process for verification of the results of FA receptor models. The application of this methodology advances the process towards real-time SA.https://doi.org/10.4209/aaqr.220386Particulate matterSource apportionmentReceptor modelsMachine learningClassification
spellingShingle Vikas Kumar
Vasudev Malyan
Manoranjan Sahu
Basudev Biswal
Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment
Aerosol and Air Quality Research
Particulate matter
Source apportionment
Receptor models
Machine learning
Classification
title Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment
title_full Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment
title_fullStr Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment
title_full_unstemmed Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment
title_short Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment
title_sort machine learning classification model to label sources derived from factor analysis receptor models for source apportionment
topic Particulate matter
Source apportionment
Receptor models
Machine learning
Classification
url https://doi.org/10.4209/aaqr.220386
work_keys_str_mv AT vikaskumar machinelearningclassificationmodeltolabelsourcesderivedfromfactoranalysisreceptormodelsforsourceapportionment
AT vasudevmalyan machinelearningclassificationmodeltolabelsourcesderivedfromfactoranalysisreceptormodelsforsourceapportionment
AT manoranjansahu machinelearningclassificationmodeltolabelsourcesderivedfromfactoranalysisreceptormodelsforsourceapportionment
AT basudevbiswal machinelearningclassificationmodeltolabelsourcesderivedfromfactoranalysisreceptormodelsforsourceapportionment