Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment

Abstract Factor analysis (FA) receptor models are widely used for source apportionment (SA) due to their ability to extract the source contribution and profile from the data. However, there is subjectivity in the source identification and labelling due to manual interpretation, which is time-consumi...

Full description

Saved in:

Bibliographic Details
Main Authors:	Vikas Kumar, Vasudev Malyan, Manoranjan Sahu, Basudev Biswal
Format:	Article
Language:	English
Published:	Springer 2023-04-01
Series:	Aerosol and Air Quality Research
Subjects:	Particulate matter Source apportionment Receptor models Machine learning Classification
Online Access:	https://doi.org/10.4209/aaqr.220386
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1823862856565653504
author	Vikas Kumar Vasudev Malyan Manoranjan Sahu Basudev Biswal
author_facet	Vikas Kumar Vasudev Malyan Manoranjan Sahu Basudev Biswal
author_sort	Vikas Kumar
collection	DOAJ
description	Abstract Factor analysis (FA) receptor models are widely used for source apportionment (SA) due to their ability to extract the source contribution and profile from the data. However, there is subjectivity in the source identification and labelling due to manual interpretation, which is time-consuming. This raises a barrier to the development of the real-time SA process. In this study, a machine learning (ML) classification algorithm, k-nearest neighbour (kNN), is applied to the source profiles obtained from the United States Environmental Protection Agency’s (U.S. EPA) SPECIATE database to develop a model that can automatically label the factors derived from FA receptor models. The train and test score of the model is 0.85 and 0.79, respectively. The overall weighted average precision, recall and F1 score is 0.79. The performance of the model during validation exhibits acceptable results. The application of ML models for source profile labelling will reduce the time taken and the subjectivity associated with results due to modeler bias. This process can act as another layer of the process for verification of the results of FA receptor models. The application of this methodology advances the process towards real-time SA.
format	Article
id	doaj-art-f12c5a434c0f48529c345ab8e715a9d5
institution	Kabale University
issn	1680-8584 2071-1409
language	English
publishDate	2023-04-01
publisher	Springer
record_format	Article
series	Aerosol and Air Quality Research
spelling	doaj-art-f12c5a434c0f48529c345ab8e715a9d52025-02-09T12:22:18ZengSpringerAerosol and Air Quality Research1680-85842071-14092023-04-0123711110.4209/aaqr.220386Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source ApportionmentVikas Kumar0Vasudev Malyan1Manoranjan Sahu2Basudev Biswal3Interdisciplinary Program in Climate Studies, Indian Institute of Technology BombayAerosol and Nanoparticle Technology Laboratory, Environmental Science and Engineering Department, Indian Institute of Technology BombayAerosol and Nanoparticle Technology Laboratory, Environmental Science and Engineering Department, Indian Institute of Technology BombayDepartment of Civil Engineering, Indian Institute of Technology BombayAbstract Factor analysis (FA) receptor models are widely used for source apportionment (SA) due to their ability to extract the source contribution and profile from the data. However, there is subjectivity in the source identification and labelling due to manual interpretation, which is time-consuming. This raises a barrier to the development of the real-time SA process. In this study, a machine learning (ML) classification algorithm, k-nearest neighbour (kNN), is applied to the source profiles obtained from the United States Environmental Protection Agency’s (U.S. EPA) SPECIATE database to develop a model that can automatically label the factors derived from FA receptor models. The train and test score of the model is 0.85 and 0.79, respectively. The overall weighted average precision, recall and F1 score is 0.79. The performance of the model during validation exhibits acceptable results. The application of ML models for source profile labelling will reduce the time taken and the subjectivity associated with results due to modeler bias. This process can act as another layer of the process for verification of the results of FA receptor models. The application of this methodology advances the process towards real-time SA.https://doi.org/10.4209/aaqr.220386Particulate matterSource apportionmentReceptor modelsMachine learningClassification
spellingShingle	Vikas Kumar Vasudev Malyan Manoranjan Sahu Basudev Biswal Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment Aerosol and Air Quality Research Particulate matter Source apportionment Receptor models Machine learning Classification
title	Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment
title_full	Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment
title_fullStr	Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment
title_full_unstemmed	Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment
title_short	Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment
title_sort	machine learning classification model to label sources derived from factor analysis receptor models for source apportionment
topic	Particulate matter Source apportionment Receptor models Machine learning Classification
url	https://doi.org/10.4209/aaqr.220386
work_keys_str_mv	AT vikaskumar machinelearningclassificationmodeltolabelsourcesderivedfromfactoranalysisreceptormodelsforsourceapportionment AT vasudevmalyan machinelearningclassificationmodeltolabelsourcesderivedfromfactoranalysisreceptormodelsforsourceapportionment AT manoranjansahu machinelearningclassificationmodeltolabelsourcesderivedfromfactoranalysisreceptormodelsforsourceapportionment AT basudevbiswal machinelearningclassificationmodeltolabelsourcesderivedfromfactoranalysisreceptormodelsforsourceapportionment

Machine Learning Classification Model to Label Sources Derived from Factor Analysis Receptor Models for Source Apportionment

Similar Items