Assessment of binary prediction of fraudulent advertisements in ATS candidate tracking cloud systems

The abstract describes the construction of a binary classification model for predicting the type of job advertisement in cloud-based ATS (Applicant Tracking Systems) as either legitimate or fraudulent. Various machine learning algorithms can be employed to address this issue. Traditional classificat...

Full description

Saved in:

Bibliographic Details
Main Authors:	V. V. Ligi-Goryaev, G. A. Mankaeva, T. B. Goldvarg, S. S. Muchkaeva, V. V. Dzhakhnaev
Format:	Article
Language:	Russian
Published:	North-Caucasus Federal University 2024-05-01
Series:	Современная наука и инновации
Subjects:	cloud-based ats fraudulent advertisement detection classifiers linearsvc gbt rf models
Online Access:	https://msi.elpub.ru/jour/article/view/1590
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849694688142950400
author	V. V. Ligi-Goryaev G. A. Mankaeva T. B. Goldvarg S. S. Muchkaeva V. V. Dzhakhnaev
author_facet	V. V. Ligi-Goryaev G. A. Mankaeva T. B. Goldvarg S. S. Muchkaeva V. V. Dzhakhnaev
author_sort	V. V. Ligi-Goryaev
collection	DOAJ
description	The abstract describes the construction of a binary classification model for predicting the type of job advertisement in cloud-based ATS (Applicant Tracking Systems) as either legitimate or fraudulent. Various machine learning algorithms can be employed to address this issue. Traditional classification algorithms, including LSVC (Support Vector Machine), GBT (Gradient Boosting Tree), and RF (Random Forest), have been chosen for this study. One approach to building such a model involves identifying and collecting relevant attributes or features that can help distinguish fraudulent job advertisements from legitimate ones. Some features that could be useful in detecting fraudulent job ads include job location, job description, job requirements, job responsibilities, company information, and recruiter data. Subsequently, different machine learning algorithms can be trained on prepared datasets using standard methods such as cross-validation to assess their performance. The performance of the trained models can be evaluated using various metrics such as accuracy, precision, and recall. Ultimately, the most effective model can be selected based on these evaluation metrics and deployed in a production environment, where it can classify job advertisements as fraudulent or legitimate. It's important to note that the model should also undergo continuous evaluation and updates over time to ensure its reliability and effectiveness. Based on the evaluation metrics, it was concluded that the GBT classifier exhibits higher performance and accuracy compared to the LinearSVC and RF classifiers on the given dataset. However, it should be considered that the GBT classifier requires more time for training and prediction; GBT takes 208.738579 seconds, while LSVC and RF take 64.267132 and 71.024914 seconds, respectively. Taking into account the evaluation results, the GBT model was utilized for the operational aspect of the program. For implementation of the prediction, machine learning was performed on GBT, RF, and LSVC using a custom dataset called "Job_Fraud," created based on the publicly available EMSCAD dataset. To address the significant data imbalance, an implementation of the Synthetic Minority Over-sampling Technique (SMOTE) from a library was utilized. Initially, a model was obtained and trained on the data using a classifier, removing stop-words through TFIDFVectorizer in the vector space. Then, after reducing the dimensionality of the data, the data was reloaded, and both the model and vectorizer were retrained before being used for prediction. The tkinter module was used for the graphical interface. The predict() function utilizes the trained model for predictions based on the feature vector.
format	Article
id	doaj-art-5b809e667b934045b0acd54a68e0bfc7
institution	DOAJ
issn	2307-910X
language	Russian
publishDate	2024-05-01
publisher	North-Caucasus Federal University
record_format	Article
series	Современная наука и инновации
spelling	doaj-art-5b809e667b934045b0acd54a68e0bfc72025-08-20T03:20:00ZrusNorth-Caucasus Federal UniversityСовременная наука и инновации2307-910X2024-05-0101324110.37493/2307-910X.2024.1.31557Assessment of binary prediction of fraudulent advertisements in ATS candidate tracking cloud systemsV. V. Ligi-Goryaev0G. A. Mankaeva1T. B. Goldvarg2S. S. Muchkaeva3V. V. Dzhakhnaev4Kalmyk State University named after B.B. GorodovikovKalmyk State University named after B.B. GorodovikovKalmyk State University named after B.B. GorodovikovKalmyk State University named after B.B. GorodovikovKalmyk State University named after B.B. GorodovikovThe abstract describes the construction of a binary classification model for predicting the type of job advertisement in cloud-based ATS (Applicant Tracking Systems) as either legitimate or fraudulent. Various machine learning algorithms can be employed to address this issue. Traditional classification algorithms, including LSVC (Support Vector Machine), GBT (Gradient Boosting Tree), and RF (Random Forest), have been chosen for this study. One approach to building such a model involves identifying and collecting relevant attributes or features that can help distinguish fraudulent job advertisements from legitimate ones. Some features that could be useful in detecting fraudulent job ads include job location, job description, job requirements, job responsibilities, company information, and recruiter data. Subsequently, different machine learning algorithms can be trained on prepared datasets using standard methods such as cross-validation to assess their performance. The performance of the trained models can be evaluated using various metrics such as accuracy, precision, and recall. Ultimately, the most effective model can be selected based on these evaluation metrics and deployed in a production environment, where it can classify job advertisements as fraudulent or legitimate. It's important to note that the model should also undergo continuous evaluation and updates over time to ensure its reliability and effectiveness. Based on the evaluation metrics, it was concluded that the GBT classifier exhibits higher performance and accuracy compared to the LinearSVC and RF classifiers on the given dataset. However, it should be considered that the GBT classifier requires more time for training and prediction; GBT takes 208.738579 seconds, while LSVC and RF take 64.267132 and 71.024914 seconds, respectively. Taking into account the evaluation results, the GBT model was utilized for the operational aspect of the program. For implementation of the prediction, machine learning was performed on GBT, RF, and LSVC using a custom dataset called "Job_Fraud," created based on the publicly available EMSCAD dataset. To address the significant data imbalance, an implementation of the Synthetic Minority Over-sampling Technique (SMOTE) from a library was utilized. Initially, a model was obtained and trained on the data using a classifier, removing stop-words through TFIDFVectorizer in the vector space. Then, after reducing the dimensionality of the data, the data was reloaded, and both the model and vectorizer were retrained before being used for prediction. The tkinter module was used for the graphical interface. The predict() function utilizes the trained model for predictions based on the feature vector.https://msi.elpub.ru/jour/article/view/1590cloud-based atsfraudulent advertisement detectionclassifierslinearsvcgbtrf models
spellingShingle	V. V. Ligi-Goryaev G. A. Mankaeva T. B. Goldvarg S. S. Muchkaeva V. V. Dzhakhnaev Assessment of binary prediction of fraudulent advertisements in ATS candidate tracking cloud systems Современная наука и инновации cloud-based ats fraudulent advertisement detection classifiers linearsvc gbt rf models
title	Assessment of binary prediction of fraudulent advertisements in ATS candidate tracking cloud systems
title_full	Assessment of binary prediction of fraudulent advertisements in ATS candidate tracking cloud systems
title_fullStr	Assessment of binary prediction of fraudulent advertisements in ATS candidate tracking cloud systems
title_full_unstemmed	Assessment of binary prediction of fraudulent advertisements in ATS candidate tracking cloud systems
title_short	Assessment of binary prediction of fraudulent advertisements in ATS candidate tracking cloud systems
title_sort	assessment of binary prediction of fraudulent advertisements in ats candidate tracking cloud systems
topic	cloud-based ats fraudulent advertisement detection classifiers linearsvc gbt rf models
url	https://msi.elpub.ru/jour/article/view/1590
work_keys_str_mv	AT vvligigoryaev assessmentofbinarypredictionoffraudulentadvertisementsinatscandidatetrackingcloudsystems AT gamankaeva assessmentofbinarypredictionoffraudulentadvertisementsinatscandidatetrackingcloudsystems AT tbgoldvarg assessmentofbinarypredictionoffraudulentadvertisementsinatscandidatetrackingcloudsystems AT ssmuchkaeva assessmentofbinarypredictionoffraudulentadvertisementsinatscandidatetrackingcloudsystems AT vvdzhakhnaev assessmentofbinarypredictionoffraudulentadvertisementsinatscandidatetrackingcloudsystems

Assessment of binary prediction of fraudulent advertisements in ATS candidate tracking cloud systems

Similar Items