Integration of Hash Encoding Technique with Machine Learning for Employee Turnover Prediction

Employee turnover refers to the replacement of employees within an organization, which can lead to losses such as recruitment costs and decreased productivity. Predicting turnover is crucial for companies to anticipate and take appropriate actions to retain potential employees. This study aims to op...

Full description

Saved in:
Bibliographic Details
Main Authors: Ahya Radiatul Kamila, Johanes Fernandes Andry, Francka Sakti Lee, Felliks F. Tampinongkol
Format: Article
Language:English
Published: Informatics Department, Faculty of Computer Science Bina Darma University 2025-06-01
Series:Journal of Information Systems and Informatics
Subjects:
Online Access:https://journal-isi.org/index.php/isi/article/view/1129
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Employee turnover refers to the replacement of employees within an organization, which can lead to losses such as recruitment costs and decreased productivity. Predicting turnover is crucial for companies to anticipate and take appropriate actions to retain potential employees. This study aims to optimize the employee turnover prediction model by integrating hash encoding techniques and machine learning. The dataset used in this study is an open-source dataset obtained from Kaggle dataset. It consists of 14,994 rows and 10 columns (features) representing employee-related information such as satisfaction level, evaluation score, number of projects, average monthly hours, and whether the employee left the company. Among these features, some are of object data type. Since machine learning algorithms generally cannot work directly with object-type features, the use of hash encoding is proposed. This technique converts object-type data into numerical data. It is part of the preprocessing stage, aiming to reduce memory usage, speed up data preprocessing, and improve model performance. After preprocessing is completed, the prediction model is trained using the Random Forest algorithm to predict employee turnover. The evaluation is conducted using accuracy, recall, precision, and F1-score metrics, which yielded results of 0.988, 0.961, 0.988, and 0.974, respectively. These results indicate that the integration of hash encoding techniques and machine learning can produce a well-performing model for predicting employee turnover.
ISSN:2656-5935
2656-4882