IMPACT OF FEATURE SELECTION ON DECISION TREE AND RANDOM FOREST FOR CLASSIFYING STUDENT STUDY SUCCESS

The advancement of technology has a profound impact on the field of education. Education plays a crucial role in enhancing quality of life, particularly in higher education, where one of the key parameters is student success. This study investigates the influence of feature selection on the performa...

Full description

Saved in:
Bibliographic Details
Main Authors: Firdaus Amruzain Satiranandi Wibowo, Heri Retnawati, Muhammad Lintang Damar Sakti, Asma Khoirunnisa, Angella Ananta Batubara, Miftah Okta Berlian, Zulfa Safina Ibrahim, Jailani Jailani, Sumaryanto Sumaryanto, Lantip Diat Prasojo
Format: Article
Language:English
Published: Universitas Pattimura 2025-07-01
Series:Barekeng
Subjects:
Online Access:https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/17017
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The advancement of technology has a profound impact on the field of education. Education plays a crucial role in enhancing quality of life, particularly in higher education, where one of the key parameters is student success. This study investigates the influence of feature selection on the performance of machine learning models, particularly Decision Tree and Random Forest, in classifying student academic success. Utilizing a dataset of 19,061 students, the research aims to identify significant variables impacting classification outcomes. Feature selection was conducted using LASSO regression, resulting in a refined dataset of critical predictors. To address data imbalance, Synthetic Minority Over-sampling Technique (SMOTE) was applied, improving the representation of underrepresented classes. Both Decision Tree and Random Forest models were trained on balanced datasets, with performance evaluated using accuracy, precision, recall, and F1-score metrics. The Random Forest model demonstrated superior accuracy (96.41%) compared to the Decision Tree (67.15%), as well as higher AUC values. Model interpretability was enhanced using SHAP (SHapley Additive exPlanations). This study underscores the utility of advanced machine learning techniques in educational analytics, paving the way for data-driven decision-making to support student achievement.
ISSN:1978-7227
2615-3017