IMPACT OF FEATURE SELECTION ON DECISION TREE AND RANDOM FOREST FOR CLASSIFYING STUDENT STUDY SUCCESS
The advancement of technology has a profound impact on the field of education. Education plays a crucial role in enhancing quality of life, particularly in higher education, where one of the key parameters is student success. This study investigates the influence of feature selection on the performa...
Saved in:
| Main Authors: | , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Universitas Pattimura
2025-07-01
|
| Series: | Barekeng |
| Subjects: | |
| Online Access: | https://ojs3.unpatti.ac.id/index.php/barekeng/article/view/17017 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The advancement of technology has a profound impact on the field of education. Education plays a crucial role in enhancing quality of life, particularly in higher education, where one of the key parameters is student success. This study investigates the influence of feature selection on the performance of machine learning models, particularly Decision Tree and Random Forest, in classifying student academic success. Utilizing a dataset of 19,061 students, the research aims to identify significant variables impacting classification outcomes. Feature selection was conducted using LASSO regression, resulting in a refined dataset of critical predictors. To address data imbalance, Synthetic Minority Over-sampling Technique (SMOTE) was applied, improving the representation of underrepresented classes. Both Decision Tree and Random Forest models were trained on balanced datasets, with performance evaluated using accuracy, precision, recall, and F1-score metrics. The Random Forest model demonstrated superior accuracy (96.41%) compared to the Decision Tree (67.15%), as well as higher AUC values. Model interpretability was enhanced using SHAP (SHapley Additive exPlanations). This study underscores the utility of advanced machine learning techniques in educational analytics, paving the way for data-driven decision-making to support student achievement. |
|---|---|
| ISSN: | 1978-7227 2615-3017 |