Beyond Performance: Explaining and Ensuring Fairness in Student Academic Performance Prediction with Machine Learning
This study addresses fairness in machine learning for student academic performance prediction using the UCI Student Performance dataset. We comparatively evaluate logistic regression, Random Forest, and XGBoost, integrating the Synthetic Minority Oversampling Technique (SMOTE) to address class imbal...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/15/8409 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | This study addresses fairness in machine learning for student academic performance prediction using the UCI Student Performance dataset. We comparatively evaluate logistic regression, Random Forest, and XGBoost, integrating the Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance and 5-fold cross-validation for robust model training. A comprehensive fairness analysis is conducted, considering sensitive attributes such as gender, school type, and socioeconomic factors, including parental education (Medu and Fedu), cohabitation status (Pstatus), and family size (famsize). Using the AIF360 library, we compute the demographic parity difference (DP) and Equalized Odds Difference (EO) to assess model biases across diverse subgroups. Our results demonstrate that XGBoost achieves high predictive performance (accuracy: 0.789; F1 score: 0.803) while maintaining low bias for socioeconomic attributes, offering a balanced approach to fairness and performance. A sensitivity analysis of bias mitigation strategies further enhances the study, advancing equitable artificial intelligence in education by incorporating socially relevant factors. |
|---|---|
| ISSN: | 2076-3417 |