A novel AI-driven model for student dropout risk analysis with explainable AI insights

The increasing number of students dropping out of school due to social, economic, personal (e.g., depression or persistent failure), and health issues is a growing concern for governments, educators, and guardians. Identifying and analyzing the factors contributing to student dropout is crucial. Var...

Full description

Saved in:
Bibliographic Details
Main Authors: Sumaya Mustofa, Yousuf Rayhan Emon, Sajib Bin Mamun, Shabnur Anonna Akhy, Md Taimur Ahad
Format: Article
Language:English
Published: Elsevier 2025-06-01
Series:Computers and Education: Artificial Intelligence
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666920X24001553
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849335921043832832
author Sumaya Mustofa
Yousuf Rayhan Emon
Sajib Bin Mamun
Shabnur Anonna Akhy
Md Taimur Ahad
author_facet Sumaya Mustofa
Yousuf Rayhan Emon
Sajib Bin Mamun
Shabnur Anonna Akhy
Md Taimur Ahad
author_sort Sumaya Mustofa
collection DOAJ
description The increasing number of students dropping out of school due to social, economic, personal (e.g., depression or persistent failure), and health issues is a growing concern for governments, educators, and guardians. Identifying and analyzing the factors contributing to student dropout is crucial. Various machine learning, analytical, and statistical models have been proposed to address this issue. However, the existing models have several limitations in providing a precise and automated system for predicting dropout risk and analyzing the factors behind this. Besides, generating a balanced dataset is also a limitation as ‘Dropouts’ are less than the ‘Non-dropouts’. Moreover, selecting significant features contributing to student dropout and non-dropout is also very important in developing a model. However, this study introduces a comprehensive machine learning (ML) and explainable AI (XAI) based methodology to address these limitations. Firstly, the imbalanced dataset problem was handled using the Upsampling technique by adjusting the minority class ‘Dropout’. Then, the feature selection method Recursive Feature Elimination (RFE) is used with Cross-Validation (CV) as the RFE-CV method to select the most significant features. After preprocessing, this study proposed a hybrid model named the Hybrid Logistic Regression and Neural Network (HLRNN) model, which predicts student dropout with 96% accuracy, outperforming other experimented models as well as the parent models Logistic Regression and Artificial Neural Network with 2% and 3% accuracy. Finally, the XAI model The SHapley Additive exPlanations (SHAP), and Local Interpretable Model-agnostic Explanations (LIME) are deployed to analyze the risk factors associated with student dropout. This approach aims to assist institutions and educational stakeholders in formulating policies for student retention, enabling early intervention to reduce dropout rates.
format Article
id doaj-art-aade0d2b231d4016b930da9a20c2a6bc
institution Kabale University
issn 2666-920X
language English
publishDate 2025-06-01
publisher Elsevier
record_format Article
series Computers and Education: Artificial Intelligence
spelling doaj-art-aade0d2b231d4016b930da9a20c2a6bc2025-08-20T03:45:07ZengElsevierComputers and Education: Artificial Intelligence2666-920X2025-06-01810035210.1016/j.caeai.2024.100352A novel AI-driven model for student dropout risk analysis with explainable AI insightsSumaya Mustofa0Yousuf Rayhan Emon1Sajib Bin Mamun2Shabnur Anonna Akhy3Md Taimur Ahad4Corresponding author.; Department of Computer Science and Engineering, Daffodil International University, Savar, Dhaka, BangladeshDepartment of Computer Science and Engineering, Daffodil International University, Savar, Dhaka, BangladeshDepartment of Computer Science and Engineering, Daffodil International University, Savar, Dhaka, BangladeshDepartment of Computer Science and Engineering, Daffodil International University, Savar, Dhaka, BangladeshDepartment of Computer Science and Engineering, Daffodil International University, Savar, Dhaka, BangladeshThe increasing number of students dropping out of school due to social, economic, personal (e.g., depression or persistent failure), and health issues is a growing concern for governments, educators, and guardians. Identifying and analyzing the factors contributing to student dropout is crucial. Various machine learning, analytical, and statistical models have been proposed to address this issue. However, the existing models have several limitations in providing a precise and automated system for predicting dropout risk and analyzing the factors behind this. Besides, generating a balanced dataset is also a limitation as ‘Dropouts’ are less than the ‘Non-dropouts’. Moreover, selecting significant features contributing to student dropout and non-dropout is also very important in developing a model. However, this study introduces a comprehensive machine learning (ML) and explainable AI (XAI) based methodology to address these limitations. Firstly, the imbalanced dataset problem was handled using the Upsampling technique by adjusting the minority class ‘Dropout’. Then, the feature selection method Recursive Feature Elimination (RFE) is used with Cross-Validation (CV) as the RFE-CV method to select the most significant features. After preprocessing, this study proposed a hybrid model named the Hybrid Logistic Regression and Neural Network (HLRNN) model, which predicts student dropout with 96% accuracy, outperforming other experimented models as well as the parent models Logistic Regression and Artificial Neural Network with 2% and 3% accuracy. Finally, the XAI model The SHapley Additive exPlanations (SHAP), and Local Interpretable Model-agnostic Explanations (LIME) are deployed to analyze the risk factors associated with student dropout. This approach aims to assist institutions and educational stakeholders in formulating policies for student retention, enabling early intervention to reduce dropout rates.http://www.sciencedirect.com/science/article/pii/S2666920X24001553DropoutHLRNNHybrid logistic regression and neural networkRisk analysisSHAPLIME
spellingShingle Sumaya Mustofa
Yousuf Rayhan Emon
Sajib Bin Mamun
Shabnur Anonna Akhy
Md Taimur Ahad
A novel AI-driven model for student dropout risk analysis with explainable AI insights
Computers and Education: Artificial Intelligence
Dropout
HLRNN
Hybrid logistic regression and neural network
Risk analysis
SHAP
LIME
title A novel AI-driven model for student dropout risk analysis with explainable AI insights
title_full A novel AI-driven model for student dropout risk analysis with explainable AI insights
title_fullStr A novel AI-driven model for student dropout risk analysis with explainable AI insights
title_full_unstemmed A novel AI-driven model for student dropout risk analysis with explainable AI insights
title_short A novel AI-driven model for student dropout risk analysis with explainable AI insights
title_sort novel ai driven model for student dropout risk analysis with explainable ai insights
topic Dropout
HLRNN
Hybrid logistic regression and neural network
Risk analysis
SHAP
LIME
url http://www.sciencedirect.com/science/article/pii/S2666920X24001553
work_keys_str_mv AT sumayamustofa anovelaidrivenmodelforstudentdropoutriskanalysiswithexplainableaiinsights
AT yousufrayhanemon anovelaidrivenmodelforstudentdropoutriskanalysiswithexplainableaiinsights
AT sajibbinmamun anovelaidrivenmodelforstudentdropoutriskanalysiswithexplainableaiinsights
AT shabnuranonnaakhy anovelaidrivenmodelforstudentdropoutriskanalysiswithexplainableaiinsights
AT mdtaimurahad anovelaidrivenmodelforstudentdropoutriskanalysiswithexplainableaiinsights
AT sumayamustofa novelaidrivenmodelforstudentdropoutriskanalysiswithexplainableaiinsights
AT yousufrayhanemon novelaidrivenmodelforstudentdropoutriskanalysiswithexplainableaiinsights
AT sajibbinmamun novelaidrivenmodelforstudentdropoutriskanalysiswithexplainableaiinsights
AT shabnuranonnaakhy novelaidrivenmodelforstudentdropoutriskanalysiswithexplainableaiinsights
AT mdtaimurahad novelaidrivenmodelforstudentdropoutriskanalysiswithexplainableaiinsights