Predicting student academic performance using Bi-LSTM: a deep learning framework with SHAP-based interpretability and statistical validation

IntroductionEducational Data Mining (EDM) involves analysing educational data to identify patterns and trends. By uncovering these insights, educators can better understand student learning, optimise teaching methods, and refine curriculum. One of the main tasks in educational data mining is predict...

Full description

Saved in:
Bibliographic Details
Main Authors: Emi Kalita, Abdullah Mana Alfarwan, Houssam El Aouifi, Ashima Kukkar, Sadiq Hussain, Tazid Ali, Silvia Gaftandzhieva
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-06-01
Series:Frontiers in Education
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/feduc.2025.1581247/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850114397280665600
author Emi Kalita
Abdullah Mana Alfarwan
Houssam El Aouifi
Houssam El Aouifi
Ashima Kukkar
Sadiq Hussain
Tazid Ali
Silvia Gaftandzhieva
author_facet Emi Kalita
Abdullah Mana Alfarwan
Houssam El Aouifi
Houssam El Aouifi
Ashima Kukkar
Sadiq Hussain
Tazid Ali
Silvia Gaftandzhieva
author_sort Emi Kalita
collection DOAJ
description IntroductionEducational Data Mining (EDM) involves analysing educational data to identify patterns and trends. By uncovering these insights, educators can better understand student learning, optimise teaching methods, and refine curriculum. One of the main tasks in educational data mining is predicting the student’s academic performance because it makes it possible to provide appropriate interventions supporting students’ achievements. Predicting the student’s academic performance also helps to identify at-risk students and explore the possibility of providing intervention techniques.MethodsIn this paper, a deep learning model using a Bi-LSTM network is introduced to predict second term GPA.ResultsThe model had an average accuracy of 88.23% and was statistically better than traditional machine learning algorithms such as CatBoost, XGBoost, Hist Gradient Boosting, and LightGBM for accuracy, precision, recall, or F1-score metrics. The results are also analysed with the help of SHAP values for model interpretability to understand feature contributions, making the proposed framework more transparent. The performance of models is also compared using various statistical tests.DiscussionThe results demonstrate that BI-LSTM performance is significantly different from other models. Hence, the proposed model provides a way to prevent student dropouts and improve academic achievements.
format Article
id doaj-art-134b9bc4736f4844ae02bbb8cfc9d023
institution OA Journals
issn 2504-284X
language English
publishDate 2025-06-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Education
spelling doaj-art-134b9bc4736f4844ae02bbb8cfc9d0232025-08-20T02:36:53ZengFrontiers Media S.A.Frontiers in Education2504-284X2025-06-011010.3389/feduc.2025.15812471581247Predicting student academic performance using Bi-LSTM: a deep learning framework with SHAP-based interpretability and statistical validationEmi Kalita0Abdullah Mana Alfarwan1Houssam El Aouifi2Houssam El Aouifi3Ashima Kukkar4Sadiq Hussain5Tazid Ali6Silvia Gaftandzhieva7Centre for Computer Science and Applications, Dibrugarh University, Dibrugarh, IndiaDepartment of Education and Psychology, Najran University, Najran, Saudi ArabiaFSJES, Ibn Zohr University, Ait Melloul, MoroccoIRF-SIC Laboratory, Faculty of Science, Ibn Zohr University, Agadir, MoroccoChitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, IndiaCentre for Computer Science and Applications, Dibrugarh University, Dibrugarh, IndiaCentre for Computer Science and Applications, Dibrugarh University, Dibrugarh, IndiaFaculty of Mathematics and Informatics, University of Plovdiv Paisii Hilendarski, Plovdiv, BulgariaIntroductionEducational Data Mining (EDM) involves analysing educational data to identify patterns and trends. By uncovering these insights, educators can better understand student learning, optimise teaching methods, and refine curriculum. One of the main tasks in educational data mining is predicting the student’s academic performance because it makes it possible to provide appropriate interventions supporting students’ achievements. Predicting the student’s academic performance also helps to identify at-risk students and explore the possibility of providing intervention techniques.MethodsIn this paper, a deep learning model using a Bi-LSTM network is introduced to predict second term GPA.ResultsThe model had an average accuracy of 88.23% and was statistically better than traditional machine learning algorithms such as CatBoost, XGBoost, Hist Gradient Boosting, and LightGBM for accuracy, precision, recall, or F1-score metrics. The results are also analysed with the help of SHAP values for model interpretability to understand feature contributions, making the proposed framework more transparent. The performance of models is also compared using various statistical tests.DiscussionThe results demonstrate that BI-LSTM performance is significantly different from other models. Hence, the proposed model provides a way to prevent student dropouts and improve academic achievements.https://www.frontiersin.org/articles/10.3389/feduc.2025.1581247/fullstudent academic outcomeXAISHAPBi-LSTMstudent dropoutstatistical test
spellingShingle Emi Kalita
Abdullah Mana Alfarwan
Houssam El Aouifi
Houssam El Aouifi
Ashima Kukkar
Sadiq Hussain
Tazid Ali
Silvia Gaftandzhieva
Predicting student academic performance using Bi-LSTM: a deep learning framework with SHAP-based interpretability and statistical validation
Frontiers in Education
student academic outcome
XAI
SHAP
Bi-LSTM
student dropout
statistical test
title Predicting student academic performance using Bi-LSTM: a deep learning framework with SHAP-based interpretability and statistical validation
title_full Predicting student academic performance using Bi-LSTM: a deep learning framework with SHAP-based interpretability and statistical validation
title_fullStr Predicting student academic performance using Bi-LSTM: a deep learning framework with SHAP-based interpretability and statistical validation
title_full_unstemmed Predicting student academic performance using Bi-LSTM: a deep learning framework with SHAP-based interpretability and statistical validation
title_short Predicting student academic performance using Bi-LSTM: a deep learning framework with SHAP-based interpretability and statistical validation
title_sort predicting student academic performance using bi lstm a deep learning framework with shap based interpretability and statistical validation
topic student academic outcome
XAI
SHAP
Bi-LSTM
student dropout
statistical test
url https://www.frontiersin.org/articles/10.3389/feduc.2025.1581247/full
work_keys_str_mv AT emikalita predictingstudentacademicperformanceusingbilstmadeeplearningframeworkwithshapbasedinterpretabilityandstatisticalvalidation
AT abdullahmanaalfarwan predictingstudentacademicperformanceusingbilstmadeeplearningframeworkwithshapbasedinterpretabilityandstatisticalvalidation
AT houssamelaouifi predictingstudentacademicperformanceusingbilstmadeeplearningframeworkwithshapbasedinterpretabilityandstatisticalvalidation
AT houssamelaouifi predictingstudentacademicperformanceusingbilstmadeeplearningframeworkwithshapbasedinterpretabilityandstatisticalvalidation
AT ashimakukkar predictingstudentacademicperformanceusingbilstmadeeplearningframeworkwithshapbasedinterpretabilityandstatisticalvalidation
AT sadiqhussain predictingstudentacademicperformanceusingbilstmadeeplearningframeworkwithshapbasedinterpretabilityandstatisticalvalidation
AT tazidali predictingstudentacademicperformanceusingbilstmadeeplearningframeworkwithshapbasedinterpretabilityandstatisticalvalidation
AT silviagaftandzhieva predictingstudentacademicperformanceusingbilstmadeeplearningframeworkwithshapbasedinterpretabilityandstatisticalvalidation