Accurate multi-category student performance forecasting at early stages of online education using neural networks

Abstract The ability to accurately predict and analyze student performance in online education, both at the outset and throughout the semester, is vital. Most of the published studies focus on binary classification (Fail or Pass) but there is still a significant research shortcoming in predicting pe...

Full description

Saved in:

Bibliographic Details
Main Authors:	Naveed Ur Rehman Junejo, Muhammad Wasim Nawaz, Qingsheng Huang, Xiaoqing Dong, Chang Wang, Gengzhong Zheng
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-05-01
Series:	Scientific Reports
Subjects:	Early-prediction Machine Learning Deep Learning Feature Engineering Virtual Learning Environment
Online Access:	https://doi.org/10.1038/s41598-025-00256-3
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Abstract The ability to accurately predict and analyze student performance in online education, both at the outset and throughout the semester, is vital. Most of the published studies focus on binary classification (Fail or Pass) but there is still a significant research shortcoming in predicting performance of students across multiple categories. This study introduces a novel neural network-based approach capable of accurately predicting student performance and identifying vulnerable students at early stages of the online courses. The open university learning analytics (OULA) dataset is employed to develop and test the proposed model, which predicts outcomes in Distinction, Fail, Pass, and Withdrawn categories. The OULA dataset is preprocessed to extract features from demographic data, assessment data, and clickstream interactions within a virtual learning environment (VLE). Novel features engineering has been utilized to predict students’ performance across multiple categories at early stages of courses. Specially, students’ VLE interactions are aggregated by total clicks to represent daily engagement and assess online activity. Comparative simulations indicate that the proposed model significantly outperforms existing baseline models including artificial neural network long short-term memory (ANN-LSTM), random forest (RF) ‘gini’, RF ‘entropy’ and deep feed forward neural network (DFFNN) in terms of accuracy, precision, recall, and F1-score. The results indicate that the prediction accuracy of the proposed method is about $$25\%$$ more than the existing state-of-the-art methods. Furthermore, compared to existing methodologies, the model demonstrates superior predictive capability across temporal course progression, achieving superior accuracy even at the initial $$20\%$$ phase of course completion.
ISSN:	2045-2322

Accurate multi-category student performance forecasting at early stages of online education using neural networks

Similar Items