Optimizing learning outcomes: a deep dive into hybrid AI models for adaptive educational feedback

Abstract Accurate prediction of student performance is essential for the creation of adaptive learning frameworks and the best utilization of educational strategies. In this work, we apply ensemble learning and neural networks to investigate data from multiple sources about students, two real educat...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hafiz Muhammad Qadir, M. Taseer Suleman, Rafaqat Alam Khan, Muhammad Sohaib, Md Junayed Hasan, Syed Abid Hussain
Format:	Article
Language:	English
Published:	SpringerOpen 2025-06-01
Series:	Journal of Big Data
Online Access:	https://doi.org/10.1186/s40537-025-01187-6
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850137995988959232
author	Hafiz Muhammad Qadir M. Taseer Suleman Rafaqat Alam Khan Muhammad Sohaib Md Junayed Hasan Syed Abid Hussain
author_facet	Hafiz Muhammad Qadir M. Taseer Suleman Rafaqat Alam Khan Muhammad Sohaib Md Junayed Hasan Syed Abid Hussain
author_sort	Hafiz Muhammad Qadir
collection	DOAJ
description	Abstract Accurate prediction of student performance is essential for the creation of adaptive learning frameworks and the best utilization of educational strategies. In this work, we apply ensemble learning and neural networks to investigate data from multiple sources about students, two real educational datasets from Kaggle, and two synthetically generated datasets. A Python-based generative script was used to create one synthetic dataset; another synthetic dataset is created by augmenting a smaller Kaggle dataset while keeping its original statistical distribution. The Integrated Synthetic Data will make the model more robust, mitigate class imbalance, and generalize predictively in a much better way across heterogeneous educational data. In this paper, we implement several ensemble models-AdaBoost, Gradient Boosting, XGBoost, LightGBM, and CatBoost-and deep learning architectures such as Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Recurrent Neural Networks (RNN). These models are evaluated using accuracy, precision, recall, F1-score, and ROC-AUC to assess their predictive effectiveness. Experimental results demonstrate that CatBoost outperforms other ensemble models with an accuracy of 0.7143 and an F1-score of 0.7338, while CNN achieves the highest performance for sequential data (accuracy: 0.6786). ROC-AUC analysis confirms CatBoost and XGBoost as top-performing classifiers, while CNN and DNN exhibit superior capability in handling temporal patterns. The study highlights the impact of dataset augmentation and synthetic data generation on improving predictive accuracy in educational data mining, reinforcing the importance of data-centric approaches for building intelligent, and evidence-driven educational systems. The learning feedback has been made available via a user-friendly webserver at: https://khan-learning-feedback.streamlit.app/ .
format	Article
id	doaj-art-48f7e8a48d224b70a15179769bdbdc34
institution	OA Journals
issn	2196-1115
language	English
publishDate	2025-06-01
publisher	SpringerOpen
record_format	Article
series	Journal of Big Data
spelling	doaj-art-48f7e8a48d224b70a15179769bdbdc342025-08-20T02:30:42ZengSpringerOpenJournal of Big Data2196-11152025-06-0112112610.1186/s40537-025-01187-6Optimizing learning outcomes: a deep dive into hybrid AI models for adaptive educational feedbackHafiz Muhammad Qadir0M. Taseer Suleman1Rafaqat Alam Khan2Muhammad Sohaib3Md Junayed Hasan4Syed Abid Hussain5Department of Software Engineering, Lahore Garrison UniversityDepartment of Computer Science, Bahria University Lahore CampusDepartment of Software Engineering, Lahore Garrison UniversitySchool of Computer Science and Technology, Zhejiang Normal UniversityDataxenseDepartment of Computer Science and Engineering, Bakhtar UniversityAbstract Accurate prediction of student performance is essential for the creation of adaptive learning frameworks and the best utilization of educational strategies. In this work, we apply ensemble learning and neural networks to investigate data from multiple sources about students, two real educational datasets from Kaggle, and two synthetically generated datasets. A Python-based generative script was used to create one synthetic dataset; another synthetic dataset is created by augmenting a smaller Kaggle dataset while keeping its original statistical distribution. The Integrated Synthetic Data will make the model more robust, mitigate class imbalance, and generalize predictively in a much better way across heterogeneous educational data. In this paper, we implement several ensemble models-AdaBoost, Gradient Boosting, XGBoost, LightGBM, and CatBoost-and deep learning architectures such as Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Recurrent Neural Networks (RNN). These models are evaluated using accuracy, precision, recall, F1-score, and ROC-AUC to assess their predictive effectiveness. Experimental results demonstrate that CatBoost outperforms other ensemble models with an accuracy of 0.7143 and an F1-score of 0.7338, while CNN achieves the highest performance for sequential data (accuracy: 0.6786). ROC-AUC analysis confirms CatBoost and XGBoost as top-performing classifiers, while CNN and DNN exhibit superior capability in handling temporal patterns. The study highlights the impact of dataset augmentation and synthetic data generation on improving predictive accuracy in educational data mining, reinforcing the importance of data-centric approaches for building intelligent, and evidence-driven educational systems. The learning feedback has been made available via a user-friendly webserver at: https://khan-learning-feedback.streamlit.app/ .https://doi.org/10.1186/s40537-025-01187-6
spellingShingle	Hafiz Muhammad Qadir M. Taseer Suleman Rafaqat Alam Khan Muhammad Sohaib Md Junayed Hasan Syed Abid Hussain Optimizing learning outcomes: a deep dive into hybrid AI models for adaptive educational feedback Journal of Big Data
title	Optimizing learning outcomes: a deep dive into hybrid AI models for adaptive educational feedback
title_full	Optimizing learning outcomes: a deep dive into hybrid AI models for adaptive educational feedback
title_fullStr	Optimizing learning outcomes: a deep dive into hybrid AI models for adaptive educational feedback
title_full_unstemmed	Optimizing learning outcomes: a deep dive into hybrid AI models for adaptive educational feedback
title_short	Optimizing learning outcomes: a deep dive into hybrid AI models for adaptive educational feedback
title_sort	optimizing learning outcomes a deep dive into hybrid ai models for adaptive educational feedback
url	https://doi.org/10.1186/s40537-025-01187-6
work_keys_str_mv	AT hafizmuhammadqadir optimizinglearningoutcomesadeepdiveintohybridaimodelsforadaptiveeducationalfeedback AT mtaseersuleman optimizinglearningoutcomesadeepdiveintohybridaimodelsforadaptiveeducationalfeedback AT rafaqatalamkhan optimizinglearningoutcomesadeepdiveintohybridaimodelsforadaptiveeducationalfeedback AT muhammadsohaib optimizinglearningoutcomesadeepdiveintohybridaimodelsforadaptiveeducationalfeedback AT mdjunayedhasan optimizinglearningoutcomesadeepdiveintohybridaimodelsforadaptiveeducationalfeedback AT syedabidhussain optimizinglearningoutcomesadeepdiveintohybridaimodelsforadaptiveeducationalfeedback

Optimizing learning outcomes: a deep dive into hybrid AI models for adaptive educational feedback

Similar Items