Predictive Model to Analyse Real and Synthetic Data for Learners' Performance Prediction Using Regression Techniques

Predicting learner performance with precision is critical within educational systems, offering a basis for tailored interventions and instruction. The advent of big data analytics presents an opportunity to employ Machine Learning (ML) techniques to this end. Real-world data availability is often h...

Full description

Saved in:
Bibliographic Details
Main Authors: SHABNAM ARA S.J, Tanuja R, Manjula S.H
Format: Article
Language:English
Published: Online Learning Consortium 2025-03-01
Series:Online Learning
Online Access:https://olj.onlinelearningconsortium.org/index.php/olj/article/view/4390
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850074303496716288
author SHABNAM ARA S.J
Tanuja R
Manjula S.H
author_facet SHABNAM ARA S.J
Tanuja R
Manjula S.H
author_sort SHABNAM ARA S.J
collection DOAJ
description Predicting learner performance with precision is critical within educational systems, offering a basis for tailored interventions and instruction. The advent of big data analytics presents an opportunity to employ Machine Learning (ML) techniques to this end. Real-world data availability is often hampered by privacy concerns, prompting a shift towards synthetic data generation. This study presents an empirical comparison of real, synthetic, and mixed (real + synthetic) data sets in forecasting learner performance, deploying an array of regression-based ML algorithms, including Random Forest, Gradient Boosting, XG Boost, K-nearest Neighbor, and Support Vector Regression. Our methodology encompasses the generation of synthetic data via generative model, followed by the application of these algorithms to each data set. The models are evaluated using precision metrics to assess their predictive accuracy. The study unveils that synthetic data can rival real data in predictive capabilities, with combined data sets achieving up to 87.76% accuracy, underscoring the efficacy of hybrid data approaches. These insights advocate for the integration of synthetic data as a practical substitute in scenarios with limited access to real data, fostering advancements in educational technology and ML.
format Article
id doaj-art-8f5ed54805954d30a33403af4c0ea504
institution DOAJ
issn 2472-5749
2472-5730
language English
publishDate 2025-03-01
publisher Online Learning Consortium
record_format Article
series Online Learning
spelling doaj-art-8f5ed54805954d30a33403af4c0ea5042025-08-20T02:46:36ZengOnline Learning ConsortiumOnline Learning2472-57492472-57302025-03-0129110.24059/olj.v29i1.4390Predictive Model to Analyse Real and Synthetic Data for Learners' Performance Prediction Using Regression TechniquesSHABNAM ARA S.J0Tanuja RManjula S.HUVCE Predicting learner performance with precision is critical within educational systems, offering a basis for tailored interventions and instruction. The advent of big data analytics presents an opportunity to employ Machine Learning (ML) techniques to this end. Real-world data availability is often hampered by privacy concerns, prompting a shift towards synthetic data generation. This study presents an empirical comparison of real, synthetic, and mixed (real + synthetic) data sets in forecasting learner performance, deploying an array of regression-based ML algorithms, including Random Forest, Gradient Boosting, XG Boost, K-nearest Neighbor, and Support Vector Regression. Our methodology encompasses the generation of synthetic data via generative model, followed by the application of these algorithms to each data set. The models are evaluated using precision metrics to assess their predictive accuracy. The study unveils that synthetic data can rival real data in predictive capabilities, with combined data sets achieving up to 87.76% accuracy, underscoring the efficacy of hybrid data approaches. These insights advocate for the integration of synthetic data as a practical substitute in scenarios with limited access to real data, fostering advancements in educational technology and ML. https://olj.onlinelearningconsortium.org/index.php/olj/article/view/4390
spellingShingle SHABNAM ARA S.J
Tanuja R
Manjula S.H
Predictive Model to Analyse Real and Synthetic Data for Learners' Performance Prediction Using Regression Techniques
Online Learning
title Predictive Model to Analyse Real and Synthetic Data for Learners' Performance Prediction Using Regression Techniques
title_full Predictive Model to Analyse Real and Synthetic Data for Learners' Performance Prediction Using Regression Techniques
title_fullStr Predictive Model to Analyse Real and Synthetic Data for Learners' Performance Prediction Using Regression Techniques
title_full_unstemmed Predictive Model to Analyse Real and Synthetic Data for Learners' Performance Prediction Using Regression Techniques
title_short Predictive Model to Analyse Real and Synthetic Data for Learners' Performance Prediction Using Regression Techniques
title_sort predictive model to analyse real and synthetic data for learners performance prediction using regression techniques
url https://olj.onlinelearningconsortium.org/index.php/olj/article/view/4390
work_keys_str_mv AT shabnamarasj predictivemodeltoanalyserealandsyntheticdataforlearnersperformancepredictionusingregressiontechniques
AT tanujar predictivemodeltoanalyserealandsyntheticdataforlearnersperformancepredictionusingregressiontechniques
AT manjulash predictivemodeltoanalyserealandsyntheticdataforlearnersperformancepredictionusingregressiontechniques