Novel Considerations in the ML/AI Modeling of Large-Scale Learning Loss

This study is a path forward for the large-scale, data-driven quantitative analysis of noisy open-source data resources. The goal is to support qualitative findings of smaller studies with extensive open-source data-driven analytics in a new way. The study presented in this research focuses on learn...

Full description

Saved in:
Bibliographic Details
Main Authors: Mirna Elizondo, June Yu, Daniel Payan, LI Feng, Jelena Tesic
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10829573/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841536177242374144
author Mirna Elizondo
June Yu
Daniel Payan
LI Feng
Jelena Tesic
author_facet Mirna Elizondo
June Yu
Daniel Payan
LI Feng
Jelena Tesic
author_sort Mirna Elizondo
collection DOAJ
description This study is a path forward for the large-scale, data-driven quantitative analysis of noisy open-source data resources. The goal is to support qualitative findings of smaller studies with extensive open-source data-driven analytics in a new way. The study presented in this research focuses on learning interventions. It uses nine publicly accessible datasets to understand and mitigate factors contributing to learning loss and the practical learning recovery measures in Texas public school districts after the recent school closures. The data came from the Census Bureau 2010, USAFACTS, Texas Department of State Health Services (DSHS), the National Center for Education Statistics (CCD), the US Bureau of Labor Statistics (LAUS), and three sources from the Texas Education Agency (STAAR, TEA, ADA, ESSER). We demonstrate a novel data-driven approach to discover insights from an extensive collection of heterogeneous public data sources. For the pandemic school closure period, the mode of instruction and prior score emerged as the primary resilience factors in the learning recovery intervention method. Grade level and census community income level are the most influential factors in predicting learning loss for both Math and Reading. We demonstrate that data-driven unbiased data analysis at a larger scale can offer policymakers an actionable understanding of how to identify learning-loss tendencies and prevent them in public schools.
format Article
id doaj-art-6393b0dd808d493f87e86b552416592f
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-6393b0dd808d493f87e86b552416592f2025-01-15T00:03:15ZengIEEEIEEE Access2169-35362025-01-01137780779210.1109/ACCESS.2025.352641210829573Novel Considerations in the ML/AI Modeling of Large-Scale Learning LossMirna Elizondo0https://orcid.org/0000-0002-9627-0755June Yu1https://orcid.org/0000-0003-3554-0552Daniel Payan2LI Feng3https://orcid.org/0000-0002-0536-1791Jelena Tesic4https://orcid.org/0000-0002-9972-9760Department of Computer Science, Texas State University, San Marcos, TX, USAState of Texas Legislative Budget Board, Austin, TX, USALove’s Travel Stops and Country Stores, Yukon, OK, USADepartment of Finance and Economics, Texas State University, San Marcos, TX, USADepartment of Computer Science, Texas State University, San Marcos, TX, USAThis study is a path forward for the large-scale, data-driven quantitative analysis of noisy open-source data resources. The goal is to support qualitative findings of smaller studies with extensive open-source data-driven analytics in a new way. The study presented in this research focuses on learning interventions. It uses nine publicly accessible datasets to understand and mitigate factors contributing to learning loss and the practical learning recovery measures in Texas public school districts after the recent school closures. The data came from the Census Bureau 2010, USAFACTS, Texas Department of State Health Services (DSHS), the National Center for Education Statistics (CCD), the US Bureau of Labor Statistics (LAUS), and three sources from the Texas Education Agency (STAAR, TEA, ADA, ESSER). We demonstrate a novel data-driven approach to discover insights from an extensive collection of heterogeneous public data sources. For the pandemic school closure period, the mode of instruction and prior score emerged as the primary resilience factors in the learning recovery intervention method. Grade level and census community income level are the most influential factors in predicting learning loss for both Math and Reading. We demonstrate that data-driven unbiased data analysis at a larger scale can offer policymakers an actionable understanding of how to identify learning-loss tendencies and prevent them in public schools.https://ieeexplore.ieee.org/document/10829573/Noisy tabular datadata in the wildgradient boostingfeature selectiondimensionality reduction
spellingShingle Mirna Elizondo
June Yu
Daniel Payan
LI Feng
Jelena Tesic
Novel Considerations in the ML/AI Modeling of Large-Scale Learning Loss
IEEE Access
Noisy tabular data
data in the wild
gradient boosting
feature selection
dimensionality reduction
title Novel Considerations in the ML/AI Modeling of Large-Scale Learning Loss
title_full Novel Considerations in the ML/AI Modeling of Large-Scale Learning Loss
title_fullStr Novel Considerations in the ML/AI Modeling of Large-Scale Learning Loss
title_full_unstemmed Novel Considerations in the ML/AI Modeling of Large-Scale Learning Loss
title_short Novel Considerations in the ML/AI Modeling of Large-Scale Learning Loss
title_sort novel considerations in the ml ai modeling of large scale learning loss
topic Noisy tabular data
data in the wild
gradient boosting
feature selection
dimensionality reduction
url https://ieeexplore.ieee.org/document/10829573/
work_keys_str_mv AT mirnaelizondo novelconsiderationsinthemlaimodelingoflargescalelearningloss
AT juneyu novelconsiderationsinthemlaimodelingoflargescalelearningloss
AT danielpayan novelconsiderationsinthemlaimodelingoflargescalelearningloss
AT lifeng novelconsiderationsinthemlaimodelingoflargescalelearningloss
AT jelenatesic novelconsiderationsinthemlaimodelingoflargescalelearningloss