SelfCode 2.0: An Annotated Corpus of Student and Expert Line-by-Line Explanations of Code Examples for Automated Assessment

Assessing student responses is a critical task in adaptive educational systems. More specifically, automatically evaluating students' self-explanations contributes to understanding their knowledge state which is needed for personalized instruction, the crux of adaptive educational systems. To...

Full description

Saved in:
Bibliographic Details
Main Authors: Jeevan Chapagain, Arun Balajiee Lekshmi Narayanan, Kamil Akhuseyinoglu, Peter Brusilovsky, Vasile Rus
Format: Article
Language:English
Published: LibraryPress@UF 2025-05-01
Series:Proceedings of the International Florida Artificial Intelligence Research Society Conference
Online Access:https://journals.flvc.org/FLAIRS/article/view/138727
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Assessing student responses is a critical task in adaptive educational systems. More specifically, automatically evaluating students' self-explanations contributes to understanding their knowledge state which is needed for personalized instruction, the crux of adaptive educational systems. To facilitate the development of Artificial Intelligence (AI) and Machine Learning models for automated assessment of learners' self-explanations, annotated datasets are essential. In response to this need, we developed the SelfCode2.0 corpus, which consists of 3,019 pairs of student and expert explanations of Java code snippets, each annotated with semantic similarity, correctness, and completeness scores provided by experts. Alongside the dataset, we also provide performance results obtained with several baseline models based on TF-IDF and Sentence-BERT vectorial representations. This work aims to enhance the effectiveness of automated assessment tools in programming education and contribute to a better understanding and supporting student learning of programming.
ISSN:2334-0754
2334-0762