Examining Exams Using Rasch Models and Assessment of Measurement Invariance

Many statisticians regularly teach large lecture courses on statistics, probability, or mathematics for students from other fields such as business and economics, social sciences, psychology, etc. The corresponding exams often use a multiple-choice or single-choice format and are typically evaluate...

Full description

Saved in:
Bibliographic Details
Main Author: Achim Zeileis
Format: Article
Language:English
Published: Austrian Statistical Society 2025-04-01
Series:Austrian Journal of Statistics
Online Access:https://ajs.or.at/index.php/ajs/article/view/2055
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Many statisticians regularly teach large lecture courses on statistics, probability, or mathematics for students from other fields such as business and economics, social sciences, psychology, etc. The corresponding exams often use a multiple-choice or single-choice format and are typically evaluated and graded automatically, either by scanning printed exams or via online learning management systems. Although further examinations of these exams would be of interest, these are frequently not carried out. For example a measurement scale for the difficulty of the questions (or items) and the ability of the students (or subjects) could be established using psychometric item response theory (IRT) models. Moreover, based on such a model it could be assessed whether the exam is really fair for all participants or whether certain items are easier (or more difficult) for certain subgroups of students. Here, several recent methods for assessing measurement invariance and for detecting differential item functioning in the Rasch model are discussed and applied to results from a first-year mathematics exam with single-choice items. Several categorical, ordered, and numeric covariates like gender, prior experience, and prior mathematics knowledge are available to form potential subgroups with differential item functioning. Specifically, all analyses are demonstrated with a hands-on R tutorial using the psycho* family of R packages (psychotools, psychotree, psychomix) which provide a unified approach to estimating, visualizing, testing, mixing, and partitioning a range of psychometric models. The paper is dedicated to the memory of Fritz Leisch (1968--2024) and his contributions to various aspects of this work are highlighted.
ISSN:1026-597X