Wrong Answers Only: Distractor Generation for Russian Reading Comprehension Questions Using a Translated Dataset
Background: Reading comprehension questions play an important role in language learning. Multiple-choice questions are a convenient form of reading comprehension assessment as they can be easily graded automatically. The availability of large reading comprehension datasets makes it possible to also...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
National Research University Higher School of Economics
2024-12-01
|
| Series: | Journal of Language and Education |
| Subjects: | |
| Online Access: | https://jle.hse.ru/article/view/22244 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Background: Reading comprehension questions play an important role in language learning. Multiple-choice questions are a convenient form of reading comprehension assessment as they can be easily graded automatically. The availability of large reading comprehension datasets makes it possible to also automatically produce these items, reducing the cost of development of test question banks, by fine-tuning language models on them. While English reading comprehension datasets are common, this is not true for other languages, including Russian. A subtask of distractor generation poses a difficulty, as it requires producing multiple incorrect items.
Purpose: The purpose of this work is to develop an efficient distractor generation solution for Russian exam-style reading comprehension questions and to discover whether a translated English-language distractor dataset can offer a possibility for such solution.
Method: In this paper we fine-tuned two pre-trained Russian large language models, RuT5 and RuGPT3 (Zmitrovich et al, 2024), on distractor generation task for two classes of summarizing questions retrieved from a large multiple-choice question dataset, that was automatically translated from English to Russian. The first class consisted of questions on selection of the best title for the given passage, while the second class included questions on true/false statement selection. The models were assessed automatically on test and development subsets, and true statement distractor models were additionally evaluated on an independent set of questions from Russian state exam USE.
Results: It was observed that the models surpassed the non-fine-tuned baseline, the performance of RuT5 model was better than that of RuGPT3, and that the models handled true statement selection questions much better than title questions. On USE data models fine-tuned on translated dataset have shown better quality than that trained on existing Russian distractor dataset, with T5-based model also beating the baseline established by output of an existing English distractor generation model translated into Russian.
Conclusion: The obtained results show the possibility of a translated dataset to be used in distractor generation and the importance of the domain (language examination) and question type match in the input data.
|
|---|---|
| ISSN: | 2411-7390 |