Learning to Score: A Coding System for Constructed Response Items via Interactive Clustering
Constructed response items that require the student to give more detailed and elaborate responses are widely applied in large-scale assessments. However, the hand-craft scoring with a rubric for massive responses is labor-intensive and impractical due to rater subjectivity and answer variability. Th...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-09-01
|
| Series: | Systems |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2079-8954/12/9/380 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850259385358483456 |
|---|---|
| author | Lingjing Luo Hang Yang Zhiwu Li Witold Pedrycz |
| author_facet | Lingjing Luo Hang Yang Zhiwu Li Witold Pedrycz |
| author_sort | Lingjing Luo |
| collection | DOAJ |
| description | Constructed response items that require the student to give more detailed and elaborate responses are widely applied in large-scale assessments. However, the hand-craft scoring with a rubric for massive responses is labor-intensive and impractical due to rater subjectivity and answer variability. The automatic response coding method, such as the automatic scoring of short answers, has become a critical component of the learning and assessment system. In this paper, we propose an interactive coding system called ASSIST to efficiently score student responses with expert knowledge and then generate an automatic score classifier. First, the ungraded responses are clustered to generate specific codes, representative responses, and indicator words. The constraint set based on feedback from experts is taken as training data in metric learning to compensate for machine bias. Meanwhile, the classifier from responses to code is trained according to the clustering results. Second, the experts review each coded cluster with the representative responses and indicator words to score a rating. The coded cluster and score pairs will be validated to ensure inter-rater reliability. Finally, the classifier is available for scoring a new response with out-of-distribution detection, which is based on the similarity between response representation and class proxy, i.e., the weight of class in the last linear layer of the classifier. The originality of the system developed stems from the interactive response clustering procedure, which involves expert feedback and an adaptive automatic classifier that can identify new response classes. The proposed system is evaluated on our real-world assessment dataset. The results of the experiments demonstrate the effectiveness of the proposed system in saving human effort and improving scoring performance. The average improvements in clustering quality and scoring accuracy are 14.48% and 18.94%, respectively. Additionally, we reported the inter-rater reliability, out-of-distribution rate, and cluster statistics, before and after interaction. |
| format | Article |
| id | doaj-art-ddcf791fdac04782b7ab206bafd3ae2e |
| institution | OA Journals |
| issn | 2079-8954 |
| language | English |
| publishDate | 2024-09-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Systems |
| spelling | doaj-art-ddcf791fdac04782b7ab206bafd3ae2e2025-08-20T01:55:52ZengMDPI AGSystems2079-89542024-09-0112938010.3390/systems12090380Learning to Score: A Coding System for Constructed Response Items via Interactive ClusteringLingjing Luo0Hang Yang1Zhiwu Li2Witold Pedrycz3School of Marxism, University of Electronic Science and Technology of China, Chengdu 611731, ChinaMacau Institute of Systems Engineering, Macau University of Science and Technology, Macau, ChinaMacau Institute of Systems Engineering, Macau University of Science and Technology, Macau, ChinaDepartment of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6R 2V4, CanadaConstructed response items that require the student to give more detailed and elaborate responses are widely applied in large-scale assessments. However, the hand-craft scoring with a rubric for massive responses is labor-intensive and impractical due to rater subjectivity and answer variability. The automatic response coding method, such as the automatic scoring of short answers, has become a critical component of the learning and assessment system. In this paper, we propose an interactive coding system called ASSIST to efficiently score student responses with expert knowledge and then generate an automatic score classifier. First, the ungraded responses are clustered to generate specific codes, representative responses, and indicator words. The constraint set based on feedback from experts is taken as training data in metric learning to compensate for machine bias. Meanwhile, the classifier from responses to code is trained according to the clustering results. Second, the experts review each coded cluster with the representative responses and indicator words to score a rating. The coded cluster and score pairs will be validated to ensure inter-rater reliability. Finally, the classifier is available for scoring a new response with out-of-distribution detection, which is based on the similarity between response representation and class proxy, i.e., the weight of class in the last linear layer of the classifier. The originality of the system developed stems from the interactive response clustering procedure, which involves expert feedback and an adaptive automatic classifier that can identify new response classes. The proposed system is evaluated on our real-world assessment dataset. The results of the experiments demonstrate the effectiveness of the proposed system in saving human effort and improving scoring performance. The average improvements in clustering quality and scoring accuracy are 14.48% and 18.94%, respectively. Additionally, we reported the inter-rater reliability, out-of-distribution rate, and cluster statistics, before and after interaction.https://www.mdpi.com/2079-8954/12/9/380constructed response itemsresponse codingtext clusteringlarge-scale assessment |
| spellingShingle | Lingjing Luo Hang Yang Zhiwu Li Witold Pedrycz Learning to Score: A Coding System for Constructed Response Items via Interactive Clustering Systems constructed response items response coding text clustering large-scale assessment |
| title | Learning to Score: A Coding System for Constructed Response Items via Interactive Clustering |
| title_full | Learning to Score: A Coding System for Constructed Response Items via Interactive Clustering |
| title_fullStr | Learning to Score: A Coding System for Constructed Response Items via Interactive Clustering |
| title_full_unstemmed | Learning to Score: A Coding System for Constructed Response Items via Interactive Clustering |
| title_short | Learning to Score: A Coding System for Constructed Response Items via Interactive Clustering |
| title_sort | learning to score a coding system for constructed response items via interactive clustering |
| topic | constructed response items response coding text clustering large-scale assessment |
| url | https://www.mdpi.com/2079-8954/12/9/380 |
| work_keys_str_mv | AT lingjingluo learningtoscoreacodingsystemforconstructedresponseitemsviainteractiveclustering AT hangyang learningtoscoreacodingsystemforconstructedresponseitemsviainteractiveclustering AT zhiwuli learningtoscoreacodingsystemforconstructedresponseitemsviainteractiveclustering AT witoldpedrycz learningtoscoreacodingsystemforconstructedresponseitemsviainteractiveclustering |