Evaluating corpora with word lists and word difficulty

This study examines the application of an IRT analysis of words on lists including the General Service List (GSL), New General Service List (NGSL), Academic Word List (AWL), New Academic Word List (NAWL), and TOEIC Service List (TSL). By comparing line graphs, density distribution graphs, and boxpl...

Full description

Saved in:
Bibliographic Details
Main Author: Brent A. Culligan
Format: Article
Language:English
Published: Castledown Publishers 2019-12-01
Series:Vocabulary Learning and Instruction
Subjects:
Online Access:https://www.castledown.com/journals/vli/article/view/1747
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850087397505630208
author Brent A. Culligan
author_facet Brent A. Culligan
author_sort Brent A. Culligan
collection DOAJ
description This study examines the application of an IRT analysis of words on lists including the General Service List (GSL), New General Service List (NGSL), Academic Word List (AWL), New Academic Word List (NAWL), and TOEIC Service List (TSL). By comparing line graphs, density distribution graphs, and boxplots for the average difficulty of each word list to related lists, we can get a visualization of the data’s distribution. Japanese EFL students responded to one or more of 84Yes/No test forms compiled from 5,880 unique real words and 2,520 nonwords. The real words were analyzed using Winsteps (Linacre,2005) resulting in IRT estimates for each word. By summing the difficulties of each word, we can calculate the average difficulty of each word list which can then be used to rank the lists. In effect, the process supports the concurrent validity of the lists. The analysis indicates the word family approach results in more difficult word lists. The mean difficulties of the GSL and the BNC_COCA appear to be more divergent and more difficult particularly over the first 4000 words, possibly due to the use of Bauer and Nation’s (1993) Affix Level 6 definition for their compilation. Finally, just as we should expect word lists for beginners to have higher frequency words than subsequent lists, we should also expect them to be easier with more words known to learners. This can be seen with the gradual but marked difference between the different word lists of the NGSL and its supplemental SPs.
format Article
id doaj-art-4eeae38e73f643c8abaea2844e31d32f
institution DOAJ
issn 2981-9954
language English
publishDate 2019-12-01
publisher Castledown Publishers
record_format Article
series Vocabulary Learning and Instruction
spelling doaj-art-4eeae38e73f643c8abaea2844e31d32f2025-08-20T02:43:13ZengCastledown PublishersVocabulary Learning and Instruction2981-99542019-12-018110.7820/vli.v08.1.CulliganEvaluating corpora with word lists and word difficultyBrent A. Culligan0Aoyama Gakuin Women’s Junior College This study examines the application of an IRT analysis of words on lists including the General Service List (GSL), New General Service List (NGSL), Academic Word List (AWL), New Academic Word List (NAWL), and TOEIC Service List (TSL). By comparing line graphs, density distribution graphs, and boxplots for the average difficulty of each word list to related lists, we can get a visualization of the data’s distribution. Japanese EFL students responded to one or more of 84Yes/No test forms compiled from 5,880 unique real words and 2,520 nonwords. The real words were analyzed using Winsteps (Linacre,2005) resulting in IRT estimates for each word. By summing the difficulties of each word, we can calculate the average difficulty of each word list which can then be used to rank the lists. In effect, the process supports the concurrent validity of the lists. The analysis indicates the word family approach results in more difficult word lists. The mean difficulties of the GSL and the BNC_COCA appear to be more divergent and more difficult particularly over the first 4000 words, possibly due to the use of Bauer and Nation’s (1993) Affix Level 6 definition for their compilation. Finally, just as we should expect word lists for beginners to have higher frequency words than subsequent lists, we should also expect them to be easier with more words known to learners. This can be seen with the gradual but marked difference between the different word lists of the NGSL and its supplemental SPs. https://www.castledown.com/journals/vli/article/view/1747corpus validityIRTmeasurementvocabulary testingword difficultyYes/No test
spellingShingle Brent A. Culligan
Evaluating corpora with word lists and word difficulty
Vocabulary Learning and Instruction
corpus validity
IRT
measurement
vocabulary testing
word difficulty
Yes/No test
title Evaluating corpora with word lists and word difficulty
title_full Evaluating corpora with word lists and word difficulty
title_fullStr Evaluating corpora with word lists and word difficulty
title_full_unstemmed Evaluating corpora with word lists and word difficulty
title_short Evaluating corpora with word lists and word difficulty
title_sort evaluating corpora with word lists and word difficulty
topic corpus validity
IRT
measurement
vocabulary testing
word difficulty
Yes/No test
url https://www.castledown.com/journals/vli/article/view/1747
work_keys_str_mv AT brentaculligan evaluatingcorporawithwordlistsandworddifficulty