The SweLL Language Learner Corpus

The article presents a new language learner corpus for Swedish, SweLL, and the methodology from collection and pesudonymisation to protect personal information of learners to annotation adapted to second language learning. The main aim is to deliver a well-annotated corpus of essays written by seco...

Full description

Saved in:
Bibliographic Details
Main Authors: Elena Volodina, Lena Granstedt, Arild Matsson, Beáta Megyesi, Ildikó Pilán, Julia Prentice, Dan Rosén, Lisa Rudebeck, Carl-Johan Schenström, Gunlög Sundberg, Mats Wirén
Format: Article
Language:English
Published: Linköping University Electronic Press 2019-12-01
Series:Northern European Journal of Language Technology
Online Access:https://nejlt.ep.liu.se/article/view/1374
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832590614627090432
author Elena Volodina
Lena Granstedt
Arild Matsson
Beáta Megyesi
Ildikó Pilán
Julia Prentice
Dan Rosén
Lisa Rudebeck
Carl-Johan Schenström
Gunlög Sundberg
Mats Wirén
author_facet Elena Volodina
Lena Granstedt
Arild Matsson
Beáta Megyesi
Ildikó Pilán
Julia Prentice
Dan Rosén
Lisa Rudebeck
Carl-Johan Schenström
Gunlög Sundberg
Mats Wirén
author_sort Elena Volodina
collection DOAJ
description The article presents a new language learner corpus for Swedish, SweLL, and the methodology from collection and pesudonymisation to protect personal information of learners to annotation adapted to second language learning. The main aim is to deliver a well-annotated corpus of essays written by second language learners of Swedish and make it available for research through a browsable environment. To that end, a new annotation tool and a new project management tool have been implemented, – both with the main purpose to ensure reliability and quality of the final corpus. In the article we discuss reasoning behind metadata selection, principles of gold corpus compilation and argue for separation of normalization from correction annotation.
format Article
id doaj-art-8b870d0e034944908f4fbdb802a0b598
institution Kabale University
issn 2000-1533
language English
publishDate 2019-12-01
publisher Linköping University Electronic Press
record_format Article
series Northern European Journal of Language Technology
spelling doaj-art-8b870d0e034944908f4fbdb802a0b5982025-01-23T10:36:31ZengLinköping University Electronic PressNorthern European Journal of Language Technology2000-15332019-12-01610.3384/nejlt.2000-1533.19667The SweLL Language Learner CorpusElena Volodina0Lena Granstedt1Arild Matsson2Beáta Megyesi3Ildikó Pilán4Julia Prentice5Dan Rosén6Lisa Rudebeck7Carl-Johan Schenström8Gunlög Sundberg9Mats Wirén10Språkbanken. University of GothenburgUmeå universityUniversity of GothenburgUppsala UniversityUniversity of GothenburgUniversity of GothenburgUniversity of GothenburgStockholm UniversityUniversity of GothenburgStockholm UniversityStockholm University The article presents a new language learner corpus for Swedish, SweLL, and the methodology from collection and pesudonymisation to protect personal information of learners to annotation adapted to second language learning. The main aim is to deliver a well-annotated corpus of essays written by second language learners of Swedish and make it available for research through a browsable environment. To that end, a new annotation tool and a new project management tool have been implemented, – both with the main purpose to ensure reliability and quality of the final corpus. In the article we discuss reasoning behind metadata selection, principles of gold corpus compilation and argue for separation of normalization from correction annotation. https://nejlt.ep.liu.se/article/view/1374
spellingShingle Elena Volodina
Lena Granstedt
Arild Matsson
Beáta Megyesi
Ildikó Pilán
Julia Prentice
Dan Rosén
Lisa Rudebeck
Carl-Johan Schenström
Gunlög Sundberg
Mats Wirén
The SweLL Language Learner Corpus
Northern European Journal of Language Technology
title The SweLL Language Learner Corpus
title_full The SweLL Language Learner Corpus
title_fullStr The SweLL Language Learner Corpus
title_full_unstemmed The SweLL Language Learner Corpus
title_short The SweLL Language Learner Corpus
title_sort swell language learner corpus
url https://nejlt.ep.liu.se/article/view/1374
work_keys_str_mv AT elenavolodina theswelllanguagelearnercorpus
AT lenagranstedt theswelllanguagelearnercorpus
AT arildmatsson theswelllanguagelearnercorpus
AT beatamegyesi theswelllanguagelearnercorpus
AT ildikopilan theswelllanguagelearnercorpus
AT juliaprentice theswelllanguagelearnercorpus
AT danrosen theswelllanguagelearnercorpus
AT lisarudebeck theswelllanguagelearnercorpus
AT carljohanschenstrom theswelllanguagelearnercorpus
AT gunlogsundberg theswelllanguagelearnercorpus
AT matswiren theswelllanguagelearnercorpus
AT elenavolodina swelllanguagelearnercorpus
AT lenagranstedt swelllanguagelearnercorpus
AT arildmatsson swelllanguagelearnercorpus
AT beatamegyesi swelllanguagelearnercorpus
AT ildikopilan swelllanguagelearnercorpus
AT juliaprentice swelllanguagelearnercorpus
AT danrosen swelllanguagelearnercorpus
AT lisarudebeck swelllanguagelearnercorpus
AT carljohanschenstrom swelllanguagelearnercorpus
AT gunlogsundberg swelllanguagelearnercorpus
AT matswiren swelllanguagelearnercorpus