Adding a Novel Italian Treebank of Marked Constructions to Universal Dependencies

In this paper we present a novel treebank developed to analyse marked constructions in Italian called MarkIT. The resource contains almost 1,300 sentences manually annotated with dependency relations following the Universal Dependencies paradigm. The sentences have been extracted from essays written...

Full description

Saved in:
Bibliographic Details
Main Authors: Teresa Paccosi, Alessio Palmero Aprosio, Sara Tonelli
Format: Article
Language:English
Published: Accademia University Press 2023-08-01
Series:IJCoL
Online Access:https://journals.openedition.org/ijcol/1110
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper we present a novel treebank developed to analyse marked constructions in Italian called MarkIT. The resource contains almost 1,300 sentences manually annotated with dependency relations following the Universal Dependencies paradigm. The sentences have been extracted from essays written by high-school students along several years, which accounts for the structure and the topic variability of the sentences. In this work, we detail the process to select the sentences, parse them automatically and then manually correct them. The resource covers seven types of marked constructions (839 sentences overall) plus some sentences, whose syntax can be wrongly classified as marked and which can serve as negative examples of markedness (453 sentences). We also present an evaluation of parsing performance, comparing a model trained on existing Italian treebanks with the model obtained by adding MarkIT to the training set.
ISSN:2499-4553