DirKorp

In this paper, we present recent developments on a new version (v3.0) of DirKorp (Korpus direktivnih govornih činova hrvatskoga jezika), the first Croatian corpus of directive speech acts developed for the purposes of pragmatic research. The corpus contains 800 elicited speech acts collected via an...

Full description

Saved in:
Bibliographic Details
Main Authors: Petra Bago, Virna Karlić
Format: Article
Language:English
Published: University of Ljubljana Press (Založba Univerze v Ljubljani) 2023-09-01
Series:Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave
Subjects:
Online Access:https://journals.uni-lj.si/slovenscina2/article/view/11995
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849721484760580096
author Petra Bago
Virna Karlić
author_facet Petra Bago
Virna Karlić
author_sort Petra Bago
collection DOAJ
description In this paper, we present recent developments on a new version (v3.0) of DirKorp (Korpus direktivnih govornih činova hrvatskoga jezika), the first Croatian corpus of directive speech acts developed for the purposes of pragmatic research. The corpus contains 800 elicited speech acts collected via an online questionnaire with role-playing tasks, a method of simulated communication that is implemented under pre-set conditions. This method is suitable for researching speech acts due to the ability to collect a great number of examples of such acts of equal propositional content and illocutionary purpose used in the same controlled situations. The presented situations are classified into two categories with regard to the relationship between the participants of the communication act: (1) situations involving interlocutors who are not in a familiar relationship; (2) situations involving interlocutors in a familiar relationship. Assignments of the two categories are organized into four pairs, asking respondents to share a speech act of similar propositional content. The respondents were 100 Croatian speakers, all undergraduate (63%) or graduate students (37%) of the Faculty of Humanities and Social Sciences (University of Zagreb). The corpus has been manually annotated on the speech act level, each speech act containing up to 14 features: (1) respondent ID, (2) familiarity/unfamiliarity, (3) utterance type, (4) directive performative verb in 1st person, (5) illocutionary force, (6) propositional content, (7) T/V form, (8) exhortative, (9) lexical marker of request, (10) lexical marker of apology, (11) lexical marker of gratitude, (12) honorific title, (13) grammatical mood, and (14) modal verb in 2nd person. It contains 12,676 tokens and 1,692 types. The corpus is encoded according to the TEI P5: Guidelines for Electronic Text Encoding and Interchange, developed and maintained by the Text Encoding Initiative Consortium (TEI). DirKorp is available for download under the CC BY-SA 4.0 license from GitHub in TEI format. We describe applied pragmatic annotation as well as the structure of the corpus.
format Article
id doaj-art-ac6b1efb3ff543768d27f69cf2abe309
institution DOAJ
issn 2335-2736
language English
publishDate 2023-09-01
publisher University of Ljubljana Press (Založba Univerze v Ljubljani)
record_format Article
series Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave
spelling doaj-art-ac6b1efb3ff543768d27f69cf2abe3092025-08-20T03:11:39ZengUniversity of Ljubljana Press (Založba Univerze v Ljubljani)Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave2335-27362023-09-0111110.4312/slo2.0.2023.1.189-21718382DirKorpPetra Bago0Virna Karlić1University of Zagreb, Faculty of Humanities and Social Sciences, CroatiaUniversity of Zagreb, Faculty of Humanities and Social Sciences, Croatia In this paper, we present recent developments on a new version (v3.0) of DirKorp (Korpus direktivnih govornih činova hrvatskoga jezika), the first Croatian corpus of directive speech acts developed for the purposes of pragmatic research. The corpus contains 800 elicited speech acts collected via an online questionnaire with role-playing tasks, a method of simulated communication that is implemented under pre-set conditions. This method is suitable for researching speech acts due to the ability to collect a great number of examples of such acts of equal propositional content and illocutionary purpose used in the same controlled situations. The presented situations are classified into two categories with regard to the relationship between the participants of the communication act: (1) situations involving interlocutors who are not in a familiar relationship; (2) situations involving interlocutors in a familiar relationship. Assignments of the two categories are organized into four pairs, asking respondents to share a speech act of similar propositional content. The respondents were 100 Croatian speakers, all undergraduate (63%) or graduate students (37%) of the Faculty of Humanities and Social Sciences (University of Zagreb). The corpus has been manually annotated on the speech act level, each speech act containing up to 14 features: (1) respondent ID, (2) familiarity/unfamiliarity, (3) utterance type, (4) directive performative verb in 1st person, (5) illocutionary force, (6) propositional content, (7) T/V form, (8) exhortative, (9) lexical marker of request, (10) lexical marker of apology, (11) lexical marker of gratitude, (12) honorific title, (13) grammatical mood, and (14) modal verb in 2nd person. It contains 12,676 tokens and 1,692 types. The corpus is encoded according to the TEI P5: Guidelines for Electronic Text Encoding and Interchange, developed and maintained by the Text Encoding Initiative Consortium (TEI). DirKorp is available for download under the CC BY-SA 4.0 license from GitHub in TEI format. We describe applied pragmatic annotation as well as the structure of the corpus. https://journals.uni-lj.si/slovenscina2/article/view/11995corpus pragmaticsdirective speech actsDirKorpCroatian language
spellingShingle Petra Bago
Virna Karlić
DirKorp
Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave
corpus pragmatics
directive speech acts
DirKorp
Croatian language
title DirKorp
title_full DirKorp
title_fullStr DirKorp
title_full_unstemmed DirKorp
title_short DirKorp
title_sort dirkorp
topic corpus pragmatics
directive speech acts
DirKorp
Croatian language
url https://journals.uni-lj.si/slovenscina2/article/view/11995
work_keys_str_mv AT petrabago dirkorp
AT virnakarlic dirkorp