Defining collocation for Slovenian lexical resources
In this paper, we define the notion of collocation for the purpose of its use in machine-readable language resources, which will be used in the creation of electronic dictionaries and language applications for Slovene. Based on theoretical and lexicographically-driven studies we define collocation...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
University of Ljubljana Press (Založba Univerze v Ljubljani)
2020-08-01
|
| Series: | Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave |
| Subjects: | |
| Online Access: | https://journals.uni-lj.si/slovenscina2/article/view/9338 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849319396219027456 |
|---|---|
| author | Iztok Kosem Simon Krek Polona Gantar |
| author_facet | Iztok Kosem Simon Krek Polona Gantar |
| author_sort | Iztok Kosem |
| collection | DOAJ |
| description |
In this paper, we define the notion of collocation for the purpose of its use in machine-readable language resources, which will be used in the creation of electronic dictionaries and language applications for Slovene. Based on theoretical and lexicographically-driven studies we define collocation as a lexical phenomenon, defined by three key aspects: statistical, syntactic, and semantic. We take lexicographic relevance as a point of departure for defining collocations within the typology of word combinations, as well as for distinguishing them from free combinations. Free combinations are (frequent) syntactically valid word combinations without lexicographic value and consequently there is no need for the description of their meaning, or syntactic role. Next, we distinguish collocations from all multiword lexical units (compounds, phraseological units and lexico-grammatical units) using the lexicographic view that multiword lexical units, whose meaning is not a sum of its parts, require a description of their meaning whereas collocations do not. In the final part, we return to the three aspects of collocation and their role in automatic extraction of collocational information from corpora. Semantic criterion or dictionary relevance of extracted collocations has particularly exposed the problem of semantically broad collocates such as certain types of adverbs, adjectives and verbs, and word which feature in different syntactic roles (e.g. pronouns and adjuncts). We discuss a particular issue of collocations related to proper names and the decisions about their inclusion into the dictionary based on the evaluation of lexicographers.
|
| format | Article |
| id | doaj-art-243991e9bc654ae8a820af79c846be07 |
| institution | Kabale University |
| issn | 2335-2736 |
| language | English |
| publishDate | 2020-08-01 |
| publisher | University of Ljubljana Press (Založba Univerze v Ljubljani) |
| record_format | Article |
| series | Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave |
| spelling | doaj-art-243991e9bc654ae8a820af79c846be072025-08-20T03:50:31ZengUniversity of Ljubljana Press (Založba Univerze v Ljubljani)Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave2335-27362020-08-018210.4312/slo2.0.2020.2.1-27Defining collocation for Slovenian lexical resourcesIztok Kosem0Simon Krek1Polona Gantar2University of Ljubljana, Faculty of Arts, Slovenia; Jožef Stefan Institute, Ljubljana, SloveniaJožef Stefan Institute, Ljubljana, SloveniaUniversity of Ljubljana, Faculty of Arts, Slovenia In this paper, we define the notion of collocation for the purpose of its use in machine-readable language resources, which will be used in the creation of electronic dictionaries and language applications for Slovene. Based on theoretical and lexicographically-driven studies we define collocation as a lexical phenomenon, defined by three key aspects: statistical, syntactic, and semantic. We take lexicographic relevance as a point of departure for defining collocations within the typology of word combinations, as well as for distinguishing them from free combinations. Free combinations are (frequent) syntactically valid word combinations without lexicographic value and consequently there is no need for the description of their meaning, or syntactic role. Next, we distinguish collocations from all multiword lexical units (compounds, phraseological units and lexico-grammatical units) using the lexicographic view that multiword lexical units, whose meaning is not a sum of its parts, require a description of their meaning whereas collocations do not. In the final part, we return to the three aspects of collocation and their role in automatic extraction of collocational information from corpora. Semantic criterion or dictionary relevance of extracted collocations has particularly exposed the problem of semantically broad collocates such as certain types of adverbs, adjectives and verbs, and word which feature in different syntactic roles (e.g. pronouns and adjuncts). We discuss a particular issue of collocations related to proper names and the decisions about their inclusion into the dictionary based on the evaluation of lexicographers. https://journals.uni-lj.si/slovenscina2/article/view/9338collocationmultiword lexical unitword combinationSlovenelexicographydictionary database |
| spellingShingle | Iztok Kosem Simon Krek Polona Gantar Defining collocation for Slovenian lexical resources Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave collocation multiword lexical unit word combination Slovene lexicography dictionary database |
| title | Defining collocation for Slovenian lexical resources |
| title_full | Defining collocation for Slovenian lexical resources |
| title_fullStr | Defining collocation for Slovenian lexical resources |
| title_full_unstemmed | Defining collocation for Slovenian lexical resources |
| title_short | Defining collocation for Slovenian lexical resources |
| title_sort | defining collocation for slovenian lexical resources |
| topic | collocation multiword lexical unit word combination Slovene lexicography dictionary database |
| url | https://journals.uni-lj.si/slovenscina2/article/view/9338 |
| work_keys_str_mv | AT iztokkosem definingcollocationforslovenianlexicalresources AT simonkrek definingcollocationforslovenianlexicalresources AT polonagantar definingcollocationforslovenianlexicalresources |