Creating Lexical Resources in TEI P5

Although most of the relevant dictionary productions of the recent past have relied on digital data and methods, there is little consensus on formats and standards. The Institute for Corpus Linguistics and Text Technology (ICLTT) of the Austrian Academy of Sciences has been conducting a number of va...

Full description

Saved in:
Bibliographic Details
Main Authors: Gerhard Budin, Stefan Majewski, Karlheinz Mörth
Format: Article
Language:deu
Published: Text Encoding Initiative Consortium 2012-10-01
Series:Journal of the Text Encoding Initiative
Subjects:
Online Access:https://journals.openedition.org/jtei/522
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832578493997645824
author Gerhard Budin
Stefan Majewski
Karlheinz Mörth
author_facet Gerhard Budin
Stefan Majewski
Karlheinz Mörth
author_sort Gerhard Budin
collection DOAJ
description Although most of the relevant dictionary productions of the recent past have relied on digital data and methods, there is little consensus on formats and standards. The Institute for Corpus Linguistics and Text Technology (ICLTT) of the Austrian Academy of Sciences has been conducting a number of varied lexicographic projects, both digitising print dictionaries and working on the creation of genuinely digital lexicographic data. This data was designed to serve varying purposes: machine-readability was only one. A second goal was interoperability with digital NLP tools. To achieve this end, a uniform encoding system applicable across all the projects was developed. The paper describes the constraints imposed on the content models of the various elements of the TEI dictionary module and provides arguments in favour of TEI P5 as an encoding system not only being used to represent digitised print dictionaries but also for NLP purposes.
format Article
id doaj-art-6b893ec907d84a87ae79d335d06cc7f1
institution Kabale University
issn 2162-5603
language deu
publishDate 2012-10-01
publisher Text Encoding Initiative Consortium
record_format Article
series Journal of the Text Encoding Initiative
spelling doaj-art-6b893ec907d84a87ae79d335d06cc7f12025-01-30T13:56:14ZdeuText Encoding Initiative ConsortiumJournal of the Text Encoding Initiative2162-56032012-10-01310.4000/jtei.522Creating Lexical Resources in TEI P5Gerhard BudinStefan MajewskiKarlheinz MörthAlthough most of the relevant dictionary productions of the recent past have relied on digital data and methods, there is little consensus on formats and standards. The Institute for Corpus Linguistics and Text Technology (ICLTT) of the Austrian Academy of Sciences has been conducting a number of varied lexicographic projects, both digitising print dictionaries and working on the creation of genuinely digital lexicographic data. This data was designed to serve varying purposes: machine-readability was only one. A second goal was interoperability with digital NLP tools. To achieve this end, a uniform encoding system applicable across all the projects was developed. The paper describes the constraints imposed on the content models of the various elements of the TEI dictionary module and provides arguments in favour of TEI P5 as an encoding system not only being used to represent digitised print dictionaries but also for NLP purposes.https://journals.openedition.org/jtei/522P5dictionariesdigital lexicographyNLP
spellingShingle Gerhard Budin
Stefan Majewski
Karlheinz Mörth
Creating Lexical Resources in TEI P5
Journal of the Text Encoding Initiative
P5
dictionaries
digital lexicography
NLP
title Creating Lexical Resources in TEI P5
title_full Creating Lexical Resources in TEI P5
title_fullStr Creating Lexical Resources in TEI P5
title_full_unstemmed Creating Lexical Resources in TEI P5
title_short Creating Lexical Resources in TEI P5
title_sort creating lexical resources in tei p5
topic P5
dictionaries
digital lexicography
NLP
url https://journals.openedition.org/jtei/522
work_keys_str_mv AT gerhardbudin creatinglexicalresourcesinteip5
AT stefanmajewski creatinglexicalresourcesinteip5
AT karlheinzmorth creatinglexicalresourcesinteip5