Pretrained Language Models as Containers of the Discursive Knowledge

Discourses can be treated as instances of knowledge. The dynamic space in which the trajectories of these discourses are described can be regarded as a model of knowledge. Such a space is called a discursive space. Its scope is defined by a set of discourses. The procedure of constructing such a spa...

Full description

Saved in:

Bibliographic Details
Main Author:	Rafal Maciag
Format:	Article
Language:	English
Published:	MDPI AG 2024-01-01
Series:	Computer Sciences & Mathematics Forum
Subjects:	natural language processing neural language models knowledge discourse discursive space
Online Access:	https://www.mdpi.com/2813-0324/8/1/93
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850090257763008512
author	Rafal Maciag
author_facet	Rafal Maciag
author_sort	Rafal Maciag
collection	DOAJ
description	Discourses can be treated as instances of knowledge. The dynamic space in which the trajectories of these discourses are described can be regarded as a model of knowledge. Such a space is called a discursive space. Its scope is defined by a set of discourses. The procedure of constructing such a space is a serious problem, and so far, the only solution has been to identify the dimensions of this space through the qualitative analysis of texts on the basis of the discourses that were identified. This paper proposes a solution by using an extended variant of the embedding technique, which is the basis of neural language models (pre-trained language models and large language models) in the field of natural language processing (NLP). This technique makes it possible to create a semantic model of the language in the form of a multidimensional space. The solution proposed in this article is to repeat the embedding technique but at a higher level of abstraction, that is, the discursive level. First, the discourses would be isolated from the prepared corpus of texts, preserving their order. Then, from these discourses, identified by names, a sequence of names would be created, which would be a kind of supertext. A language model would be trained on this supertext. This model would be a multidimensional space. This space would be a discursive space constructed for one moment in time. The described steps repeated in time would allow one to construct the assumed dynamic space of discourses, i.e., discursive space.
format	Article
id	doaj-art-e2913265419846fc851e4b5705a88de6
institution	DOAJ
issn	2813-0324
language	English
publishDate	2024-01-01
publisher	MDPI AG
record_format	Article
series	Computer Sciences & Mathematics Forum
spelling	doaj-art-e2913265419846fc851e4b5705a88de62025-08-20T02:42:36ZengMDPI AGComputer Sciences & Mathematics Forum2813-03242024-01-01819310.3390/cmsf2023008093Pretrained Language Models as Containers of the Discursive KnowledgeRafal Maciag0Institute of Information Studies, Jagiellonian University, 30-348 Cracow, PolandDiscourses can be treated as instances of knowledge. The dynamic space in which the trajectories of these discourses are described can be regarded as a model of knowledge. Such a space is called a discursive space. Its scope is defined by a set of discourses. The procedure of constructing such a space is a serious problem, and so far, the only solution has been to identify the dimensions of this space through the qualitative analysis of texts on the basis of the discourses that were identified. This paper proposes a solution by using an extended variant of the embedding technique, which is the basis of neural language models (pre-trained language models and large language models) in the field of natural language processing (NLP). This technique makes it possible to create a semantic model of the language in the form of a multidimensional space. The solution proposed in this article is to repeat the embedding technique but at a higher level of abstraction, that is, the discursive level. First, the discourses would be isolated from the prepared corpus of texts, preserving their order. Then, from these discourses, identified by names, a sequence of names would be created, which would be a kind of supertext. A language model would be trained on this supertext. This model would be a multidimensional space. This space would be a discursive space constructed for one moment in time. The described steps repeated in time would allow one to construct the assumed dynamic space of discourses, i.e., discursive space.https://www.mdpi.com/2813-0324/8/1/93natural language processingneural language modelsknowledgediscoursediscursive space
spellingShingle	Rafal Maciag Pretrained Language Models as Containers of the Discursive Knowledge Computer Sciences & Mathematics Forum natural language processing neural language models knowledge discourse discursive space
title	Pretrained Language Models as Containers of the Discursive Knowledge
title_full	Pretrained Language Models as Containers of the Discursive Knowledge
title_fullStr	Pretrained Language Models as Containers of the Discursive Knowledge
title_full_unstemmed	Pretrained Language Models as Containers of the Discursive Knowledge
title_short	Pretrained Language Models as Containers of the Discursive Knowledge
title_sort	pretrained language models as containers of the discursive knowledge
topic	natural language processing neural language models knowledge discourse discursive space
url	https://www.mdpi.com/2813-0324/8/1/93
work_keys_str_mv	AT rafalmaciag pretrainedlanguagemodelsascontainersofthediscursiveknowledge

Pretrained Language Models as Containers of the Discursive Knowledge

Similar Items