Pretrained Language Models as Containers of the Discursive Knowledge

Discourses can be treated as instances of knowledge. The dynamic space in which the trajectories of these discourses are described can be regarded as a model of knowledge. Such a space is called a discursive space. Its scope is defined by a set of discourses. The procedure of constructing such a spa...

Full description

Saved in:
Bibliographic Details
Main Author: Rafal Maciag
Format: Article
Language:English
Published: MDPI AG 2024-01-01
Series:Computer Sciences & Mathematics Forum
Subjects:
Online Access:https://www.mdpi.com/2813-0324/8/1/93
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850090257763008512
author Rafal Maciag
author_facet Rafal Maciag
author_sort Rafal Maciag
collection DOAJ
description Discourses can be treated as instances of knowledge. The dynamic space in which the trajectories of these discourses are described can be regarded as a model of knowledge. Such a space is called a discursive space. Its scope is defined by a set of discourses. The procedure of constructing such a space is a serious problem, and so far, the only solution has been to identify the dimensions of this space through the qualitative analysis of texts on the basis of the discourses that were identified. This paper proposes a solution by using an extended variant of the embedding technique, which is the basis of neural language models (pre-trained language models and large language models) in the field of natural language processing (NLP). This technique makes it possible to create a semantic model of the language in the form of a multidimensional space. The solution proposed in this article is to repeat the embedding technique but at a higher level of abstraction, that is, the discursive level. First, the discourses would be isolated from the prepared corpus of texts, preserving their order. Then, from these discourses, identified by names, a sequence of names would be created, which would be a kind of supertext. A language model would be trained on this supertext. This model would be a multidimensional space. This space would be a discursive space constructed for one moment in time. The described steps repeated in time would allow one to construct the assumed dynamic space of discourses, i.e., discursive space.
format Article
id doaj-art-e2913265419846fc851e4b5705a88de6
institution DOAJ
issn 2813-0324
language English
publishDate 2024-01-01
publisher MDPI AG
record_format Article
series Computer Sciences & Mathematics Forum
spelling doaj-art-e2913265419846fc851e4b5705a88de62025-08-20T02:42:36ZengMDPI AGComputer Sciences & Mathematics Forum2813-03242024-01-01819310.3390/cmsf2023008093Pretrained Language Models as Containers of the Discursive KnowledgeRafal Maciag0Institute of Information Studies, Jagiellonian University, 30-348 Cracow, PolandDiscourses can be treated as instances of knowledge. The dynamic space in which the trajectories of these discourses are described can be regarded as a model of knowledge. Such a space is called a discursive space. Its scope is defined by a set of discourses. The procedure of constructing such a space is a serious problem, and so far, the only solution has been to identify the dimensions of this space through the qualitative analysis of texts on the basis of the discourses that were identified. This paper proposes a solution by using an extended variant of the embedding technique, which is the basis of neural language models (pre-trained language models and large language models) in the field of natural language processing (NLP). This technique makes it possible to create a semantic model of the language in the form of a multidimensional space. The solution proposed in this article is to repeat the embedding technique but at a higher level of abstraction, that is, the discursive level. First, the discourses would be isolated from the prepared corpus of texts, preserving their order. Then, from these discourses, identified by names, a sequence of names would be created, which would be a kind of supertext. A language model would be trained on this supertext. This model would be a multidimensional space. This space would be a discursive space constructed for one moment in time. The described steps repeated in time would allow one to construct the assumed dynamic space of discourses, i.e., discursive space.https://www.mdpi.com/2813-0324/8/1/93natural language processingneural language modelsknowledgediscoursediscursive space
spellingShingle Rafal Maciag
Pretrained Language Models as Containers of the Discursive Knowledge
Computer Sciences & Mathematics Forum
natural language processing
neural language models
knowledge
discourse
discursive space
title Pretrained Language Models as Containers of the Discursive Knowledge
title_full Pretrained Language Models as Containers of the Discursive Knowledge
title_fullStr Pretrained Language Models as Containers of the Discursive Knowledge
title_full_unstemmed Pretrained Language Models as Containers of the Discursive Knowledge
title_short Pretrained Language Models as Containers of the Discursive Knowledge
title_sort pretrained language models as containers of the discursive knowledge
topic natural language processing
neural language models
knowledge
discourse
discursive space
url https://www.mdpi.com/2813-0324/8/1/93
work_keys_str_mv AT rafalmaciag pretrainedlanguagemodelsascontainersofthediscursiveknowledge