Thematic marking of the anthropological corpus: a methodology for classifying miners’ narratives

The article describes the methodology for creating an anthropological corpus of texts that are united by belonging to the mining profession. The content of the work correlates with three research tasks: development of a thematic classification, introduction of conventions for highlighting narratives...

Full description

Saved in:
Bibliographic Details
Main Authors: L. L. Mazitova, L. M. Panteleeva
Format: Article
Language:English
Published: Samara National Research University 2025-01-01
Series:Вестник Самарского университета: История, педагогика, филология
Subjects:
Online Access:https://journals.ssau.ru/hpp/article/viewFile/28139/11022
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850278397495738368
author L. L. Mazitova
L. M. Panteleeva
author_facet L. L. Mazitova
L. M. Panteleeva
author_sort L. L. Mazitova
collection DOAJ
description The article describes the methodology for creating an anthropological corpus of texts that are united by belonging to the mining profession. The content of the work correlates with three research tasks: development of a thematic classification, introduction of conventions for highlighting narratives in the text, 3) determination of principles for organizing the corpus according to the themes of narratives. Thematic classification of narratives was the result of the analysis of several «control» texts. It represents a multi-level systematization of topics of cultural and professional nature: when the main (basic) topics can have internal detail, which leads to the emergence of micro-topics. The number of such microtopics can be different and is determined by the specifics of the topic itself, i.e. the ability to characterize the corresponding phenomenon of reality in various aspects. Fragments of text in which a particular topic or subtopic is implemented are highlighted with square brackets. On both sides of the brackets, the numerical designations of the corresponding topic/subtopic from the previously given thematic classification are indicated. During thematic marking, the principles of compliance of the narrative with the main theme of the corpus, incompleteness of the thematic classification, integrity of the narrative, non-rigid marking, taking into account «zero» topics are maintained. The article describes particular problems of marking, when a particular topic is not developed by the informant, despite the fact that the context allows for different interpretations of the topic of the narrative. In such situations, during the marking process, the topic purposefully represented by the informant is separated from facts that may have only an indirect relation to any topic, but may not have any. Thus, the described methodology represents the first approach to developing a meta-markup standard. The type of marking used can be called extra-linguistic based on the specifics of the subject information, analytical based on the method of identification, narrative based on the object, multi-level based on the depth of classification, manual based on the method of assigning labels, and neutral in relation to a certain theory.
format Article
id doaj-art-ade471a3117a49c7bf43a6dffac73bff
institution OA Journals
issn 2542-0445
2712-8946
language English
publishDate 2025-01-01
publisher Samara National Research University
record_format Article
series Вестник Самарского университета: История, педагогика, филология
spelling doaj-art-ade471a3117a49c7bf43a6dffac73bff2025-08-20T01:49:31ZengSamara National Research UniversityВестник Самарского университета: История, педагогика, филология2542-04452712-89462025-01-0130415616410.18287/2542-0445-2024-30-4-156-1649070Thematic marking of the anthropological corpus: a methodology for classifying miners’ narrativesL. L. Mazitova0https://orcid.org/0000-0002-6775-8233L. M. Panteleeva1https://orcid.org/0000-0002-5815-1288HSE Campus in PermHSE Campus in PermThe article describes the methodology for creating an anthropological corpus of texts that are united by belonging to the mining profession. The content of the work correlates with three research tasks: development of a thematic classification, introduction of conventions for highlighting narratives in the text, 3) determination of principles for organizing the corpus according to the themes of narratives. Thematic classification of narratives was the result of the analysis of several «control» texts. It represents a multi-level systematization of topics of cultural and professional nature: when the main (basic) topics can have internal detail, which leads to the emergence of micro-topics. The number of such microtopics can be different and is determined by the specifics of the topic itself, i.e. the ability to characterize the corresponding phenomenon of reality in various aspects. Fragments of text in which a particular topic or subtopic is implemented are highlighted with square brackets. On both sides of the brackets, the numerical designations of the corresponding topic/subtopic from the previously given thematic classification are indicated. During thematic marking, the principles of compliance of the narrative with the main theme of the corpus, incompleteness of the thematic classification, integrity of the narrative, non-rigid marking, taking into account «zero» topics are maintained. The article describes particular problems of marking, when a particular topic is not developed by the informant, despite the fact that the context allows for different interpretations of the topic of the narrative. In such situations, during the marking process, the topic purposefully represented by the informant is separated from facts that may have only an indirect relation to any topic, but may not have any. Thus, the described methodology represents the first approach to developing a meta-markup standard. The type of marking used can be called extra-linguistic based on the specifics of the subject information, analytical based on the method of identification, narrative based on the object, multi-level based on the depth of classification, manual based on the method of assigning labels, and neutral in relation to a certain theory.https://journals.ssau.ru/hpp/article/viewFile/28139/11022anthropological corpuscomputational linguisticsextralinguistic markupthematic classificationnarrative
spellingShingle L. L. Mazitova
L. M. Panteleeva
Thematic marking of the anthropological corpus: a methodology for classifying miners’ narratives
Вестник Самарского университета: История, педагогика, филология
anthropological corpus
computational linguistics
extralinguistic markup
thematic classification
narrative
title Thematic marking of the anthropological corpus: a methodology for classifying miners’ narratives
title_full Thematic marking of the anthropological corpus: a methodology for classifying miners’ narratives
title_fullStr Thematic marking of the anthropological corpus: a methodology for classifying miners’ narratives
title_full_unstemmed Thematic marking of the anthropological corpus: a methodology for classifying miners’ narratives
title_short Thematic marking of the anthropological corpus: a methodology for classifying miners’ narratives
title_sort thematic marking of the anthropological corpus a methodology for classifying miners narratives
topic anthropological corpus
computational linguistics
extralinguistic markup
thematic classification
narrative
url https://journals.ssau.ru/hpp/article/viewFile/28139/11022
work_keys_str_mv AT llmazitova thematicmarkingoftheanthropologicalcorpusamethodologyforclassifyingminersnarratives
AT lmpanteleeva thematicmarkingoftheanthropologicalcorpusamethodologyforclassifyingminersnarratives