Lexical enrichment of philological textbooks: corpus and statistical approaches

The relevance of the study is determined by the need to study objective data on vocabulary frequency in Russian language textbooks and mastering vocabulary in teaching Russian as the native language at school. The article describes the experience of creating a frequency dictionary of philological te...

Full description

Saved in:
Bibliographic Details
Main Authors: Khalida N. Galimova, Ekaterina V. Martynova, Svetlana A. Moskvitcheva
Format: Article
Language:English
Published: Peoples’ Friendship University of Russia (RUDN University) 2024-12-01
Series:Russian Language Studies
Subjects:
Online Access:https://journals.rudn.ru/russian-language-studies/article/viewFile/42909/24487
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850025282147188736
author Khalida N. Galimova
Ekaterina V. Martynova
Svetlana A. Moskvitcheva
author_facet Khalida N. Galimova
Ekaterina V. Martynova
Svetlana A. Moskvitcheva
author_sort Khalida N. Galimova
collection DOAJ
description The relevance of the study is determined by the need to study objective data on vocabulary frequency in Russian language textbooks and mastering vocabulary in teaching Russian as the native language at school. The article describes the experience of creating a frequency dictionary of philological textbooks based on the linguistic corpus of textbooks on the Russian language and literature for 5-7 grades. Philological textbooks present an average model of the Russian language and literature, reflecting topics relevant to the student and gradually increasing the volume of lexical complexity. The aim of the article is to assess lexical enrichment in philological textbooks for 5-7 grades and to improve the methodology for compiling frequency lists. The study was carried out on the material of a corpus including 66 textbooks on the Russian language and Literature with the total size of 1,553,224 tokens. Methods of corpus and computational linguistics methods, comparative-contrastive, and statistical methods (IKSWEB program, the Google Colab environment, the Pandas, NLTK and Pymorphy libraries) revealed that the frequency list of the 5th grade comprises 8984 lemmas; the 6th grade, 7572 lemmas; the 7th grade, 7321 lemmas. Vocabulary “enrichment” in the 6th grade consists of 258 lexemes, and in the 7th grade, 150 lexemes. The lexical core of the three frequency lists are words of the thematic groups “Philological terms”, “Verbs denoting educational actions”, “Nature”, “Family and friendly relations”, “Art”, and “Time”. The 6th grade vocabulary “enrichment” includes archaisms and historicisms, terms denoting forms of the national language, and word-formation terms. The 7th grade “enrichment” comprises of linguistic terms on the themes “Names of verb forms”, “Religion”, and socio-political vocabulary. The frequency lists confirmed the hypothesis about the thematic balance of texts in modern textbooks on the Russian language and Literature and linguistics terminology being the core in the textbooks. The prospects of the study are seen in conducting a similar research of educational texts in Philology and other subjects form the textbooks for senior school in order to define intra- and meta-subject links.
format Article
id doaj-art-44bc3fd939d24c21bf1409cb5aa9a7eb
institution DOAJ
issn 2618-8163
2618-8171
language English
publishDate 2024-12-01
publisher Peoples’ Friendship University of Russia (RUDN University)
record_format Article
series Russian Language Studies
spelling doaj-art-44bc3fd939d24c21bf1409cb5aa9a7eb2025-08-20T03:00:54ZengPeoples’ Friendship University of Russia (RUDN University)Russian Language Studies2618-81632618-81712024-12-0122457959710.22363/2618-8163-2024-22-4-579-59720930Lexical enrichment of philological textbooks: corpus and statistical approachesKhalida N. Galimova0https://orcid.org/0000-0003-1817-5004Ekaterina V. Martynova1https://orcid.org/0000-0001-5883-0718Svetlana A. Moskvitcheva2https://orcid.org/0000-0002-8047-7030Kazan (Volga Region) Federal UniversityKazan (Volga Region) Federal UniversityRUDN UniversityThe relevance of the study is determined by the need to study objective data on vocabulary frequency in Russian language textbooks and mastering vocabulary in teaching Russian as the native language at school. The article describes the experience of creating a frequency dictionary of philological textbooks based on the linguistic corpus of textbooks on the Russian language and literature for 5-7 grades. Philological textbooks present an average model of the Russian language and literature, reflecting topics relevant to the student and gradually increasing the volume of lexical complexity. The aim of the article is to assess lexical enrichment in philological textbooks for 5-7 grades and to improve the methodology for compiling frequency lists. The study was carried out on the material of a corpus including 66 textbooks on the Russian language and Literature with the total size of 1,553,224 tokens. Methods of corpus and computational linguistics methods, comparative-contrastive, and statistical methods (IKSWEB program, the Google Colab environment, the Pandas, NLTK and Pymorphy libraries) revealed that the frequency list of the 5th grade comprises 8984 lemmas; the 6th grade, 7572 lemmas; the 7th grade, 7321 lemmas. Vocabulary “enrichment” in the 6th grade consists of 258 lexemes, and in the 7th grade, 150 lexemes. The lexical core of the three frequency lists are words of the thematic groups “Philological terms”, “Verbs denoting educational actions”, “Nature”, “Family and friendly relations”, “Art”, and “Time”. The 6th grade vocabulary “enrichment” includes archaisms and historicisms, terms denoting forms of the national language, and word-formation terms. The 7th grade “enrichment” comprises of linguistic terms on the themes “Names of verb forms”, “Religion”, and socio-political vocabulary. The frequency lists confirmed the hypothesis about the thematic balance of texts in modern textbooks on the Russian language and Literature and linguistics terminology being the core in the textbooks. The prospects of the study are seen in conducting a similar research of educational texts in Philology and other subjects form the textbooks for senior school in order to define intra- and meta-subject links.https://journals.rudn.ru/russian-language-studies/article/viewFile/42909/24487lemmafrequency dictionaryfrequency listsacademic corpus of the russian languagetermphilologylexical coveragelexical enrichment
spellingShingle Khalida N. Galimova
Ekaterina V. Martynova
Svetlana A. Moskvitcheva
Lexical enrichment of philological textbooks: corpus and statistical approaches
Russian Language Studies
lemma
frequency dictionary
frequency lists
academic corpus of the russian language
term
philology
lexical coverage
lexical enrichment
title Lexical enrichment of philological textbooks: corpus and statistical approaches
title_full Lexical enrichment of philological textbooks: corpus and statistical approaches
title_fullStr Lexical enrichment of philological textbooks: corpus and statistical approaches
title_full_unstemmed Lexical enrichment of philological textbooks: corpus and statistical approaches
title_short Lexical enrichment of philological textbooks: corpus and statistical approaches
title_sort lexical enrichment of philological textbooks corpus and statistical approaches
topic lemma
frequency dictionary
frequency lists
academic corpus of the russian language
term
philology
lexical coverage
lexical enrichment
url https://journals.rudn.ru/russian-language-studies/article/viewFile/42909/24487
work_keys_str_mv AT khalidangalimova lexicalenrichmentofphilologicaltextbookscorpusandstatisticalapproaches
AT ekaterinavmartynova lexicalenrichmentofphilologicaltextbookscorpusandstatisticalapproaches
AT svetlanaamoskvitcheva lexicalenrichmentofphilologicaltextbookscorpusandstatisticalapproaches