Lexical enrichment of philological textbooks: corpus and statistical approaches
The relevance of the study is determined by the need to study objective data on vocabulary frequency in Russian language textbooks and mastering vocabulary in teaching Russian as the native language at school. The article describes the experience of creating a frequency dictionary of philological te...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Peoples’ Friendship University of Russia (RUDN University)
2024-12-01
|
| Series: | Russian Language Studies |
| Subjects: | |
| Online Access: | https://journals.rudn.ru/russian-language-studies/article/viewFile/42909/24487 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850025282147188736 |
|---|---|
| author | Khalida N. Galimova Ekaterina V. Martynova Svetlana A. Moskvitcheva |
| author_facet | Khalida N. Galimova Ekaterina V. Martynova Svetlana A. Moskvitcheva |
| author_sort | Khalida N. Galimova |
| collection | DOAJ |
| description | The relevance of the study is determined by the need to study objective data on vocabulary frequency in Russian language textbooks and mastering vocabulary in teaching Russian as the native language at school. The article describes the experience of creating a frequency dictionary of philological textbooks based on the linguistic corpus of textbooks on the Russian language and literature for 5-7 grades. Philological textbooks present an average model of the Russian language and literature, reflecting topics relevant to the student and gradually increasing the volume of lexical complexity. The aim of the article is to assess lexical enrichment in philological textbooks for 5-7 grades and to improve the methodology for compiling frequency lists. The study was carried out on the material of a corpus including 66 textbooks on the Russian language and Literature with the total size of 1,553,224 tokens. Methods of corpus and computational linguistics methods, comparative-contrastive, and statistical methods (IKSWEB program, the Google Colab environment, the Pandas, NLTK and Pymorphy libraries) revealed that the frequency list of the 5th grade comprises 8984 lemmas; the 6th grade, 7572 lemmas; the 7th grade, 7321 lemmas. Vocabulary “enrichment” in the 6th grade consists of 258 lexemes, and in the 7th grade, 150 lexemes. The lexical core of the three frequency lists are words of the thematic groups “Philological terms”, “Verbs denoting educational actions”, “Nature”, “Family and friendly relations”, “Art”, and “Time”. The 6th grade vocabulary “enrichment” includes archaisms and historicisms, terms denoting forms of the national language, and word-formation terms. The 7th grade “enrichment” comprises of linguistic terms on the themes “Names of verb forms”, “Religion”, and socio-political vocabulary. The frequency lists confirmed the hypothesis about the thematic balance of texts in modern textbooks on the Russian language and Literature and linguistics terminology being the core in the textbooks. The prospects of the study are seen in conducting a similar research of educational texts in Philology and other subjects form the textbooks for senior school in order to define intra- and meta-subject links. |
| format | Article |
| id | doaj-art-44bc3fd939d24c21bf1409cb5aa9a7eb |
| institution | DOAJ |
| issn | 2618-8163 2618-8171 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Peoples’ Friendship University of Russia (RUDN University) |
| record_format | Article |
| series | Russian Language Studies |
| spelling | doaj-art-44bc3fd939d24c21bf1409cb5aa9a7eb2025-08-20T03:00:54ZengPeoples’ Friendship University of Russia (RUDN University)Russian Language Studies2618-81632618-81712024-12-0122457959710.22363/2618-8163-2024-22-4-579-59720930Lexical enrichment of philological textbooks: corpus and statistical approachesKhalida N. Galimova0https://orcid.org/0000-0003-1817-5004Ekaterina V. Martynova1https://orcid.org/0000-0001-5883-0718Svetlana A. Moskvitcheva2https://orcid.org/0000-0002-8047-7030Kazan (Volga Region) Federal UniversityKazan (Volga Region) Federal UniversityRUDN UniversityThe relevance of the study is determined by the need to study objective data on vocabulary frequency in Russian language textbooks and mastering vocabulary in teaching Russian as the native language at school. The article describes the experience of creating a frequency dictionary of philological textbooks based on the linguistic corpus of textbooks on the Russian language and literature for 5-7 grades. Philological textbooks present an average model of the Russian language and literature, reflecting topics relevant to the student and gradually increasing the volume of lexical complexity. The aim of the article is to assess lexical enrichment in philological textbooks for 5-7 grades and to improve the methodology for compiling frequency lists. The study was carried out on the material of a corpus including 66 textbooks on the Russian language and Literature with the total size of 1,553,224 tokens. Methods of corpus and computational linguistics methods, comparative-contrastive, and statistical methods (IKSWEB program, the Google Colab environment, the Pandas, NLTK and Pymorphy libraries) revealed that the frequency list of the 5th grade comprises 8984 lemmas; the 6th grade, 7572 lemmas; the 7th grade, 7321 lemmas. Vocabulary “enrichment” in the 6th grade consists of 258 lexemes, and in the 7th grade, 150 lexemes. The lexical core of the three frequency lists are words of the thematic groups “Philological terms”, “Verbs denoting educational actions”, “Nature”, “Family and friendly relations”, “Art”, and “Time”. The 6th grade vocabulary “enrichment” includes archaisms and historicisms, terms denoting forms of the national language, and word-formation terms. The 7th grade “enrichment” comprises of linguistic terms on the themes “Names of verb forms”, “Religion”, and socio-political vocabulary. The frequency lists confirmed the hypothesis about the thematic balance of texts in modern textbooks on the Russian language and Literature and linguistics terminology being the core in the textbooks. The prospects of the study are seen in conducting a similar research of educational texts in Philology and other subjects form the textbooks for senior school in order to define intra- and meta-subject links.https://journals.rudn.ru/russian-language-studies/article/viewFile/42909/24487lemmafrequency dictionaryfrequency listsacademic corpus of the russian languagetermphilologylexical coveragelexical enrichment |
| spellingShingle | Khalida N. Galimova Ekaterina V. Martynova Svetlana A. Moskvitcheva Lexical enrichment of philological textbooks: corpus and statistical approaches Russian Language Studies lemma frequency dictionary frequency lists academic corpus of the russian language term philology lexical coverage lexical enrichment |
| title | Lexical enrichment of philological textbooks: corpus and statistical approaches |
| title_full | Lexical enrichment of philological textbooks: corpus and statistical approaches |
| title_fullStr | Lexical enrichment of philological textbooks: corpus and statistical approaches |
| title_full_unstemmed | Lexical enrichment of philological textbooks: corpus and statistical approaches |
| title_short | Lexical enrichment of philological textbooks: corpus and statistical approaches |
| title_sort | lexical enrichment of philological textbooks corpus and statistical approaches |
| topic | lemma frequency dictionary frequency lists academic corpus of the russian language term philology lexical coverage lexical enrichment |
| url | https://journals.rudn.ru/russian-language-studies/article/viewFile/42909/24487 |
| work_keys_str_mv | AT khalidangalimova lexicalenrichmentofphilologicaltextbookscorpusandstatisticalapproaches AT ekaterinavmartynova lexicalenrichmentofphilologicaltextbookscorpusandstatisticalapproaches AT svetlanaamoskvitcheva lexicalenrichmentofphilologicaltextbookscorpusandstatisticalapproaches |