Typological Differences of Natural and Neural Network-Generated Texts in a Quantitative Aspect

The authors of this article identify distinctive features in texts written by humans and texts generated by the GPT-3 neural network. Texts generated by GPT-3 have not yet been subject to systematic in-depth study. In total, 160 texts were analyzed in the article, distributed across four topics (“Hi...

Full description

Saved in:
Bibliographic Details
Main Authors: R. E. Telpov, S. V. Lartsina
Format: Article
Language:Russian
Published: Tsentr nauchnykh i obrazovatelnykh proektov 2023-10-01
Series:Научный диалог
Subjects:
Online Access:https://www.nauka-dialog.ru/jour/article/view/4797
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849223453571284992
author R. E. Telpov
S. V. Lartsina
author_facet R. E. Telpov
S. V. Lartsina
author_sort R. E. Telpov
collection DOAJ
description The authors of this article identify distinctive features in texts written by humans and texts generated by the GPT-3 neural network. Texts generated by GPT-3 have not yet been subject to systematic in-depth study. In total, 160 texts were analyzed in the article, distributed across four topics (“Higher Education in My Eyes,” “How to Remain Human in Inhuman Conditions,” “How I Spent the Summer,” “Teacher of the Year”), with 80 texts generated by the neural network and 80 texts written by humans. The texts were analyzed using quantitative linguistic methods. A concordance was compiled for each text using the AntConc program, from which quantitative values were obtained for further analysis. The authors reached the following conclusions: (1) in the generated texts, words included in the title occur with the highest frequency; (2) the relative frequency of words included in the title is unreasonably inflated; (3) the list of the 20 most frequent words in all generated texts includes the highest number of full-fledged words; (4) the lexical diversity coefficient in the examined natural texts is significantly higher than that of the generated texts. The findings of this research can be useful for both educators and machine learning specialists.
format Article
id doaj-art-2d60e3cc84a346d1952159d9c7bf80ac
institution Kabale University
issn 2225-756X
2227-1295
language Russian
publishDate 2023-10-01
publisher Tsentr nauchnykh i obrazovatelnykh proektov
record_format Article
series Научный диалог
spelling doaj-art-2d60e3cc84a346d1952159d9c7bf80ac2025-08-25T18:13:30ZrusTsentr nauchnykh i obrazovatelnykh proektovНаучный диалог2225-756X2227-12952023-10-01127476510.24224/2227-1295-2023-12-7-47-652541Typological Differences of Natural and Neural Network-Generated Texts in a Quantitative AspectR. E. Telpov0S. V. Lartsina1Pushkin State Russian Language InstitutePushkin State Russian Language InstituteThe authors of this article identify distinctive features in texts written by humans and texts generated by the GPT-3 neural network. Texts generated by GPT-3 have not yet been subject to systematic in-depth study. In total, 160 texts were analyzed in the article, distributed across four topics (“Higher Education in My Eyes,” “How to Remain Human in Inhuman Conditions,” “How I Spent the Summer,” “Teacher of the Year”), with 80 texts generated by the neural network and 80 texts written by humans. The texts were analyzed using quantitative linguistic methods. A concordance was compiled for each text using the AntConc program, from which quantitative values were obtained for further analysis. The authors reached the following conclusions: (1) in the generated texts, words included in the title occur with the highest frequency; (2) the relative frequency of words included in the title is unreasonably inflated; (3) the list of the 20 most frequent words in all generated texts includes the highest number of full-fledged words; (4) the lexical diversity coefficient in the examined natural texts is significantly higher than that of the generated texts. The findings of this research can be useful for both educators and machine learning specialists.https://www.nauka-dialog.ru/jour/article/view/4797artificial intelligencechatbotneural networkquantitative linguisticsgenerated textlem-maconcordancelexical diversity coefficient
spellingShingle R. E. Telpov
S. V. Lartsina
Typological Differences of Natural and Neural Network-Generated Texts in a Quantitative Aspect
Научный диалог
artificial intelligence
chatbot
neural network
quantitative linguistics
generated text
lem-ma
concordance
lexical diversity coefficient
title Typological Differences of Natural and Neural Network-Generated Texts in a Quantitative Aspect
title_full Typological Differences of Natural and Neural Network-Generated Texts in a Quantitative Aspect
title_fullStr Typological Differences of Natural and Neural Network-Generated Texts in a Quantitative Aspect
title_full_unstemmed Typological Differences of Natural and Neural Network-Generated Texts in a Quantitative Aspect
title_short Typological Differences of Natural and Neural Network-Generated Texts in a Quantitative Aspect
title_sort typological differences of natural and neural network generated texts in a quantitative aspect
topic artificial intelligence
chatbot
neural network
quantitative linguistics
generated text
lem-ma
concordance
lexical diversity coefficient
url https://www.nauka-dialog.ru/jour/article/view/4797
work_keys_str_mv AT retelpov typologicaldifferencesofnaturalandneuralnetworkgeneratedtextsinaquantitativeaspect
AT svlartsina typologicaldifferencesofnaturalandneuralnetworkgeneratedtextsinaquantitativeaspect