-
1
The Classical Model of Type-Token Systems Compared with Items from the Standardized Project Gutenberg Corpus
Published 2025-06-01“…We compare the “classical” equations of type-token systems, namely Zipf’s laws, Heaps’ law and the relationships between their indices, with data selected from the Standardized Project Gutenberg Corpus (SPGC). Selected items all exceed 100,000 word-tokens and are trimmed to 100,000 word-tokens each. …”
Get full text
Article -
2
Demandes a plataformes web per violació de drets d’autor: la decisió del tribunal i jurisdicció en el cas del “Projecte Gutenberg”
Published 2019-05-01“…En concret, un cas alemany relacionat amb el projecte Gutenberg. Aquest projecte va ser declarat culpable d’incomplir la llei alemanya de drets d’autor i a Alemanya es va bloquejar l’accés a alguns dels seus documents. …”
Get full text
Article -
3
Long-Range Dependence in Word Time Series: The Cosine Correlation of Embeddings
Published 2025-06-01“…Using the Standardized Project Gutenberg Corpus, we find that the cosine correlation between word2vec embeddings exhibits a readily visible stretched exponential decay for lags roughly up to 1000 words, thus corroborating the presence of LRD. …”
Get full text
Article -
4
Biodiversity is not declining in fiction
Published 2022-10-01“…Using a large corpus from Project Gutenberg (N = ~15,000) and a dictionary-matching method of over 240K biological taxa, Langer et al. find that the frequency and diversity of biological taxa have been declining steadily since the first half of the nineteenth century, echoing prior work in cultural analytics. …”
Get full text
Article -
5
Shedding new light on the context and temporality of Iberian warrior stelae: The Cañaveral de León 2 Stela and Las Capellanías burial complex (Huelva, SW Spain).
Published 2025-01-01“…"As he spoke he began stripping the spoils from the son of Paeon, but Alexandrus husband of lovely Helen aimed an arrow at him, leaning against a pillar of the monument which men had raised to Ilus son of Dardanus, a ruler in days of old", Homer, The Illiad (Project Gutenberg, 2022. Trans. by Samuel Butler).…”
Get full text
Article -
6
A Case Study of -some and -able Derivatives in the OED3: Examining the Diachronic Output and Productivity of Two Competing Adjectival Suffixes
Published 2020-12-01“…Secondly, a corpus study in multiple corpora (EHBO, COHA, Project Gutenberg OEC, COCA), as well as the OED data, both suggest that -some adjectives have a low frequency of usage over all periods of English. …”
Get full text
Article