A Telegram Corpus for Hate Speech, Offensive Language, and Online Harm

We provide a new text corpus from the social medium Telegram, which is rich in indirect forms of divisive speech. We scraped all messages from one channel of Donald Trump supporters, covering a large part of his presidency, from late 2016 until January 2021, including the January 6 Capitol riot. The...

Full description

Saved in:

Bibliographic Details
Main Authors:	Veronika Solopova, Tatjana Scheffler, Mihaela Popa-Wyatt
Format:	Article
Language:	English
Published:	Ubiquity Press 2021-07-01
Series:	Journal of Open Humanities Data
Subjects:	social media offensive language harmful speech linguistics philosophy
Online Access:	https://openhumanitiesdata.metajnl.com/articles/32
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850177762261729280
author	Veronika Solopova Tatjana Scheffler Mihaela Popa-Wyatt
author_facet	Veronika Solopova Tatjana Scheffler Mihaela Popa-Wyatt
author_sort	Veronika Solopova
collection	DOAJ
description	We provide a new text corpus from the social medium Telegram, which is rich in indirect forms of divisive speech. We scraped all messages from one channel of Donald Trump supporters, covering a large part of his presidency, from late 2016 until January 2021, including the January 6 Capitol riot. The discussion among the group members, over this long time period, includes the spread of disinformation, disparaging of out-group members, and other forms of harmful speech. To enable research into the role of harmful speech in political discourse, we added two types of annotations to the corpus: (i) automatic annotations of offensive language for all messages, and (ii) our own manual annotations of harmful language for a portion of the posts leading up to the January 2021 Capitol riot and its aftermath.
format	Article
id	doaj-art-58541c8d46504e4e8d13e4cc7191b57f
institution	OA Journals
issn	2059-481X
language	English
publishDate	2021-07-01
publisher	Ubiquity Press
record_format	Article
series	Journal of Open Humanities Data
spelling	doaj-art-58541c8d46504e4e8d13e4cc7191b57f2025-08-20T02:18:55ZengUbiquity PressJournal of Open Humanities Data2059-481X2021-07-01710.5334/johd.3228A Telegram Corpus for Hate Speech, Offensive Language, and Online HarmVeronika Solopova0Tatjana Scheffler1Mihaela Popa-Wyatt2Freie Universität BerlinRuhr-Universität BochumLeibniz-Zentrum Allgemeine Sprachwissenschaft, BerlinWe provide a new text corpus from the social medium Telegram, which is rich in indirect forms of divisive speech. We scraped all messages from one channel of Donald Trump supporters, covering a large part of his presidency, from late 2016 until January 2021, including the January 6 Capitol riot. The discussion among the group members, over this long time period, includes the spread of disinformation, disparaging of out-group members, and other forms of harmful speech. To enable research into the role of harmful speech in political discourse, we added two types of annotations to the corpus: (i) automatic annotations of offensive language for all messages, and (ii) our own manual annotations of harmful language for a portion of the posts leading up to the January 2021 Capitol riot and its aftermath.https://openhumanitiesdata.metajnl.com/articles/32social mediaoffensive languageharmful speechlinguisticsphilosophy
spellingShingle	Veronika Solopova Tatjana Scheffler Mihaela Popa-Wyatt A Telegram Corpus for Hate Speech, Offensive Language, and Online Harm Journal of Open Humanities Data social media offensive language harmful speech linguistics philosophy
title	A Telegram Corpus for Hate Speech, Offensive Language, and Online Harm
title_full	A Telegram Corpus for Hate Speech, Offensive Language, and Online Harm
title_fullStr	A Telegram Corpus for Hate Speech, Offensive Language, and Online Harm
title_full_unstemmed	A Telegram Corpus for Hate Speech, Offensive Language, and Online Harm
title_short	A Telegram Corpus for Hate Speech, Offensive Language, and Online Harm
title_sort	telegram corpus for hate speech offensive language and online harm
topic	social media offensive language harmful speech linguistics philosophy
url	https://openhumanitiesdata.metajnl.com/articles/32
work_keys_str_mv	AT veronikasolopova atelegramcorpusforhatespeechoffensivelanguageandonlineharm AT tatjanascheffler atelegramcorpusforhatespeechoffensivelanguageandonlineharm AT mihaelapopawyatt atelegramcorpusforhatespeechoffensivelanguageandonlineharm AT veronikasolopova telegramcorpusforhatespeechoffensivelanguageandonlineharm AT tatjanascheffler telegramcorpusforhatespeechoffensivelanguageandonlineharm AT mihaelapopawyatt telegramcorpusforhatespeechoffensivelanguageandonlineharm

A Telegram Corpus for Hate Speech, Offensive Language, and Online Harm

Similar Items