The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil.

The main objective of this study is to describe the process of collecting data extracted from Twitter (X) during the Brazilian presidential elections in 2022, encompassing the post-election period and the event of the attack on the buildings of the executive, legislative, and judiciary branches in J...

Full description

Saved in:
Bibliographic Details
Main Authors: Sylvia Iasulaitis, Alan Demétrius Baria Valejo, Bruno Cardoso Greco, Vinicius Gonçalves Perillo, Guilherme Henrique Messias, Isabella Vicari, with the Interfaces—Center for Sociopolitical Studies of Algorithms and Artificial Intelligence
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0316626
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825206780267855872
author Sylvia Iasulaitis
Alan Demétrius Baria Valejo
Bruno Cardoso Greco
Vinicius Gonçalves Perillo
Guilherme Henrique Messias
Isabella Vicari
with the Interfaces—Center for Sociopolitical Studies of Algorithms and Artificial Intelligence
author_facet Sylvia Iasulaitis
Alan Demétrius Baria Valejo
Bruno Cardoso Greco
Vinicius Gonçalves Perillo
Guilherme Henrique Messias
Isabella Vicari
with the Interfaces—Center for Sociopolitical Studies of Algorithms and Artificial Intelligence
author_sort Sylvia Iasulaitis
collection DOAJ
description The main objective of this study is to describe the process of collecting data extracted from Twitter (X) during the Brazilian presidential elections in 2022, encompassing the post-election period and the event of the attack on the buildings of the executive, legislative, and judiciary branches in January 2023. The work of collecting data took one year. Additionally, the study provides an overview of the general characteristics of the dataset created from 282 million tweets, named "The Interfaces Twitter Elections Dataset" (ITED-Br), the third most extensive dataset of tweets with political purposes. The process of collecting and creating the database for this study went through three major stages, subdivided into several processes: (1) A preliminary analysis of the platform and its operation; (2) Contextual analysis, creation of the conceptual model, and definition of Keywords and (3) Implementation of the Data Collection Strategy. Python algorithms were developed to model each primary collection type. The "token farm" algorithm, was employed to iterate over available API keys. While Twitter is generally a "public" access platform and fits into big data standards, extracting valuable information is not trivial due to the volume, speed, and heterogeneity of data. This study concludes that acquiring informational value requires expertise not only in sociopolitical areas but also in computational and informational studies, highlighting the interdisciplinary nature of such research.
format Article
id doaj-art-37cd496021d64aa0a1a541233991cf9c
institution Kabale University
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-37cd496021d64aa0a1a541233991cf9c2025-02-07T05:30:57ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01202e031662610.1371/journal.pone.0316626The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil.Sylvia IasulaitisAlan Demétrius Baria ValejoBruno Cardoso GrecoVinicius Gonçalves PerilloGuilherme Henrique MessiasIsabella Vicariwith the Interfaces—Center for Sociopolitical Studies of Algorithms and Artificial IntelligenceThe main objective of this study is to describe the process of collecting data extracted from Twitter (X) during the Brazilian presidential elections in 2022, encompassing the post-election period and the event of the attack on the buildings of the executive, legislative, and judiciary branches in January 2023. The work of collecting data took one year. Additionally, the study provides an overview of the general characteristics of the dataset created from 282 million tweets, named "The Interfaces Twitter Elections Dataset" (ITED-Br), the third most extensive dataset of tweets with political purposes. The process of collecting and creating the database for this study went through three major stages, subdivided into several processes: (1) A preliminary analysis of the platform and its operation; (2) Contextual analysis, creation of the conceptual model, and definition of Keywords and (3) Implementation of the Data Collection Strategy. Python algorithms were developed to model each primary collection type. The "token farm" algorithm, was employed to iterate over available API keys. While Twitter is generally a "public" access platform and fits into big data standards, extracting valuable information is not trivial due to the volume, speed, and heterogeneity of data. This study concludes that acquiring informational value requires expertise not only in sociopolitical areas but also in computational and informational studies, highlighting the interdisciplinary nature of such research.https://doi.org/10.1371/journal.pone.0316626
spellingShingle Sylvia Iasulaitis
Alan Demétrius Baria Valejo
Bruno Cardoso Greco
Vinicius Gonçalves Perillo
Guilherme Henrique Messias
Isabella Vicari
with the Interfaces—Center for Sociopolitical Studies of Algorithms and Artificial Intelligence
The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil.
PLoS ONE
title The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil.
title_full The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil.
title_fullStr The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil.
title_full_unstemmed The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil.
title_short The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil.
title_sort interfaces twitter elections dataset construction process and characteristics of big social data during the 2022 presidential elections in brazil
url https://doi.org/10.1371/journal.pone.0316626
work_keys_str_mv AT sylviaiasulaitis theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil
AT alandemetriusbariavalejo theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil
AT brunocardosogreco theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil
AT viniciusgoncalvesperillo theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil
AT guilhermehenriquemessias theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil
AT isabellavicari theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil
AT withtheinterfacescenterforsociopoliticalstudiesofalgorithmsandartificialintelligence theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil
AT sylviaiasulaitis interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil
AT alandemetriusbariavalejo interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil
AT brunocardosogreco interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil
AT viniciusgoncalvesperillo interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil
AT guilhermehenriquemessias interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil
AT isabellavicari interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil
AT withtheinterfacescenterforsociopoliticalstudiesofalgorithmsandartificialintelligence interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil