The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil.
The main objective of this study is to describe the process of collecting data extracted from Twitter (X) during the Brazilian presidential elections in 2022, encompassing the post-election period and the event of the attack on the buildings of the executive, legislative, and judiciary branches in J...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2025-01-01
|
Series: | PLoS ONE |
Online Access: | https://doi.org/10.1371/journal.pone.0316626 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1825206780267855872 |
---|---|
author | Sylvia Iasulaitis Alan Demétrius Baria Valejo Bruno Cardoso Greco Vinicius Gonçalves Perillo Guilherme Henrique Messias Isabella Vicari with the Interfaces—Center for Sociopolitical Studies of Algorithms and Artificial Intelligence |
author_facet | Sylvia Iasulaitis Alan Demétrius Baria Valejo Bruno Cardoso Greco Vinicius Gonçalves Perillo Guilherme Henrique Messias Isabella Vicari with the Interfaces—Center for Sociopolitical Studies of Algorithms and Artificial Intelligence |
author_sort | Sylvia Iasulaitis |
collection | DOAJ |
description | The main objective of this study is to describe the process of collecting data extracted from Twitter (X) during the Brazilian presidential elections in 2022, encompassing the post-election period and the event of the attack on the buildings of the executive, legislative, and judiciary branches in January 2023. The work of collecting data took one year. Additionally, the study provides an overview of the general characteristics of the dataset created from 282 million tweets, named "The Interfaces Twitter Elections Dataset" (ITED-Br), the third most extensive dataset of tweets with political purposes. The process of collecting and creating the database for this study went through three major stages, subdivided into several processes: (1) A preliminary analysis of the platform and its operation; (2) Contextual analysis, creation of the conceptual model, and definition of Keywords and (3) Implementation of the Data Collection Strategy. Python algorithms were developed to model each primary collection type. The "token farm" algorithm, was employed to iterate over available API keys. While Twitter is generally a "public" access platform and fits into big data standards, extracting valuable information is not trivial due to the volume, speed, and heterogeneity of data. This study concludes that acquiring informational value requires expertise not only in sociopolitical areas but also in computational and informational studies, highlighting the interdisciplinary nature of such research. |
format | Article |
id | doaj-art-37cd496021d64aa0a1a541233991cf9c |
institution | Kabale University |
issn | 1932-6203 |
language | English |
publishDate | 2025-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS ONE |
spelling | doaj-art-37cd496021d64aa0a1a541233991cf9c2025-02-07T05:30:57ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01202e031662610.1371/journal.pone.0316626The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil.Sylvia IasulaitisAlan Demétrius Baria ValejoBruno Cardoso GrecoVinicius Gonçalves PerilloGuilherme Henrique MessiasIsabella Vicariwith the Interfaces—Center for Sociopolitical Studies of Algorithms and Artificial IntelligenceThe main objective of this study is to describe the process of collecting data extracted from Twitter (X) during the Brazilian presidential elections in 2022, encompassing the post-election period and the event of the attack on the buildings of the executive, legislative, and judiciary branches in January 2023. The work of collecting data took one year. Additionally, the study provides an overview of the general characteristics of the dataset created from 282 million tweets, named "The Interfaces Twitter Elections Dataset" (ITED-Br), the third most extensive dataset of tweets with political purposes. The process of collecting and creating the database for this study went through three major stages, subdivided into several processes: (1) A preliminary analysis of the platform and its operation; (2) Contextual analysis, creation of the conceptual model, and definition of Keywords and (3) Implementation of the Data Collection Strategy. Python algorithms were developed to model each primary collection type. The "token farm" algorithm, was employed to iterate over available API keys. While Twitter is generally a "public" access platform and fits into big data standards, extracting valuable information is not trivial due to the volume, speed, and heterogeneity of data. This study concludes that acquiring informational value requires expertise not only in sociopolitical areas but also in computational and informational studies, highlighting the interdisciplinary nature of such research.https://doi.org/10.1371/journal.pone.0316626 |
spellingShingle | Sylvia Iasulaitis Alan Demétrius Baria Valejo Bruno Cardoso Greco Vinicius Gonçalves Perillo Guilherme Henrique Messias Isabella Vicari with the Interfaces—Center for Sociopolitical Studies of Algorithms and Artificial Intelligence The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil. PLoS ONE |
title | The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil. |
title_full | The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil. |
title_fullStr | The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil. |
title_full_unstemmed | The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil. |
title_short | The Interfaces Twitter Elections Dataset: Construction process and characteristics of big social data during the 2022 presidential elections in Brazil. |
title_sort | interfaces twitter elections dataset construction process and characteristics of big social data during the 2022 presidential elections in brazil |
url | https://doi.org/10.1371/journal.pone.0316626 |
work_keys_str_mv | AT sylviaiasulaitis theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil AT alandemetriusbariavalejo theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil AT brunocardosogreco theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil AT viniciusgoncalvesperillo theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil AT guilhermehenriquemessias theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil AT isabellavicari theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil AT withtheinterfacescenterforsociopoliticalstudiesofalgorithmsandartificialintelligence theinterfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil AT sylviaiasulaitis interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil AT alandemetriusbariavalejo interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil AT brunocardosogreco interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil AT viniciusgoncalvesperillo interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil AT guilhermehenriquemessias interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil AT isabellavicari interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil AT withtheinterfacescenterforsociopoliticalstudiesofalgorithmsandartificialintelligence interfacestwitterelectionsdatasetconstructionprocessandcharacteristicsofbigsocialdataduringthe2022presidentialelectionsinbrazil |