The Technological Bridge: R Programming’s Utility in Converting Social Media Data for Quantitative Financial Analysis

This study explores whether R programming can transform unstructured qualitative social media data into a quantitative format suitable for econometric modelling. It specifically examines how elements such as text, emojis, and sentiment from Reddit and X (formerly Twitter) can be converted into varia...

Full description

Saved in:
Bibliographic Details
Main Authors: Litvinenko Alexey, Samuli Saarinen, Litvinenko Anna
Format: Article
Language:English
Published: Sciendo 2025-06-01
Series:Economics and Culture
Subjects:
Online Access:https://doi.org/10.2478/jec-2025-0006
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study explores whether R programming can transform unstructured qualitative social media data into a quantitative format suitable for econometric modelling. It specifically examines how elements such as text, emojis, and sentiment from Reddit and X (formerly Twitter) can be converted into variables for regression analysis. With the aim to enhance the predictive power of traditional financial models using alternative data sources, the paper outlines comprehensive guidelines with specific technical steps, from scripting an API to extracting data from Reddit and X, through cleaning and tokenising to incorporating the data into regression models using R programming. The study addresses the growing need in financial economics to incorporate alternative data streams by offering a structured, replicable process for transforming high-volume, unstructured online content into statistically valid variables, thereby bridging the gap between qualitative market sentiment and quantitative modelling.
ISSN:2256-0173