Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection

Microaggressions are subtle manifestations of bias (Breitfeller et al. 2019). These demonstrations of bias can often be classified as a subset of abusive language. However, not much focus has been placed on the recognition of these instances. As a result, limited data is available on the topic, and...

Full description

Saved in:
Bibliographic Details
Main Authors: Tolúlọpẹ́ Ògúnrẹ̀mí, Valerio Basile, Tommaso Caselli
Format: Article
Language:English
Published: Accademia University Press 2022-12-01
Series:IJCoL
Online Access:https://journals.openedition.org/ijcol/1066
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850124159557828608
author Tolúlọpẹ́ Ògúnrẹ̀mí
Valerio Basile
Tommaso Caselli
author_facet Tolúlọpẹ́ Ògúnrẹ̀mí
Valerio Basile
Tommaso Caselli
author_sort Tolúlọpẹ́ Ògúnrẹ̀mí
collection DOAJ
description Microaggressions are subtle manifestations of bias (Breitfeller et al. 2019). These demonstrations of bias can often be classified as a subset of abusive language. However, not much focus has been placed on the recognition of these instances. As a result, limited data is available on the topic, and only in English. Being able to detect microaggressions without the need for labeled data would be advantageous since it would allow content moderation also for languages lacking annotated data. In this study, we introduce an unsupervised method to detect microaggressions in natural language expressions. The algorithm relies on pre-trained word-embeddings, leveraging the bias encoded in the model in order to detect microaggressions in unseen textual instances. We test the method on a dataset of racial and gender-based microaggressions, reporting promising results. We further run the algorithm on out-of-domain unseen data with the purpose of bootstrapping corpora of microaggressions “in the wild”, perform a pilot experiment with prompt-based learning, and discuss the benefits and drawbacks of our proposed method.1
format Article
id doaj-art-06630cb8cdc343d0bae032c8e6e86422
institution OA Journals
issn 2499-4553
language English
publishDate 2022-12-01
publisher Accademia University Press
record_format Article
series IJCoL
spelling doaj-art-06630cb8cdc343d0bae032c8e6e864222025-08-20T02:34:24ZengAccademia University PressIJCoL2499-45532022-12-018210.4000/ijcol.1066Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression DetectionTolúlọpẹ́ Ògúnrẹ̀míValerio BasileTommaso CaselliMicroaggressions are subtle manifestations of bias (Breitfeller et al. 2019). These demonstrations of bias can often be classified as a subset of abusive language. However, not much focus has been placed on the recognition of these instances. As a result, limited data is available on the topic, and only in English. Being able to detect microaggressions without the need for labeled data would be advantageous since it would allow content moderation also for languages lacking annotated data. In this study, we introduce an unsupervised method to detect microaggressions in natural language expressions. The algorithm relies on pre-trained word-embeddings, leveraging the bias encoded in the model in order to detect microaggressions in unseen textual instances. We test the method on a dataset of racial and gender-based microaggressions, reporting promising results. We further run the algorithm on out-of-domain unseen data with the purpose of bootstrapping corpora of microaggressions “in the wild”, perform a pilot experiment with prompt-based learning, and discuss the benefits and drawbacks of our proposed method.1https://journals.openedition.org/ijcol/1066
spellingShingle Tolúlọpẹ́ Ògúnrẹ̀mí
Valerio Basile
Tommaso Caselli
Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection
IJCoL
title Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection
title_full Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection
title_fullStr Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection
title_full_unstemmed Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection
title_short Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection
title_sort leveraging bias in pre trained word embeddings for unsupervised microaggression detection
url https://journals.openedition.org/ijcol/1066
work_keys_str_mv AT tolulopeogunremi leveragingbiasinpretrainedwordembeddingsforunsupervisedmicroaggressiondetection
AT valeriobasile leveragingbiasinpretrainedwordembeddingsforunsupervisedmicroaggressiondetection
AT tommasocaselli leveragingbiasinpretrainedwordembeddingsforunsupervisedmicroaggressiondetection