Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection
Microaggressions are subtle manifestations of bias (Breitfeller et al. 2019). These demonstrations of bias can often be classified as a subset of abusive language. However, not much focus has been placed on the recognition of these instances. As a result, limited data is available on the topic, and...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Accademia University Press
2022-12-01
|
| Series: | IJCoL |
| Online Access: | https://journals.openedition.org/ijcol/1066 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850124159557828608 |
|---|---|
| author | Tolúlọpẹ́ Ògúnrẹ̀mí Valerio Basile Tommaso Caselli |
| author_facet | Tolúlọpẹ́ Ògúnrẹ̀mí Valerio Basile Tommaso Caselli |
| author_sort | Tolúlọpẹ́ Ògúnrẹ̀mí |
| collection | DOAJ |
| description | Microaggressions are subtle manifestations of bias (Breitfeller et al. 2019). These demonstrations of bias can often be classified as a subset of abusive language. However, not much focus has been placed on the recognition of these instances. As a result, limited data is available on the topic, and only in English. Being able to detect microaggressions without the need for labeled data would be advantageous since it would allow content moderation also for languages lacking annotated data. In this study, we introduce an unsupervised method to detect microaggressions in natural language expressions. The algorithm relies on pre-trained word-embeddings, leveraging the bias encoded in the model in order to detect microaggressions in unseen textual instances. We test the method on a dataset of racial and gender-based microaggressions, reporting promising results. We further run the algorithm on out-of-domain unseen data with the purpose of bootstrapping corpora of microaggressions “in the wild”, perform a pilot experiment with prompt-based learning, and discuss the benefits and drawbacks of our proposed method.1 |
| format | Article |
| id | doaj-art-06630cb8cdc343d0bae032c8e6e86422 |
| institution | OA Journals |
| issn | 2499-4553 |
| language | English |
| publishDate | 2022-12-01 |
| publisher | Accademia University Press |
| record_format | Article |
| series | IJCoL |
| spelling | doaj-art-06630cb8cdc343d0bae032c8e6e864222025-08-20T02:34:24ZengAccademia University PressIJCoL2499-45532022-12-018210.4000/ijcol.1066Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression DetectionTolúlọpẹ́ Ògúnrẹ̀míValerio BasileTommaso CaselliMicroaggressions are subtle manifestations of bias (Breitfeller et al. 2019). These demonstrations of bias can often be classified as a subset of abusive language. However, not much focus has been placed on the recognition of these instances. As a result, limited data is available on the topic, and only in English. Being able to detect microaggressions without the need for labeled data would be advantageous since it would allow content moderation also for languages lacking annotated data. In this study, we introduce an unsupervised method to detect microaggressions in natural language expressions. The algorithm relies on pre-trained word-embeddings, leveraging the bias encoded in the model in order to detect microaggressions in unseen textual instances. We test the method on a dataset of racial and gender-based microaggressions, reporting promising results. We further run the algorithm on out-of-domain unseen data with the purpose of bootstrapping corpora of microaggressions “in the wild”, perform a pilot experiment with prompt-based learning, and discuss the benefits and drawbacks of our proposed method.1https://journals.openedition.org/ijcol/1066 |
| spellingShingle | Tolúlọpẹ́ Ògúnrẹ̀mí Valerio Basile Tommaso Caselli Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection IJCoL |
| title | Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection |
| title_full | Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection |
| title_fullStr | Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection |
| title_full_unstemmed | Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection |
| title_short | Leveraging Bias in Pre-trained Word Embeddings for Unsupervised Microaggression Detection |
| title_sort | leveraging bias in pre trained word embeddings for unsupervised microaggression detection |
| url | https://journals.openedition.org/ijcol/1066 |
| work_keys_str_mv | AT tolulopeogunremi leveragingbiasinpretrainedwordembeddingsforunsupervisedmicroaggressiondetection AT valeriobasile leveragingbiasinpretrainedwordembeddingsforunsupervisedmicroaggressiondetection AT tommasocaselli leveragingbiasinpretrainedwordembeddingsforunsupervisedmicroaggressiondetection |