Inter-Annotator Agreement and Its Reflection in LLMs and Responsible AI.

Recent research on Responsible AI, particularly in addressing algorithmic biases, has gained significant attention. Natural Language Processing (NLP) algorithms, which rely on human-generated and human-labeled data, often reflect these challenges. In this paper, we analyze inter-annotator agreement...

Full description

Saved in:
Bibliographic Details
Main Authors: Amir Toliyat, Elena Filatova, Ronak Etemadpour
Format: Article
Language:English
Published: LibraryPress@UF 2025-05-01
Series:Proceedings of the International Florida Artificial Intelligence Research Society Conference
Online Access:https://journals.flvc.org/FLAIRS/article/view/139049
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recent research on Responsible AI, particularly in addressing algorithmic biases, has gained significant attention. Natural Language Processing (NLP) algorithms, which rely on human-generated and human-labeled data, often reflect these challenges. In this paper, we analyze inter-annotator agreement in the task of labeling hate speech data and examine how annotators’ backgrounds influence their labeling decisions. Specifically, we investigate differences in hate speech annotations that arise when annotators identify with the targeted groups. Our findings reveal substantial differences in agreement between a general pool of annotators and those who personally relate to the targets of the hate speech they label. Additionally, we evaluate the OpenAI GPT-4o model on the same dataset. Our results highlight the need to consider annotators’ backgrounds when assessing the performance of Large Language Models (LLMs) in hate speech detection.
ISSN:2334-0754
2334-0762