A perspective on gender bias in generated text data

Text generation by artificial intelligence became available to a broader public, latterly. This technology is based on machine learning and language models that need to be trained with input data. Many studies have focused on the distinction of human-written text. vs. generated texts but recent stud...

Full description

Saved in:
Bibliographic Details
Main Author: Thomas Hupperich
Format: Article
Language:English
Published: Frontiers Media S.A. 2024-12-01
Series:Frontiers in Human Dynamics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fhumd.2024.1495270/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Text generation by artificial intelligence became available to a broader public, latterly. This technology is based on machine learning and language models that need to be trained with input data. Many studies have focused on the distinction of human-written text. vs. generated texts but recent studies show that the underlying language models might be prone to reproduce gender bias in their output and, consequently, reinforcing gender roles and imbalances. In this paper, we give a perspective on this topic, considering both the generated text data itself and the machine learning models used for language generation. We present a case study of gender bias in generated text data and review recent literature addressing language models. Our results indicate that researching gender bias in the context of text generation faces significant challenges and that future work needs to overcome a lack of definitions as well as a lack of transparency.
ISSN:2673-2726