Measuring biases in AI-generated co-authorship networks

Abstract Large Language Models (LLMs) have significantly advanced prompt-based information retrieval, yet their potential to reproduce or amplify social biases remains insufficiently understood. In this study, we investigate this issue through the concrete task of reconstructing real-world co-author...

Full description

Saved in:
Bibliographic Details
Main Authors: Ghazal Kalhor, Shiza Ali, Afra Mashhadi
Format: Article
Language:English
Published: SpringerOpen 2025-05-01
Series:EPJ Data Science
Subjects:
Online Access:https://doi.org/10.1140/epjds/s13688-025-00555-9
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850270858969350144
author Ghazal Kalhor
Shiza Ali
Afra Mashhadi
author_facet Ghazal Kalhor
Shiza Ali
Afra Mashhadi
author_sort Ghazal Kalhor
collection DOAJ
description Abstract Large Language Models (LLMs) have significantly advanced prompt-based information retrieval, yet their potential to reproduce or amplify social biases remains insufficiently understood. In this study, we investigate this issue through the concrete task of reconstructing real-world co-authorship networks of computer science (CS) researchers using two widely used LLMs—GPT-3.5 Turbo and Mixtral 8x7B. This task offers a structured and quantifiable way to evaluate whether LLM-generated scholarly relationships reflect demographic disparities, as co-authorship is a key proxy for collaboration and recognition in academia. We compare the LLM-generated networks to baseline networks derived from DBLP and Google Scholar, employing both statistical and network science approaches to assess biases related to gender and ethnicity. Our findings show that both LLMs tend to produce more accurate co-authorship links for individuals with Asian or White names, particularly among researchers with lower visibility or limited academic impact. While we find no significant gender disparities in accuracy, the models systematically favor generating co-authorship links that overrepresent Asian and White individuals. Additionally, the structural properties of the LLM-generated networks differ from those of the baseline networks. These results highlight the importance of examining how LLMs represent social and scientific relationships, particularly in contexts where they are increasingly used for knowledge discovery and scholarly search.
format Article
id doaj-art-1524a987fee6456fa95af86c2d5fd2cd
institution OA Journals
issn 2193-1127
language English
publishDate 2025-05-01
publisher SpringerOpen
record_format Article
series EPJ Data Science
spelling doaj-art-1524a987fee6456fa95af86c2d5fd2cd2025-08-20T01:52:25ZengSpringerOpenEPJ Data Science2193-11272025-05-0114113310.1140/epjds/s13688-025-00555-9Measuring biases in AI-generated co-authorship networksGhazal Kalhor0Shiza Ali1Afra Mashhadi2School of Electrical and Computer Engineering, College of Engineering, University of TehranComputing and Software Systems, University of WashingtonComputing and Software Systems, University of WashingtonAbstract Large Language Models (LLMs) have significantly advanced prompt-based information retrieval, yet their potential to reproduce or amplify social biases remains insufficiently understood. In this study, we investigate this issue through the concrete task of reconstructing real-world co-authorship networks of computer science (CS) researchers using two widely used LLMs—GPT-3.5 Turbo and Mixtral 8x7B. This task offers a structured and quantifiable way to evaluate whether LLM-generated scholarly relationships reflect demographic disparities, as co-authorship is a key proxy for collaboration and recognition in academia. We compare the LLM-generated networks to baseline networks derived from DBLP and Google Scholar, employing both statistical and network science approaches to assess biases related to gender and ethnicity. Our findings show that both LLMs tend to produce more accurate co-authorship links for individuals with Asian or White names, particularly among researchers with lower visibility or limited academic impact. While we find no significant gender disparities in accuracy, the models systematically favor generating co-authorship links that overrepresent Asian and White individuals. Additionally, the structural properties of the LLM-generated networks differ from those of the baseline networks. These results highlight the importance of examining how LLMs represent social and scientific relationships, particularly in contexts where they are increasingly used for knowledge discovery and scholarly search.https://doi.org/10.1140/epjds/s13688-025-00555-9Large language modelsCo-authorship networksComputer scienceGenderEthnicityBiases in network representation
spellingShingle Ghazal Kalhor
Shiza Ali
Afra Mashhadi
Measuring biases in AI-generated co-authorship networks
EPJ Data Science
Large language models
Co-authorship networks
Computer science
Gender
Ethnicity
Biases in network representation
title Measuring biases in AI-generated co-authorship networks
title_full Measuring biases in AI-generated co-authorship networks
title_fullStr Measuring biases in AI-generated co-authorship networks
title_full_unstemmed Measuring biases in AI-generated co-authorship networks
title_short Measuring biases in AI-generated co-authorship networks
title_sort measuring biases in ai generated co authorship networks
topic Large language models
Co-authorship networks
Computer science
Gender
Ethnicity
Biases in network representation
url https://doi.org/10.1140/epjds/s13688-025-00555-9
work_keys_str_mv AT ghazalkalhor measuringbiasesinaigeneratedcoauthorshipnetworks
AT shizaali measuringbiasesinaigeneratedcoauthorshipnetworks
AT aframashhadi measuringbiasesinaigeneratedcoauthorshipnetworks