Measuring biases in AI-generated co-authorship networks
Abstract Large Language Models (LLMs) have significantly advanced prompt-based information retrieval, yet their potential to reproduce or amplify social biases remains insufficiently understood. In this study, we investigate this issue through the concrete task of reconstructing real-world co-author...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
SpringerOpen
2025-05-01
|
| Series: | EPJ Data Science |
| Subjects: | |
| Online Access: | https://doi.org/10.1140/epjds/s13688-025-00555-9 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850270858969350144 |
|---|---|
| author | Ghazal Kalhor Shiza Ali Afra Mashhadi |
| author_facet | Ghazal Kalhor Shiza Ali Afra Mashhadi |
| author_sort | Ghazal Kalhor |
| collection | DOAJ |
| description | Abstract Large Language Models (LLMs) have significantly advanced prompt-based information retrieval, yet their potential to reproduce or amplify social biases remains insufficiently understood. In this study, we investigate this issue through the concrete task of reconstructing real-world co-authorship networks of computer science (CS) researchers using two widely used LLMs—GPT-3.5 Turbo and Mixtral 8x7B. This task offers a structured and quantifiable way to evaluate whether LLM-generated scholarly relationships reflect demographic disparities, as co-authorship is a key proxy for collaboration and recognition in academia. We compare the LLM-generated networks to baseline networks derived from DBLP and Google Scholar, employing both statistical and network science approaches to assess biases related to gender and ethnicity. Our findings show that both LLMs tend to produce more accurate co-authorship links for individuals with Asian or White names, particularly among researchers with lower visibility or limited academic impact. While we find no significant gender disparities in accuracy, the models systematically favor generating co-authorship links that overrepresent Asian and White individuals. Additionally, the structural properties of the LLM-generated networks differ from those of the baseline networks. These results highlight the importance of examining how LLMs represent social and scientific relationships, particularly in contexts where they are increasingly used for knowledge discovery and scholarly search. |
| format | Article |
| id | doaj-art-1524a987fee6456fa95af86c2d5fd2cd |
| institution | OA Journals |
| issn | 2193-1127 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | SpringerOpen |
| record_format | Article |
| series | EPJ Data Science |
| spelling | doaj-art-1524a987fee6456fa95af86c2d5fd2cd2025-08-20T01:52:25ZengSpringerOpenEPJ Data Science2193-11272025-05-0114113310.1140/epjds/s13688-025-00555-9Measuring biases in AI-generated co-authorship networksGhazal Kalhor0Shiza Ali1Afra Mashhadi2School of Electrical and Computer Engineering, College of Engineering, University of TehranComputing and Software Systems, University of WashingtonComputing and Software Systems, University of WashingtonAbstract Large Language Models (LLMs) have significantly advanced prompt-based information retrieval, yet their potential to reproduce or amplify social biases remains insufficiently understood. In this study, we investigate this issue through the concrete task of reconstructing real-world co-authorship networks of computer science (CS) researchers using two widely used LLMs—GPT-3.5 Turbo and Mixtral 8x7B. This task offers a structured and quantifiable way to evaluate whether LLM-generated scholarly relationships reflect demographic disparities, as co-authorship is a key proxy for collaboration and recognition in academia. We compare the LLM-generated networks to baseline networks derived from DBLP and Google Scholar, employing both statistical and network science approaches to assess biases related to gender and ethnicity. Our findings show that both LLMs tend to produce more accurate co-authorship links for individuals with Asian or White names, particularly among researchers with lower visibility or limited academic impact. While we find no significant gender disparities in accuracy, the models systematically favor generating co-authorship links that overrepresent Asian and White individuals. Additionally, the structural properties of the LLM-generated networks differ from those of the baseline networks. These results highlight the importance of examining how LLMs represent social and scientific relationships, particularly in contexts where they are increasingly used for knowledge discovery and scholarly search.https://doi.org/10.1140/epjds/s13688-025-00555-9Large language modelsCo-authorship networksComputer scienceGenderEthnicityBiases in network representation |
| spellingShingle | Ghazal Kalhor Shiza Ali Afra Mashhadi Measuring biases in AI-generated co-authorship networks EPJ Data Science Large language models Co-authorship networks Computer science Gender Ethnicity Biases in network representation |
| title | Measuring biases in AI-generated co-authorship networks |
| title_full | Measuring biases in AI-generated co-authorship networks |
| title_fullStr | Measuring biases in AI-generated co-authorship networks |
| title_full_unstemmed | Measuring biases in AI-generated co-authorship networks |
| title_short | Measuring biases in AI-generated co-authorship networks |
| title_sort | measuring biases in ai generated co authorship networks |
| topic | Large language models Co-authorship networks Computer science Gender Ethnicity Biases in network representation |
| url | https://doi.org/10.1140/epjds/s13688-025-00555-9 |
| work_keys_str_mv | AT ghazalkalhor measuringbiasesinaigeneratedcoauthorshipnetworks AT shizaali measuringbiasesinaigeneratedcoauthorshipnetworks AT aframashhadi measuringbiasesinaigeneratedcoauthorshipnetworks |