Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations

Abstract Background Artificial intelligence (AI) has been shown to exhibit and perpetuate human biases; recent research efforts have focused on measuring bias within the input embeddings of AI language models, especially with non-binary classifications that are common in medicine and healthcare scen...

Full description

Saved in:
Bibliographic Details
Main Authors: Magnus Gray, Leihong Wu
Format: Article
Language:English
Published: BMC 2025-07-01
Series:BMC Medical Informatics and Decision Making
Subjects:
Online Access:https://doi.org/10.1186/s12911-025-03102-8
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Artificial intelligence (AI) has been shown to exhibit and perpetuate human biases; recent research efforts have focused on measuring bias within the input embeddings of AI language models, especially with non-binary classifications that are common in medicine and healthcare scenarios. For instance, ethnicity-linked terms might include categories such as Asian, Black, Hispanic, and White, complicating the definition of – traditionally binary – attribute groups. In this study, we aimed to develop a new framework to detect and measure inherent medical biases based on SD-WEAT (Standard Deviation - Word Embedding Association Test). Compared to its predecessor, WEAT, SD-WEAT was able to measure bias among multi-level attribute groups common in the field of medicine, such as age, race, and region. Methods We constructed a collection of medicine-based benchmarks that can be used to detect and measure biases among sex, ethnicities, and medical conditions. Then, we evaluated a collection of language models, including GloVe, BERT, LegalBERT, BioBERT, GPT-2, and BioGPT, and determined which had potential undesirable or desirable healthcare biases. Results With the presented framework, we were able to detect and measure a significant presence of bias among gender-linked (P < 0.01) and ethnicity-linked (P < 0.01) medical conditions for a biomedicine-focused language model (e.g., BioBERT) compared to general BERT models. In addition, we demonstrated that SD-WEAT was capable of simultaneously handling multiple attribute groups, detecting and measuring bias among a collection of ethnicity-linked medical conditions and multiple ethnic/racial groups. Conclusions To conclude, we presented an AI bias measurement framework, based on SD-WEAT. This framework provided a promising approach to detect and measure biases in language models that have been applied in biomedical/healthcare text analysis.
ISSN:1472-6947