Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations

Abstract Background Artificial intelligence (AI) has been shown to exhibit and perpetuate human biases; recent research efforts have focused on measuring bias within the input embeddings of AI language models, especially with non-binary classifications that are common in medicine and healthcare scen...

Full description

Saved in:

Bibliographic Details
Main Authors:	Magnus Gray, Leihong Wu
Format:	Article
Language:	English
Published:	BMC 2025-07-01
Series:	BMC Medical Informatics and Decision Making
Subjects:	Bias Bias measurement Natural Language processing Language models Artificial intelligence Input embeddings
Online Access:	https://doi.org/10.1186/s12911-025-03102-8
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Abstract Background Artificial intelligence (AI) has been shown to exhibit and perpetuate human biases; recent research efforts have focused on measuring bias within the input embeddings of AI language models, especially with non-binary classifications that are common in medicine and healthcare scenarios. For instance, ethnicity-linked terms might include categories such as Asian, Black, Hispanic, and White, complicating the definition of – traditionally binary – attribute groups. In this study, we aimed to develop a new framework to detect and measure inherent medical biases based on SD-WEAT (Standard Deviation - Word Embedding Association Test). Compared to its predecessor, WEAT, SD-WEAT was able to measure bias among multi-level attribute groups common in the field of medicine, such as age, race, and region. Methods We constructed a collection of medicine-based benchmarks that can be used to detect and measure biases among sex, ethnicities, and medical conditions. Then, we evaluated a collection of language models, including GloVe, BERT, LegalBERT, BioBERT, GPT-2, and BioGPT, and determined which had potential undesirable or desirable healthcare biases. Results With the presented framework, we were able to detect and measure a significant presence of bias among gender-linked (P < 0.01) and ethnicity-linked (P < 0.01) medical conditions for a biomedicine-focused language model (e.g., BioBERT) compared to general BERT models. In addition, we demonstrated that SD-WEAT was capable of simultaneously handling multiple attribute groups, detecting and measuring bias among a collection of ethnicity-linked medical conditions and multiple ethnic/racial groups. Conclusions To conclude, we presented an AI bias measurement framework, based on SD-WEAT. This framework provided a promising approach to detect and measure biases in language models that have been applied in biomedical/healthcare text analysis.
ISSN:	1472-6947

Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations

Similar Items