Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations

Abstract Background Artificial intelligence (AI) has been shown to exhibit and perpetuate human biases; recent research efforts have focused on measuring bias within the input embeddings of AI language models, especially with non-binary classifications that are common in medicine and healthcare scen...

Full description

Saved in:
Bibliographic Details
Main Authors: Magnus Gray, Leihong Wu
Format: Article
Language:English
Published: BMC 2025-07-01
Series:BMC Medical Informatics and Decision Making
Subjects:
Online Access:https://doi.org/10.1186/s12911-025-03102-8
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849764384583188480
author Magnus Gray
Leihong Wu
author_facet Magnus Gray
Leihong Wu
author_sort Magnus Gray
collection DOAJ
description Abstract Background Artificial intelligence (AI) has been shown to exhibit and perpetuate human biases; recent research efforts have focused on measuring bias within the input embeddings of AI language models, especially with non-binary classifications that are common in medicine and healthcare scenarios. For instance, ethnicity-linked terms might include categories such as Asian, Black, Hispanic, and White, complicating the definition of – traditionally binary – attribute groups. In this study, we aimed to develop a new framework to detect and measure inherent medical biases based on SD-WEAT (Standard Deviation - Word Embedding Association Test). Compared to its predecessor, WEAT, SD-WEAT was able to measure bias among multi-level attribute groups common in the field of medicine, such as age, race, and region. Methods We constructed a collection of medicine-based benchmarks that can be used to detect and measure biases among sex, ethnicities, and medical conditions. Then, we evaluated a collection of language models, including GloVe, BERT, LegalBERT, BioBERT, GPT-2, and BioGPT, and determined which had potential undesirable or desirable healthcare biases. Results With the presented framework, we were able to detect and measure a significant presence of bias among gender-linked (P < 0.01) and ethnicity-linked (P < 0.01) medical conditions for a biomedicine-focused language model (e.g., BioBERT) compared to general BERT models. In addition, we demonstrated that SD-WEAT was capable of simultaneously handling multiple attribute groups, detecting and measuring bias among a collection of ethnicity-linked medical conditions and multiple ethnic/racial groups. Conclusions To conclude, we presented an AI bias measurement framework, based on SD-WEAT. This framework provided a promising approach to detect and measure biases in language models that have been applied in biomedical/healthcare text analysis.
format Article
id doaj-art-cbab19199e3e4f1d8a303f3b23ded34c
institution DOAJ
issn 1472-6947
language English
publishDate 2025-07-01
publisher BMC
record_format Article
series BMC Medical Informatics and Decision Making
spelling doaj-art-cbab19199e3e4f1d8a303f3b23ded34c2025-08-20T03:05:09ZengBMCBMC Medical Informatics and Decision Making1472-69472025-07-0125111110.1186/s12911-025-03102-8Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populationsMagnus Gray0Leihong Wu1Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. FDADivision of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. FDAAbstract Background Artificial intelligence (AI) has been shown to exhibit and perpetuate human biases; recent research efforts have focused on measuring bias within the input embeddings of AI language models, especially with non-binary classifications that are common in medicine and healthcare scenarios. For instance, ethnicity-linked terms might include categories such as Asian, Black, Hispanic, and White, complicating the definition of – traditionally binary – attribute groups. In this study, we aimed to develop a new framework to detect and measure inherent medical biases based on SD-WEAT (Standard Deviation - Word Embedding Association Test). Compared to its predecessor, WEAT, SD-WEAT was able to measure bias among multi-level attribute groups common in the field of medicine, such as age, race, and region. Methods We constructed a collection of medicine-based benchmarks that can be used to detect and measure biases among sex, ethnicities, and medical conditions. Then, we evaluated a collection of language models, including GloVe, BERT, LegalBERT, BioBERT, GPT-2, and BioGPT, and determined which had potential undesirable or desirable healthcare biases. Results With the presented framework, we were able to detect and measure a significant presence of bias among gender-linked (P < 0.01) and ethnicity-linked (P < 0.01) medical conditions for a biomedicine-focused language model (e.g., BioBERT) compared to general BERT models. In addition, we demonstrated that SD-WEAT was capable of simultaneously handling multiple attribute groups, detecting and measuring bias among a collection of ethnicity-linked medical conditions and multiple ethnic/racial groups. Conclusions To conclude, we presented an AI bias measurement framework, based on SD-WEAT. This framework provided a promising approach to detect and measure biases in language models that have been applied in biomedical/healthcare text analysis.https://doi.org/10.1186/s12911-025-03102-8BiasBias measurementNatural Language processingLanguage modelsArtificial intelligenceInput embeddings
spellingShingle Magnus Gray
Leihong Wu
Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations
BMC Medical Informatics and Decision Making
Bias
Bias measurement
Natural Language processing
Language models
Artificial intelligence
Input embeddings
title Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations
title_full Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations
title_fullStr Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations
title_full_unstemmed Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations
title_short Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations
title_sort benchmarking bias in embeddings of healthcare ai models using sd weat for detection and measurement across sensitive populations
topic Bias
Bias measurement
Natural Language processing
Language models
Artificial intelligence
Input embeddings
url https://doi.org/10.1186/s12911-025-03102-8
work_keys_str_mv AT magnusgray benchmarkingbiasinembeddingsofhealthcareaimodelsusingsdweatfordetectionandmeasurementacrosssensitivepopulations
AT leihongwu benchmarkingbiasinembeddingsofhealthcareaimodelsusingsdweatfordetectionandmeasurementacrosssensitivepopulations