Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations
Abstract Background Artificial intelligence (AI) has been shown to exhibit and perpetuate human biases; recent research efforts have focused on measuring bias within the input embeddings of AI language models, especially with non-binary classifications that are common in medicine and healthcare scen...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-07-01
|
| Series: | BMC Medical Informatics and Decision Making |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12911-025-03102-8 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849764384583188480 |
|---|---|
| author | Magnus Gray Leihong Wu |
| author_facet | Magnus Gray Leihong Wu |
| author_sort | Magnus Gray |
| collection | DOAJ |
| description | Abstract Background Artificial intelligence (AI) has been shown to exhibit and perpetuate human biases; recent research efforts have focused on measuring bias within the input embeddings of AI language models, especially with non-binary classifications that are common in medicine and healthcare scenarios. For instance, ethnicity-linked terms might include categories such as Asian, Black, Hispanic, and White, complicating the definition of – traditionally binary – attribute groups. In this study, we aimed to develop a new framework to detect and measure inherent medical biases based on SD-WEAT (Standard Deviation - Word Embedding Association Test). Compared to its predecessor, WEAT, SD-WEAT was able to measure bias among multi-level attribute groups common in the field of medicine, such as age, race, and region. Methods We constructed a collection of medicine-based benchmarks that can be used to detect and measure biases among sex, ethnicities, and medical conditions. Then, we evaluated a collection of language models, including GloVe, BERT, LegalBERT, BioBERT, GPT-2, and BioGPT, and determined which had potential undesirable or desirable healthcare biases. Results With the presented framework, we were able to detect and measure a significant presence of bias among gender-linked (P < 0.01) and ethnicity-linked (P < 0.01) medical conditions for a biomedicine-focused language model (e.g., BioBERT) compared to general BERT models. In addition, we demonstrated that SD-WEAT was capable of simultaneously handling multiple attribute groups, detecting and measuring bias among a collection of ethnicity-linked medical conditions and multiple ethnic/racial groups. Conclusions To conclude, we presented an AI bias measurement framework, based on SD-WEAT. This framework provided a promising approach to detect and measure biases in language models that have been applied in biomedical/healthcare text analysis. |
| format | Article |
| id | doaj-art-cbab19199e3e4f1d8a303f3b23ded34c |
| institution | DOAJ |
| issn | 1472-6947 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | BMC |
| record_format | Article |
| series | BMC Medical Informatics and Decision Making |
| spelling | doaj-art-cbab19199e3e4f1d8a303f3b23ded34c2025-08-20T03:05:09ZengBMCBMC Medical Informatics and Decision Making1472-69472025-07-0125111110.1186/s12911-025-03102-8Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populationsMagnus Gray0Leihong Wu1Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. FDADivision of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. FDAAbstract Background Artificial intelligence (AI) has been shown to exhibit and perpetuate human biases; recent research efforts have focused on measuring bias within the input embeddings of AI language models, especially with non-binary classifications that are common in medicine and healthcare scenarios. For instance, ethnicity-linked terms might include categories such as Asian, Black, Hispanic, and White, complicating the definition of – traditionally binary – attribute groups. In this study, we aimed to develop a new framework to detect and measure inherent medical biases based on SD-WEAT (Standard Deviation - Word Embedding Association Test). Compared to its predecessor, WEAT, SD-WEAT was able to measure bias among multi-level attribute groups common in the field of medicine, such as age, race, and region. Methods We constructed a collection of medicine-based benchmarks that can be used to detect and measure biases among sex, ethnicities, and medical conditions. Then, we evaluated a collection of language models, including GloVe, BERT, LegalBERT, BioBERT, GPT-2, and BioGPT, and determined which had potential undesirable or desirable healthcare biases. Results With the presented framework, we were able to detect and measure a significant presence of bias among gender-linked (P < 0.01) and ethnicity-linked (P < 0.01) medical conditions for a biomedicine-focused language model (e.g., BioBERT) compared to general BERT models. In addition, we demonstrated that SD-WEAT was capable of simultaneously handling multiple attribute groups, detecting and measuring bias among a collection of ethnicity-linked medical conditions and multiple ethnic/racial groups. Conclusions To conclude, we presented an AI bias measurement framework, based on SD-WEAT. This framework provided a promising approach to detect and measure biases in language models that have been applied in biomedical/healthcare text analysis.https://doi.org/10.1186/s12911-025-03102-8BiasBias measurementNatural Language processingLanguage modelsArtificial intelligenceInput embeddings |
| spellingShingle | Magnus Gray Leihong Wu Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations BMC Medical Informatics and Decision Making Bias Bias measurement Natural Language processing Language models Artificial intelligence Input embeddings |
| title | Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations |
| title_full | Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations |
| title_fullStr | Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations |
| title_full_unstemmed | Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations |
| title_short | Benchmarking bias in embeddings of healthcare AI models: using SD-WEAT for detection and measurement across sensitive populations |
| title_sort | benchmarking bias in embeddings of healthcare ai models using sd weat for detection and measurement across sensitive populations |
| topic | Bias Bias measurement Natural Language processing Language models Artificial intelligence Input embeddings |
| url | https://doi.org/10.1186/s12911-025-03102-8 |
| work_keys_str_mv | AT magnusgray benchmarkingbiasinembeddingsofhealthcareaimodelsusingsdweatfordetectionandmeasurementacrosssensitivepopulations AT leihongwu benchmarkingbiasinembeddingsofhealthcareaimodelsusingsdweatfordetectionandmeasurementacrosssensitivepopulations |