Gender Disparities in Artificial Intelligence–Generated Images of Hospital Leadership in the United States
Objective: To evaluate demographic representation in artificial intelligence (AI)–generated images of hospital leadership roles and compare them with real-world data from US hospitals. Patients and Methods: This cross-sectional study, conducted from October 1, 2024 to October 31, 2024, analyzed imag...
Saved in:
| Main Authors: | , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-06-01
|
| Series: | Mayo Clinic Proceedings: Digital Health |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2949761225000252 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Objective: To evaluate demographic representation in artificial intelligence (AI)–generated images of hospital leadership roles and compare them with real-world data from US hospitals. Patients and Methods: This cross-sectional study, conducted from October 1, 2024 to October 31, 2024, analyzed images generated by 3 AI text-to-image models: Midjourney 6.0, OpenAI ChatGPT DALL-E 3, and Google Gemini Imagen 3. Standardized prompts were used to create 1200 images representing 4 key leadership roles: chief executive officers, chief medical officers, chief nursing officers, and chief financial officers. Real-world demographic data from 4397 US hospitals showed that chief executive officers were 73.2% men; chief financial officers, 65.2% men; chief medical officers, 85.7% men; and chief nursing officers, 9.4% men (overall: 60.1% men). The primary outcome was gender representation, with secondary outcomes including race/ethnicity and age. Two independent reviewers assessed images, with interrater reliability evaluated using Cohen κ. Results: Interrater agreement was high for gender (κ=0.998) and moderate for race/ethnicity (κ=0.670) and age (κ=0.605). DALL-E overrepresented men (86.5%) and White individuals (94.5%). Midjourney showed improved gender balance (69.5% men) but overrepresented White individuals (75.0%). Imagen achieved near gender parity (50.3% men) but remained predominantly White (51.5%). Statistically significant differences were observed across models and between models and real-world demographics. Conclusion: Artificial intelligence text-to-image models reflect and amplify systemic biases, overrepresenting men and White leaders, while underrepresenting diversity. Ethical AI practices, including diverse training data sets and fairness-aware algorithms, are essential to ensure equitable representation in health care leadership. |
|---|---|
| ISSN: | 2949-7612 |