OpenAI o1 Large Language Model Outperforms GPT-4o, Gemini 1.5 Flash, and Human Test Takers on Ophthalmology Board–Style Questions

Purpose: To evaluate and compare the performance of human test takers and three artificial intelligence (AI) models—OpenAI o1, ChatGPT-4o, and Gemini 1.5 Flash—on ophthalmology board–style questions, focusing on overall accuracy and performance stratified by ophthalmic subspecialty and cognitive com...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ryan Shean, BA, Tathya Shah, BS, Sina Sobhani, BS, Alan Tang, BS, Ali Setayesh, BA, Kyle Bolo, MD, Van Nguyen, MD, Benjamin Xu, MD, PhD
Format:	Article
Language:	English
Published:	Elsevier 2025-11-01
Series:	Ophthalmology Science
Subjects:	Artificial intelligence Ophthalmology Medical education Large language models
Online Access:	http://www.sciencedirect.com/science/article/pii/S2666914525001423
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!

OpenAI o1 Large Language Model Outperforms GPT-4o, Gemini 1.5 Flash, and Human Test Takers on Ophthalmology Board–Style Questions

Similar Items