Evaluating Large Language Models for Preoperative Patient Education in Superior Capsular Reconstruction: Comparative Study of Claude, GPT, and Gemini

Abstract BackgroundLarge language models (LLMs) are revolutionizing natural language processing, increasingly applied in clinical settings to enhance preoperative patient education. ObjectiveThis study aimed to evaluate the effectiveness and applicability of variou...

Full description

Saved in:
Bibliographic Details
Main Authors: Yukang Liu, Hua Li, Jianfeng Ouyang, Zhaowen Xue, Min Wang, Hebei He, Bin Song, Xiaofei Zheng, Wenyi Gan
Format: Article
Language:English
Published: JMIR Publications 2025-06-01
Series:JMIR Perioperative Medicine
Online Access:https://periop.jmir.org/2025/1/e70047
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849421968362700800
author Yukang Liu
Hua Li
Jianfeng Ouyang
Zhaowen Xue
Min Wang
Hebei He
Bin Song
Xiaofei Zheng
Wenyi Gan
author_facet Yukang Liu
Hua Li
Jianfeng Ouyang
Zhaowen Xue
Min Wang
Hebei He
Bin Song
Xiaofei Zheng
Wenyi Gan
author_sort Yukang Liu
collection DOAJ
description Abstract BackgroundLarge language models (LLMs) are revolutionizing natural language processing, increasingly applied in clinical settings to enhance preoperative patient education. ObjectiveThis study aimed to evaluate the effectiveness and applicability of various LLMs in preoperative patient education by analyzing their responses to superior capsular reconstruction (SCR)–related inquiries. MethodsIn total, 10 sports medicine clinical experts formulated 11 SCR issues and developed preoperative patient education strategies during a webinar, inputting 12 text commands into Claude-3-Opus (Anthropic), GPT-4-Turbo (OpenAI), and Gemini-1.5-Pro (Google DeepMind). A total of 3 experts assessed the language models’ responses for correctness, completeness, logic, potential harm, and overall satisfaction, while preoperative education documents were evaluated using DISCERN questionnaire and Patient Education Materials Assessment Tool instruments, and reviewed by 5 postoperative patients for readability and educational value; readability of all responses was also analyzed using the cntext package and py-readability-metrics. ResultsBetween July 1 and August 17, 2024, sports medicine experts and patients evaluated 33 responses and 3 preoperative patient education documents generated by 3 language models regarding SCR surgery. For the 11 query responses, clinicians rated Gemini significantly higher than Claude in all categories (PPPPPPP ConclusionsClaude-3-Opus, GPT-4-Turbo, and Gemini-1.5-Pro effectively generated readable presurgical education materials but lacked citations and failed to discuss alternative treatments or the risks of forgoing SCR surgery, highlighting the need for expert oversight when using these LLMs in patient education.
format Article
id doaj-art-6dffb5925d9a467db1c32ca393df7737
institution Kabale University
issn 2561-9128
language English
publishDate 2025-06-01
publisher JMIR Publications
record_format Article
series JMIR Perioperative Medicine
spelling doaj-art-6dffb5925d9a467db1c32ca393df77372025-08-20T03:31:20ZengJMIR PublicationsJMIR Perioperative Medicine2561-91282025-06-018e70047e7004710.2196/70047Evaluating Large Language Models for Preoperative Patient Education in Superior Capsular Reconstruction: Comparative Study of Claude, GPT, and GeminiYukang Liuhttp://orcid.org/0009-0000-8577-8492Hua Lihttp://orcid.org/0009-0006-0966-2977Jianfeng Ouyanghttp://orcid.org/0000-0003-2708-8500Zhaowen Xuehttp://orcid.org/0009-0001-5807-9810Min Wanghttp://orcid.org/0009-0001-3413-8441Hebei Hehttp://orcid.org/0009-0005-3336-7671Bin Songhttp://orcid.org/0000-0002-4892-470XXiaofei Zhenghttp://orcid.org/0000-0001-7502-6131Wenyi Ganhttp://orcid.org/0000-0003-1886-8062 Abstract BackgroundLarge language models (LLMs) are revolutionizing natural language processing, increasingly applied in clinical settings to enhance preoperative patient education. ObjectiveThis study aimed to evaluate the effectiveness and applicability of various LLMs in preoperative patient education by analyzing their responses to superior capsular reconstruction (SCR)–related inquiries. MethodsIn total, 10 sports medicine clinical experts formulated 11 SCR issues and developed preoperative patient education strategies during a webinar, inputting 12 text commands into Claude-3-Opus (Anthropic), GPT-4-Turbo (OpenAI), and Gemini-1.5-Pro (Google DeepMind). A total of 3 experts assessed the language models’ responses for correctness, completeness, logic, potential harm, and overall satisfaction, while preoperative education documents were evaluated using DISCERN questionnaire and Patient Education Materials Assessment Tool instruments, and reviewed by 5 postoperative patients for readability and educational value; readability of all responses was also analyzed using the cntext package and py-readability-metrics. ResultsBetween July 1 and August 17, 2024, sports medicine experts and patients evaluated 33 responses and 3 preoperative patient education documents generated by 3 language models regarding SCR surgery. For the 11 query responses, clinicians rated Gemini significantly higher than Claude in all categories (PPPPPPP ConclusionsClaude-3-Opus, GPT-4-Turbo, and Gemini-1.5-Pro effectively generated readable presurgical education materials but lacked citations and failed to discuss alternative treatments or the risks of forgoing SCR surgery, highlighting the need for expert oversight when using these LLMs in patient education.https://periop.jmir.org/2025/1/e70047
spellingShingle Yukang Liu
Hua Li
Jianfeng Ouyang
Zhaowen Xue
Min Wang
Hebei He
Bin Song
Xiaofei Zheng
Wenyi Gan
Evaluating Large Language Models for Preoperative Patient Education in Superior Capsular Reconstruction: Comparative Study of Claude, GPT, and Gemini
JMIR Perioperative Medicine
title Evaluating Large Language Models for Preoperative Patient Education in Superior Capsular Reconstruction: Comparative Study of Claude, GPT, and Gemini
title_full Evaluating Large Language Models for Preoperative Patient Education in Superior Capsular Reconstruction: Comparative Study of Claude, GPT, and Gemini
title_fullStr Evaluating Large Language Models for Preoperative Patient Education in Superior Capsular Reconstruction: Comparative Study of Claude, GPT, and Gemini
title_full_unstemmed Evaluating Large Language Models for Preoperative Patient Education in Superior Capsular Reconstruction: Comparative Study of Claude, GPT, and Gemini
title_short Evaluating Large Language Models for Preoperative Patient Education in Superior Capsular Reconstruction: Comparative Study of Claude, GPT, and Gemini
title_sort evaluating large language models for preoperative patient education in superior capsular reconstruction comparative study of claude gpt and gemini
url https://periop.jmir.org/2025/1/e70047
work_keys_str_mv AT yukangliu evaluatinglargelanguagemodelsforpreoperativepatienteducationinsuperiorcapsularreconstructioncomparativestudyofclaudegptandgemini
AT huali evaluatinglargelanguagemodelsforpreoperativepatienteducationinsuperiorcapsularreconstructioncomparativestudyofclaudegptandgemini
AT jianfengouyang evaluatinglargelanguagemodelsforpreoperativepatienteducationinsuperiorcapsularreconstructioncomparativestudyofclaudegptandgemini
AT zhaowenxue evaluatinglargelanguagemodelsforpreoperativepatienteducationinsuperiorcapsularreconstructioncomparativestudyofclaudegptandgemini
AT minwang evaluatinglargelanguagemodelsforpreoperativepatienteducationinsuperiorcapsularreconstructioncomparativestudyofclaudegptandgemini
AT hebeihe evaluatinglargelanguagemodelsforpreoperativepatienteducationinsuperiorcapsularreconstructioncomparativestudyofclaudegptandgemini
AT binsong evaluatinglargelanguagemodelsforpreoperativepatienteducationinsuperiorcapsularreconstructioncomparativestudyofclaudegptandgemini
AT xiaofeizheng evaluatinglargelanguagemodelsforpreoperativepatienteducationinsuperiorcapsularreconstructioncomparativestudyofclaudegptandgemini
AT wenyigan evaluatinglargelanguagemodelsforpreoperativepatienteducationinsuperiorcapsularreconstructioncomparativestudyofclaudegptandgemini