Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods Study

BackgroundOcular myasthenia gravis (OMG) is a neuromuscular disorder primarily affecting the extraocular muscles, leading to ptosis and diplopia. Effective patient education is crucial for disease management; however, in China, limited health care resources often restrict pat...

Full description

Saved in:

Bibliographic Details
Main Authors:	Bin Wei, Lili Yao, Xin Hu, Yuxiang Hu, Jie Rao, Yu Ji, Zhuoer Dong, Yichong Duan, Xiaorong Wu
Format:	Article
Language:	English
Published:	JMIR Publications 2025-04-01
Series:	Journal of Medical Internet Research
Online Access:	https://www.jmir.org/2025/1/e67883
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850203270940721152
author	Bin Wei Lili Yao Xin Hu Yuxiang Hu Jie Rao Yu Ji Zhuoer Dong Yichong Duan Xiaorong Wu
author_facet	Bin Wei Lili Yao Xin Hu Yuxiang Hu Jie Rao Yu Ji Zhuoer Dong Yichong Duan Xiaorong Wu
author_sort	Bin Wei
collection	DOAJ
description	BackgroundOcular myasthenia gravis (OMG) is a neuromuscular disorder primarily affecting the extraocular muscles, leading to ptosis and diplopia. Effective patient education is crucial for disease management; however, in China, limited health care resources often restrict patients’ access to personalized medical guidance. Large language models (LLMs) have emerged as potential tools to bridge this gap by providing instant, AI-driven health information. However, their accuracy and readability in educating patients with OMG remain uncertain. ObjectiveThe purpose of this study was to systematically evaluate the effectiveness of multiple LLMs in the education of Chinese patients with OMG. Specifically, the validity of these models in answering patients with OMG-related questions was assessed through accuracy, completeness, readability, usefulness, and safety, and patients’ ratings of their usability and readability were analyzed. MethodsThe study was conducted in two phases: 130 choice ophthalmology examination questions were input into 5 different LLMs. Their performance was compared with that of undergraduates, master’s students, and ophthalmology residents. In addition, 23 common patients with OMG-related patient questions were posed to 4 LLMs, and their responses were evaluated by ophthalmologists across 5 domains. In the second phase, 20 patients with OMG interacted with the 2 LLMs from the first phase, each asking 3 questions. Patients assessed the responses for satisfaction and readability, while ophthalmologists evaluated the responses again using the 5 domains. ResultsChatGPT o1-preview achieved the highest accuracy rate of 73% on 130 ophthalmology examination questions, outperforming other LLMs and professional groups like undergraduates and master’s students. For 23 common patients with OMG-related questions, ChatGPT o1-preview scored highest in correctness (4.44), completeness (4.44), helpfulness (4.47), and safety (4.6). GEMINI (Google DeepMind) provided the easiest-to-understand responses in readability assessments, while GPT-4o had the most complex responses, suitable for readers with higher education levels. In the second phase with 20 patients with OMG, ChatGPT o1-preview received higher satisfaction scores than Ernie 3.5 (Baidu; 4.40 vs 3.89, P=.002), although Ernie 3.5’s responses were slightly more readable (4.31 vs 4.03, P=.01). ConclusionsLLMs such as ChatGPT o1-preview may have the potential to enhance patient education. Addressing challenges such as misinformation risk, readability issues, and ethical considerations is crucial for their effective and safe integration into clinical practice.
format	Article
id	doaj-art-d2ed70090e8844fb8dabac3b3feaf0d9
institution	OA Journals
issn	1438-8871
language	English
publishDate	2025-04-01
publisher	JMIR Publications
record_format	Article
series	Journal of Medical Internet Research
spelling	doaj-art-d2ed70090e8844fb8dabac3b3feaf0d92025-08-20T02:11:34ZengJMIR PublicationsJournal of Medical Internet Research1438-88712025-04-0127e6788310.2196/67883Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods StudyBin Weihttps://orcid.org/0009-0007-0946-699XLili Yaohttps://orcid.org/0009-0007-2301-0841Xin Huhttps://orcid.org/0009-0005-2760-7800Yuxiang Huhttps://orcid.org/0000-0003-0042-2871Jie Raohttps://orcid.org/0000-0002-0527-1200Yu Jihttps://orcid.org/0000-0003-1781-0491Zhuoer Donghttps://orcid.org/0009-0006-2790-0983Yichong Duanhttps://orcid.org/0009-0007-1621-2330Xiaorong Wuhttps://orcid.org/0000-0003-4580-4304 BackgroundOcular myasthenia gravis (OMG) is a neuromuscular disorder primarily affecting the extraocular muscles, leading to ptosis and diplopia. Effective patient education is crucial for disease management; however, in China, limited health care resources often restrict patients’ access to personalized medical guidance. Large language models (LLMs) have emerged as potential tools to bridge this gap by providing instant, AI-driven health information. However, their accuracy and readability in educating patients with OMG remain uncertain. ObjectiveThe purpose of this study was to systematically evaluate the effectiveness of multiple LLMs in the education of Chinese patients with OMG. Specifically, the validity of these models in answering patients with OMG-related questions was assessed through accuracy, completeness, readability, usefulness, and safety, and patients’ ratings of their usability and readability were analyzed. MethodsThe study was conducted in two phases: 130 choice ophthalmology examination questions were input into 5 different LLMs. Their performance was compared with that of undergraduates, master’s students, and ophthalmology residents. In addition, 23 common patients with OMG-related patient questions were posed to 4 LLMs, and their responses were evaluated by ophthalmologists across 5 domains. In the second phase, 20 patients with OMG interacted with the 2 LLMs from the first phase, each asking 3 questions. Patients assessed the responses for satisfaction and readability, while ophthalmologists evaluated the responses again using the 5 domains. ResultsChatGPT o1-preview achieved the highest accuracy rate of 73% on 130 ophthalmology examination questions, outperforming other LLMs and professional groups like undergraduates and master’s students. For 23 common patients with OMG-related questions, ChatGPT o1-preview scored highest in correctness (4.44), completeness (4.44), helpfulness (4.47), and safety (4.6). GEMINI (Google DeepMind) provided the easiest-to-understand responses in readability assessments, while GPT-4o had the most complex responses, suitable for readers with higher education levels. In the second phase with 20 patients with OMG, ChatGPT o1-preview received higher satisfaction scores than Ernie 3.5 (Baidu; 4.40 vs 3.89, P=.002), although Ernie 3.5’s responses were slightly more readable (4.31 vs 4.03, P=.01). ConclusionsLLMs such as ChatGPT o1-preview may have the potential to enhance patient education. Addressing challenges such as misinformation risk, readability issues, and ethical considerations is crucial for their effective and safe integration into clinical practice.https://www.jmir.org/2025/1/e67883
spellingShingle	Bin Wei Lili Yao Xin Hu Yuxiang Hu Jie Rao Yu Ji Zhuoer Dong Yichong Duan Xiaorong Wu Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods Study Journal of Medical Internet Research
title	Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods Study
title_full	Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods Study
title_fullStr	Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods Study
title_full_unstemmed	Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods Study
title_short	Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods Study
title_sort	evaluating the effectiveness of large language models in providing patient education for chinese patients with ocular myasthenia gravis mixed methods study
url	https://www.jmir.org/2025/1/e67883
work_keys_str_mv	AT binwei evaluatingtheeffectivenessoflargelanguagemodelsinprovidingpatienteducationforchinesepatientswithocularmyastheniagravismixedmethodsstudy AT liliyao evaluatingtheeffectivenessoflargelanguagemodelsinprovidingpatienteducationforchinesepatientswithocularmyastheniagravismixedmethodsstudy AT xinhu evaluatingtheeffectivenessoflargelanguagemodelsinprovidingpatienteducationforchinesepatientswithocularmyastheniagravismixedmethodsstudy AT yuxianghu evaluatingtheeffectivenessoflargelanguagemodelsinprovidingpatienteducationforchinesepatientswithocularmyastheniagravismixedmethodsstudy AT jierao evaluatingtheeffectivenessoflargelanguagemodelsinprovidingpatienteducationforchinesepatientswithocularmyastheniagravismixedmethodsstudy AT yuji evaluatingtheeffectivenessoflargelanguagemodelsinprovidingpatienteducationforchinesepatientswithocularmyastheniagravismixedmethodsstudy AT zhuoerdong evaluatingtheeffectivenessoflargelanguagemodelsinprovidingpatienteducationforchinesepatientswithocularmyastheniagravismixedmethodsstudy AT yichongduan evaluatingtheeffectivenessoflargelanguagemodelsinprovidingpatienteducationforchinesepatientswithocularmyastheniagravismixedmethodsstudy AT xiaorongwu evaluatingtheeffectivenessoflargelanguagemodelsinprovidingpatienteducationforchinesepatientswithocularmyastheniagravismixedmethodsstudy

Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods Study

Similar Items