Application of Large Language Models in Stroke Rehabilitation Health Education: 2-Phase Study
Abstract BackgroundStroke is a leading cause of disability and death worldwide, with home-based rehabilitation playing a crucial role in improving patient prognosis and quality of life. Traditional health education often lacks precision, personalization, and accessibility. In...
Saved in:
| Main Authors: | , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
JMIR Publications
2025-07-01
|
| Series: | Journal of Medical Internet Research |
| Online Access: | https://www.jmir.org/2025/1/e73226 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849728949717827584 |
|---|---|
| author | Shiqi Qiang Haitao Zhang Yang Liao Yue Zhang Yanfen Gu Yiyan Wang Zehui Xu Hui Shi Nuo Han Haiping Yu |
| author_facet | Shiqi Qiang Haitao Zhang Yang Liao Yue Zhang Yanfen Gu Yiyan Wang Zehui Xu Hui Shi Nuo Han Haiping Yu |
| author_sort | Shiqi Qiang |
| collection | DOAJ |
| description |
Abstract
BackgroundStroke is a leading cause of disability and death worldwide, with home-based rehabilitation playing a crucial role in improving patient prognosis and quality of life. Traditional health education often lacks precision, personalization, and accessibility. In contrast, large language models (LLMs) are gaining attention for their potential in medical health education, owing to their advanced natural language processing capabilities. However, the effectiveness of LLMs in home-based stroke rehabilitation remains uncertain.
ObjectiveThis study evaluates the effectiveness of 4 LLMs—ChatGPT-4, MedGo, Qwen, and ERNIE Bot—selected for their diversity in model type, clinical relevance, and accessibility at the time of study design in home-based stroke rehabilitation. The aim is to offer patients with stroke more precise and secure health education pathways while exploring the feasibility of using LLMs to guide health education.
MethodsIn the first phase of this study, a literature review and expert interviews identified 15 common questions and 2 clinical cases relevant to patients with stroke in home-based rehabilitation. These were input into 4 LLMs for simulated consultations. Six medical experts (2 clinicians, 2 nursing specialists, and 2 rehabilitation therapists) evaluated the LLM-generated responses using a Likert 5-point scale, assessing accuracy, completeness, readability, safety, and humanity. In the second phase, the top 2 performing models from phase 1 were selected. Thirty patients with stroke undergoing home-based rehabilitation were recruited. Each patient asked both models 3 questions, rated the responses using a satisfaction scale, and assessed readability, text length, and recommended reading age using a Chinese readability analysis tool. Data were analyzed using one-way ANOVA, post hoc Tukey Honestly Significant Difference tests, and paired t
ResultsThe results revealed significant differences across the 4 models in 5 dimensions: accuracy (PPPPP
ConclusionsLLMs, particularly ChatGPT-4 and MedGo, demonstrated promising performance in home-based stroke rehabilitation education. However, discrepancies between expert and patient evaluations highlight the need for improved alignment with patient comprehension and expectations. Enhancing clinical accuracy, readability, and oversight mechanisms will be essential for future real-world integration. |
| format | Article |
| id | doaj-art-d3dd67add0af4011a8277890dccdb253 |
| institution | DOAJ |
| issn | 1438-8871 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | JMIR Publications |
| record_format | Article |
| series | Journal of Medical Internet Research |
| spelling | doaj-art-d3dd67add0af4011a8277890dccdb2532025-08-20T03:09:23ZengJMIR PublicationsJournal of Medical Internet Research1438-88712025-07-0127e73226e7322610.2196/73226Application of Large Language Models in Stroke Rehabilitation Health Education: 2-Phase StudyShiqi Qianghttp://orcid.org/0009-0009-5888-3524Haitao Zhanghttp://orcid.org/0009-0009-8481-1559Yang Liaohttp://orcid.org/0009-0009-6297-2562Yue Zhanghttp://orcid.org/0009-0009-2497-4161Yanfen Guhttp://orcid.org/0000-0003-0430-7133Yiyan Wanghttp://orcid.org/0000-0002-7746-3623Zehui Xuhttp://orcid.org/0009-0001-1219-5250Hui Shihttp://orcid.org/0000-0002-0971-9738Nuo Hanhttp://orcid.org/0009-0005-7054-0787Haiping Yuhttp://orcid.org/0000-0002-3394-2841 Abstract BackgroundStroke is a leading cause of disability and death worldwide, with home-based rehabilitation playing a crucial role in improving patient prognosis and quality of life. Traditional health education often lacks precision, personalization, and accessibility. In contrast, large language models (LLMs) are gaining attention for their potential in medical health education, owing to their advanced natural language processing capabilities. However, the effectiveness of LLMs in home-based stroke rehabilitation remains uncertain. ObjectiveThis study evaluates the effectiveness of 4 LLMs—ChatGPT-4, MedGo, Qwen, and ERNIE Bot—selected for their diversity in model type, clinical relevance, and accessibility at the time of study design in home-based stroke rehabilitation. The aim is to offer patients with stroke more precise and secure health education pathways while exploring the feasibility of using LLMs to guide health education. MethodsIn the first phase of this study, a literature review and expert interviews identified 15 common questions and 2 clinical cases relevant to patients with stroke in home-based rehabilitation. These were input into 4 LLMs for simulated consultations. Six medical experts (2 clinicians, 2 nursing specialists, and 2 rehabilitation therapists) evaluated the LLM-generated responses using a Likert 5-point scale, assessing accuracy, completeness, readability, safety, and humanity. In the second phase, the top 2 performing models from phase 1 were selected. Thirty patients with stroke undergoing home-based rehabilitation were recruited. Each patient asked both models 3 questions, rated the responses using a satisfaction scale, and assessed readability, text length, and recommended reading age using a Chinese readability analysis tool. Data were analyzed using one-way ANOVA, post hoc Tukey Honestly Significant Difference tests, and paired t ResultsThe results revealed significant differences across the 4 models in 5 dimensions: accuracy (PPPPP ConclusionsLLMs, particularly ChatGPT-4 and MedGo, demonstrated promising performance in home-based stroke rehabilitation education. However, discrepancies between expert and patient evaluations highlight the need for improved alignment with patient comprehension and expectations. Enhancing clinical accuracy, readability, and oversight mechanisms will be essential for future real-world integration.https://www.jmir.org/2025/1/e73226 |
| spellingShingle | Shiqi Qiang Haitao Zhang Yang Liao Yue Zhang Yanfen Gu Yiyan Wang Zehui Xu Hui Shi Nuo Han Haiping Yu Application of Large Language Models in Stroke Rehabilitation Health Education: 2-Phase Study Journal of Medical Internet Research |
| title | Application of Large Language Models in Stroke Rehabilitation Health Education: 2-Phase Study |
| title_full | Application of Large Language Models in Stroke Rehabilitation Health Education: 2-Phase Study |
| title_fullStr | Application of Large Language Models in Stroke Rehabilitation Health Education: 2-Phase Study |
| title_full_unstemmed | Application of Large Language Models in Stroke Rehabilitation Health Education: 2-Phase Study |
| title_short | Application of Large Language Models in Stroke Rehabilitation Health Education: 2-Phase Study |
| title_sort | application of large language models in stroke rehabilitation health education 2 phase study |
| url | https://www.jmir.org/2025/1/e73226 |
| work_keys_str_mv | AT shiqiqiang applicationoflargelanguagemodelsinstrokerehabilitationhealtheducation2phasestudy AT haitaozhang applicationoflargelanguagemodelsinstrokerehabilitationhealtheducation2phasestudy AT yangliao applicationoflargelanguagemodelsinstrokerehabilitationhealtheducation2phasestudy AT yuezhang applicationoflargelanguagemodelsinstrokerehabilitationhealtheducation2phasestudy AT yanfengu applicationoflargelanguagemodelsinstrokerehabilitationhealtheducation2phasestudy AT yiyanwang applicationoflargelanguagemodelsinstrokerehabilitationhealtheducation2phasestudy AT zehuixu applicationoflargelanguagemodelsinstrokerehabilitationhealtheducation2phasestudy AT huishi applicationoflargelanguagemodelsinstrokerehabilitationhealtheducation2phasestudy AT nuohan applicationoflargelanguagemodelsinstrokerehabilitationhealtheducation2phasestudy AT haipingyu applicationoflargelanguagemodelsinstrokerehabilitationhealtheducation2phasestudy |