An evaluation of the reliability and readability of large language models in the dissemination of traumatic brain injury information

Objective To compare the reliability and readability of responses from Generative Pre-trained Transformer versions 3.5 (GPT-3.5) and 4.0 (GPT-4.0) on traumatic brain injury (TBI) topics against Model Systems Knowledge Translation Center (MSKTC) fact sheets. Methods This study analyzed responses from...

Full description

Saved in:

Bibliographic Details
Main Authors:	Matthew J Lee, Angelo Cadiente, Jamie Chen, Yi Zhou, Brian D Greenwald
Format:	Article
Language:	English
Published:	SAGE Publishing 2025-06-01
Series:	Digital Health
Online Access:	https://doi.org/10.1177/20552076251350760
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Objective To compare the reliability and readability of responses from Generative Pre-trained Transformer versions 3.5 (GPT-3.5) and 4.0 (GPT-4.0) on traumatic brain injury (TBI) topics against Model Systems Knowledge Translation Center (MSKTC) fact sheets. Methods This study analyzed responses from GPT-3.5 and GPT-4.0 for accuracy, comprehensiveness, and readability against MSKTC fact sheets, incorporating a correlation analysis between reliability and readability scores. Results Findings showed an improvement in reliability from GPT-3.5 (mean score = 3.21) to GPT-4.0 (mean score = 3.63), indicating better accuracy and completeness in the latter. Despite advancements, responses generally remained accurate but not fully comprehensive. Readability comparisons found the MSKTC fact sheets were significantly more reader-friendly compared to responses from both artificial intelligence (AI) versions, with no strong correlation between reliability and readability. Conclusion The study highlights progress in AI-generated information on TBI from GPT-3.5 to GPT-4.0 in terms of reliability. However, challenges persist in matching the readability of standard patient education materials, emphasizing the need for future AI developments to focus on enhancing understandability alongside accuracy.
ISSN:	2055-2076

An evaluation of the reliability and readability of large language models in the dissemination of traumatic brain injury information

Similar Items