To take a different approach: Can large language models provide knowledge related to respiratory aspiration?
Objective To investigate the performance (accuracy, comprehensiveness, consistency, and the necessary information ratio) of large language models (LLMs) in providing knowledge related to respiratory aspiration, and to explore the potential of using LLMs as training tools. Methods This study was a no...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
SAGE Publishing
2025-07-01
|
| Series: | Digital Health |
| Online Access: | https://doi.org/10.1177/20552076251349616 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Objective To investigate the performance (accuracy, comprehensiveness, consistency, and the necessary information ratio) of large language models (LLMs) in providing knowledge related to respiratory aspiration, and to explore the potential of using LLMs as training tools. Methods This study was a non-human-subject evaluative research. Two LLMs (GPT-3.5 and GPT-4) were asked 36 questions (32 objective questions and four subjective questions) about respiratory aspiration in English and Chinese. Responses were scored by two experts against gold standards derived from authoritative books. The accuracy of the two LLMs’ responses of objective questions were compared by chi-square test or Fisher exact probability method. For subjective questions, the t-test or Mann–Whitney U test was used to compare the differences between two LLMs. Results There was no significant difference in the ratings provided by the two experts. The accuracy scores of objective questions of two LLMs were high. LLMs also performed well on subjective questions, showing high levels of accuracy, comprehensiveness, consistency, and necessary information ratio. And no significant differences were found in the accuracy of the English and Chinese responses to subjective questions between the two LLMs (z = 0.331, p = 0.886; z = 1.703, p = 0.114). There was no significant difference in the comprehensiveness of the English and Chinese responses between the two LLMs (t = 0.787, p = 0.461; t = 1.175, p = 0.285). Conclusions LLMs demonstrated promising performance in delivering respiratory aspiration-related knowledge and showed promise as supportive tools in training, particularly when their limitations were well understood. |
|---|---|
| ISSN: | 2055-2076 |