Evaluation of the ability of large language models to self-diagnose oral diseases

Summary: Large language models (LLMs) offer potential in primary dental care. We conducted an evaluation of LLMs’ diagnostic capabilities across various oral diseases and contexts. All LLMs showed diagnostic capabilities for temporomandibular joint disorders, periodontal disease, dental caries, and...

Full description

Saved in:
Bibliographic Details
Main Authors: Shiyang Zhuang, Yuanhao Zeng, Shaojunjie Lin, Xirui Chen, Yishan Xin, Hongyan Li, Yiming Lin, Chaofan Zhang, Yunzhi Lin
Format: Article
Language:English
Published: Elsevier 2024-12-01
Series:iScience
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2589004224027226
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Summary: Large language models (LLMs) offer potential in primary dental care. We conducted an evaluation of LLMs’ diagnostic capabilities across various oral diseases and contexts. All LLMs showed diagnostic capabilities for temporomandibular joint disorders, periodontal disease, dental caries, and malocclusion. The prompts did not affect the performance of ChatGPT 3.5. When Chinese was used, the diagnostic ability of ChatGPT 3.5 for pulpitis improved (0% vs. 61.7%, p < 0.001), while the ability to diagnose pericoronitis decreased (8% vs. 0%, p < 0.001). For ChatGPT 4.0 in Chinese, they were both improved (0% vs. 92%, 8% vs. 72%, p < 0.001, respectively). Claude 2 exhibited the highest accuracy in diagnosing pulpitis (36%, p = 0.048), ChatGPT 4.0 showed complete diagnostic capability for pericoronitis. Llama 2 and Claude 3.5 Sonnet exhibited complete diagnostic capability for oral cancer. In conclusion, LLMs may be a potential tool for daily dental care but need further updates.
ISSN:2589-0042