A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study

Background With the development of the information age, an increasing number of patients are seeking information about related diseases on the Internet. In the medical field, several studies have confirmed that ChatGPT has great potential for use in medical education, generating imaging reports, and...

Full description

Saved in:

Bibliographic Details
Main Authors:	Tianyang Mao, Xin Zhao, Kangyi Jiang, Qingyun Xie, Manyu Yang, Ruoxuan Wang, Fengwei Gao
Format:	Article
Language:	English
Published:	SAGE Publishing 2025-04-01
Series:	Digital Health
Online Access:	https://doi.org/10.1177/20552076251331804
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849762723711156224
author	Tianyang Mao Xin Zhao Kangyi Jiang Qingyun Xie Manyu Yang Ruoxuan Wang Fengwei Gao
author_facet	Tianyang Mao Xin Zhao Kangyi Jiang Qingyun Xie Manyu Yang Ruoxuan Wang Fengwei Gao
author_sort	Tianyang Mao
collection	DOAJ
description	Background With the development of the information age, an increasing number of patients are seeking information about related diseases on the Internet. In the medical field, several studies have confirmed that ChatGPT has great potential for use in medical education, generating imaging reports, and even providing clinical diagnosis and treatment decisions, but its ability to answer questions related to gallstones has not yet been reported in the literature. Objective The aim of this study was to evaluate the consistency and accuracy of ChatGPT-generated answers to clinical questions in cholelithiasis, compared to answers provided by clinical expert. Methods This study designs an answer task based on the clinical practice guidelines for cholelithiasis. The answers are presented in the form of keywords. The questions are categorized into general questions and professional questions. To evaluate the performance of ChatGPT and clinical expert answers, the study employs a modified matching scoring system, a keyword proportion evaluation system, and the DISCERN tool. Results ChatGPT often provides more keywords in its responses, but its accuracy is significantly lower than that of doctors ( P < .001). In the evaluation of 33 general questions, ChatGPT and doctors demonstrated similar performance in both the modified matching score system and keyword proportion evaluation ( P = .856 and P = .829, respectively). However, in the evaluation of 32 professional questions, doctors consistently outperformed ChatGPT ( P = .004 and P = .016). Additionally, while the DISCERN tool showed differences between general and professional questions ( P = .001), both types of questions were evaluated at a high level overall. Conclusions Currently, ChatGPT performs similarly to clinical experts in answering general questions related to cholelithiasis. However, it cannot replace clinical experts in professional clinical decision-making. As ChatGPT's performance improves through deep learning, it is expected to become more useful and effective in the field of cholelithiasis. Nevertheless, in more specialized areas, careful attention and continuous evaluation will be necessary to ensure its accuracy, reliability, and safety in the medical field.
format	Article
id	doaj-art-cd8002f893ed4814b798d71dcb9e9b73
institution	DOAJ
issn	2055-2076
language	English
publishDate	2025-04-01
publisher	SAGE Publishing
record_format	Article
series	Digital Health
spelling	doaj-art-cd8002f893ed4814b798d71dcb9e9b732025-08-20T03:05:39ZengSAGE PublishingDigital Health2055-20762025-04-011110.1177/20552076251331804A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional studyTianyang Mao0Xin Zhao1Kangyi Jiang2Qingyun Xie3Manyu Yang4Ruoxuan Wang5Fengwei Gao6 Department of Clinical Medicine, Sichuan North Medical College, Nanchong, China Department of Hepato-Pancreato-Biliary Surgery, , Leshan, China Department of Hepato-Pancreato-Biliary Surgery, , Leshan, China Liver Transplantation Center, State Key Laboratory of Biotherapy and Cancer Center, , Sichuan University and Collaborative Innovation Center of Biotherapy, Chengdu, China Department of Clinical Medicine, Sichuan North Medical College, Nanchong, China Department of Clinical Medicine, Sichuan North Medical College, Nanchong, China Liver Transplantation Center, State Key Laboratory of Biotherapy and Cancer Center, , Sichuan University and Collaborative Innovation Center of Biotherapy, Chengdu, ChinaBackground With the development of the information age, an increasing number of patients are seeking information about related diseases on the Internet. In the medical field, several studies have confirmed that ChatGPT has great potential for use in medical education, generating imaging reports, and even providing clinical diagnosis and treatment decisions, but its ability to answer questions related to gallstones has not yet been reported in the literature. Objective The aim of this study was to evaluate the consistency and accuracy of ChatGPT-generated answers to clinical questions in cholelithiasis, compared to answers provided by clinical expert. Methods This study designs an answer task based on the clinical practice guidelines for cholelithiasis. The answers are presented in the form of keywords. The questions are categorized into general questions and professional questions. To evaluate the performance of ChatGPT and clinical expert answers, the study employs a modified matching scoring system, a keyword proportion evaluation system, and the DISCERN tool. Results ChatGPT often provides more keywords in its responses, but its accuracy is significantly lower than that of doctors ( P < .001). In the evaluation of 33 general questions, ChatGPT and doctors demonstrated similar performance in both the modified matching score system and keyword proportion evaluation ( P = .856 and P = .829, respectively). However, in the evaluation of 32 professional questions, doctors consistently outperformed ChatGPT ( P = .004 and P = .016). Additionally, while the DISCERN tool showed differences between general and professional questions ( P = .001), both types of questions were evaluated at a high level overall. Conclusions Currently, ChatGPT performs similarly to clinical experts in answering general questions related to cholelithiasis. However, it cannot replace clinical experts in professional clinical decision-making. As ChatGPT's performance improves through deep learning, it is expected to become more useful and effective in the field of cholelithiasis. Nevertheless, in more specialized areas, careful attention and continuous evaluation will be necessary to ensure its accuracy, reliability, and safety in the medical field.https://doi.org/10.1177/20552076251331804
spellingShingle	Tianyang Mao Xin Zhao Kangyi Jiang Qingyun Xie Manyu Yang Ruoxuan Wang Fengwei Gao A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study Digital Health
title	A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study
title_full	A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study
title_fullStr	A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study
title_full_unstemmed	A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study
title_short	A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study
title_sort	comparison of the responses between chatgpt and doctors in the field of cholelithiasis based on clinical practice guidelines a cross sectional study
url	https://doi.org/10.1177/20552076251331804
work_keys_str_mv	AT tianyangmao acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT xinzhao acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT kangyijiang acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT qingyunxie acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT manyuyang acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT ruoxuanwang acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT fengweigao acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT tianyangmao comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT xinzhao comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT kangyijiang comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT qingyunxie comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT manyuyang comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT ruoxuanwang comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT fengweigao comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy

A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study

Similar Items