A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study
Background With the development of the information age, an increasing number of patients are seeking information about related diseases on the Internet. In the medical field, several studies have confirmed that ChatGPT has great potential for use in medical education, generating imaging reports, and...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
SAGE Publishing
2025-04-01
|
| Series: | Digital Health |
| Online Access: | https://doi.org/10.1177/20552076251331804 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849762723711156224 |
|---|---|
| author | Tianyang Mao Xin Zhao Kangyi Jiang Qingyun Xie Manyu Yang Ruoxuan Wang Fengwei Gao |
| author_facet | Tianyang Mao Xin Zhao Kangyi Jiang Qingyun Xie Manyu Yang Ruoxuan Wang Fengwei Gao |
| author_sort | Tianyang Mao |
| collection | DOAJ |
| description | Background With the development of the information age, an increasing number of patients are seeking information about related diseases on the Internet. In the medical field, several studies have confirmed that ChatGPT has great potential for use in medical education, generating imaging reports, and even providing clinical diagnosis and treatment decisions, but its ability to answer questions related to gallstones has not yet been reported in the literature. Objective The aim of this study was to evaluate the consistency and accuracy of ChatGPT-generated answers to clinical questions in cholelithiasis, compared to answers provided by clinical expert. Methods This study designs an answer task based on the clinical practice guidelines for cholelithiasis. The answers are presented in the form of keywords. The questions are categorized into general questions and professional questions. To evaluate the performance of ChatGPT and clinical expert answers, the study employs a modified matching scoring system, a keyword proportion evaluation system, and the DISCERN tool. Results ChatGPT often provides more keywords in its responses, but its accuracy is significantly lower than that of doctors ( P < .001). In the evaluation of 33 general questions, ChatGPT and doctors demonstrated similar performance in both the modified matching score system and keyword proportion evaluation ( P = .856 and P = .829, respectively). However, in the evaluation of 32 professional questions, doctors consistently outperformed ChatGPT ( P = .004 and P = .016). Additionally, while the DISCERN tool showed differences between general and professional questions ( P = .001), both types of questions were evaluated at a high level overall. Conclusions Currently, ChatGPT performs similarly to clinical experts in answering general questions related to cholelithiasis. However, it cannot replace clinical experts in professional clinical decision-making. As ChatGPT's performance improves through deep learning, it is expected to become more useful and effective in the field of cholelithiasis. Nevertheless, in more specialized areas, careful attention and continuous evaluation will be necessary to ensure its accuracy, reliability, and safety in the medical field. |
| format | Article |
| id | doaj-art-cd8002f893ed4814b798d71dcb9e9b73 |
| institution | DOAJ |
| issn | 2055-2076 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | SAGE Publishing |
| record_format | Article |
| series | Digital Health |
| spelling | doaj-art-cd8002f893ed4814b798d71dcb9e9b732025-08-20T03:05:39ZengSAGE PublishingDigital Health2055-20762025-04-011110.1177/20552076251331804A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional studyTianyang Mao0Xin Zhao1Kangyi Jiang2Qingyun Xie3Manyu Yang4Ruoxuan Wang5Fengwei Gao6 Department of Clinical Medicine, Sichuan North Medical College, Nanchong, China Department of Hepato-Pancreato-Biliary Surgery, , Leshan, China Department of Hepato-Pancreato-Biliary Surgery, , Leshan, China Liver Transplantation Center, State Key Laboratory of Biotherapy and Cancer Center, , Sichuan University and Collaborative Innovation Center of Biotherapy, Chengdu, China Department of Clinical Medicine, Sichuan North Medical College, Nanchong, China Department of Clinical Medicine, Sichuan North Medical College, Nanchong, China Liver Transplantation Center, State Key Laboratory of Biotherapy and Cancer Center, , Sichuan University and Collaborative Innovation Center of Biotherapy, Chengdu, ChinaBackground With the development of the information age, an increasing number of patients are seeking information about related diseases on the Internet. In the medical field, several studies have confirmed that ChatGPT has great potential for use in medical education, generating imaging reports, and even providing clinical diagnosis and treatment decisions, but its ability to answer questions related to gallstones has not yet been reported in the literature. Objective The aim of this study was to evaluate the consistency and accuracy of ChatGPT-generated answers to clinical questions in cholelithiasis, compared to answers provided by clinical expert. Methods This study designs an answer task based on the clinical practice guidelines for cholelithiasis. The answers are presented in the form of keywords. The questions are categorized into general questions and professional questions. To evaluate the performance of ChatGPT and clinical expert answers, the study employs a modified matching scoring system, a keyword proportion evaluation system, and the DISCERN tool. Results ChatGPT often provides more keywords in its responses, but its accuracy is significantly lower than that of doctors ( P < .001). In the evaluation of 33 general questions, ChatGPT and doctors demonstrated similar performance in both the modified matching score system and keyword proportion evaluation ( P = .856 and P = .829, respectively). However, in the evaluation of 32 professional questions, doctors consistently outperformed ChatGPT ( P = .004 and P = .016). Additionally, while the DISCERN tool showed differences between general and professional questions ( P = .001), both types of questions were evaluated at a high level overall. Conclusions Currently, ChatGPT performs similarly to clinical experts in answering general questions related to cholelithiasis. However, it cannot replace clinical experts in professional clinical decision-making. As ChatGPT's performance improves through deep learning, it is expected to become more useful and effective in the field of cholelithiasis. Nevertheless, in more specialized areas, careful attention and continuous evaluation will be necessary to ensure its accuracy, reliability, and safety in the medical field.https://doi.org/10.1177/20552076251331804 |
| spellingShingle | Tianyang Mao Xin Zhao Kangyi Jiang Qingyun Xie Manyu Yang Ruoxuan Wang Fengwei Gao A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study Digital Health |
| title | A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study |
| title_full | A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study |
| title_fullStr | A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study |
| title_full_unstemmed | A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study |
| title_short | A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study |
| title_sort | comparison of the responses between chatgpt and doctors in the field of cholelithiasis based on clinical practice guidelines a cross sectional study |
| url | https://doi.org/10.1177/20552076251331804 |
| work_keys_str_mv | AT tianyangmao acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT xinzhao acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT kangyijiang acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT qingyunxie acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT manyuyang acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT ruoxuanwang acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT fengweigao acomparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT tianyangmao comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT xinzhao comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT kangyijiang comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT qingyunxie comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT manyuyang comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT ruoxuanwang comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy AT fengweigao comparisonoftheresponsesbetweenchatgptanddoctorsinthefieldofcholelithiasisbasedonclinicalpracticeguidelinesacrosssectionalstudy |