Usefulness of the large language model ChatGPT (GPT‐4) as a diagnostic tool and information source in dermatology

Abstract Background The field of artificial intelligence is rapidly evolving. As an easily accessible platform with vast user engagement, the Chat Generative Pre‐Trained Transformer (ChatGPT) holds great promise in medicine, with the latest version, GPT‐4, capable of analyzing clinical images. Objec...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jacob P. S. Nielsen, Christian Grønhøj, Lone Skov, Mette Gyldenløve
Format:	Article
Language:	English
Published:	Wiley 2024-12-01
Series:	JEADV Clinical Practice
Subjects:	AI artificial intelligence Chatbot ChatGPT clinical dermatology GPT‐4
Online Access:	https://doi.org/10.1002/jvc2.459
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850059354294714368
author	Jacob P. S. Nielsen Christian Grønhøj Lone Skov Mette Gyldenløve
author_facet	Jacob P. S. Nielsen Christian Grønhøj Lone Skov Mette Gyldenløve
author_sort	Jacob P. S. Nielsen
collection	DOAJ
description	Abstract Background The field of artificial intelligence is rapidly evolving. As an easily accessible platform with vast user engagement, the Chat Generative Pre‐Trained Transformer (ChatGPT) holds great promise in medicine, with the latest version, GPT‐4, capable of analyzing clinical images. Objectives To evaluate ChatGPT as a diagnostic tool and information source in clinical dermatology. Methods A total of 15 clinical images were selected from the Danish web atlas, Danderm, depicting various common and rare skin conditions. The images were uploaded to ChatGPT version GPT‐4, which was prompted with ‘Please provide a description, a potential diagnosis, and treatment options for the following dermatological condition’. The generated responses were assessed by senior registrars in dermatology and consultant dermatologists in terms of accuracy, relevance, and depth (scale 1–5), and in addition, the image quality was rated (scale 0–10). Demographic and professional information about the respondents was registered. Results A total of 23 physicians participated in the study. The majority of the respondents were consultant dermatologists (83%), and 48% had more than 10 years of training. The overall image quality had a median rating of 10 out of 10 [interquartile range (IQR): 9–10]. The overall median rating of the ChatGPT generated responses was 2 (IQR: 1–4), while overall median ratings in terms of relevance, accuracy, and depth were 2 (IQR: 1–4), 3 (IQR: 2–4) and 2 (IQR: 1–3), respectively. Conclusions Despite the advancements in ChatGPT, including newly added image processing capabilities, the chatbot demonstrated significant limitations in providing reliable and clinically useful responses to illustrative images of various dermatological conditions.
format	Article
id	doaj-art-fe2d79dafdac491eb8ab1daf0075761d
institution	DOAJ
issn	2768-6566
language	English
publishDate	2024-12-01
publisher	Wiley
record_format	Article
series	JEADV Clinical Practice
spelling	doaj-art-fe2d79dafdac491eb8ab1daf0075761d2025-08-20T02:50:55ZengWileyJEADV Clinical Practice2768-65662024-12-01351570157510.1002/jvc2.459Usefulness of the large language model ChatGPT (GPT‐4) as a diagnostic tool and information source in dermatologyJacob P. S. Nielsen0Christian Grønhøj1Lone Skov2Mette Gyldenløve3Department of Otorhinolaryngology–Head and Neck Surgery and Audiology Copenhagen University Hospital, Rigshospitalet Copenhagen DenmarkDepartment of Otorhinolaryngology–Head and Neck Surgery and Audiology Copenhagen University Hospital, Rigshospitalet Copenhagen DenmarkDepartment of Dermatology and Allergy Copenhagen University Hospital–Herlev and Gentofte Copenhagen DenmarkDepartment of Dermatology and Allergy Copenhagen University Hospital–Herlev and Gentofte Copenhagen DenmarkAbstract Background The field of artificial intelligence is rapidly evolving. As an easily accessible platform with vast user engagement, the Chat Generative Pre‐Trained Transformer (ChatGPT) holds great promise in medicine, with the latest version, GPT‐4, capable of analyzing clinical images. Objectives To evaluate ChatGPT as a diagnostic tool and information source in clinical dermatology. Methods A total of 15 clinical images were selected from the Danish web atlas, Danderm, depicting various common and rare skin conditions. The images were uploaded to ChatGPT version GPT‐4, which was prompted with ‘Please provide a description, a potential diagnosis, and treatment options for the following dermatological condition’. The generated responses were assessed by senior registrars in dermatology and consultant dermatologists in terms of accuracy, relevance, and depth (scale 1–5), and in addition, the image quality was rated (scale 0–10). Demographic and professional information about the respondents was registered. Results A total of 23 physicians participated in the study. The majority of the respondents were consultant dermatologists (83%), and 48% had more than 10 years of training. The overall image quality had a median rating of 10 out of 10 [interquartile range (IQR): 9–10]. The overall median rating of the ChatGPT generated responses was 2 (IQR: 1–4), while overall median ratings in terms of relevance, accuracy, and depth were 2 (IQR: 1–4), 3 (IQR: 2–4) and 2 (IQR: 1–3), respectively. Conclusions Despite the advancements in ChatGPT, including newly added image processing capabilities, the chatbot demonstrated significant limitations in providing reliable and clinically useful responses to illustrative images of various dermatological conditions.https://doi.org/10.1002/jvc2.459AIartificial intelligenceChatbotChatGPTclinical dermatologyGPT‐4
spellingShingle	Jacob P. S. Nielsen Christian Grønhøj Lone Skov Mette Gyldenløve Usefulness of the large language model ChatGPT (GPT‐4) as a diagnostic tool and information source in dermatology JEADV Clinical Practice AI artificial intelligence Chatbot ChatGPT clinical dermatology GPT‐4
title	Usefulness of the large language model ChatGPT (GPT‐4) as a diagnostic tool and information source in dermatology
title_full	Usefulness of the large language model ChatGPT (GPT‐4) as a diagnostic tool and information source in dermatology
title_fullStr	Usefulness of the large language model ChatGPT (GPT‐4) as a diagnostic tool and information source in dermatology
title_full_unstemmed	Usefulness of the large language model ChatGPT (GPT‐4) as a diagnostic tool and information source in dermatology
title_short	Usefulness of the large language model ChatGPT (GPT‐4) as a diagnostic tool and information source in dermatology
title_sort	usefulness of the large language model chatgpt gpt 4 as a diagnostic tool and information source in dermatology
topic	AI artificial intelligence Chatbot ChatGPT clinical dermatology GPT‐4
url	https://doi.org/10.1002/jvc2.459
work_keys_str_mv	AT jacobpsnielsen usefulnessofthelargelanguagemodelchatgptgpt4asadiagnostictoolandinformationsourceindermatology AT christiangrønhøj usefulnessofthelargelanguagemodelchatgptgpt4asadiagnostictoolandinformationsourceindermatology AT loneskov usefulnessofthelargelanguagemodelchatgptgpt4asadiagnostictoolandinformationsourceindermatology AT mettegyldenløve usefulnessofthelargelanguagemodelchatgptgpt4asadiagnostictoolandinformationsourceindermatology

Usefulness of the large language model ChatGPT (GPT‐4) as a diagnostic tool and information source in dermatology

Similar Items