To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders

<b>Background/Objectives</b>: The increased accessibility of information has resulted in a rise in patients trying to self-diagnose and opting for self-medication, either as a primary treatment or as a supplement to medical care. Our objective was to evaluate the reliability, comprehensi...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ufuk Arzu, Batuhan Gencer
Format:	Article
Language:	English
Published:	MDPI AG 2025-07-01
Series:	Diagnostics
Subjects:	ChatGPT self-diagnosis self-treatment readability Flesch–Kincaid Grade Level trauma
Online Access:	https://www.mdpi.com/2075-4418/15/14/1834
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850078294444081152
author	Ufuk Arzu Batuhan Gencer
author_facet	Ufuk Arzu Batuhan Gencer
author_sort	Ufuk Arzu
collection	DOAJ
description	<b>Background/Objectives</b>: The increased accessibility of information has resulted in a rise in patients trying to self-diagnose and opting for self-medication, either as a primary treatment or as a supplement to medical care. Our objective was to evaluate the reliability, comprehensibility, and readability of the responses provided by ChatGPT 4.0 when queried about the most prevalent orthopaedic problems, thus ascertaining the occurrence of misguidance and the necessity for an audit of the disseminated information. <b>Methods:</b> ChatGPT 4.0 was presented with 26 open-ended questions. The responses were evaluated by two observers using a Likert scale in the categories of diagnosis, recommendation, and referral. The scores from the responses were subjected to subgroup analysis according to the area of interest (AoI) and anatomical region. The readability and comprehensibility of the chatbot’s responses were analyzed using the Flesch–Kincaid Reading Ease Score (FRES) and Flesch–Kincaid Grade Level (FKGL). <b>Results:</b> The majority of the responses were rated as either ‘adequate’ or ‘excellent’. However, in the diagnosis category, a significant difference was found in the evaluation made according to the AoI (<i>p</i> = 0.007), which is attributed to trauma-related questions. No significant difference was identified in any other category. The mean FKGL score was 7.8 ± 1.267, and the mean FRES was 52.68 ± 8.6. The average estimated reading level required to understand the text was considered as “high school”. <b>Conclusions:</b> ChatGPT 4.0 facilitates the self-diagnosis and self-treatment tendencies of patients with musculoskeletal disorders. However, it is imperative for patients to have a robust understanding of the limitations of chatbot-generated advice, particularly in trauma-related conditions.
format	Article
id	doaj-art-e0614b5719bf4f89a1e4b599f9788e92
institution	DOAJ
issn	2075-4418
language	English
publishDate	2025-07-01
publisher	MDPI AG
record_format	Article
series	Diagnostics
spelling	doaj-art-e0614b5719bf4f89a1e4b599f9788e922025-08-20T02:45:34ZengMDPI AGDiagnostics2075-44182025-07-011514183410.3390/diagnostics15141834To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal DisordersUfuk Arzu0Batuhan Gencer1Department of Orthopedics and Traumatology, Marmara University Pendik Training and Research Hospital, 34890 Istanbul, TurkeyDepartment of Orthopedics and Traumatology, Marmara University Pendik Training and Research Hospital, 34890 Istanbul, Turkey<b>Background/Objectives</b>: The increased accessibility of information has resulted in a rise in patients trying to self-diagnose and opting for self-medication, either as a primary treatment or as a supplement to medical care. Our objective was to evaluate the reliability, comprehensibility, and readability of the responses provided by ChatGPT 4.0 when queried about the most prevalent orthopaedic problems, thus ascertaining the occurrence of misguidance and the necessity for an audit of the disseminated information. <b>Methods:</b> ChatGPT 4.0 was presented with 26 open-ended questions. The responses were evaluated by two observers using a Likert scale in the categories of diagnosis, recommendation, and referral. The scores from the responses were subjected to subgroup analysis according to the area of interest (AoI) and anatomical region. The readability and comprehensibility of the chatbot’s responses were analyzed using the Flesch–Kincaid Reading Ease Score (FRES) and Flesch–Kincaid Grade Level (FKGL). <b>Results:</b> The majority of the responses were rated as either ‘adequate’ or ‘excellent’. However, in the diagnosis category, a significant difference was found in the evaluation made according to the AoI (<i>p</i> = 0.007), which is attributed to trauma-related questions. No significant difference was identified in any other category. The mean FKGL score was 7.8 ± 1.267, and the mean FRES was 52.68 ± 8.6. The average estimated reading level required to understand the text was considered as “high school”. <b>Conclusions:</b> ChatGPT 4.0 facilitates the self-diagnosis and self-treatment tendencies of patients with musculoskeletal disorders. However, it is imperative for patients to have a robust understanding of the limitations of chatbot-generated advice, particularly in trauma-related conditions.https://www.mdpi.com/2075-4418/15/14/1834ChatGPTself-diagnosisself-treatmentreadabilityFlesch–Kincaid Grade Leveltrauma
spellingShingle	Ufuk Arzu Batuhan Gencer To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders Diagnostics ChatGPT self-diagnosis self-treatment readability Flesch–Kincaid Grade Level trauma
title	To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders
title_full	To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders
title_fullStr	To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders
title_full_unstemmed	To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders
title_short	To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders
title_sort	to self treat or not to self treat evaluating the diagnostic advisory and referral effectiveness of chatgpt responses to the most common musculoskeletal disorders
topic	ChatGPT self-diagnosis self-treatment readability Flesch–Kincaid Grade Level trauma
url	https://www.mdpi.com/2075-4418/15/14/1834
work_keys_str_mv	AT ufukarzu toselftreatornottoselftreatevaluatingthediagnosticadvisoryandreferraleffectivenessofchatgptresponsestothemostcommonmusculoskeletaldisorders AT batuhangencer toselftreatornottoselftreatevaluatingthediagnosticadvisoryandreferraleffectivenessofchatgptresponsestothemostcommonmusculoskeletaldisorders

To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders

Similar Items