To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders
<b>Background/Objectives</b>: The increased accessibility of information has resulted in a rise in patients trying to self-diagnose and opting for self-medication, either as a primary treatment or as a supplement to medical care. Our objective was to evaluate the reliability, comprehensi...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Diagnostics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2075-4418/15/14/1834 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850078294444081152 |
|---|---|
| author | Ufuk Arzu Batuhan Gencer |
| author_facet | Ufuk Arzu Batuhan Gencer |
| author_sort | Ufuk Arzu |
| collection | DOAJ |
| description | <b>Background/Objectives</b>: The increased accessibility of information has resulted in a rise in patients trying to self-diagnose and opting for self-medication, either as a primary treatment or as a supplement to medical care. Our objective was to evaluate the reliability, comprehensibility, and readability of the responses provided by ChatGPT 4.0 when queried about the most prevalent orthopaedic problems, thus ascertaining the occurrence of misguidance and the necessity for an audit of the disseminated information. <b>Methods:</b> ChatGPT 4.0 was presented with 26 open-ended questions. The responses were evaluated by two observers using a Likert scale in the categories of diagnosis, recommendation, and referral. The scores from the responses were subjected to subgroup analysis according to the area of interest (AoI) and anatomical region. The readability and comprehensibility of the chatbot’s responses were analyzed using the Flesch–Kincaid Reading Ease Score (FRES) and Flesch–Kincaid Grade Level (FKGL). <b>Results:</b> The majority of the responses were rated as either ‘adequate’ or ‘excellent’. However, in the diagnosis category, a significant difference was found in the evaluation made according to the AoI (<i>p</i> = 0.007), which is attributed to trauma-related questions. No significant difference was identified in any other category. The mean FKGL score was 7.8 ± 1.267, and the mean FRES was 52.68 ± 8.6. The average estimated reading level required to understand the text was considered as “high school”. <b>Conclusions:</b> ChatGPT 4.0 facilitates the self-diagnosis and self-treatment tendencies of patients with musculoskeletal disorders. However, it is imperative for patients to have a robust understanding of the limitations of chatbot-generated advice, particularly in trauma-related conditions. |
| format | Article |
| id | doaj-art-e0614b5719bf4f89a1e4b599f9788e92 |
| institution | DOAJ |
| issn | 2075-4418 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Diagnostics |
| spelling | doaj-art-e0614b5719bf4f89a1e4b599f9788e922025-08-20T02:45:34ZengMDPI AGDiagnostics2075-44182025-07-011514183410.3390/diagnostics15141834To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal DisordersUfuk Arzu0Batuhan Gencer1Department of Orthopedics and Traumatology, Marmara University Pendik Training and Research Hospital, 34890 Istanbul, TurkeyDepartment of Orthopedics and Traumatology, Marmara University Pendik Training and Research Hospital, 34890 Istanbul, Turkey<b>Background/Objectives</b>: The increased accessibility of information has resulted in a rise in patients trying to self-diagnose and opting for self-medication, either as a primary treatment or as a supplement to medical care. Our objective was to evaluate the reliability, comprehensibility, and readability of the responses provided by ChatGPT 4.0 when queried about the most prevalent orthopaedic problems, thus ascertaining the occurrence of misguidance and the necessity for an audit of the disseminated information. <b>Methods:</b> ChatGPT 4.0 was presented with 26 open-ended questions. The responses were evaluated by two observers using a Likert scale in the categories of diagnosis, recommendation, and referral. The scores from the responses were subjected to subgroup analysis according to the area of interest (AoI) and anatomical region. The readability and comprehensibility of the chatbot’s responses were analyzed using the Flesch–Kincaid Reading Ease Score (FRES) and Flesch–Kincaid Grade Level (FKGL). <b>Results:</b> The majority of the responses were rated as either ‘adequate’ or ‘excellent’. However, in the diagnosis category, a significant difference was found in the evaluation made according to the AoI (<i>p</i> = 0.007), which is attributed to trauma-related questions. No significant difference was identified in any other category. The mean FKGL score was 7.8 ± 1.267, and the mean FRES was 52.68 ± 8.6. The average estimated reading level required to understand the text was considered as “high school”. <b>Conclusions:</b> ChatGPT 4.0 facilitates the self-diagnosis and self-treatment tendencies of patients with musculoskeletal disorders. However, it is imperative for patients to have a robust understanding of the limitations of chatbot-generated advice, particularly in trauma-related conditions.https://www.mdpi.com/2075-4418/15/14/1834ChatGPTself-diagnosisself-treatmentreadabilityFlesch–Kincaid Grade Leveltrauma |
| spellingShingle | Ufuk Arzu Batuhan Gencer To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders Diagnostics ChatGPT self-diagnosis self-treatment readability Flesch–Kincaid Grade Level trauma |
| title | To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders |
| title_full | To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders |
| title_fullStr | To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders |
| title_full_unstemmed | To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders |
| title_short | To Self-Treat or Not to Self-Treat: Evaluating the Diagnostic, Advisory and Referral Effectiveness of ChatGPT Responses to the Most Common Musculoskeletal Disorders |
| title_sort | to self treat or not to self treat evaluating the diagnostic advisory and referral effectiveness of chatgpt responses to the most common musculoskeletal disorders |
| topic | ChatGPT self-diagnosis self-treatment readability Flesch–Kincaid Grade Level trauma |
| url | https://www.mdpi.com/2075-4418/15/14/1834 |
| work_keys_str_mv | AT ufukarzu toselftreatornottoselftreatevaluatingthediagnosticadvisoryandreferraleffectivenessofchatgptresponsestothemostcommonmusculoskeletaldisorders AT batuhangencer toselftreatornottoselftreatevaluatingthediagnosticadvisoryandreferraleffectivenessofchatgptresponsestothemostcommonmusculoskeletaldisorders |