Assessment of ChatGPT-generated medical Arabic responses for patients with metabolic dysfunction-associated steatotic liver disease.
<h4>Background and aim</h4>Artificial intelligence (AI)-powered chatbots, such as Chat Generative Pretrained Transformer (ChatGPT), have shown promising results in healthcare settings. These tools can help patients obtain real-time responses to queries, ensuring immediate access to relev...
Saved in:
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2025-01-01
|
Series: | PLoS ONE |
Online Access: | https://doi.org/10.1371/journal.pone.0317929 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | <h4>Background and aim</h4>Artificial intelligence (AI)-powered chatbots, such as Chat Generative Pretrained Transformer (ChatGPT), have shown promising results in healthcare settings. These tools can help patients obtain real-time responses to queries, ensuring immediate access to relevant information. The study aimed to explore the potential use of ChatGPT-generated medical Arabic responses for patients with metabolic dysfunction-associated steatotic liver disease (MASLD).<h4>Methods</h4>An English patient questionnaire on MASLD was translated to Arabic. The Arabic questions were then entered into ChatGPT 3.5 on November 12, 2023. The responses were evaluated for accuracy, completeness, and comprehensibility by 10 Saudi MASLD experts who were native Arabic speakers. Likert scales were used to evaluate: 1) Accuracy, 2) Completeness, and 3) Comprehensibility. The questions were grouped into 3 domains: (1) Specialist referral, (2) Lifestyle, and (3) Physical activity.<h4>Results</h4>Accuracy mean score was 4.9 ± 0.94 on a 6-point Likert scale corresponding to "Nearly all correct." Kendall's coefficient of concordance (KCC) ranged from 0.025 to 0.649, with a mean of 0.28, indicating moderate agreement between all 10 experts. Mean completeness score was 2.4 ± 0.53 on a 3-point Likert scale corresponding to "Comprehensive" (KCC: 0.03-0.553; mean: 0.22). Comprehensibility mean score was 2.74 ± 0.52 on a 3-point Likert scale, which indicates the responses were "Easy to understand" (KCC: 0.00-0.447; mean: 0.25).<h4>Conclusion</h4>MASLD experts found that ChatGPT responses were accurate, complete, and comprehensible. The results support the increasing trend of leveraging the power of AI chatbots to revolutionize the dissemination of information for patients with MASLD. However, many AI-powered chatbots require further enhancement of scientific content to avoid the risks of circulating medical misinformation. |
---|---|
ISSN: | 1932-6203 |