Evaluating GPT-4's role in critical patient management in emergency departments.

<h4>Introduction</h4>Recent advancements in artificial intelligence (AI) have introduced tools like ChatGPT-4, capable of interpreting visual data, including ECGs. In our study,we aimed to investigate the effectiveness of GPT-4 in interpreting ECGs and managing patient care in emergency...

Full description

Saved in:
Bibliographic Details
Main Authors: Yavuz Yiğit, Serkan Günay, Ahmet Öztürk, Baha Alkahlout
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0327584
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850065071622848512
author Yavuz Yiğit
Serkan Günay
Ahmet Öztürk
Baha Alkahlout
author_facet Yavuz Yiğit
Serkan Günay
Ahmet Öztürk
Baha Alkahlout
author_sort Yavuz Yiğit
collection DOAJ
description <h4>Introduction</h4>Recent advancements in artificial intelligence (AI) have introduced tools like ChatGPT-4, capable of interpreting visual data, including ECGs. In our study,we aimed to investigate the effectiveness of GPT-4 in interpreting ECGs and managing patient care in emergency settings.<h4>Methods</h4>Conducted from April to May 2024, this study evaluated GPT-4 using twenty case scenarios sourced from PubMed Central and the OSCE sample question book. These cases, categorized into common and rare scenarios, were analyzed by GPT-4, and its interpretations were reviewed by five experienced emergency medicine specialists. The accuracy of ECG interpretations and subsequent patient management plans were assessed using a structured evaluation framework and critical error identification.<h4>Results</h4>GPT-4 made critical errors in 46% of ECG interpretations in the OSCE group and 50% in the PubMed group. For patient management, critical errors were found in 32% of the OSCE group and 14% of the PubMed group. When ECG evaluations were included in patient management, error rates approached 50%. The inter-rater reliability among evaluators indicated good agreement (ICC = 0.725, F = 3.72, p < 0.001).<h4>Conclusion</h4>While GPT-4 shows promise in specific applications, its current limitations in accurately interpreting ECGs and managing critical patient scenarios render it inappropriate for emergency department use. Future improvements and extensive validations are essential before such AI tools can be reliably deployed in critical healthcare settings.
format Article
id doaj-art-c06af54aaea2441eb6b109afef78776f
institution DOAJ
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-c06af54aaea2441eb6b109afef78776f2025-08-20T02:49:06ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01207e032758410.1371/journal.pone.0327584Evaluating GPT-4's role in critical patient management in emergency departments.Yavuz YiğitSerkan GünayAhmet ÖztürkBaha Alkahlout<h4>Introduction</h4>Recent advancements in artificial intelligence (AI) have introduced tools like ChatGPT-4, capable of interpreting visual data, including ECGs. In our study,we aimed to investigate the effectiveness of GPT-4 in interpreting ECGs and managing patient care in emergency settings.<h4>Methods</h4>Conducted from April to May 2024, this study evaluated GPT-4 using twenty case scenarios sourced from PubMed Central and the OSCE sample question book. These cases, categorized into common and rare scenarios, were analyzed by GPT-4, and its interpretations were reviewed by five experienced emergency medicine specialists. The accuracy of ECG interpretations and subsequent patient management plans were assessed using a structured evaluation framework and critical error identification.<h4>Results</h4>GPT-4 made critical errors in 46% of ECG interpretations in the OSCE group and 50% in the PubMed group. For patient management, critical errors were found in 32% of the OSCE group and 14% of the PubMed group. When ECG evaluations were included in patient management, error rates approached 50%. The inter-rater reliability among evaluators indicated good agreement (ICC = 0.725, F = 3.72, p < 0.001).<h4>Conclusion</h4>While GPT-4 shows promise in specific applications, its current limitations in accurately interpreting ECGs and managing critical patient scenarios render it inappropriate for emergency department use. Future improvements and extensive validations are essential before such AI tools can be reliably deployed in critical healthcare settings.https://doi.org/10.1371/journal.pone.0327584
spellingShingle Yavuz Yiğit
Serkan Günay
Ahmet Öztürk
Baha Alkahlout
Evaluating GPT-4's role in critical patient management in emergency departments.
PLoS ONE
title Evaluating GPT-4's role in critical patient management in emergency departments.
title_full Evaluating GPT-4's role in critical patient management in emergency departments.
title_fullStr Evaluating GPT-4's role in critical patient management in emergency departments.
title_full_unstemmed Evaluating GPT-4's role in critical patient management in emergency departments.
title_short Evaluating GPT-4's role in critical patient management in emergency departments.
title_sort evaluating gpt 4 s role in critical patient management in emergency departments
url https://doi.org/10.1371/journal.pone.0327584
work_keys_str_mv AT yavuzyigit evaluatinggpt4sroleincriticalpatientmanagementinemergencydepartments
AT serkangunay evaluatinggpt4sroleincriticalpatientmanagementinemergencydepartments
AT ahmetozturk evaluatinggpt4sroleincriticalpatientmanagementinemergencydepartments
AT bahaalkahlout evaluatinggpt4sroleincriticalpatientmanagementinemergencydepartments