Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey

BackgroundThis study aims to conduct an examination of GPT-4’s tendencies when confronted with ethical dilemmas, as well as to ascertain their ethical limitations within clinical decision-makings.MethodsEthical dilemmas were synthesized and organized into 10 different scenarios. To assess the respon...

Full description

Saved in:
Bibliographic Details
Main Authors: Yu-Tao Xiong, Yu-Min Zeng, Hao-Nan Liu, Ya-Nan Sun, Wei Tang, Chang Liu
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-05-01
Series:Frontiers in Public Health
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpubh.2025.1582377/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849762269958766592
author Yu-Tao Xiong
Yu-Min Zeng
Hao-Nan Liu
Ya-Nan Sun
Wei Tang
Chang Liu
author_facet Yu-Tao Xiong
Yu-Min Zeng
Hao-Nan Liu
Ya-Nan Sun
Wei Tang
Chang Liu
author_sort Yu-Tao Xiong
collection DOAJ
description BackgroundThis study aims to conduct an examination of GPT-4’s tendencies when confronted with ethical dilemmas, as well as to ascertain their ethical limitations within clinical decision-makings.MethodsEthical dilemmas were synthesized and organized into 10 different scenarios. To assess the responses of GPT-4 to these dilemmas, a series of iterative and constrained prompting methods were employed. Custom questionnaire analysis and principal adherence analysis were employed to evaluate the GPT-4-generated responses. Questionnaire analysis was used to assess GPT-4’s ability to provide clinical decision-making recommendations, while principal adherence analysis evaluated its alignment with to ethical principles. Error analysis was conducted on GPT-4-generated responses.ResultsThe questionnaire analysis (5-point Likert scale) showed GPT-4 achieving an average score of 4.49, with the highest scores in the Physical Disability scenario (4.75) and the lowest in the Abortion/Surrogacy scenario (3.82). Furthermore, the principal adherence analysis showed GPT-4 achieved an overall consistency rate of 86%, with slightly lower performance (60%) observed in a few specific scenarios.ConclusionAt the current stage, with the appropriate prompt techniques, GPT-4 can offer proactive and comprehensible recommendations for clinical decision-making. However, GPT-4 exhibit certain errors during this process, leading to inconsistencies with ethical principles and thereby limiting its deeper application in clinical practice.
format Article
id doaj-art-bceb49de3cb54be595eef742d3ec0334
institution DOAJ
issn 2296-2565
language English
publishDate 2025-05-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Public Health
spelling doaj-art-bceb49de3cb54be595eef742d3ec03342025-08-20T03:05:46ZengFrontiers Media S.A.Frontiers in Public Health2296-25652025-05-011310.3389/fpubh.2025.15823771582377Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot surveyYu-Tao Xiong0Yu-Min Zeng1Hao-Nan Liu2Ya-Nan Sun3Wei Tang4Chang Liu5State Key Laboratory of Oral Diseases and National Center for Stomatology and National Clinical Research Center for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, ChinaState Key Laboratory of Oral Diseases and National Center for Stomatology and National Clinical Research Center for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, ChinaState Key Laboratory of Oral Diseases and National Center for Stomatology and National Clinical Research Center for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, ChinaMachine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, ChinaState Key Laboratory of Oral Diseases and National Center for Stomatology and National Clinical Research Center for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, ChinaState Key Laboratory of Oral Diseases and National Center for Stomatology and National Clinical Research Center for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, ChinaBackgroundThis study aims to conduct an examination of GPT-4’s tendencies when confronted with ethical dilemmas, as well as to ascertain their ethical limitations within clinical decision-makings.MethodsEthical dilemmas were synthesized and organized into 10 different scenarios. To assess the responses of GPT-4 to these dilemmas, a series of iterative and constrained prompting methods were employed. Custom questionnaire analysis and principal adherence analysis were employed to evaluate the GPT-4-generated responses. Questionnaire analysis was used to assess GPT-4’s ability to provide clinical decision-making recommendations, while principal adherence analysis evaluated its alignment with to ethical principles. Error analysis was conducted on GPT-4-generated responses.ResultsThe questionnaire analysis (5-point Likert scale) showed GPT-4 achieving an average score of 4.49, with the highest scores in the Physical Disability scenario (4.75) and the lowest in the Abortion/Surrogacy scenario (3.82). Furthermore, the principal adherence analysis showed GPT-4 achieved an overall consistency rate of 86%, with slightly lower performance (60%) observed in a few specific scenarios.ConclusionAt the current stage, with the appropriate prompt techniques, GPT-4 can offer proactive and comprehensible recommendations for clinical decision-making. However, GPT-4 exhibit certain errors during this process, leading to inconsistencies with ethical principles and thereby limiting its deeper application in clinical practice.https://www.frontiersin.org/articles/10.3389/fpubh.2025.1582377/fullartificial intelligenceGPT-4ethical medicaldecision-makingLLMs
spellingShingle Yu-Tao Xiong
Yu-Min Zeng
Hao-Nan Liu
Ya-Nan Sun
Wei Tang
Chang Liu
Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey
Frontiers in Public Health
artificial intelligence
GPT-4
ethical medical
decision-making
LLMs
title Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey
title_full Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey
title_fullStr Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey
title_full_unstemmed Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey
title_short Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey
title_sort exploring the medical ethical limitations of gpt 4 in clinical decision making scenarios a pilot survey
topic artificial intelligence
GPT-4
ethical medical
decision-making
LLMs
url https://www.frontiersin.org/articles/10.3389/fpubh.2025.1582377/full
work_keys_str_mv AT yutaoxiong exploringthemedicalethicallimitationsofgpt4inclinicaldecisionmakingscenariosapilotsurvey
AT yuminzeng exploringthemedicalethicallimitationsofgpt4inclinicaldecisionmakingscenariosapilotsurvey
AT haonanliu exploringthemedicalethicallimitationsofgpt4inclinicaldecisionmakingscenariosapilotsurvey
AT yanansun exploringthemedicalethicallimitationsofgpt4inclinicaldecisionmakingscenariosapilotsurvey
AT weitang exploringthemedicalethicallimitationsofgpt4inclinicaldecisionmakingscenariosapilotsurvey
AT changliu exploringthemedicalethicallimitationsofgpt4inclinicaldecisionmakingscenariosapilotsurvey