Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey
BackgroundThis study aims to conduct an examination of GPT-4’s tendencies when confronted with ethical dilemmas, as well as to ascertain their ethical limitations within clinical decision-makings.MethodsEthical dilemmas were synthesized and organized into 10 different scenarios. To assess the respon...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-05-01
|
| Series: | Frontiers in Public Health |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fpubh.2025.1582377/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849762269958766592 |
|---|---|
| author | Yu-Tao Xiong Yu-Min Zeng Hao-Nan Liu Ya-Nan Sun Wei Tang Chang Liu |
| author_facet | Yu-Tao Xiong Yu-Min Zeng Hao-Nan Liu Ya-Nan Sun Wei Tang Chang Liu |
| author_sort | Yu-Tao Xiong |
| collection | DOAJ |
| description | BackgroundThis study aims to conduct an examination of GPT-4’s tendencies when confronted with ethical dilemmas, as well as to ascertain their ethical limitations within clinical decision-makings.MethodsEthical dilemmas were synthesized and organized into 10 different scenarios. To assess the responses of GPT-4 to these dilemmas, a series of iterative and constrained prompting methods were employed. Custom questionnaire analysis and principal adherence analysis were employed to evaluate the GPT-4-generated responses. Questionnaire analysis was used to assess GPT-4’s ability to provide clinical decision-making recommendations, while principal adherence analysis evaluated its alignment with to ethical principles. Error analysis was conducted on GPT-4-generated responses.ResultsThe questionnaire analysis (5-point Likert scale) showed GPT-4 achieving an average score of 4.49, with the highest scores in the Physical Disability scenario (4.75) and the lowest in the Abortion/Surrogacy scenario (3.82). Furthermore, the principal adherence analysis showed GPT-4 achieved an overall consistency rate of 86%, with slightly lower performance (60%) observed in a few specific scenarios.ConclusionAt the current stage, with the appropriate prompt techniques, GPT-4 can offer proactive and comprehensible recommendations for clinical decision-making. However, GPT-4 exhibit certain errors during this process, leading to inconsistencies with ethical principles and thereby limiting its deeper application in clinical practice. |
| format | Article |
| id | doaj-art-bceb49de3cb54be595eef742d3ec0334 |
| institution | DOAJ |
| issn | 2296-2565 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Public Health |
| spelling | doaj-art-bceb49de3cb54be595eef742d3ec03342025-08-20T03:05:46ZengFrontiers Media S.A.Frontiers in Public Health2296-25652025-05-011310.3389/fpubh.2025.15823771582377Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot surveyYu-Tao Xiong0Yu-Min Zeng1Hao-Nan Liu2Ya-Nan Sun3Wei Tang4Chang Liu5State Key Laboratory of Oral Diseases and National Center for Stomatology and National Clinical Research Center for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, ChinaState Key Laboratory of Oral Diseases and National Center for Stomatology and National Clinical Research Center for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, ChinaState Key Laboratory of Oral Diseases and National Center for Stomatology and National Clinical Research Center for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, ChinaMachine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, ChinaState Key Laboratory of Oral Diseases and National Center for Stomatology and National Clinical Research Center for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, ChinaState Key Laboratory of Oral Diseases and National Center for Stomatology and National Clinical Research Center for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, ChinaBackgroundThis study aims to conduct an examination of GPT-4’s tendencies when confronted with ethical dilemmas, as well as to ascertain their ethical limitations within clinical decision-makings.MethodsEthical dilemmas were synthesized and organized into 10 different scenarios. To assess the responses of GPT-4 to these dilemmas, a series of iterative and constrained prompting methods were employed. Custom questionnaire analysis and principal adherence analysis were employed to evaluate the GPT-4-generated responses. Questionnaire analysis was used to assess GPT-4’s ability to provide clinical decision-making recommendations, while principal adherence analysis evaluated its alignment with to ethical principles. Error analysis was conducted on GPT-4-generated responses.ResultsThe questionnaire analysis (5-point Likert scale) showed GPT-4 achieving an average score of 4.49, with the highest scores in the Physical Disability scenario (4.75) and the lowest in the Abortion/Surrogacy scenario (3.82). Furthermore, the principal adherence analysis showed GPT-4 achieved an overall consistency rate of 86%, with slightly lower performance (60%) observed in a few specific scenarios.ConclusionAt the current stage, with the appropriate prompt techniques, GPT-4 can offer proactive and comprehensible recommendations for clinical decision-making. However, GPT-4 exhibit certain errors during this process, leading to inconsistencies with ethical principles and thereby limiting its deeper application in clinical practice.https://www.frontiersin.org/articles/10.3389/fpubh.2025.1582377/fullartificial intelligenceGPT-4ethical medicaldecision-makingLLMs |
| spellingShingle | Yu-Tao Xiong Yu-Min Zeng Hao-Nan Liu Ya-Nan Sun Wei Tang Chang Liu Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey Frontiers in Public Health artificial intelligence GPT-4 ethical medical decision-making LLMs |
| title | Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey |
| title_full | Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey |
| title_fullStr | Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey |
| title_full_unstemmed | Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey |
| title_short | Exploring the medical ethical limitations of GPT-4 in clinical decision-making scenarios: a pilot survey |
| title_sort | exploring the medical ethical limitations of gpt 4 in clinical decision making scenarios a pilot survey |
| topic | artificial intelligence GPT-4 ethical medical decision-making LLMs |
| url | https://www.frontiersin.org/articles/10.3389/fpubh.2025.1582377/full |
| work_keys_str_mv | AT yutaoxiong exploringthemedicalethicallimitationsofgpt4inclinicaldecisionmakingscenariosapilotsurvey AT yuminzeng exploringthemedicalethicallimitationsofgpt4inclinicaldecisionmakingscenariosapilotsurvey AT haonanliu exploringthemedicalethicallimitationsofgpt4inclinicaldecisionmakingscenariosapilotsurvey AT yanansun exploringthemedicalethicallimitationsofgpt4inclinicaldecisionmakingscenariosapilotsurvey AT weitang exploringthemedicalethicallimitationsofgpt4inclinicaldecisionmakingscenariosapilotsurvey AT changliu exploringthemedicalethicallimitationsofgpt4inclinicaldecisionmakingscenariosapilotsurvey |