Assessment of Large Language Models (LLMs) in decision-making support for gynecologic oncology
Objective: This study investigated the ability of Large Language Models (LLMs) to provide accurate and consistent answers by focusing on their performance in complex gynecologic cancer cases. Background: LLMs are advancing rapidly and require a thorough evaluation to ensure that they can be safely a...
Saved in:
| Main Authors: | , , , , , , , , , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2024-12-01
|
| Series: | Computational and Structural Biotechnology Journal |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2001037024003702 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850191295000084480 |
|---|---|
| author | Khanisyah Erza Gumilar Birama R. Indraprasta Ach Salman Faridzi Bagus M. Wibowo Aditya Herlambang Eccita Rahestyningtyas Budi Irawan Zulkarnain Tambunan Ahmad Fadhli Bustomi Bagus Ngurah Brahmantara Zih-Ying Yu Yu-Cheng Hsu Herlangga Pramuditya Very Great E. Putra Hari Nugroho Pungky Mulawardhana Brahmana A. Tjokroprawiro Tri Hedianto Ibrahim H. Ibrahim Jingshan Huang Dongqi Li Chien-Hsing Lu Jer-Yen Yang Li-Na Liao Ming Tan |
| author_facet | Khanisyah Erza Gumilar Birama R. Indraprasta Ach Salman Faridzi Bagus M. Wibowo Aditya Herlambang Eccita Rahestyningtyas Budi Irawan Zulkarnain Tambunan Ahmad Fadhli Bustomi Bagus Ngurah Brahmantara Zih-Ying Yu Yu-Cheng Hsu Herlangga Pramuditya Very Great E. Putra Hari Nugroho Pungky Mulawardhana Brahmana A. Tjokroprawiro Tri Hedianto Ibrahim H. Ibrahim Jingshan Huang Dongqi Li Chien-Hsing Lu Jer-Yen Yang Li-Na Liao Ming Tan |
| author_sort | Khanisyah Erza Gumilar |
| collection | DOAJ |
| description | Objective: This study investigated the ability of Large Language Models (LLMs) to provide accurate and consistent answers by focusing on their performance in complex gynecologic cancer cases. Background: LLMs are advancing rapidly and require a thorough evaluation to ensure that they can be safely and effectively used in clinical decision-making. Such evaluations are essential for confirming LLM reliability and accuracy in supporting medical professionals in casework. Study design: We assessed three prominent LLMs—ChatGPT-4 (CG-4), Gemini Advanced (GemAdv), and Copilot—evaluating their accuracy, consistency, and overall performance. Fifteen clinical vignettes of varying difficulty and five open-ended questions based on real patient cases were used. The responses were coded, randomized, and evaluated blindly by six expert gynecologic oncologists using a 5-point Likert scale for relevance, clarity, depth, focus, and coherence. Results: GemAdv demonstrated superior accuracy (81.87 %) compared to both CG-4 (61.60 %) and Copilot (70.67 %) across all difficulty levels. GemAdv consistently provided correct answers more frequently (>60 % every day during the testing period). Although CG-4 showed a slight advantage in adhering to the National Comprehensive Cancer Network (NCCN) treatment guidelines, GemAdv excelled in the depth and focus of the answers provided, which are crucial aspects of clinical decision-making. Conclusion: LLMs, especially GemAdv, show potential in supporting clinical practice by providing accurate, consistent, and relevant information for gynecologic cancer. However, further refinement is needed for more complex scenarios. This study highlights the promise of LLMs in gynecologic oncology, emphasizing the need for ongoing development and rigorous evaluation to maximize their clinical utility and reliability. |
| format | Article |
| id | doaj-art-e70d57f57c814739bf47e3e669ca2d34 |
| institution | OA Journals |
| issn | 2001-0370 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Computational and Structural Biotechnology Journal |
| spelling | doaj-art-e70d57f57c814739bf47e3e669ca2d342025-08-20T02:14:57ZengElsevierComputational and Structural Biotechnology Journal2001-03702024-12-01234019402610.1016/j.csbj.2024.10.050Assessment of Large Language Models (LLMs) in decision-making support for gynecologic oncologyKhanisyah Erza Gumilar0Birama R. Indraprasta1Ach Salman Faridzi2Bagus M. Wibowo3Aditya Herlambang4Eccita Rahestyningtyas5Budi Irawan6Zulkarnain Tambunan7Ahmad Fadhli Bustomi8Bagus Ngurah Brahmantara9Zih-Ying Yu10Yu-Cheng Hsu11Herlangga Pramuditya12Very Great E. Putra13Hari Nugroho14Pungky Mulawardhana15Brahmana A. Tjokroprawiro16Tri Hedianto17Ibrahim H. Ibrahim18Jingshan Huang19Dongqi Li20Chien-Hsing Lu21Jer-Yen Yang22Li-Na Liao23Ming Tan24Graduate Institute of Biomedical Science, China Medical University, Taichung, Taiwan; Department of Obstetrics and Gynecology, Hospital of Universitas Airlangga - Faculty of Medicine, Universitas Airlangga, Surabaya, Indonesia; Correspondence to: Department of Obstetrics and Gynecology, Hospital of Universitas Airlangga, Faculty of Medicine, Universitas Airlangga, Jl. Dharmahusada Permai, Mulyorejo, Surabaya, Jawa Timur 60115, Indonesia.Department of Obstetrics and Gynecology, Dr. Soetomo General Hospital - Faculty of Medicine, Universitas Airlangga, Surabaya, IndonesiaDepartment of Obstetrics and Gynecology, Dr. Soetomo General Hospital - Faculty of Medicine, Universitas Airlangga, Surabaya, IndonesiaDepartment of Obstetrics and Gynecology, Dr. Soetomo General Hospital - Faculty of Medicine, Universitas Airlangga, Surabaya, IndonesiaDepartment of Obstetrics and Gynecology, Dr. Soetomo General Hospital - Faculty of Medicine, Universitas Airlangga, Surabaya, IndonesiaDepartment of Obstetrics and Gynecology, Hospital of Universitas Airlangga - Faculty of Medicine, Universitas Airlangga, Surabaya, IndonesiaDepartment of Obstetrics and Gynecology, Dr. Soetomo General Hospital - Faculty of Medicine, Universitas Airlangga, Surabaya, IndonesiaDepartment of Obstetrics and Gynecology, Dr. Soetomo General Hospital - Faculty of Medicine, Universitas Airlangga, Surabaya, IndonesiaDepartment of Obstetrics and Gynecology, Dr. Soetomo General Hospital - Faculty of Medicine, Universitas Airlangga, Surabaya, IndonesiaDepartment of Obstetrics and Gynecology, Dr. Soetomo General Hospital - Faculty of Medicine, Universitas Airlangga, Surabaya, IndonesiaDepartment of Public Health, China Medical University, Taichung, TaiwanDepartment of Public Health, China Medical University, Taichung, Taiwan; School of Chinese Medicine, China Medical University, Taichung, TaiwanDepartment of Obstetrics and Gynecology, Dr. Ramelan Naval Hospital, Surabaya, IndonesiaDepartment of Obstetrics and Gynecology, Dr. Kariadi Central General Hospital, Semarang, IndonesiaDepartment of Obstetrics and Gynecology, Dr. Soetomo General Hospital - Faculty of Medicine, Universitas Airlangga, Surabaya, IndonesiaDepartment of Obstetrics and Gynecology, Hospital of Universitas Airlangga - Faculty of Medicine, Universitas Airlangga, Surabaya, IndonesiaDepartment of Obstetrics and Gynecology, Dr. Soetomo General Hospital - Faculty of Medicine, Universitas Airlangga, Surabaya, IndonesiaFaculty of Medicine and Health, Institut Teknologi Sepuluh Nopember, Surabaya, IndonesiaGraduate Institute of Biomedical Science, China Medical University, Taichung, TaiwanSchool of Computing, College of Medicine, University of South Alabama, Mobile, AL, USASchool of Information and Computer Sciences, School of Social and Behavioral Sciences, University of California, Irvine, CA, USADepartment of Obstetrics and Gynecology, Taichung Veteran General Hospital, Taichung, TaiwanGraduate Institute of Biomedical Science, China Medical University, Taichung, Taiwan; Correspondence to: Graduate Institute of Biomedical Science, China Medical University, No. 100, Section 1, Jingmao Road, Beitun District, Taichung City 406040, Taiwan.Department of Public Health, China Medical University, Taichung, Taiwan; Correspondence to: Department of Public Health, China Medical University, No. 100, Section 1, Jingmao Road, Beitun District, Taichung City 406040, Taiwan.Graduate Institute of Biomedical Science, China Medical University, Taichung, Taiwan; Institute of Biochemistry and Molecular Biology and Research Center for Cancer Biology, China Medical University, Taichung, Taiwan; Correspondence to: Institute of Biochemistry and Molecular Biology, Graduate Institute of Biomedical Sciences, China Medical University (Taiwan), No. 100, Section 1, Jingmao Road, Beitun District, Taichung City 406040, Taiwan.Objective: This study investigated the ability of Large Language Models (LLMs) to provide accurate and consistent answers by focusing on their performance in complex gynecologic cancer cases. Background: LLMs are advancing rapidly and require a thorough evaluation to ensure that they can be safely and effectively used in clinical decision-making. Such evaluations are essential for confirming LLM reliability and accuracy in supporting medical professionals in casework. Study design: We assessed three prominent LLMs—ChatGPT-4 (CG-4), Gemini Advanced (GemAdv), and Copilot—evaluating their accuracy, consistency, and overall performance. Fifteen clinical vignettes of varying difficulty and five open-ended questions based on real patient cases were used. The responses were coded, randomized, and evaluated blindly by six expert gynecologic oncologists using a 5-point Likert scale for relevance, clarity, depth, focus, and coherence. Results: GemAdv demonstrated superior accuracy (81.87 %) compared to both CG-4 (61.60 %) and Copilot (70.67 %) across all difficulty levels. GemAdv consistently provided correct answers more frequently (>60 % every day during the testing period). Although CG-4 showed a slight advantage in adhering to the National Comprehensive Cancer Network (NCCN) treatment guidelines, GemAdv excelled in the depth and focus of the answers provided, which are crucial aspects of clinical decision-making. Conclusion: LLMs, especially GemAdv, show potential in supporting clinical practice by providing accurate, consistent, and relevant information for gynecologic cancer. However, further refinement is needed for more complex scenarios. This study highlights the promise of LLMs in gynecologic oncology, emphasizing the need for ongoing development and rigorous evaluation to maximize their clinical utility and reliability.http://www.sciencedirect.com/science/article/pii/S2001037024003702Gynecologic cancerLarge Language ModelsAccuracyConsistencyArtificial intelligence |
| spellingShingle | Khanisyah Erza Gumilar Birama R. Indraprasta Ach Salman Faridzi Bagus M. Wibowo Aditya Herlambang Eccita Rahestyningtyas Budi Irawan Zulkarnain Tambunan Ahmad Fadhli Bustomi Bagus Ngurah Brahmantara Zih-Ying Yu Yu-Cheng Hsu Herlangga Pramuditya Very Great E. Putra Hari Nugroho Pungky Mulawardhana Brahmana A. Tjokroprawiro Tri Hedianto Ibrahim H. Ibrahim Jingshan Huang Dongqi Li Chien-Hsing Lu Jer-Yen Yang Li-Na Liao Ming Tan Assessment of Large Language Models (LLMs) in decision-making support for gynecologic oncology Computational and Structural Biotechnology Journal Gynecologic cancer Large Language Models Accuracy Consistency Artificial intelligence |
| title | Assessment of Large Language Models (LLMs) in decision-making support for gynecologic oncology |
| title_full | Assessment of Large Language Models (LLMs) in decision-making support for gynecologic oncology |
| title_fullStr | Assessment of Large Language Models (LLMs) in decision-making support for gynecologic oncology |
| title_full_unstemmed | Assessment of Large Language Models (LLMs) in decision-making support for gynecologic oncology |
| title_short | Assessment of Large Language Models (LLMs) in decision-making support for gynecologic oncology |
| title_sort | assessment of large language models llms in decision making support for gynecologic oncology |
| topic | Gynecologic cancer Large Language Models Accuracy Consistency Artificial intelligence |
| url | http://www.sciencedirect.com/science/article/pii/S2001037024003702 |
| work_keys_str_mv | AT khanisyaherzagumilar assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT biramarindraprasta assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT achsalmanfaridzi assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT bagusmwibowo assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT adityaherlambang assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT eccitarahestyningtyas assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT budiirawan assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT zulkarnaintambunan assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT ahmadfadhlibustomi assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT bagusngurahbrahmantara assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT zihyingyu assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT yuchenghsu assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT herlanggapramuditya assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT verygreateputra assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT harinugroho assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT pungkymulawardhana assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT brahmanaatjokroprawiro assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT trihedianto assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT ibrahimhibrahim assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT jingshanhuang assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT dongqili assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT chienhsinglu assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT jeryenyang assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT linaliao assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology AT mingtan assessmentoflargelanguagemodelsllmsindecisionmakingsupportforgynecologiconcology |