How appropriately can generative artificial intelligence platforms, including GPT-4, Gemini, Bing, and Wrtn, answer questions about colon cancer in the Korean language?

Purpose This study aims to assess the performance of 4 generative artificial intelligence (AI) platforms—Gemini (formerly Bard), Bing, GPT-4, and Wrtn—in answering questions about colon cancer in the Korean language. Two main research questions guided this study. First, which AI platform provides th...

Full description

Saved in:

Bibliographic Details
Main Author:	Sun Huh
Format:	Article
Language:	English
Published:	Korean Society of Coloproctology 2025-06-01
Series:	Annals of Coloproctology
Subjects:	artificial intelligence colorectal surgery colonic neoplasms information sources surgeons
Online Access:	http://coloproctol.org/upload/pdf/ac-2024-00122-0017.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849428445282435072
author	Sun Huh
author_facet	Sun Huh
author_sort	Sun Huh
collection	DOAJ
description	Purpose This study aims to assess the performance of 4 generative artificial intelligence (AI) platforms—Gemini (formerly Bard), Bing, GPT-4, and Wrtn—in answering questions about colon cancer in the Korean language. Two main research questions guided this study. First, which AI platform provides the most accurate answers? Second, can these AI-generated answers be reliably used to educate patients and their families about colon cancer? Methods Ten questions selected by the author were posed to the 4 generative AI platforms on February 22, 2024. Two colorectal surgeons in Korea, each with over 20 years of clinical experience, independently evaluated the answers provided by these generative AI platforms. Results The generative AI platforms scored an average of 5.5 out of 10 points. Wrtn achieved the highest score at 6 points, followed by GPT-4 and Gemini, each with 5.5, and Bing, scoring 5 points. The weighted κ for inter-rater reliability was 0.597 (P<0.001). The generative AI platforms performed well in explaining the occult blood test for cancer screening, keyhole surgery, and dietary recommendations for cancer prevention. However, they demonstrated significant limitations in answering more complex topics, such as estimating survival rates following surgery, choosing targeted therapy after surgery, and accurately reporting the mortality rate due to colon cancer in Korea. Conclusion The findings suggest that using these generative AI platforms as educational resources for patients and their families regarding colon cancer is premature. Further training on colorectal diseases is required before these AI platforms can be considered reliable information sources for the general public in Korea.
format	Article
id	doaj-art-7e5218ad919e4e79a486a5506dd67f7b
institution	Kabale University
issn	2287-9714 2287-9722
language	English
publishDate	2025-06-01
publisher	Korean Society of Coloproctology
record_format	Article
series	Annals of Coloproctology
spelling	doaj-art-7e5218ad919e4e79a486a5506dd67f7b2025-08-20T03:28:43ZengKorean Society of ColoproctologyAnnals of Coloproctology2287-97142287-97222025-06-0141319019710.3393/ac.2024.00122.00172104How appropriately can generative artificial intelligence platforms, including GPT-4, Gemini, Bing, and Wrtn, answer questions about colon cancer in the Korean language?Sun HuhPurpose This study aims to assess the performance of 4 generative artificial intelligence (AI) platforms—Gemini (formerly Bard), Bing, GPT-4, and Wrtn—in answering questions about colon cancer in the Korean language. Two main research questions guided this study. First, which AI platform provides the most accurate answers? Second, can these AI-generated answers be reliably used to educate patients and their families about colon cancer? Methods Ten questions selected by the author were posed to the 4 generative AI platforms on February 22, 2024. Two colorectal surgeons in Korea, each with over 20 years of clinical experience, independently evaluated the answers provided by these generative AI platforms. Results The generative AI platforms scored an average of 5.5 out of 10 points. Wrtn achieved the highest score at 6 points, followed by GPT-4 and Gemini, each with 5.5, and Bing, scoring 5 points. The weighted κ for inter-rater reliability was 0.597 (P<0.001). The generative AI platforms performed well in explaining the occult blood test for cancer screening, keyhole surgery, and dietary recommendations for cancer prevention. However, they demonstrated significant limitations in answering more complex topics, such as estimating survival rates following surgery, choosing targeted therapy after surgery, and accurately reporting the mortality rate due to colon cancer in Korea. Conclusion The findings suggest that using these generative AI platforms as educational resources for patients and their families regarding colon cancer is premature. Further training on colorectal diseases is required before these AI platforms can be considered reliable information sources for the general public in Korea.http://coloproctol.org/upload/pdf/ac-2024-00122-0017.pdfartificial intelligencecolorectal surgerycolonic neoplasmsinformation sourcessurgeons
spellingShingle	Sun Huh How appropriately can generative artificial intelligence platforms, including GPT-4, Gemini, Bing, and Wrtn, answer questions about colon cancer in the Korean language? Annals of Coloproctology artificial intelligence colorectal surgery colonic neoplasms information sources surgeons
title	How appropriately can generative artificial intelligence platforms, including GPT-4, Gemini, Bing, and Wrtn, answer questions about colon cancer in the Korean language?
title_full	How appropriately can generative artificial intelligence platforms, including GPT-4, Gemini, Bing, and Wrtn, answer questions about colon cancer in the Korean language?
title_fullStr	How appropriately can generative artificial intelligence platforms, including GPT-4, Gemini, Bing, and Wrtn, answer questions about colon cancer in the Korean language?
title_full_unstemmed	How appropriately can generative artificial intelligence platforms, including GPT-4, Gemini, Bing, and Wrtn, answer questions about colon cancer in the Korean language?
title_short	How appropriately can generative artificial intelligence platforms, including GPT-4, Gemini, Bing, and Wrtn, answer questions about colon cancer in the Korean language?
title_sort	how appropriately can generative artificial intelligence platforms including gpt 4 gemini bing and wrtn answer questions about colon cancer in the korean language
topic	artificial intelligence colorectal surgery colonic neoplasms information sources surgeons
url	http://coloproctol.org/upload/pdf/ac-2024-00122-0017.pdf
work_keys_str_mv	AT sunhuh howappropriatelycangenerativeartificialintelligenceplatformsincludinggpt4geminibingandwrtnanswerquestionsaboutcoloncancerinthekoreanlanguage

How appropriately can generative artificial intelligence platforms, including GPT-4, Gemini, Bing, and Wrtn, answer questions about colon cancer in the Korean language?

Similar Items