Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management

Background: Integrating machine learning and artificial intelligence (AI) technologies has revolutionized various sectors, including health care. However, their application in orthopedic health-care settings still needs to be improved. This study sought to evaluate Chat Generative Pre-Trained Transf...

Full description

Saved in:

Bibliographic Details
Main Authors:	Michael Megafu, DO, MPH, Omar Guerrero, BS, Rafay Hasan, BS, Larry Hunt, MBA, Devri Langhelm, BS, Benning Le, MS, Xinning Li, MD, Robert Kelly, IV, MD, Robert L. Parisien, MD, Antonio Cusano, MD
Format:	Article
Language:	English
Published:	Elsevier 2025-07-01
Series:	JSES International
Subjects:	Basic Science Study Validation of AI Software
Online Access:	http://www.sciencedirect.com/science/article/pii/S2666638325000933
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849319576712511488
author	Michael Megafu, DO, MPH Omar Guerrero, BS Rafay Hasan, BS Larry Hunt, MBA Devri Langhelm, BS Benning Le, MS Xinning Li, MD Robert Kelly, IV, MD Robert L. Parisien, MD Antonio Cusano, MD
author_facet	Michael Megafu, DO, MPH Omar Guerrero, BS Rafay Hasan, BS Larry Hunt, MBA Devri Langhelm, BS Benning Le, MS Xinning Li, MD Robert Kelly, IV, MD Robert L. Parisien, MD Antonio Cusano, MD
author_sort	Michael Megafu, DO, MPH
collection	DOAJ
description	Background: Integrating machine learning and artificial intelligence (AI) technologies has revolutionized various sectors, including health care. However, their application in orthopedic health-care settings still needs to be improved. This study sought to evaluate Chat Generative Pre-Trained Transformer (ChatGPT) and Gemini's capacity to make quality medical recommendations regarding glenohumeral osteoarthritis, weighing them against the recommendations established in the Evidence-Based Clinical Practice Guidelines (CPGs) of the American Academy of Orthopaedic Surgeons (AAOS). Methods: The 2020 AAOS CPGs, a widely recognized and respected source, were the basis for determining recommended and nonrecommended treatments in this study. ChatGPT and Gemini were queried on 20 treatments based on these guidelines; 10 were recommended for managing glenohumeral joint osteoarthritis, five were not recommended for managing glenohumeral joint osteoarthritis, and five were reported as consensus statements. These responses were categorized as “Concordance” or “No Concordance” with the AAOS CPGs. A Cohen's Kappa coefficient was calculated to assess the interrater reliability. Results: Among the 20 treatments examined, ChatGPT and Gemini showed concordance with the AAOS CPGs for 10 (100%) and 5 (50%) treatments, respectively. On the other hand, for treatments that AAOS CPGs did not recommend, ChatGPT had concordance for four out of the five treatments (80%), while Gemini had 100% concordance. The Cohen's Kappa coefficient to assess interrater reliability was found to be 0.90, indicating a very high level of agreement between the two raters in categorizing responses as “Concordance” or “No Concordance” with the AAOS CPGs. Conclusion: The study findings reveal that ChatGPT and Gemini cannot solely recommend CPGs as outlined in AAOS CPGs. As patients increasingly utilize external resources such as AI platforms and the Internet for medical recommendations, providers should advise patients to exercise caution when seeking medical advice from these AI platforms for managing glenohumeral joint osteoarthritis.
format	Article
id	doaj-art-e486b13c57044cf2ae2a38ba9c42575c
institution	Kabale University
issn	2666-6383
language	English
publishDate	2025-07-01
publisher	Elsevier
record_format	Article
series	JSES International
spelling	doaj-art-e486b13c57044cf2ae2a38ba9c42575c2025-08-20T03:50:22ZengElsevierJSES International2666-63832025-07-01941365137010.1016/j.jseint.2025.03.011Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis managementMichael Megafu, DO, MPH0Omar Guerrero, BS1Rafay Hasan, BS2Larry Hunt, MBA3Devri Langhelm, BS4Benning Le, MS5Xinning Li, MD6Robert Kelly, IV, MD7Robert L. Parisien, MD8Antonio Cusano, MD9Department of Orthopaedic Surgery, University of Connecticut, Farmington, CT, USA; Corresponding author: Michael Megafu, DO, MPH, Department of Orthopaedic Surgery, University of Connecticut, 263 Farmington Ave, Farmington, CT 06032, USA.A.T. Still University School of Osteopathic Medicine in Arizona, Mesa, AZ, USAA.T. Still University School of Osteopathic Medicine in Arizona, Mesa, AZ, USAA.T. Still University School of Osteopathic Medicine in Arizona, Mesa, AZ, USAA.T. Still University School of Osteopathic Medicine in Arizona, Mesa, AZ, USAA.T. Still University School of Osteopathic Medicine in Arizona, Mesa, AZ, USADepartment of Orthopedic Surgery, Boston University School of Medicine, Boston, MA, USADepartment of Orthopedic Surgery, University of Pennsylvania, Philadelphia, PA, USADepartment of Orthopedic Surgery, Mount Sinai, New York, NY, USADepartment of Orthopaedic Surgery, University of Connecticut, Farmington, CT, USABackground: Integrating machine learning and artificial intelligence (AI) technologies has revolutionized various sectors, including health care. However, their application in orthopedic health-care settings still needs to be improved. This study sought to evaluate Chat Generative Pre-Trained Transformer (ChatGPT) and Gemini's capacity to make quality medical recommendations regarding glenohumeral osteoarthritis, weighing them against the recommendations established in the Evidence-Based Clinical Practice Guidelines (CPGs) of the American Academy of Orthopaedic Surgeons (AAOS). Methods: The 2020 AAOS CPGs, a widely recognized and respected source, were the basis for determining recommended and nonrecommended treatments in this study. ChatGPT and Gemini were queried on 20 treatments based on these guidelines; 10 were recommended for managing glenohumeral joint osteoarthritis, five were not recommended for managing glenohumeral joint osteoarthritis, and five were reported as consensus statements. These responses were categorized as “Concordance” or “No Concordance” with the AAOS CPGs. A Cohen's Kappa coefficient was calculated to assess the interrater reliability. Results: Among the 20 treatments examined, ChatGPT and Gemini showed concordance with the AAOS CPGs for 10 (100%) and 5 (50%) treatments, respectively. On the other hand, for treatments that AAOS CPGs did not recommend, ChatGPT had concordance for four out of the five treatments (80%), while Gemini had 100% concordance. The Cohen's Kappa coefficient to assess interrater reliability was found to be 0.90, indicating a very high level of agreement between the two raters in categorizing responses as “Concordance” or “No Concordance” with the AAOS CPGs. Conclusion: The study findings reveal that ChatGPT and Gemini cannot solely recommend CPGs as outlined in AAOS CPGs. As patients increasingly utilize external resources such as AI platforms and the Internet for medical recommendations, providers should advise patients to exercise caution when seeking medical advice from these AI platforms for managing glenohumeral joint osteoarthritis.http://www.sciencedirect.com/science/article/pii/S2666638325000933Basic Science StudyValidation of AI Software
spellingShingle	Michael Megafu, DO, MPH Omar Guerrero, BS Rafay Hasan, BS Larry Hunt, MBA Devri Langhelm, BS Benning Le, MS Xinning Li, MD Robert Kelly, IV, MD Robert L. Parisien, MD Antonio Cusano, MD Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management JSES International Basic Science Study Validation of AI Software
title	Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management
title_full	Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management
title_fullStr	Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management
title_full_unstemmed	Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management
title_short	Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management
title_sort	evaluating the perspectives of chatgpt and gemini on glenohumeral osteoarthritis management
topic	Basic Science Study Validation of AI Software
url	http://www.sciencedirect.com/science/article/pii/S2666638325000933
work_keys_str_mv	AT michaelmegafudomph evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT omarguerrerobs evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT rafayhasanbs evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT larryhuntmba evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT devrilanghelmbs evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT benninglems evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT xinninglimd evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT robertkellyivmd evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT robertlparisienmd evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT antoniocusanomd evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement

Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management

Similar Items