Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management
Background: Integrating machine learning and artificial intelligence (AI) technologies has revolutionized various sectors, including health care. However, their application in orthopedic health-care settings still needs to be improved. This study sought to evaluate Chat Generative Pre-Trained Transf...
Saved in:
| Main Authors: | , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-07-01
|
| Series: | JSES International |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2666638325000933 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849319576712511488 |
|---|---|
| author | Michael Megafu, DO, MPH Omar Guerrero, BS Rafay Hasan, BS Larry Hunt, MBA Devri Langhelm, BS Benning Le, MS Xinning Li, MD Robert Kelly, IV, MD Robert L. Parisien, MD Antonio Cusano, MD |
| author_facet | Michael Megafu, DO, MPH Omar Guerrero, BS Rafay Hasan, BS Larry Hunt, MBA Devri Langhelm, BS Benning Le, MS Xinning Li, MD Robert Kelly, IV, MD Robert L. Parisien, MD Antonio Cusano, MD |
| author_sort | Michael Megafu, DO, MPH |
| collection | DOAJ |
| description | Background: Integrating machine learning and artificial intelligence (AI) technologies has revolutionized various sectors, including health care. However, their application in orthopedic health-care settings still needs to be improved. This study sought to evaluate Chat Generative Pre-Trained Transformer (ChatGPT) and Gemini's capacity to make quality medical recommendations regarding glenohumeral osteoarthritis, weighing them against the recommendations established in the Evidence-Based Clinical Practice Guidelines (CPGs) of the American Academy of Orthopaedic Surgeons (AAOS). Methods: The 2020 AAOS CPGs, a widely recognized and respected source, were the basis for determining recommended and nonrecommended treatments in this study. ChatGPT and Gemini were queried on 20 treatments based on these guidelines; 10 were recommended for managing glenohumeral joint osteoarthritis, five were not recommended for managing glenohumeral joint osteoarthritis, and five were reported as consensus statements. These responses were categorized as “Concordance” or “No Concordance” with the AAOS CPGs. A Cohen's Kappa coefficient was calculated to assess the interrater reliability. Results: Among the 20 treatments examined, ChatGPT and Gemini showed concordance with the AAOS CPGs for 10 (100%) and 5 (50%) treatments, respectively. On the other hand, for treatments that AAOS CPGs did not recommend, ChatGPT had concordance for four out of the five treatments (80%), while Gemini had 100% concordance. The Cohen's Kappa coefficient to assess interrater reliability was found to be 0.90, indicating a very high level of agreement between the two raters in categorizing responses as “Concordance” or “No Concordance” with the AAOS CPGs. Conclusion: The study findings reveal that ChatGPT and Gemini cannot solely recommend CPGs as outlined in AAOS CPGs. As patients increasingly utilize external resources such as AI platforms and the Internet for medical recommendations, providers should advise patients to exercise caution when seeking medical advice from these AI platforms for managing glenohumeral joint osteoarthritis. |
| format | Article |
| id | doaj-art-e486b13c57044cf2ae2a38ba9c42575c |
| institution | Kabale University |
| issn | 2666-6383 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Elsevier |
| record_format | Article |
| series | JSES International |
| spelling | doaj-art-e486b13c57044cf2ae2a38ba9c42575c2025-08-20T03:50:22ZengElsevierJSES International2666-63832025-07-01941365137010.1016/j.jseint.2025.03.011Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis managementMichael Megafu, DO, MPH0Omar Guerrero, BS1Rafay Hasan, BS2Larry Hunt, MBA3Devri Langhelm, BS4Benning Le, MS5Xinning Li, MD6Robert Kelly, IV, MD7Robert L. Parisien, MD8Antonio Cusano, MD9Department of Orthopaedic Surgery, University of Connecticut, Farmington, CT, USA; Corresponding author: Michael Megafu, DO, MPH, Department of Orthopaedic Surgery, University of Connecticut, 263 Farmington Ave, Farmington, CT 06032, USA.A.T. Still University School of Osteopathic Medicine in Arizona, Mesa, AZ, USAA.T. Still University School of Osteopathic Medicine in Arizona, Mesa, AZ, USAA.T. Still University School of Osteopathic Medicine in Arizona, Mesa, AZ, USAA.T. Still University School of Osteopathic Medicine in Arizona, Mesa, AZ, USAA.T. Still University School of Osteopathic Medicine in Arizona, Mesa, AZ, USADepartment of Orthopedic Surgery, Boston University School of Medicine, Boston, MA, USADepartment of Orthopedic Surgery, University of Pennsylvania, Philadelphia, PA, USADepartment of Orthopedic Surgery, Mount Sinai, New York, NY, USADepartment of Orthopaedic Surgery, University of Connecticut, Farmington, CT, USABackground: Integrating machine learning and artificial intelligence (AI) technologies has revolutionized various sectors, including health care. However, their application in orthopedic health-care settings still needs to be improved. This study sought to evaluate Chat Generative Pre-Trained Transformer (ChatGPT) and Gemini's capacity to make quality medical recommendations regarding glenohumeral osteoarthritis, weighing them against the recommendations established in the Evidence-Based Clinical Practice Guidelines (CPGs) of the American Academy of Orthopaedic Surgeons (AAOS). Methods: The 2020 AAOS CPGs, a widely recognized and respected source, were the basis for determining recommended and nonrecommended treatments in this study. ChatGPT and Gemini were queried on 20 treatments based on these guidelines; 10 were recommended for managing glenohumeral joint osteoarthritis, five were not recommended for managing glenohumeral joint osteoarthritis, and five were reported as consensus statements. These responses were categorized as “Concordance” or “No Concordance” with the AAOS CPGs. A Cohen's Kappa coefficient was calculated to assess the interrater reliability. Results: Among the 20 treatments examined, ChatGPT and Gemini showed concordance with the AAOS CPGs for 10 (100%) and 5 (50%) treatments, respectively. On the other hand, for treatments that AAOS CPGs did not recommend, ChatGPT had concordance for four out of the five treatments (80%), while Gemini had 100% concordance. The Cohen's Kappa coefficient to assess interrater reliability was found to be 0.90, indicating a very high level of agreement between the two raters in categorizing responses as “Concordance” or “No Concordance” with the AAOS CPGs. Conclusion: The study findings reveal that ChatGPT and Gemini cannot solely recommend CPGs as outlined in AAOS CPGs. As patients increasingly utilize external resources such as AI platforms and the Internet for medical recommendations, providers should advise patients to exercise caution when seeking medical advice from these AI platforms for managing glenohumeral joint osteoarthritis.http://www.sciencedirect.com/science/article/pii/S2666638325000933Basic Science StudyValidation of AI Software |
| spellingShingle | Michael Megafu, DO, MPH Omar Guerrero, BS Rafay Hasan, BS Larry Hunt, MBA Devri Langhelm, BS Benning Le, MS Xinning Li, MD Robert Kelly, IV, MD Robert L. Parisien, MD Antonio Cusano, MD Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management JSES International Basic Science Study Validation of AI Software |
| title | Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management |
| title_full | Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management |
| title_fullStr | Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management |
| title_full_unstemmed | Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management |
| title_short | Evaluating the perspectives of ChatGPT and Gemini on glenohumeral osteoarthritis management |
| title_sort | evaluating the perspectives of chatgpt and gemini on glenohumeral osteoarthritis management |
| topic | Basic Science Study Validation of AI Software |
| url | http://www.sciencedirect.com/science/article/pii/S2666638325000933 |
| work_keys_str_mv | AT michaelmegafudomph evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT omarguerrerobs evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT rafayhasanbs evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT larryhuntmba evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT devrilanghelmbs evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT benninglems evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT xinninglimd evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT robertkellyivmd evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT robertlparisienmd evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement AT antoniocusanomd evaluatingtheperspectivesofchatgptandgeminionglenohumeralosteoarthritismanagement |