Management of Dupuytren’s Disease: A Multi-Centric Comparative Analysis Between Experienced Hand Surgeons Versus Artificial Intelligence

<b>Background</b>: Dupuytren’s fibroproliferative disease affecting the hand’s palmar fascia leads to progressive finger contractures and functional limitations. Management of this condition relies heavily on the expertise of hand surgeons, who tailor interventions based on clinical asse...

Full description

Saved in:
Bibliographic Details
Main Authors: Ishith Seth, Gianluca Marcaccini, Kaiyang Lim, Marco Castrechini, Roberto Cuomo, Sally Kiu-Huen Ng, Richard J. Ross, Warren M. Rozen
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Diagnostics
Subjects:
Online Access:https://www.mdpi.com/2075-4418/15/5/587
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850225250100314112
author Ishith Seth
Gianluca Marcaccini
Kaiyang Lim
Marco Castrechini
Roberto Cuomo
Sally Kiu-Huen Ng
Richard J. Ross
Warren M. Rozen
author_facet Ishith Seth
Gianluca Marcaccini
Kaiyang Lim
Marco Castrechini
Roberto Cuomo
Sally Kiu-Huen Ng
Richard J. Ross
Warren M. Rozen
author_sort Ishith Seth
collection DOAJ
description <b>Background</b>: Dupuytren’s fibroproliferative disease affecting the hand’s palmar fascia leads to progressive finger contractures and functional limitations. Management of this condition relies heavily on the expertise of hand surgeons, who tailor interventions based on clinical assessment. With the growing interest in artificial intelligence (AI) in medical decision-making, this study aims to evaluate the feasibility of integrating AI into the clinical management of Dupuytren’s disease by comparing AI-generated recommendations with those of expert hand surgeons. <b>Methods</b>: This multicentric comparative study involved three experienced hand surgeons and five AI systems (ChatGPT, Gemini, Perplexity, DeepSeek, and Copilot). Twenty-two standardized clinical prompts representing various Dupuytren’s disease scenarios were used to assess decision-making. Surgeons and AI systems provided management recommendations, which were analyzed for concordance, rationale, and predicted outcomes. Key metrics included union accuracy, surgeon agreement, precision, recall, and F1 scores. The study also evaluated AI performance in unanimous versus non-unanimous cases and inter-AI agreements. <b>Results</b>: Gemini and ChatGPT demonstrated the highest union accuracy (86.4% and 81.8%, respectively), while Copilot showed the lowest (40.9%). Surgeon agreement was highest for Gemini (45.5%) and ChatGPT (42.4%). AI systems performed better in unanimous cases (accuracy up to 92.0%) than in non-unanimous cases (accuracy as low as 35.0%). Inter-AI agreements ranged from 75.0% (ChatGPT-Gemini) to 48.0% (DeepSeek-Copilot). Precision, recall, and F1 scores were consistently higher for ChatGPT and Gemini than for other systems. <b>Conclusions</b>: AI systems, particularly Gemini and ChatGPT, show promise in aligning with expert surgical recommendations, especially in straightforward cases. However, significant variability exists, particularly in complex scenarios. AI should be viewed as complementary to clinical judgment, requiring further refinement and validation for integration into clinical practice.
format Article
id doaj-art-907e798a6b134b2dabb5ab0adbfe310d
institution OA Journals
issn 2075-4418
language English
publishDate 2025-02-01
publisher MDPI AG
record_format Article
series Diagnostics
spelling doaj-art-907e798a6b134b2dabb5ab0adbfe310d2025-08-20T02:05:24ZengMDPI AGDiagnostics2075-44182025-02-0115558710.3390/diagnostics15050587Management of Dupuytren’s Disease: A Multi-Centric Comparative Analysis Between Experienced Hand Surgeons Versus Artificial IntelligenceIshith Seth0Gianluca Marcaccini1Kaiyang Lim2Marco Castrechini3Roberto Cuomo4Sally Kiu-Huen Ng5Richard J. Ross6Warren M. Rozen7Department of Plastic and Reconstructive Surgery, Peninsula Health, Frankston, VIC 3199, AustraliaDepartment of Plastic and Reconstructive Surgery, Peninsula Health, Frankston, VIC 3199, AustraliaDepartment of Plastic and Reconstructive Surgery, Peninsula Health, Frankston, VIC 3199, AustraliaPlastic Surgery Unit, Department of Surgery “P. Valdoni”, “Sapienza” University of Rome, 00185 Rome, ItalyPlastic Surgery Unit, Department of Medicine, Surgery and Neuroscience, University of Siena, 53100 Siena, ItalyDepartment of Plastic and Reconstructive Surgery, Austin Health, Heidelberg, VIC 3199, AustraliaDepartment of Plastic and Reconstructive Surgery, Peninsula Health, Frankston, VIC 3199, AustraliaDepartment of Plastic and Reconstructive Surgery, Peninsula Health, Frankston, VIC 3199, Australia<b>Background</b>: Dupuytren’s fibroproliferative disease affecting the hand’s palmar fascia leads to progressive finger contractures and functional limitations. Management of this condition relies heavily on the expertise of hand surgeons, who tailor interventions based on clinical assessment. With the growing interest in artificial intelligence (AI) in medical decision-making, this study aims to evaluate the feasibility of integrating AI into the clinical management of Dupuytren’s disease by comparing AI-generated recommendations with those of expert hand surgeons. <b>Methods</b>: This multicentric comparative study involved three experienced hand surgeons and five AI systems (ChatGPT, Gemini, Perplexity, DeepSeek, and Copilot). Twenty-two standardized clinical prompts representing various Dupuytren’s disease scenarios were used to assess decision-making. Surgeons and AI systems provided management recommendations, which were analyzed for concordance, rationale, and predicted outcomes. Key metrics included union accuracy, surgeon agreement, precision, recall, and F1 scores. The study also evaluated AI performance in unanimous versus non-unanimous cases and inter-AI agreements. <b>Results</b>: Gemini and ChatGPT demonstrated the highest union accuracy (86.4% and 81.8%, respectively), while Copilot showed the lowest (40.9%). Surgeon agreement was highest for Gemini (45.5%) and ChatGPT (42.4%). AI systems performed better in unanimous cases (accuracy up to 92.0%) than in non-unanimous cases (accuracy as low as 35.0%). Inter-AI agreements ranged from 75.0% (ChatGPT-Gemini) to 48.0% (DeepSeek-Copilot). Precision, recall, and F1 scores were consistently higher for ChatGPT and Gemini than for other systems. <b>Conclusions</b>: AI systems, particularly Gemini and ChatGPT, show promise in aligning with expert surgical recommendations, especially in straightforward cases. However, significant variability exists, particularly in complex scenarios. AI should be viewed as complementary to clinical judgment, requiring further refinement and validation for integration into clinical practice.https://www.mdpi.com/2075-4418/15/5/587Dupuytren’s diseaseartificial intelligencehand surgeryclinical decision-makingAI-assisted managementsurgical recommendations
spellingShingle Ishith Seth
Gianluca Marcaccini
Kaiyang Lim
Marco Castrechini
Roberto Cuomo
Sally Kiu-Huen Ng
Richard J. Ross
Warren M. Rozen
Management of Dupuytren’s Disease: A Multi-Centric Comparative Analysis Between Experienced Hand Surgeons Versus Artificial Intelligence
Diagnostics
Dupuytren’s disease
artificial intelligence
hand surgery
clinical decision-making
AI-assisted management
surgical recommendations
title Management of Dupuytren’s Disease: A Multi-Centric Comparative Analysis Between Experienced Hand Surgeons Versus Artificial Intelligence
title_full Management of Dupuytren’s Disease: A Multi-Centric Comparative Analysis Between Experienced Hand Surgeons Versus Artificial Intelligence
title_fullStr Management of Dupuytren’s Disease: A Multi-Centric Comparative Analysis Between Experienced Hand Surgeons Versus Artificial Intelligence
title_full_unstemmed Management of Dupuytren’s Disease: A Multi-Centric Comparative Analysis Between Experienced Hand Surgeons Versus Artificial Intelligence
title_short Management of Dupuytren’s Disease: A Multi-Centric Comparative Analysis Between Experienced Hand Surgeons Versus Artificial Intelligence
title_sort management of dupuytren s disease a multi centric comparative analysis between experienced hand surgeons versus artificial intelligence
topic Dupuytren’s disease
artificial intelligence
hand surgery
clinical decision-making
AI-assisted management
surgical recommendations
url https://www.mdpi.com/2075-4418/15/5/587
work_keys_str_mv AT ishithseth managementofdupuytrensdiseaseamulticentriccomparativeanalysisbetweenexperiencedhandsurgeonsversusartificialintelligence
AT gianlucamarcaccini managementofdupuytrensdiseaseamulticentriccomparativeanalysisbetweenexperiencedhandsurgeonsversusartificialintelligence
AT kaiyanglim managementofdupuytrensdiseaseamulticentriccomparativeanalysisbetweenexperiencedhandsurgeonsversusartificialintelligence
AT marcocastrechini managementofdupuytrensdiseaseamulticentriccomparativeanalysisbetweenexperiencedhandsurgeonsversusartificialintelligence
AT robertocuomo managementofdupuytrensdiseaseamulticentriccomparativeanalysisbetweenexperiencedhandsurgeonsversusartificialintelligence
AT sallykiuhuenng managementofdupuytrensdiseaseamulticentriccomparativeanalysisbetweenexperiencedhandsurgeonsversusartificialintelligence
AT richardjross managementofdupuytrensdiseaseamulticentriccomparativeanalysisbetweenexperiencedhandsurgeonsversusartificialintelligence
AT warrenmrozen managementofdupuytrensdiseaseamulticentriccomparativeanalysisbetweenexperiencedhandsurgeonsversusartificialintelligence