Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study

Background: Surveillance of surgical site infection (SSI) relies on manual methods that are time-consuming and prone to subjectivity. This study evaluates the diagnostic accuracy of ChatGPT for detecting SSI from electronic health records after colorectal surgery via comparison with the results of a...

Full description

Saved in:
Bibliographic Details
Main Authors: Josep M. Badia, Daniel Casanova-Portoles, Estela Membrilla, Carles Rubiés, Miquel Pujol, Joan Sancho
Format: Article
Language:English
Published: Elsevier 2025-02-01
Series:Journal of Infection and Public Health
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1876034124003617
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832592811049877504
author Josep M. Badia
Daniel Casanova-Portoles
Estela Membrilla
Carles Rubiés
Miquel Pujol
Joan Sancho
author_facet Josep M. Badia
Daniel Casanova-Portoles
Estela Membrilla
Carles Rubiés
Miquel Pujol
Joan Sancho
author_sort Josep M. Badia
collection DOAJ
description Background: Surveillance of surgical site infection (SSI) relies on manual methods that are time-consuming and prone to subjectivity. This study evaluates the diagnostic accuracy of ChatGPT for detecting SSI from electronic health records after colorectal surgery via comparison with the results of a nationwide surveillance programme. Methods: This pilot, retrospective, multicentre analysis included 122 patients who underwent colorectal surgery. Patient records were reviewed by both manual surveillance and ChatGPT, which was tasked with identifying SSI and categorizing them as superficial, deep, or organ-space infections. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Receiver operating characteristic (ROC) curve analysis determined the model's diagnostic performance. Results: ChatGPT achieved a sensitivity of 100 %, correctly identifying all SSIs detected by manual methods. The specificity was 54 %, indicating the presence of false positives. The PPV was 67 %, and the NPV was 100 %. The area under the ROC curve was 0.77, indicating good overall accuracy for distinguishing between SSI and non-SSI cases. Minor differences in outcomes were observed between colon and rectal surgeries, as well as between the hospitals participating in the study. Conclusions: ChatGPT shows high sensitivity and good overall accuracy for detecting SSI. It appears to be a useful tool for initial screenings and for reducing manual review workload. The moderate specificity suggests a need for further refinement to reduce the rate of false positives. The integration of ChatGPT alongside electronic medical records, antibiotic consumption and imaging data results for real-time analysis may further improve the surveillance of SSI. ClinicalTrials.gov Identifier: NCT06556017.
format Article
id doaj-art-b1b3cb76daf946d4b69ee45dba3ad017
institution Kabale University
issn 1876-0341
language English
publishDate 2025-02-01
publisher Elsevier
record_format Article
series Journal of Infection and Public Health
spelling doaj-art-b1b3cb76daf946d4b69ee45dba3ad0172025-01-21T04:12:56ZengElsevierJournal of Infection and Public Health1876-03412025-02-01182102627Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy studyJosep M. Badia0Daniel Casanova-Portoles1Estela Membrilla2Carles Rubiés3Miquel Pujol4Joan Sancho5Department of Surgery, Hospital General de Granollers, Granollers, Spain; Universitat Internacional de Catalunya. Sant Cugat del Vallès, Barcelona, Spain; Correspondence to: Department of Surgery, Hospital General de Granollers, Av Francesc Ribas 1, Granollers, Barcelona 08402, Spain.Department of Surgery, Hospital General de Granollers, Granollers, Spain; Universitat Internacional de Catalunya. Sant Cugat del Vallès, Barcelona, SpainDepartment of Surgery, Hospital del Mar, Barcelona, SpainDepartment of Digital Transformation, Hospital General de Granollers, Granollers, SpainVINCat Program, Servei Català de la Salut, Catalonia, Spain; Centro de Investigación Biomédica en Red de Enfermedades Infecciosas (CIBERINFEC), Instituto de Salud Carlos III, Madrid, Spain. VINCat Program, Barcelona, Catalonia, Spain; Department of Infectious Diseases, Hospital Universitari de Bellvitge - IDIBELL. L’Hospitalet de Llobregat, SpainDepartment of Surgery, Hospital del Mar, Barcelona, SpainBackground: Surveillance of surgical site infection (SSI) relies on manual methods that are time-consuming and prone to subjectivity. This study evaluates the diagnostic accuracy of ChatGPT for detecting SSI from electronic health records after colorectal surgery via comparison with the results of a nationwide surveillance programme. Methods: This pilot, retrospective, multicentre analysis included 122 patients who underwent colorectal surgery. Patient records were reviewed by both manual surveillance and ChatGPT, which was tasked with identifying SSI and categorizing them as superficial, deep, or organ-space infections. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Receiver operating characteristic (ROC) curve analysis determined the model's diagnostic performance. Results: ChatGPT achieved a sensitivity of 100 %, correctly identifying all SSIs detected by manual methods. The specificity was 54 %, indicating the presence of false positives. The PPV was 67 %, and the NPV was 100 %. The area under the ROC curve was 0.77, indicating good overall accuracy for distinguishing between SSI and non-SSI cases. Minor differences in outcomes were observed between colon and rectal surgeries, as well as between the hospitals participating in the study. Conclusions: ChatGPT shows high sensitivity and good overall accuracy for detecting SSI. It appears to be a useful tool for initial screenings and for reducing manual review workload. The moderate specificity suggests a need for further refinement to reduce the rate of false positives. The integration of ChatGPT alongside electronic medical records, antibiotic consumption and imaging data results for real-time analysis may further improve the surveillance of SSI. ClinicalTrials.gov Identifier: NCT06556017.http://www.sciencedirect.com/science/article/pii/S1876034124003617Surgical site infectionDiagnosisAccuracySensitivity and specificityArtificial intelligenceChatGPT
spellingShingle Josep M. Badia
Daniel Casanova-Portoles
Estela Membrilla
Carles Rubiés
Miquel Pujol
Joan Sancho
Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study
Journal of Infection and Public Health
Surgical site infection
Diagnosis
Accuracy
Sensitivity and specificity
Artificial intelligence
ChatGPT
title Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study
title_full Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study
title_fullStr Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study
title_full_unstemmed Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study
title_short Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study
title_sort evaluation of chatgpt 4 for the detection of surgical site infections from electronic health records after colorectal surgery a pilot diagnostic accuracy study
topic Surgical site infection
Diagnosis
Accuracy
Sensitivity and specificity
Artificial intelligence
ChatGPT
url http://www.sciencedirect.com/science/article/pii/S1876034124003617
work_keys_str_mv AT josepmbadia evaluationofchatgpt4forthedetectionofsurgicalsiteinfectionsfromelectronichealthrecordsaftercolorectalsurgeryapilotdiagnosticaccuracystudy
AT danielcasanovaportoles evaluationofchatgpt4forthedetectionofsurgicalsiteinfectionsfromelectronichealthrecordsaftercolorectalsurgeryapilotdiagnosticaccuracystudy
AT estelamembrilla evaluationofchatgpt4forthedetectionofsurgicalsiteinfectionsfromelectronichealthrecordsaftercolorectalsurgeryapilotdiagnosticaccuracystudy
AT carlesrubies evaluationofchatgpt4forthedetectionofsurgicalsiteinfectionsfromelectronichealthrecordsaftercolorectalsurgeryapilotdiagnosticaccuracystudy
AT miquelpujol evaluationofchatgpt4forthedetectionofsurgicalsiteinfectionsfromelectronichealthrecordsaftercolorectalsurgeryapilotdiagnosticaccuracystudy
AT joansancho evaluationofchatgpt4forthedetectionofsurgicalsiteinfectionsfromelectronichealthrecordsaftercolorectalsurgeryapilotdiagnosticaccuracystudy