Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study
Background: Surveillance of surgical site infection (SSI) relies on manual methods that are time-consuming and prone to subjectivity. This study evaluates the diagnostic accuracy of ChatGPT for detecting SSI from electronic health records after colorectal surgery via comparison with the results of a...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-02-01
|
Series: | Journal of Infection and Public Health |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S1876034124003617 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832592811049877504 |
---|---|
author | Josep M. Badia Daniel Casanova-Portoles Estela Membrilla Carles Rubiés Miquel Pujol Joan Sancho |
author_facet | Josep M. Badia Daniel Casanova-Portoles Estela Membrilla Carles Rubiés Miquel Pujol Joan Sancho |
author_sort | Josep M. Badia |
collection | DOAJ |
description | Background: Surveillance of surgical site infection (SSI) relies on manual methods that are time-consuming and prone to subjectivity. This study evaluates the diagnostic accuracy of ChatGPT for detecting SSI from electronic health records after colorectal surgery via comparison with the results of a nationwide surveillance programme. Methods: This pilot, retrospective, multicentre analysis included 122 patients who underwent colorectal surgery. Patient records were reviewed by both manual surveillance and ChatGPT, which was tasked with identifying SSI and categorizing them as superficial, deep, or organ-space infections. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Receiver operating characteristic (ROC) curve analysis determined the model's diagnostic performance. Results: ChatGPT achieved a sensitivity of 100 %, correctly identifying all SSIs detected by manual methods. The specificity was 54 %, indicating the presence of false positives. The PPV was 67 %, and the NPV was 100 %. The area under the ROC curve was 0.77, indicating good overall accuracy for distinguishing between SSI and non-SSI cases. Minor differences in outcomes were observed between colon and rectal surgeries, as well as between the hospitals participating in the study. Conclusions: ChatGPT shows high sensitivity and good overall accuracy for detecting SSI. It appears to be a useful tool for initial screenings and for reducing manual review workload. The moderate specificity suggests a need for further refinement to reduce the rate of false positives. The integration of ChatGPT alongside electronic medical records, antibiotic consumption and imaging data results for real-time analysis may further improve the surveillance of SSI. ClinicalTrials.gov Identifier: NCT06556017. |
format | Article |
id | doaj-art-b1b3cb76daf946d4b69ee45dba3ad017 |
institution | Kabale University |
issn | 1876-0341 |
language | English |
publishDate | 2025-02-01 |
publisher | Elsevier |
record_format | Article |
series | Journal of Infection and Public Health |
spelling | doaj-art-b1b3cb76daf946d4b69ee45dba3ad0172025-01-21T04:12:56ZengElsevierJournal of Infection and Public Health1876-03412025-02-01182102627Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy studyJosep M. Badia0Daniel Casanova-Portoles1Estela Membrilla2Carles Rubiés3Miquel Pujol4Joan Sancho5Department of Surgery, Hospital General de Granollers, Granollers, Spain; Universitat Internacional de Catalunya. Sant Cugat del Vallès, Barcelona, Spain; Correspondence to: Department of Surgery, Hospital General de Granollers, Av Francesc Ribas 1, Granollers, Barcelona 08402, Spain.Department of Surgery, Hospital General de Granollers, Granollers, Spain; Universitat Internacional de Catalunya. Sant Cugat del Vallès, Barcelona, SpainDepartment of Surgery, Hospital del Mar, Barcelona, SpainDepartment of Digital Transformation, Hospital General de Granollers, Granollers, SpainVINCat Program, Servei Català de la Salut, Catalonia, Spain; Centro de Investigación Biomédica en Red de Enfermedades Infecciosas (CIBERINFEC), Instituto de Salud Carlos III, Madrid, Spain. VINCat Program, Barcelona, Catalonia, Spain; Department of Infectious Diseases, Hospital Universitari de Bellvitge - IDIBELL. L’Hospitalet de Llobregat, SpainDepartment of Surgery, Hospital del Mar, Barcelona, SpainBackground: Surveillance of surgical site infection (SSI) relies on manual methods that are time-consuming and prone to subjectivity. This study evaluates the diagnostic accuracy of ChatGPT for detecting SSI from electronic health records after colorectal surgery via comparison with the results of a nationwide surveillance programme. Methods: This pilot, retrospective, multicentre analysis included 122 patients who underwent colorectal surgery. Patient records were reviewed by both manual surveillance and ChatGPT, which was tasked with identifying SSI and categorizing them as superficial, deep, or organ-space infections. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Receiver operating characteristic (ROC) curve analysis determined the model's diagnostic performance. Results: ChatGPT achieved a sensitivity of 100 %, correctly identifying all SSIs detected by manual methods. The specificity was 54 %, indicating the presence of false positives. The PPV was 67 %, and the NPV was 100 %. The area under the ROC curve was 0.77, indicating good overall accuracy for distinguishing between SSI and non-SSI cases. Minor differences in outcomes were observed between colon and rectal surgeries, as well as between the hospitals participating in the study. Conclusions: ChatGPT shows high sensitivity and good overall accuracy for detecting SSI. It appears to be a useful tool for initial screenings and for reducing manual review workload. The moderate specificity suggests a need for further refinement to reduce the rate of false positives. The integration of ChatGPT alongside electronic medical records, antibiotic consumption and imaging data results for real-time analysis may further improve the surveillance of SSI. ClinicalTrials.gov Identifier: NCT06556017.http://www.sciencedirect.com/science/article/pii/S1876034124003617Surgical site infectionDiagnosisAccuracySensitivity and specificityArtificial intelligenceChatGPT |
spellingShingle | Josep M. Badia Daniel Casanova-Portoles Estela Membrilla Carles Rubiés Miquel Pujol Joan Sancho Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study Journal of Infection and Public Health Surgical site infection Diagnosis Accuracy Sensitivity and specificity Artificial intelligence ChatGPT |
title | Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study |
title_full | Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study |
title_fullStr | Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study |
title_full_unstemmed | Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study |
title_short | Evaluation of ChatGPT-4 for the detection of surgical site infections from electronic health records after colorectal surgery: A pilot diagnostic accuracy study |
title_sort | evaluation of chatgpt 4 for the detection of surgical site infections from electronic health records after colorectal surgery a pilot diagnostic accuracy study |
topic | Surgical site infection Diagnosis Accuracy Sensitivity and specificity Artificial intelligence ChatGPT |
url | http://www.sciencedirect.com/science/article/pii/S1876034124003617 |
work_keys_str_mv | AT josepmbadia evaluationofchatgpt4forthedetectionofsurgicalsiteinfectionsfromelectronichealthrecordsaftercolorectalsurgeryapilotdiagnosticaccuracystudy AT danielcasanovaportoles evaluationofchatgpt4forthedetectionofsurgicalsiteinfectionsfromelectronichealthrecordsaftercolorectalsurgeryapilotdiagnosticaccuracystudy AT estelamembrilla evaluationofchatgpt4forthedetectionofsurgicalsiteinfectionsfromelectronichealthrecordsaftercolorectalsurgeryapilotdiagnosticaccuracystudy AT carlesrubies evaluationofchatgpt4forthedetectionofsurgicalsiteinfectionsfromelectronichealthrecordsaftercolorectalsurgeryapilotdiagnosticaccuracystudy AT miquelpujol evaluationofchatgpt4forthedetectionofsurgicalsiteinfectionsfromelectronichealthrecordsaftercolorectalsurgeryapilotdiagnosticaccuracystudy AT joansancho evaluationofchatgpt4forthedetectionofsurgicalsiteinfectionsfromelectronichealthrecordsaftercolorectalsurgeryapilotdiagnosticaccuracystudy |