A publicly available pharyngitis dataset and baseline evaluations for bacterial or nonbacterial classification

Abstract Accurate and early differentiation between bacterial and nonbacterial pharyngitis is crucial for optimizing treatment and minimizing unnecessary antibiotic use. The similar clinical presentation of sore throat in bacterial and nonbacterial infections poses significant diagnostic challenges,...

Full description

Saved in:
Bibliographic Details
Main Authors: Negar Shojaei, Habib Rostami, Mohammad Barzegar, Shokooh Saadat Farzaneh, Zohreh Farrar, Majid Alimohammadi, Jahanbakhsh Keyvani, Mehdi Mirzad, Leila Gonbadi
Format: Article
Language:English
Published: Nature Portfolio 2025-08-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-025-05780-5
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Accurate and early differentiation between bacterial and nonbacterial pharyngitis is crucial for optimizing treatment and minimizing unnecessary antibiotic use. The similar clinical presentation of sore throat in bacterial and nonbacterial infections poses significant diagnostic challenges, even for experienced clinicians. To address this, we developed a publicly available dataset consisting of high-resolution throat images captured using smartphone cameras. These images were analyzed through deep neural networks to distinguish between bacterial and nonbacterial infections based on visual features and symptoms. The dataset is the largest publicly available dataset in this field, which includes images from 742 patients experiencing common cold symptoms. For each patient, it also records the presence or absence of 20 symptoms, age, gender, and between 4 to 9 diagnoses by different physicians. Furthermore, three baseline models were established to differentiate bacterial from nonbacterial infections. Our goal is to enhance the field of non-invasive and accurate pharyngitis diagnosis, drive the development of AI-driven diagnostic tools, promote remote healthcare solutions, and inspire future innovations in medical image analysis.
ISSN:2052-4463