Digital Diagnostics: The Potential of Large Language Models in Recognizing Symptoms of Common Illnesses
This study aimed to evaluate the potential of Large Language Models (LLMs) in healthcare diagnostics, specifically their ability to analyze symptom-based prompts and provide accurate diagnoses. The study focused on models including GPT-4, GPT-4o, Gemini, o1 Preview, and GPT-3.5, assessing their perf...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-01-01
|
Series: | AI |
Subjects: | |
Online Access: | https://www.mdpi.com/2673-2688/6/1/13 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832589417822289920 |
---|---|
author | Gaurav Kumar Gupta Aditi Singh Sijo Valayakkad Manikandan Abul Ehtesham |
author_facet | Gaurav Kumar Gupta Aditi Singh Sijo Valayakkad Manikandan Abul Ehtesham |
author_sort | Gaurav Kumar Gupta |
collection | DOAJ |
description | This study aimed to evaluate the potential of Large Language Models (LLMs) in healthcare diagnostics, specifically their ability to analyze symptom-based prompts and provide accurate diagnoses. The study focused on models including GPT-4, GPT-4o, Gemini, o1 Preview, and GPT-3.5, assessing their performance in identifying illnesses based solely on provided symptoms. Symptom-based prompts were curated from reputable medical sources to ensure validity and relevance. Each model was tested under controlled conditions to evaluate their diagnostic accuracy, precision, recall, and decision-making capabilities. Specific scenarios were designed to explore their performance in both general and high-stakes diagnostic tasks. Among the models, GPT-4 achieved the highest diagnostic accuracy, demonstrating strong alignment with medical reasoning. Gemini excelled in high-stakes scenarios requiring precise decision-making. GPT-4o and o1 Preview showed balanced performance, effectively handling real-time diagnostic tasks with a focus on both precision and recall. GPT-3.5, though less advanced, proved dependable for general diagnostic tasks. This study highlights the strengths and limitations of LLMs in healthcare diagnostics. While models such as GPT-4 and Gemini exhibit promise, challenges such as privacy compliance, ethical considerations, and the mitigation of inherent biases must be addressed. The findings suggest pathways for responsibly integrating LLMs into diagnostic processes to enhance healthcare outcomes. |
format | Article |
id | doaj-art-9c7add5a5f414c32a9b87deb0fe6a5a6 |
institution | Kabale University |
issn | 2673-2688 |
language | English |
publishDate | 2025-01-01 |
publisher | MDPI AG |
record_format | Article |
series | AI |
spelling | doaj-art-9c7add5a5f414c32a9b87deb0fe6a5a62025-01-24T13:17:23ZengMDPI AGAI2673-26882025-01-01611310.3390/ai6010013Digital Diagnostics: The Potential of Large Language Models in Recognizing Symptoms of Common IllnessesGaurav Kumar Gupta0Aditi Singh1Sijo Valayakkad Manikandan2Abul Ehtesham3Department of Computer Science, Youngstown State University, Youngstown, OH 44555, USADepartment of Computer Science, Cleveland State University, Cleveland, OH 44115, USAMcCombs School of Business, University of Texas, Austin, TX 78712, USAThe Davey Tree Expert Company, Kent, OH 44240, USAThis study aimed to evaluate the potential of Large Language Models (LLMs) in healthcare diagnostics, specifically their ability to analyze symptom-based prompts and provide accurate diagnoses. The study focused on models including GPT-4, GPT-4o, Gemini, o1 Preview, and GPT-3.5, assessing their performance in identifying illnesses based solely on provided symptoms. Symptom-based prompts were curated from reputable medical sources to ensure validity and relevance. Each model was tested under controlled conditions to evaluate their diagnostic accuracy, precision, recall, and decision-making capabilities. Specific scenarios were designed to explore their performance in both general and high-stakes diagnostic tasks. Among the models, GPT-4 achieved the highest diagnostic accuracy, demonstrating strong alignment with medical reasoning. Gemini excelled in high-stakes scenarios requiring precise decision-making. GPT-4o and o1 Preview showed balanced performance, effectively handling real-time diagnostic tasks with a focus on both precision and recall. GPT-3.5, though less advanced, proved dependable for general diagnostic tasks. This study highlights the strengths and limitations of LLMs in healthcare diagnostics. While models such as GPT-4 and Gemini exhibit promise, challenges such as privacy compliance, ethical considerations, and the mitigation of inherent biases must be addressed. The findings suggest pathways for responsibly integrating LLMs into diagnostic processes to enhance healthcare outcomes.https://www.mdpi.com/2673-2688/6/1/13large language modelshealthcareAIdigital healthmedical diagnosticsnatural language processing (NLP) |
spellingShingle | Gaurav Kumar Gupta Aditi Singh Sijo Valayakkad Manikandan Abul Ehtesham Digital Diagnostics: The Potential of Large Language Models in Recognizing Symptoms of Common Illnesses AI large language models healthcare AI digital health medical diagnostics natural language processing (NLP) |
title | Digital Diagnostics: The Potential of Large Language Models in Recognizing Symptoms of Common Illnesses |
title_full | Digital Diagnostics: The Potential of Large Language Models in Recognizing Symptoms of Common Illnesses |
title_fullStr | Digital Diagnostics: The Potential of Large Language Models in Recognizing Symptoms of Common Illnesses |
title_full_unstemmed | Digital Diagnostics: The Potential of Large Language Models in Recognizing Symptoms of Common Illnesses |
title_short | Digital Diagnostics: The Potential of Large Language Models in Recognizing Symptoms of Common Illnesses |
title_sort | digital diagnostics the potential of large language models in recognizing symptoms of common illnesses |
topic | large language models healthcare AI digital health medical diagnostics natural language processing (NLP) |
url | https://www.mdpi.com/2673-2688/6/1/13 |
work_keys_str_mv | AT gauravkumargupta digitaldiagnosticsthepotentialoflargelanguagemodelsinrecognizingsymptomsofcommonillnesses AT aditisingh digitaldiagnosticsthepotentialoflargelanguagemodelsinrecognizingsymptomsofcommonillnesses AT sijovalayakkadmanikandan digitaldiagnosticsthepotentialoflargelanguagemodelsinrecognizingsymptomsofcommonillnesses AT abulehtesham digitaldiagnosticsthepotentialoflargelanguagemodelsinrecognizingsymptomsofcommonillnesses |