A Systematic Review of Large Language Models in Medical Specialties: Applications, Challenges and Future Directions

This systematic review evaluates recent literature from January 2021 to March 2024 on large language model (LLM) applications across diverse medical specialties. Searching PubMed, Web of Science, and Scopus, we included 84 studies. LLMs were applied to tasks such as clinical natural language process...

Full description

Saved in:
Bibliographic Details
Main Authors: Asma Musabah Alkalbani, Ahmed Salim Alrawahi, Ahmad Salah, Venus Haghighi, Yang Zhang, Salam Alkindi, Quan Z. Sheng
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/16/6/489
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849472417537196032
author Asma Musabah Alkalbani
Ahmed Salim Alrawahi
Ahmad Salah
Venus Haghighi
Yang Zhang
Salam Alkindi
Quan Z. Sheng
author_facet Asma Musabah Alkalbani
Ahmed Salim Alrawahi
Ahmad Salah
Venus Haghighi
Yang Zhang
Salam Alkindi
Quan Z. Sheng
author_sort Asma Musabah Alkalbani
collection DOAJ
description This systematic review evaluates recent literature from January 2021 to March 2024 on large language model (LLM) applications across diverse medical specialties. Searching PubMed, Web of Science, and Scopus, we included 84 studies. LLMs were applied to tasks such as clinical natural language processing, medical decision support, education, and aiding diagnostic processes. While studies reported benefits such as improved efficiency and, in some specific NLP tasks, high accuracy above 90%, significant challenges persist concerning reliability, ethical implications, and performance consistency, with accuracy in broader diagnostic support applications showing substantial variability, with some as low as 3%. The overall risk of bias in the reviewed literature was considerably low in 72 studies. Key findings highlight a substantial heterogeneity in LLM performance across different medical tasks and contexts, preventing meta-analysis due to a lack of standardized methodologies. Future efforts should prioritize developing domain-specific LLMs using robust medical data and establishing rigorous validation standards to ensure their safe and effective clinical integration. Trial registration: PROSPERO (CRD42024561381).
format Article
id doaj-art-9fbbdcd8c95b4d6e9217766d8a961df2
institution Kabale University
issn 2078-2489
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Information
spelling doaj-art-9fbbdcd8c95b4d6e9217766d8a961df22025-08-20T03:24:33ZengMDPI AGInformation2078-24892025-06-0116648910.3390/info16060489A Systematic Review of Large Language Models in Medical Specialties: Applications, Challenges and Future DirectionsAsma Musabah Alkalbani0Ahmed Salim Alrawahi1Ahmad Salah2Venus Haghighi3Yang Zhang4Salam Alkindi5Quan Z. Sheng6Department of Information Technology, College of Computing and Information Sciences, University of Technology and Applied Sciences, Ibri 511, OmanDepartment of Information Technology, College of Computing and Information Sciences, University of Technology and Applied Sciences, Ibri 511, OmanDepartment of Information Technology, College of Computing and Information Sciences, University of Technology and Applied Sciences, Ibri 511, OmanSchool of Computing, Macquarie University, Sydney, NSW 2109, AustraliaThe Anuradha and Vikas Sinha Department of Data Science, University of North Texas, Denton, TX 76203, USADepartment of Hematology, College of Medicine & Health Science, Sultan Qaboos University, Muscat 123, OmanSchool of Computing, Macquarie University, Sydney, NSW 2109, AustraliaThis systematic review evaluates recent literature from January 2021 to March 2024 on large language model (LLM) applications across diverse medical specialties. Searching PubMed, Web of Science, and Scopus, we included 84 studies. LLMs were applied to tasks such as clinical natural language processing, medical decision support, education, and aiding diagnostic processes. While studies reported benefits such as improved efficiency and, in some specific NLP tasks, high accuracy above 90%, significant challenges persist concerning reliability, ethical implications, and performance consistency, with accuracy in broader diagnostic support applications showing substantial variability, with some as low as 3%. The overall risk of bias in the reviewed literature was considerably low in 72 studies. Key findings highlight a substantial heterogeneity in LLM performance across different medical tasks and contexts, preventing meta-analysis due to a lack of standardized methodologies. Future efforts should prioritize developing domain-specific LLMs using robust medical data and establishing rigorous validation standards to ensure their safe and effective clinical integration. Trial registration: PROSPERO (CRD42024561381).https://www.mdpi.com/2078-2489/16/6/489artificial intelligenceclinical decision support systemslarge language modelsclinical natural language processingmedical specialties
spellingShingle Asma Musabah Alkalbani
Ahmed Salim Alrawahi
Ahmad Salah
Venus Haghighi
Yang Zhang
Salam Alkindi
Quan Z. Sheng
A Systematic Review of Large Language Models in Medical Specialties: Applications, Challenges and Future Directions
Information
artificial intelligence
clinical decision support systems
large language models
clinical natural language processing
medical specialties
title A Systematic Review of Large Language Models in Medical Specialties: Applications, Challenges and Future Directions
title_full A Systematic Review of Large Language Models in Medical Specialties: Applications, Challenges and Future Directions
title_fullStr A Systematic Review of Large Language Models in Medical Specialties: Applications, Challenges and Future Directions
title_full_unstemmed A Systematic Review of Large Language Models in Medical Specialties: Applications, Challenges and Future Directions
title_short A Systematic Review of Large Language Models in Medical Specialties: Applications, Challenges and Future Directions
title_sort systematic review of large language models in medical specialties applications challenges and future directions
topic artificial intelligence
clinical decision support systems
large language models
clinical natural language processing
medical specialties
url https://www.mdpi.com/2078-2489/16/6/489
work_keys_str_mv AT asmamusabahalkalbani asystematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections
AT ahmedsalimalrawahi asystematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections
AT ahmadsalah asystematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections
AT venushaghighi asystematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections
AT yangzhang asystematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections
AT salamalkindi asystematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections
AT quanzsheng asystematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections
AT asmamusabahalkalbani systematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections
AT ahmedsalimalrawahi systematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections
AT ahmadsalah systematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections
AT venushaghighi systematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections
AT yangzhang systematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections
AT salamalkindi systematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections
AT quanzsheng systematicreviewoflargelanguagemodelsinmedicalspecialtiesapplicationschallengesandfuturedirections