Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation

Abstract Large language models (LLMs) demonstrate significant potential in healthcare applications, but clinical deployment is limited by privacy concerns and insufficient medical domain training. This study investigated whether retrieval-augmented generation (RAG) can improve locally deployable LLM...

Full description

Saved in:
Bibliographic Details
Main Authors: Akihiko Wada, Yuya Tanaka, Mitsuo Nishizawa, Akira Yamamoto, Toshiaki Akashi, Akifumi Hagiwara, Yayoi Hayakawa, Junko Kikuta, Keigo Shimoji, Katsuhiro Sano, Koji Kamagata, Atsushi Nakanishi, Shigeki Aoki
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:npj Digital Medicine
Online Access:https://doi.org/10.1038/s41746-025-01802-z
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849389261906771968
author Akihiko Wada
Yuya Tanaka
Mitsuo Nishizawa
Akira Yamamoto
Toshiaki Akashi
Akifumi Hagiwara
Yayoi Hayakawa
Junko Kikuta
Keigo Shimoji
Katsuhiro Sano
Koji Kamagata
Atsushi Nakanishi
Shigeki Aoki
author_facet Akihiko Wada
Yuya Tanaka
Mitsuo Nishizawa
Akira Yamamoto
Toshiaki Akashi
Akifumi Hagiwara
Yayoi Hayakawa
Junko Kikuta
Keigo Shimoji
Katsuhiro Sano
Koji Kamagata
Atsushi Nakanishi
Shigeki Aoki
author_sort Akihiko Wada
collection DOAJ
description Abstract Large language models (LLMs) demonstrate significant potential in healthcare applications, but clinical deployment is limited by privacy concerns and insufficient medical domain training. This study investigated whether retrieval-augmented generation (RAG) can improve locally deployable LLM for radiology contrast media consultation. In 100 synthetic iodinated contrast media consultations we compared Llama 3.2-11B (baseline and RAG) with three cloud-based models—GPT-4o mini, Gemini 2.0 Flash and Claude 3.5 Haiku. A blinded radiologist ranked the five replies per case, and three LLM-based judges scored accuracy, safety, structure, tone, applicability and latency. Under controlled conditions, RAG eliminated hallucinations (0% vs 8%; χ²₍Yates₎ = 6.38, p = 0.012) and improved mean rank by 1.3 (Z = –4.82, p < 0.001), though performance gaps with cloud models persist. The RAG-enhanced model remained faster (2.6 s vs 4.9–7.3 s) while the LLM-based judges preferred it over GPT-4o mini, though the radiologist ranked GPT-4o mini higher. RAG thus provides meaningful improvements for local clinical LLMs while maintaining the privacy benefits of on-premise deployment.
format Article
id doaj-art-98d901570fd74c5f8c3fa61829804113
institution Kabale University
issn 2398-6352
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series npj Digital Medicine
spelling doaj-art-98d901570fd74c5f8c3fa618298041132025-08-20T03:42:00ZengNature Portfolionpj Digital Medicine2398-63522025-07-01811910.1038/s41746-025-01802-zRetrieval-augmented generation elevates local LLM quality in radiology contrast media consultationAkihiko Wada0Yuya Tanaka1Mitsuo Nishizawa2Akira Yamamoto3Toshiaki Akashi4Akifumi Hagiwara5Yayoi Hayakawa6Junko Kikuta7Keigo Shimoji8Katsuhiro Sano9Koji Kamagata10Atsushi Nakanishi11Shigeki Aoki12Department of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Urayasu HospitalFaculty of Health Data Science, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineAbstract Large language models (LLMs) demonstrate significant potential in healthcare applications, but clinical deployment is limited by privacy concerns and insufficient medical domain training. This study investigated whether retrieval-augmented generation (RAG) can improve locally deployable LLM for radiology contrast media consultation. In 100 synthetic iodinated contrast media consultations we compared Llama 3.2-11B (baseline and RAG) with three cloud-based models—GPT-4o mini, Gemini 2.0 Flash and Claude 3.5 Haiku. A blinded radiologist ranked the five replies per case, and three LLM-based judges scored accuracy, safety, structure, tone, applicability and latency. Under controlled conditions, RAG eliminated hallucinations (0% vs 8%; χ²₍Yates₎ = 6.38, p = 0.012) and improved mean rank by 1.3 (Z = –4.82, p < 0.001), though performance gaps with cloud models persist. The RAG-enhanced model remained faster (2.6 s vs 4.9–7.3 s) while the LLM-based judges preferred it over GPT-4o mini, though the radiologist ranked GPT-4o mini higher. RAG thus provides meaningful improvements for local clinical LLMs while maintaining the privacy benefits of on-premise deployment.https://doi.org/10.1038/s41746-025-01802-z
spellingShingle Akihiko Wada
Yuya Tanaka
Mitsuo Nishizawa
Akira Yamamoto
Toshiaki Akashi
Akifumi Hagiwara
Yayoi Hayakawa
Junko Kikuta
Keigo Shimoji
Katsuhiro Sano
Koji Kamagata
Atsushi Nakanishi
Shigeki Aoki
Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation
npj Digital Medicine
title Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation
title_full Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation
title_fullStr Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation
title_full_unstemmed Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation
title_short Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation
title_sort retrieval augmented generation elevates local llm quality in radiology contrast media consultation
url https://doi.org/10.1038/s41746-025-01802-z
work_keys_str_mv AT akihikowada retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation
AT yuyatanaka retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation
AT mitsuonishizawa retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation
AT akirayamamoto retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation
AT toshiakiakashi retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation
AT akifumihagiwara retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation
AT yayoihayakawa retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation
AT junkokikuta retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation
AT keigoshimoji retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation
AT katsuhirosano retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation
AT kojikamagata retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation
AT atsushinakanishi retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation
AT shigekiaoki retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation