Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation
Abstract Large language models (LLMs) demonstrate significant potential in healthcare applications, but clinical deployment is limited by privacy concerns and insufficient medical domain training. This study investigated whether retrieval-augmented generation (RAG) can improve locally deployable LLM...
Saved in:
| Main Authors: | , , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | npj Digital Medicine |
| Online Access: | https://doi.org/10.1038/s41746-025-01802-z |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849389261906771968 |
|---|---|
| author | Akihiko Wada Yuya Tanaka Mitsuo Nishizawa Akira Yamamoto Toshiaki Akashi Akifumi Hagiwara Yayoi Hayakawa Junko Kikuta Keigo Shimoji Katsuhiro Sano Koji Kamagata Atsushi Nakanishi Shigeki Aoki |
| author_facet | Akihiko Wada Yuya Tanaka Mitsuo Nishizawa Akira Yamamoto Toshiaki Akashi Akifumi Hagiwara Yayoi Hayakawa Junko Kikuta Keigo Shimoji Katsuhiro Sano Koji Kamagata Atsushi Nakanishi Shigeki Aoki |
| author_sort | Akihiko Wada |
| collection | DOAJ |
| description | Abstract Large language models (LLMs) demonstrate significant potential in healthcare applications, but clinical deployment is limited by privacy concerns and insufficient medical domain training. This study investigated whether retrieval-augmented generation (RAG) can improve locally deployable LLM for radiology contrast media consultation. In 100 synthetic iodinated contrast media consultations we compared Llama 3.2-11B (baseline and RAG) with three cloud-based models—GPT-4o mini, Gemini 2.0 Flash and Claude 3.5 Haiku. A blinded radiologist ranked the five replies per case, and three LLM-based judges scored accuracy, safety, structure, tone, applicability and latency. Under controlled conditions, RAG eliminated hallucinations (0% vs 8%; χ²₍Yates₎ = 6.38, p = 0.012) and improved mean rank by 1.3 (Z = –4.82, p < 0.001), though performance gaps with cloud models persist. The RAG-enhanced model remained faster (2.6 s vs 4.9–7.3 s) while the LLM-based judges preferred it over GPT-4o mini, though the radiologist ranked GPT-4o mini higher. RAG thus provides meaningful improvements for local clinical LLMs while maintaining the privacy benefits of on-premise deployment. |
| format | Article |
| id | doaj-art-98d901570fd74c5f8c3fa61829804113 |
| institution | Kabale University |
| issn | 2398-6352 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | npj Digital Medicine |
| spelling | doaj-art-98d901570fd74c5f8c3fa618298041132025-08-20T03:42:00ZengNature Portfolionpj Digital Medicine2398-63522025-07-01811910.1038/s41746-025-01802-zRetrieval-augmented generation elevates local LLM quality in radiology contrast media consultationAkihiko Wada0Yuya Tanaka1Mitsuo Nishizawa2Akira Yamamoto3Toshiaki Akashi4Akifumi Hagiwara5Yayoi Hayakawa6Junko Kikuta7Keigo Shimoji8Katsuhiro Sano9Koji Kamagata10Atsushi Nakanishi11Shigeki Aoki12Department of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Urayasu HospitalFaculty of Health Data Science, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineDepartment of Radiology, Juntendo University Graduate School of MedicineAbstract Large language models (LLMs) demonstrate significant potential in healthcare applications, but clinical deployment is limited by privacy concerns and insufficient medical domain training. This study investigated whether retrieval-augmented generation (RAG) can improve locally deployable LLM for radiology contrast media consultation. In 100 synthetic iodinated contrast media consultations we compared Llama 3.2-11B (baseline and RAG) with three cloud-based models—GPT-4o mini, Gemini 2.0 Flash and Claude 3.5 Haiku. A blinded radiologist ranked the five replies per case, and three LLM-based judges scored accuracy, safety, structure, tone, applicability and latency. Under controlled conditions, RAG eliminated hallucinations (0% vs 8%; χ²₍Yates₎ = 6.38, p = 0.012) and improved mean rank by 1.3 (Z = –4.82, p < 0.001), though performance gaps with cloud models persist. The RAG-enhanced model remained faster (2.6 s vs 4.9–7.3 s) while the LLM-based judges preferred it over GPT-4o mini, though the radiologist ranked GPT-4o mini higher. RAG thus provides meaningful improvements for local clinical LLMs while maintaining the privacy benefits of on-premise deployment.https://doi.org/10.1038/s41746-025-01802-z |
| spellingShingle | Akihiko Wada Yuya Tanaka Mitsuo Nishizawa Akira Yamamoto Toshiaki Akashi Akifumi Hagiwara Yayoi Hayakawa Junko Kikuta Keigo Shimoji Katsuhiro Sano Koji Kamagata Atsushi Nakanishi Shigeki Aoki Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation npj Digital Medicine |
| title | Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation |
| title_full | Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation |
| title_fullStr | Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation |
| title_full_unstemmed | Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation |
| title_short | Retrieval-augmented generation elevates local LLM quality in radiology contrast media consultation |
| title_sort | retrieval augmented generation elevates local llm quality in radiology contrast media consultation |
| url | https://doi.org/10.1038/s41746-025-01802-z |
| work_keys_str_mv | AT akihikowada retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation AT yuyatanaka retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation AT mitsuonishizawa retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation AT akirayamamoto retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation AT toshiakiakashi retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation AT akifumihagiwara retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation AT yayoihayakawa retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation AT junkokikuta retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation AT keigoshimoji retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation AT katsuhirosano retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation AT kojikamagata retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation AT atsushinakanishi retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation AT shigekiaoki retrievalaugmentedgenerationelevateslocalllmqualityinradiologycontrastmediaconsultation |