A strategy for cost-effective large language model use at health system-scale

Abstract Large language models (LLMs) can optimize clinical workflows; however, the economic and computational challenges of their utilization at the health system scale are underexplored. We evaluated how concatenating queries with multiple clinical notes and tasks simultaneously affects model perf...

Full description

Saved in:
Bibliographic Details
Main Authors: Eyal Klang, Donald Apakama, Ethan E. Abbott, Akhil Vaid, Joshua Lampert, Ankit Sakhuja, Robert Freeman, Alexander W. Charney, David Reich, Monica Kraft, Girish N. Nadkarni, Benjamin S. Glicksberg
Format: Article
Language:English
Published: Nature Portfolio 2024-11-01
Series:npj Digital Medicine
Online Access:https://doi.org/10.1038/s41746-024-01315-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850162651056832512
author Eyal Klang
Donald Apakama
Ethan E. Abbott
Akhil Vaid
Joshua Lampert
Ankit Sakhuja
Robert Freeman
Alexander W. Charney
David Reich
Monica Kraft
Girish N. Nadkarni
Benjamin S. Glicksberg
author_facet Eyal Klang
Donald Apakama
Ethan E. Abbott
Akhil Vaid
Joshua Lampert
Ankit Sakhuja
Robert Freeman
Alexander W. Charney
David Reich
Monica Kraft
Girish N. Nadkarni
Benjamin S. Glicksberg
author_sort Eyal Klang
collection DOAJ
description Abstract Large language models (LLMs) can optimize clinical workflows; however, the economic and computational challenges of their utilization at the health system scale are underexplored. We evaluated how concatenating queries with multiple clinical notes and tasks simultaneously affects model performance under increasing computational loads. We assessed ten LLMs of different capacities and sizes utilizing real-world patient data. We conducted >300,000 experiments of various task sizes and configurations, measuring accuracy in question-answering and the ability to properly format outputs. Performance deteriorated as the number of questions and notes increased. High-capacity models, like Llama-3–70b, had low failure rates and high accuracies. GPT-4-turbo-128k was similarly resilient across task burdens, but performance deteriorated after 50 tasks at large prompt sizes. After addressing mitigable failures, these two models can concatenate up to 50 simultaneous tasks effectively, with validation on a public medical question-answering dataset. An economic analysis demonstrated up to a 17-fold cost reduction at 50 tasks using concatenation. These results identify the limits of LLMs for effective utilization and highlight avenues for cost-efficiency at the enterprise scale.
format Article
id doaj-art-5e75fc54840d40b7b0bdf4bc508e4cab
institution OA Journals
issn 2398-6352
language English
publishDate 2024-11-01
publisher Nature Portfolio
record_format Article
series npj Digital Medicine
spelling doaj-art-5e75fc54840d40b7b0bdf4bc508e4cab2025-08-20T02:22:30ZengNature Portfolionpj Digital Medicine2398-63522024-11-017111210.1038/s41746-024-01315-1A strategy for cost-effective large language model use at health system-scaleEyal Klang0Donald Apakama1Ethan E. Abbott2Akhil Vaid3Joshua Lampert4Ankit Sakhuja5Robert Freeman6Alexander W. Charney7David Reich8Monica Kraft9Girish N. Nadkarni10Benjamin S. Glicksberg11Division of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiDepartment of Emergency Medicine, Icahn School of Medicine at Mount SinaiDivision of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiDivision of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiDivision of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiDivision of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiThe Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount SinaiThe Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount SinaiDepartment of Anesthesiology, Perioperative and Pain Medicine, Icahn School of Medicine at Mount SinaiThe Samuel Bronfman Department of Medicine, Icahn School of Medicine at Mount SinaiDivision of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiDivision of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiAbstract Large language models (LLMs) can optimize clinical workflows; however, the economic and computational challenges of their utilization at the health system scale are underexplored. We evaluated how concatenating queries with multiple clinical notes and tasks simultaneously affects model performance under increasing computational loads. We assessed ten LLMs of different capacities and sizes utilizing real-world patient data. We conducted >300,000 experiments of various task sizes and configurations, measuring accuracy in question-answering and the ability to properly format outputs. Performance deteriorated as the number of questions and notes increased. High-capacity models, like Llama-3–70b, had low failure rates and high accuracies. GPT-4-turbo-128k was similarly resilient across task burdens, but performance deteriorated after 50 tasks at large prompt sizes. After addressing mitigable failures, these two models can concatenate up to 50 simultaneous tasks effectively, with validation on a public medical question-answering dataset. An economic analysis demonstrated up to a 17-fold cost reduction at 50 tasks using concatenation. These results identify the limits of LLMs for effective utilization and highlight avenues for cost-efficiency at the enterprise scale.https://doi.org/10.1038/s41746-024-01315-1
spellingShingle Eyal Klang
Donald Apakama
Ethan E. Abbott
Akhil Vaid
Joshua Lampert
Ankit Sakhuja
Robert Freeman
Alexander W. Charney
David Reich
Monica Kraft
Girish N. Nadkarni
Benjamin S. Glicksberg
A strategy for cost-effective large language model use at health system-scale
npj Digital Medicine
title A strategy for cost-effective large language model use at health system-scale
title_full A strategy for cost-effective large language model use at health system-scale
title_fullStr A strategy for cost-effective large language model use at health system-scale
title_full_unstemmed A strategy for cost-effective large language model use at health system-scale
title_short A strategy for cost-effective large language model use at health system-scale
title_sort strategy for cost effective large language model use at health system scale
url https://doi.org/10.1038/s41746-024-01315-1
work_keys_str_mv AT eyalklang astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT donaldapakama astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT ethaneabbott astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT akhilvaid astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT joshualampert astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT ankitsakhuja astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT robertfreeman astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT alexanderwcharney astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT davidreich astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT monicakraft astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT girishnnadkarni astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT benjaminsglicksberg astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT eyalklang strategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT donaldapakama strategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT ethaneabbott strategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT akhilvaid strategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT joshualampert strategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT ankitsakhuja strategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT robertfreeman strategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT alexanderwcharney strategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT davidreich strategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT monicakraft strategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT girishnnadkarni strategyforcosteffectivelargelanguagemodeluseathealthsystemscale
AT benjaminsglicksberg strategyforcosteffectivelargelanguagemodeluseathealthsystemscale