A strategy for cost-effective large language model use at health system-scale
Abstract Large language models (LLMs) can optimize clinical workflows; however, the economic and computational challenges of their utilization at the health system scale are underexplored. We evaluated how concatenating queries with multiple clinical notes and tasks simultaneously affects model perf...
Saved in:
| Main Authors: | , , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2024-11-01
|
| Series: | npj Digital Medicine |
| Online Access: | https://doi.org/10.1038/s41746-024-01315-1 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850162651056832512 |
|---|---|
| author | Eyal Klang Donald Apakama Ethan E. Abbott Akhil Vaid Joshua Lampert Ankit Sakhuja Robert Freeman Alexander W. Charney David Reich Monica Kraft Girish N. Nadkarni Benjamin S. Glicksberg |
| author_facet | Eyal Klang Donald Apakama Ethan E. Abbott Akhil Vaid Joshua Lampert Ankit Sakhuja Robert Freeman Alexander W. Charney David Reich Monica Kraft Girish N. Nadkarni Benjamin S. Glicksberg |
| author_sort | Eyal Klang |
| collection | DOAJ |
| description | Abstract Large language models (LLMs) can optimize clinical workflows; however, the economic and computational challenges of their utilization at the health system scale are underexplored. We evaluated how concatenating queries with multiple clinical notes and tasks simultaneously affects model performance under increasing computational loads. We assessed ten LLMs of different capacities and sizes utilizing real-world patient data. We conducted >300,000 experiments of various task sizes and configurations, measuring accuracy in question-answering and the ability to properly format outputs. Performance deteriorated as the number of questions and notes increased. High-capacity models, like Llama-3–70b, had low failure rates and high accuracies. GPT-4-turbo-128k was similarly resilient across task burdens, but performance deteriorated after 50 tasks at large prompt sizes. After addressing mitigable failures, these two models can concatenate up to 50 simultaneous tasks effectively, with validation on a public medical question-answering dataset. An economic analysis demonstrated up to a 17-fold cost reduction at 50 tasks using concatenation. These results identify the limits of LLMs for effective utilization and highlight avenues for cost-efficiency at the enterprise scale. |
| format | Article |
| id | doaj-art-5e75fc54840d40b7b0bdf4bc508e4cab |
| institution | OA Journals |
| issn | 2398-6352 |
| language | English |
| publishDate | 2024-11-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | npj Digital Medicine |
| spelling | doaj-art-5e75fc54840d40b7b0bdf4bc508e4cab2025-08-20T02:22:30ZengNature Portfolionpj Digital Medicine2398-63522024-11-017111210.1038/s41746-024-01315-1A strategy for cost-effective large language model use at health system-scaleEyal Klang0Donald Apakama1Ethan E. Abbott2Akhil Vaid3Joshua Lampert4Ankit Sakhuja5Robert Freeman6Alexander W. Charney7David Reich8Monica Kraft9Girish N. Nadkarni10Benjamin S. Glicksberg11Division of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiDepartment of Emergency Medicine, Icahn School of Medicine at Mount SinaiDivision of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiDivision of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiDivision of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiDivision of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiThe Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount SinaiThe Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount SinaiDepartment of Anesthesiology, Perioperative and Pain Medicine, Icahn School of Medicine at Mount SinaiThe Samuel Bronfman Department of Medicine, Icahn School of Medicine at Mount SinaiDivision of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiDivision of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount SinaiAbstract Large language models (LLMs) can optimize clinical workflows; however, the economic and computational challenges of their utilization at the health system scale are underexplored. We evaluated how concatenating queries with multiple clinical notes and tasks simultaneously affects model performance under increasing computational loads. We assessed ten LLMs of different capacities and sizes utilizing real-world patient data. We conducted >300,000 experiments of various task sizes and configurations, measuring accuracy in question-answering and the ability to properly format outputs. Performance deteriorated as the number of questions and notes increased. High-capacity models, like Llama-3–70b, had low failure rates and high accuracies. GPT-4-turbo-128k was similarly resilient across task burdens, but performance deteriorated after 50 tasks at large prompt sizes. After addressing mitigable failures, these two models can concatenate up to 50 simultaneous tasks effectively, with validation on a public medical question-answering dataset. An economic analysis demonstrated up to a 17-fold cost reduction at 50 tasks using concatenation. These results identify the limits of LLMs for effective utilization and highlight avenues for cost-efficiency at the enterprise scale.https://doi.org/10.1038/s41746-024-01315-1 |
| spellingShingle | Eyal Klang Donald Apakama Ethan E. Abbott Akhil Vaid Joshua Lampert Ankit Sakhuja Robert Freeman Alexander W. Charney David Reich Monica Kraft Girish N. Nadkarni Benjamin S. Glicksberg A strategy for cost-effective large language model use at health system-scale npj Digital Medicine |
| title | A strategy for cost-effective large language model use at health system-scale |
| title_full | A strategy for cost-effective large language model use at health system-scale |
| title_fullStr | A strategy for cost-effective large language model use at health system-scale |
| title_full_unstemmed | A strategy for cost-effective large language model use at health system-scale |
| title_short | A strategy for cost-effective large language model use at health system-scale |
| title_sort | strategy for cost effective large language model use at health system scale |
| url | https://doi.org/10.1038/s41746-024-01315-1 |
| work_keys_str_mv | AT eyalklang astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT donaldapakama astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT ethaneabbott astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT akhilvaid astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT joshualampert astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT ankitsakhuja astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT robertfreeman astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT alexanderwcharney astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT davidreich astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT monicakraft astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT girishnnadkarni astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT benjaminsglicksberg astrategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT eyalklang strategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT donaldapakama strategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT ethaneabbott strategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT akhilvaid strategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT joshualampert strategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT ankitsakhuja strategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT robertfreeman strategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT alexanderwcharney strategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT davidreich strategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT monicakraft strategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT girishnnadkarni strategyforcosteffectivelargelanguagemodeluseathealthsystemscale AT benjaminsglicksberg strategyforcosteffectivelargelanguagemodeluseathealthsystemscale |