Infrastructure for the deployment of Large Language Models: challenges and solutions

Large Language Models are increasingly prevalent, and their capabilities are advancing rapidly due to extensive research in this field. A growing number of models are being developed, with sizes significantly surpassing 70 billion parameters. As a result, the ability to perform efficient and scalabl...

Full description

Saved in:
Bibliographic Details
Main Authors: Tomasz Walkowiak, Bartosz Walkowiak
Format: Article
Language:English
Published: Polish Academy of Sciences 2025-07-01
Series:International Journal of Electronics and Telecommunications
Subjects:
Online Access:https://journals.pan.pl/Content/135740/12_4999_Walkowiak_L_sk.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849470216325562368
author Tomasz Walkowiak
Bartosz Walkowiak
author_facet Tomasz Walkowiak
Bartosz Walkowiak
author_sort Tomasz Walkowiak
collection DOAJ
description Large Language Models are increasingly prevalent, and their capabilities are advancing rapidly due to extensive research in this field. A growing number of models are being developed, with sizes significantly surpassing 70 billion parameters. As a result, the ability to perform efficient and scalable inferences on these models is becoming crucial to maximize the utilization of valuable resources such as GPUs and CPUs. This thesis outlines a process for selecting the most effective tools for efficient inference, supported by the results of experiments. Additionally, it provides a comprehensive description of an end-to-end system for the inference process, encompassing all components from model inference and communication to user management and a userfriendly web interface. Furthermore, we detail the development of an LLM chatbot that leverages the function-calling capabilities of LLMs and integrates various external tools, including weather prediction, Wikipedia information, symbolic math, and image generation.
format Article
id doaj-art-2cb9770f87634a5ba7ba779e3e348826
institution Kabale University
issn 2081-8491
2300-1933
language English
publishDate 2025-07-01
publisher Polish Academy of Sciences
record_format Article
series International Journal of Electronics and Telecommunications
spelling doaj-art-2cb9770f87634a5ba7ba779e3e3488262025-08-20T03:25:12ZengPolish Academy of SciencesInternational Journal of Electronics and Telecommunications2081-84912300-19332025-07-01vol. 71No 3https://doi.org/10.24425/ijet.2025.153620Infrastructure for the deployment of Large Language Models: challenges and solutionsTomasz Walkowiak0Bartosz Walkowiak1Faculty of Information and Communication Technology, Wroclaw University of Science and Technology, Wroclaw, PolandFaculty of Information and Communication Technology, Wroclaw University of Science and Technology, Wroclaw, PolandLarge Language Models are increasingly prevalent, and their capabilities are advancing rapidly due to extensive research in this field. A growing number of models are being developed, with sizes significantly surpassing 70 billion parameters. As a result, the ability to perform efficient and scalable inferences on these models is becoming crucial to maximize the utilization of valuable resources such as GPUs and CPUs. This thesis outlines a process for selecting the most effective tools for efficient inference, supported by the results of experiments. Additionally, it provides a comprehensive description of an end-to-end system for the inference process, encompassing all components from model inference and communication to user management and a userfriendly web interface. Furthermore, we detail the development of an LLM chatbot that leverages the function-calling capabilities of LLMs and integrates various external tools, including weather prediction, Wikipedia information, symbolic math, and image generation.https://journals.pan.pl/Content/135740/12_4999_Walkowiak_L_sk.pdflarge language modelsmodel deploymentcontinuous batchingchatbotsfunction-calling llm
spellingShingle Tomasz Walkowiak
Bartosz Walkowiak
Infrastructure for the deployment of Large Language Models: challenges and solutions
International Journal of Electronics and Telecommunications
large language models
model deployment
continuous batching
chatbots
function-calling llm
title Infrastructure for the deployment of Large Language Models: challenges and solutions
title_full Infrastructure for the deployment of Large Language Models: challenges and solutions
title_fullStr Infrastructure for the deployment of Large Language Models: challenges and solutions
title_full_unstemmed Infrastructure for the deployment of Large Language Models: challenges and solutions
title_short Infrastructure for the deployment of Large Language Models: challenges and solutions
title_sort infrastructure for the deployment of large language models challenges and solutions
topic large language models
model deployment
continuous batching
chatbots
function-calling llm
url https://journals.pan.pl/Content/135740/12_4999_Walkowiak_L_sk.pdf
work_keys_str_mv AT tomaszwalkowiak infrastructureforthedeploymentoflargelanguagemodelschallengesandsolutions
AT bartoszwalkowiak infrastructureforthedeploymentoflargelanguagemodelschallengesandsolutions