Infrastructure for the deployment of Large Language Models: challenges and solutions
Large Language Models are increasingly prevalent, and their capabilities are advancing rapidly due to extensive research in this field. A growing number of models are being developed, with sizes significantly surpassing 70 billion parameters. As a result, the ability to perform efficient and scalabl...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Polish Academy of Sciences
2025-07-01
|
| Series: | International Journal of Electronics and Telecommunications |
| Subjects: | |
| Online Access: | https://journals.pan.pl/Content/135740/12_4999_Walkowiak_L_sk.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849470216325562368 |
|---|---|
| author | Tomasz Walkowiak Bartosz Walkowiak |
| author_facet | Tomasz Walkowiak Bartosz Walkowiak |
| author_sort | Tomasz Walkowiak |
| collection | DOAJ |
| description | Large Language Models are increasingly prevalent, and their capabilities are advancing rapidly due to extensive research in this field. A growing number of models are being developed, with sizes significantly surpassing 70 billion parameters. As a result, the ability to perform efficient and scalable inferences on these models is becoming crucial to maximize the utilization of valuable resources such as GPUs and CPUs. This thesis outlines a process for selecting the most effective tools for efficient inference, supported by the results of experiments. Additionally, it provides a comprehensive description of an end-to-end system for the inference process, encompassing all components from model inference and communication to user management and a userfriendly web interface. Furthermore, we detail the development of an LLM chatbot that leverages the function-calling capabilities of LLMs and integrates various external tools, including weather prediction, Wikipedia information, symbolic math, and image generation. |
| format | Article |
| id | doaj-art-2cb9770f87634a5ba7ba779e3e348826 |
| institution | Kabale University |
| issn | 2081-8491 2300-1933 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Polish Academy of Sciences |
| record_format | Article |
| series | International Journal of Electronics and Telecommunications |
| spelling | doaj-art-2cb9770f87634a5ba7ba779e3e3488262025-08-20T03:25:12ZengPolish Academy of SciencesInternational Journal of Electronics and Telecommunications2081-84912300-19332025-07-01vol. 71No 3https://doi.org/10.24425/ijet.2025.153620Infrastructure for the deployment of Large Language Models: challenges and solutionsTomasz Walkowiak0Bartosz Walkowiak1Faculty of Information and Communication Technology, Wroclaw University of Science and Technology, Wroclaw, PolandFaculty of Information and Communication Technology, Wroclaw University of Science and Technology, Wroclaw, PolandLarge Language Models are increasingly prevalent, and their capabilities are advancing rapidly due to extensive research in this field. A growing number of models are being developed, with sizes significantly surpassing 70 billion parameters. As a result, the ability to perform efficient and scalable inferences on these models is becoming crucial to maximize the utilization of valuable resources such as GPUs and CPUs. This thesis outlines a process for selecting the most effective tools for efficient inference, supported by the results of experiments. Additionally, it provides a comprehensive description of an end-to-end system for the inference process, encompassing all components from model inference and communication to user management and a userfriendly web interface. Furthermore, we detail the development of an LLM chatbot that leverages the function-calling capabilities of LLMs and integrates various external tools, including weather prediction, Wikipedia information, symbolic math, and image generation.https://journals.pan.pl/Content/135740/12_4999_Walkowiak_L_sk.pdflarge language modelsmodel deploymentcontinuous batchingchatbotsfunction-calling llm |
| spellingShingle | Tomasz Walkowiak Bartosz Walkowiak Infrastructure for the deployment of Large Language Models: challenges and solutions International Journal of Electronics and Telecommunications large language models model deployment continuous batching chatbots function-calling llm |
| title | Infrastructure for the deployment of Large Language Models: challenges and solutions |
| title_full | Infrastructure for the deployment of Large Language Models: challenges and solutions |
| title_fullStr | Infrastructure for the deployment of Large Language Models: challenges and solutions |
| title_full_unstemmed | Infrastructure for the deployment of Large Language Models: challenges and solutions |
| title_short | Infrastructure for the deployment of Large Language Models: challenges and solutions |
| title_sort | infrastructure for the deployment of large language models challenges and solutions |
| topic | large language models model deployment continuous batching chatbots function-calling llm |
| url | https://journals.pan.pl/Content/135740/12_4999_Walkowiak_L_sk.pdf |
| work_keys_str_mv | AT tomaszwalkowiak infrastructureforthedeploymentoflargelanguagemodelschallengesandsolutions AT bartoszwalkowiak infrastructureforthedeploymentoflargelanguagemodelschallengesandsolutions |