Search Results - llms (inference OR influence) optimization :: Kabale University Library Catalog

Search alternatives:
inference » conference (Expand Search)

1

Efficient LLMs Training and Inference: An Introduction by Rui Li, Deji Fu, Chunyu Shi, Zhilan Huang, Gang Lu

Published 2025-01-01

Get full text

Article

Save to List

Saved in:
2

ELO-Mask: Effective and Layerwise Optimization of Mask for Sparse LLMs by Bingjie Xiang, Jiarui Wu, Xiaoying Han, Qian Gu, Fei Chao, Xiao Yang, Fan Wu, Xin Fu

Published 2024-01-01

Get full text

Article

Save to List

Saved in:
3

Long-context inference optimization for large language models: a survey by TAO Wei, WANG Jianzong, ZHANG Xulong, QU Xiaoyang

Published 2025-01-01
“…To improve the efficiency of LLMs in long-text inference, a comprehensive review and analysis of existing optimization techniques were conducted. …”

Get full text

Article

Save to List

Saved in:
4

Long-context inference optimization for large language models: a survey by TAO Wei, WANG Jianzong, ZHANG Xulong, QU Xiaoyang

Published 2025-01-01
“…To improve the efficiency of LLMs in long-text inference, a comprehensive review and analysis of existing optimization techniques were conducted. …”

Get full text

Article

Save to List

Saved in:
5

A study on classification based concurrent API calls and optimal model combination for tool augmented LLMs for AI agent by HeounMo Go, SangHyun Park

Published 2025-07-01
“…With the rapid advancement of LLMs, enhanced models continue to emerge. Considering the trade-offs between performance and cost in models, it is crucial to find an optimal combination of models in each stage of tool augmented LLM. …”

Get full text

Article

Save to List

Saved in:
6

Entropy-Guided KV Caching for Efficient LLM Inference by Heekyum Kim, Yuchul Jung

Published 2025-07-01
“…However, their practical deployment—especially in long-context scenarios—is often hindered by the computational and memory costs associated with managing the key–value (KV) cache during inference. Optimizing this process is therefore crucial for improving LLM efficiency and scalability. …”

Get full text

Article

Save to List

Saved in:
7

AsymGroup: Asymmetric Grouping and Communication Optimization for 2D Tensor Parallelism in LLM Inference by Ki Tae Kim, Seok-Ju Im, Eui-Young Chung

Published 2025-01-01
“…Recent advances in Large Language Models (LLMs), such as GPT and LLaMA, have demonstrated remarkable capabilities across a wide array of natural language processing tasks. …”

Get full text

Article

Save to List

Saved in:
8

ORANSight-2.0: Foundational LLMs for O-RAN by Pranshav Gajjar, Vijay K. Shah

Published 2025-01-01
“…We thoroughly evaluate the energy characteristics of ORANSight-2.0, demonstrating its efficiency in training, inference, and inference with RAG augmentation, ensuring optimal performance while maintaining low computational and energy costs. …”

Get full text

Article

Save to List

Saved in:
9

Survey and Evaluation of Converging Architecture in LLMs Based on Footsteps of Operations by Seongho Kim, Jihyun Moon, Juntaek Oh, Insu Choi, Joon-Sung Yang

Published 2025-01-01
“…The evolution of LLMs has been driven by advances in high-bandwidth memory, specialized accelerators, and optimized architectures, enabling models to scale to billions of parameters. …”

Get full text

Article

Save to List

Saved in:
10

LLMs on a Budget: System-Level Approaches to Power-Efficient and Scalable Fine-Tuning by Kailash Gogineni, Ali Suvizi, Guru Venkataramani

Published 2025-01-01
“…Large Language Models (LLMs) have shown remarkable capabilities in various applications, including robotics, telecommunications, and scientific discovery. …”

Get full text

Article

Save to List

Saved in:
11

Few-Shot Optimization for Sensor Data Using Large Language Models: A Case Study on Fatigue Detection by Elsen Ronando, Sozo Inoue

Published 2025-05-01
“…In this paper, we propose a novel few-shot optimization with Hybrid Euclidean Distance with Large Language Models (HED-LM) to improve example selection for sensor-based classification tasks. …”

Get full text

Article

Save to List

Saved in:
12

Unveiling the Power of Large Language Models: A Comparative Study of Retrieval-Augmented Generation, Fine-Tuning, and Their Synergistic Fusion for Enhanced Performance by Gulsum Budakoglu, Hakan Emekci

Published 2025-01-01

Get full text

Article

Save to List

Saved in:
13

BALI—A Benchmark for Accelerated Language Model Inference by Lena Jurkschat, Preetam Gattogi, Sahar Vahdati, Jens Lehmann

Published 2025-01-01
“…These applications rely on real-time or near-real-time responses to process sequential LLM requests, creating a critical demand for efficient and accelerated inference. These developments have led to numerous frameworks optimizing inference speed and resource utilization. …”

Get full text

Article

Save to List

Saved in:
14

Impact of Developer Queries on the Effectiveness of Conversational Large Language Models in Programming by Viktor Taneski, Sašo Karakatič, Patrik Rek, Gregor Jošt

Published 2025-06-01
“…These findings suggest that the nature of the queries made to LLMs influences the success of programming tasks and provides insights into how AI tools can assist learning in software development.…”

Get full text

Article

Save to List

Saved in:
15

Agentic AI Systems: Architecture and Evaluation Using a Frictionless Parking Scenario by Alaa Khamis

Published 2025-01-01
“…Key metrics (agent’s response time or latency and lexical consistency) show that a lightweight gpt-4o-mini backbone and concise verbosity minimize latency, while medium prompt specificity and moderate query complexity optimize consistency. Decoding entropy influences stylistic diversity without significant latency costs but reduces consistency at high settings. …”

Get full text

Article

Save to List

Saved in:
16

Human-Centered AI for Migrant Integration Through LLM and RAG Optimization by Dagoberto Castellanos-Nieves, Luis García-Forte

Published 2024-12-01
“…Our proposal involves the optimal tuning of key hyperparameters for LLMs and RAG through multi-criteria decision-making (MCDM) methods to ensure the solutions are fair, equitable, and non-discriminatory. …”

Get full text

Article

Save to List

Saved in:
17

AI driven cardiovascular risk prediction using NLP and Large Language Models for personalized medicine in athletes by Ang Li, Yunxin Wang, Hongxu Chen

Published 2025-06-01
“…Furthermore, the research underscores the role of LLMs in personalized medicine, identifying patient-specific risk factors and optimizing treatment pathways for cardiac patients. …”

Get full text

Article

Save to List

Saved in:
18

Exploring the Joint Influence of Built Environment Factors on Urban Rail Transit Peak-Hour Ridership Using DeepSeek by Zhuorui Wang, Xiaoyu Zheng, Fanyun Meng, Kang Wang, Xincheng Wu, Dexin Yu

Published 2025-05-01
“…Recent advancements in the reasoning capabilities of large language models (LLMs) offer a robust methodological foundation for analyzing the complex joint influence of multiple built environment factors. …”

Get full text

Article

Save to List

Saved in:
19

Probing the Pitfalls: Understanding SVD’s Shortcomings in Language Model Compression by Сергей Александрович Плетенев

Published 2024-12-01

Get full text

Article

Save to List

Saved in:
20

Evaluating and Improving Syndrome Differentiation Thinking Ability in Large Language Models: Method Development Study by Chunliang Chen, Xinyu Wang, Ming Guan, Wenjing Yue, Yuanbin Wu, Ya Zhou, Xiaoling Wang

Published 2025-06-01
“…The optimized LLMs show a high degree of similarity in reasoning results, consistent with the opinions of domain experts, indicating that they can simulate syndrome differentiation thinking to a certain extent. …”

Get full text

Article

Save to List

Saved in: