Edge-LLM Inference With Cost-Aware Layer Allocation and Adaptive Scheduling

This paper addresses two key challenges in distributed Large Language Model (LLM) inference at the edge: 1) cost-efficient and fair task allocation, and 2) dynamic scheduling under deadline constraints. We propose two mechanisms: the Fair Cost-Efficient Incentive Mechanism (FCIM) for task and layer...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sama Habibi, Ozgur Ercetin
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Adaptive scheduling distributed AI edge computing fair incentive mechanism large language models resource allocation
Online Access:	https://ieeexplore.ieee.org/document/11095716/
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!

Edge-LLM Inference With Cost-Aware Layer Allocation and Adaptive Scheduling

Similar Items