Addressing Activation Outliers in LLMs: A Systematic Review of Post-Training Quantization Techniques
Large Language Models (LLMs) have transformed natural language processing, yet their deployment remains challenging due to substantial computational, memory, and energy demands. Post-training quantization has emerged as a key strategy for enabling efficient inference, particularly in resource-constr...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10994764/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Large Language Models (LLMs) have transformed natural language processing, yet their deployment remains challenging due to substantial computational, memory, and energy demands. Post-training quantization has emerged as a key strategy for enabling efficient inference, particularly in resource-constrained settings. This systematic review focuses on weight-activation quantization, with a unique emphasis on the emergent outlier phenomenon in LLM activations. This work evaluates recent techniques that mitigate activation outliers and improve quantization efficiency, distinguishing itself from prior reviews. Using the PRISMA methodology, we examine 52 recent studies to uncover key trends and evaluate the effectiveness of different approaches. By synthesizing insights from these works, this review presents a diverse set of techniques and their implications for activation quantization, laying the groundwork for future research and practical advancements in LLM deployment. |
|---|---|
| ISSN: | 2169-3536 |