InMemQK: A Product Quantization Based MatMul Module for Compute-in-Memory Attention Macro
Large Language Models (LLMs), based on transformer architecture, have demonstrated remarkable capabilities in natural language processing tasks, enabling machines to generate human-like text and engage in meaningful dialogues. However, the exponential increase in model parameters has led to limitati...
Saved in:
| Main Authors: | Pengcheng Feng, Yihao Chen, Jinke Yu, Hao Yue, Zhelong Jiang, Yi Xiao, Wan’ang Xiao, Huaxiang Lu, Gang Chen |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-12-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/14/23/11198 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
NeuAFG: Neural Network-Based Analog Function Generator for Inference in CIM
by: Pengcheng Feng, et al.
Published: (2025-01-01) -
Macro Memory Cell Generator for SKY130 PDK
by: Emilio Isaac Baungarten-Leon, et al.
Published: (2024-01-01) -
Low-Power 8T SRAM Compute-in-Memory Macro for Edge AI Processors
by: Hye-Ju Shin, et al.
Published: (2024-11-01) -
Quantized convolutional neural networks: a hardware perspective
by: Li Zhang, et al.
Published: (2025-07-01) -
10T SRAM Computing-in-Memory Macros for Binary and Multibit MAC Operation of DNN Edge Processors
by: Van Truong Nguyen, et al.
Published: (2021-01-01)