Multi-distribution noise quantisation: an extreme compression scheme for transformer according to parameter distribution

With the development of deep learning, neural networks are widely used in various fields, and the improved model performance also introduces a considerable number of parameters and computations. Model quantisation is a technique that turns floating-point computing into low-specific-point computing,...

Full description

Saved in:
Bibliographic Details
Main Authors: Zaiyang Yu, Shuang Li, Linjun Sun, Liang Liu, Wang Haining
Format: Article
Language:English
Published: Taylor & Francis Group 2022-12-01
Series:Connection Science
Subjects:
Online Access:http://dx.doi.org/10.1080/09540091.2021.2024510
Tags: Add Tag
No Tags, Be the first to tag this record!