Smoothed per-tensor weight quantization: a robust solution for neural network deployment

This paper introduces a novel method to improve quantization outcomes for per-tensor weight quantization, focusing on enhancing computational efficiency and compatibility with resource-constrained hardware. Addressing the inherent challenges of depth-wise convolutions, the proposed smooth quantizati...

Full description

Saved in:

Bibliographic Details
Main Author:	Xin Chang
Format:	Article
Language:	English
Published:	Polish Academy of Sciences 2025-07-01
Series:	International Journal of Electronics and Telecommunications
Subjects:	per-tensor quantization edge device neural network compression
Online Access:	https://journals.pan.pl/Content/135755/23_4966_Chang_L_sk.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850104006655868928
author	Xin Chang
author_facet	Xin Chang
author_sort	Xin Chang
collection	DOAJ
description	This paper introduces a novel method to improve quantization outcomes for per-tensor weight quantization, focusing on enhancing computational efficiency and compatibility with resource-constrained hardware. Addressing the inherent challenges of depth-wise convolutions, the proposed smooth quantization technique redistributes weight magnitude disparities to pre-activation data, thereby equalizing channel-wise weight magnitudes. This adjustment enables more effective application of uniform quantization schemes. Experimental evaluations on the ImageNet classification benchmark demonstrate substantial performance gains across modern architectures and training strategies. The proposed method achieves improved accuracy to per-tensor quantization without noticeable computational overhead, making it a practical solution for edge-device deployments.
format	Article
id	doaj-art-bccde7525d194aab9d62899f43c1fa50
institution	DOAJ
issn	2081-8491 2300-1933
language	English
publishDate	2025-07-01
publisher	Polish Academy of Sciences
record_format	Article
series	International Journal of Electronics and Telecommunications
spelling	doaj-art-bccde7525d194aab9d62899f43c1fa502025-08-20T02:39:25ZengPolish Academy of SciencesInternational Journal of Electronics and Telecommunications2081-84912300-19332025-07-01vol. 71No 3https://doi.org/10.24425/ijet.2025.153629Smoothed per-tensor weight quantization: a robust solution for neural network deploymentXin Chang0Warsaw University of Technology, PolandThis paper introduces a novel method to improve quantization outcomes for per-tensor weight quantization, focusing on enhancing computational efficiency and compatibility with resource-constrained hardware. Addressing the inherent challenges of depth-wise convolutions, the proposed smooth quantization technique redistributes weight magnitude disparities to pre-activation data, thereby equalizing channel-wise weight magnitudes. This adjustment enables more effective application of uniform quantization schemes. Experimental evaluations on the ImageNet classification benchmark demonstrate substantial performance gains across modern architectures and training strategies. The proposed method achieves improved accuracy to per-tensor quantization without noticeable computational overhead, making it a practical solution for edge-device deployments.https://journals.pan.pl/Content/135755/23_4966_Chang_L_sk.pdfper-tensor quantizationedge deviceneural network compression
spellingShingle	Xin Chang Smoothed per-tensor weight quantization: a robust solution for neural network deployment International Journal of Electronics and Telecommunications per-tensor quantization edge device neural network compression
title	Smoothed per-tensor weight quantization: a robust solution for neural network deployment
title_full	Smoothed per-tensor weight quantization: a robust solution for neural network deployment
title_fullStr	Smoothed per-tensor weight quantization: a robust solution for neural network deployment
title_full_unstemmed	Smoothed per-tensor weight quantization: a robust solution for neural network deployment
title_short	Smoothed per-tensor weight quantization: a robust solution for neural network deployment
title_sort	smoothed per tensor weight quantization a robust solution for neural network deployment
topic	per-tensor quantization edge device neural network compression
url	https://journals.pan.pl/Content/135755/23_4966_Chang_L_sk.pdf
work_keys_str_mv	AT xinchang smoothedpertensorweightquantizationarobustsolutionforneuralnetworkdeployment

Smoothed per-tensor weight quantization: a robust solution for neural network deployment

Similar Items