Smoothed per-tensor weight quantization: a robust solution for neural network deployment

This paper introduces a novel method to improve quantization outcomes for per-tensor weight quantization, focusing on enhancing computational efficiency and compatibility with resource-constrained hardware. Addressing the inherent challenges of depth-wise convolutions, the proposed smooth quantizati...

Full description

Saved in:
Bibliographic Details
Main Author: Xin Chang
Format: Article
Language:English
Published: Polish Academy of Sciences 2025-07-01
Series:International Journal of Electronics and Telecommunications
Subjects:
Online Access:https://journals.pan.pl/Content/135755/23_4966_Chang_L_sk.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850104006655868928
author Xin Chang
author_facet Xin Chang
author_sort Xin Chang
collection DOAJ
description This paper introduces a novel method to improve quantization outcomes for per-tensor weight quantization, focusing on enhancing computational efficiency and compatibility with resource-constrained hardware. Addressing the inherent challenges of depth-wise convolutions, the proposed smooth quantization technique redistributes weight magnitude disparities to pre-activation data, thereby equalizing channel-wise weight magnitudes. This adjustment enables more effective application of uniform quantization schemes. Experimental evaluations on the ImageNet classification benchmark demonstrate substantial performance gains across modern architectures and training strategies. The proposed method achieves improved accuracy to per-tensor quantization without noticeable computational overhead, making it a practical solution for edge-device deployments.
format Article
id doaj-art-bccde7525d194aab9d62899f43c1fa50
institution DOAJ
issn 2081-8491
2300-1933
language English
publishDate 2025-07-01
publisher Polish Academy of Sciences
record_format Article
series International Journal of Electronics and Telecommunications
spelling doaj-art-bccde7525d194aab9d62899f43c1fa502025-08-20T02:39:25ZengPolish Academy of SciencesInternational Journal of Electronics and Telecommunications2081-84912300-19332025-07-01vol. 71No 3https://doi.org/10.24425/ijet.2025.153629Smoothed per-tensor weight quantization: a robust solution for neural network deploymentXin Chang0Warsaw University of Technology, PolandThis paper introduces a novel method to improve quantization outcomes for per-tensor weight quantization, focusing on enhancing computational efficiency and compatibility with resource-constrained hardware. Addressing the inherent challenges of depth-wise convolutions, the proposed smooth quantization technique redistributes weight magnitude disparities to pre-activation data, thereby equalizing channel-wise weight magnitudes. This adjustment enables more effective application of uniform quantization schemes. Experimental evaluations on the ImageNet classification benchmark demonstrate substantial performance gains across modern architectures and training strategies. The proposed method achieves improved accuracy to per-tensor quantization without noticeable computational overhead, making it a practical solution for edge-device deployments.https://journals.pan.pl/Content/135755/23_4966_Chang_L_sk.pdfper-tensor quantizationedge deviceneural network compression
spellingShingle Xin Chang
Smoothed per-tensor weight quantization: a robust solution for neural network deployment
International Journal of Electronics and Telecommunications
per-tensor quantization
edge device
neural network compression
title Smoothed per-tensor weight quantization: a robust solution for neural network deployment
title_full Smoothed per-tensor weight quantization: a robust solution for neural network deployment
title_fullStr Smoothed per-tensor weight quantization: a robust solution for neural network deployment
title_full_unstemmed Smoothed per-tensor weight quantization: a robust solution for neural network deployment
title_short Smoothed per-tensor weight quantization: a robust solution for neural network deployment
title_sort smoothed per tensor weight quantization a robust solution for neural network deployment
topic per-tensor quantization
edge device
neural network compression
url https://journals.pan.pl/Content/135755/23_4966_Chang_L_sk.pdf
work_keys_str_mv AT xinchang smoothedpertensorweightquantizationarobustsolutionforneuralnetworkdeployment