Smoothed per-tensor weight quantization: a robust solution for neural network deployment
This paper introduces a novel method to improve quantization outcomes for per-tensor weight quantization, focusing on enhancing computational efficiency and compatibility with resource-constrained hardware. Addressing the inherent challenges of depth-wise convolutions, the proposed smooth quantizati...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Polish Academy of Sciences
2025-07-01
|
| Series: | International Journal of Electronics and Telecommunications |
| Subjects: | |
| Online Access: | https://journals.pan.pl/Content/135755/23_4966_Chang_L_sk.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850104006655868928 |
|---|---|
| author | Xin Chang |
| author_facet | Xin Chang |
| author_sort | Xin Chang |
| collection | DOAJ |
| description | This paper introduces a novel method to improve quantization outcomes for per-tensor weight quantization, focusing on enhancing computational efficiency and compatibility with resource-constrained hardware. Addressing the inherent challenges of depth-wise convolutions, the proposed smooth quantization technique redistributes weight magnitude disparities to pre-activation data, thereby equalizing channel-wise weight magnitudes. This adjustment enables more effective application of uniform quantization schemes. Experimental evaluations on the ImageNet classification benchmark demonstrate substantial performance gains across modern architectures and training strategies. The proposed method achieves improved accuracy to per-tensor quantization without noticeable computational overhead, making it a practical solution for edge-device deployments. |
| format | Article |
| id | doaj-art-bccde7525d194aab9d62899f43c1fa50 |
| institution | DOAJ |
| issn | 2081-8491 2300-1933 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Polish Academy of Sciences |
| record_format | Article |
| series | International Journal of Electronics and Telecommunications |
| spelling | doaj-art-bccde7525d194aab9d62899f43c1fa502025-08-20T02:39:25ZengPolish Academy of SciencesInternational Journal of Electronics and Telecommunications2081-84912300-19332025-07-01vol. 71No 3https://doi.org/10.24425/ijet.2025.153629Smoothed per-tensor weight quantization: a robust solution for neural network deploymentXin Chang0Warsaw University of Technology, PolandThis paper introduces a novel method to improve quantization outcomes for per-tensor weight quantization, focusing on enhancing computational efficiency and compatibility with resource-constrained hardware. Addressing the inherent challenges of depth-wise convolutions, the proposed smooth quantization technique redistributes weight magnitude disparities to pre-activation data, thereby equalizing channel-wise weight magnitudes. This adjustment enables more effective application of uniform quantization schemes. Experimental evaluations on the ImageNet classification benchmark demonstrate substantial performance gains across modern architectures and training strategies. The proposed method achieves improved accuracy to per-tensor quantization without noticeable computational overhead, making it a practical solution for edge-device deployments.https://journals.pan.pl/Content/135755/23_4966_Chang_L_sk.pdfper-tensor quantizationedge deviceneural network compression |
| spellingShingle | Xin Chang Smoothed per-tensor weight quantization: a robust solution for neural network deployment International Journal of Electronics and Telecommunications per-tensor quantization edge device neural network compression |
| title | Smoothed per-tensor weight quantization: a robust solution for neural network deployment |
| title_full | Smoothed per-tensor weight quantization: a robust solution for neural network deployment |
| title_fullStr | Smoothed per-tensor weight quantization: a robust solution for neural network deployment |
| title_full_unstemmed | Smoothed per-tensor weight quantization: a robust solution for neural network deployment |
| title_short | Smoothed per-tensor weight quantization: a robust solution for neural network deployment |
| title_sort | smoothed per tensor weight quantization a robust solution for neural network deployment |
| topic | per-tensor quantization edge device neural network compression |
| url | https://journals.pan.pl/Content/135755/23_4966_Chang_L_sk.pdf |
| work_keys_str_mv | AT xinchang smoothedpertensorweightquantizationarobustsolutionforneuralnetworkdeployment |