Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates

For embedded devices, both memory and computational efficiency are essential due to their constrained resources. However, neural network training remains both computation and memory intensive. Although many existing studies apply quantization schemes to mitigate memory overhead, they often employ st...

Full description

Saved in:

Bibliographic Details
Main Authors:	Leo Buron, Andreas Erbslöh, Gregor Schiele
Format:	Article
Language:	English
Published:	Graz University of Technology 2025-08-01
Series:	Journal of Universal Computer Science
Subjects:	deep learning quantized parameters edge training
Online Access:	https://lib.jucs.org/article/164737/download/pdf/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849341529494126592
author	Leo Buron Andreas Erbslöh Gregor Schiele
author_facet	Leo Buron Andreas Erbslöh Gregor Schiele
author_sort	Leo Buron
collection	DOAJ
description	For embedded devices, both memory and computational efficiency are essential due to their constrained resources. However, neural network training remains both computation and memory intensive. Although many existing studies apply quantization schemes to mitigate memory overhead, they often employ stochastic rounding for both inference and gradient computation. Notably, no prior work has explored its advantages exclusively in parameter updates. Here, we in-troduce Quantized Parameter Updates (QPU), which uses stochastic rounding (SQPU) to achieve improved and more stable training outcomes. Our fixed-point quantization scheme quantizes parameters (weights and biases) upon model initialization, conducts high-precision gradient com-putations during training, and applies stochastically quantized updates thereafter. This approach substantially lowers memory usage and enables mostly quantized inference, thereby accelerating calculations. Furthermore, storing quantized inputs for gradient computation reduces memory demands even more. When tested on the FASHION-MNIST dataset, our method matches the Straight-Through Estimator (STE) in performance, delivering 0.92% validation accuracy while consuming just 57% of the memory during training. Accepting a slight 1.5% drop in accuracy yields a 50% memory reduction. Additional techniques include stochastic rounding in inference, the use of higher precision for parameters than for layer outputs to limit overflow, L2 regularization via weight decay, and adaptive learning-rate scheduling for improved optimization across a range of batch sizes.
format	Article
id	doaj-art-b66c0c8c23d74beabc68e628c434e11c
institution	Kabale University
issn	0948-6968
language	English
publishDate	2025-08-01
publisher	Graz University of Technology
record_format	Article
series	Journal of Universal Computer Science
spelling	doaj-art-b66c0c8c23d74beabc68e628c434e11c2025-08-20T03:43:36ZengGraz University of TechnologyJournal of Universal Computer Science0948-69682025-08-0131996397910.3897/jucs.164737164737Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter UpdatesLeo Buron0Andreas Erbslöh1Gregor Schiele2University of Duisburg-EssenUniversity of Duisburg-EssenUniversity of Duisburg-EssenFor embedded devices, both memory and computational efficiency are essential due to their constrained resources. However, neural network training remains both computation and memory intensive. Although many existing studies apply quantization schemes to mitigate memory overhead, they often employ stochastic rounding for both inference and gradient computation. Notably, no prior work has explored its advantages exclusively in parameter updates. Here, we in-troduce Quantized Parameter Updates (QPU), which uses stochastic rounding (SQPU) to achieve improved and more stable training outcomes. Our fixed-point quantization scheme quantizes parameters (weights and biases) upon model initialization, conducts high-precision gradient com-putations during training, and applies stochastically quantized updates thereafter. This approach substantially lowers memory usage and enables mostly quantized inference, thereby accelerating calculations. Furthermore, storing quantized inputs for gradient computation reduces memory demands even more. When tested on the FASHION-MNIST dataset, our method matches the Straight-Through Estimator (STE) in performance, delivering 0.92% validation accuracy while consuming just 57% of the memory during training. Accepting a slight 1.5% drop in accuracy yields a 50% memory reduction. Additional techniques include stochastic rounding in inference, the use of higher precision for parameters than for layer outputs to limit overflow, L2 regularization via weight decay, and adaptive learning-rate scheduling for improved optimization across a range of batch sizes.https://lib.jucs.org/article/164737/download/pdf/deep learningquantized parametersedge training
spellingShingle	Leo Buron Andreas Erbslöh Gregor Schiele Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates Journal of Universal Computer Science deep learning quantized parameters edge training
title	Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates
title_full	Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates
title_fullStr	Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates
title_full_unstemmed	Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates
title_short	Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates
title_sort	reducing memory and computational cost for deep neural network training with quantized parameter updates
topic	deep learning quantized parameters edge training
url	https://lib.jucs.org/article/164737/download/pdf/
work_keys_str_mv	AT leoburon reducingmemoryandcomputationalcostfordeepneuralnetworktrainingwithquantizedparameterupdates AT andreaserbsloh reducingmemoryandcomputationalcostfordeepneuralnetworktrainingwithquantizedparameterupdates AT gregorschiele reducingmemoryandcomputationalcostfordeepneuralnetworktrainingwithquantizedparameterupdates

Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates

Similar Items