Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates

For embedded devices, both memory and computational efficiency are essential due to their constrained resources. However, neural network training remains both computation and memory intensive. Although many existing studies apply quantization schemes to mitigate memory overhead, they often employ st...

Full description

Saved in:
Bibliographic Details
Main Authors: Leo Buron, Andreas Erbslöh, Gregor Schiele
Format: Article
Language:English
Published: Graz University of Technology 2025-08-01
Series:Journal of Universal Computer Science
Subjects:
Online Access:https://lib.jucs.org/article/164737/download/pdf/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849341529494126592
author Leo Buron
Andreas Erbslöh
Gregor Schiele
author_facet Leo Buron
Andreas Erbslöh
Gregor Schiele
author_sort Leo Buron
collection DOAJ
description For embedded devices, both memory and computational efficiency are essential due to their constrained resources. However, neural network training remains both computation and memory intensive. Although many existing studies apply quantization schemes to mitigate memory overhead, they often employ stochastic rounding for both inference and gradient computation. Notably, no prior work has explored its advantages exclusively in parameter updates. Here, we in-troduce Quantized Parameter Updates (QPU), which uses stochastic rounding (SQPU) to achieve improved and more stable training outcomes. Our fixed-point quantization scheme quantizes parameters (weights and biases) upon model initialization, conducts high-precision gradient com-putations during training, and applies stochastically quantized updates thereafter. This approach substantially lowers memory usage and enables mostly quantized inference, thereby accelerating calculations. Furthermore, storing quantized inputs for gradient computation reduces memory demands even more. When tested on the FASHION-MNIST dataset, our method matches the Straight-Through Estimator (STE) in performance, delivering 0.92% validation accuracy while consuming just 57% of the memory during training. Accepting a slight 1.5% drop in accuracy yields a 50% memory reduction. Additional techniques include stochastic rounding in inference, the use of higher precision for parameters than for layer outputs to limit overflow, L2 regularization via weight decay, and adaptive learning-rate scheduling for improved optimization across a range of batch sizes.
format Article
id doaj-art-b66c0c8c23d74beabc68e628c434e11c
institution Kabale University
issn 0948-6968
language English
publishDate 2025-08-01
publisher Graz University of Technology
record_format Article
series Journal of Universal Computer Science
spelling doaj-art-b66c0c8c23d74beabc68e628c434e11c2025-08-20T03:43:36ZengGraz University of TechnologyJournal of Universal Computer Science0948-69682025-08-0131996397910.3897/jucs.164737164737Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter UpdatesLeo Buron0Andreas Erbslöh1Gregor Schiele2University of Duisburg-EssenUniversity of Duisburg-EssenUniversity of Duisburg-EssenFor embedded devices, both memory and computational efficiency are essential due to their constrained resources. However, neural network training remains both computation and memory intensive. Although many existing studies apply quantization schemes to mitigate memory overhead, they often employ stochastic rounding for both inference and gradient computation. Notably, no prior work has explored its advantages exclusively in parameter updates. Here, we in-troduce Quantized Parameter Updates (QPU), which uses stochastic rounding (SQPU) to achieve improved and more stable training outcomes. Our fixed-point quantization scheme quantizes parameters (weights and biases) upon model initialization, conducts high-precision gradient com-putations during training, and applies stochastically quantized updates thereafter. This approach substantially lowers memory usage and enables mostly quantized inference, thereby accelerating calculations. Furthermore, storing quantized inputs for gradient computation reduces memory demands even more. When tested on the FASHION-MNIST dataset, our method matches the Straight-Through Estimator (STE) in performance, delivering 0.92% validation accuracy while consuming just 57% of the memory during training. Accepting a slight 1.5% drop in accuracy yields a 50% memory reduction. Additional techniques include stochastic rounding in inference, the use of higher precision for parameters than for layer outputs to limit overflow, L2 regularization via weight decay, and adaptive learning-rate scheduling for improved optimization across a range of batch sizes.https://lib.jucs.org/article/164737/download/pdf/deep learningquantized parametersedge training
spellingShingle Leo Buron
Andreas Erbslöh
Gregor Schiele
Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates
Journal of Universal Computer Science
deep learning
quantized parameters
edge training
title Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates
title_full Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates
title_fullStr Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates
title_full_unstemmed Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates
title_short Reducing Memory and Computational Cost for Deep Neural Network Training with Quantized Parameter Updates
title_sort reducing memory and computational cost for deep neural network training with quantized parameter updates
topic deep learning
quantized parameters
edge training
url https://lib.jucs.org/article/164737/download/pdf/
work_keys_str_mv AT leoburon reducingmemoryandcomputationalcostfordeepneuralnetworktrainingwithquantizedparameterupdates
AT andreaserbsloh reducingmemoryandcomputationalcostfordeepneuralnetworktrainingwithquantizedparameterupdates
AT gregorschiele reducingmemoryandcomputationalcostfordeepneuralnetworktrainingwithquantizedparameterupdates