Custom Network Quantization Method for Lightweight CNN Acceleration on FPGAs

The low-bit quantization can effectively reduce the deep neural network storage as well as the computation costs. Existing quantization methods have yielded unsatisfactory results when being applied to lightweight networks. Additionally, following network quantization, the differences in data types...

Full description

Saved in:

Bibliographic Details
Main Authors:	Lingjie Yi, Xianzhong Xie, Yi Wan, Bo Jiang, Junfan Chen
Format:	Article
Language:	English
Published:	Wiley 2024-01-01
Series:	International Journal of Distributed Sensor Networks
Online Access:	http://dx.doi.org/10.1155/2024/8018810
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849698084695572480
author	Lingjie Yi Xianzhong Xie Yi Wan Bo Jiang Junfan Chen
author_facet	Lingjie Yi Xianzhong Xie Yi Wan Bo Jiang Junfan Chen
author_sort	Lingjie Yi
collection	DOAJ
description	The low-bit quantization can effectively reduce the deep neural network storage as well as the computation costs. Existing quantization methods have yielded unsatisfactory results when being applied to lightweight networks. Additionally, following network quantization, the differences in data types between the operators can cause issues when deploying networks on Field Programmable Gate Arrays (FPGAs). Moreover, some operators cannot be accelerated heterogeneously on FPGAs, resulting in frequent switching between the Advanced RISC Machine (ARM) and FPGA environments for computation tasks. To address these problems, this paper proposes a custom network quantization approach. Firstly, an improved PArameterized Clipping Activation (PACT) method is employed during the quantization aware training to restrict the value range of neural network parameters and reduce the loss of precision arising from quantization. Secondly, the Consecutive Execution Of Convolution Operators (CEOCO) strategy is utilized to mitigate the resource consumption caused by the frequent environment switching. The proposed approach is validated on Xilinx Zynq Ultrascale+MPSoC 3EG and Virtex UltraScale+XCVU13P platforms. The MobileNetv1, MobileNetv3, PPLCNet, and PPLCNetv2 networks were utilized as testbeds for the validation. Moreover, experimental results are on the miniImageNet, CIFAR-10, and OxFord 102 Flowers public datasets. In comparison to the original model, the proposed optimization methods result in an average decrease of 1.2% in accuracy. Compared to conventional quantization method, the accuracy remains almost unchanged, while the frames per second (FPS) on FPGAs improves by an average of 2.1 times.
format	Article
id	doaj-art-aa14af564a8a4388bb4d786042dbf99e
institution	DOAJ
issn	1550-1477
language	English
publishDate	2024-01-01
publisher	Wiley
record_format	Article
series	International Journal of Distributed Sensor Networks
spelling	doaj-art-aa14af564a8a4388bb4d786042dbf99e2025-08-20T03:19:02ZengWileyInternational Journal of Distributed Sensor Networks1550-14772024-01-01202410.1155/2024/8018810Custom Network Quantization Method for Lightweight CNN Acceleration on FPGAsLingjie Yi0Xianzhong Xie1Yi Wan2Bo Jiang3Junfan Chen4School of Computer Science and TechnologySchool of Computer Science and TechnologySchool of Computer Science and TechnologySchool of Computer Science and TechnologyChongqing Haiyun Jiexun TechnologyThe low-bit quantization can effectively reduce the deep neural network storage as well as the computation costs. Existing quantization methods have yielded unsatisfactory results when being applied to lightweight networks. Additionally, following network quantization, the differences in data types between the operators can cause issues when deploying networks on Field Programmable Gate Arrays (FPGAs). Moreover, some operators cannot be accelerated heterogeneously on FPGAs, resulting in frequent switching between the Advanced RISC Machine (ARM) and FPGA environments for computation tasks. To address these problems, this paper proposes a custom network quantization approach. Firstly, an improved PArameterized Clipping Activation (PACT) method is employed during the quantization aware training to restrict the value range of neural network parameters and reduce the loss of precision arising from quantization. Secondly, the Consecutive Execution Of Convolution Operators (CEOCO) strategy is utilized to mitigate the resource consumption caused by the frequent environment switching. The proposed approach is validated on Xilinx Zynq Ultrascale+MPSoC 3EG and Virtex UltraScale+XCVU13P platforms. The MobileNetv1, MobileNetv3, PPLCNet, and PPLCNetv2 networks were utilized as testbeds for the validation. Moreover, experimental results are on the miniImageNet, CIFAR-10, and OxFord 102 Flowers public datasets. In comparison to the original model, the proposed optimization methods result in an average decrease of 1.2% in accuracy. Compared to conventional quantization method, the accuracy remains almost unchanged, while the frames per second (FPS) on FPGAs improves by an average of 2.1 times.http://dx.doi.org/10.1155/2024/8018810
spellingShingle	Lingjie Yi Xianzhong Xie Yi Wan Bo Jiang Junfan Chen Custom Network Quantization Method for Lightweight CNN Acceleration on FPGAs International Journal of Distributed Sensor Networks
title	Custom Network Quantization Method for Lightweight CNN Acceleration on FPGAs
title_full	Custom Network Quantization Method for Lightweight CNN Acceleration on FPGAs
title_fullStr	Custom Network Quantization Method for Lightweight CNN Acceleration on FPGAs
title_full_unstemmed	Custom Network Quantization Method for Lightweight CNN Acceleration on FPGAs
title_short	Custom Network Quantization Method for Lightweight CNN Acceleration on FPGAs
title_sort	custom network quantization method for lightweight cnn acceleration on fpgas
url	http://dx.doi.org/10.1155/2024/8018810
work_keys_str_mv	AT lingjieyi customnetworkquantizationmethodforlightweightcnnaccelerationonfpgas AT xianzhongxie customnetworkquantizationmethodforlightweightcnnaccelerationonfpgas AT yiwan customnetworkquantizationmethodforlightweightcnnaccelerationonfpgas AT bojiang customnetworkquantizationmethodforlightweightcnnaccelerationonfpgas AT junfanchen customnetworkquantizationmethodforlightweightcnnaccelerationonfpgas

Custom Network Quantization Method for Lightweight CNN Acceleration on FPGAs

Similar Items