Enhancing Computation-Efficiency of Deep Neural Network Processing on Edge Devices through Serial/Parallel Systolic Computing

In recent years, deep neural networks (DNNs) have addressed new applications with intelligent autonomy, often achieving higher accuracy than human experts. This capability comes at the expense of the ever-increasing complexity of emerging DNNs, causing enormous challenges while deploying on resource...

Full description

Saved in:

Bibliographic Details
Main Authors:	Iraj Moghaddasi, Byeong-Gyu Nam
Format:	Article
Language:	English
Published:	MDPI AG 2024-07-01
Series:	Machine Learning and Knowledge Extraction
Subjects:	systolic array DNN accelerator serial inference engine TPU energy efficiency
Online Access:	https://www.mdpi.com/2504-4990/6/3/70
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850260436832747520
author	Iraj Moghaddasi Byeong-Gyu Nam
author_facet	Iraj Moghaddasi Byeong-Gyu Nam
author_sort	Iraj Moghaddasi
collection	DOAJ
description	In recent years, deep neural networks (DNNs) have addressed new applications with intelligent autonomy, often achieving higher accuracy than human experts. This capability comes at the expense of the ever-increasing complexity of emerging DNNs, causing enormous challenges while deploying on resource-limited edge devices. Improving the efficiency of DNN hardware accelerators by compression has been explored previously. Existing state-of-the-art studies applied approximate computing to enhance energy efficiency even at the expense of a little accuracy loss. In contrast, bit-serial processing has been used for improving the computational efficiency of neural processing without accuracy loss, exploiting a simple design, dynamic precision adjustment, and computation pruning. This research presents Serial/Parallel Systolic Array (SPSA) and Octet Serial/Parallel Systolic Array (OSPSA) processing elements for edge DNN acceleration, which exploit bit-serial processing on systolic array architecture for improving computational efficiency. For evaluation, all designs were described at the RTL level and synthesized in 28 nm technology. Post-synthesis cycle-accurate simulations of image classification over DNNs illustrated that, on average, a sample 16 × 16 systolic array indicated remarkable improvements of 17.6% and 50.6% in energy efficiency compared to the baseline, with no loss of accuracy.
format	Article
id	doaj-art-2ee3adbb86c845e6b391cab1e91cd59b
institution	OA Journals
issn	2504-4990
language	English
publishDate	2024-07-01
publisher	MDPI AG
record_format	Article
series	Machine Learning and Knowledge Extraction
spelling	doaj-art-2ee3adbb86c845e6b391cab1e91cd59b2025-08-20T01:55:38ZengMDPI AGMachine Learning and Knowledge Extraction2504-49902024-07-01631484149310.3390/make6030070Enhancing Computation-Efficiency of Deep Neural Network Processing on Edge Devices through Serial/Parallel Systolic ComputingIraj Moghaddasi0Byeong-Gyu Nam1Department of Computer Science and Engineering, Chungnam National University, Daejeon 305-764, Republic of KoreaDepartment of Computer Science and Engineering, Chungnam National University, Daejeon 305-764, Republic of KoreaIn recent years, deep neural networks (DNNs) have addressed new applications with intelligent autonomy, often achieving higher accuracy than human experts. This capability comes at the expense of the ever-increasing complexity of emerging DNNs, causing enormous challenges while deploying on resource-limited edge devices. Improving the efficiency of DNN hardware accelerators by compression has been explored previously. Existing state-of-the-art studies applied approximate computing to enhance energy efficiency even at the expense of a little accuracy loss. In contrast, bit-serial processing has been used for improving the computational efficiency of neural processing without accuracy loss, exploiting a simple design, dynamic precision adjustment, and computation pruning. This research presents Serial/Parallel Systolic Array (SPSA) and Octet Serial/Parallel Systolic Array (OSPSA) processing elements for edge DNN acceleration, which exploit bit-serial processing on systolic array architecture for improving computational efficiency. For evaluation, all designs were described at the RTL level and synthesized in 28 nm technology. Post-synthesis cycle-accurate simulations of image classification over DNNs illustrated that, on average, a sample 16 × 16 systolic array indicated remarkable improvements of 17.6% and 50.6% in energy efficiency compared to the baseline, with no loss of accuracy.https://www.mdpi.com/2504-4990/6/3/70systolic arrayDNN acceleratorserial inference engineTPUenergy efficiency
spellingShingle	Iraj Moghaddasi Byeong-Gyu Nam Enhancing Computation-Efficiency of Deep Neural Network Processing on Edge Devices through Serial/Parallel Systolic Computing Machine Learning and Knowledge Extraction systolic array DNN accelerator serial inference engine TPU energy efficiency
title	Enhancing Computation-Efficiency of Deep Neural Network Processing on Edge Devices through Serial/Parallel Systolic Computing
title_full	Enhancing Computation-Efficiency of Deep Neural Network Processing on Edge Devices through Serial/Parallel Systolic Computing
title_fullStr	Enhancing Computation-Efficiency of Deep Neural Network Processing on Edge Devices through Serial/Parallel Systolic Computing
title_full_unstemmed	Enhancing Computation-Efficiency of Deep Neural Network Processing on Edge Devices through Serial/Parallel Systolic Computing
title_short	Enhancing Computation-Efficiency of Deep Neural Network Processing on Edge Devices through Serial/Parallel Systolic Computing
title_sort	enhancing computation efficiency of deep neural network processing on edge devices through serial parallel systolic computing
topic	systolic array DNN accelerator serial inference engine TPU energy efficiency
url	https://www.mdpi.com/2504-4990/6/3/70
work_keys_str_mv	AT irajmoghaddasi enhancingcomputationefficiencyofdeepneuralnetworkprocessingonedgedevicesthroughserialparallelsystoliccomputing AT byeonggyunam enhancingcomputationefficiencyofdeepneuralnetworkprocessingonedgedevicesthroughserialparallelsystoliccomputing

Enhancing Computation-Efficiency of Deep Neural Network Processing on Edge Devices through Serial/Parallel Systolic Computing

Similar Items