CNN Accelerator Performance Dependence on Loop Tiling and the Optimum Resource-Constrained Loop Tiling

This paper analyzes the dependence of the convolutional neural network (CNN) accelerator performance on loop tiling. More specifically, based on the closed-form expression of the CNN accelerator performance, the dependence on the tile sizes is characterized by the derivative, the asymptote and the s...

Full description

Saved in:
Bibliographic Details
Main Authors: Chester Sungchung Park, Sungkyung Park
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10849540/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576737239629824
author Chester Sungchung Park
Sungkyung Park
author_facet Chester Sungchung Park
Sungkyung Park
author_sort Chester Sungchung Park
collection DOAJ
description This paper analyzes the dependence of the convolutional neural network (CNN) accelerator performance on loop tiling. More specifically, based on the closed-form expression of the CNN accelerator performance, the dependence on the tile sizes is characterized by the derivative, the asymptote and the switching point between the computation-limited condition and the communication-limited condition. The analysis provides a useful insight into how to determine the tile sizes to achieve the required performance while avoiding an unnecessary static random access memory (SRAM) size increase. The paper also deals with the optimum resource-constrained loop tiling for CNN accelerators. Given the constraint on either the on-chip buffer size or the multiply-accumulate (MAC) array size, tile sizes are optimized to maximize the performance. The closed-form expressions of the optimum tile sizes provide useful insights into how to allocate the available hardware resources for maximum performance. From performance evaluation, the proposed tile sizes achieve almost the maximum performance, which enables the optimization of tile sizes without relying on exhaustive search, speeding up design space exploration.
format Article
id doaj-art-137de8a2da0c46dd8f62226d3ee7dfc3
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-137de8a2da0c46dd8f62226d3ee7dfc32025-01-31T00:01:42ZengIEEEIEEE Access2169-35362025-01-0113168001681010.1109/ACCESS.2025.353279010849540CNN Accelerator Performance Dependence on Loop Tiling and the Optimum Resource-Constrained Loop TilingChester Sungchung Park0https://orcid.org/0000-0003-2009-2814Sungkyung Park1https://orcid.org/0000-0003-1171-5020Department of Electrical and Electronics Engineering, Konkuk University, Gwangjin-gu, Seoul, South KoreaDepartment of Electrical and Electronics Engineering, Pusan National University, Geumjeong-gu, Busan, South KoreaThis paper analyzes the dependence of the convolutional neural network (CNN) accelerator performance on loop tiling. More specifically, based on the closed-form expression of the CNN accelerator performance, the dependence on the tile sizes is characterized by the derivative, the asymptote and the switching point between the computation-limited condition and the communication-limited condition. The analysis provides a useful insight into how to determine the tile sizes to achieve the required performance while avoiding an unnecessary static random access memory (SRAM) size increase. The paper also deals with the optimum resource-constrained loop tiling for CNN accelerators. Given the constraint on either the on-chip buffer size or the multiply-accumulate (MAC) array size, tile sizes are optimized to maximize the performance. The closed-form expressions of the optimum tile sizes provide useful insights into how to allocate the available hardware resources for maximum performance. From performance evaluation, the proposed tile sizes achieve almost the maximum performance, which enables the optimization of tile sizes without relying on exhaustive search, speeding up design space exploration.https://ieeexplore.ieee.org/document/10849540/Closed-form expressionCNN acceleratorcommunicationcomputationhardware resourceloop tiling
spellingShingle Chester Sungchung Park
Sungkyung Park
CNN Accelerator Performance Dependence on Loop Tiling and the Optimum Resource-Constrained Loop Tiling
IEEE Access
Closed-form expression
CNN accelerator
communication
computation
hardware resource
loop tiling
title CNN Accelerator Performance Dependence on Loop Tiling and the Optimum Resource-Constrained Loop Tiling
title_full CNN Accelerator Performance Dependence on Loop Tiling and the Optimum Resource-Constrained Loop Tiling
title_fullStr CNN Accelerator Performance Dependence on Loop Tiling and the Optimum Resource-Constrained Loop Tiling
title_full_unstemmed CNN Accelerator Performance Dependence on Loop Tiling and the Optimum Resource-Constrained Loop Tiling
title_short CNN Accelerator Performance Dependence on Loop Tiling and the Optimum Resource-Constrained Loop Tiling
title_sort cnn accelerator performance dependence on loop tiling and the optimum resource constrained loop tiling
topic Closed-form expression
CNN accelerator
communication
computation
hardware resource
loop tiling
url https://ieeexplore.ieee.org/document/10849540/
work_keys_str_mv AT chestersungchungpark cnnacceleratorperformancedependenceonlooptilingandtheoptimumresourceconstrainedlooptiling
AT sungkyungpark cnnacceleratorperformancedependenceonlooptilingandtheoptimumresourceconstrainedlooptiling