Optimization of Direct Convolution Algorithms on ARM Processors for Deep Learning Inference

In deep learning, convolutional layers typically bear the majority of the computational workload and are often the primary contributors to performance bottlenecks. The widely used convolution algorithm is based on the IM2COL transform to take advantage of the highly optimized GEMM (General Matrix Mu...

Full description

Saved in:
Bibliographic Details
Main Authors: Shang Li, Fei Yu, Shankou Zhang, Huige Yin, Hairong Lin
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/5/787
Tags: Add Tag
No Tags, Be the first to tag this record!