Optimization of Direct Convolution Algorithms on ARM Processors for Deep Learning Inference
In deep learning, convolutional layers typically bear the majority of the computational workload and are often the primary contributors to performance bottlenecks. The widely used convolution algorithm is based on the IM2COL transform to take advantage of the highly optimized GEMM (General Matrix Mu...
Saved in:
| Main Authors: | Shang Li, Fei Yu, Shankou Zhang, Huige Yin, Hairong Lin |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-02-01
|
| Series: | Mathematics |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2227-7390/13/5/787 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Endoscopic Mucosal Techniques for GERD: Dawn of New Era in Endoscopic GERD Management
by: Zubin Sharma, et al.
Published: (2025-06-01) -
Towards Improved Nepovirus Detection and Identification in Xiphinema Nematodes
by: Ellen A. Everaert, et al.
Published: (2024-12-01) -
Assessment of Viral Limit of Detection in Spiked, Unassembled High-Throughput Sequencing Datasets
by: Lizbeth Peña-Zúñiga, et al.
Published: (2025-06-01) -
Optimizing Lattice Basis Reduction Algorithm on ARM V8 Processors
by: Ronghui Cao, et al.
Published: (2025-02-01) -
BTCP: Binary Temporal Convolutional Network-Based Data Prefetcher for Low Inference Latency and Storage Overhead
by: Chang Ho Ryu, et al.
Published: (2025-01-01)