Karatsuba Algorithm Revisited for 2D Convolution Computation Optimization
Convolution plays a significant role in many scientific and technological computations, such as artificial intelligence and signal processing. Convolutional computations consist of many dot-product operations (multiplication–accumulation, or MAC), for which the Winograd algorithm is currently the mo...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-05-01
|
| Series: | Entropy |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1099-4300/27/5/506 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850257030211698688 |
|---|---|
| author | Qi Wang Jianghan Zhu Can He Shihang Wang Xingbo Wang Yuan Ren Terry Tao Ye |
| author_facet | Qi Wang Jianghan Zhu Can He Shihang Wang Xingbo Wang Yuan Ren Terry Tao Ye |
| author_sort | Qi Wang |
| collection | DOAJ |
| description | Convolution plays a significant role in many scientific and technological computations, such as artificial intelligence and signal processing. Convolutional computations consist of many dot-product operations (multiplication–accumulation, or MAC), for which the Winograd algorithm is currently the most widely used method to reduce the number of MACs. The Karatsuba algorithm, since its introduction in the 1960s, has been traditionally used as a fast arithmetic method to perform multiplication between large-bit-width operands. It had not been exploited to accelerate 2D convolution computations before. In this paper, we revisited the Karatsuba algorithm and exploited it to reduce the number of MACs in 2D convolutions. The matrices are first segmented into tiles in a divide-and-conquer method, and the resulting submatrices are overlapped to construct the final output matrix. Our analysis and benchmarks have shown that for convolution operations of the same dimensions, the Karatsuba algorithm requires the same number of multiplications but fewer additions as compared with the Winograd algorithm. A pseudocode implementation is also provided to demonstrate the complexity reduction in Karatsuba-based convolution. FPGA implementation of Karatsuba-based convolution also achieves 33.6% LUTs (Look -up Tables) reduction compared with Winograd-based implementation. |
| format | Article |
| id | doaj-art-928607b3aa2e4066838e6e9b9a6fa016 |
| institution | OA Journals |
| issn | 1099-4300 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Entropy |
| spelling | doaj-art-928607b3aa2e4066838e6e9b9a6fa0162025-08-20T01:56:31ZengMDPI AGEntropy1099-43002025-05-0127550610.3390/e27050506Karatsuba Algorithm Revisited for 2D Convolution Computation OptimizationQi Wang0Jianghan Zhu1Can He2Shihang Wang3Xingbo Wang4Yuan Ren5Terry Tao Ye6The Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC V6T 1Z4, CanadaThe Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, ChinaThe Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, ChinaThe Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, ChinaThe Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, ChinaThe Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, ChinaSchool of Science and Engineering, The Chinese University of Hong Kong, Shenzhen 518172, ChinaConvolution plays a significant role in many scientific and technological computations, such as artificial intelligence and signal processing. Convolutional computations consist of many dot-product operations (multiplication–accumulation, or MAC), for which the Winograd algorithm is currently the most widely used method to reduce the number of MACs. The Karatsuba algorithm, since its introduction in the 1960s, has been traditionally used as a fast arithmetic method to perform multiplication between large-bit-width operands. It had not been exploited to accelerate 2D convolution computations before. In this paper, we revisited the Karatsuba algorithm and exploited it to reduce the number of MACs in 2D convolutions. The matrices are first segmented into tiles in a divide-and-conquer method, and the resulting submatrices are overlapped to construct the final output matrix. Our analysis and benchmarks have shown that for convolution operations of the same dimensions, the Karatsuba algorithm requires the same number of multiplications but fewer additions as compared with the Winograd algorithm. A pseudocode implementation is also provided to demonstrate the complexity reduction in Karatsuba-based convolution. FPGA implementation of Karatsuba-based convolution also achieves 33.6% LUTs (Look -up Tables) reduction compared with Winograd-based implementation.https://www.mdpi.com/1099-4300/27/5/506Karatsuba algorithmconvolutional computing complexityhardware computation accelerationWinograd algorithmhardware/software co-design |
| spellingShingle | Qi Wang Jianghan Zhu Can He Shihang Wang Xingbo Wang Yuan Ren Terry Tao Ye Karatsuba Algorithm Revisited for 2D Convolution Computation Optimization Entropy Karatsuba algorithm convolutional computing complexity hardware computation acceleration Winograd algorithm hardware/software co-design |
| title | Karatsuba Algorithm Revisited for 2D Convolution Computation Optimization |
| title_full | Karatsuba Algorithm Revisited for 2D Convolution Computation Optimization |
| title_fullStr | Karatsuba Algorithm Revisited for 2D Convolution Computation Optimization |
| title_full_unstemmed | Karatsuba Algorithm Revisited for 2D Convolution Computation Optimization |
| title_short | Karatsuba Algorithm Revisited for 2D Convolution Computation Optimization |
| title_sort | karatsuba algorithm revisited for 2d convolution computation optimization |
| topic | Karatsuba algorithm convolutional computing complexity hardware computation acceleration Winograd algorithm hardware/software co-design |
| url | https://www.mdpi.com/1099-4300/27/5/506 |
| work_keys_str_mv | AT qiwang karatsubaalgorithmrevisitedfor2dconvolutioncomputationoptimization AT jianghanzhu karatsubaalgorithmrevisitedfor2dconvolutioncomputationoptimization AT canhe karatsubaalgorithmrevisitedfor2dconvolutioncomputationoptimization AT shihangwang karatsubaalgorithmrevisitedfor2dconvolutioncomputationoptimization AT xingbowang karatsubaalgorithmrevisitedfor2dconvolutioncomputationoptimization AT yuanren karatsubaalgorithmrevisitedfor2dconvolutioncomputationoptimization AT terrytaoye karatsubaalgorithmrevisitedfor2dconvolutioncomputationoptimization |