A Flexible and Parallel Hardware Accelerator for Forward and Inverse Number Theoretic Transform

This paper demonstrates an efficient and flexible hardware accelerator for polynomial multiplication using number theoretic transform (NTT). The proposed architecture considers flexibility and performance requirements at the same time. Flexibility is achieved by computing the following three operati...

Full description

Saved in:
Bibliographic Details
Main Authors: Muhammad Rashid, Safiullah Khan, Omar S. Sonbul, Seong Oun Hwang
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10772085/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper demonstrates an efficient and flexible hardware accelerator for polynomial multiplication using number theoretic transform (NTT). The proposed architecture considers flexibility and performance requirements at the same time. Flexibility is achieved by computing the following three operations: (i) computing only the forward NTT operation using a Cooley-Tukey butterfly unit (CT-BFU), (ii) computing only the inverse NTT operation using a Gentleman-Sande butterfly unit (GS-BFU), and (iii) computing both forward and inverse NTT operations simultaneously. The performance is enhanced by supporting parallelism between one CT-BFU unit, one GS-BFU unit, and four Block-RAMs. Moreover, a dedicated control unit is implemented to ensure a flexible and parallel FP-NTT design. A throughput/area metric is used for evaluation of performance for the proposed design. The implementation results are presented after post-placement and route on various Xilinx field-programmable gate array (FPGA) devices. Specifically, on Virtex-7 FPGA, FP-NTT operates at a frequency of <inline-formula> <tex-math notation="LaTeX">$250MHz$ </tex-math></inline-formula>, utilising 1026 slices, and requires <inline-formula> <tex-math notation="LaTeX">$4.61\mu s$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$5.12\mu s$ </tex-math></inline-formula> for forward and inverse NTT computations, respectively. The calculated throughput/area is 211.41 and 190.36 for forward and inverse computations, respectively. A comparison with state-of-the-art designs emphasises the suitability of the FP-NTT accelerator for high-speed cryptographic applications.
ISSN:2169-3536