A Pipelined Hardware Design of FNTT and INTT of CRYSTALS-Kyber PQC Algorithm

Lattice-based post-quantum cryptography (PQC) algorithms demand number theoretic transform (NTT)-based polynomial multiplications. NTT-based polynomials’ multiplication relies on the computation of forward number theoretic transform (FNTT) and inverse number theoretic transform (INTT), respectively....

Full description

Saved in:
Bibliographic Details
Main Authors: Muhammad Rashid, Omar S. Sonbul, Sajjad Shaukat Jamal, Amar Y. Jaffar, Azamat Kakhorov
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/16/1/17
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832588359311032320
author Muhammad Rashid
Omar S. Sonbul
Sajjad Shaukat Jamal
Amar Y. Jaffar
Azamat Kakhorov
author_facet Muhammad Rashid
Omar S. Sonbul
Sajjad Shaukat Jamal
Amar Y. Jaffar
Azamat Kakhorov
author_sort Muhammad Rashid
collection DOAJ
description Lattice-based post-quantum cryptography (PQC) algorithms demand number theoretic transform (NTT)-based polynomial multiplications. NTT-based polynomials’ multiplication relies on the computation of forward number theoretic transform (FNTT) and inverse number theoretic transform (INTT), respectively. Therefore, this work presents a unified NTT hardware accelerator architecture to facilitate the polynomial multiplications of the CRYSTALS-Kyber PQC algorithm. Moreover, a unified butterfly unit design of Cooley–Tukey and Gentleman–Sande configurations is proposed to implement the FNTT and INTT operations using one adder, one multiplier, and one subtractor, sharing four routing multiplexers and one Barrett-based modular reduction unit. The critical path of the proposed butterfly unit is minimized using pipelining. An efficient controller is implemented for control functionalities. The simulation results after the post-place and -route step are provided on Xilinx Virtex-6 and Virtex-7 field-programmable gate array devices. Also, the proposed design is physically implemented for validation on Virtex-7 FPGA. The number of slices utilized on Virtex-6 and Virtex-7 devices is 398 and 312, the required number of clock cycles for one set of FNTT and INTT computations is 1410 and 1540, and the maximum operating frequency is 256 and 290 MHz, respectively. The average figure of merit (FoM), where FoM is the ratio of throughput to slices, illustrates 62% better performance than the most relevant NTT design from the literature.
format Article
id doaj-art-9557e23be65248309505890f2acd893f
institution Kabale University
issn 2078-2489
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Information
spelling doaj-art-9557e23be65248309505890f2acd893f2025-01-24T13:35:09ZengMDPI AGInformation2078-24892024-12-011611710.3390/info16010017A Pipelined Hardware Design of FNTT and INTT of CRYSTALS-Kyber PQC AlgorithmMuhammad Rashid0Omar S. Sonbul1Sajjad Shaukat Jamal2Amar Y. Jaffar3Azamat Kakhorov4Computer and Network Engineering Department, Umm Al-Qura University, Makkah 24382, Saudi ArabiaComputer and Network Engineering Department, Umm Al-Qura University, Makkah 24382, Saudi ArabiaDepartment of Mathematics, College of Science, King Khalid University, Abha 61413, Saudi ArabiaComputer and Network Engineering Department, Umm Al-Qura University, Makkah 24382, Saudi ArabiaDepartment of Artificial intelligence, Tashkent State University of Economics, Tashkent 100066, UzbekistanLattice-based post-quantum cryptography (PQC) algorithms demand number theoretic transform (NTT)-based polynomial multiplications. NTT-based polynomials’ multiplication relies on the computation of forward number theoretic transform (FNTT) and inverse number theoretic transform (INTT), respectively. Therefore, this work presents a unified NTT hardware accelerator architecture to facilitate the polynomial multiplications of the CRYSTALS-Kyber PQC algorithm. Moreover, a unified butterfly unit design of Cooley–Tukey and Gentleman–Sande configurations is proposed to implement the FNTT and INTT operations using one adder, one multiplier, and one subtractor, sharing four routing multiplexers and one Barrett-based modular reduction unit. The critical path of the proposed butterfly unit is minimized using pipelining. An efficient controller is implemented for control functionalities. The simulation results after the post-place and -route step are provided on Xilinx Virtex-6 and Virtex-7 field-programmable gate array devices. Also, the proposed design is physically implemented for validation on Virtex-7 FPGA. The number of slices utilized on Virtex-6 and Virtex-7 devices is 398 and 312, the required number of clock cycles for one set of FNTT and INTT computations is 1410 and 1540, and the maximum operating frequency is 256 and 290 MHz, respectively. The average figure of merit (FoM), where FoM is the ratio of throughput to slices, illustrates 62% better performance than the most relevant NTT design from the literature.https://www.mdpi.com/2078-2489/16/1/17Number theoretic transform (NTT)CRYSTALS-Kyberhardware acceleratorPQCFPGA
spellingShingle Muhammad Rashid
Omar S. Sonbul
Sajjad Shaukat Jamal
Amar Y. Jaffar
Azamat Kakhorov
A Pipelined Hardware Design of FNTT and INTT of CRYSTALS-Kyber PQC Algorithm
Information
Number theoretic transform (NTT)
CRYSTALS-Kyber
hardware accelerator
PQC
FPGA
title A Pipelined Hardware Design of FNTT and INTT of CRYSTALS-Kyber PQC Algorithm
title_full A Pipelined Hardware Design of FNTT and INTT of CRYSTALS-Kyber PQC Algorithm
title_fullStr A Pipelined Hardware Design of FNTT and INTT of CRYSTALS-Kyber PQC Algorithm
title_full_unstemmed A Pipelined Hardware Design of FNTT and INTT of CRYSTALS-Kyber PQC Algorithm
title_short A Pipelined Hardware Design of FNTT and INTT of CRYSTALS-Kyber PQC Algorithm
title_sort pipelined hardware design of fntt and intt of crystals kyber pqc algorithm
topic Number theoretic transform (NTT)
CRYSTALS-Kyber
hardware accelerator
PQC
FPGA
url https://www.mdpi.com/2078-2489/16/1/17
work_keys_str_mv AT muhammadrashid apipelinedhardwaredesignoffnttandinttofcrystalskyberpqcalgorithm
AT omarssonbul apipelinedhardwaredesignoffnttandinttofcrystalskyberpqcalgorithm
AT sajjadshaukatjamal apipelinedhardwaredesignoffnttandinttofcrystalskyberpqcalgorithm
AT amaryjaffar apipelinedhardwaredesignoffnttandinttofcrystalskyberpqcalgorithm
AT azamatkakhorov apipelinedhardwaredesignoffnttandinttofcrystalskyberpqcalgorithm
AT muhammadrashid pipelinedhardwaredesignoffnttandinttofcrystalskyberpqcalgorithm
AT omarssonbul pipelinedhardwaredesignoffnttandinttofcrystalskyberpqcalgorithm
AT sajjadshaukatjamal pipelinedhardwaredesignoffnttandinttofcrystalskyberpqcalgorithm
AT amaryjaffar pipelinedhardwaredesignoffnttandinttofcrystalskyberpqcalgorithm
AT azamatkakhorov pipelinedhardwaredesignoffnttandinttofcrystalskyberpqcalgorithm