Efficient constructions for large‐state block ciphers based on AES New Instructions
Abstract Large‐state block ciphers with 256 bits or 512 bits block sizes receive much attention from the viewpoint of long‐term security. Existing large‐state block ciphers, such as Haraka‐v2 and Pholkos, consist of only the AES New Instructions set (AES‐NI) and a word shuffle that can be efficientl...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2022-05-01
|
Series: | IET Information Security |
Subjects: | |
Online Access: | https://doi.org/10.1049/ise2.12053 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Abstract Large‐state block ciphers with 256 bits or 512 bits block sizes receive much attention from the viewpoint of long‐term security. Existing large‐state block ciphers, such as Haraka‐v2 and Pholkos, consist of only the AES New Instructions set (AES‐NI) and a word shuffle that can be efficiently executed by SIMD instructions for fast software implementation. In Haraka‐v2 and Pholkos, the AES round function is executed twice in parallel at each step and its outputs are shuffled (called two‐round constructions). In this study, optimal constructions based on AES‐NI and efficient word shuffles for such large‐state block ciphers in terms of the encryption speed for software are explored. Specifically, an optimal class of word shuffles that can achieve security in a smaller number of rounds from the class of word shuffles that can be efficiently implemented in SIMD to contribute to the improvement of the performance of large‐state block ciphers is identified. Their speed for each CPU architecture is measured. As a result, the authors reveal the constructions such that two rounds of the AES round function is executed in parallel at each step and its outputs are shuffled (called two‐round constructions) and are optimal in all CPUs with Skylake architecture or later versions. Furthermore, the authors reveal that there is a clear difference in word shuffle instructions with respect to the speed, even if they theoretically require the same number of cycles. Consequently, the authors clarify the optimal construction for each architecture by taking these differences into consideration. |
---|---|
ISSN: | 1751-8709 1751-8717 |