Efficient constructions for large‐state block ciphers based on AES New Instructions

Abstract Large‐state block ciphers with 256 bits or 512 bits block sizes receive much attention from the viewpoint of long‐term security. Existing large‐state block ciphers, such as Haraka‐v2 and Pholkos, consist of only the AES New Instructions set (AES‐NI) and a word shuffle that can be efficientl...

Full description

Saved in:
Bibliographic Details
Main Authors: Rentaro Shiba, Kosei Sakamoto, Takanori Isobe
Format: Article
Language:English
Published: Wiley 2022-05-01
Series:IET Information Security
Subjects:
Online Access:https://doi.org/10.1049/ise2.12053
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Large‐state block ciphers with 256 bits or 512 bits block sizes receive much attention from the viewpoint of long‐term security. Existing large‐state block ciphers, such as Haraka‐v2 and Pholkos, consist of only the AES New Instructions set (AES‐NI) and a word shuffle that can be efficiently executed by SIMD instructions for fast software implementation. In Haraka‐v2 and Pholkos, the AES round function is executed twice in parallel at each step and its outputs are shuffled (called two‐round constructions). In this study, optimal constructions based on AES‐NI and efficient word shuffles for such large‐state block ciphers in terms of the encryption speed for software are explored. Specifically, an optimal class of word shuffles that can achieve security in a smaller number of rounds from the class of word shuffles that can be efficiently implemented in SIMD to contribute to the improvement of the performance of large‐state block ciphers is identified. Their speed for each CPU architecture is measured. As a result, the authors reveal the constructions such that two rounds of the AES round function is executed in parallel at each step and its outputs are shuffled (called two‐round constructions) and are optimal in all CPUs with Skylake architecture or later versions. Furthermore, the authors reveal that there is a clear difference in word shuffle instructions with respect to the speed, even if they theoretically require the same number of cycles. Consequently, the authors clarify the optimal construction for each architecture by taking these differences into consideration.
ISSN:1751-8709
1751-8717