Fast correlation attacks against stream ciphers and related open ... - Inria

3 downloads 0 Views 108KB Size Report
proved recently, based on efficient decoding algorithms dedicated ... different parts suggests the use of divide-and-conquer at- ..... mugi/mugi_spe.pdf, 2001.
Fast correlation attacks against stream ciphers and related open problems Anne Canteaut INRIA - projet Codes B.P. 105 78153 Le Chesnay cedex - France Email: [email protected]

Abstract— Fast correlation attacks have been considerably improved recently, based on efficient decoding algorithms dedicated to very large linear codes in the case of a highly noisy channel. However, a better adaptation of these techniques to the concrete involved stream ciphers is still an open issue.

I. I NTRODUCTION In an additive synchronous stream cipher, the ciphertext is obtained by adding bitwise the plaintext to a pseudo-random sequence called the keystream. This keystream is generated by a finite state automaton whose initial state is derived from the secret key, and usually from a public initial value, by a key-loading algorithm. At each time unit, the keystream digit produced by the generator is obtained by applying a filtering function to the current internal state. The internal state is then updated by a transition function. Both filtering function and transition function must be chosen carefully in order to make the underlying cipher secure. In particular, the filtering function must not leak too much information on the internal state and the transition function must guarantee that, for (almost) all initial states, the sequence formed by the successive internal states has a high period. Stream ciphers are mainly devoted to applications which require either an exceptional encryption rate in software or an extremely low implementation cost in hardware (see e.g. [1]). These implementation constraints influence the design choices, especially for the transition function. Keystream generators can be divided into the following main families depending on the procedure used for updating the internal state: • generators based on a linear transition function. A linear transition function seems to be a relevant choice for hardware implementation as soon as the filtering function breaks the inherent linearity. Amongst all possible linear transition functions, those based on linear feedback shift registers (LFSRs) are very popular because they are appropriate for low-cost hardware implementations, produce sequences with good statistical properties and can be easily analyzed. • generators based on a nonlinear transition function. The weaknesses resulting from the linearity of the transition function, especially the vulnerability to algebraic attacks [2], can be avoided by choosing a nonlinear transition mapping. However, for hardware-oriented stream ci-

phers, this function must guarantee that the generated sequence has a high period. This condition may be avoided when the size of the internal state is not limited by implementation constraints: in that case, the probability that a short cycle exists is very low because of the large size of the internal state (e.g. RC4 [3]). But, for hardware applications, the internal state cannot be much larger than the bound provided by time-memory-data tradeoff attacks, i.e., twice the key size. Therefore, theoretical results on the period of the sequence generated by the transition function are required. Only a few appropriate mappings can be used in this context, such as Feedback with Carry Shift Registers [4], Nonlinear Feedback Shift Registers, T-functions [5]... • Hybrid transition functions. In some keystream generators, the internal state is split into two parts: the first one is updated linearly and the other one has a nonlinear behavior. When the nonlinear part is much smaller than the linear one, it is usually identified with internal memory; for instance, SNOW 2.0 [6] or E0 [7] are viewed as LFSR-based stream ciphers with memory. But, some keystream generators such as PANAMA [8] or MUGI [9] use linear and nonlinear parts of similar sizes. From the cryptanalyst’s point of view, the trend towards splitting the internal state of the keystream generator into different parts suggests the use of divide-and-conquer attacks. The correlation attack, which was originally proposed by Siegenthaler against combination generators [10], applies when a part of the internal state is updated independently from the other ones and has a reasonable size. This attack has been greatly improved by Meier and Staffelbach [11], [12] when the target part of the internal state is updated linearly. In this case, efficient error-correcting decoding can be used in order to (partially) recover the initial state of the generator. II. C ORRELATION ATTACK Here, we focus on binary keystream generators which can be described as follows. We denote by xt the n-bit internal state of the generator at time t. The filtering function f is assumed to be a Boolean function of n variables: at time t the generator outputs a single bit, st = f (xt ). In order to produce an unbiased sequence, f must obviously be balanced, i.e., it must output 0 or 1 with probability 1/2. The transition

function is denoted by Φ : Fn2 → Fn2 . Therefore, we have t

st = f (Φ (x0 )) , where x0 is the initial state. We only consider the case where both the filtering function and the transition function are publicly known, i.e., independent from the secret key. We investigate known-plaintext attacks which aim at recovering the initial state x0 , which is in that sense identified with the key of the cipher. However, it must be pointed out that the initial state is usually computed from a shorter secret key and from a public initial value. Some additional information on x0 can therefore be derived, especially in the context of related IVs attacks. A correlation attack, as originally described by Siegenthaler against combination generators, can actually be mounted as soon as the n-bit internal state xt of the generator can be decomposed into two parts yt and zt of respective sizes ` and n − `, which are updated independently from each other, i.e., (yt+1 , zt+1 ) = (Φ1 (yt ), Φ2 (zt )) . The attack aims at recovering one of the parts of the initial state, e.g. y0 , called the target state. The attack applies if and only if there exists a Boolean function g of ` variables which is correlated to the filtering function f . This equivalently means that there exists g from F`2 into F2 such that 1 pg = PY,Z [f (Y, Z) = g(Y )] > 2 where Y and Z are two independent random variables uniformly distributed in F`2 and Fn−` . 2 If such a function g exists, the target sequence σ = (σt )t≥0 defined by σt = g(Φt1 (y0 )) is correlated to the keystream sequence s = (st )t≥0 . This correlation can be detected by computing the correlation between N bits of the keystream and the corresponding bits of the target sequence σ(y0 ) generated from the initial state y0 : C(s, σ(y0 )) =

N −1 X

(−1)st ⊕σt (y0 ) .

t=0

The expected value of this quantity is equal to 2N (pg − 1 2 ) when y0 is the correct value of the target initial state. Therefore, the attack consists of an exhaustive search for the target `-bit part of the initial state y0 . For each possible value of y0 , the first N bits of the corresponding sequence σ(y0 ) are computed and the correlation with the known keystream s is evaluated. A right guess for y0 can be distinguished from a wrong one by comparing C(s, σ(y0 )) to a given threshold. For a wrong guess, both sequences s and σ(y0 ) are actually expected to be uncorrelated. This procedure can be seen as a basic statistical test for distinguishing two binary random sources: one distributed according to the uniform distribution, the other one according to the distribution of f (Y, Z) ⊕ g(Y ), i.e., P [X = 1] = pg [13], [14]. When pg is close to 12 , the attack requires the knowledge of õ ¶−2 ! 1 N =O pg − 2

keystream bits. The time complexity for recovering the `-bit target part of the initial state is therefore à µ ¶−2 ! 1 ` . O 2 pg − 2 From the cryptanalyst’s point of view, this attack raises the question of the optimal choice for the function g. For the designer, the underlying problem consists in finding the filtering functions f which make the attack infeasible. Both questions can be answered by computing the probability that an `-variable function g coincides with f . This probability involves the distributions of the output of f when its first ` inputs are fixed, namely py = PZ [f (y, Z) = 1]. Actually, we have   X 1  X pg = (1 − py ) + py  2` −1 −1 y∈g (0) y∈g (1) µ ¶ 1 1 X 1 g(y) + − py (1) = (−1) 2 2` 2 ` y∈F2

In the attack, g must then be chosen such that pg is maximal; this situation occurs if and only if all terms in the above sum are positive, i.e., if ½ g(y) = 1 if py > 12 g(y) = 0 if py < 12 It follows that max pg = g

¯ 1 X ¯¯ 1 1 + ` − py ¯ . 2 2 2 ` y∈F2

Therefore, it clearly appears that the correlation attack can be prevented if the filtering function f is such that its output remains uniformly distributed when its first ` input variables are fixed. Such functions are said to be resilient (or correlationimmune) with respect to its first ` variables. More generally, the correlation-immunity order of a function, defined by Siegenthaler [15], is the highest number of variables ` such that the output distribution of the function is unchanged when any ` inputs are fixed. In the special case of a combination generator where the inputs of f correspond to the outputs of m independent LFSRs, the minimum number of LFSRs which must be considered together in a correlation attack is ` + 1 where ` is the correlation-immunity order of f . III. FAST CORRELATION ATTACKS AS A DECODING PROBLEM

One major problem in correlation attacks is that they perform an exhaustive search for an entire part of the initial state, leading to a huge time-complexity. The fast correlation attacks introduced by Meier and Staffelbach [12] considerably reduce the running-time but require a longer segment of known keystream. They apply when the target sequence σ generated by σt = g(Φt1 (y0 ))

depends linearly on the `-bit target initial state y0 . In this case, any N -bit subsequence of σ can be seen as a codeword of a linear code C of length N and dimension `. The attack aims at recovering the codeword corresponding to σ from the knowledge of N consecutive keystream bits where p = P [σt 6= st ]

Suggest Documents