Achieving side-channel high-order correlation ...

Achieving side-channel high-order correlation immunity with Leakage Squeezing Claude Carlet1? , Jean-Luc Danger2 , Sylvain Guilley3 , Houssem Maghrebi4 , and Emmanuel Prouff5 1

3

Claude Carlet LAGA, University of Paris VIII and University of Paris XIII UMR 7539, CNRS, Department of Mathematics, 2 rue de la liberté, 93 526 Saint-Denis Cedex, FRANCE. [email protected] 2 Jean-Luc Danger Institut MINES-TELECOM / TELECOM-ParisTech, CNRS LTCI (UMR 5141), Department COMELEC, 46 rue Barrault, 75 634 Paris Cedex 13, FRANCE, and Secure-IC S.A.S., 80 avenue des Buttes de Coësmes, 35 700 Rennes, FRANCE. [email protected] Sylvain Guilley Institut MINES-TELECOM / TELECOM-ParisTech, CNRS LTCI (UMR 5141), Department COMELEC, 37/39 rue Dareau, 75 014 Paris, FRANCE, and Secure-IC S.A.S., 80 avenue des Buttes de Coësmes, 35 700 Rennes, FRANCE. [email protected] 4 Houssem Maghrebi Morpho-Safran 18, chaussée Jules César, 95 520 Osny, FRANCE. [email protected] 5 Emmanuel Prouff Agence Nationale de la Sécurité des Systèmes d’Information, 51 boulevard de La Tour-Maubourg 75 700 Paris 07 SP, FRANCE. [email protected]

Abstract. This article deeply analyses high-order (HO) Boolean masking countermeasures against side-channel attacks in contexts where the shares are manipulated simultaneously and the correlation coefficient is used as a statistical distinguisher. The latter attacks are sometimes referred to as zero-offset High-Order Correlation Power Analysis (HOCPA). In particular, the main focus is to get the most out of a single mask (i.e., for masking schemes with two shares). The relationship between the leakage characteristics and the attack efficiency is thoroughly studied. Our main contribution is to link the minimum attack order (called HO-CPA immunity) to the amount of information leaked. Interestingly, the HO-CPA immunity can be much larger than the number of shares in the masking scheme. This is made possible by the leakage squeezing. It is a variant of the Boolean masking where masks are recoded relevantly by bijections. This technique, and others from the state-of-theart (namely leak-free masking and wire-tap codes), are overviewed, and put in perspective. ?

This paper has appeared in the Journal of Cryptographic Engineering (Springer), volume 4, issue 2, at pages 107-121, DOI: 10.1007/s13389-013-0067-1 (see also [13]).

2

C. Carlet, J.-L. Danger, S. Guilley, H. Maghrebi and E. Prouff

Keywords: High-Order Masking, High-Order Correlation Power Analysis (HO-CPA), High-Order CPA Immunity (HCI), Mutual Information Metric (MIM), Leakage Squeezing.

1

Introduction

Masking [16,25] is a countermeasure against observation attacks, also known as side-channel attacks (SCA), that is suitable for both hardware and software cryptographic implementations. It consists in changing the variable representation into randomized shares [16], and can thus be qualified as a logical countermeasure. Notably, masking does not rely on specific hardware properties, as opposed to dual-rail protection [37, Chp. 7] that demands some physical indiscernibility. However, the level of provided security depends on some hardware specificities (noise, glitches, etc). Nonetheless, masked implementations can always be theoretically attacked, since the tuple of all the shares does unambiguously leak information about the sensitive variable (nota bene: this was first shown by Thomas Messerges). In practice however, the difficulty of performing an attack involving several shares increases exponentially with the number of shares, the basis of the exponent being the variance of the noise. This statement has been first enlightened in Chari et al.’s paper [16], then complemented by Prouff and Rivain at EUROCRYPT 2013 [45], and confirmed several times by real experimentations. As a consequence, and as already observed in many papers, it makes sense to define the security of a masked implementation as the number of shares per internal variable. A scheme where every intermediate variable is shared into d + 1 shares is called dth-order masking scheme. For such a scheme, we usually say that the security is reached at order d if and only if any combination of d shares conveys no information about the sensitive variables. We must concur that computing with d + 1 shares without revealing information from any set of size d of intermediate values is tough. In fact, many purported solutions have been defeated [19,44,47]. One sound solution, based on secure schemes to process additions and multiplications in a field, has been put forward recently in [50] and generalized in [15]. Another solution, based on extending the principle of look-up table recomputation introduced in [30], also exists [18]. They can either be applied on software or hardware and their security is stated under the two following assumptions: – H1: only the manipulated values leak, and

Achieving side-channel hi-order corr. immunity with Leakage Squeezing

– H2: manipulated values leak at different times without interfering. These assumptions impose special constraints on the hardware. For instance, it must be ascertained that it does not cache the values. One example of variable caching is actually inherent to the leakage of CMOS technology. In this context, the leakage is caused by the transitions and hence involves two consecutive variables (see e.g. [6]). Another example is the fork of a value that causes it to be latched in two or more different registers. Thus a sensitive value can show up many times during the computation, and the hypothesis of independent execution contexts does not hold either. The two examples above alert us on the importance of analysing the hardware. In particular, it must be carefully checked against caching and forking interferences when applying the existing masking schemes. Alternatively, the hardware can be designed to avoid them, for instance by enforcing a variable “wainscoting” (i.e., physical separation) as proposed for instance at the gate-level (for masked logic styles) in [22], or more recently at the algorithmic-level in [42,48]. This is the point of view we adopt in the sequel. More precisely, here are the hardware constraints we set to satisfy the assumptions H1 and H2: – The leakage comes mainly from the registers update. To ensure that the computation part does not leak (via glitches that carry information about a sensitive variable [38]), the designer can decide to tabulate it in memories. For instance, in an FPGA design, the operations on the data stored in registers can be implemented by accesses to read-only memories, termed BRAM (Block RAM). As characterized for instance on Xilinx FPGAs [5], memories leak only their input and output data. But those data are exactly the ones that are already in a register (input) and that will be updated in another register (output). The paper [21] shows, still on Xilinx FPGAs, that a full-fledged AES can be implemented mostly with BRAM (and DSP, that can easily be traded for BRAMs). Based on these remarks, we will thus consider in the sequel only the leakage from the state registers. – The decoupling of variables (especially those belonging to the same sharing of a variable) is ensured by the storage of each share in a distinct hardware resource. In this article, we focus on a scenario where all the shares leak at the same time. Eventually, we notice that in hardware, some further simplification hypotheses can be made:

3

4


– The leakage is additive, and the designer can arrange that each manipulated bit leaks in the same way. This means that the Hamming weight/distance model is suitable. – The algorithmic noise is large, because the many variables processed in parallel with the targeted resource are unrelated to the attack, and thus act as independent noise sources, that can be modeled as a binomial law. Contribution of the Paper. Masking countermeasures are customarily characterized by the number of shares in which sensitives variables are split. The purpose of this paper is to look more in details (with assumptions on the leakage), and to explain that the minimum order of a successful attack can be made greater than the number of shares. This is achieved by an encoding of the shares, termed leakage squeezing. Additionally, we link the minimum attack order to the amount of information leaked (Theorem 1). Outline. The rest of the paper is organized as follows. A brief overview of masking theory is described in Sec. 2. The notion of minimum HOCPA attack order (called HO-CPA immunity) is related to the amount of information leaked (even independently of any masking scheme) in Sec. 3. An extensive overview of the known techniques to reduce the leakage and to increase the HO-CPA immunity (noted HCI) is given in Sec. 4. Then, Sec. 5 provides a concrete example of HO-CPA immunity optimization thanks to a “leakage squeezing” followed by some simulations results in Sec. 6. Finally, Sec. 7 concludes the paper and opens some further research perspectives. The value of a typical signal-to-noise ratio (SNR) of a sidechannel signal measured on an FPGA is given in appendix A.

2 2.1

State-of-the-art: Masking Definitions and Modelling

We use capital letters (e.g. X) for random variables (RVs), and small letters (e.g. x) for their realizations. Moreover, we denote by X the support of RV X. The probability that X is equal to x is noted P[X = x] or P[x] when there is no ambiguity. Scalar (resp. vectorial) variables are noted in “thin” (resp. “bold”) font. A dth-order masking scheme consists in splitting a sensitive variable Z ∈ Fn2 (that can be deduced from either the plaintext or the ciphertext through few sub-keys hypotheses) into d + 1 random shares, noted S =


(Si )i∈J0,dK , in such a way that the relation S0 ⊥ · · · ⊥ Sd = Z is satisfied for a group operation ⊥ (e.g. the XOR operation in additive Boolean masking). We recall hereafter the definition of the masking soundness. Definition 1. (masking dth-order soundness) The masking is sound at dth-order if: – Z can be deterministically reconstructed knowing the d + 1 shares, while – no information about Z can be extracted from the knowledge of strictly less than d + 1 shares. In order to study a masking scheme resistance in a SCA context, one usually associates each share with a noisy observation of it, modeled by a noisy function ì : X 7→ fi (X)+Ni where Ni is an independent and Gaussian noise and where fi is a deterministic but unknown function sometimes approximated by the Hamming weight. For the sake of simplicity, we assume that Ni is a centered Gaussian RV of standard deviation σi . We denote by Li the RV ì (Si ) and summarize by L the tuple (Li )i∈J0,dK . This definition forbids the modelization of glitches (as those put forward in [38]), that can be unexpected functions of different shares Si and Sj (i 6= j). But it complies with our strategy of allocating one dedicated hardware resource to each share. This is illustrated by the snapshots from Cadence SOC-Encounter place-and-route tool in Fig. 1. In addition, the usage of block RAMs allows to remove glitches. It is shown on the lefthand side that without placement constraints, the registers that hold the two shares of a first-order masked sbox can be intricated in the circuit, hence being coupled. But if the shares are constrained to be placed at different locations, then they remain clearly separated in space, as shown in the right-hand side of Fig. 1. Incidentally, the physical separation helps to make the device’s leakage be the sum of the Li (linear, no coupling). When all the shares are manipulated at the same time, the attacker observes Cdevice (L), typically the sum of all the Li . In this case, Cdevice (L)|S follows a normal law, of variance equal to: P P P . Var[ di=0 Ni ] = di=0 Var[Ni ] = di=0 σi2 = σ 2 . This is equivalent to saying that Cdevice (L) =

Pd

i=0 fi (Si )

+N ,

P where N = di=0 Ni = N ∼ N (0, σ 2 ). After collecting the side-channel leakage, the attacker can apply a pre-processing of her choice before

5

6

C. Carlet, J.-L. Danger, S. Guilley, H. Maghrebi and E. Prouff 1111 0000 00000 11111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 00000 11111 0000 1111 00000 11111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 00000 11111 0000 1111 00000 11111 0000 0000 0000 0000 0000 00000 11111 0000 1111 000001111 11111 00001111 1111 0000 1111 1111 00001111 1111 00001111 1111 0000 1111 00000 11111

1111 0000 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 0000 0000 0000 00001111 1111 00001111 1111 0000 1111 1111 0000 1111 1111 0000 1111 0000 1111 1111 0000 1111 1111 0000 1111 00000 11111 0000 1111 0000 0000 00000 11111 0000 1111 0000 1111 0000 1111 00000 11111 00000 11111 0000 1111 0000 1111 00000 11111 00000 11111 0000 1111 0000 1111 00000 11111 00000 11111 0000 1111 0000 00000 11111 00000 11111 00001111 1111 0000 1111 00000 11111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 00001111 1111 0000 1111

1111111111 00000 000001111 0000 000001111 11111 00001111 0000 00000 11111 00000 11111 00000 11111 00000 11111 0000 1111 00000 11111 0000 1111 0000 1111 00000 11111 00000 11111 00000 11111 00000 0000 00000 11111 0000 0000 00000 11111 00000 11111 0000011111 11111 000001111 11111 0000 1111 000001111 11111 00001111 1111 0000 1111 00000 11111 00000 11111

Fig. 1. One eight-bit masked sbox, whose input register (the two shares are highlighted on the one hand in blue color and right-hatched, and on the other hand, in red color and left-hatched) is automatically placed (left) or manually constrained (right).

performing any statistical analysis. It is denoted by Cattacker : R → R. Eventually, the exploited leakage is the RV Ctotal (L), which is equal to . Cattacker (Cdevice (L)). Thus, the function Ctotal = Cattacker ◦ Cdevice can be developed as a polynomial in R[L0 , · · · , Ld ] = R[L]. The reason is that any pseudo-Boolean function (i.e. a function Fn2 → R) can be written uniquely as a multi-linear polynomial. This polynomial takes the form: X Ctotal = aα Lα , α=(α0 ,··· ,αd )∈Nd+1

Q where Lα denotes the monomial term di=0 Lαi i and aα is a real coefficient. We recall that the degree d◦ of a multivariate polynomial is defined as: ( d ) X ◦ d+1 d = max αi , ∀α ∈ N such that aα 6= 0 . i=0

◦

That is, d is the greatest sum of exponents amongst all the monomials that make up the polynomial. 2.2

Notation and Basics on Statistics, with Application to HO-CPA

We introduce some notation which will be useful in the sequel. We denote by µz and σz2 the mean and the variance of the conditional RV Ctotal (L) |


P Z = z. We call µtot = z P[Z = z] · µz the mean of Ctotal (L). The total 2 of C variance σtot total (L) decomposes into the sum of inter- and intra2 2 class variance, denoted by σinter and σintra respectively. Those quantities . P 2 are defined using the lawPof total variance as: σinter = z P[Z = z] · . 2 (µz − µtot )2 and σintra = z P[Z = z] · σz2 . In the presence of countermeasures, the central moment µi (Ctotal (L) | . Z = z) = E[(Ctotal (L) − µtot )i | Z = z] of order i can be constant with respect to z. In practice, the attacker will typically try to compute the moments starting from low orders i ≥ 1, because their estimation is less affected by the measurement noise. 2.3

Information-Theoretic Characterization of the Masking

In order to characterize the polynomial Ctotal (L) from an information6 theoretic point of view, we introduce the notion Qd ofαimultivariate degree . The multivariate degree dalg of a monomial i=0 Li is equal to the number of non-zero exponents αi . We also emphasize that the multivariate degree is smaller than the degree. We start with the following basic lemma: Lemma 1 (Soundness and mutual information). If a masking scheme is dth-order sound, then the mutual information between any monomial in R[L], of multivariate degree lower than or equal to d, and Z is null. Proof. If a masking scheme is dth-order sound, then I[Z; (Si )i∈I ] = 0 if #I ≤ d. Now, for any function ψ, I[Z; (Si )i∈I ] ≥ I[Z; ψ((Si )i∈I )]. So, if ψ is taken as a monomial in (Si )i∈J0,dK of multivariate degree less than or equal to d, we have I[Z; ψ((Si )i∈I )] ≤ 0, hence I[Z; ψ((Si )i∈I )] = 0. t u As a consequence of Lemma 1, for any sound dth-order masking scheme, every moment of L of order lower than or equal to d is constant. Hence, an attacker may attempt to apply the following strategy: choose a Cattacker such that Ctotal is of multivariate degree strictly greater than d. This result implies that the adversary must choose Cattacker such that the multivariate degree dalg (Ctotal ) of Ctotal is at least d + 1, and that the (regular ) degree d◦ (Ctotal ) must not be too high otherwise the SCA attack efficiency decreases [16,45]. 6

In the context of polynomials in variables L0 , · · · , Ld over the field K (e.g. K = R), our definition of multivariate degree Q coincides with the “usual” degree of polynomials in the algebra K[L0 , · · · , Ld ]/( di=0 L2i − Li ), also called sometimes the algebraic degree.

7

8


2.4

Our Goal

Considering Lemma 1, given Cdevice , it seems sufficient from an attacker viewpoint to choose Cattacker such that Ctotal = Cattacker ◦ Cdevice contains at least one term depending on all d + 1 leakages Li . In this paper, we show that it is possible to devise dth-order masking schemes such that such term of multivariate degree d + 1 of Cdevice does not give enough information on Z for an attack to succeed, whatever the choice of Cattacker . In Sec. 5, we give examples where a dth-order masking scheme: – is not defeated even if the polynomial Ctotal (L) has a monomial of maximal multivariate degree d + 1, hence – is not defeated by a (d + 1)th-order CPA. Thus the relationship between the number of masks and the order of the first HO-CPA to work is not trivial. In the next section, we formally prove this relationship.

3

HO-CPA Attacks and HO-CPA Immunity

There are two ways to address the security evaluation of a countermeasure [53]: 1. Estimate the efficiency of existing attacks (using metrics such as the success rate or the guessing entropy). Basically, the main higher-order attacks are HO-CPA [46], HO-MIA [2], template attacks [17] and the stochastic attack [52]. 2. Leakage estimation with information theoretic metrics, such as the mutual information between the leakage (observations) and the sensitive data. In the next subsections 3.1 and 3.2, we introduce a new metric that jointly covers the strength of the attacks and the amount of leaked information. In the sequel, when mentioning HO-CPA attacks, we mean the univariate attacks that focus on a higher-order moment of the leakage. 3.1

HO-CPA Immunity

In this subsection, we define the notion of HO-CPA immunity to quantify the difficulty of an attack. Definition 2. The HO-CPA immunity of RV Ctotal (L) is the order of the smallest (central) moment of Ctotal (L) which is dependent on Z.


The HO-CPA immunity of Ctotal is denoted by HCI in the following. The minimal value of the HO-CPA immunity is 1 and it is reached when the distributions of the RV Ctotal (L) | Z = z do not have the same mean when s varies. This is the case of unprotected circuits, for which a first-order CPA works. The HO-CPA immunity is larger than or equal to 2 when the distributions are balanced (i.e. µz = µtot for every z). In this case, the inter-class 2 variance is null Pand the total variance σtot is equal to the intra-class vari2 ance σintra = z P[Z = z] × µ2 (Ctotal (L) | Z = z). If the central moments µ2 (Ctotal (L) | Z = z) are not all equal, then HCI = 2 and a second-order CPA using the moments of order 2 is possible. The motivation of the HO-CPA immunity definition is thus straightforward. As argued in Definition 2, all HO-CPA using the moments of order i < HCI will fail, because the moments are independent of Z. Thus the HO-CPA immunity is equal to the smallest order of the moments for which an HO-CPA attack can be successful. 3.2

Link Between I[Ctotal (L); Z] and the HO-CPA Immunity

HO-CPA exploits linear dependencies between the RVs Ctotal (L) and Z. Unless the RVs Ctotal (L) | Z = z are identically distributed for every z, the mutual information I[Ctotal (L); Z] will be non-zero. There is no such notion of “order” for MIA. Nonetheless, we show in the following theorem that HCI is also relevant to quantify the efficiency of a mutual information attack with respect to the leakage noise. P In terms of mutual information, the impact of the noise N = i∈J0,dK Ni is quantified by Theorem 1. Theorem 1. Let σ denote the standard deviation of the noise N , the mutual information I[Ctotal (L); Z] tends towards O σ −2×HCI when σ tends towards infinity. Remark 1. Theorem 1 holds only asymptotically when σ tends to infinity. Nonetheless, as will be noticed in Fig. 5, the relationship between the logarithm of the mutual information between the leakage and the sensitive variable, and the noise standard deviation (σ), starts to be almost affine starting from σ ≥ 4, i.e. much less that the σ ≈ 14 found in our appendix A. To prove the theorem, we recall the notion of cumulants of the RV X, denoted by ki (X), that correspond to the monomials in the Taylor series

9

10


of the function t ∈ R 7→ ln(E(exp(t · X))) =

+∞ X

ki (X)

i=0

ti . i!

(1)

The proof will use the following lemma. Lemma 2. If Ctotal (L) has an HO-CPA immunity equal to HCI, then for every i in J0, HCIJ and every z in Fn2 we have, ki (Ctotal (L) | Z = z) = ki (Ctotal (L)). Proof. First of all, we notice that ∀i ∈ J0, HCIJ, the cumulants ki (Ctotal (L) | Z = z) are equal for every z ∈ Fn2 . The reason is that for any law X, kj (X) can be expressed as a function of µi (X) for 0 ≤ i ≤ j (and reciprocally). For instance k3 (X) = µ3 (X), k4 (X) = µ4 (X) − 3µ22 (X), k5 (X) = µ5P (X)−10µ3(X)µ2 (X), etc. Generally speaking, the relationship i−1 is ki = µi − i−1 j=1 j−1 kj µi−j . Now, according to Definition 2, if Ctotal (L) has HO-CPA immunity HCI, then all the moments µi (Ctotal (L) | Z = z) for 0 ≤ i < HCI are independent of z. Consequently, the same holds for the cumulants of orders i ∈ J0, HCIJ. Eventually, as ∀i < HCI, ∀z, µi (Ctotal (L) | Z = z) = µi (Ctotal (L)), we also have ki (Ctotal (L) | Z = z) = ki (Ctotal (L)). t u Besides, we need also to make use of this lemma. Lemma 3 (Known as “small cumulant approximation to the Kullback-Leibler divergence”). Let P and Q be two distributions close to the standard normal distribution. Then, the small cumulant approximation to the Kullback-Leibler divergence DKL [P ; Q] writes DKL [P ; Q] =

1X1 (ki (P ) − ki (Q))2 . 2 i!

(2)

i≥1

Proof (Sketch). This lemma has been proved by Cardoso in Section 4 at page 1194 of [10]. The proof relies on the Gram-Charlier expansion around a standard normal reference distribution. In the demonstration of Cardoso, multivariate distributions for P and Q are considered; P and Q are n-dimensional probability density functions close to an n-dimensional standard Gaussian. Therefore, it involves cross-cumulants, i.e., cumulants between at least two different random variables. In our case, P and Q are scalar, thus cross-cumulants simplify to cumulants, as defined in Eqn. (1). t u


We give hereafter the proof of Theorem 1. Proof. The mutual information between Ctotal (L) and Z can be computed as follows: I[Ctotal (L) ; Z] = EZ [DKL [Ctotal (L) |Z; Ctotal (L)]] X P[Z = z] · DKL [Ctotal (L) |Z = z; Ctotal (L)] . =

(3)

z

2 + Under the Gaussian assumption, Ctotal (L) is about distributed like N (µtot , σtot 2 2 2 σ ) and (Ctotal (L) |Z = z) like N (µz , σz + σ ). We distinguish two cases: HCI ≤ 2, and HCI > 2.

Case HCI ≤ 2. The Kullback-Leibler divergence of two Gaussians Pj ∼ N (µj , σj2 ) (j ∈ {1, 2}) has an analytic closed form; it is equal to 1 (µ1 − µ2 )2 σ12 σ12 DKL [P1 ; P2 ] = + 2 − 1 − log2 2 . 2 σ22 σ2 σ2 Therefore, the mutual information (refer to Eqn. (3)) is equal to: I[Ctotal (L) ; Z] 2 2 P σz2 +σ 2 tot ) z = 12 z P[z] (µσz2−µ+σ + σ2 σ+σ 2 2 − 1 + log2 σ 2 +σ 2 tot tot tot 2 2 P σintra +σ 2 σinter σz2 +σ 2 1 = 2 σ2 +σ2 + σ2 +σ2 − 1 + z P[z] log2 σ2 +σ2 tot tot tot σz2 +σ 2 1P = − 2 z P[z] log2 σ2 +σ2 tot 1+σz2 /σ 2 1P = − 2 z P[z] log2 1+σ2 /σ2 .

(4)

tot

2 2 2 . The logarithm Indeed, by the law of total variance, σinter + σintra = σtot in Eqn. (4) can be developed by a Taylor expansion at order 2, using log2 (1 + ) = ln12 − 2 /2 + O(3 ), when = σ12 → 0+ . Consequently

I[Ctotal (L) ; Z] 1 1 P 2 2 = − 2 ln z P[z]σz − σtot 2 σ2 1 1 P 4 4 + 4 ln z P[z]σz − σtot 2 σ4 2 1 σinter 1 1 P 4 4 = 2 ln z P[z]σz − σtot + O 2 σ 2 + 4 ln 2 σ 4

1 σ6

.

(5)

Thus 2 – if HCI = 1, then σinter 6= 0, and thus I[Ctotal (L) ; Z] tends to zero as 2 2 1/σ (the dominant term is proportional to σinter /σ 2 ); 2 – if HCI = 2, then thus I[Ctotal (L) ; Z] tends to zero as P σinter =4 0, and 4 4 6= 0). 1/σ (because z P[z]σz − σtot

11

12


Case HCI > 2. With the condition that Ctotal (L) |Z = z are close enough to standard normal distributions, the Kullback-Leibler divergence between Ctotal (L) |Z = z and Ctotal (L) can be expanded according to Lemma 3. Notice that the cumulants of the RV Ctotal (L) (resp. (Ctotal (L) |Z = z)) and of the RV Ctotal (L)+N (resp. (Ctotal (L) |Z = z)+N ) are the same at any order strictly greater than two, and are the sum of the variance, i.e., 2 + σ 2 (resp. σ 2 + σ 2 ) at order two. The reason is that the noise is inσtot z dependent on Ctotal (L) (resp. (Ctotal (L) |Z = z)), and assumed Gaussian (i.e., whose cumulants of order strictly greater than two are all zero). Before using Lemma 3, the distributions shall be standardized. Under 2 . the assumption of 2nd-order resistance (HCI > 2), we have ∀z, σz2 = σtot Thus Ctotal (L) and (Ctotal (L) |Z = z) have identical variance. Now, the Kullback-Leibler divergence is invariant by a common scaling (specifically, p 2 a scaling by factor 1/ σtot + σ 2 ). This means that, for all z: " DKL

Ctotal (L|Z = z) Ctotal (L) p ;p 2 + σ2 2 + σ2 σtot σtot

#

= DKL [Ctotal (L|Z = z) ; Ctotal (L)] . Besides, it is well known that the ith cumulant is homogeneous of degree i, i.e., if c is any constant, ki (cX) = ci ki (X). So, by plugging Lemma 3 into Eqn. (3), we get an expression of the mutual information between Ctotal (L) and Z as a series7 : I[Ctotal (L) ; Z] !2 ki (Ctotal (L|Z=z)) √ i 2 +σ 2 σtot

ki (Ctotal (L) √ i 2 +σ 2 σtot

=

P

1 i≥1 2·i!

P

P[z]

=

P

1 i>2 2·i!

P

i (Ctotal (L)) P[z] (ki (Ctotal (L|Z=z))−k . 2 +σ 2 i (σtot )

z

z

−

2

(6)

Notice that the index i starts at 3 because the cumulants are balanced when i < HCI. Then, according to Lemma 2, the first non-zero term in 7

A similar result had already been derived by Le and Berthier in [31], based on a development of the Kullback-Leibler divergence (alike Lemma 3) at order 4 obtained also by Cardoso in an earlier work of his [9]. Our result, given in Eqn. (6), can be seen as a generalization at any order.


the summation in (6) is at index i = HCI. So, I[Ctotal (L); Z] = +∞ X (ki (Ctotal (L) | z) − ki (Ctotal (L)))2 1 X P[z] = 2 + σ2 i 2 · i! z σtot i=HCI −2×HCI O σ . 2 + σ 2 ≈ σ 2 . This proves Theorem 1. Indeed, when σ → +∞, σtot

t u

Our main interest in Theorem 1 is that it gives the dependence between the leakage, the noise variance σ 2 and the HCI order. It shows that the higher HCI , the less information is leaked by the device. We notice that a recent paper by Grosso et al. at CARDIS 2013 [26] has empirically illustrated Theorem 1 on simulations.

4 4.1

State-of-the-Art About Masking Optimizations When the Leakage Model is Approximately Known Motivations

The modern description of masking schemes actually puts the emphasis on the way the shares are split and processed. For example, Boolean additive masking [25] gets its security from an information theoretic argument similar to that employed to prove the unconditional security of the Vernam cipher. However, Boolean and bitwise masking is only suited to operations in (Fn2 , ⊕), hence the invention later on led to the use of the multiplicative masking [1] (when a product appears in the algorithm and the data are non-zero) and of the homographic masking [20] (when inversion is necessary, along with addition and multiplication). Many other sharings exist (affine [23], polynomial [48,24], threshold implementation [42], etc.), depending on the usage constraints. Those sharings can be expensive to implement. Thus several trade-offs are encountered in the literature. One of them consists in reducing the entropy of the masks, whilst keeping a security against dth-order attacks. This strategy of using depleted masks is presented in [40]. It is proved that the masking resists first and second order attacks if the masks are chosen as a subset M ⊆ Fn2 such that the indicator of M is 2nd -order correlation-immune [11, Chap. 7]. We recall that the indicator of a subset M ⊆ Fn2 is the Boolean function 1M , defined on Fn2 as 1M (x) = 1 ⇐⇒ x ∈ M ,

13

14


and that a Boolean function is dth-order correlation-immune if its output distribution is unchanged by fixing up to d input bits. A concrete architecture that implements this masking scheme is presented in [41]: every sensitive variable is masked by an element from M. Incidentally, the results have been extended to arbitrary order in [3]. Another direction aims at keeping the entropy of the masks full, but attempts to encode the shares to further reduce the leakage. An example (illustrated in Fig. 2) for this operation consists in applying a function Bi independently to each share Si . This choice is termed “leakage squeezing” and further developed in Sec. 4.2. Another option is to encode all the shares together. This is implemented by the “wiretap codes” countermeasure, described in Sec. 4.3, and by the “leak-free” countermeasure presented in Sec. 4.4. The only requirement is that the encoding function is invertible, as at the end of the computation (and certainly also during the computations) the unencoded shares must be recovered. At the out-

Defense: counter-measure Masking = Sharing + Encoding Registers:

Shares: S0

B0 (S0 )

S1 .. . Sd

B1 (S1 ) .. . Bd (Sd )

[device under attack]

Sensitive variable: Z

(Side)-Channel Non-injective and noisy leakage function Cdevice

N (0, σ 2 )

Attack Information retrieval 1) Measure Cdevice ; 2) Compute Ctotal , i e.g. Ctotal = Cdevice ; 3) Test: ? Var[E[Ctotal |Z]] 6= 0.

Fig. 2. High-order protection (order d) and high-order attack (order i, with i > d if the masking is sound) scenario.

put, an attacker attempts to extract information on Z from the leakage. In Fig. 2, the leakage function is depicted as “scalar”. However, generally speaking, the leakage function Cdevice can be vectorial. The attacker measures this leakage, and applies this attack strategy: the measured leakage is raised at successive powers i until it starts to depend on Z. This test allows to build a distinguisher: the variations between Ctotal and Z are non-zero when i = HCI. If the sharing is dth-order secure, then the attacker must use a postprocessing function (e.g. a power function) of order at least d + 1. If


in addition the encoding of the shares is effective, the post-processing function must be of greater order. Let us consider an example that illustrates the benefits of the encoding stage in Fig. 2. In the case of Boolean masking with one mask M , the sensitive variable is split in S0 = Z ⊕ M and S1 = M . Let us also assume that the leakage function consists in adding all the bits of the processed variable (Hamming weight model). Then, the attacker measures P HW(Z ⊕ M, M ) = ni=1 (Z ⊕ M )i + Mi . In this expression, it appears that the mask almost cancels. Indeed, on n = 1 bit, the leakage is: (Z ⊕ M )1 + H H M1 = 2 × (Z1 ⊕ M1 ) ∧ M1 + (Z1 ⊕ M M H H 1) ⊕ 1. We review in the rest of this section the state-of-the-art about encoding methods. They can be seen as pre-processing methods on the shares, aiming at reducing the overall degree of the function Ctotal . 4.2

Leakage Squeezing

The leakage squeezing has the objective to increase the HCI value of Cdevice . It consists in encoding each share separately. Thus d + 1 functions Bi : Fn2 → Fn2 are applied to the d + 1 additive shares Si of Z. The functions must be bijective to recover the plain shares for the computation (during the algorithm and at the final demasking). When the device leaks the sum of the shares, then the deterministic part of the leakage function Ctotal (L) is the overall sum of the Hamming weight of the shares: HW(B0 (S0 ), · · · , Bd (Sd )). More realistically, on hardware platforms, this deterministic part involves the distances between the values carried by registers, i.e. HW(B0 (S00 ) ⊕ B0 (S0 ), · · · , Bd (Sd0 ) ⊕ Bd (Sd )) . Interestingly enough, when the bijections are linear, then the two cases can be treated in a common framework: 1. In the Hamming weight model, the distinguisher is the correlation coefficient with the first nonzero moment of HW(B0 (S0 ), · · · , Bd (Sd ))|Z; 2. In the Hamming distance model, the distinguisher is the correlation coefficient with the first nonzero moment of HW(B0 (S00 ⊕ S0 ), · · · , Bd (Sd0 ⊕ Sd ))|(Z 0 ⊕ Z) = HW(B0 (S000 ), · · · , Bd (Sd00 ))|Z 00 ,

15

16


where for any random variable X, the random variable X 00 is the difference between the two consecutive values, X 0 and X, i.e. X 00 = X 0 ⊕ X. Clearly, this encoding has the potential to increase the resistance of the additive Boolean sharing: indeed, the sharing without encoding is a special case of the leakage squeezing, where all the bijections are equal to the identity. A proof of concept (linear Bi ) and some implementation results were the topic of the first publication about leakage squeezing at WISTP 2011 [34]. In subsequent publications, the leakage squeezing has been studied from a mathematical standpoint. For the sake of simplification, they all assume that one share (say the first one) is processed without a bijection (B0 is the identity), and that the other shares are encoded. The bijections Bi can be public; nonetheless, if they are not, a correlation attack will be more complex. The paper [32] analyses the leakage with two shares, explicitly finds bounds for the resistance against high-order attacks, and provides the corresponding B1 bijection. The countermeasure is depicted in Fig. 3. The memory C implements the cryptographic function (e.g. the substitution box) and R the mask refresh. The optimal solution is a bijection whose graph has the highest possible correlation immunity (see definition in [8]). This problem comes down to the identification of rate 1/2 codes of maximal dual distance. Such codes, called Complementary Information Set (CIS), have been studied in details in [14]. In Sec. 5, we focus on the solutions that are linear. Indeed, they are easier to compute, and, as already discussed, cover both Hamming weight and Hamming distance leakage models. The paper [32] also analyzes the effects of the models imperfections (including cross-coupling): e.g. when the squeezing is done under the Hamming weight or the Hamming distance assumption but the device actually leaks differently. It is remarkable how the leakage squeezing is resilient to such imperfections: it still reduces the leakage even if the imperfections represent up to 50 % of the expected leakage. The performance of leakage squeezing is illustrated in [34] where two FPGA implementations of the leakage squeezing countermeasure are studied on DES. The proposed implementations have been tested in a StratixII FPGA which is based on Adaptative LUT Module (ALM) cell. They have been compared with non-protected DES and masked without any leakage squeezing. Table 1 summarizes the memories needed for each implementation and the estimated throughput. The ALMs are used for the combinational gates that implement linear operations. The memories are used for the non-linear op-

Achieving side-channel hi-order corr. immunity with Leakage Squeezing n bits

Z⊕M a

n bits

B1 (M)

simultaneous leakage

b

B1−1

Memory

C

R

B1 a′

Z′ ⊕ M ′

b′

B1 (M ′ )

Fig. 3. Rationale of the improvement of masking scheme with two shares thanks to the application of the leakage squeezing.

erations. It can be seen that 32 BRAMs are necessary, and fully employed (32 × 4 Kbit = 131072 bit). These results show that the leakage squeezing method on hardware implementations has little impact on complexity and speed. Table 1. Complexity and speed results. “LS” denotes the “leakage squeezing” countermeasure.

Implementation

ALMs Block mem- BRAM Throughput -ory [bit] (M4K) [Mbit/s] Unprotected DES (reference) 276 0 0 929.4 DES masked ROM 366 131072 32 398.4 DES masked ROM with LS 408 131072 32 320.8

Eventually, the study of the leakage squeezing in the case where the sensitive variable is split in three shares is conducted in [12]. It shows that couples of bijections (B1 , B2 ) can be found jointly, that are better than any upgrade from the optimal solution for one mask to two masks. Indeed,

17

18


one can start to improve on top of a first-order leakage squeezing to get a second-order leakage squeezing scheme; but this approach is proved to be sub-optimal w.r.t. the direct consideration of second-order leakage squeezing. 4.3

Wire-tap Codes

The countermeasure using wire-tap codes has been presented in [7]. Its objective is to prevent an attacker from recovering the information on the sensitive variable even if she can recover some bits of the encoded sensitive variable. The attacker model is thus original, because it is assumed that either the attacker is able to probe some wires, or that she can use very accurate magnetic probes (see [29]). To achieve this goal, some random bits M (p bits) are collated to the sensitive variable Z (n bits). The complete wire-tap protected signal has length q = n + p. The protection consists in: – encoding the mask M thanks to linear code C of parameters [q, p, d], – expanding the sensitive data Z on q bits by computing a linear combination (parity matrix), and – XORing the two parts, which yields the protected data z ∈ Fq2 . Thus, if G is a generating matrix of C and H is the corresponding parity matrix, then the transformation in the masked representation is: z = H T Z ⊕ GT M . (7) Then, it can be proved (Lemma 1 in [7]) that this representation resists unconditionally to the leakage of up to d⊥ − 1 bits (including), when d⊥ is the dual distance of C. However, this paper does not link this result to the resistance against high-order side-channel attacks. Similarly to the leakage squeezing, the wire-tap needs not be private; obviously, the scheme will be more secure if it is secret. 4.4

Leak-free Masking

The leak-free masking allows to completely cancel any univariate leakage, i.e. to zero the mutual information between the leakage and the sensitive variables. (considered individually, i.e. not in distance w.r.t. a previous state). It has been introduced in [35], and an example of implementation is discussed in [36]. This pre-processing requires to fulfill two conditions for this encoding to apply:


1. The leakage must depend on the previous and on the current values of the manipulated variables, and must be invariant in the exchange of previous and current values. 2. The two shares must not interfere in the leakage, i.e. no product terms between (all or part) of the masked variable and (all or part) of the mask shall exist. The countermeasure consists in defining a processing for one share (e.g. the mask) that does not leak, whereas the other share is perfectly masked. The construction demands that: – M 0 = M ⊕ α, for a nonzero constant α: this allows to have a constant leakage in distance, because M 0 ⊕ M = α, which does not depend on the mask; Notice that a fresh mask M is chosen randomly at every new encryption. – The sensitive variable be XORed with a function F of the masking, so that its leakage (Z 0 ⊕ F (M 0 )) ⊕ (Z ⊕ F (M )) is independent of Z 0 ⊕ Z. In these equations, Z 0 is the next value taken by Z (refer to Fig. 4). This is achievable provided that the derivative Dα (F ) of F in α is balanced. Indeed, (Z 0 ⊕ F (M 0 )) ⊕ (Z ⊕ F (M )) = Z 00 ⊕ Dα F (M ). The scheme is sketched in Fig. 4. The function F is not necessarily invertible. However, the requirement is that the overall encoding be invertible. Here, the function (Z, M ) 7→ (Z ⊕ F (M ), M ) must be invertible. Perfect masking strategy masked data

11111 00000 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111

Z ⊕ F (M)

n

ROM

Hiding strategy mask p (p > n)

M

· ⊕ F(·)

Z

Z′ ⊕ F (M ′ )

S Z′ · ⊕ F(·)

α

α

M′

Fig. 4. Leak-free masking.

M′

19

20


In theory this construction cancels the mutual information between the leakage model and the sensitive variable. However, it holds only if the leakage is indeed symmetrical with respect to the exchange between the previous and the current values and the two shares do not interfere. A small asymmetry according to the time or a small coupling between the two shares will create a dependency with the sensitive variable. As this variable is not encoded (with a code, like in the leakage squeezing), the dependency will leak the secret. This is confirmed by the simulations in [32]. Also, real experiments show that if a little part of the leakage obeys the Hamming weight (total asymmetry in the relationship between previous / current value), then the countermeasure can be broken at order 2 [39].

5 5.1

Concrete Example of HCI Increase Formulation and Results in the Perfect Model

In this section, we discuss a case-study that illustrates the leakage squeezing with a linear bijection. The considered example applies to AES, where the data are manipulated by bytes (hence n = 8). We also focus on the usage of a single mask (d = 1). For this countermeasure, the combination function Ctotal of Cdevice with an univariate ith order CPA adversary part applying a processing Cattacker = ( · )i can be expressed (for hardware devices) as: Ctotal (L) = Ctotal (L0 , L1 ) = (L0 + L1 )i where:

(8)

L0 = HW(Z ⊕ M ) + N0 and L1 = HW(B1 (M )) + N1 . The defender searches for a good bijection B1 , denoted simply by B. From the attacker standpoint, the univariate HO-CPA succeeds if and only if E[Ctotal |Z = z] does depend on z ∈ Fn2 . So, conversely, if z 7→ E[Ctotal |Z = z] is constant, the HO-CPA fails. Now, Ctotal (L0 , L1 ) is a polynomial of L0 and L1 . The terms in the polynomial write Lp0 × Lq1 , where (p, q) ∈ N2 are the exponents of each leakage in each term. We can write: f optB,p,q (z) = E(Lp0 × Lq1 |Z = z)

= EM (HW(z ⊕ M )p · HW(B(M ))q ) .

By definition of HCI, for all exponents that satisfy p + q < HCI, f optB,p,q (z) is constant, when z ∈ Fn2 .

(9)


The goal for the designer is to choose the bijection B that maximizes HO-CPA immunity, i.e. that meets Eqn.(9) for the largest possible HCI. In [32], it is shown that the best bijection B is such that its graph (Z, B(Z)) is of maximal dual distance. For n = 8, such graph C = (Z, B(Z)) can be deduced from the linear code [16, 8, 5] that is autodual. More precisely, the graph is the indicator of this code. By writing the code C in a systematic form, i.e. the codewords are listed in a matrix (In , G), we have that B is a linear function generated by G. This leakage squeezing protects against high-order CPA of any order up to 4. Besides, in [7], it is shown that the best wire-tap code with a mask of size p = n can be built from a linear code of characteristics [q, p, d] = [2n, n, d] = [16, 8, d] of greatest dual distance. Thus, the same code [16, 8, 5] can be used. It protects against all attacker that is able to probe up to 4 bits. To see the similarity between leakage squeezing and wire-tap masking in the particular case of linear functions and with the use of a single mask, we apply the simplifications suggested in [7]: – L is (In , 0), and – G is written in systematic form as (Γ T , Iq−n=p ). Let us assume (which is beyond the assumptions made in [7]) that the matrix Γ is invertible. Then, Eqn. (7) rewrites: X In Γ X ⊕ Γ TM = . Z= M 0 Ip M −1 Thus, by replacing the uniformly distributed mask M (on Fn2 ) by Γ T M= B(M ), the sensitive variable encoded in wire-tap is also the couple (X ⊕ M, B(M )), as in the leakage squeezing. So, with one mask of the same size as the sensitive data, the linear functions that allow: – to resist to attacks of the highest order, and – to resist to the probing with the highest number of probes, are the same. Two security objectives are achieved with a single linear code. 5.2

Resistance in the Imperfect Model

Remark 2. The fact mentioned in Sec. 4.2 that leakage squeezing resists model imperfections can be proved. If for instance L0 = HW(Z ⊕M )2 +N0 instead of L0 = HW(Z ⊕ M ) + N0 (i.e., there is already a mixture of bits

21

22


owing to the device), then the new resistance order, noted HCI0 , is equal to: fi ∀p ≥ 1, ∀q ≥ 1, p + q = HCI fi . (10) max HCI such that =⇒ 2p + q ≤ d It can be proved (see below) that HCI0 = b d+1 2 c; thus, a security margin still exists. Proof. Equation (10) can be equivalently reformulated as: max {r ∈ N, such that ∀p ≥ 1, p ≤ r − 1 ⇒ p ≤ d − r} = max {r ∈ N, such that r − 1 ≤ d − r}

d+1 = max {r ∈ N, such that 2r ≤ d + 1} = 2

. t u

6

Security Evaluation of the Countermeasure

For our simulations, we still consider the P hardware case, where all shares leak simultaneously (i.e. Cdevice (L) = di=0 Li ). 6.1

Security Analysis

Lemma 2 in [51] proves that, without leakage squeezing, a hardware Boolean masking countermeasure with d masks has HO-CPA immunity HCI = d + 1, and thus protects against dth -order CPA. This is illustrated for n = 4 in Tab. 2 first five groups, that correspond to d ∈ J0, 4K. For these simulations, we consider Li = HW(Si ), without noise. In this table, the number of lines in gray is equal to HCI − 1 (Definition 2). . Let us define the linear bijection B defined by its matrix ni4 = I4 : x1 0 1 1 1 x1 x2 ⊕x3 ⊕x4 x2 3 ⊕x4 x = xx23 ∈ F42 7→ ni4 × x = 11 01 10 11 = xx11 ⊕x ∈ F42 . x3 ⊕x2 ⊕x4 x4

1110

x4

x1 ⊕x2 ⊕x3

(11)

Using this linear bijection, we summarize in Tab. 2 (the last group) the security improvement brought by the leakage squeezing. Now, for these simulations, we take L0 = HW(S0 ) and L1 = HW(B(S1 )). The results in the table show that the leakage squeezing allows to improve the HO-CPA immunity by two units without adding extra masks.


Example 1. The resistance when the leakage model is imperfect (refer to Remark 2) can also be illustrated on an example using B : x 7→ ni4 × x and a combination function that mixes the two shares by an exclusive-or (of degree 2). This can happen in a software implementation where one register would successfully contain one share, then the other. In this case, the leakage ⊕ M ) ⊕(B(M )), which is thus equal to 1 1 1 1 is a function of (z o 1111 n T T 4 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 Z ⊕ 1 1 1 1 × M . Of course, 1 1 1 1 F2 = ( ) ,( ) . 1111

1111

But in the image, there are 8 vectors ( 0 0 0 0 )T and 8 vectors ( 1 1 1 1 )T , i.e., they are balanced. Thus the expectation (on RV M ) of any affine function of (z ⊕ M ) ⊕ (B(M )) will not depend on z. This result is a direct application of Remark 2, since b d+1 2 c = 2. 6.2

Information-Theoretic Evaluation of the Countermeasure

In this section, our purpose is to quantify the amount of information that the countermeasure reveals about the sensitive variable Z. To achieve this goal, we follow the information-theoretic approach introduced in [53]. Namely, we compute the mutual information between the sensitive variable Z and the leakage function of Eqn. (8), where N = N0 + N1 is an Additive White Gaussian Noise (AWGN) of standard deviation σ. In our simulation, we use the bijection I4 (called ni4). For comparison purpose, we proceed the same for high-order Boolean masking. The mutual information of the leakage squeezing hardware implementation is represented in Fig. 5. The curves in this figure have been obtained by computing the mutual information as the integral of distributions (mixtures of Gaussians). The data type is double and the integration software the GNU “contrib adaptint” by Steven G. Johnson, with accuracy set to 1.14×10−14 . This accuracy is a bit less than 2−46 , hence the vertical scale of Fig. 5. This first analysis allows us to observe that the gain is high when the leakage squeezing is applied, because the mutual information leaked is less than without the countermeasure whatever the SNR. Typically, our simulations confirm theoretical predictions of Theorem 1. As a corollary, . I[Ctotal (L); Z] = MIM = O(1/σ 8 ) for first order masking with leakage squeezing (HCI = 4), whereas MIM = O(1/σ 4 ) for first order masking without (HCI = 2). Taking advantage of the leakage squeezing principle, the quantity of information leaked with one sole mask is almost the same of the third order masking without the need of adding extra masks. It is in this respect

23

24


0

−8

Slop

Slo

−24

=

2

e

=

pe

=− 4

−6

−8

21

e

½ ¾ 1

0

¼

−1

−48

e=

No LS, Cdevice=sum, 0 mask No LS, Cdevice=sum, 1 mask No LS, Cdevice=sum, 2 masks No LS, Cdevice=sum, 3 masks No LS, Cdevice=sum, 4 masks LS (B=ni4), Cdevice=sum, 1 mask

op

op Sl

−32

−40

Sl

op Sl

log2( I[Ctotal; Z] )

e=−

−16

22 23 24 25 Noise standard deviation (σ)

26

27

28

29

Fig. 5. Leakage metrics for the 4-bit leakage model without and with P leakage squeezing (shortened in LS) enhancement, for Cdevice = sum, i.e., Cdevice (L) = di=0 Li .

that we describe the leakage squeezing as a masking scheme that gets the most out of a single mask. An illustration of leakage squeezing is given in Fig. 6 and Fig. 7. The goal of the designer (in green) is to increase the order of the informationtheoretic (IT) attacks and of the HO-CPA. At the opposite, the goal of the attacker (in red) is to decrease the order of the attacks. Without leakage squeezing (B0 = B1 = · · · = Bd = In , the identity function, cf. Fig. 6), the HCI coincides with the algebraic degree of the attack. However, with the leakage squeezing (cf. Fig. 7), the lowest degree of the working HOCPA, namely d◦ (Ctotal ) can be two units greater than dalg (Ctotal ) (on the example of d = 1 and n = 4, with B0 = I4 and B1 = I4 ).

7

Conclusions and Perspectives

In this paper, we have investigated high-order masking countermeasure against side-channel attacks, in the context of FPGAs where the computation is implemented as table lookups in block RAMs. We have shown that


attack impossible HO-CPA impossible

attack possible in IT HO-CPA possible i

0

···

d

d+1

dth-order masking

d+2

d+3

attacker dalg (Ctotal ) = d◦ (Ctotal ) = HCI

Fig. 6. Information-theoretic (IT) leakage and HO-CPA attack metrics without leakage squeezing.

attack impossible

attack possible in IT HO-CPA impossible

possible i

0

···

d

d+1

dth-order masking dalg (Ctotal )

d+2

d+3

leakage squeezing

attacker

d◦ (Ctotal ) = HCI

Fig. 7. Information-theoretic (IT) leakage and HO-CPA attack metrics with leakage squeezing.

the minimal attack order (the HO-CPA immunity, or HCI) relates to the amount of leakage. Then, we presented a method called leakage squeezing which aims at raising the HO-CPA immunity. This method consists in using bijective encodings which are applied on the masking shares. Our evaluation analysis shows that this technique provides a great security robustness against HO-CPA: without leakage squeezing, HCI = d + 1, whereas with leakage squeezing, HCI > d + 1. For instance, we characterize linear bijections, that allow to reach HCI = 4 with only d = 1 mask when the sensitive variable is a nibble. The robustness is corroborated by an information theoretic analysis of the leakage. Indeed, at a given cost

25

26


and performance level, we show that the leakage squeezing with linear bijections is as efficient as adding one or two other masks. As a perspective, we intend to extend the research for bijections where more than two masks are used. A recent paper by Grosso et al. at CARDIS 2013 [26] suggests that leakage squeezing could be also efficient against higher-order attacks when the several shares (two in their article) are leaking at different dates. We intend to formalize the reason why leakage squeezing also increases the HCI in this context.

Acknowledgments The authors are grateful to Shivam Bhasin for providing the estimation of the signal-to-noise ratio on FPGAs. We also thank Thanh-Ha Le and Maël Berthier from Safran-Morpho for interesting discussions regarding the use of cumulants in the development of the mutual information in the presence of strong noise. The interaction with them was key for the rigorous demonstration of Theorem 1. Besides, this work, originating from IACR Cryptology ePrint Archive 2011/520 [33] and from a presentation at CRYPTARCHI 2012 [27], has greatly improved after the numerous fruitful exchanges with the anonymous reviewers. This work has been partly supported by the French National Research Agency (ANR), under grant ANR-09-SEGI-013 (ARPEGE project SecReSoC, “Secured Reconfigurable System on Chip”).

References 1. Mehdi-Laurent Akkar and Christophe Giraud. An Implementation of DES and AES Secure against Some Attacks. In LNCS, editor, Proceedings of CHES’01, volume 2162 of LNCS, pages 309–318. Springer, May 2001. Paris, France. 2. Lejla Batina, Benedikt Gierlichs, Emmanuel Prouff, Matthieu Rivain, Fran¸coisXavier Standaert, and Nicolas Veyrat-Charvillon. Mutual Information Analysis: a Comprehensive Study. J. Cryptology, 24(2):269–291, 2011. 3. Shivam Bhasin, Claude Carlet, and Sylvain Guilley. Theory of masking with codewords in hardware: low-weight dth-order correlation-immune Boolean functions. Cryptology ePrint Archive, Report 2013/303, 2013. http://eprint.iacr.org/ 2013/303/. 4. Shivam Bhasin, Jean-Luc Danger, Sylvain Guilley, and Zakaria Najm. NICV: Normalized Inter-Class Variance for Detection of Side-Channel Leakage. Cryptology ePrint Archive, Report 2013/717, 2013. http://eprint.iacr.org/2013/717. 5. Shivam Bhasin, Sylvain Guilley, Annelie Heuser, and Jean-Luc Danger. From cryptography to hardware: analyzing and protecting embedded Xilinx BRAM for cryptographic applications. J. Cryptographic Engineering, 3(4):213–225, 2013.

Achieving side-channel hi-order corr. immunity with Leakage Squeezing ´ 6. Eric Brier, Christophe Clavier, and Francis Olivier. Correlation Power Analysis with a Leakage Model. In CHES, volume 3156 of LNCS, pages 16–29. Springer, August 11–13 2004. Cambridge, MA, USA. 7. Julien Bringer, Hervé Chabanne, and Thanh Ha Le. Protecting AES against sidechannel analysis using wire-tap codes. J. Cryptographic Engineering, 2(2):129–141, 2012. 8. Paul Camion, Claude Carlet, Pascale Charpin, and Nicolas Sendrier. On Correlation-Immune Functions. In Joan Feigenbaum, editor, CRYPTO, volume 576 of Lecture Notes in Computer Science, pages 86–100. Springer, 1991. 9. Jean-Fran¸cois Cardoso. High-order contrasts for independent component analysis. Neural Comput., 11(1):157–192, January 1999. 10. Jean-Fran¸cois Cardoso. Dependence, Correlation and Gaussianity in Independent Component Analysis. Journal of Machine Learning Research, 4:1177–1203, 2003. 11. Claude Carlet. Boolean Functions for Cryptography and Error Correcting Codes: Chapter of the monography Boolean Models and Methods in Mathematics, Computer Science, and Engineering. pages 257–397. Cambridge University Press, Y. Crama and P. Hammer eds, 2010. Preliminary version available at http: //www.math.univ-paris13.fr/~carlet/chap-fcts-Bool-corr.pdf. 12. Claude Carlet, Jean-Luc Danger, Sylvain Guilley, and Houssem Maghrebi. Leakage Squeezing of Order Two. In INDOCRYPT, volume 7668 of LNCS, pages 120–139. Springer, December 9-12 2012. Kolkata, India. 13. Claude Carlet, Jean-Luc Danger, Sylvain Guilley, Houssem Maghrebi, and Emmanuel Prouff. Achieving side-channel high-order correlation immunity with leakage squeezing. J. Cryptographic Engineering, 4(2):107–121, 2014. 14. Claude Carlet, Philippe Gaborit, Jon-Lark Kim, and Patrick Solé. A New Class of Codes for Boolean Masking of Cryptographic Computations. IEEE Transactions on Information Theory, 58(9):6000–6011, 2012. 15. Claude Carlet, Louis Goubin, Emmanuel Prouff, Michael Quisquater, and Matthieu Rivain. Higher-Order Masking Schemes for S-Boxes. In FSE, Lecture Notes in Computer Science. Springer, March 19–21 2012. Washington DC, USA. 16. Suresh Chari, Charanjit S. Jutla, Josyula R. Rao, and Pankaj Rohatgi. Towards Sound Approaches to Counteract Power-Analysis Attacks. In CRYPTO, volume 1666 of LNCS. Springer, August 15-19 1999. Santa Barbara, CA, USA. ISBN: 3-540-66347-9. 17. Suresh Chari, Josyula R. Rao, and Pankaj Rohatgi. Template Attacks. In CHES, volume 2523 of LNCS, pages 13–28. Springer, August 2002. San Francisco Bay (Redwood City), USA. 18. Jean-Sébastien Coron. Higher Order Masking of Look-up Tables. Cryptology ePrint Archive, Report 2013/700, 2013. http://eprint.iacr.org/. 19. Jean-Sébastien Coron, Emmanuel Prouff, and Matthieu Rivain. Side Channel Cryptanalysis of a Higher Order Masking Scheme. In CHES, volume 4727 of LNCS, pages 28–44. Springer, September 10-13 2007. Vienna, Austria. 20. Nicolas Courtois and Louis Goubin. An Algebraic Masking Method to Protect AES Against Power Attacks. In Dongho Won and Seungjoo Kim, editors, ICISC, volume 3935 of Lecture Notes in Computer Science, pages 199–209. Springer, 2005. 21. Saar Drimer, Tim G¨ uneysu, and Christof Paar. DSPs, BRAMs, and a Pinch of Logic: Extended Recipes for AES on FPGAs. TRETS, 3(1), 2010. 22. Wieland Fischer and Berndt M. Gammel. Masking at Gate Level in the Presence of Glitches. In CHES, volume 3659 of Lecture Notes in Computer Science, pages 187–200. Springer, August 29 – September 1 2005. Edinburgh, UK.

27

28


23. Guillaume Fumaroli, Ange Martinelli, Emmanuel Prouff, and Matthieu Rivain. Affine Masking against Higher-Order Side Channel Analysis. In Alex Biryukov, Guang Gong, and Douglas R. Stinson, editors, Selected Areas in Cryptography, volume 6544 of Lecture Notes in Computer Science, pages 262–280. Springer, 2010. 24. Louis Goubin and Ange Martinelli. Protecting AES with Shamir’s Secret Sharing Scheme. In Preneel and Takagi [43], pages 79–94. 25. Louis Goubin and Jacques Patarin. DES and Differential Power Analysis. The “Duplication” Method. In CHES, LNCS, pages 158–172. Springer, Aug 1999. Worcester, MA, USA. 26. Vincent Grosso, Fran¸cois-Xavier Standaert, and Emmanuel Prouff. Low Entropy Masking Schemes, Revisited. In CARDIS, Lecture Notes in Computer Science. Springer, November 2013. Berlin, Germany. 27. Sylvain Guilley, Claude Carlet, Houssem Maghrebi, Jean-Luc Danger, and Emmanuel Prouff. Leakage Squeezing — Defeating Instantaneous (d + 1)th-order Correlation Power Analysis with Strictly Less Than d Masks. In CryptArchi, June 19–22 2012. Chˆ ateau de Goutelas, Marcoux, France; (abstract). 28. Tim G¨ uneysu and Amir Moradi. Generic side-channel countermeasures for reconfigurable devices. In Preneel and Takagi [43], pages 33–48. 29. Johann Heyszl, Stefan Mangard, Benedikt Heinz, Frederic Stumpf, and Georg Sigl. Localized Electromagnetic Analysis of Cryptographic Implementations. In Orr Dunkelman, editor, CT-RSA, volume 7178 of Lecture Notes in Computer Science, pages 231–244. Springer, 2012. 30. Paul C. Kocher, Joshua Jaffe, and Benjamin Jun. Differential power analysis. In Michael J. Wiener, editor, CRYPTO, volume 1666 of Lecture Notes in Computer Science, pages 388–397. Springer, 1999. 31. Thanh-Ha Le and Maël Berthier. Mutual Information Analysis under the View of Higher-Order Statistics. In Isao Echizen, Noboru Kunihiro, and Ryˆ oichi Sasaki, editors, IWSEC, volume 6434 of Lecture Notes in Computer Science, pages 285– 300. Springer, 2010. 32. Houssem Maghrebi, Claude Carlet, Sylvain Guilley, and Jean-Luc Danger. Optimal First-Order Masking with Linear and Non-linear Bijections. In Aikaterini Mitrokotsa and Serge Vaudenay, editors, AFRICACRYPT, volume 7374 of Lecture Notes in Computer Science, pages 360–377. Springer, 2012. 33. Houssem Maghrebi, Sylvain Guilley, Claude Carlet, and Jean-Luc Danger. Classification of High-Order Boolean Masking Schemes and Improvements of their Efficiency. Cryptology ePrint Archive, Report 2011/520, September 2011. http: //eprint.iacr.org/2011/520. 34. Houssem Maghrebi, Sylvain Guilley, and Jean-Luc Danger. Leakage Squeezing Countermeasure Against High-Order Attacks. In WISTP, volume 6633 of LNCS, pages 208–223. Springer, June 1-3 2011. Heraklion, Greece. DOI: 10.1007/978-3642-21040-2 14. 35. Houssem Maghrebi, Emmanuel Prouff, Sylvain Guilley, and Jean-Luc Danger. A First-Order Leak-Free Masking Countermeasure. In CT-RSA, volume 7178 of LNCS, pages 156–170. Springer, February 27 – March 2 2012. San Francisco, CA, USA. DOI: 10.1007/978-3-642-27954-6 10. 36. Houssem Maghrebi, Emmanuel Prouff, Sylvain Guilley, and Jean-Luc Danger. Register Leakage Masking Using Gray Code. In HOST, IEEE Computer Society, pages 37–42, June 2-3 2012. Moscone Center, San Francisco, CA, USA. DOI: 10.1109/HST.2012.6224316.

Achieving side-channel hi-order corr. immunity with Leakage Squeezing 37. Stefan Mangard, Elisabeth Oswald, and Thomas Popp. Power Analysis Attacks: Revealing the Secrets of Smart Cards. Springer, December 2006. ISBN 0-38730857-1, http://www.dpabook.org/. 38. Stefan Mangard and Kai Schramm. Pinpointing the Side-Channel Leakage of Masked AES Hardware Implementations. In CHES, volume 4249 of LNCS, pages 76–90. Springer, October 10-13 2006. Yokohama, Japan. 39. Amir Moradi and Oliver Mischke. How Far Should Theory be from Practice? Evaluation of a Countermeasure. In CHES, September 9-12 2012. Leuven, Belgium. 40. Maxime Nassar, Sylvain Guilley, and Jean-Luc Danger. Formal Analysis of the Entropy / Security Trade-off in First-Order Masking Countermeasures against Side-Channel Attacks. In INDOCRYPT, volume 7107 of LNCS, pages 22–39. Springer, December 11-14 2011. Chennai, Tamil Nadu, India. DOI: 10.1007/9783-642-25578-6 4. 41. Maxime Nassar, Youssef Souissi, Sylvain Guilley, and Jean-Luc Danger. RSM: a Small and Fast Countermeasure for AES, Secure against First- and Second-order Zero-Offset SCAs. In DATE, pages 1173–1178. IEEE Computer Society, March 12-16 2012. Dresden, Germany. (TRACK A: “Application Design”, TOPIC A5: “Secure Systems”). 42. Svetla Nikova, Vincent Rijmen, and Martin Schl¨ affer. Secure hardware implementation of nonlinear functions in the presence of glitches. J. Cryptology, 24(2):292– 321, 2011. 43. Bart Preneel and Tsuyoshi Takagi, editors. Cryptographic Hardware and Embedded Systems - CHES 2011 - 13th International Workshop, Nara, Japan, September 28 – October 1, 2011. Proceedings, volume 6917 of LNCS. Springer, 2011. 44. Emmanuel Prouff and Robert P. McEvoy. First-Order Side-Channel Attacks on the Permutation Tables Countermeasure. In CHES, volume 5747 of Lecture Notes in Computer Science, pages 81–96. Springer, September 6-9 2009. Lausanne, Switzerland. 45. Emmanuel Prouff and Matthieu Rivain. Masking against Side Channel Attacks: a Formal Security Proof. In EUROCRYPT, volume 7881 of LNCS, pages 142–159. Springer, May 2013. Athens, Greece. 46. Emmanuel Prouff, Matthieu Rivain, and Régis Bevan. Statistical Analysis of Second Order Differential Power Analysis. IEEE Trans. Computers, 58(6):799–811, 2009. 47. Emmanuel Prouff and Thomas Roche. Attack on a Higher-Order Masking of the AES Based on Homographic Functions. In Guang Gong and Kishan Chand Gupta, editors, INDOCRYPT, volume 6498 of Lecture Notes in Computer Science, pages 262–281. Springer, 2010. 48. Emmanuel Prouff and Thomas Roche. Higher-Order Glitches Free Implementation of the AES Using Secure Multi-party Computation Protocols. In Preneel and Takagi [43], pages 63–78. 49. Japanese RCIS-AIST. SASEBO (Side-channel Attack Standard Evaluation Board, Akashi Satoh) development board: http://www.risec.aist.go.jp/project/sasebo/, 2013. 50. Matthieu Rivain and Emmanuel Prouff. Provably Secure Higher-Order Masking of AES. In Stefan Mangard and Fran¸cois-Xavier Standaert, editors, CHES, volume 6225 of LNCS, pages 413–427. Springer, 2010. 51. Matthieu Rivain, Emmanuel Prouff, and Julien Doget. Higher-Order Masking and Shuffling for Software Implementations of Block Ciphers. Cryptology ePrint Archive, Report 2009/420, September 2009. http://eprint.iacr.org/2009/420.

29

30


52. Werner Schindler, Kerstin Lemke, and Christof Paar. A Stochastic Model for Differential Side Channel Cryptanalysis. In LNCS, editor, CHES, volume 3659 of LNCS, pages 30–46. Springer, Sept 2005. Edinburgh, Scotland, UK. 53. Fran¸cois-Xavier Standaert, Tal Malkin, and Moti Yung. A Unified Framework for the Analysis of Side-Channel Key Recovery Attacks. In EUROCRYPT, volume 5479 of LNCS, pages 443–461. Springer, April 26-30 2009. Cologne, Germany.


A

Appendix: Estimation of the Noise Level in Hardware Implementations

This appendix presents a method to estimate the signal-to-noise ratio (SNR) from real traces. For the sake of illustration, we use traces gathered from an FPGA (Xilinx Virtex 5) soldered on a SASEBO-GII board [49]. The traces are captures of the electromagnetic field emitted by the FPGA by an oscilloscope with a bandwidth of 6 GHz. The FPGA is programmed with an AES, that leaks values Y that depend on the distance between two state values X. The architecture of the AES is that described in [41] (but with the countermeasure inhibited): one round is computed every clock cycle. For every of the 16 states bytes (but the first line, invariant through the ShiftRows transform, that has a poor SNR), the SNR is computed at the last round. The definition of the SNR requires two notions: 1. the signal is the inter-class variance, i.e. Var[E[Y |X]], whereas 2. the noise is the total variance minus the signal, i.e. the intra-class variance E[Var[Y |X]]. The SNR (in power – i.e. squared) is defined as the ratio between the inter-class and the intra-class variances (refer to [37,4]). These values are plotted over time in Fig. 8 when X is the transition of the last round. It appears that the value of the “squared” SNR is about 0.005, hence: 1/σ 2 ≈ 0.005, which means σ ≈ 14. This value of σ, representative of million-gates parallel devices like FPGAs, is significantly larger than the noise that taints measurements over ASICs such as smart-cards. This definitely shows that the hypothesis of “large values” of σ in FPGAs is supported, all the more so as the designer can decide to further increase the noise variance by activating pseudo-random logic, as explained for instance in [28].

31

HCI = 1 HCI = 2 HCI = 3 HCI = 4 HCI = 5 HCI = 4

C. Carlet, J.-L. Danger, S. Guilley, H. Maghrebi and E. Prouff 32

Table 2. Statistics about some leakage models on words of n = 4 bitwidth, without noise (i.e. σ = 0).

R.V. Cdevice (L) Cdevice (L) | Z = 0 Cdevice (L) | Z = 1 Cdevice (L) | Z = 2 Cdevice (L) | Z = 3 Cdevice (L) | Z = 4 Plain zero-offset (Eqn. (8)) with d = 0 mask (unprotected reference). µ1 = E( · ) 2.000 0.000 1.000 2.000 3.000 4.000 µ2 = E(( · − µ1 )2 ) 1.000 0.000 0.000 0.000 0.000 0.000 µ3 = E(( · − µ1 )3 ) 0.000 0.000 0.000 0.000 0.000 0.000 µ4 = E(( · − µ1 )4 ) 2.500 0.000 0.000 0.000 0.000 0.000 Entropy [bit] 2.031 0.000 0.000 0.000 0.000 0.000 Plain zero-offset (Eqn. (8)) with d = 1 mask. µ1 = E( · ) 4.000 4.000 4.000 4.000 4.000 4.000 µ2 = E(( · − µ1 )2 ) 2.000 4.000 3.000 2.000 1.000 0.000 µ3 = E(( · − µ1 )3 ) 0.000 0.000 0.000 0.000 0.000 0.000 µ4 = E(( · − µ1 )4 ) 11.000 40.000 21.000 8.000 1.000 0.000 Entropy [bit] 2.544 2.031 1.811 1.500 1.000 0.000 Plain zero-offset (Eqn. (8)) with d = 2 masks. µ1 = E( · ) 6.000 6.000 6.000 6.000 6.000 6.000 µ2 = E(( · − µ1 )2 ) 3.000 3.000 3.000 3.000 3.000 3.000 µ3 = E(( · − µ1 )3 ) 0.000 −3.000 −1.500 0.000 1.500 3.000 µ4 = E(( · − µ1 )4 ) 25.500 25.500 25.500 25.500 25.500 25.500 Entropy [bit] 2.839 1.762 1.822 1.836 1.822 1.762 Plain zero-offset (Eqn. (8)) with d = 3 masks. µ1 = E( · ) 8.000 8.000 8.000 8.000 8.000 8.000 µ2 = E(( · − µ1 )2 ) 4.000 4.000 4.000 4.000 4.000 4.000 µ3 = E(( · − µ1 )3 ) 0.000 0.000 0.000 0.000 0.000 0.000 µ4 = E(( · − µ1 )4 ) 46.000 52.000 49.000 46.000 43.000 40.000 Entropy [bit] 3.047 2.044 2.047 2.046 2.043 2.031 Plain zero-offset (Eqn. (8)) with d = 4 masks. µ1 = E( · ) 10.000 10.000 10.000 10.000 10.000 10.000 µ2 = E(( · − µ1 )2 ) 5.000 5.000 5.000 5.000 5.000 5.000 µ3 = E(( · − µ1 )3 ) 0.000 0.000 0.000 0.000 0.000 0.000 µ4 = E(( · − µ1 )4 ) 72.500 72.500 72.500 72.500 72.500 72.500 Entropy [bit] 3.208 2.207 2.208 2.208 2.208 2.207 Leakage squeezing zero-offset with d = 1 mask and B = I4 (see Eqn. (11)). 4.000 4.000 4.000 4.000 4.000 4.000 2.000 2.000 2.000 2.000 2.000 2.000 0.000 0.000 0.000 0.000 0.000 0.000 11.000 32.000 11.000 8.000 11.000 8.000 2.544 0.669 1.544 1.500 1.544 1.500 µ1 = E( · ) µ2 = E(( · − µ1 )2 ) µ3 = E(( · − µ1 )3 ) µ4 = E(( · − µ1 )4 ) Entropy [bit]


0.007 0.006

SNR

0.005 0.004 0.003

Sbox 1 Sbox 2 Sbox 3 Sbox 5 Sbox 6 Sbox 7 Sbox 9 Sbox 10 Sbox 11 Sbox 13 Sbox 14 Sbox 15

0.002 0.001 0 500

600

700

800 900 Timing samples

1000

1100

1200

Fig. 8. SNR in power for an AES within a Xilinx Virtex 5 FPGA.

33

Achieving side-channel high-order correlation ...

Achieving side-channel high-order correlation ...

Suggest Documents

gis model for achieving the spatial correlation

Achieving the Rayleigh Limit in Fresnel Incoherent Correlation ...

Achieving Cohesion Achieving Cohesion

ACHIEVING

Verification of a mixed highorder accurate DNS ... - Wiley Online Library

A Highorder extended finite element method for ...

Highorder spacetime finite element schemes for ... - Wiley Online Library

Achieving Sustainable Achieving Sustainable Transportation System ...

Achieving Provider Engagement www.researchgate.net › publication › fulltext › Achieving

Correlation C_pooltop2lg2mean_tp2 symbol LCMS Correlation ...

Achieving Turkey's INDC Target: Assessments of ... - MDPIwww.researchgate.net › publication › fulltext › Achieving

Achieving Hemostasis With Topical Hemostats - COREwww.researchgate.net › publication › fulltext › Achieving

Pearson correlation Pearson correlation Fig A. Correlation of ... - PLOS

Achieving flexibility? - Ascilite

Achieving, Assessing and Communicating

Seminar : Achieving Excellence

Achieving academic excellence

Achieving Urban Resilience

Achieving water security - IRC

ACHIEVING COLLEGE DREAMS

Achieving planar plasmonic subwavelength

achieving digital fitness - Infoblox

Achieving Optimal Infrastructure - Pomeroy

achieving literacy outcomes