Fast and Efficient MIA

3 downloads 437 Views 227KB Size Report
hardware and software AES-128 embedded within micro- ... that aGE is defined as the average rank of all the correct ... reduce the computational complexity tracking down the fast .... [17] S. J. Lee, S. C. Seo, D.-G. Han, S. Hong, and S. Lee.
Fast and Efficient MIA

ABSTRACT Mutual Information Analysis (MIA) has a main advantage over Pearson’s correlation Analysis (CPA): its ability in detecting any kind of leakage within traces. However, it remains rarely used and less popular than CPA; probably because of two reasons. The first one is related to the appropriate choice of hyperparameters involved in MIA, choice that determines its efficiency. The second one is surely the high computational burden associated to MIA. The interests of applying MIA in the frequency domain rather than in the time domain are discussed. It is shown that MIA running into this domain is really effective and fast when combined with the use of an accurate frequency leakage model.

1.

INTRODUCTION

Since the introduction of Side-Channels Attacks (SCA) by the seminal paper of Kocher and al. [15], a large amount of work has been devoted to their developements, thus becoming a full branch of research in cryptography and embedded system security. Any device performing cryptographic operations is seriously challenged by side-channel cryptanalysis. In fact, SCA exploit unintentionally emitted information caused by the execution of computing tasks in cryptographic devices. As a matter of fact, such information that are related to the activity of a cryptographic device, can be sources of useful information to a malicious adversary. Among the known sources of physical leakages in the literature, the two most frequently used are power consumption [15] and electromagnetic radiation [12], [25], [1]. In time domain (T -domain), many different kinds of power analysis attacks, e.g., Simple and Differential power analysis (SPA and DPA) [15], Template-based Attacks (TA) [10], Correlation Power Analysis (CPA) [8], and Mutual Information Analysis (MIA) [3], [14], have been introduced with each one some advantages rendering it more suitable in special conditions. In particular, Pearson’s correlation coefficient has become very popular as a distinguisher when the leakage of a device can be approximated by a linear leakage model beforehand [8], [23], a quite common situtation for

unprotected devices. However, it is not so much uncommon to encounter integrated circuit (IC) for which the model of linear leak is not very accurate or completely inadequate; an example will be presented in this paper. This has led to use more powerful distinguishers capable of detecting more complex dependencies between the physical leakage and usual leakage models, notably MIA, at the cost of a drastic increase of computational burden. In most publications related to MIA, emphasis is put on the generality of the concept, as well as on the significance of density estimations in the context of SCA. There are various techniques to estimate Probability Density Functions (PDF) from data. The most widely adopted technique is probably the use of histograms [14]. Some other MIA-inspired distinguishers without explicit PDF estimations have also been proposed as an alternative to MIA such as the Kolmogorov-Smirnov distance [31], [35] or on the Cram´er-von-Mises test [31]. In the remainder of the paper, we focus on the Kernel Density Estimation (KDE) because this method has been demonstrated to have advantages over histogram, in terms of effectiveness and noise resistance [31], [24]. Most of the classical nonprofiled vertical SCA, i.e. DPA and CPA, have been derived from T -domain to frequency domain, F-domain, to take advantage of robustness against misalignement of traces [13], [20]. This paper intends to show the benefits in terms of practicability i.e. effectiveness, speed up and genericity, of translating the third most popular non-profiled SCA : MIA in F-domain; especially when an accurate frequency leakage model is used [32]. The remainder of this paper is organised as follows. In Sect. 2, we review the SCA principle, the definition of Mutual information (MI) index, the KDE method and the frequency representation based on Fourier Transform (FT). Sect. 3 introduces the Mutual Information Frequency Analysis (MIFA) using a frequency leakage model. In Sect. 4, we compare MIFA to MIA and CPA/CPFA, using real-world measurements. Finally, we conclude in Sect. 5.

2. PRELIMINARIES 2.1 Advanced or Vertical SCA The idea of advanced (vertical) SCA is based on the fact that physical leakages emanating from devices contain information on secret data. In practice, the adversary usually try to statistically link, in the T -domain, the physical leakage to sensitive (i.e. key-dependent) intermediate values computed by the Device Under Test (DUT) which depends on parts of the secret key, called subkeys. For this purpose, the adver-

sary builds subkey-dependent models and compare them to the actual physical observations. In the following, calligraphic letters are used to denote sets. Let K denote the random variable over K modeling a guessable part of the secret cryptographic key κ. Let X denote the random variable over X modeling the public inputs (i.e. plaintext or ciphertext) that are to be fed to the DUT. Let V denote the random variable over V modeling the derivation of internal values within the cryptographic algorithm, i.e. f : X × K → V such as V = f (X, K). Let O denote the random variable over O modeling the physical leakage generated by the computation of V during the execution of the cryptographic algorithm. Firstly, the adversary collects several leakage measurements, also called observations, oτ at leakage sample τ due to the computation of some sensitive intermediate variable Vκ = f (X, κ) by executing the DUT repeatedly for n different inputs x ∈ X . Namely, Vκ mixes a known part of x ∈ X and a small part of κ ∈ K for which its computation by the DUT generates data-dependent leakage oτ that satisfies oτ = φτ (Vκ ) + τ ,

(1)

where φτ is a device-specific deterministic term and τ denotes an independent Gaussian noise. In SCA, the targeted sensitive variable Vκ is normally assumed to be known, since it is part of the algorithm’s specification (e.g. output of a Sbox). Secondly, knowing x ∈ X but not κ ∈ K, the adversary choose a relevant (depending on the prior knowledge) leakˆ k ) repage model φˆ of φτ and compute the predictions φ(V resented by the random variable Hk over H for each key hypothesis k ∈ K. Eventually, φˆ should be a good approximation of φτ to provide a meaningful comparison between Oτ and Hk , i.e. highlight the dependency between Oτ and Hκ . Then, a statistical tool, called distinguisher, D, is used to detect this dependency, i.e. {Dk (τ )}k∈K = {D(Oτ , Hk )}k∈K and decide nwhich is the o most probable key \ hypothesis κ ˆ = arg maxk∈K D(oτ , hk ) from the observations vector oτ = {o0τ , . . . , on−1 } and the predictions vecτ tor hk = {h0k , . . . hn−1 } corresponding to inputs vector x = k {x0 , . . . xn−1 }. Suppose each leakage measurement consists of d physical realizations then a trace can be seen as an element of Od . As a result, O is a d-dimensional random variable {O0 , . . . , Od−1 } where Ot represents the leakage sample t for 0 ≤ t < d. In that case, D is applied on each of the leakage samples independently returning the best result among them (e.g. max.).

the DUT as the predicted values Hk = Id(Vk ) = Vk . Let (X, Y ) be the hybrid random vector, that is X is discrete while Y is continuous with support SY ; the theoretical version of the MI index is defined as

MI =

X x

Z l(x)

f (y|x) log2 SY

f (y|x) , dy g(y)

(2)

where f (y|x) is the conditional (on X) PDF of Y while g(y) 1 (resp. l(x)) P is the marginal PDF of Y (resp. X) and the symbol x refers to a sum taken over values x of X such that l(x) > 0. There are other equivalent formulas defining the MI index, notably,

M I = H(Y ) − H(Y |X), X = H(Y ) − l(x)H(Y |x),

(3) (4)

x

R where H(Y ) = − S g(y) log2 g(y)dy is the (differential) Y entropy of random variable Y and similarly to H(Y |x). The higher the mutual information, the stronger the dependency between X and Y . Specializing formula (3), its application as an attack in T -domain can be expressed as computing at each leakage sample t ∈ {0, . . . , d − 1} and for each key hypothesis k ∈ K, the quantity: M Ik (t) = H(Ot ) − H(Ot |Hk ),

(5)

However, one difficulty in using the MI index is that, in contrast to Pearson’s coefficient which is easily estimated via sample moments, the estimation of the MI index requires the estimation of the underlying PDF which are both theoretically and practically, a non trivial statistical problem. Several PDF based-approaches have been explored through parametric approaches such as cumulants [16], copulas [34] and nonparametric approaches such as histograms [14], bsplines [33], kernels [24] [31], maximal information coefficient [18] in the open SCA literature. Neither parametric nor nonparametric estimators are universally preferable in all situations, however. Nonparametric methods make minimal or no distribution assumptions and can be shown to achieve asymptotic estimation optimality for any input distribution under them, at the cost of some degrees of freedom owing to the selection of tuning parameters.

2.3 2.2

Mutual Information Analysis

The concept of MIA was initially introduced by [3] and independently formalized by [14], who characterized this method as being a generic distinguisher capable of efficiently detecting any kind of dependency between predictions and observations. It thereby appears as being a powerful distinguisher for SCA as no restrictive assumption on the leakage behaviour is required. As a matter of the fact, a specific leakage model, called Identity model, i.e. φˆ ≡ Id, was also introduced for this distinguisher in [14] and further investigated in [26]. It consists in plugging the intermediate values processed by

Interests of MIA in F -domain ? In a real-world context, due to either a high sampling rate or a large time window to explore, a side-channel trace may contain a very large number of leakage samples, e.g. > 1 million, so manipulating huge leakage traces can be cumbersome but the computation complexity of these SCA can be reduced. Indeed, a first step before performing SCA is often necessary to reduce the computation complexity by selecting a small subset of points where leakage prevails, i.e. 1

Formally l(x) is a probability mass function (PMF) because X is discrete. To simplify notation, we use the generic acronym PDF

Table 1: Attack settings for different DUT. For each experiment, selected PoI intervals roughly cover the attacked round (i.e. 10th ) which has been evaluated by SPA.

DUT #1 #2 #3 #4 #5

# traces 10000 20000 50000 80000 75000

PoI [7000;7800] [2200;2600] [10000;10500] [13000;24000] [6300;6800]

FoI (Hz) using [32] [1e6;80e6] [1e6;48e6] [4e6;48e6] ∪ [90e6;160e6] [1e6;164e6] ∪ [309e6;389e6] [1e3;1e8]

relevant time windows, called Points of Interest (PoI) in the rest of the paper. Indeed, all time samples are not equally useful to the side-channel adversary. Only few of them contain exploitable information about subkeys, e.g. PoI covering the first or last round for symmetric cryptographic algorithm (AES or DES). Many heuristics have been proposed for the purpose of selecting the most interesting time samples. A straightforward solution is to select the samples where previous successful attack have performed the best, e.g. CPA [11], [6]. Alternatively, It is also possible to use pre-processing (dimensionality reduction: d0 < d) techniques like Principal Component Analysis (PCA) [29], [5], Linear Discriminant Analysis (LDA) [29] but operating with a profiling stage, i.e. the requirement of a clone device. Other methods have been analyzed such as integration [19], filtering [21] or variance test approach [22], [4], [30] have also been explored. Additionally, investigations have shown the use of parallel computing could reduce the data processing time [17]. Those related works attempt to circumvent practical issues in terms of computational complexity and memory requirements must be emphasized when dealing with high-dimensional traces in T -domain. The running time of the MIA in T -domain scales with the number of analyzed leakage samples, making it computationally prohibitive for large d, i.e. O(αd) where α is the time required to compute the MI index in (5) at a single leakage sample using a specific leakage partitioning φˆ and PDF estimation tool. There is no benefit of translating input traces from the T -domain into the F-domain using FFT algorithm as d frequency points are still generated. If no zero-padding is used, one could exploit the redundancy making half of the output of FFT redundant (being the complex conjugate of the other half) thanks to the ”Hermitian” symmetry, i.e. O(α(d/2 + 1)) assuming the time it takes to compute MI index in (5) at a single leakage sample is equal to one at a single frequency point using an identical leakage partitioning φˆ and PDF estimation tool. The problem of selecting the time samples of interest in the T -domain has therefore its equivalent problem in the F-domain, i.e. finding the Frequency of Interest (FoI). However, there is more knowledge about the distribution of the leakage in the F-domain than in the T -domain. Indeed, [20] first observed that the exploitable leakage is essentially distributed at low frequencies, as its amplitude is bounded above by the function 1/f (or 1/f 2 ) for the EM leakage (for the current leakage respectively) as theoretically and experimentally demonstrated in [32] that introduces a frequency leakage model, denoted as Leakage Noise Ratio (LNR) criterion. In addition, experimental results using wavelet decomposition in [28] also confirmed that leakage is mainly concentrated in low frequencies (i.e. in the

Nfft 4096 4096 4096 11000 4096

fs (Hz) 20e9 5e9 20e9 20e9 20e9

# harmonics 16 48 23 134 20

approximation coefficients). There is thus a way to significantly speed-up MIA in an automated manner. In follows, The classical MIA defined in T -domain (see Eq. (5)) can be reformulated in F-domain as M Ik (f ) = H(P (Sf )) − H(P (Sf )|Hk ),

(6)

where P is the power spectral density of the random variable S (corresponding to the transformed random variable of O in F-domain using FFT algorithm). We restrict ourselves to use only the amplitude of the signal since experiments conducted in Sect. 3 do not give better result by using Discrete Hartley Transform suggesting that the information on the phase is negligible. The correct subkey κ should satisfy   κ = arg max max M Ik (f ) , (7) f ∈F

k∈K

and if M\ Ik (f ) is an estimate of M Ik (f ), an estimate κ ˆ of κ using the LNR criterion [32], is obtained by

 κ ˆ = arg max k∈K

3.

 max∗ M\ Ik (f ) .

f ∈F

(8)

EXPERIMENTAL RESULTS

In this section, we assess the practicability (i.e. efficiency, computational complexity) of MIFA using the Pearson coefficient based attacks in T -domain and F-domain as a benchmark, referred to as CPA [8] and CPFA [20], respectively. For this purpose, we considered several practical scenarios, i.e. analyzing 5 AES implementations. First, we attacked EM traces collected above a hardware AES block, mapped into an Xilinx Spartan 3 FPGA board, operating at 50MHz with a Langer RF2 probe and 48 dB low noise amplifier, denoted as DUT #1. Then, we replicated the same experiments on the publicly available traces acquired on a Saebo GII board from unprotected hardware AES-128 implementation on Xilinx Virtex-5 FPGA provided by DPA Contest v2 representing DUT #2. Next, with the same equipement as described for DUT #1, EM traces were recorded from hardware and software AES-128 embedded within microcontroller equiped with a Cortex M3 processor designed in a 90nm technology, referred as DUT #3 and #4, respectively. To investigate genericity, we have finally considered DUT #5 corresponding to AES-128 designed with a 65nm Low Power High Threshold Voltage CMOS technology integrating an in-house communication protocol and supplied by 16 pads so that the power consumed by the AES is not drawn from a single power pad. A Xilinx Spartan 3 FPGA board were used to drive the chip. Power consumption was registered as a voltage drop-off using differential voltage probe

Domain

Attacks CPFA

F

MIFA CPA

T

MIA

Level wd wd mb wd wd mb

#1 430 950 700 370 710 580

#2 2440 2690 2520 3060 3750 3570

#3 13500 22100 25400 8300 11600 13100

#4 17800 Fail (12.81) 4400 17900 N.A N.A

#5 Fail (34.94) 6700 Fail (19) Fail (28.46) 9600 Fail (21.31)

Table 2: Number of traces required to reach stable aGE < 10. Additionally, when attacks fail, the aGE after the processing of all the traces is reported in parenthesis. The N.A acronym (i.e. Not Available) stands for attacks that could not be evaluated because of high computational burden.

MIAmb

16 10

64

1 0

32 16 1 0

CPFAwd

20

40

1

2

60

3

4

80

MIAwd

Average Guessing Entropy

96

100

Average Guessing Entropy

Normalized CPU time

96

16 10

64

1 0

32

1

2

3

16 1 0

10

20 30 Normalized CPU time

Average Guessing Entropy

Average Guessing Entropy

Average Guessing Entropy

CPAwd

40

50

MIFAmb

MIFAwd

96 64 32 10 1 0

1

2

4

6

8

10

12

14

16

18

20

Normalized CPU time

96 64 32 10 1 0

0.5

1

1.5

2

2.5

3

Normalized CPU time

128

64 30 16 1 0

0.2

0.4

0.6

0.8 1 1.2 Normalized CPU Time

1.4

1.6

1.8

2

Figure 1: Plots of aGE over normalized CPU time of attacks corresponding to each DUT (i.e. top left: DUT #1, bottom left: DUT #2, top right: DUT #3, bottom right: DUT #4, bottom middle: DUT #5). The normalization was done with respect to the time necessary to obtain an aGE of 10 (resp. 30) with a CPA for DUT #1,#2,#3,#4 (resp. DUT #5).

featuring a bandwidth of 1Mhz-4GHz. The attacks target the sixteen bytes of the AES state after the last SubBytes operation. Each attack was performed 50 times (i.e. processing traces in 50 random orders) against DUT#1, #3, #4 and #5 while for DUT#2, we carried out attacks on 32 independent data sets (with 32 different secret keys). We used the Hamming distance function for hardware DUT (i.e. #1, #2, #3 and #5) and Hamming weight function for software DUT (i.e. #4). The metric for the attacks evaluation is based on the average Guessing Entropy (aGE). We recall that aGE is defined as the average rank of all the correct subkey bytes in the sorted list of the key hypothesis. Additionally, the FoI were preselected by applying the LNR criterion [32] on a subset of 150 leakage traces. Otherwise, we performed MI-based attacks at bit level (i.e. multi-bit [7]) and word level [21], [8] as in [9] with respect of the following notation ’mb’ and ’wd’ suffixes respectively for the remainder of this paper. However, we do not report the identity model [14] as it was more computationally inten-

sive with no better results. Since larger bandwidth than the commonly used obtained by the Silverman’s rule [27] gives better results with a reduction of computational burden, we evaluated MIFA (resp. MIA), with bandwidth values systematically set close to interval length at each considered frequency component (resp. leakage sample). Furthermore, we used 5 query points fixed along a mesh grid between µ − 3σ and µ + 3σ obtained from 10% of the total number of leakage traces and the Epanechnikov kernel was chosen. Tab. 1 summarizes the attack settings for each experiment. It is noteworthy that identified leaking frequency subbands by LNR criterion (i.e. FoI) were used to define filters for T -domain attacks after verifying that provides an improvement in terms of effectiveness. From Tab. 2, it may first be noticed that correlation-based attacks are the most efficient attacks against DUT #1, #2 and #3 as it proved itself very efficient on an broad majority of hardware implementations (leaking linearly) when

combined with classical leakage models like Hamming distance or Hamming weight function at the word level. For DUT #4, one can observe that MIFAmb is able to reach an aGE < 10 with only 4400 traces whereas CPA and CPFA roughly requires four times more traces. This can be explained a better discrimination through the distinct influence of each bit rather than at word level. No results have been reported about MIA because its evaluation was too time-consuming. The use of MIFA can be justified for software traces when MIA is prohibitory. At this stage, MIFA compares favorably with MIA in terms of effectiveness while remaining competitive with CPA and CPFA. For DUT #5, A pre-characterization leakage using the Akaike Criterion (AIC) [2] with the weighted least squares method, showed us a non-linear leakage for almost all the Sbox according to high degree of the polynomials ranging from 4 to 8. Only Sbox 5 was unbreakable since the degree of this polynomial was null, suggesting that no leakage could be exploited. Accordingly, this testcase is suitable to provide the genericity conservation of MIA in F-domain. Only MI-based attacks succeed at word level (i.e. MIFAwd and MIAwd) suggesting that the leakage has in fact a nonlinear dependency at this level. Naturally, correlation-based attacks (i.e. CPFAwd and CPAwd) fail with aGE greater than MI-based attacks at bit level supporting the generic nature of the MI distinghuisher. For all the DUT, we investigated the effectiveness of MIFA measuring the aGE over normalized CPU time, i.e. the speed/efficiency ratio in recovering the secret full key. The used normalization for comparing of analyszed distinghuisher was done with respect to the time necessary to obtain an aGE of 10 (resp. 30) with a CPAwd for DUT #1,#2,#3,#4 (resp. DUT #5). From Tab. 1, a first observation is that the complexity reduction of F-attacks comes from the reduced number of analyzed frequencies thanks to the LNR criterion which allows performing between 10% and 80% less of distinguisher calculations during our experiments. This indicates F-attacks should be less time-consuming than the corresponding T -attacks when a frequency leakage model is used. As sustained in Fig. 1, we point out that transforming MI-based attacks in the F-domain, enable to substantially reduce the computational complexity tracking down the fast correlation-based attacks.

4.

CONCLUSION

In this paper, we investigated the interest of applying MIA in F-domain as powerful exploratory attack being able to replace the systematic CPA. The motivation of this study was guided by the fact that MIA is theoretically more powerful than CPA as no prior knowledge about the particular dependencies between the processed data and leakage is required. No advantage was found in the direct application of MIA in F. However, by taking into account the results of [20] and [32] related to the distribution of the leakage in F-domain, it appears possible to easily reduce the computational burden of MIA by a factor ranging between 10 and 50 allowing the application of MIA on full length traces, characterizing the course of software implementations of cryptographic implementations. Importantly, experimental results show this gain is obtained without significant loss of efficiency nor genericity; MIA in F-domain compares favorably with the fast and simple CPA enjoying an additional genericity.

5.

REFERENCES

[1] D. Agrawal, B. Archambeault, J. R. Rao, and P. Rohatgi. The EM Side-Channel(s). In Revised Papers from the 4th International Workshop on Cryptographic Hardware and Embedded Systems, CHES’02, pages 29–45. Springer-Verlag, 2003. [2] H. Akaike. Information theory and an extension of the Maximum Likelihood Principle. In B. N. Petrov and F. Csaki, editors, Second International Symposium on Information Theory, Budapest, 1973. Akad´emiai Kiado. [3] S. Aumonier. Generalized Correlation Power Analysis. In ECRYPT Workshop on Tools For Cryptanalysis, Krak´ ow, Poland, September 2007. [4] L. Batina, B. Gierlichs, and K. Lemke-Rust. Differential Cluster Analysis. In CHES, volume 5747 of LNCS, pages 112–127. Springer, 2009. [5] L. Batina, J. Hogenboom, and J. G. J. van Woudenberg. Getting More from PCA: First Results of Using Principal Component Analysis for Extensive Power Analysis. In Proceedings of the 12th Conference on Topics in Cryptology, CT-RSA’12, pages 383–397, Berlin, Heidelberg, 2012. Springer-Verlag. [6] P. Belgarric, N. Bruneau, J.-L. Danger, N. Debande, S. Guilley, A. Heuser, Z. Najm, O. Rioul, and S. Bhasin. Time-Frequency Analysis for Second-Order Attacks. In CARDIS, 2013. [7] R. P. Bevan and E. Knudsen. Ways to Enhance Differential Power Analysis. In Information Security and Cryptology (ICISC), pages 327–342, Seoul, Korea, 2002. [8] E. Brier, C. Clavier, and F. Olivier. Correlation Power Analysis with a Leakage Model. In CHES, volume 3156 of LNCS, pages 16–29, Cambridge, MA, USA, August 2004. Springer, Heidelberg. [9] M. Carbone, S. Tiran, S. Ordas, M. Agoyan, Y. Teglia, G. R. Ducharme, and P. Maurine. On Adaptive Bandwidth Selection for Efficient MIA. In COSADE, 2014. [10] S. Chari, J. R. Rao, and P. Rohatgi. Template Attacks. In CHES, volume 2523 of Lecture Notes in Computer Science, pages 13–28, August 2002. [11] G. Dabosville, J. Doget, and E. Prouff. A New Second-Order Side Channel Attack Based on Linear Regression. IEEE Trans. Computers, 62:1629–1640, 2013. [12] K. Gandolfi, C. Mourtel, and F. Olivier. Electromagnetic Analysis: Concrete Results. In Proceedings of the Third International Workshop on Cryptographic Hardware and Embedded Systems, CHES ’01, pages 251–261, London, UK, UK, 2001. Springer-Verlag. [13] C. H. Gebotys, S. Ho, and C. C. Tiu. EM Analysis of Rijndael and ECC on a Wireless Java-Based PDA. In J. R. Rao and B. Sunar, editors, CHES, pages 250–264. Springer, 2005. [14] B. Gierlichs, L. Batina, and P. Tuyls. Mutual Information Analysis : A Generic Side-Channel Distinguisher. In Cryptographic Hardware and Embedded Systems, volume 5141 of LNCS, pages 426–442, 2008. [15] P. C. Kocher, J. Jaffe, and B. Jun. Differential power

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26] [27]

[28]

[29]

[30]

analysis. In Proceedings of the 19th Annual International Cryptology Conference on Advances in Cryptology, volume 1666 of CRYPTO ’99, pages 388–397, London, UK, UK, 1999. Springer-Verlag. T.-H. Le and M. Berthier. Mutual Information Analysis under the View of Higher-Order Statistics. In IWSEC, volume 6434 of LNCS, pages 285–300. Springer, 2010. S. J. Lee, S. C. Seo, D.-G. Han, S. Hong, and S. Lee. Acceleration of Differential Power Analysis through the Parallel Use of GPU and CPU. IEICE Transactions, 93-A(9):1688–1692, 2010. Y. Linge, C. Dumas, and S. Lambert-Lacroix. Maximal Information Coefficient Analysis. Cryptology ePrint Archive, Report 2014/012, 2014. S. Mangard, E. Oswald, and T. Popp. Power Analysis Attacks: Revealing the Secrets of Smart Cards, volume 31. Springer Publishing Company, Incorporated, 1st edition, December 2006. E. Mateos and C. H. Gebotys. A New Correlation Frequency Analysis of the Side Channel. In Proceedings of the 5th Workshop on Embedded Systems Securityc, WESS ’10, pages 4:1–4:8, Scottsdale, Arizona, 2010. ACM. T. S. Messerges, E. A. Dabbish, R. H. Sloan, T. S. Messerges, E. A. Dabbish, and R. H. Sloan. Investigations of Power Analysis Attacks on Smartcards. In In USENIX Workshop on Smartcard Technology, pages 151–162, 1999. A. Moradi, O. Mischke, and T. Eisenbarth. Correlation-Enhanced Power Analysis Collision Attack. In CHES, volume 6225 of LNCS, pages 125–139. Springer, 2010. A. Moradi, N. Mousavi, C. Paar, and M. Salmasizadeh. A Comparative Study of Mutual Information Analysis under a Gaussian Assumption. In WISA 2009, volume 5932 of LNCS, pages 193–205. Springer, Heidelberg, 2009. E. Prouff and M. Rivain. Theoretical and Practical Aspects of Mutual Information Based Side Channel Analysis. In ACNS 2009, volume 5536 of LNCS, pages 499–518, Paris, France, June 2009. J.-J. Quisquater and D. Samyde. ElectroMagnetic Analysis (EMA): Measures and Counter-Measures for Smart Cards. In I. Attali and T. P. Jensen, editors, E-smart, LNCS, pages 200–210. Springer, 2001. O. Reparaz, B. Gierlichs, and I. Verbauwhede. Generic DPA attacks: curse or blessing? In COSADE, 2014. B. Silverman. Density Estimation for Statistics and Data Analysis. London: Chapman & Hall/CRC., page 48, 1998. Y. Souissi, M. el Aabid, J.-L. Danger, S. Guilley, and N. Debande. Novel Applications of Wavelet Transforms based Side Channel Analysis. In Non-Invasive Attack Testing Workshop, 2011. F.-X. Standaert and C. Archambeau. Using Subspace-Based Template Attacks to Compare and Combine Power and Electromagnetic Information Leakages. In CHES, pages 411–425, 2008. F.-X. Standaert, B. Gierlichs, and I. Verbauwhede. Partition vs Comparison side-channel distinguishers: An empirical evaluation of statistical tests for

[31]

[32]

[33]

[34]

[35]

univariate side-channel attacks against two unprotected CMOS devices. In Information Security and Cryptology, volume 5461 of LNCS, pages 253–267, Seoul, Korea, December 2008. F.-X. Standaert and N. Veyrat-Charvillon. Mutual Information Analysis: How, When and Why? In Cryptographic Hardware and Embedded Systems CHES 2009, volume 5747 of LNCS, pages 429–443, Lausanne, Switzerland, September 2009. S. Tiran, S. Ordas, Y. Teglia, M. Agoyan, and P. Maurine. A Frequency Leakage Model and its application to CPA and DPA. IACR Cryptology ePrint Archive, 2013:278, 2013. A. Venelli. Efficient Entropy Estimation for Mutual Information Analysis Using B-Splines. In WISTP, volume 6033 of LNCS, pages 17–30, 2010. N. Veyrat-Charvillon and F.-X. Standaert. Generic side-channel distinguishers: Improvements and limitations. In CRYPTO 2011, volume 6841 of LNCS, pages 354–372. Cryptology ePrint Archive, Report 2011/149, 2011. C. Whitnall and E. Oswald. A Comprehensive Evaluation of Mutual Information Analysis Using a Fair Evaluation Framework. IACR Cryptology ePrint Archive, 2011:322, 2011.