Reverse Engineering of Embedded Software Using Syntactic Pattern ...

5 downloads 253 Views 424KB Size Report
natures is detailed. Next, we state how to construct, signature models, during a learning stage, to which input signatures are compared. Afterwards, we ex-.
Reverse Engineering of Embedded Software Using Syntactic Pattern Recognition Mike Fournigault1 , Pierre-Yvan Liardet2 , Yannick Teglia2 , Alain Tr´emeau3, and Fr´ed´erique Robert-Inacio1 1

3

L2MP-ISEN, Place Georges Pompidou, F-83000 Toulon, France [email protected] http://www.l2mp.fr/doct/fournigault.html 2 ST Microelectronics,77 Avenue O. Perroy, F-13790 Rousset, France LIGIV, 18 Rue du professeur Benoˆıt Lauras F-42000 Saint-Etienne, France

Abstract. When a secure component executes sensitive operations, the information carried by the power consumption can be used to recover secret information. Many different techniques have been developped to recover this secret, but only few of them focus on the recovering of the executed code itself. Indeed, the code knowledge acquired through this step of Simple Power Analysis (SPA) can help to identify implementation weaknesses and to improve further kinds of attacks. In this paper we present a new approach improving the SPA based on a pattern recognition methodology, that can be used to automatically identify the processed instructions that leak through power consumption. We firstly process a geometrical classification with chosen instructions to enable the automatic identification of any sequence of instructions. Such an analysis is used to reverse general purpose code executions of a recent secure component. Keywords: Power Analysis, Side Channel, Chip Instructions, Reverse Engineering, Pattern Recognition.

1

Introduction

The purpose of this paper is to study how pattern recognition techniques can be used to classify power signals representing secure component instructions. More precisely, we apply these techniques to a smart card. Kocher et. al. showed in [1] that power variations are correlated to both component instructions and manipulated data. In consequence, the global power consumption of a microprocessor leaks information about the operations it processes. Especially, when the component processes data encryption operations, this information can be used to recover secret information from the embedded cryptosystem [1,2,3,4,5]. Power analysis attacks generally work on power consumption traces and perform global statistical processing of those signals to discover secret leakage. For example, a Differential Power Analysis (DPA) tries to correlate via a selection function, hypothetic values of key bits on power signals. Differences between correlated signals and original signals create peaks when correct key bits are R. Meersman, Z. Tari, P. Herrero et al. (Eds.): OTM Workshops 2006, LNCS 4277, pp. 527–536, 2006. c Springer-Verlag Berlin Heidelberg 2006 

528

M. Fournigault et al.

guessed. This technique enables to recover the secret key, but the attacker needs to know the type of cryptographic algorithm, to formulate the selection function in DPA. As DPA, many attacks could be improved with further information about the processed code, in consequence code recognition is an added value to the attacker. The Simple Power Analysis (SPA) introduced by Kocher et. al. in [1], shows that the sequence of instructions can be appreciated along the power traces. To refine SPA, tokenizing power signals and identifying current signatures can become an interesting tool to perform power analysis attacks, as well as an interesting tool to provide the setup of other kinds of side channel attacks. It can help in processing signal synchronisation and in identifying specific macro code instructions in power signals. For example, automatic identification of the S−box execution part of a DES, enables to determine the corresponding time interval. It can be used to improve the efficiency of a DPA attack, by registring traces only during this interval. It can also help to detect the key points of an algorithm and then order a Differential Fault Analysis (DFA) attack on those key points. A first attempt in automatic code recognition was made on side channels in [6]. Other methods based on statistical approaches were also published that focus on dedicated unknown part, but with known structures [7]. The interest of pattern recognition methods was shown in [6]. We continue this direction, but with a method that is more able to counter-act some counter-measures and that is more able to deal with sequences of instruction signatures. The aim of this paper is to outline how to classify code instructions extracted from power signals by using pattern recognition methods, and to show how such a classification paired with sequential pattern relations can be used to perform a SPA. The organisation of this paper is as follows. First, we describe the studied power signals. Then, the pattern recognition tool to compare elementary signatures is detailed. Next, we state how to construct, signature models, during a learning stage, to which input signatures are compared. Afterwards, we explain how pattern sequences and their grammatical analysis can help in macro instruction recognition. Finally we present a practical application.

2

Experiments and Power Signals Under Study

In [8], M.L. Akkar shows that the power consumption P (I) of an instruction I can be separated from: the general power of the instruction, the power due to data in input and output of I, and the component due to previous instructions and manipulated data. Experimentally, we observed that previous instructions and data have no significant impact on the result. So, in our consumption model, we choose to neglect the consumption component due to the previous instruction. In our experiments, manipulated data are unkwnown. In addition, our method characterizes the consumption signatures by taking into account data influence. In consequence, we do not make any difference between data in input and data in output. So, we assume that P (I) is defined by: P (I) = Pgen × Ngen + Pdata × Ndata

(1)

Reverse Engineering of Embedded Software

529

where Pgen denotes the general power of the instruction and Ngen refers to its variations, Pdata refers to the power due to data and Ndata is the corresponding noise. With those considerations, it means that, we have to characterize the signature for various manipulated data to identify a component instruction from its power signature. So, we have to proceed to several executions of the same instruction with various random data when possible. We consider for this study, a secured microcontroller for smart cards, more precisely a 8 bits CISC microcontroller with a Von Neumann architecture. The processor is run in stable clock mode with a frequency fc = 7 MHz. The principal component of consumption remains synchronous with the clock signal. We observe that our method used to analyse power signals is self-adaptative to the clock frequency. In order to characterize the power consumption of each instruction according to the model of the equation 1, we programme a loop to execute N times the same instruction. Signals are recorded on the power supply contact with a current probe and an oscilloscope Tektronik T3032B, at the sampling frequency of 2GHz. Each instruction trace is composed of at least six hundred cycle samples, from which one hundred of cycle samples is representative of the instruction code. In a first step, we need to define the set of instructions to be identified for real code. We consider general CPU statements like ”load”, ”add”, ”and”, ”or” and ”multiply”. We refer to the set of instructions as I = {Ii , i ∈ [1, n]} where Ii denotes a precise instruction. By precise we mean an instruction with an adressing mode. The data set chosen for the k th instruction execution is noted Dk , k ∈ [1, N ], and Iik = Ii (Dk ) refers to the k th execution of the instruction Ii with data Dk . Pik , k ∈ [1, N ], denotes the current signature of Iik . In order to identify power traces according to the consumption model of the equation 1, we proceed in two stages. The first stage consists in learning characteristics of instruction signatures in order to get some specific knowledges. We will refer to this step as the learning stage. According to equation 1, power signatures of an instruction was characterized without separating the contribution of the instruction and the contribution of the data. It finally leads, for each instruction Ii , to the choice of a set of signature prototypes that constitutes a reference database of signatures statistically representatives of Ii . The second stage consists in identifying a power signature Px given in input, by searching for a signature prototype that is similar to Px . Each signature database of each instruction characterized, is scanned in order to find the signature prototype of Ii , that is the most similar to Px . If the similarity degree between both is higher than a threshold value, Px is said to be characteristic of the instruction Ii . Both learning stage and identification stage use a measure of similarity between signatures. We describe hereafter how such a similarity is measured.

3

Elementary Pattern Recognition Scheme

Both learning stage and identification stage need a tool to compare signatures. As an instruction Ii takes a constant number (noted Ri ) of cycles to execute, a signature Pik is decomposed, using a gaussian derivative wavelet transform [9],

530

M. Fournigault et al.

in subparts corresponding to cycle signatures, where Ri are examined (see fig. 1). In our scheme, each cycle signature is disassembled in significant peaks. Each peak is characterized individually, it allows to take into account data influence k , a set of on the signature. From each significant peak of the cycle signature Pi,c shapes Sc,j is constructed by considering the subgraph (see fig. 1). According to this decomposition, matching two cycle signatures Pi,c and Pi,d consists in comparing every shape of Pi,c to every shape of Pi,d .

Fig. 1. Left: decomposition of Pik in cycle signatures. Right: decomposition of a cycle signature in elementary shapes.

In the learning stage, shapes are used to build shape prototypes. In the identification stage, input shapes are compared to learned prototype shapes. These two stages are based on a pattern recognition tool defined by F. Robert in [10]. This tool is a shape parameter, that defines a similarity degree between two shapes, the shape under study and a reference shape. The Robert’s parameter is a bounded measure that enables to compare shapes according to some geometrical features. An important fact for our application is that this parameter is invariant under translation and scaling. This allows to take into account some counter-measures running on the studied component that modifies the magnitude of consumption peaks. This parameter considers two convex shapes, on the one hand X, is the shape under study and, on the other hand A is the reference shape. In order to compute the Robert’s parameter, we search for the smallest homothetic set of A circumscribed to X, AX = λA (X).A, where λA (X) denotes the scale ratio to apply to A. Then, the smallest homothetic set of X circumscribed to AX , XAX = λX (AX ).X, is computed. In this way, the two shapes A and X are compared with the Robert’s shape parameter, as follows: pAX (X) =

μ(X) 1 λX (AX ) μ(AX )

(2)

where μ(X) and μ(AX ) refer respectively to the areas of X and AX . The Robert’s parameter is presented in [10] for convex shapes. Since in our application, the elementary signatures are often quasi convex and also the corresponding patterns Sj , it would be quite restrictive to deal only with convex

Reverse Engineering of Embedded Software

531

shapes. A way to compute this parameter for non convex shapes, is to proceed as explained in [11]. We note p(Sc,j , Sd,l ) the similarity measure between the shape Sc,j extracted from Pi,c and the shape Sd,l extracted from Pi,d . The similarity measure between two cycle signatures Pi,c and Pi,d , M (Pi,c , Pi,d ) is:  M (Pi,c , Pi,d ) = max(p(Sc,j , Sd,l )) (3) j

4

l

Learning Stage and Current Signature Sequences in Grammatical Formulation

The learning stage is the operation of determining for each instruction Ii , the set of signature prototypes that are characteristic of its power consumption, regardless of the manipulated data. Each instruction is assigned to several signature prototypes Pi1 , . . . , Piq . We note Pi , the set of all signature prototypes of Ii . As mentioned in section 3, the number of cycles to execute Ii , Ri , is known during the learning stage. So, each prototype of Pi is constructed from Ri prototypes Pi,c of cycle signatures. The learning stage begins with the choice of cycle signature prototypes Pi,c that are representatives of those possibles for Ii . q It continues with the construction of each prototype of instruction signature Pi from cycle prototypes. A cycle prototype Pi,c is said to be characteristic of n samples of cycle signature (regardless to the data influence), if it is similar, according to a threshold value T , to n samples of cycle signature, where n is as large as possible with respect to the threshold T . The prototype Pi,c is choosed to be Pi,c = Pi,c , c ∈ [1, n], ∀d ∈ [1, n], M (Pi,c , Pi,d ) ≥ T Maximising n for each prototype results in minimising the number of prototypes Pi,c required to characterize an instruction. In consequence, it results in minimising the matching complexity during the identification stage. Finally, the set of prototypes of instruction signatures are learned to be all possible sequences of cycle prototypes encountered. We consider the problem of identifying an instruction signature in a power trace, as matching a string with another string of a reference language. During the learning stage, the language L(Gi ) associated to each instruction Ii is the set Pi of instruction signature prototypes. The grammar Gi is the set q q ∧ . . . ∧ Pi,R , where ∧ is of rules to derive each Piq in Ri cycles prototypes Pi,1 i q in elementary the concatenation operator, and finally, each cycle prototype Pi,c q shapes Sc,j . The registration of the set Pi of prototypes enables string recognition as direct string matching, without any parser tool. In this way, the identification of an instruction through its power signal is equivalent to the simple matching of its string representation to some reference strings of a database constructed during the learning stage. An input signature

532

M. Fournigault et al.

Fig. 2. Recognition of a cycle signature according to syntactic methods. The elementary pattern Sp is discriminant in the identification of the input cycle signature.

Px is similar to a signature prototype Piq , if for each cycle signature Px,c of Px and q q of Piq , M (Px,c , Pi,c ) ≥ T . Each signature database of each cycle signature Pi,c each instruction characterized, is scanned in order to find the signature prototype of Ii , that is the most similar to Px . This syntactic analysis is generalized to instruction sequences, and allows us to recognize macro code instructions and their sequential execution. When combined with previous pattern matching, this syntactic analysis provides a tool to reverse code through power analysis. In figure 2, we illustrate the syntactic recognition process with the example of the identification of a cycle signature.

5

Experiment Results on a Recent Secure Component

In this section, we present some application results obtained for instruction identification of a recent secure component. This component runs some countermeasures and especially a magnitude counter-measure that occurs randomly on consumption peaks during the instruction execution. This component also embeds a phase jitter counter-measure that had been stopped for our experiments. In order to give these application results, all instructions that we want to identify, have been previously characterized during a learning stage, and so, that all signature prototypes are availables in the registred databases.

1 Fig. 3. Pattern models extracted from Padd,1

Reverse Engineering of Embedded Software

533

We first illustrate the identification of an instruction ”add”. We begin to illustrate this example with the matching of a cycle signature, and then the complete instruction signature is presented. The signature model of the first 1 and is noted P 1 instruction cycle is extracted from the prototype Padd add,1 . Three 1 1 1 , Sadd,2 , Sadd,3 . pattern models are separated: Sadd,1 1 We describe the result of matching two signature samples of Iadd , Padd and 2 1 Padd with Padd . The first cycle signatures of those two samples are noted respec1 2 tively Padd,1 and Padd,1 . We give on fig. 3 the three pattern models computed 1 2 1 , and those of Padd,1 and Padd,1 are given on fig. 4. from Padd,1

1 2 Fig. 4. Left: Padd,1 and its patterns. Right: Padd,1 and its patterns.

1 2 In order to proceed to the identification of power signals Padd,1 and Padd,2 , 1 . In tab. 1, we have all input patterns are compared to all patterns of Padd,1 k reported the shape parameter value between Sadd,j and Sadd,j , j = {1, . . . , 3} and k = {1, 2}, for the shapes of fig. 4 according to model patterns of fig. 3. k Table 1. Best scores for the comparison of Sadd,j and Sadd,j

1 Padd,1 2 Padd,1

k k k Sadd,1 /Sadd,1 Sadd,2 /Sadd,2 Sadd,3 /Sadd,3 0.48 0.35 0.37 0.46 0.71 0.54

Although values of tab. 2 seem to be quite low, they are ten times higher than k values obtained when comparing Sadd,j / Sadd,o with o ∈ [1, 3], j ∈ [1, 3], o = j. 1 2 1 1 It leads in similarity measures M (Padd,1 , Padd,1 ) = 1.2 and M (Padd,1 , Padd,1 )= 1.71. The threshold value used is T = 1, it enables to say that the cycle signatures 1 2 1 Padd,1 and Padd,1 are similar to the cycle prototype Padd,1 . 1 2 1 We now consider the entire signature Padd and Padd . In this example, Padd 1 matches the prototype signature Padd , with respect to the minimum similarity value T and where no other signature prototype of any Ii , i = add gives better 2 1 , but P 2 results. Because of its second cycle signature, Padd does not match Padd add

534

M. Fournigault et al.

Fig. 5. Example of a cycle signature of a load instruction 4 giving the highest similarity matches another prototype of ”add” signature, Padd 2 than any other prototype of any instruction. This example measures with Padd shows that our method enables to identify the instruction ”add” through its power signatures. In our experiments, we identify the instruction ”add” with at least 75% of success on the secure component tested. We now illustrate this recognition scheme with an input signature corresponding to the instruction ”load” executed with random data. This sample is noted 1 . We begin to describe this example with the matching of one cycle signaPload 1 1 ture, noted Pload,c (see figure 5). We present the comparison of Pload,c to the cycle prototype of the instruction ”add” that gives the best scores, Padd,1 . In tab. 2, we have reported the shape parameter values between Sload,j and Sadd,j , j = {1, . . . , 3}, that correspond to best scores of comparisons Sload,j / 1 1 Sadd,o , j ∈ [1, 3], o ∈ [1, 3]. It leads in the similarity measure M (Pload,1 , Padd,1 )= k 0.85 < T . The best scores for Sload,2 and Sload,3 are equivalent to those of Sadd,2 k k and Sadd,3 . But the score for Sadd,1 is more than 2 times better than the score for Sload,1 . From this example we verify that, according to the threshold value T = 1, shape parameter values are discriminant enough to conclude that Pload,1 1 is not similar to Padd,1 . 1 We now consider the entire signature Pload to signature prototypes of the instruction ”add”. The tested instruction ”load” takes 3 cycles to execute, and 1 1 are tested. The string of Pload cannot so, the first three cycle signatures of Pload be matched to any string prototype of the instruction ”add”, and we verify for 1 this example that the signature Pload is not characteristic of the instruction ”add”. Finally, we give on figure 6, the application of identifying subparts of an input signature. It was successfully identified as the sequence of two differents ”load” instruction signatures, followed by an ”XOR” instruction signature. 1 Table 2. Best scores for the comparison of Sload,j to Sadd,o

1 Pload,1

1 1 1 Sadd,1 /Sload,1 Sadd,2 /Sload,2 Sadd,3 /Sload,3 0.19 0.36 0.30

Reverse Engineering of Embedded Software

535

Fig. 6. Identification of subparts of an input signature

6

Conclusion

In this paper, we have shown that pattern recognition methods could automatically identify instructions through power signals of a recent secured smart card component. This process of instruction identification needs two steps: a characterization step to produce signature models and a recognition step to compare input signatures to signature models. Tokenizing instruction signatures in cycle signatures, and then, in elementary patterns, allows to perform a local analysis. This local analysis powered by a shape parameter and a syntactic analysis enables to automatically identify precise subparts of an instruction signature, such that a cycle signature and then the complete instruction signature. According to the results of our experiments, our pattern recognition scheme enables to recognize 75% in the worst case, and 81% in the average case, of tested instruction signatures, showing that this is an interesting tool to reverse code instruction from power signals. Most of the identification failures encountered are due to counter-measures that were running. Although our method is really efficient for some counter-measures like amplitude counter-measure, it does not work when the jitter phase counter-measure is activated. Finally, let us outline that our method needs to be applied on an opened component on which it is possible to execute specific instructions, in order to learn signature prototypes. In future work, it can become interesting to test a different syntactic analysis scheme like regular grammar analysis, in order to analyse how much the previous instruction and elementary statement can influence the power consumption.

References 1. Kocher, P., Jaffe, J., Jun, B.: Differential Power Analysis. Lecture Notes in Computer Science 1666 (1999) 388–297 2. Fahn, P., Pearson, P.: IPA: A New Class of Power Attacks. Lecture Notes in Computer Science 1717 (1999) 173–186 3. Clavier, C., Coron, J.S., Dabbous, N.: Differential Power Analysis in the Presence of Hardware Countermeasures. Lecture Notes in Computer Science 1965 (2000) 252–263

536

M. Fournigault et al.

4. Berna Ors, S., Gurkaynak, F., Oswald, E., Preneel, B.: Power-Analysis Attack on an ASCIC AES implementation. IEEE ITCC 04 proceedings 2 (2004) 546 5. Mangard, S.: A Simple Power-Analysis (SPA) attack on implementations of the AES key expansion. ICISC 02 Proceedings, Lecture Notes in Computer Science 2587 (2002) 6. Quisquater, J.J., Samyde, D.: Automatic Code Recognition for Smartcards Using a Kohonen Neural Network. Proceedings of the Fifth Smart Card Research and Advanced Application Conference (CARDIS ’02). San Jose, USA, november (2002) 7. Clavier, C.: Side Channel Analysis for Reverse Engineering (SCARE) - An Improved Attack Against a Secret A3/A8 GSM Algorithm. Cryptology ePrint Archive, http://eprint.iacr.org/, Report 2004/049, (2004) 8. Akkar, M.L.: Attaques et m´ethodes de protections de syst`emes cryptographiques embarqu´es. Doctor Thesis, Versailles University (2004) 9. Bigot, J.: A scale-space approach to landmark detection. Technical Report, TR2046, PAI (Interuniversity Attraction Pole network). (2002) 10. Robert, F.: Shape studies based on the circumscribed disk algorithm. IEEE CESA 98 proceedings, IEEE-IMACS, Hammamet, Tunisia, 1-4 april. (1998) 11. Fournigault, M., Tr´emeau, A., Robert-Inacio, F.: Characteristic centre point for quasi-convex shapes. 9th European Congres on Stereology and Image Analysis proceedings (ECSIA), Zakopane, Poland, 10-13 May. 2 (2005) 299–304

Suggest Documents