use of random search, hill-climbing, genetic algorithms and a hybrid approach ... In optimisation-based methods the cost (or fitness) function plays a crucial role.
Two-Stage Optimisation in the Design of Boolean Functions John A Clark and Jeremy L Jacob Department of Computer Science University of York, England {jac, jeremy}@cs.york.ac.uk
Abstract. This paper shows how suitable choice of cost function can significantly affect the power of optimisation methods for the synthesising of Boolean functions. In particular we show how simulated annealing, coupled with a new cost function motivated by Parseval’s Theorem, can be used to drive the search into areas of design from which traditional techniques, such as hill-climbing, can find then find excellent solutions.
1
Introduction
Cryptography needs ways to find good Boolean functions so that ciphers can resist various forms of cryptanalytical attack (particularly linear cryptanalysis and differential cryptanalysis). The main properties required are high non-linearity and low autocorrelation [9]. Recent work [7] has investigated and compared the use of random search, hill-climbing, genetic algorithms and a hybrid approach for the derivation of Boolean functions with high non-linearity. Other work [9] has investigated the use of enhanced hill-climbing methods to derive balanced Boolean functions with high non-linearity and low autocorrelation. In this paper we investigate the use of another heuristic technique, namely simulated annealing [4] (based on the annealing process for metals). The technique has been used by other researchers to break simple substitution and transposition ciphers [1,2] and to cryptanalyse systems based on the NP-hardness of discovering a trapdoor secret, e.g. attacks on the Permuted Perceptron Problem (PPP)[5]. In optimisation-based methods the cost (or fitness) function plays a crucial role. For highly non-linear Boolean function design extant optimisation techniques methods seek to maximise the non-linearity directly. In this paper we introduce a new cost function (motivated by Parseval’s Theorem) that enables the search to reach areas of the design space from which hill-climbing techniques can be used more effectively. Using this new and simple cost function we are able to converge on areas of the solution space with high non-linearity and low autocorrelation. The paper is structured as follows. In Section 2 we recap on basic Boolean function terminology. In Section 3 we outline the notion of local search, describe the simulated annealing algorithm, discuss the cost functions currently used and provide a template for enhanced cost functions. Section 4 shows how using a E. Dawson, A. Clark, and C. Boyd (Eds.): ACISP 2000, LNCS 1841, pp. 242–254, 2000. c Springer-Verlag Berlin Heidelberg 2000
Two-Stage Optimisation in the Design of Boolean Functions
243
two-stage strategy using our new cost function at the first stage provides better results than direct maximisation of non-linearity alone. It provides comparisons with functions derived from random generation followed by hill-climbing (which is far more effective than random generation alone as shown in [8]). In Section 4.2 we show how the basic technique can be significantly improved by suitable parameter selection.
2
Boolean Functions
This section summarises the basic cryptographic definitions needed. We shall denote the binary truth table of a Boolean function by f : Z2n → Z2 mapping each combination of n binary variables to some binary value. If the number of combinations mapping to 0 is the same as the number mapping to 1 then the function is said to be balanced. The polarity truth table is a particularly useful representation for our purposes. It is defined by fˆ(x) = (−1)f (x) . Two P g (x) = 0. If so, functions f and g are said to be uncorrelated when x∈Z n fˆ(x)ˆ 2 if you approximate f by using g you will be right half the time and wrong half the time. An area of particular importance for cryptanalysts is the ability to approximate a function f by a simple linear function. One of the cryptosystem designer’s tasks is to make such approximation as difficult as possible (by making the function f suitably non-linear ). We shall make use of the following terms: Linear Boolean Function. A linear Boolean function, selected by ω ∈ Z2n , is denoted by Lω (x) = ω1 x1 ⊕ ω2 x2 · · · ⊕ ωn xn Affine Function. The set of affine functions is the set of linear functions and their complements Aω,c (x) = Lω (x) ⊕ c. Walsh Hadamard Transform. For a Boolean P functionˆ f the Walsh-Hadamard Transform Fˆ is defined by Fˆ (ω) = x∈Z n fˆ(x)L ω (x). We denote the 2 maximum absolute value taken by the transform by W H max (f ) = maxω∈Z2n ˆ F (ω) . It is related to the non-linearity of f . Non-linearity of f. The non-linearity Nf of a Boolean function f is the minimum distance to any affine function. It is given by Nf = 12 (2n − W Hmax (f )) P Parseval’s Theorem. This states that ω∈Z n (Fˆ (ω))2 = 22n . A consequence 2 n of this result is that W Hmax (f ) ≥ 2 2 . This motivates our new cost function. Autocorrelation. The P autocorrelation ACf of a Boolean function f is given ˆ ˆ by ACf = maxs f (x)f (x ⊕ s) Here x and s range over Z n and x ⊕ s x
denotes bitwise XOR (and so produces a result in Z2n ).
3 3.1
2
Discrete Optimisation Approaches Overview
Optimisation techniques work either with a single candidate solution or with a population of candidate solutions. Both have been used in the design of Boolean
244
J.A. Clark and J.L. Jacob
functions, for example hill-climbing (single)[9] and genetic algorithms (population) [8]. Techniques working with a single solution are usually termed local since they consider moving to solutions that are ‘close’ to the current one (i.e. are in the local neighbourhood ). In contrast, population-based techniques are typically described as global. The distinction is not essential (with suitable abstraction population based approaches can also be viewed as local) but we shall follow convention and refer to the simulated annealing-based search we use in this paper as a local search technique. The optimisation is carried out with respect to some ‘evaluation function’ that measures how ‘good’ a candidate is. The terms ‘fitness function’ and ‘cost function’ are also used. ‘Fitness’ is used most naturally for maximisation problems and ‘cost’ for minimisation problems. To work effectively, an evaluation function must provide sufficient guidance to the search process. In formal terms we require reasonable local smoothness (essentially the fitness/cost values of neighbouring solutions are not too different to the value of the current one). The non-linearity function Nf is reasonably smooth [8] . We aim to provide balanced functions with high non-linearity (and low autocorrelation). We shall adopt a strategy that starts with a balanced but otherwise random function and moves only to neighbouring solutions that preserve balance. We define the neighbourhood of a balanced function fˆ to be all functions gˆ obtained from fˆ by swapping any two dissimilar values associated with two domain elements x, y : Z2n (we shall refer to the normal representation f and the polar representation fˆ as convenient). In formal terms, gˆ is in the neighbourhood of fˆ if ∃ x, y ∈ Z2n such that 1. fˆ(x) 6= fˆ(y) 2. gˆ(x) = fˆ(y), gˆ(y) = fˆ(x) and 3. ∀z ∈ Z2n \ {x, y} : gˆ(z) = fˆ(z) A local search starts at some initial solution fˆ0 and advances through a series of neighbouring solutions fˆ1 · · · fˆend . Restrictions may be placed on the relative fitnesses of consecutive fˆi . These determine the nature of the search. Millan et al [9] characterise the fitness of a solution fˆ by the pair of values (Nf , ACf ). For non-linearity a strong strategy allows only moves that strictly improve Nf , i.e. we must have Nfi+1 > Nfi . A weak strategy requires only that moves do not make the non-linearity worse, i.e we require only Nfi+1 ≥ Nfi . Finally it is possible to impose no restrictions on non-linearity. Similar considerations apply to autocorrelation, e.g. a strong strategy would require that ACfi+1 < ACfi (since improving here means smaller autocorrelation). Millan et al [9] describe nine search strategies, each combining a strategy for non-linearity and a strategy for autocorrelation. Imposing some element of restriction corresponds to what is generally known as hill-climbing. The most restrictive strategy is (strong, strong). The most permissive is (none, none) — essentially allowing a random walk. A problem with hill-climbing methods is that they can get stuck in local optima. Some modern heuristic local search techniques work by encouraging
Two-Stage Optimisation in the Design of Boolean Functions
245
improving moves but allowing (or, with some techniques, forcing) some worsening moves to be accepted as a means of escaping such local optima. Simulated annealing is one such technique and is described in Section 3.4. In this paper, a minimisation framework is the most natural. We shall optimise with respect to two different cost functions in a two-stage search. 3.2
Cost Functions
Current optimisation work in non-linearity attempts to improve the non-linearity directly. Equivalently (see the definition of Nf in Section 2), it seeks to minimise the cost function cost(fˆ) = W Hmax (f ) Essentially, the search considers the effect of a move only on those extreme (or near extreme) values of the Walsh Hadamard Transforms Fˆ (ω) for the current solution. A more indirect approach can derived by considering Parseval’s theorem below. X (Fˆ (ω))2 = 22n ω∈Z2n
n This constrains W Hmax (f ) = maxω∈Z2n Fˆ (ω) to be at least 2 2 . It would n achieve this bound when for each ω Fˆ (ω) = 2 2 In practice this bound may be impossible. When some Fˆ (ω) are greater than this ideal bound, Parseval’s theo rem ensures that some Fˆ (ω) must be smaller than it. Thus, it would appear that attempting to restrict the spread of absolute values achieved is well-motivated. This suggests a cost function of the following form: X n R cost(fˆ) = Fˆ (ω) − 2 2 ω∈Z2n
The value R is positive and can be varied. In the experiments reported here we have mostly used R = 3. Note that it does not necessarily follow that a reduction in our cost function gives rise to an increase in non-linearity but if the range of absolute values is small, then the maximum value will be small too. In section 4.2 we shall further generalise this cost function. 3.3
Calculating Effects of Moves
Every time a move is considered or accepted we must recalculate the values for the various Fˆ (ω). It is far more efficient to calculate the changes for each transform using some simplifying equations [9]. If swapping the values of fˆ(x) and fˆ(y) is a valid move (see Section 3.1) then each Walsh Hadamard Transform Fˆ (ω) is changed by an amount ˆ ω (x) − 2fˆ(y)L ˆ ω (y) ∆Fˆ (ω) = −2fˆ(x)L
246
J.A. Clark and J.L. Jacob
∆Fˆ (ω) ∈ {−4, 0, +4} Similar formulae P are also available for dealing with changes to correlation elements rˆf (s) = x fˆ(x)fˆ(x ⊕ s). 3.4
Simulated Annealing
Simulated annealing is a local search technique that allows escape from local optima. From the current state a move in a local neighbourhood is generated and considered. Improving moves are always accepted. Worsening moves may also be accepted probabilistically in a way that depends on the temperature T of the search and the extent to which the move is worse. A number of moves are considered at each temperature. Initially the temperature is high and virtually any move is accepted. Gradually the temperature is cooled and it becomes ever harder to accept worsening moves. Eventually the process ‘freezes’ and only improving moves are accepted at all. If no move has been accepted for some time then the search halts. The technique has the following principal parameters: – – – –
the temperature T the cooling rate α ∈ (0, 1) the number of moves N considered at each temperature cycle the number M axF ailedCycles of consecutive failed temperature cycles (where no move is accepted) before the search aborts – the maximum number ICM ax of temperature cycles considered before the search aborts The initial temperature T0 is obtained by the technique itself. The other values are typically supplied by the user. In the work described here they remain fixed during a run. More advanced approaches allow these parameters to vary dynamically during the search. The simulated annealing algorithm is as follows: 1. Let T0 be the start temperature. Increase this temperature until the percentage of moves accepted within an inner loop of N trials exceeds some threshold (e.g. 95%). 2. Set IC = 0 (iteration count), f inished = f alse and ILSinceLastAccept = 0 (number of inner loops since a move was accepted) and randomly generate an initial current solution fˆcurr . 3. while(not f inished) do 3a-3d a) Inner Loop: repeat N Times i. fˆnew = generateMoveFrom(fˆcurr ) ii. calculate change in cost ∆cost = cost(fˆnew ) − cost(fˆcurr ) iii. If ∆cost < 0 then accept the move, i.e. fˆcurr = fˆnew iv. Otherwise generate a value u from a uniform(0,1) random variable. If exp−∆cost /T > u then accept the move, otherwise reject it. b) if no move has been accepted in most recent inner loop then ILSinceLastAccept = ILSinceLastAccept + 1
Two-Stage Optimisation in the Design of Boolean Functions
247
c) T = T ∗ α, IC = IC + 1 d) if (ILSinceLastAccept > MaxFailedCycles) or (IC > ICmax ) then f inished = true 4. the current value of fˆ is taken as the final ‘solution’. Note that as T decreases to 0 then exp−∆cost /T also tends to 0 if ∆cost > 0 and so the chances of accepting a worsening move become vanishing small as the temperature is lowered. 3.5
Two-Stage Approach
Our technique can now be described very simply: 1. carry out a search using simulated annealing to reach a solution fˆsa with a very low value of the uniform cost function. Calculate the non-linearity Nfˆsa of this solution. 2. hill-climb from fˆsa to reach a solution fˆend that is locally optimal with respect to non-linearity (as in [7]). Thus we minimise W Hmax (f ) We view the initial stage as ‘getting in the right area’. Spending too much effort at this stage might actually be counter-productive. Since the initial stage is just a means to an end, we are free to make pragmatic concessions with respect to parameter values. Thus, for example, it is often recommended that N should be equal to the size of the neighbourhood. Even for small problems this would consume much computational effort. We are not bound by this recommendation and choose much smaller N . In fact n = 400 was used for all our experiments.
4 4.1
Experimental Results Comparison of Approaches
In this section we detail the results of applying the technique to the derivation of balanced functions with n = 8. The best known bound for the non-linearity is currently 118. The best boolean function demonstrated has non-linearity of 116. Four techniques have been examined: random generation followed by hillclimbing using the traditional cost function; simulated annealing using the traditional cost function; simulated annealing using the new uniform cost function; and simulated annealing using the new cost function followed by hill-climbing using the traditional cost function. We replicated each technique 400-fold. The result of each run is a non-linearity/autocorrelation pair (Nf , ACf ). For the simulated annealing components the search was terminated after 300 temperature cycles (i.e. inner loops) or else after 50 consecutive cycles had not produced an accepted move. At each temperature cycle 400 moves were considered (as indicated earlier). A temperature factor α = 0.8 was used throughout. Figure 1 shows the results of applying strong hill climbing from randomly chosen initial starting points using the traditional cost function (i.e. maximising
248
J.A. Clark and J.L. Jacob ACmax 104 96 88 80 72 64 56 48
106 0 0 0 0 1 0 0 0
Non-linearity 108 110 112 114 1 0 0 0 0 0 0 0 2 1 0 0 5 7 2 0 19 31 6 0 48 78 20 0 25 75 47 0 3 11 18 0
116 0 0 0 0 0 0 0 0
Fig. 1. Hill-Climbing with Fitness Nf
the non-linearity Nf directly). No run attained a non-linearity of 114 or more. No run attained an autocorrelation value of 40 or less. Figure 2 shows the results for applying simulated annealing alone using the traditional cost function. The results are unaffected by following simulated annealing by hill-climbing. Here we see that results are more bunched with respect to non-linearity (all but one have non-linearity of 112). There are a very small number of runs giving values better than for hill-climbing alone. Figure 3 shows the results of applying simulated annealing alone using the new cost function and Figure 4 shows the results of following this new optimisation by a traditional hill-climb. Now we hit functions with non-linearity of 116 occasionally and some have relatively low autocorrelations. The results show that the two-stage approach using local optimisation is highly effective. Results have been achieved for non-linearity and autocorrelation that were not obtained using hill-climbing and the traditional cost function. In Millan et al’s more extensive hill-climbing [9] no trial resulted in Nf = 116 (when maximising Nf ) and no trial resulted in ACf = 32 (when minimising ACf ). As we shall show below, it is possible to improve on the results obtained so far. ACmax Non-linearity 110 112 114 116 80 0 2 0 0 72 0 10 0 0 64 0 59 0 0 56 0 186 0 0 48 0 140 1 0 40 0 2 0 0 32 0 0 0 0 Fig. 2. SA only using Traditional Cost Function (Unaffected by addition of Hillclimbing)
Two-Stage Optimisation in the Design of Boolean Functions ACmax 56 48 40 32 24
249
Non-linearity 108 110 112 114 116 0 1 0 0 0 2 7 35 5 0 4 39 158 27 0 4 26 79 11 0 0 0 2 0 0
Fig. 3. SA Only Using New Cost Function ACmax 56 48 40 32 24
Non-linearity 108 110 112 114 116 0 0 1 7 1 0 0 14 56 2 0 0 27 176 18 0 0 23 64 11 0 0 0 0 0
Fig. 4. SA with New Cost Function and Hill-climbing with Traditional Cost Function
4.2
Tuning the Technique
The cost function is a means to an end. Indeed, we chose an initial new cost function for which a decrease did not necessarily correspond to an increase in non-linearity. Its purpose was to get the search to the right area. In using optimisation techniques across a range of problems we have found that experimentation with the cost function frequently produces better results. For this problem we have found that a more effective form of cost function is given by: cost(fˆ) =
X R n Fˆ (ω) − 2 2 + K ω
where K can be varied. Figure 5 shows the results of using K = 4 and K = −12 and α = 0.9 (other parameters as before). The effects of such tuning are marked. The technique has produced values that were not achieved by any hill-climbing strategy (or indeed by the work reported here so far). The results are all the more remarkable since autocorrelation was effectively ignored as (a conscious) part of the search. The values of K clearly influence which parts of the design space are reached. For K = −12 the search has tended to find solutions with lower non-linearity than for K = 4 but generally lower (and so better) autocorrelation. Also, examples of functions with autocorrelation of 16 have been found easily by the search. Small-scale experimentation has been carried out with various K values and the results are presented in Figure 6. The results presented here are for illustration only. More extensive experimentation is required. We leave this as future work.
250
J.A. Clark and J.L. Jacob
ACmax 56 48 40 32 24 16
112 0 2 9 29 1 0
Non-linearity 114 116 112 114 116 0 0 0 0 0 5 13 0 1 3 68 80 11 16 8 74 115 41 42 24 1 3 201 6 5 0 0 42 0 0 K=4 K=-12
Fig. 5. SA with New Cost Function followed by Hill-climbing with Traditional Cost Function for K = 4, −12 (400 runs)
ACmax 48 40 32 24 16
112 114 116 0 0 0 3 2 2 19 10 6 45 2 0 11 0 0 K=-14
ACmax 48 40 32 24 16
112 114 116 0 0 1 0 5 13 2 27 42 0 0 10 0 0 0 K=-6
Non-linearity 112 114 116 112 114 116 0 1 0 0 0 1 3 2 3 2 5 2 15 8 2 15 9 3 53 2 0 51 2 0 11 0 0 10 0 0 K=-12 K=-10 Non-linearity 112 114 116 112 114 116 0 3 2 0 8 5 3 19 20 6 32 15 5 17 27 3 16 15 1 1 2 0 0 0 0 0 0 0 0 0 K=-4 K=-2
112 114 116 0 0 0 2 2 1 11 12 6 44 3 1 18 0 0 K=-8 112 5 12 6 0 0
114 116 12 1 43 1 19 1 0 0 0 0 K=0
Fig. 6. Results for n = 8 with various K values (100 runs)
4.3
Results for Higher Values of n
We have applied the technique with the tunable cost function to the synthesis of Boolean functions with n = 9, 10, 11, 12 variables. The number of inner cycles N was 200 (except for n = 12 where 400 cycles were carried out) with 400 moves considered in each cycle as before. The overall aim of the work was to generate highly non-linear Boolean functions (low autocorrelation values obtained were accidental). Figure 7 summarises how our technique compares with the best results obtained so far by other techniques (both constructive techniques and heuristic optimisation approaches)with respect to non-linearity. For lower values of n our technique has no difficulty in equalling the best produced so far by any technique. We have not yet carried out extensive experiments with respect to parameter variation and make no claims to optimality of our results above. Experience with other problems has shown that cost function variation generally leads to improvements. Hard and fast comparisons with other
Two-Stage Optimisation in the Design of Boolean Functions Method Lowest Upper Bound Best Known Example [11,3] Bent Concatenation Genetic Algorithms [8] Our Simulated Annealing
4 4 4 4 4 4
5 12 12 12 12 12
6 26 26 24 26 26
7 56 56 56 56 56
8 118 116 112 116 116
9 244 240 240 236 238
10 494 492 480 484 484
11 1000 992 992 980 984
251
12 2014 2010 1984 1976 1990
Fig. 7. Comparing the Non-linearity of Balanced Functions
optimisation work is difficult since the simulated annealing is likely to be more computationally complex (here 80000 function evaluations before hill climbing). Nevertheless, our techniques seems to compare favourably with genetic algorithms for the higher values of n (work performed at the request of the referees). The results are encouraging and indicate that local search has good potential as a synthesis technique. For reference purposes we include below the results of our experiments for n = 9, 10, 11, 12. For n = 9, 10 we provide the results of (small-scale) random function generation followed by hill-climbing with respect to non-linearity (this is far better than random generation alone as demonstrated in [7]). For n = 11, 12 we simply present the results obtained by our technique to allow comparisons by future researchers. ACmax Non-linearity 236 238 72 3 1 64 16 13 56 97 60 48 125 76 40 5 4 Fig. 8. 400 runs for n = 9,α = 0.9 and K = −12
5
Conclusions and Further Work
The work reported in this paper has shown how local search can be used to generate boolean functions that have both high non-linearity and low autocorrelation. Further work is needed to determine optimal parameter values for the technique. The values reported here are for illustration only. Indeed since the efficacy of our technique is significantly affected by parameter values we propose to investigate the use of adaptive cost functions. With this approach the search is continually monitored and parameters varied dynamically. The parameters for the annealing algorithm have been chosen to allow a reasonable amount of experimentation. The value N = 400 is extremely small compared with recommended values (and the larger the number of Boolean variables n the worse is
252
J.A. Clark and J.L. Jacob ACmax 136 128 120 112 104 96 88 80 72
Non-linearity 224 226 228 230 232 1 1 1 0 0 0 1 5 1 0 4 3 8 1 0 4 6 25 3 0 6 14 51 11 0 3 9 87 23 4 1 5 55 26 2 0 0 18 15 1 0 0 0 3 2
Fig. 9. 400 randomly generated functions followed by Hill-climbing for n = 9 ACmax Non-linearity 480 482 484 104 0 1 1 96 0 0 8 88 0 1 23 80 0 1 41 72 0 0 24 Fig. 10. 100 runs for n = 10,α = 0.9 and K = −16 ACmax 192 184 176 168 160 152 144 136 128 120
Non-linearity 464 466 468 470 472 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0 1 0 4 2 2 1 2 5 2 4 0 0 3 6 3 0 0 11 11 9 0 2 3 5 9 0 0 0 3 5 0 0 0 2 1
Fig. 11. 100 randomly generated functions followed by Hill-climbing for n = 10
the discrepancy). Also, the values used for the cooling factor α (0.8 and 0.9) are very small compared with the bulk of successful annealing work. The results so far show considerable promise but further work is needed to tune the technique effectively. The use of global optimisation techniques such as genetic algorithms in conjunction with the new cost function would also seem worthy of investigation. We aim to extend the local simulated annealing approach to incorporate other desirable cryptographic properties.
Two-Stage Optimisation in the Design of Boolean Functions
253
ACmax Non-linearity 980 982 984 160 0 1 0 152 0 0 0 144 0 2 2 136 0 7 2 128 4 11 7 120 1 23 6 112 2 19 5 104 0 5 3 Fig. 12. 100 runs for n = 11,α = 0.9, K = −24 and R = 2.5
ACmax Non-linearity 1988 1990 212 1 0 204 0 1 196 0 0 188 3 1 180 1 0 172 1 1 164 1 0 Fig. 13. 10 runs for n = 12,α = 0.9, K = −28 and R = 3.0
As is typically the case with heuristic techniques, augmenting the basic approach with hill-climbing has been found to be an excellent idea. The simulated annealing simply provides a means of locating good areas from which to hillclimb. We have found that adopting a two-stage strategy is of use in other security problems too. Perhaps the most important point of this paper is that the cost function matters. The authors have recently applied optimisation techniques to a variety of security related problems. Experimentation with cost functions has typically led to better results (in some cases a radical improvement is obtained). We would recommend such experimentation to all. We are currently working on the use of optimisation techniques to derive cryptographic artifacts which satisfy excellent publicly stated criteria but which also satisfy secret malicious criteria! Acknowledgements. We would like to thank the members of the SEMINAL network (EPSRC grant GR/M78083) for their encouragement in carrying out this work.
254
J.A. Clark and J.L. Jacob
References 1. W.S. Forsyth and R. Safavi-Naini. Automated Cryptanalysis of Substitution Ciphers. Cryptologia. vol. XVII. No 4. 1993. pp407–418. 2. J.P. Giddy and R. Safavi-Naini R. Automated Cryptanalysis of Transposition Ciphers. The Computer Journal, (XVII)4, 1994. 3. X.-D Hou. On the Norm and Covering Radius of First-order Reed-Muller Codes. IEEE Transactions on Information Theory, 43(3). May 1983. pp.354–356. 4. S. Kirkpatrick and C. D. Gelatt, Jr. and M. P. Vecchi. Optimization by Simulated Annealing. Science, May 1993. pp. 671–680 5. Lars R. Knudsen and Willi Meier. Cryptanalysis of an Identification Scheme Based on the Permuted Perceptron Problem. In Advances in Cryptology Eurocrypt ’99, Springer Verlag LNCS 1592 pp. 363–374 6. Robert A J Mathews. The Use of Genetic Algorithms in Cryptanalysis. Cryptologia (XVII)2, April 1993. pp187–201 7. William Millan, Andrew Clark and Ed Dawson. An Effective Genetic Algorithm for Finding Highly Non-linear Boolean Functions. In Proceedings of the First International Conference on Information and Communications Security, 1997. Springer Verlag LNCS 1334. pp149–158 8. William Millan, Andrew Clark and Ed Dawson. Heuristic Design of Cryptographically Strong Balanced Boolean Functions. In Advances in Cryptology EUROCRYPT’98, Springer Verlag LNCS 1403. pp.489–499 9. William Millan, Andrew Clark and Ed Dawson. Boolean Function Design Using Hill-climbing Methods. Australian Conference on Information Security and Privacy (ACISP). 1999. pp.1–11 10. National Bureau of Standards. NBS FIPS PUB 46.Data Encryption Standard. 1976. 11. N.J. Patterson and D.H. Weidermann. The Covering Radius of the (215 , 16) ReedMuller Code is at Least 16276. IEEE Transactions on Information Theory, 29(3). May 1997. pp.1025–1027. 12. Richard Spillman and Mark Janssen and Bob Nelson and Martin Kepner. The Use of A Genetic Algorithm in the Cryptanalysis of Simple Substitution Ciphers. Cryptologia (XVII)1. April 1993. pp. 187–201