Recurrent Neural Network Models for improved (Pseudo ... - CiteSeerX

2 downloads 50839 Views 33KB Size Report
systems, which rely on the use of strong pseudo-random number sequences. The methodology ... like the ElGamal or Digital Signature scheme (DSS) need the ...
Recurrent Neural Network Models for improved (Pseudo) Random Number Generation in computer security applications 1 D.A. Karras and V. Zorkadis2 1University of Piraeus, Dept. of Business Administration, Rodu 2, Ano Iliupolis, Athens 16342, Greece 2 University of Ioannina, Dept.of Computer Science, Greece Abstract: This paper proposes a novel approach for generating strong pseudorandom numbers. The suggested random number generators are intended to be applied to cryptographic protocols of computing and communication systems, which rely on the use of strong pseudo-random number sequences. The methodology presented here is based on the exploitation of the recalling capabilities of Recurrent Neural Network models of the Hopfield type. More specifically, it is illustrated that while an associative memory model of the Hopfield type is able to retrieve a previously stored pattern when orthogonal patterns are involved and its weight matrix has specific properties, its oscillations occurred when trying to minimize the cost function corresponding to the case of a network with a weight matrix not satisfying the desired properties, while being fed with non-correlated and orthogonal patterns, could be employed as a mechanism for improved (pseudo) random number generation. It is demonstrated that these generators pass the most important relevant statistical tests and their performance regarding these tests is compared to that of well known in the literature random number generators. More specifically, DES and the linear congruential random number generators have been involved as such generators in the experimental study herein conducted. Key-words: Recurrent Neural Networks, Security Mechanisms, Cryptographic Protocols, Strong Pseudo-random Number Sequences.

CSCC'99 Proceedings, Pages 6041-6046

1. INTRODUCTION Cryptographic Protocols of Computing and Communication Systems may have random components, which require methods to obtaining numbers that are random in some sense. For instance, authentication mechanisms may use random numbers to protect against replay attacks [1]. Symmetric and asymmetric cryptographic systems like DES, IDEA, RSA [1] are involved as basic elements of security protocols and require random cryptographic keys. Furthermore, integrity mechanisms [2] or cryptographic key exchange mechanisms [3] or the construction of digital signatures like the ElGamal or Digital Signature scheme (DSS) need the generation and use of random numbers. In addition, random numbers are used for the generation of traffic and message padding, in order to protect against traffic analysis attacks and for the computation of strong and efficient stream ciphers [3]. Two criteria are used for the evaluation of the quality of random numbers obtained by using a generator in applications related to security of computing and communications systems: uniform distribution and independence. The most important requirement imposed on random number generators is their capability to produce random numbers uniformly distributed in [0,1];

otherwise the application’s results may be completely invalid. The independence requires that the numbers should not exhibit any correlation with each other. Additionally, random number generators should possess further properties: fast computations of the random numbers, possibility to reproduce a given sequence of random numbers and being able of producing several separate sequences of random numbers [4]. However, for random number generators involved in the implementation of security mechanisms such as authentication, key generation and exchange the most important property might be to produce unpredictable numbers. True random numbers possess this property. However, uniformly distributed pseudorandom number generators, that are used for practical reasons such as the linear congruential generators have not this property since each number they produce can be expressed as a function of the initialization value or of its predecessor value and the coefficients of the generator. The great majority of random number generators used for traditional applications are linear congruential generators, which behave statistically very well, except in terms of unpredictability, since there exists a linear functional relation connecting the numbers of the

6041

sequence. A sequence of random numbers produced by these generators is defined as follows: Z i = (aZ i −1 + c )(mod m ) , where m , a and c are the

UNIX-rand is a Unix function that uses a multiplicative

coefficients, i.e., the modulus, the multiplier and the increment, correspondingly. Z 0 is the seed or initialization value. All are nonnegative integers. Each random number can be expressed, as mentioned above, as a function of another random number or of its predecessors or of the seed and the coefficients. So, if the coefficients and the seed or any random number belonging in the sequence is known, then all the numbers of the sequence can be inferred. Such generators are inappropriate for security mechanisms, since the disclosure of one of them could very easily lead to the computation of the others. In security mechanisms like authentication and key generation and exchange the primary concern of the used pseudorandom bit sequences is that they are unpredictable, while being uniformly distributed comes as requirement next. True random numbers are independent from each other and therefore unpredictable but they are rarely employed, since it is difficult to obtain and they might be not reproducible. It is more common that numbers that behave like random numbers are obtained by means of an algorithm, i.e. a pseudorandom number generator. Next, we briefly describe some of the widely used generators, the DES in the output feedback mode (OFB) combined with a further element and linear congruential generators. Data Encryption Standard (DES) and, recently IDEA, are the most widely used symmetric encryption systems. The input to the encryption function is the plain text in blocks and the key. The plain text block is 64 bits and the key 56 bits in DES and 128 bits in IDEA in length. The encryption and decryption algorithm of DES relies on permutations, substitutions and xor-operations under the control of 16 subkeys obtained from the initial key. On the other hand, the encryption and decryption algorithms of IDEA rely on xor-operations and modular additions and multiplications. DES and IDEA can operate under various modes such as Cipher Block Chaining (CBC), Cipher Feedback (CFB) and Output Feedback (OFB). The OFB mode can be used as a pseudorandom number generator for key generation and stream cipher computation. As traditional generators we use a Prime Modulus Multiplicative Linear Congruential Pseudorandom number generator (PMMLCP) and the Unix-rand. The first computes numbers in the interval [0,1) using the following formula:

range of 0,2 . As input Unix-rand takes a seed, which affects the pseudorandom number sequences obtained. Based on the OFB of symmetric cryptosystems, like DES, cryptographically strong pseudorandom number generators are some of the most commonly employed in security mechanisms. This OFB mode can be used for session key generation and the implementation of stream cipher computation. According to this method the encryption function of the symmetric cryptosystem is, at first, applied to an initialization variable under the control of a cryptographic key. The resulting cipher is the pseudorandom bit string or number. Subsequently, the output of the encryption function, i.e., the cipher is the new input to the encryption function E k , T1 = E k (I ) ,

Z i = 630360016Z i −1 (mod(231 − 1)) .

The multiplier 630360016 is suggested by Payne, Rabung and Bogyo [4].

32

congruential random number generator with 2 period, which returns pseudorandom integer numbers in the

[

15

)

T 2 = E k (T1 ) .., T n = E k (T n −1 ) . This paper presents a novel approach for constructing robust random number generators to be used in security mechanisms, which are based on recurrent Artificial Neural Network (ANN) techniques of the Hopfield type. It is well known that these neural models possess interesting associative memory storage and retrieval properties when certain conditions about their weight matrix and input pattern vectors are satisfied [5]. These ANN of the Hopfield type are exactly the ones employed in this paper as random number generators. Since ANNs, in general, are parallel and distributed processing devices they can be implemented in parallel hardware and consequently, they can be used for realtime random number generation. It is very important to emphasize that ANNs of the Hopfield type are the most easily and naturally implemented in hardware neural models [5]. They can be implemented in silicon chips through using operational amplifiers corresponding to their neurons. These neurons have outputs given by the following formula. Ok = g(

∑W

ki

Oi )

Where, Ok is the output of neuron k , g is a special nonlinearity, like the well known signum function or the sigmoidal nonlinearity and finally, Wki is the weight connecting neurons k and i. However, the main property of Hopfield type ANNs, that is herein exploited in order to design improved random number generators, is their capability to minimize a cost function during their recall phase, when certain conditions are satisfied [5]. When these conditions, described in the next section, are not satisfied then, the network acquires an unpredictable behavior, which cannot be inferred as a closed form solution.

6042

Furthermore, the nonlinearity g in the above formula supplies the neural system with the ability to nonlinearly transform its inputs in a complex manner. This transformation results in obtaining outputs which cannot be easily produced from their inputs after several iterations of the recurrent scheme in the recall phase of a Hopfield network. Moreover, despite the fact that if a Hopfield recurrent ANN architecture were known then, its outputs could be estimated from its inputs even after several iterations of its recurrent recall scheme, however, this estimation could be performed by algorithmic means only. The analytic formula relating Hopfield inputoutput, although existent, is too involved. The organization of this paper is as follows. Section 2 describes the suggested novel procedure for generating strong (pseudo)random numbers by invoking Hopfield type recurrent ANN techniques and reports the traditional statistical tests for evaluating the quality of the pseudorandom bit sequences produced by the generators involved in this work. Section 3 gives a detailed account of the experimental study conducted. Finally, section 4 concludes the paper and discusses the prospects of our approach.

2. THE HOPFIELD TYPE RECURRENT ANN BASED (PSEUDO) RANDOM NUMBER GENERATOR The methodology for transforming Hopfield type recurrent ANNs into strong (pseudo)random number generators is herein depicted by exploiting their properties to minimize a cost function involving their weights and neuron activations under certain conditions concerning their weight matrix [5]. More specifically, a Hopfield network possesses the following important characteristics [5], which are next summarized. a) If the weight matrix of a Hopfield recurrent ANN is symmetric with zero valued diagonals and furthermore, only one neuron is activated per iteration of the recurrent recall scheme then, there exists a Liapunov type cost function involving its weights and neuron activations, which decreases after each iteration until a local optimum of this objective function is found. b) The final output vector of the Hopfield network, after the convergence of the above mentioned recurrent recall scheme, has minimum distance or is exactly equal to one prototype stored in the network during its weight matrix definition (learning phase) provided that the prototypes stored are orthogonal to one another and their number M 0.15 N then, the recurrent recall scheme converges to a linear combination of the prototypes stored when it is fed

with a variation of one of these prototype vectors, provided that the weight matrix has the properties discussed in (a) above. d) Hopfield net outputs are given by the following formula discussed in the introduction, which is precisely the update formula for the single neuron activated during the iterations of the recurrent recall scheme mentioned in (a) above. Ok = g(

∑W

ki

Oi )

A sigmoidal nonlinearity is considered for g, in the following. These properties lead us intuitively to the principles of the proposed random number generation methodology involving such recurrent ANNs, summarized as follows. 1) If we impose a perturbation to the recurrent network weight matrix so that its symmetry is broken and its diagonal units obtain large positive values then, the convergence property of the recurrent recall scheme will be lost. This can be achieved, for instance, by adding a positive parameter ä to every unit in the upper triangle of the matrix, including diagonal units, and subtracting the negative quantity –ä from every unit in the lower triangle of the matrix 2) Moreover, if we let a large number of neurons (in our experiments N/2 neurons) update their activations by following the formula of (d) above, then, the recurrent recall scheme will loose its convergence property to a local optimum of the suitable Liapunov function associated to the network. 3) If the recurrent recall scheme is not guaranteed to converge to a network output that corresponds to the local optima of a cost function then, the behavior of the network becomes unpredictable. 4) If the network is large and the patterns stored in it are orthogonal and thus, uncorrelated (that is, they have maximum distances from one another) then, the possibility of obtaining predictable outputs after several iterations of the recurrent recall scheme is minimum compared to the one associated with storing non-orthogonal prototypes, which are correlated to one another. In our experiments we use binary valued orthogonal patterns. 5) If the history of the network outputs during its recall phase is considered for T iterations of the recurrent recall scheme then, predicting the sequence of these output vectors is much harder than trying to predict a single output vector. The above principles lead us to use the following function of network outputs over T iterations of the recurrent recall scheme as a pseudorandom number generator. To obtain better quality pseudorandom numbers, we have considered the Unix-function modf ,

6043

which outcomes the non-integral part of a real number, as the required mechanism for aiding Hopfield net output to acquire the desired properties, since the first digits of its decimal part are predictable, due to the fact that the sigmoidal nonlinearity g is a mapping on the

O = mod f (1000 * (1 / TN )

∑∑

t =1..T

k =1.. N

(g(

∑W O (t))) ) 2

ki

i

(0,1) interval. Consequently, the formula of the Hopfield recurrent ANN proposed random number generator is as follows. The previous discussion determines all the steps of the approach adopted here for designing strong (pseudo)random bit sequences generators employing the recurrent recall scheme of Hopfield networks.. In this way a sequence of (pseudo)random numbers is produced whose quality is quantitatively evaluated by utilizing the statistical tests presented in the next paragraphs. Statistical tests are applied to examine if the pseudorandom number sequences are sufficiently random [6]. The first test we apply is the most basic technique in the suite of the methods used for evaluating pseudorandom numbers quality, namely, the chi-square

5. A sequence produced by a simple deterministic real function, like the sin(x*y), so as to have an example of the performance of a non-random number generator in the tests of section 2. The Hopfield ANN herein employed has N = 100 neurons connected following the conventional feedback architecture. All the sequences herein produced and compared have 5000 points. All the results obtained from the above specified experiments concerning the empirical tests are presented in table 1. From this table we can derive the following: 1. Indeed, it is possible to obtain strong pseudorandom numbers using the complex recurrent recall scheme of Hopfield type ANNs. 2. These pseudorandom numbers are of good quality, passing several critical evaluation tests.

Generator

DES Unix-rand PMMLCP Hopfieldrecurrent 2 ANN test ( x test) [6]. Furthermore, the sample means and variances of the SIN(X*Y) pseudorandom number sequences obtained by the generators herein employed have been computed and compared to their expected values associated to the uniform distribution in the range [0,1), i.e. 0.5 and (1/12), respectively. The chi-square test along with the sample mean and variance comparison tests form the suite of our empirical tests.

3. EXPERIMENTAL STUDY DISCUSSION OF THE RESULTS

AND

An experimental study has been carried out in order to demonstrate the efficiency of the suggested in section 2 procedures for designing pseudorandom number generators, concerning their performance with respect to the traditional statistical tests previously mentioned. The following experiments have been conducted by applying the empirical tests depicted in section 2, on 1. A random sequence produced by the DES algorithm. 2. A random sequence produced by the UNIX-rand generator 3. A random sequence produced by the prime modulus multiplicative linear congruential pseudorandom (PMMLCP) number generator found in the introduction. 4. A random sequence produced by the Hopfield recurrent ANN using the methodology described in the previous section.

X2 test (max=118.49) 109.186 114.518 96.056 78.922

Sample mean 0.503 0.493 0.496 0.498

Sample variance 0.0836 0.0816 0.0844 0.0835

789.84527

0.407

0.0844

Table 1. The empirical tests results of the random bit sequence generators involved, as well as the corresponding results for a non-random bit sequence generator (SIN(X*Y)).

4. CONCLUSIONS AND PROSPECTS It has been studied for the first time a mechanism on how recurrent ANN of the Hopfield type can be used in creating strong (pseudo) random bit sequences. This mechanism relies on their ability to perform complex mappings between their inputs and outputs during their recurrent recall phase, which are unpredictable when a suitable perturbation of the weight matrix is involved. The issue of pursuing other such techniques for improving traditional random number generators is under investigation.

REFERENCES [1] Schneier B.,“Applied Cryptography”, J. Willey & Sons, second edition, 1996. [2] ISO 8731-2, Approved Algorithms for Message Authentication, Part 2: Message Authenticator Algorithm (MAA). [3] Meyer, C., and Matyas, S. Cryptography: A New Dimension in Computer Data Security. New York: Wiley, 1982. [4] A. M. Law, W. D. Kelton. Simulation Modeling and Analysis, MacGraw-Hill, 1991. [5] Patterson D. W., “Artificial Neural Networks. Theory and Applications”, Prentice Hall, 1996.

6044

[6] Knuth, D. The Art of Computer Programming, Volume2: Seminumerical Algorithms. Reading, MA: Addison-Wesley, 3rd ed., 1998.

6045

6046