1 MULTIPLICATIVE COMBINATION OF RANDOM NUMBERS Fatin ...

12 downloads 91 Views 11MB Size Report
Fatin Sezgin. 1. Tevfik Metin Sezgin. 2. 1. Bilkent University SATM, Ankara TURKEY. 2. Koç University, Istanbul, TURKEY*. Abstract. Fast and reliable random ...
MULTIPLICATIVE COMBINATION OF RANDOM NUMBERS Fatin Sezgin1

Tevfik Metin Sezgin2

1. Bilkent University SATM, Ankara TURKEY 2. Koç University, Istanbul, TURKEY*

Abstract Fast and reliable random number generators are essential in many fields of science, and engineering. Combining multiple streams of random numbers has been suggested as a way of building better generators with improved statistical properties and longer periods. Combined generators are also in high demand because they exhibit cryptographically secure character due to the one-way property of the generation technique. Existing combination techniques rely primarily on applying addition, subtraction and exclusive OR operations on parent generators. Although some authors have suggested multiplication-based combination, efforts have been restricted to series obtained by Lagged Fibonacci type generators. In this paper, we show how multiplication can be used to combine different parent generators in an efficient manner to obtain generators with many desirable characteristics such as longer periods and improved statistical properties. We present FORTRAN and C implementations of multiplicative combination algorithms along with sets of good multipliers for various word sizes. We also demonstrate how the inherent parallelism in our algorithm can be exploited to obtain faster parallelized implementations on multicore machines using the OpenMP C parallel programming library. We support our results with empirical analysis of the presented algorithms. AMS Subject Classifications: 65C10 Random number generation Keywords: Random Number Generation, Multiplicative RNG Combination, Simulation, Pseudorandom Number Generation, MCMC, Cryptographic Security

1. INTRODUCTION Random numbers are essential tools with many applications in simulation, education, arts, numerical analysis, computer programming, recreation and sampling. In addition to physical and tabular sources, there are several deterministic computational techniques for producing random sequences of data such as congruential, shift register, lagged Fibonacci, inverse and cellular automata generators. However these generators have several shortcomings. For example, they have unsatisfactory randomness properties, short periods, and they may not deliver the properties required by cryptographic security applications. In order to remedy these shortcomings, several authors have proposed combining individual generators through various operations such as exclusive OR, addition, subtraction, and multiplication.

In this paper we investigate the use of multiplication for combining multiple generators. We extend multiplicative combination to Linear Congruential Generators and propose new families of generators with efficient implementations. In Section 2, we briefly introduce the concept of combining multiple generators, and highlight the utility of combining in obtaining random number generators with better properties. Multiplicative random number combination and its advantages are presented in Section 3. In Section 4 we propose a new approach for multiplicative combination. Section 5 presents empirical results to demonstrate the improvements that can be obtained by multiplicatively combined generators. The quality of random number generators depends to a large extent on the goodness of their parameters, therefore in Section 6 we present five moduli values for generators with various word-sizes and give lists of suitable multipliers with good lattice properties. Section 7 presents serial and parallel implementations of suggested generators along with their FORTRAN and C codes. *

Corresponding author.

1

2. Combining as an improvement tool for random numbers Combining is an effective way of improving random number generators. Combining random numbers from several different sources is seen by many authors as a remedy for fixing irregularities of a single generator. Empirical comparisons made by Marsaglia (1985) showed that combined generators are superior to single generators. Collings (1987) and Anderson (1990) also supported combination generators. Based on results from majorization theory, Brown and Solomon (1979) show that the combined generators offer improvement in uniformity in a strong sense. Works of Marsaglia (1985) and Deng and George (1990) also justify the use of combined generators for improving uniformity. Deng et al. (1989, 1991) prove that combining improves not only uniformity, but also independence. In general, random numbers are combined by designing functions that take two or more streams of random numbers and generate a single stream. Legitimate functions for improving the quality of pseudo-random numbers are presented by Deng and George (1992). The main operations used for combining random number generators include the exclusive OR, addition, subtraction and multiplication. We are interested in the multiplication operation, because its superiority over other methods has been established. In particular, Coddington (1996) stresses the importance of multiplicative combination and notes: Empirical tests have shown that the randomness properties of Lagged Fibonacci Generators are best when multiplication is used, with addition (or subtraction) being next best, and XOR being by far the worst. This is intuitively reasonable in that multiplication mixes the bits in two numbers much more than addition, which is in turn much better than XOR. Multiplicative LFGs have seen little use, which is somewhat surprising considering their excellent randomness properties and extremely long period. Although slower than additive LFGs, they are just as fast as 32-bit LCGs, and much faster than LCGs that require multiple-precision arithmetic. Multiplicative LFGs can also be used with much smaller lags than for additive LFGs. Many different tests are failed by additive LCGs with small lags (less than 100), however no currently published results in any test show failure of a multiplicative LFG for a lag as small as 17. More specifically, the advantages of using multiplicative combination can be enumerated as: 1. Multiplication mixes the bit structure of the random numbers better than addition, subtraction and exclusive OR operations. Therefore the output will exhibit better randomness properties. 2. Combining outputs of two different generators and applying Mod operation after the multiplication of various partitions of mother generators will result in cryptographically secure random numbers. Given the outputs, it is not possible to infer future numbers. 3. Multiplicative combination allows one to avoid the poor lattice structures that are typical of linear congruential and lagged Fibonacci generators. 4. Combined generators will have longer cycles. 5. Using our “partitioned multiplication” algorithm, we can obtain several independent streams of random numbers. Despite the superiority of multiplicative combination over other combination methods, previous work on combining generators has focused largely on non-multiplicative combination methods. Here, we review previous work on multiplicative combination methods, and point out their shortcomings in more detail. 2

3 Previous Work on Multiplicative Combination of RNGs Deng and George (1992) study functions for improving the quality of pseudo-random numbers. Their work covers independent random variables with continuous density function in the interval (0, 1). Unfortunately, as it stands, their work cannot be adopted for multiplicative combination, because multiplying uniform (0, 1) random variables x and y (z=x×y) does not produce uniform output. The resulting pdf is listed in many sources as f  z    ln(z ) and can easily be derived by considering the distribution function of the random variable obtained by multiplying two uniform random variables. An application of multiplication in the range (0, 1) using only one uniform random variable is the case of Logistic Map X i 1  4 X i (1  X i ) , and is discussed by Phatak and Rao (1995). This generator was proposed to introduce chaotic properties but did not gain popularity. Several authors have applied multiplication on integers and then subjected the product to a Mod operation. These generators can be classified into two groups: 1) Exponentiation-based generators: Here the same random number is multiplied by itself several times and Mod is taken for obtaining the next number. 2) Lagged Products: The elements of a generator are stored in a vector and multiplication operations with different lags are applied. The first class of generators is employed mainly for cryptographic purposes. The randomness properties are rather poor. For example the last digit of squared integers will not take on values 2, 3, 7, and 8. The last two digits will assume only 22 of the possible 100 two-digit combinations and only 159 of 1000 possible values will be produced for three digit integers. Similar problems arise for larger exponents. Sezgin (1995) presents some remarkable patterns in various digits when integers are raised in various powers. In exponentiation, the emphasis is on the unpredictability rather than randomness. We list some examples of this application below: A) Blum et al. (1986) discuss the one-way function X i 1  X i2 Mod (n) .

(1)

B) Eichenauer-Herrman and Niederreiter (1991) present statistical independence properties of a more general form of the quadratic congruential generator X i 1  a X i2  bX i  c Mod ( p m ) . (2) C) The so-called power generator RSA discussed by Rivest et al. (1978) uses the recursions E ( M )  M e Mod (n) , and (3) t D (C )  C Mod (n) in order to encrypt a message and decrypt a ciphertext respectively. Here n and powers are parameters describing the generator. D) Another one-way function proposed by Schrift and Shamir (1990) use the exponentiation modulo composite function f g , N ( X )  g X Mod ( N ). (4)

3

Here N=P.Q, where P and Q are distinct odd primes and g is a generator in the multiplicative group containing elements in [1, N]. E) Yet another example is the compound cubic congruential pseudorandom numbers proposed by Eichenauer-Herrmann and Herrmann (1997). Although it has an exponential form, it uses random inputs from different sources. This generator has the form: yn( i)1  ai bi2 ( yn(i )  ci )3  bi  ci

Mod ( pi )

(5)

where ai and bi are integers satisfying 0 ≤ ai, bi ≤ pi -1 and 1 ≤ ci ≤ pi -1. The prime modulus values satisfy the condition pi=5 Mod(6). Authors list primes pi < 215 suitable for generating random integers with implementation accuracy of 30 bits. Very large integers produced during the calculation severely restrict the choice of the modulus values. F) In a recent study Kak (2007) proposed the cubic transformation for public-key applications and random number generation. The second class of generators combines elements of the same generator by lagged product. Some examples of these generators are: A) Marsaglia et al. (1990) use a multiplicative generator depending on 3 past values in the form yn  yn  3  yn  2  yn 1 Mod (179) (6) to generate bits during the initialization of a table. B) Marsaglia and Zaman (1994) propose a generator of the form X n  X n1  X n 2 Mod (232 ) (7) having sound statistical properties. C) As a more general case, Marsaglia (1992) discusses the performance of Lagged Fibonacci Generators in the form xn  xnr  xns Mod (2n). (8) In this formula, the operator  may be addition, subtraction, exclusive or and multiplication. It is noted that all standard generators (with the exception of the generators obtained by multiplication) fail one or more stringent tests of randomness such as those described in Marsaglia (1985). This generator, called Multiplicative Lagged-Fibonacci Generator (MLFG) is also studied by Mascagni and Srivinasan (2000) and Srivinasan et al. (2007). Empirical studies of lagged Fibonacci Generators show that, the outputs of multiplied components give the best randomness properties but there are several aspects of this operation hindering the popularity of these generators. First, multiplication must be performed on integers. This limits the size of the integers to avoid overflow or requires multi-precision arithmetic which slows down the algorithm. Second, the period of multiplicative generators implemented for Lagged Fibonacci Generators is one-quarter of the period of the combined generators obtained by addition and subtraction.

4

An exception of multiplicative combination that does not use power or lagged production is the generator type studied by Hildebrand (1993). This is the random process Xn+1= an Xn +bn Mod (p). (9) Although it resembles a mixed congruential random number generator, here, an and bn are independent random variables and X0=0. This form is a rather rich model giving good randomness properties for various restrictions on its components. In unpublished manuscripts, the same author studied cases where bn takes on a single value (Hildebrand, preprint). Unfortunately this study remained in theoretical level and there are no reports on implementation evaluations. 4. Multiplicative Combination with Partitioning With the exception of work by Eichenauer-Herrmann & Herrmann (1997) and Hildebrand (1993) in most cases presented in Section 3, multiplication was applied to elements of the same generator taken by different lags, or the same element was raised to a power before modulus operation. Using different lags creates the Multiplicative Lagged Fibonacci generators. These are quite satisfactory in randomness qualities but severely limited by the maximum integer size. Exponentiation is not recommended because it results in nonrandom behavior in some digits of the output. Current combination implementations have certain drawbacks. In the conventional multiplicative combination, in order to avoid integer overflow, X and Y are either chosen less than the square root of the integer allowed by the word-size of the computer, or multi-precision is employed. The first case decreases the precision of the outputs, because for the common 32 bit computers the maximum modulus value can’t exceed Int 2 31  1  46340 . These small moduli will not employ the 23-bit precision capacity of the real numbers when the outputs are transformed to a U(0, 1) variable. On the other hand, choice of large moduli will require costly multi-precision arithmetic and result in slower generators.

Yet another disadvantage of conventional multiplicative combination methods is their inefficiency. Combination methods that use +, - and  (exclusive OR) operations employ all digits of the parent random numbers and all digits have a contribution to the output value, whereas in multiplication, higher digits of the result are less effective in determining the final output. For example choosing M=100, we can obtain four digit results in the interval [1, 9801] from the multiplication but only last two digits [0, 99] will be significant in determining the output and the first two digits are always disregarded. The periods of + and - are larger than that of ×. As Mascagni et al. (1995) point out “In recent years the Additive Lagged Fibonacci Generator has become a popular generator for serial as well as scalable parallel machines because it is easy to implement, it is cheap to compute and it does well on standard statistical tests especially when the lag k is sufficiently high (such as k =1279). The maximal period of the ALFG is (2k-1)2m-1 and has 2(k-1)(m-1) different full-period cycles”. But the multiplicative LFG has a maximal period of (2k-1)2m-3, which is a quarter the length of the additive LFG. The most serious problem for the implementation of multiplicative generators is the size of the variable obtained by multiplication operation. To overcome this difficulty we propose the multiplication by partitioning. This makes the implementation possible without overflow and speeds up the calculations by assigning any parallelizable work to separate cores in multi-core processors. Let us demonstrate the application of partitioning for two generators of the same periods. Let X and Y be uniform integer random variables taking values 1 ≤ X, Y ≤ K2. We assume that K+1 and K2+1 are primes. Then we can develop an efficient random number combining method using multiplication as a binary operator producing random integers between 1 and K. The random numbers generated by this method will

5

give independent streams and will use the information in the parent variables X and Y very efficiently. For this purpose we partition the parent variables as: X  KU  V (10, 11) Y  KW  Z Note that, in effect, this partitioning produces a representation of X and Y in base K, where U and W represent the higher order digits of X and Y (upper halves); and the variables V and Z correspond to the lower order digits (lower halves). Since U, V, W, and Z are discrete uniform integer random variables taking values between 1 and K, they have mean K 1  (12) 2 and variance K 2 1 2  . (13) 12 The covariance between X and Y can be expressed as:  XY  E ( X   X )(Y  Y )  EKU  V  ( K  1)  KW  Z  ( K  1)    EK (U   )  (V   )K (W   )  (Z   )  K 2 E (U   )(W   )  KE (U   )( Z   )  KE (V   )(W   )  E (V   )(Z   )  K 2 uw  K uz  K vw   vz (14) where the vw, vz,vw, and vz values indicate the covariance between the variables represented in the subscripts. The correlation between random variables X and Y becomes:  K 2 uw  K uz  K vw   vz   XY   X Y ( K 2  1) 2





12 K 2 uw  K uz  K vw   vz . ( K 2  1)( K 2  1) Then we will have  12 K 2 uw K uz  K vw   vz   2 2 ( K  1)( K  1) ( K 2  1)( K 2  1) In this equation the covariance uz satisfies K 2 1  uz   u z  , 12 and this inequality is valid for other covariances. Using these values and taking K2/( K2-1)≈1, we can get 12 2K  1   2 uw  2 . (15) ( K  1) K  1 For large values of K the second term tends to zero and this implies that the correlation between X and Y variables is mainly due to the upper halves, namely U and W. This result explains the independence of random numbers obtained by multiplicative combining. The values obtained by modulus operation are mainly determined by the lower bits of the X and Y variables and the correlation between the parent generators does not depend very much on these bits. 

6

In order to multiply the random variables X and Y efficiently without causing multiplication overflows, we rewrite XY as the product of the partitioned representations of X and Y shown in Eq. (10, 11): X  Y Mod(K  1) = ( KU  V )( KW  Z )

Mod ( K  1)

 UWK 2  UZK  VWK  VZ Mod ( K  1)

(16)

In the implementation section (section 7), we show how the quantity above can be calculated efficiently using properties of the modulus operation. Here it suffices to note that it is a sum that involves four terms with factors UW, UZ, VW, and VZ where  X  1 U  Int  (17)  1  K   Y  1 W  Int  (18)  1  K  V  X Mod ( K  1) (19) Z  Y Mod ( K  1). (20) We choose K+1 to be a prime in order to have the maximum periods for the above components. The X and Y variables are generated from an integer uniform distribution having X i 1  a1 X i Mod ( M ) (21) and Yi 1  a 2Yi Mod (M ) , (22) 2 where M is a prime modulus in the form K +1. If we have X and Y generators with different prime Moduli M1 and M2, the period of the combined output will be the least common multiple of (M1-1) and (M2-1) because each Yi value from the second generator can be combined with each Xi of the first generator to give a full period N1. Let the particular value of Yi generator be C. We can show that given X+D