Fast general methods for generating univariate discrete random variables ... alias generation method, non-sequential search procedures. 1. INTRODUCTION.
FAST METHODS FOR GENERATING BIVARIATE DISCRETE RANDOM VARIABLES
C. D. KEMP and S. LOUKAS School of Mathematical Sciences University of Bradford Bradford, BD7 1DP ENGLAND
SUMMARY. Fast general methods for generating univariate discrete random variables require the preliminary setting-up of tables. Actual generation then involves a table look-up procedure initiated by a generated pseudo-random uniform variable. Two such methods were introduced by Marsaglia (1963) and one by Walker (1974). In this paper we consider bivariate versions of the Marsaglia and Walker methods and compare them with methods proposed by Kemp and Loukas (1978a,b). KEY WORDS. Computer generation, bivariate discrete distributions, alias generation method, non-sequential search procedures. 1.
INTRODUCTION
Kemp and Loukas (1978a,b) examined inter alia general methods of generating bivariate discrete random variables based on inverse interpolation by sequential searching of tables of accumulated probabilities. Such methods are only applicable where a large number of identically distributed variables are to be generated (e.g. they are quite unsuitable if a parameter of the distribution is varying from call to call) . This is partly because it takes a substantial time to set up the required table(s). They also use considerable amounts of storage. However they are much faster than structural methods based on characteristic properties of the particular distribution being sampled. The average generation times for the simpler versions discussed in Kemp and Loukas (1978a) are heavily dependent on the means of the marginal distributions but the indexed ordered search of Kemp and Loukas (1978b) is much less dependent and was, in general, preferred to the other methods by Kemp and Loukas, provided sufficient storage is available. 313 C. Taillie e t al. (eds.), Statistical D istributions in Scientific Work, Vol. 4, 3 1 3 -3 1 9 . C opyright © 1981 b y D. R eid el Publishing Company.
C. D. KEMP AND S. LOUKAS
314
In the univariate discrete situation, several very fast tabular non-sequential-search methods are available. Two of them were given by Marsaglia (1963) and a third by Walker (1974) . In the present paper we briefly describe the univariate version of each of the three methods and then consider its extension to the bivariate situation. Finally we give some comparisons of timing and of storage requirements for various methods.
2.
NOTATION AND TERMINOLOGY
We consider an arbitrary bivariate distribution of (X, Y) with probability function P(x, y) defined on the non-negative integers. Strictly speaking, the methods under consideration are only applicable to distributions with finite support, e.g. 0 £ x £ £, 0 £ y £ m. If a distribution has infinite support, £ m
we suppose
£
and
m
to be chosen such that
1 ” I I P(x, y)
is negligible, i.e. we cannot generate an (x, y) with x > £ and/or y > m. If it is important that the possibility of gener ating such rare values should remain, special provision can be made to switch to a different method in appropriate cases, but we shall not consider this here. In the sequel we use
3.
[a]
to denote
the integer part of
a.
METHODS OF GENERATION
3.1 Simple Urn Method (SO). Marsaglia (1963) pointed out that a very fast method of generating from a univariate distribution with probability function P(x) could be constructed as follows: suppose we have an array (urn) with 10c locations (numbered 1 to 10 ).
To set up the table, calculate
P(0)
and place
0
in
each of the first [10CP (0)] locations. Then calculate P(l) and place 1 in each of the next [10CP(1)] locations, and similarly for P(2),···, P(£) where P(£ 4- 1) < 10 C < P(£). This completes the set-up procedure. To generate, we obtain a u from the uniform distribution on [0,1), and calculate L = [10c u + 1]. The required x is the value in the Lth location of the array. For most purposes 4 £ c < 6 seems adequate. Since the only operations required in actual generation (once one has a u) are calculating L and looking up the value in the Lth location of a one-dimensional array, the procedure is
315
FAST METHODS FOR GENERATING BIVARIATE DISCRETE VARIATES
completely independent of the distribution mean. In general it is the fastest method available, but the storage requirement is very large.
3.2 Bivariate SimpleUrn (BSU).
The generalization of SU to the bivariate case is immediate but a practical problemarises: we now need to store [10cP(x, y)] pairs (x, y) instead of [10cP(x)] single values x. This can be done either by using two arrays (or a 2-dimensional array) or by coding each (x, y) as w = x + ay with a an integer, a > £, before storage in the one-dimensional array, and then disentangling each generated w by letting z = w /α, so y = [z] and x = w - ay. Either procedure adds to the generation time compared with SU: the 2-array method adds less extra time but doubles an already excessive storage requirement.
3.3 Conditional Urn Method (CU). Marsaglia (1963) also proposed an ingenious modification of simple urn. This requires c urns. Let a truncated probability be P(i) = 0 . ' * *^ic ’ i = 0,1,...,£.
Let
_. £
I
S. = 10_J J o Thus
S_.
k
and
iJ
N
=
κ
£ 10JS., i 3
k = 1,···,