Fast Deterministic Backtrack Search

6 downloads 0 Views 197KB Size Report
We refer to the processors containing one or more dormant nodes as renegades. Let. E(j) denote the set of renegades at the beginning of Phase j and let C(j) ...
Fast Deterministic Backtrack Search Kieran T. Herley

y

Andrea Pietracaprina

z

Geppino Pucci

x

Abstract The Backtrack Search Problem involves visiting all the nodes of an arbitrary binary tree given a pointer to its root, subject to the constraint that the children of a node are revealed only after their parent is visited. This note describes a fast deterministic backtrack search algorithm for a p-processor COMMON CRCW-PRAM ?  that visits any n-node tree with height h in time O (n=p + h)(log log log p)2 . This upperbound compares favourably with a natural (n=p + h) lower bound for this problem. Our approach embodies novel, ecient techniques for dynamically assigning tree-nodes to processors to ensure that the work is shared evenly among the processors.

1 Introduction In this note we present a deterministic algorithm for the backtrack search problem, which involves visiting the nodes of an arbitrary n-node binary tree T stored in the memory of a p-processor COMMON CRCW-PRAM where only the root of the tree is initially known, and pointers to the children of any node become available only after the node is visited. This research was supported, in part, by the ESPRIT III Basic Research Programme of the EC under contract No. 9072 (project GEPPCOM). y Department of Computer Science, University College Cork, Cork, Ireland. z Dipartimento di Matematica Pura e Applicata, Universit a di Padova, Padova, Italy x Dipartimento di Elettronica e Informatica, Universit a di Padova, Padova, Italy 

1

It is easy to see that any backtrack search algorithm requires (n=p + h) time, since (a) at least (n) work is needed to visit n nodes and (b) there exists a path of h nodes whose exploration times form a strictly increasing sequence. A number of earlier studies have investigated this problem [Ran91, KP95] and the related problems of branch and bound [KZ93, KP95] and dynamic tree embedding [BGLL91] and motivate the signi cance of these problems by indicating their application to various tree-structured combinatorial optimization problems. Optimal randomized algorithms requiring time O(n=p + h) with high probability are given for the p-processor clique architecture [KZ93] and the butter y interconnection [Ran91]. The paper of Kaklamanis and p Persiano [KP95] describes a deterministic algorithm with a running time of O( ph log p) on the p-node two-dimensional mesh and a lower bound showing that this algorithm is within a polylogarithmic factor of optimality for this architecture. This latter study is the only work to date that focuses on deterministic algorithms, but the results are restricted to trees of size p. To specify the problem more formally, let us assume that each PRAM cell may hold a record representing a tree node that contains two memory addresses, interpreted as the pointers to the node's children, a tag- eld, and a small constant amount of additional space, used by the algorithm for bookkeeping. The root of the tree occupies cell 0, and a zero-valued pointer will be interpreted as a nil pointer. The tag eld at the root of the tree contains some arbitrary symbol, which will be referred to as the signature. The goal of the backtrack search problem is to visit the entire tree. We say that a node is visited when the signature is copied into the tag eld of its corresponding memory record and the pointers to its children have been revealed. No assumption is made about the structure of the tree or about its location within memory. Note that while our formulation implicitly suggests that the tree is present in memory before the execution of the algorithm, this is not an essential requirement: all of our techniques apply equally to the setting where the structure of the tree is determined dynamically. One straightforward strategy to solve the backtrack search problem is to visit the tree level by level. The algorithm is divided into phases, where each phase visits all the nodes at a certain level and evenly redistributes their children among the processors, in order 2

to balance the work to be done in the next phase. The redistribution could, for example, be accomplished deterministically using parallel pre x techniques [JaJ92]. However, a more ecient approach on the COMMON CRCW-PRAM is to use Goldberg and Zwick's algorithm for Approximate Pre x Sums [GZ94] which takes an arbitrary sequence of p integer values a ; a ; : : :; ap? and produces values b ; b ; : : :; bp? such that Pij aj  bi  (1 + ) Pij aj , with  = o(1), and bi  bi? + ai. The algorithm is deterministic and runs in O(log log p) time using p= log log p processors. The use of this strategy for redistribution yields an O(n=p + h log log p) algorithm for the backtrack search problem, for any values of n, h and p. For ease of reference, we encapsulate this observation in the following lemma. 0

=0

1

1

0

1

1

=0

1

Lemma 1 There is a deterministic algorithm running on a p-processor COMMON CRCW

PRAM that visits all the nodes of an arbitrary n-node binary tree of height h in time O(n=p + h log log p).

In this note, we establish the following result, which improves substantially upon the above straightforward algorithm for trees of size O (ph log log p=(log log log p) ). 2

Theorem 1 There is a deterministic algorithm running on a p-processor COMMON CRCW PRAM that visits all the nodes of an arbitrary n-node binary tree of height h in time O ((n=p + h)(log log log p)2).

Notice that this result is within a triply-logarithmic factor of the natural (n=p + h) lower bound mentioned earlier.

2 A High-Level View of the Algorithm The main idea behind our algorithm is use global redistributions (using approximate pre x sums) sparingly and to use other more ecient techniques to dynamically reallocate the tree nodes available for visiting among the processors between successive redistributions. At any time during the backtrack search procedure, we de ne frontier as the set of nodes ready to be visited, that is, nodes not yet visited but whose parents have been visited. Our 3

algorithm performs a sequence of stages. Each stage visits a stratum of the tree consisting of ` = (log log p) consecutive levels. The initial frontier for a stage contains tree nodes at some particular level in the tree. As the stage executes the frontier evolves: whenever a node is visited it is replaced in the frontier by its children, if any. The frontier need not necessarily evolve in a regular level-by-level fashion, indeed the frontier may contain tree nodes of many di erent levels, but the algorithm is biased in favour of tree nodes from upper levels to ensure that the progress of the algorithm is predictable. We now outline the algorithm for a single stage. Let m be the total number of nodes belonging to the stratum visited in the stage. If m = (p` ), then the straightforward strategy of Lemma 1 would explore the stratum optimally in O(m=p) time. We can restrict our attention to the case m = O(p` ) and interleave the execution of the resulting algorithm with that of the algorithm of Lemma 1. Clearly, the running time will be the minimum of the running times of the two. Therefore, in the following description we assume m = O(p` ). For convenience we number the levels of the stratum from 0 to ` ? 1, from top to bottom. The stage explores all the nodes in these levels. Let F (j ) denote those frontier nodes at level j , for 0  j < ` and let F = [`j? F (j ). In order to evaluate the progress that the algorithm is making, we de ne a weight function on the frontier F as 2

2

2

1 =0

w(F ) =

`X ?1 j =0

jF (j )j3`?j :

Note that the contribution of a frontier node to w(F ) is exponentially decreasing in its level. Therefore, visiting frontier nodes at lower numbered levels rather than nodes further down the tree results in a more substantial decrease in the weight function. A stage consists of two parts. In the rst part, we perform a sequence of visiting steps that explore nodes of the stratum until the the frontier weight becomes O(p). The visiting steps are executed in batches of ` and a weight estimate is computed after the execution of each batch, by using the approximate pre x sums algorithm, whose complexity is dominated by that of the visiting steps. Let F be the frontier left at the end of the rst part which is of weight O(p). The second part assigns 2`?j distinct processors to each 4

node in F (j ) (this can be done by using again approximate pre x sums). Note that the bound on the weight guarantees that the assignment is feasible. Then, the processors assigned to each frontier node visit the subtree (within the stratum) rooted at that node in a straightforward level-by-level fashion in O(`) time. (The subtree rooted at a node in F (j ) can not have more than 2`?j descendants at any level within the current stratum, so the assignment of tree nodes to processor is straightforward in this case.) At the end of the stage, the frontier nodes, consisting of all the children of nodes at level ` ? 1, are evenly distributed among the processors and the next stage can begin. In order to determine the total running time of the stage we need to give a bound on the number of visiting steps. Let Ft be the frontier at the beginning of the t-th visiting step. The step is called full, if it visits (p) nodes in Ft, and it is called reducing if it explores at least half of the nodes in Sij F (j ), for each i in the range 0  i < `. (Note that a visiting step can be both full and reducing.) In the next section, we will show how to perform a visiting step in time O((log log log p) ) ensuring that it is always either full or reducing. Clearly, there are at most O(m=p) full visiting steps in the stage, whereas the number of reducing steps is bounded by the following lemma. =0

2

Lemma 2 If m = O(p` ), then k = O(`) reducing visiting steps are sucient to reduce 2

the frontier weight to O(p).

Proof: Our proof is based on the following simple proposition. Claim : Let x0; x1;    ; xn?1 and y0; y1;    ; yn?1 be two sequences of nonnegative inte?1 x =3i  Pn?1 y =3i . gers such that for all k, 0  k < n, Pki=0 xi  Pki=0 yi, then Pni=0 i i=0 i

The proof of this proposition is by induction on the length of the sequences. The proposition is clearly true for sequences of length one. Consider sequences of length l+1 > 1 and assume that xl > yl, since the proposition follows trivially otherwise. It is easy to see that xl  xl ? yl + yl 3l 3l? 3l? 3l 1

hence

l X i=0

1

l? xi +  xl? + xl ? yl  + yl : xi  X 3i i 3i 3l 3l? 2

1

1

=0

5

Now since

l?2 X i=0

it follows that

xi + (xl? + xl ? yl)  1

l?1 X i=0

yi ;

l? l xi  X yi + yl  X yi : i i l 3 i 3 3 i 3i i This concludes the proof of the claim. Consider a single reducing step and suppose that prior to the execution of the step there were exactly fi nodes of level i in the frontier and that the visiting step visited exactly ni of these. Clearly, k k X fi  X ni ; 2 l X

1

=0

=0

=0

i=0

i=0

for each k, 0  k < `, since the visiting step is assumed to be reducing, hence `? ? f X 3` `X ni ; i `  3 i 2i 3 3i i 1

1

=0

=0

by the Claim. Thus, the visited nodes account for at least half the total frontier weight. Since the combined weight of the children of any node is at most two thirds of the weight of their parent, it follows that the weight reduction must be at least one third of the total weight of the visited nodes or at least one sixth of the frontier weight W prior to the execution of the visiting step. Thus, the frontier weight W 0 following the completion of the step is at most (5=6)W . Let Wt denote the total frontier weight following the completion of the tth visiting step. Since, there are O(p` ) frontier nodes initially, it follows that W = O(p` 3l). We have also seen that for each t > 0, Wt  (5=6)Wt? , which implies that Wt = O(p) for some t = O(`). 2 2

0

2

1

Consequently, the overall running time of a stage is O((m=p + `)(log log log p) ), which yields the result of Theorem 1. 2

6

3 Implementation of a visiting step Consider a generic stage and suppose that the total number of nodes of the stratum to be visited is m = O(p` ). The crucial component of the algorithm sketched in the previous section is a careful implementation of a visiting step. Speci cally, each visiting step must ensure that either (p) frontier nodes are visited or the total frontier weight is halved. The following two sections describe an implementation of a visiting step which enforces the above property. For convenience, we regard the processors as being conceptually arranged into an `  q array, with q = p=`. Each processor maintains a data structure, called Tree Ring (TR) storing tree nodes assigned to the processor. The TRs associated with the processors of the i-th row store all of the frontier nodes at level i, 0  i < `. A TR is structured as a forest of complete binary trees of di ering sizes. (To avoid confusion discussing the elements of the tree being visited and the trees employed in the TRs, we will use the term node exclusively in connection with the former and reserve the term vertex for the latter.) The leaf vertices in a TR hold nodes of the tree being visited, and each internal vertex contains pointers to its children and to the leftmost and rightmost leaf vertex in its subtree. All the trees in the same TR have di erent sizes and their roots are organized in a doubly-linked list ordered by tree size. A similar data structure was used in [CV88]. We will show that in any visiting step the maximum size of a tree of a TR is at most polynomial in `, therefore its height is always O(log `). A visiting step consists of two substeps, VISIT and BALANCE , which are described in the following paragraphs. 2

VISIT Within each column perform the following: 1. Select the minfs; 5c`g heaviest nodes from the TRs in the column, where s is the total number of nodes held in the column and c is a constant to be speci ed later. 2. Distribute these nodes evenly among the processors in the column. 3. Allow each processor to visit the nodes it receives. 7

4. Insert the children of these just-visited nodes into the appropriate TR within the column.

BALANCE This substep is executed in parallel by each row and aims to partially

balance the nodes stored in the TRs maintained by the processors of the row. We de ne the degree of a processor as the number of nodes contained in its TR (note that we only count the leaves of each tree). Consider Row i, and let fi be the total number of nodes held in the row processors' TRs. The BALANCE substep redistributes the nodes among the TRs in such a way that at the end of the substep at most minffi; qg=(2`K ) TRs hold more than cdfi =qe nodes, where K is a xed upper bound to the maximum degree of any processor in any row. Later, we will argue that K = O(poly(`)). The BALANCE substep guarantees never to increase the maximum processor degree in the row. Let F be the frontier at the beginning of a visiting step. The following lemma shows that a visiting step is always either full or reducing, which implies, as argued in the previous section, that O(m=p + `) visiting steps are sucient to complete the rst part of the stage.

Lemma 3 If jF j  4p then (p) nodes are visited in the step. Otherwise, for each 0  i < `, at least half of the nodes in Sij F (j ) are visited. =0

Proof: Fix any index i, 0  i < `, and recall that BALANCE substep executed at the end of the visiting step preceding the one under consideration has left at most minffj ; qg=(2`K ) processors of Row j with more than cdfj =qe tree nodes in their TR's, for each j , 0  j  i. We call the nodes belonging to these processors bad nodes and all the other ones good nodes. Since K is an upper bound to the degree of any processor and fj = jF (j )j, we have that the total number of bad nodes belonging to the rst i-levels of the frontier is

K

i X j =0





minffj ; qg  1 [i F (j ) ; 2`K 2 j =0

for any i, 0  i < `, thus the bad nodes at level i or lower account for at most half the total number of frontier nodes in those levels. Suppose jF j  4p and let r  q be the number of columns holding fewer than ` nodes. 8

Since a column holds at most P`j? cdfj =qe  c(jF j=q + `) good nodes, g the number of good nodes is bounded as follows: 1 =0

F  g  r` + (q ? r)c(F=q + `); 2 which implies r  q(5c ? 2)=(5c ? 1). Since c is constant greater than one, we conclude that (q) columns hold at least ` nodes and therefore will visit (`) nodes each. In this case, the visiting step is full. Now consider the case jF j  4p. Since the number of good nodes in each column is now at most c(jF j=q + `)  5c`, it follows that the total number of nodes to be visited in the step is at least equal to the total number of good nodes. From the observation made above, we know that if we visited only the good nodes, then for any 0  i < ` we would visit at least half of the frontier nodes at level i or lower and hence the step would be reducing. Since we explore up to 5c` nodes in each column and we select the heaviest nodes available, if s good nodes are not visited it can only because that number of heavier bad nodes are visited instead and so the weight reduction target is still met.

2 In order to implement the procedure described above, we need ecient primitives to operate on the TRs. Observe that at the beginning of a stage the degree of each processor is O(` ) (since the total stratum size and hence also the initial frontier size is assumed to be O(p` )), and that after each VISIT substep the degree increases by at most an O(`) additive term. Since the BALANCE substep does not increase the maximum degree and O(m=p + `) = O(` ) visiting steps are executed overall, we can conclude that K , i.e., the maximumdegree of any processor, will always be O(poly(`)). As a consequence, throughout the stage each TR contains at most O(log `) trees of O(poly(`)) size and O(log `) height each. Under these conditions, it is easy to prove that: 2

2

2

1. Given s = O(`) nodes evenly distributed among (s) processors, a TR whose trees contain these nodes as leaves, can be constructed by the processors in O(log `) time. 2. Two TRs can be merged into one TR in O(log `) time. 9

3. Any number of leaves s = O(`) can be extracted in O(log `) time by O(s) processors from a complete binary tree when initially only a pointer to the root is known by one of the processors. Moreover, the rest of the tree can be split into subtrees forming a TR within the same time bound. It can be easily argued that the VISIT substep can be implemented by using standard pre x operations within each column and by employing the above primitives to manipulate the TRs. The above discussion provides a proof of the following lemma.

Lemma 4 VISIT can be executed in O(log `) time. The BALANCE substep requires a more complex implementation which is presented in the next section.

4 Approximate balancing In this section, we describe how to implement the BALANCE substep of a visiting step. Recall that the substep is executed in parallel among the rows. Therefore, we will concentrate on the operations of an arbitrary row, say row h. Recall that K = O(poly(`)) is a xed upper bound to the degree of any processor and we can assume it is known to all processors. The purpose of the balancing is to redistribute the nodes among the processors' TRs in such a way that, after the redistribution, the number of TRs holding more than cdfh=qe nodes is at most minffh; qg=(2`K ). It should be noted that the values of f ;    ; fl? are not known to the individual processors during the execution of the stage. A feature of the balancing procedure is to move only pointers to trees between the TRs without physically moving the nodes, which will be too costly for our purposes. The algorithm we present is essentially a PRAM implementation of a variant of the balancing algorithm by Broder et al. [BFSU92]. We need the following de nition. 0

1

De nition 1 ([BFSU92]) A graph G = (V; E ) is called (a; b)-extrovert, for some a; b with 0 < a; b < 1, if for any set S  V , with jS j  ajV j, at least bjS j of its vertices have strictly more neighbors in V ? S than in S . 10

We identify the PRAM processors with the nodes of a d-degree (a; b)-extrovert graph, where a < 1, b < 1 and d > 1 are constants. Let = (4d + 3)=(4d + 4) and  = dlog = aK=4e. The BALANCE procedure consists of  phases, numbered from 0 to  ? 1. In each phase, a number of nodes are marked as dormant, and will not participate in subsequent phases. The remaining nodes are said to be active. At the beginning of Phase 0 all frontier nodes are active. For 0  i <  , Phase i executes the following actions. 1

1. Each processor determines its degree with respect to the active nodes. If it exceeds K i=2, then the processor declares itself congested. 2. Let = 2=b. Execute  = dlog = ?b (  4K`=a)e steps of the DAG construction algorithm of [BFSU92]. By the end of these steps, each congested processor with degree s that was not included in the DAG marks all but minfs; i K g of its active nodes dormant. The construction guarantees that each congested processor in the DAG has outdegree greater than its indegree, while each non congested processor in the DAG has outdegree 0. 1 (1

)

+1

3. Let j be such that 2j  K i=(2d + 2) < 2j . Note that 2j > K i=(4d + 4). Each processor in the DAG sends a sub-TR of size 2j to every active processor at the endpoint of an outgoing DAG edge (each sub-TR is a subset of the original TR and is a TR itself whose component trees account for a total of 2j leaves) +1

4. Each processor merges the sub-TRs it receives with what is left of its TR. We now show that, at the end of the above procedure, the number of processors of degree more than more than cdfh =qe is at most minffh ; qg=(2`K ).

Lemma 5 At the beginning of Phase i, 0  i <  , each processor holds at most K i active

nodes.

Proof: By induction on i. Immediate for i = 0. Suppose that the proposition is true up to index i ? 1 and consider index i. Consider a processor at the beginning of Phase i ? 1 that is congested, but does not succeed in being incorporated into the DAG. By de nition it retains

11

at most K i active nodes, since it marked all but that number as dormant. A congested processor that does succeed in being incorporated into the DAG starts with degree at most K i? (by induction) and its degree decreases by at least 2j  K i? =(4d + 4). Therefore, the residual degree is at most 1

1

i? i K i? ? 4K d + 4  K : 1

1

A processor that was not congested starts with degree at most K i? =2, and its degree increases by at most d2j  dK i? =(2d + 2). Therefore, the new degree is at most 1

1

K i? + dK i?  K i: 2 2d + 2 1

1

2 It is easy to argue that our implementation of BALANCE never increases the maximum processor degree. To see this consider any phase of BALANCE. None of the congested processors, whether or not it is incorporated into the DAG, will experience an increase in the number of nodes it holds. While the noncongested processor may acquire nodes, none will acquire enough to breach the congestion threshold for that phase. Hence, the phase will not increase the maximum processor degree. We refer to the processors containing one or more dormant nodes as renegades. Let E (j ) denote the set of renegades at the beginning of Phase j and let C (j ) denote the set of processors that declare themselves congested in step (a) of Phase j . De ne ej = jE (j )j and cj = jC (j )j, for 0  j <  .

Lemma 6 Let

!%

$

log = 4dKa fh =qe ; and note that  0   . For 0  j   0, we have ej  j (1 ? b) minffh ; qg.

0 =

1

Proof: By induction on j . Immediate for j = 0 since e0 = 0. Suppose that the proposition holds up to index j ? 1  0 and consider index j . Note that the renegades at the beginning

12

of Phase j will be given by the set E (j ? 1) plus a subset of C (j ? 1), namely those congested processors that do not make it into the DAG by the end of Phase j ? 1. Let Z = C (j ? 1) [ E (j ? 1), which we refer to as \potential renegades", that is, a superset of E (j ). Note that cj?  a minffh; qg=2  aq=2, since otherwise the degrees of the congested processors would account for more than 1

K j? a minffh; qg > K  a minffh; qg  f h 2 2 4 0

1

nodes, which is impossible. By induction,

ej?  j? (1 ? b) minffh; qg   (1 ? b) minffh; qg  a minffh; qg=2  aq=2: 1

1

Thus, we have that z = jZ j  aq. By the extrovertness of the graph superimposed on the processors, after the rst step of DAG construction, the number of potential renegades is reduced to at most z ? zb + ej? = z(1 ? b) + ej? . In general, it can be shown that after the k-th DAG construction step, the potential renegades are at most 1

kX ?1

z(1 ? b)k + ej?

1

s=0

1

(1 ? b)s = cj? (1 ? b)k + ej? 1

1

k X

(1 ? b)s:

s=0

Hence, the number of renegades at the beginning of Phase j will be

ej  cj? (1 ? b) + ej? 1

1

 X

(1 ? b)s

s=0

 X  a2 minffh; qg(1 ? b) + j? (1 ? b) minffh; qg (1 ? b)s 1

s=0

 a2 (1 ? b) minffh ; qg + j? 1b (1 ? b) minffh; qg ! j? a = 2 + b (1 ? b) minffh ; qg 1

1

 j (1 ? b) minffh ; qg:

2

Theorem 2 Let c = 4=(a ) = (1). By the end of the BALANCE procedure, the number 13

of processors of degree more than cdfh =qe is at most minffh ; qg=(2K`). Moreover, the procedure is executed in time O((log log log p)2 ) on the COMMON CRCW-PRAM. Proof: At the beginning of Phase  0, each processor has at most K   cdfh =qe active nodes (by Lemma 5), and the number of renegades is 0

fh ; q g e   minffh; qg(1 ? b)  min2fK` 0

0

(by Lemma 6 and the choice of ). The rst part of the theorem follows by observing that the remaining phases, which apply only to the nodes still active at that point, will not increase the maximum processor degree of these items above the cdfh =qe threshold. We now evaluate the running time. In each phase, every DAG construction step is accomplished in constant time and the extraction and merging operations on the TRs take O(log K ) time, because the degree of each processor is upper bounded by K . Noting that  = O(log K ) and  = O( + log K ) = O(log K ), we conclude that the overall running time is O( log K ) = O(log K ) = O(log `) = O((log log log p) ): 2

2

2

2

References [BFSU92] A.Z. Broder, A.M. Frieze, E. Shamir, and E. Upfal. Near-perfect token distribution. In Proc. of the 19th Int. Colloquium on Automata, Languages and Programming, pages 308{317, July 1992. [BGLL91] S.N. Bhatt, D. Greenberg, F.T. Leighton, and P. Liu. Tight bounds for on-line tree embeddings. In Proc. of the 2nd ACM-SIAM Symp. On Discrete Algorithms, pages 344{350, January 1991. [CV88]

R. Cole and U. Vishkin. Approximate parallel scheduling. Part I: The basic technique with applications to optimal parallel list ranking in logarithmic time. SIAM Journal on Computing, 17(1):128{142, 1988. 14

[GZ94]

T. Goldberg and U. Zwick. Optimal deterministic approximate parallel pre x sums and their applications. In Proc. of 3th Israel Symp. on Theory and Computing Systems, 1994.

[JaJ92]

J. JaJa. An Introduction to Parallel Algorithms. Addison Wesley, Reading MA, 1992.

[KP95]

C. Kaklamanis and G. Persiano. Branch-and-bound and backtrack search on mesh-connected arrays of processors. Mathematical Systems Theory, 1995. To appear.

[KZ93]

R.M. Karp and Y. Zhang. Randomized parallel algorithms for backtrack search and branch and bound computation. Journal of the ACM, 40:765{789, 1993.

[Ran91]

A.G. Ranade. Optimal speed-up for backtrack search on a butter y network. In Proc. of 3rd ACM Symp. on Parallel Algorithms and Architectures, pages 40{48, 1991.

15

Suggest Documents