A canonical object has no parent, and we make no use of the rank of any object that is not ...... still finish the proto
I
[P
GILLES BRASSARD PAUL BRATLEY
ALGORITHMICS
ALGORITHMICS Theory and Practice
Gilles Brassard and Paul Bratley Departement d'informatique et de recherche operationnelle Universitd de Montreal
PRENTICE HALL, Englewood Cliffs, New Jersey
07632
Library of Congress Cataloging-in-Publication graph" in two different ways. A graph may be a data structure in the memory of a computer. In this case, the nodes are represented by a certain number of bytes, and the edges are represented by pointers. The operations to be carried out are quite concrete : to "mark a node" means to change a bit in memory, to "find a neighbouring node" means to follow a pointer, and so on.
At other times, the graph exists only implicitly. For instance, we often use abstract graphs to represent games : each node corresponds to a particular position of the pieces on the board, and the fact that an edge exists between two nodes means that it is possible to get from the first to the second of these positions by making a single legal move. When we explore such a graph, it does not really exist in the memory of the machine. Most of the time, all we have is a representation of the current position (that is, of the node we are in the process of visiting) and possibly representations of a few other positions. In this case to "mark a node" means to take any appropriate measures that enable us to recognize a position we have already seen, or to avoid arriving at 169
170
Exploring Graphs
Chap. 6
the same position twice ; to "find a neighbouring node" means to change the current position by making a single legal move ; and so on. However, whether the graph is a data structure or merely an abstraction, the techniques used to traverse it are essentially the same. In this chapter we therefore do not distinguish the two cases.
6.2 TRAVERSING TREES
We shall not spend long on detailed descriptions of how to explore a tree. We simply remind the reader that in the case of binary trees three techniques are often used. If at each node of the tree we visit first the node itself, then all the nodes in the left-hand subtree, and finally, all the nodes in the right-hand subtree, we are traversing the tree in preorder ; if we visit first the left-hand subtree, then the node itself, and finally, the right-hand subtree, we are traversing the tree in inorder; and if we visit first the left-
hand subtree, then the right-hand subtree, and lastly, the node itself, then we are visiting the tree in postorder. Preorder and postorder generalize in the obvious way to nonbinary trees. These three techniques explore the tree from left to right. Three corresponding techniques explore the tree from right to left. It is obvious how to implement any of these techniques using recursion. Lemma 6.2.1. For each of the six techniques mentioned, the time T (n) needed to explore a binary tree containing n nodes is in O(n). Proof.
Suppose that visiting a node takes a time in O(1), that is, the time
required is bounded above by some constant c. Without loss of generality, we may suppose that c >- T (0).
Suppose further that we are to explore a tree containing n nodes, n > 0, of which one node is the root, g nodes are situated in the left-hand subtree, and n -g -1 nodes are in the right-hand subtree. Then T(n) 0
O. In general, < i , j >, 1 , 1 k . All the nodes whose second component is zero correspond to terminal positions, but only < 0, 0 > is interesting : the nodes < i, 0 > for i > 0 are inaccessible. Similarly, nodes < i, j > with j odd and j < i -1 cannot be reached starting from any initial position. Figure 6.6.1 shows part of the graph corresponding to this game. The square nodes represent losing positions and the round nodes are winning positions. The heavy edges correspond to winning moves : in a winning position, choose one of the heavy
edges in order to
win.
There are no heavy edges leaving a losing position,
corresponding to the fact that such positions offer no winning move.
We observe that a player who has the first move in a game with two, three, or five matches has no winning strategy, whereas he does have such a strategy in the game with four matches.
Problem 6.6.14. Add nodes < 8, 7 >, < 7, 6 >, < 6, 5 > and their descendants to the graph of Figure 6.6.1.
Figure 6.6.1.
Part of a game graph.
Exploring Graphs
192
Chap. 6
Problem 6.6.15. Can a winning position have more than one losing position among its successors ? In other words, are there positions in which several different winning moves are available? Can this happen in the case of a winning initial position
? The obvious algorithm to determine whether a position is winning is the following.
function rec (i, j ) { returns true if and only if the node < i, j > is winning ; we assume that 0 j - 1, and hence M(k)=2k -1+k-2. In other words, (n -3)/2+Ig(n + 1) multiplications are sufficient to evaluate a preconditioned polynomial of degree n = 2k -1. * Problem 7.1.5. Prove that if the monic polynomial p (x) is given by its coefficients, there does not exist an algorithm that can calculate p (x) using less than n -1 multiplications in the worst case. In other words, the time invested in preconditioning the polynomial allows us to evaluate it subsequently using essentially half the number of multiplications otherwise required.
Problem 7.1.6.
Show that evaluation of a preconditioned polynomial of
degree n = 2k - 1 requires (3n - 1)/2 additions in the worst case. Generalize this method of preconditioning to polynomials that Problem 7.1.7. are not monic. Your generalization must give an exact answer, with no risk of rounding error due to the use of floating-point arithmetic.
Problem 7.1.8. (Continuation of Problem 7.1.7) method to polynomials of any degree.
Further generalize the
Precomputation for String-Searching Problems
Sec. 7.2
211
Problem 7.1.9. Show using an explicit example that the method described here does not necessarily give an optimal solution (that is, it does not necessarily minimize the required number of multiplications) even in the case of monic polynomials of degree n = 2k -1. Problem 7.1.10.
Is the method appropriate for polynomials involving real
coefficients and real variables ? Justify your answer. 7.2 PRECOMPUTATION FOR STRING-SEARCHING PROBLEMS
The following problem occurs frequently in the design of text-processing systems (editors, macroprocessors, information retrieval systems, etc.). Given a target string consisting of n characters, S = s is 2 s , and a pattern consisting of m characters, P=p1p2
p,,,
, we want to know whether P is a substring of S, and if so,
whereabouts in S it occurs. Suppose without loss of generality that n >- m. In the analyses that follow, we use the number of comparisons between pairs of characters as a barometer to measure the efficiency of our algorithms. The following naive algorithm springs immediately to mind. It returns r if the first occurrence of P in S begins at position r (that is, r is the smallest integer such that Sr+i - 1 =Pi , i = 1, 2, ... , m), and it returns 0 if P is not a substring of S.
for i E- O to n -m do ok F- true
j *- 1 while ok and j cp n >_ tp(n,(Y)]
.
For the execution time to be independent of the permutation a, it suffices to choose the pivot randomly among the n elements of the array T. The fact that we no longer calculate a pseudomedian simplifies the algorithm and avoids recursive calls. The resulting algorithm resembles the iterative binary search of Section 4.3. function selectionRH (T [ 1 .. n ], k) { finds the kth smallest element in array T ; we assume that 1 0) To analyse the efficiency of this algorithm, we need to determine its probability p of success, the average number s of nodes that it explores in the case of success, and the average number e of nodes that it explores in the case of failure. Clearly s = 9 (counting the 0-promising empty vector). Using a computer we can calculate A solution is therefore obtained more than one p = 0.1293 . and e = 6.971 time out of eight by proceeding in a completely random fashion ! The expected number of nodes explored if we repeat the algorithm until a success is finally obtained
is given by the general formula s + (1-p )e/p = 55.927
,
less than half the
number of nodes explored by the systematic backtracking technique.
When there is more than one position open for the (k + I )st Problem 8.5.2. queen, the algorithm QueensLV chooses one at random without first counting the number nb of possibilities. Show that each position has, nevertheless, the same probability of being chosen.
We can do better still. The Las Vegas algorithm is too defeatist : as soon as it detects a failure it starts all over again from the beginning. The backtracking
Probabilistic Algorithms
250
Chap. 8
algorithm, on the other hand, makes a systematic search for a solution that we know has nothing systematic about it. A judicious combination of these two algorithms first places a number of queens on the board in a random way, and then uses backtracking to try and add the remaining queens, without, however, reconsidering the positions of the queens that were placed randomly. An unfortunate random choice of the positions of the first few queens can make
it impossible to add the others. This happens, for instance, if the first two queens are placed in positions 1 and 3, respectively. The more queens we place randomly, the smaller is the average time needed by the subsequent backtracking stage, but the greater is the probability of a failure. The resulting algorithm is similar to QueensLV, except that the last two lines are replaced by
until nb = 0 or k = stopVegas if nb > 0 then backtrack (k , col, diag45, diag 135, success) else success *- false , where 1 < stopVegas S 8 indicates how many queens are to be placed randomly before moving on to the backtracking phase. The latter looks like the algorithm Queens of Section 6.6.1 except that it has an extra parameter success and that it returns immediately after finding the first solution if there is one. The following table gives for each value of stopVegas the probability p of suc-
cess, the expected number s of nodes explored in the case of success, the expected
number e of nodes explored in the case of failure, and the expected number t = s +(I -p)e/p of nodes explored if the algorithm is repeated until it eventually finds a solution. The case stopVegas = 0 corresponds to using the deterministic algorithm directly. stopVegas
p
s
0
1.0000 1.0000 0.8750 0.4931 0.2618 0.1624 0.1357 0.1293 0.1293
114.00 39.63 22.53 13.48
1
2 3
4 5
6 7 8
10.31
9.33 9.05 9.00 9.00
e
t
39.67 15.10 8.79 7.29 6.98 6.97 6.97
114.00 39.63 28.20 29.01 35.10 46.92 53.50 55.93 55.93
--
We tried these different algorithms on a CYBER 835. The pure backtracking algorithm finds the first solution in 40 milliseconds, whereas an average of 10 milliseconds is all that is needed if the first two or three queens are placed at random. The original greedy algorithm QueensLV, which places all the queens in a random way, takes on the average 23 milliseconds to find a solution. This is a fraction more than
Sec. 8.5
Las Vegas Algorithms
251
half the time taken by the backtracking algorithm because we must also take into account the time required to make the necessary pseudorandom choices of position.
Problem 8.5.3. If you are still not convinced of the value of this technique, we suggest you try to solve the twelve queens problem without using a computer. First, try to solve the problem systematically, and then try again, this time placing the first five queens randomly.
For the eight queens problem, a systematic search for a solution beginning with the first queen in the first column takes quite some time. First the trees below the 2-promising nodes [1,3] and [1,4] are explored to no effect. Even when the search starting from node [1,5] begins, we waste time with [1,5,21 and [1,5,71. This is one reason why it is more efficient to place the first queen at random rather than to begin the systematic search immediately. On the other hand, a systematic search that begins with the first queen in the fifth column is astonishingly quick. (Try it!) This unlucky characteristic of the upper left-hand corner is nothing more than a meaningless accident. For instance, the same corner is a better than average starting point for the problems with five or twelve queens. What is significant, however, is that a solution can be obtained more rapidly on the average if several queens are positioned randomly before embarking on the backtracking phase. Once again, this can be understood intuitively in terms of the lack of regularity in the solutions (at least when the number of queens is not 4k + 2 for some integer k). Here are the values of p, s, e, and t for a few values of stopVegas in the case of the twelve queens problem. t stopVegas p s e 0 5 12
1.0000 0.5039 0.0465
262.00 33.88 13.00
-
47.23 10.20
262.00 80.39 222.11
On the CYBER 835 the Las Vegas algorithm that places the first five queens randomly
before starting backtracking requires only 37 milliseconds on the average to find a solution, whereas the pure backtracking algorithm takes 125 milliseconds. As for the greedy Las Vegas algorithm, it wastes so much time making its pseudorandom choices of position that it requires essentially the same amount of time as the pure backtracking algorithm. An empirical study of the twenty queens problem was also carried out using an Apple II personal computer. The deterministic backtracking algorithm took more than 2 hours to find a first solution. Using the probabilistic approach and placing the first
ten queens at random, 36 different solutions were found in about five and a half minutes. Thus the probabilistic algorithm turned out to be almost 1,000 times faster per solution than the deterministic algorithm. ** Problem 8.5.4. If we want a solution to the general n queens problem, it is obviously silly to analyse exhaustively all the possibilities so as to discover the optimal
Probabilistic Algorithms
252
Chap. 8
value of stopVegas, and then to apply the corresponding Las Vegas algorithm. In fact, determining the optimal value of stopVegas takes longer than a straightforward search for a solution using backtracking. (We needed more than 50 minutes computation on the CYBER to establish that stopVegas = 5 is the optimal choice for the twelve queens problem !) Find an analytic method that enables a good, but not necessarily optimal, value of stopVegas to be determined rapidly as a function of n. ** Problem 8.5.5. Technically, the general algorithm obtained using the previous problem (first determine stopVegas as a function of n, the number of queens, and then try to place the queens on the board) can only be considered to be a Las Vegas algorithm if its probability of success is strictly positive for every n. This is the case if and only if there exists at least one solution. If no solution exists, the obstinate proba-
bilistic algorithm will loop forever without realizing what is happening. Prove or disprove : the n queens problem can be solved for every n >- 4. Combining this with Problem 8.5.4, can you find a constant S > 0 such that the probability of success of the Las Vegas algorithm to solve the n queens problem is at least S for every n ?
8.5.2 Square Roots Modulo p
Let p be an odd prime. An integer x is a quadratic residue modulo p if 1 0 fixed in advance, there exists a Monte Carlo algorithm that is able
to handle a sequence of m questions in an average total time in 0 (m). The algorithm never makes an error when Si = S/ ; in the opposite case its probability of error does not exceed E. This algorithm provides an interesting application of universal hashing (Section 8.4.4).
Let e > 0 be the error probability that can be tolerated for each request to test the equality of two sets. Let k = Ilg(max(m, 1/E))]. Let H be a universal2 class of functions from U into (0,1 }k , the set of k -bit strings. The Monte Carlo algorithm first chooses a function at random in this class and then initializes a hash table that has U for its domain. The table is used to implement a random function rand : U -+ (0,1 }k as follows.
function rand (x) if x is in the table then return its associated value y E- some random k -bit string add x to the table and associate y to it
return y Notice that this is a memory function in the sense of Section 5.7. Each call of rand (x) returns a random string chosen with equal probability among all the strings of length k. Two different calls with the same argument return the same value, and two calls with different arguments are independent. Thanks to the use of universal hashing, each call of rand (x) takes constant expected time. To each set Si we associate a variable v [i] initialized to the binary string composed of k zeros. Here is the algorithm for adding an element x to the set Si . We suppose that x is not already a member of Si . procedure add (i , x)
v[i] F- v[i]®rand(x) The notation t ® u stands for the bit-by-bit exclusive-or of the binary strings t and u. The algorithm to test the equality of Si and S1 is: function test (i , j )
ifv[i]=v[jI then return true else return false It is obvious that Si # Sj if v [i] # v [ j ]. What is the probability that v [i ] = v [ j ] when Si # Sj ? Suppose without loss of generality that there
exists an xo a Si such that xa 9 S/ . Let Si = Si \ (x0}. For a set S c U, let
Sec. 8.6
Monte Carlo Algorithms
273
XOR (S) be the exclusive-or of the rand (x) for every x E S . By definition, v [i] = XOR (Si) = rand (xo) ®XOR (S;) and v[j] = XOR (SS ).
Let yo = XOR (S; ) ®XOR (Si ). The fact that v [i ] = v [ j ] implies that rand (xo) = yo; the probability of this happening is only 2-k since the value of rand (xo) is chosen independently of those
values that contribute to yo. Notice the similarity to the use of signatures in Section 7.2.1.
This Monte Carlo algorithm differs from those in the two previous sections in that our confidence in an answer "Si = Si " cannot be increased by repeating the call of test (i , j ). It is only possible to increase our confidence in the set of answers obtained to a sequence of requests by repeating the application of the algorithm to the entire sequence. Moreover, the different tests of equality are not independent. For instance, if Si # Sj , x 9 Si u SS , Sk = Si u {x }, Si = Sj u {x }, and if an application of the
algorithm replies incorrectly that Si = Sj , then it will also reply incorrectly that
Sk=S1. What happens with this algorithm if by mistake a call of
Problem 8.6.11.
add (i, x) is made when x is already in Si ?
Problem 8.6.12. Show how you could also implement a procedure elim (i, x), which removes the element x from the set Si . A call of elim (i, x) is only permitted when x is already in Si
.
Problem 8.6.13. Modify the algorithm so that it will work correctly (with probability of error c) even if a call of add (i, x) is made when xE Si . Also implement a request member (i , x), which decides without ever making an error whether x E Si . A sequence of m requests must still be handled in an expected time in O (m). Universal hashing allows us to implement a random function ** Problem 8.6.14. rand : U -*{O,1}". The possibility that rand (x 1) = rand (X2) even though x I * X2, which does not worry us when we want to test set equality, may be troublesome for other applications. Show how to implement a random permutation. More precisely, let N be an integer and let U = { 1, 2, ... , N 1. You must accept two kinds of request : init and p (i) for 1 _ (310 for every real constant 0 > 1. For the third case prove by constructive induction that t (n) 5 S [(lg n)1gT - yt(lg n)(197)- 11 - p Ig n , for some constants S, yt, and p that you must determine and for n sufficiently large. Also use the fact that
(lg1')Ig1< T(lgn)IST+2lgylg1(lg1`r)(IgY)-1 provided n
y21g R, for all real constants Q ? 1 and y > 2.]
Let t(n) = M (n)/n . The equations obtained earlier for M (n) lead to t(n)E6t(3N/n-)+O(logn) when lgn is even, and t(n)E5t(52 )+O(logn) when
n is odd. By Problem 9.5.2, t (n) E O ((log n)1 g 6) c 0 ((log n)2-59). Consequently, this algorithm can multiply two n-bit integers in a time in M (n) = nt (n) E 0 (n (log n)2.59 ).
Problem 9.5.3. Prove that 0 (n (log n)2.59) c n°C whatever the value of the real constant a > 1. This algorithm therefore outperforms all those we have seen previously, provided n is sufficiently large.
Is this algorithm optimal ? To go still faster using a similar approach, Problem 9.5.2 suggests that we should reduce the constant y = 6, which arises here as the max-
imum of 2 x 3 and ' x i . This is possible provided we increase slightly the size of the coefficients of the polynomials used in order to decrease their degree. More precisely, we split the n-bit integers to be multiplied into k blocks of I bits, where
Transformations of the Domain
290
Chap. 9
I _ 2r+f; ignl and k = n/I for an arbitrary constant i >_ 0. Detailed analysis shows that recursive calls on integers of size (2' + 1 + 2-' if Ig n is this gives rise to 21-' even, and 2-' 2n recursive calls on integers of size (2' + + 2-' if Ig n is odd provided n is sufficiently large. The corresponding y is thus 1
y=max(21-'(2'+1+2-'),
2-' f(2'++2-i`1)J)=4+21 2r
The algorithm obtained takes a time in O(n(log n)a), where
a=2+Ig(1+2-1-21)
- k lg k for every integer k ? 1 now follows by mathematical induction. The base k = 1 is immediate. Let k > 1. Suppose by the induction hypothesis that h (j) >- j lg j for every strictly positive integer j < k -1. By definition,
h(k)=k +min{h(i)+h(k-i) I 1 - k lg k for every tree T with k leaves. The average height of T being H (T )lk , it is therefore at least Ig k.
* Problem 10.1.10. Let t = Llg k and 1 =k-2, Prove that h (k) = k t + 21 , where h (k) is the function used in the proof of Lemma 10.1.3. Prove that this also
implies that h (k) >- k lg k. (Optionally : give an intuitive interpretation of this formula in the context of the average height of a tree with k leaves.)
Theorem 10.1.2. Any deterministic algorithm for sorting by comparison makes at least lg(n!) comparisons on the average to sort n elements. It therefore takes an average time in Q (n log n). Proof. Follows immediately from Lemmas 10.1.2 and 10.1.3.
Problem 10.1.11. Determine the number of comparisons performed on the average by the insertion sorting algorithm and by the selection sorting algorithm when sorting n elements. How do these values compare to the number of comparisons performed by these algorithms in the worst case? * Problem 10.1.12. Let T [1 .. n ] be an array and k