Delayed Binary Search, or Playing Twenty Questions with ... - CiteSeerX

2 downloads 0 Views 198KB Size Report
Mar 25, 1999 - yComputer Science Division, University of California, Berkeley, CA 94720. Email: ..... 1 = d, then by Proposition 10, we're left searching the.
Delayed Binary Search, or Playing Twenty Questions with a Procrastinator



Andris Ambainisy Stephen A. Blochz David L. Schweizerx March 25, 1999 Corresponding author:

Stephen A. Bloch Department of Math & Computer Science Adelphi University Garden City, NY 11530 Email [email protected] Phone 516-877-4483 Fax 516-877-4499

Abstract

We study the classic binary search problem, with a delay between query and answer. For all constant delays, we give matching upper and lower bounds on the number of queries. Key Words: binary search, delay, Fibonacci series, Golden Ratio, monotone, search with apologies to [1] Computer Science Division, University of California, Berkeley, CA 94720. Email: [email protected]. z Department of Math & Computer Science, Adelphi University, Garden City, NY 11530. Email: [email protected]. x Barclays Global Investors, 45 Fremont Street, San Francisco, CA 94105. Email: [email protected].  y

1 Introduction and De nitions Recently one of us (S. Bloch) found himself talking with a colleague about how to match homework assignments to the intellectual level of an unknown group of students. An obvious algorithm appeared to be binary search: once a range of possible diculty levels has been determined, divide it in half (assuming some reasonable metric on homework diculty) and recurse on the upper or lower half depending on depending on the students' performance on this assignment. However, the course in question was a seminar that met once a week. One homework problem was assigned each week, due in writing in class the following week. As a result, there was no chance to grade the students' performance on assignment m until after giving assignment m + 1. In other words, each query must be chosen before the previous one is answered. The problem obviously generalizes to any xed delay d 2 N. Trivially, this form of search is at most a factor of d + 1 slower than the usual form (d = 0): simply ask the same query d + 1 times in a row, ignoring all but the rst answer. (This is roughly equivalent to giving homework only every d + 1 weeks, which the students would no doubt prefer.) The question remained: is this slowdown of d + 1 optimal? We show that the optimal slowdown is in fact log'd 2, where 'd is the positive real root of xd+1 = xd + 1. Several variants of the problem suggest themselves.

De nition 1 The real-real search problem with delay d and size n Instance: a real number  > 0, a function f : R ! R, and a domain

[a; b] (where b = a + n); we assume that f is nondecreasing on [a; b], and f (a) < 0 < f (b). Output: a pair of numbers (a ; b ) such that  a  a < b  b,  b ? a  , and  f (a )  0  f (b ). Allowed operations: the i-th query consists of a number xi, and is answered by one of f (xi ) < y0, f (xi ) = y0 , or f (xi) > y0 . For all i, the answer to query i is available before the algorithm chooses the value of xi+d+1 . 0

0

0

0

0

0

0

0

2

Note on the \allowed operations": Lance Fortnow points out that if

arbitrary subset queries are allowed, the problem becomes uninteresting: the algorithm can simply query each bit of the answer, and since these queries are nonadaptive anyway, the delay makes little di erence and the optimal algorithm takes exactly dlog2(n)e queries. Note also that we have placed no conditions on f other than that it be nondecreasing and have a root in the speci ed interval.

De nition 2 The discrete-real search problem with delay d and size n Instance: a function f : Z ! R and a domain [a; b] (where b = a + n);

we assume that f is nondecreasing on [a; b], and f (a) < 0 < f (b). Output: either an integer c 2 (a; b) such that f (c) = 0, or a pair of adjacent integers c; c + 1 2 [a; b] such that f (c) < 0 < f (c + 1). Allowed operations: as above, except that all xi are integers.

Note that since f (a) < 0 < f (b) is assumed, both versions of the problem really search the open interval (a; b), and we shall sometimes use this notation. As our title suggests, this problem is related to Ulam's famous question [2] about nding a number between one and one million by asking questions of an opponent who may lie once or twice. Various specializations and generalizations of this problem have been studied (e.g., [3, 4, 5, 1, 6]), including exact and asymptotic analyses, relations to error-correcting codes, and variations on the type of query allowed. None of this work has examined delayed answers. The problem also appears to be related to the problem of nding the maximum of a unimodal function. The rst solutions [7, 8, 9], in which Fibonacci searching was developed, were extended to parallel searching [9, 10] and to searching with a delay [11, 12]. More recently, this work has been extended to di erent searching rules [13] and generalized to k-modal function optimization (in which the function's k-th derivative is assumed to have a unique zero; see [14]). Although the two problems are related, they are not (see below) isomorphic.

De nition 3 The real-real unimodal optimization problem with delay d and size n

3

Instance: A real number  > 0, a function f : R ! R, and a domain

[a; b] (where b = a + n) on which f is presumed unimodal ( i.e. for some c 2 [a; b], f is increasing on [a; c) and decreasing on (c; b]). Output: a pair of numbers a ; b such that  a  a < b  b,  b ? a  , and  a  c  b , where c is the mode described above. Allowed operations: the i-th query consists of a real number xi, and is answered with the value of f (xi). For all i, the answer to query i is available before the algorithm chooses the value of xi+d+1 . 0

0

0

0

0

0

0

0

De nition 4 The lattice (or discrete-real) unimodal optimization problem

with delay d and size n Instance: a function f : Z ! R, and a domain [a; b] (where b = a + n) on which f is presumed unimodal. Output: an integer d 2 [a; b] such that for all integers e 2 [a; b], f (d)  f (e). Allowed operations: as above, except that all xi are integers.

The unimodal optimization problem with delays d = 1; 2 was studied by Beamer and Wilde[11], and extended to d = 3 by Li[12]. They show that log d n + o(log d n) queries are necessary and sucient, where

   

p 5 ? 1)=2  1:618:::, p 1 = 2  1:4145:::, 2  1:325, and 3  1:2786. 0=(

This result uses complicated case-by-case analysis and there is no obvious way of extending it to general d (or even to d = 4). The monotone search problem reduces easily to the unimodal search problem (simply minimize the function f (x)2). However, the converse does not hold, and it is easy to see that unimodal optimization requires more queries 4

than search in the delay-0 case. Our results together with [11, 12] imply that the same is true for search with delay greater than 0. Our delay-d optimal algorithm searches the interval (0; n) with log'd n + O(1) queries where 'd is the positive real root of xd+1 = xd +1. In particular, '0 = 2 (i.e. classical binary search is a special case of our algorithm), '1  1:618:::, and '2  1:466:::.

2 Main results Our rst approach to the problem was to construct a table, by dynamic programming, of the minimum number of queries necessary to solve the problem on search spaces of various sizes with various pending queries. We then looked for patterns in the table, proved them, and converted them into recurrence relations. This program had been carried out for the cases of delay 1 and 2, and we were working on delay 3 (the tables are d + 1-dimensional, so it becomes progressively harder to nd, state, and prove the patterns), before we managed to generalize the results. While this approach did generate a lot of pretty patterns and cute lemmas, we'll omit them in favor of the more concise general solution following.

De nition 5 Let Ad(t) denote the largest natural number such that the dis-

crete monotone search problem with delay d on (0; Ad (t)) can be solved with t queries.

De nition 6 Let

(

1 t0 Bd(t) = B ( t ? 1) + B ( t ? d ? 1) t >0 d d

(For example, B0(t) = 2t, and B1(t) is the t-th Fibonacci number.) The main result of the present paper is Theorem 7 For all d  0 and t  0,

Ad(t) = Bd(t): Once this is established, standard techniques for solving recurrences yield the following two corollaries: 5

Corollary 8 Let 'd be the positive real root of xd+1 = xd + 1. Then Ad(t) 2 ('td): Corollary 9 Let 'd be the positive real root of xd+1 = xd + 1.

Then log'd (n + 1) + O(1) queries are necessary and sucient for the monotone search problem of size n with delay d.

3 Proof of the main theorem

We shall prove the inequality Ad(t)  Bd(t) by giving an algorithm which correctly solves the problem on the range (0; Bd (t)). For the reverse inequality, we generalize the functions Ad(t) and Bd(t) to Ad(t;~{) and Bd(t;~{) and prove Ad(t;~{)  Bd (t;~{). First, we make an easy observation: Proposition 10 If t  d, then

Ad(t) = Bd (t) = t + 1: By de nition of Ad, if t  d, then all queries must be made before any answers are received, so all queries are non-adaptive, and in this situation we can do no better with our t queries than to query 1; 2; : : : ; t, which suce to search the interval (0; t + 1) or any translation of it. On the other hand, Bd(t) = Bd(t ? 1) + Bd(t ? d ? 1). The latter term is 1 because t ? d ? 1 < 0, so by a straightforward induction, Bd (t) = t +1. Proof:

3.1

The upper bound

Theorem 11 For all d  0, t  0, Ad(t)  Bd(t): We prove this by exhibiting an algorithm that solves the problem of size Bd(t) in t queries. The intuition behind the algorithm is as follows: each query splits the search interval unevenly, but in such a fashion that the longer subinterval 6

contains all the pending queries, so its actual diculty is the same as that of the shorter subinterval with no pending queries. Algorithm: Without loss of generality, we search the interval (0; Bd (t)). If t  d, then Bd(t) = t + 1 and we can just query f at all integer points 1; 2; : : : ; t. If t > d, we ask the rst d + 1 queries at points Bd(t ? 1), Bd (t ? 2), : : :, Bd(t ? d ? 1). Note that since Bd (t) is trivially increasing in t, these queries are made from right to left. (The algorithm would of course work equally well from left to right, but the algebra would be slightly uglier.) After receiving the answer f (Bd (t ? 1))

Suggest Documents