On the Complexity of Approximate Sum of Sorted List ∗
arXiv:1112.0520v4 [cs.DS] 21 Jan 2012
Bin Fu Department of Computer Science University of Texas-Pan American Edinburg, TX 78539, USA Email:
[email protected]
Abstract We consider the complexity for computing the approximate sum a1 + a2 + · · · + an of a sorted list of numbers a1 ≤ a2 ≤ · · · ≤ an . We show an algorithm that computes an (1 + ǫ)-approximation for the sum of a sorted list of nonnegative numbers in an O( 1ǫ min(log n, log( xxmax )) · (log 1ǫ + log log n)) time, where xmax and xmin are the largest min and the least positive elements of the input list, respectively. We prove a lower bound )) time for every O(1)-approximation algorithm for the sum of a sorted list Ω(min(log n, log( xxmax min of nonnegative elements. We also show that there is no sublinear time approximation algorithm for the sum of a sorted list that contains at least one negative number.
1.
Introduction
Computing the sum of a list of numbers is a classical problem that is often found inside the high school textbooks. There is a famous story about Karl Friedrich Gauss who computed 1+2+· · ·+100 via rearranging these terms into (1 + 100) + (2 + 99) + ... + (50 + 51) = 50 × 101, when he was seven years old, attending elementary school. Such a method is considered an efficient algorithm for computing a class of lists of increasing numbers. Computing the sum of a list of elements has many applications, and is ubiquitous in software design. In the classical mathematics, many functions can be approximated by the sum of simple functions via Taylor expansion. This kind of approximation theories is in the core area of mathematical analysis. In this article we consider if there is an efficient way to compute the sum of a general list of nonnegative numbers with nondecreasing order. Let ǫ be a real number Pnat least 0. Real number s is an (1 + ǫ)-approximation for the sum Pn ai i=1 problem a1 , a2 , · · · , an if 1+ǫ ≤ s ≤ (1 + ǫ) i=1 ai . Approximate sum problem was studied in the randomized computation model. Every O(1)-approximation algorithm with uniform random sampling requires Ω(n) time in the worst case if the list of numbers in [0, 1] is not sorted. Using O( ǫ12 log δ1 ) random samples, one can compute the (1 + ǫ)-approximation for the mean, or decide if it is at most δ for a list numbers in [0, 1] [9]. Canetti, Even, and Goldreich [3] showed that the sample √ size is tight. Motwani, Panigrahy, and Xu [14] showed an O( n) time approximation scheme for computing the sum of n nonnegative elements. There is a long history of research for the accuracy of summation of floating point numbers (for examples, see [10, 2, 1, 4, 5, 6, 8, 11, 12, 13, 15, 16]). The efforts were mainly spent on finding algorithms with small rounding errors. We investigate the complexity for computing the approximate sum of a sorted list. When we have a large number of data items and need to compute the sum, an efficient approximation algorithm becomes important. Par-Heled developed an coreset approach for a more general problem. The ∗ This research is supported in part by the NSF Early Career Award CCF 0845376. The first version is on December 2, 2011, and it is revised on 12/18/2011, 1/16/2012, and 1/21/2012.
1
method used in his paper implies an O( logǫ n ) time approximation algorithm for the approximate sum of sorted nonnegative numbers [7]. The coreset is a subset of numbers selected from a sorted input list, and their positions only depends on the size n of the list, and independent of the numbers. The coreset of a list of n sorted nonnegative numbers has a size Ω(log n). This requires the algorithm time to be also Ω(log n) under all cases. We show an algorithm that gives an (1 + ǫ)-approximation for the sum of a list of sorted nonnegative elements in O( 1ǫ min(log n, log( xxmax )) · (log 1ǫ + log log n)) time, where xmax and xmin min are the largest and the least positive elements of the input list, respectively. This algorithm has a comparable complexity with Par-Heled’s algorithm. Our algorithm is of sub-logarithm complexity 1 (log log n)1+a for any fixed a > 0. The algorithm is based on a different method, which ≤ n when xxmax min is a quadratic region search algorithm, from the coreset construction used in [7]. We also prove a lower bound Ω(min(log n, log( xxmax )) for this problem. We first derive an min O(log log n) time approximation algorithm that finds an approximate region of the list for holding the items of size at least a threshold b. Our approximate sum algorithm is derived with it as a submodule. We also show an Ω(log log n) lower bound for approximate region algorithms for the sum of a sorted list with only nonnegative elements. In Section 2, we present an algorithm that computes (1+ǫ)-approximation for the sum of a sorted )) · (log 1ǫ + log log n)) time, where xmax and list of nonnegative numbers in O( 1ǫ min(log n, log( xxmax min xmin are the largest and the least positive elements of the input list, respectively. In Section 3, we present lower bounds related to the sum of sorted list. In Section 4, we show the experimental results for the implementation of our algorithm in Section 2. This paper contains self-contained proofs for all its results.
2.
Algorithm for Approximate Sum of Sorted List
In this section, we show a deterministic algorithm for the sorted elements. We first show an approximation to find an approximate region of a sorted list with elements of size at least threshold b. A crucial part of our approximate algorithm for the sum of sorted list is to find an approximate region with elements of size at least a threshold b. We develop a method that is much faster than binary search and it takes O(log 1δ + log log n) time to find the approximate region. We first apply the square function to expand the region and use the square root function to narrow down to a region that only has (1 + δ) factor difference with the exact region. The parameter δ determines the accuracy of approximation. Definition 1. For i ≤ j, let |[i, j]| be the number of integers in the interval [i, j]. If both i and j are integers with i ≤ j, we have |[i, j]| = j − i + 1. Definition 2. A list X of n numbers is represented by an array X[1, n], which has n numbers X[1], X[2], · · · , X[n]. For integers i ≤ j, let X[i, j] be the sublist that contains elements X[i], X[i + 1], · · · , X[j]. For an interval R = [i, j], denote X[R] to be X[i, j]. Definition 3. For a sorted list X[1, n] with nonnegative elements by nondecreasing order and a threshold b, the b-region is an interval [n′ , n] such that X[n′ , n] are the numbers at least b in X[1, n]. An (1 + δ)-approximation for the b-region is a region R = [s, n], which contains the last position n of |R| X[1, n], such that at least 1+δ numbers in X[s, n] are at least b, and [s, n] contains all every position j with X[j] ≥ b, where |R| is the number of integers i in R.
2
2.1.
Approximate Region
The approximation algorithm for finding an approximate b-region to contain the elements at least a threshold b has two loops. The first loop searches the region by increasing the parameter m via the square function. When the region is larger than the exact region, the second loop is entered. It converges to the approximate region with a factor that goes down by a square root each cycle. Using the combination of the square and square root functions makes our algorithm much faster than the binary search. In order to simplify the description of the algorithm Approximate-Region(.), we assume X[i] = −∞ for every i ≤ 0. It can save the space for the boundary checking when accessing the list X. The description of the algorithm is mainly based on the consideration for its proof of correctness. For a real number a, denote ⌊a⌋ to be the largest integer at most a, and ⌈a⌉ to be the least integer at least a. For examples, ⌊3.7⌋ = 3, and ⌈3.7⌉ = 4. Algorithm Approximate-Region(X, b, δ, n) Input: X[1, n] is a sorted list of n numbers by nondecreasing order; n is the size of X[1, n]; b is a threshold in (0, +∞); and δ is a parameter in (0, +∞). 1.
if (X[n] < b), return ∅;
2.
if (X[n − 1] < b), return [n, n];
3.
if (X[1] ≥ b), return [1, n];
4.
let m1 := 2;
5.
while (X[n − m2 + 1] ≥ b) { let m := m2 ;
6. 7.
};
8.
let i := 1;
9.
let m1 := m;
10.
let r1 := m;
11.
while (mi ≥ 1 + δ) { √ let mi+1 := mi ;
12. 13.
if (X[n − ⌊mi+1 ri ⌋ + 1] ≥ b), then let ri+1 := mi+1 ri ;
14.
else ri+1 := ri ;
15.
let i := i + 1;
16.
};
17.
return [n − ⌊mi ri ⌋ + 1, n];
End of Algorithm Lemma 4. Let δ be a parameter in (0, 1). Then there is an O((log 1δ ) + (log log n)) time algorithm such that given an element b, and a list A of sorted n elements, it finds an (1 + δ)-approximate b-region.
3
Proof:
After the first phase (lines 1 to 7) of the algorithm, we obtain number m such that X[n − m + 1] ≥ X[n − m2 + 1]
(1 + 2i−1 1 larger than the square root of (1 + 2i−1 ). We may let variable mi go down by following the sequence {(1 + 21i )}∞ i=1 after mi ≤ 2. In order words, let g(.) be an approximate square root function such that 1 for computing the square root after m ≤ 2 in the algorithm. It has the property g(1 + 21i ) = 1 + 2i+1 √ g(m) · g(m) ≥ m. The assignment mi+1 = mi can be replaced by mi+1 = g(mi ) in the algorithm. It can simplify the algorithm by removing the computation of square root while the computational complexity is of the same order.
4
2.2.
Approximate Sum
We present an algorithm to compute the approximate sum of a list of sorted nonnegative elements. It calls the module for the approximate region, which is described in Section 2.1. The algorithm for the approximate sum of a sorted list X of nonnegative n numbers generates a series disjoint intervals R1 = [r1 , r1′ ], · · · , Rt = [rt , rt′ ], and a series of thresholds b1 , · · · , bt such that bi ′ each Ri is an (1 + δ)-approximate bi -region in X[1, ri′ ], r1′ = n, ri+1 = ri − 1, and bi+1 ≤ 1+δ , where 3ǫ δ = 4 and 1 + ǫ is the accuracy for approximation. The sum of numbers in X[Ri ] is approximated by |Ri |bi . As the list b1 > b2 > · · · > bt decreases exponentially, we can show that t = O( 1ǫ log n). Pt The approximate sum for the input list is i=1 |Ri |bi . We give a formal description of the algorithm and its proof below. Algorithm Approximate-Sum(X, ǫ, n) Input: X[1, n] is a sorted list of nonnegative numbers (by nondecreasing order) and n is the size of X[1, n], and ǫ is a parameter in (0, 1) for the accuracy of approximation. 1.
if (X(n) = 0), return 0;
2.
let δ :=
3.
let r1′ := n;
4.
let s := 0;
5.
let i := 1;
6.
let b1 :=
7.
while (bi ≥
3ǫ 4 ;
X[n] 1+δ ; δX[n] 3n )
{
8.
let Ri :=Approximate-Region(X, bi, δ, ri′ );
9.
′ let ri+1 := ri − 1 for Ri = [ri , ri′ ]; ′ X[ri+1 ] 1+δ ;
10.
let bi+1 :=
11.
let si := |[ri , ri′ ]| · bi ;
12.
let s := s + si ;
13.
let i := i + 1;
14.
};
15.
return s;
End of Algorithm Theorem 5. Let ǫ be a positive parameter. Then there is an O( 1ǫ min(log n, log( xxmax )) · (log 1ǫ + min log log n)) time algorithm to compute (1 + ǫ)-approximation for the sum of sorted list of nonnegative numbers, where xmax and xmin are the largest and the least positive elements of the input list, respectively.
5
Proof: Assume that there are t cycles executed in the while loop of the algorithm ApproximateSum(.). Let regions R1 , R2 , · · · , Rt be generated. In the first cycle of the loop, the algorithm finds a region R1 = [r1 , n] of the elements of size at least X[n] 1+δ . In the second cycle of the loop, the algorithm X[r1 −1] 1+δ . In the i-th cycle of the loop, X[ri−1 −1] . By the algorithm, we have 1+δ
finds region R2 = [r2 , r1 − 1] for the elements of size at least
it finds a region Ri = [ri , ri−1 − 1] of elements of size at least
j ∈ R1 ∪ R2 ∪ · · · ∪ Rt for every j with X[j] ≥ X[ri−1 −1] -region 1+δ
Since each Ri is an (1 + δ)-approximation of |Ri | i−1 −1] at least 1+δ entries of size at least X[r1+δ X[ri−1 −1] in X[1, ri−1 − 1]. Thus, least 1+δ
δX[n] . 3n
(5)
in X[1, ri−1 − 1], X[Ri ] contains
in X[1, ri−1 − 1], Ri also contains every entry of size at
X si |Ri | X[ri−1 − 1] = · ≤ X[j] ≤ |Ri |X[ri−1 − 1] = (1 + δ)si . 1+δ 1+δ 1+δ j∈Ri
Thus,
X si ≤ X[j] ≤ (1 + δ)si . 1+δ j∈Ri
We have
X 1 X X[j] ≤ si ≤ (1 + δ) X[j]. 1+δ j∈Ri
(6)
j∈Ri
P P Thus, si is an (1 + δ)-approximation for j∈Ri X[j]. We also have X[i]< δX[n] X[i] < 3n X[1, n] has only n numbers in total. Therefore, we have the following inequalities: X
X[i] =
n X
X[i] −
n X
X[i] −
i=1
δX[n] X[i]≥ 3n
≥ =
i=1
X
X[i]
δX[n] 3
since
(7)
δX[n] X[i]< 3n
n
n
δX X[i] 3 i=1
δ X (1 − ) X[i]. 3 i=1
(8) (9)
We have the inequalities: s
=
t X
(10)
si
i=1
≥
1 1+δ
≥
(1 − 3δ ) X X[i] 1 + δ i=1
X
X[i]
(by inequality (6)))
(11)
δX[n] X[i]≥ 3n
n
=
n 1 X
1+δ 1− δ3 i=1
1
= 1+
4δ 3
1− δ3
(by inequality (9))
(13)
X[i] n X
(12)
(14)
X[i]
i=1
6
≥
n 1 X X[i] 1 + 4δ 3 i=1
(15)
n
=
1 X X[i]. 1 + ǫ i=1
(16)
As R1 , R2 , · · · are disjoint each other, we also have the following inequalities: s
= ≤
t X
i=1 t X
(17)
si (1 + δ)
i=1
≤ (1 + δ) ≤ (1 + ǫ)
X
X[j]
(by inequality (6))
(18)
j∈Ri n X
j=1 n X
X[j]
(19)
X[j].
(20)
j=1
PnTherefore, the output s returned by the algorithm is an (1 + ǫ)-approximation1 for the sum i=1 X[i]. By Lemma 4, each cycle in the while loop of the algorithm takes O((log δ + log log n)) time for generating Ri . For the descending chain r1′ > r2′ > · · · > rt′ with X[ri′ ] ≤
′ X[ri+1 ] 1+δ
and
for each i, we have that the number of cycles t is at most O( 1δ log n). This is bi = ≥ δX[n] 3n δX[n] xmax 1 because X[rt′ ] ≤ (1+δ) t ≤ 3n for some t = O( δ log n). Similarly, the number of cycles t is at most xmax xmax 1 ′ )). O( δ log( xmin )) because X[rt ] ≤ (1+δ)t ≤ xmin for some t = O( 1δ log( xxmax min xmax 1 Therefore, there are most t = O( δ min(log n, log xmin )) cycles in the while loop of the algorithm. Therefore, the total time is O( 1δ min(log n, log( xxmax ))(log δ1 + log log n)) = min O( 1ǫ min(log n, log( xxmax ))(log 1ǫ + log log n)). This proves Theorem 5. min X[ri′ ]
3.
Lower Bounds
In this section, we show several lower bounds about approximation for the sum of sorted list. The Ω(min(log n, log( xxmax )) lower bound is based on the general computation model for the sum problem. min The lower bound Ω(log log n)) for finding an approximate b-region shows that upper bound is optimal if using the method developed in Section 2. We also show that there is no sublinear time algorithm if the input list contains one negative element.
3.1.
Lower Bound for Computing Approximate Sum
In this section, we show a lower bound for the general computation model, which almost matches the upper bound of our algorithm. This indicates the algorithm in Section 2 can be improved by at most O(log log n) factor. The lower bound is proved by a contradiction method. In the proof of the lower bound, two lists L1 and L2 are constructed. For an algorithm with o(log n) queries, the two lists will have the same answers to all queries. Thus, the approximation outputs for the two inputs L1 and L2 are the same. We let the gap of the sums from the two lists be large enough to make them impossible to share the same constant factor approximation. Theorem 6. For every positive constant d > 1, every d-approximation algorithm for the sum of a )) (adaptive) queries to the sorted list of nonnegative numbers needs at least Ω(min(log n, log xxmax min 7
list, where γ is an arbitrary small constant in (0, 1), where xmax and xmin are the largest and the least positive elements of the input list, respectively.. Proof:
We first set up some parameters. Let c α β
= (4 + δ)d2 , 3 = , and 4 log c 3 = , 4
(21) (22) (23)
where δ is an arbitrary small constant in (0, 1). Let m be a positive integer. Let L0 be a list of t numbers equal to h with h ≤ c and t · h ≤ δmcm , where h, t, and δ will be determined later. Let list Ri contain cm−i identical numbers equal to ci for i = 1, 2, · · · , m. Let the first list ′ L1 = R1 R2 · · · Rm , which is the concatenation of R1 , R2 , · · · , and Rm . The list L′1 has n′ = cm−1 + m −1 numbers. We have n′ < cm as c > 2. Assume that an algorithm A(.) only cm−2 + · · · + c + 1 = cc−1 makes at most βm queries to output a d-approximation for the sum of sorted list of nonnegative numbers. Let A(Li ) represent the computation of the algorithm A(.) with the input list Li . During the ′ computation, A(.) needs to query the numbers in the input list. Let L′2 = R1′ R2′ · · · Rm , where Ri′ has the same length as Ri and is derived from Ri by the following two cases. Let Li = L0 L′i for i = 1, 2. • Case 1: Rk in L1 has no element queried by the algorithm A(L1 ). Let Rk′ be a list of |Rk | identical numbers equal to that of Rk+1 (note that each element of Rk+1 is equal to ck+1 ). Since Rk′ has cm−k numbers equal to ck+1 , the sum of numbers in Rk′ is cm−k · ck+1 = cm+1 . • Case 2: Rk in L1 has at least one element queried by the algorithm A(L1 ). Let Rk′ = Rk . It is easy to verify that L2 is still a nondecreasing list. The number of Ri s that are not queried in A(L1 ) is at least (m − βm), as the number of queried elements is at most βm. Let S1 be the sum of elements in L1 , and S2 be the sum of elements in L2 . We have S1 ≤ (δ + 1)mcm , and S2 ≥ (m − βm)cm+1 . The two lists L1 and L2 have the same result for running the algorithm. Assume that the algorithm gives an approximation s for both L1 and L2 . We have s 1 (m − βm)cm+1 d
≤ ≤
dS1 ≤ d(1 + δ)mcm S2 ≤ s for L2 . d
for L1 , and
(24) (25)
By inequalities (24) and (25), we have d1 (m − βm)cm+1 ≤ d(1 + δ)mcm . Thus, d1 (1 − β)c ≤ d(1 + δ). 2
2
≤ β. By equation (21), we have 1 − d (1+δ) > 1 − 41 = 43 = β. This brings a Thus, 1 − d (1+δ) c c contradiction. Thus, the algorithm cannot give a d-approximation for the sum of sorted list with at most βm queries to the input list. The largest number of L1 and L2 is cm . We can create the two cases for the lower bound. m
• Case 1: log n > log xxmax . We just let L0 contains t = n − n′ 0s. We have log xxmax = log cc = min min xmax (m − 1) log c. Since the algorithm has to make at least βm = Ω(log xmin ) queries, we can see a lower bound of Ω(log xxmax ). min . Let L0 only contain one number h = nδc2 (note t = 1). Since the • Case 2: log n ≤ log xxmax min algorithm has to make at least βm = Ω(log n) queries, we can see a lower bound of Ω(log n).
8
3.2.
Lower Bound for Computing Approximate Region
We give an Ω(log log n) lower bound for the deterministic approximation scheme for a b-region in a sorted input list of nonnegative numbers. The method is that if there is an algorithm with o(log log n) queries, two sorted lists L1 and L2 of 0, 1 numbers are constructed. They reply the same answer the each the query from the algorithm, but their sums have large difference. This lower bound shows that it is impossible to use the method of Section 2, which iteratively finds approximate regions via a top down approach, to get a better upper bound for the approximate sum problem. Definition 7. For a sorted list X[1, n] with 0, 1 numbers by nondecreasing order, an d-approximate 1-region is a region R = [s, n], which contains the last position n of X[1, n], such that at least |R| d numbers in X[s, n] are 1, and X[s, n] contains all the positions j with X[j] = 1, where |R| is the number of integers i in R. Theorem 8. For any parameter d > 1, every deterministic algorithm must make at least log log n − log log(d + 1) adaptive queries to a sorted input list for the d-approximate 1-region problem. Proof: We let each input list contain either 0 or 1 in each position. Assume that A(.) is a d-approximation algorithm for the approximate region. Let A(Li ) represent the computation of A(.) with input list Li . We construct two lists L1 and L2 of length n, and make sure that A(L1 ) and A(L2 ) receive the same answer for each query to the input list. For the list of adaptive queries generated by the algorithm A(.), we generate a series of intervals [1, n] = I0 ⊇ I1 ⊇ · · · ⊇ Im .
(26)
R [n, n] = I0R ⊆ I1R ⊆ · · · ⊆ Im ,
(27)
We also have a list
where m is the number of queries to the input list by the algorithm A(.) and each IjR is a subset of Ij for j = 0, 1, 2, · · · , m. For each Ij , it is partitioned into IjL ∪ IjR such that its right part IjR is for 1, and its left part IjL is undecided except its leftmost position. Furthermore, 1
|Ij | ≥ n 2j |IjR |,
(28)
and both Ij and RjR always contain the position n, which is the final position in the input list. Stage 0 let I0 := [1, n]; let I0R := [n, n]; let L1 [1] := L2 [1] := 0; let L1 [n] := L2 [n] := 1; mark every 1 < i < n as a “undecided” position (1 and n are already decided); End of Stage 0; It is easy to see that inequality (28) holds for Stage j = 0. For an interval [a, b], |[a, b]| is the number of integers in it as defined in Definition 1. Assume that Ij = [aj , n] and IjR = [bj , n]. We assume that inequality (28) holds for j. We also assume that both L1 [i] and L2 [i] have been decided to hold 0 for each i ≤ aj ; both L1 [i] and L2 [i] have been decided to hold 1 for each i ≥ bj ; and the other points are undecided after stage j, which processes the j-query. Stage j + 1 (j ≥ 0) Assume that a position p is queried to the input list by the j + 1-th query (j ≥ 0) made by the algorithm A(.). We discuss several cases. 9
R • Case 1: p ≤ aj . Let Ij+1 := Ij and Ij+1 := IjR . We have 1 1 |Ij+1 | |Ij | = R ≥ n 2j > n 2j+1 . R |Ij+1 | |Ij |
Let the answer to the j + 1-th query be 0 as we already assigned L1 [p] := L2 [p] := 0 in the earlier stages by the hypothesis. R • Case 2: p > aj and p ∈ IjR . Let Ij+1 := Ij and Ij+1 := IjR . We have 1 1 |Ij+1 | |Ij | = R ≥ n 2j > n 2j+1 . (by the hypothesis) R |Ij+1 | |Ij |
Let the answer to the j + 1-th query be 1 as we already assigned L1 [p] := L2 [p] := 1 in the earlier stages by the hypothesis. r |I | |[p,n]| R R • Case 3: p > aj and p 6∈ Ij and |I R | ≥ |I Rj | . Let Ij+1 := [p, n] and Ij+1 := IjR . We still j
j
have
|[p, n]| |Ij+1 | = ≥ R |Ij+1 | |IjR |
s
|Ij | ≥ |IjR |
q 1 1 n 2j = n 2j+1 .
Let the answer to the j + 1-th query be 0, as the position p will hold the number 0. Let L1 [i] := L2 [i] := 0 for each undecided i ≤ p (it becomes “decided” after the assignment). r |I | |[p,n]| R R := [p, n]. We have the • Case 4: p > aj and p 6∈ Ij and |I R | < |I Rj | . Let Ij+1 := Ij and Ij+1 j
j
inequalities
|Ij+1 | |Ij | = R |[p, n]| |Ij+1 |
=
>
=
|Ij | |IjR |
(29)
|[p,n]| |IjR | |Ij | |IjR |
r
s
(by the condition of this case)
(30)
|Ij | |IjR |
|Ij | ≥ |IjR |
q
1
1
n 2j = n 2j+1 .
(by the hypothesis)
(31)
Let the answer to the j + 1-th query be 1, as the position p will hold the number 1. Let L1 [i] := L2 [i] := 1 for each undecided i ≥ p (it becomes “decided” after the assignment). End of Stage j + 1 Assume that there are m queries. The following final stage is executed after processing all the m queries. Final Stage assume that Im = [am , n] and LR m = [bm , n]. let L1 [i] := 0 for every undecided i < bm , and let L1 [i] = 1 for every undecided i ≥ bm ; let L2 [i] := 0 for every undecided i ≤ am , and let L1 [i] = 1 for every undecided i > am ; End of Final Stage
10
We note that the assignments to the two lists L1 and L2 are consistent among all stages. In other words, if Li [j] is assigned a ∈ {0, 1} at stage k, then Li [j] will not be assigned b 6= a at any stage k ′ with k < k ′ , because of the two chains (26) and (27) in the construction. The two deterministic computations A(L1 ) and A(L2 ) have the same result. We get two sorted R lists L1 and L2 such that each position in Im of L1 is 1, every other position of L1 is 0, each position in [am + 1, n] of L2 is 1, and every other position of L2 is 0, where Im = [am , n]. On the other hand, the numbers of 1s of L1 and L2 are greatly different. Let D be the approximate 1-region outputted by the algorithm for the two lists. As the algorithm gives a d-approximation for L1 , we have |D| R ≤ |Im |. d
(32)
As D is a d-approximate 1-region for L2 , D contains every j with X[j] = 1 (see Definition 7). We have |Im | − 1 ≤ |D|. R By inequalities (32) and (33), |Im | − 1 ≤ d|Im |. Therefore, R |Im | ≥ 1. We have n
1 2m
(33) |Im |−1 R| |Im
≤ d. Thus,
≤ d + 1. This implies m ≥ log log n − log log(d + 1).
|Im | R| |Im
≤ d + 1 as
Corollary 9. For any constant ǫ ∈ (0, 1), every deterministic O(1)-approximation algorithm for 1-region problem must make at least (1 − ǫ) log log n adaptive queries.
3.3.
Lower Bound for Sorted List with Negative Elements
We derive a theorem that shows there is not any factor approximation sublinear time algorithm for the sum of a list of elements that contains both positive and negative elements. Theorem 10. Let ǫ be an arbitrary positive constant. There is no algorithm that makes at most n − 1 queries to give (1 + ǫ)-approximation for the sum of a list of n sorted elements that contains at least one negative element. Proof: Consider a list of element −m(m + 1), 2, · · · , 2m. This list contains n = m + 1 elements. If there is an algorithm that gives (1 + ǫ)-approximation, then there is an element, say 2k, that is not queried by the algorithm. We construct another list that is identical to the last list except 2k being replaced by 2k + 1. The sum of the first list is zero, but the sum of the second list is 1. The algorithm gives the same result as the element 2k in the first list and the element 2k + 1 in the second list are not queried (all the other queries are the of the same answers). This brings a contradiction. Similarly, in the case that −m(m + 1) is not queried, we can bring a contradiction after replacing it with −m(m + 1) + 1.
4.
Implementation and Experimental Results
As computing the summation of a list of elements is widely used, testing the algorithm with program is important. Our algorithm has not only theoretical guarantee for its speed and accuracy, but also simplicity for converting into software. We have implemented the algorithm described in Section 2. It has the fast performance to compute the approximate sum of a sorted list with nonnegative real numbers. As the algorithm is simple, it is straight to convert it into a C++ program, which shows satisfactory performance for both the speed and accuracy of approximation. In the experiments conducted, we set up a loop to compute the summation of n = 107 elements. The loop is repeated k = 100 times. The approximation algorithm is much faster than the brute force method to compute the approximate sum. 11
In order to avoid the memory limitation problem, we use an nondecreasing function x(.), instead of a list, from integers to double type floating point numbers. There is a function “double approximate sum(double (*x)(int), double e, int n)”. If we let function x(i) return the i-the element of an input list, it can also handle the input of a list of numbers, and compute its approximate sum. In order to avoid the time consuming computation for the square root function, we set up a table of 30 k entries to save the values for 22 with integer k ∈ [−20, 9]. This table is enough to handle e as small 9 as 10−6 without calling library function sqrt(.) to compute the square root, and n as large as 22 . When the number n of numbers of the input is fixed to be 107 , the speed of the software depends on the accuracy 1 + e. We let x(i) = i during the experiments. For parameter e = 0.1, 0.01, 0.001 and 0.0001, our algorithm for the approximate sum is much faster than the brute force method, which computes the exact sum. Our algorithm may be slower than the brute force method when e is very small (for example e = 0.00001). This is very reasonable from the analysis of the algorithm as the complexity is inversely propositional to e, and the algorithm Approximate-Sum(.) generates a lot of regions Ri with only one position.
5.
Conclusions and Open Problems
We studied the approximate sum in a sorted list with nonnegative elements. For a fixed ǫ, there is a log log n factor gap between the upper bound of our algorithm, and our lower bound. An interesting problem of further research is to close this gap. Another interesting problem is the computational complexity of approximate sum in the randomized computational model, which is not discussed in this paper.
6.
Acknowledgements
We would like to thank Cynthia Fu for her proofreading and comments for an earlier version of this paper.
References [1] I. J. Anderson. A distillation algorithm for floating-point summation. SIAM J. Sci. Comput., 20:1797–1806, 1999. [2] J. E. Bresenham. Algorithm for computer control of a digital plotter. IBM Systems Journal, 4(1):2530, 1965. [3] R. Canetti, G. Even, and O. Goldreich. Lower bounds for sampling algorithms for estimating the average. Information Processing Letters, 53:17–25, 1995. [4] J. Demmel and Y. Hida. Accurate and efficient floating point summation. SIAM J. Sci. Comput., 25:1214–1248, 2003. [5] T. O. Espelid. On floating-point summation. SIAM Rev., 37:603–607, 1995. [6] J. Gregory. A comparision of floating point summation methods. Commun. ACM, 15:838, 1972. [7] S. Har-Peled. Coresets for discrete integration and clustering. In Proceedings of the 26th International Conference on Foundations of Software Technology and Theoretical Computer Science, pages 33–44, 2006. [8] N. J. Higham. The accuracy of floating point summation. SIAM J. Sci. Comput., 14:783–799, 1993.
12
[9] W. Hoefding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58:13–30, 1963. [10] W. Kahan. Further remarks on reducing truncation errors. Communications of the ACM, 8(1):40, 1965. [11] D. E. Knuth. The art of computer programming, Vol. 2: Seminumerical Algorithms, 3rd ed. AddisonWesley, Reading, MA, 1998. [12] P. Linz. Accurate floating-point summation. Commun. ACM, 13:361–362, 1970. [13] M. A. Malcolm. On accurate floating-point summation. Commun. ACM, 14:731–736, 1971. [14] R. Motwani, R. Panigrahy, and Y. Xu. Estimating sum by weighted sampling. In Proceedings of the 34th International Colloquium on Automata, Languages and Programming, pages 53–64, 2007. [15] D. M. Priest. On Properties of Floating Point Arithmetics: Numerical Stability and the Cost of Accurate Computations, Ph.D. thesis. PhD thesis, Mathematics Department, University of California, Berkeley, CA, 1992. [16] Y. K. Zhu, J. H. Yong, and G. Q. Zheng. A new distillation algorithm for floating-point summation. SIAM Journal on Scientific Computing, 26:2066–2078, 2005.
13