On the Complexity of Real Root Isolation arXiv

0 downloads 0 Views 488KB Size Report
Nov 1, 2010 - Due to its similarities to the Descartes method, we also consider it .... In [20], we introduced a version of the Descartes method that runs on a ...
On the Complexity of Real Root Isolation Michael Sagraloff [email protected] Max-Planck-Institut f¨ur Informatik, Saarbr¨ucken, Germany

arXiv:1011.0344v1 [cs.DS] 1 Nov 2010

Abstract We introduce a new method to isolate the real roots of a square-free polynomial F = ∑ni=0 Ai xi with real coefficients Ai , where |An | ≥ 1 and |Ai | ≤ 2τ for all i. It is assumed that each coefficient of F can be approximated to any specified error bound. The presented method is exact, complete and deterministic. Due to its similarities to the Descartes method, we also consider it practical and easy to implement. Compared to previous approaches, our new method achieves a significantly better bit complexity. In particular, we show that the hardness of isolating the real roots of F is essentially determined by the geometry of the roots and not by the complexity of the coefficients. More precisely, our new algorithm ˜ demands for O(n(Σ(F) + n log Γ + τ)(Σ(F) + n log Γ)) bit operations, where σi denotes the separation of the i-th (complex) root of F, Σ(F) := ∑ni=1 log σi−1 , and Γ constitutes a bound on the modulus of all ˜ 3 τ 2 ) which roots. For polynomials with integer coefficients, the bound on the bit complexity writes as O(n improves the best bounds known for existing practical algorithms by a factor of n. The crucial idea of the proposed approach is to run an approximate version of the Descartes method. ˜ We show that it suffices to approximate F by a polynomial F˜ to O(Σ(F) + n log Γ) bits after the binary ˜ point in order to ensure that the roots of F and F are almost at the same location. The latter result implies that, for isolating the roots of F, it suffices to approximate each polynomial obtained in the intermediate ˜ steps of the Descartes method to O(Σ(F) + n log Γ) bits. For integer polynomials, this means that the ˜ intermediate results have to be approximated to only O(nτ) bits whereas the existing exact methods have to consider corresponding polynomials with up to O(n2 τ) bits.

1

Introduction

Throughout the paper, n

F(x) := ∑ Ai xi ∈ R[x]

(1.1)

i=0

denotes a square-free polynomial of degree n ≥ 2 with real coefficients Ai , where |An | ≥ 1 and |Ai | ≤ 2τ for all i. It is assumed that each coefficient Ai can be approximated to any specified precision. The roots of F are denoted by ξ1 , . . . , ξn ∈ C. Furthermore, let Γ ≤ dΓCB e be an integer bound on the modulus of all ξi , where ΓCB := 1 + maxi=0,...,n |Ai |/|An | ≤ 2τ+1 constitutes the Cauchy Bound [32] for the roots of F. The separation σ (ξi , F) of a root ξi is defined as the minimal distance of ξi to any distinct root ξ j 6= ξi , the separation σ (F) of F is defined as the minimum of all σ (ξi , f ), and Σ(F) := ∑ni=1 log σ (ξi , F)−1 .

1.1

Main result

We present a deterministic and practical algorithm which computes isolating intervals I1 , . . . , Im for the real roots of F and amounts for (1.2)

˜ ˜ O(n(Σ(F) + n log Γ + τ)(Σ(F) + n log Γ)) = O(n(Σ(F) + nτ)2 ) 1

I0 =(-1/2,1/2) I2 =(0,1/2)

I1 =(-1/2,0)

(1/4,1/2)

(0,1/4)

(0,1/8)

(1/8,1/4) (1/4,3/8)

(3/8,1/2)

Figure 1.1: The above figure shows the recursion tree induced by the Descartes method when applied to the polynomial f (x) := √

process, we√ have to compute 16 2x2 − 8x + π/4 (with roots at ≈ 0.06 and ≈ 0.29). For each interval I = (a, b) in the subdivision √ √ fI (x) = f (a + (b − a)x). For instance, for I = (1/4, 1/2), we have fI (x) = f (1/4 + x/4) = 2x2 + (2 2 − 2)x + 2 + π/8. In the modified version from [20], we initially start with an approximation f˜ of f to a certain number of bits; e.g., f˜(x) = (11585/512)x2 − 8x + (201/512) approximates f to 10 bits after the binary point. Then, the Descartes method is applied to f˜, that is, for each interval I = (a, b) as above, we have to compute f˜I (x) = f˜(a + (b − a)x); e.g., f˜(1/4 + x/4) = (11585/8192)x2 + (3393/4096)x − (1583/8192). Given that f˜ is a sufficiently good approximation of f , it is shown that the roots of f can be isolated in this way. Our new approach follows a similar strategy, that is, we start with an approximation (g fI0 ) of fI0 to a certain number L of bits. Then, for each interval I, we recursively compute approximations (g fI ) of fI to LI bits (LI is updated in each step). In contrast to the previous method, the polynomials e fI do not necessarily correspond to a specific initial approximation f˜ of f . We illustrate this by means of the above example: We start with (g fI0 )(x) = (11585/512)x2 − (31363/1024) + (5145/512) which approximates fI0 (x) = √ 2 √ √ f (−1/2 + x) = 16 2x − 16 2x + 4 − 8x + 4 + π/8 + 4 2 to L = 10 bits. Then, (g fI0 )(1/2 + x/2) are evaluated and fI0 )(x/2) and (g the result is rounded to 9 bits after the binary point. The resulting polynomials are then approximations of fI1 (x) = f (−1/2 + x/2) and fI2 (x) = f (x/2) to 8 bits, respectively (cf. Lemma 1). In the following bisection steps, we proceed in exactly the same manner. For instance, for the interval I = (1/4, 1/2), we obtain (g fI )(x) = (181/128)x2 + (53/64)x − (25/128) which approximates fI to LI = 6 bits after the binary point.

bit operations 1 . Our algorithm demands for approximations of the coefficients of F to O(Σ(F) + n log Γ) ˜ 3 τ 2 ). bits after the binary point. For a polynomial F with integer coefficients, the bound in (1.2) writes as O(n Instead of isolating the roots of F, we consider the equivalent task of isolating the roots of a polynomial f which is defined as follows: For F as in (1.1), let F ∗ (x) = ∑i=0,...,n A∗i xi := F(x)/An be the corresponding monic polynomial with the same roots and n

f (x) = ∑ ai xi := F ∗ (2Γx)

(1.3)

i=0

the ”scaled polynomial” with roots z1 , . . . , zn . Each root zi = ξi /(2Γ) of f is contained within the disc ∆1/2 (0) of radius 1/2 centered at the origin, and the separations of corresponding roots of F and f scale by a factor 2Γ, that is, σ (ξi , F) = 2Γ · σ (zi , f ). The modulus of the coefficients of f is bounded by (2Γ)n · 2τ ≤ 2n(τ+3)+τ and Σ( f ) = ∑ni=1 log σ (zi , f )−1 = Σ(F) + n log(2Γ) = O(n log Γ + Σ(F)).

1.2

Main contributions

The crucial idea of the presented method is to consider an “approximate version” of the Descartes method applied to the polynomial f as defined in (1.3). More precisely, instead of computing the exact intermediate results obtained in the subdivision process, we only consider approximations to a certain number of bits. Whereas other methods [6, 17, 21, 26, 12] proceed in a similar way by using interval polynomials, our new method considers a specific approximation in each step and updates the possible approximation error. In [20], we introduced a version of the Descartes method that runs on a certain approximation f˜ of f . 1O ˜

indicates that we omit poly-logarithmic factors

2

However, in the latter version, all intermediate results corresponding to the approximation f˜ are computed exactly, whereas our new method only considers approximations in each node of the recursion tree which do not necessarily correspond to an initial approximation at the root of the tree (cf. Figure 1.1). The presented algorithm is deterministic and exact evaluation of f at specific points is never required. Our results show that, for the complexity of isolating the real roots of a real polynomial, it does not make a difference whether the given polynomial has arbitrary real, integer or rational coefficients. In fact, the hardness of isolating the roots of f (or F, respectively) is crucially determined by the quantities Γ and Σ(F) which only depend on the location of the roots of F. For integer polynomials, the bound in (1.2) on the worst case bit complex˜ 3 τ 2 ) improving the best bounds known for other practical methods such as the Descartes ity writes as O(n method [2, 7, 13, 21, 26], Sturm’s method [8, 19] or the continued fraction method [1, 28, 29, 30] by a factor of n. To the best of our knowledge, this is the first time where it is shown that approximation leads to a better worst case complexity for real root isolation, a fact which has already been observed in experiments [16]. How is it possible that an approximate version of the Descartes method is more efficient than the original “exact version”? For a moment, we only concentrate on the case where f has integer (or dyadic) coefficients (cf. Figure 1.1). The complexity analysis (cf. Section 2.4 for a more comprehensive treatment) of the Descartes method shows that, for each interval (node) I = (a, b) in the recursion tree, the dominating costs are those for the computation of the local Taylor expansion fI (x) := f (a + (b − a)x). In each bisection step, the polynomials fIl and fIr (corresponding to the left and the right subinterval of I) are recursively computed from fI by replacing x by x/2, followed by a Taylor shift by 1, that is, x 7→ x + 1. More precisely, we have fIl (x) = fI (x/2) and fIr (x) = fIl (x + 1). In each bisection step, the bitsize of the coefficients of fI increases ˜ ˜ 2 τ) bits. by n bits, and since the recursion tree has depth O(nτ), the representation of fI eventually needs O(n 3 ˜ τ) bit operAssuming asymptotically fast Taylor shift [31, 15], the computation of each fI amounts for O(n ˜ ˜ ations. In the paper, we will show that, for an arbitrary approximation f of f to Ω(nτ) bits after the binary point, corresponding roots of f and f˜ are almost at the same location with respect to their separations (cf. ˜ Theorem 2 for a more precise result). Starting with a polynomial f˜ that coincides with f to Θ(nτ) bits after the binary point, we can incrementally obtain approximations (g fI ) of fI of comparable approximation qual˜ ity, that is, (g fI ) and fI coincide in the first Θ(nτ) bits after the binary point. The latter is due to the following g fact: The polynomials ( fI ) of fI can be recursively computed such that the approximation error quadruples ˜ at most in each bisection step (cf. Theorem 12). Since the height of the recursion tree is bounded by O(nτ), ˜ it follows that each approximation (g fI ) coincides with fI in Θ(nτ) bits after the binary point. Eventually, 2 g ˜ ˜ τ) bits for the exact counterpart fI ) and, all polynomials ( fI ) are represented by O(nτ) bits (instead of O(n 2 ˜ thus, the costs at each node decrease by a factor n to O(n τ). We will prove the above result for the more general setting where F is an arbitrary polynomial with real coefficients. More precisely, we will show that it suffices to approximate each fI to O(Σ(F) + n) bits after the binary point. Then, each (g fI ) is represented by O(Σ(F) + τ + n log Γ) bits and, as a consequence, the costs ˜ at each node are bounded by O(n(Σ(F) + τ + n log Γ)) bit operations. The additional factor Σ(F) + n log Γ in our result (1.2) on the bit complexity is due to the size of the induced recursion tree.

1.3

Outline

In Section 2, we first introduce some basic notations. Furthermore, we derive a bound on how good f has to be approximated such that its roots stay at almost the same place with respect to the corresponding separations. Eventually, we briefly repeat the Descartes method and the corresponding V CA algorithm [7] before presenting a slight modification of it in Section 3. Section 4 and Section 5 are central for this paper. Therein, we present our new algorithm and the results of the complexity analysis. We conclude in Section 6. Parts of our complexity analysis as well as pseudo-code for our new method is outsourced to the Appendix.

3

2

Basics

2.1

Some Notations

For an interval I = (a, b), w(I) := b − a denotes the width of I and mI := (a + b)/2 the center of I. Furthermore,     w(I) w(I) w(I) w(I) I + = (a+ , b+ ) := a − ,b+ and I ∗ = (a∗ , b∗ ) := a − ,b+ 4n 4n 2n 2n denote extensions of I by w(I)/(4n) and w(I)/(2n) (to both sides), respectively. For a point m ∈ C and an r ∈ R+ , ∆r (m) ⊂ C constitutes the open disc of radius r centered at m. For a polynomial g(x) := ∑ni=0 gi xi ∈ C[x] with complex coefficients and a non-negative real number µ ∈ R+ 0 , we define ( ) n

[g]µ :=

g˜ = ∑ g˜i xi ∈ C[x] : |gi − g˜i | ≤ µ for all i i=0

the family of all µ-approximations of g. Given a µ ≥ 2−L , there exists a binary fraction g˜i = mi /2L with mi ∈ Z and |gi − g˜i | ≤ µ, e.g., g˜i = sign(gi )b|gi 2L |c2−L . If we consider the coefficients of g as binary numbers with potentially infinite binary places after the binary point, then g˜ is obtained by keeping the first L digits after the binary point of each coefficient of g. We call a polynomial g˜ ∈ [g]2−L obtained in this way an L-binary approximation of g. Since the coefficients of modulus less than 2−L are approximated by zero, g˜ might have lower degree than g. 12256 10 1 9 Example. For g(x) := 65589 x − 2x2 + 243 x − 16 , the polynomial g˜1 := 3 2 binary approximation and g˜2 := −2x − 4 a 2-binary approximation of g.

11 10 9 2 64 x − 2x − 16

constitutes a 6-

We further remark that, for an approximation of f (as defined in (1.3)) to L bits after the binary point, we have to approximate F to n log(2Γ) + L = O(nτ + L) bits after the binary point because of the scaling operation x 7→ 2Γx, where Γ ≤ 2τ+3 .

2.2

Approximate Taylor Shifts

For arbitrary values m ∈ C and λ ∈ R\{0}, we define f[m,λ ] (x) := f (m + λ x). The following lemma provides error bounds on how the absolute approximation error µ of a polynomial ˜f ∈ [ f ]µ scales under the transformation x 7→ m + λ x: Lemma 1. Let µ ∈ R+ 0 and g˜ ∈ [g]µ be a µ-approximation of a polynomial g ∈ C[x] of degree n. Then, (i) g˜[1/2,1/2] ∈ [g[1/2,1/2] ]2µ , (ii) g˜[−1/(4n),1+1/(2n)] ∈ [g[−1/(4n),1+1/(2n)] ]4µ , (iii) g˜[−1/2,1] ∈ [g[−1/2,1] ]2n µ , and (iv) g˜[1,1] ∈ [g[1,1] ]2n µ .

4

Proof. We define h(x) = µn xn + . . . + µ1 x + µ0 := (g − g)(x). ˜ Then, the absolute value of each coefficient µi is bounded by µ. For arbitrary m ∈ C and λ ∈ R\{0}, we consider the following computation:     i k k i−k i k i−k k i (2.1) h(m + λ x) = ∑ µi (m + λ x) = ∑ µi ∑ x λ m = ∑ x ∑ µi m λ k k i=0,...,n i=0,...,n k=0,...,i k=0,...,n i=k,...,n Thus, for |m| < 1, the absolute value of the coefficient of xk is bounded by     i k+i 1 (2.2) , µ|λ |k · ∑ |m|i−k = µ|λ |k · ∑ |m|i = µ|λ |k · k+1 (1 − |m|) k k i≥0 i≥k where we used −(k+1)

(1 − |m|)

      −(k + 1) k+i k+i i i i =∑ (−1) |m| = ∑ |m| = ∑ |m|i . i i k i≥0 i≥0 i≥0

For m = λ = 1/2, it follows that all coefficients of h are bounded by 2µ. This shows (i). (iii) also follows directly from (2.2). For m = −1/(4n) and λ = 1 + 1/(2n), (2.2) implies that g˜[−1/(4n),1+1/(2n)] ∈ [g[−1/(4n),1+1/(2n)] ]

 n , µ 78 · 1+1/(2n) 1−1/(4n)

√ · e ≤ 4. Hence, (ii) follows. (iv) follows from the computation in (2.1) since     n i+k i n ≤ ∑n−k = ∑n−k each µi is then (m = λ = 1) bounded by µ · ∑ni=k ki = ∑ni=k i−k i=0 i ≤ 2 · µ. i=0 i

where

2.3

8 7

·



1+1/(2n) 1−1/(4n)

n



83 73

Sufficiently Good Approximation of Polynomials

In the next step, we derive a bound on how good f has to be approximated by an f˜ such that, for all i, the distance of corresponding roots zi and z˜i of f and f˜ is small with respect to the separation σ (zi , f ). We introduce the following definition: Definition 1. Let t ≥ 1 be an arbitrary real value and f a polynomial as defined in (1.3). We define σ (zi , f ) f 0 (zi ) 1 µ( f ,t) := · min (2.3) t i=1,...,n 3n(n + 1) We call an L ∈ N sufficiently large 2 with respect to f if (2.4)

L ≥ L f := d− log µ( f , 64n2 )e = O(Σ( f ) + log n − log |an |) = O(Σ(F) + log n).

The upper bound for L f in 2.4 follows from σ (zi , f ) · | f 0 (zi )| = σ (zi , f ) · |an | ∏ |zi − z j | ≥ σ (zi , f ) · |an | ∏ σ (z j , f ) = |an |2−Σ( f ) . j6=i

j6=i

and Σ( f ) − log |an | = Σ(F) + n log(2Γ) − log(2Γ)n = Σ(F). The following theorem gives an answer to our question raised above: 2 This

definition is motivated by our results in Theorem 2 and Section 5.1

5

x

|f(x)|>(n+1)m(f,t) z

zj

s (z i ,f)/(tn) zi Di

Figure 2.1: The roots of f are all contained in the disc with radius 1/2 centered around the origin. Then, for each point z on the boundary of a disc ∆i = ∆σ (zi , f )/tn (zi ), it holds that | f (z)| > σ (zi , f )| f 0 (zi )|/(3tn). For an arbitrary point x ∈ C that is not contained in any ∆i , we have | f (x)| > (n + 1)µ( f ,t).

Theorem 2. Let f be a polynomial as in (1.3), µ ≤ µ( f ,t) and f˜ ∈ [ f ]µ an arbitrary µ-approximation of f . Furthermore, for i = 1, . . . , n, we denote ∆i := ∆σ (zi , f )/(tn) (zi ) the disc of radius σ (zi , f )/(tn) centered at the root zi of f . Then, (i) each root zi of f differs by less than σ (zi , f )/(tn) from a corresponding counterpart z˜i of f˜. (ii) For each z ∈ C\

Sn

i=1 ∆i ,

it holds that | f (z)| > (n + 1)µ( f ,t).

Proof. Since all roots of f are contained within ∆1/2 (0), it follows that σ (zi , f ) < 1 for all i and, thus, each disc ∆i is completely contained within the unit disc. For an arbitrary point z ∈ ∂ ∆i on the boundary of ∆i , we have the following estimate on | f (z)|: σ (zi , f ) · |an | · ∏ |z − z j | tn j=1,...,n j=1,...,n, j6=i ! ! z−zj σ (zi , f ) · |an | = ∏ ∏ |zi − z j | tn j=1,...,n, j6=i zi − z j j=1,...,n, j6=i z − z j σ (zi , f )| f 0 (zi )| |zi − z j | − |z − zi | σ (zi , f ) 0 ≥ | f (zi )| ∏ = ∏ tn tn |zi − z j | j=1,...,n, j6=i zi − z j j=1,...,n, j6=i   σ (zi , f )| f 0 (zi )| 1 n−1 σ (zi , f )| f 0 (zi )| ≥ 1− > ≥ (n + 1)µ( f ,t) tn tn 3tn

| f (z)| = |an |

(2.5)



|z − z j | =

The second to last inequality follows from the fact that, for arbitrary chosen t ≥ 1, the function h(x) := (1 − tx1 )x−1 is monotonously decreasing for all x ≥ 1 and limx→∞ h(x) = e−1/t > 1/3. We can now apply Rouch´e’s Theorem to the discs ∆i and the functions f and f˜: Since µ ≤ µ( f ,t), the inequality (2.5) implies that n |( f − f˜)(z)| ≤ ∑ µ|z|i ≤ (n + 1)µ ≤ (n + 1)µ( f ,t) < | f (z)| i=0

for all z ∈ ∂ ∆i and, thus, f and f˜ have the same number of roots, namely one, within each ∆i . Hence, for each root zi , there exists a corresponding root z˜i ∈ ∆i of f˜ which proves (i). For (ii), we remark that f is S a holomorphic function on C\ ni=1 ∆i and, thus, | f (z)| becomes minimal for a point z on the boundary of one of the discs ∆i . According to (2.5), it follows that | f (z)| > σ (zi , f )| f 0 (zi )|/(3tn) ≥ (n + 1)µ( f ,t) which proves (ii). From the last theorem it follows that, for given f as in (1.3), it suffices to approximate the coefficients of f to only L = O(Σ( f ) + log n − log |an |) = O(Σ(F) + log n) bits after the binary point such that each 6

approximation f˜ ∈ [ f ]2−L has its roots at almost the same location as f (with respect to the corresponding separations). Corollary 3. Let f be a polynomial as defined in (1.3) and L ∈ N be sufficiently large with respect to f , that is, L ≥ L f = O(Σ(F) + log n). Then, each root zi of f moves by at most σ (zi , f )/(64n3 ) when passing from f to an arbitrary approximation f˜ ∈ [ f ]2−L . In particular, real roots of f stay real and non-real roots stay non-real. Furthermore, for any z ∈ C with |z − zi | ≥ σ (zi , f )/(64n3 ) for all i, it holds that | f (z)| > (n + 1)2−L f .

2.4

The Descartes Method

We first resume some basic facts about the Descartes method for isolating the real roots of a polynomial. For a polynomial f (x) = ∑ni=0 ai xn ∈ R[x], Descartes’ Rule of Signs states that the number var( f ) of sign changes in the coefficient sequence of f , that is, the number of pairs (i, j) with i < j, ai a j < 0, and ai+1 = . . . = a j−1 = 0, is no smaller than and of the same parity as the number of positive real roots of f . If var( f ) = 0, then f has no positive real root, and if var( f ) = 1, f has exactly one positive real root. The rule easily extends to an arbitrary open interval I = (a, b) via a suitable coordinate transformation: The mapping x 7→ a + (b − a)x maps (0, 1) bijectively onto I, that is, the roots of f in I exactly correspond to those of (2.6)

fI (x) := f (a + (b − a)x)

in (0, 1). Hence, the composition of x 7→ a + (b − a)x and x 7→ 1/(1 + x) constitutes a bijective map from (0, ∞) to I. It follows that the positive real roots of     1 ax + b t n n fI (x) := (1 + x) fI = (1 + x) · f x+1 x+1 correspond bijectively to the real roots of f in I. The factor (1+x)n in the definition of fIt clears denominators and guarantees that fIt is a polynomial. We define var( f , I) as var( fIt ). Based on Descartes’ Rule of Sign, Vincent, Collins and Akritas introduced a bisection algorithm for isolating the roots of a real polynomial f in an open interval I0 (after an appropriate transformation we can assume that I0 = (−1/2, 1/2)). We refer the reader to [5, 10, 2, 3, 4, 7] for extensive treatments and references. V CA: The algorithm requires that the real roots of f in I0 are simple, otherwise it diverges. In each step, a set A of active intervals is maintained. Initially, A contains I0 , and the algorithm stops as soon as A is empty. In each iteration, some interval I ∈ A is processed; If var( f , I) = 0, then I contains no root of f and we discard I. If var( f , I) = 1, then I contains exactly one root of f and hence is an isolating interval for it. We add I to a list O of isolating intervals. If there is more than one sign change, we divide I at its midpoint mI and add the subintervals to the set of active intervals. If mI is a root of f , we add the trivial interval [mI , mI ] to the list of isolating intervals. Correctness of the algorithm is obvious. Termination and complexity analysis of the VCA algorithm rest on the following theorem: Theorem 4 ([22, 25]). Consider a real polynomial f (x), an interval I = (a, b), and let v = var( f , I). (i) (One-Circle Theorem) If the open disc bounded by the circle centered at mI and passing through the endpoints of I contains no root of f (x), then v = 0. 7

(ii) (Two-Circle Theorem) If the union of the open discs bounded by the two circles centered at mI ± √ i(1/(2 3))w(I) and passing through the endpoints of I contains exactly one root of f (x), then v = 1. Proofs of the one- and two-circle theorems can be found in [22, 23, 24, 2, 25, 18, 10]. Theorem 4 implies that no interval I of length σ ( f ) or less is split. Such an interval, recall that it is open, cannot contain two real roots and its two-circle region cannot contain any nonreal root. Thus, var( f , I) ≤ 1 by Theorem 4. We conclude that the depth of the recursion tree is bounded by 1/σ ( f ). Furthermore, it holds (see [11, Corollary 2.27] for a simple self-contained proof): Theorem 5. Let I be an interval and I1 and I2 be two disjoint subintervals of I. Then, var( f , I1 ) + var( f , I2 ) ≤ var( f , I). According to the above theorem, there cannot be more than n/2 intervals I with var( f , I) ≥ 2 at any level of the recursion. Therefore, the size of the recursion tree T is bounded by n log(1/σ ( f )). For integer ˜ 2 τ). However, a more refined argumentation [11] shows that polynomials, the latter bounds writes as O(n ˜ |T | is even bounded by O(nτ). The computation of fIt at each node of the tree is costly. It is better to store with every interval I = (a, b) the polynomial fI (x) = f (a + x(b − a)). If I is split at its midpoint mI into Il = (a, mI ) and Ir = (mI , b), the polynomials associated with the subintervals are fIl (x) = fI (x/2) and fIr (x) = fI ((1 + x)/2) = fIl (1 + x). Also, fIt (x) = (1 + x)n fI (1/(1 + x)). If the coefficients of f are integers (or dyadic fractions) of bitsize τ, then the coefficients grow by n bits in every bisection step. Thus, for a node I of depth h, the bitsize τh of the coefficients of fI is given by τh = τ + nh. Using asymptotically fast Taylor shift (see [31, 15]), the number ˜ of bit operations needed to compute fIl , fIr and fIt from fI is bounded by O(n(nh + τ)). Since the depth of ˜ ˜ 2 τ) and the costs at each node the recursion tree is bounded by O(nτ), each fI has coefficients of bitsize O(n ˜ 3 τ). Eventually, the total costs for the V CA algorithm are given by O(n ˜ 3 τ) · O(nτ) ˜ ˜ 4 τ 2 ). are O(n = O(n

3

A Modified Descartes Method

The theoretical description of the Descartes method allows to isolate the real roots of a polynomial f as defined in (1.3). However, this approach assumes that, for each node I of the recursion tree, we can compute the number var( f , I) = var( fIt ) of sign variations for the polynomial fIt . Since the coefficients of f are arbitrary real numbers, this computation is hard in general. To overcome this issue, we aim to apply our predicates to approximations of fI and fIt instead. In Section 5.1, we will show that, at least for sufficiently good approximations, this approach is feasible. However, it does not directly apply to the Descartes method but to a slight modification of it. For our modified version of the Descartes algorithm, we aim to replace the inclusion predicate var( f , I) = 1 by a predicate used in the Bolzano method. Section 3.1 resumes some useful results which are adopted from our studies on the Bolzano method [27] whereas, in Section 3.2, our modified version is formulated.

3.1

f

The TK (m, r)-Test: Existence of Roots

For m ∈ C and positive real values K and r, we introduce the test TKf (m, r)

:

tKf (m, r)

f (k) (m) k := | f (m)| − K ∑ r > 0. k! k≥1

We summarize the following useful properties for TKf (m, r): • If TKf (m, r) holds, then TKf0 (m, r) holds for all K 0 ≤ K. 8

f

• For arbitrary values m, r and λ 6= 0, the test TKf (m, r) is equivalent to TK [m,λ ] (0, r/λ ) because of f

tK[m,λ ] (0, r/λ ) = tKf (m, r). In particular, for an interval I = (a, b) and fI (x) = f (a + (b − a)x), the test TKfI (0, r) is equivalent to TKf (a, rw(I)) (cf. (2.6) for the definition of fI ). • For λ ∈ R+ , tKf (m, r) = tKλ f (m, r)/λ and, thus, TKf (m, r) is equivalent to TKλ f (m, r). We remark that ( f 0) ( f )0 the latter implies the equivalence of TK I (m, r) and TK I (m, r) since ( fI )0 = ( f (a + (b − a)x))0 = (b − a) f 0 (a + (b − a)x) = (b − a)( f 0 )I . The TKf (m, r)-test serves as an exclusion predicate but might also give a guarantee that a certain disc contains at most one root: Lemma 6. Consider a disk ∆ = ∆r (m) ⊂ C : (i) If TKf (m, r) holds for some K ≥ 1, then the closure ∆ of ∆ contains no root of f and     1 1 1− | f (m)| < | f (z)| < 1 + | f (m)| K K for all z ∈ ∆. 0

(ii) If TKf (m, r) holds for a K ≥

√ 2, then ∆ contains at most one root of f .

Proof. (i) follows from a straight forward computation: For each z ∈ ∆, we have f (z) = f (m + (z − m)) = f (m) + ∑ k≥1

f (k) (m) (z − m)k , k!

and, thus,   | f (z)| 1 | f (k) (m)| 1 ≤ 1+ ·∑ |z − m|k < 1 + | f (m)| | f (m)| k≥1 k! K since |z − m| ≤ r and TKf (m, r) holds. The left inequality in (i) follows in complete analogous manner. In particular, for K ≥ 1, the left inequality implies | f (z)| > 0 and, thus, f has no root in ∆. √ 0 If TKf (m, r) holds for a K ≥ 2, then, for any point z ∈ ∆, the derivative f 0 (z) differs from f 0 (m) by a complex number of absolute value less than | f 0 (m)|/K. Consider the triangle spanned by the points 0, f 0 (m) and f 0 (z), and let α and β denote the angles at the points 0 and f 0 (z), respectively. From the Sine Theorem, it follows that | sin α| = (| f 0 (m) − f 0 (z)| · | sin γ|)/| f 0 (m)| < 1/K. Thus, the arguments of f 0 (m) and f 0 (z) differ √ by less than arcsin(1/K) which is smaller than or equal to π/4 for K ≥ 2. Assume that there exist two 0 roots a, b ∈ ∆ of f . Since a = b implies f 0 (a) = 0, which is not possible as T1f (m, r) holds, we can assume that a 6= b. We split f into its real and imaginary part, that is, we consider f (x + iy) = u(x, y) + iv(x, y) where u, v : R2 → R are two bivariate polynomials. Then, f (a) = f (b) = 0 and so u(a) = v(a) = u(b) = v(b) = 0. But u(a) = u(b) = 0 implies, due to the Mean Value Theorem in several real variables, that there exists a φ ∈ [a, b] such that ∇u(φ ) ⊥ (b − a). Similarly, v(a) = v(b) = 0 implies that there exists a ξ ∈ [a, b] such that ∇v(ξ ) ⊥ (b − a). But ∇v(ξ ) = (vx (ξ ), vy (ξ )) = (−uy (ξ ), ux (ξ )) and, thus, it follows that ∇u(ξ ) k (b − a). Therefore, ∇u(ψ) and ∇u(ξ ) must be perpendicular. Since f 0 = ux + ivx = ux − iuy , the arguments of f 0 (ψ) and f 0 (ξ ) must differ by π/2. This contradicts our above result that both differ from the argument of f 0 (m) by less than π/4, thus, (ii) follows. 0

f The T3/2 (m, r)-test now easily applies as an inclusion predicate:

9

f0

I (0, r) holds for an r ≥ 1. Then, I contains a root ξ Corollary 7. Let I = (a, b) be an interval such that T3/2 of f exactly if f (a) · f (b) < 0. In the latter case, the disc ∆rw(I) (a) is isolating for ξ .

f0

0

f I (a, r(b − a)) holds as well. It follows that the disc ∆rw(I) (a) and, thus, I (0, r) holds, then T3/2 Proof. If T3/2 0 contains no root of the derivative f . Now, since f is monotone on I, it suffices to check whether there is a sign change at the endpoints of I. Namely, there exists a root ξ of f within I exactly if f (a) f (b) < 0. In the latter case, ∆rw(I) (a) is isolating for ξ due to Lemma 6. f0

I (0, r)-test in combination with sign evaluation is an efficient inclusion In order to show that the T3/2 predicate, we give lower bounds on r in terms of σ ( f ) such that the predicate succeeds under guarantee.

Lemma 8. For an arbitrary disc ∆ = ∆r (m) ⊂ C, an interval I = (a, b) and I + = (a − w(I)/(4n), b + w(I)/(4n)), it holds that: (i) If r ≤

σ( f ) , 4n2

0

f f (m, r) holds. then T3/2 (m, r) or T3/2

(ii) If ∆ contains a root zi of f and r
σ (ξ , f )/(4n2 ) and, thus, there exists an additional root ξ ∗ of f with |ξ ∗ − ξ | < 8n2 w(I). Since |ξ − a| < 2w(I), it follows that ∆10n2 w(I) (a) contains at least two roots of f . For (iv), a similar argumentation applies: If var( f 0 , I) 6= 0, then ∆w(I)/2 (mI ) contains a root ξ 0 of f 0 . fI If, in addition, T3/2 (0, 1) does not hold, then, due to our above considerations, the disc ∆2nw(I) (a) contains a root ξ of f . Hence, we have |ξ − ξ 0 | < 2nw(I) + w(I) which implies σ (ξ , f ) < n(2n + 1)w(I). Since |a − ξ | < 2nw(I), it follows that ∆n(2n+3)w(I) (a) ⊂ ∆4n2 w(I) (a) contains at least two roots of f . 10

3.2

D CM: A Modified Descartes Algorithm f0

I (0, r)We aim to modify the Descartes method by replacing the inclusion predicate var( f , I) = 1 by the T3/2 test in combination with sign evaluation at the endpoints of an interval I (cf. Corollary 7). More precisely, we proceed as follows (cf. Algorithm 1 in the Appendix for pseudo-code):

D CM: Our modified Descartes algorithm D CM (short for “Descartes modified”) maintains a list A of active nodes and a list O of isolating intervals. We initially set A := {(I0 , fI0 )}, where I0 := (−1/2, 1/2), and O = 0. / For each active node (I, fI ) ∈ A , we remove (I, fI ) from A and compute the number vI + := var( f , I + ) = var( fIt+ ) of sign variations for f on the extended interval I + = (a+ , b+ ) = (a − w(I)/(4n), b + w(I)/(4n)). We remark that fI + (x) = fI (−1/(4n) + (1 + 1/(2n))x) and fIt+ (x) = (1 + x)n fI + (1/(1 + x)). If vI + = 0, we f0

I (0, 2). If it fails, then I is subdivided into Il = (a, mI ) and discard (I, fI ). For vI + ≥ 1, we consider the test T3/2 Ir = (mI , b) and we add (Il , fIl ) = (Il , fI (x/2)) and (Ir , fIr ) = (Ir , fIl (x + 1)) to A . Otherwise, we evaluate the sign s of f (a+ ) · f (b+ ) = fI + (0) · fI + (1). If s < 0 and I + is disjoint from any other interval in O, we add I + to O. If s ≥ 0 or I intersects any other interval in O, we discard (I, fI ). The algorithm terminates when A becomes empty.

Theorem 9. D CM terminates for a polynomial f as defined in (1.3) and returns a list O = {I1 , . . . , Im } of disjoint isolating intervals for all real roots of f . f0

I Proof. An interval I = (a, b) is not further subdivided by D CM if var( f , I + ) = 0 or T3/2 (0, 2) holds. If + var( f , I ) 6= 0, then ∆2w(I) (a) contains a root of f due to Theorem 4. If, in addition, 2w(I) ≤ σ ( f )/(4n2 ),

f0

I then Lemma 8 (ii) guarantees that T3/2 (0, 2) holds. Hence, it follows that D CM never splits an interval 2 I of width w(I) ≤ σ ( f )/(8n ) and, thus, termination of D CM is guaranteed. From our construction and Corollary 7, it is clear that each interval in O is isolating for a real root of f and that all intervals in O are pairwise disjoint. It remains to show that, for each real root ξ of f , there exists a corresponding isolating interval in O. Let I = (a, b) be an interval which has been discarded and whose closure I contains ξ . Since fI0 vI + > 0, I was not discarded in the first step of D CM. Hence, T3/2 (0, 2) holds and, thus, f must be monotone + + + + on I . Since I contains the root ξ , we must have f (a ) f (b ) < 0. It follows that I + intersects an interval J + = (c+ , d + ) ∈ O which had been added to O before I was proceeded. Let J = (c, d) be the corresponding smaller interval for J + . Since the (w(I)/(4n))-neighborhood of I intersects the (w(J)/(4n))-neighborhood of J, the preceding Lemma 10 shows that either ∆2w(I) (a) or ∆2w(J) (c) contains both intervals I + and J + .

f0

f0

J I Since T3/2 (0, 2) and T3/2 (0, 2) hold, each of these discs contains at most one root and, therefore, J + ∈ O must already be an isolating interval for ξ .

Lemma 10.

3

Let I = (a, b) and J = (c, d) be two intervals (not necessarily of equal length) of the form   1 1 − + i2−h , − + (i + 1)2−h , 2 2

where h ∈ N and i ∈ {0, . . . , 2h − 1}. If the (w(I)/(2n))-neighborhood Uw(I)/(2n) (I) of I intersects the (w(J)/(2n))-neighborhood Uw(J)/(2n) (J) of J, then at least one of the two discs ∆2w(I) (a) and ∆2w(J) (c) contains the two intervals I ++ := (a − w(I), b + w(I)) and J ++ := (c − w(J), d + w(J)). 3 Lemma

10 proves a slightly stronger result than necessary for the proof of Theorem 9. The stronger result applies in the proof of Theorem 15 in Section 5.2.

11

I a

d mI

b

mJ

c

J

d

D 2w(J) (c)

Figure 3.1: Wlog., we can assume that w(J) ≥ w(I). w(I), w(J) and the distance δ between I and J differ by a power of 2. For δ = 0, the disc ∆ := ∆2w(J) (c) certainly contains I ++ and J ++ . If δ 6= 0, then w(J) ≥ 2w(I) and w(J) ≥ 4δ , hence I ++ , J ++ ⊂ ∆. Proof. Wlog., we can assume that w(J) ≥ w(I) and, thus, w(J) = 2l w(I) with an l ∈ N0 . Let δ denote the distance between I and J. If δ = 0, then ∆2w(J) (c) contains I ++ and J ++ . If δ 6= 0, then δ = 2k w(I) with a k ∈ N0 . Since Uw(I)/(2n) (I) ∩ Uw(J)/(2n) (J) 6= 0, / we must have w(J)/(2n) > δ /2. In particular, we have k−1 w(J)/4 > δ /2 = 2 w(I) and, thus, w(J)/4 ≥ 2k w(I) = δ (here, we use that w(I) and w(J) differ by a multiple of 2). Since 2w(J) = w(J) + w(J)/2 + w(J)/2 ≥ w(J) + 2w(I) + 2δ , it follows that I ++ and J ++ are both contained in ∆2w(J) (c). Theorem 11. For a polynomial f as in (1.3), D CM induces a subdivision tree TD CM of height hTDCM = O(log(1/σ ( f )) + log n) and size |TD CM | = O(Σ( f ) + n log n). Proof. The result on the height of TD CM follows directly from the proof of Theorem 9. Namely, there we have shown that D CM never subdivides an interval of width ≤ σ ( f )/(8n2 ). For the bound on |TD CM |, we use a similar argumentation as in [14] and [20]. For a root z of f and a certain depth h ∈ N0 we say that I = (−1/2 + i2−h , −1/2 + (i + 1)2−h ), i = {0, . . . , 2h − 1}, is a canonical interval for z if the real part of z is contained in [−1/2 + i2−h , −1/2 + (i + 1)2−h ) and σ (z, f ) < n2 25−h . We denote Tc the canonical tree which consists of all canonical intervals. We remark that, for a canonical interval I, the parent interval of I is canonical as well. The following considerations will show that |TD CM | = O(|Tc |) and Tc = O(Σ( f ) + log n). For the size of the canonical tree, consider a leaf I ∈ Tc and let zI be a root of f corresponding to this leaf. If there are several, then zI is the root with minimal separation. Then, σ (zI , f ) < n2 25−h and, thus, h ≤ 2 log n + 5 + log(1/σ (zI , f )). Since any root of f is associated with at most one leaf of the canonical tree, we conclude |Tc | = O(n log n + Σ( f )). It remains to show that |TD CM | = O(Tc ). Consider the following mapping of internal nodes (intervals) of TD CM to canonical nodes (intervals) in Tc : Let I be a non-terminal fI0 (with respect to D CM) interval of width w(I) = 2−h . Then, var( f , I + ) 6= 0 and T3/2 (0, 2) does not succeed. According to Theorem 4 and Lemma 8 (iii), the disc ∆(1+1/2n)w(I) (mI ) contains a root z of f with σ (z, f ) < 20n2 w(I) < n2 25−h . Hence, one of the intervals I1 = (a − (b − a), a), I or I2 = (b, b + (b − a)) is canonical for z. We map I to the corresponding interval. This defines a mapping from the internal nodes of T to the nodes of the canonical tree Tc . Furthermore, each node in the canonical tree has at most three preimages in T and, thus, the number of internal nodes of T is bounded by O(n log n + Σ( f )). Since T is a binary tree, the bound on the number of internal nodes applies to the whole tree T as well. We remark that, similar to the Descartes method, D CM applies to arbitrary square-free polynomials with real coefficients. However, it also still assumes exact computation of the number var( f , I + ) of sign variations and exact evaluation of f at specific points. In the following sections, we show that these assumptions can 12

be relaxed. Namely, instead of computing the polynomials fI exactly, it suffices to consider sufficiently good approximations of them.

4

Approximate Subdivision Trees

Let f ∈ R[x] be an arbitrary polynomial and T a finite tree with the following properties 4 : • Each node of T consists of an interval I = (a, b) and a corresponding polynomial fI = f (a + (b − a)x). • The root (unique node of depth 0) of T is defined as v0,0 := (I0 , fI0 ) = ((−1/2, 1/2), f (−1/2 + x)) and • each node of a certain depth h ≥ 0 is of the form vh,i := (Ih,i , fIh,i ),  where Ih,i := − 12 + i2−h , − 21 + (i + 1)2−h and i ∈ {0, . . . , 2h − 1}. Starting with the root v0,0 , T can be recursively computed as follows: Let (I, fI ) ∈ T be a node of depth h, that is, I = (a, b) is an interval of length 2−h . For Il = (a, mI ), we have fIl (x) = fI (x/2) and, for Ir = (mI , b), we have fIr (x) = fI ((x + 1)/2). Thus, the computation of a node (I, fI ) amounts for either substituting x by x/2 or by (x + 1)/2 in the polynomial fI stored in the parent node (I, fI ). If we start with a polynomial ˜ f whose coefficients are dyadic numbers of bitsize τ0 , then T can be computed with O(n|T |(τ0 + nhT )) bit operations, where hT denotes the height of T (see also the considerations on the bit complexity of the Descartes method in Section 2.4). Now, instead of computing T exactly, we aim to compute an approximation T L of T with the following properties: • Each node (I, f˜I , LI ) of T L consists of an interval I = (a, b) of width w(I) = 2−h , an integer LI ≥ L − 2h ≥ 0 (indicating a maximal approximation error of size 2−LI ) and an LI -binary approximation f˜I of fI , that is, f˜I ∈ [ fI ]2−LI . • There exists a node (I, f˜I , LI ) ∈ T L exactly iff (I, fI ) ∈ T (“T L is an approximation of T ”) The following lemma gives a bound on the bit complexity for the computation of a T L with the above properties. Theorem 12. Let L be an arbitrary positive integer and f := ∑ni=0 ai xi a polynomial of degree n such that |ai | < 2τ0 for all i. Then, the computation of an approximate tree T L as defined above amounts for ˜ O(n|T |(n + τ0 + L)) bit operations. Proof. Consider the following recursive method to compute T L : 1. Approximate f to L + n + 1 bits after the binary point, that is, compute an (L + n + 1)-binary approximation f˜ ∈ [ f ]2−L−n−1 of f . 2. Evaluate f˜(−1/2 + x) and compute an (L + 1)-binary approximation f˜I0 of f˜(−1/2 + x). Due to Lemma 1 (iii), we have f˜(−1/2 + x) ∈ [ f (−1/2 + x)]2−L−1 and, thus, f˜I0 ∈ [ fI0 ]2−L . Store (I0 , f˜I0 , L0 ) as the root of T L . 4 The

reader might think of T as the recursion tree TD CM induced by D CM.

13

3. Let (I, f˜I , LI ) ∈ T L be a node with I = (a, b). For the left subinterval Il = (a, mI ), let f˜Il be an LI binary approximation of f˜I (x/2) and set LIl := LI − 1. Since f˜I ∈ [ fI ]2−LI , it follows that f˜I (x/2) ∈ [ fI (x/2)]2−LI = [ fIl ]2−LI and, thus, f˜I ∈ [ fIl ]2−LIl . For the right subinterval Ir = (mI , b), let f˜Ir be an (LI − 1)-binary approximation of f˜I ((x + 1)/2) and set LIr := LI − 2. According to Lemma 1 (i), we have f˜I ((x + 1)/2) ∈ [ fI ((x + 1)/2)]2−LI +1 = [ fIr ]2−LI +1 and, thus, f˜Ir ∈ [ fIr ]2−LIr . We add (Il , f˜Il , LIl ) and (Ir , f˜Ir , LIr ) to T iff (Il , fIl ) and (Il , fIl ) are in T , respectively. From our construction it follows that, in each bisection step, the number LI of exact bits after the binary point drops by at most 2 and, thus, we have LI ≥ L+2 log w(I). It remains to prove the bound on the bit complexity. The costs at each node are dominated by the computation of f˜I (x/2) and f˜I ((x + 1)/2). The absolute values of the coefficients of each polynomial fI are bounded by 2τ0 +n because the substitution x 7→ a + (b − a)x does not increase the modulus of the coefficients of f by more than a factor of 2n if a, b ∈ [−1/2, 1/2]. fI and f˜I coincide in the digits before the binary point and, thus, the coefficients of f˜I are represented by O(L + n + τ0 ) ˜ bits. It follows that the costs for computing f˜I (x/2) and f˜I ((x + 1)/2) are bounded by O(n(L + n + τ0 )) bit ˜ operations. Eventually, this implies the bound O(n|T |(L + n + τ0 )) for the total costs of computing T L . For a polynomial f as in (1.3), the roots of an L-binary approximation f˜ of f are almost at the same location as those of f if L ≥ L f = O(Σ(F) + log n) (cf. Corollary 3). Since the recursion tree TD CM induced by D CM has height hTDCM ≤ − log σ ( f )/(16n2 ), it suffices to choose an L ≥ L f + 2hTDCM = O(Σ(F) + log(1/σ ( f )) + log n) such that each of the polynomials f˜I stored in the approximate tree TDLCM constitutes an L f -binary approximation of fI . This observation gives reason to the assumption that, for isolating the roots of f , it should suffice to consider a TDLCM for an L ≥ L f + 2hTDCM .

5

Algorithm

5.1

D CML : An Approximate Version of D CM

In this section, we formulate a version of D CM which only considers approximations f˜I of the polynomials fI in each recursion step. The algorithm is driven by an initial precision L ∈ N explaining our denotation D CML (cf. Algorithm 2 in the Appendix for pseudo-code). The algorithm is formulated in a way such that, for arbitrary precision L, D CML induces a subtree of the recursion tree of the exact counterpart D CM when applied to f (cf. Theorem 14 for a proof): D CML : In a first step, we choose an (L + n + 1)-binary approximation f˜ ∈ [ f ]2−L−n−1 of f . We evaluate f˜(−1/2 + x) and compute an (L + 1)-binary approximation f˜I0 ∈ f˜(−1/2 + x) of the resulting polynomial. Then, due to Lemma 1, we have f˜I0 ∈ [ fI0 ]2−L , where I0 = (−1/2, 1/2). D CML maintains a list A of active nodes and a list O of isolating intervals. We initially start with A := {(I0 , f˜I0 , L)} and O := 0. / For each active node (I, f˜I , LI ) ∈ A with I = (a, b), we proceed as follows: 1. (I, f˜I , LI ) is removed from A . If LI < 0, we stop and return ”insufficient precision”. 2. If LI ≥ 0, we compute the polynomials       n 1 1 1 i n ˜ ˜ ˜ ˜ ˜ fI + (x) := fI − + 1 + (5.1) x and h(x) = ∑ hi x := (1 + x) fI + . 4n 2n 1+x i=0 A simple computation (cf. the subsequent Lemma 13 (i)) shows that each coefficient h˜ i approximates the corresponding exact coefficient hi of ∑ni=0 hi xi := fIt+ (x) = (x + 1)n fI + (1/(1 + x)) to an error of size less than 2n+2−LI . 14

3. If h˜ i > −2n+2−LI for all i or h˜ i < 2n+2−LI for all i, we discard (I, f˜I , LI ). The motivation underlying this approach is that, in the given situation, it is possible that var( f , I + ) = 0 holds. ( f˜ )0

4. If there exist h˜ i and h˜ j with h˜ i ≤ −2n+2−LI and h˜ j ≥ 2n+2−LI , we consider the test T3/2I (0, 2), that is, ( f˜ )0

( f )0

( f˜ )0

we evaluate t3/2I (0, 2). Due to Lemma 13 (i), we have |t3/2I (0, 2) − t3/2I (0, 2)| < n2n+1−LI and, thus, ( f˜ )0

( f )0

T3/2I (0, 2) might hold if t3/2I (0, 2) > −n2n+1−LI . We distinguish the following two cases: ( f˜ )0

• t3/2I (0, 2) > −n2n+1−LI : We compute the polynomial fˆI (x) := f˜I (x) + n2n+1−LI · x

(5.2) 0

( fˆ ) for which T3/2I (0, 2) holds. In particular, it follows that fˆI is monotone on (−2, 2). We evaluate

(5.3) λ − := fˆI (−1/(4n)) = f˜I + (0) − 2n−1−LI ,

λ + := fˆI (1 + 1/(4n)) = f˜I + (1) + (4n + 1)2n−1−LI ,

λ := fˆI (−1/n) = f˜I (−1/n) − 2n+1−LI

(5.4)

and check whether the following inequalities are fulfilled: λ − · λ + < 0,

(5.5) (5.6)

|λ − | > 2n−LI ,

(5.7)

|λ + | > 2n+3−LI n, and ˆ

|λ | > 2deg fI +n+7−LI n2 .

(5.8)

If any of the above inequalities does not hold, we discard (I, f˜I , LI ). If all inequalities hold, then I + contains a root ξ of f and the (w(I)/n)−neighborhood of I is isolating for ξ (cf. Lemma 13 (ii)). In the latter case, the interval I ∗ = (a∗ , b∗ ) = (a−w(I)/(2n), b+w(I)/(2n)) is also isolating for ξ , and we add the tuple (I ∗ , f˜I , LI ) to O if I ∗ intersects no interval J ∗ stored in any tuple (J ∗ , f˜J , LJ ) ∈ O. Otherwise, we discard (I, f˜I , LI ). ( f˜ )0

• t3/2I (0, 2) ≤ −n2n+1−LI : I is subdivided into Il := (a, mI ) and Ir := (mI , b). We add (Il , f˜Il , LI − 1) and (Ir , f˜Ir , LI − 2) to A , where f˜Il is an LI -binary approximation of f˜I (x/2) and f˜Ir an (LI − 1)binary approximation of f˜I ((x + 1)/2) (cf. the proof of Theorem 12 for more details). D CML terminates when LI < 0 for a node (I, f˜I , LI ) ∈ A or when A becomes empty. In the first case, D CML returns ”insufficient precision”. In the second case, a list O of isolating intervals I ∗ (together with corresponding polynomials fI and error bounds LI ) is returned. Lemma 13. Let f be a polynomial as in (1.3), I = (a, b) an interval produced by D CML and h˜ as defined in (5.1). Then, 0

0

(f ) ( f˜ ) ˜ (i) h(x) ∈ [ fIt+ ]2n+2−LI and |t3/2I (0, 2) − t3/2I (0, 2)| < n · 2n+1−LI . 0

( f˜ ) (ii) Suppose that t3/2I (0, 2) > −n2n+1−LI and let fˆI , λ − , λ + and λ be defined as in (5.2)-(5.4). Then, + f (a ) − λ − < 2n−LI , | f (a − w(I)/n) − λ | < 2n+2−LI and f (b+ ) − λ + < n2n+3−LI .

If the inequalities (5.5)-(5.8) hold, then I + contains a real root ξ of f and the (w(I)/n)-neighborhood of I is isolating for ξ . 15

U:=U1/n ((0,1))

roots of ^f I

0 -2

-1

g

-1/n -1/4n

z

1 1+1/4n 1+1/n

2

D 2 (0)

Figure 5.1: If λ − · λ + = fˆI (−1/4n) · fˆI (1 + 1/4n) < 0, then there exists a root γ ∈ (−1/4n, 1 + 1/4n) of fˆI . Furthermore, ∆2 (0) contains at most one root of fˆI . A computation shows that (5.8) implies that | fˆI (z)| > | fI (z) − fˆI (z)| for all z on the boundary of the (1/n)-neighborhood U of (0, 1). Due to Rouch´e’s Theorem, it follows that U is isolating for a root of fI .

(iii) For any tuple (I ∗ , f˜I , LI ) ∈ O, the endpoints of I ∗ are located outside the discs ∆i := ∆σ (zi , f )/(64n3 ) (zi ), where i = 1, . . . , n. Proof. Since f˜I ∈ [ fI ]2−LI , we have f˜I + ∈ [ fI + ]2−LI +2 (cf. Lemma 1 (ii)). Reversing the coefficients and replacing x by x + 1 increases the error by a factor of at most 2n (cf. Lemma 1 (iii)). Thus, we obtain h˜ ∈ [ fIt+ ]2−LI +2+n . For the second part of (i), we consider the following simple computation: ( f )0

( f˜ )0

|t3/2I (0, 2) − t3/2I (0, 2)| ≤

n−1 3 3 · n · 2−LI ∑ 2i = · n · 2−LI (2n − 1) < n2n+1−LI , 2 2 i=0

where the first inequality uses ( f˜I )0 ∈ [( fI )0 ]n·2−LI . For (ii), we obtain n+1 + f (a ) − λ − = fI (−1/(4n)) − fˆI (−1/(4n)) ≤ n2n+1−LI /(4n) + 2−LI 1 − (1/(4n) < 2n−LI 1 − 1/(4n)

and, in complete analogy, | f (a − w(I)/n) − λ | < 2n+2−LI . For the right endpoint b+ of I + , we have + f (b ) − λ + = fI (1 + 1/(4n)) − fˆI (1 + 1/(4n)) ≤ n2n+2−LI + (n + 1)2−LI (1 + 1/(4n))n < n2n+3−LI . If the inequalities (5.5)-(5.7) are fulfilled, then sign f (a+ ) = signλ − , sign f (b+ ) = signλ + and f (a+ ) · f (b+ ) < 0. Hence, it follows that f has a real root in I + . We have to show that the inequality (5.8) implies the ( f˜ )0

( fˆ )0

uniqueness of this root. From t3/2I (0, 2) > −n2n+1−LI , it follows that T3/2I (0, 2) succeeds and, thus, ∆2 (0) contains at most one root of fˆI . Since λ − = fˆI (−1/(4n)) and λ + = fˆI (1 + 1/(4n)) have different signs, the interval (−1/(4n), 1 + 1/(4n)) contains a root γ of fˆI . We consider the (1/n)-neighborhood U ⊂ C of (0, 1) and an arbitrary point z on its boundary. It holds that | − 1/n − γ|/|z − γ| < (1 + 5/(4n))/(1/(4n)) = 4n + 5 and, for any root γ˜ 6= γ of fˆI , we have ˜ ˜ | − 1/n − z| + |z − γ| 1 + 2/n 1 + 1/(2n) | − 1/n − γ| ≤ ≤ 1+ =2 . ˜ ˜ |z − γ| |z − γ| 1 − 1/n 1 − 1/n Hence, it follows that λ fˆI (−1/n) | − 1/n − γ| ˜ | − 1/n − γ| ∏ fˆ (z) = fˆ (z) = |z − γ| ˜ |z − γ| I I ˜ γ6˜=γ: fˆI (γ)=0  ˆ    ˆ √ 1 deg fI −1 1 − deg fI +1 ˆ ˆ deg fˆI −1 < (4n + 5)2 1+ 1− < (4n + 5)2deg fI −1 · e · e < n2deg fI +4 2n n 16

ˆ and, thus, | fˆI (z)| > |λ | · 2− deg fI −4 /n. Since |z| ≤ 1 + 1/n, we have n

| fI (z) − fˆI (z)| ≤ n2n+1−LI · |z| + ∑ n2n+2−LI |z|i < n2n+2−LI + (n + 1)2−LI (1 + 1/n)n < n2n+3−LI . i=0

Then, according to Rouch´e’s Theorem, (5.8) implies that fI has exactly one root within U. It remains to prove (iii): Let I ∗ = (a∗ , b∗ ) = (a − w(I)/2n), b + w(I)/2n) be an interval corresponding to a tuple (I ∗ , f˜I , LI ) ∈ O. Then, due to (ii), I + contains a root ξ = zi0 of f and the (w(I)/n)-neighborhood of I is isolating for this root. From the definition of I ∗ , it follows that |a∗ − zi | > w(I)/(4n) for all i. If there exists an i 6= i0 with a∗ ∈ ∆i , then w(I) < 4n|a∗ − zi | < σ (zi , f )/(16n2 ). Thus, we obtain |ξ − zi | ≤ |ξ − a∗ | + |a∗ − zi | < (1 + 3/(4n))w(I) + σ (zi , f )/(64n3 ) < (1 + 3/(4n))σ (zi , f )/(16n2 ) + σ (zi , f )/(64n3 ) < σ (zi , f ), a contradiction. It remains to show that a∗ ∈ / ∆i0 . If a∗ ∈ ∆i0 = ∆σ (ξ , f )/(64n3 ) (ξ ), then w(I) < σ (ξ , f )/(16n2 ). ( f )0

( f˜ )0

According to Lemma 8 (ii), T3/2J (0, 2) already holds for a parent node J of I and, thus, t3/2J (0, 2) > n2n+1−LJ according to (i). This contradicts the fact that J is not terminal. Hence, the distance between a∗ and zi0 is larger than σ (ξ , f )/(64n3 ). In completely analogous manner, one shows that b∗ is also not contained in any ∆i . Eventually, from our construction of O, it follows that, for any interval I ∗ from O, its both endpoints are located outside all discs ∆i . We remark that D CML outputs a list of isolating intervals for f . However, there is no guarantee that each real root of f is captured. We will emphasize on this issue in Section 5.3 where we introduce a postcertification method to check whether D CML has returned isolating intervals for all real roots. We close this section with a result on the size of the subdivision tree induced by D CML and the bit complexity of D CML : Theorem 14. Let f be a polynomial as in (1.3) and L ∈ N an arbitrary positive integer. Then, the subdivision tree TD CML induced by D CML is a subtree of the tree TD CM induced by D CM and, thus, |TD CML | ≤ |TD CM | = O(Σ( f ) + n log n). ˜ Furthermore, D CML amounts for O(n(Σ( f ) + log n)(n log Γ + τ + L)) bit operations. Proof. D CML never splits an interval I which is not split by D CM when applied to the exact polynomial ( f )0 f . Namely, if I is terminal for D CM, then either t3/2I (0, 2) > 0 or var( f , I + ) = var( fIt+ ) = 0. In the first 0

( f˜ ) case, we must have t3/2I (0, 2) > −n2n+1−LI whereas, in the second case, all coefficients h˜ i of h(x) = (1 + x)n f˜I + (1/(1 + x)) are either larger than −2n+2−LI or smaller than 2n+2−LI (cf. Lemma 13 (i)). Thus, I is terminal for D CML as well. It follows that D CML induces a recursion tree TD CML which is contained in the recursion tree TD CM induced by D CM. Then, the result on the tree size follows from Theorem 11. For the bit complexity, we consider the costs at each iteration: For a node (I, f˜I , LI ) ∈ A , the costs ˜ ˜ for computing h(x), f˜Il and f˜Ir (x) are bounded by O(n(n log Γ + τ + L)) bit operations. The latter is due to the fact that each f˜I is a polynomial whose coefficients are binary fractions with O(n log Γ + LI + τ) = O(n log Γ + τ + L) bits and we perform a Taylor shift by an O(log n)-bit number. The costs for evaluating ( f )0 ˜ t3/2I (0, 2), λ − , λ + and λ are also bounded by O(n(n log Γ + τ + L)) bit operations. Namely, these are the costs for evaluating a polynomial of bitsize O(n log Γ + L + τ) at an O(log n)-bit number. We further remark that, in each iteration, O contains disjoint isolating intervals for the real roots of f and, thus, |O| ≤ n. Hence, the endpoints of an interval I ∗ have to be compared with those of at most n intervals stored in O. It follows ˜ that the total costs in each iteration are bounded by O(n(n log Γ + τ + L)) bit operations. The bound on the total costs then follows from our result on the size of the recursion tree.

17

5.2

Known L f and σ ( f )

From Corollary 3 we already know that, for L ≥ L f , each root zi of f moves by at most σ (zi , f )/(64n3 ) when passing from f to an arbitrary approximation f˜ ∈ [ f ]2−L . Hence, we expect that it should be possible to isolate the roots of f by only considering approximations of f (and the intermediate results fI ) to L f bits after the binary point. The following theorem proves that O(Σ( f ) + n) bits suffice. Theorem 15. Let f be a polynomial as in (1.3) and L an integer with   1 L ≥ Lmax := L + 3 log (5.9) + 16n = O(Σ( f ) + n). f f σ( f ) Then, D CML returns isolating intervals for all real roots of f . Proof. Due to Theorem 11 and 14, the height hD CML of TD CML is bounded by hD CML ≤ log

16n2 1 1 = log + 2 log n + 4 ≤ log + 4n. σ( f ) σ( f ) σ( f )

Then, for any interval I = (a, b) produced by D CML , we have  LI ≥ L + 2 log w(I) ≥ L − 2hD CML ≥ Lmin := L f + log f

(5.10)

 1 + 8n > 0. σ( f )

The latter inequality guarantees that D CML does not return “insufficient precision”. Now let I = (a, b) be an interval whose closure I contains a root ξ = zi0 of f . We will show the following facts: 1. I is not discarded in Step 3 of the algorithm D CML . ( f˜ )0

2. If t3/2I (0, 2) > −n2n+1−LI , then all inequalities (5.5)-(5.8) are fulfilled. 3. In the latter case, either I ∗ = (a−w(I)/(2n), b+w(I)/(2n)) is added to O or I ∗ only intersects intervals J ∗ from O which are already isolating for ξ . If (1)-(3) hold, then D CML outputs isolating intervals for all real roots of f . Namely, D CML starts with an interval I0 = (−1/2, 1/2) containing all real roots of f and, thus, for each root ξ of f , we eventually obtain ( f˜ )0 an interval I whose closure I contains ξ and t3/2I (0, 2) > −n2n+1−LI . Then, either the isolating interval I ∗ is added to the output list or O already contains such an isolating interval for ξ . For the proof of (1), we have already shown that w(I) > σ (ξ , f )/(16n2 ). Corollary 3 then ensures that an arbitrary g ∈ [ f ]2−L f has a root ξ 0 ∈ I + . Namely, the root ξ ∈ I stays real and moves by at most ˜ σ (ξ , f )/(64n3 ) < w(I)/(4n) when passing from f to g. Now let us assume that all coefficients h˜ i of h(x) = n n+2−L n+2−L I I ˜ ˜ (1 + x) fI + (1/(1 + x)) are larger than −2 . Since |hi − hi | < −2 (cf. Lemma 13 (i)) for all coefficients hi of fIt+ = ∑ni=0 hi xi , it follows that hi > −2n+3−LI for all i. For g(x) := f (x) + 2n+3−LI ∈ [ f ]2−L f , we have gtI + (x) = fIt+ (x)+2n+3−LI (x+1)n and, thus, gtI + has only positive coefficients. In the case where h˜ i < 2n+2−LI for all i, we consider g(x) := f (x) − 2n+3−LI ∈ [ f ]2−L f and, thus, gtI + has only negative coefficients. Hence, in both cases, there exists a g ∈ [ f ]2−L f which has no root in I + , a contradiction. It follows that I is not discarded in Step 3.

18

( f )0

( f˜ )0

For (2), suppose that t3/2I (0, 2) > −n2n+1−LI . Due to Lemma 13 (i), we have t3/2I (0, 2) > −n2n+2−LI and since log(n2n+2−LI )/w(I)) ≤ 6 + 3 log n + n − LI < −L f , it follows that g(x) := f (x) + x ·

n2n+2−LI w(I) (g )0

( f )0

constitutes a 2−L f -approximation of f . Hence, g has a root ξ 0 in I + and since t3/2I (0, 2) = t3/2I (0, 2) + n2n+2−LI > 0, the disc ∆2w(I) (a) is isolating for ξ 0 . The following argumentation shows that ∆3w(I)/2 (a) is isolating for the root ξ of f : Assume that ∆3w(I)/2 (a) contains an additional root z j 6= ξ of f . Then, σ (ξ , f ) < 3w(I) and, thus, ξ and z j would move by at most 3w(I)/(64n3 ) < w(I)/2 when passing from f to g. It follows that g would have at least two roots within ∆2w(I) (a), a contradiction. Since ∆3w(I)/2 (a) is isolating for ξ ∈ I, we have σ (ξ , f )/(16n2 ) < w(I) < 2σ (ξ , f ). The left inequality implies that the distance of ξ to any of the points a+ = a − w(I)/(4n), b+ = b + w(I)/(4n) and c := a − w(I)/n is larger than or equal to w(I)/(4n) > σ (ξ , f )/(64n3 ). Let di := |zi − a| denote the distance between a root zi 6= ξ and the disc ∆3w(I)/2 (a). Then, σ (zi , f ) |zi − ξ | di + 3w(I) ≤ ≤ < di + w(I)/4. 64n3 64n3 64n3 It follows that the points a+ , b+ , c ∈ ∆5w(I)/4 (a) are located outside the disc ∆i := ∆σ (zi , f )/(64n3 ) (zi ). In summary, none of the discs ∆σ (zi , f )/(64n3 ) (zi ), i = 1, . . . , n, contains any of the points a+ , b+ and c. Hence, due to Corollary 3, it follows that each of the values | f (c)|, | f (a+ )| and | f (b+ )| is larger than (n + 1)2−L f . A simple computation shows that (n + 1)2−L f > 22n+8−LI n2 and, thus, Lemma 13 implies that each of the values λ , λ − and λ + fulfill the inequalities (5.6)-(5.8). Since I + is isolating for ξ , f (a+ ) and f (b+ ) must have different signs and, thus, the same holds for λ − and λ + . Hence, the inequality (5.5) holds as well. ( f˜ )0 It remains to show (3): If t3/2I (0, 2) > −n2n+1−LI , then due to (2) and Lemma 13 (ii), the interval I ∗ and the (w(I)/n)-neighborhood of I is isolating for ξ . If I ∗ does not intersect any interval in O, then I ∗ is added to O and, thus, D CML outputs an isolating interval for ξ . We still have to consider the case where I ∗ intersects an interval J ∗ = (c∗ , d ∗ ) from O. Assume that J ∗ is isolating for a root γ 6= ξ and J ∗ ∩ I ∗ 6= 0. / The roots ξ and γ move by at most w(I)/(4n) and w(J)/(4n), respectively, when passing from f to an arbitrary g ∈ [ f ]2−L f (cf. the above proof of (1)). Hence, it follows that the union of the two intervals I ++ = (a − w(I), b + w(I)) and J ++ = (c − w(J), d + w(J)) contains at least two roots of an arbitrary g ∈ [ f ]2−L f . Due to Lemma 10, one of the discs ∆2w(I) (a) or ∆2w(J) (c) then also contains at least two roots of (p )0

any g ∈ [ f ]2−L f . This contradicts the fact that t3/2I (0, 2) > 0 for p(x) := f (x) + x · n2n+2−LI /w(I) ∈ [ f ]2−L f (q )0

and t3/2J (0, 2) > 0 for q(x) := f (x) + x · n2n+2−LJ /w(J) ∈ [ f ]2−L f . It follows that J ∗ already isolates ξ .

5.3

Unknown L f and σ ( f )

For unknown L f and σ ( f ), we proceed as follows: We start with an initial precision L (e.g., L = 16) and run D CML . D CML outputs a list O = {(Ik∗ , f˜Ik , LIk )}k=1,...,m , where each interval Ik∗ = (a∗k , b∗k ) = (ak − w(Ik )/(2n), bk + w(Ik )/(2n)) isolates a real root of f and f˜Ik constitutes an LIk -binary approximation of fIk . Then, in a certification step, we check whether the list L := {I1∗ , . . . , Im∗ } of isolating intervals is complete, that is, whether each real root of f is already contained in one the intervals Ik∗ . If we can guarantee the latter, we return L . Otherwise, we double L and start over again. Theorem 15 ensures that, for L ≥ Lmax (i.e., L f L fulfills the inequality (5.9)), D CM outputs isolating intervals for all real roots of f . In the following steps, we present a certification method which also succeeds if L ≥ Lmax f .

19

J1 -1/2 I -1/2

a

a

I1*

- I 2* +

* + Im

a1*

b1*

a2*

* am

+

-

-

a1*

b1*

+

-

-

a1*

b1*

a2*

+

-

J2

b*2

Jm+1 b*m

1/2

a2* b

Figure 5.2: D CML returns isolating intervals I1∗ , . . . , Ik∗ for some of the real roots of f . The intervals J1 , . . . , Jm+1 in between

define the region of uncertainty R. If P RECERTIFY succeeds, then f has alternating signs at each pair a∗k and b∗k and the signs of f at the endpoints of each Jk are equal. Furthermore, at each endpoint of an interval Ik , | f | is larger than or equal to µ¯ k > (n + 1)2−LIk . f˜ We then subdivide (−1/2, 1/2) into intervals I such that, for an LI -binary approximation f˜I of fI , either T I (0, 1) holds or fI is 3/2

monotone on (0, 1). For the intersection of I with R, we check whether | f˜I | is larger than (n + 1)2−LI . If the latter is fulfilled, f has no root in I. In case that L ≥ Lmax f , the certification step succeeds.

Since the intervals Ik∗ are already certified to be isolating for a real root of f , it suffices to show that the   m region of uncertainty 1 1 [ R := − , \ Ik∗ 2 2 k=1 contains no root of f . R decomposes into m + 1 disjoint intervals J1 = [−1/2, a∗1 ], J2 = [b∗1 , a∗2 ], . . . , Jm+1 = [b∗m , 1/2] which separate the isolating intervals Ik from each other. L c := {J1 , . . . , Jm+1 } denotes the corresponding set consisting of exactly these intervals. We start with the following lemma: Lemma 16. For k = 1, . . . , m, we define µk := min(| f˜Ik (−1/(2n))|, | f˜Ik (1 + 1/(2n))|). (i) If there exists a k with µk ≤ (n + 1)2−LIk +1 , then L < Lmax f . (ii) If µk > (n + 1)2−LIk +1 , then min(| f (a∗k )|, | f (b∗k )|) > µ¯ k := µk − (n + 1)2−LIk +1 and f˜Ik (−1/(2n)) and f˜Ik (1 + 1/(2n)) have the same signs as f (a∗k ) and f (b∗k ), respectively. (iii) If the condition in (ii) is fulfilled for all k = 1, . . . , m and f˜Ik0 (1 + 1/(2n)) · f˜Ik0 +1 (−1/(2n)) < 0 for at least one k0 , then the list L returned by D CML is not complete and, thus, L < Lmax f . Proof. Due to Lemma 13 (iii), both endpoints a∗k and b∗k of Ik∗ are located outside all discs ∆σ (zi , f )/(64n3 ) (zi ), where i = 1, . . . , n. Hence, | f (a∗k )|, | f (b∗k )| > (n + 1)2−L f according to Corollary 3. Furthermore, we have | f (a∗k ) − f˜Ik (−1/(2n))| = | fIk (−1/(2n)) − f˜Ik (−1/(2n))| < (n + 1)2−LIk and | f (b∗k )− f˜Ik (1+1/(2n))| = | fIk (1+1/(2n))− f˜Ik (1+1/(2n))| < (n+1)(1+1/(2n))n 2−LIk < (n+1)2−LIk +1 . Thus, (5.11)

µk > (n + 1)(2−L f − 2−LIk +1 ).

In the proof of Theorem 15, we have already shown that L ≥ Lmax implies that LIk ≥ Lmin ≥ L f + 8n (cf. f f min (5.10) for the definition of L f ). Hence, (i) and (ii) follow. It remains to show (iii): If there exists a k0 such that f˜Ik0 (1 + 1/(2n)) · f˜Ik0 +1 (−1/(2n)) < 0, then (ii) implies that f (b∗k0 ) · f (a∗k0 +1 ) < 0. Hence, f must have a root in (b∗k0 , a∗k0 +1 ) which is not contained in any Ik∗ .

20

The above Lemma induces our first subroutine P RECERTIFY. If P RECERTIFY fails, then L < Lmax f . P RECERTIFY: The input of P RECERTIFY is the list O = {(Ik∗ , f˜Ik , LIk )}k=1,...,m returned by D CML . For each k = 1, . . . , m, we evaluate µ¯ k := min(| f˜Ik (−1/(2n))|, | f˜Ik (1 + 1/(2n))|) − (n + 1)2−Lk +1 and + ˜ ˜ s− k := sign f Ik (−1/(2n)) and sk = sign f Ik (1 + 1/(2n)). − • If there exists a k with µ¯ k ≤ 0 or s+ k · sk+1 < 0, we return “insufficient precision”. + ¯ • In all other cases, we return the list O ∗ := {(Ik∗ , s− k )}k=1,...,m k , sk , µ

We remark that the test whether µ¯ k > 0 could have already been integrated in our first subroutine D CML . However, for the sake of clarity, we decided to separate both steps in the description. In case that P RE − + CERTIFY succeeds, its output constitutes a list O ∗ of isolating intervals Ik∗ = (a∗k , b∗k ) with signs sk and sk ∗ ∗ ∗ of f at the endpoints of Ik and lower bounds µ¯ k for | f (ak )| and | f (bk )|. The crucial idea underlying the following certification method is to consider a decomposition of R into subintervals I with corresponding f˜I LI -binary approximations f˜I of fI such that either f˜I is monotone on [0, 1] or T3/2 (0, 1) holds. Then, in both situations, we can directly derive information (cf. Lemma 6) on the image of f˜I on [0, 1]: If f˜I (0) f˜I (1) > 0 and | f˜I (t)| > (n + 1)2−LI for all t ∈ [0, 1], then f cannot contain a root in I since | f˜I (t) − fI (t)| ≤ (n + 1)2−LI . We now present our certification method C ERTIFYL (cf. Algorithm 3 in the Appendix for pseudo-code). Its input is the polynomial f , a certain precision L ∈ N and the list O = {(Ik∗ , f˜Ik , LIk )}k=1,...,m returned by D CML . Throughout the following consideration, we assume that C ERTIFYL never produces an interval I of width w(I) ≤ σ ( f )/(16n2 ). We will provide a proof of this fact in Theorem 18 (ii). C ERTIFYL : In the first step, we call P RECERTIFY. In case of success, we proceed with the list O ∗ returned by P RECERTIFY. We choose an (L + n − 1)-binary approximation f˜ ∈ [ f ]2−L−n−1 of f , evaluate f˜(−1/2 + x) and approximate the resulting polynomial by an (L + 1)-binary approximation f˜I0 ∈ [ fI0 ]2−L , where I0 = (−1/2, 1/2). Similar as for D CML , we maintain a list A of active nodes, where we initially start with A := {(I0 , f˜I0 , L)}. In each iteration, some active node (I, f˜I , LI ) ∈ A , where I = (a, b), is processed; 1. (I, f˜I , LI ) is removed from A . If LI < 0, we return “insufficient precision”. 2. If I ∩ R = 0/ (i.e., I is contained in an interval Ik∗ ), we discard I. max due to the 3. If a is contained in an interval Jk ∈ L c , k = 1, . . . , m, and sign f˜I (0) 6= s− k , then L < L f preceding Lemma 17 (i). Hence, we return “insufficient precision”. We proceed in exactly the same manner if b is contained in an interval Jk ∈ L c , k = 2, . . . , m + 1, and sign f˜I (1) 6= s+ k−1 . ˜

fI 4. We evaluate t3/2 (0, 1) and consider the following two cases: f˜I • t3/2 (0, 1) > −(n + 1)2−LI +1 : If the inequality | f˜I (0)| > (n + 1)2−LI +5 holds, then I = [a, b] contains no root of f (cf. Lemma 17 (ii)) and, thus, we discard I. Otherwise, L < Lmax and we return f “insufficient precision”. f˜I ˜ • t3/2 (0, 1) ≤ −(n + 1)2−LI +1 : We compute h(x) = ∑ni=0 h˜ i xi := (1 + x)n ( f˜I )0 (1/(1 + x)) and consider the following two cases:

21

– h˜ i < n2n−LI for all i or h˜ i > −n2n−LI for all i: In this case, we check whether ∗ µ¯ k > n2n+2−LI for all intervals Ik which intersect I, ∗ either a ∈ / R or | f˜I (0)| > n2n+2−LI and ∗ either b ∈ / R or | f˜I (1)| > n2n+2−LI . If all of the above three conditions are fulfilled, then I ∩ R contains no root of f (cf. Lemma 17 (iii)), hence we discard I. If one of the conditions is not fulfilled, then L < Lmax f and we return “insufficient precision”. – There exist h˜ i and h˜ j with h˜ i ≤ −n2n−LI and h˜ j ≥ n2n−LI : I is subdivided into Il := (a, mI ) and Ir := (mI , b). We add (Il , f˜Il , LI − 1) and (Ir , f˜Ir , LI − 2) to A , where f˜Il is an LI -binary approximation of f˜I (x/2) and f˜Ir an (LI − 1)-binary approximation of f˜I ((x + 1)/2) (cf. the proof of Theorem 12 for more details). C ERTIFYL either returns “insufficient precision” or terminates when A becomes empty. In the first case, we have shown that L < Lmax whereas, in the second case, R contains no root of f and, thus, we return f “certification successful”.

Lemma 17. Let I = (a, b) be an interval of width w(I) = 2−h > σ ( f )/(16n2 ), LI ∈ N with LI ≥ L − 2h and f˜I ∈ [ fI ]2−LI an LI -binary approximation of fI . We further define      n 1 1 n 0 ˜h(x) := ∑ h˜ i xi := (1 + x)n ( f˜I )0 ∈ (1 + x) ( fI ) . 1+x 1 + x n2n−LI i=1 (i) Suppose that – a is contained in an interval Jk ∈ L c , k = 1, . . . , m, and f˜I (0) 6= s− k , or c ˜ – b is contained in an interval Jk ∈ L , k = 2, . . . , m + 1, and fI (1) 6= s+ k−1 , then L < Lmax f . f˜I (ii) Suppose that t3/2 (0, 1) > −(n + 1)2−LI +1 . If | f˜I (0)| > (n + 1)2−LI +5 , then [a, b] contains no root of f . If | f˜I (0)| ≤ (n + 1)2−LI +5 and I ∩ R 6= 0, / then L < Lmax f .

(iii) Suppose that (i) does not apply and h˜ i < n2n−LI for all i or h˜ i > −n2n−LI for all i. If – µ¯ k > n2n+2−LI for all intervals Ik∗ which intersect I, – either a ∈ / R or | f˜I (0)| > n2n+2−LI and – either b ∈ / R or | f˜I (1)| > n2n+2−LI , then [a, b] ∩ R contains no root of f . If one of the above three conditions is not fulfilled, then L < Lmax f . min Proof. If L ≥ Lmax f , then LI ≥ L − 2h ≥ L f > L f + 8n (cf. (5.10) in the proof of Theorem 15). Furthermore, according to Corollary 3, we have | f (x)| > (n + 1)2−L f for all x ∈ R. Hence, for all t ∈ [0, 1] with a + t(b − a) ∈ R, it holds that | fI (t)| > (n + 1)2−L f . Since | f˜I (t) − fI (t)| ≤ (n + 1)2−LI < (n + 1)2−L f , it follows that ∗ | f˜I (t)| must have the same sign as fI (t). Now, if a ∈ Jk ⊂ R and sign f˜I (0) 6= s− k = sign f (ak ), then the two values f (a) = fI (0) and f (a∗k ) have different signs. This contradicts the fact that R contains no real root of f + ∗ ˜ for L ≥ Lmax f . The case b ∈ Jk and f I (1) 6= sk = sign f (bk ) is treated in analogous manner. Thus, (i) follows.

22

˜

fI We come to the proof of (ii): If t3/2 (0, 1) > −(n + 1)2−LI +1 holds, then

g(x) := f˜I (x) + (n + 1)2−LI +1 g g (0, 1) holds. According to Lemma 6, it follows that (0, 1) > 0 and, thus, T3/2 fulfills t3/2

(5.12)

5 |g(0)| | f˜I (0)| |g(0)| > |g(t)| > > − (n + 1)2−LI 3 3 3

for all t ∈ [0, 1]. For any x ∈ I, there exists a t ∈ [0, 1] with x = a +t(b − a) and | f (x) − g(t)| = | fI (t) − g(t)| < (n + 1)2−LI +1 + (n + 1)2−LI < (n + 1)2−LI +2 . If | f˜I (0)| > (n + 1)2−LI +5 , the latter inequality combined with (5.12) implies that | f (x)| > |g(t)| − (n + 1)2−LI +2 >

| f˜I (0)| − (n + 1)2−LI − (n + 1)2−LI +2 > 0. 3

For the other direction, we assume that L ≥ Lmax and I ∩ R 6= 0. / Since | f (x)| > (n + 1)2−L f for all x ∈ R, f it follows the existence of a t ∈ [0, 1] with | fI (t)| > (n + 1)2−L f . Then, |g(t)| > (n + 1)(2−L f − 2−LI +2 ) and, thus, |g(0)| > 3/5(n + 1)(2−L f − 2−LI +2 ). A simple computation shows that f˜I (0) ≥ |g(0)| − (n + 1)2−LI +1 > (n + 1)2−LI +5 . For (iii), we have ( f˜I )0 ∈ [( fI )0 ]n2−LI . Reversing the coefficients of ( fI )0 and replacing x by x + 1 increases ˜ the total error by a factor of at most 2n , hence, h(x) ∈ [(1 + x)n ( fI )0 (1/(1 + x))]n2n−LI . For h˜ i > −n2n−LI for all i, the polynomial g(x) := f˜I (x) + n2n−LI · x is monotone on [0, 1]. Namely, for its derivative g0 (x) = ( f˜I )0 (x) + n2n−LI , it follows that (x + 1)n g0 (1/(1 + ˜ + n2n−LI (x + 1)n has only positive coefficients, hence g0 has no root in (0, 1). For h˜ i < n2n−LI x)) = h(x) for all i, the same argumentation shows that g(x) := f˜I (x) − n2n−LI · x is monotone in [0, 1]. The intersection of I and R decomposes into disjoint intervals of the form J = (c, d) = (a + t1 (b − a), a + t2 (b − a)) with t1 , t2 ∈ [0, 1]. We have to show that [t1 ,t2 ] contains no root of fI if the conditions in (iii) are fulfilled. For this purpose, we aim to show that g(t1 ) and g(t2 ) have the same sign and min(|g(t1 )|, |g(t2 |) > n2n−LI +1 . Namely, since g is monotone on [t1 ,t2 ], the latter would imply that mint∈[t1 ,t2 ] |g(t)| > n2n−LI +1 and, thus, | fI (t)| > |g(t)| − n2n−LI − (n + 1)2−LI > 0 for all t ∈ [0, 1]. We distinguish the following two cases: 1. c is an endpoint of an interval Ik∗ in O ∗ . Then, | fI (t1 )| = | f (c)| > µ¯ k and |g(t1 ) − fI (t1 )| < n2n+1−LI . If the inequality µ¯ k > n2n−LI +2 holds, then |g(t1 )| > n2n+1−LI and g(t1 ) has the same sign as f (c) and, thus, the same sign as f˜I (t1 ). The case where d is an endpoint of an interval Ik∗ in O ∗ is treated in completely analogous manner. 2. c is not an endpoint of any interval Ik∗ , that is, c is contained in the interior of an interval Jk . Then, c = a and, thus, | f˜I (t1 )| = | f˜I (0)| > n2n+2−LI . It follows that g(t1 ) has the same sign as f˜I (0) and |g(t1 )| > n2n+1−LI . For d, we consider a similar argumentation. In summary, it follows that g(t1 ) and g(t2 ) have the same signs as f˜I (t1 ) and f˜I (t2 ), respectively, and their absolute values are both larger than n2n+1−LI . Since we assume that (i) does not apply, we have sign f˜I (t1 ) = sign f˜I (t2 ) and, thus, f has no root in [c, d]. It remains to show that, for L ≥ Lmax f , the conditions in (iii) are −LIk +1 −L f fulfilled. Due to (5.11), µ¯ k > (n + 1)(2 −2 ), thus a simple computation shows that µ¯ k > n2n+2−LI L for all k. L > Lmax further implies that D CM outputs isolating intervals for all real roots of f . Then, f according to Corollary 3, we have | f (x)| > (n + 1)2−L f for all x ∈ R. For c = a + t1 (b − a) ∈ R, it follows that | f˜I (t1 )| ≥ | fI (t1 )| − (n + 1)2−LI = | f (c)| − (n + 1)2−LI > (n + 1)2−L f − (n + 1)2−LI ≥ n2n+2−LI . The case d ∈ R is treated in exactly the same manner. 23

The following theorem shows that our certification method is efficient in two ways: First, C ERTIFYL ˜ succeeds for L ≥ Lmax and, second, it amounts for O(n(Σ( f ) + n log n)(n log Γ + τ + L)) bit operations which f matches the worst case complexity bound obtained for D CML (cf. Theorem 14). Theorem 18. For a polynomial f as defined in (1.3), C ERTIFYL succeeds if L ≥ Lmax f . In this case, the region L of uncertainty R contains no root of f . Furthermore, C ERTIFY fI (i) subdivides no interval I, where T3/2 (0, 1) or var( f 0 , I) = 0 holds,

(ii) produces no interval I of width less than or equal to σ ( f )/(16n2 ), (iii) induces a subdivision tree of size O(Σ( f ) + n log n) and ˜ (iv) demands for O(n(Σ( f ) + n log n)(n log Γ + τ + L)) bit operations. f˜I Proof. For (i), an interval I is only subdivided if t3/2 (0, 1) ≤ −(n + 1)2−LI +1 or if there exist coefficients h˜ i ˜ and h˜ j of h(x) = ∑ni=0 h˜ i xi = (1 + x)n ( f˜I )0 (1/(1 + x)) with h˜ i < −n2n−LI and h˜ j > n2n−LI . In the first case, we ˜

fI fI fI fI must have t3/2 (0, 1) < 0 since |t3/2 (0, 1) −t3/2 (0, 1)| < (n + 1)2−LI +1 . Hence, T3/2 (0, 1) cannot hold. For the 0 ˜ second case, we have var(( fI ) , [0, 1]) 6= 0 since corresponding coefficients of h(x) = (1 + x)n ( f˜I )0 (1/(1 + x)) and (1 + x)n ( fI )0 (1/(1 + x)) differ by at most n2n−LI . It follows that var( f 0 , I) = var(( f 0 )I , [0, 1]) = var(( fI )0 , [0, 1]) 6= 0. Now, (ii) follows immediately from Lemma 8. Furthermore, Lemma 17 applies and, thus, C ERTIFYL succeeds if L ≥ Lmax f . For the proof of (iii), we refer to Appendix 7.1. (iv) follows in completely analogous manner as the result on the bit complexity for D CML as shown in the proof of Theorem 14.

Eventually, we present our overall root isolation method RI SOLATE. It applies to a polynomial F as given in (1.1) and returns isolating intervals for all real roots of F. RI SOLATE: We choose a starting precision L ∈ N (e.g., L = 16) and run D CML on the polynomial f as defined in (1.3). If D CML returns “insufficient precision”, we double L and start over again. If D CML outputs a list O = {(Ik∗ , f˜Ik , Lk )}k=1,...,m , we run C ERTIFYL . If C ERTIFYL returns “insufficient precision”, we double L and start over the entire algorithm. If C ERTIFYL succeeds, the intervals Ik = (a∗k , b∗k ) isolate all real roots of f . We return the intervals (2Γa∗k , 2Γb∗k ), k = 1, . . . , m, which isolate the real roots of F. The following theorem summarizes our results: Theorem 19. Let F be a polynomial as given in (1.1). Then, RI SOLATE determines isolating intervals for all real roots of F and, for each of these intervals J containing a root ξ of F, it holds that σ (ξ , F)/(16n2 ) < w(J) < 2nσ (ξ , F). RI SOLATE demands for coefficient approximations of F to O(Σ(F) + n log Γ)) bits after the binary point and the total costs are bounded by ˜ ˜ O(n(Σ(F) + n log Γ)(n log Γ + τ + Σ(F))) = O(n(Σ(F) + nτ)2 ) ˜ 3 τ 2 ). bit operations. For F ∈ Z[x], the bound on the bit complexity writes as O(n

24

Proof. The proof is an immediate consequence of our results in Theorem 14 and 18. Namely, for each precision L, the total costs for running D CML and C ERTIFYL are bounded by ˜ ˜ O(n(Σ( f ) + n log n)(n log Γ + τ + L)) = O(n(Σ(F) + n log Γ)(n log Γ + τ + L)) max = bit operations. Since we double L in each step and succeed for L ≥ Lmax f , L is always bounded by 2L f O(Σ( f ) + n) = O(Σ(F) + n log Γ). It follows that the total costs are dominated by the costs for the last run. ˜ Hence, RI SOLATE amounts for O(n(Σ(F) + n log Γ)(n log Γ + τ + Σ(F))) bit operations. Furthermore, we have to approximate the coefficients of f to O(Σ( f ) + n) = O(Σ(F) + n log Γ) bits after the binary point. Since f is obtained from F via the scaling operation x 7→ 2Γx, we have to approximate the coefficients of F to O(Σ(F) + n log Γ) bits after the binary point. For the special case where F is an integer polynomial, ˜ the bound on the bit complexity follows from Σ(F) = O(nτ) (cf. Appendix 7.2). The estimate on the size of the isolating intervals is a direct consequence of the following two facts: First, an interval I containing a root zi of f is not subdivided by D CML if w(I) ≤ σ (zi , f )/(8n2 ). Hence, any interval J which is returned by D CML as an isolating interval for zi is the extension I ∗ of an interval I by w(I)/(2n) to both sides, where w(I) > σ (zi , f )/(16n2 ). Second, the (w(I)/n)-neighborhood of I isolates zi as well and, thus, w(J) < 2w(I) < 2nσ (zi , f ).

6

Conclusion

We presented a new complete and deterministic algorithm to isolate the real roots of an arbitrary squarefree polynomial F with real coefficients. Our analysis shows that the hardness of isolating the real roots exclusively depends on the location of the roots and not on the coefficient type. Furthermore, the overall running time is significantly reduced by considering approximations at each node of the recursion tree. In particular, for integer polynomials, we achieve an improvement with respect to worst case bit complexity by a factor n = deg F compared to the best bounds known for other practical methods such as the Descartes or the continued fraction method. The latter is due to the fact that exact arithmetic produces too much information for the task of root isolation and, thus, a significant overhead of computation. Since we were gaining for a practical method, we formulated our algorithm in the spirit of the classical V CA bisection method. We are convinced that because of its similarities to the latter exact method and because of its usage of approximate and less expensive computations, it will prove to be efficient in practice as well. We plan to implement our algorithm to verify this claim. Finally, univariate root isolation constitutes an important substep in cad (cylindrical algebraic decomposition) computations. Hence, our analysis may serve as a basis to improve the corresponding bounds on the worst case bit complexity.

25

References [1] A. G. Akritas. The fastest exact algorithms for the isolation of the real roots of a polynomial equation. Computing, 24(4):299–313, 1980. [2] A. Alesina and M. Galuzzi. A new proof of Vicent’s theorem. L’Enseignement Mathematique, 44:219– 256, 1998. [3] A. Alesina and M. Galuzzi. Addendum to the paper ”a new proof of Vicent’s theorem”. L’Enseignement Mathematique, 45:379–380, 1999. [4] A. Alesina and M. Galuzzi. Vincent’s theorem from a modern point of view. Categorical Studies in Italy 2000, Rendiconti del Circolo Matematico di Palermo, Serie II, 64:179–191, 2000. [5] S. Basu, R. Pollack, and M.-F. Roy. Algorithms in Real and Algebraic Geometry. Springer, 2nd edition, 2006. [6] G. Collins, J. Johnson, and W. Krandick. Interval arithmetic in cylindrical algebraic decomposition. J. Symbolic Computation, 34:143–155, 2002. [7] G. E. Collins and A. G. Akritas. Polynomial real root isolation using Descartes’ rule of signs. In ISSAC, pages 272–275, 1976. [8] Z. Du, V. Sharma, and C. Yap. Amortized bounds for root isolation via Sturm sequences. In SNC, pages 113–130. 2007. [9] A. Eigenwillig. On multiple roots in Descartes’ rule and their distance to roots of higher derivatives. Journal of Computational and Applied Mathematics, 200(1):226–230, March 2007. [10] A. Eigenwillig. Real Root Isolation for Exact and Approximate Polynomials Using Descartes’ Rule of Signs. PhD thesis, Saarland University, 2008. [11] A. Eigenwillig. Real Root Isolation for Exact and Approximate Polynomials using Descartes’ Rule of Signs. PhD thesis, Universit¨at des Saarlandes, May 2008. [12] A. Eigenwillig, L. Kettner, W. Krandick, K. Mehlhorn, S. Schmitt, and N. Wolpert. An exact descartes algorithm with approximate coefficients. In CASC, volume 3718 of LNCS, pages 138–149, 2005. [13] A. Eigenwillig, V. Sharma, and C. Yap. Almost tight complexity bounds for the Descartes method. In ISSAC, pages 71–78, 2006. [14] A. Eigenwillig, V. Sharma, and C. Yap. Almost tight complexity bounds for the Descartes method. In ISSAC, pages 151–158, 2006. [15] J. Gerhard. Modular algorithms in symbolic summation and symbolic integration. LNCS, Springer, 3218, 2004. [16] M. Hemmer, E. P. Tsigaridas, Z. Zafeirakopoulos, I. Z. Emiris, M. I. Karavelas, and B. Mourrain. Experimental evaluation and cross benchmarking of univariate real solvers. In SNC, pages 45–54, 2009. [17] J. R. Johnson and W. Krandick. Polynomial real root isolation using approximate arithmetic. In W. K¨uchlin, editor, ISSAC, pages 225–232. ACM Press, 1997.

26

[18] W. Krandick and K. Mehlhorn. New bounds for the Descartes method. Journal of Symbolic Computation, 41(1):49–66, 2006. [19] T. Lickteig and M.-F. Roy. Sylvester-Habicht sequences and fast Cauchy index computation. J. of Symbolic Computation, 31:315–341, 2001. [20] K. Mehlhorn and M. Sagraloff. Isolating real roots of real polynomials. In ISSAC ’09, pages 247–254. ACM, 2009. [21] B. Mourrain, F. Rouillier, and M.-F. Roy. The Bernstein basis and real root isolation. In Combinatorial and Computational Geometry, pages 459–478. 2005. ¨ [22] N. Obreschkoff. Uber die Wurzeln von algebraischen Gleichungen. Jahresbericht der Deutschen Mathematiker-Vereinigung, 33:52–64, 1925. [23] N. Obreschkoff. Verteilung und Berechnung der Nullstellen reeller Polynome. VEB Deutscher Verlag der Wissenschaften, 1963. [24] N. Obreschkoff. Zeros of Polynomials. Marina Drinov, Sofia, 2003. Translation of the Bulgarian original. [25] A. M. Ostrowski. Note on Vincent’s theorem. Annals of Mathematics, Second Series, 52(3):702–707, 1950. Reprinted in: Alexander Ostrowski, Collected Mathematical Papers, vol. 1, Birkh¨auser Verlag, 1983, pp. 728–733. [26] F. Rouillier and P. Zimmermann. Efficient isolation of [a] polynomial’s real roots. J. Computational and Applied Mathematics, 162:33–50, 2004. [27] M. Sagraloff and C. K. Yap. An efficient exact subdivision algorithm for isolating complex roots of a polynomial and its complexity analysis, 2009. Submitted. http://mpi-inf.mpg.de/ msagralo/ceval.pdf. [28] V. Sharma. Complexity of real root isolation using continued fractions. Theoretical Computer Science, 409:292–310, 2008. [29] E. P. Tsigaridas and I. Z. Emiris. On the complexity of real root isolation using continued fractions. Theor. Comput. Sci., 392(1-3):158–173, 2008. [30] A. J. H. Vincent. Sur la r´esolution des equations num´eriques. Journal de Math´ematiques Pures et Appliqu´ees, 1:341–372, 1836. [31] J. von zur Gathen and J. Gerhard. Fast algorithms for Taylor shifts and certain difference equations. In ISSAC ’97: Proceedings of the 1997 international symposium on Symbolic and algebraic computation, pages 40–47, New York, NY, USA, 1997. ACM. [32] C. K. Yap. Fundamental Problems in Algorithmic Algebra. Oxford University Press, 2000.

27

7 7.1

Appendix Proof of Theorem 18 (iii)

For f as defined in (1.3), let T be the subdivision tree consisting of all intervals obtained by recursive bisection of the interval I0 = (−1/2, 1/2) with respect to the following rule: At depth h ∈ N0 , an interval (a, b) = (−1/2 + i2−h , −1/2 + (i + 1)2−h ), i ∈ {0, . . . , 2h − 1}, is subdivided exactly if var( f 0 , I) ≥ 1 and ∆4n2 w(I) (a) contains at least two roots of f . According to Lemma 8 (iv) and Theorem 18 (i), the recursion tree induced by C ERTIFYL is a subtree of T . The preceding considerations will show that |T | = O(Σ( f ) + n log n) and, thus, Theorem 18 (iii) follows. We first introduce some notations. Wlog., we can assume that σ (z1 , f ) ≤ . . . ≤ σ (zn , f ). We define ∗ h := d2 log ne + 4 and hi := dlog(1/σ (zi , f ))e for all i = 1 . . . , n. For an arbitrary h ∈ N, k(h) denotes the number of roots zi with hi ≥ h − h∗ . Thus, we get ∗ −h

σ (zi , f ) ≥ 2−hi > 2h

≥ n2 24−h for i = k(h) + 1, . . . , n.

We denote the set of all nodes (intervals) at depth h by Th and the set of all terminal nodes (intervals) in Th by Th∗ . Furthermore, let λ (h) := |Th | the number of nodes at depth h, λ ∗ (h) := |Th∗ | the number of terminal nodes at depth h, λ # (h) := λ (h) − λ (h)∗ the number of nodes at depth h that are further subdivided, and ν(h) :=

∑ var( f 0 , I) the sum of all sign variations for f 0 at depth h. I∈Th

For each non-terminal interval I = (a, b) ∈ Th \Th∗ , Descartes’ Rule of Sign counts var( f 0 , I) ≥ 1 and ∆4n2 w(I) (a) = ∆n2 22−h (a) contains at least two roots of f . The latter implies that there exists a root z ∈ ∆n2 22−h (a) of f with σ (z, f ) < n2 23−h . We say that a root zi of f is critical for I if ∆4n2 w(I) (a) contains zi and σ (zi , f ) < 8n2 w(I). It immediately follows: Lemma 20. An interval I is terminal if there exists no root zi of f that is critical for I. If σ (zi , f ) ≥ n2 23−h , then zi cannot be critical for any interval I of width w(I) ≤ 2−h . Hence, only the roots z1 , . . . , zk(h) can be critical for intervals I ∈ Th0 at depth h0 ≥ h. For each root zi , all but at most two intervals I ∈ Th fulfill the inequality |zi − x| > 2−(h+1) for all x ∈ I. It follows that, for all but at most 2k(h) intervals I ∈ Th , we must have |zi − x| > 2−(h+1) for all x ∈ I and all i = 1, . . . , k(h). We consider a non-terminal interval I ∈ Th \Th∗ that fulfills this inequality. Then, the following consideration shows that, at depth h0 := h + h∗ , there cannot be any interval I 0 ∈ Th0 with I 0 ⊂ I: Assume that there exists such an interval I 0 . Then, I 0 is one of the two children of a J ∈ Th0 −1 and J ⊂ I. For ∗ 0 0 aJ the left endpoint of J, we get |aJ − zi | > 2−(h+1) = 2h −1 2−h ≥ n2 23−h = 4n2 w(J) for all i = 1, . . . , k(h). Hence, it follows that none of the roots z1 , . . . , zk(h) is critical for J. According to Lemma 20, none of the roots zk(h)+1 , . . . , zn is critical for J as well and, thus, J must be terminal, a contradiction. Since I is a non-terminal node, we must have var( f 0 , I) ≥ 1, and since I has no children in Th0 , Theorem 5 implies that, at depth h0 , Descartes’ Rule of Sign counts at least one sign variation less for f 0 than at depth h. The latter applies to at least λ # (h) − 2k(h) intervals at depth h and, thus, we obtain: Lemma 21. For r(h) := λ # (h) − 2k(h), it holds that v(h + h∗ ) ≤ ν(h) − r(h).

28

We are now ready to prove our result on the size of T : From Lemma 20 and k(h) = 0 for all h > h1 + h∗ − 1 it follows that T has no nodes at depth h ≥ h1 + h∗ + 1. For a certain depth h with 1 ≤ h ≤ h∗ , we consider the sequence Th , Th+h∗ , Th+2h∗ , . . . corresponding to the levels h, h + h∗ , h + 2h∗ , . . . in the recursion tree T . From Lemma 21 and v(h) ≤ n − 1 for all h we obtain the following computation: ∞

∑λ

#

dh1 /h∗ e+1



(h + ih ) =

dh1 /h∗ e+1



#

λ (h + ih ) =



i=0

i=0

i=0

dh1 /h∗ e+1

=2

(r(h + ih∗ ) + 2k(h + ih∗ ))





k(h + ih∗ ) +

dh1 /h∗ e+1



i=0

i=0

dh1 /h∗ e+1

dh1 /h∗ e+1

≤2



k(h + ih∗ ) +



i=0

r(h + ih∗ )

(ν(h + ih∗ ) − ν(h + (i + 1)h∗ ))

i=0 dh1 /h∗ e+1

≤ ν(h) + 2



dh1 /h∗ e+1



k(h + ih ) ≤ n − 1 + 2



i=0

k(h + ih∗ )

i=0

Now, summing up over all levels leads to the following result: h∗ dh1 /h∗ e+1

h1 +h∗

|T | =



i=0

λ (i) = λ (0) + ∑

i=0

h=1 h∗ −1 dh1 /h∗ e+1

= 1+







i=0

h=0

λ (h + ih ) = 1 + ∑



2λ # (h + ih∗ ) = 1 + 2λ # (0) + ∑

h=1 dh1 /h∗ e

n−1+2





2λ # (h + ih∗ )

i=0

! k(h + ih∗ )

≤ 3 + 2(n − 1)h∗ + 4

h1 +h∗

i=0

h=1

2λ # (h − 1 + ih∗ )

i=0

h=1

h∗ dh1 /h∗ e+1

h∗

≤ 3+2 ∑

h∗ dh1 /h∗ e+1



k(i)



i=0

Since 3 + 2(n − 1)h∗ = O(n log n), it remains to show that the sum over all k(i) is in O(n log n + Σ( f )). We decompose the sequence h1 , . . . , hn into subsequences of equal values; let i0 := 0 < i1 < . . . < im := n, m ≤ n, such that his > his +1 and his−1 +1 = . . . = his for all s = 1, . . . , m. From our definition of k(i) it follows that k(i) = is for all his+1 + h∗ < i ≤ his + h∗ and, thus, hi1 +h∗

h1 +h∗



k(i) =

i=0



i=0

him−1 +h∗

him +h∗

k(i) =



i=0

k(i) +



hi1 +h∗

k(i) + . . . +

i=him +h∗ +1



k(i) =

i=hi2 +h∗ +1

= (him + h∗ + 1) · im + (him−1 − him ) · im−1 + . . . + (hi1 − hi2 ) · i1 = hi1 · i1 + hi2 · (i2 − i1 ) + . . . + him · (im − im−1 ) + im (h∗ + 1) < (− log σ (zi1 , f ) + 1) · i1 + (− log σ (zi2 , f ) + 1) · (i2 − i1 ) + . . . . . . + (− log σ (zim , f ) + 1) · (im − im−1 ) + im (h∗ + 1) i1

≤ n(h∗ + 1) + ∑ (− log σ (zi , f )) + . . . + i=1 n

im



im

(− log σ (zi , f )) + ∑ 1

i=im−1 +1

= 2n + nh∗ − ∑ log σ (zi , f ) = O(n log n + Σ( f )) i=1

which proves our claim.

29

i=1

7.2

Integer Polynomials

˜ For an integer polynomial F ∈ Z[x] as given in (1.1), we aim to show that Σ(F) = O(nτ). We proceed in two steps: First, we cluster the roots ξi of F into subsets consisting of nearby roots. Second, we apply the generalized Davenport-Mahler bound [8, 11] to the roots of F. Eventually, the above result follows. Wlog., we can assume that σ (ξ1 , F) ≤ . . . , ≤ σ (ξn , F). For h ∈ N, we denote i(h) the maximal index i with σ (ξi , F) ≤ 2−h and R = R(h) := {ξ1 , . . . , ξi(h) } the corresponding set of roots ξi with σ (ξi , F) ≤ 2−h . If h ≤ log(1/σ (F)), then R contains at least two roots. We are interested in a partition of R into disjoint subsets R1 , . . . , Rl that consist of nearby points, only. Lemma 22. Suppose that h ≤ log(1/σ (F)). Then, there exists a partition of R := R(h) into disjoint sets R1 , . . . , Rl such that |Ri | ≥ 2 for all i ∈ {1, . . . , l} and |ξ − ξ 0 | ≤ n2−h for all ξ , ξ 0 ∈ Ri . Proof. We initially set R1 := {ξ1 }. Then, we add all roots ξi to R1 that satisfy |ξi − ξ1 | ≤ 2−h . For each root in R1 , we proceed in the same way. More precisely, for each ξ ∈ R1 , we add those roots ξ 0 ∈ R to R1 with |ξ − ξ 0 | ≤ 2−h . If no further root can be added to R1 , we consider the set R\R1 of the remaining roots and treat it in exactly the same manner. Finally, we end up with a partition R1 , . . . , Rl of R such that, for any two points in any Ri , their distance is less than or equal to (|Ri | − 1)2−h < n2−h . Furthermore, each of the sets Ri must contain at least two roots as σ (ξi , F) ≤ 2−h for all i = 1, . . . , i(h). We now consider a directed graph Gi on each Ri which connects consecutive roots of Ri in ascending order of their absolute values. We define G := (R, E) as the union of all Gi . Then, G is a directed graph on R with the following properties: 1. each edge (α, β ) ∈ E satisfies |α| ≤ |β |, 2. G is acyclic, and 3. the in-degree of any node is at most 1. Hence, we can apply the generalized Davenport-Mahler bound [8, 11] to G : √ !#E  n/2 1 3 1 · ∏ |α − β | ≥ ((n + 1)1/2 2τ )n−1 · n n (α,β )∈E As each set Ri contains at least 2 roots, we must have i(h) > #E ≥ i(h)/2. Furthermore, for each edge (α, β ) ∈ E, we have |α − β | ≤ n2−h . It follows that √ !i(h)  n/2  i(h)/2 i(h) 1 3 1 3 1 −h 2 (n2 ) > · > · · 2 n nτ 1/2 τ n−1 n n (n + 1) 2 n ((n + 1) 2 ) and, thus, 2n(τ + log(n + 1)) 2n(τ + log(n + 1)) < . log 3 + log n + h h It directly follows that log(1/σ (F)) < n(τ + log(n + 1)) + 1 since, otherwise, there would exist an h with n(τ + log(n + 1)) < h ≤ log(1/σ (F)) and i(h) < 2 which is not possible. For the bound on Σ(F), it suffices to consider only the roots ξ1 , . . . , ξk with separation ≤ 1/2 since all other roots contribute with at most n to the sum Σ(F). Since i(h)
0 then s := sign fI + (0) · fI + (1) if s ≥ 0 then do nothing else if I + does not intersect any interval in O then add I + to O else do nothing end if end if else subdivide I into Il := (a, mI ) and  Ir := (mI , b) x x+1 fIl := fI 2 and fIr := fI 2 = fIl (x + 1) add (Il , fIl ) and (Ir , fIr ) to A end if end if until A is empty return O

31

Algorithm 2 D CML Require: polynomial f = ∑0≤i≤n ai xi ∈ R[x] as in (1.3) and an L ∈ N Ensure: either returns ”insufficient precision” or a list O of disjoint isolating intervals I ∗ (together with corresponding polynomials fI and error bounds LI ) for some of the real roots of f . {no guarantee that all real roots are captured} I0 := (−1/2, 1/2) f˜ an (L + n + 1)-binary approximation of f f˜I0 an (L + 1)-binary approximation of f˜(−1/2 + x) A := { (I0 , f˜I0 , L) }; O := 0/

{⇒ f˜I0 ∈ [ fI0 ]2−L } {list of active and isolating intervals}

repeat (I, f˜I , LI ), where I := (a, b), some element in A ; delete (I, f˜I , LI ) from A if LI < 0 then return ”insufficient precision” else    1 1 1 ˜ f˜I + (x) := f˜I − 4n + 1 + 2n x and h(x) = ∑ni=0 h˜ i xi := (1 + x)n · f˜It+ 1+x if h˜ i > −2n+2−LI for all i or h˜ i < 2n+2−LI for all i then do nothing else ( f˜ )0 if t3/2I > −n2n+1−LI then fˆI (x) := f˜I (x) + n2n+1−LI · x λ − := f˜I + (0) − 2n−1−LI , λ + := f˜I + (1) + (4n + 1)2n−1−LI and λ := f˜I (−1/n) − 2n+1−LI . ˆ if λ − · λ + < 0 and|λ − | > 2n−LI and|λ + | > n2n+3−LI and |λ | > n2 2deg( fI )+7+n−LI then w(I) I ∗ = (a∗ , b∗ ) := a − w(I) 2n , b + 2n {⇒ I ∗ contains a root ξ of f and the w(I)/n-neighborhood of I is isolating for ξ } if does not intersect an interval J ∗ for any (J ∗ , f˜J , LJ ) ∈ O then add (I ∗ , f˜I , LI ) to O else do nothing {J ∗ is already isolating for ξ } end if else do nothing end if else Subdivide I into Il := (a, mI ) and Ir := (m  I , b) f˜Il an LI -binary approximation of f˜I 2x {⇒ f˜Il ∈ [ fIl ]2−(LI −1) }  1+x ˜ ˜ fIr an (LI − 1)-binary approximation of fI 2 {⇒ f˜Ir ∈ [ fIr ]2−(LI −2) } ˜ ˜ Add (Il , fIl , LI − 1) and (Ir , fIr , LI − 2) to A end if end if end if until A is empty return O I∗

32

Algorithm 3 C ERTIFYL Require: polynomial f = ∑0≤i≤n ai xi ∈ R[x] as defined in (1.3), an L ∈ N and the output list O = {(Ik∗ , f˜Ik , LIk )}k=1,...,m returned by D CML . Ensure: either returns ”insufficient precision” or the list L = {I1∗ , . . . , Im∗ } of isolating intervals with the guarantee that, for each real root of f , there exists a corresponding interval in L . if P RECERTIFY (cf. page 21 for its description) returns “insufficient precision” for O then return “insufficient precision”. else − + ¯ + ¯ ∗ ∗ O ∗ = {(Ik∗ , s− k )}k=1,...,m the list returned by P RECERTIFY k )}k=1,...,m = {((ak , bk ), sk , sk , µ k , sk , µ S c ∗ ∗ ∗ L := {J1 , . . . , Jm+1 } = {[−1/2, a1 ], [b1 , a2 ], . . . , [b∗m , 1/2]}; R := m+1 i=1 Ji end if I0 := (−1/2, 1/2) f˜ an (L + n + 1)-binary approximation of f f˜I0 an (L + 1)-binary approximation of f˜(−1/2 + x) {⇒ f˜I0 ∈ [ fI0 ]2−L } A := { (I0 , f˜I0 , L) } {list of active intervals} repeat (I, f˜I , LI ), where I := (a, b), some element in A ; delete (I, f˜I , LI ) from A if LI < 0 then return ”insufficient precision” {L < Lmax f } else if I ∩ R = 0/ then do nothing + ˜ else if ∃k ∈ {1, . . . , m} : a ∈ Jk ∧ sign f˜I (0) 6= s− k or ∃k ∈ {2, . . . , m + 1} : b ∈ Jk ∧ sign f I (1) 6= sk−1 then return ”insufficient precision” else f˜I if t3/2 (0, 1) > −(n + 1)2−LI +1 then if | f˜I (0)| > (n + 1)2−LI +5 then do nothing {I contains no root of f } else return ”insufficient precision” {L < Lmax f } end if else ˜ := ∑ni=0 h˜ i xi = (1 + x)n ( f˜I )0 (1/(1 + x)) h(x) if h˜i < n2n−LI for all i or h˜i > −n2n−LI for all i then if µ¯ k > n2n+2−Li for all Ik with Ik ∩ I 6= 0/ and a ∈ / R ∨ | f˜I (0)| > n2n+2−LI and b ∈ / R ∨ | f˜I (1)| > n+2−L I n2 then do nothing {I ∩ R contains no root of f } else return ”insufficient precision” {L < Lmax f } end if else Subdivide I into Il := (a, mI ) and Ir := (m  I , b) ˜fIl an LI -binary approximation of f˜I x {⇒ f˜Il ∈ [ fIl ]2−(LI −1) } 2  1+x ˜ ˜ fIr an (LI − 1)-binary approximation of fI 2 {⇒ f˜Ir ∈ [ fIr ]2−(LI −2) } Add (Il , f˜Il , LI − 1) and (Ir , f˜Ir , LI − 2) to A end if end if end if until A is empty 33 return ”certification successful” {The region of uncertainty R contains no root of f }