A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

0 downloads 0 Views 395KB Size Report
Mar 21, 2009 - Notice that an approximate zero z∗ may not be an ε-root. .... Section 5 computes several estimates related to how the polynomial f ... This bound depends on the log of the angle f(z0) ... See also Figure 4.2. ... The proof can be found in .... The proof of the theorem will be prepared in the next sections, and is ...
A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

arXiv:0903.3674v1 [math.NA] 21 Mar 2009

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND A BSTRACT. We analyze a path-lifting algorithm for finding an approximate zero of a complex polynomial, and show that for any polynomial with distinct roots in the unit disk, the average number of iterates this algorithm requires is universally bounded by a constant times the log of the condition number. In particular, this bound is independent of the degree d of the polynomial. The average is taken over initial values z with |z| = 1 + 1/d using uniform measure.

C ONTENTS 1. Introduction 2. Preliminaries 3. The Path-Lifting Algorithm 4. The Voronoi Partition in the Branched Cover 5. The Behavior of f on the Initial Circle 6. The Size of the Step 7. The Pointwise Cost 8. The Average Cost 9. Concluding Remarks References

1 3 5 8 13 18 23 29 31 32

1. I NTRODUCTION A point z∗ is an approximate zero for a function f (z) if it converges quadratically to a zero under Newton’s method. The notion of an approximate zero was introduced by Smale in [Sm81]. A sufficient condition to determine if z∗ is an approximate zero for f using only evaluation of f (z∗ ) and its derivatives at z∗ was developed by Kim ([K85], [K88]); this condition was sharpened and extended to apply also to systems of polynomials by Smale [Sm86]. Nowadays, this approach is commonly called α -theory. We will use the Kim-Smale criterion to locate approximate zeros; see Theorem 3.2. Depending on the context of the problem, the goal might be to produce a point zˆ so that | f (ˆz)| < ε , or one might desire that |ˆz − ζ | < ε where f (ζ ) = 0. In either case, such a solution is called 1991 Mathematics Subject Classification. Primary 65H05; Secondary 30C15, 37F10, 52C20, 57M12, 68Q25. Key words and phrases. Root-finding, alpha theory, Newton’s method, Voronoi region, path-lifting, branched covering. Acknowledgements: Part of this work was done while Myong-Hi Kim was visiting Stony Brook University; we are grateful for their support and hospitality. Stony Brook IMS Preprint #2009/1 March 2009

2

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND

an ε -root. Notice that an approximate zero z∗ may not be an ε -root. However, such a point z∗ will converge to an ε -zero quadratically. Consequently, locating an approximate zero resolves the question of producing an ε -root. See Def. 3.1 for the specific definition. In this paper, we discuss the use of a path-lifting method (which we call the α -step method, a variation of the algorithm developed in [K85] and [K88]) to locate an approximate zero for a complex polynomial f (z), and show that for any polynomial f , the average number of steps required by the algorithm is universally bounded, independent of the degree of f , where the average is taken over the starting points for the method. In fact, the average cost depends on the average of the logarithms of several of the critical values of f (this, in turn, is less than the logarithm of the condition number; see Remark 3.8). We note that the results of [K88] imply that for any polynomial, this method converges except on a finite set of starting values. More precisely, we have the following. Theorem 1.1 (Main Theorem). Let f : C → C be a polynomial with distinct roots ζi in the unit disk. There is a constant Λ f , independent of the degree of f , so that the average number of steps required by the α -step algorithm to locate an approximate zero for f is no more than 67(Λ f + 13.1), where the average is taken over starting values on the circle of radius 1 + 1/d. The constant Λ f is the average of the logarithms of the radius of convergence of f −1 at the roots ζi . The cost of each step of the algorithm is dominated by the calculation of α f (z). Since this can be done with O(d log2 d) arithmetic operations (see [BM], for example), we have the following. Corollary 1.2. The average arithmetic complexity of locating an approximate zero for f by the  2 α -step algorithm is O Λ f d log d , where the average is taken over starting values on the circle of radius 1 + 1/d. In path-lifting methods, it is useful to distinguish between the domain and range, so we have f : Csource → Ctarget .

To implement the method, we choose a path γ in the target space (typically a segment connecting an initial point w0 = f (z0 ) to zero) and attempt to lift it back to the source space via a branch of f −1 . In this form, such methods were introduced by Shub and Smale (see, for example [SS86] or [Sm85]), although one could argue (as Smale points out in [Sm81]) that in some sense this idea goes back to Gauss. See [Ren] and the references therein, as well as [KS]. The series [SS93a, SS93b, SS93c, SS96, SS94, Sh07, BS] discusses related methods for systems of polynomial equations. A survey of complexity results for solving polynomial equations in one variable can be found in [Pa]; see also [B08]. The difficulty of computing a local branch of f −1 along a path γ in the target space is related to how close γ comes to a critical value of f . However, not all critical values of f are relevant: only those which correspond to a critical point of f lying near the particular branch of f −1 (γ ) have any impact. Consequently, it is useful to factor f through the (branched) Riemann surface S for f −1 ,

A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

3

so that we have CsourceJ

fb

/

JJ JJ JJ f JJ$

S π



Ctarget where b f is a diffeomorphism except at the critical points of f , and π is the projection map. With this viewpoint, the path γ that we lift back to Csource lies in S , and the troublesome points now are the branch points of S . One ingredient important to our analysis is understanding the Voronoi decomposition of S relative to the branch points. That is, for each branch point v of S , the Voronoi domain Vor v is the set of points in S which are closer to v than any other branch point of S (using the metric lifted via π ). We show in §4 that the projection map π restricted to any single Vor v is at most (m + 1)-to-one, where m is the multiplicity of the critical point of f corresponding to v (hence the projection is generically at most 2-to-one). Because the number of steps required for a path-lifting algorithm is related to the number of critical values the lifted path comes near, this result enables us to count the number of relevant critical values for a typical path. The path-lifting algorithm works as follows: We choose an initial point z0 just outside the disk known to contain all of the roots, and let w0 = f (z0 ). We then attempt to continue the branch of f −1 which has f −1 (w0 ) = z0 along the segment from w0 to 0 by choosing a suitable sequence wn along this ray, together with approximations zn such that f (zn ) ≈ wn . (Note that specifying a pair (z, w) such that w = f (z) is equivalent to specifying a point in S , so in practice our rays live naturally in S .) The process stops when a point zn is detected to be an approximate zero. Given a pair (zn , wn ), the point wn+1 is chosen to ensure that zn is an approximate zero for f (z) − wn+1 . The tool we use to detect approximate zeros is the Kim-Smale α -function: if α f (z) < 0.1307, then z is an approximate zero. The paper is organized as follows. In section 2, we set out notation and preliminary notions. Section 3 describes the path-lifting algorithm explicitly. In section 4, we discus the branched surface S and the corresponding Voronoi partition; this section may be of interest independent to the question of root-finding. Section 5 computes several estimates related to how the polynomial f behaves on the initial circle. In section 6, we calculate a lower bound on how far apart the points wn and wn+1 can be, and in §7 bound the number of steps needed for the algorithm to locate an approximate zero from a given starting point z0 . This bound depends on the log of the angle f (z0 ) makes with the relevant critical values of f . Using these results, we can average the bound from §7 over all starting points; this is done in Section 8, which proves the main theorem. We conclude in Section 9 with some remarks. 2. P RELIMINARIES We will use the following general notions and notations throughout. An open disk of radius r > 0 centered around z ∈ C is denoted by Dr (z). The function Arg denotes the argument of a complex number (in the interval (−π , π ]).

4

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND

The ray ℓz ⊂ C of a point z ∈ C r {0} is

ℓz = (0, ∞) · z = {w ∈ C | Arg w = Arg z} ,

and the slit of this point is the part of the ray extending outward from z, that is

σz = [1, ∞) · z = {w ∈ ℓz | |w| ≥ |z|} Finally, we will introduce some notation used when dealing with the Newton flow. Let f : C → C be a polynomial, and denote the critical points of f by  C f = z | f ′ (z) = 0 .

Consider the following vector field on C,

X (z) = −

f (z) . f ′ (z)

The corresponding flow is called the Newton flow. This vector field blows up near the critical points of f . By rescaling the length of the vector X (z) by | f ′ (z)|2, the critical points of f become well-defined singular points of the rescaled vector field. This rescaled vector field is the gradient vector field z˙ = −∇| f (z)|2 ; the solution curves of the former coincide with the latter, and we will use the two interchangably. The equilibria of the Newton flow are exactly the roots and critical points of f . Each root ζ is a sink; we shall denote its basin of attraction by Basin(ζ ). Critical points are saddles for the flow. Furthermore, we can extend the flow to infinity, which is the only source. Each boundary component of Basin(ζ ) contains critical points c ∈ C f ; each critical point c has an unstable orbit leaving from c and converging to ζ . This unstable orbit is a separatrix of c and will be denoted by γ c . Generically, there is a unique critical point in each boundary component; in the degenerate cases, there could be saddle connections resulting in multiple critical points on one boundary component. A general discussion regarding Newton flows can be found in [STW] and [JJT]. See also Figure 4.2. S We note that for each root ζ , f is a biholomorphic map between Basin(ζ ) and C r σ f (c) , where the union is taken over the critical points c which lie on the boundary of Basin(ζ ). It is important to note that if φt is a solution curve for the Newton flow, f (φt ) lies along a ray. Throughout the paper, we will consider polynomials f ∈ Pd (1), that is, f : C → C given by d

f (z) = ∏ (z − ζ j ) j=1

with |ζ j | < 1,

with distinct roots ζ j . The set of roots of f will be denoted by  ZF = ζ j | j = 1, . . . , d .

The restriction to Pd (1) is not severe; it can always be accomplished via an affine change of coordinates depending only on the coefficients of f ; see [M], for example. We shall use the following standard result several times.

A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

5

Lemma 2.1. (Koebe Lemma) Let g : Dr (0) → C be univalent with g(0) = 0 and g′ (0) = 1. For z ∈ Dr (0) with s = |z|/r, we have 1+s 1−s ′ ≤ |g (z)| ≤ (2.1) (1 + s)3 (1 − s)3

and

(2.2) Consequently, (2.3)

|z|

s s ≤ |g(z)| ≤ |z| 2 (1 + s) (1 − s)2 Dr/4 (0) ⊂ g(Dr (0)).

Remark 2.2. The last statement (2.3) is known as the Koebe 14 -Lemma. The proof can be found in [Ko], [P]. 3. T HE PATH -L IFTING A LGORITHM We now discuss explicitly the path-lifting algorithm that we will use to find an approximate zero of an f ∈ Pd (1). Definition 3.1. Let zn ∈ C be the nth iterate under Newton’s method of the point z0 ∈ C, that is, zn+1 = zn −

f (zn ) . f ′ (zn )

The point z0 is called an approximate zero of f if  2n −1 1 |z1 − z0 | |zn+1 − zn | ≤ 2 for all n > 0.

A sufficient condition for a point to be an approximate zero is developed in [K85] and [Sm86]. We will use the criterion formulated by Smale in [Sm86] to locate approximate zeros. It uses α : C r C f → C defined by 1 f (z) f ( j) (z) j−1 (3.1) α (z) = max ′ ′ . j>1 f (z) j! f (z)

It is sometimes useful to use the related function γ (z) instead, where 1 f ( j) (z) j−1 (3.2) γ (z) = max ′ . j>1 j! f (z)

While we will primarily use α (z), we make use of γ (z) in section 6.

Theorem 3.2. [Sm86] There is a number α0 > 0.1307 such that if α (z) < α0 , the point z is an approximate zero.

6

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND

Remark 3.3. The number α0 is given in [Sm86] and in many places throughout the literature as α0 ≈ 0.130707. However, this specific value is very likely the result of a typographic error in the fifth decimal place. Smale shows in [Sm86] that α0 is a solution to the equation (2r2 − 4r + 1)2 − 2r = 0; the relevant root of this equation is 0.13071694 . . .. There have been subsequent improvements to this constant (see [WH] or [WZ], for example), but 0.1307 suffices for our purposes. Remark 3.4. Calculation of α (z) requires the ability to evaluate all derivatives of f at a z. In some situations, this is not possible; for example, if f is defined as an n-fold composition of some other function g, calculation of f and f ′ in terms of g and g′ is simple, but calculating even f ′′ is impractical. However, evaluation of higher derivatives may be avoided using the bound [Sm86]:

γ (z)
0 around u ∈ S is denoted by Dr (u).  Lemma 4.1. u ∈ Vor v if and only if π : D|u−v| (u) → D|u−v| (π (u)) is an isometry. In particular,  if u ∈ Vor v then D|u−v| (u) ∩ V f = 0. /  Proof. If u ∈ Vor v then D|u−v| (u) ∩ V f = 0. / Thus, π is a local isometry on all of D|u−v| (u), and in particular, π is a global isometry on this disk. Conversely, If π is an isometry on all of D|u−v| (u),  there can be no critical values in the disk, and so u ∈ Vor v .  Vor v1



Vor v2 σ2 σ3

Vor v3



σ2



Vor v3

Vor v2

σ3

Vor v6 

Vor v5 σ5

Vor v5



σ6 σ1

Vor v4

 Vor v3



Vor v6

Vor v4

σ5



 Vor v1

Vor v6





σ6



σ4



σ4

Vor v4 



Vor v3

 σ1

F IGURE 4.1. The surface S for a degree 7 polynomial, viewed as a stack of seven slit planes. Each sheet is fb(Basin(ζi )) for the root ζi , and is slit along σv j (dashed lines), which begin at the branch points v j ∈ V f (indicated by crosses). The central ellipses indicate π −1 (0). In the figure, σv j is labeled as σ j . The Voronoi domains of each of the v j are indicated, with boundaries marked by heavy black lines. Note that while Vor v j may enter many sheets, the projection is at most 2-to-1, as in Cor. 4.5. See also Figure 4.2.

10

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND

Let u1 , u2 ∈ S . If the line segment [π(u1), π(u2)] ⊂ C has a lift in S which connects u1 with u2 , we denote this lifted line segment by u1 , u2 .Observe  that many pairs u1 , u2 do not have such a connecting line segment. In this case we write u1 , u2 = 0. / When u1 , u2 is nonempty, we say that u1 is visible from u2 in S . Also observe, if v ∈ V f then    u, v 6= 0/ for all u ∈ Vor v . We can form the visibility graph for S as follows. The of the graph are the critical  vertices  values V f , and there is an edge from v to w if and only if v, w is non-empty. We can identify the visibility graph with the subset of S given by [   G= v, w . v,w∈V

Recall that fb is a homeomorphism. Hence, fb−1 (G ) is well-defined, so we can also view G as a graph immersed in C, with the critical points of f as vertices. 1.0

Vor c3



Vor c2 c3



0.5

c2

0.0

Vor c4



c4

c1

Vor c1

K

0.5

c6

 Vor c6

K

Vor c5

1.0

K

1.0

K

0.5



c5

0.0

0.5



1.0

F IGURE 4.2. The Voronoi regions of Fig. 4.1 are shown in the source space. The roots of f are indicated by circles, the critical points by crosses. The Newton flow is indicated by the small arrows, and the dashed lines are the boundaries ofthe basins of each root (each such boundary contains a unique critical point). fb−1 (Vor v j ) is shown for each critical point   c j ∈ C f (bounded by the heavy lines). In the figure, fb−1 (Vor v j ) is labeled by Vor c j . The visibility graph G is also shown.

A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

11

We shall say that a critical value w ∈ V is a neighbor of v if there is an edge between w and v in G . Denote the set of neighbors of v by Nv . For each edge of G , we can define the lines Lv,w = {u ∈ S | dist(u, w) = dist(u, v)} ,

  which are the geodesics passing perpendicularly through the midpoint of each v, w . In the terminology of [VPRS] and [BV], each of the Lv,w is a mediatrix relative to the set V f . Problem 4.2. Which abstract graphs can be realized as the visibility graph G of a polynomial?  Lemma 4.3. For u ∈ Vor v and w ∈ Nv    u, v r {u} ∩ Lv,w = 0. /

Proof. According to Lemma 4.1, the metric on D|u−v| (u) ⊂ S is the usual Euclidean metric. This implies immediately that if   u, v ∩ Lv,w 6= 0/

then either w ∈ D|u−v| (u) or u ∈ Lv,w ; see Figure 4.3. If w ∈ D|u−v| (u), it cannot be in V f .



v

u

Lv,w

D|u−v|(u) w







F IGURE 4.3. By Lemma 4.3, if u ∈ Vor v , then u, v cannot cross Lv,w , since π is

univalent on D|u−v| (u).

Lemma 4.3 can be used to describe the boundary of Voronoi domains. Specifically, for each v ∈ V f , Vor v is the connected component of Sr

[

Lv,w

w∈Nv

which contains v. See Figures 4.1 and 4.2. Recall that the ray ℓy ⊂ C of a point y ∈ C r {0} is the set of points which have the same argument as y.

12

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND

    If b 0 ∈ S projects onto 0 and b 0, u 6= 0, / the geodesic starting at b 0 and containing b 0, u is the b b b ray through u ∈ S , which we denote by ℓu . Observe that if ℓu ∩ V f = 0/ then π : ℓu → ℓπ (u) , is a surjective isometry. Let y = π (u). If ℓy ∩ f (C f ) = 0, / then

π −1 (ℓy) = ℓby1 ∪ ℓby2 ∪ · · · ∪ ℓbyd ,

where the points yi ∈ S are the d different preimages of y.

Proposition 4.4. Given v ∈ V f and y ∈ C r f (C f ). Then n o  card i | ℓbyi ∩ Vor v 6= 0/ ≤ mv + 1.

 Furthermore, each ℓbyi ∩ Vor v is a connected set.  Proof. Suppose ℓby1 , ℓby2 , . . ., ℓbyk intersect Vor v , with v = fb(c), c ∈ C f . Pick a point ui in each of these intersections, that is,  ui ∈ ℓbyi ∩ Vor v .

fb(c) = v

ℓbyi

ui pi

Di

π π (Dk ) π (v)

ℓy

π (D1 ) π (u1)

p π (ui)

π (uk )

π (Di )



F IGURE 4.4. As proven in Proposition 4.4, the projection π is (mv + 1)-to-one on Vor v . Let Di = D|v−ui | (ui ). According to Lemma 4.1, we know that π : Di → π (Di ) is an isometry. Let pi ∈ ℓbyi be the perpendicular projection of v onto ℓbyi and let p be the projection of f (c) = π (v)

A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

onto ℓy . Then for all i ≤ k,   0/ 6= v, pi ⊂ Di

0/ 6= [π (v), p] ⊂

and

\

13

π (Di ).

i≤k

of v ∈ S , with k ≤ mc + 1. Hence, π : S → C is k-to-1 in a neighborhood  b The connectedness of ℓyi ∩ Vor v follows from the triangle inequality.  Corollary 4.5. Each projection π : Vor v → C is at most (mv + 1)-to-one.



  Let z ∈ C. We’ll say that a critical point c ∈ C f influences the orbit of z if the segment b 0, b f (z)  passes through Vor fb(c) .

We are interested in the critical points which influence the starting values for our algorithm, and, conversely, the starting values which are influenced by a given critical point. Definition 4.6. For starting values z on the circle of radius r, we define the following sets: o n    2π it b b b I = (t, c) ∈ [0, 1] × C f | 0, f (re ) ∩ Vor f (c) 6= 0/  It = c ∈ C f | (t, c) ∈ I Ic = {t ∈ [0, 1] | (t, c) ∈ I }

Notice that, for z = re2π it fixed, we have c ∈ It precisely when, for some y ∈ ℓ f (z), D| f (c)−y| (y) is the largest ball on which fz−1 is defined. Similarly, for this pair (t, c), we also have t ∈ Ic . 5. T HE B EHAVIOR

OF

f

ON THE I NITIAL

C IRCLE

Consider the function ar : [0, 1) → R defined by

ar (t) = Arg f (re2π it ),

with r > 0. We can easily bound the rate of change of ar (t); while elementary, these bounds play a crucial role for us. Lemma 5.1. Let r > 1. Then for all t ∈ [0, 1), we have dar r r ≤ ≤ 2π d · . 2π d · r+1 dt r−1

Proof. Let z = re2π it , with r > 1. Since |ζ | < 1, we have shows (5.1)

ζ z

 ∈ D 1 (0) = w | |w| ≤ 1r . A calculation r

d d d dar z 1 = 2π · Re ∑ = Im log f (re2π it ) = 2π · Re ∑ . dt dt j=1 z − ζ j j=1 1 − ζ j /z

For each root ζi , we have

1 r r ≤ Re ≤ . r+1 1 − ζi /z r − 1 Summing this inequality over the d roots and applying it to equation 5.1 gives the desired result. 

14

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND

Remark 5.2. The estimates in Lemma 5.1 are sharp. Corollary 5.3. Let r = 1 + 1/d, and define ) ( b(reit ) f < A, for all c ∈ C f . BA = t ∈ [0, 1) | Arg fb(c)

Then

2A d − 1 . · π d Remark 5.4. Let GA be the complement of BA . For each t ∈ GA , fre−1it will be analytic in a cone  w | | Arg(w) − Arg( f (reit ))| < A , measure(BA ) ≤

and consequently such t correspond to ”good starting points” for a path-lifting algorithm. This is essentially Condition H of [Sm85] and [SS86], with A = π /12. It is shown in those papers (Prop. 2) that Gπ /12 has measure at least 1/6 if one also takes r = 3/2 (which increases the number of steps by approximately d log(3/2)). The corollary 5.3 gives the measure of Gπ /12 to be at least 5/6. Lemma 5.5. Let c be a critical point on the boundary of Basin(ζ ), and let γ c be the solution to the Newton flow emanating from c whose interior lies in Basin(ζ ). Then if r > 1, γ c ∩ Sr = 0. /

Proof. Note that the Newton flow points inward on Sr = {z | |z| = r} for r > 1, which follows from the observation that f (z) 1 . = f ′ (z) ∑ z−1ζ i

This immediately implies Lemma 5.5. To see this, note that since |z| > 1 and |ζi | ≤ 1, the vectors z − ζi all lie in the half-plane H which contains Dr . Consequently, their inverses and hence their sum ∑ 1/(z − ζi ) lie in the complementary half-plane. Inverting again gives f (z)/ f ′ (z) ∈ H , as required.  We can now use the previous lemmas to estimate the width of the “necks” of Basin(ζ ).

Lemma 5.6. Let r > 1, ζ ∈ Z f , and let γ be a connected component of Sr ∩ Basin(ζ ). Then 1 dar length(γ ) · ≤ 2π r. · min 2π dt / Let c ∈ C f ∩ B be Proof. Let B ⊂ Basin(ζ ) be a boundary component of Basin(ζ ) with γ ∩ B 6= 0. the critical point which has an orbit γ c ⊂ Basin(ζ ) of the Newton flow starting at c and ending at ζ. Observe that f (γ c ∪ B) = (0, ∞) · f (c) = ℓ f (c) , / Hence, the ray through f (c). From the definition of γ and Lemma 5.5 we get int(γ ) ∩ (B ∪ γ c ) = 0. Arg( f (int(γ ))) ∩ Arg( f (c)) = 0, /

that is, the image of γ cannot make more than a full turn in the target space. The Lemma follows. 

A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

15

The following corollary follows immediately from the proof. Corollary 5.7. Let z1 and z2 satisfy |z1| = |z2 | = r with r > 1, and suppose also that they lie in the same connected component of Sr ∩ Basin(ζ ). Then | Arg f (z1 ) − Arg f (z2 )| < 2π .

In the sequel we will consider integrals over the circle Sr = {z ∈ C | |z| = r}, which, for all r > 0, carries Lebesgue measure with unit mass. Lemma 5.8. Let r > 0 and |ζ | < r then Z 1 0

log |re2π it − ζ |dt = log r.

Proof. Define S(ζ ) =

Z 1

Z

0

log |re2π it − ζ |dt

1 dz 2π i z Sr Z dz 1 log(z − ζ ) · . = Re 2π i Sr z =

Note that

Re(log(z − ζ )) ·

Z

dS 1 1 dz = − Re dζ 2π i Sr z − ζ z  Z  1/ζ 1 1/ζ dz = 0. = − Re − 2π i Sr z − ζ z

Hence,

S(ζ ) = S(0) = log r.  Corollary 5.9. Let f (z) = ∏dj=1 (z − ζ j ), with |ζ j | < r. Then Z 1 0

log | f (re2π it )|dt = d log r.

Proof. Z 1 0

log | f (re2π it )|dt = =

Z 1 0

d log ∏ (re2π it − ζ j ) dt j=1

d Z 1



j=1 0

log |re2π it − ζ j |dt = d log r,

where the last equality follows from Lemma 5.8. Remark 5.10. Notice that if r = 1 + 1/d, we have d log r < 1.



16

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND

Problem 5.11. The previous corollary shows that the average value of log | f (z)| on Sr is d log r. Is there a constant cr independent of d so that  measure t | log | f (re2π it )| < d log r > cr ? We now establish a lower bound on |w0 | = | f (z0)| for starting values z0 on the circle Sr with r > 1. We shall use this in Lemma 6.9 to give a bound on the size of our final point wN . Proposition 5.12. Let z ∈ Basin(ζ ) with |z| = r > 1. There exists sr < 1 such that | f (z)| ≥ sr · ρζ ,

where ρζ is the radius of convergence of the branch of f −1 taking 0 to ζ . If r > 1 + 2dπ , sr = 41 . Otherwise, for r = 1 + Cd , sr is the smallest positive solution of s C = 8π . (1 − s)2 √ Remark 5.13. For 0 < C ≤ 2π , we have 0 < sr ≤ 3 − 8. For C = 1, we have sr ≈ 0.0369 >

1 28 .

Proof. We will assume, without loss of generality, that the x-axis is aligned along ζ . Let l be the radius of the largest disk centered at ζ on which f is univalent, that is, Dl (ζ ) ⊂ f −1 (Dρζ (0)) ⊂ Basin(ζ ). The Koebe 14 -Lemma (Lemma 2.1), implies l≥

(5.2)

S1

ρζ 1 · . | f ′ (ζ )| 4

Sr

v

A

f −1 (Dρ (0))

f (Sr )

f (S1 )

c

ρζ

f (A)

f φ

ζ

0

1

f (ζ ) = 0

f (z)

z

Dl (ζ )

F IGURE 5.1. Using the Koebe Lemma to calculate a lower bound on | f (z)| for z on Sr , in Proposition 5.12.

A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

17

Let z be a point in Basin(ζ ) with |z| = r. It is our goal to estimate |z − ζ |. First notice that if l ≤ |z − ζ |, the desired result follows immediately from (5.2). Thus, we need only consider the case when l > |z − ζ |. T This means that we can assume that z ∈ Dl (ζ ), and since |z| > 1, there is a point A ∈ S1 Dl (ζ ); let φ be the angle of the sector connecting 0, A, and 1. See Figure 5.1. | f (z)| The Koebe Lemma gives an upper bound on |z − ζ |; for s ≤ ρ , ζ

|z − ζ | ≤

(5.3)

1 s ρ · · . ζ | f ′ (ζ )| (1 − s)2 |z−ζ |

We now look for a lower bound on |z − ζ | by estimating l q l = ζ 2 − 2ζ cos(φ ) + 1,

for z ∈ Sr ∩ Dl (ζ ). Notice that

since

(cos φ − ζ )2 + sin2 φ = l 2

where (cos(φ ), sin(φ )) is the coordinate of the point A on Sl (ζ ) ∩ S1. From Corollary 5.7, we have ¯ ≤ 2π , Arg( f (A)) − Arg( f (A)) and by Lemma 5.1 (which bounds the radial derivative of f ), we have π r + 1 2π φ = Arg(A) ≤ · ≤ , for all r > 1. d r d Since r = 1 + Cd , we have 1 + Cd − ζ 1 + Cd − ζ |z − ζ | . ≥q ≥p l 2π ζ 2 − 2ζ cos(φ ) + 1 2 ζ − 2ζ cos( d ) + 1

Notice that for 0 < C < 2π and |ζ | ≤ 1, the above expression is minimized when ζ = 1. Hence, we have C C |z − ζ | d ≥ ≥q , l 1 − 2 cos( 2π ) + 1 2π d

for all d. This gives us

ρζ Cl C ≥ · 2π 2π 4| f ′ ζ )| This, together with the upper bound estimate (5.3), gives the lower bound on s as the solution to ρζ ρζ s C ≤ · ′ 2 ′ 2π 4| f (ζ )| (1 − s) | f (ζ )| |z − ζ | ≥

This simplifies to

C = 8π as desired.

s , (1 − s)2



18

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND

6. T HE S IZE

OF THE

S TEP

Each iterate of the algorithm described in § 3 is guided by the values wn . The difference between wn+1 and wn is called the nth -jump and is denoted by fn Jn = A · , αn

where fn = f (zn ) and αn = α (zn ). To be able to control the algorithm we have to carefully adjust 1 . the range of the coefficient A. In this section we will explain the choice A = 15

If f were linear, the algorithm would follow wn exactly, and fn ≡ wn . When the degree of f is at least 2, there will be a small error δn = | fn − wn |. While the algorithm is described in terms of Csource (the zn ) and Ctarget ( f (zn ) and the wn ), it is more straightforward to think of it in terms of the branched surface S . Let rn ≥ 0 be maximal such that

fz−1 : Drn (wn ) → U 0

is univalent, where U is a neighborhood of zn . This is the distance between w bn ∈ S and the critical  value v ∈ V f for which w bn ∈ Vor v . Also, let Rn ≥ 0 be maximal such that fz−1 : DR n ( f n ) → V 0

 is univalent, where V is a neighborhood of zn . Note that fbn could be in Vor v′ for a critical value different from that used for w bn ; in this case, we still use Rn = |v′ − fn |.

fn

δn wn

Rn rn Jn Rn+1 v

rn+1

fn+1

δn+1 wn+1

F IGURE 6.1. The various notations used througout this section, shown in the target space.

A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

19

The following Proposition is a crucial ingredient for the estimate of the average cost. Proposition 6.1. 1 · rn . 66 We need to do a little bit of work before proving this proposition. Let Jn ≥

εn = zn+1 − zn

and

hn = −(zn+1 − zn ) ·

fn′ f′ = −εn · n . fn fn

( j)

We use fn′ = f ′ (zn ), fn′′ = f ′′ (zn ), and fn = f ( j) (zn ) as notation for the derivatives of f at zn . Lemma 6.2. If |αn hn | < 1 then

δn+1 = | fn+1 − wn+1 | ≤ |hn fn | ·

| αn h n | . |1 − αn hn |

Proof. Note that since zn+1 = zn − Thus,

wn+1 − fn , fn′

we have

wn+1 = fn − (zn+1 − zn ) fn′ = (1 − hn ) fn .

δn+1 = | fn+1 − (1 − hn ) fn | = | f (zn + εn ) − (1 − hn) fn | fn′′ 2 ′ = fn + fn εn + εn + · · · − fn + hn fn 2! (3) f ′′ f n εn3 + . . . = n εn2 + 2! 3! (3) f ′′ f n 2 = |hn fn | · n ′ εn + + . . . ε 2! fn 3! fn′ n ′ fn′ f n 2 ≤ |hn fn | · αn εn + (αn εn ) + . . . fn fn ≤ |hn fn | · αn hn + (αn hn )2 + . . . | αn h n | ≤ |hn fn | · . |1 − αn hn |



The proof of the following can be in [BCSS] (Lemma 8.2b and Prop 8.3b). Here γn = γ (zn ) found fn is as in equation (3.2); thus αn = f ′ γn . n √ Lemma 6.3. Let un = αn hn and ψ (u) = 1 − 4u + 2u2 . Then if un < 1 − 1/ 2, we have ′ fn (1 − un )2 1 γn+1 ≤ and f ′ ≤ ψ (un ) γn (1 − un )ψ (un ) n+1

20

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND

Remark 6.4. In [BCSS], un is defined as (zn+1 − zn )γn . We use

wn+1 − fn f′ = −(zn+1 − zn ) n , fn fn and so our usage and that of [BCSS] agree. hn =

The proof of Proposition 6.1 will use induction. Given a choice for the positive numbers A and c we will use the following induction hypothesis | fn′ | . (6.1) Indn (A, c) : δn ≤ c · γn The constants A, c > 0 will be chosen later. The optimization process is better illustrated by using these general parameters instead our final value A = 1/15. Lemma 6.5. The induction hypothesis Indn (A, c) implies |αn hn | ≤ A + c. Proof. Observe, |hn fn | = | fn − wn+1 |

≤ |wn − wn+1 | + | fn − wn |

≤ Jn + δn ≤ A·

| fn | | fn | | fn | +c· = (A + c) · αn αn αn 

So that we may apply Lemma 6.3, we impose the condition 1 A+c < 1− √ . 2 By virtue of Lemma 6.5, this condition also ensures that the hypothesis of Lemma 6.2 is satisfied. Assume Indn (A, c). We will prepare the induction step. From the proof of Lemma 6.5, we have |hn fn | ≤ (A + c)

| fn′ | . γn

In Lemma 6.2, we obtained

α h | fn′ | un n n ≤ (A + c) · . δn+1 ≤ hn fn 1 − αn h n γn 1 − u n

Consequently, a sufficient condition which implies Indn+1 (A, c) is (A + c) or equivalently, (A + c) ·

|f′ | un | fn′ | · ≤ c · n+1 , γn 1 − u n γn+1

un γn+1 | fn′ | 1 < 1. · ′ · · γn | fn+1 | c 1 − un

A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

21

From Lemma 6.3, after simplification we get (A + c)

1 un · < 1. 2 ψ (un ) c

√ Since un ≤ A + c and u/ψ (u) increases monotonically for u ∈ [0, 1 − 1/ 2], we must have (A + c)2 1 · < 1. ψ (A + c)2 c

(6.2) We have established the following.

Lemma 6.6. If (A, c) satisfies (6.2) then Indn (A, c) =⇒ Indn+1 (A, c) The iterations are guided by the points wn which decrease towards 0 with jumps fn Jn = A · . αn

To optimize this convergence we need to find the largest A > 0 for which there is a c > 0 such that the pair (A, c) satisfies inequality (6.2). Numerics show that such solutions exist for A < 0.0703039 < 1/14.22396; we can use A = 1/15 and c = 0.0158. Recall, f0 δ0 = 0 < 0.0158 · . α0 1 , 0.0158) holds. Then Lemma 6.6 implies that So Ind0 ( 15 fn δn ≤ 0.0158 · (6.3) αn

holds for all n ≥ 0.

The proof of Proposition 6.1 uses the following Lemma. This is essentially Corollary 4.3 of [K88]; the lower bound of 14 follows from the Extended L¨owner’s Theorem in [Sm81]. Lemma 6.7.

√ 1 | fn | · Rn ≤ ≤ (3 − 2 2) · Rn . 4 αn

With these lemmas in hand, we can now return to the proof of Prop. 6.1: Proof of Proposition 6.1. From Lemma 6.7, we get Jn = A · The radius of convergence at wn is

| f n | | f n | Rn 1 Rn ≥ · = · · rn . αn 15 4| fn | 60 rn rn = |wn − vn |,

22

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND

 where vn is the critical value for which w bn ∈ S lies in Vor vn . It might be that the radius at fn is determined by another critical value, say Let rn′ = |wn − v′n |. Then we have

Rn = | fn − v′n |.

rn ≤ rn′ ≤ |v′n − fn | + | fn − wn | = Rn + δn .

In the case when vn = v′n we get the same estimate for rn . Notice, by using (6.3) and Lemma 6.7, rn ≤ Rn + δn ≤ Rn + 0.0158 ·

| fn | αn

0.0158 √ Rn 3−2 2 ≤ 1.09209 · Rn. ≤ Rn +

Consequently, we have Jn ≥ as desired.

rn rn > , 1.09209 · 60 66



Lemma 6.8. If αn > 0.1307, then | fn | ≤ 1.1376 |wn|

and

|wn+1 | ≥ 0.41982 |wn|.

Proof. Observe, | fn | ≤ wn + δn ≤ |wn | + 0.0158 ·

| fn | . αn

Hence, | fn | ≤ Now,

1 1−

0.0158 αn

|wn | ≤ 1.1376 |wn|.

1 | fn | |wn+1 | = |wn | − · 15 αn   1 1 ≥ |wn | · 1 − · 15 αn − 0.0158   1 1 ≥ 0.41982 |wn|. ≥ |wn | · 1 − · 15 0.1307 − 0.0158



Using these results, we can also obtain a relationship between the guide point wN where the algorithm terminates and ρζ , the norm of the closest critical value to 0. Recall that αN ≤ 0.1307 but αN−1 > 0.1307.

A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

Lemma 6.9. For r ≥ 1 + d1 |wN | ≥

23

1 ·ρ . 87 ζ

Proof. From Proposition 5.12, we have |w0 | ≥ sr · ρζ ≥

ρζ . 28

If wN = w0 , the lemma holds trivially. If N > 0, then αN−1 ≥ 0.1307 (and αN ≤ 0.1307). From Lemma 6.7, we get 1 | fN−1 | ≥ · αN−1 · RN−1 4 RN−1 ≥ 0.032675 · RN−1 > 31 1 ≥ · ρζ − | fN−1 | . 31

This last inequality follows from the triangle inequality: if v is the critical value with |v| = ρζ , then 0, v, and fN−1 form a triangle with side lengths ρζ , RN−1 , and | fN−1 |. Rewriting the above yields | fN−1 | ≥

(6.4)

1 ·ρ . 32 ζ

We now apply Lemma 6.8 to obtain (6.5)

|wN | ≥ 0.41982 · |wN−1| ≥ 0.41982 ·

The lemma follows by combining equations (6.4) and (6.5).

fN−1 1.1376 

7. T HE P OINTWISE C OST In this section we will estimate the number # f (z0 ) of iterates needed to find an approximate zero starting at z0 . We need some preparation to be able to state the estimate. To simplify notation and without loss of generality, throughout this section we shall assume that ℓw0 lies along the positive real axis. Furthermore, we shall assume that no critical values of f lie along ℓw0 . As before, let w0 = f (z0 ) and the let the wn be the guide points along ℓw0 as produced by the algorithm. Also let w b0 = fb(z0 ) and w bn be the corresponding points in the surface S , lying along b the ray ℓw0 . We divide ℓbw0 into subintervals as follows: as noted in Proposition 4.4, for each v ∈ V f the  intersection of ℓbw0 with b0 , and denote  Vor  v will either be an interval or the empty set. Set qb0 = w the first interval by qb0 , qb1 with corresponding critical value v1 . In general, set    qbj−1 , qbj = Vor v j ∩ ℓbw0 . Let β = β (z0 ) denote the total number of such intervals. Note that for a point z0 = re2π it0 on our initial circle, we have β (z0 ) = card It0 .

24

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND

So that we may work in the target space C rather than in the surface S , we make the following / We observation. The projection π is an isometry in a neighborhood of ℓbw0 , since V f ∩ ℓbw0 = 0. define a set U (ℓbw0 ) ⊂ S as    U (ℓbw ) = yb | yb, yb⊥ 6= 0/ , 0

where for y ∈ C, y⊥ denotes the orthogonal projection of y onto ℓw0 (or its extension ℓ−w0 ). That is, for each critical point ci which influences the orbit of w0 , we remove the ray perpendicular to ℓw0 starting at the critical value f (ci ). Lifting the result to S via the branch of π −1 taking ℓw0 to ℓbw0 yields the set U (ℓbw0 ).

Observe that π is an isometry on U (ℓbw0 ), and furthermore, U (ℓbw0 ) contains ℓbw0 and a unique lift of each of the points fn produced by the algorithm. Consequently, we have a well-defined correspondence between the target space C (minus finitely many rays) and a subset of S most relevant to the α -step algorithm starting at z0 . In what follows, we shall use the notation   vor vi = π (Vor vi ∩U (ℓbw0 )),

and shall slightly abuse notation by using vi for f (ci ). Note that the branch of f −1 which takes w0 to z0 is well-defined throughought all of π (U (ℓbw0 )); in particular, it coincides with analytic continuation of f −1 along ℓw0 . vor v3



vor v2



v2

v3

x2

x3

θ3

q3 p3

0

θ2

q2

q1

p1

θ1

p2

ℓw0

x1

vor v1



v1

F IGURE 7.1. We divide ℓw0 into intervals where it is influenced by each critical value; the various notations used in this section are labeled as in the figure. Let p j be the orthogonal projection of v j onto the ray ℓw0 (or its extension, ℓ−w0 ), and let x j = |v j − p j |. See Figure 7.1. Also, let θ j ∈ (−π , π ] be the angle between v j and the ray ℓw0 ; that is,

θ j = Arg(v j /w0 ).

A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

25

Furthermore, use β + (z0) to denote the number of θ j for which |θ j | ≤ π /2 (or, equivalently, for which p j lies on ℓw0 ). With this notation in hand, we can state an upper bound on the cost of finding an approximate zero starting from a point z0 . Proposition 7.1. Let z0 be an initial point for the α -step path lifting algorithm, with |z0 | > 1, let f ∈ Pd (1), w0 = f (z0 ). Then the maximum number of steps required for the algorithm to produce an approximate zero starting from z0 is  ! β + (z0 ) 4 + tan | θ | 9 |w0 | j , + β + (z0 ) log + ∑ log # f (z0 ) ≤ 67 · log |wN | 4 sec |θ j | − 1 j=1 where β + (z0 ) is the number of relevant critical values along ℓw0 with angle |θ j | < π /2, and wN is the final “guide-point” for the algorithm. Remark 7.2. The above result may seem circular, since wN cannot be determined a priori. However, Lemma 6.9 tells us that ρζ /87 ≤ |wN | < ρζ . In order to establish this proposition, we estimate the number of steps required to pass each Voronoi domain, and then sum over the β (z0) domains that ℓw0 passes through. If w j and wk are two guide points lying on ℓw0 with k > j, we can define the rather trivial function Cost(w j , wk ) = k − j. This measures the number of iterations required by the α -step algorithm beginning at a point z j near w j to obtain a point zk near wk . We extend this function to all pairs of points y1 and y2 lying on ℓw0 by linear interpolation. It is our goal in this section to estimate N = Cost(w0 , wN ) where wN is an approximate zero. Rather than count the number of steps directly (which is possible, but tedious), instead we follow a suggestion of Mike Shub and integrate the reciprocal of the stepsize along ℓw0 . Lemma 7.3. Let y1 and y2 be two points of ℓw0 . Then Cost(y1 , y2 ) ≤ 67  where ry = |y − v| for each y ∈ vor v ∩ ℓw0 .

Z y1 dy y2

ry

,

Proof. Recall that in section 6, we used Jn to denote the nth jump, that is, Jn = |wn − wn+1 | where wn is a guide point for the algorithm. Set J(wn ) = Jn , and extend the function J(y) to all of ℓw0 by linear interpolation. Now consider the differential equation along ℓw0 given by

dy = −J(y) y(0) = w0 . dt Since J(y) is Lipschitz, the equation (7.1) has a unique solution. Observe that the points wn are exactly the values given by using Euler’s method with stepsize 1 to solve (7.1) numerically. Now consider instead the differential equation given by (7.1)

(7.2)

ry dy =− dt 67

y(0) = w0 .

26

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND

We wish to compare the solution of (7.2) to the Euler method for (7.1). We will show that for every y in any interval [wn+1 , wn ], we have ry /67 ≤ J(y). Consequently, if ϕ (t) is the solution to (7.2) and ϕ (t1) = y1 , ϕ (t2 ) = y2 , then we will have t2 − t1 ≥ Cost(y1 , y2 ). To see that  ry /67 ≤ Jy for all y ∈ [wn+1 , wn ], we must examine a few cases. First, note that if y ∈ vor vi , we have ry2 = (y − pi )2 + x2i .

Also, recall that by virtue of Prop. 6.1, we have J(wn ) ≥ rwn /66.  First consider the case where the interval [wn+1 , wn ] lies entirely in vor vi . If wn+1 ≥ pi , then since ry is decreasing on the interval [pi , wn ], we have J(y) ≥ ry /66. If pi ≥ wn+1 , ry will be nondecreasing. However, we can apply the triangle inequality (recalling that J(wn ) = wn − wn+1 ) to see that ry ≤ J(wn ) + rwn ≤ J(wn ) + 66J(wn), and so J(wn ) ≥ ry /67 for all y in the interval. In the case where the interval intersects more than one Voronoi region, we proceed as follows. First, observe that for all y ∈ [qi , wn ], we have  already established that J(y) ≥ ry /67 holds (where qi is the smallest point of [wn+1 , wn ] ∩ vor vi ). Since |vi − qi | = |qi − vi+1 |, we have J(qi ) ≥ rqi /67, and we continue as above. Finally, the equation (7.2) is separable; elementary calculus yields Z w0 dy t(y) = 67 . ry y

 Let y be a point on ℓw0 , and let c be a critical point which influences w0 ; as before, let p be the orthogonal projection of f (c) onto ℓw0 , and let x denote the distance between f (c) and p. For each y and a fixed critical point c, we also define the angle Ay , which is the angle that the segment from y to f (c) makes with the segment between f (c) and p. Notice that ry = | f (c) − y|. As before, use θc to denote the angle between f (c) and ℓw0 . See Figure 7.2. f (c) Ay

x

ry

θc

0

p

y

F IGURE 7.2. The quantities y, ry , p, x, Ay , and θc .

We now define the following function, related to Cost(y1 , y2 ).   (y1 − p) + ry1 £(y1, y2 , c) = log . (y2 − p) + ry2

ℓw0

A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

27

 By virtue of Lemma 7.3, if y1 and y2 are both in vor f (c) , we have Cost(y1 , y2 ) ≤ 67

(7.3)

Z y1 dy y2

ry

= 67 £(y1, y2 , c).

 However, £ will still be useful even when one or both of its first two arguments are not in vor f (c) . We establish some bounds on the value of £ in the next few lemmas. Lemma 7.4.

( 3(y − p) ry + (y − p) ≤ √ x 3

if Ay > if Ay ≤

π 6 π 6

Proof. Note that ry + (y − p) = x(tan Ay + sec Ay ). If Ay > π /6, we have x(tan Ay + sec Ay ) ≤ 3x tan Ay = 3(y − √ p). When Ay ≤ π /6, note that tan Ay + sec Ay is increasing in Ay ; at Ay = π /6, ry + (y − p) = x 3. We remark that this holds even if p < 0.  Lemma 7.5. Let y1 , y2 ∈ ℓw0 with y1 > y2 ≥ 3p > 0. Then £(y1 , y2 , c) < log

y1 9 + log . y2 4

Proof. We consider two cases: when the angle Ay is large and when it is small. If Ay1 ≤ π /6, since y2 > p √ √ x 3 = log 3, £(y1 , y2 , c) < £(y1 , p, c) ≤ log x where we have used Lemma 7.4 in the second inequality. If Ay1 > π /6, we have (using Lemma 7.4 again) £(y1 , y2 , c) ≤ log

3(y1 − p) 3y1 (1 − p/y1 ) = log . 2(y2 − p) 2y2 (1 − p/y2 )

Since y2 ≥ 3p, we have (1 − p/y1 )/(1 − p/y2) < 3/2, and so 9 y1 £(y1 , y2 , c) ≤ log + log . y2 4 √ Since 3 < 9/4, the above bound holds in either case. Lemma 7.6. If p > 0, £(3p, 0, c) ≤ log

4 + tan |θc| . sec |θc | − 1

We note that since p > 0, we have −π /2 < θc < π /2. Consequently,



4+tan |θc | sec |θc |−1

> 1.

Proof. We have £(3p, 0, c) = log

(3p − p) + r3p 4 + tan |θc | 2p + (2p + p tan |θc |) = log . ≤ log r0 − p p sec |θc| − p sec |θc| − 1



28

MYONG-HI KIM, MARCO MARTENS, AND SCOTT SUTHERLAND

Finally, we handle the case where |θc | ≥ π /2. Lemma 7.7. If y1 > y2 > 0 ≥ p,

£(y1 , y2 , c) ≤ log(y1 /y2 ).

Proof. Observe that ry2 ≥ y2 − p, since ry2 is the hypotenuse of the right triangle with a leg of length y2 − p. Also, by the triangle inequality, ry1 − ry2 ≤ y1 − y2 . Using this, we have ry1 + (y1 − p) (ry2 + y1 − y2 ) + (y1 − p) ≤ ry2 + (y2 − p) 2(y2 − p) 2y1 − p + ry2 − y2 = 2(y2 − p) 2(y1 − p) + ry2 − (y2 − p) ≤ 2(y2 − p) y1 − p y1 ≤ < . y2 − p y2 ry +(y −p)

Consequently, £(y1 , y2 , c) = log ry1 +(y1 −p) < log(y1 /y2 ) as desired. 2

2



We can now prove the main result of this section. Proof of Proposition 7.1.  First,th divide ℓw0 into segments where it intersects each of the β (z0 ) Voronoi regions vor v j ; the j segment will be bounded by points q j−1 and q j (we set q0 = w0 , and qβ (z0 ) = wN ). See Figure 7.1. Now, we have β (z0 )

(7.4)

N = Cost(w0 , wN ) =



j=1

β (z0 )

Cost(q j−1 , q j ) ≤ 67



£(q j−1, q j , c j ),

j=1

where the inequality follows from Lemma 7.3 and (7.3). Applying Lemmas 7.5 and 7.6 gives us + β + (z0 ) β + (z0 ) q 4 + tan |θ j | 9 β (z0 ) + + j−1 ∑ £(q j−1, q j , c j ) ≤ ∑ log q∗ + β (z0) log 4 + ∑ log sec |θ j | − 1 j j=1 j=1 j=1

where q∗j = max(|q j |, |3p j |). Note that since q∗j ≥ |q j |, replacing q∗j with q j will still give us an upper bound; furthermore, since |q j−1| > |q j |, the logarithm of their ratio is positive. Thus, we have β + (z0 ) β + (z0 ) β + (z0 ) q j−1 4 + tan |θ j | 9 + (7.5) ∑ £(q j−1, q j , c j ) ≤ ∑ log q j + β (z0) log 4 + ∑ log sec |θ j | − 1 . j=1 j=1 j=1 Now we apply Lemma 7.7 to the remaining intervals (if any).

(7.6)

q j−1 ∑ £(q j−1, q j , c j ) ≤ +∑ log q j + j=β (z )+1 j=β (z )+1 β (z0 )

β ( z0 )

0

0

Combining equations (7.5) and (7.6) with (7.4) and recalling that q0 = w0 , qβ = wN gives the desired result. 

A UNIVERSAL BOUND FOR THE AVERAGE COST OF ROOT FINDING

29

8. T HE AVERAGE C OST In this section we shall prove our Main Theorem (Thm. 1.1), which follows from averaging the bound in Proposition 7.1 over the starting values on the circle of radius r = 1 +C/d. Recall from Definition 4.6 that I is the set of pairs (t, c) for which the critical points c ∈ C f influence the starting values z0 = reit on the initial circle of radius r, It is the set of critical points which influence a given t, and Ic are the t ∈ Sr which are influenced by c.

For each pair in (t, c) ∈ I , we use θ = θ (t, c) to denote the angle between [0, f (re2π it )] and [0, f (c)], that is f (re2π it ) θ (t, c) = Arg . f (c) In the notation of section 7, θ (t, c j ) = θ j where v j = fb(c j ) and (t, c j ) ∈ I . Note that for each fixed c, Ic is a collection of finitely many intervals: Ic consists of for those  b b t such that ℓ f (reit ) intersects Vor f (c) . Define for every c ∈ C f the function θc : Ic → R by

θc (t) = θ (t, c) = Arg

f (re2π it ) . f (c)

Lemma 8.1. For each c ∈ C f , the map θc is at most (mc + 1)-to-one. Proof. For every θ ∈ (−π , π ] there are at most (mc + 1) rays ℓb ⊂ S for which the angle between  b is θ and which also intersect Vor fb(c) . This is a consequence of Proposi[0, f (c)] and π (ℓ) tion 4.4.  As an immediate consequence of Lemma 5.1, we have 2π d ·

(8.1)

r r d θc (t) ≤ 2π d · ≤ . r+1 dt r−1

Proposition 8.2. Let f ∈ Pd (1) be of degree d and r > 1. Then Z 1 0



log

c∈It |θ (t,c)|

Suggest Documents