Newton's method is âgenerally convergent" for quadratics. (but not higher). ⢠Newton's method for quadratics can be expressed in terms of 2 à 2 matrices.
A Root-Finding Algorithm for Cubics Sam Northshield Department of Mathematics SUNY-Plattsburgh
International Conference on Difference Equations and Applications, 2011
Outline
• Newton’s method is “generally convergent" for quadratics
(but not higher).
Outline
• Newton’s method is “generally convergent" for quadratics
(but not higher). • Newton’s method for quadratics can be expressed in terms
of 2 × 2 matrices.
Outline
• Newton’s method is “generally convergent" for quadratics
(but not higher). • Newton’s method for quadratics can be expressed in terms
of 2 × 2 matrices. • This can be generalized to 3 × 3 matrices yielding a
generally convergent (2 variable) method for cubics.
Outline
• Newton’s method is “generally convergent" for quadratics
(but not higher). • Newton’s method for quadratics can be expressed in terms
of 2 × 2 matrices. • This can be generalized to 3 × 3 matrices yielding a
generally convergent (2 variable) method for cubics. • This is motivated by, and related to, work by Smale, Shub,
McMullen, and Hawkins.
Newton’s Method
xn+1 = xn −
f (xn ) f 0 (xn )
Generally Convergent For quadratics f (x) = ax 2 + bx + c, xn+1 = xn −
f (xn ) axn2 − c = . f 0 (xn ) 2axn + b
Generally Convergent For quadratics f (x) = ax 2 + bx + c, xn+1 = xn −
f (xn ) axn2 − c = . f 0 (xn ) 2axn + b
If f (x) has roots r and s, xn+1 − r axn2 − 2raxn − rb − c = xn+1 − s axn2 − 2saxn − sb − c axn2 − 2raxn + ar 2 = = axn2 − 2saxn + as2
xn − r xn − s
2
Generally Convergent For quadratics f (x) = ax 2 + bx + c, xn+1 = xn −
f (xn ) axn2 − c = . f 0 (xn ) 2axn + b
If f (x) has roots r and s, xn+1 − r axn2 − 2raxn − rb − c = xn+1 − s axn2 − 2saxn − sb − c axn2 − 2raxn + ar 2 = = axn2 − 2saxn + as2
xn − r xn − s
→ 0 (and xn → r ) if x0 closer to r than to s.
2
Generally Convergent For quadratics f (x) = ax 2 + bx + c, xn+1 = xn −
f (xn ) axn2 − c = . f 0 (xn ) 2axn + b
If f (x) has roots r and s, xn+1 − r axn2 − 2raxn − rb − c = xn+1 − s axn2 − 2saxn − sb − c axn2 − 2raxn + ar 2 = = axn2 − 2saxn + as2
xn − r xn − s
2
→ 0 (and xn → r ) if x0 closer to r than to s. For almost all coefficients a, b, c and almost all starting points x0 , xn converges to a root of f (x). (“Generally Convergent")
Newton’s Method is not Generally Convergent for Cubics f (x) = x 3 − 2x + 2 yields xn+1 =
2xn3 − 2 . 3xn2 − 2
Newton’s Method is not Generally Convergent for Cubics f (x) = x 3 − 2x + 2 yields xn+1 =
2xn3 − 2 . 3xn2 − 2
0 → 1 → 0 → 1 → 0 → ...
Newton’s Method is not Generally Convergent for Cubics f (x) = x 3 − 2x + 2 yields xn+1 =
2xn3 − 2 . 3xn2 − 2
0 → 1 → 0 → 1 → 0 → ... For all starting points sufficiently near 0 and all coefficients sufficiently near [1, 0, −2, 2], iterates of Newton’s method approach a 2-cycle.
Motivation and Earlier Work McMullen [Ann.Math(1987)] answered a question of Smale [BAMS (1985)] and found a generally convergent algorithm for cubics and showed that no such algorithm exists for quartics and beyond.
Motivation and Earlier Work McMullen [Ann.Math(1987)] answered a question of Smale [BAMS (1985)] and found a generally convergent algorithm for cubics and showed that no such algorithm exists for quartics and beyond. For cubics f (x) = x 3 + ax + b, Newton’s method applied to (x 3 + ax + b)/(3ax 2 + 9bx − a2 ) is generally convergent (“McMullen’s superconvergent alogrithm").
Motivation and Earlier Work McMullen [Ann.Math(1987)] answered a question of Smale [BAMS (1985)] and found a generally convergent algorithm for cubics and showed that no such algorithm exists for quartics and beyond. For cubics f (x) = x 3 + ax + b, Newton’s method applied to (x 3 + ax + b)/(3ax 2 + 9bx − a2 ) is generally convergent (“McMullen’s superconvergent alogrithm"). Hawkins [PAMS (2002)] noted that McMullen’s algorithm for x 3 − 1 coincides with Halley’s method and that all others are conjugate to that by Möbius transformations.
Newton’s Method from 2 × 2 Matrices For f (x) = ax 2 + bx + c, let A satisfy f (A) = 0.
Newton’s Method from 2 × 2 Matrices For f (x) = ax 2 + bx + c, let A satisfy f (A) = 0. b c (A − xI)2 = A2 − 2xA + x 2 I = − A − I − 2xA + x 2 I a a 2 c b ax − c . 2 + 2x A − −x I =A− I =− a a 2ax + b
Newton’s Method from 2 × 2 Matrices For f (x) = ax 2 + bx + c, let A satisfy f (A) = 0. b c (A − xI)2 = A2 − 2xA + x 2 I = − A − I − 2xA + x 2 I a a 2 c b ax − c . 2 + 2x A − −x I =A− I =− a a 2ax + b n . A − xn I = (A − x0 I)2 .
Newton’s Method from 2 × 2 Matrices For f (x) = ax 2 + bx + c, let A satisfy f (A) = 0. b c (A − xI)2 = A2 − 2xA + x 2 I = − A − I − 2xA + x 2 I a a 2 c b ax − c . 2 + 2x A − −x I =A− I =− a a 2ax + b n . A − xn I = (A − x0 I)2 .
Another approximation scheme: . (A − xI)3 = A − (Halley’s method).
a2 x 3 + 3acx − bc I 3a2 x 2 + 3abx + b2 − ac
General Convergence via Matrices n . If spec(A) = {r , s} then, since A − xn I = (A − x0 I)2 , n
n
n
{(r − x0 )2 , (s − x0 )2 } = spec((A − x0 I)2 ) . = spec(A − xn I) = {r − xn , s − xn }.
General Convergence via Matrices n . If spec(A) = {r , s} then, since A − xn I = (A − x0 I)2 , n
n
n
{(r − x0 )2 , (s − x0 )2 } = spec((A − x0 I)2 ) . = spec(A − xn I) = {r − xn , s − xn }. If |x0 − r | < |x0 − s|, then xn − r = xn − s
x −r x −s
and thus xn → r .
2n →0
Extending to 3 × 3 Matrices Let A be 3 × 3 with spec(A) = {r , s, t} (all distinct). Given x0 , y0 , define xn and yn by n . A2 − xn A + yn I = (A2 − x0 A + y0 I)2 .
Extending to 3 × 3 Matrices Let A be 3 × 3 with spec(A) = {r , s, t} (all distinct). Given x0 , y0 , define xn and yn by n . A2 − xn A + yn I = (A2 − x0 A + y0 I)2 .
Letting fn (z) = z 2 − xn z + yn , {fn (r ), fn (s), fn (t)} = spec(A2 − xn A + yn I) n n n n . = spec((A2 − x0 A + y0 I)2 ) = {f0 (r )2 , f0 (s)2 , f0 (t)2 }.
Extending to 3 × 3 Matrices Let A be 3 × 3 with spec(A) = {r , s, t} (all distinct). Given x0 , y0 , define xn and yn by n . A2 − xn A + yn I = (A2 − x0 A + y0 I)2 .
Letting fn (z) = z 2 − xn z + yn , {fn (r ), fn (s), fn (t)} = spec(A2 − xn A + yn I) n n n n . = spec((A2 − x0 A + y0 I)2 ) = {f0 (r )2 , f0 (s)2 , f0 (t)2 }.
If f0 (r ), f0 (s) < f0 (t) then fn (r ) = fn (t)
f0 (r ) f0 (t)
2n
fn (s) , = fn (t)
and so fn (r ), fn (s) → 0 (uses r 6= s).
f0 (s) f0 (t)
2n →0
Extending to 3 × 3 Matrices (cont.) With
n . A2 − xn A + yn I = (A2 − xA + yI)2 ,
and |r 2 − xr + y |, |s2 − xs + y | < |t 2 − xt + y |, we have xn s − yn → s2 and xn r − yn → r 2 and so 2 s −1 xn s → yn r −1 r2 and so
xn yn
→
r +s rs
.
Applying This to Cubics z 3 + az + b Let p(z) = z 3 + az + b have distinct roots r , s, t.
Applying This to Cubics z 3 + az + b Let p(z) = z 3 + az + b have distinct roots r , s, t. For x, y , let u, v satisfy . (A2 − xA + yI)2 = A2 − uA + vI. where A is a matrix with spectrum {r , s, t}.
Applying This to Cubics z 3 + az + b Let p(z) = z 3 + az + b have distinct roots r , s, t. For x, y , let u, v satisfy . (A2 − xA + yI)2 = A2 − uA + vI. where A is a matrix with spectrum {r , s, t}. The analogue of Newton’s method is the map (x, y ) 7→ (u, v ) which is, explicitly, 2xy − 2ax + b 2bx + y 2 (x, y ) 7→ , . x 2 + 2y − a x 2 + 2y − a
Applying This to Cubics z 3 + az + b
Theorem: The iterates of (x, y ) 7→ N(x, y ) :=
2xy − 2ax + b 2bx + y 2 , x 2 + 2y − a x 2 + 2y − a
converge (quadratically) to (r + s, rs) = (−t, −b/t) when |r 2 − xr + y |, |s2 − xs + y | < |t 2 − xt + y |. This algorithm is generally convergent.
.
A Conjugate Algorithm
By conjugation with the (idempotent) map (x, y ) 7→ (−x, −b/y ) (and using r + s = −t and rs = −b/t), iterates of 2bx + 2axy + by x 2 y 2 − ay 2 − 2by , M(x, y ) := 2b + ay − x 2 y 2x 2 y − b converge to (t, t) whenever |r 2 − xr + y |, |s2 − xs + y | < |t 2 − xt + y | (r , s, t roots of f (x) = x 3 + ax + b).
Interesting Connections
Letting x = y , M(x, x) =
3bx + 2ax 2 x 4 − ax 2 − 2bx , 2b + ax − x 3 2x 3 − b
whose coordinates are the maps for Newton’s method for f (x)/x 2 and f (x)/x respectively.
Interesting Connections
Letting x = y , M(x, x) =
3bx + 2ax 2 x 4 − ax 2 − 2bx , 2b + ax − x 3 2x 3 − b
whose coordinates are the maps for Newton’s method for f (x)/x 2 and f (x)/x respectively. If, in addition, a = 0, then the second coordinate becomes T (x) := (x 4 − 2bx)/(2x 3 − b), “McMullen’s superconvergent algorithm".
An Example
To calculate
√ 3
2 by iterating M starting at (2,3):
An Example
To calculate (2,3)
√ 3
2 by iterating M starting at (2,3):
An Example
To calculate
√ 3
2 by iterating M starting at (2,3):
(2,3) (.875, 1.263157895)
An Example
To calculate
√ 3
2 by iterating M starting at (2,3):
(2,3) (.875, 1.263157895) (1.213245033, 1.309248555)
An Example
To calculate
√ 3
2 by iterating M starting at (2,3):
(2,3) (.875, 1.263157895) (1.213245033, 1.309248555) (1.260547998, 1.259900272)
An Example
To calculate
√ 3
2 by iterating M starting at (2,3):
(2,3) (.875, 1.263157895) (1.213245033, 1.309248555) (1.260547998, 1.259900272) (1.259920953, 1.259921154)
An Example
To calculate
√ 3
2 by iterating M starting at (2,3):
(2,3) (.875, 1.263157895) (1.213245033, 1.309248555) (1.260547998, 1.259900272) (1.259920953, 1.259921154) (1.259921050, 1.259921050).
Questions
• Generalize to n × n case?
Questions
• Generalize to n × n case? • Are either of the algorithms (M or N) actually a
multi-dimensional Newton’s method?
Questions
• Generalize to n × n case? • Are either of the algorithms (M or N) actually a
multi-dimensional Newton’s method? • Are there deeper connections with McMullen’s algorithm
(and his negative result for quartics, etc.)?
Further Reading J. Hawkins, McMullen’s root-finding algorithm for cubic polynomials, Proc. Amer. Math. Soc., 130 (2002), no. 9, 2583-2592. C. McMullen Familes of rational maps and iterative root-finding algorithms , Ann. Math. (2) 125 (1987), no. 3, 467-493. S. Northshield On two types of exotic addition, Aequationes Math. 77 (2009), no. 1-2, 1-23. S. Northshield A root-finding method for cubics, to appear, Proc. Amer. Math. Soc. S. Smale, On the Efficiency of Algorithms of Analysis, Bull. Amer. Math. Soc. (New Series), 13 (1985), no. 2, 87-121. M. Shub, S. Smale, On the Existence of Generally Convergent Algorithms, J. Complexity, 2 (1986), 2-11.
Summary
• Newton’s method is “generally convergent" for quadratics
(but not higher). • Newton’s method for quadratics can be expressed in terms
of 2 × 2 matrices. • This can be generalized to 3 × 3 matrices yielding a
generally convergent (2 variable) method for cubics. • This is motivated by, and related to, work by Smale, Shub,
McMullen, and Hawkins.