Forward stable eigenvalue decomposition of rank-one modifications of

Forward stable eigenvalue decomposition of rank-one modifications of diagonal matrices N. Jakovˇcević Stor 1 , I. Slapniˇcar 2 , J. L. Barlow

3

arXiv:1405.7537v1 [math.NA] 29 May 2014

Abstract We present a new algorithm for solving an eigenvalue problem for a real symmetric matrix which is a rank-one modification of a diagonal matrix. The algorithm computes each eigenvalue and all components of the corresponding eigenvector with high relative accuracy in O(n) operations. The algorithm is based on a shift-and-invert approach. Only a single element of the inverse of the shifted matrix eventually needs to be computed with double the working precision. Each eigenvalue and the corresponding eigenvector can be computed separately, which makes the algorithm adaptable for parallel computing. Our results extend to Hermitian case.

Keywords: eigenvalue decomposition, diagonal-plus-rank-one update, high relative accuracy, forward stability AMS subject classifications: 65F15, 65G50, 15-04, 15B99

1

Introduction and Preliminaries

In this paper we consider eigenvalue problem for a n × n real symmetric matrix A of the form A = D + ρzz T ,

(1)

where D = diag(d1 , d2 , . . . , dn ) is diagonal matrix of order n, z = ζ1 ζ2 · · ·

ζn

T

(2)

is a vector and ρ 6= 0 is a scalar. Notice that A is a rank-one modification of a diagonal matrix. Subsequently, we shall refer to such matrices as “diagonalplus-rank-one” (DPR1) matrices. DPR1 matrices arise, for example, in solving symmetric real tridiagonal eigenvalue problems with the divide-and-conquer method [10]. 1

University of Split, Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture, R. Boˇskovića 32, 21000 Split, Croatia, [email protected] 2 University of Split, Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture, R. Boˇskovića 32, 21000 Split, Croatia, [email protected] 3 Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802-6822, USA, [email protected]

1

In this paper we present an algorithm which computes each eigenvalue and all components of the corresponding eigenvector with high relative accuracy in O(n) operations. Let us make some assumptions about the matrix A. First, without loss of generality, we may assume that ρ > 0, otherwise we may consider the matrix A = −D − ρzz T . Further, without loss of generality, we may assume that A is irreducible, that is, ζi 6= 0, for all i and di 6= dj , for all i 6= j, i, j = 1, . . . , n. If ζi = 0 for some i, then the diagonal element di is an eigenvalue whose corresponding eigenvector is the i-th unit vector, and we can reduce the size of the problem by deleting the i-th row and column of the matrix, eventually obtaining a matrix for which all elements ζj are nonzero. If di = dj , then di is an eigenvalue of the matrix A: we can reduce the size of the problem by annihilating ζj with a Givens rotation in the (i, j)-plane and proceeding as in the previous case. Also, by symmetric row and column pivoting, we can order elements of D such that d1 > d2 > · · · > dn . (3) Finally, without loss of generality, we can also assume that ζi > 0 for all i, which can be attained by pre- and post-multiplication of the matrix A with the matrix Ds = diag(sign(ζ1 ), · · · , sign(ζn ), 1). To summarize, we assume that A is ordered irreducible DPR1 matrix of the form (1), where ρ > 0, elements of the diagonal matrix D satisfy (3), and ζi > 0 for all i. Let A = V ΛV T (4) be the eigenvalue decomposition of A. Here Λ = diag(λ1 , λ2 , . . . , λn ) is a diagonal matrix whose diagonal elements are the eigenvalues of A, and V = v1 · · · vn

is an orthonormal matrix whose columns are the corresponding eigenvectors. The eigenvalue problem for a DPR1 matrix A can clearly be solved by any of the standard methods for the symmetric eigenvalue problem (see, for example [21, 20]). However, due to the special structure of diagonal-plusrank-one matrices, we can use the following approach. The eigenvalues of 2

A are the zeros of the secular equation (see, for example, [5] and [8, Section 8.5.3]): n X ζi2 f (λ) = 1 + ρ = 1 + ρz T (D − λI)−1 z = 0, (5) di − λ i=1

and the corresponding eigenvectors are given by vi =

xi , kxi k2

xi = (D − λi I)−1 z,

i = 1, . . . , n.

(6)

Diagonal elements of the matrix D, di , are called poles of the function f . Notice that n X ζi2 f ′ (λ) = ρ . (di − λ)2 i=1

Since ρ > 0, f is strictly increasing between poles implying the strict interlacing property (7) λ1 > d1 > λ2 > d2 > · · · > λn > dn . The approach given in (5) and (6) is conceptually simple and has been used to solve similar eigenvalue problems [2, 4, 5, 7]. However, maintaining orthogonality among the eigenvectors vi in (6) requires all of the eigenvalues λi to be computed with high accuracy [10]. In other words, if the computed eigenvalues λi are not accurate enough, then the computed eigenvectors vi may not be sufficiently orthogonal (see Example 3). The existing algorithms for DPR1 matrices [10, 5, 7] obtain orthogonal eigenvectors with the following procedure: ˜ i of A by solving (5); - compute the eigenvalues λ - construct a new matrix A˜ = D + ρ˜ z z˜T

(8)

˜ and by solving an inverse problem with the prescribed eigenvalues λ, diagonal matrix D, that is, compute new z˜ as v u u ˜ j − di ˜ j − di Y n i−1 λ Y λ u ˜ i − di , ζ˜i = t λ (dj − di ) (dj − di ) j=i+1

j=1

- compute the eigenvectors of A˜ by (6). Since the formulas for ζ˜i involve only multiplications, division and subtractions of exact quantities, each ζi is computed with relative error of O(εM ), where εM denotes the machine precision. The machine precision 3

εM is defined as the smallest positive number such that in the floatingpoint arithmetic 1 + εM 6= 1. In Matlab or FORTRAN REAL(8) arithmetic εM = 2−53 ≈ 1.1102 · 10−16 , thus the floating-point numbers have approximately 16 significant decimal digits. The term “double the working precision” means that the computations are performed with numbers having approximately 32 significant decimal digits, or with the machine precision less than or equal to ε2M . Therefore, A˜ in (8) satisfies A˜ = A + δA, where kδAk2 = O(ǫM ). Here k · k2 denotes the spectral matrix norm. We conclude that the com˜ i satisfy standard perturbation bounds like those from puted eigenvalues λ ˜ i are the eigenvalues of the matrix A˜ [8, Corollary 8.1.6]. Further, since λ computed to higher relative accuracy, the eigenvectors computed by (6) are orthogonal to machine precision. For details see [10, 5, 7, 2]. This results in an algorithm which requires only O(n2 ) computations and O(n) storage for eigenvalues and O(n) storage for each eigenvector. This algorithm is implemented in the LAPACK subroutine DLAED9 and its subroutines [1]. Our algorithm uses a different approach and is forward stable, that is, it computes all eigenvalues and all individual components of the corresponding eigenvectors of a given arrowhead matrix of floating-point numbers to almost full accuracy, a feature which no other method has. The accuracy of the eigenvectors and their numerical orthogonality follows from the high relative accuracy of the computed eigenvalues. Even though we use bisection to compute the zeros of the secular equation, our FORTRAN implementation of our algorithm is, for larger matrices (n = 4000), about four times slower than the optimal implementation of the LAPACK routine DLAED9 from the Intel Math Kernel Library [13]. The algorithm is based on a shift-and-invert technique. Basically, an eigenvalue λ is computed from the largest or the smallest eigenvalue of the inverse of the matrix shifted to the pole di which is nearest to λ, that is, λ=

1 + di , ν

(9)

where ν is either largest or smallest (first or last) eigenvalue of the matrix −1 A−1 i ≡ (A − di I) .

The inverses of DPR1 matrices are structured as follows (here × stands for non-zero element): the inverse of an irreducible DPR1 matrix with one pole equal to zero is a permuted arrowhead matrix [14]:  −1    × × ×      × × ×      T     0   + ρzz  = × × × × × .      × × × × × × 4

The corresponding formulas are given in (10), (11) and (12). Inverse of a non-singular DPR1 matrix with di 6= 0, i = 1, . . . , n, is again a DPR1 matrix. The corresponding formulas are given in Remark 1. Our algorithm is completely parallel, since the computation of one eigenvalue and its eigenvector is completely independent of the computation of other eigenvalues and eigenvectors. The organization of the paper is the following. In Section 2, we describe the basic idea of our algorithm named dpr1eig (DPR1 EIGenvalues). In Section 3, we discuss the accuracy of the algorithm. In Section 4, we present the complete algorithm which uses double the working precision, if necessary. In Section 4.1, we discuss fast secular equation solvers, and in Section 4.2 we discuss three implementations of the double the working precision. In Section 4.3 we extend our results to the Hermitian case. In Section 5, we illustrate algorithm with few examples. The proofs are given in Appendix A.

2

The basic shift-and-invert algorithm

Let λ be an eigenvalue of A, let v be its eigenvector, and let x be the unnormalized version of v from (6). Let di be a pole which is closest to λ. Clearly, from (7) it follows that either λ = λi or λ = λi+1 . Let Ai be the shifted matrix     D1 0 0 z1    Ai = A − di I = 0 0 0 + ρ ζi  z1T ζi z2T , (10) 0 0 D2 z2 where

D1 = diag(d1 − di , . . . , di−1 − di ),

D2 = diag(di+1 − di , . . . , dn − di ), T , z1 = ζ1 ζ2 · · · ζi−1 T z2 = ζi+1 ζi+2 · · · ζn .

Notice that D1 (D2 ) is positive (negative) definite. Obviously, λ is an eigenvalue of A if and only if µ = λ − di

is an eigenvalue of Ai , and they share the same eigenvector. The inverse of Ai is a permuted arrowhead matrix   −1 D1 w1 0  w1T (11) b w2T  , A−1 i = 0 w2 D2−1 5

where 1 , ζi 1 w2 = −D2−1 z2 , ζi 1 1 f¯ (di ) T −1 T −1 b= 2 + z1 D1 z1 + z2 D2 z2 = . ζi ρ ρζi2 w1 = −D1−1 z1

In (12), ¯ − di I f¯ (di ) = 1 + ρ¯ zT D

−1

(12)

z¯,

¯ is the diagonal matrix D without di and z¯ is z without ζi . The where D computation of the scalar b in (12), is critical to how well we are able to compute a desired eigenvalue. The eigenvalues ν = 1/µ of a real symmetric arrowhead matrix A−1 i from (11) are the zeros of the secular equation (see, for example [19, 14]) g(ν) = b − ν − wT (∆ − νI)−1 w = 0, where ∆=

D1 D2

,

(13)

w1 . w= w2

Once ν is computed, the normalized and unnormalized eigenvectors v and x are computed by applying (6) to the matrix Ai , that is,     (D1 − µI)−1 z1 x1  ζi    . x =  ...  =  (14) −   µ xn (D2 − µI)−1 z2

If λ is an eigenvalue of A which is closest to the pole di , then µ is the eigenvalue of matrix Ai which is closest to zero and

1

. ν = = ± A−1 i 2 µ

We say that ν is the largest absolute eigenvalue of A−1 i . In this case, if all entries of A−1 are computed with high relative accuracy, then, according i to standard perturbation theory, any reasonable algorithm can compute ν to high relative accuracy. In Section 3, we show that all entries of A−1 are i indeed computed to high relative accuracy, except possibly b in (12). If b

−1

is not computed to high relative accuracy and it influences Ai 2 , it is sufficient to compute it with double the working precision (see Section 4). Further,

−1 if µ is not the eigenvalue of Ai which is closest to zero, then |ν| < Ai 2 , and the quantity

−1

A i 2 (15) Kν = |ν| 6

tells us how far ν is from the largest absolute eigenvalue of A−1 i . If Kν ≫ 1, then standard perturbation theory does not guarantee that the eigenvalue µ will be computed with high relative accuracy. Remedies to this situation are described in Remark 3. With this approach, the componentwise high relative accuracy of the eigenvectors computed by (14) follows from the high relative accuracy of the computed eigenvalues (see Theorem 3). Componentwise high relative accuracy of the computed normalized eigenvectors implies, in turn, their numerical orthogonality. The described procedure is implemented in algorithm dpr1eig basic (Algorithm 1). The computation of the inverse of the shifted matrix, A−1 i , according to formulas (11) and (12), is implemented in Algorithm 2. Algorithm 3 computes the first or last zero of the secular equation (13) by bisection. Given eigenvalue λ, Algorithm 4 computes the corresponding eigenvector according to (6), or, if called for the shifted matrix, according to (14).

3

Accuracy of the algorithm

We now consider numerical properties of Algorithms 1, 2, 3, and 4. We assume the standard model of floating point arithmetic where subtraction is preformed with guard digit, such that [11, Section 2.2] f l(a op b) = (a op b)(1 + εop ),

|εop | ≤ εM ,

op ∈ {+, −, ∗, /},

where a and b are floating-point numbers, εM is machine precision, and we assume that neither underflow nor overflow occurs. In the statements of the theorems and their proofs, we shall use the standard first order approximations, that is, we neglect the terms of order O(ε2M ) or higher. We shall use the following notation: Matrix A Ai ei = f l(Ai ) A A−1 i ^ −1 (A ) = f l(A−1 ) i

Here

i



Exact eigenvalue λ µ µ b ν νb

Computed eigenvalue e λ − µ e = f l(b µ) −

(16)

νe = f l(b ν)

   D1 (I + E1 ) 0 0 z1 e    Ai = f l (Ai ) = 0 0 0 + ρ ζi  z1T 0 0 D2 (I + E2 ) z2

ζi z2T ,

where E1 and E2 are diagonal matrices whose elements are bounded by εM in absolute values. 7

Algorithm 1 [λ, v] = dpr1eig basic (D, z, ρ, k) % Computes the k -th eigenpair of an irreducible DPR1 matrix A = diag (D)+ρzz ′ n = length(D) % Determine the shift σ , the shift index i, and whether λ is on the left % or the right side of the nearest pole. % Exterior eigenvalue (k = 1):

if k == 1 σ = d1 i=1 side = ′ R′ else % Interior eigenvalues (k ∈ {2, . . . , n}):

Dtemp = D − dk middle = Dtempk−1 P /2 F middle = 1 + ρ (z. ∗ z./(Dtemp − middle)) if F middle > 0 σ = dk i=k side = ′ R′ else σ = dk−1 i=k−1 side = ′ L′ end end % Compute the inverse of the shifted matrix, A−1 i [invD1 , invD2 , w1 , w2 , b] = invA(D, z, ρ, i)

% Compute the leftmost or the rightmost eigenvalue of the arrowhead matrix A−1 i

ν = bisect([invD1 ; invD2 ], [w1 ; w2 ], b, side) % Compute the corresponding eigenvector according to (14)

µ = 1/ν v = vect(D − σ, z, µ)

% Shift the eigenvalue back

λ=µ+σ

8

Algorithm 2 [invD1 , invD2 , w1 , w2 , b] = invA (D, z, ρ, i)

% Computes the inverse of an irreducible DPR1 matrix A = diag(D − di ) + ρzz ′ % according to (11) and (12).

n = length(D) D = D − di invD1 = 1./D1:i−1 invD2 = 1./Di+1:n wζ = 1/zi w1 = −z1:i−1 . ∗ invD1 ∗ wζ w2 = −zi+1:n . ∗ invD2 ∗ wζ b = (1/rho + sum(z1:i−1 .^2. ∗ invD1 ) + sum(zi+1:n .^2. ∗ invD2 )) ∗ wζ ∗ wζ

Algorithm 3 λ = bisect (∆, w, b, side) % Computes the leftmost (for side=’L’) or the rightmost (for side=’R’) eigenvalue % of an arrowhead matrix A = [diag(∆) w; w′ b] by solving (13) with bisection. % Determine the starting interval for bisection, [lef t, right]

if side == ′ L′ lef t = min{∆ − |w|, b − kwk1 } right = min ∆ else right = max{∆ + |w|, b + kwk1 } lef t = max ∆ end % Bisection

middle = (lef t + right)/2 while (right − lef t)/abs(middle) > εM F middle = b − middle − sum(w.^2./ (∆ − middle)) if F middle > 0 lef t = middle else right = middle end middle = (lef t + right)/2 end % Eigenvalue

λ = right

9

Algorithm 4 v = vect (D, z, λ) % Computes the eigenvector of a DPR1 matrix A = diag(D) + ρzz ′ % which corresponds to the eigenvalue λ by using (6).

v = z./(D − λ) v = v/kvk2 Further we define the quantities κλ , κµ and κb as follows: e = f l (λ) = λ (1 + κλ εM ) , λ

µ e = f l (µ) = µ (1 + κµ εM ) , eb = f l (b) = b (1 + κb εM ) .

(17) (18) (19)

We also define the quantity

1 + ρz1T D1−1 z1 − ρz2T D2−1 z2 . Kb = 1 + ρz T D −1 z1 + ρz T D −1 z2 1 1 2 2

(20)

In the next few sections, we show that keeping Kb modest allows us to compute b in (12) in a way that keeps κb in (19) modest. Before that, we show that if we keep κb modest, then it is always possible to compute highly accurate eigenvalues and eigenvectors.

3.1

Connection between accuracy of λ and µ

Let λ = µ + di be an eigenvalue of the matrix A, and let µ be the corresponding eigenvalue of the shifted matrix Ai = A − di . We compute µ from A−1 and then λ. i Also, let e = f l(e λ µ + di )

be the computed eigenvalue. Theorem 1 shows the relations between the e in (17) and the accuracy of µ accuracy of λ e in (18). e from (17) and µ and µ Theorem 1. For λ and λ e from (18) we have |κλ | ≤

|di | + |µ| (|κµ | + 1) . |λ|

(21)

Proofs of this theorem and subsequent theorems are given in Appendix A.

e depends on κµ and the From Theorem 1 we see that the accuracy of λ size of the quotient |di | + |µ| . (22) |λ| 10

Theorem 2 analyzes the quotient (22) with respect to the position of λ, the sign of µ, and the signs of the neighboring poles. Figure 1 illustrates the assumptions made in Theorem 2. Theorem 2. Let the assumptions of Theorem 1 hold. (i) If (see Figure 1 (i)) sign (di ) = sign (µ) , then

|di | + |µ| = 1. |λ|

(ii) If λ is between two poles of the same sign and sign (di ) 6= sign (µ) (see Figure 1 (ii)), then |di | + |µ| ≤ 3. |λ| µ

z}|{ s

di+1

λ

di

µ

z}|{ s

0

di

(i)

λ

di−1

0

(ii)

Figure 1: Typical situations from Theorem 2 Theorem 2 does not cover the cases when d1 < 0, di+1 < 0 < di and µ < 0, or di < 0 < di−1 and µ > 0 (see e.g. Figure 2). Notice that these cases are mutually exclusive, so there can be at most one eigenvalue for which the quotient (22) is not bounded by Theorem 2. If one of these cases occurs, the quotient (22) may still be small. The quotient (22) will clearly be large if, additionally, µ is such that λ is near zero. Then λ is computed as a difference of two close quantities and cancellation can occur. z

d1

µ

}|

{

µ

s

λ1 0

di+1

(a)

0

z }| { s

λ

di

(b)

Figure 2: Typical situations for special cases If the quotient (22) is large, remedy is given in the following remark. Remark 1. [Inverting the unshifted matrix] If, in one of the above cases, the quotient (22) is large, then λ is also an eigenvalue of A nearest to 11

zero, and we can accurately compute it from the inverse of A. Notice that the inverse is of an unreduced DPR1 matrix A = D + ρzz T , with all poles being non-zero, is a DPR1 matrix of the form A−1 = D −1 + γuuT , where u = D −1 z, γ=−

ρ . 1 + ρz T D −1 z

(23)

Eigenvalues of A−1 are the zeros of the corresponding secular equation (5). Since the absolutely largest eigenvalue of A−1 is computed accurately according to standard perturbation theory, and 1/|λ| = kA−1 k2 , λ is also computed with high relative accuracy. In computing matrix A−1 , eventually γ needs to be computed in higher precision. For more details see Remark 3. If the denominator in γ is computed as zero, the matrix A is numerically singular and we can set λ = 0. Notice that all components of the the corresponding eigenvector are still computed accurately. Remark 2. [Additional accuracy in λ] Notice that Algorithm 1 (and, consequently, Algorithm 5 below) can be easily modified to return both quantities, di and µ such that λ = di + µ. If none of the remedies from Remark 1 were needed, these two quantities give additional information about λ (that is, they give a more accurate representation of λ). An example is given in Example 2 in section 5. We still need to bound the quantity κµ from (21). This quantity depends on the accuracy of f l(b). The bound for κµ is given in Theorem 6.

3.2

Accuracy of the eigenvectors

Since the eigenvector is computed by (14), its accuracy depends on the accuracy of µ e as described by the following theorem: Theorem 3. Let (18) hold and let    x e1 (D1 (I + E1 ) − µ eI)−1 z1  ..   ζi    − x e =  .  = fl   µ e   (D (I + E ) eI)−1 z2 2 2 −µ x en

   

(24)

be the computed un-normalized eigenvector corresponding to µ and λ. Then xej = xj 1 + εxj , εxj ≤ 3 (|κµ | + 3) εM , j = 1, . . . , n. 12

In other words, if κµ is small, then all components of the unnormalized eigenvector x are computed to high relative accuracy. Since computing the vector norm and the scaling induces only a small relative error in each component, all components of the corresponding normalized eigenvector v are also computed to high relative accuracy. Componentwise high relative accuracy of the computed normalized eigenvectors implies, in turn, that they are mutually numerically orthogonal to the order O(nκµ εM ). e and x Since the accuracy of λ e depends on the accuracy of µ e (that is, the size of κµ ), in the next three subsections, we discuss the accuracy of µ e. −1 Since µ e is computed as the inverse of the eigenvalue of the matrix f l(Ai ), we first discuss the accuracy of that matrix.

3.3

Accuracy of the matrix A−1 i

We have the following theorem: Theorem 4. For the computed elements of the matrix A−1 from (11) and i (12) for all (j, k) 6= (i, i) we have ^ A−1 = f l A−1 = A−1 (1 + εjk ), |εjk | ≤ 3εM . i i i jk jk jk

For the computed element b ≡ A−1 i

ii

from (19) we have

|κb | ≤ (n + 4)Kb , where Kb is defined by (20). The above theorem states that all elements of the matrix A−1 are comi puted with high relative accuracy with the possible exception of b = (A−1 i )ii . Therefore, the accuracy of our algorithm depends upon the accurate computation of b as detailed in section 4.

3.4

Accuracy of bisection

Let λmax be the absolutely largest eigenvalue of a symmetric arrowhead maemax be the eigenvalue computed by bisection as implemented trix A, an let λ in Algorithm 3. The error bound from [19, §3.1] immediately implies that e λmax − λmax √ (25) = κbis εM , κbis ≤ 1.06n n + 1 . |λmax |

Notice that the similar error bound holds for all eigenvalues which are of the same order of magnitude as |λmax |.

13

3.5

Accuracy of exterior eigenvalues of A−1 i

The desired interior eigenvalue and, in some cases, also absolutely smaller exterior eigenvalue λ of A is computed by (9), where ν is one of the exterior eigenvalues of the matrix A−1 i . The following theorem covers the case when ν is the largest absolute eigenvalue of A−1 i , and gives two different bounds. Theorem 5. Let A−1 be defined by (11) and let ν be its eigenvalue such i that

. |ν| = A−1 (26) i 2

Let νb be the exact eigenvalue of the computed matrix ^ A−1 = f l A−1 . Let i i νb = ν (1 + κν εM ) .

Then

(27)

n √ √ 2 X |κν | ≤ min (n + 4) n Kb , 3 n + (n + 4) 1 + |ζj | , |ζi |

(28)

j=1 j6=i

where Kb is defined by (20).

3.6

Final error bounds

All previous error bounds are summarized as follows. e be the computed eigenvalue of an unreduced DPR1 maTheorem 6. Let λ fi from (10), and let νe be trix A, let µ e be computed eigenvalue of the matrix A the corresponding computed eigenvalue of the matrix ^ A−1 from (11). If µ i is the eigenvalue of Ai closest to zero (or, equivalently, if (26) holds), then e is given by (17) with the error in the computed eigenvalue λ |κλ | ≤ 3(|κν | + κbis ) + 4,

(29)

and the error in the computed un-normalized eigenvector x e is given by Theorem 3 with |κµ | ≤ |κν | + κbis + 1, (30) where |κν | is bounded by (28) and κbis is defined by (25). Since we are essentially using a shift-and-invert technique, we can guarantee high relative accuracy of the computed eigenvalue and high componentwise relative accuracy of the computed eigenvector if ν is such that |ν| = O(kA−1 i k2 ) and it is computed accurately. This is certainly fulfilled if the following conditions are met: 14

C1. The quantity Kν from (15) is moderate, and C2.

(i) either the quantity Kb from (20) is small, or n P (ii) the quantity |ζ1i | |ζj | from (28) is of order O(n). j=1 j6=i

The condition C1 implies that ν will be computed accurately according to the standard perturbation theory. The conditions C2 (i) or C2 (ii) imply that κν from (28) is small, which, together with C1, implies that ν is computed accurately. If the condition C1 does not hold, that is, if Kν ≫ 1, remedies are given in Remark 2 above. If neither of the conditions C2 (i) and C2 (ii) holds, the remedy is to compute b with double the working precision as described in section 4. Remark 3. [Non-standard shifting] There are two possibilities: (a) we can compute λ by shifting to the neighboring pole on the other side if that gives a smaller value of Kν (for example, by shifting to the pole di−1 instead of di in Figure 3 (a)), (b) if shifting to another neighboring pole is not possible (if Kν ≫ 1, see Figure 3 (b)), then we can invert A − σI, where the shift σ is chosen near but not equal to λ, and not equal to the neighboring poles. This results in a DPR1 matrix (A − σI)−1 = (D − σI)−1 + γuuT , where u and γ are defined similarly as in Remark 1 (simply substitute D by D − σI), and the largest absolute eigenvalue is computed accurately. Similar to b in (12), γ may need to be computed in higher precision.4 If no floating-point numbers σ lie between λ and the neighboring poles, σ and the corresponding DPR1 matrix must be computed in double the working precision.

4

Final algorithm

If neither of the conditions C2 (i) and C2 (ii) hold, in order to guarantee that λ will be computed with high relative accuracy, the element b from the matrix A−1 needs to be computed in higher precision. The following i 4

Determining whether γ needs to be computed in higher precision is done similarly as determining whether element b of A−1 needs to be computed in higher precision, which i is described in section 4. Further, Theorem 7 implies that it suffices to compute γ with double the working precision.

15

di s λi+1

di−1 s

di s

s λi−1

λ(λi )

λi+1

(a)

s

di−1 s

λ(λi )

λi−1

(b)

Figure 3: Typical situations from Remark 3 theorem implies that if 1 ≪ Kb ≤ O(1/εM ), it is sufficient to evaluate (12) with double the working precision.5 Theorem 7. Set P =

1 + z1T D1−1 z1 , ρ

Q = −z2T D2−1 z2 .

e = f l(Q) Notice that P, Q ≥ 0 and b = (P − Q)/ζi2 . Let Pe = f l(P ) and Q e e be evaluated in standard precision, εM . Assume that P 6= Q and Kb ≤ O(1/εM ). If P , Q and b are all evaluated with double the working precision, ε2M , then (19) holds with |κb | ≤ O(n). Remark 4. [Summation techniques] Since the summands in (12) are computed quantities, neither compensated summation [11, Algorithm 4.2, p. 84] nor doubly compensated summation [11, Algorithm 4.3, pp. 87–88] nor any other summation method is guaranteed to achieve necessary accuracy without using double the working precision. We summarize the above results in one complete algorithm, dpr1eig. The algorithm first checks the components of the vector z. If they are of the same order of magnitude, the eigenpair (λ, v) is computed by Algorithm 1. If that is not the case, the quantity Kb is computed. If Kb ≫ 1, the eigenpair (λ, v) is computed by Algorithm 1, but with evaluation of b with double the working precision. At the end, the quantity Kν is computed, and if Kν ≫ 1, one of the remedies from Remark 3 must be used.

4.1

Fast secular equation solvers

Instead of using bisection to compute zeros of secular equation (13) in Algorithm 3, we can use some fast zero finder with quadratic or even cubic convergence like those from [18, 3, 15]. Such zero finders compute zeros to 5

If Kb ≥ O(1/εM ), that is, if Kb = 1/εE for some εE < εM , then, in view of Theorem 7, b needs to be computed with extended precision εE . Usage of higher precision in conjunction with the eigenvalue computation for DPR1 matrices is analyzed in [2], but there the higher precision computation is potentially needed in the iterative part. This is less convenient than our approach where the higher precision computation is used only to compute one element.

16

Algorithm 5 [λ, v] = dpr1eig (D, z, ρ, k) % Computes the k -th eigenpair of an ordered irreducible DPR1 matrix % A = diag (D) + ρzz ′ compute the shift the first part of Algorithm 1 ni as in P |ζj | /|ζi | from (28) is of O(n) if the quantity j=1 j6=i

% standard precision is enough

[λ, v] = dpr1eig basic(D, z, ρ, k) else compute the quantity Kb from (20) if Kb ≫ 1

% double the working precision is necessary

[λ, v] = dpr1eig basic(D, z, ρ, k) with evaluation of b with double the working precision else % standard precision is enough

[λ, v] = dpr1eig basic(D, z, ρ, k) end end compute the quantity Kν from (15) if Kν ≫ 1 apply one of the remedies from Remark 3 end apply the remedy from Remark 1, if necessary

17

machine accuracy using a small number of direct evaluations of the Pick function and its its derivative, where O(log(log(1/ε))) iterations are needed to obtain an ε-accuracy [16]. In particular, we tested the implementation of the cubically convergent zero finder by Borges and Gragg from [3, §3.3], with the stopping criterion defined by [3, p. 15]. From [3, (21)], it follows that the accuracy of the computed solution satisfies similar backward error bound as (25). This was indeed, true in all our tests. The number of iterations never exceeded 7. Similarly, for the solution of the secular equation (5), which is needed in remedies according to Remarks 1 and 3, one can use the fast secular equation solver by Li [15]. This solver is implemented in the LAPACK routine DLAED4. The accuracy of the computed solution satisfied similar backward error bound as (25) and the number of iterations behaved as predicted. Although the operation count of both fast zero finders is approximately half of the operations needed for bisection, we observed no speed-up in our Matlab implementation. This is due to the fact that the formulas of zero finders are more complex and have more memory references.

4.2

Implementation of the double the working precision

We tried three different implementations of the double the working precision: • by converting all quantities in the formulas (12) or (23) to variable precision by Matlab [17] command sym with parameter ’f’, and then performing the computations; • by evaluating all parts of the formulas (12) or (23) using extended precision routines add2, sub2, mul2, and div2 from [6]; and • by converting all quantities in the formulas (12) or (23) from standard 64 bit double precision numbers, declared by REAL(8), to 128 quadruple precision numbers, declared by REAL(16), in Intel FORTRAN compiler ifort [12], and then performing the computations. Having to invoke higher precision clearly slows the computation down. In Matlab, when using variable precision sym command, the computation may be slowed down by a factor of three hundred or more for each eigenvalue that requires formulas (12) or (23) to be evaluated in higher precision. This makes use of sym prohibitive for higher dimensions. Using Matlab implementations of the extended precision routines by Dekker [6], slows the computation down up to 8 times. The fastest implementation is the one in ifort which is only about three times slower. Thus, the algorithm benefits from a good implementation of higher precision.

18

4.3

Hermitian matrices

In this section we extend our results to Hermitian case. Let C = D + ρzz ∗ , where D = diag(d1 , d2 , . . . , dn ), is a real diagonal matrix of order n, z = ζ1 ζ2 · · ·

ζn

∗

,

is a complex valued vector and ρ is a real scalar. Here z ∗ denotes the conjugate transpose of z. As in Section 1, we assume that C is irreducible and ρ > 0. The eigenvalue decomposition of C is given by C = U ΛU ∗ where Λ = diag(λ1 , . . ., λn ) ∈ Rn×n is a diagonal matrix of eigenvalues, and U = u1 u2 · · · un is an unitary matrix of the corresponding eigenvectors. To apply Algorithm 5 to the matrix C we first transform C to real symmetric DPR1 matrix A by diagonal unitary similarity: A = Φ∗ CΦ = D + ρ|z||z|T ,

where Φ = diag

ζ1 ζ2 ζn , ,..., |ζ1 | |ζ2 | |ζn |

(31) .

We now compute the k-th eigenpair (λ, v) of A by Algorithm 5, and set u = Φv. Since we guarantee high relative accuracy of the eigenvalue decomposition of A computed by Algorithm 5, we also guarantee high relative accuracy of the eigenvalue decomposition of C. Notice that, if double the working precision is needed to compute b in Algorithm 5, in order for the proof of Theorem 7 to hold, the moduli |ζi | in (31) need to be computed in double the working precision, as well. Similarly, for an irreducible real non-symmetric DPR1 matrix of the form G = D + ρ˘ z˚ zT , where sign(˚ ζi ) = sign(ζ˘i ), i = 1, . . . , n, we define the diagonal matrix  s  s ˘ ζ1 ζ˘n  , . . . , sign(˚ ζn ) . Ψ = diag sign(˚ ζ1 ) ˚ ˚ ζ1 ζn The matrix

A = Ψ−1 GΨ = D + ρzz T , 19

q

where ζi = ζ˘1˚ ζi is an irreducible DPR1 matrix. We now compute the k-th eigenpair (λ, v) of A by Algorithm 5. The eigenpair of G is then (λ, Ψv). Since we guarantee high relative accuracy of the eigenvalue decomposition of A, we also guarantee high relative accuracy of the eigenvalue decomposition of G. Notice that, if double the working precision is needed to compute b in Algorithm 5, the elements ζi need to be computed in double the working precision, as well.

5

Numerical Examples

We illustrate out algorithm with four numerically demanding examples. Examples 1 and 2 illustrate Algorithm 1, Example 3 illustrates the use of double precision arithmetic, and Example 4 illustrates an application to higher dimension. Example 1. In this example quantities Kb from (20) are approximately 1 for all eigenvalues, so we guarantee that all eigenvalues and all components of their corresponding eigenvectors are computed with high relative accuracy by Algorithm 5, using only standard machine precision. Let A = D + zz T , where D = diag (1010 , 5, 4 · 10−3 , 0, −4 · 10−3 , −5), T z = 1010 1 1 10−7 1 1 .

The eigenvalues computed by Matlab [17] routine eig, LAPACK routine DLAED9, and Algorithm 5 and Mathematica [22] with 100 digits of precision (properly rounded to 16 decimal digits), are, respectively:6 λ(eig) 1.000000000100000 · 1020 5.000000000099998 4.000000099999499 · 10−3 1.665334536937735 · 10−16 0 −25.00000000150000

λ(dlaed9) 1.000000000100000 · 1020 5.000000000100000 4.000000100000001 · 10−3 1.000000023272195 · 10−24 −3.999999900000001 · 10−3 −4.999999999900000

λ(dpr1eig,Math) 1.000000000100000 · 1020 5.000000000100000 4.000000100000001 · 10−3 9.99999999899999(7, 9) · 10−25 −3.999999900000001 · 10−3 −4.999999999900000

We see that all eigenvalues computed by Algorithm 5 (including the tiniest ones), are exact to the machine precision. The eigenvalues computed by DLAED9 are all accurate, except λ4 . The eigenvalues computed by eig are accurate according to the standard perturbation theory, but they have almost no relative accuracy7 . Due to the the accuracy of the computed eigenvalues, the eigenvectors computed by Algorithm 5 are componentwise 6

If, in the last column, the last digits computed by dpr1eig and Mathematica, respectively, differ, they are displayed in parenthesses. 7 The displayed eigenvalues are the ones obtained by the command [V,Lambda]=eig(A). The command Lambda=eig(A) produces different eigenvalues.

20

accurate up to machine precision, and therefore, orthogonal up to machine precision. The eigenvectors computed by DLAED9 are also componentwise accurate, except for v4 : (dlaed9)

v4 1.000000011586098 · 10−17 2.000000023172195 · 10−18 2.500000028965244 · 10−15 −1.000000000000000 −2.500000028965244 · 10−15 −2.000000023172195 · 10−18

(dpr1eig,Math)

v4 9.99999999899999(6, 9) · 10−18 1.999999999800000 · 10−18 2.499999999749999 · 10−15 −1.000000000000000 −2.499999999749999 · 10−15 −1.999999999800000 · 10−18

Example 2. In this example, despite very close diagonal elements, we again guarantee that all eigenvalues and all components of their corresponding eigenvectors are computed with high relative accuracy, without deflation. Let A = D + zz T , where D = diag (1 + 40ε, 1 + 30ε, 1 + 20ε, 1 + 10ε), z= 1 2 2 1 .

and ε = 2−52 = 2εM . For this matrix, the quantities Kb are again of order one for all eigenvalues, so Algorithm 5 uses only standard working precision. The eigenvalues computed by Matlab, DLAED9, and Algorithm 5 are: λ(eig) λ(dlaed9) λ(dpr1eig) 11 + 32ε 11 + 48ε 11 + 16ε 1 + 38ε 1 + 41ε 1 + 39ε 1 + 31ε 1 + 27ε 1 + 25ε 1 + 8ε 1 + 9ε 1 + 11ε

Notice that all computed eigenvalues are accurate according to standard perturbation theory. However, only the eigenvalues computed by Algorithm 5 satisfy the interlacing property. The eigenvalues computed by Mathematica with 100 digits of precision, properly rounded to 32 decimal digits are: λ(Math) 11.000000000000005551115123125783 1.0000000000000085712482686374087 1.0000000000000055511151231257826 1.0000000000000025309819776141565

If, as suggested in Remark 2, Algorithm 5 is modified to return di and µ (both in standard precision), then for the eigenvalues λ2 , λ3 and λ4 the corresponding pairs (di , µ) give representations of those eigenvalues to 32 decimal digits. In our case, the exact values di + µ properly rounded to 32 decimal digits are equal to the corresponding eigenvalues computed by Mathematica displayed above. The eigenvectors v2 , v3 and v4 computed by Matlab span an invariant subspace of λ2 , λ3 and λ4 , but their components are not accurate. Due 21

to the accuracy of the computed eigenvalues, the eigenvectors computed by Algorithm 5 are componentwise accurate up to the machine precision (they coincide with the eigenvectors computed by Mathematica with 100 digits precision), and are therefore orthogonal. Interestingly, in this example the eigenvectors computed by DLAED9 are also componentwise accurate, but there is no underlying theory for such high accuracy. Example 3. In this example (see [9]) we can guarantee that all eigenvalues and eigenvectors will be computed with componentwise high relative accuracy only if b from (12) is for k ∈ {2, 3, 4} computed in double of the working precision. Let A = D + zz T , where D = diag (10/3, 2 + β, 2 − β, 1), z = 2 β β 2 , β = 10−7 .

For k ∈ {2, 3, 4} the quantities κν from (28) are of order O(107 ), so the element b in each of the matrices needs to be computed in double of the work ing precision. For example, for k = 2, the element b = A−1 computed by 2 22 Algorithm 2 in standard precision is equal to b = 5.749999751891721 · 107 , while Matlab routine inv gives b = 5.749999746046776 · 107 . Computing b in double of the working precision in Algorithm 2 gives the correct value b = 5.749999754927588 · 107 . The eigenvalues computed by Matlab, DLAED9, Algorithm 5 and Mathematica with 100 digits precision, respectively, are all highly relatively accurate – they differ in the last or last two digits. However, the eigenvectors v2 , v3 and v4 computed by Algorithm 5, with double precision computation of b’s, are componentwise accurate to machine precision and therefore orthogonal. The eigenvectors computed by Matlab and DLAED9 are, of course, orthogonal, but are not componentwise accurate. For example, (eig)

(dlaed9)

(dpr1eig,M ath)

v2 v2 v2 2.088932176072975 · 10−1 2.088932143122528 · 10−1 2.088932138163857 · 10−1 −9.351941376557037 · 10−1 −9.351941395201120 · 10−1 −9.351941398441738 · 10−1 −6.480586028358029 · 10−2 −6.480586288204153 · 10−2 −6.480586264549802 · 10−2 −2.785242341430628 · 10−1 −2.785242297496694 · 10−1 −2.785242290885133 · 10−1 Example 4. In this example we extend Example 3 to higher dimension, as in TEST 3 from [9, §6]. Here A = D + zz T ∈ R202×202 , where D = diag (1, 2 + β, 2 − β, 2 + 2β, 2 − 2β, . . . , 2 + 100β, 2 − 100β, 10/3), z = 2 β β . . . β 2 , β ∈ {10−3 , 10−8 , 10−15 }.

For each β, we solved the eigenvalue problem with Algorithm 5 without using double the working precision, Algorithm 5, and DLAED9 from LAPACK. For β = 10−3 , Algorithm 5 used double the working precision for computing 25 22

eigenvalues, and for β = 10−8 and β = 10−15 double the working precision was needed for all but the largest eigenvalue. As in [9, §6], for each algorithm we computed orthogonality and residue measures, kAvi − λi vi k2 kV T vi − ei k2 , R = max , 1≤i≤n 1≤i≤n nεM nεM kAk2 respectively. Here V = v1 v2 · · · vn is the computed matrix of eigenvectors, and ei is the i-th column of the identity matrix. Let (λ(dpr1eig nd) , v (dpr1eig nd) ), (λ(dpr1eig) , v (dpr1eig) ), and (λ(dlaed9) , v (dlaed9) ), denote the eigenpairs computed by Algorithm 5 without using double the working precision, Algorithm 5, and DLAED9, respectively. Since we proved the componentwise accuracy of eigenvectors computed by Algorithm 5, we take those as the ones of reference. Table 1 displays orthogonality measures, residue measures, relative errors in the computed eigenvalues and componentwise relative errors in the computed eigenvectors. From table 1, we see O = max

β

10−3

10−8

10−15

O(dpr1eig nd) O(dpr1eig) O(dlaed9) R(dpr1eig nd) R(dpr1eig) R(dlaed9)

1.47 0.059 0.049 0.0086 0.0086 0.029

5.8 · 104 0.039 0.064 0.033 0.039 0.03

2.1 · 1011 0.045 0.045 0.0043 0.0043 0.013

2.2 · 10−16

0

2.2 · 10−16

1.5 · 10−15

2.2 · 10−16

0

2.7 · 10−13

2.8 · 10−8

0.518

2.2 · 10−12

1.9 · 10−8

0.043

(dpr1eig nd)

(dpr1eig) −λi | (dpr1eig) |λi | 1≤i≤n (dlaed9) (dpr1eig) |λi −λi | (dpr1eig) |λi | 1≤i≤n (dpr1eig) (dpr1eig nd) ]j −[vi ]j | |[vi (dpr1eig) |[vi ]j | 1≤i,j≤n (dlaed9) (dpr1eig) |[vi ]j −[vi ]j | (dpr1eig) |[vi ]j | 1≤i,j≤n

max

|λi

max

max

max

Table 1: Orthogonality measures, residue measures, relative errors in computed eigenvalues, and componentwise relative errors in computed eigenvectors. that all algorithms behave exactly as predicted by the theoretical analysis. All algorithms compute all eigenvalues to high relative accuracy because it is the same as normwise accuracy for this case. Algorithm 5 without use of double the working precision loses orthogonality as predicted by the respective condition numbers. The number of correct digits in the computed eigenvectors is approximately the same for Algorithm 5 without use of double the working precision and DLAED9, but there is no proof of such componentwise accuracy of the eigenvectors computed by DLAED9. When double the 23

working precision is used, the eigenvectors computed by Algorithm 5 are fully orthogonal, as a consequence of their componentwise accuracy.

Acknowledgment We would like to thank Ren Cang Li for providing Matlab implementation of the LAPACK routine DLAED4 and its dependencies.

A

Proofs

Proof of Theorem 1. e and µ Let λ e be defined by (17) and (18), respectively. Then e ≡ f l (di + µ λ e) = (di + µ e) (1 + ε1 ) .

By simplifying the equality

(di + µ (1 + κµ εM )) (1 + ε1 ) = λ (1 + κλ εM ) and using λ = µ + di , we have di ε1 + µ (κµ εM + ε1 ) = λκλ εM . Taking absolute value gives |κλ | ≤

|di | + |µ| (|κµ | + 1) . |λ|

Proof of Theorem 2. (i) The assumption sign (di ) = sign (µ) immediately implies |di + µ| |di | + |µ| = = 1. |λ| |di + µ| (ii) The assumptions imply that either 0 < di+1 < λ < di ,

µ < 0,

di < λ < di−1 < 0,

µ > 0.

or In the first case λ is closest to the pole di and di + 12 di − 12 di+1 |di | + 12 |di − di+1 | |di | + |µ| ≤ ≤ 1 1 1 |λ| 2 |di + di+1 | 2 di + 2 di+1 ≤

3 2 di 1 2 di

− 12 di+1 3di = 3. ≤ 1 di + 2 di+1 24

Here we used the inequalities |µ| ≤ 21 |di − di+1 | and |λ| ≥ 12 |di + di+1 | for the first inequality, di − di+1 > 0 and di + di+1 > 0 for the second inequality and di+1 > 0 for the fourth inequality, respectively. The proof for the second case is analogous.

Proof of Theorem 3. Let x and x e be defined by (14) and (24), respectively. By using (18), for x ei we have ζi ζi (1 + ε1 ) = xi (1 + εxi ) , =− x ei = f l − µ e µ (1 + κµ εM )

and the first order approximation gives

|εxi | ≤ (|κµ | + 1) εM . For j 6= i, by using (18), solving the equality x ej =

ζj ζj (1 + ε3 ) = (1 + εx ) ((dj − di ) (1 + ε1 ) − µ (1 + κµ εM )) (1 + ε2 ) dj − λ

for εx , using λ = µ + di , and ignoring higher order terms, we have εx =

(dj − di ) (ε1 + ε2 + ε3 ) − µ (κµ εM + ε2 + ε3 ) . dj − λ

Therefore, |εx | ≤

|dj − di | + |µ| (|κµ | + 3) εM . |dj − λ|

(32)

To complete the proof we need to analyze two cases. If sign (dj − di ) = − sign µ, then

If

|dj − di − µ| |dj − λ| |dj − di | + |µ| = = = 1. |dj − λ| |dj − λ| |dj − λ| sign (dj − di ) = sign µ,

then, since di is pole closest to λ, we have |µ| ≤ 0.5 |dj − di | and |dj − di | + |µ| |dj − di | + |µ| ≤ ≤ |dj − λ| |dj − di | − |µ|

3 2 1 2

|dj − di | = 3. |dj − di |

Finally, the theorem follows by inserting this into (32).

25

Proof of Theorem 4. For the non-zero computed elements of the matrix A−1 from (11) and (12) i we have: f l( A−1 )= i jj

1 (1 + ε2 ) , j 6= i, (dj − di ) (1 + ε1 ) −1 −ζj f l( A−1 ) = f l( Ai ij ) = (1 + ε5 ) , j 6= i, i ji (dj − di ) (1 + ε3 ) ζi (1 + ε4 )

where |εk | ≤ εM for all indices k. The first statement of the theorem now follows by using standard first order approximations. Similar analysis of the formula (12) yields

where |δb| ≤

1 ζi2

e f l([A−1 i ]ii ) = b = b + δb,

1 + z1T D1−1 z1 − z2T D2−1 z2 (n + 4)εM . ρ

(33)

This, in turn, implies (19) with |κb | ≤ where Kb is defined by (20).

|δb| 1 = (n + 4)Kb , |b| |εM |

Proof of Theorem 5. Let

^ −1 A−1 = A−1 i i + δAi .

Therefore,

|b ν − ν| = kδA−1 i k2 ,

which, together with (27), implies |νκν εM | ≤ kδA−1 i k2 .

(34)

Theorem 4 implies that

Since k|A−1 i |k2 ≤

√

−1 kδA−1 i k2 ≤ (n + 4)k|Ai |k2 Kb εM . −1 nkA−1 i k2 and |ν| = kAi k2 , from (34) we have √ |κν | ≤ (n + 4) nKb ,

(35)

which proves the first part of the bound (28). For the second part of the proof, notice that Theorem 4 also implies 26

kδA−1 i k2 ≤ 3k|A|k2 εM + |δb|,

(36)

where A is equal to the matrix A−1 without b (that is, with Aii = 0). i By bounding (34) with (36) and (33), and dividing the resulting inequality by |νεM |, we have √ 1 |1/ρ| + z1T D1−1 z1 + z2T D2−1 z2 . (37) |κν | ≤ 3 n + (n + 4) |ν| ζi2 Since 1 + z1T D −1 z1 + z2T D −1 z2 − z1T D −1 z1 − z2T D −1 z2 1 2 1 2 ρ 1 ≤ |b| + 2 (|z1T D1−1 z1 | + |z2T D2−1 z2 |), ζi

1 |1/ρ| = 2 2 ζi ζi

from (37) it follows √ |b| 2 z1T D1−1 z1 + z2T D2−1 z2 ) + . |κν | ≤ 3 n + (n + 4) |ν| |ν| ζi2

(38)

Since |b| ≤ |ν| and

−1

A = |ν| = max A−1 x ≥ A−1 ek i i i 2 2 2 kxk2 =1 s ζk2 1 |ζk | + = 2 2 ≥ |ζ | |d − d | , 2 (dk − di ) ζi (dk − di ) i i k by simply dividing each term ζk2 ζi2 |dk − di | in (38) with the corresponding quotient |ζk | , |ζi | |dk − di | we obtain

n √ 2 X |ζk | . |κν | ≤ 3 n + (n + 4) 1 + |ζi | k=1 k6=i

The bound (28) now follows from (35) and (39).

27

(39)

Proof of Theorem 6. We first prove the bound (30). Since νe = f l(b ν ) is computed by bisection, from (25) we have νe = νb(1 + κbis εM ).

This and (27) imply

νe = ν(1 + κν εM )(1 + κbis εM ).

Since µ b = f l(1/b ν ), the bound (30) follows by ignoring higher order terms. The bound (29) now follows by inserting (30) into Theorems 1 and 2.

Proof of Theorem 7. Let the assumptions of the theorem hold. Let b be computed in double of the working precision, ε2M , and then stored in the standard precision. The standard floating-point error analysis with neglecting higher order terms gives P −Q P 1 + κP ε2M − Q 1 + κQ ε2M 1 + κ1 ε2M = (1 + κb εM ) 2 ζi ζi2 ≡ b (1 + κb εM ) ,

where |κP |, |κQ | ≤ (n + 2) and |κ1 | ≤ 3. Solving the above equality for κb , neglecting higher order terms, and taking absolute values gives |P | + |Q| max{|κP |, |κQ |} + κ1 εM ≤ Kb (n + 2) + 3 εM . |κb | ≤ |P − Q|

Since, by assumption, Kb ≤ O(1/εM ), this implies |κb | ≤ O(n), as desired.

References [1] E. Anderson et al., LAPACK Users’ Guide, SIAM 3rd ed., Philadelphia, (1999). [2] J. L. Barlow, Error analysis of update methods for the symmetric eigenvalue problem, SIAM J. Matrix Anal. Appl., 14 (1993) 598-618. [3] C. F. Borges, W. B. Gragg, A parallel Divide - and - Conquer Method for the Generalized Real Symmetric Definite Tridiagonal Eigenproblem, in Numerical Linear Algebra and Scientific Computing, L. Reichel, A. Ruttan and R. S. Varga, eds., de Gruyter, Berlin (1993) 11-29. 28

[4] J. R. Bunch and C. P. Nielsen, Rank-one modification of the symmetric eigenproblem, Numer. Math., 31 (1978) 31-48. [5] J. J. M. Cuppen, A divide and conquer method for the symmetric tridiagonal eigenproblem, Numer. Math., 36 (1981) 177-195. [6] T. J. Dekker, A floating-point technique for extending the available precision, Numer. Math., 18 (1971) 224-242. [7] J. Dongarra and D. Sorensen, A fully parallel algorithm for the symmetric eigenvalue problem, SIAM J. Sci. Statist. Comput., 8 (1987) 139-154. [8] G. H. Golub and C. F. Van Loan, Matrix Computations, The John Hopkins University Press, Baltimore, 4th ed. (2013). [9] M. Gu and S. C. Eisenstat, A stable and efficient algorithm for the rank-one modification of the symmetric eigenproblem, SIAM J. Matrix Anal. Appl., 15 (1994) 1266-1276. [10] M. Gu and S. C. Eisenstat, A divide-and-conquer algorithm for the symmetric tridiagonal eigenproblem, SIAM J. Matrix Anal. Appl., 16 (1995) 79-92. [11] N. Higham, Accuracy and Stability of Numerical Algorithms, Second Edition, SIAM, Philadelphia, 2002. [12] Intel Fortran Compiler, http://software.intel.com/en-us/fortran-compilers [13] Intel Math Kernel Library, http://software.intel.com/en-us/intel-mkl [14] N. Jakovˇcević Stor, I. Slapniˇcar and J. L. Barlow, Accurate eigenvalue decomposition of real symmetric arrowhead matrices and applications, Lin. Alg. Appl., to appear, http://dx.doi.org/10.1016/j.laa.2013.10.007 [15] R. C. Li, Solving secular equations stably and efficiently Tech. Report UCB/CSD-94851, Computer Science Division, University of California, Berkeley, CA (1994), Also: LAPACK Working Note 89. [16] O. Livne and A. Brandt, N Roots of the secular equation in O(N ) operations, SIAM J. Matrix Anal. Appl., 24 (2002) 439453. [17] MATLAB. The MathWorks, Inc., Natick, Massachusetts, USA, http://www.mathworks.com. [18] A. Melman, Numerical solution of a secular equation, Numer. Math., 69 (1995) 483-493.

29

[19] D. P. O’Leary and G.W. Stewart, Computing the eigenvalues and eigenvectors of symmetric arrowhead matrices, J. Comput. Phys. 90, 2 (1990) 497-505. [20] B. N. Parlett, The Symmetric Eigenvalue Problem, Prentice-Hall, Englewood Cliffs, (1980). [21] J. H. Wilkinson, The Algebraic Eigenvalue Problem, Clarendon Press, Oxford, (1965). [22] Wolfram Mathematica, Documentation Center, http://reference.wolfram.com/mathematica/guide/Mathematica.html

30