Accurate solution of structured linear systems via rank-revealing - UC3M

101 downloads 0 Views 475KB Size Report
Linear systems of equations Ax = b, where the matrix A has some particular ... therefore, standard algorithms fail to compute accurate solutions of Ax = b. We say ...
IMA Journal of Numerical Analysis (2011) Page 1 of 21 doi:10.1093/imanum/drnxxx

Accurate solution of structured linear systems via rank-revealing decompositions ´ M. D OPICO† F ROIL AN Instituto de Ciencias Matem´aticas CSIC-UAM-UC3M-UCM and Departamento de Matem´aticas, Universidad Carlos III de Madrid, Avda. de la Universidad 30, 28911 Legan´es, Spain AND

J UAN M. M OLERA‡ Departamento de Matem´aticas, Universidad Carlos III de Madrid, Avda. de la Universidad 30, 28911 Legan´es, Spain [Received on xxxxxxxx; revised on xxxxxxx]

Linear systems of equations Ax = b, where the matrix A has some particular structure, arise frequently in applications. Very often structured matrices have huge condition numbers κ(A) = kA−1 k kAk and, therefore, standard algorithms fail to compute accurate solutions of Ax = b. We say in this paper that a computed solution xb is accurate if kb x − xk/kxk = O(u), being u the unit roundoff. In this work, we introduce a framework that allows to solve accurately many classes of structured linear systems, independently of the condition number of A and efficiently, that is, with cost O(n3 ). For most of these classes no algorithms are known that are both accurate and efficient. The approach in this work relies on computing first an accurate rank-revealing decomposition of A, an idea that has been widely used in the last decades to compute singular value and eigenvalue decompositions of structured matrices with high relative accuracy. In particular, we illustrate the new method solving accurately Cauchy and Vandermonde linear systems with any distribution of nodes, i.e., without requiring A to be totally positive, for most right-hand sides b. Keywords: accurate solutions, acyclic matrices, Cauchy matrices, diagonally dominant matrices, graded matrices, linear systems, polynomial Vandermonde matrices, rank-revealing decompositions, structured matrices, Vandermonde matrices

1. Introduction Structured matrices arise very frequently in applications -see for example Olshevsky (2001a,b). As a consequence, the design and analysis of special algorithms for solving linear systems A x = b, A ∈ Cn×n and b ∈ Cn , whose matrix coefficient A has some particular structure is a classical area of research in Numerical Linear Algebra that has attracted the attention of many researchers -see (Golub & Van Loan, 1996, Chapter 4) and (Higham, 2002, Chapters 8,10,11,22) and the references therein. The goal of these † Email:

[email protected] author. Email: [email protected]

‡ Corresponding

c The author 2011. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved.

2 of 21

F. M. DOPICO & J. M. MOLERA

special algorithms is to exploit the structure to increase the speed of solution, and/or to decrease storage requirements, and/or to improve the accuracy in comparison with standard algorithms for general matrices, as for instance Gaussian elimination with partial pivoting (GEPP) or the use of the QR factorization (see in (Higham, 2002) the accuracy properties of these methods). In this work, we establish a framework that allows us to solve many classes of structured linear systems with much more accuracy than the one provided by standard algorithms and roughly with the same computational cost, that is, with cost O(n3 ) 6 c n3 , where c denotes a small real constant. This framework enlarges significantly the number of classes of structured matrices for which it is possible to compute accurately and efficiently solutions of linear systems. Given a vector norm k · k in Cn and its subordinate matrix norm in the set Cn×n of complex n × n matrices, it is well known that if xb is the solution of A x = b computed by GEPP or QR in a computer with unit roundoff u (u ≈ 10−16 in double precision IEEE arithmetic), then the relative error satisfies kx − xbk 6 u g(n) κ(A), kxk

(1.1)

where κ(A) = kA−1 k kAk is the condition number∗ of A and g(n) is a modestly growing function of n, that is, a function bounded by a low degree polynomial in n. The bound (1.1) does not guarantee a single digit of accuracy if κ(A) & 1/u, that is, if A is ill conditioned with respect to the inverse of the unit roundoff. Unfortunately, many types of structured matrices arising in applications are extremely ill conditioned. Two very famous examples are Cauchy and Vandermonde matrices (Higham, 2002, Chapters 22, 28). Our goal is to prove that the numerical framework we propose provides for many classes of structured matrices error bounds where κ(A) is replaced by a much smaller quantity. The framework we introduce relies on the concept of rank-revealing decomposition (RRD), originally introduced by Demmel et al. (1999) for computing the singular value decomposition (SVD) with high relative accuracy -see also (Higham, 2002, Sec 9.12). An RRD of A ∈ Cn×n is a factorization A = XDY , where X ∈ Cn×n , D = diag(d1 , d2 , . . . , dn ) ∈ Cn×n is diagonal, and Y ∈ Cn×n , and X and Y are well conditioned. Note that this means that if A is ill conditioned, then the diagonal factor D is also ill conditioned. We propose to compute the solution of Ax = b in two steps: 1. First, compute an RRD of A = XDY accurately in the sense of Demmel et al. (1999) (we revise the precise meaning of “accuracy” in this context in Definition 2.1). 2. Second, solve the three linear systems Xs = b, Dw = s, and Y x = w, where Xs = b and Y x = w are solved by standard backward-stable methods as GEPP or QR, while Dw = s is solved as wi = si /di , i = 1 : n. The intuition behind why this procedure computes accurate solutions, even for extremely ill conditioned matrices, is that each entry of w is computed with a relative error less than u, that is, the ill conditioned linear system Dw = s is solved very accurately, together with the fact that Xs = b and Y x = w are also solved accurately because X and Y are well conditioned. We will prove in Section 4 that the relative error in the solution xb computed by this two step procedure is kx − xbk kA−1 k kbk 6 u f (n) max{κ(X), κ(Y )} , kxk kxk ∗ We

(1.2)

assume throughout this work that A is nonsingular, i.e., that the solution of the linear system Ax = b is unique for any b.

ACCURATE SOLUTION OF STRUCTURED LINEAR SYSTEMS

3 of 21

where f (n) is a modestly growing function of n. Note that the only potentially big factor in (1.2) is kA−1 k kbk/kxk, because X and Y are well conditioned. It is well known that this factor is small for most right-hand sides b, even for very ill conditioned matrices A (Chan & Foulser, 1988; Banoczi et al., 1998). We will explain this fact carefully in Subsection 3.2, where we revise some properties of kA−1 k kbk/kxk. The computation of an accurate RRD of A is the difficult part in this framework. For almost any matrix A ∈ Cn×n an RRD (potentially inaccurate) can be computed by applying standard Gaussian elimination with complete pivoting (GECP) to get an LDU factorization, where L is unit lower triangular, D is diagonal, and U is unit upper triangular (Demmel et al., 1999; Higham, 2002). Very rarely GECP fails to produce well conditioned L and U factors, but then other pivoting strategies that guarantee well conditioned factors are available, for instance in Miranian & Gu (2003) and Pan (2000). However standard GECP is not accurate for ill conditioned matrices, and nowadays RRDs with guaranteed accuracy can be computed only for particular classes of structured matrices through special implementations of GECP that exploit carefully the structure to obtain accurate factors. Fortunately, as a by-product of the intense research performed in the last two decades on computing SVDs with high relative accuracy, there are algorithms to compute accurate RRDs of many classes of n × n structured matrices in O(n3 ) operations. These classes include: Cauchy matrices, diagonally scaled Cauchy matrices, Vandermonde matrices, and some “related unit-displacement-rank” matrices (Demmel, 1999); graded matrices (that is, matrices of the form D1 BD2 with B well conditioned and D1 and D2 diagonal), acyclic matrices (which include bidiagonal matrices), total signed compound matrices, diagonally scaled totally unimodular matrices (Demmel et al., 1999); diagonally dominant M-matrices (Demmel & Koev, 2004; Pe˜na, 2004); polynomial Vandermonde matrices involving orthonormal polynomials (Demmel & Koev, 2006); and diagonally dominant matrices (Ye, 2008; Dopico & Koev, 2010). For certain real symmetric structured matrices, it is possible to compute accurate RRDs that preserve the symmetry. These symmetric matrices include: symmetric positive definite matrices DHD, with H well conditioned and D diagonal (Demmel & Veseli´c, 1992; Mathias, 1995); symmetric Cauchy, symmetric diagonally scaled Cauchy, symmetric Vandermonde (Dopico & Koev, 2006); and symmetric diagonally scaled totally unimodular and total signed compound matrices (Pel´aez & Moro, 2006). These symmetric RRDs have been used to compute accurate eigenvalues and eigenvectors of symmetric matrices (Dopico et al., 2003, 2009). For all classes of matrices listed in this paragraph, the framework introduced in this paper solves linear systems with relative errors bounded as in (1.2). This error bound is O(u f (n)) for most right-hand sides independently of the traditional condition number of the matrices and so guarantees accurate solutions. We want to remark that for Cauchy and Vandermonde matrices A, the use of the RRDs computed by the algorithms in (Demmel, 1999) allows us to solve accurately Ax = b for any distribution of the nodes defining Cauchy and Vandermonde matrices and for most right-hand sides b. The situation is different for currently available fast algorithms with O(n2 ) cost for Vandermonde and Cauchy linear systems. The analyses (Boros et al., 1999; Higham, 2002) show that they deliver componentwise bounds for errors (forward, backward and the residual), though these error bounds are small, for most right-hand sides, only for those distributions of nodes corresponding to totally positive (TP) matrices. In conclusion, fast methods compute solutions with guaranteed accuracy only in the TP case. This is further shown in the numerical experiments in this paper. It remains as an open problem to find fast algorithms for Cauchy and Vandermonde matrices that guarantee accurate solutions for any distribution of nodes. The paper is organized as follows. We introduce in Section 2 basic notations and concepts that will be used throughout the paper. Section 3 develops a structured perturbation theory for linear systems through the factors of RRDs. These perturbation results are used in Section 4 to perform a general

4 of 21

F. M. DOPICO & J. M. MOLERA

error analysis of the framework proposed in this work. Numerical experiments that show the significant improvements obtained in accuracy from the use of the new method are presented in Section 5. For brevity, we restrict only to Cauchy and Vandermonde matrices in the numerical tests. Finally, Section 6 contains conclusions and establishes some lines of future research. 2. Preliminaries We present in this section basic facts on vector and matrix norms, the definition of the entrywise absolute value of a matrix, the exact meaning of accurate rank-revealing decomposition in this work, and the model of floating point arithmetic that we use in the error analysis of Section 4. We denote throughout the paper by k · k any vector norm in Cn and also the corresponding subordinate matrix norm, defined as kAxk kAk = max x6=0 kxk

for any A ∈ Cn×n . We will use the dual norm of the vector norm k · k, which is defined as kxkD = max z6=0

|z∗ x| kzk

(2.1)

for any x ∈ Cn . As usual z∗ denotes the conjugate-transpose of z ∈ Cn . Given a vector y ∈ Cn , we will need the vector z dual to y, which is defined by the property z∗ y = kzkD kyk = 1.

(2.2)

See (Higham, 2002, Chapter 6) or (Horn & Johnson, 1985, Chapter 5) for more information on these concepts. In Sections 3.1 and 4, we will need the entrywise absolute value of a matrix. Given a matrix G ∈ Cn×n with entries gi j , we denote by |G| the matrix with entries |gi j |. Expressions like |G| 6 |B|, where B ∈ Cn×n , mean |gi j | 6 |bi j | for 1 6 i, j 6 n. Following Demmel et al. (1999), next we define the precise meaning of an accurate computed RRD of a matrix A. We consider only nonsingular matrices A, since these are the ones of interest in this work. D EFINITION 2.1 Let A ∈ Cn×n be a nonsingular matrix, let A = XDY with D = diag(d1 , d2 , . . . , dn ) be b D b = diag(db1 , db2 , . . . , dbn ), and Yb be the factors computed by a certain algorithm an RRD of A, and let X, bYb has been accurately computed if in a computer with unit roundoff u. We say that the factorization XbD the factors satisfy kXb − Xk 6 u p(n), kXk

kYb −Y k 6 u p(n), kY k

and

|dbi − di | 6 u p(n), |di |

i = 1 : n,

(2.3)

where p(n) is a modestly growing function of n, that is, a function bounded by a low degree polynomial in n. For example, the algorithm to compute an RRD of a Cauchy matrix presented in (Demmel, 1999, Section 4), satisfies (2.3) with p(n) 6 9n, in norm k . k1 . In the rounding error analysis of Section 4 we will use the conventional error model for floating point arithmetic (Higham, 2002, Section 2.2): f l(a b) = (a b)(1 + δ ),

ACCURATE SOLUTION OF STRUCTURED LINEAR SYSTEMS

5 of 21

where a and b are real floating point numbers, ∈ {+, −, ×, /}, and |δ | 6 u. Recall that this model also holds for complex floating point numbers if u is replaced by a slightly larger constant, see (Higham, 2002, Section 3.6). In addition, we will assume that neither overflow nor underflow occurs. 3. Structured perturbation theory for linear systems In this section we present structured perturbation results for the solution of linear systems that will be used in the error analysis of Section 4. We deal with perturbations of the coefficient matrix coming from perturbing the factors of an RRD according to the errors in (2.3). A crucial point here is the technical Lemma 3.1. This Lemma shows that the results of this section are closely connected to multiplicative perturbations of the coefficient matrix of the system. Multiplicative perturbation theory of matrices has received considerable attention in the literature in the context of accurate computations of eigenvalues and singular values (Eisenstat & Ipsen, 1995; Ipsen, 1998, 2000; Li, 1998, 1999), but, as far as we know, it has not been studied yet for linear systems. We stress that the most relevant factor in all perturbation bounds in this section is kA−1 k kbk/kxk. We will revise briefly some results existing in the literature for this factor in Subsection 3.2 which justify that this factor has a moderate magnitude for most vectors b even for very ill conditioned matrices A. We are concerned in this section with a linear system A x = b, where A ∈ Cn×n , b ∈ Cn , and A is nonsingular. For brevity, we will not repeat these assumptions in the statements of the results. As usual, I denotes the n × n identity matrix. 3.1

Perturbation theory through factors

We start with Lemma 3.1 that is the base of our analysis. L EMMA 3.1 Let Ax = b and (I + E)A(I + F)e x = b + h, where I + E ∈ Cn×n and I + F ∈ Cn×n are nonsingular matrices. Then  xe− x = (I + F)−1 A−1 (I + E)−1 (h − Eb) − Fx . (3.1) In addition, if khk 6 εkbk and max{kEk, kFk} 6 ε < 1, then xe− x = A−1 h − A−1 Eb − Fx + O(ε 2 ).

(3.2)

x. Then (I + E)A(x + f ) = b + h and (I + E)(b + A f ) = b + h. Proof. Define f ∈ Cn as x + f := (I + F)e Therefore, (I + E)A f = h − Eb and f = A−1 (I + E)−1 (h − Eb).

(3.3)

Use the definition of f to get xe− x = (I + F)−1 (x + f ) − x = (I + F)−1 ( f − Fx) , and introduce (3.3) in this expression to prove (3.1). To prove (3.2) note that if ε < 1, then (I + E)−1 = k −1 = ∑∞ k=0 (−E) = I + O(ε) (Horn & Johnson, 1985, Corollary 5.6.16) and, analogously, (I + F) I + O(ε). This allows us to obtain (3.2) from (3.1).  Our next result measures the sensitivity of the system for perturbations of the factors of an RRD of the coefficient matrix.

6 of 21

F. M. DOPICO & J. M. MOLERA

T HEOREM 3.1 Let A = XDY ∈ Cn×n , where D ∈ Cn×n is a diagonal matrix, and let k · k be a vector norm in Cn whose subordinate matrix norm satisfies kΛ k = maxi=1:n |λi | for all diagonal matrices Λ = diag(λ1 , . . . , λn ). Let XDY x = b and (X + ∆X) (D + ∆D ) (Y + ∆Y ) xe = b + h, where k∆Xk 6 ε kXk, k∆Y k 6 ε kY k, |∆D | 6 ε |D| and khk 6 ε kbk, and assume that εκ(Y ) < 1 and ε(2 + ε)κ(X) < 1. Then,   kX −1 bk −1 1 + (2 + ε)kXk ε kA k kbk  ke x − xk kbk κ(Y ) + (3.4) 6 kxk 1 − εκ(Y ) 1 − ε(2 + ε)κ(X) kxk and, to first order in ε, ke x − xk 6ε kxk

    kX −1 bk kA−1 k kbk κ(Y ) + 1 + 2 kXk + O(ε 2 ). kbk kxk

(3.5)

Proof. The proof consists in transforming the perturbation of the factors into a multiplicative perturbation and then to use Lemma 3.1. The matrix (X + ∆X) (D + ∆D ) (Y + ∆Y ) can be written as (X + ∆X) (D + ∆D ) (Y + ∆Y ) = (I + ∆X X −1 ) X (I + ∆D D−1 ) DY (I +Y −1 ∆Y ) = (I + ∆X X −1 )(I + X∆D D−1 X −1 ) XDY (I +Y −1 ∆Y ), =: (I + E)XDY (I + F), where E F

= ∆X X −1 + X∆D D−1 X −1 + ∆X ∆D D−1 X −1 , = Y

−1

(3.6) (3.7)

∆Y.

Observe also that kEk 6 ε(2 + ε)κ(X) < 1,

kFk 6 εκ(Y ) < 1,

and,

kEbk 6 ε(2 + ε) kXk kX −1 bk,

(3.8)

that guarantee in particular that I +E and I +F are nonsingular. Now use (3.1) for the systems XDY x = b and (I + E)XDY (I + F)e x = b + h, apply standard norm inequalities to get  ke x − xk 6 k(I + F)−1 k kA−1 k k(I + E)−1 k (khk + kEbk) + kFkkxk , use k(I + F)−1 k 6

1 1 6 , 1 − kFk 1 − εκ(Y )

and

k(I + E)−1 k 6

1 1 6 , 1 − kEk 1 − ε(2 + ε)κ(X)

and (3.8) to obtain (3.4). Finally, the bound (3.5) to first order follows immediately from (3.4).



R EMARK 3.1 We will see in Theorem 3.2 that the bounds (3.4) and (3.5) are “essentially” the best that can be obtained under the assumptions on Theorem 3.1. However the following weaker bounds may be more useful in practice   ke x − xk ε 1 + (2 + ε) κ(X) kA−1 k kbk 6 κ(Y ) + , (3.9) kxk 1 − εκ(Y ) 1 − ε(2 + ε)κ(X) kxk   ke x − xk kA−1 k kbk 6 ε κ(Y ) + (1 + 2 κ(X)) + O(ε 2 ), (3.10) kxk kxk

ACCURATE SOLUTION OF STRUCTURED LINEAR SYSTEMS

7 of 21

because they do not require to compute kX −1 bk and do not overestimate significantly the variation of the solution if κ(X) is small. We have not been able to prove that the bound (3.5) can be attained to first order in ε for some specific perturbations h, ∆X, ∆D, and ∆Y and, as a consequence, we have not found an expression of the condition number of linear equation solving for perturbations of the factors of an RRD of the coefficient matrix. In fact, we think that (3.5) cannot be attained since it is independent of D. However, we show in Theorem 3.2 that what we get, after dividing by ε the right-hand side of (3.5) and taking the limit ε → 0, is smaller than three times the condition number, and, so, (3.5) cannot overestimate significantly the best possible first order perturbation bound. The condition number for perturbation of the factors is denoted by κ f (X, D,Y, b), where the subindex f stands for “factors”. T HEOREM 3.2 Let us use the same notation and assumptions as in Theorem 3.1 and define the condition number  ke x − xk : (X + ∆X) (D + ∆D ) (Y + ∆Y ) xe = b + h, κ f (X, D,Y, b) := lim sup ε→0 εkxk  khk 6 εkbk, k∆Xk 6 ε kXk, |∆D | 6 ε |D|, k∆Y k 6 ε kY k . Then     kX −1 bk kA−1 k kbk 6 3 κ f (X, D,Y, b). κ f (X, D,Y, b) 6 κ(Y ) + 1 + 2 kXk kbk kxk

(3.11)

Proof. The lower bound in (3.11), that is, the part “κ f (X, D,Y, b) 6 · · · ”, follows immediately from (3.5) and the definition of the condition number. Therefore, we focus on proving the upper bound in (3.11). We prove first that κ(Y ) 6 κ f (X, D,Y, b). For this purpose choose a perturbation such that h = 0, ∆D = 0, ∆X = 0, and ∆Y = ε kxk kY k zy∗ ,

where y is a vector dual to x, i.e., y∗ x = kykD kxk = 1, kzk = 1, and kY −1 zk = kY −1 k. Note that k∆Y k = εkY k, and that the matrices defined in (3.6) and (3.7) are in this case E = 0 and F = Y −1 ∆Y . From (3.2), we obtain that for this perturbation xe− x = −Y −1 ∆Y x + O(ε 2 ) = − ε kxk kY k (Y −1 z) (y∗ x) + O(ε 2 ) = − ε kxk kY k (Y −1 z) + O(ε 2 ). So, for this perturbation, ke x − xk = κ(Y ) 6 κ f (X, D,Y, b), ε→0 εkxk lim

where the last inequality follows from the “sup” appearing in the definition of κ f (X, D,Y, b). Next we prove that   kX −1 bk kA−1 k kbk kX −1 bk kA−1 k kbk kXk 6 1 + kXk 6 κ f (X, D,Y, b). kbk kxk kbk kxk

(3.12)

(3.13)

For this purpose choose a perturbation such that ∆D = 0, ∆Y = 0, h = ε kbk w, where kwk = 1 and kA−1 wk = kA−1 k, and, finally, ∆X = −ε kX −1 bk kXk ws∗ ,

8 of 21

F. M. DOPICO & J. M. MOLERA

where s is a vector dual to X −1 b. Note that khk = εkbk, k∆Xk = εkXk, and that the matrices defined in (3.6) and (3.7) are in this case E = ∆X X −1 and F = 0. From (3.2), we obtain that for this perturbation  xe− x = ε kbk A−1 w + kX −1 bk kXk (A−1 w) (s∗ X −1 b) + O(ε 2 )  = ε kbk + kX −1 bk kXk A−1 w + O(ε 2 ). So, for this perturbation,   kX −1 bk kA−1 k kbk ke x − xk = 1 + kXk 6 κ f (X, D,Y, b), lim ε→0 εkxk kbk kxk where the last inequality follows again from the “sup” appearing in the definition of κ f (X, D,Y, b). It only remains to combine (3.12) and (3.13) to obtain     kX −1 bk kA−1 k kbk κ(Y ) + 1 + 2 kXk 6 3 κ f (X, D,Y, b). kbk kxk 

3.2

Why is

kA−1 k kbk usually small? kxk

Theorem 3.1, together with Theorem 3.2, prove that the sensitivity of the system Ax = b to perturbations through factors is mainly governed by kA−1 k kbk/kxk, assuming that the factors X and Y of the RRD of A are well conditioned. The quantity kA−1 k kbk / kxk is well known in Numerical Linear Algebra because it is the usual condition number for normwise perturbations when only the right-hand side of the linear system Ax = b is perturbed. For instance, it is proved in (Higham, 2002, p. 121) that  ke x − xk kA−1 k kbk κ(A, b) := lim sup : A xe = b + h, khk 6 ε kbk = . ε→0 εkxk kxk 

(3.14)

Therefore we have proved in Theorem 3.1 that, loosely speaking, perturbations trough the factors of an RRD of A = XDY have an effect on the solution similar to perturbing only the right-hand side b. It is obvious that 1 6 κ(A, b) 6 κ(A), but the key point in this section is to show that if A is fixed and κ(A)  1, then κ(A, b)  κ(A) for most vectors b, that is, κ(A, b) is usually a moderate number. This was put to light by Chan & Foulser (1988) -see also (Higham, 2002, Problem 7.5). The explanation of this fact is particularly simple for the Euclidean vector norm k · k2 and its subordinate matrix norm (Higham, 2002, p. 108), with the use of the SVD of A. Let A = UΣV ∗ be the SVD of A, where U,V ∈ Cn×n are unitary, Σ = diag(σ1 , . . . , σn ), and σ1 > · · · > σn > 0. Observe that the 2-norm of the solution of Ax = b can be written as kxk2 = kA−1 bk2 = kΣ −1U ∗ bk2 >

|u∗n b| , σn

ACCURATE SOLUTION OF STRUCTURED LINEAR SYSTEMS

9 of 21

and bound (3.14) simply as κ2 (A, b) =

kbk2 kbk2 1 6 ∗ = , σn kxk2 |un b| cos θ (un , b)

(3.15)

being un the last column of U and θ (un , b) the acute angle between un and b. Observe that the bound on κ2 (A, b) in (3.15) may be large only if b is “almost” orthogonal to un . For example, if A is an extremely ill conditioned fixed matrix (think that κ(A) = 101000 to be concrete) and the right-hand side b is considered as a random vector whose direction is uniformly distributed in the space, then the probability that 0 6 θ (un , b) 6 π/2 − 10−6 is approximately 1 − 10−6 . Note that the condition 0 6 θ (un , b) 6 π/2 − 10−6 implies κ2 (A, b) . 106 , which is a moderate number compared to 101000 . In particular, if the perturbation parameter in Theorem 3.1 is ε = 10−16 , then kA−1 k2 kbk2 /kxk2 . 106 provides a very good bound for the variation of the solution. Even more, it is possible that κ2 (A, b) is moderate although cos θ (un , b) ≈ 0. This is consequence of the following theorem.

T HEOREM 3.3 (Chan & Foulser (1988)) Let A = UΣV ∗ be the SVD of A ∈ Cn×n and Pk be the orthogonal projector onto the subspace spanned by the last k columns of U. Then κ2 (A, b) 6

σn+1−k kbk2 , σn kPk bk2

for

k = 1 : n.

(3.16)

If cos θ (un , b) ≈ 0, then kP2 bk2 ≈ |u∗n−1 b|, being un−1 the next to last column of U, κ2 (A, b) .

σn−1 1 . σn cos θ (un−1 , b)

Whenever σn−1 ≈ σn , this bound will be moderate with high probability on random right-hand sides b such that cos θ (un , b) ≈ 0. Observe that (3.16) can be interpreted as saying that the effective condition number for the problem Ae x = b + h is σn+1−k /σn , being k the smallest natural number for which kPk bk2 /kbk2 is not too small. 4. Algorithm and error analysis We have seen in Remark 3.1 that the sensitivity of the solution of the system Ax = b, where A = XDY , for perturbations of the factors of A depends on the condition numbers of the factors X and Y , which are small if A = XDY is an RRD, and also on kA−1 k kbk/kxk, which is moderate for most vectors b according to the discussion in Subsection 3.2. The purpose of this section is to prove that these quantities also determine the rounding errors of the solution of Ax = b computed with the framework presented in the Introduction. Therefore, if an RRD of A can be computed accurately in the sense of Definition 2.1, then this framework computes accurate solutions of the system Ax = b for most vectors b. We start by specifying carefully the method of solution, and then we present the error analysis. Algorithm 4.1 (Accurate solution of linear systems with rank-revealing decompositions) Input: A ∈ Cn×n , b ∈ Cn Output: x , solution of A x = b Step 1: Compute an accurate RRD of A = XDY , with D = diag(d1 , d2 , . . . , dn ), in the sense of Definition 2.1.

10 of 21

F. M. DOPICO & J. M. MOLERA

Step 2: Solve the three linear systems, X s = b −→ s, D w = s −→ w, Y x = w −→ x, where X s = b and Y x = w are solved by any backward stable (in norm) method as GEPP or the QR factorization, and D w = s is solved as wi = si /di , i = 1 : n. As mentioned in the Introduction, in most cases RRDs are computed trough variants of GECP, so X and Y are permutations of lower and upper triangular matrices respectively and X s = b and Y x = w are solved simply with forward and backward substitution in n2 operations (Higham, 2002, Chapter 8). However, there are classes of matrices for which this does not happen, see for instance the algorithm for computing RRDs of Vandermonde matrices presented in (Demmel, 1999, Section 5), and we have decided to present Algorithm 4.1 and its error analysis in a more general context. Theorem 4.2 provides backward and forward errors for Algorithm 4.1. To simplify future references to Theorem 4.2, we include all necessary assumptions in the statement. T HEOREM 4.2 Let k · k be a vector norm in Cn whose subordinate matrix norm satisfies kΛ k = b D, b and Yb be the factors of A commaxi=1:n |λi | for all diagonal matrices Λ = diag(λ1 , . . . , λn ). Let X, puted in Step 1 of Algorithm 4.1 and assume that they satisfy kXb − Xk 6 u p(n), kXk

kYb −Y k 6 u p(n), kY k

b − D | 6 u p(n) |D |, and |D

(4.1)

where p(n) is a modestly growing function of n, X, D, and Y are the corresponding exact factors of A, and u is the unit roundoff. Assume also that the systems X s = b and Y x = w are solved with a backward stable algorithm that when applied to any linear system Bz = c, B ∈ Cn×n and c ∈ Cn , computes a solution b z that satisfies† (B + ∆B )b z = c, with k∆B k 6 u q(n) kB k, (4.2) √ where q(n) is a modestly growing function of n such that q(n) > 4 2/(1 − 12 u). Let g(n) := p(n) + q(n) + u p(n)q(n). 1. If xb is the computed solution of A x = b using Algorithm 4.1, then (X + ∆X )(D + ∆D )(Y + ∆Y )b x = b, where k∆X k 6 u g(n) kX k, |∆D | 6 u g(n) |D |, and k∆Y k 6 u g(n) kY k. 2. In addition, if x is the exact solution of Ax = b, ( u g(n) κ(Y ) ) < 1 and ( u g(n) (2+u g(n)) κ(X) ) < 1, then   kb x − xk u g(n) 1 + (2 + u g(n)) κ(X) kA−1 k kbk 6 κ(Y ) + . (4.3) kxk 1 − u g(n)κ(Y ) 1 − u g(n)(2 + u g(n))κ(X) kxk † We are assuming here that there are no backward errors in the right-hand side c. This happens for Gaussian elimination (Higham, 2002, p. 165) and also if the QR factorization is used for solving the system (Higham, 2002, eq. (19.14)). It is possible to consider backward errors in c at the cost of complicating somewhat the analysis.

ACCURATE SOLUTION OF STRUCTURED LINEAR SYSTEMS

11 of 21

R EMARK 4.1 The error bound (4.3) simplifies considerably if we only pay attention to the first order term    kA−1 k kbk kb x − xk + O (u g(n))2 , (4.4) 6 u g(n) κ(Y ) + (1 + 2 κ(X)) kxk kxk  kA−1 k kbk 6 4u g(n) max{κ(X), κ(Y )} + O (u g(n))2 . (4.5) kxk These bounds show clearly that the relative error in the solution is O(u) if κ(X), κ(Y ) and kA−1 k kbk/kxk are moderate numbers. Proof of Theorem 4.2. Part 2 follows directly from Part 1 and Remark 3.1. Therefore, we only need to prove Part 1. Observe that (4.1) implies Xb = X + ∆X1 ,

b = D + ∆D 1 , D

Yb = Y + ∆Y1 ,

(4.6)

where k∆X1 k 6 u p(n) kXk, |∆D 1 | 6 u p(n) |D|, and k∆Y1 k 6 u p(n) kY k. From (4.6), we obtain also b 6 (1 + u p(n)) kXk, kXk

b 6 (1 + u p(n)) |D|, |D|

kYb k 6 (1 + u p(n)) kY k ,

(4.7)

that will be used in the sequel. Next, we use (4.2), Section 3.6 in (Higham, 2002) for the rounding error committed in a floating point complex division, and (4.7) to bound the backward errors when solving the linear systems in b = b satisfies Step 2 of Algorithm 4.1: (i) the computed solution, sb, of Xs (Xb + ∆X2 ) sb = b,

b 6 u q(n) (1 + u p(n)) kXk; with k∆X2 k 6 u q(n) kXk

(4.8)

b = sb satisfies b of Dw (ii) the computed solution, w, b + ∆D 2 ) w b = sb, (D

√ 4 2 b with |∆D 2 | 6 u |D| 6 u q(n) (1 + u p(n)) |D|; 1 − 12 u

(4.9)

b satisfies (iii) the computed solution, xb, of Yb x = w b (Yb + ∆Y2 ) xb = w,

with k∆Y2 k 6 u q(n) kYb k 6 u q(n) (1 + u p(n)) kY k.

Putting together (4.8), (4.9) and (4.10), we found that the computed solution xb satisfies b + ∆D 2 )(Yb + ∆Y2 ) xb = b, (Xb + ∆X2 ) (D or with (4.6) (X + ∆X1 + ∆X2 )(D + ∆D 1 + ∆D 2 )(Y + ∆Y1 + ∆Y2 ) xb = b, that is (X + ∆X) (D + ∆D ) (Y + ∆Y )b x=b with k∆Xk 6 u g(n) kXk, |∆D | 6 u g(n) |D|, k∆Y k 6 u g(n) kY k.

(4.10)

12 of 21

F. M. DOPICO & J. M. MOLERA

If we ignore in (4.4) second order terms and the pessimistic dimensional factor g(n), then we obtain     −1 kb x − xk b kA k kbk =: Θ1 , (4.11) . u κ(Yb ) + 1 + 2 κ(X) kxk kb xk that can be used to estimate the forward error in the solution. We have already mentioned that the RRD bYb is in most cases a (permuted) LDU factorization, so κ(X), b κ(Yb ), and kA−1 k can be estimated in XbD 2 O(n ) operations for p-norms (Higham, 2002, Chapter 15). However it is also possible to compute, from calculated quantities, a realistic bound without the use of A−1 . From kA−1 k = kY −1 D−1 X −1 k 6 kY −1 k kX −1 k mini |di | ,

the following simpler estimation of the error follows kb x − xk .u kxk

  b −1 b−1 b kY k kX k kbk κ(Yb ) + 1 + 2 κ(X) xk mini |dbi | kb

! =: Θ2 .

(4.12)

Both quantities Θ1 and Θ2 will be computed in the numerical tests of Section 5 and we will see that they are reliable estimators of the true error in the solution. 5. Numerical Experiments In this section we will show the results obtained from testing Algorithm 4.1 with two important classes of structured matrices: Cauchy and Vandermonde matrices. For matrices in these classes, accurate RRDs in the sense of Definition 2.1 can be computed using the algorithms developed by Demmel (1999) with a cost of 4n3 /3 + O(n2 ) operations plus n3 /3 + O(n2 ) comparisons. Note that this cost is double than the cost of usual GECP. We will compare the relative errors committed by Algorithm 4.1 with those committed by other algorithms available in the literature and we will see that Algorithm 4.1 is by far the most accurate one‡ and that, for random right-hand sides, it achieves relative normwise errors approximately equal to the unit roundoff u for any distribution of the nodes defining Cauchy and Vandermonde matrices. We will also check how well the quantities Θ1 and Θ2 defined in (4.11) and (4.12) estimate the actual errors. We will use in all the experiments the Euclidean or two norm and we will present tests only for matrices with real entries. 5.1

Cauchy matrices

The entries of a Cauchy matrix, C ∈ Rn×n , are defined in terms of two vectors x = [x1 , . . . , xn ]T , y = [y1 , . . . , yn ]T ∈ Rn as 1 ci j = . (5.1) xi + y j It is well known that C is totally positive if x1 + y1 > 0, x1 < x2 < · · · < xn , and y1 < y2 < · · · < yn (Gantmacher & Krein, 2002, p. 78). Matrices of the form G = D1CD2 , where C is Cauchy and D1 , D2 are diagonal, are called by Demmel (1999) quasi-Cauchy matrices, which include as a particular case Cauchy matrices for D1 = D2 = I. Algorithm 3 in (Demmel, 1999) uses a structured version of GECP to compute accurate RRDs of any quasi-Cauchy matrix with a cost of 4n3 /3 + O(n2 ) operations plus n3 /3 + O(n2 ) comparisons, and so this algorithm can be used in Step 1 of Algorithm 4.1.

‡ See

however our comments below on the method BjP-PPP in Figure 4.

13 of 21

ACCURATE SOLUTION OF STRUCTURED LINEAR SYSTEMS

Forward error vs size. TP Cauchy matrices. 100 10

GECP RRD BKO BKO-PPP Θ1

−2

10−4 10−6 ||∆x||2 −8 ||x||2 10 10−10 10−12 10−14 10−16 10

20

30

40

60

50

70

80

90

100

n FIG. 1. Forward relative error

kb x − xk2 against size for Totally Positive Cauchy matrices. kxk2

In the numerical experiments to solve linear systems Ax = b, where A is a Cauchy matrix, we have compared the following four algorithms implemented in MATLABTM : • GECP: standard GECP, that is, we compute first the entries of the matrix and then apply GECP. Cost:§ 2n3 /3 operations + n3 /3 comparisons. • RRD: Algorithm 4.1 implemented using Algorithm 3 in (Demmel, 1999) for Step 1, while for Step 2 we solve Xs = b with forward substitution (recall that X is a permutation of a lower triangular matrix), Dw = s as wi = si /di for i = 1 : n, and Y x = w with backward substitution (recall that Y is a permutation of an upper triangular matrix). Cost: 4n3 /3 operations + n3 /3 comparisons. • BKO: We use also Algorithm 3.2 by Boros et al. (1999), which is a fast method that solves Cauchy linear systems and that delivers satisfactory componentwise bounds for the relative forward error of the solution, for most right-hand sides (in the sense explained in Section 3.2) only for totally positive Cauchy matrices (Boros et al., 1999). However, when the matrix is not totally positive, current analysis does not guarantee good bounds for the errors, even in normwise sense. Cost: 7n2 operations. § When

1 the cost is of order n3 we omit the terms O(n2 ).

14 of 21

F. M. DOPICO & J. M. MOLERA

• BKO-PPP: Finally we use BKO algorithm above, but ordering the nodes x in advance by using Algorithm 5.2 by Boros et al. (2002). This ordering costs O(n2 ) operations and corresponds to the order of the rows of C that would be obtained by applying GEPP. This is why this strategy is called Predictive Partial Pivoting by Boros et al. (1999, 2002). This method has not guaranteed accuracy even when the matrix is totally positive (Boros et al., 1999). Cost: 9n2 operations + n2 /2 comparisons.

Forward error vs size. Random Cauchy matrices. 100

10−5

||∆x||2 ||x||2

GECP RRD BKO BKO-PPP Θ1

10−10

10−15 10

20

30

40

60

50

70

80

90

100

n FIG. 2. Forward relative error

kb x − xk2 against size for non-Totally Positive Cauchy matrices. kxk2

We have generated random Cauchy matrices both totally positive (TP) (uniformly distributed random, using MATLABTM command rand, x and y vectors satisfying the TP requirements: x1 + y1 > 0, x1 < x2 < · · · < xn , and y1 < y2 < · · · < yn ) and non totally positive (nTP) (normally distributed random, using MATLABTM command randn, x and y vectors). We have used in all tests normally distributed random right-hand side vectors b. The sizes of the matrices have ranged from n = 5 to n = 100. The range of the condition numbers has been 106 . κ(C) . 10200 . For each linear system we take as “exact” solution the one computed with the variable precision arithmetic of MATLABTM set to log10 κ(C) + 30 decimal digits, where κ(C) has been estimated from the D factor of the RRD as κ(C) ≈ maxi |di |/ mini |di |.

ACCURATE SOLUTION OF STRUCTURED LINEAR SYSTEMS

15 of 21

The results are shown in Figure 1 for TP matrices and in Figure 2 for non TP matrices. We have plotted in a log scale the relative error kb x − xk2 /kxk2 of the solution against the size of the matrices. We have run five different cases for every size and we have plotted the maximum error among the five cases. Besides the relative error for the four methods mentioned above, we have displayed (dashed line) also the quantity Θ1 appearing in (4.11). It can be observed in both figures that the bound (4.11) is very sharp. We have also computed the quantity Θ2 in (4.12). For clarity it is not shown in the figures, but it behaves as Θ1 , being approximately a factor 10 bigger, for the range of our experiments. It can be seen in Figure 1 for TP matrices that, except GECP that produces huge errors due to the ill conditioning of Cauchy matrices, the other three methods get full accuracy (≈ u) when the matrices are totally positive. This is the expected accuracy for the methods BKO and RRD, however the method with predictive partial pivoting, BKO-PPP, gets also full accuracy without having a theoretical result that guarantees this. We have reproduced the experiment in (Boros et al., 1999, Figure 2), obtaining the same lost of accuracy for the BKO-PPP method for the TP matrices shown there, while our method (RRD) behaves perfectly well. More important, Figure 2 also shows that the algorithms introduced by Boros, Kailath and Olshevsky in (Boros et al., 1999, 2002) do not deliver high relative accuracy when the Cauchy matrices are non totally positive, while the algorithm proposed in this paper, RRD, (continuous ∗-line) does. An important remark about the sources of errors committed by GECP applied to Cauchy (and also to Vandermonde) matrices is in order here. A key difference between GECP and the other methods presented in this section is that the input data for the former are the entries of the matrix ci j , and these have to be computed from the data vectors x and y, that are the input data for the other methods. Therefore, GECP is applied to a system Cb x = b, where Cb is not the exact Cauchy matrix C. Due to the usually huge condition number of C, even the exact solution of Cb x = b would differ enormously from the exact solution of C x = b. In addition, GECP computes a solution xb of Cb x = b that differs also greatly b which is the second from the exact solution as a consequence again of the huge condition number of C, source of error for GECP. Both sources of error contribute essentially the same to the final relative error, i.e., (∼ uκ(C)). A way to check that the first source of error in GECP, the one coming from forming the elements of the matrix, is not the only cause for the bad results of the forward error, is to create a matrix for which there are not rounding errors in forming it. This is not easy to do for Cauchy matrices, but it can be done easily for Vandermonde matrices, using floating point numbers (sums of power of two) for the components of the vector x. We have done those experiments, seeing that the forward errors behave in the same way as in Figures 3 and 4 (see below). In particular, those of GECP are still huge despite the fact that the matrix is formed without rounding errors. 5.2

Vandermonde matrices

We have performed similar numerical tests of Algorithm 4.1 on Vandermonde matrices. Given x = [x1 , . . . , xn ]T ∈ Rn , a Vandermonde matrix, V ∈ Rn×n , is a matrix whose entries are given by vi j = xij−1 .

(5.2)

It is well known that V is totally positive if 0 < x1 < · · · < xn (Gantmacher & Krein, 2002, p. 76). A method to compute an accurate RRD of any Vandermonde matrix was presented in (Demmel, 1999, Section 5). It is based on the fact that if F ∈ Cn×n is the n × n discrete Fourier transform, then V F is a quasi-Cauchy matrix whose parameters can be accurately computed in O(n2 ) operations, as well as the sums and subtractions of any pair of these parameters. Then, an accurate RRD of V F = XDY can be computed with Algorithm 3 in (Demmel, 1999). Finally, V = X D (Y F ∗ ) is an accurate RRD of V . For

16 of 21

F. M. DOPICO & J. M. MOLERA

most applications, in particular for linear solving, (Y F ∗ ) can be stored in factored form, then the method costs 4n3 /3 + O(n2 ) operations plus n3 /3 + O(n2 ) comparisons. If one insists in computing explicitly Y F ∗ , then this can be done accurately in O(n2 log n) operations through the fast Fourier Transform (Higham, 2002, Chapter 24). Let us notice that the method above further requires, to guarantee the desired accuracy, that the primitive nth roots of unity be computed in advance with high precision. They can be computed and stored once for all. This is if the nodes xi are real (our case in the experiments), if the nodes xi are complex, further demands on the precision of the computations of the elements of the matrix F are necessary (Demmel, 1999). In the numerical experiments to solve linear systems Ax = b, where A is a Vandermonde matrix, we have compared the following four algorithms implemented in MATLABTM : • GECP: standard GECP, that is, we compute first the entries of the matrix and then apply GECP. Cost: 2n3 /3 operations + n3 /3 comparisons. • RRD: Algorithm 4.1 implemented using the method in Section 5 of (Demmel, 1999) for computing in Step 1 an RRD A = XD(Y F ∗ ), while for Step 2 we solve Xs = b with forward substitution (recall that X is a permutation of a lower triangular matrix), Dw = s as wi = si /di for i = 1 : n, Y x0 = w with backward substitution (recall that Y is a permutation of an upper triangular matrix), and finally x = Fx0 . Cost: 4n3 /3 operations + n3 /3 comparisons. • BjP: We use also the fast algorithm introduced by Bj¨orck & Pereyra (1970) -see also Algorithm 4.6.1 in Golub & Van Loan (1996). This algorithm guarantees excellent componentwise bounds for the relative forward error of the solution, for most right-hand sides (in the sense explained in Section 3.2) only for totally positive Vandermonde matrices (Higham, 1987, 2002). However, when the matrix is not totally positive, current analysis does not guarantee good bounds for the errors, even in normwise sense. Cost: 9n2 /2. • BjP-PPP: Finally we use BjP algorithm above, but ordering the nodes x in advance by using essentially the Leja ordering (Reichel, 1990). This ordering can be implemented in n2 /2 operations and n2 /2 comparisons (Higham, 1990, Algorithm 5.1) and corresponds to the order of the rows of V that would be obtained by applying GEPP. It is known that the accuracy of this last algorithm is not guaranteed even when the matrix is TP (Higham, 1990). Cost: 5n2 + n2 /2 comparisons.

For these algorithms we have proceed as in Section 5.1. We have generated random Vandermonde matrices both totally positive (TP) (uniformly distributed random, using MATLABTM command rand, x vectors satisfying the TP requirements: 0 < x1 < x2 < · · · < xn ) and non totally positive (nTP) (normally distributed random, using MATLABTM command randn, x vectors). We have used in all tests normally distributed random right-hand side vectors b. The sizes of the matrices have ranged from n = 5 to n = 100. The range of the condition numbers has been 102 . κ(V ) . 10100 . The results are shown in Figure 3 for TP matrices and in Figure 4 for non TP matrices. For TP matrices all algorithms, except GECP, get the full accuracy (≈ u). This is the expected accuracy for the methods BjP and RRD; however the method with predictive partial pivoting, BjP-PPP, gets also full accuracy without having a theoretical result that guarantees this. There are experiments with TP Vandermonde matrices in (Boros et al., 1999) that show that, for special distribution of nodes and right-hand sides, with small

17 of 21

ACCURATE SOLUTION OF STRUCTURED LINEAR SYSTEMS

Forward error vs size. TP Vandermonde matrices.

100 GECP RRD BjP BjP-PPP Θ1 10−5 ||∆x||2 ||x||2 10−10

10−15

10

20

30

40

60

50

70

80

90

100

n FIG. 3. Forward relative error

kb x − xk2 against size for Totally Positive Vandermonde matrices. kxk2

kA−1 kkbk/kxk, the accuracy is lost by the BjP-PPP method. However, Figure 4 shows that, when Vandermonde matrices are non totally positive, Algorithm 4.1 delivers full accuracy while BjP does not. Therefore, as predicted by the error analysis, the behavior of the RRD algorithm is excellent both in the TP and the non-TP cases. It is remarkable also that the quantity Θ1 (dashed line) is very close to the error of the method RRD (continuous ∗-line), being sometimes smaller. This shows how sharp the bound (4.5) is, and that the factor g(n) is O(1) for these experiments. There is a fact in Figure 4 for which we do not have an explanation: the BjP-PPP algorithm also shows full accuracy. As far as we know, there is no error analysis that guarantees this good behavior.

5.3

Experiments with

kA−1 k kbk not small kxk

Besides the experiments with random matrices presented in subsections 5.1 and 5.2 we have performed experiments in which kA−1 k kbk/kxk is not small. To do that, the right-hand side has been prepared to be b = u1 , the first left singular vector of A (see Theorem 3.3). In this case, both for Cauchy and Vandermonde matrices, the forward error of the solution has been, as expected, big and proportional to the unit roundoff times the condition number1 of the matrices. The results for six Cauchy matrices, three TP and three non-TP, are presented in Tables 1 and 2; the results for Vandermonde matrices are similar

18 of 21

F. M. DOPICO & J. M. MOLERA

Forward error vs size. Random Vandermonde matrices. 100 GECP RRD BjP BjP-PPP Θ1 10−5

||∆x||2 ||x||2 10−10

10−15

10

20

30

40

60

50

70

80

90

100

n FIG. 4. Forward relative error

kb x − xk2 against size for non-Totally Positive Vandermonde matrices. kxk2

and not presented here for brevity. Let us notice however that to get these results, the vector b has to be prepared using a high accurate algorithm for the SVD (Demmel et al., 1999). If the vector b is prepared using usual floating point arithmetic and the svd command in MATLABTM , the rounding errors make it impossible for b to lay exactly in the direction of u1 , developing small components in the direction of other left singular vectors, and making the relative forward error be of order u. 6. Conclusions We have introduced a numerical method that employs accurate rank-revealing decompositions (RRD) to solve accurately and with a cost of O(n3 ) operations many classes of structured linear systems Ax = b, even if the matrix A ∈ Cn×n has a huge condition number. The precise cost of this procedure is given by the cost of computing an accurate RRD of the particular class of structured matrices we consider. We have presented a complete error analysis of this method that is based on a new structured perturbation theory for linear systems when A is perturbed through the factors of an RRD. We have performed very satisfactory numerical tests on Cauchy and Vandermonde matrices with arbitrary distributions of the nodes. These tests show that the accuracy of the proposed algorithm is essentially perfect in extreme situations where any other algorithm fails to 1compute accurate solutions. The accuracy of our method is governed, essentially, by the unit roundoff times the factor kA−1 kkbk/kxk, while for other methods, that

ACCURATE SOLUTION OF STRUCTURED LINEAR SYSTEMS

n = 10

n = 30

n = 50

GECP

0.33

1.00

1.00

RRD

3.49

4.00

3.71

BKO

1.41

0.49

0.41

19 of 21

Table 1. Experiments with kA−1 k kbk/kxk not small. The forward relative error k∆ xk2 /kxk2 is displayed for three different methods for Totally Positive Cauchy matrices, for sizes n = 10, kA−1 k2 kbk2 /kxk2 = 1.4 · 1017 , n = 30, kA−1 k2 kbk2 /kxk2 = 2.1 · 1017 , and n = 50, kA−1 k2 kbk2 /kxk2 = 2.0 · 1017 .

n = 10 GECP RRD BKO

1.46 ·

n = 30

10−7

9.13 · 10−9 1.81 · 10−3

4.30 ·

10−9

4.44 · 10−8 5.33 · 10−4

n = 50 2.04 · 10−5 7.85 · 10−5 2.49 · 103

Table 2. Experiments with kA−1 k kbk/kxk not small. The forward relative error k∆ xk2 /kxk2 is displayed for three different methods for non-Totally Positive Cauchy matrices, for sizes n = 10, kA−1 k2 kbk2 /kxk2 = 2.5 · 1010 , n = 30, kA−1 k2 kbk2 /kxk2 = 1.4 · 1010 , and n = 50, kA−1 k2 kbk2 /kxk2 = 1.6 · 1015 .

get, for some experiments, the same accuracy (as BjP-PPP in Figure 4), this is not always guaranteed by theoretical bounds. It remains an open problem to find fast algorithms, i.e., with O(n2 ) cost, for solving linear systems with guaranteed accuracy for Cauchy and Vandermonde matrices. It should be noted that the method we propose will compute accurate solutions of linear systems Ax = b for any class of matrices A for which accurate RRDs can be computed, now or in the future, and for most vectors b. Acknowledgements We thank the referees and the Associate Editor of this paper, Nick Higham, for very useful comments and remarks that helped us to improve this paper. This research was partially supported by the Ministerio de Ciencia e Innovaci´on of Spain through grant MTM2009-09281. R EFERENCES BANOCZI , J. M., C HIU , N.-C., C HO , G. E. & I PSEN , I. C. F. (1998) The lack of influence of the right-hand side on the accuracy of linear system solution. SIAM J. Sci. Comput., 20, 203–227. ˙ & E LFVING , T. (1973/74) Algorithms for confluent Vandermonde systems. Numer. Math., 21, ¨ B J ORCK , A. 130–137. ˙ & P EREYRA , V. (1970) Solution of Vandermonde systems of equations. Math. Comp., 24, 893–903. ¨ B J ORCK , A. B OROS , T., K AILATH , T. & O LSHEVSKY, V. (1999) A fast parallel Bj¨orck-Pereyra-type algorithm for solving Cauchy linear equations. Linear Algebra Appl., 302/303, 265–293. B OROS , T., K AILATH , T. & O LSHEVSKY, V. (2002) Pivoting and backward stability of fast algorithms for solving Cauchy linear equations. Linear Algebra Appl., 343/344, 63–99.

20 of 21

F. M. DOPICO & J. M. MOLERA

C HAN , T. F. & F OULSER , D. E. (1988) Effectively well-conditioned linear systems. SIAM J. Sci. Statist. Comput., 9, 963–969. D EMMEL , J. (1999) Accurate singular value decompositions of structured matrices. SIAM J. Matrix Anal. Appl., 21, 562–580. ˇ , I., V ESELI C´ , K. & D RMA Cˇ , Z. (1999) Computing the D EMMEL , J., G U , M., E ISENSTAT, S., S LAPNI CAR singular value decomposition with high relative accuracy. Linear Algebra Appl., 299, 21–80. D EMMEL , J. & KOEV, P. (2004) Accurate SVDs of weakly diagonally dominant M-matrices. Numer. Math., 98, 99–104. D EMMEL , J. & KOEV, P. (2006) Accurate SVDs of polynomial Vandermonde matrices involving orthonormal polynomials. Linear Algebra Appl., 417, 382–396. D EMMEL , J. & V ESELI C´ , K. (1992) Jacobi’s method is more accurate than QR. SIAM J. Matrix Anal. Appl., 13, 1204–1246. D OPICO , F. M., M OLERA , J. M. & M ORO , J. (2003) An orthogonal high relative accuracy algorithm for the symmetric eigenproblem. SIAM J. Matrix Anal. Appl., 25, 301–351. D OPICO , F. M., KOEV, P. & M OLERA , J. M. (2009) Implicit standard Jacobi gives high relative accuracy. Numer. Math., 113, 519–553. D OPICO , F. M. & KOEV, P. (2006) Accurate symmetric rank revealing and eigendecompositions of symmetric structured matrices. SIAM J. Matrix Anal. Appl., 28, 1126–1156. D OPICO , F. M. & KOEV, P. (2010). Perturbation theory for the LDU factorization and accurate computations for diagonally dominant matrices. Submitted. E ISENSTAT, S. & I PSEN , I. (1995) Relative perturbation techniques for singular value problems. SIAM J. Numer. Anal., 32, 1972–1988. G ANTMACHER , F. & K REIN , M. (2002) Oscillation Matrices and Kernels and Small Vibrations of Mechanical Systems, revised edn. AMS Chelsea, Providence, RI, pp. viii+310. G OLUB , G. & VAN L OAN , C. (1996) Matrix Computations, 3rd edn. Baltimore, MD: Johns Hopkins University Press. H IGHAM , N. J. (1987) Error analysis of the Bj¨orck-Pereyra algorithms for solving Vandermonde systems. Numer. Math., 50, 613–632. H IGHAM , N. J. (1988) Fast solution of Vandermonde-like systems involving orthogonal polynomials. IMA J. Numer. Anal., 8, 473–486. H IGHAM , N. J. (1990) Stability analysis of algorithms for solving confluent Vandermonde-like systems. SIAM J. Matrix Anal. Appl., 11, 23–41. H IGHAM , N. J. (2002) Accuracy and stability of numerical algorithms, second edn. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM), pp. xxx+680. H ORN , R. A. & J OHNSON , C. R. (1985) Matrix Analysis. Cambridge: Cambridge University Press. I PSEN , I. C. F. (1998) Relative perturbation results for matrix eigenvalues and singular values. Acta numerica, 1998. Acta Numer., vol. 7. Cambridge: Cambridge Univ. Press, pp. 151–201. I PSEN , I. C. F. (2000) An overview of relative sinΘ theorems for invariant subspaces of complex matrices. J. Comput. Appl. Math., 123, 131–153. Numerical analysis 2000, Vol. III. Linear algebra. L I , R.-C. (1998) Relative perturbation theory. I. Eigenvalue and singular value variations. SIAM J. Matrix Anal. Appl., 19, 956–982. L I , R.-C. (1999) Relative perturbation theory. II. Eigenspace and singular subspace variations. SIAM J. Matrix Anal. Appl., 20, 471–492. M ATHIAS , R. (1995) Accurate eigensystem computations by Jacobi methods. SIAM J. Matrix Anal. Appl., 16, 977–1003. M IRANIAN , L. & G U , M. (2003) Strong rank revealing LU factorizations. Linear Algebra Appl., 367, 1–16. O LSHEVSKY, V. (ed.) (2001a). Proceedings of the AMS-IMS-SIAM Joint Summer Research Conference on Structured Matrices in Operator Theory, Numerical Analysis, Control, Signal and Image Processing held at the

ACCURATE SOLUTION OF STRUCTURED LINEAR SYSTEMS

21 of 21

University of Colorado, Boulder, CO, June 27–July 1, 1999. Contemporary Mathematics, vol. 280. Providence, RI: American Mathematical Society, pp. xiv+327. O LSHEVSKY, V. (ed.) (2001b). Proceedings of the AMS-IMS-SIAM Joint Summer Research Conference held at the University of Colorado, Boulder, CO, June 27–July 1, 1999. Contemporary Mathematics, vol. 281. Providence, RI: American Mathematical Society, pp. xiv+344. PAN , C.-T. (2000) On the existence and computation of rank-revealing LU factorizations. Linear Algebra Appl., 316, 199–222. ´ , M. J. & M ORO , J. (2006) Accurate factorization and eigenvalue algorithms for symmetric DSTU and P EL AEZ TSC matrices. SIAM J. Matrix Anal. Appl., 28, 1173–1198. ˜ , J. M. (2004) LDU decompositions with L and U well conditioned. Electron. Trans. Numer. Anal., 18, P E NA 198–208 (electronic). R EICHEL , L. (1990) Newton interpolation at Leja points. BIT, 30, 332–346. Y E , Q. (2008) Computing singular values of diagonally dominant matrices to high relative accuracy. Math. Comp., 77, 2195–2230.

Suggest Documents