Bounds on the error of an approximate invariant subspace for non-self

Numer. Math. 67: 491{500 (1994)

Numerische Mathematik

c Springer-Verlag 1994 Electronic Edition

Bounds on the error of an approximate invariant subspace for non-self-adjoint matrices Moshe Haviv1 2, Ya'acov Ritov1 ;

1 2

Department of Statistics, The Hebrew University of Jerusalem, 91905 Jerusalem, Israel Department of Econometrics, The University of Sydney, Sydney, NSW 2006, Australia

Received December 1, 1992 / Revised version received October 20, 1993

Summary. Suppose one approximates an invariant subspace of an n n matrix in C nn which in not necessarily self{adjoint. Suppose that one also has an approximation for the corresponding eigenvalues. We consider the question of how good the approximations are. Speci cally, we develop bounds on the angle between the approximating subspace and the invariant subspace itself. These bounds are functions of the following three terms: (1) the residual of the approximations; (2) singular{value separation in an associated matrix; and (3) the goodness of the approximations to the eigenvalues. Mathematics Subject Classi cation (1991): 65F15

1. Introduction The subject of bounding the angle between an invariant subspace of a matrix B 2 C nn and an approximation to it is not new. Most attention received in the literature concerns the case where B is a self{adjoint{matrix. For a recent paper see Sun (1991). This paper removes this assumption. Suppose one approximates p eigenvalues 1 ; 2 ; : : : ; p of a matrix B with the scalars 1 ; 2 ; : : : ; p and suppose that one approximates the corresponding invariant subspace with the p dimensional subspace Y . The question which rises is how good the approximations are. In particular, one likes to bound the angle between the invariant subspace corresponding to 1 ; 2 ; : : : ; p and its approximation. We obtain bounds separately for the following four cases: 1. p = 1 and the approximation to is error-free. 2. p = 1. 3. p 1 and the approximations 1 ; : : : ; p to 1 ; : : : ; p are error free. 4. p 1. In all cases, the bounds on the angle between the two subspaces are monotone functions of the following:1 (a) The residual of the approximation. (b) The reciprocal of singular{value separation in the matrix (B ? p )(B ? p?1 ) (B ? 1 ): Correspondence to : 1

Y. Ritov Exact de nitions are given in the next section

Numerische Mathematik Electronic Edition { page numbers may dier from the printed version page 491 of Numer. Math. 67: 491{500 (1994)

492

M. Haviv and Y. Ritov

(c) The distances between 1 ; : : : ; p and their approximations (only in cases 2 and 4). In comparing the bounds we developed with those existing for the self-adjoint case, as appearing in Parlett (1980, pp. 222{225), one nds terms corresponding to (a) and (b) above. A term corresponding to (c) does not appear in the self{ adjoint case. This is due to the fact that for self{adjoint matrices the orthogonal complement of an invariant subspace is an invariant subspace itself. We like to mention two related questions which were considered in the literature. Stewart (1971) de ned a measure which quanti es how near to an invariant subspace a given subspace X is. Then he constructed an invariant subspace whose distance form X approaches zero as the corresponding measure does. Finally, he developed a bound on the norm of the dierence between the two subspaces. See also Stewart (1973). In Sect. 3 we compare our bounds with Stewart's. Kahan, Parlett and Jiang (1982) considered an alternative question. For a given subspaces X , does there exist a small perturbation of B such that X is an invariant subspace of the perturbed matrix? Our nal remark here considers ill{conditioned eigenvalues. An eigenvalue is called ill{conditioned if the right and left eigenspaces belonging to it are almost orthogonal. This might be the case only in non{self{adjoint matrices. In that case, the eigenspaces may be sensitive to small perturbation (c.f., Golub and Wilkinson 1976, p. 588). Hence, it looks as though a bound on the angle between an invariant subspace and an approximation to it should be a function of the degree of ill{conditioning. As our bounds do not contain a term directly representing ill{conditioning, we conclude that ill{conditioned eigenvalues implies poor singular value separation. For more on the relation between the ill{conditioning phenomenon and its correspondence to eigenvalues and singular value separation see Golub and Wilkinson (1976).

2. Notation and preliminaries For a vector x 2 C n , we denote its Euclidean norm by kxk. Similarly, for a matrix B 2 C nn , kB k denotes maxkxk kBxk. All the vectors denoted later by x; y; z and w will be norm one vectors in C n . Also, for a matrix B we denote by B its conjugate transpose. The angle between the vectors x; y 2 C n , denoted by \(x; y) is de ned as arccos j x y j. Of course, 0 \(x; y) =2. The =1

H

H

angle between subspace X and subspace Y , denoted by \(X; Y ), is de ned as supx2X inf y2Y \(x; y). Note that \(X; Y ) does not necessarily equal \(Y; X ). For an n n matrix B and for a vector of length p, ~ = (1 ; : : : ; p ), let ~ B () = (B ? p )(B ? p?1 ) (B ? 1 ). For an n n matrix B , a p dimensional subspace Y , and for a vector ~ of length p, ~ = (1 ; : : : ; p ), let rB (~; Y ) denote the residual of (~; Y ) at B , namely maxy2Y k B (~)yk. Of course, if the dimension of Y is one, the maximization is redundant. The term B (~; Y ) will be defend accordingly as the minimization on the complementary subspace, namely B (~; Y ) = miny?Y kB (~)yk. Note that the number of scalars, p, involved in the de nition of rB () and B () equals the dimension of Y . The case where p = 1 is, of course possible. In that case, we use instead of ~. Also, for a matrix B , let (B ) denote the set of eigenvalues of B . We deviate from the traditional


Title suppressed due to excessive length

493

notation and include an eigenvalue in (B ) as many times as its geometric multiplicity (i.e., the dimension of its corresponding eigenspace) indicates. Finally, let 1 (B ) n (B ) be the nonnegative square roots of the eigenvalues of B H B (i.e., the singular values of B ) appearing in a nonincreasing order. Again, a multiple singular value appears in this sequence as its multiplicity indicates. We end this section with quoting two results which will be used frequently later on. Result 1 (cf. Parlett (1980), p. 188). For ~ = (1 ; : : : ; p) (B) with the corresponding p dimensional invariant subspace Z (1) B (~ ; Z ) = n?p (B (~ )):

Result 2 (cf. Parlett (1980), p. 222). For a self{adjoint matrix B let (; z) be

an eigenpair. Let and y be approximations to and z , respectively. If is the closest to in (B ), then2 ; y) : (2) sin \(z; y) rB((B n?1 ? )

3. The bounds 3.1. The case p = 1; = We begin with the simplest case where the dimension of the invariant space is one, and where the approximation to is known to be error-free. More speci cally, let y be an approximation to the eigenvector z belonging to the (known) eigenvalue . Theorem 3.1 below bounds sin \(y; z ) in terms of rB (; y) and n?1 (B ? ).

Theorem 3.1.

sin \(y; z ) rB (; y)= n?1 (B ? ): Proof. To simplify notation, let = \(y; z ). Hence, y = z cos + w sin for some w with w ? z . As Bz = z one easily gets that By = z cos + Bw sin and that y = z cos + w sin : Hence, By ? y = (B ? )w sin and then rB (; y) = sin k(B ? )wk implying that (3) sin rB (; y)=B (; z ): The proof is completed by noticing that n?1 (B ? ) = B (; z ) as indicated in (1). ut It is worthwhile to note that inequality (3) above is valid for any norm on C n (when one adjusts the de nitions of rB (; y ) and of B (; z ) accordingly). For explicit expressions for B (; z ) for the l1 {norm and for the l1 {norm see Rothblum (1984). 2

In Parlett (1980, p.222) stands for the Rayleigh quotient of y, namely for yH By, but it is easy to see from the proof there that this restriction is not necessary for our needs


494


3.2. The case p = 1, 6=

Suppose one approximates the eigenpair (; z ) with the pair (; y) where and do not necessarily coincide. In this sub-section we give three dierent bounds for the angle between y and z in terms of j ? j, rB (; y), B (; z ), and kB ? B H k. Later, in Theorem 3.3, we bound B (; z ) from below, in terms of

n (B ()) and n?1 (B ()), the smallest and the second smallest singular value of B (), respectively. For a given y, the traditional choice for is its Rayleigh quotient ? = yHBy as it minimizes the corresponding residual, rB (; y). For that choice we have the following result as a corollary to Theorem 3.1.

Corollary 1.

1 2 ? ? 2 2 sin \(y; z ) [rB ( ; y) +(B(? ?) ) ]

n?1 ? Proof. Since y ? (B ? )y, rB2 (; y) = k(B ? ? )y + (? ? )yk2 = k(B ? ? )yk2 + (? ? )2 :

The corollary follows from Theorem 3.1. ut Our two other bounds follow from the following theorem. Theorem 3.2. For some w ? z: H sin \(y; z ) rB (; y)+ j ? jj z ((;B z?) )w j =k(B ? )wk : B Proof. Again for simplicity let = \(y; z ). Then y = z cos + w sin for some w with w ? z . As Bz = z one easily gets that By = z cos + Bw sin and that y = z cos + w sin : Thus, By ? y = ( ? )z cos + (B ? )w sin . Hence, rB2 (; y) = ( ? )2 cos2 + 2( ? ) sin cos z H (B ? )w + k(B ? )wk2 sin2 2 H = k(B ? )wk sin ? j ? j cos jzk(B(B??)w)wk j jzH(B ? )wj 2 ! 2 2 + ( ? ) cos 1 ? k(B ? )wk 2 H k(B ? )wk sin ? j ? j cos jzk(B(B??)w)wk j : The theorem follows since, by de nition, B (; z ) k(B ? )wk. ut The rst corollary is immediate: Numerische Mathematik Electronic Edition { page numbers may dier from the printed version page 494 of Numer. Math. 67: 491{500 (1994)


Corollary 2.

495

j ? j: sin \(y; z ) rB (;y)+ (; z ) B

The next corollary in useful when B is nearly self-adjoint matrix in a sense that kB ? B H k is small.

Corollary 3.

; y) + j ? j kB ? B H k sin \(y; z ) rB ((; z) 2 (; z ) B

B

Proof. The corollary follows from Theorem 3.2 since z ? w and hence j z H (B ? )w j=j z H(B ? B H )w j kB ? B H k. ut It is possible to see from Theorem 3.2 and from the analysis in the self{ adjoint case, that the bounds are proportional to the reciprocal of the eigenvalue separation in corresponding matrices. In the general case, as can be deduced from Theorem 3.2, the bounds are function of B (; z ). Thus we conclude that the closer and are, the closer B (; z ) and n?1 (B ? ) are. Theorem 3.3, given next, quanti es this observation in a sense that it bounds B (; z ) from below in terms of n?1 (B ? ). Of course, this bound can replace B (; z ) in the bound on sin \(y; z ) given in Theorem 3.2.

Theorem 3.3. )]( ? ) kB ? k : B (; z ) n? (B ? ) ? [ n? (B ? ) ?

n (B(B? ? ) 2

2

2

1

1

2

2

2

4

n?1 ? H Proof. Let z be the eigenvector of (B ? ) (B ? ) belonging to n2 (B ? ) with kz ?k = 1 and let ? be the angle between z ? and z . Then by taking (0; z ) as an approximation to the eigenpair ( n2 (B ? ); z ? ) of the self{adjoint matrix H

(B ? ) (B ? ) one gets by (2) that H sin ? k(B 2? ()(BB??))z k j ?2 (jBkB??)k : (4) n?1 n?1 Now, let w be the vector where the minimum de ning B (; z ) is attained. For some angle and some vector v with v ? z ? and kvk = 1, w = z ? sin + v cos . (Note that arcsin is the angle between w and z ? .) Applying B ? yields B2 (; z ) = k(B ? )wk2 = k(B ? )z ? sin + (B ? )v cos k2 : As vH (B ? )H (B ? )z ? = 0 and as k(B ? )z ? k = n (B ? ) one gets that k(B ? )wk2 = n2 (B ? ) sin2 + k(B ? )vk2 cos2 : But k(B ? )vk2 min[uH(B ? )H (B ? )u j u ? z ? ; kuk = 1] where the latter equals n2?1 (B ? ) as indicated in (1). Hence, (5)

B2 (; z ) n2 (B ? ) sin2 + n2?1 (B ? ) cos2 = n2?1 (B ? ) ? [ n2?1 (B ? ) ? n2 (B ? )] sin2 :


496


Next we show that sin sin ? . This, coupled with inequalities (4) and (5), complete the proof. Indeed, one can write z ? = z cos ? + u sin ? for some vector u, kuk = 1. Then sin =j wH z ? j=j wH (z cos ? + u sin ? ) j sin ? by wH z = 0 and j wH u j 1. ut Remark 1. The term kB ? k in Theorem 3.3 bounds k(B H ? )z k. It could be replaced by j ? j +k(B H ? )z k j ? j +kB ? B H k. Again, this is a useful bound for nearly self-adjoint matrices B . 3.3. The case p 1, 1 = 1 , 2 = 2 , : : : ; p = p

Let Z be a invariant subspace of B with a dimension of p with corresponding set of p (known) eigenvalues ~ = (1 ; : : : ; p ). Finally, let Y be a p dimensional subspace and think of it as an approximation to Z . This subsection considers the issue of measuring the quality of Y as an approximation to Z . Of course, we rst need a measure which quanti es how close is a given subspace to another. As stated in the introduction we use the quantity supy2Y inf z2Z \(y; z ) which is denoted by \(Y; Z ). Next, in Theorem 3.4 we bound \(Y; Z ) in terms of rB (~; Y ), namely the residual of ~ and Y which is de ned as supy2Y k B (~ )yk, and in terms of n?p (B (~ )).

Theorem 3.4.

~ sin \(Y; Z ) rB (; Y~) :

n?p (B ()) Proof. Let y 2 Y . Then for some z 2 Z and for some w ? z , y = z cos + w sin for = \(y; z ) = \(y; Z ). Note, by the de nition of z as the projection of y on Z , that = inf z2Z \(y; z ). (Of course, is a function of y.) Then, kB (~ )yk = sin kB (~)wk, or ~ sin = kB (~)yk : kB ()wk By de nition kB (~)yk rB (~ ; Y ), and by Eq. (1) and the fact that w ? Z one gets that kB (~)wk n?p (B (~ )). Hence, ~ sin rB (; Y~) :

n?p (B ()) Since the right hand side of the last inequality does not depend on y it holds for any , namely for any y 2 Y . This completes the proof. ut Remark 2. Suppose the subspace Y is spanned by the orthonormal vectors y1 ; : : : ; yp 2 C n . With a slight abuse of notation denote the n p matrix whose j column is Yj by Y . Then, rB (~; Y ) = p (B (~ )Y ). This is the case as rB (~ ; Y ) = max kB (~)yk = max kB (~ )Y wk = kB (~ )Y k = p (B (~ )Y ): y2Y

w2C p



497

3.4. The case p 1, 1 6= 1 ; : : : ; p 6= p

Suppose one approximates a p dimensional invariant subspace Z of B 2 C nn by some other p dimensional subspace Y . Also, suppose ~ = (1 ; : : : ; p ) are approximations to the corresponding eigenvalues ~ = (1 ; : : : ; p ). Next we bound sin \(Y; Z ) in terms of the residual of (~; Y ), in terms of kB (~) ? B (~)k (which represents the goodness of the approximation ~ to ~) and in terms of the singular value separation measure n?p (B (~)). Before stating Theorem 3.5, we need the following lemma. Lemma 3.1. For a self{adjoint matrix M 2 C nn let 1 : : : n be its eigenvalues. Also, let Z be the invariant subspace corresponding to n?p+1 ; : : : ; n . Then for any vector y, sin \(y; Z ) kMyk : n?p

Proof. For some z 2 Z and w ? Z; y = z cos + w sin where = \(y; z ) = \(y; Z ): Then, as for self-adjoint matrices, the orthogonal complement of an invariant subspace is invariant itself, kMyk2 = kMz cos k2 + kMw sin k2 kMwk2 sin2 or sin kMyk=kMwk. By (1), the proof is completed. ut

Theorem 3.5.

sin \(Y; Z )

rB (~; y) +

n?p (B (~)) ? 22 kB(B(~()~k))2 2

: 1 2

n?p

Proof. For y 2 Y , write y = z cos + w sin for z 2 Z and w ? Z . Then as B (~)w sin = B (~)y ? B (~)z cos , by the triangle inequality one gets that ~ ~ (6) sin kB ()yk +~ kB ()z k : kB ()wk Now, let Z~ be the invariant subspace belonging to the p smallest eigenvalues of B H (~)B (~). Then, by Lemma 3.1 H ~ ~ sin \(Z; Z~) supz2Z2kB (~)B ()z k

n?p (B ()) H ~ ~ ~ = supz2Z kB 2()[B (~) ? B ()]z k

n?p (B ()) ~ 2 kB ()~k : (7)

n?p (B ())

Write w = z~ sin + v cos with z~ 2 Z~, v ? Z~ and kvk = 1 and note that sin sin \(Z; Z~). Then, as in the proof of Theorem 3.3, kB (~)wk2 = kB (~)(~z sin + v cos )k2 = kB (~)~zk2 sin2 + kB (~)vk2 cos2


498


(1 ? sin )kB (~)vk (1 ? sin ) n?p (B (~)) 2

2

2

2

where the last inequality follows (1). Then, by noting that sin sin \(Z; Z~) and using (7) one gets that 2 ~ 2 (8) kB (~)wk2 n2?p (B (~)) ? 2 kB ()~k :

n?p (B ())

Finally, combining the inequalities (6) and (8) completes the proof. ut Remark 3. Similar to Remark 2, note that if Y stands also for an orthonormal matrix whose columns spans Y then rB (~; Y ) = 1 (B (~)Y ). The bound of Theorem 3.5 is a monotone function of = kB (~) ? B (~)k. Next, in Theorem 3.6, we bound in terms of max1ip j i ? i j , which is denoted next by . Theorem 3.6. Let be a uniform bound on k(B ? i)k, 1 i p. Then, ( + )p ? p : Proof. First note that

Yp

Yp

i=1

i=1 p X

= k (B ? i ) ? =k

p X

(B ? i )k = k m Y

m=1 1i1

Bounds on the error of an approximate invariant subspace for non-self

Bounds on the error of an approximate invariant subspace for non-self

Suggest Documents

An invariant subspace problem for

APPROXIMATE SOLUTION AND ERROR BOUNDS FOR ...

Error Bounds For Approximate Deflating Subspaces For Linear ... - UTA

The Invariant Subspace Problem

The Invariant Subspace Problem for Non

EXISTENCE OF INVARIANT SUBSPACE FOR ... - Project Euclid

Some remarks on the invariant subspace problem for ... - EMIS

Parallel Studies of the Invariant Subspace

A Study of the Invariant Subspace Decomposition

An Error-Resilient Redundant Subspace Correction Method

An invariant subspace problem for multilinear ... - Springer Link

BOUNDS FOR SECTIONAL GENERA OF VARIETIES INVARIANT

Group-invariant Subspace Clustering - arXiv

ERROR BOUNDS FOR VECTOR-VALUED

Invariant subspace for a singular integral operator on Ahlfors David ...

A zk-INVARIANT SUBSPACE WITHOUT THE

DUALITY AND THE COMPUTATION OF APPROXIMATE INVARIANT

On Improved Bounds on the Decoding Error Probability of Block ...

Bounds and error bounds for queueing networks - University of Twente

ERROR ESTIMATE FOR APPROXIMATE SOLUTIONS OF ... - CiteSeerX

Deterministic Performance Bounds on the Mean Square Error for Near ...

Analytical Bounds on the Average Error Probability for ... - ITA @ UCSD

Deterministic Performance Bounds on the Mean Square Error for Near ...

Analytical Bounds on the Average Error Probability for ... - ITA @ UCSD