PSEUDOSPECTRA COMPUTATION OF LARGE ... - Semantic Scholar

1 downloads 0 Views 3MB Size Report
Mar 26, 2004 - We thank our project partners from the. Chemical Engineering .... [34] G.L.G. Sleijpen and H.A. van der Vorst. A Jacobi-Davidson iteration ...
PSEUDOSPECTRA COMPUTATION OF LARGE MATRICES∗ C. BEKAS† , E. GALLOPOULOS‡ , AND V. SIMONCINI§ Abstract. Transfer functions have been shown to provide monotonic approximations to the resolvent 2-norm of A, R(z) = (A − zI)−1 , when associated with a sequence of nested spaces. This paper addresses the open question of the effectiveness of the transfer function scheme for the computation of the pseudospectrum of large matrices. It is shown that the scheme can be combined with certain Krylov type linear solvers, such as restarted fom, for the efficient solution of shifted linear systems of the form (A − zk I)−1 b, for a large number of shifts zk . Extensive numerical experiments illustrate the performance of the methods developed in this paper. Tools for the effective combination of the transfer function framework with path following methods are developed. A hybrid method is proposed that combines transfer functions with iterative solvers and path following and is shown to be a powerful and cost effective scheme for computing pseudospectra of very large matrices. Key words. Pseudospectrum. Transfer Function. Resolvent norm. Iterative methods AMS subject classifications. 65F15

1. Introduction. In several applications important information regarding the behavior of a non-normal matrix A ∈ Cn×n can be derived from its -pseudospectrum (pseudospectrum for short), that can be defined as Λ (A) = {z ∈ C : z ∈ Λ(A + E), kEk ≤ }. Here  ∈ R is a nonnegative, small parameter and Λ(A + E) denotes the spectrum of A + E; see [38]. The notion of the pseudospectrum can be further extended, e.g. in polynomial eigenvalue problems; see [36]. An equivalent definition of Λ (A) is based on the resolvent of A, R(z) = (A − zI)−1 Λ (A) = {z ∈ C : kR(z)k ≥ −1 }.

(1.1)

In the sequel we would be using solely the Euclidean norm. An equivalent, third definition in that case is Λ (A) = {z ∈ C : σmin (A − zI) ≤ },

(1.2)

where σmin (A − zI) denotes the smallest singular value of A − zI. From these definitions, it immediately follows that for any 1 > 2 , Λ1 (A) ⊇ Λ2 (A). Furthermore, Λ0 (A) coincides with the eigenvalues of A. In order for the pseudospectrum to become a useful and practical tool, we need efficient methods for its computation and visualization. Definition (1.2) motivates the simplest algorithm for estimating Λ (A), typically referred to as grid, that consists of the following two major steps: i) Construct a mesh Ω over a region of the complex plane that includes Λ (A); ii) compute σmin (A − zI) for every node z of Ω. Subsequently, the values can be processed, e.g. by contour plotting software, to visualize the pseudospectrum by plotting contours, denoted here by ∂Λ (A), for specific values of . grid is easy to program and robust and frequently serves as baseline when ∗ Version

of March 26, 2004. Science and Engineering Dept., University of Minnesota, Minneapolis, USA. E-mail: [email protected]. This work was completed when the first author was graduate student at the University of Patras, Greece, Computer Engineering and Informatics Dept. ‡ Computer Engineering and Informatics Dept. University of Patras, Greece. E-mail: [email protected] § Dipartimento di Matematica, Universit` a di Bologna, and IMATI-CNR, Pavia, Italy. [email protected] † Computer

1

evaluating new methods for computing pseudospectra. The total cost of grid can be approximated reliably by the sum of the cost of computing σmin (A − zI) at all mesh points in Ω. Clearly, as the size of the matrix and/or the number of mesh points increase the cost of grid becomes prohibitive. Research efforts designed to speed-up the computation of pseudospectra have been aiming i) to reduce the costs due to the singular value decompositions and/or ii) to reduce the number of mesh points in Ω at which we need to perform singular value computations. We refer to methods as matrix-based or domain-based depending on whether their principal aim is (i) or (ii) respectively. See [39] and [42] for informative surveys of efforts on this topic and the Pseudospectra Gateway [29] for a comprehensive Web repository of links to research efforts, references and related software. When using definition (1.2) to compute pseudospectra, e.g. as in grid, we need to approximate the minimum singular values of the underlying matrix shifted for many different values of z (mesh points). Although great benefits can be obtained from recent advances in singular value computations, such as [1], [7], [12], [16], [18], [19], [20], [21], [22], [28], [34], only few attempts only have attempted to exploit this shifted structure in order to speedup the entire computation. One approach is to apply continuation techniques that utilize singular value information from nearby mesh points to approximate σmin (A − zI) [9], [23]. Another approach is to perform dimensional reduction by projecting the underlying matrix A into a smaller subspace and approximating the pseudospectrum from that subspace; cf. [39, Section 11] for a review and references to relevant efforts. In particular, in [37], Toh and Trefethen proposed to replace A with the (m + 1) × m upper Hessenberg matrix Hm+1,m , satisfying AWm = Wm+1 Hm+1,m ,

(1.3)

where Wm is an n × m matrix having orthonormal columns. For any z ∈ C, using (1.3) it holds   I ˜ ˜ (A − zI)Wm = Wm+1 (Hm+1,m − z I), I= ∈ R(m+1)×m . (1.4) 0 ˜ to approximate σmin (A − zI) is The rationale behind the use of σmin (Hm+1,m − z I) that, thanks to (1.4), the Hessenberg reduction need be performed only once for all z ∈ Ω. Therefore, this approach naturally exploits the knowledge about the mesh. In addition, (see [37, Th. 1]) the ordering σmin (H2,1 ) ≥ σmin (H3,2 ) ≥ · · · ≥ σmin (A) permits a monotonic approximation of σmin (A). For the case of large matrices, Wright and Trefethen refined this approach in [40] and [41] by using the Hessenberg matrix that results from the implicitly restarted Arnoldi iteration as it is implemented in ARPACK [22], leading to a method that we would be referring as iramgrid. Overall, the motivation in iramgrid is the following: Since the eigenvalues of the Hessenberg matrix are used as Ritz value approximations of the eigenvalues of the original matrix, it is hoped that in the proximity of the approximated eigenvalues, one can also construct useful approximations to the pseudospectrum. On the other hand, as we illustrate in Section 5, for some non-normal matrices this approach might not be effective, especially around sensitive eigenvalues. Another matrix-based approach was presented by Simoncini and Gallopoulos in [33] and relies on definition (1.1). The key idea is to project the resolvent function onto a conveniently chosen subspace of much smaller dimension; borrowing from the language of control and dynamical systems, the resulting operator is termed transfer 2

function. In fact, it is interesting to note that pseudospectra have been in active use in these areas; see for example [11] and references therein. Theoretical and numerical evidence in [33] indicated that this approach can result in considerably more accurate approximations to σmin (A − zI) than those obtained by merely projecting A onto the same subspace. Nonetheless, the implementation and computational effectiveness of that approach for large problems was not studied in [33]. In this paper we develop a set of methods based on the transfer function approach of [33] that is specifically designed for computing pseudospectra of large problems. Section 2 reviews the transfer function method. A major kernel of these methods are Krylov subspace iterative solvers that exploit the shifted structure, A − zI, of the matrices. Section 3 describes cost effective iterative schemes for the efficient implementation of the transfer function approach and the approximation of pseudospectra using many mesh points; we call the resulting methods trgrid and trfomgrid. As we show, these algorithms can also provide the user with useful, albeit partial information regarding the reliability of the pseudospectral plots. Section 4 describes a new hybrid method that is based on transfer functions and a path following scheme, called cobra ([2]), to trace a single pseudospectrum curve; we call this method trfomcobra. We conduct extensive numerical experiments which suggest that the aforementioned methods are effective and enable the estimation of one or more pseudospectra contours of large sparse matrices. In particular, using these methods, we were able to compute pseudospectra of matrices of size O(105 ) in a matter of a few hours on commodity workstations. Sections 5 and 6 present extensive numerical experiments and comparisons with state-of-the-art methods. Section 7 provides a summary and concluding remarks. Before proceeding, we need to highlight a few more issues related to software development for pseudospectra. One is that the strong visual component of the pseudospectrum problem as well as the state-of-the-art in linear algebra tools naturally motivate the incorporation of new algorithms with software that facilitates interaction with the user and visualization of the results. One such user-friendly tool is eigtool ([40]), that incorporates iramgrid and has become popular with the community. Another important issue, in view of the expected complexity of the problem for large matrices, is the flexibility of the software to use high performance computing resources. Earlier work (e.g. [5], [6]) has demonstrated that hybrid algorithms based on transfer functions lend themselves to efficient parallel implementations. ppat is another interesting tool that combines parallelism with reliable domain and matrix based techniques [24]. We also note that the methods described in this paper have already been incorporated into a toolbox written in MATLAB that also supports parallel processing via message passing [4]. Overall, as exemplified by the variety of efforts in the area, an environment for the effective computation of pseudospectra is more likely to have at its core a polyalgorithm, e.g. implemented as a MATLAB toolbox, that would contain several algorithmic components to deal in the best way possible with the multitude of matrix types and computational platforms. 2. The Transfer Function Framework. Let Wk , Wm be two n × k, n × m complex matrices with orthonormal columns, respectively, with m ≤ k. Consider the projection and restriction of the resolvent R(z) onto the subspaces spanned by the columns of Wm and Wk , that is Gz (A, Wm , Wk∗ ) := Wk∗ R(z)Wm . Because of its form, Gz (A, Wm , Wk∗ ) would be referred to as transfer function. It is worth noting that in their more general form, transfer functions do not require that the matrices to the left and right of the resolvent be related in any way [33]. For our purposes, however, we restrict our attention to the case where Wm is an orthonormal basis for the Krylov 3

subspace Km (A, w1 ) := span{w1 , Aw1 , ..., Am−1 w1 }, constructed by Arnoldi iteration according to the expansion of relation (1.3), that is AWm = Wm Hm,m + hm+1,m wm+1 e∗m ,

Wm = [w1 , . . . , wm ],

(2.1)

where Hm,m is the square upper Hessenberg matrix consisting of the first m rows of Hm+1,m , and em is the mth column of the identity matrix of size m. As usual, the superscript ∗ on a vector or matrix denotes the transposed conjugate of the object. We also denote Wm+1 = [Wm , wm+1 ] and Hm+1,m = [Hm,m ; hm+1,m e∗m ]. The last term is written in a MATLAB-like notation so that the semicolon (‘;’) implies the stacking of the row vector term to its right below the matrix term to its left. Throughout the paper we shall assume that the Arnoldi process can continue for at least m steps, without breakdown, so that an orthogonal basis Wm+1 is generated. This in turn implies that m ≤ m∗ , where m∗ is the smallest integer such that Km∗ (A, w1 ) = Km∗ +1 (A, w1 ). It was proposed in [33] to approximate the norm of the resolvent as ∗ )k. In the present work we modify this approach slightly kR(z)k ≈ kGz (A, Wm+1 , Wm and use ∗ ∗ kR(z)k ≈ kGz (A, Wm+1 , Wm+1 )k = kWm+1 (A − zI)−1 Wm+1 k.

(2.2)

In the sequel, we would be referring to m as the transfer function dimension and we ∗ ∗ ) with Gz,m (A) ∈ C(m+1)×(m+1) . Since kWm (A− shall abbreviate Gz (A, Wm+1 , Wm+1 −1 ∗ −1 zI) Wm k ≤ kWm+1 (A − zI) Wm+1 k for all m ≤ n, kGz,m (A)k provides monotonic approximations to kR(z)k, that is kGz,1 (A)k ≤ kGz,2 (A)k ≤ . . . ≤ kR(z)k. It is worth noting that the transfer function used in [33] is a principal submatrix of ∗ ). Therefore, the norm of the latter would provide an approxiGz (A, Wm+1 , Wm+1 mation to kR(z)k that would be at least as good. We can also easily extend the key “interpolation result” proved in [33, Proposition 2.2]. ˜ ∗u Proposition 2.1. Let u ˜ be a unitary vector such that (Hm+1,m − z I) ˜ = 0. Then for z ∈ C: 1 ˜ σmin (Hm+1,m − z I)

≤ kGz,m (A)k ≤

1 ˜ σmin (Hm+1,m − z I)

+ kGz,m (A)˜ uk.

Proof. As in Prop. 2.2, [33]. ∗ Proposition 2.1 shows that kGz (A, Wm+1 , Wm+1 )k−1 is a better approximation to ˜ for any m ≥ 0. How much better depends both σmin (A − zI) than σmin (Hm+1,m − z I) on the matrix A and the location of z; as experiments in [33] have shown, the level of approximation offered by the transfer function approach can vary considerably, from ˜ to being significantly more being very close to that offered by σmin (Hm+1,m − z I) accurate. 3. Approximating the Transfer Function. The theoretical aspects of the transfer function approach were established in [33]. The question remained, however, whether the approach was practical. Note, for example, that a direct calculation of Gz,m (A) from (2.2) would require, for each shift z, the solution of m+1 linear systems, with coefficient (A − zI). It was observed in [33] that it is possible to transform this problem to one requiring the solution of a single system, with coefficient (A − zI), for each shift z. We next analyze this formulation and show that by combining with Krylov subspace methods one can exploit their properties to amortize the cost of the solution across those values of z for which Gz,m (A) is needed. 4

∗ Using the quantities in (2.1), define the vector φz = Wm+1 (A−zI)−1 wm+1 . Then ∗ −1 we can write Gz,m (A) = [Wm+1 (A − zI) Wm , φz ]. Consider now the computation of the first m columns of Gz,m (A). From the Arnoldi factorization it follows that

(A − zI)Wm = Wm (Hm,m − zI) + hm+1,m wm+1 e∗m .

(3.1)

Assume that z is not an eigenvalue or Ritz value of A, which would render A or H m,m ∗ respectively singular. Pre-multiplying by Wm+1 (A − zI)−1 , leads to Gz,m (A) = [(I˜ − hm+1,m φz e∗m )(Hm,m − zI)−1 , φz ].

(3.2)

Therefore, in order to approximate kR(z)k for several values z, we first need to compute φz . This is done by solving with right-hand side wm+1 and coefficient matrix ∗ A − zI and multiplying the resulting vector with Wm+1 . We then solve m + 1 lower Hessenberg linear systems of size m and finally compute, for each z, the norm of a dense matrix of size m + 1. We call the resulting method trgrid. The computational core of trgrid is the solution of the large shifted linear system (A − zI)gz = wm+1 for each gridpoint z. Proposition 2.1 and relation (3.2) indicate that solving this system accurately becomes particularly important whenever k(Hm,m − zI)−1 k is not a good approximation to the resolvent, which is exactly when we expect the transfer function approach to provide sharper approximations to the resolvent than the methods based on Hm+1,m . Krylov subspace linear solvers are particularly suitable for solving shifted systems because of their “shift invariance” property, Km (A, b) = Km (A − zI, b),

(3.3)

for any nonzero starting vector b and shift z ∈ C (e.g. see [13, 27]). Therefore, when solving (A − zI)gz = wm+1 , a single Krylov basis is sufficient for any number of values z. Since pseudospectra are interesting for non-normal matrices, it is natural to consider using GMRES ([31]). In the sequel, when referring to trgrid, we implicitly assume that GMRES is used as the solver for approximating (A−zI)−1 wm+1 . As is well-known, however, computational and memory costs frequently force the use of the restarted version of GMRES; that is, the method is restarted after a certain number of steps, utilizing as starting vector the most recent guess to the solution. Unfortunately, this has an undesired side-effect in our context, e.g. in trgrid, which is that the GMRES residuals used as right-hand sides in the first restart are not collinear hence shift invariance fails and from then on independent approximation spaces need to be generated for each shift. To address this problem one possibility is to use a non-restarted Krylov linear solver such as QMR, ([13]), BiCGStab ([14]), or a variant of restarted GMRES ([15]). However, our experience indicated that these approaches are not well suited for highly indefinite and non-normal matrices, whose pseudospectra present the most interest. Another alternative is to consider using a restarted Krylov subspace method that naturally generates collinear residuals and thus maintains the key Property 3.3 after each restart. One such method is the restarted Full Orthogonalization Method fom ([30]) that was recently investigated by Simoncini for solving shifted systems; cf. [32]. Its principal advantage is that the shifted algorithm does not introduce any computational overhead compared to restarted fom applied independently to each shift. Let now zi , i = 1, . . . , M be the mesh points enumerated in some order. In implementing trgrid as described above, there appears to be a significant storage 5

(k)

cost of O(M ×n) for the M vectors gi ∈ Cn . Fortunately, this cost can be avoided by (k) ∗ noting that what is actually needed is not gi ∈ Cn but φzi = Wm+1 (A−zi I)−1 wm+1 . (k) In particular, assume that at each restart k the solution vector is updated as g i ← (k−1) (k) (k) ˆ d y , for some y ∈ Cm , i = 1, . . . , M , where M is the number of mesh gi +W i i nodes; in practice, of course, the method is restarted only for systems that have not (k) yet converged. Then the value of φzi can be directly updated, without explicitly (k) addressing the vector gi , as shown below: (k−1)

∗ φ(k) zi = Wm+1 (gi

(k−1)

∗ = Wm+1 gi

(k−1) (k−1) yi ) (k−1) (k−1) ∗ c (Wm+1 W )yi d

c +W d

+

(k−1)

+ X (k−1) yi = φz(k−1) i

.

∗ c (k−1) is the same for all M shifts and can be computed once Matrix X (k−1) = Wm+1 W d at each restart. Furthermore, assuming a zero starting approximate solution for all (0) shifts, we also have φzi = 0. In this manner, the overall storage requirements for the (k) vectors φzi can be kept as low as O(m × d) since φzi ∈ C(m+1) and d is the dimension of the approximants subspace in GMRES. It is also worth mentioning that from a computational standpoint, we can arrange (k) updating φzi using BLAS-3 rather than BLAS-2 routines, by aggregating all columns (k−1) yi into a matrix Y (k−1) at the cost of some relatively small additional memory. Figure 3.1 outlines trgrid based on restarted fom for multiple shifts. We call the resulting method trfomgrid. In the same figure, we also provide estimates for the total cost of the algorithm based on the worst case scenario that all K = maxit restarts have been performed for all shifts. We also used MV(n) to denote the cost of a matrix-vector product, when the matrix is of size n. In Section 5.2 we report on computational experiments that highlight the accuracy properties, together with the computational cost of the method.

4. The Transfer Function Framework and Path Following. One important class of domain-based methods for the computation of pseudospectra uses numerical path following to trace boundaries ∂Λ (A) for any given . The first such algorithm was presented by Br¨ uhl in [10], who demonstrated impressive savings of path following compared to grid. Further work in [2] advanced the original path following approach into an algorithm called cobra. cobra permitted the effective use of path following on parallel systems and also achieved increased robustness. It is worth noting that the aforementioned methods are not the only domain-based methods in existence; cf. [3], [6], [26], [25]. The algorithm we describe in the sequel uses cobra for path following and combines it with the transfer function framework. We next briefly review the major steps of cobra and refer to [2] for a comprepiv hensive discussion. Assume that at step k − 1, the method has produced a point z k−1 positioned close to the curve being traced. Step k consists of three phases (cf. Figure 4.1): Prediction phase Use a small stepsize h and perform a prediction (as in [10]) to piv , zksup determine a prediction compute the support point zksup . Points zk−1 direction dk . Define m equidistant points, ζi,0 ∈ dk , i = 1, . . . m, in the piv − ζm,0 | the length of the “head” of cobra direction of dk . We call H = |zk−1 piv and h = |˜ zk − zk−1 | the stepsize of the method. Correction phase Correct the predicted points ζi,0 ∈ dk , i = 1, . . . m to the final points zki ∈ ∂Λ (A), i = 1, . . . , m using a single step of Newton iteration on σmin (A − zI) −  = 0. Notice that all corrections are completely decoupled. 6

Algorithm trfomgrid (* Input *) Matrix A, starting vector w1 with kw1 k = 1, grid Ω, |Ω| = M , I = {1, . . . , M }, φz = 0, β(1 : M ) = 1 and scalars m, d. maxit: maximum number of iterations, k = 1 1. [Wm+1 , Hm+1,m ] ← arnoldi (A, w1 , m), w b1 ← wm+1 cd+1 , Fd+1,d ] ← arnoldi(A, w 2. [W b1 , d) 3. for each i ∈ I 4. Y (:, i) = (Fd − zi Id )−1 e1 β(i) 5. β(i) = −fd+1,d e∗d Y (:, i) 6. end ∗ cd )Y (:, I) and update I, k = k + 1 7. Update φz (:, I) = φz (:, I) + (Wm+1 W 8. if (I = ∅ or k > maxit) then break else set w b1 ← w bd+1 and goto 2. 9. for each zi ∈ Ω 10. Solve Di = (I˜ − hm+1,m φz (:, i)e∗m )(Hm,m − zi I)−1 11. Compute kGz (A)k = k[Di , φz (:, i)]k 12. end

Cost of trfomgrid 1 2 4 7 10 11

mMV(n) + 2nm2 dMV(n) + 2nd2 O(d2 ) O(2md(n + M )) O(m3 ) O(m3 )

Arnoldi to build a basis with m vectors Arnoldi to build a basis with d vectors Linear system with upper Hessenberg Fd Dense matrix-matrix multiplications Solution of m + 1 lower Hessenberg linear systems Computation of the 2-norm of a matrix of size m + 1

Total Cost: K(dMV(n) + (2n + M )d2 + 2md(n + M ) + md) + M m3 + mMV(n) + 2nm2 Fig. 3.1. Top: Algorithm trfomgrid Bottom: Cost of trfomgrid assuming that the maximum number K of iterations has been used for all shifts.

Selection of next pivot Select, according to some criterion, one of the corrected points as the next pivot point zkpiv . In the correction phase of cobra and several other path following schemes Newton’s method is applied to solve the nonlinear equation F(z) −  = 0,

F(z) = σmin (A − zI),

where

(4.1)

for some  > 0. Therefore, we need to compute ∇F(z). Theorem 4.1 ([35]). Let the matrix valued function A(ξ) : Rd → Cnr ×nc be real analytic in a neighborhood of ξ0 = (ξ01 , . . . , ξ0d ). If (u0 , σ0 , v0 ) is a singular triplet of A(ξ0 ), where σ0 is simple and non-zero, then there exists a singular triplet (u(ξ), σ(ξ), v(ξ)) of A(ξ) with σ(ξ) simple and nonzero, such that σ(ξ 0 ) = σ0 , u(ξ0 ) = u0 , v(ξ0 ) = v0 and the functions σ, u, v are real analytic in a neighborhood 7

~ zk

ζ 3,0 dk

ζ 2,0 zk3

h

ζ 1,0

zk2

zk1

zksup

h^

zk-1piv

piv Fig. 4.1. cobra: Position of pivot (zk−1 ), initial prediction (˜ zk ), support (zksup ), first order

predictors (ζj,0 ) and corrected points (zkj ). (A proper scale would show that h  H).

of ξ0 . The partial derivatives of σ(ξ) are given by   ∂σ(ξ0 ) ∗ ∂A(ξ0 ) , = < u v 0 0 ∂ξ j ∂ξ j

j = 1, . . . , d.

(4.2)

Let z = x + iy ∈ C \ Λ(A) and let σmin (A − zI) be a simple singular value. Then (cf. [10]) ∗ ∗ umin )). umin ), =(vmin ∇F(x, y) = ( maxit then break else w b1 ← w bd+1 and goto 3. 8. Start cobra from the starting point z0 9. repeat ∀ zk for which we need the corresponding smallest singular triplet 9.1 For the four neighboring nodes ζi , i = k1 , . . . , k4 of the mesh Ω: 9.1.1 Solve (Hm,m − ζi I)∗ Dk = (I˜ − hm+1,m φζi e∗m )∗ 9.1.2 Compute (u1 , σ1 , v1 ): largest singular triplet of matrix [Dk∗ , φζ ] 9.1.3 Using 2-D interpolation on the nodes ζi approximate σmin (A − zk I) and ∇σmin (A − zk I) 9.2 Perform Newton step 10. until Maximum number of steps of cobra has been executed Fig. 4.3. trfomcobra. Gridpoints: ζi , i = 1, . . . M . Path following nodes: zi , i = 0, 1, 2, . . ..

Preprocessing phase As in the first phase of trgrid we define a mesh Ω and com∗ pute φζ = Wm+1 (A − ζI)−1 wm+1 for each gridpoint ζ using shifted restarted fom. (A different notation for the gridpoints is used here to distinguish between the actual mesh points of Ω and “path” points obtained in the course of path following.) Path Following phase We start cobra from an initial path point z0 ∈ Ω. For any k = 0, 1, . . ., we approximate the smallest singular value σmin (A − zk I) and its corresponding gradient by two dimensional interpolation on the four neighboring mesh points of zk . (This is possible because the singular values of a matrix are real analytic functions of its elements.) We call this new hybrid method trfomcobra. Figure 4.2 provides a graphical illustration of trfomcobra while Figure 4.3 outlines the algorithm. To appreciate the practical effectiveness of the computation, it is important to realize that each mesh point may be addressed by several different path points z k during the course of cobra. In the example of Figure 4.2, points z2 and z3 have all four neighboring mesh points in common, while point z1 has two such mesh points in common with the latter points. The new hybrid scheme is invoked using a call of the form trfomcobra(m, d, maxit, H, h), where m is the dimension of the transfer function, d is the dimension of the Krylov subspace for fom, maxit is the maximum number of restarts, H is the length of the “head” of cobra and h the length of its prediction step. 13

ahgrid(m) trgrid(m, d) trfomgrid(m, d, maxit)

iramgrid(m, p, maxit,’WH’)

trfomcobra(m, d, maxit, H, h)

m dim. projection space Range(Wm ) m dim. transfer function matrix d max. Krylov subspace dim. for GMRES m dim. transfer function matrix d max. Krylov subspace dim. for GMRES maxit max. # restarts for fom m dim. projected problem (via implicitly restarted Arnoldi) p # sought eigenvalues maxit maximum # restarts ‘WH’ user specified criterion for eig. selection (see text) m dim. transfer function matrix d max. Krylov subspace dim. for fom maxit max. # restarts for fom H length of “cobra” head h prediction step length

Table 5.1 Calls to the tested routines.

5. Numerical Experiments with Matrix-based Methods. For the experiments in this section, we used an Intel Pentium III, 866 MHz based workstation with 1 GB of RAM and 256 KB of L2 cache memory, running Windows 2000 Server. The programming environment was MATLAB 6.1, which utilizes the LAPACK 3.0 library. Furthermore, in the experiments that involve grid, the singular values of large and sparse matrices are computed by means of the MATLAB function svds; this, in turn, uses ARPACK by means of a pre-compiled mex file. We first compare the accuracy of trgrid with the augmented Hessenberg ap˜ (see proach, we call ahgrid, which approximates σmin (A−zI) via σmin (Hm+1,m −z I) [37]). The test aims to emphasize the relevance of the projection strategy used in trgrid. We postpone comparisons between trfomgrid and iramgrid for section 5.2. In order to test the accuracy of the pseudospectra obtained via the aforementioned methods, we use as a reference the results of grid, with either direct (svd) or iterative (svds) computation of the smallest singular value. The runtimes reported in all tables correspond to CPU time (using the cputime routine). In Table 5.1 we show the call and input arguments for each method. In each of the comparisons that follow, we selected the dimension m of the Krylov subspaces for ahgrid and iramgrid at least equal to the transfer function dimension m for trgrid and trfomgrid. In the case of iramgrid, parameter WH is set by the user according to the sought eigenvalues: Smallest or largest real ‘LR’-‘SR’, smallest or largest imaginary ‘LI’-‘SI’, smallest or largest in magnitude ‘LM’-‘SM’; cf. the on-line help for eigs in MATLAB. We note that in eigs, the approximation of the eigenvalue of smallest modulus is achieved by applying the implicitly restarted Arnoldi iteration on A−1 , making use of its LU decomposition. This, of course, becomes very expensive or infeasible for very large matrices. We also note that for iramgrid, we first report the results obtained using the same parameters as those reported in [41]. The comparison with trfomgrid is carried out by first using such parameters that both methods require the same amount of memory. Alternative values of the parameters are also 14

Method grid (svds) ahgrid(100) ahgrid(150) ahgrid(200) trgrid(100, 50)

Runtime (secs) 3438 49.5 221 574 111

Table 5.2 Runtimes for ahgrid and trgrid for matrix gre 1107.

explored in search of better performance. 5.1. Comparison of trgrid with ahgrid. Our first experiment is with matrix gre 1107 (n = 1107) from Matrix Market [8] and compares the performance of ahgrid and trgrid. We applied a mesh of 25×50 points on the domain [−1, 1.5]×[0, 1]. For trgrid we used m = 100 and d = 50 (trgrid(100, 50)) while for the call to ahgrid(m), we used m = 100, 150 and 200. Figure 5.1 depicts the results of both methods for contours ∂Λ (A) and values of  = 10−1 and 10−2 . The solid lines are the contours computed by grid. For  = 10−1 , trgrid performs better than ahgrid; in particular, note the extraneous ahgrid contour in the center of the figure. Furthermore, the quality of the results of ahgrid improves but slowly as the dimension of the Krylov subspace increases from m = 100 to m = 200. On the other hand, as reported in Table 5.2, increasing m triggers a significant increase in the runtime of ahgrid. Interestingly, the computed contour in the middle of the figure does not disappear, even when m = 200. For the contour  = 10−2 , we observe once more that trgrid performs better than ahgrid, although the quality of the approximation is less satisfactory than when  = 10−1 . This deterioration is caused by the failure of full gmres to accurately approximate (A − zI)−1 wm+1 within the largest allowed dimension of the Krylov subspace for those shifts z lying towards the interior of the pseudospectrum. An intuitive explanation for the good approximation properties of trgrid around sensitive eigenvalues is that whenever z is close to such an eigenvalue then k(A−zI) −1 k is very large and therefore k(A − zI)−1 k ≈ k(A − zI)−1 wm+1 k, for sufficiently general wm+1 , i.e having components in the direction of the principal singular vector u 1 of (A − zI). Furthermore, from (3.2) it follows that kGz,m (A)k ≥ kφz k, therefore kφz k alone could offer an approximation to the resolvent norm around sensitive eigenvalues. It is thus natural to demand accurate approximations of kφz k. Our next experiment is with matrix grcar(1000). We first compare the performance of ahgrid(m) and m = 100, 150 and 200, with trgrid(m, d) and m = 100, d = 50. We approximated the contour for  = 10−1 using a 30 × 30 mesh on the domain [−1, 3.5] × [0, 3.5]. Figure 5.2 illustrates the results and compares with the contour computed by means of grid employing svds. The corresponding computational costs are summarized at the upper part of Table 5.4. We note that trgrid produces results that are significantly more accurate than those obtained by ahgrid, even for large values of m. On the other hand, the approximation obtained by trgrid is satisfactory only in some regions and is definitely inferior to the approximation obtained for gre 1107. As before, this appears to be caused by the low level of accuracy obtained in the computation of the transfer function and in turn by the inaccurate approximation of (A − zi I)−1 wm+1 . In order to better understand the influence of this error, we used MATLAB’s pcolor function to shade the background of the plots 15

1

1

0.9

0.9

m=100 d=50

0.8 0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0 −1

−0.5

0

0.5

1

1.5

0 −1.5

1

0.1

−1

−0.5

0

0.5

1

1.5

1

0.9

0.9 m=150

0.8

m=200

0.8

0.7

0.7

0.6

0.6

0.5

0.5 0.1

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0 −1.5

m=100

0.8

−1

−0.5

0

0.5

1

1.5

0 −1.5

0.1

−1

−0.5

0

0.5

1

1.5

Fig. 5.1. Experiments with ahgrid and trgrid for the matrix gre 1107. Top (left): trgrid(100, 50). Rest: ahgrid(100), ahgrid(150) and ahgrid(200). Dotted lines correspond to the  = 10−1 contour while dashed lines correspond to the  = 10−2 contour. The outer (resp. inner) solid line corresponds to the contour for  = 10−1 (resp.  = 10−2 ) obtained by means of grid (using svds).

in Figure 5.3 so that the color depends on the norm of the relative residual of GMRES (d = 50) (left plot) and restarted fom (d = 50, maxit = 50) for the shifted systems (right plot). We expect more satisfactory results from the iteratively approximated transfer function for those shifts zk for which the linear solvers achieve small relative residual for the shifted system (A − zI)g = wm+1 . We observe that almost all parts of the contour for  = 10−1 (which was computed using grid) lie in areas of large residual norms for GMRES. However, the situation is remarkably different when we use restarted fom. As will show in the next section, this has a significant positive impact on the accuracy of trfomgrid. 5.2. Comparison of trfomgrid and iramgrid. In this section we compare trfomgrid with iramgrid [41], which enhances grid with information obtained from implicitly restarted Arnoldi. We first consider matrix gre 1107. For trfomgrid, we always set the maximum number of fom restarts equal to maxit=50 and the convergence tolerance to tol=10−14 . For iramgrid we used (cf. Table 5.1) m = 100, p = 80 and maxit=300. We experimented with WH=‘SI’, ‘LI’, ‘LR’. Figure 5.4 illustrates the computed 16

3.5

3.5

TRGRID(100,50) TRGRID(100,50) GRID (svds)

3

3

2.5

2.5

2

2

1.5

1.5

1 0.5 0 −1

1 −1

ε=10

−0.5

0

0.5

1

1.5

2

2.5

3

0 −1

3.5

3.5

3.5

3

3

2.5

2.5

2

2

1.5

1.5

1

1

0.5 0 −1

AHGRID(100), ε=10−1

0.5

0

0.5

1

1.5

2

2.5

3

3.5

0 −1

0

0.5

1

1.5

2

2.5

3

3.5

1

1.5

2

2.5

3

3.5

AHGRID(200), ε=10−1

0.5

−1

AHGRID(150), ε=10

−0.5

−0.5

−0.5

0

0.5

Fig. 5.2. Matrix grcar of size n = 1000,  = 10−1 : Results of trgrid(100, 50), (solid line, top left) and ahgrid(m) with m = 100 (top right), m = 150 (bottom left), m = 200 (bottom right).

3.5

∂ Λε(A), ε=10−1 ||wm+1 − (A−ziI) xi||2

3

3.5 3

−2

2.5

−5

10

2

5

−6

1.5

1.5

−7

0 −5

−8

1

1 −10

−9

0.5

−10

0.5

−15 −20

−11

0 −1

−0.5

0

0.5

1

1.5

2

2.5

25

15

−4

2

30

20

−3

2.5

∂ Λε(A), ε=10−1 ||wm+1 − (A−ziI)xi||2

−1

0 −1

3

−0.5

0

0.5

1

1.5

2

2.5

3

Fig. 5.3. Norms of relative residuals for the shifted linear systems (A−zI) −1 g = wm+1 , for the matrix grcar(1000). Left: gmres (d = 50) without restarts. Right: fom (d = 50) with maxit= 50 restarts. Solid line: ∂Λ (A),  = 10−1 computed by grid (using svds).

contours for  = 10−1 , 10−2 and  = 10−3 . The upper plots correspond to trfomgrid. As before, we shaded the background according to the magnitude of the final fom residual. Note that in all cases where restarted fom terminated successfully, the resolvent norm approximation was satisfactory, even for the innermost contour corresponding to  = 10−3 . On the other hand, it appears that the approximations computed by iramgrid are locally satisfactory but are unable to convey an adequate representation for the entire contour. The approximation is especially poor for the inner contour ( = 10−3 ). Table 5.3 depicts runtimes for the above experiments. It is 17

1

1 30

0.9 0.8

1 30

0.9 0.8

0.8

20 0.7

20 0.7

0.6

10

0.5

0.6

10

−10

0 −1

−20

0.3

−10

−0.5

0

0.5

1

1.5

1

−20

−0.5

0

0.5

1

1.5

0.7

0.7

0.6

0.6

0.5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1 −0.5

0

0.5

1

1.5

0 −1.5

0

0.5

1

1.5

WH=’LR’

0.8

0.6

−1

−0.5

0.9 WH=’LI’

0.8

0.7

0 −1.5

−20

1

0.9 WH=’SR’

0.8

−10

0.1 0 −1

1

0.9

0.3 0.2

0.1 0 −1

0 0.4

0.2

0.1

10

0 0.4

0.2

0.6 0.5

0

0.3

20 0.7

0.5

0.4

30

0.9

0.1 −1

−0.5

0

0.5

1

1.5

0 −1.5

−1

−0.5

0

0.5

1

1.5

Fig. 5.4. Top: contours ∂Λ (A) for the matrix gre 1107 with log 10 () = −1, −2, −3 (left to right) using grid with svds (solid lines) and trfomgrid(100, 50, 50) (solid line). Background plot: norms of the residuals of restarted fom in log 10 scale. Bottom: Contours ∂Λ  (A) of gre 1107 with log10 () = −1, −2, −3 (solid line, dashed line and dotted line respectively) using iramgrid(100, 80, 300, WH) with WH = ‘SR’, ‘LI’, ‘LR’ (left to right).

Method iramgrid(100, 80, 300, ‘SR’) iramgrid(100, 80, 300, ‘LI’) iramgrid(150, 120, 300, ‘LR’) trfomgrid(100, 50, 50)

Runtimes (secs) 87 113 78 129

Table 5.3 Runtimes for iramgrid and trfomgrid for matrix gre 1107.

interesting to point out that choosing the eigenvalue selection criterion for implicitly restarted Arnoldi is crucial for performance and by no means trivial. Furthermore, a possible strategy in order to approximate the complete contour could be the repeated application of iramgrid, with different eigenvalue selection criteria, and the proper combination of the resulting contours. This however, would exacerbate the overall computational cost. Similar conclusions can be derived from the approximations of the pseudospectrum of grcar(1000) obtained with the two methods; see Figure 5.5. Note that the approximations obtained from iramgrid vary, depending on the specific eigenvalue selection parameter WH. The corresponding timings are tabulated in Table 5.4. Although the computational cost of trfomgrid appears to be slightly higher than that of iramgrid, this is acceptable, given the much better accuracy obtained with trfomgrid. Furthermore, it is fair to point out that the implementation of iramgrid is based on compiled MATLAB mex files while trfomgrid is written in pure MATLAB; therefore, we could further improve the performance of trfomgrid by also making use of mex files. 18

Methods grid (svds) ahgrid(100) ahgrid(150) ahgrid(200) trgrid(100, 50) iramgrid(100, 90, 300, ‘LM’) iramgrid(100, 90, 300, ‘LR’) iramgrid(100, 90, 300, ‘SI’) trfomgrid(100, 50, 50)

Runtimes (secs) > 10 hours 35 152 398 81 91 89 86 115

Table 5.4 Runtimes of grid, ahgrid, trgrid and iramgrid for the matrix grcar(1000).

Method trfomgrid(50, 50, 50) trfomgrid(50, 25, 50) trfomgrid(50, 25, 75) trfomgrid(50, 75, 5) iramgrid(50, 30, 300, ‘LR’)

Runtime (secs) 436 145 215 119 196

Table 5.5 Runtimes of trfomgrid and iramgrid for matrix (5.1) with n = 10, 000.

We next study the performance of trfomgrid and iramgrid in the case of much larger matrices. In particular, we are interested in approximating the pseudospectrum of a family of matrices studied in [41]. These are matrices with a basic bidiagonal structure and some random non-zero entries at random positions. We experimented with matrices that were constructed by means of the MATLAB expression A=

spdiags([3 ∗ exp(−(0 : n − 1)0 /10), 0.5 ∗ ones(n, 1)], 0 : 1, n, n) +0.1 ∗ sprandn(n, n, 10/n),

(5.1)

where n is the matrix size. For our experiments, we selected n = 10, 000 and n = 200, 000. Therefore, in both cases, we approximated the contours for values of  such that log10 () = −1 : −0.5 : −4.5, using a 25×50 mesh on the domain [−1, 3.5]×[0.7, 0]. Figure 5.6 illustrates the results with the matrix of size n = 10, 000, by calling (cf. Table 5.1) trfomgrid(50, 50, 50), trfomgrid(50, 25, 50), trfomgrid(50, 75, 5) and trfomgrid(50, 25, 75) (clockwise, starting from top left). The convergence tolerance for restarted fom was tol=10−14 . Figure 5.7 illustrates the contours computed by the call iramgrid(50, 30, 300, ‘LR’). Runtimes are reported in Table 5.5. For the part of the pseudospectrum lying on the right side of the depicted region, all results appear to be quite similar. This is no longer the case for the pseudospectrum approximations at the left side of the spectral region. The shaded backgrounds in Figure 5.6 indicate that the contours approximated by means of trfomgrid lie in regions with large residuals from restarted fom and must not be considered reliable. Regarding iramgrid, on the other hand, note that for the shift z = 0.65i, the contours in Figure 5.7 suggest that σmin (A − zI) > 0.1. However, direct computation using1 1 When

we attempted to use MATLAB’s svds, the routine failed emitting an error message indicating 19

3.5

3.5

TRFOMGRID(100,50,50) GRID (svds)

−1

ε=10 −1 ε=10 −3 ε=10 −6 ε=10

IRAMGRID(100,90,300,’SI’)

3

3 −1

ε=10 2.5

2.5

2

2

1.5

1.5

1

1

0.5

0.5

0 −1

−0.5

0

0.5

1

1.5

3.5

2

2.5

3

0 −1

3.5

−0.5

0

0.5

1

1.5

2

2.5

3

3.5

TRFOMGRID(100,50,50) GRID (svds)

ε=10 −3 ε=10 −6 ε=10

3

3

3.5 −1

−3

ε=10 2.5

2.5

2

2

1.5

1.5

1

1

0.5

0.5

0 −1

−0.5

0

0.5

1

1.5

2

3.5

2.5

3

0 −1

3.5

IRAMGRID(100,90,300,’LM’)

−0.5

0

0.5

1

1.5

2

2.5

3

3.5

TRFOMGRID(100,50,50) GRID (svds)

ε=10 −3 ε=10 −6 ε=10

IRAMGRID(100,90,300,’LR’) 3

3

3.5 −1

−6

ε=10

2.5

2.5

2

2

1.5

1.5

1

1

0.5

0.5

0 −1

−0.5

0

0.5

1

1.5

2

2.5

3

3.5

0 −1

−0.5

0

0.5

1

1.5

2

2.5

3

3.5

Fig. 5.5. Matrix grcar(1000), log 10 () = −1, −3, −6: Results of trfomgrid(100, 50, 50), (left column of plots) and iramgrid(m, p, 300, ’WH’) with ‘WH’=‘SI’, ‘LM’, ‘LR’ (right column of plots, top to bottom respectively).

a method specifically geared for the accurate computation of small singular values (algorithm irlanb from [20]) showed that σmin (A−0.65iI) ≈ 0.026, which is correctly suggested by trfomgrid. We next report on experiments with the matrix in (5.1) of size n = 200, 000. We used the same domain and mesh as above. Note that the parameters for iramgrid were the same with those used for the approximation of the pseudospectrum of A in lack of adequate system memory. 20

0.6

40

0.6

0.4

30

0.4

0.2

20

0.2

30

0

20

10

0

0

−0.2

−1

−0.5

0

0.5

1

1.5

2

2.5

3

10 0

−0.4 −10

−20

−0.6

40

−0.2

−10

−0.4

50

−0.6

3.5

−20

−1

−0.5

0

0.5

1

1.5

2

2.5

3

3.5

80 0.6

0

60

0.4

5

0.6

70

−5

0.4

50 0.2

−10 0.2

40 30

0

−15 −20

0

−25

20 −0.2

−0.2

10 0

−0.4

−35 −40

−10

−0.6

−30

−0.4 −0.6

−45

−20 −1

−0.5

0

0.5

1

1.5

2

2.5

3

3.5

−1

−0.5

0

0.5

1

1.5

2

2.5

3

3.5

Fig. 5.6. Contours ∂Λ (A), log10 (A)=-1:-0.5:-4.5 and residual norms for the random matrix (5.1) of size n = 10, 000 computed by trfomgrid(50, 50, 50), trfomgrid(50, 25, 50), trfomgrid(50, 75, 5) and trfomgrid(50, 25, 75) (clockwise starting for top left).

0.6

n=10000

0.4 0.2 0 −0.2 −0.4 −0.6 −1

IRAMGRID(50,30,300,’LR’) −0.5

0

0.5

1

1.5

2

2.5

3

3.5

Fig. 5.7. Contours ∂Λ (A), log10 (A) = −1 : −0.5 : −4.5 for the matrix (5.1) with n = 10, 000, computed by iramgrid(50, 30, 300, ‘LR’).

[41]. We computed the contours for the same values of  used in the smaller case and compared iramgrid(50, 30, 300, ‘LR’) with trfomgrid(50, 75, 5). Figure 5.8 and Table 5.6 illustrate the computed contours and runtimes, respectively. Based on these results, it appears that the same comments made for the smaller case also apply to this matrix. Using irlanb, we approximated σmin (A − 0.65iI) ≈ 0.025. Therefore, the result from trfomgrid appears to be consistent with that result, whereas the contours computed by iramgrid suggest that σmin (A − 0.65iI) > 0.1. 21

n=200000

0.6 0.6

5

0.4

0 0.4

−5

0.2

−10

0.2

−15 0

0

−20 −25 −0.2

−0.2

−30 −35 −0.4

−0.4

−40 −0.6 −1

−45 −0.5

0

0.5

1

1.5

2

2.5

3

−0.6

IRAMGRID(50,30,300,’LR’)

−1

3.5

−0.5

0

0.5

1

1.5

2

2.5

3

3.5

Fig. 5.8. Contours ∂Λ (A), log10 (A) = −1 : −0.5 : −4.5 for the matrix (5.1) with size n = 200, 000. Left: trfomgrid(50, 75, 5). Right: iramgrid(50, 30, 300, ‘LR’).

Method trfomgrid(50, 75, 5) iramgrid(50, 30, 300, ‘LR’)

Runtimes (secs) 1969 4649

Table 5.6 Runtimes of trfomgrid and iramgrid for matrix (5.1) of size n = 200, 000.

At this point it is fair to point out that even though our results suggest that trfomgrid returns superior performance, we cannot claim that it is capable of accurately computing the pseudospectrum for any matrix and for any value of . For instance, we have already seen cases where the method has problems for small values of . However, a significant property of trfomgrid is that by means of the residuals of restarted fom, we can obtain an indication regarding the reliability of the computed results. In particular, if the residuals are large, the approximation of the transfer function would not be reliable. We also note that an important factor causing the better runtimes of trfomgrid vs. iramgrid is that the latter approximates the pseudospectrum only after implicitly restarted Arnoldi has converged. On one hand, this property ensures a much desired termination criterion. On the other hand, it can significantly delay the method. The example of trfomgrid clearly shows that a successful method for the computation of the pseudospectrum may not rely at all on accurate approximation of the eigenvalues of the matrix. 6. Experiments with trfomcobra. In this section we experiment and evaluate the resulting behavior and robustness of trfomcobra. We first use matrix gre 1107 and apply trfomcobra(100, 50, 50, 0.1, 0.025) (cf. Table 5.1), in order to approximate the  = 0.1 contour. We employed a 50 × 50 mesh in the domain [−1, 1.5] × [1, 0] (taking advantage of the symmetry of the pseudospectrum of a real matrix with respect to the real axis). A total of 38 steps with 2 points per step were required for trfomcobra to approximate the contour. The total number of computed points on the contour was 114 (= 38 × 3), since 3 points were computed at each step of cobra, 1 at the Prediction Phase and 2 at the Correction Phase of each step. Therefore, the total number of evaluations of kGz,m (A)k and ∇kGz,m (A)k appear to be 4 × 114 = 456. However, since many contour points share two or more nodes of the mesh, we only need 295 such evaluations. Table 5.7 depicts the computation costs 22

Method trfomcobra(100, 50, 50, 0.1, 0.05) trfomgrid(100, 50, 50)

Runtime for φz (secs) 27.5 27.4

Number of points 295 2500

Table 5.7 Computational cost of restarted fom and number of computed points on the contours for trfomcobra and trfomgrid for matrix gre 1107 on a 50 × 50 mesh for the domain [−1, 1.5] × [1, 0].

while Figure 6.1 illustrates the resulting contour and used mesh points. On the other hand, trfomgrid would have required all 50 × 50 = 2, 500 computations on the mesh nodes. Table 5.7 also depicts the costs of trfomgrid(100, 50, 50). 1

mesh point: G

(A)

z,m

0.9

TRFOMCOBRA points

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 −1

−0.5

0

0.5

1

1.5

Fig. 6.1. The  = 0.1 contour of matrix gre 1107 computed by trfomcobra(100, 50, 50, 0.1, 0.05). The dots (‘.’) denote the nodes of the mesh for which the norm kGz,m (A)k and the gradient ∇kGz,m (A)k were computed.

We next consider the case where we need to increase the number of computed points on the contour, using the same mesh: We double the points at the Correction Phase of each step of trfomcorba to 4. Then, the total number of kGz,m (A)k and ∇kGz,m (A)k evaluations is only slightly increased, compared to the previous experiment, from 295 to 308. In the next experiment we test the robustness of the method. In particular, we gradually use significantly fewer nodes for the mesh of trfomgrid while we keep the number of cobra steps equal to 38 as well as the number of points per step equal to 2. We first used a 20 × 20 mesh and then an even coarser 10 × 10 mesh, while keeping the same domain. In the first case we required 116 evaluations of kG z,m (A)k and ∇kGz,m (A)k on the mesh points, while for the second case this number dropped to 52. Figure 6.2 illustrates the computed contours for  = 0.1 as well as the contour for the same  which was computed by grid on the same domain using a fine 50 × 50 mesh. The results suggest that trfomcobra is very robust. In particular, the method computed a fairly accurate contour even in the case of the very coarse 10 × 10 mesh. 23

1

1

TRFOMCOBRA GRID

0.9 0.8

0.8

MESH:20 × 20

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0 −1

−0.5

0

0.5

1

TRFOMCOBRA GRID

0.9

1.5

0 −1

MESH:10 × 10

−0.5

0

0.5

1

1.5

Fig. 6.2. Comparing the accuracy of the results of trfomcobra with the results of grid using a coarse mesh for the latter.

7. Conclusions. In order for the pseudospectrum to become a useful and practical tool for the analysis of non-normal matrices, it is necessary to design efficient methods for its computation. Given the research results to date, an effective approach would have to combine the advances that have taken place in both domainand matrix-based methods. Regarding the latter, we noticed that despite the existence of powerful algorithms for computing the smallest singular value, these do not capitalize on the domain available when pseudospectra are computed. The reason is that, even using path following algorithms such as cobra, the number of points z k where the smallest singular value σmin (A − zk I) is required remains large so that computing even a single contour of a very large matrix remains a formidable task. Consequently, we need methods specifically designed to concurrently approximate σ min (A − zk I) for a large number of points zk . Following the promising initial results in [33], the methods proposed in this paper address this major difficulty by implementing the transfer function framework for the efficient approximation of k(A − zk I)−1 k for many points zk ∈ C and selecting linear solvers that make use of the shift invariance of Krylov subspaces. We have addressed the problem of making the transfer function approach practical, in both the matrix- and domain-based frameworks. Note that unlike other methods, transfer functions do not rely on the computation of the smallest singular values of large matrices, a difficult problem in its own right. We were thus able to compute the pseudospectrum of a matrix of size n = 200, 000 in less than one hour using a standard workstation. Overall, our results suggest that methods based on the transfer function framework, possibly combined with path following, are powerful tools for approximating pseudospectra of very large matrices. Acknowledgements. This work has been partially supported by the Greek General Secretariat for Research and Development, Project ΠENE∆99 − 07. and by University of Patras Carath´eodory 2003 grant. We thank our project partners from the Chemical Engineering Department of the National Technical University of Athens, A. Boudouvis, A. Spyropoulos and E. Koronaki, as well as E. Kokiopoulou for useful discussions and support in this work. The first author would like to thank the Bodossaki Foundation for financial support through its doctoral scholarship program. REFERENCES 24

[1] J. Baglama, D. Calvetti, and L. Reichel. IRBL: An implicitly restarted block Lanczos method for large-scale Hermitian eigenproblems. SIAM J. Sci. Comput., 24(5):1650–1677, 2003. [2] C. Bekas and E. Gallopoulos. Cobra: Parallel path following for computing the matrix pseudospectrum. Parallel Computing, 27(14):1879–1896, 2001. [3] C. Bekas and E. Gallopoulos. Parallel computation of pseudospectra by fast descent. Parallel Computing, 28(2):223–242, 2002. [4] C. Bekas, E. Kokiopoulou, and E. Gallopoulos. The design of a distributed MATLAB-based environment for computing pseudospectra. Future Generation Computer Systems, to appear. [5] C. Bekas, E. Kokiopoulou, E. Gallopoulos, and E. Simoncini. Parallel computation of pseudospectra using transfer functions on a MATLAB-MPI cluster platform. In Recent Advances in Parallel Virtual Machine and Message Passing Interface, Proc.9th European PVM/MPI Users’ Group Meeting, Springer-Verlag, LNCS Vol. 2474, 2002. [6] C. Bekas, E. Kokiopoulou, I. Koutis, and E. Gallopoulos. Towards the effective parallel computation of matrix pseudospectra. In Proc. 15th ACM Int’l. Conf. Supercomputing (ICS’01), pages 260–269, Sorrento, Italy, June 2001. [7] M.W. Berry, D. Mezher, B. Philippe, and A. Sameh. Parallel computation of the singular value decomposition. Technical report no. 4694, IRISA, Rennes, Jan. 2003. [8] R.F. Boisvert, R. Pozo, K. Remington, R. Barrett, and J. Dongarra. The Matrix Market: A Web repository for test matrix data. In R.F. Boisvert, editor, The Quality of Numerical Software, Assessment and Enhancement, pages 125–137. Chapman+Hall, London, 1997. [9] T. Braconnier and N.J. Higham. Computing the field of values and pseudospectra using the Lanczos method with continuation. BIT, 36(3):422–440, 1996. [10] M. Br¨ uhl. A curve tracing algorithm for computing the pseudospectrum. BIT, 33(3):441–445, 1996. [11] J.V. Burke, A.S. Lewis, and M.L. Overton. Optimization over pseudospectra, with applications to robust stability. To appear, SIAM J. Matrix Anal. Appl., 2003. [12] D. R. Fokkema, G. A. G. Sleijpen, and H. A. van der Vorst. Jacobi-Davidson style QR and QZ algorithms for the reduction of matrix pencils. SIAM J. Sc. Comp., 20(1):94–125, 1998. [13] R.W. Freund. Solution of shifted linear systems by quasi-minimal residual iterations. In L. Reichel, A. Ruttan, and R.S. Varga, editors, Numerical Linear Algebra, pages 101–121, Berlin, 1993. W. de Gruyter. [14] A. Frommer. Bicgstab(`) for families of shifted linear systems. Computing, 7(2):87–109, 2003. [15] A. Frommer and U. Gl¨ assner. Restarted GMRES for shifted linear systems. SIAM J. Sci. Comput., 19(1):15–26, January 1998. [16] V. Heuveline, B. Philippe, and M. Sadkane. Parallel computation of spectral portrait of large matrices by Davidson type methods. Numer. Algorithms, 16(1):55–75 (1998), 1997. [17] N.J. Higham. The Matrix Computation Toolbox. Technical report, Manchester Centre for Computational Mathematics, 2002. In www.ma.man.uc.uk/˜higham/mctoolbox. [18] M. Hochstenbach. A Jacobi–Davidson type SVD method. SIAM J. Sci. Comput., 23(2):606– 628, 2001. [19] Z. Jia and D. Niu. An implicitly restarted refined bidiagonalization Lanczos method for computing a partial singular value decomposition. SIAM J. Matrix Anal. Appl., 25(1), 2003. [20] E. Kokiopoulou, C. Bekas, and E. Gallopoulos. Computing smallest singular triplets with implicitly restarted Lanczos bidiagonalization. Applied Numerical Mathematics, to appear. [21] R. M. Larsen. Lanczos bidiagonalization with partial reorthogonalization. PhD thesis, Dept. Computer Science, University of Aarhus, DK-8000 Aarhus C, Denmark, Oct. 1998. [22] R. Lehoucq, D.C. Sorensen, and C. Yang. Arpack User’s Guide: Solution of Large-Scale Eigenvalue Problems With Implicitly Restarted Arnoldi Methods. SIAM, Philadelphia, 1998. [23] S.H. Lui. Computation of pseudospectra with continuation. SIAM J. Sci. Comput., 18(2):565– 573, 1997. [24] D. Mezher. A graphical tool for driving the parallel computation of pseudosprectra. In Proc. 15th ACM Int’l. Conf. Supercomputing (ICS’01), pages 270–276, Sorrento, Italy, June 2001. [25] D. Mezher and B. Philippe. PAT - a reliable path following algorithm, Aug. 2000. To appear. [26] D. Mezher and B. Philippe. Parallel computation of the pseudospectrum of large matrices. Parallel Computing, 28(2):199–221, 2002. [27] B. N. Parlett. The Symmetric Eigenvalue Problem. Prentice Hall, Englewood Cliffs, 1980. [28] B. Philippe and M. Sadkane. Computation of the fundamental singular subspace of a large matrix. Lin. Alg. Appl., 257:77–104, 1997. [29] Pseudospectra gateway. At the Oxford University site 25

http://web.comlab.ox.ac.uk/projects/pseudospectra. [30] Y. Saad. Iterative Methods for Sparse Linear Systems. SIAM, sec. edt., Philadelphia, 2003. [31] Y. Saad and M. H. Schultz. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Statist. Comput., 7(3):856–869, July 1986. [32] V. Simoncini. Restarted full orthogonalization method for shifted linear systems. BIT Numerical Mathematics, 43(2):459–466, 2003. [33] V. Simoncini and E. Gallopoulos. Transfer functions and resolvent norm approximation of large matrices. Electronic Transactions on Numerical Analysis (ETNA), 7:190–201, 1998. [34] G.L.G. Sleijpen and H.A. van der Vorst. A Jacobi-Davidson iteration method for linear eigenvalue problems. SIAM Rev., 42(2):267–293, 2000. [35] J.-g. Sun. A note on simple non-zero singular values. J. Comput. Math., 6(3):258–266, 1988. [36] F. Tisseur and N. J. Higham. Structured pseudospectra for polynomial eigenvalue problems, with applications. SIAM J. Matrix Anal. Appl., pages 187–208, 2001. [37] K.-C. Toh and L.N. Trefethen. Calculation of pseudospectra by the Arnoldi iteration. SIAM J. Sci. Comput., 17(1):1–15, 1996. [38] L.N. Trefethen. Pseudospectra of matrices. In D.F. Griffiths and G.A. Watson, editors, Numerical Analysis 1991, Proc. 14th Dundee Conf., pages 234–266. Essex, UK: Longman Sci. and Tech., 1991. [39] L.N. Trefethen. Computation of pseudospectra. In Acta Numerica 1999, volume 8, pages 247–295. Cambridge University Press, 1999. [40] T. Wright. Eigtool: A graphical tool for nonsymmetric eigenproblems, Dec. 2002. At the Oxford University Computing Laboratory site http://web.comlab.ox.ac.uk/pseudospectra/eigtool. [41] T. Wright and L. N. Trefethen. Large-scale computation of pseudospectra using ARPACK and Eigs. SIAM J. Sci. Comp., 23(2):591:605, 2001. [42] T.G. Wright. Algorithms and Software for Pseudospectra. PhD thesis, University of Oxford, 2002.

26

Suggest Documents