Lanczos algorithm is a commonly used method for nding a few extreme eigenvalues of symmetric matrices. 7, 17]. It is e ective if the wanted eigenvalues haveĀ ...
Preconditioned Krylov Subspace Methods for Eigenvalue Problems Kesheng Wu, Yousef Saad, and Andreas Stathopoulos Computer Science Department, University of Minnesota March, 1996
1 Introduction Lanczos algorithm is a commonly used method for nding a few extreme eigenvalues of symmetric matrices [7, 17]. It is eective if the wanted eigenvalues have large relative separations. If separations are small, several alternatives are often used, including the shift-invert Lanczos method [11], the preconditioned Lanczos method [14], and Davidson method [6, 8, 9, 13]. The shift-invert Lanczos method requires direct factorization of the matrix, which is often impractical if the matrix is large. In these cases preconditioned schemes are preferred. Many applications require solution of hundreds or thousands of eigenvalues of large sparse matrices, which pose serious challenges for both iterative eigenvalue solver and preconditioner [2, 3, 4, 5]. In this paper we will explore several preconditioned eigenvalue solvers and identify the ones suited for nding large number of eigenvalues. Methods discussed in this paper make up the core of a preconditioned eigenvalue toolkit under construction.
2 Preconditioned Algorithms
2.1 Generating the basis vectors
Conceptually, we separate an eigenvalue solver into two parts; one constructs an orthonormal basis set, the other uses the basis set in a Rayleigh-Ritz procedure to nd approximate eigen-pairs. For symmetric matrices the Rayleigh-Ritz procedure is optimal in many aspects [15] and it is easy to use. There are numerous ways of building orthonormal bases, including Davidson method and Arnoldi method. Many of them can be formulated as one algorithm with dierent preconditioning strategies. Algorithm Generating an orthonormal basis Vm = [v1 ; v2 ; : : : ; vm ] 1. Start. Choose a vector v1 . Let z = v1 . 2. Iterate. For j = 1; 2; : : : ; m do: (a) z = z ? Vj ?1 VjT?1 z , vj = z=kz k, (b) wj = Avj , (c) hi;j ?1 = (vi ; wj ) for i = 1; : : : ; j , (d) If
j < m, generate next z .
We explore four schemes of generating new z . 1. Preconditioned Arnoldi. z = Mj?1 wj :
2. Modi ed Arnoldi. Let hj = [h1j ; : : : ; hjj ]T , z = Mj?1(wj ? V hj ): 3. Davidson. (a) Compute smallest eigen-pair (, y) of Hj = VjT AVj , 1
(b) u = Vj y, (c) r = Wj y ? u, where Wj = [w1 ; : : : ; wj ], (d) z = Mj?1 r. 4. Harmonic Davidson. Let Hj = VjT AVj , Gj = WjT Wj , (a) Solve the generalized eigenvalue problem Hj Y = Gj Y , (b) = minji=1 yiT Hj yi . Assume = y1T Hj y1 . (c) u = Vj y1 , (d) r = Wj y1 ? u, where Wj = [w1 ; : : : ; wj ], (e) z = Mj?1 r. Among these four schemes, the preconditioned Arnoldi scheme is the most straightforward one. In the unpreconditioned case, i.e., M = I , we simply have z = wj . The second scheme is a modi cation of the Arnoldi scheme. The input to the preconditioner is the last column of AVj ? Vj Hj , which could be viewed as the \residual of the basis", and it is orthogonal to the current basis Vj . This scheme can be viewed as an Arnoldi scheme where an extra orthogonalization is applied before preconditioning. The Davidson's scheme applies the preconditioner on the residual of the current approximate solution. Let n denote the matrix size. At the j th Davidson step, computing one residual vector requires about 2jn
oating-point multiplications, 2jn additions, and the solution of a j j eigenvalue problem. This preparation for preconditioning is about twice as expensive as scheme 2. The Harmonic Ritz value scheme is adopted from [12]. Two alternative ways of computing the Harmonic Ritz values require either Vj and Wj to be bi-orthogonal, or Vj to be A2 -orthonormal, and do not t into the above framework.
2.2 Some characteristics of the implementations
In practice, the above algorithm is almost always restarted. Usually more than one eigenvalue is wanted, another level of loop is placed outside of the restarting loop. This yields two nested loops outside the above algorithm. In our implementation of the eigenvalue solvers, we always work on a small number of eigenvalues at a time. This make it possible to keep the maximum basis size independent of the number of eigenvalues desired. Therefore, the workspace does not have to increase as the number of eigenvalues increases. At restarting, we save Ritz vectors as the initial guesses for the next basis set. The number of vectors saved is independent of the number of eigenvalues wanted. This technique has been shown to enhance the performance of the eigenvalue solver in many cases [18]. When more than one starting vector is given to the Arnoldi process, we build a Krylov subspace from the rst vector, v1 . The rest of the initial guesses are used to de ate the Krylov subspace generated. The theoretical aspects of this technique are studied in [16], and some experiences from using it on linear system solvers are reported in [1]. For our experiments here, we always keep the workspace to be about 50n, i.e., the maximum basis size is 25. When restart, we always save 12 vectors. Preconditioning is a crucial feature of the programs. In linear system solvers, either M approximates A, or M ?1 approximates A?1 . In the eigenvalue case, if (, x) is the current approximation to eigen-pair ( , x ), the preconditioner is often taken to be
M (A ? I ) (A ? I ): A commonly used preconditioner is M = diag(A) ? I . Usually it is helpful to use a biased estimate of as the shift in the preconditioner [19]. For example, if the smallest eigenvalue is sought, we under-estimate it as ? krk, where r is the residual vector of the current approximation. Among the four preconditioning strategies we have mentioned, the last two compute residual vectors. In these two cases, we also compute the residual norms and provide updated shift for the preconditioner at every step. 2
Arnoldi Modi ed Davidson Harmonic matvec time matvec time matvec time matvec time 662BUS 3410 66.8 5000 67.4 4969 136.1 4267 160.0 685BUS 1629 33.5 3814 54.0 1771 48.6 3747 145.3 BCSSTK01 236 0.9 954 6.7 1095 10.2 369 4.9 BCSSTK02 126 0.7 374 2.8 198 2.3 172 2.8 BCSSTK04 3550 19.3 5000 30.4 5005 56.5 3928 65.5 BCSSTK05 563 3.4 1274 11.5 757 9.6 665 11.9 BCSSTK09 745 25.3 1084 30.6 927 43.3 861 55.2 BCSSTK16 964 257.6 952 282.7 718 257.3 BCSSTK22 2118 11.1 1903 12.6 3175 36.7 3266 57.5 BCSSTM10 5006 174.5 444 13.6 263 12.3 4033 276.1 BCSSTM27 4032 190.6 5000 244.4 3331 207.3 5005 409.4 GR3030 199 4.8 323 8.5 147 5.8 628 34.7 LUNDA 1012 5.9 3774 25.6 1485 17.4 1316 22.5 LUNDB 2301 13.1 5000 50.5 3136 37.2 2967 52.0 NOS3 537 15.8 994 24.6 484 20.0 640 35.7 NOS4 3074 15.7 344 3.0 198 2.4 1732 29.6 NOS5 5005 70.8 2453 46.5 1381 30.6 1797 53.1 ZENIOS 5006 604.0 144 13.7 120 15.1 262 46.4
Table 1: Results of unpreconditioned case.
662BUS 685BUS BCSSTK01 BCSSTK02 BCSSTK03 BCSSTK04 BCSSTK05 BCSSTK08 BCSSTK09 BCSSTK16 BCSSTK21 BCSSTK22 BCSSTM07 BCSSTM10 BCSSTM13 BCSSTM27 GR3030 LUNDA LUNDB NOS3 NOS4 NOS5 NOS6 NOS7 ZENIOS
Arnoldi Modi ed Davidson Harmonic matvec time matvec time matvec time matvec time 5005 93.7 5000 99.3 991 29.8 5003 185.8 5005 101.4 5000 103.0 1004 30.8 5004 194.2 1781 6.4 1024 5.0 133 1.4 2694 38.7 2194 10.0 5000 27.7 211 2.5 2577 41.6 5009 24.2 5000 25.2 2031 23.7 5004 82.1 5012 28.7 4003 23.3 198 2.5 5004 86.9 5012 31.0 5000 32.1 445 6.2 5003 90.4 5005 174.1 5000 181.6 900 44.8 5004 323.2 4671 170.5 5000 184.9 900 45.9 4552 305.1 5003 1393.9 887 284.4 5005 1971.4 5002 820.3 5000 866.1 2227 422.6 5004 1177.2 5012 27.4 5000 31.7 1680 21.5 5004 87.7 5005 72.3 5000 54.3 367 8.0 5003 145.9 5005 179.4 3103 102.4 276 14.2 5004 338.6 5003 379.6 303 31.8 5004 588.7 5004 246.2 5000 254.0 3045 206.8 5004 428.5 159 4.2 323 7.9 159 6.7 680 39.0 5013 32.0 4993 32.9 237 3.2 5005 89.4 5012 29.7 5000 32.0 367 4.7 5004 91.6 2370 73.7 5000 130.0 666 29.9 2863 171.2 3074 14.7 2144 8.4 237 2.9 3565 61.8 4788 71.4 5000 77.0 861 20.3 4318 136.5 5005 99.5 5000 106.5 4605 143.8 5004 189.6 5005 105.0 5000 113.8 211 7.3 5004 195.5 172 20.5 134 13.3 120 16.0 250 46.1
Table 2: Results of diagonal preconditioned case. 3
3 Experimental Comparisons Our numerical experiments are conducted on a set of symmetric matrices from the Harwell-Boeing sparse matrix collection [10]. We used all the matrices of RSA type. In the following tables, we have dropped the results of diagonal matrices and those which can not be solved by any of the methods tested. The eigenvalues are considered converged if the residual norm of the approximation is less than 10?12kAkF , where kAkF is the Frobenius norm of the matrix.
3.1 Without preconditioning
The rst experiment is performed without any preconditioning (see table 1). The table shows the number of matrix-vector multiplications (matvec) and the CPU time used to compute 5 smallest eigenvalues. The unit of time is second. The limit on the number of matrix-vector multiplications is 5000. Entries in the table showing 5000 or more matrix-vector multiplications indicate that the eigenvalue solver did not compute all ve wanted eigenvalues. The results from Arnoldi method, Modi ed Arnoldi method, Davidson method and Harmonic Davidson method are shown. There are 50 test matrices. Without preconditioning, we solved 18 of them by one of the four eigenvalue solvers. Comparing the columns shown in table 1, we notice that Arnoldi method and Davidson method generally use less matrix-vector multiplications and less time than the other two. When solving the same eigenvalue problem, Arnoldi method often takes less time than Davidson method on the matrices tested.
3.2 Simple preconditioning
This experiment is performed with diagonal preconditioning, see table 2. Here again we look for 5 smallest eigenvalues. With diagonal preconditioning, we are able to solve half of the problem set. Our experiment indicates that Davidson method can take advantage the diagonal preconditioner better than others.
3.3 Finding more eigenvalues
Table 3 demonstrates the behavior of Davidson eigenvalue solver when more eigenvalues are wanted. The number of matrix-vector multiplications allowed is 20,000 for nding 20 eigenvalues, 40,000 for nding 40 eigenvalues. For all 25 matrices where Davidson method was successful in nding 5 eigenvalues, it is also able to nd 20 eigenvalues with more matrix-vector multiplications. There are 3 case where Davidson method failed to nd 40 eigenvalues. For matrix 685BUS, we are able to nd 39. In the other two cases, less than 30 eigenvalues have converged. The average ratio of the time spent on nding 20 eigenvalues versus time spent on nding 5 eigenvalues is about 3.8 (table 3). For the 23 cases where 40 eigenvalues are found by Davidson method, the average ratio time spent on nding 40 eigenvalues versus time spent on nding 5 eigenvalues is about 9.5. This indicates that if Davidson method do nd the desired number of eigenvectors, it is fairly ecient in nding them. However Davidson method could fail to nd more eigenvalues as in the case of 685BUS, BCSSTM27 and NOS6. Acknowledgment. This research was supported by National Science Foundation Grant DMR 95-25885 and NSF/ASC 95-04038. The authors would also like to acknowledge the support of the Minnesota Supercomputer Institute which provided the computer facilities and an excellent research environment to conduct this research.
References [1] A. Chapman and Y. Saad. De ated and augmented Krylov subspace techniques. Technical Report UMSI 95/181, Minnesota Supercomputing Institute, Univ. of Minnesota, 1995. [2] J. R. Chelikowsky, N. Troullier, and Y. Saad. Finite-dierence-pseudopotential method: electronic structure calculations without a basis. Phys. Rev. Lett., 72:1240{3, 1994. 4
662BUS 685BUS BCSSTK01 BCSSTK02 BCSSTK03 BCSSTK04 BCSSTK05 BCSSTK08 BCSSTK09 BCSSTK16 BCSSTK21 BCSSTK22 BCSSTM07 BCSSTM10 BCSSTM13 BCSSTM27 GR3030 LUNDA LUNDB NOS3 NOS4 NOS5 NOS6 NOS7 ZENIOS
5 matvec 991 1004 133 211 2031 198 445 900 900 887 2227 1680 367 276 303 3045 159 237 367 666 237 861 4605 211 120
20 eigenvalues 40 eigenvalues time matvec time matvec time 29.8 3710 126.0 7773 316.7 30.8 3606 127.0 40012 2261.1 1.4 340 3.5 399 4.1 2.5 655 8.0 1247 15.5 23.7 3294 40.0 4250 53.8 2.5 616 8.4 1559 22.9 6.2 1513 21.9 2859 43.4 44.8 4165 248.7 10880 828.9 45.9 2878 153.4 5342 332.8 284.4 3346 1037.0 5927 2071.1 422.6 6050 1229.6 12271 2696.2 21.5 3086 41.4 10763 164.0 8.0 1344 32.9 2729 77.3 14.2 1292 72.8 5108 362.4 31.8 915 108.1 1846 264.5 206.8 15345 1129.3 40002 3579.1 6.7 681 30.0 1663 86.4 3.2 720 9.8 1741 26.4 4.7 1227 16.8 2963 45.0 29.9 2202 103.8 4302 237.2 2.9 720 9.1 1416 18.5 20.3 2436 59.3 3847 106.2 143.8 11471 386.4 40001 1721.7 7.3 949 35.5 2105 91.4 16.0 1006 152.1 2534 445.4
Table 3: Results of seeking dierent number of eigenvalues.
5
[3] J. R. Chelikowsky, N. Troullier, K. Wu, and Y. Saad. Higher order nite dierence pseudopotential method: an application to diatomic molecules. Phys. Rev. B, 50:11355{11364, 1994. [4] J. R. Chelikowsky, N. R. Troullier, X. Jing, D. Dean, N. Binggeli, K. Wu, and Y. Saad. Algorithms for the structural properties of clusters. Computer Physics Communications, 85:325{335, 1995. [5] M. L. Cohen and J. R. Chelikowsky. Electronic Structure and Optical Properties of Semiconditors. Springer-Verlag, New York, Berlin, Heidelberg, 2nd edition, 1989. [6] M. Crouzeix, B. Philippe, and M. Sadkane. The Davidson method. SIAM J. Sci. Comput., 15:62{76, 1994. [7] J. Cullum and R. A. Willoughby. Lanczos Algorithms for Large Symmetric Eigenvalue Computations, volume 1: Theory of Progress in Scienti c Computing; v. 3. Birkhauser, Boston, 1985. [8] Ernest R. Davidson. The iterative calculation of a few of the lowest eigenvalues and corresponding eigenvectors of large real-symmetric matrices. J. Comput. Phys., 17:87, 1975. [9] Ernest R. Davidson. super-matrix methods. Computer Physics Communications, 53:49{60, 1989. [10] I. S. Du, R. G. Grimes, and J. G. Lewis. Sparse matrix test problems. ACM Trans. Math. Soft., pages 1{14, 1989. [11] R. G. Grimes, J. G. Lewis, and H. D. Simon. A shifted block lanczos algorithm for solving sparse symmetric generalized eigenproblems. SIAM J. Matrix Anal. Appl., 15(1):228{272, 1994. [12] R. B. Morgan. Computing interior eigenvalues of large matrices. Lin. Alg. Appl., pages 289{309, 1991. [13] R. B. Morgan and D. S. Scott. Generalizations of davidson's method for computing eigenvalues of sparse symmetric matrices. SIAM J. Sci. Statist. Comput., 7:817{825, 1986. [14] R. B. Morgan and D. S. Scott. Preconditioning the Lanczos algorithm for sparse symmetric eigenvalue problems. SIAM J. Sci. Comput., 14:585{593, 1993. [15] Beresford N. Parlett. The symmetric eigenvalue problem. Prentice-Hall, Englewood Clis, NJ, 1980. [16] Y. Saad. Analysis of augmented Krylov subspace techniques. Technical Report UMSI 95/175, Minnesota Supercomputing Institute, Univ. of Minnesota, 1995. [17] Yousef Saad. Numerical methods for large eigenvalue problems. Manchester University Press, 1993. [18] D. S. Sorensen. Implicit application of polynomial lters in a K-step Arnoldi method. SIAM J. Matrix Anal. Appl., 13(1):357{385, 1992. [19] A. Stathopoulos, Y. Saad, and C. F. Fischer. Robust preconditioning of large, sparse, symmetric eigenvalue problems. Journal of Computational and Applied Mathematics, 64:197{215, 1995.
6