Home
Search
Collections
Journals
About
Contact us
My IOPscience
Accelerating the convergence of the Lanczos algorithm by the use of a complex symmetric Cholesky factorization: application to correlation functions in quantum molecular dynamics
This content has been downloaded from IOPscience. Please scroll down to see the full text. 2011 J. Phys. B: At. Mol. Opt. Phys. 44 205102 (http://iopscience.iop.org/0953-4075/44/20/205102) View the table of contents for this issue, or go to the journal homepage for more
Download details: IP Address: 158.38.79.58 This content was downloaded on 02/03/2016 at 19:20
Please note that terms and conditions apply.
IOP PUBLISHING
JOURNAL OF PHYSICS B: ATOMIC, MOLECULAR AND OPTICAL PHYSICS
doi:10.1088/0953-4075/44/20/205102
J. Phys. B: At. Mol. Opt. Phys. 44 (2011) 205102 (6pp)
Accelerating the convergence of the Lanczos algorithm by the use of a complex symmetric Cholesky factorization: application to correlation functions in quantum molecular dynamics Hans O Karlsson Quantum Chemistry, Department of Physical and Analytical Chemistry, Uppsala University, Box 518, SE-751 20 Uppsala, Sweden E-mail:
[email protected]
Received 28 April 2011, in final form 5 July 2011 Published 23 September 2011 Online at stacks.iop.org/JPhysB/44/205102 Abstract The theoretical description of reactive scattering, photo dissociation and a number of other problems in chemical physics can be formulated in terms of a correlation function between an initial and final state. It is shown by example that the convergence of correlation functions computed using a complex symmetric Lanczos algorithm can be significantly accelerated by using a complex symmetric version of the Cholesky decomposition. In fact, using the standard Lanczos approach without the Cholesky transformation, the correlation function might not converge at all. It is further demonstrated that a stopping criterion for the Lanczos recursions, based on an estimate for the upper bound of the error of the correlation function, can be extended to complex symmetric matrices and used as a reliable stopping criterion for the Cholesky–Lanczos approach. (Some figures in this article are in colour only in the electronic version)
without complex conjugation. The resulting Lanczos matrix will remain tridiagonal, but additional numerical instabilities might appear in the recursion due to the complex norm used. Another well-known issue is that the LA first converges to the extreme eigenstates. Using a sinc-DVR-discretization (essentially a particle-in-a box basis), the largest eigenvalue h ¯ 2π 2 h ¯ 2π 2N 2 scales as Emax → 2mx 2 = 2mL2 as N → ∞ [2]. Here x is the grid spacing and L the size of the box. In combination with AP or SES, there will thus be very large unphysical complex eigenvalues that will slow down the LA further. A common misconception is that the LA first converges to the eigenstates contained in the start vector. This is, however, not true. After less than 30 iterations, the largest eigenstates start to dominate, even if they have negligible components in the start vector [3]. Thus, increasing the number of grid points gives a better resolution of the wavefunctions, but it also increases the kinetic
1. Introduction The Lanczos algorithm (LA) [1] combined with discrete variable discretization (DVR) and absorbing potentials (AP) or smooth exterior scaling (SES) [2] has become a standard tool in chemical physics for computing, e.g., vibrational eigenstates, resonance wavefunctions, photoelectron spectra, correlation functions and reaction probabilities. The problems related to the loss of orthogonality of the Lanczos vectors and how it slows down the convergence of, e.g., eigenstates and cross sections, are well known. Despite this, there has been very little discussion on how to devise a reliable stopping criterion for the recursions when computing correlation functions. For scattering problems, where AP or SES is used to enforce the correct boundary conditions, the Hamiltonian matrix will be complex symmetric, which means that the LA vectors will be complex bi-orthogonal, i.e. the scalar product is taken 0953-4075/11/205102+06$33.00
1
© 2011 IOP Publishing Ltd
Printed in the UK & the USA
J. Phys. B: At. Mol. Opt. Phys. 44 (2011) 205102
H O Karlsson
energy of the system which will slow down the convergence of the recursions. In this paper, we consider two issues when computing cross sections with the LA. First we show how to improve the convergence of the Lanczos recursions significantly by using a shift-and-invert strategy [4] based on an extension of the Cholesky decomposition to complex symmetric matrices. This is the first time that a complex symmetric Cholesky decomposition is used in combination with the LA. Secondly, we show that the error bound for correlation functions derived by Bai and Ye [5] for Hermitian matrices can be extended to the complex symmetric case. In a previous paper, the application of the Bai–Ye stopping criteria to the standard Lanczos approach [6] was studied. Here we show that the stopping criteria applied to the Cholesky–Lanczos approach can be used to determine the accuracy of the correlation functions and when to stop the recursions. In section 2, we discuss the complex symmetric LA, the Cholesky decomposition and propose a stopping criterion. Section 3 contains a numerical experiment to illustrate the concept and in section 4, the results are summarized.
although the conclusions are valid for, in principle, every function f (H ). Here 0 refers to a specific initial state, e.g. the ground state, and it is not an eigenstate to H. Utilizing the properties of the LA and using q0 = 0 as a start vector, we have, as shown by Wyatt [9], that
2. LA and correlation functions
where M is a sparse lower triangular matrix. The Cholesky transformation can be written as [11] i−1 2 Mi,k , (9) Mi,i = H˜ i,i −
C(E) = (E + i − J )−1 1,1
so we need only to compute the (1, 1) element of the inverse of the Lanczos matrix. This can be done either by computing the eigenstates of J, by solving a small system of linear equations or using a continued fraction expansion [10]. We have found that an approach based on solving sparse linear systems is more numerically stable. 2.1. Cholesky decomposition The convergence of (7) is mainly determined by the spectral properties of H and specifically its largest eigenvalues. To improve on the convergence, we will use a shift-and-invert strategy [4] based on a sparse Cholesky transformation of H, H˜ = H − E0 = MM T ,
In the following, we assume that the molecular system of interest is discretized using a DVR and that the scattering boundary conditions are invoked either using an absorbing potential W , H = T + V − iW
(7)
(8)
k=1
(1)
Mj,i = H˜ j,i −
or SES,
i−1
Mi,k Mj,k
Mi,i ,
j > i.
(10)
k=1
H = f −1 Tf −1 + V + VSES .
Formally the decomposition requires H˜ = H − E0 to be positive definite [11]. However, in this paper, we show that it can be extended to complex symmetric matrices [12]. For full matrices, the Cholesky transformation scales as N3 , but from electronic structure theory, where the Cholesky decomposition is used regularly, linear scaling has been reported [13]. It has been shown that further computational speedup can be gained by reordering the rows and columns of the Cholesky matrix [14]. Using equation (8), we have
(2)
where F (x) is a path in the complex plane Here f = [7]. The LA [1], transforming the large N × N matrix H to a smaller NLA × NLA tridiagonal matrix J, is given by dF dx
βn+1 qn+1 = H qn − αn qn − βn qn−1
(3)
αn = qnT H qn = Jn,n
(4)
βn2 = qnT qn = Jn,n−1 = Jn−1,n ,
(5)
where J has αn as on-diagonal and βn as off-diagonal elements. Note that H is assumed to be symmetric and that for a complex symmetric H, both αn and βn are complex. An alternative to the complex symmetric Lanzos approach is to use the Arnoldi algorithm [8]. This leads to an upper Hessenberg, instead of a tridiagonal, matrix. The drawback with the Arnoldi method is that it has a significantly higher memory need since all generated basis vectors must be stored. The LA recursively builds up a new basis Q = [q0 q1 . . . qn ] in which the Hamiltonian is represented as a tri-diagonal matrix J = QT H Q and where QT Q is a NLA × NLA unit matrix. Due to numerical round-off errors, the Lanczos vectors lose orthogonality which slows down the convergence. In this paper, we are concerned with correlation functions of the type C(E) = 0T (E + i − H )−1 0 ,
C(E) = 0T (E + i − H )−1 0 = 0T (E − E0 + i − (H − E0 ))−1 0 = 0T (E − E0 + i − MM T )−1 0 = T0 ((E − E0 + i)A − 1)−1 0 , −1
−T
(11)
−T
where A = M M and M 0 = 0 . We can now apply the LA onto A with q0 = 0 to obtain C(E) = ((E − E0 + i)JM − 1)−1 1,1 .
(12)
Note that we now, effectively, are working with (H − E0 )−1 and thus eigenstates close to E0 will converge first. The matrix–vector product y = H x is replaced by solving, with similar computational cost, two sets of the sparse linear system y = M −1 M −T x. Note that both (7) and (12) can be computed for several energies at once.
(6) 2
J. Phys. B: At. Mol. Opt. Phys. 44 (2011) 205102
H O Karlsson
2.2. Stopping criteria
800
One problem with the LA for computing correlation functions C(E) is to determine when the result has converged. Due to the loss of orthogonality, there is no guarantee that the result will improve if the number of recursions is increased. A stopping criterion that indicates when the recursions no longer improve on the resulting correlation function is thus much needed. An estimate of the error for a transfer function of the form (11) was derived by Bai and Ye [5] for a Hermitian matrix H. The error between the exact result C(E), equation (11) and the result after k recursions Ck (E) with the LA, equation (12), was shown to be T 2 |C(E) − Ck (E)| = β12 E 2 τ1k (E) qk+1 (EH − 1)−1 qk+1 ,
700 600
Real E
500 400 300 200 100 0
(13) 1)−1 1k .
where τ1k (E) = (EJk − Note that τ1k (E) is trivially computed at the same time as Ck (E). The last term of equation (13) is difficult to estimate, but if |E| < 1/ H , it can be simplified to 2 (E) qk+1 2 (|E|H − 1)−1 , |C(E) − Ck (E)| = β12 E 2 τ1k
0
5
10 15 Number of recursions
20
Figure 1. Lanczos eigenvalues after 20 recursions using the initial state (17) and the original Hamiltonian. 30
(14)
25
where the matrix norm H can be estimated from the largest eigenvalue of Jk . The idea here is to see if the error bound (14) and its simplified form 2 |C(E) − Ck (E)| = β12 E 2 τ1k (E) qk+1 2 (15)
Real E
20 15 10
can be used as a stopping criterion for the Lanczos recursions. The advantage of the last relation (15) is that it works for all energies with the drawback that it does not necessarily provide an upper bound to the error. Close to convergence, the error decreases exponentially, thus giving an indication that there will be no further improvement of the correlation function.
5 0
0
5
10 15 Number of recursions
20
Figure 2. Lanczos eigenvalues after 20 recursions using the initial state (17) and using the Cholesky decomposition (8). Note the different scale compared to figure 1.
3. Numerical experiments The aim of this paper is twofold: first, to investigate the effect on the convergence of C(E), equation (6), when combining the LA with the Cholesky decomposition, equation (12), of a complex symmetric Hamiltonian matrix; secondly, to see if the error bounds, equations (14) and (15), derived for a Hermitian matrix, can be used as a stopping criterion for complex symmetric matrices. Two model systems will be considered. The first model system has been used previously to study the convergence of the complex symmetric LA [2, 15]. For this system, we can compare the LA with and without full orthogonalization, with and without the Cholesky decomposition and with a numerical exact result. The second model is the collinear H + H2 reaction, which has become a benchmark system for computing reaction probabilities. The Hamiltonian matrix is sparse and can be made large enough that loss of orthogonality between the Lanczos vectors becomes an issue.
3.1. Model problem in one dimension The first model problem is the single-channel one-dimensional shape-resonance potential [2, 15]: V (r) = 7.5r 2 e−r .
(16)
The correlation function is computed with respect to an initial state
3 1/4 4
2 0 (r) = (r − r0 ) e− (r−r0 ) /2 (17) π with = 4 and r0 = 2 [15]. The system is discretized using a sinc-DVR for the kinetic energy and using N = 300 grid points and an AP [2, 15]. Only eigenstates with energies below 60 au have non-negligible components in the start vector 0 . Despite this, the Lanczos recursions tend to converge to the extreme eigenstates first. This is clearly illustrated in 3
J. Phys. B: At. Mol. Opt. Phys. 44 (2011) 205102
H O Karlsson
(a)
(a)
0
5
10
0
Error
Error
10
−5
10
10
−5
10 −10
10
−10
0
100
200
300
10
400
0
100
Number of recursions
200 300 Number of recursions
400
5
10
(b)
(b) 5
10
0
0
−5
Error
Error
10
10
−10
10
−5
10
10
−15
10
0
100
200
300
400
−10
500
10
Number of recursions
Figure 3. Convergence of the correlation function (6) for the LA. (a) With full orthogonalization. Standard Lanczos (7): black solid line. Lanczos with Cholesky (12): red dashed line. (b) With no re-orthogonalization. Standard Lanczos (7): black solid line. Lanczos with Cholesky (12): red dashed line.
0
100
200 300 Number of recursions
400
Figure 4. Comparison between the true error and the error estimate (15) for the Cholesky–Lanczos approach, equation (11). (a) Full orthogonalization. Exact error: black solid line. Error estimate from equation (15): red dashed line. (a) No orthogonalization. Exact error: black solid line. Error estimate from equation (15): red dashed line.
figure 1, where the Lanczos eigenvalues after 20 recursions are plotted. If we instead use the Cholesky decomposition (8) and use a shift and invert strategy, the result is completely different, as shown in figure 2. The lowest eigenstates, those of physical interest, converge first. Note the different scales on the axis in figures 1 and 2. It is thus clear that regardless of how we choose the initial state, the LA will converge to the extreme states which will slow down convergence significantly and even lead to a non-converged result [2, 3]. In figure 3, we compare the convergence of the correlation function using the standard approach (7) with the Cholesky approach (12). The results are shown for an energy of E = 4 au, but similar results were obtained for all relevant energies. The norm of the matrix was H = 711. In both cases, we consider full re-orthogonalization and no re-orthogonalization. For full orthogonalization, Lanczos need NLaF = N = 300 recursions to converge, whereas for Cholesky–Lanczos only NCLaF = 39 recursions are needed. Convergence
slows down without re-orthogonalization, as expected, and the standard Lanczos needs NLaNo = 414 recursions to converge, whereas the use of a Cholesky decomposition gives a converged result after only NCLaNo = 62 recursions. This emphasizes the need to suppress the unphysical high-energy states that slow down the convergence of the LA. The low convergence rate is likely to become more severe when the size of the matrix increases. The combination of the LA with a complex symmetric Cholesky-type decomposition (12) provides superior convergence rates compared to the standard approach (7). The next issue to be investigated is if the error bound (15) can be used as a stopping criterion for the Lanczos recursions. In figure 4, we plot the exact error (solid line) with the error estimate from (15) (dashed line) for the Cholesky decomposition. For both with and without reorthogonalization, the error estimate initially overestimates the true error, but when the correlation function starts to converge, 4
J. Phys. B: At. Mol. Opt. Phys. 44 (2011) 205102
H O Karlsson 0
2
10
10
(a) 0
10
−2
10
−2
10
−4
Error
Error
10 −4
10
−6
10 −6
10
−8
−8
10
−10
10
10 10
0
−10
20
40 60 Number of recursions
(b) 0
10
−2
400 600 800 Number of recursions
1000
1200
2
10
10
0
−4
10
10
Exact error vs estimate
Error
200
Figure 6. Comparison for the H+H2 problem between the standard Hamiltonian (black solid line) versus the Cholesky (red dashed line). N = 3200. Note that the standard Lanczos approach does not converge beyond 1.5 × 10−6 .
2
10
−6
10
−8
10
−10
10
0
80
0
20
40 60 Number of recursions
80
−2
10
−4
10
−6
10
−8
10
Figure 5. Comparison between the true error and the error estimate (15) for the standard Lanczos approach, equation (6). (a) Full orthogonalization. Exact error: black solid line. Error estimate from equation (15): red dashed line. (a) No orthogonalization. Exact error: black solid line. Error estimate from equation (15): red dashed line.
−10
10
0
50
100 150 200 250 Number of recursions
300
350
Figure 7. Convergence of the correlation function (11) for the collinear H3 system using the LA with no re-orthogonalization. Solid line: exact error. Dashed line: error estimate (15). N = 3200.
the error estimate also drops fast and provides a good estimate for the true error. It can thus be used as a stopping criterion, resulting in only a few more recursions than given by the exact error. For the standard Lanczos approach, the initial error estimate is far off, as seen in figure 5, but the same behaviour close to convergence is observed, and can also be considered as a good choice for stopping criteria.
results were obtained for other energies. As an initial state, an incoming scattering state in its vibrational ground state was used. Other initial states were tested with a similar result. Using the Cholesky decomposition approach (11), the correlation function converged after 250 iterations, as illustrated in figure 6, whereas the convergence of the standard Lanczos approach comes to a standstill after 600 recursions at an error of 1.5 × 10−6 and never converges further. This is due to loss of orthogonality but also due to numerical instability caused by the use of the complex symmetric scalar product. For the Cholesky approach, the stopping criterion based on (15) also works here, as shown in figure 7. For the standard Lanczos approach, it no longer provides an upper bound, as seen in figure 8, but can be used to indicate when the recursion has come to a standstill.
3.2. The collinear H + H2 system The hydrogen exchange reaction has become a benchmark problem for testing and validating novel computational schemes for reaction dynamics. Here normal mode coordinates of the transition state are used together with DVR discretization and AP [16], leading to a sparse complex symmetric Hamiltonian matrix. The parameters were chosen such that the size of the matrix was N = 3200. In the example below, an energy of E = 1.4 eV was used but similar 5
J. Phys. B: At. Mol. Opt. Phys. 44 (2011) 205102
H O Karlsson
for C(E) can be accelerated significantly. The Cholesky transformation (8) adds an extra computational cost but for sparse matrices and medium-sized systems, it will provide an invaluable speedup. For a very large system, approaches based on an approximative iterative implementation of the shift-andinvert method might be more efficient [17]. We have also shown that the error estimate by Bai and Ye [5] can be used as a stopping criterion, at least in combination with the Cholesky decomposition. Thus, it will be possible to determine a priori when to stop the recursions.
0
10
−2
Exact error vs estimate
10
−4
10
−6
10
References
−8
10
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
−10
10
0
200
400 600 800 Number of recursions
1000
1200
Figure 8. Convergence of the correlation function (6) for the collinear H3 system using the LA with no re-orthogonalization. Solid line: exact error. Dashed line: error estimate (15). N = 3200.
4. Summary and conclusions The convergence of the LA is to a large extent related to its extreme eigenvalues and is independent of the initial state. Even if the initial state is given as a linear combination of the lowest few eigenstates, the Lanczos recursions are quickly contaminated by the extreme eigenstates, which slows down the convergence. It might even lead to a situation where the correlation function never converges. In this paper, we have shown that by using a Cholesky decomposition of a complex symmetric matrix, the convergence of the Lanczos recursions
[12] [13] [14] [15] [16] [17]
6
Lanczos C 1950 J. Res. Natl Bur. Stand. 45 255 Karlsson H O 2009 J. Phys. B: At. Mol. Opt. Phys. 42 125205 Karlsson H O 2007 J. Chem. Phys. 126 084105 Ericsson T and Ruhe A 1980 Math. Comput. 35 1251 Bai Z and Ye Q 1998 Electron. Trans. Numer. Anal. 7 1 Karlsson H O 2007 J. Phys. Chem. A 111 10263 Karlsson H O 1998 J. Chem. Phys. 108 3849 Arnoldi W E 1951 Q. Appl. Math. 9 17 Wyatt R W 1989 Adv. Chem. Phys. 73 231 Haydock R 1980 Solid State 35 215 Golub G H and van Loan C F 1996 Matrix Computations (Baltimore, MD: John Hopkins University Press) B´ereux N 2005 Linear Algebr. Appl. 404 193 Schweizer S, Kussman J, Doser B and Ochsenfeld C 2007 J. Comput. Chem. 29 1004 Brandhorst K and Head-Gordon M 2011 J. Chem. Theory Comput. 7 351 Grozdanov T P, Bouakline F, Andric L and McCarroll R 2004 J. Phys. B 37 1737 Thompson W H and Miller W H 1993 Chem. Phys. Lett. 206 123 Poirier B and Carrington Jr T 2002 J. Chem. Phys. 116 1215