Procedia Computer Science Volume 80, 2016, Pages 2266–2270 ICCS 2016. The International Conference on Computational Science
Preconditioning Large Scale Iterative Solution of Ax = b Using a Statistical Method with Application to Matrix-Free Spectral Solution of Helmholtz Equation Arash Ghasemi and Lafayette K. Taylor SimCenter - Center of Excellence in Applied Computational Science and Engineering Chattanooga, Tennessee, USA
[email protected]
Abstract The problem of preconditioning the linear system Ax = b where no entries of A are known but the matrix-vector product is given in the linear functional form Ax = L(x) is considered. A statistical multiregressor method is proposed whose convergence improves during Krylov iterations. It is shown that this method effectively improves the rate of convergence of the GMRES algorithm. The results are validated by applying the current preconditioner to a pseudo-spectral solution of Helmholtz equation. Keywords: Newton-Krylov, Preconditioning, Jacobian-free, GMRES, Regression Analysis
1
Introduction
Nowadays the solution of large scale non-symmetric sparse matrices is computationally feasible using advanced iterative methods. The popular choices are Biconjugate Gradient Stabilized method (BiCGSTAB) [14], the more general Induced Dimension Reduction (IDR) [12, 11] and the Generalized Minimal Residual (GMRES) [9]. The reader is referred to [8, 10, 4, 2] for a comprehensive review, comparison and implementation of many iterative solvers. Depending on the structure of the matrix and the ratio of min-max singular values, the number of required matrix-vector products might increase dramatically. To remedy this, it is important to invoke a preconditioning method, otherwise the iterative solution would be as costly as a direct solution. Preconditioning techniques are well developed when the matrix A is explicitly available at hand. A very effective and general purpose preconditioner can be obtained using an incomplete LU factorization of the sparse matrix [8]. However when the matrix A is not available or very costly/difficult to compute, other preconditioning strategies must be used. As an example, Jacobian-free (matrix-free) implementations usually lead to the situation where the matrix A is not available. 2266
Selection and peer-review under responsibility of the Scientific Programme Committee of ICCS 2016 c The Authors. Published by Elsevier B.V.
doi:10.1016/j.procs.2016.05.406
Preconditioning Large Scale Iterative Solution of Ax = b
Ghasemi and Taylor
To show this, consider the general form of linear time-dependent problems ∂u ∂ku = R = L (t, u) = αki , ∂t ∂xi k i
(1)
k
where the steady-state residual vector R contains high-order derivatives of the dependent variable u(t, x1 , x2 , . . . , xn ) with respect to independent variables. When discretized using a suitable discretization method like Finite-Difference, Finite Element or Finite Volume [5], Eq. (1) can be written in the following semi-discrete form d u = L u. dt
(2)
The fully discrete form is obtained after Eq. (2) is discretized in time. There are many choices available for time-discretization [3]. However a common step in these algorithms, which is the most costly part of the solution, is simply an implicit Euler scheme presented below (I − ΔtL) un+1 = un ,
(3)
where A = (I − ΔtL) is typically a huge sparse matrix for practical problems. For this reason, a direct solution of Eq. (3) is almost impossible and thus an iterative algorithm (as mentioned before) is usually used for the solution. The important point is that for many practical problems the matrix A in Eq. (3) is so huge that it does not fit in the physical memory of the current machines. This situation typically happens in high Reynolds number Large Eddy Simulation (LES) or Direct Numerical Simulations (DNS) of a turbulent fluid flow [7]. Yet still another possibility is available by using a matrix-free implementation. In this approach, the matrix A is never computed/stored but Eq. (3) is solved by replacing the matrix-vector product in the iterative algorithm with the residual computations [6]. However a disadvantage of this approach is that preconditioning is hard since entries aij of A are not available [6]. This is the motivation for the current work where a simple statistical method is used to estimate A by multiple regression analysis of the history of matrix-vector products obtained in the Krylov iterations.
2
The Statistically Preconditioned GMRES Algorithm
The details of Progressive Statistical Preconditioning (PSP) for GMRES is covered in Alg. (1) and Alg. (2) below. To solve the linear system Eq. (3), the original GMRES algorithm starts as usual in Alg. (1). However when a matrix-vector product is performed in lines 4 and 9 of Alg. (1), for the given vector Px (:, i), the result is stored in the ith column of matrix Py , i.e. Py (:, i). Therefore the statistical sample (dataset) Py versus Px is obtained progressively in the sense that the more matrix-vector products are performed the better sample is obtained. It should be noted that in Jacobian-free methods the matrix-vector products in lines 4 and 9 should be replaced with the residual computations. Now, the primary goal is to find a relation between Py and Px such that it has minimum error compared to the exact relation Py = APx . This relation, which is estimated using a banded diagonal matrix N such that Py ≈ NPx will be further used as a preconditioner in the GMRES Alg. (1) when Eq. (3) is solved at the next time step or the next restart of the same physical time. The off-diagonal entries on a given rows of A are reduced to a banded matrix N where its inverse is cheap and can be efficiently
2267
Preconditioning Large Scale Iterative Solution of Ax = b
subroutine (X, R, Px , Py ) ← PSPGMRES (A, B, N, , nA , X0 , nr ); 2 for l ← 1 to nr do 3 Px (:, i) ← X0 , Py (:, i) ← A X0 ; ¯ ← B − Py (:, i), i ← i + 1; 4 R ¯ R, ¯ G(1) ← R; ¯ 5 V(:, 1) ← R/ 6 for k ← 1 to nA do 7 G(k + 1) ←0, Yk ← N−1 V(:, k); 8 Px (:, i) ← Yk , Py (:, i) ← A Yk ; 9 Uk ← Py (:, i), i ← i + 1; 10 for j ← 1 to k do 11 H(j, k) ← V(:, j) Uk ; 12 Uk ← Uk − H(j, k) V(:, j) ; 13 end 14 H(k + 1, k) ← Uk ; 15 V(:, k + 1) ← Uk /H(k + 1, k); 16 for j ← 1 to k − 1 do 17 δ ← H(j, k); 18 H(j, k) ← δC(j) + S(j)H(j + 1, k); 19 H(j + 1, k) ← −δS(j) + C(j)H(j + 1, k); 20 end 2 2 21 γ ← H(k, k) + H(k + 1, k) ; 22 C(k) ← H(k, k)/γ; 23 S(k) ← H(k + 1, k)/γ; 24 H(k, k) ← γ, H(k + 1, k) ← 0; 25 δ ← G(k); 26 G(k) ← C(k)δ + S(k)G(k + 1); 27 G(k + 1) ← −S(k)δ + C(k)G(k + 1); 28 R(k) ← G(k + 1); 29 if R(k) ≤ then 30 finish-flag ← true; 31 break; 32 end 33 end 34 Q ← H−1 G; 35 for j ← 1 to k do 36 Zk ← Zk + Q(j)V(:, j); 37 end 38 X ← X0 + N−1 Zk ; X0 ← X; 39 clean G, V, H, C, S; 40 if finish-flag then break; 41 end Algorithm 1: Restarted PSP-GMRES
Ghasemi and Taylor
1
2268
1 2 3 4 5 6 7
8
9 10
function N ← MREP (Px , Py , d); n ← nrows(Px ), m ← ncols(Px ); for i ← 1 to d do (β0 , β1 ) ← linear fit(Px (i, :), Py (i, :)); N(i, i) ← β1 ; end for i ← (d +1) to (n − d) do ⎡ ⎤ 1 1 . . . 1 1×m ⎢ ⎥ Px (i − d, :) ⎢ ⎥ ⎢ ⎥ . .. ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ Px (i − 1, :) ∗ ⎢ ⎥; Px ← ⎢ ⎥ Px (i, :) ⎢ ⎥ ⎢ ⎥ Px (i + 1, :) ⎢ ⎥ ⎢ ⎥ . . ⎣ ⎦ . P∗y
Px (i + d, :)
← Py (i, :);
−1 P∗x P∗y ; N ← P∗x (P∗x ) ∗
for j ← 2 to 2(d + 1) do N(i, i − j − d + 1) ← N ∗ (j); 13 end 14 end 15 for i ← (n − d + 1) to n do 16 (β0 , β1 ) ← linear fit(Px (i, :), Py (i, :)); 17 N(i, i) ← β1 ; 18 end Algorithm 2: The algorithm for MultiRegressor Estimator of the Preconditioner.
11
12
Preconditioning Large Scale Iterative Solution of Ax = b
Ghasemi and Taylor
8
10
GMRES 6
10
PSP−GMRES
4
Residuals
10
2
10
0
10
−2
10
−4
10
−6
10
0
200
400
600 Iterations
800
1000
1200
Figure 1: The matrix-free Pseudo-spectral solution of Helmholtz equation using GMRES and statistically preconditioned GMRES. Left) The converged solution. Right) The effect of statistical preconditioning. The restart happens at the iteration 381 where the statistically estimated preconditioner during 1 ≤ IT R < 380 is then applied to GMRES. obtained with the LAPACK band diagonal solver [1]. This reduction procedure is described in Alg. (2) using a multi-regression estimation of matrix Py versus matrix Px . As shown, the output of the GMRES algorithm, i.e. Px and Py are sent to MREP (Multi regressor Preconditioner) where for d = 1, a tridiagonal estimation (three regressors) is enforced. The emphasis on the three regressor model is based on the fact that this approach leads to a tridiagonal matrix N which can be solved efficiently in O(n) FLOPs using the Thomas algorithm. Thus it is expected that this three-regressor model would be very efficient when the preconditioned equations need to be solved in lines 8 and 39 of Alg. (1).
3
Results
The algorithms (1, 2) are implemented in a computer program. The two dimensional Helmholtz equation uxx + uyy + k 2 u = f (x, y) for f = 10 sin (x(y − 1)), k = 8 and homogenous boundary conditions is solved using a matrix-free Pseudo-spectral method. The fully discrete form be written as the Kronecker product I ⊗ D2 + D2 ⊗ I + k 2 I ⊗ I u = f where D is the firstorder Chebyshev differentiation matrix [13]. In the matrix-free implementation, two sweep differentiations in the xy directions are performed in the residual calculation. The results are presented in Fig. (1) for a 39x39 discretization which yields a system with 1521 unknowns. The number of search vectors is set to 380 and GMRES is restarted. The strong stretching of the Chebyshev grid near the wall degrades the condition number and hence a preconditioner seems to be necessary for this problem. This is illustrated in Fig. (1) where the original GMRES algorithm stagnates. The statistical preconditioning yields an improved convergence after the restart at IT R = 381.
4
Conclusions and Acknowledgment
A new approach for preconditioning the iterative solution of Ax = b when absolutely no information is known about the matrix A is presented. It should be noted that the number of 2269
Preconditioning Large Scale Iterative Solution of Ax = b
Ghasemi and Taylor
regressors used in Alg. (2) was chosen to be three (d = 1) in the presented benchmark problem in order to obtain a tridiagonal preconditioner. The three-regressor model presented here has generated impressive results in improving the convergence of the original GMRES algorithm according to figure (1). However, there might be situations where adding more regressors, i.e. d > 1 would result in an improved convergence rate. The work was supported by the Tennessee Higher Education Commission (THEC) Center of Excellence in Applied Computational Science and Engineering (CEACSE). The support is greatly appreciated.
References [1] E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. LAPACK Users’ Guide. Society for Industrial and Applied Mathematics, Philadelphia, PA, third edition, 1999. [2] R. Barrett, M. Berry, T. F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, and H. Van der Vorst. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, 2nd Edition. SIAM, Philadelphia, PA, 1994. [3] J.C. Butcher. Numerical Methods for Ordinary Differential Equations. Wiley, 2008. [4] A. Greenbaum. Iterative Methods for Solving Linear Systems. Frontiers in Applied Mathematics. Society for Industrial and Applied Mathematics, 1997. [5] C. Johnson. Numerical Solution of Partial Differential Equations by the Finite Element Method. Dover Books on Mathematics Series. Dover Publications, Incorporated, 2012. [6] D. A. Knoll and D. E. Keyes. Jacobian-free newton-krylov methods: A survey of approaches and applications. J. Comput. Phys., 193(2):357–397, January 2004. [7] M. Lesieur, O. M´etais, and P. Comte. Large-Eddy Simulations of Turbulence. Cambridge University Press, 2005. [8] Y. Saad. Iterative Methods for Sparse Linear Systems: Second Edition. Society for Industrial and Applied Mathematics, 2003. [9] Youcef Saad and Martin H Schultz. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput., 7(3):856–869, July 1986. [10] Yousef Saad and Henk A. van der Vorst. Iterative solution of linear systems in the 20th century. Journal of Computational and Applied Mathematics, 123(12):1 – 33, 2000. Numerical Analysis 2000. Vol. III: Linear Algebra. [11] Gerard L. G. Sleijpen, Peter Sonneveld, and Martin B. van Gijzen. Bi-CGSTAB as an induced dimension reduction method. Appl. Numer. Math., 60(11):1100–1114, November 2010. [12] Peter Sonneveld and Martin B. van Gijzen. IDR(s): A family of simple and fast algorithms for solving large nonsymmetric systems of linear equations. SIAM Journal on Scientific Computing, 31(2):1035–1062, 2009. [13] L.N. Trefethen. Spectral Methods in MATLAB. Software, Environments, and Tools. Society for Industrial and Applied Mathematics, 2000. [14] H. A. van der Vorst. BI-CGSTAB: A fast and smoothly converging variant of bi-cg for the solution of nonsymmetric linear systems. SIAM J. Sci. Stat. Comput., 13(2):631–644, March 1992.
2270