TOWARDS PROBLEM-INDEPENDENT MULTIGRID CONVERGENCE RATES FOR UNSTRUCTURED MESH METHODS I: INVISCID AND LAMINAR VISCOUS FLOWS Carl F. Ollivier-Gooch National Research Council Prepared for the Sixth International Symposium on Computational Fluid Dynamics Current address: Mathematics and Computer Science Division Argonne National Laboratory Argonne, IL 60439-4844 USA
[email protected]
TOWARDS PROBLEM-INDEPENDENT MULTIGRID CONVERGENCE RATES FOR UNSTRUCTURED MESH METHODS I: INVISCID AND LAMINAR VISCOUS FLOWS Carl F. Ollivier-Gooch1 National Research Council, Moett Field, CA 94035
INTRODUCTION
Multigrid methods have been demonstrated to be an ecient tool for solution of compressible ow problems on unstructured meshes. Multigrid is very memory ecient compared with implicit methods and has excellent convergence behavior for many problems. Turbulent viscous ows, which are physically and numerically sti, are an exception. Convergence histories for multigrid schemes applied to such problems often show a signi cant decrease in convergence rate after the residual has been reduced to about 10?4 or 10?5 (e.g., [1]). This leads to a dilemma: important global quantities such as lift and drag are not yet fully converged, but the additional CPU time required to converge these quantities fully appears to be prohibitive. This paper describes the rst phase of an ongoing eort to obtain multigrid convergence rates that are independent of problem type and size. One reasonable approach for reducing physical and numerical stiness in the context of multigrid schemes is to use local pre-conditioning to improve the distribution of the eigenvalues of the governing equations. Allmaras [2] proposed the use of block-Jacobi local pre-conditioning to restrict the eigenvalues associated with high-frequency error components to a compact region of the complex plane. If the eigenvalues are restricted in this manner, an optimal multi-stage scheme can be designed to damp these error components rapidly, which in turn should lead to good multigrid convergence. Also, to improve the convergence rate near steady-state, a Newton-GMRES scheme has been wrapped around the multigrid solver. The already-good convergence properties of the locally pre-conditioned multigrid scheme make matrix pre-conditioning for GMRES unnecessary. This allows a totally matrix-free implementation of GMRES with modest memory requirements. Results are presented for several inviscid and laminar viscous airfoil cases; results for turbulent ow will be presented in a later paper. The cases shown demonstrate that the convergence rate of the overall procedure is quite good and nearly insensitive to the type or size of problem being solved.
BASE FLOW SOLUTION ALGORITHM
The Navier-Stokes equations are solved in two dimensions. The conserved variables 1 This work supported by the National Research Council while the author was a Research Associate at NASA Ames Research Center.
Q are stored at the vertices of a triangular mesh. The computational domain is
decomposed into small, non-overlapping sub-domains using the median dual of the mesh, which connects cell centroids to edge midsides. The inviscid uxes are evaluated using the edge-based upwind nite-volume method introduced by Barth and Jespersen [3, 4]. The conserved variables are reconstructed locally in each control volume using a least-squares technique [4, 5]. The reconstructed gradients must be limited in order to ensure monotonicity of the solution. Venkatakrishnan's limiter [6] is used because of its superior convergence properties. The dissipation introduced by the limiter is reduced by invoking it directionally [7]. After the solution is reconstructed, a ux quadrature is performed around each control volume using Roe's approximate Riemann solver. The viscous uxes across the faces of the control volumes are computed on a cell-wise basis. In each cell, the gradients of velocity and temperature are uniquely determined by their values at the vertices. From the gradients, viscous uxes are computed and integrated around the control volumes. Once the ux integral has been computed, the solution is advanced in time using a multi-stage Runge-Kutta scheme and local time-stepping. At present, a three-stage scheme due to Allmaras [2] is used; the i are f0:5321; 1:3711; 2:7744g. A CFL number of 0.8 is used for all the cases described in this paper. The base scheme uses full approximation multigrid [8]. Residual restriction is performed conservatively, while linear interpolation is used for solution restriction and correction prolongation. First-order accurate spatial discretization is used on coarse meshes; the solution is determined solely by the ne mesh discretization, so the coarse mesh discretization can be chosen to improve convergence. Sawtooth multigrid cycles are used, with relaxation steps taken only as the cycle progresses from coarse to ne meshes. The multigrid scheme is described in more detail in [9].
LOCAL PRE-CONDITIONING VIA BLOCK JACOBI
Local pre-conditioners for the Euler and Navier-Stokes equations modify the governing equations in an eort to improve the eigenvalue structure of the resulting system without aecting the steady-state solution. The pre-conditioned form of the equations is @ ZZ Q dx dy + Z F n ds = 0 P ?1 @t (1)
@
Pre-conditioning techniques generally fall into two classes. The rst class, physical pre-conditioners, examine the analytical form of the governing equations and attempt to form matrices P which will make the eigenvalue structure independent of physical parameters such as Mach number, cell aspect ratio, and cell Reynolds number [10, 11, 12]. Physical pre-conditioners have been very successful for structured meshes, but are dicult to interpret in the context of unstructured meshes. The second class of pre-conditioners, numerical pre-conditioners, seek to reduce the condition number of the discrete equations through strictly numerical means. One
member of this class of pre-conditioners is rescaled block Jacobi [2]. This scheme uses only the diagonal blocks of the Jacobian of the residual with respect to the conserved variables. The inviscid and viscous parts are scaled dierently to ensure that the eigenvalues associated with high-frequency errors fall into a compact region of the complex plane and are well away from the origin. The implementation of a block Jacobi scheme on unstructured meshes is straightforward, as diagonal blocks of the Jacobian matrix are easily computed. The inviscid part of the residual at a vertex 0 when using Roe's FDS can be written as 1 ~ X 1 ^ ^ Rinv = f (Q0; n0j ) + f (Qj ; n0j ) ? 2 A(Q; n0j ) (Qj ? Q0) jn0j j (2) j 2 where n0j is the scaled outward normal vector for the segment of the control volume boundary which connects the centroids of the cells on opposite sides of 0j . Q^ 0 and Q^ j are reconstructed states on opposite sides on the control volume boundary, f (Q; n0j ) ~ is the inviscid ux for state Q in the direction of n0j , and A(Q; n0j ) is the Roe dissipation matrix for ux in this direction. In taking the Jacobian of this sum with respect to Q0, several assumptions are made. First, the presence of Q0 in gradients is neglected; this is roughly equivalent to lin~ n0j ) on Q0 earizing the rst-order discretization. Second, the dependence of A(Q; has been neglected. These assumptions are far less dramatic that the approximations already made in discarding all o-diagonal blocks and therefore are justi able. Furthermore, the rst two terms in Equation 2 do not contribute to the derivative: P P j f (Q0 ; n0j )jn0j j is identically zero and j f (Qj ; n0j )jn0j j has no dependence on Q0. Therefore, the diagonal part of the implicit discretization of the inviscid uxes is simply X ~ Pinv = 12 A(Q; n0j ) jn0j j (3) j The matrix is scaled by a factor of 2 so that the eigenvalues will all have real part between 0 and -1 [2]. The viscous part of the residual at vertex 0 may be written as X Rvisc = Fvisc (nk ) (4) k
where nk is the scaled outward normal for the portion of the boundary of control volume 0 crossing cell k, connecting the midpoints of the edges incident on vertex 0. Fvisc (nk ) is the viscous ux across this segment. Taking the Jacobian of the viscous residual with respect to Q0 gives X @Fvisc (nk ) ! @W0 (5) Pvisc = @W @Q k
where W0 = (0; u0; v0; T0)T .
0
0
If the gradients of velocity and temperature are computed via Green's theorem @ () = ? 1 I ()dx @ () = 1 I ()dy (6) then for a single cell
@x
Ac
@y
Ac
1 0 0 0 0 0 C BB @0 C C BB 0 ? 34 n2x + n2y A ? 31 nxny A Fvisc2 30 @T C 0 C B C BB @F visc @ 1 4 Pk @W = BB 0 ? 3 nxny A ? n2x + 3 n2y A Fvisc3 30 @T00 CCC (7) 0 C BB 2 k B@ 0 uPk2 2 + vPk3 2 uPk2 3 + vPk3 3 ? jnj A CCA 0 @0 + 13 Fvisc2 + 13 Fvisc3 +Fvisc4 3 @T0 @0 depends on the viscosity model where u; v; ; and k are cell-averaged values and @T 0 used. The sum in Equation 5 is evaluated in a loop over cells. A subsequent loop over vertices multiplies the sum by @W0=@Q0 to obtain Pvisc . The viscous part of the matrix pre-conditioner is scaled by a factor of 4 [2]. Once the explicit residual R and preconditioning matrix P are both known, the solution is updated via the same multistage scheme as before with the local scalar time step replaced by a local matrix time step. Q(0) = Qn Q(ji) = Q(ji?1) + iPj?1Rj Q(ji?1) Qn+1 = Q(imax ) (8) This time advance is embedded in the multigrid scheme. c
c
c
;
c
;
;
;
c
A MATRIX-FREE NEWTON-GMRES IMPLEMENTATION
In the limit of in nite time step, a generic implicit CFD scheme reduces to a Newton scheme: ! @R (9) I + t @Q Q = tR ?! @R @Q Q = R
The latter system of linear equations is typically solved using an iterative scheme such @R , which as GMRES [13]. One of the diculties here is the large size of the matrix @Q limits the size of problem which may be solved on a given computer. However, the matrix @R @Q is used directly in the GMRES algorithm only to compute matrix-vector @R V to produce new members of the Krylov subspace. Because this matrixproducts @Q vector product is mathematically equivalent to the directional derivative of R in the direction of V @R V lim R(Q + V ) ? R(Q) (10)
@Q
!0
@R for the computational expense of it is possible to trade the storage required for @Q additional evaluations of R. In the present work, R is computed as the change in solution over a multigrid cycle rather than as a residual. That is, the Newton-GMRES algorithm is wrapped
around the multigrid scheme. Because of the good convergence properties of the locally pre-conditioned multigrid scheme, the linear system of Equation 9 is reasonably well-conditioned. For the cases shown in this paper, a subspace size of 25-30 was generally sucient to converge the linear problem by three orders of magnitude with no restarts. Because the convergence of the linear problem is good already, no matrix pre-conditioner is needed, and the GMRES implemenation is totally matrix-free.
RESULTS AND DISCUSSION
Several test cases were run to demonstrate the convergence properties of the combined scheme. Table 1 summarizes ve cases for ow around a NACA 0012 airfoil: an inviscid subsonic case, an inviscid transonic case, a symmetric laminar separated case, an asymmetric laminar separated case, and the same inviscid transonic case on a substantially ner mesh. All cases use three coarse meshes except case 5, which has four. Both viscous cases were computed on the same mesh, with maximum cell aspect ratio of 320. The solutions for these cases will not be discussed further; the
ow solver has been validated elsewhere [9] and our interest here is in the convergence behavior of the scheme. The eect of local pre-conditioning and matrix-free GMRES on convergence rate is shown in Figure 1, which compares convergence histories with and without local pre-conditioning and with and without matrix-free GMRES. For fair comparison, the CPU time has been scaled by the amount of time required for a single explicit residual evaluation on the nest mesh. Multigrid work units, the time required for a single time step on the nest mesh, are shown across the top of the gure. The use of local preconditioning gives nearly a factor of three improvement in time to convergence even for this benign problem. The use of GMRES after the maximum residual has dropped to 10?5 improves the local time-stepped result by about 40% and the local pre-conditioned result by about 20%. The convergence factor is measured by the factor by which the maximum residual is reduced in the time required for a ne mesh residual evaluation, averaged from the time the GMRES algorithm is turned on. This factor is 0.968 for the run with local pre-conditioning and GMRES, which is equivalent to a convergence factor of 0.908 per work unit or about 0.670 per sawtooth W-cycle, and is 4.5 times better than the rate obtained without the techniques described in this paper. Insensitivity of convergence rate to problem type is demonstrated in Figure 2. All ve cases converge quickly, with a maximum time to machine-zero convergence equal to the time required for 1633 explicit residual evaluations. Furthermore, the variation in average convergence rate among the non-trivial cases with approximately the same mesh size (cases 2{4) is extremely small, especially given the very dierent ow physics of these cases and the relatively high cell aspect ratio for the viscous cases. Finally, Case 5 has over six times as many mesh points as case 2, but requires only about 30% more work to reach full convergence, indicating little dependence of convergence rate on problem size.
The convergence acceleration techniques described in this paper both improve multigrid convergence rate and make the convergence rate relatively insensitive to problem type and problem size. In a subsequent paper, we will extend the present methodology to turbulent ows, with the goal of obtaining similarly excellent convergence performance.
REFERENCES [1] D. J. Mavriplis and V. Venkatakrishnan. AIAA paper 94-2332-CP, June 1994. [2] S. R. Allmaras. AIAA paper 93-3330-CP, July 1993. [3] T. J. Barth and D. C. Jespersen. AIAA paper 89-0366, Jan. 1989. [4] T. J. Barth. In AGARD-R-787. AGARD, 1992. [5] T. J. Barth. AIAA paper 93-0668, Jan. 1993. [6] V. Venkatakrishnan. AIAA paper 93-0880, Jan. 1993. [7] M. J. Aftosmis, D. Gaitonde, and T. S. Tavares. AIAA paper 94-0415, Jan. 1994. [8] A. Brandt. In Multigrid Methods, Springer-Verlag, 1982. [9] C. F. Ollivier-Gooch. AIAA J., To appear, Sep. 1995. [10] B. van Leer, W.-T. Lee, and P. L. Roe. AIAA paper 91-1552-CP, June 1991. [11] D. Lee and B. van Leer. AIAA paper 93-3328-CP, July 1993. [12] S. Venkataswaran, J. M. Weiss, C. L. Merkle, and Y.-H. Choi. J. Comp. Phys., 1993. [13] Y. Saad and M. H. Schultz. SIAM J. on Sci. Stat. Comp., vol. 7, pp. 856{869, July 1986. Case Ma 1 2 3 4 5
Re
Vertices
CL
CD
0.5 1 | 0.8 1.25 | 0.5 0 5000 0.5 1.25 5000 0.8 1.25 |
3084 4156 3523 3523 25540
0.141 0.364 -0.0003 0.0635 0.360
0.0001 0.0225 0.0416 0.0428 0.0226
Convergence Cost In... Average Resid Work W Convergence Eval Units Cycles Factor 869 290 70 0.968 1281 427 105 0.976 1183 394 106 0.975 1333 444 115 0.975 1633 544 151 0.982
Table 1: Summary of Test Cases
Work Units 0
0
250
500
750
Work Units
1000
10
0
-2
100
200
300
400
Case 1 Case 2 Case 3 Case 4 Case 5
-2
10
-4
10
-4
Maximum Residual
Maximum Residual
500
10
Local dt Local dt + GMRES Preconditioned Precond. + GMRES
10
0
-6
10
-8
10
-10
10
10
-6
10
-8
10
-10
10
-12
10
0
750
1500 2250 Residual Evaluations
3000
-12
10
0
Figure 1: Convergence histories for case 1
400
800 1200 Residual Evaluations
1600
Figure 2: Comparison of convergence histories