Point and line implicit methods to improve the efficiency and robustness of the DLR TAU code Stefan Langer
Abstract We present a line implicit preconditioned multistage Runge-Kutta method to significantly improve the convergence rate for approximating steady state solutions of high Reynolds number viscous flows. The preconditioner is constructed by a simplification of a first order approximation to the Jacobian of the residual function. Predetermined lines identifying mesh regions of high cell stretching are exploited to extract the relevant parts of the Jacobian matrix. The lines are identified using an efficient algorithm based on a weighted graph. This has the advantage that high aspect ratio cells are determined everywhere in the mesh, for example also in the wake of a wing.
1 Introduction Future challenges in Computational Fluid Dynamics (CFD) are for example the solution of flow problems for full high-lift configurations of airplanes on unstructured meshes. To this end efficient and robust solvers for the Reynolds-Averaged-Navier-Stokes (RANS) equations are required. The well-known and established unstructured multigrid techniques for solving large-scale high-Reynolds number viscous flow often show a slowdown and significant deterioration of the observed convergence rate. The main reason for this breakdown in efficiency of the multigrid algorithm is the use of highly stretched anisotropic meshes required to efficiently resolve the boundary layer and wake regions in viscous flows. Indeed, the higher the Reynolds number, the more the grid stretching is required and the worse the convergence rate becomes. The grid stretching influences the local time step ∆ ti , which is often used as a local preconditioner in the relaxation scheme. For example, the local convective time step on a cartesian grid is computed as
∆ ti =
∆ xi ∆ yi ∆ yi vol(Ωi ) ≤ , = ρ expl c∆ xi + c∆ yi c
if
∆ yi ≪ ∆ xi ,
where c denotes the local maximum characteristic velocity, typically the velocity plus the speed of sound. Hence, on a highly stretched grid the local time step will be proportional to the smallest scale ∆ yi (see Figure 1). Therefore, a key issue to improve the convergence rate is to get rid of the restriction of the time step, or even better, to replace the scalar time step by some better suited expression. Stefan Langer German Aerospace Center, Member of the Helmholtz Association, Institute of Aerodynamics and Flow Technology, Lilienthalplatz 7, 38108 Braunschweig, e-mail:
[email protected]
1
2
Stefan Langer
Fig. 1 Left: Example of a stretched cell, Right: Example of a dual cell; the solid lines represent the primary grid, the dashed line the corresponding dual cell
2 Governing equations and discretization 2.1 RANS equations To describe flow effects we consider for an open domain Ω ⊂ R3 the Reynolds averaged Navier-Stokes (RANS) equations for a three-dimensional flow with velocity u(x,t) = (u1 (x,t), u2 (x,t), u3 (x,t)),
(x,t) ∈ Ω × [0, ∞)
in conservative variables W := (ρ , ρ u1 , ρ u2 , ρ u3 , ρ E, ρ ν˜ ) written as d dt
Z
Ω
W dx +
Z
∂Ω
(fc · n − fv · n) ds(y) =
Z
Ω
Q dx.
(1)
The definition of the convective terms fc · n and viscous terms fv · n as well as the additional equation and the source term Q arising from the Spalart-Allmaras turbulence model [10] can be found for example in [1]. The pressure is defined by the state equation kuk22 , p(x,t) = (γ − 1)ρ (x,t) E(x,t) − 2 where E is the specific total energy and γ is the gas dependent ratio of specific heats, which is given by 1.4 for air.
2.2 Discretization The CFD solver in the present paper is the TAU code developed at the Deutsches Zentrum f¨ur Luft- und Raumfahrt e.V. (see e.g. [9]). TAU is based on a finite volume formulation where a median dual grid forms the control volumes with the unknowns in the vertices of the primary grid. A multistage Runge-Kutta scheme may be used to approximate a steady state solution of the governing equations. An agglomeration multigrid algorithm is applied as acceleration technique. The dual grid is created in a preprocessing step. Figure 1 shows an example of a dual cell. Throughout this paper the neighbors of some point i are denoted by N (i) and the corresponding face between the points i and j is denoted by i j. In this article we consider a central discretization of the convective terms fc · n with artificial dissipation. For an inner point i of a given mesh we have
Point and line implicit methods to improve the efficiency and robustness of the DLR TAU code
Z
∂Ω
fc · n ds(y) ≈ −
3
1 (fc · ni j ) (Wi ) + (fc · ni j ) (W j ) 2 j∈N (i)
∑
1 d (Ai j ) Ψ (W j − Wi ) − κ (1 − Ψ ) (L j (W) − Li (W)) 2
(2)
where Li represents an undivided Laplacian operator, Ψ some pressure switch, the function d describes the kind of dissipation and Ai j denotes the derivative of the convective flux evaluated on the face i j using Roe-averaged variables [8], Li (W) :=
∑
(W j − Wi ) ,
j∈N (i)
Ψ := min εΨ max Ψi , Ψj , 1 , d(Ai j ) := Ai j :=
∑ j∈N (i) (p j − pi ) Ψi := , ∑ j∈N (i) (p j + pi )
ρ (Ai j ) (Scalar Dissipation), Ai j (Matrix Dissipation),
∂ (fc · ni j ) [WRoe ] . ∂ Wi
The parameter εΨ is a user defined constant which may be chosen as εΨ ∈ [4, 8], and ρ (Ai j ) denotes the spectral radius of the matrix Ai j . κ is a further empiric constant weighting the 4th difference operator, usually chosen as κ ∈ [1/20, 1/5]. For a computation of Ai j we refer to [4].
The discretization of the viscous terms is straightforward. The gradients are reconstructed using a Green-Gauss Ansatz. For more details on these topics we refer to [1]. The convective part of the turbulent equation is also discretized using a second order central scheme. The turbulent equation itself is solved strongly coupled to the mean flow equations.
3 Line implicit preconditioned Runge-Kutta method After discretization we obtain the ordinary differential equation dW = −M−1 R(W) dt
where M := diag (vol(Ωi )i=1,...,N ) .
(3)
To approximate a steady state solution of (3) we apply a multistage Runge-Kutta scheme. To improve the convergence rates in particular for high Reynolds number viscous flow we include a preconditioner into the Runge-Kutta scheme. To this end we make the following assumption: Assumption 1 We assume that n Lines L1 , . . . , Ln satisfying L j ⊂ {1, . . . , N},
L j ∩ Li = 0, /
i 6= j,
∪nj=1 L j = {1, . . . , N}
of length r1 , . . . , rn in the grid are known. We denote the lines by r
L j = {ℓ1j , . . . , ℓ j j },
j = 1, . . . , n.
A preconditioned Runge-Kutta scheme may be coded as follows:
4
Stefan Langer (0)
WLi = WLTin (1) (0) WLi = WLi − α1 P1 (Li )−1 RLi W(0) .. .
(s−1) WLi
=
(4) (0) WLi − αs−1Ps−1 (Li )−1 RLi
W(s−2)
(s) (0) WLi = WLi − αs Ps (Li )−1 RLi W(s−1) (s)
T
WLin+1 = WLi .
WTn describes the approximate solution at time level Tn . For a derivation and a thorough analysis of such Runge-Kutta methods with respect to point implicit preconditioning we refer to [4]. Similar line implicit methods have been suggested in [2, 5, 6, 7]. The preconditioner P j (Li ) is given by a simplification of a first order approximation to the Jacobian of the residual function R(W(0) ). In this case the residual at point i only depends on its direct neighbors (i.e. Ri = Ri Wi , W j, j∈N (i) ). To extract the relevant parts of the first order Jacobian ∂ R/∂ W we use the line information from Assumption 1. Along one line Li the extracted part ∂ RLi /∂ WLi of the full Jacobian can be represented by a block tridiagonal matrix,
∂ RLi ∂ WLi
=
∂ Rℓ1
i ∂ Wℓ1 i ∂ Rℓ2 i ∂ Wℓ1 i
∂ Rℓ1
i ∂ Wℓ2 i ∂ Rℓ2 i ∂ Wℓ2 i
∂ Wℓ3
..
..
.
∂ Rℓ2 i
i
.
∂R
r −1 ℓi i
∂W
r −2 ℓi i
..
.
∂R
r −1 ℓi i
∂W
r −1 ℓi i r ℓi i ∂ W ri −1 ℓi
∂R
∂R
r −1 ℓi i
∂W ∂R
r ℓi i
r ℓi i
∂W
r ℓi i
.
Introducing some sort of regularization in the way of a backward Euler step the preconditioner may be written as ! vol(Ωℓki ) ∂ RLi P j (Li ) := MLi + εP , MLi := diag . ∂ WLi CFLimpl ∆ tℓki k=1,...,ri
Adding the time step estimation MLi is important in particular in the starting phase of the algorithm, since it ensures diagonal dominance of P j (Li ), a necessity to solve the linear systems P j (Li )−1 RLi by a block LU decomposition (see e.g. [3]). In regions where a line degenerates to a point (i.e. in isotropic regions of the mesh) the block tridiagonal matrix degenerates canonically to the diagonal block corresponding to the point. Therefore, our line preconditioner is more correctly spoken a hybrid line and point preconditioner. Considered the other way round, a line preconditioner is a natural generalization of a block diagonal preconditioner in case of anisotropic cells. To approximate the Jacobian of the convective flux and dissipative part we furthermore assume that the Roe matrix weighting the dissipation is constant. Considering these conditions we may approximate the diagonal term of the derivative of (2) by 1 ∂ [(f · n ) (W ) + (f · n ) (W )] =0 c ij j ∑ 2 c ij i ∂ Wi j∈N (i)
Point and line implicit methods to improve the efficiency and robustness of the DLR TAU code
5
and
∂ ∂ Wi ≈−
1 2
∑
j∈N (i)
∑
1 d (Ai j ) Ψ (W j − Wi ) − κ (1 − Ψ ) (L j (W) − Li (W)) 2
d (Ai j ) .
j∈N (i)
The off-diagonal terms of the convective and dissipative part of the residual function are obtained analogously.
4 A line search algorithm We present a line search algorithm suited not only for identifying lines in the boundary layer of an airfoil, but also identifying lines in any anisotropic area of the given unstructured mesh. For example, also lines in the wake of a wing are found. This is important for the overall efficiency of a line implicit method. Otherwise the high grid stretching in the wake could not be resolved and would yield a significant slowdown in the convergence rate. The line search algorithm is based on a weighted graph. Each edge of the mesh is assigned a weight w(ei ) representing the degree of coupling in the discretization. The weights are taken as the inverse of the edge length, w(ei ) := (kvi (left) − vi (right)k2 )−1 ,
(5)
where vi (left) and vi (right) denote the left and the right vertex of the edge ei . The ratio of maximum to average weight is used as an indication of the local anisotropy in the mesh at each vertex. In the next step the vertices are sorted according to the ratio of the maximum to the average weight. This is a very important issue since it ensures that lines originate in areas of maximum grid stretching and end in isotropic regions. To construct the lines the first vertex in this ordered list is then picked as the starting point for a line. The line is built by adding to the original vertex the neighboring vertex which is most strongly connected to the current vertex, provided this vertex does not already belong to a line, and provided the ratio of maximum to minimum edge weights is greater than some threshold parameter α ≥ 1. The line terminates when no additional vertex can be found. The line search algorithm can be summarized as follows: 1) For each vertex v j , construct a list of edges ei ( j) originating from the vertex and determine the weight (5). 2) Compute the minimum weight, maximum weight, the average weight and the ratio of both of them: wmin(v j ) := min {w(ei ( j)}, i∈N ( j)
wavg(v j ) :=
wmax(v j ) := max {w(ei ( j)}
1 #N ( j) ∑ w(ei ( j)), #N ( j) i=1
3) Sort the vertices v j with respect to rat(v j ). 4) Construct the lines: a) Set searchOppositeDirection = false.
i∈N ( j)
rat(v j ) := wmax(v j )/wavg(v j ).
6
Stefan Langer
b) Pick the first vertex vk out of the sorted list, delete it from the list and add it to the line. Mark v˜ := vk . c) If the ratio wmax(vk )/wmin(vk ) ≥ α : · Find the neighbor vertex vneig corresponding to wmax(vk ), delete it from the sorted list and add it to the line. · Define vk := vneig and go back to b). d) else if the ratio wmax(vk )/wmin(vk ) < α and searchOppositeDirection = false: · searchOppositeDirection = true · Go to v˜ and define vk := v. ˜ · Go to c). e) else · Go to a). It can be easily verified that this algorithm generates lines satisfying the properties of Assumption 1. The lines are created in a preprocessing step. The CPU time for the procedure is negligible. The choice of α ≥ 1 determines the shape and length of the lines. It is an empiric parameter. For many 2d test cases a choice of α = 4.0 is appropriate. For 3d test cases it has turned out that a choice of α ∈ [100, 150] can be necessary.
5 Numerical Examples In our examples we chose a three stage scheme for Algorithm (4) with coefficients α1 = α2 = 2/3 and α3 = 1. The parameter εP was given by 3/5 for the laminar flat plate and 1 for the wing. Algorithm (4) is compared to an explicit Runge-Kutta method in combination with an explicit residual smoother. The dissipative part of (2) is only evaluated on the first stage for both methods.
5.1 Laminar flat plate To show the efficiency of Algorithm (4) we consider as a first test case laminar flow over a flat plate. A sequence of three structured meshes is considered. The coarse mesh has dimension 58 × 28, the medium mesh 116 × 56 and the fine mesh 232 × 112. Figure 2 displays the meshes as well as the lines along the anisotropies which are found by the line search algorithm presented in Section 4. Note that none of the structured information of the mesh has been exploited. The rate of convergence as function of the multigrid cycles is displayed in Figure 3. Six convergence curves are presented and the line implicit Algorithm (4) is compared with an explicit Runge-Kutta scheme with local time stepping and explicit and residual smoothing. Furthermore, it is observed from Figure 3 that the grid induced stiffness of the equations is removed for the line implicit Algorithm (4), whereas the deterioration of the convergence rate for the explicit Runge-Kutta scheme is obviously.
5.2 DPW Wing 1 As second example we consider a wing from the Drag Prediction workshop. It is a transonic test case with inflow Mach number 0.76 and angle of attack 0.5° . The mesh is unstructured with prismatic boundary layer (see Figure 4). The number of points is given by 10150588,
Point and line implicit methods to improve the efficiency and robustness of the DLR TAU code
7
Fig. 2 Top: Sequence of meshes for laminar flat plate, left: 58 × 28, middle: 116 × 56, right: 232 × 112, Bottom: Corresponding lines determined by the line search algorithm
Fig. 3 Comparison of convergence rate for laminar flat plate, line implicit preconditioned and explicit Runge-Kutta method with local time stepping
the number of surface elements is 465724. Figure 5 shows the improvement in the convergence rate for the line implicit Algorithm (4) when compared with the explicit Runge-Kutta method. Unfortunately the speed-up observed in example 5.1 is not approved. Therefore we consider in future work the following topics: a) Combine the line implicit Algorithm (4) with a directional coarsening strategy (see [6]). For the results shown in this article a 1 : 8 (1 : 4 in 2d) coarsening has been used, also in the boundary layer. b) Reconsidering the coupling of the turbulence and mean flow equations.
8
Stefan Langer
Fig. 4 Left: DPW Wing 1, Right: Section of the trailing edge
Fig. 5 Comparison of convergence rate, line implicit preconditioned and explicit Runge-Kutta method with local time stepping
References 1. Blazek, J.: Computational Fluid Dynamics: Principles and Applications. Elsevier Science Ltd. (2001) 2. Eliasson, P., Weinerfelt, P., Nordstr¨om, J.: Applicaton of a line-implicit scheme on stretched unstructured grid. AIAA Paper, AIAA-2009-163 (2009) 3. Golub, G.H., van Loan, C.F.: Matrix Computations, second edn. The John Hopkins University Press, Baltimore (1983) 4. Langer, S.: Investigation and Application of point implicit Runge-Kutta methods to inviscid flow problems. submitted to the International Journal for Numerical Methods in Fluids (2010) 5. Mavriplis, D.J.: Directional Coarsening and Smoothing for anisotrpoic Navier-Stokes Problems. Electronic Transactions on Numerical Analysis 6, 182–197 (1997) 6. Mavriplis, D.J.: Directional Agglomeration Multigrid Techniques for High-Reynolds Number Viscous Flows. ICASE Report No.98-7 (1998) 7. Mavriplis, D.J.: Multigrid Strategies for Viscous Flow Solvers on Anisotropic Unstructured Meshes. ICASE Report No.98-6 (1998) 8. Roe, P.: Approximate Riemann Solvers, Parameter Vectors, and Difference Schemes. Journal of Computational Physics 43, 357–372 (1981) 9. Schwamborn, D., Gerhold, T., Heinrich, R.: The DLR TAU-Code: Recent Applications in Research and Industry. ECCOMAS CFD 2006 CONFERENCE (2006) 10. Spalart, P.R., Allmaras, S.R.: A One-Equation Turbulence Model for Aerodynamic Flows. AIAA Paper, AIAA-92-439 (1992)