Multigrid Acceleration of the Preconditioned Euler and ... - CiteSeerX

0 downloads 0 Views 805KB Size Report
with the total range of cycles for all Mach numbers and all grids being only from 7 cycles for the M1 = 0:1 cases to 16 cycles for the finest grid, M1 = 0:8 case.
Multigrid Acceleration of the Preconditioned Euler and Navier-Stokes Equations David L. Darmofal Aeronautics & Astronautics Massachusetts Institue of Technology Cambridge, MA 02139 Bram Van Leer Aerospace Engineering University of Michigan Ann Arbor, MI 48109

Abstract In these notes, we discuss the development of multigrid algorithms for the preconditioned Euler and Navier-Stokes equations. A semi-coarsened multigrid algorithm with a point block Jacobi, multi-stage smoother for second order upwind discretizations of the two-dimensional Euler equations is presented which produces convergence rates independent of grid size for moderate subsonic Mach numbers. By modi cation of this base algorithm to include local preconditioning for low Mach number ows, the convergence becomes largely independent of grid size and Mach number over a range of ow conditions from nearly incompressible to transonic ows including internal and external ows. A local limiting technique is introduced to increase the robustness of preconditioning in the presence of stagnation points. Computational timings are made showing that the semi-coarsening algorithm requires O(N ) time to lower the ne grid residual six orders of magnitude where N is the number of cells. By comparison, the same algorithm applied to a full-coarsening approach requires O(N = ) time, and, in nearly all cases, the semi-coarsening algorithm is faster than full-coarsening with the computational savings being greatest on the nest grids. Recent developments for preconditioned multigrid on unstructured meshes are also discussed with two- and three-dimensional, high Reynolds number applications. 3 2

1 Introduction An important aspect of any computational method is robustness. A robust computational method not only solves a given class of problems but does so in a reliable, predictable manner from case to case. In practice, robustness issues can arise in many di erent ways. For example, an algorithm which provides accurate answers in a reasonable amount of time for one case may suddenly require a signi cant amount of time to arrive at the same level of accuracy for a di erent case. Another common (and perhaps worse) diculty is a typicallyreliable algorithm which simply diverges and fails to return any useful information for a particular problem. In either situation, these algorithms exhibit non-robust behavior. In these notes, we discuss progress toward a robust method for solving multi-dimensional Euler and Navier-Stokes equations. We present our recent work for solving the Euler equations based on semi-coarsening multigrid, point block Jacobi multi-stage relaxation, and low Mach number local preconditioning[1]. As we will show, the proposed algorithm is very robust with its convergence rate being nearly independent of grid size and Mach number for both internal and external ows. Then, we review recent work by Mavriplis[2] in unstructured multigrid algorithms for Navier-Stokes calculations which employs many of the same techniques. In both the Euler and Navier-Stokes applications, we will focus on low Mach number e ects and the impact of local preconditioning. A robust multigrid algorithm requires the careful matching of the coarsening strategy with the iterative scheme or smoother. In particular, the smoother must e ectively damp any modes which cannot be represented on coarser grids (without aliasing)[3, 4]. The most common coarsening strategy for multigrid on structured grids is full-coarsening in which every other point is removed in both directions as illustrated in Figure 1. For typical upwind discretizations, the use of full-coarsening generally requires an implicit relaxation in order to produce grid independent convergence rates. This approach has been followed by several researchers including Euler[5, 6, 7, 8, 9] and Navier-Stokes[10, 11] applications. The necessity for implicit relaxation with full-coarsening multigrid arises from the existence of grid-aligned modes i.e. error modes with long streamwise wavelengths but short cross-stream wavelengths. For upwind discretizations which do not introduce numerical dissipation normal to characteristics, these grid-aligned modes cannot be easily damped by simple point relaxations such as the multi-stage point block Jacobi relaxation algorithm employed in this work. The diculty with damping grid-aligned error modes is a direct consequence of the upwind discretization, thus, schemes with more isotropic numerical dissipation[12] may not su er these problems. However, schemes with more isotropic numerical dissipation are also likely to be less accurate[13, 14, 15]. In contrast to convection-dominated problems, point relaxation in combination with full-coarsening multigrid generally works well for elliptic problems with isotropic physical dissipation[3]. In these notes, we explore the use of semi-coarsening which is a more complex coarsening strategy devised by Mulder[16, 17] speci cally for the Euler equations and other convectiondominated ows. In semi-coarsening, a ne grid is associated with two coarser grids each independently coarsened in a single grid direction. A typical family of semi-coarsened grids is shown in Figure 1. While the semi-coarsening algorithm is more complex than fullcoarsening, the smoothing requirements are signi cantly reduced. In particular, a simple, point smoother is now sucient for achieving smoothing of all ne grid error modes[16, 17, 4]. 1

I

J

I

I

2

I

2

J

I

2

4

J

J

J

I

I

2

I

2

I

2

2

J

QQ  Q 4

(a) Full-coarsening

2

J

J

I

4

J

4

J

4

(b) Semi-coarsening

Figure 1: Multigrid coarsening strategies

2

J

QQ QQ  Q Q 4

4

Q  QQ

Q Q  QQ QQ

I

I

J

4

We select a point block Jacobi, multi-stage relaxation method as our smoother for a second order upwind discretization. An advantage of point block Jacobi is its simplicity as it requires the inversion and storage of only the local block matrix arising from a linearization of the discrete equations. Also, point block Jacobi relaxation has been shown to be an e ective smoother of high-high frequency errors for the 2-D discrete Euler[16, 17] and Navier-Stokes equations[4]. A multi-stage implementation of point block Jacobi is required for stability as a single-stage, damped Jacobi method is not stable for second order discretizations. For the proposed semi-coarsening algorithm, convergence rates for moderate Mach numbers are nearly grid independent implying the total work for this algorithm is O(N ) where N is the number of cells. We note that similar O(N ) convergence for semi-coarsening and point block Jacobi has been previously observed by Mulder[17]. While semi-coarsening with point block Jacobi smoothing gives robust convergence rates for moderate Mach numbers, the performance severely degrades at lower Mach numbers. To alleviate this problem, we introduce the low Mach number preconditioning of Turkel[18] by modifying the upwind ux function[19] while retaining the point block Jacobi relaxation based on the modi ed uxes. We refer to this approach as preconditioned Jacobi. Mavriplis[2] and Turkel[20] have previously studied this method for incorporating low Mach number preconditioning. Unfortunately, the full bene ts of local preconditioning, especially at low Mach numbers, have been dicult to achieve because local preconditioners designed for good low Mach number performance inevitably have poor robustness at stagnation points[21, 22]. Darmofal and Schmid[22] have shown that this lack of robustness is due to unlimited transient ampli cation of perturbations stemming from a highly non-orthogonal (in fact, degenerate) eigenvector structure of the preconditioned equations as M ! 0. The most common technique to avoid this robustness problem is based on limiting the e ect of preconditioning below a multiple of the freestream Mach number. Since this multiple is typically greater than one, the limit often acts globally and thus destroys the locality of the preconditioning. Furthermore, for problems in which a reference Mach number is inappropriate or non-existent, this type of limiting will be dicult to realize. Examples of these types of ows would be a hypersonic

ow about a blunt body (which would contain regions of subsonic ow) or ow of a high speed jet into a stationary uid. In these notes, we discuss a new method for improving the robustness of local preconditioners. This new method relies on strictly local information and, as a result, acts locally to avoid transient ampli cation of perturbations. In addition, we show through numerical tests that the preconditioned block Jacobi algorithm is quite robust even without resorting to local preconditioning limiting. Thus, the overall robustness of the algorithm is a result of not only the preconditioner limiting technique but also the point block Jacobi smoothing. Utilizing the preconditioned Jacobi relaxation with semi-coarsening, convergence rates have only a small dependence on Mach number. Furthermore, O(N ) convergence is demonstrated over a wide range of Mach numbers including M1 ! 0 for both internal and external ows.

3

2 Numerical Method

2.1 Baseline Discretization

Employing a cell-centered, nite volume algorithm on a structured grid composed of quadrilateral cells, the steady, two-dimensional Euler equations can be written as, R(U ) = 0; (1) where U is the cell-average conservative state vector de ned as U = (; u; v; E ). The cell residual, R, is de ned as, X R = H^ nk sk ; 4

k=1

where sk is the length and H^ nk is the numerical ux approximation for face k. Using an approximate Riemann solver (and dropping the subscript, k), H^ n is given by, 1 ^ j(U ? U ); 1 H^ n = [H (UL ) + H (UR )] ? jA R L 2 2 where A^ = @Hn =@U is the ux Jacobian evaluated using a Roe average[23]. For our second order scheme, we approximate the left and right states, UL and UR , using van Leer's scheme[24]. This reconstruction is actually performed on the primitive variables, , u, v, and p, instead of the conserved variables. For the results in these notes, we use  = 0. Also, we did not limit the reconstruction, thus, near shocks, the solutions may not be monotonic.

2.2 Smoothing

The smoothing algorithm is based on a nonlinear implementation of a multi-stage, point block Jacobi relaxation. Speci cally, a four-stage relaxation gives, U = U n ? PncRn; (2) n n U = U ? Pc R ; (3) n n U = U ? Pc R ; (4) n n n U = U ? Pc R : (5) The role of the residual preconditioner is to guarantee good smoothing of high-high frequency error modes for the semi-coarsening multigrid algorithm. Good smoothing of highhigh modes can be accomplished using a point block Jacobi preconditioner[4] which we approximate as, Pc? = 1 ?2  X jA^ kjsk: k For the coecients, i , we use the optimal damping schemes of Lynn & Van Leer[25, 26], which are (0:203; 0:451; 0:906; 1:466) for full-coarsening, and (0:182; 0:412; 0:785; 1:401) for semi-coarsening. The preconditioner Pc is frozen at the rst stage of an iteration allowing an LU factorization and ecient back-solves to be performed on all subsequent stages. In Section 4, we investigate the behavior of the eigenvalues of the preconditioned discrete operator. (1)

(1)

(2)

(2)

(1)

(3)

(3)

(2)

+1

(4)

(3)

4

1

=1

4

2.3 Low Mach Number Modi cations

Low Mach number preconditioning is incorporated into the algorithm by modifying the ux function, 1 ^ ? jP^ A^ j(U ? U ); 1 (6) H^ n = [H (UL ) + H (UR )] ? P R L 2 2 where P^ is a local ux preconditioner. This ux modi cation also propagates to the smoothing algorithm; thus, the block Jacobi preconditioner with the low Mach number ux function is, Pc? = 1 ?2  X P^ ?k jP^ kA^ kjsk: 1

4

1

1

k=1

The interesting aspect of this approach to low Mach number preconditioning is that the emphasis is placed on modifying the spatial discretization and not convergence acceleration. In fact, the low Mach number preconditioning implemented in this manner provides both improved accuracy and convergence. We will refer to this approach as preconditioned Jacobi (although this is a slight misnomer). This basic approach to incorporating low Mach number preconditioning was suggested by Turkel[20] and has been implemented successfully by Mavriplis[2]. For the ux preconditioner, we have chosen a form of Turkel's preconditioning[18]. This preconditioner takes on a particularly simple form when expressed in the symmetrizing variables, dU~ T = [dp=c; du; dv; dp ? c d]. In symmetrizing variables, the speci c form of Turkel's preconditioner we use is a diagonal matrix, 2

2 0 0 03 7 6 P~ = 664 00 10 01 00 775 : 0 0 0 1

Although not obvious, this preconditioner is identical to the preconditioner of Weiss and Smith[27]. P~ can be related to the preconditioner, P^ , appearing in the ux function through the similarity transformation, ^ ?; P~ = MPM where M is the transformation matrix from the conserved variables to the symmetrizing variables, i.e. dU~ = M dU . In Section 3, we discuss rationale for choosing , but, we note here that we can return to the unpreconditioned ux by setting  = 1. For this form of Turkel's preconditioner, the upwind ux function in Equation (6) can be expanded into its characteristic form, 1

1 2

H^ = [H (UL ) + H (UR )] ?

1 X j jw P^ ? ~r ; i i 2i i 4

1

=1

where i are the eigenvalues and ~ri are the right eigenvectors of the matrix PA. The wave strengths, wi, are the projection of the conserved state vector changes, U , on to the 5

corresponding left eigenvector, ~liT , of PA, i.e., wi = ~liT U . Leaving  a free parameter, the preconditioned eigenvalues and eigenvectors are, 1  = [(1 + )ug ?  ] ; 2  = ug ;  = ug ; 1  = [(1 + )ug +  ] ; 2 1

2

3

4

where ug is the velocity component normal to the cell face, and,

q

 = (1 ? )2 u2g + 4c2 ;

with c the speed of sound. For the results contained in these notes, we have applied an entropy x to these eigenvalues, speci cally, we de ne,

j j =

(

i ; if jij < i; jij; if jij  i;

i

where i = 2 jiL ? iRj. The corresponding preconditioned right eigenvectors are,

0 s B 1 us ? P? ~r =  BB@ vs ? 22cc nnyx +

1

1

+

2

+

2

Hs

? 2c ug

+

2

0 0 B P? ~r = BB@ ?nnxy 1

2

0 B P? ~r = BB@ 1

vg

1 CC CA ;

1

3

1 2

u v

(u + v ) 2

2

1 CC CA ;

0 s? B ? + 2c nx P? ~r = 1 BB@ us ? vs + 2c ny 1

where,

2

4

2

Hs? + 2c2 ug

s+ =  + (1 ? )ug ; s? =  ? (1 ? )ug :

6

1 CC CA ;

1 CC CA ;

Note, vg is the velocity tangential to the grid, H is the stagnation enthalpy, and (nx; ny ) are the (x; y) components of the unit face normal. The left eigenvectors are,

0 s?u + ( ? 1)(u + v g B 1 ? s? nx ? ( ? 1)u B ~l = B 4c @ ?s?ny ? ( ? 1)v 2

1

2

2

)1

CC CA ;

2( ? 1)

0 ?v g B ? n B ~l = B y @ nx 2

0

1 CC CA ;

0 c ? ( ? 1)(u + v ) 1 CC B ( ? 1)u ~l = 1 B CA ; B ( ? 1)v c @ 1 2

2

3

2

2

2

?( ? 1) 0 ?s u + ( ? 1)(u + v ) 1 g CC BB 1 s nx ? ( ? 1)u ~l = CA : B s ny ? ( ? 1)v 4c @ 2( ? 1) +

2

2

+

4

+

2

Finally, the wave strengths are given by,   w = 21c p ? 21 s?ug ; w = vg ; w =  ? p=c ;   1 1 w = 2c p + 2 s ug : 1

2

2

2

3

4

+

2

2.4 Multigrid

The semi-coarsening multigrid method was implemented using a Full Approximation Storage (FAS) scheme[3] and follows the algorithm described by Mulder[16, 17] with a few exceptions. First, we solve the second order discretization on all grids unlike Mulder[17] who employs a defect correction strategy in which a rst order discretization is used on coarse grids. Use of defect correction may increase the eciency of the method but was beyond the scope of the current work. Also, Mulder developed two anti-symmetric prolongation operators which he alternates on successive multigrid cycles. We have found this approach problematic often producing non-monotonic convergence histories in conjunction with the switching between prolongation operators. Instead, we use a symmetric prolongation operator[28] such that the correction, U , from two coarse grids to a ne grid is de ned by, U = P I UI + P J UJ   ? 21 P J RJ P I UI + P I RI P J UJ ; 7

2

10

c 1

10

0

10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

M Figure 2: Characteristic condition number, c, for Euler (solid line), block Jacobi ( = 1, dashed line), and preconditioned block Jacobi ( = cut and Mcut = 0:5, dash-dotted line).

where P I=J are the I/J prolongation operators, RI=J are the I/J restriction operators, and UI=J is the change on the I/J-coarsened grid. Linear interpolation is used for prolongation and area-weighted averaging for restriction. A V-cycle is used with 2 pre-smoothing and 2 post-smoothing iterations. The current algorithm starts from the nest grid with an initially uniform ow (i.e. impulsive start). We have also implemented a full-coarsening multigrid method for comparisons with semi-coarsening. A detailed description of the algorithms is given by Siu[29].

3 De nition of  As shown by Turkel[18], for good low Mach number preconditioning, the value of  should be proportional to M ; in this case, as M ! 0, the preconditioned eigenvalues all remain proportional to the ow speed. A useful measure of the e ectiveness of preconditioning is the characteristic condition number, c, de ned as the ratio of largest to smallest propagation speeds for an isolated point disturbance[19, 30] which is equivalent to the ratio of largest to smallest group velocities[31, 30]. The optimal variation of  which minimizes c over all Mach number[29] is given by, ( for M < 0:5; opt = 2M =(11? 2M ) for (7) M  0:5: With a block Jacobi iterative scheme, we have found that slightly better convergence is given by  of the form, ( for M < Mcut; cut = M =(1 ?1 cut M ) for (8) M M ; 2

2

2

2

2

2

cut

where cut = (1 ? Mcut )=Mcut and Mcut is the user-de ned Mach number above which no preconditioning is used. Speci cally, for all of the preconditioned block Jacobi results in these notes, we use Mcut = 0:5. Figure 2 contains plots of c for the Euler equations 2

2

2

8

(without preconditioning), the Euler equations with block Jacobi (no ux preconditioning), and the Euler equations with preconditioned block Jacobi (Turkel ux preconditioning with  = cut and Mcut = 0:5). The poor conditioning of the Euler equations is clearly seen at M = 0 and M = 1. Without any low Mach number preconditioning, block Jacobi has the identical conditioning as the Euler equations until M = 0:5 above which block Jacobi is an improvement. The combination of block Jacobi preconditioning with ux preconditioning proves e ective at removing low Mach number sti ness. Although the wave propagation sti ness is substantially improved at low Mach numbers using the ux preconditioning, a major source of trouble lies in the lack of robustness as M ! 0. Darmofal and Schmid[22] have shown that many local preconditioners can transiently amplify perturbations by a factor of 1=M as M ! 0. For thep Turkel preconditioner we employ, the maximum ampli cation over all time behaves as 1=  at M = 0. These results suggest the need to limit  such that it does not approach zero at stagnation points. The typical approach for constructing a limit for  is to require  to be greater than some multiple of the freestream Mach number[22, 32, 33]. For example, if the desired value of  = M , the limited value would be, 2





 = max M 2 ; M12 ;

With this limit, the ampli cation can still reach 1=(pM1) which could be signi cant at lower freestream Mach numbers. A typical value for  = 3:0[33]. This approach works reasonably well for airfoil ows but the high value of  means that the limit is often active throughout the computational domain. In developing a new  limit, the key idea is to recognize that when perturbations are small, the maximum ampli cation can be large; however, for large perturbations, the maximum ampli cation must be small. This suggests making the limit a function of the local ow perturbations. The speci c limit which we have found to work successfully for a variety of

ows is, jp^ (k)j : limnew = (9) c 1

0

2

As shown by Darmofal and Siu[1, 30], the most-ampli ed disturbance is a mode with all of its initial energy in the pressure perturbation. In this case, a unit-magnitude p pressure disturbance creates an ampli ed velocity perturbation with magnitude 1=  at M = 0. Thus, velocity perturbations can be bounded by, (10) ju^(k; t)j  p1 jp^ (ck)j : 0

Substituting limnew from Equation (9) into Equation (10), we may show that the velocity perturbation squared is bounded by, ju^(k; t)j  jp^ (k)j : 0

2

This limit came from a suggestion by Jonathon Weiss of Fluent to include pressure variations in the  cut-o . The authors would like to acknowledge his contribution. 1

9

This is reminiscent of the incompressible Bernoulli equation and suggests that the magnitude of velocity perturbations will correctly scale with pressure perturbations when  is required to be greater than limnew . This limit can also be interpreted as a linearity condition since it guarantees the square of the velocity perturbation is less than the pressure perturbation. In our preconditioning strategy, the values of  are only needed during the ux calculation. We call these ux values of , flux. To determine flux, a new value of  is calculated for every face, face = min [1; max (L ; R ; lim )] ; (11) where L;R are the values of  from Equation (8) using the left and right cell-average states. When using the old limit, limold = 1 ; where 1 is the value of  evaluated at M1 using Equation (8). For the new limit, we approximate jp^j by the di erence in cell-average pressures, jpR ? pLj giving, jp ? p j limnew = R L : (12) ^c^ The new face values are limited only with respect to pressure variations across the face; however, pressure gradients in all directions should be accounted for when limiting . Thus, the values of face are sent to the cells with the cell values of  being the maximum of the four face values which surround it, 2

4 : cell = max  k=1 face k

(13)

Finally, the value flux used in the ux calculations is the maximum value of cell from the two cells surrounding a face, flux = max (cellL ; cellR ) :

We note, the process of maximizing the  over faces and cells as described above naturally raises the value of  even without recourse to the new limit.

4 Smoothing Analysis via Fourier Footprints As discussed in the Introduction, the success of multigrid hinges on the ability of the smoothing algorithm to remove errors modes which can not be supported on coarser meshes. For example, consider an error mode in Fourier space given by, n = U^ n ( ;  ) exp i(j + k ); Uj;k (14) x y x y where U^ n (x; y ) is the Fourier error mode with grid wavenumber (x; y ) at iteration n. Figure 3 shows error modes in the 2-dimensional Fourier space. Waves with 0  jxj   represent low frequency modes in x and   jxj   represent high frequency error modes in x. Similarly, waves in y may be separated into low and high frequencies. In 2-dimensions, the error modes can be categorized into four types: low-low, low-high, high-low, and high-high. 2

2

10

y



low-high

high-high

low-low

high-low

 2



x

0 Figure 3: Spectral view of the error modes. 2



For a full-coarsening multigrid algorithm, the grid size is reduced by a factor of two in both the x and y directions. As a result, modes with high frequencies in any direction cannot be represented on coarser grids. Thus, the smoother must be capable of damping low-high, high-low, and high-high modes as shown on Figure 4. For Mulder's semi-coarsening multigrid algorithm, only high-high modes are not represented on some coarser grid. Thus, the complexity of the coarsening strategy reduces the smoothing requirements to damping of the high-high modes only. Figure 5 shows the semi-coarsening smoothing requirements in the spectral space. A common approach for analyzing the smoothing properties of a scheme is via the Fourier footprint of the discrete spatial operator which is the locus of its eigenvalues in the complex plane. Speci cally, after linearization about a constant state and substitution of the discrete Fourier mode in Equation (14), the spectral representation of the numerical approximation may be found for any discrete wavenumber combination, (x ; y ). The use of the Fourier footprint to analyze the smoothing abilities of preconditioned Euler and Navier-Stokes equations can be found in several other references[34, 35, 4, 36, 25, 37, 38]. For the -schemes with low Mach number preconditioning, the Fourier-transformed spatial operator on a rectangular grid is, n h i h io Z^ (x; y ) = ?Pc y A ^x ? P? jPAj ^xx + x B ^y ? P? jPBj ^yy ; (15) 1

^x =

i

2 [(3 ? ) sin x ? (1 ? ) sin 2x] ; i ^y = [(3 ? ) sin y ? (1 ? ) sin 2y ] ; 2 1 ?  (4 cos  ? cos 2 ? 3) ; ^xx = x x 4 1 ?  (4 cos  ? cos 2 ? 3) : ^yy = y y 4 11

1

y



 2

1111111111 00000 00000 00000 11111 00000 0000011111 11111 00000 11111 00000 11111 00000 11111 0000011111 11111 00000 0000011111 11111 00000 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111 

x

 0 Figure 4: Modes (shaded) which must be damped by smoother for full-coarsening. 2

y



 2

11111 00000 00000 11111 00000 11111 00000 11111 00000 11111 00000 11111 

x

0 Figure 5: Modes (shaded) which must be damped by smoother for Mulder's semi-coarsening algorithm. 

2

12

The Jacobi cell preconditioner takes the form,   Pc = 1 ?1  P? jPAjy + P? jPBjx ? = 1 ?1  (jPAjy + jPBjx)? P: 1

1

1

1

This last expression for the block Jacobi preconditioner is interesting because upon substitution into Equation (15), the Fourier transformed spatial operator is, n h i h io Z^ = ? 1 ?1  (jPAjy + jPBjx)? y PA ^x ? jPAj ^xx + x PB ^y ? jPBj ^yy ; which shows that utilizing a block Jacobi smoother on the modi ed preconditioned ux function is equivalent to utilizing a block Jacobi smoother on the preconditioned equations directly. Note, this is only strictly valid for linear ows. To complete the smoothing analysis, we note that the multi-stage relaxation given in Equations (2)-(5) acts on the eigenmodes of Z^ independently such that, 1

v n+1 = g (^z )v n ;

where vn is the amplitude of an eigenmode of Z^ , z^ is the corresponding eigenvalue of Z^ , and g (^z ) is the ampli cation factor which is determined solely by the relaxation scheme. For a four-stage relaxation, the ampli cation factor is, g (^z ) = 1 + 4 z^ + 4 3 z^2 + 4 3 2 z^3 + 4 3 2 1 z^4 :

We now consider the e ect of low Mach number preconditioning on the smoothing abilities of the multi-stage, block Jacobi relaxation. Figure 6 shows the footprints without low Mach number preconditioning for a M = 0:01 ow aligned in the x-direction with a cell aspect ratio, x=y = 1. Figure 7 is the same plot with low Mach number preconditioning. Contours of the ampli cation factor magnitude, jg(^z)j, are also plotted. In both cases, the high-high modes are clustered away from the origin and the jgj = 1 contour, thus, we expect good damping for these modes. However, we note the presence of modes along the real axis for all frequency combinations without preconditioning. These modes are a direct result of the low Mach number sti ness and correspond to the low speed convection modes present in the system. In contrast, these modes are not clustered along the real axis for the preconditioned case. In terms of smoothing performance, the biggest gains associated with removing the real axis modes occurs in the high-low modes where the smoothing rate goes from e ectively 1.0 without preconditioning to 0.6 with preconditioning. Looking at the low-high modes, both cases have error modes present at the origin with nearly jgj = 1; these modes are related to the grid-aligned error modes associated with convection. In summary, both preconditioned and unpreconditioned block Jacobi algorithms appear suitable for semi-coarsened multigrid since high-high modes are well damped. However, because of the better smoothing for high-low modes, we expect the preconditioned results to be superior to those without preconditioning. Numerical results will show that the preconditioned Jacobi scheme is far superior to Jacobi without preconditioning. 13

2

2

1.5

1.5 Imaginary

2.5

Imaginary

2.5

1

1

0.5

0.5

0 −3

−2.5

−2

−1.5

−1

−0.5

0

0 −3

0.5

−2.5

Real

−2

−1.5

−1

−0.5

0

0.5

0

0.5

Real

Low-high frequencies

High-high frequencies

2

2

1.5

1.5 Imaginary

2.5

Imaginary

2.5

1

1

0.5

0.5

0 −3

−2.5

−2

−1.5

−1

−0.5

0

0.5

Real

0 −3

−2.5

−2

−1.5

−1 Real

−0.5

Low-low frequencies High-low frequencies Figure 6: Fourier footprint for block Jacobi smoothing. M = 0:01 ow aligned in x-direction on a grid of aspect ratio x=y = 1. Contours of ampli cation factor, g(^z) are included from 0.1 to 1.0 in increments of 0.1.

14

2

2

1.5

1.5 Imaginary

2.5

Imaginary

2.5

1

1

0.5

0.5

0 −3

−2.5

−2

−1.5

−1

−0.5

0

0 −3

0.5

−2.5

Real

−2

−1.5

−1

−0.5

0

0.5

0

0.5

Real

Low-high frequencies

High-high frequencies

2

2

1.5

1.5 Imaginary

2.5

Imaginary

2.5

1

1

0.5

0.5

0 −3

−2.5

−2

−1.5

−1

−0.5

0

0.5

Real

0 −3

−2.5

−2

−1.5

−1 Real

−0.5

Low-low frequencies High-low frequencies Figure 7: Fourier footprint for preconditioned block Jacobi smoothing. M = 0:01 ow aligned in x-direction on a grid of aspect ratio x=y = 1. Contours of ampli cation factor, g (^z ) are included from 0.1 to 1.0 in increments of 0.1.

15

(a) 32  16 grid

(b) Cp contours, M1 = 0:1, preconditioned Jacobi Figure 8: Sample bump grid and Cp data.

5 Results The convergence rates presented in the following results include cycle counts, work units, and CPU timings required to converge the solution six orders of magnitude from the initial residual. Convergence is measured using the RMS residual of all components of the residual vector (i.e. mass, momentum, and energy). A single work unit is equal to the amount of work required to evaluate the residual on the nest grid. Also, the total amount of work includes only the work required to perform smoothing passes on all of the grids but not any intergrid transfers.

5.1 Bump ow results

The rst set of test simulates ow over a solid bump between 0  x  1 described by y = :042 sin (x). The domain is 5 unit lengths long and 2 lengths high. The grid is structured with clustering toward the wall boundary. A sample grid and ow solution are shown in Figure 8(a). Grid sizes range from 32  16 to 256  128. The results for the Jacobi algorithm in Table 1 show a pronounced dependence on Mach number. In particular, at low Mach numbers, the convergence rates signi cantly degrade from similar grid sizes at higher Mach numbers. For example, for a grid of 128  64 cells, the M1 = 0:1 case converges (i.e. the residual drops 6 orders of magnitude) in 252 fullcoarsening cycles while the M1 = 0:5 case converges almost ve times faster needing only 52 full-coarsening cycles. For semi-coarsening, the results are even more dramatic with M1 = 0:1 and M1 = 0:5 converging in 73 and 8 cycles, respectively. While this degradation in convergence rate is observed most signi cantly at M1 = 0:1, the e ect is also evident at M1 = 0:3. 2

16

Grid 32  16 64  32 128  64 256  128 32  16 64  32 128  64 256  128 32  16 64  32 128  64 256  128 32  16 64  32 128  64 256  128

Full coarsening Semi coarsening Cycles Work Cycles Work M1 = 0:1 67 1492 46 2393 113 2551 64 3825 241 5457 73 4659 DNC DNC 81 5338 M1 = 0:3 37 824 14 729 63 1422 14 838 102 2310 14 894 168 3808 14 923 M1 = 0:5 23 513 8 417 35 791 8 479 52 1178 8 512 101 2290 8 528 M1 = 0:8 28 624 10 521 42 949 10 599 48 1088 15 958 80 1814 16 1055

Table 1: Bump ow with Jacobi results. 6 orders of magnitude drop in residual. DNC: did not converge in 300 cycles.

17

Grid 32  16 64  32 128  64 256  128 32  16 64  32 128  64 256  128 32  16 64  32 128  64 256  128 32  16 64  32 128  64 256  128

Full coarsening Semi coarsening Cycles Work Cycles Work M1 = 0:1 20 446 7 365 21 475 7 419 33 748 7 448 65 1474 7 462 M1 = 0:3 19 423 7 365 23 520 7 419 36 816 7 448 71 1610 7 462 M1 = 0:5 23 513 8 417 35 791 8 479 52 1178 8 511 101 2290 8 528 M1 = 0:8 28 624 10 521 42 949 10 599 48 1088 15 958 80 1814 16 1055

Table 2: Bump ow with preconditioned Jacobi results. 6 orders of magnitude drop in residual.

18

By comparison, the results for the preconditioned Jacobi algorithm in Table 2 show very little dependence on Mach number for full- and semi-coarsening with the lowest Mach number cases converging fastest. The semi-coarsening performance is particularly impressive with the total range of cycles for all Mach numbers and all grids being only from 7 cycles for the M1 = 0:1 cases to 16 cycles for the nest grid, M1 = 0:8 case. We note also that for M1  0:5, the Jacobi and preconditioned Jacobi converge in almost exactly the same amount of cycles (or work). This result is expected since the preconditioning is turned o for M  0:5 by the  de nition in Equation (8). Another interesting aspect of the bump ow convergence results is the dependence of convergence rate on grid size for full- and semi-coarsening. Tables 1 and 2 clearly show that full-coarsening requires an increasing number of cycles (or work units) to converge with increasing grid size for Jacobi and preconditioned Jacobi. In fact, for the largest grids, the total cycles or work units for convergence is increasing by almost exactly a factor of two for a factor of four increase in grid size. This suggests that the full-coarsening algorithm requires O(N = ) operations to converge to a xed level. Convergence histories for full-coarsening with M1 = 0:1 are shown in Figure 9. As described in the Introduction, the poor performance of the full-coarsening algorithm is attributable to the lack of damping for grid-aligned error modes. The results for semi-coarsening are distinctly superior to full-coarsening with respect to grid dependence. For Jacobi, the M1 = 0:3 and 0:5 cases are grid independent requiring 14 cycles and 8 cycles to converge, respectively, for all grid sizes. At M1 = 0:8, the coarsest two grids converge in 10 cycles while the ner grids jump to 15 and 16 cycles. The diculty, we believe, lies with the presence of a shock wave in the steady solution. We have found this type of grid dependence for all many problems with shocks. Several researchers have proposed modi cations in the multigrid algorithm to better handle discontinuous ow variations, however, these have not been pursued for this work. For the low speed M1 = 0:1 case, the Jacobi algorithm in conjunction with semi-coarsening is no longer grid independent with the 32  16 grid requiring 46 cycles and the 256  128 grid requiring 81 cycles to converge (see the convergence histories in Figure 10(a)). However, preconditioned Jacobi with semi-coarsening maintains grid independence at low Mach numbers. This is shown in Figure 10(b) in which the convergence histories versus work units for all grid sizes are nearly identical. The bene cial e ect of low Mach number preconditioning is even felt at M1 = 0:3. In comparison to the Jacobi algorithm which required 14 cycles to converge for all grids, the preconditioned Jacobi algorithm converges in 7 cycles for all grids giving a factor of two improvement. 3 2

5.2 Duct ow results

A second set of cases were performed for duct ows. All conditions are the same as the bump tests except the upper boundary condition. The upper boundary of the domain is now a solid wall instead of a far eld. With the far eld boundary condition approach, error modes can propagate out of the domain. However with a solid wall boundary on top, acoustic error modes will re ect back into the domain and could hinder with convergence. Results for Jacobi and preconditioned Jacobi duct cases are shown in Table 3 and 4. All of the trends observed in the bump ow results are also evident in the duct ow cases. 19

0

−1

−1

−2

−2

−3

−3

log10 RMS

log10 RMS

0

−4

−4

−5

−5

−6

−6

−7

0

500

1000

1500 work units

2000

2500

−7

3000

0

500

(a) Jacobi

1000

1500 work units

2000

2500

3000

(b) preconditioned Jacobi

0

0

−1

−1

−2

−2

−3

−3

log10 RMS

log10 RMS

Figure 9: Variation of convergence with grid size. Bump ow with full-coarsening for M1 = 0:1. Solid: 32  16; dash: 64  32; dash-dot: 128  64; dot: 256  128.

−4

−4

−5

−5

−6

−6

−7

0

500

1000

1500 work units

2000

2500

3000

(a) Jacobi

−7

0

500

1000

1500 work units

2000

2500

3000

(b) preconditioned Jacobi

Figure 10: Variation of convergence with grid size. Bump ow with semi-coarsening for M1 = 0:1. Solid: 32  16; dash: 64  32; dash-Dot: 128  64; dot: 256  128.

20

Grid 32  16 64  32 128  64 256  128 32  16 64  32 128  64 256  128 32  16 64  32 128  64 256  128 32  16 64  32 128  64 256  128

Full coarsening Semi coarsening Cycles Work Cycles Work M1 = 0:1 67 1492 46 2393 113 2551 64 3825 240 5435 73 4659 DNC DNC 80 5338 M1 = 0:3 37 824 14 729 64 1445 14 838 106 2401 14 894 182 4125 14 923 M1 = 0:5 23 513 9 469 37 836 8 479 62 1405 8 512 109 2471 8 528 M1 = 0:8 28 624 11 573 37 836 10 599 46 1042 10 639 68 1542 21 1385

Table 3: Duct ow with Jacobi results. 6 orders of magnitude drop in residual. DNC: did not converge in 300 cycles

21

Grid 32  16 64  32 128  64 256  128 32  16 64  32 128  64 256  128 32  16 64  32 128  64 256  128 32  16 64  32 128  64 256  128

Full coarsening Semi coarsening Cycles Work Cycles Work M1 = 0:1 20 446 7 365 23 520 7 419 39 884 7 448 71 1610 7 462 M1 = 0:3 19 424 7 365 25 565 7 419 43 975 7 448 76 1723 7 462 M1 = 0:5 23 513 8 417 37 836 8 479 62 1405 8 511 109 2471 8 528 M1 = 0:8 28 624 11 573 37 836 10 599 46 1042 10 639 68 1542 21 1385

Table 4: Duct ow with preconditioned Jacobi results. 6 orders of magnitude drop in residual.

22

5.3 Airfoil results

The third set of tests simulates ow over a NACA 0012 airfoil. The grid sizes range from 96  16 to 384  128, and the far eld boundaries are 20 chord lengths away. The far eld boundary model is simply the uniform freestream although more accurate models could have been incorporated. A typical grid and a transonic solution is shown in Figure 11. One set of grids contained 96  16, 192  32, and 384  64 cells. A second set of grids were generated by doubling the number of cells in the direction normal to the airfoil surface giving 96  32, 192  64, and 384  128 cells. Clustering was used to allow better resolution of the ow properties in critical areas. Convergence data for all NACA 0012 results are given in Tables 5-8. The results follow the same trends observed with the bump and duct ows. For the Jacobi algorithm, the performance at M1 = 0:1 is extremely poor with all but the one case (semi-coarsening on the nest grid) failing to converge in 300 multigrid cycles. Furthermore, as illustrated in the convergence history plots in Figures 12 and 13, many of these low Mach number solutions appeared to completely stall and never converge six orders of magnitude. As before, the low Mach number preconditioning completely alleviates this problem. The bene cial e ect of preconditioned Jacobi is also observed at M1 = 0:3 with a factor of two or more improvement compared to Jacobi in most cases. At higher Mach numbers, the convergence of both Jacobi and preconditioned Jacobi are almost identical. As observed with the bump and duct ow results, the number of cycles required by fullcoarsening to converge six orders of magnitude increases with increasing grid size. For the ner grids, the amount of work approximately doubles with a fourfold increase in grid size implying an O(N = ) algorithm. Convergence histories for full-coarsening with M1 = 0:1 are shown in Figure 12. In both the Jacobi and preconditioned Jacobi results, the dependence of convergence on grid size can be easily observed. As described in the Introduction, the poor performance of the full-coarsening algorithm is attributable to the lack of damping for grid-aligned error modes. At Mach numbers of 0.5 and 0.8, the semi-coarsening algorithm with Jacobi preconditioning performs nearly independent of grid size. At low Mach numbers, the Jacobi algorithm with semi-coarsening is no longer grid independent; however, the incorporation of low Mach number preconditioning again alleviates this problem. Convergence histories for the semicoarsening results are plotted in Figure 13. 3 2

5.4 CPU timings

To further demonstrate the dependence of convergence on grid size, we plot the total CPU time required when running the simulations on a single SGI R10000 processor for the M1 = 0:1 cases with preconditioned Jacobi in Figures 14 and 15 for the bump and airfoil ows, respectively. The full-coarsening CPU times (marked by ) show a nonlinear increase with respect to grid size while the semi-coarsening times (marked by ) appear linear. An approximate curve ts for the timings are also shown in the gures. Speci cally, for bump ows, the full-coarsening curve t is, CP U full  (4:4  10?4 )N 3=2 secs;

23

(a) 192  32 grid

(b) Cp contours, M1 = 0:8, preconditioned Jacobi Figure 11: Sample NACA 0012 grid and Cp data.

24

Grid 96  16 192  32 384  64 96  16 192  32 384  64 96  16 192  32 384  64 96  16 192  32 384  64

Full coarsening Cycles Work M1 = 0:1 DNC DNC DNC DNC DNC DNC M1 = 0:3 78 1737 117 2641 183 4144 M1 = 0:5 47 1047 74 1671 131 2967 M1 = 0:8 39 869 61 1377 107 2424

Semi coarsening Cycles Work DNC DNC DNC

DNC DNC DNC

27 29 29

1405 1734 1852

17 15 16

885 897 1022

26 25 23

1353 1495 1469

Table 5: NACA 0012 ow with Jacobi results for set 1 grids. 6 orders of magnitude drop in residual. DNC: did not converge in 300 cycles. Grid 96  16 192  32 384  64 96  16 192  32 384  64 96  16 192  32 384  64 96  16 192  32 384  64

Full coarsening Semi coarsening Cycles Work Cycles Work M1 = 0:1 35 780 18 937 51 1152 14 838 87 1971 14 894 M1 = 0:3 34 758 16 833 52 1174 13 778 87 1971 13 831 M1 = 0:5 39 869 17 885 67 1513 15 897 128 2899 16 1022 M1 = 0:8 38 847 26 1353 61 1377 26 1555 106 2401 23 1469

Table 6: NACA 0012 ow with preconditioned Jacobi results for set 1 grids. 6 orders of magnitude drop in residual. 25

Grid 96  32 192  64 384  128 96  32 192  64 384  128 96  32 192  64 384  128 96  32 192  64 384  128

Full coarsening Semi coarsening Cycles Work Cycles Work M1 = 0:1 DNC DNC DNC DNC DNC DNC DNC DNC DNC DNC 177 11664 M1 = 0:3 81 1829 24 1435 142 3216 31 1979 277 6278 33 2175 M1 = 0:5 59 1332 16 957 103 2333 17 1086 205 4646 17 1121 M1 = 0:8 59 1332 16 957 108 2446 17 1086 195 4420 21 1385

Table 7: NACA 0012 ow with Jacobi results for set 2 grids. 6 orders of magnitude drop in residual. DNC: did not converge in 300 cycles. Grid 96  32 192  64 384  128 96  32 192  64 384  128 96  32 192  64 384  128 96  32 192  64 384  128

Full coarsening Semi coarsening Cycles Work Cycles Work M1 = 0:1 48 1084 12 718 83 1880 12 767 147 3332 11 726 M1 = 0:3 50 1129 12 718 85 1925 11 703 156 3536 11 726 M1 = 0:5 55 1242 16 957 102 2310 17 1086 204 4624 17 1121 M1 = 0:8 59 1332 15 897 108 2446 16 1022 194 4397 20 1319

Table 8: NACA 0012 ow with preconditioned Jacobi results for set 2 grids. 6 orders of magnitude drop in residual. 26

0

−1

−1

−2

−2

−3

−3

log10 RMS

log10 RMS

0

−4

−4

−5

−5

−6

−6

−7

0

500

1000

1500

2000 work units

2500

3000

3500

−7

4000

0

500

(a) Jacobi

1000

1500

2000 work units

2500

3000

3500

4000

(b) preconditioned Jacobi

0

0

−1

−1

−2

−2

−3

−3

log10 RMS

log10 RMS

Figure 12: Variation of convergence with grid size. NACA 0012 ow with full-coarsening at M1 = 0:1 for set 2 grids. Solid: 96  32; dash: 192  64; dash-Dot: 384  128.

−4

−4

−5

−5

−6

−6

−7

0

500

1000

1500

2000 work units

2500

3000

3500

4000

(a) Jacobi

−7

0

500

1000

1500

2000 work units

2500

3000

3500

4000

(b) preconditioned Jacobi

Figure 13: Variation of convergence with grid size. NACA 0012 ow with semi-coarsening at M1 = 0:1 for set 2 grids. Solid: 96  32; dash: 192  64; dash-Dot: 384  128.

27

3000

2500

CPU time

2000

1500

1000

500

0

0

0.5

1

1.5 2 Number of cells

2.5

3

3.5 4

x 10

Figure 14: CPU seconds versus number of cells, N . Bump ow with preconditioned Jacobi at M1 = 0:1. x: full-coarsening, o: semi-coarsening. Curve ts are given by CP Ufull = (4:4  10? )N 23 and CP Usemi = (2:3  10? )N . 4

2

10000

9000

8000

7000

CPU time

6000

5000

4000

3000

2000

1000

0

0

0.5

1

1.5

2

2.5 3 Number of cells

3.5

4

4.5

5 4

x 10

Figure 15: CPU seconds versus number of cells, N . NACA 0012 ow with preconditioned Jacobi at M1 = 0:1. x3 : full-coarsening; o: semi-coarsening. Curve ts are given by CP Ufull = (8:2  10? )N 2 and CP Usemi = (3:8  10? )N . 4

2

28

and the semi-coarsening curve t is, CP U semi  (2:3  10?2 )N secs;

where N is the total number of cells. For the airfoil results, the full-coarsening curve t is, CP U full  (8:2  10?4 )N 3=2 secs;

and the semi-coarsening curve t is, CP U semi  (3:8  10?2 )N secs:

Thus, to good approximation, full-coarsening is an O(N = ) algorithm while semi-coarsening is O(N ). At coarser grid sizes, while semi-coarsening is usually faster than full-coarsening, the CPU di erence is minor; however, the real bene t of semi-coarsening is apparent for ner grids where the performance of full-coarsening degrades. 3 2

5.5 E ect of -limiter

As shown from the cases above, the new -limiter is a robust method which converged well in all tests. To assess the relative merits of the old (i.e. freestream-based) limiting and new limiting, we ran a set of cases on the NACA 0012 192  32 grid. For the old limit, several di erent values of  were used. The results are given in Table 9. The old and new  limits perform similarly except for large values of  for which a noticeable drop in convergence rate is typically observed. For the higher Mach number ows, the convergence rate for the old limit is particularly insensitive to the value of . This would seem to indicate that the potentially bene cial e ect of low Mach number preconditioning in stagnation regions is not important for these ows. For higher Mach number ows with signi cant regions of low speed

ow such as occurs with separation, the e ect of low speed preconditioning and -limiting may be more pronounced. Notably, the new -limit performs well without any tuning. To demonstrate the local limiting e ect of the new -limit, we investigate the old and new limiter activity for M1 = 0:1 on a 192  32 grid. For the old limit,  = 3:0. Figure 16 shows a contour plot of cell ? (Mcell ) for the converged solution where cell is de ned in Equation (13) and (Mcell ) is the value of  evaluated with the local cell Mach number. Thus, in regions where no limiting occurs and Mach number variations are small, this quantity will be essentially zero. As shown in the plot, the new -limit is only active near the body while the old -limit is active throughout the ow. Surprisingly, the old  limit converges only slightly slower than the new -limit even though the old limit is active through an appreciable part of the ow. Another interesting observation from the airfoil results is that =0 not only converges but often gives the best convergence rate for a test case. However, this is contrary to what several researchers have found [39, 2, 21, 40], particularly with airfoil problems where a stagnation point is present. This suggests that the use of the block Jacobi scheme, the maximization of  over several cells as described in Section 3, or both may be contributing to the robustness of the local preconditioning observed here. In the solutions shown so far, the  limits on the cell face are in uenced by a total of eight cells composed of the nearest neighbors of the two 29

lim

Old:  = 0 Old:  = 1 Old:  = 2 Old:  = 3 Old:  = 4 New Old:  = 0 Old:  = 1 Old:  = 2 Old:  = 3 Old:  = 4 New Old:  = 0 Old:  = 1 Old:  = 2 Old:  = 3 Old:  = 4 New Old:  = 0 Old:  = 1 Old:  = 2 Old:  = 3 Old:  = 4 New

Full coarsening Semi coarsening Cycles Work Cycles Work M 1 = 0:1 51 1152 14 838 52 1174 14 838 59 1332 13 778 69 1558 15 897 77 1738 17 1017 51 1152 14 838 M 1 = 0:3 52 1174 13 778 53 1197 13 778 65 1468 14 838 76 1716 16 957 85 1919 20 1196 52 1174 13 778 M 1 = 0:5 67 1513 15 897 74 1671 15 897 74 1671 15 897 74 1671 15 897 74 1671 15 897 67 1513 15 897 M 1 = 0:8 61 1377 26 1555 61 1377 25 1495 61 1377 25 1495 61 1377 25 1495 61 1377 25 1495 61 1377 26 1555

Table 9: NACA 0012 ow convergence results for old and new -limits. Grid 192  32 cells. 6 orders of magnitude drop in residual.

30

(a) old -limit ( = 3)

(b) new -limit Figure 16: Plots of cell ? (Mcell ) with 41 equally spaced contours from 0 to 0.03 for NACA 0012, M1 = 0:1, and 1 = 1:25 on 192  32 grid.

31

Semi coarsening Cycles Work M1 = 0:1 96  32 12 718 192  64 12 767 384  128 UNS UNS M1 = 0:3 96  32 12 718 192  64 11 703 384  128 11 726 M1 = 0:5 96  32 15 897 192  64 17 1086 384  128 17 1121 M1 = 0:8 96  32 15 897 192  64 16 1022 384  128 20 1319 Grid

Table 10: Convergence results for NACA 0012 ow using semi-coarsening algorithm with no -limiting for set 2 grids. 6 orders of magnitude drop in residual. UNS: Unstable. cells at the face. The algorithm will pick the largest value of  from this eight cell stencil. Thus, even with  = 0, the in uence from the other cells in the stencil can still keep  from going to zero. In order to gain better insight into the e ect of the eight cell stencil, the algorithm for determining flux was modi ed to only use the Mach number from the two reconstructed states at the face. No other -limiting process was used (i.e. no freestream or p-based cut-o ). Table 10 shows the results for the NACA 0012 airfoil with preconditioned Jacobi and semi-coarsening for the set 2 grids. Only one case becomes unstable (the nest grid at M1 = 0:1); while, the other cases converge almost identically to the local limiter results in Table 8. The conclusion we draw is that the Jacobi formulation increases the robustness of the overall algorithm, and in combination with the limiting strategy described in Section 3, provides a robust and ecient algorithm for Euler calculations.

6 Unstructured Multigrid Results Mavriplis[2, 41, 42] has developed a multigrid algorithm for solving high Reynolds number

ows on unstructured meshes which utilizes many of the same techniques as discussed above. He has also reviewed the current state of the art in convergence acceleration for unstructured meshes[43]. In his approach, coarse grids are determined by agglomerating ne grid cells which are strongly coupled. The level of coupling is determined by a graph algorithm weighted to account for local grid stretching. In regions of the mesh with isotropic cells, the 32

agglomeration is isotropic; however, in regions where the mesh is anisotropic, the algorithm e ectively joins cells in the anisotropic direction producing more isotropic coarse grid cells. To strengthen his smoothing algorithm, Mavriplis also contructs a directional line implicit scheme. The lines are formed similar to the agglomeration procedure. Thus, in isotropic regions, the lines reduce to points (i.e. a point block Jacobi algorithm is used). However, in regions of stretched grids, the lines extend across the anisotropies. In the case of a highReynolds application, the mesh near wall boundaries will be highly anisotropic with thin cells normal to the wall. Thus, the agglomeration procedure will tend to produce coarse grid cells which are grouped normal to the wall; similarly, the directional implicit lines will also be largely normal to the wall. Mavriplis also implemented the Turkel/Weiss-Smith preconditioner to remove low Mach number sti ness. His implementation is identical to that described in Section 2.3. Some sample convergence histories for this basic algorithm are shown in Figures 17 and 18. Figure 17 is the convergence history for a two-dimensional, viscous calculation over an RAE 2822 airfoil at several Mach numbers with and without preconditioning[41]. The results clearly show the dramatic performance gains associated with low Mach number preconditioning. In particular, with low Mach number preconditioning, the convergence for M1 = 0:01, 0:1, and 0:73 are all very similar. Results for a three-dimensional high-lift calculation[42] are shown in Figure 18. In this case, the low Mach number preconditioning is less e ective. While the lift coecient converges faster with preconditioning, the actual level of residual convergence is less with preconditioning.

7 Closing Remarks The use of local preconditioning to combat low Mach number sti ness is important in developing robust multigrid algorithms for the Euler and Navier-Stokes equations. In conjunction with local preconditioning, a semi-coarsening multigrid technique for two-dimensional Euler calculations has been demonstrated which produces grid-independent and Mach number independent solution rates. Similar results have been found for two-dimensional Navier-Stokes calculations on unstructured meshes. However, the e ectiveness of local preconditioning for the complex case of three-dimensional, high-lift con guration is less impressive currently and remains a signi cant challenge for preconditioned multigrid methods.

Acknowledgements The authors would like to acknowledge the assistance of Dimitri Mavriplis in providing gures from his recent work. This work has been partially supported by the NSF through NSF CAREER Award (ACS-9702435), The Boeing Company, ICASE, and NASA Langley.

References [1] D.L. Darmofal and K. Siu. A robust locally preconditioned multigrid algorithm for the Euler equations. AIAA Paper 98-2428 (to appear Journal of Computational Physics), 33

MACH NUMBER = 0.73 0.00

MACH NUMBER = 0.1

-2.00

MACH NUMBER = 0.01

-6.00 -12.00

-10.00

-8.00

Log (Error)

-4.00

NON-PRECONDITIONED IMPLICIT MG

-14.00

PRECONDITIONED IMPLICIT MG 0

100

200

300

400

500

600

Number of MG Cycles

Figure 17: Comparision of convergence rates for low Mach number preconditioned and unpreconditioned directional-implicit, agglomeration multigrid algorithms. RAE 2822 airfoil at a Reynolds number of 6.5 million[41].

34

5 Level MG (No PC) 1.20

4.00

6 Level MG (No PC)

0.40 0.20

-6.00 0

100

200

300

400

500

600

-0.20

0.00

-8.00 -10.00

Normalized Lift Coefficient

1.00 0.80 0.60

-2.00 -4.00

Log (Error)

0.00

2.00

6 Level MG (with Low Mach PC)

Number of Cycles

Figure 18: Convergence rate for 24.7 million node grid using 5 and 6 multigrid levels with and without low Mach number preconditioning for a three-dimensional, Energy Ecient Transport (EET) wing calculation at 0.2 Mach number, 10 degrees incidence, 1.6 million Reynolds number, and high-lift con guration[42].

35

[2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]

1998. D. Mavriplis. Multigrid strategies for viscous ow solvers on anisotropic unstructured meshes. AIAA Paper 97-1952, 1997. A. Brandt. A guide to multigrid development. In W. Hackbush and U. Trottenberg, editors, Multigrid Methods. Springer Verlag, 1981. S.R. Allmaras. Analysis of a local matrix preconditioner for the 2-D Navier-Stokes equations. AIAA Paper 93-3330, 1993. D.C. Jespersen. Design and implementation of a multigrid code for the Euler equations. Applied Mathematics and Computations, 13:357{374, 1983. W.A. Mulder. Multigrid relaxation for the Euler equations. Journal of Computational Physics, 60:235{252, 1985. W.K. Anderson, J.L. Thomas, and D.L. Whit eld. Multigrid acceleration of the ux split Euler equations. AIAA Paper 86-0274, 1986. P.W. Hemker and S.P. Spekreijse. Multiple grid and Osher's scheme for the ecient solution of the steady Euler equations. Applied Numerical Mathematics, 2:475{493, 1986. B. Koren. Defect correction and multigrid for an ecient and accurate computation of airfoil ows. Journal of Computational Physics, 77:183{206, 1988. B. Koren. Multigrid and defect correction for the steady Navier-Stokes equations. Journal of Computational Physics, 87:25{46, 1990. W.K. Anderson, R.D. Rausch, and D.L. Bonhaus. Implicit/multigrid algorithms for incompressible turbulent ows on unstructured grids. Journal of Computational Physics, 128:391{408, 1996. A. Jameson. Solution of the Euler equations for two-dimensional transonic ow by a multigrid method. Applied Mathematics and Computation, 13:327{356, 1983. B. van Leer, J. Thomas, P. Roe, and R. Newsome. A comparison of numerical ux formulas for the Euler and Navier-Stokes equations. AIAA Paper 87-1104-CP, 1987. S.R. Allmaras. Contamination of laminar boundary layers by arti cial dissipation in Navier-Stokes solutions. Proc. Conf. Numerical Methods in Fluid Dynamics, 1992. R.C. Swanson and E. Turkel. Aspects of a high-resolution scheme for the Navier-Stokes equations. AIAA Paper 93-3372-CP, 1993. W.A. Mulder. A new approach to convection problems. Journal of Computational Physics, 83:303{323, 1989. 36

[17] W.A. Mulder. A high resolution Euler solver based on multigrid, semi-coarsening, and defect correction. Journal of Computational Physics, 100:91{104, 1992. [18] E. Turkel. Preconditioned methods for solving the incompressible and low speed compressible equations. Journal of Computational Physics, 72:277{298, 1987. [19] B. van Leer, W.T. Lee, and P.L. Roe. Characteristic time-stepping or local preconditioning of the Euler equations. AIAA Paper 91-1552, 1991. [20] E. Turkel. Preconditioning-squared methods for multidimensional aerodynamics. AIAA Paper 97-2025, 1997. [21] B. van Leer, L. Mesaros, C.H. Tai, and E. Turkel. Local preconditioning in a stagnation point. AIAA Paper 95-1654, 1995. [22] D.L. Darmofal and P.J. Schmid. The importance of eigenvectors for local preconditioners of the Euler equations. Journal of Computational Physics, 127:346{362, 1996. [23] P.L. Roe. Approximate Riemann solvers, parametric vectors, and di erence schemes. Journal of Computational Physics, 43:357{372, 1981. [24] B. van Leer. Upwind-di erence methods for aerodynamic problems governed by the Euler equations. Lectures in Applied Mathematics, 22, 1985. [25] J.F. Lynn and B. van Leer. A semi-coarsening multigrid solver for the Euler and NavierStokes equations with local preconditioning. AIAA Paper 95-1667, 1995. [26] J. Lynn. Multigrid solution of the Euler equations with local preconditioning. PhD thesis, University of Michigan, 1995. [27] J.M. Weiss and W.A. Smith. Preconditioning applied to variable and constant density

ows. AIAA Journal, 33(11):2050{2057, 1995. [28] R. Radespiel and R.C. Swanson. Progress with multigrid schemes for hypersonic ow problems. Journal of Computational Physics, 116:103{122, 1995. [29] K. Siu. A robust locally preconditioned multigrid algorithm for the 2-D Euler equations. Master's thesis, Texas A&M University, 1998. [30] D.L. Darmofal and B. Van Leer. Local preconditioning of the Euler equations: A characteristic interpretation. 30th VKI Lecture Series in Computational Fluid Dynamics, 1999. [31] D.L. Darmofal and B. van Leer. Local preconditioning: Manipulating Mother Nature to fool Father Time. In M. Hafez and D.A. Caughey, editors, Computing the Future II: Advances and Prospects in Computational Aerodynamics. John Wiley and Sons, 1998. [32] E. Turkel, V.N. Vatsa, and R. Radespiel. Preconditioning methods for low-speed ows. AIAA Paper 96-2460, 1996. 37

[33] D. Jespersen, T. Pulliam, and P. Buning. Recent enhancements to OVERFLOW. AIAA Paper 97-0644, 1997. [34] B. van Leer, C.H. Tai, and K.G. Powell. Design of optimally smoothing multi-stage schemes for the Euler equations. AIAA Paper 89-1933, 1989. [35] J.F. Lynn and B. van Leer. Multi-stage schemes for the Euler and Navier-Stokes equations with optimal smoothing. AIAA Paper 93-3355, 1993. [36] A.C. Godfrey. Steps toward a robust preconditioning. AIAA Paper 94-0520, 1995. [37] S.R. Allmaras. Analysis of semi-implicit preconditioners for multigrid solution of the 2-d Navier-Stokes equations. AIAA Paper 95-1651, 1995. [38] N.A. Pierce and M.B. Giles. Preconditioning compressible ow calculations for stretched grids. AIAA Paper 96-0889, 1996. [39] D. Lee. Local preconditioning of the Euler and Navier-Stokes equations. PhD thesis, University of Michigan, 1996. [40] C.L. Reed and D.A. Anderson. Application of low speed preconditioning to the compressible Navier-Stokes equations. AIAA Paper 97-0873, 1997. [41] D. Mavriplis. Directional agglomeration multigrid techniques for high-reynolds number viscous ows. AIAA Paper 98-0612, 1998. [42] D. Mavriplis and S. Pirzadeh. Large-scale parallel unstructured mesh computations for 3-D high-lift analysis. AIAA Paper 99-0537, 1999. [43] D. Mavriplis. On convergence acceleration techniques for unstructured meshes. AIAA Paper 98-2966, 1998.

38

Suggest Documents