Oct 8, 2010 - to be found using an alternative, more robust, solution strategy which ... that the mass, momentum and energy conservation equations on one hand, and .... What we do in practice is to retain the diagonal term (by using a finite âÏk) ...... run âtEULFS avg. Newton steps. 15687. 0.08. 8.1. 15688. 0.04. 6.8.
Steady and unsteady RANS simulations using a 3D unstructured grid solver with Newton-Krylov acceleration Aldo Bonfiglioli and Bruno Carpentieri and M.Sergio Campobasso October 8, 2010
Abstract This report describes the research activity carried out in the framework of the Standard HPC Grant 2009. The key thread of this activity lies in demonstrating the effectiveness of Newton’s rootfinding method in solving the large system of non-linear algebraic equations arising from the discretization of the Navier-Stokes and Reynolds Averaged Navier-Stokes equations using unstructured tetrahedral grids. The aforementioned technique has been applied to both steady and un-steady flow simulations. In this latter case, a dual time-stepping technique is used to advance the solution in physical time. Last, but not least, this research activity offered the opportunity to validate the un-steady capability that has been recently implemented within the EulFS code.
Chapter 1
Introduction Newton’s method (also known as the NewtonRaphson method), named after Isaac Newton and Joseph Raphson, is perhaps the best known method for finding successively better approximations to the zeroes (or roots) of a real-valued function. Newton’s method can often converge remarkably quickly, especially if the iteration begins ”sufficiently near” the desired root. Just how near ”sufficiently near” needs to be, and just how quickly ”remarkably quickly” can be, depends on the problem. [...] Unfortunately, when iteration begins far from the desired root, Newton’s method can easily lead an unwary user astray with little warning. This quote from Wikipedia[1] well explains the strentghs and weaknesses of Newton’s method; it also points out some of the reason for it has received little attention from the CFD community. These can be summarized in: high CPU and memory cost and lack of robustness. In this work, Newton’s algorithm is used to solve the large sparse systems of non-linear algebraic equations that arise from the discretization of the NavierStokes (NS) and Reynolds Averaged Navier-Stokes (RANS) equations, both in the case of steady and un-steady flows. Un-steady flows are dealt with using a dual time stepping approach [2]. The use of a class of discretization schemes that enjoys a compact stencil allows to build and store a Jacobian matrix which is as sparse as the graph of the underlying unstructured triangulation (or tetrahedralization in 3D). This Jacobian matrix is computed without introducing approximations, even for second order accurate discretizations, by resorting to one sided Finite Difference formulae. An accurately computed Jacobian is a key issue to obtain the quadratic convergence property of Newton’s method. The second issue mentioned above, that is the need of a “good” initial guess, is readily available in the context of un-steady calculations using a dual time stepping strategy. This is because the solution of the flow field at a given physical time starts from the converged solution at the preceding time, and this 1
latter constitutes a very convenient initial state. In the case of steady simulations, the initial guess for Newton’s method has to be found using an alternative, more robust, solution strategy which allows to obtain a reasonable initial guess even starting from scratch. One such strategy is described and adopted in this report. The key thread of this work consists in demonstrating the applicability of Newton’s method to the solution of the systems arising from the discretization of the RANS and URANS equations, which suffer from severe stiffness problems due to the coupling between the mean flow and turbulent transport equations.
2
Chapter 2
Numerical methodology 2.1
Governing equations and discretization technique
The EulFS code solves the conservation law form of the compressible and incompressible Euler, NS and RANS equations using unstructured grids made of triangles in 2D and tetrahedra in 3D. The Euler and NS equations translate into a system of PDEs the fundamental laws of conservation of mass and momentum, as well as energy in the compressible case. Incompressible flows are dealt with using the artificial compressibility approach[3]. When solving the RANS equations an additional set of one or more PDEs is added in order to (somewhat empirically) describe the effect of the unresolved turbulent scales upon the mean flow quantities. In the present work the one-equation model proposed by Spalart and Allmaras [4] is employed. One of the novel features of this study is the coupling of an hybrid class of methods for the space discretization, called Fluctuation Splitting (or residual distribution) schemes [5], and a fully coupled Newton algorithm for solving the RANS equations. By “fully coupled” we mean that the mass, momentum and energy conservation equations on one hand, and the turbulent equation on the other, are solved simultaneously rather than in a decoupled or staggered fashion. In a FS discretization the dependent variables (different sets are used in the compressible and incompressible cases) are stored in the vertices (gridpoints) of the mesh and assumed to vary linearly in space, being also continuous across the interface between adjacent cells. Further details concerning the the space and time discretisation of the governing conservation equations are given elsewhere and will not be repeated here, for brevity. The interested reader is referred to [6, 7] for further details. It will be sufficient to point out that the space- and time-discretized conservation equations within the ith gridpoint reads: Z Z I ∂Ui S dV = 0 (2.1) R (U )i = n · (F − G) dS + dV − Ci ∂t Ci ∂Ci
3
In Eq. (2.1) t is the independent variable which describes the physical time; F and G are the inviscid and viscous fluxes, respectively, S is a source term which is non-zero only in the turbulence transport equation and U is the set of dependent variables. Volume integrals in Eq. (2.1) extend over the control volume Ci (shown by solid green lines in Fig. 2.1) centred around gridpoint i and surface integrals extend over its boundary ∂Ci .
Figure 2.1: 2D triangular tessellation and median dual grid. When dealing with steady-state calculations, the time derivative term in Eq. (2.1) should be dropped. Equation (2.1) is in fact equivalent to a set of m scalar equations, where m is the number of state variables (or degrees of freedom) defined in each gridpoint. For instance, if we consider the compressible RANS equations in two space dimensions, m = 5 since it equals the number of conservation equations (which is 4: mass, energy and the x and y components of the momentum equations) plus the number of turbulence transport equations (which is 1 for the SpalartAllmaras model implemented in EulFS). Grouping into a single vector R the m conservation equations of all gridpoints of the mesh, we end up by solving a large sparse system of non-linear algebraic equations, which can be formally stated as follows: R (U ) = 0. (2.2) In the case of steady flows, U is the steady solution we are looking for. In the case of un-steady flows, solved using a dual-time-stepping approach, Eq. (2.2) is solved at each time-step using the solution U n at time level n as initial condition and U = U n+1 is the sought solution at the subsequent time level n + 1. We plan to use Newton’s algorithm to solve Eq. (2.2).
4
When dealing with real-valued functions f of a scalar argument x (f ′ being its first derivative) Newton’s algorithm generates a sequence of values xk (eventually) converging towards the sought solution ξ, such that f (ξ) = 0, Indeed, Taylor expanding the function f about ξ: 0 = f (ξ) = f (x) + f ′ (x) (ξ − x) + O (ξ − x)
2
(2.3)
2
and ignoring terms of the order of (x − ξ) leads to the following sequence: −1 xk+1 = xk − f ′ xk f xk which can also be cast as follows: −f ′ xk xk+1 − xk = f (x) .
(2.4)
One of the key features of Newton’s method is its quadratic convergence property, which can be translated into the following statement: |xk+1 − ξ| = C|xk − ξ|2 C being a constant. When dealing with a system of non-linear equations, Newton’s method can be generalized so that Eq. 2.4 becomes: −J U k U k+1 − U k = R (U ) (2.5)
were J = (∂R/∂U ) is the Jacobian matrix of the residual vector U . When looking at Eq. (2.5) it is evident that when using Newton’s algorithm to find the roots of a system of equations each Newton step (which we also call inneriteration) requires the solution of a large (sparse) system of linear algebraic equations. In EulFS, this is done by means of an iterative solver, namely a preconditioned GMRES algorithm. This task is accomplished by means of the PETSc library. Before proceeding any further, we wish establish the connection between the dual time stepping approach which we use for unsteady calculations and Newton’s algorithm. In Jameson’s dual time-stepping[2] approach the solution of Eq. (2.2) is obtained by means of an implicit approach based on the use of a fictitiuos timederivative. This approach amounts to solve the following evolutionary problem: dU VM = R (U ) (2.6) dτ in pseudo-time τ until steady-state is reached. Since accuracy in pseudo-time is obviously irrelevant, the mass matrix arising from the integration of the pseudotime derivative term in Eq. (2.6) has been lumped into the diagonal matrix VM 1 1 the diagonal matrix V M can be represented as a one-dimensional array of blocksize m whose ith has m repeated values equal to the volume (area in 2D) of the control volume centred on gridpoint i
5
and a first-order accurate, two time levels FD formula: dU U n+1,k+1 − U n+1,k ≈ dτ ∆τ
(2.7)
is used to approximate the pseudo-time derivative in the l.h.s. of Eq. (2.6). The inner iterations counter k has been introduced in Eq. (2.7) to label the pseudo-time levels. Once Eq. (2.7) is replaced in Eq. (2.6), an implicit scheme is obtained if the residual R is evaluated at the unknown pseudo-time level k +1. Taylor expanding R about time level k one obtains the following (sparse) system of linear equations: 1 (2.8) VM − J U n,k ∆U = R U n+1,k ∆τk to be solved at each inner iteration until the required convergence of R is obtained. Equation. (2.8) has to be supplemented with the initial condition: U n+1,0 = U n . It is only in the limit of infinite ∆τk that Eq. (2.8) recovers Newton’s rootfinding algorithm, Eq. (2.5). What we do in practice is to retain the diagonal term (by using a finite ∆τk ) in the l.h.s. of Eq. (2.8) during the first inner iterations, when the solution is likely to be far from the sought solution. As soon as the norm of the r.h.s. of Eq. (2.8) decreases, thus indicating that we are approaching the solution, the pseudo time-step length is increased, reaching very large values during the last inner iterations. We shall describe later on how the the pseudo time-step is varied in practice. The reason for adding a (positive) diagonal matrix to the Jacobian matrix is twofold: it increases the diagonal dominance thus making the linear solve easier and it also ensures a slower evolution in pseudo-time which can be beneficial during the early inner iterations to ensure that the iterative process does not blow up. Observe that steady RANS simulations can be accomodated within the present integration scheme by simply dropping the physical time-derivative term in Eq. (2.1). Despite its excellent convergence characteristics, there exist a number of caveats concerning Newton’s method that explain why it has found limited application within the CFD community. These can be summarized by the following two: cost and robustness. We shall now address these two issues in some more detail. The evaluation of the Jacobian matrix is costly both in terms of CPU and storage. The evaluation of the Jacobian matrix should also be done accurately, since any approximation is likely to spoil the quadratic convergence rate of Newton’s algorithm. In EulFS we take advantage of the fact that the computational stencil is compact, being limited to the distance-1 neighbours even in the case of second-order accurate schemes. Therefore, the sparsity pattern of the Jacobian matrix coincides with the graph of the underlying unstructured grid. The 6
evaluation of the Jacobian is anyway a costly operation, regardless of whether it is done analitically or numerically, say by using a Finite Difference approximation. The analytical approach is more accurate, but may be very cumbersome to derive “by hand”, particularly when turbulence models are involved. In addition to this, all the hand-made differentiation should be re-done each time a new model, e.g. a new low speed preconditioner, is implemented within the code. Automatic differentiation might be a valuable alternative, but it has not been tried with EulFS. Numerical jacobians are more flexible: no changes are needed when a new turbulence model or discretization schemes is added. They are however less accurate than the analytical jacobians; this is because, in order to reduce the number of residual evaluations required do build up the FD jacobian, one-sided, hence first-order accurate, FD formulae are employed. Two alternatives are currently available in EulFS: one uses a numerical approximation of the “true” Jacobian, obtained using one-sided finite differences (FD) formulae, the other is based on an analytical evaluation of an approximate Jacobian, computed via Picard linearization. In both cases, the individual entries of the Jacobian matrix are explicitly computed and stored in memory. In the approximate, Picard linearization, we write: R (U ) = C (U ) U in which C is a matrix. The approximate Jacobian is then analitically computed as: J ≈ C (U ) i.e. we neglect the dependence of C upon (U ) when evaluating ∂R/∂U . This approximate, analytical linearization is used in the tandem solution approach which will be described shortly hereafter. The FD approximation to the true Jacobian matrix is a costly operation that requires (d + 1) × m residual evaluations where d is the spatial dimension (d + 1 is the number of vertices of a triangular/tetrahedral cell) and m is the number of dependent variables within each gridpoint. It pays off only when the quadratic convergence of Newton’s rootfinding method can be fully exploited. This is likely to occur when dealing with unsteady flow problems because the solution of the flow field at a given physical time starts from the converged solution at the preceding time, and this latter constitutes a very convenient initial state. In fact, it is sufficiently close to the sought new solution to allow the use of the exact Newton’s method since the first solution step. Even in the case of unsteady flows, though, we retain the diagonal term in the l.h.s. of Eq. (2.8) during the first inner iterations, eventually using a larger pseudo-time step than it would be done in a steady calculation. When solving the steady RANS equations, however, a good initial guess is generally not available and using the FD jacobian approximation starting from scratch is likely to yield divergence of the inner iterations. Therefore, the following two-step approach is adopted in practice. In the early stages of the calculation we solve the turbulent transport equation in tandem with the mean flow equations: the mean flow solution is advanced over a single 7
time step using an approximate Jacobian (i.e. by Picard linearization) while keeping turbulent viscosity frozen, then the turbulent variable is advanced over one or more pseudo-time steps using a FD Jacobian with frozen mean flow variables. Due to the uncoupling between the mean flow and turbulent transport equations, this procedure will eventually converge to steady state, but never yields quadratic convergence. Once the solution has come close to steady state, a true Newton strategy is adopted: the mean flow and turbulence transport equation are solved in fully coupled form and the Jacobian is computed by FD. The pseudo-time step length ∆τk is selected according to the Switched Evolution Relaxation (SER) strategy proposed by Mulder and van Leer[8], as: ||R(U n+1,0 )||2 ∆τk = ∆τ min Cmax , C0 , (2.9) ||R(U n+1,k )||2 where ∆τ is the pseudo time-step computed using the stability criterion of the explicit time integration scheme. C0 and Cmax are user defined constants that control the initial and maximum pseudo time-steps used in the actual calculations. When using the tandem solution approach, C0 is of order 1 and Cmax is tipically set equal to 100, although larger values may be tolerable. In the fully coupled approach Cmax has to be set equal to infinity in order to recover Newton’s method, while C0 can be taken of the order 10 or larger, depending on how close is the initial solution to the root. Remark that, when evaluating the norms of the residual in Eq. (2.9) we use a component-wise norm. i.e. we choose to monitor one of the conservation equations. The Jacobian matrix requires a considerable storage; as mentioned previously, however, for the class of schemes used in the EulFS code, the sparsity pattern of the Jacobian matrix only extends to the distance-1 neighbours of a given gridpoint. Cheaper approaches in terms of storage are also possible: these are often referred to as matrix-free Newton-Krylov algorithms. Matrixfree methods take advantage of the fact that Krylov algorithms do not explicitely need J, but only require matrix-vector products involving J. These matrix-vector products can be approximated by Finite Differences without ever computing (and storing) J. The matrix-free approach allows to save memory, but requires extra evaluations of R (U ) during each linear solve, thus increasing the overall CPU cost. For the class of schemes used in EulFS, which are somewhat more costly that state-of-the-art Finite Volume schemes, the storage of the Jacobian matrix has been preferred. Moreover, Krylov methods (such as GMRES) require a preconditioner (such as an incomplete lower-upper factorization) to reduce the number of linear iterations. In matrix-free methods this preconditioner is tipically built (and stored) using a low order approximation of the residual R (U ). The use of a lower order approximation to build the preconditioner has two motivations: low order discretizations are generally cheaper in terms of floating-point operations and also have a more compact stencil than higher order approximations. In summary, matrix-free require the storage of the preconditioner, but not that of the Jacobian matrix. It should also be kept in mind that certain second order accurate FV discretizations have a stencil 8
which extends beyond the distance-1 neighbours thus making the storage of the Jacobian matrix very costly in terms of RAM memory. We have mentioned the fact that a preconditioner is needed to speed up the iterative solution of the linear system (2.5) at each Newton step. It is our experience, also shared by others (citare), that the stiffness of matrix J increases quite dramatically when the mean flow equations are solved fully coupled with the turbulence transport equations in a RANS or URANS simulation. Decoupling the turbulence equations from the mean flow equations, which is done in the tandem solution approach, relieves this stiffness, but also destroys quadratic convergence. As mentioned previously, the addition of the diagonal term 1/∆ τk V in the l.h.s. of Eq. 2.8 improves the conditioning of the linear systems. However, in order to retain quadratic convergence, the pseudo-time step is increased as soon as one gets closer to the solution. By doing so, the diagonal term in Eq. (2.8) vanishes and one is left with Newton’s algorithm (2.5) during the last inner iterations steps. During the last Newton steps an effective preconditioner is therefore needed. In sequential runs, Incomplete lower-upper factorizations of increasing level of fill turn out to be necessary and in most cases sufficient to keep the number of linear iterations within an acceptable number. In the parallel case this becomes an even more relevant issue.
9
Chapter 3
Numerical Experiments
10
3.1
Laminar flow past a 3D circular cylinder
The chosen testcase is classical: it consists in the unconfined flow past a circular cylinder. Norberg [9] has compiled a list of references that report numerical simulations for this flow configuration. More than sixty of these address the two-dimensional case, while only 18 (for Reynolds numbers up to 105 ) deal with the three-dimensional case. We have chosen this testcase both because it allows to generate meshes in a relatively simple way and because it allows to draw a comparison with other published numerical results. In particular, we have tried to match the grids and simulation parameters of two sets of calculations: that of Karlo and Tezduyar[10] and that of Majumdar and co-workers[11]. These two differ primarily because of the boundary conditions imposed in the crossflow direction. Let us first start with the FEM calculation by Karlo and Tezduyar[10]; the second set of experiments will be described later on. Using the cylinder’s diameter as reference length, which also used in the definition of the Reynolds’ number: ReD = U∞ D/ν, the computational domain extends 7.5 diameters ahead of the cylinder, 30 downstream and 15 in the crossflow direction. The cylinder’s height H over diameter ratio equals 4. Remark that in [10] the unit length is the cylinder’s radius, although the Reynolds number and aerodynamic coefficients are defined based on the diameter. The tetrahedral mesh has been created by first generating a triangular mesh within a two-dimensional plane normal to the cylinder’s axis. This twodimensional mesh has then been extruded along the span thus generating triangular prisms. Each of these has then been decomposed into six tetrahedra. We have tried to match the salient features of the mesh used in [10], despite the different cell types being used in the two simulations: hexahedral in [10] and tetrahedral for EulFS. The 2D meshes in the lateral planes are shown in Fig. 3.1, The triangular mesh has been built by merging an inner structured mesh with an outer, purely unstructured one. This can be seen from Fig. 3.2(a) which shows a close-up of the two-dimensional grid in the neighbourhood of the cylinder’s surface. 80 segments have been placed along the perimeter and the radial height of the first layer of cell has been set equal to 0.01 diameters. This is twice the mesh spacing used in [10], which equals 0.01 radii. This was not a deliberate choice, but simply caused by some confusion generated by the choice of different reference lengths. The triangular mesh in the two-dimensional plane of Fig. 3.1 is made of 5455 gridpoints and 10710 triangular cells, whereas the number of quadrilateral cells used in [10] (which is also shown in Fig. 3.1) in the two-dimensional plane is equal to 4656. In the tetrahedral 3D mesh, 41 uniformly spaced planes have been piled up along the cylinder’s span, giving a total number of gridpoints and tetrahedra, resp. equal to 229866 and 1321110. These figures should be compared with the 197948 nodes and 186240 prismatic elements used in [10]. It is however more meaningful to compare the number of unknowns for the two simulations: 760,107 equations for the hexahedral mesh and 919,464 equations for the tetrahedral one. Boundary conditions are set equal to freestream along the inflow, outflow and cross-flow boundaries, whereas in11
20
Y
10
0
-10
-20
0
20
40
60
X
Figure 3.1: Surface grids in the lateral planes; top: tetrahedral mesh, bootom: hexahedral mesh of [10]. viscid wall boundary conditions have been set along the lateral ones. No-slip boundary conditions are imposed on the cylinder’s surface.
3.1.1
Incompressible simulation at Re=300
The aerodynamic coefficients obtained from the 3D simulation are displayed in Fig. 3.3 and contrasted with those of [10]. Remark that in the three-dimensional case, we define: 2 Fx/y CL/D = 2 DH ρ∞ U ∞ Observe that in Fig. 3.3 the temporal axis does not exactly span the same time interval in the two different numerical calculations. The EulFS simulation has been performed using a non-dimensional time step size: ∆t∗ = ∆t U∞ D = 0.05, 12
4
2
2
0
0
Y
Y
4
-2
-2
-4
-4 -4
-2
0
2
4
-4
-2
0
X
X
(a) Grid A.
(b) Grid B.
2
4
Figure 3.2: Surface grids in the lateral planes. which is half the value used in the FEM simulation: ∆t∗F EM = δt U∞ R = ∆t∗EulF S D/R due to the different choice of reference length. The comparison between the two numerical solutions shows good matching between the amplitude of the lift force compare Fig. 3.3(a) with 3.3(b), whereas the mean value of the drag coefficient computed by the FEM code is slightly larger than that predicted by EulFS. Performing an FFT of the lift signal gives a Strhoual number, St = U∞ /(T D) = 0.215, which is slightly higher than the 0.202 value of the FEM solution [10]. Figure 3.4(a) shows the convergence history in pseudo-time of the Newton’s algorithm for a selected number of time steps. Using a starting CFL set equal 30, 6 inner iterations are needed at each time step to converge all residuals to machine zero. The simulation has been run on 24 processors; the parallel preconditioner is the Additive Schwarz method (ASM) with an Incomplete Lower Upper (ILU(k)) factorization with level of fill k = 1. In the second set of calculations for this same Reynolds’ number, we have tried to match the grids and simulation parameters of the FVM calculation recently appeared in [11]. We shall refer to this latter grid as grid B. The characteristics of the two grids are contrasted in Table 3.1: NP and NE are the number of gridpoints and tetrahedral cells, while NB is the number of triangular faces into which the cylindrical surface has been discretized. grid B has about twice as many nodes as grid A. Grid B also features a finer near wall resolution and maintains a finer mesh resolution immediately downstream of the obstacle. Mesh A extends much further downstream w.r.t. grid B, thus possibly reducing the influence of the downstream boundary conditions. The spanwise extent (denoted by h in Table 3.1) of the two grids is also different. In this second simulation periodic boundary conditions are applied spanwise, as also done in [11]. The differences in near wall resolution for these two grids can 13
3D simulation Re = 300 1.5 dt = 0.05 1
CL
0.5
0
-0.5
-1
-1.5 20
40
60
80
t U0/D
(a) EulFS solution: lift.
(b) SUPG-FEM solution: lift.
3D cylinder flow Re = 300
1.6
CL
1.5
1.4
1.3
1.2
1.1 20
40
60
80
t U0/D
(c) EulFS solution: drag.
(d) SUPG-FEM solution: drag.
Figure 3.3: Laminar, incompressible 3D cylinder flow: time evolution of the drag coefficients at Re = 300. be seen in Fig. 3.2, which shows both surface grids in one of the lateral planes. Statistics on the mean aerodynamic coefficients are displayed in Table 3.2 where they are contrasted with the computational results of Karlo and Tezduyar[10]. Rajani et. al. [11] and Mittal & Balachandar [12]. The EulFS results computed on the two different grids do not show relevant differences, as can be inferred by comparing the data displayed in the first two rows of Tab. 3.2. A clearly defined dominant frequency is evident, as in the case of the FEM simulation by Karlo and Tezduyar[10], see Fig. 3.3. In the FV simulation of [11] the aerodynamic coefficients show signals that are modulated by a lower frequency, although this phenomenon occurs after a fairly long transient of approximately 60 periods. The present simulation has been advanced in time only up to about 20 periods. This time length might be too short for the spanwise disturbancies to build up completely. Both the mean drag and lift amplitude predicted by EulFS are larger than the values computed in [11]. For comparison, the simulation performed by Mittal & Balachandar [12] using a Fourier/Chebyshev spectral method is shown for comparison in Tab. 3.2. These calculation, beside using an 14
3D incompressible cylinder flow
3D incompressible cylinder flow
∆t U0/D = 0.05
Re = 300, grid B
1 ∆t mass x-momentum y-momentum z-momentum
1
1e-05 1e-05
1e-10
mass x-momentum y-momentum z-momentum 1e-10
1e-15 3000
3020 Newton steps
3040
6200
6400 Newton steps
(a) grid A.
(b) grid B.
Figure 3.4: Incompressible 3D cylinder flow at Re = 300: convergence history in pseudo time for a selected number of time steps. Table 3.1: Laminar 3D cylinder flow: geometric characteristics of the grids being used grid A B
NP 229866 465249
NE 1321110 2680265
NB 6560 7740
δ1 /D 2 10−3 10−4
h/D 4 2π
high order discretization is space, was performed on a domain which extends about twice that of grid B in the spanwise direction, and also features a much finer spawise discretization. The overall grid is made of 81×160×288 gridpoints in the radial, circumferential and spanwise directions. This simulations shows a much reduced lift amplitude and lower Strouhal number than the others shown in Tab. 3.2. It showld also be mentioned that the EulFS simulation on this grids suffered from convergence problems in pseudo-time at some time instants during a shedding cycle. This behaviour is evident when looking at Fig. 3.4(b), which refer to the simulation performed on grid B. Convergence in pseudo-time is not as clean as the one obtained on grid A and shown in the companion Fig. 3.4(a). Indeed, the residual of the three components of the momentum equation is not always driven to machine zero, but sometimes stagnates around 10−5 .
15
Table 3.2: Laminar 3D cylinder flow: global aerodynamic parameters at Re = 300. source EulFS on grid A EulFS on grid B Karlo & Tezduyar[10] Rajani et. al. [11] Mittal & Balachandar [12] (3DA)
∆t 0.05 0.05 0.1 (?) 0.05
St 0.215 0.215 0.202 0.195
< CD > 1.34 1.35 N/A 1.28
rms(CD − < CD >) 6.11 10−2 6.71 10−2 N/A 6.7 10−2
rms(CL ) 0.643 0.656 0.61 0.499
N/A
0.203
1.26
N/A
0.38
16
Table 3.3: Un-steady, incompressible 3D cylinder flow at Re = 800; parameters used in the Newton-Krylov solver Newton step linear iterations rk = |R U 0 |/|R U k |
3.1.2
1 7 10
2 21 102
3 80 105
4 104 108
5 117 109
6 106 1013
Incompressible simulation at Re=800
The most interesting simulation is certainly the one performed at Re = 800. The simulation, shown in Fig. 3.5, has been run for a fairly long time, roughly corresponding to 90 periods based on the frequency of the lift coefficient. The simulation has been started by projecting a snapshop of the 2D solution onto the 3D grid, setting to zero the transverse velocity. The dominant frequency obtained from the present calculation gives a Strhoual number St = 0.210 which is slightly larger than the value 0.203 obtained by Karlo and Tezduyar[10]. The non-dimentional period T ∗ = T U∞ /D = 4.76 so that the number of time-steps per period is about one hundred when using ∆t∗ = 0.05. The simulation has been run on 24 processors using ASM+ILU(1) as preconditioner. Using C0 = 30 in Eq. (2.9), exactly six Newton steps are needed at each physical time step to drive all residuals to machine zero. The ratio rk = |R U 0 |/|R U k | used in the SER approach is based upon the x-momentum residual: its order of magnitude as a function of the Newton step counter is displayed in the last row of Table 3.3. The same table shows the number of linear iterations required to solve the linear system (2.8) at each of the six Newton steps, averaged over all physical time-steps. It is seen that the number of linear equations increases as soon as the diagonal matrix in Eq. (2.8) is made to vanish. Figure 3.5 shows 3D cylinder flow, Re = 800
3D cylinder flow
run 15381;15387; 162195; 163051; 187337; 15833
Re = 800
2
2
1.8
1
CD
CL
1.6
0
1.4
1.2
-1
1
-2
0
100
200
300
400
500
0
t U0/D
100
200
300
400
500
t U0/D
(a) Lift coefficient.
(b) Drag coefficient.
Figure 3.5: Laminar, incompressible 3D cylinder flow: time evolution of the aerodynamic coefficients at Re = 800; EulFS calculation.
17
the time evolution of the lift and drag coefficients It is quite clear from the observation of Fig. 3.5 that the initial transient requires a fairly long time to be washed out. The phenomenon that occurs at about t U∞ /D = 125 is likely to be an artifact due to the initial condition. Even longer running time might however be necessary to obtained converged statistics. The time evolution of the aerodynamic coefficients obtained from the FS and FE calculations are contrasted in Figs. 3.6. The time interval on the time axis has been chosen so as to capture the same number of periods (approximately 25) for both simulations. It can be seen that the FS simulation predicts a somewhat larger lift amplitude and mean drag value. Figure 3.7 shows the iso-vorticity magnitude surfaces, coloured by static 3D cylinder flow, Re = 800 run 15381;15387; 162195; 163051; 187337 1.5 dt = 0.05 1
CL
0.5
0
-0.5
-1
-1.5 250
275
300 t U0/D
325
350
(a) FS calculation: lift.
(b) FE calculation: lift.
Re = 800
CD
1.6
1.4
1.2
250
275
300 t U0/D
325
350
(c) FS calculation: drag.
(d) FE calculation: drag.
Figure 3.6: Laminar, incompressible 3D cylinder flow: time evolution of the aerodynamic coefficients at Re = 800. pressure, for two different values of the vorticity magnitude and two different time instants during the shedding cycle. Figures 3.7(a) and 3.7(b) have been drawn at a time when the lift force is approximately zero, wheeras Figs. 3.7(c) and 3.7(d) approximately correspond to a time when the maximum lift occurs.
18
Y
Y X
Z
(a) |ω| = 0.5; zero lift.
(b) |ω| = 2.0; zero lift. Y
Y
X
X Z Z
(c) |ω| = 0.5; maximum lift.
(d) |ω| = 2.0; maximum lift.
Figure 3.7: Laminar, incompressible 3D cylinder flow: iso-vorticity surfaces, coloured by static pressure.
19
Table 3.4: Geometric characteristics DPW-W1 - Baseline Wing Alone S = 290322 mm2 c = 197.556 mm b = 1524 mm
Ref. Area, Ref. Chord, Ref. Span,
= 450 in2 = 7.778 in = 60 in
Table 3.5: Mesh characteristics of the DPW-W1 - Baseline Wing Alone grid name Cessna Coarse Larc Coarse Cessna Medium Larc Medium Cessna Fine
3.2
Gridpoints 983,633 1,806,422 2,417,082 4,476,969 6,138,245
Boundary Gridpoints 35,270 70,449 110,866 139,768
Tetrahedra 5,764,887 10,639,646 14,183,211 26,368,110
DPW3 Wing1
In this section we present steady RANS calculations of the incompressible flow past a wall mounted wing. The geometry and the grids have been proposed in the framework of the 3rd Drag Prediction Workshop. In the following, we shall refer to this geometry as the “DPW3-W1 - Baseline Wing Alone”. The geometric characteristics of this wing are reported in Table 3.4. A number of unstructured grids, suited for vertex-based, tetrahedral unstructured solver is accessible through the ftp://cmb24.larc.nasa.gov/outgoing/DPW3 repository. One of these two sets has been generated by Tom Zickuhr from Cessna Aircraft. We shall refer to these as the Cessna grids. The other set of grids has been generated by Beth Lee-Rausch from NASA LaRC, so that we shall refer to these latter as the Larc grids. The main features of the various grids used are listed in Tab. 3.5, sorted by increasing mesh size. All calculations have been performed by first running the tandem solution approach from scratch. Tipically ten iterations are performed on the turbulence transport equation for each mean flow iteration. Following the initial transient, which is needed to let the turbulent viscosity build up, the tandem solution strategy converges towards machine zero, although the residuals generally stall a few order of magnitudes above machine zero. These will be shown in the subsequent examples. Starting from a solution computed using the aformentioned approach, the fully coupled approach with Finite-Differenced Newton linearization is put in place. The quality of the initial solution can be crucial for the convergence of Newton’s method.
20
3.2.1
Incompressible simulation
Incompressible flow simulations have been performed for an incidence angle of 0.5◦ . Freestream turbulence is fixed to ten percent its laminar value. Coarse Cessna grid (1M gridpoints) Calculations on this coarsest mesh have been performed using 32 processors of the leonardo HP Proliant DL-360 cluster. Figure 3.8 displays the convergence histories obtained using both solution strategies. Figure 3.8(a) shows the one obtained using the tandem solution approach. The calculation has been started from scratch, i.e. uniform flow and turbulent viscosity set everywhere equal to their freestream values. It can be seen that, after the initial transient, the residuals decrease steadily towards machine zero. The calculation obtained using the coupled approach has been re-started from the solution obtained after 200 iterations of the tandem approach. The level of the residuals for this initial solution is marked by an arrow in Fig. 3.8(a). The convergence history of the fully coupled Newton approach is displayed in Fig. 3.8(b). DPW3 Wing1
DPW3-Wing1
incompressible flow, Cessna coarse grid, run15771
incompressible simulation, run 13787, Cessna coarse grid
1
1
0.0001
1e-05
1e-08 mass x-mom y-mom z-mom turb
1e-10
continuity x-momentum y-momentum z-momentum turbulence
1e-12
0
200
400 600 non-linear (Picard) iterations
800
0
1000
10
20
30
40
50
Newton steps
(a) Tandem solution strategy; Picard linearization.
(b) Coupled solution strategy; Newton linearization.
Figure 3.8: Convergence histories for the DPW-W1 - Baseline Wing Alone geometry: incompressible calculation of the Cessna coarse mesh.
21
Coarse Larc grid (1.8M gridpoints) Calculations performed on the coarse Larc grid are presented in this section. 1500 iterations have first been run, starting from scratch, on 30 processors of the leonardo cluster, using the tandem solution strategy. The initial CFL number, set equal to 5 and 1 for the mean flow and turbulence transport equations, has been kept of the order of unity for all non-linear iteration by application of the SER strategy. The calculation has then been re-started on 32 processors of the matrix cluster. The CFL number for the mean-flow equations rapidly ramps to the maximum preset value of 100, while the CFL number for the turbulent transport equations remains of order 1. The convergence history is displayed in Figure 3.9. The solution obtained after 150 Picard iterations (its convergence dpww1_0.6 grid runs: 15610+206313 1
0.0001 run15610 on 30 procs (leonardo)
mass x-momentum y-momentum z-momentum turb. viscosity
1e-08
run206313 on 32 PEs (matrix) 1e-12
0
500
1000 1500 non-linear (Picard) iterations
2000
2500
Figure 3.9: Convergence histories for the DPW-W1 - Baseline Wing Alone geometry: incompressible calculation of the Cessna coarse mesh. history is displayed in Fig. 3.10(a)) is used to startup Newton’s algorithm. The convergence history of the latter is shown in Figure 3.10(b).
22
DPW3 Wing1 run220807; coarse Larc grid
run212829, coarse Larc mesh
1
1
mass x-momentum y-momentum z-momentum turbulence
0.0001 0.01
1e-08 0.0001
1e-12
1e-06
0
50
100 non-linear (Picard) iterations
150
0
200
(a) Tandem solution strategy; Picard linearization.
20
40 Newton steps
60
80
(b) Coupled solution strategy; Newton linearization.
Figure 3.10: Convergence histories for the DPW-W1 - Baseline Wing Alone geometry: incompressible calculation of the Larc coarse mesh.
23
Medium Cessna grid (2.4M gridpoints) Figure 3.11(a) shows the convergence history obtained on the medium Cessna grid using the tandem solution strategy. The companion Fig. 3.11(b) shows the evolution of the ratios |R(U 0 )|/|R(U k )| with the number of non-linear iterations k. Observe that two different ratios are used: one for the mean flow equations and one for the turbulent transport equations. The latter is kept fix at 1, while the former is allowed to grow unboundedly. Convergence is achieved in about 300 non-linear iterations. Observe that, due to the decoupling between the mean flow and turbulence transport equations, the nodal residual level off above machine zero. For all practical “engineering” use, however, the solution displayed in Fig. 3.11(a) can be considered converged. Convergence down to machine zero can only be achived using the fully coupled approach. Figure 3.12 shows the convergence history obtained using the coupled solution DPW3-Wing1
DPW3-Wing1
incompressible simulation, run 223795 (40 PEs) Cessna medium grid
incompressible simulation, run 223795 (40 PEs) Cessna medium grid
1 8
10
mass x-mom y-mom z-mom turb
mean flow eqns turbulence transport equations
6
k
|R(U )|/|R(U )|
10
0
1e-05
4
10
2
10
0
10
1e-10
0
200
400 non-linear (Picard) iterations
600
800
0
200
400 non-linear (Picard) iterations
600
800
(b) History of the |R(U 0 )|/|R(U k )| ratio used in the SER strategy.
(a) Convergence history.
Figure 3.11: Convergence histories for the DPW-W1 - Baseline Wing Alone geometry: incompressible calculation of the Cessna medium mesh; tandem solution strategy. strategy with FD Jacobian. The initial solution is provided running 240 iterations of the tandem solution strategy, see Fig. 3.11(a). Figure 3.12(b) shows how the |R(U 0 )|/|R(U k )| ratio used in the SER strategy behaves when using the coupled solution approach. In this case, only one value of the ratio is used. For the case shown, it has been measured using the residual of the turbulence transport equation. Figure 3.13 shows a comparison of the pressure coefficient distribution along the wing at 4 different spanwise sections. The three different symbols refer to different grids of increasing mesh resolution.
24
DPW3 Wing1 Alone
DPW3 Wing1 Alone
Incompressible simulation, Cessna medium grid, run 230006 on 40 PEs
Incompressible simulation, Cessna medium grid, run 230006 on 40 PEs
1
12
10
k
8
10
0
0.0001
|R(U )|/|R(U )|
mass x-momentum y-momentum z-momentum turbulence
1e-08
turbulence
4
10
1e-12
0
0
20
40 60 non-linear (Newton) iterations
80
10 0
100
20
40 60 non-linear (Newton) iterations
80
100
(b) Evolution of the |R(U 0 )|/|R(U k )| ratio used in the SER strategy.
(a) Coupled solution strategy; Newton linearization.
Figure 3.12: Convergence histories for the DPW-W1 - Baseline Wing Alone geometry: incompressible calculation of the Medium Cessna mesh.
25
1.4
1.4
Slc: Y/C=0.2; Cessna medium Slc: Y/C=0.2; Cessna coarse Slc: Y/C=0.2; Larc coarse
1.3
Slc: Y/C=1; Cessna coarse Slc: Y/C=1; Cessna medium Slc: Y/C=1; Larc coarse
1.2
1.2 1.1
1 1 0.8
P
P
0.9 0.8 0.7
0.6
0.6 0.4
0.5 0.4
0.2
0.3 0.2 0.2
0.4
0.6
0.8
1
1.2
0.4
0.6
0.8
1
X
X
(a) Y/c = 0.2.
(b) Y/c = 1.
1.4
1.4
Slc: Y/C=2; Cessna coarse Slc: Y/C=2; Cessna medium Slc: Y/C=2; Larc coarse
1.2
1.2
1.4
Slc: Y/C=3; Cessna coarse Slc: Y/C=3; Cessna medium Slc: Y/C=3; Larc coarse
1.2
1
1
P
P
0.8
0.8
0.6 0.6 0.4 0.4 0.2 0.2 0.8
1
1.2
1.4
1
1.2
1.4
X
X
(c) Y/c = 2.
(d) Y/c = 3.
1.6
Figure 3.13: DPW-W1 - Baseline Wing Alone geometry: pressure distribution at different wing spans obtained from the incompressible calculation.
26
Medium Larc grid (4.5M gridpoints) Convergence histories obtained using the medium Larc mesh are displayed in Fig. 3.14. The tandem solution strategy is run for 1300 iterations; the corresponding residual history is displyed in Fig. 3.14(a). The residuals of both the mean flow and turbulence transport equations are seen enter a limit cycle a few oorders of magnitude above meachine zero. Using this almost converged solution as the initial guess for Newton’s algorithm, fifteen iterations are needed to bring all residual down to machine accuracy, as shown in Fig. 3.14(b). DPW3 Baseline Wing1 Alone
DPW3 Baseline Wing1 Alone
Larc medium grid; runs: 245797
Larc medium grid; runs: 233582, 236454; 239096 (72 PEs)
0.0001
1
mass x-momentum y-momentum z-momentum turbulence
mass x-momentum y-momentum z-momentum turbulence
0.01
1e-08 0.0001
233582 1e-06
239096 236454
1e-12 1e-08
0
1000
500
0
1500
non-linear iterations
(a) Tandem solution strategy; Picard linearization.
5
10 non-linear iterations
15
20
(b) Coupled solution strategy; Newton linearization.
Figure 3.14: Convergence histories for the DPW-W1 - Baseline Wing Alone geometry: incompressible calculation of the Larc medium mesh.
27
Fine Cessna grid (6.5M gridpoints) Convergence histories obtained using the finest of all five meshes are displayed in Fig. 3.15. The tandem solution strategy is run for 1300 iterations; the corresponding residual history is displyed in Fig. 3.15(a). The residuals of both the mean flow and turbulence transport equations are seen enter a limit cycle a few oorders of magnitude above meachine zero. Using this almost converged solution as the initial guess for Newton’s algorithm, ten iterations are needed to further reduce by six orders of magnitude all residual down to machine accuracy, as shown in Fig. 3.15(b). DPW3 Wing 1 Alone
DPW3 Wing 1 Alone
incompressible simulation, Cessna fine grid
incompressible simulation, run276733, Cessna fine grid
1
0.001
mass x-momentum y-momentum z-momentum turbulence
0.01
mass x-momentum y-momentum z-momentum turbulence 1e-06
0.0001
1e-09
1e-06
1e-08
1e-12
0
500 Picard iterations
1000
0
2
4
6
8
10
Picard iterations
(a) Tandem solution strategy; Picard linearization.
(b) Coupled solution strategy; Newton linearization.
Figure 3.15: Convergence histories for the DPW-W1 - Baseline Wing Alone geometry: incompressible calculation using the Cessna fine mesh.
28
3.3 3.3.1
Turbulent flow past a 2D circular cylinder Compressible, M∞ = 0.2 simulation
Subsonic flow past a circular cylinder is modeled at a Reynolds number of 10,000, with the Spalart-Allmaras turbulence model employed. At this low a Re, the turbulence is primarily confined to the wake region. This testcase is one of the validation cases available on the CFL3D website 2 . We have used the same grid used in the CFL3D calculation: it is a rather coarse mesh made of 129×81 gridpoints in the circumferential and wall normal directions, respectively. Using a time step of ∆tCF L3D = 0.4 on this 129 x 81 grid (2-D), the Strouhal number St = f*D/u∞ predicted by CFL3D comes out to be 0.236. In CFL3D’s nondimensional units, St = D/(M∞ *T), where D is the nondimensional cylinder diameter = 1.0, M∞ = 0.2, and T is the the nondimensional time for one period. The computed average drag coefficient on the cylinder is about 1.76. As mentioned on the CFL website these levels for St and CD are not necessarily spatially or temporally converged enough, since 129 x 81 is a rather coarse grid, and ∆tCF L3D = 0.4 yields only 53 steps per period. In experiments at Re = 10,000, St is roughly 0.2 and the average drag coefficient is near 1.0-1.2 (see Cox et al, Theoret. Comput. Fluid Dynamics (1998) 12: 233-253). Thus, 2-D CFD yields too-high levels for St and Cd (overall conclusions are similar even if one were to use finer grids and lower time steps). However, experiments for Re > 200 or so always have inherent threedimensionality (spanwise structures), so 3-D computations would be necessary to reproduce the physics, including St and CD . The reason for running a two-dimensional calculation for such a high Reynolds’ number is the availability of the CFL3D solution against which the EulFS solution can be compared and contrasted. Observe that because of the different non-dimensionalization adopted in the two codes, the non-dimensional reference time differs between EulFS and CFL3D by a factor of M∞ : ∆tEULFS = M∞ ∆tCFL3D The EulFS results are summarized in Table 3.6 which includes: the Strouhal number, the mean drag coefficient, the rms of the lift coeffient (which averages to 0 because of symmetry) and the rms of the difference between the drag coefficient and its mean value.
3.3.2
Incompressible simulation
We now turn to the incompressible flow simulation. The grid is the same as that used in the compressible case. Four different values of the time-step length ∆t U∞ /D, have been selected, as shown in Tab. 3.7. The space-time discretization relies on the LDA-PG mass-matrix for both the mean flow and turbulent transport equations. Table 3.8 shows the average number of Newton steps 2 http://cfl3d.larc.nasa.gov/Cfl3dv6/cfl3dv6
29
testcases.html
Table 3.6: Turbulent 2D cylinder flow: compressible simulation run 15485 15486 15487 15488 CFL3D
∆tEULFS 0.08 0.04 0.02 0.01
St .22016 .21994 .21983 .21978 .236
rms(CL ) 1.2671 1.2707 1.2662 1.2613
2D cylinder flow
< CD > 1.7422 1.7293 1.7255 1.7283 1.76
rms(CD − < CD >) 9.406 10−2 8.513 10−2 7.976 10−2 7.691 10−2
2D cylinder flow
M = 0.2, Re = 10,000
M = 0.2, Re = 10,000
2
1.85
1 1.8
0
1.75
run 15485, dt = 0.08 run 15486, dt = 0.04 run 15487, dt = 0.02 run 15488, dt = 0.01
run 15485, dt = 0.08 run 15486, dt = 0.04 run 15487, dt = 0.02 run 15488, dt = 0.01
1.7
-1 1.65
-2 20
30
1.6 20
40
(a) Lift coefficient.
30
40
(b) Drag coefficient.
Figure 3.16: Turbulent, compressible 2D cylinder flow: time evolution of the lift and drag coefficients. required within each physical time step to bring all residuals down to machine zero. In all runs the starting value of C0 used in Eq. (2.9) is 30. First of all, it can be seen from table 3.8 that, for a given time-step length, the number of Newton steps is roughly the same regardless of whether the compressible or incompressible flow equations are being solved. The number of Newton is reduced as the physical time step is halved. This is reasonable if we consider that, as the physical time step is reduced, the solution at time level n + 1 should get closer to the one at time level n. However, for the smallest time step length, ∆t = 0.01 the number of inner iterations raises again, reaching an average value close to 7. This might be due to the fact that C0 = 30 introduces too much damping into Newton’s algorithm tghus slowing down convergence. Convergence histories for both the compressible and incompressible simulations are displayed in Fig. 3.18 using a time step length ∆t = 0.01.
30
Table 3.7: Turbulent, incompressible 2D cylinder flow: run 15687 15688 15689 15690
∆tEULFS 0.08 0.04 0.02 0.01
St 0.21994135 0.2198339 0.21978022 0.21978022
T 4.40 4.38 4.38 4.39
< CD > 1.6966 1.6809 1.6766 1.6758
rms(CD − < CD >) 8.7857 10−2 8.2790 10−2 8.0318 10−2 7.951 10−2
rms(CL ) 1.2548 1.2417 1.2386 1.2388
Table 3.8: Turbulent 2D cylinder flow: average number of Newton steps required at each physical time-step compressible simulation run ∆tEULFS avg. Newton steps 15485 0.08 8.2 15486 0.04 6.4 15487 0.02 6.0 15488 0.01 7.0 incompressible simulation run ∆tEULFS avg. Newton steps 15687 0.08 8.1 15688 0.04 6.8 15689 0.02 6.0 15690 0.01 6.7
31
2D cylinder flow, Re = 10,000 2D cylinder flow, Re = 10,000 2 run15687, dt = 0.08 run15688, dt = 0.04 run15689, dt = 0.02 run15690, dt = 0.01
1
run15687, dt = 0.08 run15688, dt = 0.04 run15689, dt = 0.02 run15690, dt = 0.01
1.8
CD
0
-1
1.7
1.6
-2 50
55
60
65
70
1.5 50
75
55
60
t U0/D
70
65
75
t U0/D
(a) Lift coefficient.
(b) Drag coefficient.
Figure 3.17: Turbulent, incompressible 2D cylinder flow: time evolution of the lift and drag coefficients.
Compressible 2D cylinder flow
Incompressible 2D cylinder flow
run 15488, ∆t = 0.02
run 15690, ∆t = 0.02 mass energy x momentum y momentum turbulence
1
1e-05
1e-05
1e-10
1e-10
1e-15 40000
40020
40040 40060 Newton steps counter
40080
mass x momentum y momentum turbulence
1
1e-15 40000
40100
(a) Compressible simulation.
40020
40040 40060 Newton steps counter
40080
40100
(b) Incompressible simulation.
Figure 3.18: Turbulent, 2D cylinder flow: convergence history in pseudo-time using a physical time step length ∆t = 0.01.
32
Chapter 4
Conclusions This report has described the research activity conducted in the framework of a project titled “Unsteady RANS simulations using a 3D unstructured grid solver with Newton-Krylov acceleration” funded by CASPUR under the “Standard HPC grant 2009”. It soon became clear that before embarking ourselves into the simulation of unsteady RANS flow cases, benchmarking of 3D laminar, low Reynolds number flows was needed. To this end, the incompressible flow simulation past a circular cylinder at Reynolds numbers 300 and 800 were conducted, trying to match the grids and simulation parameters of two published numerical simulations of the same problem. These two simulations differ primarily by the spanwise extent of the domain and the type of boundary conditions applied in the spanwise direction: non-penetration vs. periodic. In the former case, Newton’s algorithm allows to drive the residual to machine-zero in six iterations within each pseudotime iteration. When periodic boundary conditions were used, problems have been encountered during certain physical time-steps when Newton’s algorithm fails to drive all residuals to machine zero. This might be due to the way periodic boundary conditions are implemented and should be further investigated. Thanks to the large memory and computing power made available on the matrix cluster, steady RANS simulations that had already been performed using grids featuring up to 1 million gridpoints have been repeated using finer (tetrahedral) grids with up to 4.5 million gridpoints. Although this can be considered only as a medium-sized grid by industrial standards1 , we managed to demonstrate quadratic convergence of Newton’s algorithm even on this relatively large datasets. The possibility of testing larger grids was limited due to the availability of a partitioner which is not parallel, at present. Finally, URANS simulations have only been performed in 2D solving both the compressible and incompressible flow equations. Even in this case, Newton’s algortihm proves effective in driving all residuals to machine zero in a limited 1 at
least according to the classification used in the AIAA Drag Prediction Workshops
33
number of iterations. 3D URANS simulations are plagued by an instability in the turbulence transport equations which deserves further investigation.
34
Bibliography [1] Wikipedia. Newton’s method — Wikipedia, the free encyclopedia, 2009. [Online; accessed 28-November-2009]. [2] A. Jameson. Time-dependent calculations using multigrid with applications to unsteady flows past airfoils and wings. AIAA Paper 91-1596, 1991. [3] AJ Chorin. A numerical method for solving incompressible viscous flow problems. Journal of Computational Physics, 135(2):118–125, 1997. [4] P. R. Spalart and S. R. Allmaras. A one-equation turbulence model for aerodynamic flows. La Recherche-Aerospatiale, 1:5–21, 1994. [5] E. van der Weide, H. Deconinck, E. Issman, and G. Degrez. A parallel, implicit, multi-dimensional upwind, residual distribution method for the Navier-Stokes equations on unstructured grids. Computational Mechanics, 23:199–208, 1999. [6] Aldo Bonfiglioli. Fluctuation splitting schemes for the compressible and incompressible euler and navier-stokes equations. International Journal of Computational Fluid Dynamics, 14(1):21–39, 2000. [7] A. Bonfiglioli, MS Campobasso, and B. Carpentieri. Parallel unstructured three-dimensional turbulent flow analyses using efficiently preconditioned newton-krylov solver. In Proceedings of the 19th AIAA Computational Fluid Dynamics Conference 2009. Curran Associates, 2009. San Antonio, Texas; 22-25 June 2009. [8] W. Mulder and B. van Leer. Experiments with an Implicit Upwind Method for the Euler Equations. Journal of Computational Physics, 59:232–246, 1985. [9] C. Norberg. Fluctuating lift on a circular cylinder: review and new measurements. Journal of Fluids and Structures, 17(1):57–96, 2003. [10] V. Kalro and T. Tezduyar. Parallel 3D computation of unsteady flows around circular cylinders. Parallel computing, 23(9):1235–1248, 1997.
35
[11] B.N. Rajani, A. Kandasamy, and Sekhar Majumdar. Numerical simulation of laminar flow past a circular cylinder. Applied Mathematical Modelling, 33(3):1228 – 1247, 2009. [12] R. Mittal and S. Balachandar. On the inclusion of three-dimensional effects in simulations of two-dimensional bluff-body wake flows. In Proceedings, 1997 ASME Fluids Engineering Division Summer Meeting, 1997.
36