Fast Algorithms for Solving High-Order Finite Element Equations for ...

Fast Algorithms for Solving High-Order Finite Element Equations for Incompressible Flow L. Ridgway Scott, Andrew Iliny, Ralph W. Metcalfez, and Babak Bagheriy

Abstract

AMS subject classi cations: 65M60, 65F05, 65F10.

one to replace an inde nite linear-algebraic system with a positive-de nite linear system in many cases. One key observation we make is that it is necessary in some cases to do only a very small number of penalty iterations after an initial start-up phase. This leads to a very ecient algorithm. We also extend the algorithm and analysis in [4] to the case of inhomogeneous boundary conditions. Another observation we make is that direct methods may be a suitable choice for solving the resulting linear equations, at least in the two-dimensional case. We show that the relative ll-in due to Gaussian elimination is quite small for high-order methods, and decreases with increasing degree. This allows one to solve non-symmetric problems with comparable eciency to the symmetric case, allowing complex time-stepping schemes to be used.

1 Introduction

2 Mixed Method Formulation

We discuss high-order nite element methods to simulate the ow of incompressible viscous uids. We focus on algorithms for solving the related algebraic equations eciently using the iterated penalty method for resolving the incompressibility constraint. We show that direct methods may be a suitable choice for solving the resulting linear equations, at least in the two-dimensional case.

Key words: nite element method, incompressible vis-

cous uids, iterated penalty method, Navier-Stokes problem.

High-order nite element methods provide very accurate simulations of the ow of incompressible viscous uids [7]. Here we focus on various algorithms for solving the related algebraic equations eciently in the context of Newtonian

uids, that is, ones solving the Navier-Stokes equations [8] which appear subsequently in equation (9). We discuss the iterated penalty method for resolving the incompressibility constraint [4]. This method allows

In all of the cases considered here, the formulation will reduce to solving a general mixed method of the form a(uh ; v) + b(v; ph ) = F(v) 8v 2 Vh (1) b(uh ; q) = G(q) 8q 2 h ; where F 2 V 0 and G 2 0 (the \primes" indicate dual spaces [4]). Here V and are two Hilbert spaces with subspaces Vh V and h , respectively; uh 2 Vh and ph 2 h are unknowns, a(; ) and b(; ) are bilinear forms, Department of Mathematics and Texas Center for Advanced which satisfy conditions introduced later. Molecular Computation, University of Houston y Texas Center for Advanced Molecular Computation, University The main new ingredient in the variational formulation of Houston of mixed methods with inhomogeneous boundary condiz Department of Mechanical Engineering, University of Houston tions is that the term G is not zero. In [4], complete details are given only in the case that G 0. The variational problem (1) is only a slight generalization of (11.1.1) in [4], which had b(uh ; q) = (g; q) as the second equation, if we make the translation g = Ph G where Ph denotes the Riesz representation of G in h , that is (2) (Ph G; q) = G(q) 8q 2 h : This space left blank for copyright notice.

1

2

ICOSAHOM 95

We will assume that D : V ! is continuous and that (3)

b(v; p) = (Dv; p) 8p;

where (; ) denotes the inner product in . Note that the second equation in (1) says that

where e.g. a(; ), b(; ) and c(; ; ) are given by (11)

Ph Duh = Ph G

(12)

where we also use Ph g to denote the -projection of g 2 onto h . We assume that the bilinear forms satisfy the continuity conditions a(u; v) Ca kukV kvkV 8u; v 2 V (5) b(v; p) Cb kvkV kpk 8v 2 V; p 2

(13)

(4)

a(u; v) :=

d X

Z

@ui @vi dx;

i;j =1 @xj xj

b(v; q) := ? c(u; v; w) :=

Z X d

Z

@vi q dx; @x

i=1 i

?

u rv w dx;

and (; )L2 denotes the L2 ( )d -inner-product, V = R H1 ( )d and = q 2 L2( ) : q dx = 0 . For more details about notation see [4]. One of the simplest rst-order time-stepping schemes is ? ? ` +R c ?u`?1 ; u`?1; v a u `?1; v + b v; p and the coercivity conditions ? + Rt u ` ? u `?1; v L2 = 0; (14) ? kvk2V a(v; v) 8v 2 Z [ Zh b u `; q = 0; p) 8p 2 : kpk sup b(v; (6) where, here and below, v varies over all V (or Vh ) and h k v k v2Vh V q varies over all (or h ) and t denotes the time-step size. The scheme is only explicit with respect to the velocHere Z and Zh are de ned by ity. There is an equation for the pressure and the zerodivergence constraint. The major drawback of explicit (7) Z = fv 2 V : b(v; q) = 0 8q 2 g schemes such as (14) is the severe limitation on the time and step as depicted in Figure 1. (8) Zh = fv 2 Vh : b(v; q) = 0 8q 2 h g Time-stepping schemes that are implicit with respect to the linear terms and explicit with respect to the nonlinear respectively. terms can be more ecient even though a linear system involving a(; ) form must be solved. The simplest of these is ? ? ? p` +R c? u `?1; u `?1; v a u ` ; v + b v; The Navier-Stokes equations for the ow of a viscous, in- (15) + Rt u ` ? u `?1; v L2 = 0; compressible, Newtonian uid can be written ? b u ` ; q = 0: ? ?u + rp = ?R u ru + u t (9) A more complex time-stepping scheme could be based div u = 0: on the variational equations ? ? ? for x 2 IRd , t 2 [0; T], where u(t; a u` ; v + b v; p` +R c? u`?1; u`; v denotes the uid x) velocity, p(t; x) + Rt u` ? u`?1; v L2 = 0; denotes the pressure, and R denotes the ? Reynolds number [8]. These equations must be supple- (16) ` ; q = 0; b u mented by appropriate boundary conditions, such as the Dirichlet boundary conditions, u = on @ , and initial condition u(0; in which the nonlinear term has been approximated in such x) =u 0(x). A complete variational formulation of (9) takes the form: a way that the linear algebraic problem changes at each Find u such that u(t; ) ? 2 V and p(t; ) 2 such that time step. It takes the form (1) with a form a~(; ) given 8t 2 [0; T] by ? ? ? ? (17) v + v; U = a u; ~a u; p + v +?b v; ? ? a u; Z = 0 8v 2 V ; (10) R c u; 2 R u v + U r u v dx v + u t; v L u; b(u;

t q) = 0 8q 2 ;

3 The Navier-Stokes Equations

Fast Algorithms for High-Order FEM for Incompressible Flow

tions. To avoid this time-dependence and to keep implicitness of the non-linear term approximation, xed-point iterations have been introduced in the following variational form ? ? ? a u`;m ; v + b v; p`;m +R c? u`;m?1; u`;m?1; v + Rt u`;m ? u`?1; v L2 = 0; (21) ? b u`;m ; q = 0;

Comparison of different time stepping schemes, JHFlow

0

10

−1

10

2−nd Order Implicit 1−st Order Implicit

Max dt

3

−2

10

1−st Order Implicit−explicit

where m = 1; 2; :::; M is the xed-point iteration index. We write u ` for u `;M . The following convergence criterion has been used in our calculations for the xed-point method:

−3

10

1−st Order Explicit

(22) −4

10

0

10

1

10 log(h0/h)

2

10

u`;m

? u`;m?1 "fp u`;m :

The simplest initial guess for xed-point iterations is u `;0 = u `?1. To reduce the number of xed-point iterations, the initial guess was calculated using linear interpolation of the solution at two previous time steps:

Figure 1: Empirically determined maximum time step size for dierent time stepping schemes for Jerey-Hamel ow for Reynolds number R = 240. The explicit scheme is (14), the implicit-explicit scheme is (15), the implicit scheme is (23) u `;0 = 2u `?1 ? u `?2: (21). To utilize an implicit scheme for nonlinear equation, In practice, higher order time-stepping schemes can be the xed-point iterative method has been used. used, including the second-order approximation of the time derivative ([6]): where U = Ru n arises from linearizing the nonlinear term. Even though the addition of the U term makes it non@ u (t` ) 3u ` ? 4u `?1 + u `?2 ; (24) R symmetric, ~a(; ) will be coercive for t suciently large. @t 2t In fact, when div U 0 then integrating by parts yields but the main issues related to solving the resulting linear Z Z X d equations remain the same. @vj v dx = U rv v dx = Ui @x All? time-stepping schemes may be written as a problem j i ` ; p` which is nearly of the form (1), for example,

i;j =1 for u ! Z X d equation (15) may now be written as @vj2 1 (18) dx = Ui ? ? ` = ?R c ?u`?1; u`?1 ; v

i;j =1 2 @xi a~ u ` ; v + b v; p ? + u `?1; v L2 ; Z X d (25) @Ui v2 dx = 0 ? ` ? 21 b u ; q = 0 j

i;j =1 @xi with the more general form a~(; ), namely for all v 2 V so that ?

(26) a~(u; v) := a(u; v) + u; v L2 (19) v 2 V v) 8 v 2V a~(v; for the same choice of > 0 as before. Of course, a~(; ) is where the constant = R=t for the rst-order timestepping scheme and = 1:5R=t for the second-order continuous: time-stepping scheme. Numerical experiments will be pre

sented subsequently for such a problem. Note that the (20) w 8 v; w 2 V ~a(v; w) C v a V V linear algebraic problem to be solved at each time step is the same. but now Ca depends on both Rt and U. Note that we have u ` = on @ , that is, u ` = u + The main disadvantage of the time-stepping scheme (16) is that it has a time-dependent system of algebraic equa- where u 2 V . The variational problem (25) for u ` can

4

ICOSAHOM 95

then be written: Find u 2 V and p 2 such that ?

? u; p? = v + b v;

? a~ ; v

? ~ ? a ` ? 1; u`?1; v + u`?1 ; v 8v 2 V u ? R c (27) 2 L ? ? 8q 2 : b u; q q = ?b ; This is of the form (1) with ? `?1 `?1 ? F(v) ; v ;u ?1v ? := ?a~? ` ; Rc u + u ; v L2 8v 2 V (28) ? 8q 2 : G (q) := ?b ; q Note that the inhomogeneous boundary data appears in both right-hand sides, F and G. Thus we are forced to deal with a nonzero G. A nite element discretization of (10), (15) or (27) is obtained by replacing V by Vh and by h satisfying (6). Since the notation is identical if we let V denote either V or Vh (and similarly with ) we drop all h subscripts from now on.

will be symmetric if a(; ) is symmetric, and it will be positive de nite if a(; ) is coercive and 0 > 0. Suppose that = DV . Then since G(q) := ?b( ; q), (Ph G; q) = G(q) = ?b( ; q) = (34) ?(D ; q) = ?(Ph D ; q) 8q 2 ; that is, Ph G = ?Ph D . Then (35) pn+1 = pn + Ph D (un + ) since Ph Dun = Dun . If we begin with p0 = 0 then, for all n > 0, (36) where (37) Note that

pn = Ph D

n ? X i=0

wn :=

ui + = Ph Dwn;

n ? X i=0

ui + :

b(v; pn) = (Dv; pn ) = (Dv; Ph Dwn ) = (Dv; Dwn ) Consider a general mixed method of the form (1). Let since Ph Dv = Dv. 0 2 IR and > 0. The iterated penalty method (IPM) Thus, the iterated penalty method implicitly becomes de nes un 2 V and pn by a(un; v) + 0 (Dun ; Dv) = F(v) n 0 n (39) ? (Dv; Dwn ) ? 0 (Dv; D ) 8v 2 V a(u ; v) + (Du ; Dv) n 0 wn+1 = wn + (un + ) : (29) = F(v) ? b(v; p ) + G(Dv) 8v 2 V n +1 n n p = p + (Du ? Ph G) In the case 0 = this simpli es to where Ph G is de ned in (2). Recall also that (4) says a(un ; v) + (Dun ; Dv) = F(v) that Ph Du = Ph G: ? (Dv; D(wn + )) 8v 2 V The algorithm does not require Ph G to be computed, (40) wn+1 = wn + (un + ) : only Thus we see that the introduction of inhomogeneous (30) b(v; Ph G) = (Dv; Ph G) = G(Ph Dv) boundary conditions does not lead to a dramatic change to the formulation of the iterated penalty method. The main for v 2 V . Suppose that G(q) := ?b( ; q). Then dierence is that the \pressure" term (31) G(Ph Dv) = ?b( ; Ph Dv) = ?(D ; Ph Dv) : (41) p = Ph Dw; If further = DV , then where w := wn for the value of n at which the iteration is terminated, cannot be computed directly since w 62 V . (32) b(v; Ph G) = G(Ph Dv) = ?(D ; Dv) That is, Dw 62 . On the other hand, p is not needed at for v 2 V . all for the computation of u and can be computed only as One key point of the iterated penalty method is that required. For example, in a time-stepping scheme it need the system of equations represented by the rst equation not be computed every time step. in (29) for un, namely The \pressure" term can be calculated in various ways. It satis es the system n ; v) + 0 (Dun ; Dv) a(u (33) (42) (p; Dv) = (Dw; Dv) 8v 2 V: = F(v) ? b(v; pn) + 0 G(Dv) 8v 2 V;

4 Iterated Penalty Method

(38)


5

We can write p = Dz for various z 2 V , with z satisfying Let Whk denote piecewise polynomials of degree k on a the under-determined system triangular mesh, and let Vhk denote the subspace of Whk consisting of functions that vanish on the boundary. Let (43) (Dz; Dv) = (Dw; Dv) 8v 2 V: (48) V h = Vhk Vhk ; Several techniques could be used to specify z, but one simple one is to use the system (1), namely, to nd z 2 V and and let h = div Vh . Then each q 2 h is constrained at all boundary singular [4] vertices, i, in the mesh. q 2 such that On the other hand, inhomogeneous boundary conditions k k a(z; v) + b(v; q) = 0 8v 2 V will require the introduction of some 2 Wh Wh . It (44) ? b(z; q) = (Dw; q) 8q 2 is known [10] that div Whk Whk consists of all piecewise polynomials of degree k ? 1, a larger space than which is (1) with F 0 and G(q) := (Dw; q). if boundary singular vertices are present in the mesh. The iterated penalty method can be used to solve (44), Onh the other hand, if there are no boundary singular yielding an algorithm of the form vertices, since there ?is kno need toR form the projection k = q 2 div W W : q(x) dx = 0 in this case. h h h

a(z n ; v) + (Dz n ; Dv) = (45) ? (Dv; D( n ? w)) 8v 2 V n+1 = n + (z n ? w) in the case 0 = . This involves inverting the same algebraic system as in (40), so very little extra work or storage is involved.

5 Application to Navier-Stokes

6 Convergence of IPM

The convergence properties of (29) follow from [4].

Theorem 6.1 Suppose that the forms (1) satisfy (5) and (6) for Vh and h = DVh . Then the algorithm (29) con-

verges to the solution of (1) for any 0 < < 20 for 0 suciently large. For the choice = 0 , (29) converges geometrically with a rate given by

The iterated penalty method (39) (with 0 = ) for (15) 2 takes the form C 1 a ? 0 : Ca + `;n; v + ?div u`;n ; div v 2 u a ~ ? ?L ? v ? Rc u `?1; u `?1; v + u `?1; v L2 = ?a ; The following stopping criterion follows from [4]. ? ? (46) ? div w `;n + ; div?v L2 8v 2 V `;n+1 = w`;n ? u`;n + w Theorem 6.2 Suppose that the forms (1) satisfy (5) and (6) for Vh and h = DVh . Then the errors in algorithm where either p`;0 = 0 (i.e. w `;0 = 0) or w `;0 = w `?1;N (29) can be estimated by where N is the nal value of n at time-step ` ? 1. If for some reason p` = p`;n = Pdiv w `;n were desired, it could n ? uh k 1 + Ca kDun ? P Gk k u h V be computed separately. For example, algorithm (40) could be used to compute ?

(47)

?

n ~ zn?; v + div ; div L2 = ?a z v n `;n ? div ? w ;?div v L2 8v 2 V n+1 = n ? zn ? w `;n

starting with, say, 0 0. Then zn will converge to z 2 V satisfying div z = Pdiv w `;n = p`;n . Note that (47) requires the same system to be solved as for computing u `;n in (46), so very little extra code or data-storage is required. The potential diculty caused by having inhomogeneous boundary data can be seen for high-order nite elements. For simplicity, consider the two-dimensional case (d = 2).

and

Ca Ca2 0 n + + Cb kDu ? Ph Gk : When G(q) = ?b( ; q), then Ph G = ?Ph D and since D u n 2 h , n kPh D (un + )k (49) kDu ? Ph Gk = kD (un + )k : The latter norm is easier to compute, avoiding the need to compute Ph G. We formalize this observation in the following result.

kpn ? ph k

6

ICOSAHOM 95

Corollary 6.1 Under the conditions of Theorem (6.2) the errors in algorithm (29) can be estimated by

Ca kD (un + )k kun ? uh kV 1 + and

radial velocity amplitude for Jeffrey−Hamel flow at r=1

1

10

R=240

C2

kpn ? ph k C a + a + 0 Cb kD (un + )k :

The corollary gives a natural convergence criterion of the iterated penalty method: (50) kD (un + )k "ip kunk:

7 Computational Examples

0

10

R=20

−1

10

−2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 All numerical experiments reported in this paper were conangular variable ducted using the code Albert [1]. The space Vh is chosen to be (48) with degree k 4. As a test problem we con- Figure 2: The pro le of the angular velocity u at r = 1 sider the well-studied Jerey-Hamel ow in a converging for two choices of C, 100 and 10,000. These correspond to duct [8]. For this ow, a semi-analytical solution is known Reynolds' numbers of 20 and 240, respectively. which is a similarity solution of the form (51) u(x; y) := 6 u(atan(y=x)) x2 + y2 x ; x = (x; y) 2

where (52) u00 + 4u + 6u2 = C; u(0) = u() = 0; and dierentiation is with respect to the polar angle and is the angle (in radians) made by the two walls of the channel. Figure 2 shows the pro le (for = =4) of the radial velocity u at r = 1 for two choices of C, 100 and 10,000. These correspond to Reynolds' numbers of 20 and 240, respectively, with R calculated by formula R = max juj for r = 1. The pro les are plotted for 0:01 = 0:99. This allows an assessment of the behavior of the solution in the boundary layer. For example, the angular displacement thickness R , which can be de ned in this case by 10

(53)

R :=

Z

0

=2

u() d; 1 ? u(=2)

Maximum pressure error = 0.0005 Maximum pressure error = 2.3245

Figure 3: The pressure relative \error" due to a boundaryvertex in the case of C = 100 if the projection For the domain, we consider a slice of a wedge with Psingular is not included in the calculation (left) and if it is h opening = =4 radians made perpendicular to one of included (right). of the trapezoid are (1; 0), the wedge sides, as shown in Figure 3. This provides a (1; 1), (2; 2), (2; 0),Thein vertices clockwise order starting from the

ow with no particular symmetry. lower-left corner. Figure 3 shows the pressure relative \error" in the case of C = 100 if the projection Ph in (41) is not included in = 0:110 and 240 = 0:046. has values 20


7

Maximum time−step size for the implicit schemes, JHFlow

0

10

25

# penalty iterations

Max dt

20

−1

10

R = 240

15

10

2−nd order

5

R = 20

1−st order −2

10

1

10

2

3

10

10

4

10

R

Figure 4: Maximum time-step size for the implicit time stepping schemes using xed-point iterations as in (46) for Jerey-Hamel ow vs Reynolds number. The maximum mesh-step size h = 0:35 (two subdivisions of the mesh displayed on Figure 3). the calculation (left). Note that the error is concentrated around the one boundary-singular vertex. When the projection is included (right), the relative error drops by ve orders of magnitude, to about 0:05 percent. Figure 1 shows the maximum time-step size for stability of the dierent time-stepping schemes. The implicit schemes are unconditionally stable if one solves the nonlinear equation exactly, but any algorithm used to solve them, such as xed point iterations, introduces implicitly a stability limit. Our simulations were initialized with u = 0. Note that the time-dependent numerical solution converges to the steady state. A set of subsequently re ned meshes was used, starting with the coarsest one from Figure 3. Numerical experiments have shown that the second-order implicit-explicit scheme with 3 xed-point iterations (21) is the most ecient of the ones we tested. Figure 4 shows the maximum time-step size for the implicit time-stepping schemes as a function of the Reynolds number. The mesh was generated by twice re ning the coarse mesh shown in Figure 3. The number of unknowns N = 1202.

8 Using an Initial Pressure It is possible to use a good initial guess for w0 in the iterated penalty method (40). For example, we can use the

0 0

0.5

1

1.5

2

2.5 time

3

3.5

4

4.5

5

Figure 5: Reduction in number of iterations needed to satisfy (50) as a function of time for two dierent Reynolds numbers and for penalty parameter = 100 using initial wn from the previous time step for Jerey-Hamel ow. nal wn from the previous time step or other \outer iteration." Figure 5 shows the reduction in number of iterations needed as a function of time using an initial wn from the previous time step, for two dierent Reynolds numbers and for a moderate penalty parameter ( = 100). Numerical experiments were conducted for the Jerey-Hamel problem with zero initial guess. The number of penalty iterations approaches one because the solution approaches steady state. Figure 6 shows the eect of varying the penalty parameter for Reynolds numbers 20 and 240. Shown are the pressure and velocity error after convergence (at a time of t = 5) for the Jerey-Hamel problem. We see that if the penalty parameter becomes too large, then rst the pressure degrades and nally so does the velocity. Also shown is the number of penalty iterations at the rst time step when there is no initial guess for w0 . We see that it is large for a small penalty parameter but decreases rapidly to just one iteration for larger penalty parameters. The errors quoted in gures were in the L1 norm. We have obtained quantitatively similar error estimates in the L2 norm. For truly time-dependent problems, the number of penalty iterations also approaches one for each xedpoint iteration, when the value of penalty parameter is big enough. The uniform ow around a cylinder is a classical time-dependent problem which has been investigated by many researchers (see [3] and references reported

8

ICOSAHOM 95 Convergence of iterated penalty method vs penalty parameter, R = 20

3

Convergence of iterated penalty method vs penalty parameter, R = 240

3

10

10

2

2

10

10

1

1

10

10

0

Number of iterations

−1

Pressure error

−2

Velocity error

10

Number of iterations

0

10

Pressure error −1

10

10

−2

10

−3

10

Velocity error

10

−3

0

10

5

10

10

10 Penalty parameter

(a)

15

10

10

0

10

5

10

10

10

15

10

Penalty parameter

(b)

Figure 6: Eect of varying the penalty parameter for Reynolds numbers 20 (a) and 240 (b). Shown are the pressure and velocity error after convergence (at a time of t = 5) and the number of iterations in the penalty method at the rst time step, plotted as a function of the penalty parameter, . Number of penalty iterations for small value of the penalty parameter is proportional to 1 (b). therein). For Reynolds numbers larger than some critical value (around 40) vortex shedding occurs behind the cylinder (Figure 7). The time-dependent horizontal component of velocity is given in Figure 8. Plotted are velocity values at six points along the line through the center of the cylinder cross-section at the angle of = 153 with direction of ow for a Reynolds number R = 100, a time step size t = 0:05 and a computational mesh with 2831 nodal points. The frequency of the oscillations in terms of the dimensionless Strouhal number is found to be St = 0:177. Authors of [3] found the dimensionless Strouhal number to be St = 0:16 (only two digits were reported). The dierence in observed St could be a result of the nite computation domain used = [?10; 30] [?20; 20] or by the polygonal approximation of the cylinder (24 sides). A calculation on the domain = [?20; 60] [?40; 40], using a 36-edge polygon for the cylinder representation, a mesh with 12943 nodal points and a time step size t = 0:02, yielded a the dimensionless Strouhal number St = 0:169. Figure 7: Velocity eld for the uniform ow around a cylinder. R = 100; t = 100; t = 0:05. Computational domain

= [?10; 30] [?20; 20]. Radius of the cylinder r = 1. The initial time steps need a large number of xed-point Center of the cylinder is at (0; 0). Plotted is part of the and penalty iterations but when the solution becomes pedomain [?3; 9] [?6; 6]. Computational mesh has 2831 riodic (t > 50), each time step requires approximately the nodal points. Initial ow at t = 0 was uniform ow except same number of xed-point and penalty iterations. Figon the cylinder, where it was set to zero. ure 9 shows the total number of penalty iterations per


9

time step as function of the penalty parameter at time 100. Two dierent iteration strategies were used. One requires penalty iterations reach convergence according to the criterion (50) with tolerance "ip = 10?7. Another sets the number of penalty iterations per xed-point iteration to one and the iterations are forced to satisfy both convergence criteria (22) with tolerance "fp = 10?4 and (50). The second strategy requires fewer total number of penalty iterations for high values of the penalty parameter and therefore minimizes the number of linear-system solves required. We did simulation of the same ow for higher values of the Reynolds number (R = 200). For penalty parameter = 108 and time step t = 0:02 number of xed-point iterations equal 3 with one penalty iteration per xed-point iteration. Like for steady-state ows the value of penalty parameter can not arbitrary large due to the big numerical error. Since it appears that it is often possible to do only one Figure 8: Time-dependent horizontal component of veloc- penalty iteration per xed-point iteration, it is interestity eld at six points along the line through the center of ing to compare this method with other methods. Algethe cylinder cross-section at the angle of = 153 with di- braically, the system of equations for a mixed method (1) rection of ow. Simulations were for uniform ow around can be written in the form cylinder with diameter D = 2, R = 100. A B U = F (54) Bt 0 P 0 The iterated penalty method avoids the need to even calculate the matrix B, and only requires the solution of equations of the form (A + D)V = G which involve only the velocity degrees-of-freedom, where D = BB t and B t represents the matrix associated with the divergence operator. Since this is a much smaller system of equations than (54), the iterated penalty method can provide a much more ecient algorithm than ones like (54) which explicitly involve the pressure degrees-of-freedom, such as the Taylor-Hood method [7]. Uniform flow around a cylinder, R = 100

1.4

r/D=1.82

1.2

r/D=9.32 r/D=3.85

1

Velocity Ux

0.8

0.6

0.4

0.2

r/D=0.90

0

r/D=0.71 r/D=0.56

−0.2 0

5

10

15

20

25

30

35

Time

100

PI = 1

PI−>conv.

Total number of penalty iterations

80

60

40

20

0 2 10

3

10

4

10 Penalty parameter

5

10

6

10

Figure 9: Convergence of iterated penalty method for the uniform ow around a cylinder, as a function of penalty parameter for fully developed periodic ow (t = 100). o's indicate a strategy in which the penalty iterations are repeated until they reach convergence criterion (50). 's indicate a strategy with one penalty iteration per xed-point iteration and xed-point iterations are repeated until they satisfy both (47) and (50).

9 Direct Versus Iterative Solvers Iterative solvers are often used to solve the linearalgebraic problem that arises in implicit and implicitexplicit schemes for time-dependent partial dierential equations. However, direct solvers have various nice features. For example they allow one to solve non-symmetric problems with the same amount of work as for symmetric problems. Thus it is natural to ask what is the relative amount of work for direct versus iterative solvers. We will see that for high-order methods, at least in two dimensions, direct methods have an interesting range of applicability.

10

ICOSAHOM 95 1

Relative number of nonzeros of factorized matrix

10

p=4

2.0

p=5 p=6 p=7

0

10 3 10

4

10

5

6

10 10 Number of nonzeros of penalty matrix

7

10

Figure 10: Growth in the number of ll-ins using an ordering strategy related to the \minimum degree" algorithm. Plotted is the ratio of the number of nonzeros in the factored matrix to the number of nonzeros in the original matrix. Using an ordering strategy related to the \minimum degree" algorithm [9] in the case of symmetric matrices, the growth in the number of ll-ins is as depicted in Figure 10. What is plotted is the ratio of the number of non-zeros in the factored matrix to the number of non-zeros in the original matrix, as a function of the former. The number of non-zeros in the factored matrix is a good measure of the computational complexity for time-stepping problems where the matrix can be factored once and then re-used many times. We see that for a large range of problems, the number of non-zeros in the factored matrix will be less than twice that of the original matrix. This means that using the factored matrix for a direct solution will take less than two steps of typical iterative methods. The relative cost of solving is made even more apparent in Figure 11. Shown is the relative cost of the solver and the form evaluations for the 4-th and 6-th order niteelement method. The data reported in Figures 10 and 11 does not change very much as we vary the geometry of the domain and other problem parameters. The line marked D denotes the work for computing the divergence form b(v; p) and the line marked C denotes the work for computing the nonlinear form c(u; v; w). We see that for the 6-th order nite-element method, the nonlinear form dominates the solver by a substantial factor, although the cost for the latter is growing at a faster rate (the forms can be computed in an amount of work that is linear in

the number of matrix entries, but the cost of the solver grows supra-linearly). An interesting question is the eectiveness of increasing the degree of approximation versus decreasing the mesh size. To investigate the dependence of the numerical error on the degree of approximation and number of unknowns, we simulate Jerey-Hamel ow. A set of subsequently re ned meshes was used, starting with the coarsest one from Figure 3. Figure 12 compares the accuracy gained as a function of the number of non-zeros in the factored matrix, a measure of the dominant part of the computation. For smaller Reynolds numbers, higher degree approximations are more ecient. However, for larger Reynolds numbers, less dierence in eciency is seen as the degree is varied. Since the forward- and back-solve is so ecient, it is reasonable to study the eciency as a function of an estimate of the entire work for a complete time step for the overall algorithm. Figure 13 shows this data for Reynolds numbers 20 and 240. We see that basing the plots on an estimate of the work for a complete time-step tends to make loworder and high-order methods look much more similar in terms of accuracy achieved as a function of work expended. There is a noticeable increase in cost as the Reynolds number increases, due to the increased information content of the solution. We note that we have omitted plots of the eciency as a function of run time [2], as the latter is highly dependent on the computer architecture. For cache architectures, run time correlates well with the number of nonzeros in the factorized matrix. For very large problem sizes, iterative methods will be more ecient. However, the standard multigrid method requires a number of smoothing steps proportional to the penalty parameter [5]. The data in Figures 6 and ?? indicates penalty parameter should be very large.

10 Conclusions We showed that high-order nite element methods can simulate the ow of incompressible viscous uids eciently using the iterated penalty method for resolving the incompressibility constraint. We gave a description of the iterated penalty method in the case of inhomogeneous boundary conditions. We showed that direct methods may be a suitable choice for many simulations, at least in the twodimensional case. Current work extending the results of this paper involves mesh optimization using error estimators and an extension of these results to three dimensions.


11

Floating point operations per iteration vs number of unknowns, p = 4

8

10

7

Total C

7

Total solver C

10

Floating point operations

10

Floating point operations

Floating point operations per iteration vs number of unknowns, p = 6

8

10

D 6

10

5

10

4

solver D

6

10

5

10

4

10

10

3

3

10 1 10

2

3

10

10 Number of unknowns

4

10 1 10

5

10

10

2

10

(a)

3

4

10 Number of unknowns

5

10

10

(b)

Figure 11: Relative cost of solver and form evaluations for nite elements order p = 4 (a) and p = 6 (b). D denotes the divergence term and C the nonlinear term.

Computational error vs number of nonzeros of factorized matrix, R = 20

−1

10

−2

−2

10

10

−3

−3

10

Solution error

10

Solution error

Computational error vs number of nonzeros of factorized matrix, R = 240

−1

10

−4

10

−5

p = 10 p=7

−4

p=5

10

p=4 p=6 −5

10

10 p=5 p = 10

−6

10

p=7

−6

10

p=4 p=6

−7

10

−7

3

10

4

10

5

10 Number of nonzeros

(a)

6

10

7

10

10

3

10

4

10

5

10 Number of nonzeros

6

10

7

10

(b)

Figure 12: Accuracy in the maximum norm as a function of the number of non-zeros in the factored matrix for Reynolds number = 20 (a) and 240 (b). Numerical solution was computed for Jerey-Hamel ow on a set of subsequently re ned meshes.

12

ICOSAHOM 95 Computational error vs floating point operations per iteration, R = 20

−1

10

−2

−2

10

10

−3

−3

10

Solution error

10

Solution error

Computational error vs floating point operations per iteration, R = 240

−1

10

−4

10

−5

p = 10 p=7

−4

p=5

10

p=4 p=6 −5

10

10

p = 10 p=4

−6

10

−6

10

p=6 −7

10

−7

3

10

4

10

5

6

10 10 Floating point operations per iteration

(a)

7

10

8

10

10

3

10

4

10

5

6

10 10 Floating point operations per iteration

7

10

8

10

(b)

Figure 13: Accuracy as a function of the number of total work in a time step for Reynolds number = 20 (a) and 240 (b).

11 Acknowledgments

[6] S. D. Conte, C. de Boor. Elementary Numerical Analysis, McGraw-Hill, 1981. This work was supported in part by the National Science Foundation through award number DMS9403563 and [7] V. Girault and P.-A. Raviart. Finite Element Methods for Navier-Stokes Equations, Theory and Algorithms. the Texas Advanced Research Program, Project number Springer-Verlag , Berlin, New York, 1986. 003652-141. [8] L. D. Landau and E. M. Lifshitz, Fluid Mechanics, Pergamon Press, 1959. [9] S. Pissanetsky, Sparse Matrix Technology, Academic [1] The Albert system is available in the compressed Press, 1984. tar le albert.tar.Z which can be obtained from http://www.hpc.uh.edu/~babak. [10] M. Vogelius and L. R. Scott. Norm estimates for a maximal right inverse of the divergence operator in [2] B. Bagheri, S. Zhang, and L. R. Scott. Implementing spaces of piecewise polynomials. M 2AN (formerly and using high{order nite element methods. Finite R.A.I.R.O. Analyse Numerique), 19:111{143, 1985. Elements in Analysis & Design, 16:175{189, 1994. [3] M. Braza, P. Chassaing and H. Ha Minh. Numerical study and physical analysis of the pressure and velocity elds in the near wake of a circular cylinder. J. Fluid Mechanics, 165:79{130, 1986. [4] S. Brenner and L. R. Scott. The Mathematical Theory of Finite Element Methods. Springer-Verlag, 1994. [5] Z. Cai, C. I. Goldstein and J. E. Pasciak. Multilevel iteration for mixed nite element systems with penalty. SIAM J. Sci. Comput., 14:1072{1088, 1993.

References

Fast Algorithms for Solving High-Order Finite Element Equations for ...

Fast Algorithms for Solving High-Order Finite Element Equations for ...

Suggest Documents

Highorder spacetime finite element schemes for ... - Wiley Online Library

A Highorder extended finite element method for ...

Fast simplicial finite element algorithms using

Finite element method for extended KdV equations

Finite element method for solving geodetic

Finite Element Methods for Partial Differential Equations

Multigrid algorithms for stochastic finite element

Finite element algorithms for thermoelastic wear problems

Adaptive implicit-explicit finite element algorithms for

Fast simplicial finite element algorithms using Bernstein ... - CiteSeerX

Sparse finite element methods for operator equations with stochastic ...

NEW FINITE ELEMENT ITERATIVE METHODS FOR SOLVING A ...

Finite Element Solution of Navier-Stokes Equations for ... - CiteSeerX

Cut Finite Element Methods for Partial Differential Equations on ... - arXiv

Finite Element Method of High-Order Accuracy for Solving Two

Efficient Algorithms for Solving Partial Differential Equations with ...

Efficient Algorithms for Solving Static Hamilton-Jacobi Equations

Matrix algorithms for solving (in) homogeneous bound state equations

Algorithms for Solving Linear and Polynomial Systems of Equations ...

Fast Quantum Algorithm for Solving Multivariate Quadratic Equations

finite element-based algorithms to make cuts for ... - PIER Journals

finite element algorithms for dynamic ... - Wiley Online Library

Finite element algorithms for contact problems - Springer Link

Conditioning of finite element equations with