Fast Iterative Solvers for Discrete Stokes Equations - (IGPM), RWTH

0 downloads 0 Views 240KB Size Report
Many different iterative methods for solving this discrete problem are known. One can .... Clearly, using some suitable scaling one can always satisfy the condition (3.3). ..... For these approximations we have a recursion of the form zi+1 = zi + ...
FAST ITERATIVE SOLVERS FOR DISCRETE STOKES EQUATIONS ¨ JORG PETERS, VOLKER REICHELT

AND ARNOLD REUSKEN∗

Abstract. We consider saddle point problems that result from the finite element discretization of stationary and instationary Stokes equations. Three efficient iterative solvers for these problems are treated, namely the preconditioned CG method introduced by Bramble and Pasciak, the preconditioned MINRES method and a method due to Bank et al. We give a detailed overview of algorithmic aspects and theoretical convergence results. For the method of Bank et al a new convergence analysis is presented. A comparative study of the three methods for a 3D Stokes problem discretized by the Hood-Taylor P2 − P1 finite element pair is given. AMS subject classifications. 65N30, 65F10 Key words. Stokes equations, inexact Uzawa methods, preconditioned MINRES, multigrid

1. Introduction. We consider a class of Stokes equations on a bounded connected polyhedral Lipschitz domain Ω in d-dimensional Euclidean space. WeRuse the notation V := H01 (Ω)d for the velocity space and M = L20 (Ω) := { p ∈ L2 (Ω) | Ω p(x) dx = 0 } for the pressure space. The variational problem is as follows: given f ∈ L2 (Ω)d find {u, p} ∈ V × M such that  (∇u, ∇v) + ξ(u, v) − (div v, p) = (f , v) for v ∈ V, (1.1) (div u, q) = 0 for q ∈ M . with a constant ξ ≥ 0. Here and in the remainder the L2 scalar product and associated norm are denoted by (·, ·), k · k, respectively. The zero order term ξ(u, v) is included in view of implicit time integration methods applied to instationary Stokes equations. For the discretization of this problem we use a pair of conforming LBB stable finite element spaces. This results in a saddle point problem of the form      f x A BT (1.2) = 0 y B 0 Many different iterative methods for solving this discrete problem are known. One can apply multigrid techniques to the whole coupled system in (1.2), cf. [15, 27, 28]. Most other approaches are based on the prominent classical Uzawa method. This Uzawa method requires that A−1 x can be computed exactly ([1]). In many variants of this method, which are often called inexact Uzawa methods, one tries to avoid the exact inversion by using an inner iterative method, for example, a Jacobi-like iteration (Arrow-Hurwicz algorithm, [1]) or a multigrid method, cf. [24]. We also mention (variants of) the preconditioned MINRES method [21, 22, 23, 26] and the method from [2]. A different approach is presented in [3]. There the indefinite problem (1.2) is reformulated as a symmetric positive definite problem. For most of these methods theoretical convergence analyses are known, cf. [2, 3, 5, 12, 21, 23, 24, 26, 29]. In [11] the performance of a few of these methods is compared by means of systematic numerical experiments for a stationary 2D Stokes problem. In this paper we consider three representative methods from the large class of inexact Uzawa methods, namely the preconditioned CG method from [3] (denoted by ∗ Institut f¨ ur Geometrie und Praktische Mathematik, RWTH-Aachen, D-52056 Aachen, Germany; email: [email protected]. This work was supported by the German Research Foundation through SFB 540.

1

2

J. PETERS, V. REICHELT AND A. REUSKEN

BPCG), the preconditioned MINRES method from [21, 23, 26] (denoted by PMINRES) and the method from [2] (denoted by MGUZAWA). The topics treated in this paper are the following: • For these three methods we discuss costs per iteration, known theoretical convergence results and some implementation issues. This makes it possible to make a fair comparison of these methods. • For the MGUZAWA method we present a convergence analysis. This analysis is much simpler than the analyses presented in [2, 29]. The result that we obtain is different from the ones in [2, 29] and gives a better explanation of the observation that if one uses a very good preconditioner for A (like multigrid) then even with a very low accuracy in the inner iteration the MGUZAWA method converges (cf. remark 7). • We present a comparative study of the performance of the three methods. For this we consider a Stokes problem as in (1.1) in 3D. We treat both the stationary (ξ = 0) and instationary (ξ > 0) case. For the discretization we apply the popular Hood-Taylor P2 − P1 finite element pair. As a preconditioner for A a standard multigrid method is used. For the Schur complement preconditioner we use the mass matrix for the stationary case and a more sophisticated preconditioner analyzed in [4] for the instationary case. The main results of this paper are a detailed comparative study of the three fast iterative solvers for a 3D Stokes problem and a new convergence analysis for the MGUZAWA method. The paper is organized as follows. In section 2 we formulate the standard Galerkin discretization of the Stokes problem. In section 3.1 we consider the BPCG method. An efficient implementation of this method is discussed and a main convergence result known from the literature is formulated. In section 3.2 the PMINRES method is treated. Also for this method a known convergence result is given. In section 3.3 we discuss the MGUZAWA method and give an effcient implementation of this method. In section 4 a convergence analysis of this method is given. Theorem 4.3 contains the main theoretical result of this paper. In section 5 we discuss the preconditioners for the A-block and for the Schur complement. In section 6 results of numerical experiments are presented that illustrate the performance of the three methods. Finally, in section 7 we give some conclusions. 2. The discrete Stokes equations. For the discretization of the Stokes problem (1.1) we assume a family of triangulations {Th } in the sense of [9, 10] and a pair of finite element spaces Vh ⊂ V and Mh ⊂ M that is LBB stable with a constant βˆ independent of h: inf

sup

qh ∈Mh vh ∈Vh

(div vh , qh ) ≥ βˆ > 0 k∇vh kkqh k

The Galerkin discretization is as follows: Find {uh , ph } ∈ Vh × Mh such that  (∇uh , ∇vh ) + ξ(uh , vh ) − (div vh , ph ) = (f , vh ) for vh ∈ Vh , (div uh , qh ) = 0 for qh ∈ Mh .

(2.1)

(2.2)

In practice the discrete space Mh for the pressure is constructed by taking a standard finite element space, which we denote by Mh+ (for example, continuous piecewise linear functions) and then adding an orthogonality condition: Mh = { ph ∈ Mh+ | (ph , 1) = 0 }

3

FAST ITERATIVE SOLVERS FOR STOKES EQUATIONS

Note that dim(Mh ) = dim(Mh+ ) − 1. Let n := dim(Vh ), m := dim(Mh+ ). We assume standard (nodal) bases in Vh and Mh+ and corresponding isomorphisms JV : Rn → Vh , , JM : Rm → Mh+ . Let the stiffness matrices D ∈ Rn×n , B ∈ Rm×n and the mass matrix M ∈ Rm×m be given by hDx, yi = (∇JV x, ∇JV y) for all x, y ∈ Rn , hBx, yi = (div JV x, JM y) for all x ∈ Rn , y ∈ Rm , hMx, yi = (JM x, JM y) for all x, y ∈ Rm .

(2.3)

Here h·, ·i denotes the standard Euclidean scalar product. The discrete problem has a matrix-vector representation of the form      f x A BT , A := D + ξM , (2.4) = 0 y B 0 with f such that hf, yi = (f , JV y) for all y ∈ Rn . We introduce the notation   A BT , S := BA−1 BT . K= B 0 −1 Define the constant vector e := JM 1 = (1, . . . , 1)T ∈ Rm . Note that BT e = 0 and that both the Schur complement S and the matrix K are singular and have a onedimensional kernel. For the Schur complement this kernel is ker(S) = span{e}. Note that

(JM y, 1) = 0 ⇔ (JM y, JM e) = 0 ⇔ hMy, ei = 0 .

(2.5)

Hence, with e⊥M := { y ∈ Rm | hy, Mei = 0 } we have Mh = { JM y | y ∈ e⊥M } and we get the following matrix-vector representation of the discrete problem (2.2): Find x ∈ Rn , y ∈ e⊥M such that (2.4) holds.

(2.6)

3. Iterative solvers. In this section we describe three iterative solvers for the problem (2.6). The first two methods, a preconditioned conjugate gradient method (BPCG) and a preconditioned minimal residual method (PMINRES) are known from the literature and often used in practice. The third method, which may be less known, is a simple variant of the Uzawa method as presented in [2]. In section 4 a convergence analysis of this method is given. In all three solvers we need symmetric positive definite preconditioners QA of A and QS of S. The quality of these preconditioners is described by the following spectral equivalences. Let γA > 0, ΓA , γS > 0, ΓS be such that γA QA ≤ A ≤ ΓA QA , γS QS ≤ S ≤ ΓS QS

on e

(3.1) ⊥M

.

(3.2)

In section 5 we will discuss concrete choices for the preconditioners QS and QA .

4

J. PETERS, V. REICHELT AND A. REUSKEN

3.1. Preconditioned conjugate gradient method. We describe the preconditioned conjugate gradient (BPCG) method introduced in [3]. For this method we need the following assumption on the preconditioner QA 0 < QA < A ,

(3.3)

i.e., γA > 1 in (3.1). Clearly, using some suitable scaling one can always satisfy the condition (3.3). Premultiplication of (2.4) by the matrix   −1 QA 0 G= BQ−1 −I A yields an equivalent system     f x ˆ , =G K 0 y

ˆ := GK = K

T Q−1 Q−1 A A A B −1 −1 T BQA A − B BQA B



Due to the assumption (3.3) the bilinear form     x1 y , 1 = h(A − QA )x1 , y1 i + hx2 , y2 i x2 y2



.

(3.4)

(3.5)

ˆ is symmetric and positive defines an inner product. In [3] it is shown that the matrix K n ⊥M with respect to this inner product. The matrix definite on R × e   ˜ := I 0 K 0 QS is also symmetric positive definite with respect to this scalarproduct. Hence we can  apply the preconditioned CG method in the scalar product ·, · and with precondi˜ to the linear system in (3.4). The algorithm is as follows: tioner K    0  f x 0 0   a given starting vector ; ¯r := v = − Kv0 , r0 := G¯r0 0   y 0     for k ≥ 0 (if rk 6= 0) :      ˜ k = rk .  Solve zk from Kz     (n)    (n) (n) βk := zk , rk , βk := βk /βk−1 (β0 := 0) (3.6)  k k k−1 0 0  p := z + β p (p := z )  k    (d)  k k   ˆ , p , αk := β (n) /α(d)   αk := Kp k k     k+1 k k  v := v + αk p     k+1 ˆ k r := rk − αk Kp This is the standard PCG algorithm. We now discuss an efficient implementation of this algorithm based on ideas from [11]. In particular we want to avoid the evaluation of QA . We use indizes 1 and 2 to denote the first and second vector components, ˜ we have zk = rk . respectively (as in (3.5)). First note that due to the structure of K 1 1 k k k We introduce ¯r1 := QA r1 and d := Ar1 . We then have  k k z , r = h(A − QA )zk1 , rk1 i + hzk2 , rk2 i = h(A − QA )rk1 , rk1 i + hzk2 , rk2 i =

hd, rk1 i



h¯rk1 , rk1 i

+

hzk2 , rk2 i

(3.7)

.

FAST ITERATIVE SOLVERS FOR STOKES EQUATIONS

5

Also define sk := Apk1 . From pk = zk + βk pk−1 it follows that for sk we have the relation sk = Ark1 + βk sk−1 = d + βk sk−1 . With k    s + BT pk2 t k := Kp = , u Bpk1

    Q−1 t y k A ˆ := Kp = z BQ−1 A t−u

we obtain  k k ˆ , p = h(A − QA )y, pk1 i + hz, pk2 i Kp

= hy, Apk1 i − ht, pk1 i + hz, pk2 i = hy, sk i − ht, pk1 i + hz, pk2 i .

(3.8)

Finally note that for ¯rk1 we have the recursion := ¯rk1 − αk (Kpk )1 = ¯rk1 − αk t . ¯rk+1 1 Summarizing we then obtain the following efficient implementation of the BPCG algorithm:    0  f x  0 0  − Kv0 , r0 := G¯r0 a given starting vector ; ¯r := v =  0  0 y     for k ≥ 0 (if rk 6= 0) :       Solve zk2 from QS zk2 = rk2 , zk1 := rk1 .      k   d := Ar1     (n) (n) (n)  β := hd, rk1 i − h¯rk1 , rk1 i + hzk2 , rk2 i, βk := βk /βk−1 (β0 := 0)    k pk := zk + βk pk−1 (p0 := z0 ) , sk = d + βk sk−1 (s0 := d) (3.9)             t sk + BT pk2 y Q−1  A t  := , :=  −1 k  u z Bp BQ  1 A t−u     (d) (n) (d)   αk := hy, sk i − ht, pk1 i + hz, pk2 i, αk := βk /αk       vk+1 := vk + αk pk         k+1 y   r , ¯rk+1 := ¯rk1 − αk t := rk − αk 1 z

In this algorithm we do not need the evaluation of QA and only one evaluation of −1 Q−1 A per iteration. Furthermore, per iteration we need one evaluation of QS , two T matrix-vector products with B, one with B and one with A. We want to use this algorithm to solve the problem (2.6). Then it is desirable that the iterands vk = (xk , yk ) are such that yk ∈ e⊥M holds. Therefore we assume that the preconditioner QS satisfies the consistency condition QS e = α Me ,

α∈R.

(3.10)

¿From this assumption it follows that y0 ∈ e⊥M implies yk ∈ e⊥M for all k ≥ 1. ˆ is bijective on Rn × e⊥M . We now formulate a ˜ −1 K Moreover, if (3.10) holds then K

6

J. PETERS, V. REICHELT AND A. REUSKEN

˜ −1 K) ˆ be main convergence result for the BPCG algorithm due to [3, 29]. Let κ∗ (K −1 ˆ n ⊥M ˜ the spectral condition number of the matrix K K on the subspace R × e . This quantity plays a key role in the convergence analysis of the PCG method. Theorem 3.1. Assume that (3.1), (3.2), (3.3) and (3.10) hold. For x ∈ R define q 1 1 (1 + x)2 − 4xΓ−1 g(x) = (1 + x) + A . 2 2 For the BPCG method we have ˜ −1 K) ˆ ≤ ΓS ΓA g(ΓS )g(γS ) . κ∗ (K γS ΓS

(3.11)

Proof. We can apply theorem 4.1 in [29] in the subspace Rn × e⊥M . This yields ˜ ˆ ≤ (1 − ρ1 )/(1 − ρ2 ), with κ∗ (K−1 K) r 2 − (1 + ΓS )ΓA (2 − (1 + ΓS )ΓA )2 ρ1 := − + ΓA − 1 , 2 4 r 2 − (1 + γS )ΓA (2 − (1 + γS )ΓA )2 ρ2 := + + ΓA − 1 . 2 4 Elementary manipulations show that (1 − ρ1 )/(1 − ρ2 ) =

ΓS ΓA g(ΓS )g(γS ) γS ΓS

holds.

Remark 1. Note that ΓA ≥ 1 and thus 12 (1 + x) + 12 |1 − x| ≤ g(x) ≤ 21 (1 + x) + + x|. Hence, 1 ≤ g(x) ≤ 1 + x if x ∈ [0, 1] and x ≤ g(x) ≤ 1 + x if x ≥ 1. From S) in (3.11) we have this it follows that for the factor C(γS , ΓS ) := g(ΓSΓ)g(γ S 1 2 |1

1 4 ≤ C(γS , ΓS ) ≤ ΓS ΓS 2 ≤4 1 ≤ C(γS , ΓS ) ≤ 2 + ΓS γS ≤ C(γS , ΓS ) ≤ 3 + γS ≤ 4γS

if ΓS ≤ 1 , if γS ≤ 1 ≤ ΓS , if 1 ≤ γS .

¿From this we see that although the condition number of the preconditioned Schur complement κ∗ (Q−1 S S) ≤ ΓS /γs is independent of the scaling of the preconditioner ˜ −1 K) ˆ in (3.11) does depend on the QS , the bound for the condition number κ∗ (K scaling of QS . A poor scaling (i.e., ΓS ≪ 1 or γS ≫ 1) leads to a large upper bound. We consider an example, which shows that this bound is sharp. Take QA = Γ−1 A A and QS = Γ−1 S (on e⊥ ) with ΓA > 1 and Γ > 0. Then (3.2) is fulfilled with ˜ −1 K) ˆ = {Γ/g(Γ), ΓA , ΓA g(Γ)} γS = ΓS = Γ. A simple computation shows that σ∗ (K 2 g(Γ) −1 ˜ K) ˆ = ΓA and thus κ∗ (K Γ , which agrees with the bound in (3.11). Hence ΓA −1 ˆ ˜ −1 K) ˆ ≥ ΓA Γ if Γ ≥ 1. Note, however, that ˜ κ∗ (K K) ≥ Γ if Γ ≤ 1 and κ∗ (K −1 ˜ K) ˆ is large if Γ ≪ 1 or Γ ≫ 1 the BPCG method will converge in although κ∗ (K ˜ −1 K) ˆ contains only three elements. three iterations because σ∗ (K

3.2. Preconditioned minimal residual method. In this section we consider a preconditioned minimal residual (PMINRES) method for solving the discretized

FAST ITERATIVE SOLVERS FOR STOKES EQUATIONS

7

Stokes problem. This class of methods has been developed in [20, 21, 23]. We consider a symmetric positive definite preconditioner   ˜ = QA 0 (3.12) K 0 QS ˜ vi 12 for v ∈ Rn+m . Given a starting vector v0 for K. Define the norm kvkK˜ := hKv, with corresponding error e0 := v∗ − v0 , then preconditioned MINRES method  −1in the ˜ Ke0 , . . . , (K ˜ −1 K)k e0 which minimizes one computes the vector vk ∈ v0 + span K −1 ∗ ˜ K(v − v)k ˜ over this subspace. the preconditioned residual kK K An efficient implementation of this method can be derived using the Lanczos algorithm and Givens rotations. For such an implementation we refer to the literature, e.g. [19, 14]. In an efficient implementation of this method one needs per iteration −1 one evaluation of Q−1 A , one evaluation of QS and one matrix-vector product with K. We assume that the preconditioner QS satisfies the consistency condition (3.10). Then it follows that the approximations yk of the discrete pressure are in the subspace e⊥M if the starting vector y0 is in this subspace. Note that for this algorithm one does not need the scaling condition (3.3) for QA . We give two main theoretical convergence results on the PMINRES algorithm from the literature (cf. [14, 21, 23]). Theorem 3.2. Let v0 = (x0 , y0 ) be a given starting vector with y0 ∈ e⊥M . For  k ˜ −1 (f, 0) − Kvk . v , k ≥ 0, computed in the PMINRES algorithm we define ˜rk = K The following holds: k˜rk kK˜ = ≤

min

pk ∈Pk ;pk (0)=1

min

˜ −1 K)˜r0 k ˜ kpk (K K max

˜ −1 K) pk ∈Pk ;pk (0)=1 λ∈σ∗ (K

|pk (λ)| k˜r0 kK˜ ,

(3.13)

where σ∗ (·) denotes the spectrum on the subspace Rn × e⊥M . Theorem 3.3. Assume that (3.1), (3.2) and (3.10) hold. For the spectrum of ˜ −1 K we have: the preconditioned matrix K q q h  i ˜ −1 K) ⊂ 1 γA − γ 2 + 4ΓS ΓA , 1 γA − γ 2 + 4γS γA σ∗ (K A A 2 2 (3.14) q i [h  1 γA , ΓA + Γ2A + 4ΓS ΓA 2 From the results in these theorems it follows that the rate of convergence of the PMINRES method is robust with respect to variation in the parameters h and ξ if the spectral constants γS , ΓS , γA , ΓA do not depend on these parameters. Moreover we can expect faster convergence if we have better preconditioners. Remark 2. Related to the scaling we consider the same example as in remark 1: −1 QA = Γ−1 S (on e⊥ ). We then A A, QS = Γ p have γA = ΓA , γ1S = ΓSp= Γ. A simple  1 −1 ˜ computation yields σ∗ (K K) = { 2 ΓA − Γ2A + 4ΓΓA , ΓA , 2 ΓA + Γ2A + 4ΓΓA }. This shows that for this example the result in (3.14) is sharp. Furthermore, the scaling of QS has a significant influence on the endpoints of the intervals in (3.14). However, for this example, the PMINRES method will converge in three iterations ˜ −1 K) contains only three elements. since σ∗ (K

8

J. PETERS, V. REICHELT AND A. REUSKEN

3.3. An inexact Uzawa method. A basic method for saddle point problems is the Uzawa method. This method is closely related to the block factorization

K=



A 0 B −I



I 0

A−1 BT S



    f x by block forward-backward substitution yields = Solving the problem K 0 y the equivalent problem: 1. Solve Az = f .

(3.15)

2. Solve Sy = Bz ,

y∈e

⊥M

.

T

3. Solve Ax = z − B y .

(3.16) (3.17)

In the Uzawa method one applies an iterative solver (e.g., CG) to the Schur complement system in step 2. The A-systems that occur in each iteration of this method and in the steps 1 and 3 are solved sufficiently accurate using some fast Poisson solver. Here we reconsider a simple variant of this method which was introduced in [2]. Let QA be a preconditioner of A as in (3.1). We use this preconditioner in the steps 1 and 3 and also for the approximation of the Schur complement is step 2. For this we introduce the notation ˆ := BQ−1 BT S A

(3.18)

ˆ : e⊥M → e⊥ is bijective. We use a (nonlinear) approximate inverse of Note that S ˆ S denoted by Ψ : e⊥ → e⊥M . For each w ∈ e⊥ , Ψ(w) is an approximation to the ˆ = w. We assume that solution z∗ ∈ e⊥M of Sz kΨ(w) − z∗ kSˆ ≤ δkz∗ kSˆ

for all w ∈ e⊥

(3.19)

holds with some δ < 1. In our experiments Ψ will be the PCG method. Let (xk , yk ) be a given approximation to the solution (x, y). Note that using the block factorization of K we get    k  x x I = + y 0 yk

−A−1 BT S−1 S−1

A−1 BA−1



    k  0  f x −K k −I 0 y

(3.20)

−1 ˆ−1 w ≈ Ψ(w) and rk := f − Axk − BT yk we obtain the With A−1 ≈ Q−1 w≈S 1 A , S (nonlinear) iterative method

 −1 T −1 k k k xk+1 = xk + Q−1 A r1 − QA B Ψ B(QA r1 + x )  k k yk+1 = yk + Ψ B(Q−1 A r1 + x )

(3.21)

9

FAST ITERATIVE SOLVERS FOR STOKES EQUATIONS

We obtain the following algorithmic structure:   0  x 0   a given starting vector, y0 ∈ e⊥M ; r01 := f − Ax0 − BT y0 v = 0   y       for k ≥ 0 :       −1 k w := xk + QA r1   z := Ψ(Bw) ∈ e⊥M     T  xk+1 := w − Q−1  A B z      yk+1 := yk + z     rk+1 := rk1 − A(xk+1 − xk ) − BT z 1

(3.22) (3.23) (3.24) (3.25) (3.26)

This is the same method as the one presented in section 4 in [2]. An obvious choice for Ψ(Bw), which is also used in our experiments, is the following:  Result of ℓ PCG iterations with startvector 0 and (3.27) Ψ(Bw) = ˆ = Bw. preconditioner QS applied to Sy Remark 3. We comment on some implementation issues and on the arithmetic costs per iteration of the algorithm (3.22)-(3.27). In each iteration of the PCG al−1 T ˆ gorithm in (3.27) one needs an evaluation of Q−1 S and of S = BQA B . Thus per −1 −1 PCG iteration we need one evaluation of QS , one of QA and a matrix-vector multiplication with B and with BT . In the PCG method one computes approximations z0 = 0, z1 , . . . , zℓ =: z. For these approximations we have a recursion of the form ˆ i = B(Q−1 (BT pi )) are zi+1 = zi + αi pi (cf. algorithm (3.6)) and the vectors Sp A −1 i i T i i computed. Hence the vectors ¯z := B z , ˆz := QA ¯z can be computed by cheap recursions (AXPY operations). Using this approach in the PCG algorithm we can T obtain ¯zℓ = BT zℓ = BT z and ˆzℓ = Q−1 zℓ = Q−1 A ¯ A B z with negligible computational −1 T T costs. Hence for the computation of B z and QA B z in (3.24), (3.26) we do not need significant arithmetic work. Summarizing, per outer iteration of the algorithm (3.22)−1 (3.27) we need ℓ + 1 evaluations of Q−1 A , ℓ evaluations of QS , ℓ + 1 matrix-vector T multiplications with B, ℓ matrix-vector multiplications with B and one matrix-vector multiplication with A. Clearly an important issue in the algorithm is the stopping criterion for the inner iteration. ¿From the analysis presented in section 4 it follows that for an efficient method this stopping criterion should depend on the accuracy of the preconditioner QA . A very efficient Poisson solver is the multigrid method. Thus for the preconditioner QA we propose the following: ( One symmetric multigrid V -cycle iteration, using one −1 (3.28) QA w = symmetric Gauss-Seidel iteration for pre- and postsmoothing, with startvector 0 applied to Ax = w. The convergence factor per iteration of this method typically is between 0.1 and 0.2. ¿From the analysis (cf. remark 4) we deduce the following two possibilities for the stopping criterion of the PCG method: • Let ri be the residual in the PCG algorithm. We stop the PCG iteration if krℓ k2 ≤ σacc kr0 k2

is satisfied, with a given σacc ∈ [0.2, 0.6].

(3.29)

10

J. PETERS, V. REICHELT AND A. REUSKEN

• We use a fixed low number of iterations: ℓ ∈ [1, 4].

4. Analysis of the inexact Uzawa algorithm. The inexact Uzawa method has been analyzed in [2, 29]. Here we present an alternative analysis. This analysis is much simpler as those presented in [2] and [29] and the result is different. In remark 7 we discuss the main differences with the results from [2, 29]. For simplicity we assume that for the preconditioner QA we have A ≤ QA , i.e. γA QA ≤ A ≤ QA

(4.1)

We make the scaling assumption A ≤ QA because, opposite to the scaling condition in (3.3), it is fulfilled for the multigrid preconditioner in (3.28). Moreover, it improves the presentation of the analysis. However, this scaling assumption is not essential neither for the algorithm nor for the convergence analysis. In the analysis we use the following natural norms: 1

1

kxkQ := kQA2 xk = hQA x, xi 2 for x ∈ Rn ,

ˆ 12 yk = hSy, ˆ yi 12 for y ∈ e⊥M kykSˆ := kS

For the error in algorithm (3.22)-(3.26) we use the notation    k  k x x e ek = − k =: 1k . y y e2 Note that ek2 ∈ e⊥M holds. Lemma 4.1. For w as in (3.22) we have the bound 1

ˆ−1 Bw, Bwi 2 ≤ (1 − γA )kek kQ + kek kˆ . hS 1 2 S ˆ = range(B) define kykˆ−1 = hS ˆ−1 y, yi 12 . Note that for Proof. For y ∈ range(S) S the exact discrete solution x we have Bx = 0. Using this and the definition in (3.22) we get −1 k T k k ˆ k Bw = Bxk − Bx + BQ−1 A (Ae1 + B e2 ) = −B(I − QA A)e1 + Se2 .

Hence, 1

ˆ k2 kˆ−1 ˆ−1 Bw, Bwi 2 = kBwkˆ−1 ≤ kB(I − Q−1 A)ek1 kˆ−1 + kSe hS A S S S 1

1

ˆ− 12 B(I − Q−1 A)Q− 2 kkQ 2 ek k + kS ˆ 12 ek k . ≤ kS 2 A A A 1 Now note that 1

1

1

1

ˆ− 21 B(I − Q−1 A)Q− 2 k ≤ kS ˆ− 12 BQ− 2 kkQ 2 (I − Q−1 A)Q− 2 k kS A A A A A A and 1

ˆ− 21 ) = ρ(I) = 1 , ˆ− 21 BQ− 2 k2 = ρ(S ˆ− 12 BQ−1 BT S kS A A 1

−1

−1

−1

2 2 2 kQA2 (I − Q−1 A A)QA k = ρ(I − QA AQA ) ≤ 1 − γA .

11

FAST ITERATIVE SOLVERS FOR STOKES EQUATIONS

This completes the proof. We now formulate the main convergence result. Theorem 4.2. Consider the inexact Uzawa method (3.22)-(3.27) with Ψ such that (3.19) holds. For the error ek = (ek1 , ek2 ) we have the bounds kek+1 kQ ≤ (1 − γA )kek1 kQ + kek+1 kSˆ , 1 2

(4.2)

kek+1 kSˆ 2

(4.3)

≤ (1 − γA )(1 +

δ)kek1 kQ

+

δkek2 kSˆ

with γA from (4.1) and δ from (3.19). we have the relations Proof. For the error component ek+1 1 T = x − xk+1 = x − w + Q−1 ek+1 1 A B z k T k T = x − xk − Q−1 A (Ae1 + B e2 − B z) −1 T k k = (I − Q−1 A A)e1 − QA B (e2 − z) −1 T k+1 k . = (I − Q−1 A A)e1 − QA B e2

Hence, −1

−1

−1

1

ˆ− 2 kkek+1 kˆ kQ ≤ kI − QA 2 AQA 2 kkek1 kQ + kQA 2 BT S kek+1 2 1 S In combination with −1

−1

kI − QA 2 AQA 2 k ≤ 1 − γA ,

−1

1

ˆ− 2 k = 1 . kQA 2 BT S

we obtain this proves the inequality in (4.2). For the error component ek+1 2 ˆ−1 Bw) + (S ˆ−1 Bw − z) . ek+1 = y − yk+1 = ek2 − z = (ek2 − S 2

(4.4)

Using Bx = 0 we get ˆ−1 Bwkˆ = kek2 − S ˆ−1 B(xk + Q−1 Aek1 + Q−1 BT ek2 )kˆ kek2 − S A A S S ˆ−1 B(I − Q−1 A)ek1 − S ˆ−1 BQ−1 BT ek2 kˆ = kek2 + S A

A

S

ˆ−1 B(I − Q−1 A)ek kˆ = kS 1 S A 1

−1

−1

−1

ˆ− 2 BQ 2 kkI − Q 2 AQ 2 kkek1 kQ ≤ kS A A A ≤ (1 − γA )kek1 kQ .

(4.5)

Furthermore, using (3.19) and lemma 4.1 we also have ˆ−1 Bw − zkˆ = kS ˆ−1 Bw − Ψ(Bw)kˆ ≤ δ kS ˆ−1 Bwkˆ kS S S S  ˆ−1 Bw, Bwi 12 ≤ δ (1 − γA )kek1 kQ + kek2 kˆ . = δ hS S

(4.6)

Combination of the results in (4.4), (4.5), (4.6) proves the inequality in (4.3). As a simple consequence of this theorem we obtain the following convergence result: Theorem 4.3. Define µA := 1 − γA ,

g(µA , δ) := 2µA + δ(1 + µA ) .

(4.7)

12

J. PETERS, V. REICHELT AND A. REUSKEN

Consider the inexact Uzawa method (3.22)-(3.27) with Ψ such that (3.19) holds. For the error ek = (ek1 , ek2 ) we have the bounds   kSˆ ≤ g(µA , δ) max kek1 kQ , kek2 kSˆ , (4.8) kQ , kek+1 max kek+1 2 1 p   1 g(µA , δ) + g(µA , δ)2 − 4µA δ k 0 (ke1 kQ + ke02 kSˆ ) . (4.9) kek1 kQ + kek2 kSˆ ≤ 3 2 2 Proof. Define the matrix C= Due to theorem 4.2 we obtain 



µA (2 + δ) δ µA (1 + δ) δ

kQ kek+1 1 kek+1 kSˆ 2





kek1 kQ ≤C kek2 kSˆ 

.



.

¿From kCk∞ = g(µA , δ) we obtain the result in (4.8). A simple (MAPLE) computation yields the eigenvector decomposition C = VDV−1 with a diagonal matrix D. From this we get  p 1 g(µA , δ) + g(µA , δ)2 − 4µA δ 2 1 max kVk1 kV−1 k1 ≤ 3 0 0, C1 independent of h and ξ, such that ˜ −1 yi ˜ −1 yi ≤ C1 hy, Q ˜ −1 y, Q ˜ −1 yi ≤ hSQ C0 hy, Q S S S S

for all y ∈ e⊥

(5.7)

Note that many pairs of finite element spaces satisfy the assumptions in this theorem. In our experiments we use such a pair, namely the Hood-Taylor P2 − P1 pair. Although the Laplace problem for the pressure that has to be solved in each applica˜ −1 is much smaller than the problems involving A or QA , often it is inefficient tion of Q S to solve it exactly. Let QT be a preconditioner for Th that is spectrally equivalent to ˜ −1 in (5.6) one can then use Th , uniformly in h. Instead of Q S ( M−1 + ξQ−1 if ξ ≤ h−2 , T , Q−1 (5.8) S := −2 ξh2 M−1 + ξQ−1 ≤ξ T , if h

FAST ITERATIVE SOLVERS FOR STOKES EQUATIONS

17

In [4] (remark 4.1) it is shown that for this preconditioner a result as in (5.7) holds with constants C0 > 0 and C1 independent of h and ξ. Hence, the preconditioner QS as in (5.8) is spectrally equivalent to S (on e⊥M ) uniformly in h and ξ. In the numerical experiments in section 6 for QT we use a symmetric multigrid V-cycle method. 6. Numerical experiments. In this section we present a comparison of the three iterative solvers discussed above. These solvers are denoted by BPCG (algorithm 3.9), PMINRES (section 3.2) and MGUZAWA (algorithm (3.22)-(3.27)). We consider a problem as in (1.1) on the unit cube Ω = (0, 1)3 . For the discretization we start with a uniform tetrahedral grid with h = 21 and we apply regular refinements to this starting triangulation. For the finite element discretization we used the LBB stable pair of Hood-Taylor P2 − P1 , i.e. continuous piecewise quadratics for the velocity and continuous piecewise linears for the pressure. We performed computations for the cases h = 1/16, h = 1/32. Note that for h = 1/32 we have approximately 7.5·105 velocity unknowns and 3.3·104 pressure unknowns (n ≈ 7.5·105 , m ≈ 3.3·104 ). We consider the linear system as in (2.4) with solution (x, y) = 0. We take a fixed arbitrary starting vector (x0 , y0 ), with y0 ∈ e⊥M . The iteration is stopped after a reduction of the Euclidean norm of the starting residual by a factor 106 , i.e., krk k ≤ 10−6 kr0 k, with rk = (f, 0) − K(xk , yk ). Experiments for the case ξ = 0. We apply the BPCG, PMINRES, MGUZAWA methods described above. For the preconditioner QA we use one multigrid V-cycle iteration as in (3.28). For the preconditioner QS for the Schur complement we use one multigrid V-cycle iteration as in (3.28), but now applied to the mass matrix M. For the BPCG method we need a suitable scaling of QA . To determine the scal˜ ≈ λmax (I − Q−1 A) by a few power-iterations and use ing parameter, we compute λ MG ˜ ˜ −1 ), as scaled preconditioner. The approximation QA := (1 − α λ)QM G , α ∈ [1, λ ˜ satisfies λ ˜ ≤ λmax (I − Q−1 A), hence we take α ≥ 1. In our experiments we use λ MG α = 1.1. Although not negligible, the arithmetic costs for determining this scaling parameter are not taken into account in the tables presented below. The PMINRES method is parameter free. In the MGUZAWA method we have to chose a stopping criterion for the inner iteration. We use the one in (3.29) with σacc = 0.5. For all three methods the costs are dominated by the evaluations of Q−1 A . Therefore to measure the arithmetic costs we count the number of Q−1 evaluations, which is denoted by A #QM G . Results are presented in table 6.1.

BPCG PMINRES MGUZAWA

h = 1/16 29 49 33

h = 1/32 29 49 30

Table 6.1 #QM G for the three methods.

In figure 6.1 we give the convergence behaviour of krk k for the case h = 1/32. Note that in the MGUZAWA method in each iteration we have an inner iteration. In table 6.2 we give the number of inner iterations per outer iteration.

18

J. PETERS, V. REICHELT AND A. REUSKEN

0 PMINRES MGUZAWA BPCG

-1

-2

-3

-4

-5 log10 (krk k) -6 k -7 0

5

10

15

20

25

30

35

40

45

50

Fig. 6.1. Convergence behaviour of krk k for the case h = 1/32.

k # inner it.

1 1

2 1

3 1

4 2

5 1

6 2

7 1

8 2

9 1

10 1

11 3

12 2

Table 6.2 Number of inner iterations in MGUZAWA, h = 1/32.

The behaviour of the BPCG method depends on the scaling parameter α in (1 − ˜ M G . In table 6.3 we show the dependence of #QM G on this scaling parameter α λ)Q ˜ = λmax (I − Q−1 A). The convergence of the MGUZAWA method depends on for λ MG σacc . In table 6.4 we show the dependence of #QM G on this parameter.

α #QM G

0.7 div

0.8 div

0.9 28

1.0 28

1.1 29

1.2 30

1.3 31

1.5 32

2.0 36

Table 6.3 Dependence of BPCG method on scaling parameter; h = 1/32.

σacc #QM G

0.1 38

0.2 33

0.3 34

0.4 30

0.5 30

0.6 29

0.7 30

0.8 30

Table 6.4 Dependence of MGUZAWA method on σacc ; h = 1/32.

0.9 52

FAST ITERATIVE SOLVERS FOR STOKES EQUATIONS

19

According to remarks 1, 2, 5 the bounds for the convergence rates of BPCG and PMINRES depend on the scaling parameter ρ if we use ρQS (ρ > 0) as a preconditioner for the Schur complement. The convergence rate of MGUZAWA does not depend on the scaling of QS . Table 6.5 shows #QM G for different values of ρ. ρ BPCG PMINRES MGUZAWA

10−4 110 135 30

10−2 58 91 30

1.0 29 49 30

102 44 30 30

104 45 40 30

Table 6.5 Dependence of #QM G on the scaling of QS ; h = 1/32.

Experiments for the case ξ > 0. We again apply the three methods described above. For the preconditioner QA we use the same multigrid V-cycle iteration as for the case ξ = 0. For the preconditioner for the Schur complement we use the one in (5.8). For QT we apply one multigrid V-cycle iteration with one symmetric GaussSeidel pre- and post-smoothing iteration. For the BPCG method we take the same scaling of QA as we used for the case ξ = 0. Note that this may not be the optimal scaling if ξ > 0. In the MGUZAWA method we take σacc = 0.6. In the tables 6.6 and 6.7 we present results for ξ = h−1 and ξ = h−2 .

BPCG PMINRES MGUZAWA

h = 1/16 29 48 26

h = 1/32 28 48 29

Table 6.6 #QM G for the three methods; ξ = h−1 .

BPCG PMINRES MGUZAWA

h = 1/16 26 44 27

h = 1/32 24 41 25

Table 6.7 #QM G for the three methods; ξ = h−2

7. Concluding remarks. In this paper we considered three fast iterative methods for discretized Stokes equations. The BPCG method is a preconditioned CG method applied to a transformed system and with a special scalar product. The PMINRES method is a standard preconditioned minimal residual method. The MGUZAWA method results from an approximation of an exact block factorization of the matrix K (cf. (3.20),(3.21)). The MGUZAWA algorithm has the structure of an outer-inner iteration. Thus for this method a stopping criterion (tolerance parameter) for the inner iteration is needed. The PMINRES method is parameter free. In the BPCG method a scaling parameter for the preconditioner QA has to be determined. If for QA we use a multigrid solver then for all three methods the costs per iteration are dominated by the evaluations of Q−1 A . In the BPCG and the PMINRES method

20

J. PETERS, V. REICHELT AND A. REUSKEN

−1 one needs one Q−1 A evaluation per iteration. In the MGUZAWA method ℓ + 1 QA evaluations per iteration are needed (ℓ: # inner iterations). For all three methods convergence analyses are available in which bounds for the rate of convergence are derived that depend only on the constants in the spectral equivalences γA QA ≤ A ≤ ΓA QA , γS QS ≤ S ≤ ΓS QS . For the MGUZAWA method we presented a convergence analysis that shows that if one uses a multigrid preconditioner for A then for the inner iteration a very low accuracy is already sufficient to have a convergent and efficient method. In this convergence analysis we use a natural norm. In the numerical experiments for the stationary case, all three methods have a rate of convergence independent of h. In our experiments the BPCG method and MGUZWA have almost the same efficiency. The PMINRES method is less efficient. The latter method, however, is the most robust one. It is parameter free and converges even without preconditioning. From table 6.3 we conclude that the performance of the BPCG-method depends only mildly on α (provided α ≥ 1). Thus, we need only ˜ ≈ λmax (I − Q−1 A) to obtain an efficient solver. The results a rough approximation λ A in table 6.4 show that the MGUZAWA method is efficient for a broad range of σacc values. Note that the method converges even for large σacc values. This confirms the theory in section 4. The rate of convergence of both BPCG and PMINRES depend on the scaling of QS , whereas for the MGUZAWA method this is not the case. For particular choices of this scaling it may happen that the PMINRES method is more efficient than the BPCG method (cf. table 6.5). The experiments for the instationary case demonstrate the effectiveness of the preconditioner in (5.8). The rate of convergence is independent of h for ξ = h−1 and ¯ for ξ = h−2 all ξ = h−2 . If instead of the preconditioner in (5.8) one uses QS = M, three solvers would need several hundred to thousand iterations.

REFERENCES [1] Arrow, K., Hurwicz, L., and Uzawa, H., Studies in nonlinear programming, Stanford University Press, Stanford, CA, 1958. [2] Bank, R.E., Welfert, B.D., and Yserentant, H.: A class of iterative methods for solving saddle point problems, Numer. Math., 56, 1990, 645–666. [3] Bramble, J.H., Pasciak, J.E.: A preconditioning technique for indefinite systems resulting from mixed approximations of elliptic problems. Math. Comp., 50, 1988, 1–17. [4] Bramble, J.H., and Pasciak, J.E. Iterative techniques for time dependent Stokes problems, Comput. Math. Appl. 33, 1997, 13–30 [5] Bramble, J.H., Pasciak, J.E., and Vassilev, A.T.: Analysis of the inexact Uzawa algorithm for saddle point problems. SIAM J. Numer. Anal. , 34, 1997, 1072–1092. [6] Bramble, J.H., Multigrid Methods, Pitman Research Notes in Mathematics, Vol. 294, John Wiley, New York, 1993. [7] Brezzi, F. and Fortin, M., Mixed and Hybrid Finite Element Methods, Springer, New York, 1991. [8] Cahouet, J., Chabard, J.P.: Some fast 3D finite element solvers for the generalized Stokes problem. Int. J. Numer. Methods Fluids, 8, 1988, 869–895. [9] Ciarlet, P.G., The Finite Element Method for Elliptic Problems, North-Holland, Amsterdam, 1978. [10] Ciarlet, P.G.: Basic Error Estimates for Elliptic Problems. Handbook of Numerical Analysis, Vol. II, P.G. Ciarlet and J.L. Lions, eds., Elsevier Science, 1991, 17–351. [11] Elman, H.C.: Multigrid and Krylov subspace methods for the discrete Stokes equations. Int. J. Numer. Meth. Fluids, 227, 1996, 755–770. [12] Elman, H.C. and Golub, G.H.: Inexact and preconditioned Uzawa algorithms for saddle point problems, SIAM J. Numer. Anal., 31, 1994, 1645–1661. [13] Girault, V., and Raviart, P.-A., Finite Element Methods for Navier-Stokes Equations, Springer, Berlin, 1986.

FAST ITERATIVE SOLVERS FOR STOKES EQUATIONS

21

[14] Greenbaum, A., Iterative Methods for Solving Linear Systems, Frontiers in Applied Mathematics, Vol. 17, SIAM, Philadelphia, 1997. [15] Hackbush, W., Multi-grid Methods and Applications, Springer, Berlin, Heidelberg, 1985. [16] Hackbush W. Iterative solution of large sparse systems of equations, Springer, Berlin, Heidelberg, 1994. [17] Kobelkov, G.M. and Olshanskii, M.A. Effective preconditioning of Uzawa type schemes for a generalized Stokes problem, Numer. Math., 86, 2000, 443–470 [18] Olshanskii, M.A., Reusken, A.: On the convergence of a multigrid algorithm for linear reactiondiffusion problems, Computing, 65, 2000, 193–202. [19] Paige, C.C., Saunders, M.A.: Solution of sparse indefinite systems of linear equations. SIAM J. Numer. Anal., 11, 1974, 197–209. [20] Powell, C., Silvester, D.: Black-box preconditioning for mixed formulation of self-adjoint elliptic PDEs. In: Challenges in Scientific Computing - CISC 2002, ed. E. B¨ ansch. Lecture Notes in Computational Science and Engineering, Vol. 35, 268–285, 2003. [21] Rusten, T. and Winther, R.: A preconditioned iterative method for saddlepoint problems. SIAM J. Matrix Anal. Appl., 13, 1992, 887–904. [22] Silvester, D., Elman, H., Kay, D. and Wathen, A.: Efficient preconditioning of the linearized Navier-Stokes equations for incompressible flow. J. Comp. Appl. Math., 128, 2001, 261– 279. [23] Silvester, D. Wathen, A.: Fast iterative solution of stabilised stokes systems. Part II: using general block preconditioners. SIAM J. Numer. Anal., 31, 1994, 1352–1367. [24] Verf¨ urth, R.: A combined conjugate gradient-multigrid algorithm for the numerical solution of the Stokes problem, IMA Journal Numer. Anal., 4, 1984, 441–455. [25] Wathen, A.: Realistic eigenvalues bounds for the Galerkin mass matrix. IMA J. Numer. Anal., 7, 1987, 449–457. [26] Wathen, A., Silvester, D.: Fast iterative solution of stabilised stokes systems. Part I: using simple diagonal preconditioners. SIAM J. Numer. Anal., 30, 1993, 630–649. [27] Wesseling, P., An Introduction to Multigrid Methods, Wiley, Chichester, 1992. [28] , Wittum, G.: Multigrid methods for Stokes and Navier-Stokes equations. Transforming smoothers: algorithms and numerical results. Numer. Math., 54, 1989, 543–563. [29] Zulehner, W.: Analysis of iterative methods for saddle point problems: A unified approach, Math. Comp., 71, 2001, 479–505.

Suggest Documents