A null space algorithm for mixed finiteÐ²â¢'element ... - Wiley Online Library

COMMUNICATIONS IN NUMERICAL METHODS IN ENGINEERING Commun. Numer. Meth. Engng 2002; 18:645–657 (DOI: 10.1002/cnm.524)

A null space algorithm for mixed nite-element approximations of Darcy’s equation Mario Arioli1; ∗; † and Gianmarco Manzini 2 1 Rutherford

Appleton Laboratory; Chilton; Oxfordshire; OX11 0QX; U.K. 2 IAN - CNR; via Ferrata 1; 27100 Pavia; Italy

SUMMARY A null space algorithm is considered to solve the augmented system produced by the mixed niteelement approximation of Darcy’s Law. The method is based on the combination of an orthogonal factorization technique for sparse matrices with an iterative Krylov solver. The computational eciency of the method relies on a suitable stopping criterion for the iterative solver. We experimentally investigate its performance on a realistic set of selected application problems. Copyright ? 2002 John Wiley & Sons, Ltd. KEY WORDS:

augmented systems; sparse matrices; mixed nite elements

1. INTRODUCTION The mixed nite-element approximation of Darcy’s law produces a nite-dimensional problem, which is dened by an augmented system. In order to solve the latter, we present in this paper a null space method which combines a direct Householder factorization solver and a conjugate gradient iterative one. The properties of this method have been studied in Reference [1] where its backward stability when using nite-precision arithmetic is proved, and where a review of the bibliography on the topic is also presented. The eciency of this kind of approach is strongly aected by the criterion used to stop the iterative solver. The one adopted in this paper is based on an estimate of the approximation error in the energy norm of the problem. See Reference [2] for a more detailed theoretical presentation. We remark that this method can be applied to any diusion equation. For the sake of simplicity, we will focus on Darcy’s Law, which is a signicant example among saddle point problems. The performances of the nal algorithm is experimentally investigated on a representative set of problems.

∗ Correspondence to: M. Arioli, † E-mail: [email protected]

Rutherford Appleton Laboratory, Chilton, Didcot, Oxon OX11 0QX, U.K.

Copyright ? 2002 John Wiley & Sons, Ltd.

Received 17 January 2001 Accepted 18 December 2001

646

M. ARIOLI AND G. MANZINI

In the presentation of the algorithm, we will denote by E1 and E2 the n × m (m6n) matrix Im (1) E1 = 0n−m; m and the n × (n − m) matrix

E2 =

0n−m; m

(2)

In−m

2. PROBLEM FORMULATION Let be a simply connected, bounded, polygonal domain in R2 , dened by a closed curve . is usually the union of two parts D and N , where dierent Dirichlet- and Neumann-type boundary conditions are imposed, that is = D ∪ N . Darcy’s laws can be formulated as follows: u(x) = −K(x) grad p(x); div u(x) = f(x);

x∈ x∈

(3)

with a set of boundary conditions for both u and p p(x) = gD (x);

x ∈ D

u · n = gN (x);

x ∈ N

(4)

using two regular functions gD and gN for Dirichlet and Neumann conditions, and where n denotes the external normal to . In the following, we will assume that gN = 0. Darcy’s law describes the relationship between the pressure p(x) (the total head) and the velocity eld u(x) (the visible eect) in ground-water ow. In system (3) K(x) is the hydraulic conductivity tensor and f(x) is a source–sink term. The former equation in (3) relates the vector eld u to the scalar eld p via the permeability tensor K, which accounts for the soil characteristics. The latter equation in (3) relates the divergence of u to the source–sink term f(x). Let Th be a family of triangulations of , i.e. each Th is a set of disjoint triangles which cover in such a way that no vertex of any triangle lies in the interior of an edge of another triangle. Let h = max{diam (): ∈Th }. We assume that Th is regular in the sense of Ciarlet, see Reference [3], i.e. triangles do not degenerate as h → 0. Moreover, we assume that no triangle has one vertex on gD and another vertex on gN , and that each triangle cannot have more than one edge lying on . The mixed nite element approximation of system (3) with the usual Raviart–Thomas space (we refer to Reference [4] for a detailed analysis), leads to the solution of the following system of linear equations: M A u q = (5) p b AT 0 Copyright ? 2002 John Wiley & Sons, Ltd.

Commun. Numer. Meth. Engng 2002; 18:645–657

NULL SPACE ALGORITHM FOR DARCY’S EQUATION

647

where, denoting by n the number of edges and by m the number of triangles, we have that M ∈Rn×n is a symmetric and positive denite matrix, and that A ∈Rn×m is a sub matrix of a totally unimodular matrix with m + 1 columns. Therefore, A is full rank and its entries are equal to either 1, −1, or 0. The augmented system (5) is non-singular because Ker(AT ) ∩ Ker(M ) = 0. 3. NUMERICAL ALGORITHM In this section, we illustrate the classical null space algorithm, which is described in Reference [5], for the minimization of linearly constrained quadratic forms. Let Y ∈Rn×m and Z ∈Rn×(n−m) be two matrices such that Y TA = Im

Z TA = 0n−m; m

and

(6)

The algorithm can be formulated as follows: Null Space Algorithm: 1: u0 = Yb 2:

Z T MZw = Z Tq − Z T Mu0 = s

3:

u = u0 + Zw

4:

p = Y Tq − Y T Mu

(7)

The matrices Y and Z can be computed using either an orthogonal factorization or a Gaussian factorization of the matrix A. In Reference [6], it is shown that, by a sparse version of the Householder algorithm, the matrix A can be factored as follows: R (8) A=H 0 where R is an m × m sparse, non-singular, upper triangular matrix and H is an n × n orthonormal matrix. The matrix H can be stored implicitly as the set of sparse vectors that generate the elementary Householder transformations, see Reference [6] for details on the sparsity of R and on the sparse storage of H . With respect to the previous choice, we have Y = HE1 R−T

and

Z = HE2

(9)

The matrix Z is an orthonormal basis of the kernel of AT . It is very important to observe that we never need to explicitly compute either Y or Z. Indeed the Householder factorization gives the possibility of using its sparse result for implicitly computing all the matrix–vector products required by the algorithm. We can perform the product of the projected Hessian matrix Z T MZ by a vector and the product of Y or Y T by a vector in the following ways: Z T MZw = E2T (H T (M (H (E2 w)))) Y Ty = R−1 (E1T (H Ty)) Yx = H (E1 (R Copyright ? 2002 John Wiley & Sons, Ltd.

−T

(10)

x)) Commun. Numer. Meth. Engng 2002; 18:645–657

648


This approach has the advantage of performing backward and forward substitutions for sparse triangular matrices, and of using the sparse Householder elementary matrices to perform matrix–vector products. In Step 2 of the null space algorithm, we also need to solve a system involving Z T MZ, the projected Hessian matrix. We have two alternative ways to proceed. If n − m is small (the number of constraints is very close to the number of unknowns), or the projected Hessian matrix is still sparse, we can explicitly compute Z T MZ, and then solve the system Z T MZw = s using the Cholesky factorization. Otherwise, when the product Z T MZ cannot be performed directly because both the complexity would be too high—O(n3 )—or the resulting matrix would be fairly dense, despite the sparsity of M . We can solve the linear system Z T MZw = s using a conjugate gradient algorithm, and implicitly computing the matrix by vector products. The mixed nite element approximation of (3), (4) produces augmented systems where we need to use this second approach. 4. STOPPING CRITERION If we use the conjugate gradient method, it is quite natural to have a stopping criterion which takes advantage of the minimization property of this method. At each step j the conjugate gradient minimizes the energy norm of the error w = w − w( j) on a Krylov space. The space Rn−m with the norm yZ T MZ = (yT Z T MZy)1=2

(11)

induces on its dual space the dual norm f(Z T MZ)−1 = (fT (Z T MZ)−1 f)1=2

(12)

A stopping criterion such as if

Z T MZw( j) − s(Z T MZ)−1 6s(Z T MZ)−1

then STOP

(13)

will guarantee that the computed solution w( j) satises the perturbed linear system Z T MZw( j) = s + f f(Z T MZ)−1 6 s(Z T MZ)−1

(14)

The choice of will depend on the properties of the problem that we want to solve, and, in the practical cases , where is the rounding unit. Therefore, it is appropriate to analyse the inuence of the perturbations on the error between the exact solutions u and p and the computed u∗ and p∗ neglecting the part depending on . Using the results of References [1; 7], we can prove that u − u∗ M 6 u − u0 M p − p∗ AT M −1 A 6 u − u0 M

(15)

where is the spectral radius of Z T MZE2 M −1 E2 . Furthermore, we need to add some tool within the conjugate gradient algorithm for estimating the values in (13). The estimate j of Z T MZw( j) − s(Z T MZ)−1 can be computed using a Copyright ? 2002 John Wiley & Sons, Ltd.



649

Gauss quadrature rule as proposed in Reference [8] and tested in References [2; 9]. At step j of the conjugate gradient, j estimates the true value of the error at step j − d, where d is an a priori selected integer value. Finally, we must estimate s(Z T MZ)−1 . Taking into account that s(Z T MZ)−1 = uZ T MZ , we could replace s(Z T MZ)−1 with w( j) Z T MZ at the step j of the conjugate gradient if the current estimate j is less than or equal to s2 . Therefore, we can only use (13) after an additional check if

j 6s2 if

then

j 6w( j) Z T MZ

then STOP

(16)

endif Moreover, using (16) we can avoid too many additional matrix–vector products, and the additional oating-point operations needed are negligible in comparison with the total number of oating point operations performed by the conjugate gradient.

5. NUMERICAL EXPERIMENTS All our experiments were performed on a SUN workstation using the sparse code MA49 of the HSL [10], described in Reference [6]. In MA49 the Householder factorization is implemented

Figure 1. Sketch of the test problem. Copyright ? 2002 John Wiley & Sons, Ltd.


650


Table I. Parameters of the runs.

Mesh 1 Mesh 2 Mesh 3

n

m

nnz(A)

nnz(M )

nnz(R)

stg(H )

237 2309 22919

147 1497 15136

416 4396 45084

1119 11291 113735

940 16058 250944

3023 47120 649793

0.16 5:8 × 10−2 1:92 × 10−2

Table II. Stopping iteration and nal residual (without scaling).

Mesh 1

d = 15

Mesh 2

d = 25

Mesh 3

d = 30

K1 = 10−2

K1 = 10−4

K1 = 10−6

K1 = 10−8

Niter

21

25

29

31

u − u∗ 2 u2

2:8 × 10−3

2:8 × 10−2

2:8 × 10−2

3:5 × 10−2

p − p∗ 2 p2

5:1 × 10−4

6:6 × 10−3

3:4 × 10−2

22:7

Niter

42

66

80

93

u − u∗ 2 u2

9:0 × 10−4

2:4 × 10−3

5:1 × 10−3

1:7 × 10−2

p − p∗ 2 p2

4:0 × 10−4

5:2 × 10−3

6:5 × 10−1

2:08

Niter

59

96

119

143

u − u∗ 2 u2

4:2 × 10−4

1:2 × 10−3

3:1 × 10−3

7:8 × 10−3

p − p∗ 2 p2

1:8 × 10−5

1:4 × 10−3

2:0 × 10−4

1:66

using a multifrontal approach for sparse matrices: in the symbolic part, the code computes the column ordering of A following the minimum degree reordering of ATA in order to minimize the ll-in in the upper triangular matrix R, and then a row ordering making A ordered by the leading entries. In our experiments, we compare the exact solution u and p of (5) with the values u∗ and p∗ computed by the null space algorithm where, in Step 2, we used the conjugate gradient method with (16), and we chose w (0) = 0. We assume that the values computed by the HSL routine MA47 [11] which implements a sparse Gaussian factorization applied to (5) are exact. We also report on the performances of the algorithm when a symmetric scaling √ is performed on the problem. The entries of the diagonal scaling matrix D are equal to ( Mii )i=1;:::; n and the Copyright ? 2002 John Wiley & Sons, Ltd.


651


Table III. Stopping iteration and nal residual (with scaling).

Mesh 1

d=5

Mesh 2

d=5

Mesh 3

d=5

K1 = 10−2

K1 = 10−4

K1 = 10−6

K1 = 10−8

Niter

6

6

6

6

u − u∗ 2 u2

1:1 × 10−5

4:0 × 10−6

2:0 × 10−6

1:9 × 10−6

p − p∗ 2 p2

4:4 × 10−7

1:5 × 10−6

1:6 × 10−6

1:6 × 10−6

Niter

7

7

7

7

u − u∗ 2 u2

1:4 × 10−5

5:4 × 10−6

4:5 × 10−6

4:5 × 10−6

p − p∗ 2 p2

2:8 × 10−7

1:0 × 10−6

1:1 × 10−6

1:1 × 10−6

Niter

8

8

8

8

u − u∗ 2 u2

4:5 × 10−6

4:4 × 10−6

4:4 × 10−6

4:4 × 10−6

p − p∗ 2 p2

1:8 × 10−8

1:9 × 10−8

1:9 × 10−8

1:9 × 10−8

system (5) is scaled as follows:

D−1

0

0

Im

M

A

AT

0

D−1

0

0

Im

−1 D q Du = p b

(17)

5.1. Test problems In Figure 1, we illustrate the geometry of the domain and the boundary conditions. We generated three regular meshes on with an increasing number m of triangles (see Table I). For each mesh, we assembled M , A, q, and b corresponding to a permeability that assumes two values K0 and K1 (see Figure 1). The value K0 = 1 is constant for all the test problems, and, for each mesh, we have four values of K1 : 10−2 , 10−4 , 10−6 , and 10−8 . We assume that internal sources are absent, and, therefore, b = 0 for all the test problems. In Table I, we report on the values n; m (number of degrees of freedom of the problem), and the values of the number of non-zero entries in M and A. Moreover, in Table I, we report on the values of the storage (measured in words) needed for H and R, and on the values of . Copyright ? 2002 John Wiley & Sons, Ltd.


652


Figure 2. Velocity elds produced by the direct (top) and iterative (bottom) solvers on Mesh 2 with K1 = 10−8 .

5.2. Numerical results In Tables II and III, we summarize the numerical performance of the algorithm for each test problem and each mesh, respectively, without and with scaling. In Figures 3, 4, and 5, we Copyright ? 2002 John Wiley & Sons, Ltd.



653

Figure 3. Mesh 1 calculations for dierent relative permeabilities: ‘exact’ error (solid), estimate derror (circles), residual norm (dotted).

plot the convergence history of Z T MZw( j) − s2 = s2 , and of f(Z T MZ)−1 = s(Z T MZ)−1 and its estimate, for each mesh and each value of K1 , respectively. We choose = h (see the values of = h in Table I) in (16) as suggested in Reference [2]. In Reference [2], it is proved, in the framework of a classical nite-element approximation, that the energy norm of the dierence between the function corresponding to the computed algebraic solution and the true solution of the continuous problem is of order h. Our numerical experiments support the same conclusion for the velocity eld in the framework of the mixed nite-element approximation. We denote by Niter the iteration on which (16), with chosen as in Table I, stops the conjugate gradient. We point out that the error estimate is relative to the errors at the step Niter − d. Nevertheless, because the conjugate gradient energy norm convergence is monotone, we can safely use the nal values of w(Niter ) . In Table II, we report the errors of the nal step, the values chosen for the parameter d used in computing the estimate j , and the errors between u and u∗ , p and p∗ . For the sake of simplicity, we used the same value of d within each mesh. We point out that for the biggest values of K, we obtained similar results using smaller values of d. Copyright ? 2002 John Wiley & Sons, Ltd.


654


Figure 4. Mesh 2 calculations for dierent relative permeabilities: ‘exact’ error (solid), estimated error (circles), residual norm (dotted).

The condition number (Z T MZ) = Z T MZ (Z T MZ)−1 of the projected Hessian matrix is uniformly bounded by a values which depends on K but not on h [4]. Therefore, for decreasing values of h, the convergence of the conjugate gradient, measured in error energy norm, will depend only on K. In our test problems, the convergence has a staircase behaviour where the length, the depth, and the steepness of the steps increases when K1 decreases. The choice of d depends on the length and the steepness of these steps and moderately on h. If we choose a smaller value for d then we can have phenomena similar to the one of Figure 5 relative to K1 = 10−8 . When scaling is performed, the convergence is faster and without steps because the condition number of the scaled projected Hessian matrix depends neither on K nor on h (see Table III). In Figure 2, we plot the velocity eld uh (x), which is obtained solving (5) by MA47, and the velocity eld u∗ (x), which is computed by the null space algorithm using (16), for specic choices of the mesh and permeability. From Table II, the error between the two elds is ≈10−2 while h = 5:8 × 10−2 . We point out that p∗ is computed with less accuracy (see Table II). Copyright ? 2002 John Wiley & Sons, Ltd.



655

Figure 5. Mesh 3 calculations for dierent relative permeabilities: ‘exact’ error (solid), estimated error (circles), residual norm (dotted).

In Figure 6, we show that the energy norm of the solution at each step converges quite fast to the energy norm of u. Finally, we observe that the symmetric scaling (17) normalizing to 1 the entries Mij , increases the rate of convergence of the conjugate gradient for 2D problems. Nevertheless, this improvement is much less dramatic when the permeability is not isotropic, and when we solve 3D problems. Moreover, we point out that this null space algorithm can be easily generalized to solve non-linear problems in which the permeability tensor depends on the solution. Within this framework, the scaling process imposes a new factorization at each step of the Newton algorithm, making the total cost prohibitive. 6. CONCLUSIONS We can see from the numerical experiments that the null space algorithm based on the Householder factorization gives a projected Hessian matrix asymptotically independent of n and m Copyright ? 2002 John Wiley & Sons, Ltd.


656


Figure 6. Ratio of the energy norm of the ‘exact’ and the iterative solution for the three calculations. Copyright ? 2002 John Wiley & Sons, Ltd.



657

and, thus, of the parameter h. This follows from the inf–sup condition [4] and the choice of the Raviart–Thomas nite-element approximation functions. Nevertheless, when the physical problem is very ill-conditioned owing to the presence of very low permeability regions in the domain, the rate of convergence of the conjugate gradient does not deteriorate dramatically. The Householder decomposition-based algorithm stores a number of non-zero entries for R and H , which increases with n and m. In Reference [12], the authors point out that, for 3D problems, the computer memory requirement for storing H can become prohibitive. REFERENCES 1. Arioli M. The use of QR factorization in sparse quadratic programming and backward error issues. SIAM Journal on Matrix Analysis and Applications 2000; 21:825– 839. 2. Arioli M. A stopping criterion for the conjugate gradient algorithm in a nite element method framework. Technical Report 1179, IAN-CNR, Pavia, Italy, 2000. 3. Ciarlet PG. The Finite Element Method for Elliptic Problems. North-Holland: Amsterdam, Holland, 1978. 4. Brezzi F, Fortin M. Mixed and Hybrid Finite Element Methods. Springer: Berlin, 1992. 5. Gill PH, Murray W, Wright MH. Practical Optimization. Academic Press: London, UK, 1981. 6. Amestoy P, Du I, Puglisi C. Multifrontal QR factorization in a multiprocessor environment. Numerical Linear Algebra with Applications 1996; 3:275–300. 7. Arioli M, Baldini L. A backward error analysis of a null space algorithm in sparse quadratic programming. SIAM Journal on Matrix Analysis and Applications 2001; 23:425– 442. 8. Golub G, Meurant G. Matrices, moments and quadrature II: how to compute the norm of the error in iterative methods. BIT: Numerical Mathematics 1997; 37:687–705. 9. Meurant G. The computation of bounds for the norm of the error in the conjugate gradient algorithm. Numerical Algorithms 1997; 16:77– 87. 10. HSL. A collection of Fortran codes for large scale scientic computation. 2000. http:==www.cse.clrc.ac.uk= Activity=HSL. 11. Du I, Reid J. The multifrontal solution of indenite sparse linear systems. Association for Computing Machinery. Transactions on Mathematical Software 1983; 9:302–325. 12. Arioli M, Maryska J, Rozlozn k M, Tˆuma M. Dual variable methods for mixed-hybrid nite element approximation of the potential uid ow problem in porous media. Technical Report RAL-TR-2001-023, RAL-CLRC, Oxfordshire, UK, 2001.

Copyright ? 2002 John Wiley & Sons, Ltd.


A null space algorithm for mixed finiteÐ²â¢'element ... - Wiley Online Library

A null space algorithm for mixed finiteÐ²â¢'element ... - Wiley Online Library

Suggest Documents

A matheuristic algorithm for the mixed ... - Wiley Online Library

A Null-Space Algorithm for Overcomplete

A mixed effects Markov model for repeated ... - Wiley Online Library

Application of the Mixed Hybrid FiniteElement

A fast algorithm for multiscale electromagnetic ... - Wiley Online Library

A Learning Algorithm for Boltzmann Machines - Wiley Online Library

A genetic algorithm-based task scheduling for ... - Wiley Online Library

Algorithm for writing a scientific manuscript - Wiley Online Library

A parameter optimization algorithm for ... - Wiley Online Library

A simple contact mapping algorithm for ... - Wiley Online Library

A tabu search algorithm for the Capacitated ... - Wiley Online Library

A simple algorithm for reducing the operation ... - Wiley Online Library

A Novel Genetic Algorithm for Designing ... - Wiley Online Library

A hybrid topology optimization algorithm for ... - Wiley Online Library

Mixed-Pyrimidine (Uracil, Cytosine) - Wiley Online Library

Robust online algorithm for adaptive linear ... - Wiley Online Library

An Algorithm for Online Measurement of the ... - Wiley Online Library

Confined displacement algorithm determines ... - Wiley Online Library

Accelerated simulated annealing algorithm ... - Wiley Online Library

Space maintainer using fiberreinforced ... - Wiley Online Library

Thinking State/Space Incompossibly - Wiley Online Library

Dopaminergic transmission in STOP null mice - Wiley Online Library

The PAX2-null immunophenotype defines ... - Wiley Online Library

Research frontiers in null model analysis - Wiley Online Library

A null space algorithm for mixed finiteÐ²â¢'element ... - Wiley Online Library