Numerical Algorithms (2005) 40: 103–124 DOI 10.1007/s11075-005-1523-5
Springer 2005
Differential equations and solution of linear systems Jean-Paul Chehab a,b and Jacques Laminie b a Laboratoire de Mathématiques Paul Painlevé, CNRS UMR 8524, Université de Lille, France
E-mail:
[email protected] b Laboratoire de Mathématiques, CNRS UMR 8628, Equipe ANEDP, Université Paris Sud, Orsay, France
E-mail:
[email protected]
Received 14 June 2003; accepted 12 December 2004 Communicated by H. Sadok
Many iterative processes can be interpreted as discrete dynamical systems and, in certain cases, they correspond to a time discretization of differential systems. In this paper, we propose to derive iterative schemes for solving linear systems of equations by modeling the problem to solve as a stable state of a proper differential system; the solution of the original linear problem is then computed numerically by applying a time marching scheme. We discuss some aspects of this approach, which allows to recover some known methods but also to introduce new ones. We give convergence results and numerical illustrations. Keywords: differential equation, numerical schemes, numerical linear algebra, preconditioning AMS subject classification: 65F10, 65F35, 65L05, 65L12, 65L20, 65N06
1.
Introduction
The connections between differential equations and linear algebra are numerous: one the one hand, linear algebra tools and concepts are used for studying theoretical aspects of ODEs, e.g., such as the properties of equilibrium points [14,15,19]; in a parallel way, techniques of linear numerical algebra are intensively applied in numerical analysis of ODEs for the analysis of time marching methods. On the other hand, in some cases, iterative processes for solving linear as well as nonlinear systems of equations can be derived from the discretization of a ODE, as, e.g., pointed out in [8,9,11,15,19] for the solution fixed points, but also in [7] for the interpretation of convergence acceleration algorithms. During the last two decades, Numerical Linear Algebra (NLA) has been considerably enriched with the introduction of methods like GMRES [18], Bi-Cgstab [20] or QMR [12] since they allow the efficient solution of large scale non-symmetric problems. These algorithms are based on Krylov subspace techniques and, if we omit some variations on these methods, no new algorithm was proposed since, making the preconditioning a central topic in NLA [3].
104
J.-P. Chehab, J. Laminie / Differential equations and solution of linear systems
Several classical iterative methods can be recovered by a proper discretization of ODEs, particularly some decent methods can be interpreted as discrete versions of gradient flows [13]. One of the simplest example is given by the relation between Richardsonlike methods and forward Euler’s schemes, the relaxation parameter and the time stepsize playing the same role, see [9,17]. Let P be a n × n symmetric positive definite matrix. Consider the equation dU = b − PU, dt (1.1) U (0) = U0 , whose the steady state is the solution of the linear system PU = b.
(1.2)
The steady state is asymptotically stable and can be then computed numerically by using an explicit time marching scheme. This is a simple but very important property since, in that case, the time discretization consists in building a sequence of vector satisfying a simple (linear) recurrence relation. The application of the Forward Euler scheme to (1.1) generates the iterations (1.3) U k+1 = U k + t b − PU k , k = 0, . . . . (1.3) is nothing else but the classical Richardson scheme; if P is positive definite, the stability condition is 0 < t < 2/ρ(P), where ρ(P) denotes the spectral radius of P. However, since the goal is to approach the steady state as fast as possible, many variants, not directly connected to numerical analysis of ODEs can be considered; the time step t can depend on k so some descent methods enter in this framework. Of course other classical methods can be recovered following this approach. In this article we propose to generate numerical methods in NLA by modeling the linear system to be solved as a given state of a dynamical system; the solution can be reached asymptotically, as a (asymptotically stable) steady state, but also at finite time (shooting methods). In that way, any (stable) numerical scheme for the integration of such a problem can be presented as a method for solving linear systems. This idea was introduced in [10] for building sequences of inverse preconditioners. We then propose to generate schemes in numerical linear algebra following the two steps: 1. Construction/derivation of a dynamical system. 2. Discretization of the dynamical system by, e.g., time marching techniques. We here discuss of some ideas of this approach and show that the derived methods can be of numerical interest. The article is organized as follows: in section 2 we consider a family of coupled dynamical system whose the discretization allows to recover classical descent method but also to defined new schemes. Then, in section 3, we propose to reach numerically the
J.-P. Chehab, J. Laminie / Differential equations and solution of linear systems
105
solution at finite time of the linear system by implementing a shooting method. In section 4, we consider different time marching schemes for the differential systems as (1.1). Finally, we present some numerical results in section 5. The numerical results we present were obtained by using Matlab © 6 software on a cluster of Bi-processor 800 (Pentium III) at Université Paris XI, Orsay, France. 2.
Coupled differential systems and descent methods Basically, the iterations of a descent method verify a recurrence relation of type uk+1 = uk + αk zk ,
(2.1)
where uk is the approximation of the solution of the system at step k, αk is the step-size and zk the descent direction vector. The residual r k = b − Puk satisfies the relation r k+1 = r k − αk Pzk .
(2.2)
Here P is a regular matrix, not necessarily symmetric positive definite. If αk plays the role of a time step, we can identify the above iterative process to a time marching scheme applied to a differential system. One way to recover the above stencil is to consider the time discretization of linear differential systems, such as du
dt r = A B . (2.3) dz C D z dt We have set here r = b − Pu. So, up to suitable assumptions on the matrices A, B, C and D, the convergence of the system to the trivial equilibrium point (r, z) = (0, 0) implies that limt→+∞ u(t) = P−1 b. We hereafter discuss on different strategies for choosing these matrices. 2.1. The general case The system (2.3) is consistent with the solution of the linear problem Pu = b once the matrix
A B C D is regular. Let us neglect the time derivative in z, the above system reads
du A B r dt = . C D z 0
(2.4)
106
J.-P. Chehab, J. Laminie / Differential equations and solution of linear systems
If we eliminate z with the algebraic relation Cr + Dz = 0, we obtain formally du = A − BD −1 C r, dt say dr = −P A − BD −1 C r. dt Hence, if
A − BD −1 C = P−1 ,
(2.5)
the solution of the linear system is reached in one iteration by taking t = 1: S = A−BD −1 C is a Schur complement which can be interpreted as an inverse preconditioner of P. This indicates how to choose the matrices A, B, C, D. For example, if we let A = 0, B = −C = Id, then, according to (2.5), the matrix D must be chosen such as PD −1 ≈ Id, that means that D must be a preconditioner of P. In a general way, the iterative solution of the algebraic equation Cr + Dz = 0 can be seen as a projection on a linear manifold. If we take D = P, the projection reduces to the equation r = Pz and can be interpreted as a preconditioning step; the implementation of the preconditioning consisting in solving this last system iteratively, see also section 5. Remark 1. In (2.4), the expression Bz together with the relation z = −D −1 Cu can be interpreted as a feedback control of the system, see also [4] for the relations between control of linear systems and descent methods. 2.2. A family of descent methods 2.2.1. Derivation of the system In order to build inverse preconditioners of a given regular matrix P, it was proposed in [10] to integrate numerically matrix differential equations which have P−1 as steady state, such as the following Riccati equation: dQ = Q(Id − PQ), (2.6) dt Q(0) = Q . 0
It can be shown, under suitable assumptions, that Q(t) converges to P−1 as t → +∞, see [10]. Unfortunately, since the equation is nonlinear, it is not possible to derive a simple (linear) recurrence relation when integrating this system by, e.g., Euler’s method. For these reasons, we consider a linearized version of the above system: dQ − PQ), = Q(Id (2.7) dt u(0) = u , 0
J.-P. Chehab, J. Laminie / Differential equations and solution of linear systems
107
is an inverse preconditioner of P; Q can be a fixed matrix as well as a where here Q = Q(t). function matrix Q = P−1 and we can Of course, the convergence is speed-ed up when limt→+∞ Q(t) build Q(t) as the solution of a linear differential equation: dQ = QId − PQ , dt 0 , Q(0) = Q
(2.8)
where here Q is now a constant matrix. is a constant matrix, the integration of system (2.7) by forward Euler Remark 2. When Q playing the role of the scheme coincides with preconditioned Richardson iterations, Q → P−1 , see [5,6]. preconditioner. The Richardson iterations are accelerated when Q are solution of the coupled system The matrices Q and Q dQ − PQ), = Q(Id dt dQ , = Q Id − P Q dt =Q 0 . Q(0) Q(0) = Q0 ,
(2.9)
We now introduce u = Qb, in such a way limt→+∞ u(t) = P−1 b. We multiply, on the right, the first matrix equation by the fixed vector b, and the second one by r = b − Pu. We obtain du = Qr, dt (2.10) dQ . r = Q r − PQr dt and using the relation Letting z = Qr dQ dQr dr , = r +Q dt dt dt we get
du = z, dt dz dr −Q = Q(r − Pz). dt dt
(2.11)
108
J.-P. Chehab, J. Laminie / Differential equations and solution of linear systems
Finally, since dr/dt = −Pz we obtain the system du = z, dt dz + Q(r − Pz). = −QPz dt Remark 3. Following the presentation of (2.3), this last systems writes as du
dt r = 0 I . dz +Q P z Q − Q dt
(2.12)
(2.13)
+ Q)−1 Q. The associate Schur’s complement is here S = −P−1 (Q Remark 4. We can of course repeat the process by defining Q as the solution of a linear differential equation, and so on. More precisely, if we consider N levels of these iterations, we obtain the differential system du = z1 , dt dzi (2.14) = (Qi + Qi+1 )Pzi + zi+1 for i = 1, . . . , N − 1, dt QN = Id, zN = 0 where the matrices Qi are defined by dQi = Qi+1 (Id − PQi ), dt and where we have set zi = Qi r, i = 1, . . . , N − 1. 2.2.2. Some derived differential systems must be computed at each step for integrating the system: In (2.12), the matrix Q this is not compatible with the general stencil of a descent method in which only sequences of vector and fixed matrices are handled. A way to overcome this difficulty is We hereafter propose some dynamical systems deduced by to approach the matrix QP. such approximations and that allow to derive descent methods by numerical integration. = P−1 . ≈ Id. This approximation is motivated by the assumption limt→+∞ Q 1. QP The dynamical system is, in that case, du = z, dt (2.15) dz = −z + Q(r − Pz). dt
J.-P. Chehab, J. Laminie / Differential equations and solution of linear systems
≈ Q. The derived dynamical system is then 2. Q du = z, dt dz = Q(r − 2Pz). dt
109
(2.16)
= 0. This approximation is obtained by considering the steady state z = 0. The 3. QP dynamical system is here du = z, dt (2.17) dz = Q(r − Pz). dt 4. Replace dz/dt by 0 in (2.12) du = z, dt QPz = Q(r − Pz).
(2.18)
Various dynamical systems can be derived by considering different approximations Let us consider the particular case QP ≈ Id, QP ≈ α(t)Id. The discretization of QP. of such a system by a forward Euler method with variable time step reads k+1 = uk + β k zk , u (2.19) zk+1 = r k + α k zk . This is the general stencil of the conjugate gradient method. 2.2.3. Convergence results As stated before, any (stable) discretization of the dynamical systems reads as a numerical method for solving Pu∗ = b. Of course, these methods must be explicit and their stability require the equilibrium point (u, z) = (u∗ , 0) or (r, z) = (0, 0) to be asymptotically stable. The differential systems can be written as du dt =M r , dz z ε dt with ε = 0 or 1. Here M is the matrix of the system. The point (r, z) = (0, 0) is asymptotically stable when all the eigenvalues of M are of real part bounded from below by a strictly negative real number, see [14].
110
J.-P. Chehab, J. Laminie / Differential equations and solution of linear systems
We have the following result: Proposition 5. We set Q = Id. Then, the vector u defined the differential system (2.15) converges to P−1 b as t → +∞. Moreover (r − Pz) → 0, at an exponential rate, as t → +∞. Proof. System (2.15) is equivalent to dr
dt −P r = 0 . (2.20) dz Id −Id − P z dt We establish the result by showing that the real part of the eigenvalues of the matrix
0 P M= −Id Id + P are positive: in that case all the orbits converge to the equilibrium point at an exponential rate [14]. Let (u, v)T be an eigenvector of M with associate eigenvalue λ. We have the relations Pv = λu,
−u + (Id + P)v = λv,
from which we deduce (1 − λ)Pu = λ(1 − λ)u. Hence, the eigenvalues of M are {1, σ (P)}, where σ (P) is the spectrum of P. Now, returning to (2.15) and taking the addition of the two equations, we obtain, after multiplication by −P and after the usual simplifications: dr − Pz + P(r − Pz) = 0. dt We integrate this equation and we get (r − Pz)(t) = e−tP (r − Pz)(0). Hence the last assertion. The proof is achieved.
In a similar way, we can prove the following results: Proposition 6. We set Q = Id. Assume that the eigenvalues of P are real and larger than 1. Then, the vector u defined the differential system (2.16) converges to P−1 b as t → +∞. Proof. We proceed as above. Let w = (u, v)t be an eigenvector of M with λ as associated eigenvalue. We have the relation (1 − 2λ)Pu = λ2 u.
J.-P. Chehab, J. Laminie / Differential equations and solution of linear systems
It follows that λ2 /(1 − 2λ) is an eigenvalue of P (we can not have λ = case w = 0 and w is a eigenvector). So λ verify the equations
1 2
111
because in that
λ2 − 2µλ + µ = 0, for µ ∈ σ (P). We have then λ=µ±
µ2 − µ
Hence, since µ > 1, we have λ > 0. The stability of fixed points of system (2.17) is given by:
Proposition 7. We let Q = Id. Assume that all the eigenvalues of P are real and larger than 12 . Then, the vector u defined the differential system (2.17) converges to P−1 b as t → +∞. Proof. The proof is very similar to the previous one.
Remark 8. Assumptions like σ (P) ⊂ [ 12 , +∞[ are not restrictive at all: indeed, they can be obtained after a simple rescaling since P is positive definite. Iterations (2.19) can be derived by time discretization of the system du = z, dt z = r − α(t)z. Here α(t) is a (regular) function to be chosen. Now, using the relation du dr = −P , dt dt we have dz dα(t) dz = −Pz − α(t) − z. dt dt dt Hence,
and therefore
1 dα(t) dz =− Pz + z , dt 1 + α(t) dt
dα(t) dr 1 d2 r P+ Id . =− 2 dt 1 + α(t) dt dt
(2.21)
112
J.-P. Chehab, J. Laminie / Differential equations and solution of linear systems
From these equations, we deduce the following result: Proposition 9. Assume that (i) α(t) > −1, ∀t 0, (ii) dα(t)/dt > λmin (P). Then all the orbits of (2.18) converge to the solution (0, 0). Proof. The proof is obtained by a classical computation.
Remark 10. All the coupled dynamical systems introduced above can be written as a second order differential system. Indeed, thanks to the relation du/dt = z, we can write (2.12) as du d2 u Q + + Q P + QPu − Qb = 0. dt 2 dt
(2.22)
Remark 11. Bi-gradient methods can be obtained by a particular time discretization of a coupled dynamical system. In this case there are two descent direction vectors. For example, Bi-Cgstab is derived from dr = −P(s + q), dt (2.23) q = r + ω0 (Id − αP)q, s = r − ωPq. Here, s and q are the descent direction vectors [20]. 3.
Shooting methods
The solution of the linear problem Pu = b was previously defined as the steady state of some differential systems. A way to reach the solution for a finite value of the independent variable is to model the linear system as an objective. Let us turn back to the stencil of the differential systems associated to the descent methods, as they were built above. We have the system dr
dt r = 0 −P . (3.1) dz C D z dt Now, let T > 0 be a given real number. We define the problem as follows: Find z(0) ∈ Rn
such that r(T ) = 0.
(S)
J.-P. Chehab, J. Laminie / Differential equations and solution of linear systems
113
Letting M=
0 −P , CD
we rewrite problem (S) as Given r(0), find z(0), such that
0 −T M r(0) . =e z(0) z(T )
(3.2)
At this point we introduce the flow function F defined by F : z(0) −→ r(T ), in such a way S reduces to find a zero of F. We now consider the case C = Id, D = −Id − P. Remark 12. A natural idea could be to consider a pointwize version of the classical shooting method for solving second order boundary problem. This consists in solving twice the problem dr dt = −Pz, dz (3.3) = −z + (r − Pz), dt r(0) = r0 , z(0) = z0 for two different values of z(0). Denoting by r1 (t) and r2 (t) (resp. z1 (t) and z2 (t)) the solutions of the above system for z(0) = ξ1 and z(0) = ξ2 , we build the function r(t) as r(t) = 1 r1 (t) + 2 r2 (t) where 1 and 2 are two diagonal matrices such that r(0) = 1 r1 (t) + 2 r2 (0)
and 0 (= r(T )) = 1 r1 (T ) + 2 r2 (T ).
We have immediately 2 = Id − 1 and ( 1 )i =
(r2 (T ))i . (r2 (T ))i − (r1 (T ))i
Unfortunately, this not allows to give a simple and explicit value of z(t) with F(z(0)) = 0 since we have z(t) = P−1 1 Pz1 (t) + P−1 2 Pz2 (t).
114
4.
J.-P. Chehab, J. Laminie / Differential equations and solution of linear systems
Numerical integration
4.1. Enhanced stable time marching scheme 4.1.1. Definition of the scheme The computation of a steady state by an explicit scheme can be speed-ed up by enhancing the stability domain of the scheme since it allows to use larger time steps; in that context the accuracy of a time marching scheme is not a priority. A simple way to derive more stable methods is to use the parametrized one step schemes and to fit the parameters, not for increasing the accuracy such as in the classical schemes (Heun’s, Runge–Kutta’s), but for improving the stability. For example, in [9] it was defined a method for computing iteratively fixed points with larger descent parameter starting from a specific numerical time scheme. More precisely, this method consists in integrating the differential equation dU = F (U ), dt (4.1) U (0) = U0 , by the two steps scheme k K1 = F (U ), K2 = F (U k + tK1 ), k+1 U = U k + t αK1 + (1 − α)K2 .
(4.2)
Here α is a parameter to be fixed. This scheme allows a larger stability as compared to the Forward Euler scheme. For example, when F (U ) = b − PU , we have the result: Lemma 13. Assume that P is positive definite, then the scheme is convergent iff α< Proof. We have
7 8
and
t