SHOOTING METHODS FOR TWO-POINT BVPs ... - Semantic Scholar

11 downloads 0 Views 340KB Size Report
Oct 22, 1997 - (i) methods constructed for problems with (most) general, i.e. nonseparated boundary condi- tions. Bax(a) + Bbx(b) = ; Ba; Bb 2 Rn n; 2 Rn. (1.4).
SHOOTING METHODS FOR TWO-POINT BVPs WITH PARTIALLY SEPARATED ENDCONDITIONS M. HERMANN AND D. KAISER

Friedrich-Schiller-Universitat Jena Fakultat fur Mathematik und Informatik Institut fur Angewandte Mathematik Leutragraben 1, D-07740 Jena, Germany E-mail: [email protected]

October 22, 1997

Abstract The stabilized march technique is extended to nonlinear two-point boundary value problems via a new Generalized Brent Method for systems of nonlinear algebraic equations. The resulting algorithms can be used to solve systems of nonlinear rst-order ordinary di erential equations under partially separated nonlinear boundary conditions economically. Numerical results which compare the nonlinear stabilized march method with the standard multiple shooting method are given.

short title: SHOOTING METHODS

AMS (MOS) subject classi cation: 65L10 1

1 Introduction In this paper we consider the two-point boundary value problem (BVP) for a nonlinear system of n ordinary di erential equations: x0 (t) = f (t; x(t)); a < t < b; (1.1) subject to n nonlinear partially separated boundary conditions rp(x(a)) = 0; rq (x(a); x(b)) = 0; (1.2)

 Rn and where x(t) 2 Rn ; f : 1 ! Rn ; 1  (a; b)  Rn ; rp : (1) ! Rp ; (1) 2 2 q (2)  Rn  Rn ; n = p + q: rq : (2) 2 ! R ; 2 Assume that (1.1, 1.2) has an isolated solution and for convenience that f ; rp and rq are as smooth as desired. The numerical techniques for (1.1, 1.2) can be split into two classes, global and sequential. The need for a mesh over which the problem can be discretized and the resulting linear system solved is common to global methods. Here collocation and nite di erences are involved, and two of the most widely tested general purpose codes are based on their respective implementations (PASVA [15] and COLSYS [2]). Sequential (or so-called \shooting") methods require the integration of associated initial value problems (IVPs). This can involve unidirectional or bidirectional integration. A typical representative of this class is the well-known multiple shooting technique (e.g. the standard solvers: RWPM [9], MUSN [19] and BVPSOL [5]). There are various reasons for the recent revival of interest in sequential methods. One is the existence of widely available IV software. Another is that global methods can require excessive storage in many situations, and mesh selection procedure can be expensive. In case of linear di erential equations x0(t) = A(t)x(t) + g(t); A(t) 2 Rnn; g(t) 2 Rn; existing shooting techniques can be classi ed as follows:

(1.3)

(i) methods constructed for problems with (most) general, i.e. nonseparated boundary conditions Bax(a) + Bbx(b) = ; Ba; Bb 2 Rnn; 2 Rn (1.4) (e.g. simple shooting method, multiple shooting method), and (ii) methods constructed for problems with special, i.e. separated or at least partially separated boundary conditions "

# # "   Ba(1) x(a) + 0 x(b) = 1 ; 2 Bb(2) Ba(2)

Ba(1) 2 Rpn ; Ba(2); Bb(2) 2 Rqn ; 1 2 Rp ; 2 2 Rq ; n = p + q (e. g. method of complementary functions, stabilized march method). 2

(1.5)

The advantage of the methods of category (ii) is that the number of IVPs to be solved and the dimension of the associated matrix system are normally less than for methods of category (i). This aspect is of paramount importance since in a typical application the integration techniques require more than 90% of the total computational amount of a shooting method. Therefore, the reduction of the number of integrations is the central point of research in this eld. However, the shooting algorithms for nonlinear di erential equations (1.1) which have been referred to up to now (see e.g. [1, 5, 24]) are all nonlinear versions of methods belonging to the rst category. They have been developed to solve (1.1) under rather general nonlinear boundary conditions

r(x(a); x(b)) = 0; r : ! Rn;  Rn  Rn : 2

2

(1.6)

Therefore, these nonlinear shooting methods do not take advantage of the special structure of the boundary conditions (e. g. separated or partially separated linear/nonlinear boundary conditions). The purpose of this paper is to extend the principles of the methods of category (ii) to nonlinear BVPs. The resulting algorithms can be used to solve nonlinear di erential equations (1.1) economically under partially separated nonlinear boundary conditions (1.2). A rst attempt in this direction was undertaken in [7, 8, 14]. But here we give a more rigorous and systematic approach to the problem. We demonstrate that the nonlinear stabilized march algorithm immediately results from the application of a so-called Generalized Brent Method to the nonlinear algebraic shooting equations. By means of the known convergence results for Brent's method it is possible to study the convergence of these nonlinear shooting algorithms. The linear stabilized march method [6, 13, 21] comes out as a special case. It should be mentioned that an alternative approach consists in the combination of the quasilinearization method with the linear stabilized march method as proposed in [1]. The resulting algorithm doesn't belong to the class of sequential techniques. Consequently, modern integration routines (with automatic step-size control) cannot be used directly but only together with interpolation methods or related techniques. Just the sequential elimination of the nonlinear algebraic shooting equations (Brent method) enables the construction of a nonlinear stabilized march method which possesses the characteristic features of a shooting method. An outline of the paper is as follows. In section 2, we give a brief explanation of two standard shooting techniques, the simple shooting method and the multiple shooting method. These results will be constantly used in the subsequent sections. In section 3, a direct approach is presented to transmit the principles of the method of complementary functions and the stabilized march method to nonlinear BVPs. In this context it has to be assumed that p boundary conditions are linear (separated) and q boundary conditions are nonlinear (nonseparated), where n = p +q. In section 4, we derive a so-called Generalized Brent Method for the nonlinear algebraic equations arising in shooting methods. We prove local convergence of the method. In section 5, we show how the multiple shooting method can be combined with the generalized Brent method to produce a stabilized march method for the BVP (1.1), (1.2). The methods of section 3 are special cases of this new shooting technique. Finally, section 6 is concerned with numerical experiments which compare the nonlinear stabilized march method with the standard multiple shooting method. 3

2 The standard techniques: SSM and MSM In order to prepare the subsequent discussions, let us rst consider two standard shooting techniques which are nonlinear analogues of methods belonging to category (i). Given the nonlinear BVP (1.1) (1.6)

x0 = f (t; x(t)); a < t < b; r(x(a); x(b)) = 0:

(2.1)

With (2.1) we associate the related IVP

u0 = f (t; u(t)); u(a) = s 2 Rn; t 2 [a; b]:

(2.2)

Let u  u(t; s) denote the solution of (2.2). The simple shooting method (SSM) determines the unknown vector s in (2.2) so that the corresponding trajectory u(t; s) satis es the boundary conditions of (2.1). This yields a system of n nonlinear algebraic equations for s 2 Rn :

F(s)  r(s; u(b; s)) = 0: (2.3) If s = s is a root of (2.3), then x (t)  u(t; s) is a solution of (2.1). Conversely, for any solution x(t) of (2.1) the vector s  x(a) is a root of (2.3). Thus the BVP (2.1) is reduced to that of computing the zeros of F(s). To solve system (2.3) Newton's method is most frequently used. The Newton iterates fsk g1 k are de ned for any s 2 S (s)  fs 2 Rn = ks ? sk  g by Mk ck = qk ; sk = sk ? ck ; k = 0; 1; :::; (2.4) where Mk  Ba;k + Bb;k Xke; qk  F(sk ); Xke = X (b; sk)  @ u(b; sk)=@ s; Bi;k = Bi(s )  @ r(sk ; u(b; sk))= @ x(i); i = a; b. Here X (t; sk) satis es the matrix IVP X 0(t) ? A(t; sk)X (t) = 0; a < t < b; X (a) = I; (2.5) where A(t; sk )  @ f (t; u(t; sk))=@ u. =0

0

+1

The convergence properties of this scheme have been thoroughly studied and are well-known (see e. g. [13]). Computational diculties often arise because the solutions of the IVPs (2.2) may diverge/ rapidly, such that a solution manifold loses dimensions. The multiple shooting method (MSM) attempts to circumvent this diculty by requiring that IVPs be solved over smaller subintervals only. In particular, given

a = 0 < 1 <    < m?1 < m = b; (2.6) on each segment [j ; j +1]; 0  j  m ? 1, an IVP is posed u0j (t) = f (t; uj (t)); j  t  j+1; uj (j ) = sj 2 Rn : (2.7) Using the notation uj  uj (t; sj ), the continuity of the solution of (2.1) at each interior point j is expressed by the conditions uj (j+1; sj) ? sj+1 = 0; j = 0; : : :; m ? 2: (2.8) 4

Writing equations (2.8) rst and then the boundary conditions of (2.1),

r(s ; um? (b; sm? )) = 0; 0

1

(2.9)

1

an algebraic system of mn equations is obtained 2 6 6 (m) (m) F (s )  666 4

3 7 1 2 1 2 7 7 = 0; .. 7 . 7 um?2(m?1; sm?2) ? sm?1 5 r(s0; um?1(b; sm?1))

u ( ; s ) ? s u ( ; s ) ? s 0

1

0

1

s m  (s ; : : :; sm? )T : (

)

0

1

(2.10)

Applying Newton's method to (2.10) requires at the kth stage of the iteration solving the linear algebraic system

Mk(m) c(km) = q(km) ; s(km+1) = s(km) ? c(km) : The coecient matrix Mk(m) has the following structure 2 6 Mk(m) = 664

X0e;k ?I ... Ba;k

...

Xme ?2;k

?I

Bb;k Xme ?1;k

3 7 7 7 5

(2.11)

;

e  @ uj (j +1; sj;k)=@ sj 2 Rnn; j = 0; : : :; m ? 1; q(m)  F(m)(s(m)); where Xj;k k k Bi;k = Bi (sk)  @ r(s0;k; um?1(b; sm?1;k))=@ x(i) 2 Rnn; i = a; b:

The convergence properties of scheme (2.11) are studied in [13]. Moreover, in the monograph [24] we describe an implementation of the MSM (code RWPM) for approximately solving (2.1) and give the complete FORTRAN source. The main features of RWPM are an automatic procedure for choosing the segmentation points, sophisticated linear and nonlinear equation solvers and up-to-date codes for solving sti and nonsti IVPs.

3 Nonlinear problems with linear/nonlinear partially separated endconditions In this section we demonstrate that the method of complementary functions (MCF) and the stabilized march method (SMM) designed for linear BVPs (see e.g. [6, 13, 24]) can be extended in a natural and straightforward way to handle nonlinear BVPs. Both shooting techniques belong to category (ii). For such an approach we have to assume that the rst part of the boundary conditions (1.2) is linear. Later it will be proved that these methods are special cases of shooting algorithms which are based on the application of a generalized Brent method to the algebraic shooting equations (2.3) and (2.10). Then the assumption that rp (x(a)) is a linear function can be neglected. 5

Consider the BVP

a) x0 (t) = f (t; x(t)); a < t < b; b) rp (x(a))  Ba(1)x(a) ? 1 = 0; c) rq (x(a); x(b)) = 0;

(3.1)

where Ba(1); 1 and rq are de ned as in (1.5) and (1.2). Assume rank (Ba(1)) = p.

3.1 Nonlinear MCF If we incorporate the particular form (3.1 b, c) of the boundary conditions (1.2) into (2.4), we nd "

#

Ba(1) c = (2) (2) Ba;k + Bb;k Xke k

"

#

Ba sk ? rq (sk; u(b; sk)) ; (1)

(3.2)

1

(2) where Bi;k = Bi(2)(sk )  @ rq (sk ; u(b; sk))=@ x(i); i = a; b.

Using the orthogonal factorization (Ba

(1)

)T

   U = Q 0 = [Q jQ ] U0 = Q U ; 

we transform the vector ck as follows

(1)

(3.3)

(1)

(2)

   y y k k = [Q j Q ] z ; ck k=Q z k k where yk 2 Rp , zk 2 Rq , Q 2 Rnp, Q 2 Rnq and U 2 Rpp is upper triangular.

= QQT c (1)



(1)

UT ? ? ? ? ? ?i? Ba;k + Bb;k Xke Q (2)

(2)

(1)

(3.4)

(2)

Substituting (3.4) into (3.2), we obtain 2 6 4 h

(2)

j 0 j h ? ? ? ? ? ?i? j Ba;k + Bb;k Xke Q (2)

(2)

(2)

32 74 5

yk 3 2 Ba sk ? 3 ?? 5 = 4 ? ? ? ? ? 5 : (3.5) zk rq (sk; u(b; sk)) (1)

1

The solution of the rst block row is obvious. To treat the second block row numerically, u(b; sk) and Xke have to be computed as solutions of the n + 1 IVPs (2.2) and (2.5). However, since Xke appears only in combination with Q(2) and Q(1)yk , the terms XkeQ(2) and XkeQ(1) yk can be approximated by discretized directional derivatives. Assume that the solution u(t1 ; t0 ; s) of the IVP u0 = f (t; u); u(t0 ) = s has already been computed. Then, Algorithm 3.1 produces an approximation XR 2 Rnj of X (t1; t0 ; s)R, where X (t1; t0; s)  @ u(t1; t0; s)=@ s 2 Rnn and R 2 Rnj ; e(i) denotes the ith unit vector in Rj and h is a small positive constant.

Remark 3.1 In the following algorithm the notation "f  g" is used to formulate comments. Algorithm 3.1: 6

Input: fdimensiong j ; fintervalg t0 ; t1 ; fpointg s; fdirectiong R; fstepsizeg h and

fthe functiong f (t; x) " = (ksk + 1)h

Set: For i = 1 : j Do Compute:

u(t ; t ; s + "Re i )

fas the solution of the IVP u0 = f (t; u); u(t ) = s + "Re i g XR?;i = fu(t ; t ; s + "Re i ) ? u(t ; t ; s)g=" 1

( )

0

0

Set: Endfor Output: XR

1

( )

0

1

( )

0

We have to set t0 = a; t1 = b; j = q; s = sk ; R = Q(2) [or t0 = a; t1 = b; j = 1; s = sk ; R = Q(1)yk ] so that Algorithm 3.1 generates an approximation of XkeQ(2) [or XkeQ(1) yk ]. As can easily be seen, this strategy requires only the integration of q + 2 IVPs; u(b; sk), u(b; sk + "Q(2)e(i)); i = 1(1)q; and u(b; sk + "Q(1)yk ) have to be computed. The shooting technique described above (included Algorithm 3.1) is called Newton form of the (nonlinear) MCF. Note that the rst block row of (3.5) has only to be solved at the rst step of the iteration process. For the next iterates yk = 0; k  2, holds and the iteration has to be executed for zk only, i.e. the problem is reduced to q dimensions. The total number of IVPs to be integrated in the nonlinear MCF can be further reduced. Going back to system (3.5), we now change sk so that the condition

Ba(1)sk = 1

(3.6)

is ful lled. The general solution ^sk of this underdetermined system can be represented in the form ^sk = (Ba(1))+ 1 + [I ? (Ba(1))+Ba(1)]!; ! 2 Rn ;

(3.7)

where (Ba(1))+ denotes the Moore-Penrose pseudoinverse of Ba(1). Since rank (Ba(1)) = p, we have

s^k = Q(1)U ?T 1 + Q(2)(Q(2))T !: (3.8) If we set ! = sk , then formula (3.8) transforms sk into a vector ^sk which satis es (3.6). Substituting (3.8) into (3.5), we obtain yk = 0 and the n-dimensional system (3.2) is reduced to the following system of q linear equations for zk 2 Rq : M^ k zk = q^ k ; (3.9) where M^ k  [Ba(2)(^sk ) + Bb(2)(^sk)X (b; ^sk)]Q(2) and q^ k  rq (^sk; u(b; ^sk)): Finally, the new iterate sk+1 is determined according to

sk = ^sk ? Q zk : (2)

+1

(3.10)

As can easily be shown, this iterate immediately satis es condition (3.6). Consequently, transformation (3.8) has only to be executed for the initial guess s0 2 Rn . However, to avoid numerical 7

instabilities this transformation should be performed at each kth step (k  10) of the iteration procedure. If we de ne t0 = a; t1 = b; j = q; s = ^sk ; R = Q(2), then Algorithm 3.1 can be used to approximate X (b; ^sk)Q(2) (see formula (3.9)) by discretized directional derivatives. In order to perform this approximation, q + 1 integrations are required, namely u(b; ^sk) and u(b; ^sk + "Q(2)e(i) ); i = 1(1)q. Thus the number of IVPs is diminished from n +1 (SSM) to q +1 (MCF). The shooting technique (3.3), (3.8) - (3.10) [included Algorithm 3.1] is called the standard form of the (nonlinear) MCF. In accordance with the linear MCF the columns of X (b; s)Q(2) are referred to as complementary functions. Both forms of the nonlinear MCF can be used to solve linear BVPs. In that case only one step of the iteration procedure is theoretically necessary. However, additional steps improve the accuracy of the results signi cantly in comparison with the linear MCF. They can be regarded as an iterative re nement which is known from numerical linear algebra.

3.2 Nonlinear SMM We now present nonlinear versions of the SMM for BVPs of type (3.1). Remember that the MSM involves at the kth stage of the algorithm the solution of the linear system (2.11). In the multiple shooting setting these linear equations are normally solved by Gaussian elimination with an economized ll-in (see e.g. [9, 18, 20]). If, however, the boundary conditions are separated, special block elimination techniques enable a considerable reduction of storage. Moreover, a signi cant economization on the number of integrations is obtained as a by-product. The latter result is much more important than the (already quite low) storage requirements of the shooting method. Substituting the special form (3.1 b, c) of the boundary conditions (1.2) into (2.11) and de ning uej;k  uj (j +1; sj;k), we obtain: 2 6 6 6 6 6 6 6" 6 4

X0e;k ?I ... Ba(1) (2) Ba;k

#

...

Xme ?2;k

?I "

0

#

Xme ?1;k (2) Bb;k

32 76 76 76 76 76 76 76 74 5

3 2 .. 7 6 7 6 . 6 cm?2;k 777 = 666 7 6 7 6 5 4 cm?1;k

c ;k 0

3 7 .. 7 . 7 e um?2;k ? sm?1;k 77 7 7 Ba(1)s0;k ? 1 75 rq (s0;k; uem?1;k)

ue;k ? s ;k 0

1

(3.11) The following transformations are based on two orthogonal matrices. They are constructed to rearrange the above matrix such that it split in a 2  2 block matrix with a lower triangular (1,1)-block and a zero (1,2)-block. Using the orthogonal matrix Qk 2 Rmnmn (see (3.14)), we perform a coordinate transformation

ckm = Qkdkm ; dkm 2 Rmn : (

)

(

(

)

)

(3.12)

Substituting (3.12) into (3.11) and multiplying the resulting system by the orthogonal matrix Q~ Tk 2 Rmnmn (see (3.14)), we have (3.13) Q~ Tk Mk(m)Qk d(km) = Q~ Tk q(km) : 8

In (3.13) we let Qk and Q~ Tk be matrices of the form (1)

I

T

(Q1;k ) (1)

Q

k=

..

(2)

;k

;k

Q0

Q0

..

.

..

T k

~ ; Q

.

(1)

m?1;k

Q

Q

. (1)

=

(2)

m?1;k

?

T

(Qm 1;k )

T

(2) (Q1 )

;k

p

..

.

;

(2)

?

(3.14)

T

(Qm 1;k )

q

I

np and Q(2) 2 Rnq are recursively de ned: where the block matrices Q(1) i;k i;k 2 R

i=0: Compute the orthogonal factorization (3.3) of (Ba(1))T . Choose Q0;k  [Q(1) jQ(2) ] = [Q(1)jQ(2)]: 0;k 0;k

(3.15)

i=1(1)m-1: Compute the following orthogonal factorization of Xie?1;kQ(2) i?1;k: ~ (2) ~ (1) Xie?1;kQ(2) i?1;k = [Qi?1;kjQi?1;k]

 ~  Ui?1;k

(3.16)

0

(2) ~ (1) ~ (2) Choose Qi;k  [Q(1) i;k jQi;k ] = [Qi?1;kjQi?1;k]:

If we set

U0;k Z0;k Zm;k Ra;k

   

(3.17)

U T ; Ui;k  U~i? ;k; i = 1(1)m ? 1; Ba;kQ ;k ; Zi;k  (Qi;k )T Xie? ;kQi? ;k; i = 1(1)m ? 1; Bb;k Xme ? ;kQm? ;k; Vi;k  (Qi;k )T Xie? ;kQi? ;k; i = 1(1)m ? 1; Ba;kQ ;k ; Rb;k  Bb;k Xme ? ;kQm? ;k; 1

(2)

1

(1)

(2)

1

(2)

(1) 1

(2)

(1) 0

(1)

1

1

(2)

(2)

(2) 0

(3.18)

(1) 1

1

1

then system (3.13) can be explicitly written as 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4

U0;k V1;k ?Ip ...

?

Z1;k

Z0;k where

? ...

...

Vm?1;k ?Ip

?

Zm?1;k

?

Zm;k

j j j j j j j j

32 3 2 7 6 (1) 7 6 (1) 7 6 rk 7 6 k 76 7 6 76 7 6 76 7 6 76 7 6 76 7=6 ? 7 ? 7 6 ? ? ? ? 7 66 7 6 76 7 6 U1;k ?Iq 76 7 6 76 ... ... 7 6 76 6 (2) 7 4 (2) 7 Um?1;k ?Iq 5 rk 5 4 k Ra;k Rb;k

9

3 7 7 7 7 7 7 7 7 7 7 7 7 7 5

(3.19)

dkm  [d ;kjd ;k; : : : ; dm? ;kjdm? ;k]T ; di;k 2 Rp ; di;k 2 Rq ; rk  [d ;k; : : : ; dm? ;k]T ; rk  [d ;k; : : : ; dm? ;k]T ; k  [Ba s ;k ? ; (Q ;k)T (ue;k ? s ;k ); : : : ; (Qm? ;k)T (uem? ;k ? sm? ;k)]T ; k  [(Q ;k)T (ue;k ? s ;k); : : : ; (Qm? ;k)T (uem? ;k ? sm? ;k); rq (s ;k; uem? ;k)]T : If the BVP (3.1) is completely separated, i.e. rq = rq (x(b)), then system (3.19) can be solved by (

(1)

(1)

(2)

)

(1) 0

(1)

(2) 1

1

1

(1) 0

(1)

(2)

(2) 0

1

0

1

0

(1) 1

1

(2)

(1)

(2)

(1)

(2) 0

0

1

(2)

1

(1)

(2)

1

2

1

2

1

1

0

1

the following stable block elimination technique which works with minimal storage. Note that this assumption implies Z0;k = Ra;k = 0.

Firstly, r(1) k is computed from the bidiagonal (1; 1)-block matrix system. Then this result is substituted into the second half of the equations (3.19). Finally, r(2) k is determined from the bidiagonal (2; 2)-block matrix system. An adequate implementation (i.e. minimizing storage) of this solution strategy requires only mq + n2 (m > 2) storage. Adding an iterative re nement increases the storage to m(n2 + 2n + q ) + n2 ? 2q . In comparison, Gaussian elimination requires approximatively 3mn2 (without iterative re nement) and (4m ? 8)n2 + 2mn (with iterative re nement) storage; see [12]. The CPU time is normally somewhat longer for the block elimination technique than for the Gaussian elimination (depending on the value of p and the orthonormalization method used). However, in a shooting algorithm the linear equation solver requires only a small percentage of the total amount of the CPU time. The major part is necessary for the integration of the associated IVPs. Let us return to the partially separated problem (3.1). As mentioned above we are interested in a generalization of the linear SMM. There are at least two ways in which nonlinear versions of the SMM can be constructed. The resulting algorithms di er in the computational amount as well as in the convergence properties. The rst approach begins with a block elimination of the (1; 1)-block system of (3.19), i.e. r(1) k is determined from

U0;k d(1) = Ba(1)s0;k ? 1 ; 0;k (1) (1) (1) T (1) T e e d(1) i;k = (Qi;k ) (ui?1;k ? si;k) + (Qi;k ) Xi?1;kQi?1;k di?1;k; i = 1(1)m ? 1:

(3.20)

(1) Since the matrix Xie?1;k appears in (3.20) only in combination with Q(1) i?1;kdi?1;k, the corresponding terms can be approximated by discretized directional derivatives. Using Algorithm (1) 3.1 (t0 = i?1; t1 = i ; j = 1; s = si?1;k; R = Q(1) i?1;kdi?1;k; i = 1(1)m ? 1), these approximations can be done with 2(m ? 1) integrations. Now, substituting r(1) k into (3.19) the (2) vector rk is the solution of the (2; 2)-block matrix system of (3.19); the iteration index k is dropped for simplicity:

10

2 6 6 6 4 2 6 6 = 66 4

?Iq

U1

...

Ba Q (2)

(2) 0

...

Um?1

3 2 (2) d0 . 76 6 7 6 .. 7 6 (2) 5 4 dm?2 ?Iq (2) (2) e Bb Xm?1Qm?1 d(2) m?1

3 7 7 7 7 5

3 (Q 0 0 0 1 0 7 .. 7 . 7 T (ue ? sm?1) ? (Q(2) )T X e Q(1) d(1) 7 (Q(2) ) m?2 m?2 m?2 m?2 5 m?1 m?1 (1) (2) (1) (1) (2) e e rq (s0; um?1) ? Ba Q0 d0 ? Bb Xm?1Q(1) m?1dm?1 (2) 1

(3.21)

)T (ue ? s1) ? (Q(2))T X eQ(1)d(1)

:

e Q(1)d(1); In order to solve system (3.21), the following information is required: uei;k ; Xi;k i;k i;k (2) (2) e e i = 0(1)m ? 1; Ui;k  Xi?1;kQi?1;k; i = 1(1)m ? 1; and Xm?1;kQm?1;k. Note that uei;k e Q(1)d(1); i = 0(1)m ? 2, have already been computed in the course of determinand Xi;k i;k i;k (1) ing the di;k 's (see formula (3.20)). To determine uem?1;k as well as an approximation of (1) Xme ?1;kQ(1) m?1;kdm?1;k (using Algorithm 3.1) two additional integrations are necessary. If we set t0 = i?1; t1 = i ; j = p; s = si?1;k ; R = Q(2) i?1;k; i = 1(1)m, Algorithm 3.1 can (2) e be used to approximate Xi?1;kQi?1;k; i = 1(1)m, on the basis of mq integrations. Thus, at the kth stage of our rst nonlinear stabilized march algorithm, which we call the Newton form of the SMM, a total of (q + 2)m IVPs have to be integrated. Compared with the MSM we have a reduction of (p ? 1)m IVPs. Furthermore, the dimension of the corresponding linear algebraic system is diminished from mn to mq .

Once a solution d(km) of the equations (3.19) has been obtained, a new iterate s(km+1) is computed according to

skm = skm ? Qk dkm : (

) +1

(

)

(

(3.22)

)

Obviously, the iteration (3.22) proceeds in the same way as Newton's method applied to (3.11). Thus, damping and regularization strategies can also be used successfully in (3.22). For linear BVPs only one step of the iteration procedure (3.22) is theoretically necessary. However, it should be mentioned that this iteration step does not agree with the linear SMM (here, (q +1)m integrations have to be performed). The second approach to generalize the linear SMM is based on a decoupling of the system (3.19) by introducing an intermediate step involving the transformation of s0;k; : : : ; sm?1;k into vectors ^s0;k ; : : : ; ^sm?1;k which satisfy: T e (a) (Q(1) i;k ) (ui?1;k ? ^si;k ) = 0; i = 1(1)m ? 1;

(b) Ba(1)^s0;k ? 1 = 0:

(3.23)

Then in (3.19) the regularity of the (1; 1)-block matrix implies r(1) k = 0 and system (3.19) is reduced to the following mq -dimensional system

M^ k(m) z(km) = q^ (km) ; 11

(3.24)

(m) (2) where M^ k(m) denotes the (2; 2)-block matrix in (3.19), z(km)  r(2) k and q^ k  k . The vectors ^si;k ; i = 0; : : : ; m ? 1, satisfying conditions (3.23) can be constructed in the same way as it is done in section 3.1. In particular, the general solution of the underdetermined system (3.23 a) is (1) T (1) T + n T + (1) T e ^si;k = f(Q(1) i;k ) g (Qi;k ) ui?1;k + [I ? f(Qi;k ) g (Qi;k ) ]!i;k; !i;k 2 R :

(3.25)

Since the matrices Q(1) i;k have full rank, (3.25) can be written in the form (1) T e (2) (2) T ^si;k = Q(1) i;k (Qi;k ) ui?1;k + Qi;k (Qi;k ) !i;k: The general solution of (3.23 b) is (see formula (3.8)):

(3.26)

^s0;k = Q(1) )T !0;k: (3.27) (Q(2) (U0;k)?1 1 + Q(2) 0;k 0;k 0;k If we set !i;k  si;k ; i = 0(1)m ? 1, then formulas (3.26, 3.27) transform si;k into vectors ^si;k which satisfy (3.23). Finally, it should be mentioned that the number of integrations can also e appear only be diminished. In order to accomplish this we note that in (3.24) the matrices Xi;k e Q(2). Therefore, if we approximate these latter terms on the basis of the corresponding as Xi;k i;k directional derivatives (using Algorithm 3.1), the amount of work decreases signi cantly: from m(n + 1) IVPs [MSM] to m(q + 1) IVPs [SMM]. Once a solution z(km) of the system (3.24) has been obtained, the new iterate s(km+1) is computed according to

skm = skm ? diag[Q ;k; : : : ; Qm? ;k]zkm : (

) +1

(

)

(2) 0

(2)

(

)

1

(3.28)

We call the shooting technique (3.24) - (3.28) the standard form of the SMM.

4 A -stage GBM for the numerical treatment of systems of nonlinear algebraic equations Systems of nonlinear algebraic equations are the basis of each shooting technique. In this chapter we develop and study a new generalized Brent method/ (GBM) for the numerical treatment of such systems. The application of the GBM to the algebraic shooting equations (2.10) leads to a new (nonlinear) shooting method which is economized with respect to the number of IVPs to be solved. In particular, if we consider BVPs of the form (1.1), (1.2) the new method requires only m(q + 1) integrations per iteration step. This gives rise to its further classi cation as stabilized march/. All variants of the SMM (linear and nonlinear) which have been referred to up till now are special cases of this shooting technique.

4.1 The algorithm Consider the  -dimensional system of nonlinear algebraic equations F(x) = 0; 12

(4.1)

where F : DF  R ! R belongs to C s (s > 1). Let J (x) be the Jacobian of F(x). Assume that an isolated solution x of (4.1) exists, i.e. J (x ) is nonsingular, and that J (x) satis es a Lipschitz condition with the constant L > 0 for all x 2 S(x;  );   > 0. The GBM generates (0) a sequence of approximate solutions fx(k)g1 k=0 to (4.1). A starting vector x has to be chosen  (k ) (k +1) which is close to the exact solution x . The iteration x ! x is performed by intermediate (k ) (k +1) steps y1 ; y2 ; : : : ; y+1 , where y1  x and x  y+1. The transition from yi to yi+1 is called the ith elimination step and the number of elimination steps  is referred to as the stage of the method. In the following the iteration index k is often dropped for simplicity. Throughout the remainder of this paper, we denote by k  k the 2-norm of a vector or a matrix. Before executing the method the user has to x two constants ; ! 2 [1; 1) as well as the inte P gers n1 ; : : : ; n > 0 with nj =  . At the beginning of the ith elimination step (i = 1; : : : ; ) j =1 a matrix Ai (yi ) has to be chosen in such a way that (i) rank(A(i)) = li ;

2 where A(i)  64

3

A1 ( y 1 ) i .. 75 2 Rli  ; li  X nj ; . j =1 Ai ( y i )

(ii) if (A(i) )T is orthogonally factorized in the form (A(i) )T = [ Qi j (Q(Ai) )T ] [ 2 6 then the matrix i  66 4

A1 ( y 1 ) .. . Ai ( y i ) Q(Ai)

li 3

 ? li

7 7 2 R  7 5

UA 0 li

] li ? l i

(4.2)

ful lls k i k   and k ?i 1 k  ! .

The relatively large degree of freedom in the choice of the stage , the matrices Ai , the subdimensions n1 ; : : : ; n and the constants ! ,  enables a great variety of specializations. Thus, the method of Brent [4] as well as the orthogonalization techniques described in [22, 17, 16] are particular cases of our GBM. However, for the construction of a nonlinear stabilized march method only one choice of the parameters is necessary (see section 5). Let us now demonstrate the principles of the method for the case  = 3. Given the iterate

y = x k , we solve successively the following systems for their minimum 2-norm solutions, 2 3 2 3     A 0 A z = ?F(y ); AA z = ?F0(y ) ; and 4 A 5 z = 4 0 5 ; (4.3) ?F(y ) A ( )

1

1

(1)

1

1 2

1

(2)

2

2

(3)

3

where Ai  Ai (yi )J (yi ); yi+1 = yi + z(i) (i = 1; 2; 3) and x(k+1) = y4 . The minimum 2-norm solution of (4.3, a) can be written in the form z(1) = ?A+1A1(y1)F(y1): 13

3

(4.4)

A denotes the Moore-Penrose pseudoinverse of A . Assume that rank(A ) = n . Then, the +

1

orthogonal factorization

AT = [ Q 1





(1) 2

n1

1

] U011 = Q(1) 1 U11

j Q

(1) 1

1

n2 + n 3

?T yields A+1 = Q(1) 1 U11 and the minimum 2-norm solution is

z = ?Q U ?T A (y )F(y ): (1) 1

(1)

1

11

1

(4.5)

1

Assume that the matrices of the linear systems (4.3, b) and (4.3, c) have full rank. Then, the solutions z(2) and z(3) of these systems can be determined in a similar manner. Using the orthogonal factorizations [AT jAT ] = [ 1

Q

2

2

Q

3

n3

j Q

n1

n2

n3

12

5 ; and

22

0

0

U

U 0 U

] 4

(3) 1

3

U 0 U 11

2

j Q

(2) 1

U

] 4

(2) 2

n2

(1) 1

2

j Q

(2) 1

n1

[AT jAT jAT ] = [ 1

j Q

(1) 1

11

0

U U 0 U 12

13

22

23

(4.6)

3 5;

(4.7)

33

the minimum 2-norm solutions of (4.3, b) and (4.3, c) can be expressed in the form

?T   0 U U z = [Q j Q ] 0 U ?A (y )F(y ) = ?Q U ?T A (y )F(y ) (4.8) 2 3?T 2 3 U U U 0 5 z = [Q j Q j Q ] 4 0 U U 5 4 0 0 0 U ?A (y )F(y ) = ?Q U ?T A (y )F(y ): (4.9) Note that only Q and U  [Q and U ] appear in formula (4.8) [(4.9)]. If the factorization T , we get (4.6) is multiplied by Q Q (1) 1

(2)

(2) 1

(1) 1

(3)

(2) 1



11

2

22

12

22

2

(2) 1

2

2

2

2

11

(3) 1

12

13

22

23

33

(3) 1

(2) 1

Q

2

T (1)

Q

2

3

3

3

3

3

(3) 1

22

(1) 2

 (1)

33

3

33

(1) 2

2 (2) 4 [AT1 jAT2 ] = [0 j Q(2) 1 j Q2 ]

U

U 0 U 11

12

22

0

0

3 5 = [0 j Q(2) 1 U22 ]:

(4.10)

Thus, in order to compute (4.8) it suces to determine the factorization

Q

(1) 2



Q

(1) 2

T

AT = Q U 2

(2) 1

22

(1) T T T T T the matrix J appears instead of (4.6). Moreover, since in (Q(1) 2 ) A2 = (Q2 ) J (y2 ) A2 (y2 ) (1) (1) only in combination with Q2 , the term J (y2 )Q2 can be approximated by discretized directional derivatives.

14

Let Q^(2) be determined from the trivial orthogonal factorization 2 2 In1 6 (1) (2) 6 [Q(1) j Q(2) j Q^(2) 1 1 ] = [Q1 j Q1 2 ]4 ?

0

3 In2 77 : ? 5

^ (2) T results in a formula which is similiar to (4.10): The multiplication of (4.7) by Q^ (2) 2 (Q2 )

Q^ (Q^ )T [AT j AT j AT ] = [0 j 0 j Q U ]: (2) 2

(2) 2

1

2

(3) 1

3

33

Thus, for the determination of Q(3) and U33 in (4.9) only the orthogonal factorization 1

Q^ (Q^ )T AT = Q U (2) 2

(2) 2

(3) 1

3

33

and the computation of J (y3 )Q^ (2) by discretized directional derivatives are necessary. 2

Remark 4.1 Since (Q^ )T AT is a nonsingular n  n -matrix, the vector z can be computed (2) 2

3

3

(3)

3

by the following formula which is an alternative to (4.9):

z = ?Q^ [A (y )J (y )Q^ ]? A (y )F(y ): (2) 2

(3)

3

3

3

(2) 2

1

3

3

(4.11)

3

The above 3-stage GBM allows us to formulate a general -stage GBM as follows. Note: In the following algorithm we use the notation described in Remark 3.1.

Algorithm 4.1: 01

Input: fdimensionsg ; ; n1 ; : : : ; n ; ftolerancesg TOL1; TOL2; kmax;

fstarting vectorg x and fthe functiong F(x) (0)

02 03 04 05 06 07 08 09

For k = 0 : kmax Do

Set: y1 = x(k) and Q^ (0) = I 2 For i = 1 :  ? 1 Do Choose: Ai (yi ) fwhich satis es (4.2)g Compute: fi = Ai (yi )F(yi ) Compute: Z (i) = A (y )J (y )Q^ (i?1) fby directional derivativesg i i i 2 Compute: Q(1i) ; U (i) fby the orthogonal factorization Q^(i?1)(Z (i))T = Q(i)U (i) g

Compute: Q^ i

( ) 2

2

1

fby the trivial orthogonal factorization

2

Ili

3

(i) (i) (1) ^ (i) 4 ?? 5 g [Q(1) 1 j : : : j Q1 ] = [Q1 j : : : j Q1 j Q2 ] 0

15

If det(U (i)) = 0 Then Stop f unsuitable starting vectorg Set: yi+1 = yi ? Q(i)(U (i))?T fi

10 11 12

1

fend of the i-th elimination stepg fperform the nal (-th) elimination stepg 13 Choose: A (y ) fwhich satis es (4.2)g 14 Compute: f = A (y )F(y ) Endfor

v uX u  If Fnorm  t kfi k2  TOL1 Then Goto Output fsmall residualsg i=1 Compute: Z () = A (y )J (y )Q^ (2?1) fby directional derivativesg If det(Z () ) = 0 Then Stop funsuitable starting vectorg

15 16 17 18

Q  ;U 

fby the orthogonal factorization Q^ ? (Z  )T = Q  U  g 19 Set: x k = y = y ? Q  (U  )? f 20 If kx k ? x k k  TOL2 Then Goto Output fstationary pointg 21 Endfor fend of the k-th iteration stepg 22 Output: x k ; Fnorm Remark 4.2 Considering Remark 4.1 there is a second possibility to determine y : Replace Compute:

( ) 1

( )

( 2

( +1)

( +1)

1)

( ) 1

+1

( ) 1

( )

( )

( )

1

( )

( +1)

+1

the lines 18 and 19 by 18 19

Compute: Set:

xk

( +1)

d fas the solution of the linear system Z  d = f g = y = y ? Q^ ? d ( )

( 2

+1

1)

The Algorithm 4.1 modi ed by the lines 18 and 19 is called Algorithm 4.2.

4.2 Convergence of the method The following theorem contains the basic result of the convergence of the -stage GBM.

Theorem 4.1 Assume that F : DF  R ! R is di erentiable on DF , that there exists an isolated solution x of (4.1), i.e. J (x) is nonsingular, and that J (x) satis es a Lipschitz condition with the constant L > 0 for all x 2 S(x ;  )  int(DF ), where S(x ;  )  fx 2 R : kx ? xk    ;  > 0g. Suppose that the matrices A i 2 Rli  and i 2 R satisfy conditions (4.2). Then real constants  and C , with     > 0; C > 0 and C   < 1 will exist, so that the iterates x k of the -stage GBM (Algorithm 4.1) are well-de ned for all x 2 S(x ;  ), remain ( )

( )

(0)

16

in S(x ;  ) and converge to x . Moreover, the error estimate

kx k ? xk  C  kx k ? x k ; k = 0; 1; : : : ; ( +1)

( )

(4.12)

2

holds. Proof: The proof is based on the assumption that all directional derivatives which appear in Algorithm 4.1 are computed exactly. The in uence of approximated directional derivatives is studied in [12].

Let ; 2 R be de ned by  kJ (x )k and  kJ (x )?1k  !. Fixe an arbitrary q 2 (0; 1) and set   =(1 ? q). In the sequel the following notation is used: 1  minf ; q=(     L); 2q=(  L)g; N0  (L  1=2 + )  ;  c0  1 +   N0;p c1  c0 ; c2  2L    c21;

c3    !  c2  :

We can prove that the statements of the theorem are ful lled by

  minf cq ; c1 g and C = c3 : 3 1 In order to do this we show by induction (for  = 1(1)) that (i) kA (y )  F(y )k  N0  ky ? x k; (ii) k(U () )?T k   and y+1 is well-de ned; (iii) ky+1 ? x k  c0 kx(k) ? x k  c1    1; (iv) kAj (yj )  F(y+1 )k  c2  kx(k) ? x k2 8 j  : Suppose the relations (i) - (iv) are satis ed for  = 1; : : : ; i ? 1.

(4.13)

Proposition (i):

kAi (yi )  F(yi )k  kAi(yi )k  kF(yi )k  k ik  kF(yi )k    fkF(yi ) ? F(x ) ? J (x )(yi ? x )k + kJ (x )(yi ? x )kg    f L2 kyi ? x k + kJ (x )k  kyi ? x kg 2

(4.14)

   f L2 kyi ? x k + kJ (x )kg  kyi ? x k  N  kyi ? x k; 0

Proposition (ii): Relation (iii) implies y1; : : : ; yi 2 S(x ; 1). Furthermore, for  = 1(1)i ? 1, the following orthogonal factorizations have been computed Q^2(?1) (Z () )T = Q(1)  U () and [Q j : : : (1) 1

j Q  ] = [Q(1) j 1 ( ) 1

:::

jQ

( ) 1









j Q^ (2) ]  I0l = [Q^ (1) j Q^ (2) ]  I0l ; 17

(4.15)

where Q^ (0) 2  I . We now de ne the matrix

2

6 6 6 6 6 Fi  66 6 6 6 4

T Z (1)(Q^(0) 2 ) T ^ (1) ^(1) T Z (2)(Q^(1) 2 ) + A2 (y2 )J (y2 )Q1 (Q1 )

3

7 7 7 7 7 7; 7 7 7 ( i ? 1) ( i ? 1) T ^ ^ i )Q1 (Q1 ) 7 5

.. . ( i ? 1) Z (i)(Q^2 )T + Ai (yi )J (y QA(i) J (x )

where QA(i) is given in (4.2), and we show that Fi?1 is bounded. De ning J~i  [A1J (y1 ); : : : ; AiJ (yi ); QA(i) J (x )]T , we have kFi ? i J (x )k  kFi ? J~i k + kJ~i ? i J (x )k (i) ^(0) T  k(Z (1) ? A1 J (y1 )Q^ (0) ? Ai J (yi )Q^2(i?1) )(Q^ 2(i?1))T k 2 )(Q2 ) k + k(Z +kA1k  kJ (y1 ) ? J (x )k + : : : + kAik  kJ (yi ) ? J (x )k  k 1 k  kJ (y1) ? J (x )k + : : : + k i k  kJ (yi ) ? J (x )k  Lfky1 ? x k + : : : + kyi ? x kg

 L : Since k( i J (x ))? k  kJ (x )? k  kAi? k  and        L  q < 1, the assumptions of the Perturbation Lemma (I + E invertible if kE k < 1) are ful lled. Thus, Fi is non-singular and Fi? is bounded by kFi? k  =(1 ? q)  . Let us consider the following orthogonal factorization of FiT 2 U    U ;i 3 .. 75 : ... (4.16) FiT = [Q j : : : j Q i j Q^ i ] 64 . Ui ;i Obviously, kFi?T k   implies (4.17) kUii? k = kUii?T k  : In order to express (4.17) in terms of U i we next show by induction that Q  U  = Q^ ? (Z  )T = Q  U ;  = 1(1)i; (4.18) where the matrices Q^ ? are determined by (4.15). The case  = 1 follows immediately from (4.15), (4.16). Suppose that (4.18) is valid for  = 1(1)i ? 1. Multiplying (4.16) by the orthogonal projection Q^ i? (Q^ i? )T , we obtain 2 U    U ;i 3 .. 75 : ... Q^ i? (Q^ i? )T FiT = [0 j : : : j 0 j Q i j Q^ i ] 64 (4.19) . Ui ;i 1

1

1

1

1

1

1

11

(1) 1

( ) 1

1 +1

( ) 2

+1 +1

1

( )

( ) 1

( 2

( )

( 2

1)

( ) 1

( )

1)

( 2

1)

( 2

1)

11

( 2

1)

( 2

1)

( ) 1

1 +1

( ) 2

+1 +1

This implies

Q1(i)Uii = Q^ 2(i?1)(Q^2(i?1))T  fQ^ 2(i?1)(Z (i))T + Q^ 1(i?1)(Q^ 1(i?1))T J (yi )T ATi g = Q^ 2(i?1)(Z (i))T = Q1(i) U (i) :

18

Therefore, the factorization (4.16) can also be accomplished by

Q(1)  Q(1) ; U  U () ;  = 1(1)i; and modi cations of the upper triangular entries Ukl; k < l, and Ui+1;i+1 . In that case (4.17) reads k(U (i) )?T k   and the intermediate vector yi+1 is well-de ned. Proposition (iii):

kyi ? x k = kyi ? Q i (U i )?T Ai F(yi ) ? x k  kyi ? x k + kQ i (U i )?T k  kAi F(yi )k  kyi ? x k + N kyi ? x k = (1 + N )kyi ? x k = c kyi ? x k  ci kx k ? x k  c     : ( ) 1

+1

( ) 1

( )

0

0

( )

0

1

( )

0

1

Proposition (iv): From the de nition of yi+1, it follows that

Aj F(yj ) + Aj J (yj )(yi+1 ? yj ) = 0; j  i:

Thus, for j  i, we obtain

kAj F(yi )k = kAj F(yi ) ? Aj F(yj ) ? Aj J (yj )(yi ? yj )k  kAj k  kF(yi ) ? F(yj ) ? J (yj )(yi ? yj )k   L2 k(yi ? yj )k =  L2 k(yi ? x ) + (x ? yj )k +1

+1

+1

+1

+1

2

+1

2

+1

  L2 (kyi ? x k + kyj ? x k)   L2 (2c kx k ? x k)  2Lc kx k ? x k = c kx k ? x k : 2

+1

2 1

( )

2

2

( )

( )

1

2

2

For  = 1, it is quite easy to see that the propositions (i) - (iv) are ful lled. We will now show that the iterates x(k) converge quadratically. For j   and x(k+1)  y+1, (4.13, iv) implies

kF(x k )k = k(A  )? A  F(x k )k  k(A  )? k  kA  F(x k )k q q  ! Pj kAj F(x k )k  ! Pj c kx k ? x k  !c pkx k ? x k : ( +1)

( )

1

( )

=1

2

( +1)

( +1)

( )

( )

2

1

2 =1 2

( )

( +1)

( )

4

(4.20)

2

If x 2 S(x ; 1), then

j kF(x)k ? kJ (x )(x ? x )k j  kF(x) ? F(x ) ? J (x )(x ? x )k  L2 kx ? x k:

Hence, kF(x)k  kJ (x )(x ? x )k ? L2  kx ? x k.

Since kx ? x k = kJ (x )?1 J (x )(x ? x )k  kJ (x )(x ? x )k, we have  kF(x)k  ( 1 ? L2 kx ? x k)kx ? x k = kx ? x k (1 ? L2 kx ? x k): Relation (4.13, iii) implies ky+1 ? x k = kx(k+1) ? x k  c1kx ? x k  1 ; i.e. x(k+1) 2 S(x ; 1):

19

(4.21)

Thus, (4.21) is also satis ed for x = x(k+1) and we obtain

L kx(k+1) ? x k  L   q: 2 2 1

Using this result in (4.21), it follows that

kF(x k )k  kx k ( +1)

( +1)

? x k 1 ? q = 1 kx k ? x k:

(4.22)

( +1)

Combining the inequalities (4.20) and (4.22), we have

? x k  p!c kx k ? x k = c kx k ? x k :

kx k

( +1)

( )

2

2

( )

3

2

This states that hypothesis (4.12) is satis ed for C = c3 . Since (4.22) is still true if x(k+1) is replaced by an arbitrary vector x 2 S(x ; 1), no other solution than x exists in S(x ;  ). For x(k) 2 S(x ;  ), the inequality

kx k

? x k  c kx k ? x k  c  kx k ? x k < qkx k ? x k < kx k ? x k   shows that the absolute errors kx k ? x k are strictly decreasing. It remains to be proven that x k = x k implies x k = x . Let us assume x k ? x k = 0. Then, ( +1)

( )

3

2

( )

3

( )

( )

( )

( +1)

( )

( )

 X

( +1)

0 = x k ? x k = (yi ? yi) = ? ( +1)

( )

i=1

 X

+1

2

i=1

( )

Q i (U i )?T Ai F(yi ) ( ) 1

( )

3

(U (1) )?T A1 F(y1 ) 7 (1) () 6 .. = ?[Q1 j : : : j Q1 ] 4 5: . () ?T (U ) A F(y ) () (j ) Since [Q(1) ; j = 1(1), are nonsingular matrices, it follows that 1 j : : : j Q1 ] and U

2 6 4

A1 F(y1 )

3

.. . A F(y )

7 5=

0:

(4.23)

Substituting (4.23) into yi+1 = yi ? Q1(i) (U (i) )?T Ai F(yi ); i = 1(1), we have

y = y = : : : = y and F(y ) = F(y ) = : : : = F(y ): 1

2

1

Therefore, formula (4.23) reads 2 6 4

A1 .. .

A

3 7 5

2

2

A1

F(y ) = : : : = 64 ... 1

A

3 7 5

F(y ) = 0:

Then F(y1 ) = F(y2 ) = : : : = F(y ) = 0 by virtue of the rank hypothesis (4.2, i). This shows that x(k) = y1 is a root of the equation (4.1). Since in the neighbourhood of x there is no other solution, we conclude that x(k) = x . Q. E. D.

20

5 A general nonlinear SMM In the previous chapter we developed and studied a -stage GBM for the numerical treatment of systems of nonlinear algebraic equations. Our objective in this chapter is to show how the MSM can be combined with the new GBM to produce a SMM for the partially separated BVP (1.1),(1.2). For simplicity, we subdivide the interval [a,b] in two segments only, i.e.

a = 0 < 1 < 2 = b: Then, the shooting equations (2.10) read 

F(s)  FF ((ss ;; ss )) 1

0

1

2

0

1



(5.1)

2 3 u0(1; s0) ? s1 6" #7 7 = 0;  64 rp (s0) 5 rq (s0; u1(b; s1))

(5.2)

where s  (s0; s1)T ; s0 ; s1 2 Rn . The restriction on two subintervals is not essential since the missing (block) rows

u ( ; s ) ? s = 0; : : : ; um? (m? ; sm? ) ? sm? = 0 1

2

1

2

2

1

2

have the same structure as F1 (s0; s1 ). The Jacobian J of F at s = yi  [s0;i; s1;i]T is

2 e X0;i ?In 6 " (1) # " # J (yi ) = 64 Ba;i 0  X1e;i (2) (2) Bb;i Ba;i

1

3 7 7; 5

(2) Bj;i = Bj(2)(yi )  @ rq (s0;i; u1(b; s1;i))=@ x(j ); j = a; b; (1) e = X e(yi )  @ uj (t; sj ) j = Ba(1)(yi )  @ rp (s0;i)=@ x(a); and Xj;i Ba;i j @ sj (j ;sj;i) ; j = 0; 1. To solve the algebraic shooting equations (5.2), we apply a special GBM. The structure of these equations de nes  = 2n and suggests to choose  = 3. The values ni and the matrices Ai (yi ); i = 1; : : : ;  are chosen such that the application of the GBM to (5.2) results in a generalization of the SMM, i.e. a shooting method which is economized with respect to the number of integrations.

where

+1

Let us now describe the speci cation of Algorithm 4.2 for the problem (5.2). 01

Input:

fdimensionsg  = 2n;  = 3; n = n = p; n = 2q; ftolerancesg TOL1; TOL2; kmax; fstarting vectorg x = s and fthe functiong F(x)  F(s) fsee formula (5.2)g 1

2

3

(0)

21

(0)

02 For k = 0 : kmax Do 03 Set: y1  (s0;1; s1;1)T = x(k) 04 fi = 1g 05 Choose: A1 (s0;1; s1;1) = [ 0 j Ip j 0 ] p 06 07

Compute: Compute:

n

f = rp ( s ; ) 1

p

01

Z (1) = [ Ba;(1)1 j 0 ] n

08

Compute:

q

Q ;U (1) 1

p

n

(1)

f if the following orthogonal factorization is computed   (Ba; )T = [Q jQ ]  U0 = Q  U , (1) 1

(1) 0

(2) 0

(1) 0

0

0

then the factorization formulated in Algorithm 4.2 is "

)T

"

#

#

09

= Q U Q U g Q^ (Z )T = (Ba;0 0 Compute: Q^ f by the trivial orthogonal factorization

10

Ip  [Q(1) j Q^(1) ] Ip g Q Q 0 Q^(1) 1 = 1 2 0 0 0 0 In If det (U (1)) = 0 Then Stop

11

Set:

(0) 2

(1) 1

(1)

"

(1) 2

(1) 0

#

(2) 0



(1) 0

(1) 1

0



(1)





"

#

   s s ; ; y  s ; = s ; ? Q0 U ?T  rp(s ; ) # " ?T rp (s ; ) s ? Q U ; = s; fend of the rst elimination stepg 2

02

01

12

11

(1) 0

01

(1) 0

01

0

01

0

11

12 Endfor 04 fi = 2g 05 Choose:

T j 0 j 0 ] A2 (s0;2; s1;2) = [ (Q(1) 1 ) n

fQ

(1) 1

06 07

p

q

p

(2) j Q(1) is obtained from the factorization X0e;2Q(2) 1 ] 0 = [Q1

f = (Q )T [u ( ; s ; ) ? s ; ] Compute: Z = [ 0 j ? (Q )T ] Compute:

(1) 1

2

(2)

0

1

02

12

(1) 1

22





U1 g 0

(5.3)

08

Compute:

Q ;U (2) 1

"

f by the orthogonal factorization

(2)

0

#

"

0

#

09

Q^ (Z )T = ?Q = ?Q  Ip  Q U g Compute: Q^ f by the trivial orthogonal factorization

10

Q(1) 0 Q(2) 0 (2) 0 0 [ Q(1) j Q ] = 1 1 (1) 0 Q(2) 0 ?Q1 1 (1) If det (U ) = 0 Then Stop

(1) 2

(2)

(1) 1

(2) 1

(1) 1

(2) 2

"



#

(2)





I2p  [ Q(1) j Q(2) j Q^(2) ] I2p 1 1 2 0 0

"



g

#

   s s ; ; 11 Set: y  s ; = s ; ? ?Q0 (Q )T [u ( ; s ; ) ? s ; ] " # s ; = s ; + Q (Q )T (u ( ; s ; ) ? s ; ) 12 Endfor fend of the second elimination stepg fperform the nal (third) elimination stepg 3

03

02

13

12

02

12

13

Choose:

(1) 1

(1) 1

0

1

02

"

(2) T A3 (s0;3; s1;3) = (Q10 ) 00 I0 q

14 15 16

p

q

(2) T Compute: f3 = (Q1 ) [u0 (1; s0;3) ? s1;3] rq (s0;3; u1(b; s1;3))

#

0

1

02

12

12

q q

#

v uX u 3 If Fnorm  t kfi k2  TOL1 Then Goto Output i=1

Compute: 2

Z (3) = 4

T e (2) ?(Q(2))T Q(2) (Q(2) 1 ) X0;3 Q0 1 1

Bb;(2)3 X1e;3Q(2) 1

Ba;(2)3 Q(2) 0

3

"

U1 ?Iq 5= (2) (2) (2) Ba;3 Q0 Bb;3 X1e;3Q(2) 1

f s ; = s ; implies X e; = X e; . Thus, the (1,1)-block is U g 03

17 18 19 20 21 22

n

"

(1) 1

(1) 1

02

03

1

02

If det(Z (3)) = 0 Then Stop Compute: d f as the solution of the linear system Z (3)d = f3

Set: x(k+1) = y4 = y3 ? Q^ d If kx(k+1) ? x(k)k  TOL2 Then Goto Output Endfor fend of the k-th iteration stepg Output: x(k+1); Fnorm (2) 2

23

g

#

(5.4)

Remark 5.1 Only the computation of (5.3), (5.4), f and f requires the solution of IVPs. X0e;2Q(2) 0

2

X1e;3Q(2) 1

3

If and in (5.3), (5.4),respectively, are determined by directional derivatives the total number of integrations per iteration step is 2(q + 1). The MSM requires 2(n + 1) integrations. Remark 5.2 Note that in the method described above the matrices i; i = 1(1)3; satisfy the condition (4.2) with  = ! = 1. We are now in a position to formulate the algorithm of a general SMM which can be used to solve (1.1), (1.2) under an arbitrary segmentation a = 0 < 1 < : : : < m = b. Obviously, the adequate GBM is a (m + 1)-stage method and  = mn; ni = p (i = 1(1)m); nm+1 = q .

Algorithm 5.1: Input: fdimensionsg n; m; p; q ; ftolerancesg TOL1; TOL2, kmax ;

fshooting pointsg  ; : : : ; m; fstarting vectorg s  (s ; : : : ; sm? )T and fthe functionsg f (t; x); rp (x(a)); rq (x(a); x(b)) (0)

0

0

1

For k = 0 : kmax Do

Compute: rp (s0) and Ba(1)  @ rp (s0)=@ x(a)   U (1) T (1) (2) 0 Compute: (Ba ) = [Q0 j Q0 ] 0 forthogonal factorizationg ?T rp (s0) and Set: s0 = s0 ? Q(1) 0 U0 For i = 1 : m Do Compute: ui?1 = ui?1(i; si?1)

Fnorm = krp (s0)k2

Compute: Xi?1 = Xie?1Q(2) i?1 fby directional derivatives and Algorithm 3.1g If i 6= m Then   U (2) (1) i Compute: Xi?1 = [Qi j Qi ] 0 forthogonal factorizationg Set:

fi = ui? ? si si = si + Qi (Qi )T fi 1

(1) (1) Set: Endif Endfor fend of the i-th elimination stepg Set: fm = rq (s0; um?1)

Compute: Bj(2) = @ rq (s0; um?1)=@ x(j ); j = a; b Set: Ra = Ba(2)Q(2) and Rb = Bb(2)Xm?1 0 ,

(2) T T T Set: q^(m) = ((Q(2) 1 ) f1 ; : : : ; (Qm?1) fm?1 ; fm ) Set: Fnorm = Fnorm + kq^ (m) k2 If Fnorm  TOL1 Then Goto Output fsmall residualsg

24

Compute:

z m  (d ; : : : ; dm? )T fas the solution of the linear system 2 3 U ?Iq 6 7 ... ... 6 7 m m m m ^ ^ 6 7g M z = q^ , where M  6 7 U ? I m ? q 4 5 (

)

0

1

1

(

)

(

)

(

If kz(m) k  TOL2 Then Goto Output

)

Ra

1

Rb

sj = sj ? (Qj ) dj; j = 0(1)m ? 1 Set: s k = (s ; : : : ; sm? )T Endfor fend of the k-th iteration stepg Output: s k ; Fnorm Set:

(2)

( +1)

0

1

( )

Remark 5.3

(a) In order to construct the system M^ (m) z(m) = q^ (m) , the following integrations have to be performed: for each fj (j = 1(1)m) : 1 IVP e for each Ui (i = 1(1)m ? 1) and for Xm?1 : q IVPs. (b) A look at formula (3.24) shows that the standard form of the SMM is a special case of Algorithm 5.1.

6 Numerical results In this section we summarize some numerical experiments which compare the nonlinear SMM with the standard MSM. For the tests we used implementations of these methods (RWPSM: stabilized march, RWPMS: multiple shooting) which are part of the software package RWPM (see e.g. [10, 11]). Note that in RWPM damped versions of the nonlinear equation solver (Newton's method or GBM) are implemented to obtain semilocal convergence. All computations were executed on a 486-AT in Microsoft FORTRAN 5.0 carrying a mantissa of 16 signi cant digits. The interval [0; 1] was subdivided into 10 and 20 equidistributed segments. The resulting IVPs were solved by a semi-implicit extrapolation method SIMPRS [3].

Problem: (see e.g. [1, 23])

25

x01 = a  xx1  (x3 ? x1); x02 = ?a  (x3 ? x1 ); 2 1 0 x3 = x  f0:9 ? 1000  (x3 ? x5 ) ? a  x3  (x3 ? x1 )g; a = 100 4 x04 = a  (x3 ? x1 ); x05 = ?100  (x5 ? x3); x1 (0) = x2 (0) = x3(0) = 1; x4 (0) = ?10; x3 (1) = x5(1):

(6.5)

Here, n = 5; q = 1 and p = 4. In [1] only the case a = 0:5 is solved.

Accuracy:  = 10? Starting Trajectory: In a rst step we set a = 0 in (6.5) and solved this reduced problem 6

with RWPM and RWPS using the following starting trajectory: x1(t) = x2(t) = x3(t)  1; x4 (t)  ?10; x5(t)  1: In the second step the solution of the reduced problem was taken as starting trajectory for the given problem (a = 100).

Computational Results:

a=0 a=100 Code it nivp node CPU it nivp node CPU RWPMS 1 70 3430 1.7 6 370 88706 21.5 RWPSM 1 40 1124 0.6 5 120 15224 4.8

Table 6.1: 10 equidistributed segments a=0 a=100 Code it nivp node CPU it nivp node CPU RWPMS 1 140 6860 2.5 6 740 125066 30.6 RWPSM 1 80 2076 1.3 4 200 17018 5.3

Table 6.2: 20 equidistributed segments it = number of iteration steps, nivp = number of IVP-solver calls, node = number of ODE calls; CPU measured in seconds.

Acknowledgment. We are grateful to the referee for helpful remarks and suggestions.

References [1] U. M. Ascher, R. M. M. Mattheij, and R. D. Russell. Numerical Solution of Boundary Value Problems for Ordinary Di erential Equations. Prentice Hall Series in Computational Mathematics. Prentice-Hall Inc., Englewood Cli s, 1988. 26

[2] G. Bader and U. Ascher. A new basis implementation for a mixed order boundary value ODE solver. SIAM J. Sci. Stat. Comput., 8:483{500, 1987. [3] G. Bader and P. Deu hard. A semi-implicit midpoint rule for sti systems of ODEs. Numerische Mathematik, 41:373{398, 1983. [4] R. P. Brent. Some ecient algorithms for solving systems of nonlinear equations. SIAM J. Numer. Anal., 10:327{344, 1973. [5] P. Deu hard and G. Bader. Multiple shooting techniques revisited. In P. Deu hard and E. Hairer, editors, Numerical Treatment of Inverse Problems in Di erential and Integral Equations, pages 74{94, Boston, Basel, Stuttgart, 1983. Birkhauser Verlag. [6] M. Hermann. The numerical treatment of linear boundary value problems with partially separated boundary conditions by shooting methods. In M. Hermann, editor, Numerische Behandlung von Di erentialgleichungen II, pages 91{145, Jena, 1984. Wissenschaftliche Beitrage der Friedrich-Schiller- Universitat. [7] M. Hermann. Ecient shooting algorithms for solving nonlinear TPBVPs with separated endconditions. In P. Rozsa and G. Stoyan, editors, Numerical Methods, pages 260{276, Amsterdam, 1987. North-Holland Publishers. [8] M. Hermann. Shooting algorithms for two-point bvps. In K. Strehmel, editor, Numerical Treatment of Di erential Equations, Teuner-Texte zur Mathematik, Bd. 104, pages 74{83, Leipzig, 1988. Teubner Verlag. [9] M. Hermann and H. Bernd. RWPM: a multiple shooting code for nonlinear two-point boundary value problems, version 4. Technical Report 67,68,69, Friedrich-Schiller-Universitat Jena, 1982. [10] M. Hermann and D. Kaiser. RWPM: a software package of shooting methods for nonlinear two-point boundary value problems. Appl. Numer. Math., 13:103{108, 1993. [11] M. Hermann and D. Kaiser. RWPM: a software package of shooting methods for nonlinear two-point boundary value problems, documents and programs, 1994. anonymous ftp: ftp.unijena.de, directory: /pub/mathematik/rwpm. [12] D. Kaiser. Zur numerischen Behandlung von partiell separierten Zweipunktrandwertproblemen mit Schieverfahren. PhD thesis, Friedrich-Schiller-Universitat Jena, 1991. [13] H. B. Keller. Numerical Solution of Two-Point Boundary Value Problems. SIAM, Philadelphia, 1976. [14] F. T. Krogh, J. P. Keener, and W. H. Enright. Reducing the number of variational equations in the implementation of multiple shooting. In U. M. Ascher and R. D. Russell, editors, Numerical Boundary Value ODEs, pages 121{135, Boston, Basel, Stuttgart, 1985. Birkhauser Verlag. [15] M. Lentini and V. Pereyra. An adaptive nite di erence solver for nonlinear two- point boundary value problems with mild boundary layers. SIAM J. Numer. Anal., 14:91{111, 1977. 27

[16] W. Mackens. Some notes on block gauss-seidel-newton iteration for the solution of sparse nonlinear problems. Technical Report 37, Rheinisch-Westfalische Technische Hochschule Aachen, Germany, 1986. [17] J. M. Martinez. Generalization of the methods of brent and brown for solving nonlinear simultaneous equations. SIAM J. Numer. Anal., 16:434{448, 1979. [18] R. M. M. Mattheij. Stability of block LU-decompositions of matrices arising from BVP. SIAM J. Discr. Math., 5:314{331, 1984. [19] R. M. M. Mattheij and G. W. Staarink. BOUNDPACK user's manual. Technical Report 84 - 01, Math. Inst., Kath. Univ. Nijmegen, 1984. [20] R. M. M. Mattheij and G. W. Staarink. An ecient algorithm for solving general linear two point BVP. SIAM J. Sci. Stat. Comput., 5:745{763, 1984. [21] M. Osborne. The stabilized march is stable. SIAM J. Numer. Anal., 16:923{933, 1979. [22] J. W. Schmidt and W. Hoyer. Ein Konvergenzsatz fur Verfahren vom Brown- Brent-Typ. ZAMM, 57:397{405, 1977. [23] M. R. Scott and H. A. Watts. A systemalized collection of codes for solving two-point boundary value problems. In L. Lapidus and W. E. Schiesser, editors, Numerical Methods for Di erential Systems, New York and London, 1976. Academic Press. [24] W. Wallisch and M. Hermann. Schieverfahren zur Losung von Rand- und Eigenwertaufgaben. Teubner-Texte zur Mathematik, Bd. 75. Teubner Verlag, Leipzig, 1985.

28