Convergence of the forward-backward sweep method

Convergence of the forward-backward sweep method in optimal control

Michael McAsey, Libin Mou & Weimin Han

Computational Optimization and Applications An International Journal ISSN 0926-6003 Volume 53 Number 1 Comput Optim Appl (2012) 53:207-226 DOI 10.1007/s10589-011-9454-7

1 23

Your article is protected by copyright and all rights are held exclusively by Springer Science+Business Media, LLC. This e-offprint is for personal use only and shall not be selfarchived in electronic repositories. If you wish to self-archive your work, please use the accepted author’s version for posting to your own website or your institution’s repository. You may further deposit the accepted author’s version on a funder’s repository at a funder’s request, provided it is not made publicly available until 12 months after publication.

1 23

Author's personal copy Comput Optim Appl (2012) 53:207–226 DOI 10.1007/s10589-011-9454-7

Convergence of the forward-backward sweep method in optimal control Michael McAsey · Libin Mou · Weimin Han

Received: 8 June 2011 / Published online: 13 January 2012 © Springer Science+Business Media, LLC 2012

Abstract The Forward-Backward Sweep Method is a numerical technique for solving optimal control problems. The technique is one of the indirect methods in which the differential equations from the Maximum Principle are numerically solved. After the method is briefly reviewed, two convergence theorems are proved for a basic type of optimal control problem. The first shows that recursively solving the system of differential equations will produce a sequence of iterates converging to the solution of the system. The second theorem shows that a discretized implementation of the continuous system also converges as the iteration and number of subintervals increases. The hypotheses of the theorem are a combination of basic Lipschitz conditions and the length of the interval of integration. An example illustrates the performance of the method. Keywords Optimal control · Numerical solution · Convergence · Indirect method

1 Introduction Solutions of optimal control problems are often difficult. Yet when either learning or teaching optimal control it is helpful to have some examples with closed form solutions. It is also useful to have a simple numerical scheme that can produce a numerical approximation to solutions for some problems for which closed form solutions are not available. In their textbook Lenhart and Workman [38] have just that. The Forward-Backward Sweep Method (FBSM) in [38] is easy to program and runs M. McAsey () · L. Mou Department of Mathematics, Bradley University, Peoria, IL 61625, USA e-mail: [email protected] W. Han Department of Mathematics, University of Iowa, Iowa City, IA 52242, USA

Author's personal copy 208

M. McAsey et al.

quickly. The method is designed to solve the differential-algebraic system generated by the Maximum Principle that characterizes the solution. A detailed convergence analysis of the method is not appropriate for the intended audience of [38]. However after seeing the method work on problems from several disciplines and knowing that indirect methods have some difficulties, it seems natural to ask about some convergence properties. In this paper we prove a convergence result for the method applied to a very basic class of problems. The literature on numerical solutions of optimal control problems is large. To put some of this into perspective, consider a basic problem: choose a control u(t) to opT timize an integral 0 f (t, x(t), u(t))dt subject to a differential equation constraint x (t) = g(t, x(t), u(t)), x(t0 ) = x0 . The main analytical technique is provided by Pontryagin’s Maximum Principle which gives necessary conditions that the control u(t) and the state x(t) need to satisfy. These conditions can be solved explicitly in a few examples. However for most problems, especially problems that also involve additional constraints on the state or control, the conditions are too algebraically involved to be solved explicitly. So numerical approaches are used to construct approximations to the solutions. Useful techniques and surveys on numerical methods can be found in texts, articles, and introductions to articles. Examples include [5, 7, 9, 10, 16, 24, 36, 47]. Numerical techniques for optimal control problems can often be classified as either direct or indirect. For a direct method, the differential equation and the integral are discretized and the problem is converted into a nonlinear programming problem. Many choices are available for discretizing the integral, the differential equation, and for solving the nonlinear programming problem resulting in several different numerical methods. For example, in an early paper [26], Hager considers an unconstrained problem to minimize the final state value subject to the differential equation x = f (x(t), u(t)). The paper treats discretizations by both one-step and multistep approximations. Dontchev, Hager, and co-authors have produced not only convergence results but rates of convergence for direct techniques on problems that include state and control constraints. For a sample, see [13–16, 26]. More recent contributions to the discretization approach include papers by Birgin, Kaya, and Martínez (see [8, 36, 37]) in which the technique of inexact restoration is applied to the nonlinear programming problem resulting from the discretization. Local convergence analysis of the inexact restoration method, for the nonlinear programming problem only, is given in [8]; comparisons with existing software is found there as well. Convergence properties of inexact restoration as applied to Euler, and more generally Runge-Kutta, discretization of optimal control problems are discussed in references [36, 37], respectively. The direct transcription method, a full discretization method, is described, illustrated, and updated in the recent edition of the book by Betts [7], highlighting the Sparse Optimal Control Software (SOCS) developed by the Boeing Company. Betts, Campbell, and Engelsone [6] discuss a test problem using SOCS. Using different code, Kameswaran and Biegler in [32] solve the same test problem. In related work [31] these last authors also provide convergence rates for the direct transcription methods. Another discretization method, sometimes called control parametrization or partial discretization, approximates the controls in each subdivision by polynomials of

Author's personal copy Convergence of the forward-backward sweep method

209

degree zero or one or by splines while the dynamics of the problem is solved for the states using these approximate controls. Older examples include [45] (and the MISER3 code), [44], and more recent work [11, 34, 35]. Pseudospectral methods represent discretization of a different flavor but are also useful tools. In this method the states and controls are approximated by polynomials using specific collocation points. Early examples include [17, 18, 46]. More recent examples are found in [19, 24, 33, 41, 42]. Very recent papers by Hager and coauthors on pseudospectral methods in optimal control are [12, 21–23]. Indirect methods approximate solutions to optimal control problems by numerically solving the boundary value problem for the differential-algebraic system generated by the Maximum Principle. Techniques for solving boundary value problems can be found in the venerable book by Keller [29] and include shooting, finite difference, and collocation methods. More recently, Iserles’ book [28] solves boundary value problems via finite element methods. The addition of an algebraic constraint or state/control constraints present additional difficulties that do not appear in classical BVP. Relevant for the present paper is work by Hackbusch [25] that approximates solutions to boundary value problems with two parabolic equations. The books [1, 27, 30] have extensive treatments of differential-algebraic systems, concentrating more on initial value problems rather than boundary value problems. The paper by Bertolazzi [4] has an informative introduction, highlighting the various numerical techniques in optimal control and their advantages and disadvantages. The idea exploited by the FBSM is that the initial value problem of the state equation is solved forward in time, using an estimate for the control and costate variables. Then the costate final value problem is solved backwards in time. An early reference to a technique that has the forward-backward flavor is [39] where the update step is different from that considered here. In [20] Enright and Muir use both explicit and implicit Runge-Kutta methods (and an average of the two) for two-point boundary value problems that also has some of the flavor of the FBSM. The lack of recent literature on the FBSM is due to the difficulties inherent in using indirect methods. These difficulties include (see [5]) the need to compute various partial derivatives of the Hamiltonian, the need to make an initial guess of the costate variable, the sensitivity of the method to changes in initial guesses, and the limiting behavior of the method. The authors are not suggesting that the FBSM overcomes any of these difficulties but are suggesting that the method is easy to use on small problems and can provide a quick check on initial uses of more powerful methods. In addition the method provides some insight into the Maximum Principle and the proof of the convergence result is a nice illustration of the proof techniques in numerical differential equations as in [3]. In Sect. 2 of this paper, we describe the type of optimal control problems to which we will apply the FBSM. These are among the most basic in the subject. In Sect. 3 we investigate the convergence issue for the simplest case of the method. We show both a continuous version and a discrete version of the convergence theorem. In Sect. 4 we illustrate the numerical performance of the method through simulations of solutions to an example in optimal control. The paper closes in Sect. 5 with a few remarks on more general problems.


M. McAsey et al.

2 Basic problem The basic problem to be considered is to choose a control function u(t) to maximize an integral objective function: t1 f (t, x(t), u(t))dt (2.1) max u

t0

subject to the state equation x (t) = g(t, x(t), u(t)),

x(t0 ) = x0 .

(2.2)

For this formulation of the problem assume that x and u are vector-valued functions on [t0 , t1 ] with values in Rn and Rm , respectively. Assume f and g map R × Rn × Rm into R and Rn , respectively. The basic problem is generalized in several ways. Some of these include: (1) the terminal value of the state x(t1 ) may be fixed; (2) the end time t1 could be a choice variable; and (3) the objective may include a “scrap function” φ(t1 ) in addition to the integral. There are more variations on the basic problem of course, but these are the main problems in [38] that can be computed by the FBSM. The convergence results of this paper have not yet been extended to these problems. See the remarks in the final section. The FBSM is one of the so-called indirect methods for solving optimal control problems. Begin by using the Maximum Principle to characterize the method as applied to the basic problem. This is considered in detail in [38] (p. 13) and we provide a brief sketch. Assume that f and g are continuously differentiable in all three variables. We also assume that a solution to the basic problem exists where x is continuously differentiable and u is piecewise continuous. Form the Hamiltonian H(t, x, u, λ) = λ0 f (t, x, u) + λg(t, x, u) where λ = λ(t) is the adjoint or co-state variable. (We will take the constant λ0 to be equal to 1 for the problems considered here, although in general this cannot be assumed.) The Maximum Principle says that there is a co-state variable λ(t), such that an optimal state x(t), and optimal control u(t) must necessarily (1) satisfy the state equation, x (t) = g(t, x(t), u(t)), x(t0 ) = x0 ; (2) satisfy the co-state equation dλ/dt = −∂H/∂x,

λ(t1 ) = 0;

(2.3)

and (3) maximize the Hamiltonian, considered as a function of the control. These three conditions result in a two-point boundary value problem and an additional algebraic equation from the optimality condition (3). Assuming enough structure on the functions, condition (3) can be written as Hu = 0. Although it is not necessary for the numerical algorithm, in many problems this equation can be solved for u, and that is how we shall state the FBSM. In brief, the Forward Backward Sweep Method first solves the state equation x = g(t, x, u) with a Runge-Kutta routine, then solves the costate equation (2.3) backwards in time with the Runge-Kutta solver, and then updates the control. This produces a new approximation of the state, costate, and control (x, λ, u). The method continues by using these new updates and calculating new Runge-Kutta approximations and control updates with the goal of finding a fixed “point” (x, λ, u). The


211

method terminates when there is sufficient agreement between the states, costates, and controls of two passes through the approximation loop.

3 Convergence of the FBSMs To better understand the idea of the convergence analysis, we first consider the FBSM at the continuous level, and this is done in Sect. 3.1. The argument can then be adapted for the convergence study of the FBSM applied to the discrete systems, in Sect. 3.2. Throughout this section we will assume that an optimal solution exists for the basic problem (2.1)–(2.2). The Lipschitz conditions to be assumed shortly are enough to be able to apply the Maximum Principle. (See [43, p. 85] for a statement of the Maximum Principle.) This in turn implies that the boundary value problem of interest has a solution. Thus the real problem is solving a boundary value problem of a specific form. 3.1 Convergence for the continuous system For notational simplicity, we express the problem as finding (x(t), λ(t), u(t)) such that x (t) = g(t, x(t), u(t)),

x(t0 ) = x0 ,

λ (t) = h1 (t, x(t), u(t)) + λ(t) h2 (t, x(t), u(t)), u(t) = h3 (t, x(t), λ(t)).

(3.1) λ(t1 ) = 0,

(3.2) (3.3)

Here x0 ∈ Rn and t0 < t1 are given real numbers, g, h1 and h2 are given functions satisfying the continuity properties mentioned in Sect. 2 so that the system (3.1)–(3.3) has a unique solution (x(t), λ(t), u(t)). The use of the forms of (3.2) and (3.3) is for convenience. The motivation for h1 and h2 comes from the optimal control problem in Sect. 2 so that h1 can be thought of as −λ0 fx and h2 as −gx . It will be assumed here that u is defined uniquely by the optimality condition Hu = 0 in the Maximum Principle and h3 is then the solution written explicitly as a function of t, x, and λ. (In practice, writing u as an explicit function of t, x, and λ is not needed; there are several ways to approximate u for use in the algorithm.) The FBSM for the system (3.1)–(3.3) reads as follows: Initialization: choose an initial guess u(0) (= u(0) (t)). Iteration: for k ≥ 0, solve dx (k+1) (t) = g(t, x (k+1) (t), u(k) (t)), dt

x (k+1) (t0 ) = x0 ,

(3.4)

dλ(k+1) (t) = h1 (t, x (k+1) (t), u(k) (t)) + λ(k+1) (t) h2 (t, x (k+1) (t), u(k) (t)), dt (3.5) λ(k+1) (t1 ) = 0, u(k+1) (t) = h3 (t, x (k+1) (t), λ(k+1) (t)).

(3.6)


M. McAsey et al.

For each loop of the algorithm, note that (3.4) is solved forward from t0 to t1 . Then (3.5) is solved backward from t1 to t0 . And finally u is updated in (3.6). For a convergence analysis of the above FBSM, we will make the following assumptions. (A) The functions g, h1 , h2 and h3 are Lipschitz continuous with respect to their second and third arguments, with Lipschitz constants Lg , Lh1 , etc., e.g. |g(t, x1 , u1 ) − g(t, x2 , u2 )| ≤ Lg (|x1 − x2 | + |u1 − u2 |) . Moreover, = λ∞ < ∞ and H = h2 ∞ < ∞. Note that in convergence analysis of numerical methods for ODEs, it is standard to assume Lipschitz conditions (see, e.g., [2]). In the proof of the next theorem, we will apply a simple form of the well-known Gronwall’s inequality (see [3, Exercise 5.2.12]): Suppose f and g are continuous functions on [a, b] and g is non-decreasing, then f (t) ≤ g(t) + c

t

f (s) ds

=⇒ f (t) ≤ ec (t−a) g(t)

∀ t ∈ [a, b].

(3.7)

a

Similarly, if f and g are continuous functions on [a, b] and g is non-increasing, then f (t) ≤ g(t) + c

b

f (s) ds

=⇒ f (t) ≤ ec (b−t) g(t)

∀ t ∈ [a, b].

(3.8)

t

Theorem 3.1 Under the assumptions (A), if c0 ≡ Lh3 exp(Lg (t1 − t0 )) − 1 1 exp(H (t1 − t0 )) − 1 exp(Lg (t1 − t0 )) + 1 < 1, + (Lh1 + Lh2 ) H (3.9) then we have convergence: as k → ∞, max |x(t) − x (k) (t)| + max |λ(t) − λ(k) (t)| + max |u(t) − u(k) (t)| → 0. (3.10)

t0 ≤t≤t1

t0 ≤t≤t1

t0 ≤t≤t1

Proof Denote the errors ex(k) = x − x (k) ,

(k)

eλ = λ − λ(k) ,

eu(k) = u − u(k) .

These errors are all functions of t as well as k. From (3.1) and (3.4), we have (k+1)

dex

dt

(t)

= g(t, x(t), u(t)) − g(t, x (k+1) (t), u(k) (t)),

ex(k+1) (t0 ) = 0.


Then ex(k+1) (t) =

213

t g(s, x(s), u(s)) − g(s, x (k+1) (s), u(k) (s)) ds. t0

Apply the Lipschitz condition on g, t |ex(k+1) (t)| ≤ Lg |ex(k+1) (s)| + |eu(k) (s)| ds,

t ∈ [t0 , t1 ].

(3.11)

t0

Similarly, from (3.2) and (3.5), we have t

(k+1) h1 (s, x(s), u(s)) − h1 (s, x (k+1) (s), u(k) (s)) eλ (t) = t1

+ λ(s) h2 (s, x(s), u(s)) − h2 (s, x (k+1) (s), u(k) (s)) (k+1) + eλ (s) h2 (s, x (k+1) (s), u(k) (s)) ds. Hence, (k+1)

|eλ

t1

(t)| ≤ t

(k+1) (Lh1 + Lh2 ) |ex(k+1) (s)| + |eu(k) (s)| + H |eλ (s)| ds,

t ∈ [t0 , t1 ].

(3.12)

Furthermore, from (3.3) and (3.6), we obtain (k+1) (t)| , |eu(k+1) (t)| ≤ Lh3 |ex(k+1) (t)| + |eλ Apply Gronwall’s inequality (3.7) on (3.11) to obtain t |ex(k+1) (t)| ≤ exp(Lg (t − t0 ))Lg |eu(k) (s)| ds,

t ∈ [t0 , t1 ].

t ∈ [t0 , t1 ].

(3.13)

(3.14)

t0

Apply Gronwall’s inequality (3.8) on (3.12) to obtain t1 (k+1) (k+1) |ex |eλ (t)| ≤ exp(H (t1 − t)) Lh1 + Lh2 (s)| + |eu(k) (s)| ds, t

t ∈ [t0 , t1 ]. Then, plug (3.14) into the right side of this inequality and use integration by parts to obtain (k+1) |eλ (t)| ≤ exp(H (t1 − t)) Lh1 + Lh2 t (k) |eu (s)| ds × exp(Lg (t1 − t0 )) − exp(Lg (t − t0 )) t0

+ exp(Lg (t1 − t0 )) − exp(Lg (t − t0 )) + 1

t

t ∈ [t0 , t1 ].

t1

|eu(k) (s)| ds , (3.15)


M. McAsey et al.

Use (3.14) and (3.15) in (3.13), t |eu(k+1) (t)| ≤ Lh3 exp(Lg (t − t0 ))Lg |eu(k) (s)| ds t0

+ exp(H (t1 − t)) Lh1 + Lh2

t1 t

|eu(k) (s)| ds

+ exp(H (t1 − t)) Lh1 + Lh2 exp(Lg (t1 − t0 )) − exp(Lg (t − t0 )) t1 (k) · |eu (s)| ds , t ∈ [t0 , t1 ]. (3.16) t0

We integrate (3.16) over the interval [t0 , t1 ] to obtain

t1 t0

|eu(k+1) (t)| dt ≤ c1

t1

t0

|eu(k) (t)| dt,

where c1 = c0 − Lh3 Lh1 + Lh2 Hence,

t1

t0

t1

exp(−(H − Lg ) t + H t1 − Lg t0 )dt.

t0

|eu(k) (t)| dt ≤ (c1 )k

t1

t0

|eu(0) (t)| dt.

(3.17)

Thus, if c1 < 1, valid under the assumption (3.9), we have

t1 t0

|eu(k) (t)| dt → 0 as k → ∞.

(3.18)

Using this convergence in (3.14), (3.15) and (3.16), we conclude the statement (3.10). Remark 3.2 The condition (3.9) is valid if Lh3 is sufficiently small, or if Lg (t1 − t0 ) and Lh1 + Lh2 are sufficiently small. As is seen from the proof, this condition can be replaced by the weaker one c1 < 1. It is possible to further sharpen the condition (3.9). Other iteration methods may be studied as well. For the iteration, one may consider using dx (k+1) = g(t, x (k) (t), u(k) (t)), dt

x(t0 ) = x0 ,

(3.19)

instead of (3.4)—then it only requires an integration to get x (k+1) . A similar comment applies to (3.5). The price to pay for using the alternative iteration is a slower convergence. The difference between the iteration schemes in (3.19) and (3.4) is reminiscent of the difference between Jacobi iteration and Gauss-Seidel iteration in solving linear


215

algebraic equations. Updating the iterate as soon as possible, as in Guass-Seidel, often produces faster convergence. Some numerical experiments provides evidence for faster convergence using (3.4) rather than (3.19) but we have no proof of a general fact. 3.2 The numerical algorithm and convergence for the discretized system In a numerical implementation of the FBSM we are not actually solving the state and co-state differential equations (3.1) and (3.2), of course, but are finding instead numerical approximations of the solutions at discrete points in the interval. The convergence theorem in this section will show that when the Lipschitz constants are small enough or the time interval is short enough, then there is a grid size and an iteration count so that the error between the solution at the nodes and the discrete approximation can be made small. Recall the system being solved is (3.1)–(3.3). For notational convenience the interval is now assumed to be [0, T ], the initial state is denoted by x(0) = a, and the final value of the costate is written λ(T ) = 0. The assumptions (A) continue in force for this section. Recall these assumptions are: the functions g, h1 , h2 , h3 are continuous in t and Lipschitz in x, u, and λ. We continue to assume that a solution exists to the optimal control problem (2.1)–(2.2). These hypotheses and the Maximum Principle then imply that the boundary value problem (3.1)–(3.3) has a solution. In this section we also assume that the solutions x(t), λ(t) and u(t) are continuous. Let n be a positive integer and define the step size h = T /n. Denote xj = x(tj ) and λj = λ(tj ), where tj = t0 + j h = j h and x(t), λ(t), u(t) are the actual solutions to the system (3.1)–(3.3). For each k ≥ 0, let xjk , λkj , ukj be the k-th approximations to xj , λj , uj , as defined below. Our goal is to show that the approximations converge to the solution as k → ∞ and h → 0. 3.2.1 Discrete approximations Consider a discrete approximation to a general initial value problem y = g(t, y),

α < t ≤ β;

y(α) = y0

that has the following scheme: yj +1 = yj + hG(tj , yj , h; g) where G is a function such that ηg (h) ≡ sup{|g(t, y) − G(t, y, h; g)| : α ≤ t ≤ β, −∞ < y < ∞} → 0

(3.20)

as h → 0. The classical methods (e.g., Euler, Runge-Kutta) have this form. A simple example is the Modified Euler method: G = (g(t, y) + g(t + h, y + hg(t, y)))/2. A discrete approximation method will be applied to approximate x(t) using (3.1) forward in time and then applied to the second equation (3.2) backwards in time to


M. McAsey et al.

approximate λ(t). For simplicity, the functions G for the methods associated with g(t, x, u) and ϕ(t, x, u, λ) = h1 (t, x, u) + λh2 (t, x, u) are denoted by G and F in the algorithm (3.21). For j = 0, . . . , n − 1 define the forward and backward difference operators

j x = xj +1 − xj ,

δj x = xj − xj +1 .

The operators j and δj apply to the approximating sequences xjk , λkj , ukj as well. 3.2.2 The algorithm • Initialization: choose initial guess u0j , j = 0, . . . , n. k+1 • Iteration: for k ≥ 0, we define xjk+1 , λk+1 j , uj , j = 0, . . . , n by the equations

⎧ k+1 = hG(t , x k+1 , uk ), x0k+1 = a, j = 0, . . . , n − 1 ⎪ j j ⎨ j x j k+1 k k+1 k+1 k+1 δj −1 λ = hF (tj , xj , uj , λj ), λn = 0 j = n, . . . , 1 ⎪ ⎩ k+1 k+1 k+1 j = 0, . . . , n. uj = h3 (tj , xj , λj )

(3.21)

3.2.3 Convergence theorem Theorem 3.3 Suppose the assumptions (A) hold. Suppose that either the Lipschitz constants are small enough or T is small enough. Then max{|x(tj ) − xjk | + |λ(tj ) − λkj | + |u(tj ) − ukj |, j = 0, . . . , n} → 0 as k, n → ∞. That is, for every ε > 0 there exist N, K > 0 such that max{|x(tj ) − xjk | + |λ(tj ) − λkj | + |u(tj ) − ukj |, j = 0, . . . , n} < ε for all n > N and k > K. Remark 3.4 The first approximating equation in (3.21) can be replaced by

j x k+1 = hG(tj , xjk , ukj ),

x0k+1 = a.

(3.22)

See Remark 3.6.1 for a discussion. Proof Denote the errors by exkj = xj − xjk ,

eλk j = λj − λkj ,

euk j = uj − ukj

for k ≥ 0, j = 0, 1, . . . , n. The proof follows the general outline of the proof of the continuous approximation. The essence is to find bounds for the errors in x and λ in terms of the error in u and then show that this last error can be made small. Define the following average errors: Exk = h

n j =0

|exkj |,

Eλk = h

n j =0

|eλk j |,

Euk = h

n j =0

|euk j |.

(3.23)


217

Since h = T /n, Exk is an “average” error of the approximation xjk . The idea of the proof is to show that Euk → 0 as k → ∞ and h → 0, which implies the desired result. • Inequality for exkj Note that exk+1 = 0. From the equations for x and x k+1 we get for i = 0, . . . , n − 1, 0 k+1 − exk+1 = (xi+1 − xi+1 ) − (xi − xik+1 )

i exk+1 = exk+1 i i+1 k+1 = (xi+1 − xi ) − (xi+1 − xik+1 ) = i x − i x k+1 .

It follows that

i exk+1 = i x − i x k+1 =

ti+1

ti

[g(t, x(t), u(t)) − G(ti , xik+1 , uki )]dt.

(3.24)

To analyze the preceding difference, subtract and add both the quantities g(ti , x(ti ), u(ti )) and g(ti , xik+1 , uki ). First, using the continuity of g, x, and u, we get |g(t, x(t), u(t)) − g(ti , x(ti ), u(ti ))| ≤ ωg (h)

(3.25)

where ωg (h) is the oscillation of the function g(t, x(t), u(t)) considered as a function of t; that is, ωg (h) =

sup

r,s∈[0,T ],|r−s|≤h

|g(r, x(r), u(r)) − g(s, x(s), u(s))| .

Second, by the Lipschitz condition on g we get g(ti , x(ti ), u(ti )) − g(ti , x k+1 , uk ) ≤ Lgx |x(ti ) − x k+1 | + Lgu |u(ti ) − uk | i i i i ≤ Lgx |exk+1 | + Lgu |euk i |. i

(3.26)

Third, by the definition of η in (3.20), we have |g(ti , xik+1 , uki ) − G(ti , xik+1 , uki )| ≤ ηg (h).

(3.27)

Putting these three pieces (3.25)–(3.27) together we have that | i exk+1 | ≤ Lgx h|exk+1 | + Lgu h|euk i | + o1 (h) i

(3.28)

for j = 0, . . . , n − 1, where o1 (h) = hωg (h) + hηg (h). Note that exk+1 = exk+1 + j 0

j −1

i exk+1 , i

exk+1 = 0. 0

i=0

This and (3.28) imply |≤ |exk+1 j

j −1 [Lgx h|exk+1 | + Lgu h|euk i | + o1 (h)]. i i=0

(3.29)


M. McAsey et al.

Sum both sides of (3.29) on j and change the order of sums to get n

|exk+1 |≤ j

j =0

j −1 n

[Lgx h|exk+1 | + Lgu h|euk i | + o1 (h)] i

j =0 i=0

=

n

(n − i)[Lgx h|exk+1 | + Lgu h|euk i | + o1 (h)]. i

i=0

Multiply both sides of this inequality by h = T /n. Note that (n − i)h ≤ nh = T . Use the notation (3.23) for “average” error to get Exk+1 ≤ T [Lgx Exk+1 + Lgu Euk + n · o1 (h)].

(3.30)

So if T Lgx < 1, then we get Exk+1 ≤

T [Lgu Euk + n · o1 (h)]. 1 − T Lgx

(3.31)

Thus we have a bound for errors in x written in terms of errors in u. • Inequality for eλk j Next we try to get a similar inequality for errors in λ. We use the equation in (3.2) for λ and that in (3.21) for λk to get for j = n, . . . , 1, δj −1 eλk+1 = δj −1 λ − δj −1 λk+1 tj −1 = [ϕ(t, x(t), u(t), λ(t)) − F (tj , xjk+1 , ukj , λk+1 j )]dt tj

=

tj −1

[ϕ(t, x(t), u(t), λ(t)) − ϕ(tj , x(tj ), u(tj ), λ(tj ))]dt

tj

+

tj −1

tj

+

tj −1

tj

[ϕ(tj , x(tj ), u(tj ), λ(tj )) − ϕ(tj , xjk+1 , ukj , λk+1 j )]dt k+1 k k+1 [ϕ(tj , xjk+1 , ukj , λk+1 j ) − F (tj , xj , uj , λj )]dt.

Using a computation similar to (3.28), we get |δj −1 eλk | ≤ [Lϕx h|exk+1 | + Lϕu h|euk j | + Lϕλ h|eλk+1 | + o2 (h)] j j where o2 (h) = hωϕ (h) + hηϕ (h). Recall ωϕ (h) is the oscillation function of ϕ(t, x(t), u(t), λ(t)), ηϕ (h) is defined in (3.20), and Lϕx , Lϕu , Lϕλ are the Lipschitz constants of ϕ = h1 (t, x, u) + λh2 (t, x, u) for x, u, λ, respectively. Since eλk+1 j −1

= eλk+1 n

+

j i=n

δi−1 eλk+1


219

and eλk+1 = 0, by the triangle inequality, we have n |eλk+1 |≤ j −1

j

|δi−1 eλk | ≤

i=n

j [Lϕx h|exk+1 | + Lϕu h|euk i | + Lϕλ h|eλk+1 | + ho2 (h)]. i i i=n

(3.32) The next step is to rewrite (3.32) to get errors in λ on the left side only. • Eliminate eλk j We need the following discrete Gronwall’s inequality for sequences fn , pn , and kn . Lemma 3.5 Assume that g0 ≥ 0 and pn ≥ 0 for n ≥ 0, and for n ≥ 1, f 0 ≤ g0 ;

f n ≤ g0 +

n−1 j =0

Then fn ≤ (g0 +

n−1

pj +

n−1

kj f j .

j =0

n−1

j =0 pj ) exp(

j =0 kj ).

A proof can be found in Quarteroni and Valli [40], p. 14. |+ Apply this lemma to (3.32) (backwards) with g0 = 0, pj = Lϕx h|exk+1 j Lϕu h|euk j | + o2 (h) and kj = Lϕλ h to get | ≤ Mj −1 |eλk+1 j −1

n

[Lϕx h|exk+1 | + Lϕu h|euk i | + o2 (h)], i

j = n, . . . , 1

(3.33)

i=j

where Mj = exp(Lϕλ h(n − j )). Note that Mj ≤ M0 = exp(T Lϕλ ) because hn = T . This gives a bound for errors in λ written in terms of errors in x and u. • Show Euk → 0 By the third equation in (3.21) we obtain for j = 0, . . . , n, k+1 k+1 |euk+1 |e | ≤ L | + |e | . h x 3 λ j j j

(3.34)

Replace j by j + 1 in (3.33) and substitute it into (3.34) to get ⎡ |euk+1 | ≤ Lh3 ⎣|exk+1 | + Mj j j

n

⎤ [Lϕx h|exk+1 | + Lϕu h|euk i | + o2 (h)]⎦ . i

(3.35)

i=j +1

Sum (3.35) over j from j = 0 to j = n to get n j =0

⎡ |euk+1 | ≤ Lh3 ⎣ j

n j =0

|exk+1 |+ j

n n j =0 i=j +1

⎤ Mj [Lϕx h|exk+1 | + Lϕu h|euk i | + o2 (h)]⎦ i


M. McAsey et al.

⎡ ≤ Lh3 ⎣ = Lh3

n

|exk+1 |+ j

j =0 n

n i−1

⎤ Mj [Lϕx h|exk+1 | + Lϕu h|euk i | + o2 (h)]⎦ i

i=1 j =0

Ki |exk+1 |+ i

i=1

n

Lϕu Ni h|euk i | + o2 (h)

i=1

n

Ni

(3.36)

i=1

where for i = 1, . . . , n, Ni =

i−1 j =0

Mj =

exp(Lϕλ h(n + 1)) − exp(Lϕλ h(n − i + 1)) ; exp(Lϕλ h) − 1

Ki = 1 + Lϕx Ni h.

Note that Ni ≤ iM0 . It follows that Ki ≤ 1 + T M0 Lϕx ;

n i=1

1 Ni ≤ n(n + 1)M0 ≤ n2 M0 . 2

Now (3.36) implies that n n n |euk+1 | ≤ Lh3 (1 + T M0 Lϕx ) |exk+1 | + M0 nh |euk i | + n2 M0 o2 (h) . j i j =0

i=1

i=1

(3.37) Multiplying both sides of (3.37) by h, using T = nh and the definition of average errors, we get Euk+1 ≤ Lh3 (1 + T M0 · Lϕx )Exk+1 + M0 T Euk + o2 (h)T 2 M0 h−1 . Combining this with (3.31) we get for k = 0, 1, 2, . . . , Euk+1 ≤ BEuk + o3 (h)

(3.38)

where B = Lh3 M0 T + Lh3

(1+T M0 Lϕx )T Lgu 1−T Lgx

o1 (h) o3 (h) = Lh3 M0 T 2 o2h(h) + Lh3 T 2 [1−T Lgx ]h .

Iterating (3.38) we obtain Euk ≤ B k Eu0 +

k

B i o3 (h).

i=0

Note that |B| < 1 when either T is or the Lipschitz constants are small small enough 1 enough. Therefore B k → 0 and ki=0 B i ≤ 1−B is bounded. Moreover, the definitions of o1 (h) and o2 (h) imply that o1 (h)/ h → 0,

o2 (h)/ h → 0


221

as h → 0, which implies that o3 (h) → 0. Therefore Euk → 0 as k → ∞ and h → 0. All the pieces are now in place. By (3.31), we also get Exk → 0 as k → ∞ and h → 0. Go back to (3.29) to see that max |exkj | ≤ Lgx Exk+1 + Lgu Euk + T o1 (h)/ h → 0

j =0,...,n

as h → 0. From (3.35) we see that max |euk j | ≤ Lh3 [ max |exk+1 | + M0 (Lϕx Exk+1 + Lϕu Euk + T o2 (h)/ h)] → 0 j

j =0,...,n

j =0,...,n

as h → 0. Finally, from (3.33) we see that as h → 0, max |eλk j | ≤ M0 [Lϕx Exk+1 + Lϕu Euk + T o2 (h)/ h] → 0.

j =0,...,n

This finishes the proof. Remark 3.6 1. The proof for the alternative approximating equation

j x k+1 = hG(tj , xjk , ukj ),

x0k+1 = a

(3.39)

is similar. In this case, (3.30) is replaced by Exk+1 ≤ T [Lgx Exk + Lgu Euk + n · o1 (h)]. We do not have (3.31). Then (3.38) is replaced by Euk+1 ≤ AExk + BEuk + o4 (h) with A, B, o4 (h) being similar expressions of T and the Lipschitz constants of f, g. So we get the following iterative inequality for the average errors:

Exk+1 Euk+1

Exk ≤A Euk

+C

T Lgu T 1 (h) where A = Lgx and C = T ono(h) . Under the same conditions, we have A B 4 the A < 1 and C → 0, which imply the desired results. 2. Lenhart and Workman [38] show how to use the FBSM for problems with bounded controls, a ≤ u(t) ≤ b by changing the characterization of u from, say u(t) = h3 (t, x(t), λ(t)) in (3.3) to u(t) = min(b, max(a, h3 (t, x(t), λ(t)))). Note that the function min(b, max(a, h3 (t, x, u)))) is also a Lipschitz function because the functions min{a, x} and max{a, x} are Lipschitz and the composition of Lipschitz functions is Lipschitz. So Assumption (A) continues to hold and the FBSM is applicable.


M. McAsey et al.

4 Example The following simple linear-quadratic problem has been used as an example in several papers. See for example Vlassenbroeck and Van Dooren [46]. 1 1 max x(t)2 + u(t)2 dt (4.1) u 2 0 subject to the state equation x (t) = −x(t) + u(t),

x(0) = 1.

(4.2)

The Maximum Principle can be used to construct an analytic solution. The costate equation is λ (t) = λ(t) − x(t). The maximizing condition on the Hamiltonian gives u(t) = −λ(t). Together with the state equation, the result is the following linear differential-algebraic system. x (t) = −x(t) + u(t),

λ (t) = λ(t) − x(t),

x(0) = 1

(4.3)

λ(1) = 0

(4.4)

u(t) = −λ(t).

(4.5)

The solution is

√ √ √ 2 cosh( 2(t − 1)) − sinh( 2(t − 1)) x(t) = √ √ √ 2 cosh( 2) + sinh( 2) √ sinh( 2(t − 1)) λ(t) = − √ √ √ . 2 cosh( 2) + sinh( 2)

and

The optimal value of the objective functional is J = 0.1929092981. The final value of the state is x(1) = 0.2819695346 and the initial value of the co-state is λ(0) = 0.3858185962. The numerical computations of the FBSM algorithm were implemented in Mathematica. The initial guess for the control is u ≡ 0. The differential equation solver used is the fourth order Runge-Kutta on the interval [0, 1] partitioned into N subintervals. The stopping criteria is determined by finding the relative errors for the state, the co-state, and the control and requiring that all three be less than a specified k−1 k value δ. The desired relative error for the state variable, for example, is x x k−x