How to Verify Optimal Controls Computed by Direct Shooting Methods? - A TutorialI Ralf Hannemann-Tam´asa,b , Wolfgang Marquardta,∗ a
Aachener Verfahrenstechnik – Process Systems Engineering, RWTH Aachen University, Turmstr. 46, 52064 Aachen, Germany b German Research School for Simulation Sciences GmbH, 52425 J¨ ulich, Germany
Abstract For the solution of optimal control problems, direct methods have been established in the process engineering community. If set up correctly they robustly provide more or less accurate approximations of the exact solution. In the usual engineering practice, neither the distance to the exact solution is reflected, nor the compliance with the continuous necessary conditions in form of Pontryagins Minimum Principle is checked. At the end, some approximate solution is available but its quality is at question. This tutorial addresses the problem of the verification of optimal controls computed by direct shooting methods. We focus on this popular transcription method though the results are also relevant for other solution strategies. We review known results spread in the mathematical literature on optimal control to show how the output of the nonlinear programs (NLPs) resulting from single shooting transcriptions of optimal control problems can be interpreted in the context of Pontryagin’s Minimum Principle. In particular, we show how to approximate continuous adjoint variables by means of the dual information provided by the NLP solver. Based on this adjoint approximation we use a multi-level setting to construct an estimate of the distance to a true extremal solution satisfying the continuous necessary conditions of optimality. A comprehensive case study illustrates the theoretical results. Keywords: optimal control, direct single shooting, indirect multiple shooting, I
This work was supported by the German Research Foundation (DFG) under grant MA 1188/28-1. ∗ Corresponding author:
[email protected] Preprint submitted to Journal of Process Control
November 3, 2013
multipoint boundary value problem, Pontryagin, adjoints 1. Introduction Optimal control problems arise in many engineering applications. Originally dealing with problems in aerospace engineering, nowadays optimal control and especially nonlinear model predictive control (NMPC) are popular research topics in process control. Usually, optimal control problems are solved (approximately) by so-called direct methods where the continuous problem is discretized. A brief review of the most popular direct methods has been compiled by Binder et al. (2001). If set up correctly, direct methods robustly provide more or less accurate approximations of the exact solution. In current practice, neither the distance to the exact solution is reflected, nor the compliance with the continuous necessary conditions in the form of Pontryagins Minimum Principle is checked. Consequently, there is no way to assess the quality of the approximate solution, which is at least unsatisfactory or even unacceptable from both, a theoretical and a practical point of view. Within the class of direct methods, direct single shooting, also referred to as control vector parametrization, is popular in engineering practice. Here, only the control vector is parameterized to transcribe the infinite-dimensional optimal control problem into a finite-dimensional nonlinear program (NLP). Typically, this NLP is smooth and can be solved by gradient-based algorithms like sequential quadratic programming (SQP) or interior-point methods. Usually, these nonlinear programming methods not only provide the optimal solution but also dual information in terms of the Lagrange multipliers of the well-know Karush-Kuhn-Tucker necessary conditions of optimality. The aim of this tutorial is to show how this dual information can be utilized to estimate the distance of the direct single shooting solution to a true extremal of continuous (Pontryagin-type) necessary conditions of optimality. Of course, an essential prerequisite for such an estimate is the convergence of the finite-dimensional nonlinear program to a Pontryagin-extremal of the original optimal control problem. In the past, there were several publications which addressed the latter problem: The early work of Daniel (1969) derives some convergence results for the direct method by Rosen (1966), a variant of the simultaneous approach using the explicit Euler discretization. Alt (1984), Dontchev (1996) and Malanowski et al. (1998) also derive convergence relations for the ex2
plicit Euler discretization applied to a wider class of optimal control problems. Pytlak (1999) proposes an algorithm and a prove of convergence for state-constrained optimal control problems in a space of relaxed controls, and shows how his theoretical results can be implemented by means of some kind of sequential approach. Hager (2000) proves convergence for controlconstrained problems discretized by Runge-Kutta methods. Dontchev and Hager (2001) address the explicit Euler discretization in the context of the simultaneous control and state variable discretization for optimal control problems with pure state and without control constraints; they prove the convergence of the nonlinear program’s Lagrange multipliers to the Pontryagin adjoints if the true optimal control profile is Lipschitz continuous. Kameswaran and Biegler (2008) investigate direct collocation with respect to consistency and convergence to the continuous optimal solution. For readers interested in a general theory of optimal control, we refer to the monograph of Vinter (2010) who gives an almost self-contained introduction into optimal control and presents proofs of different Maximum Principles with a strong emphasis on set-valued analysis. All these investigations rely on a common principle: at least the control vector is discretized on a grid, the fineness of which can be measured by some positive number ∆ > 0. Taking the limit ∆ → 0, it is shown that the solution of the discretized problem converges in some norm1 to the true solution. In general, no criteria for the distance of the approximation to the true solution are derived. A remarkable exception is the work of Dontchev (1996) who considers the Euler discretization and computes a priori estimates for the distance of the approximations to the true solution. Accordingly, in this tutorial, we are not only focusing on the convergence of direct methods but also on quantifying into the distance of an approximate solution to a Pontryagin-type extremal, subsequently referred to as the true solution. Especially we want to show, how a measure 2 can be defined to estimate the distance of a given approximate solution to the true solution in the context of direct single shooting. Early descriptions of the single shooting approach comprise the works of Brusch (1974) and Sargent and Sullivan (1978). Vassiliadis et al. (1994a,b) Usually, the convergence of the optimal control is regarded with respect to the L∞ norm whereas the convergence of the state is measured by means of the W 1,∞ norm. 2 This measure is defined by a calculation rule. In particular, it is not a measure in the sense of mathematical measure theory. 1
3
use direct single shooting to solve multi-stage optimal control problems. Adaptive direct single shooting techniques for single- and multi-stage optimal control problems are described in the contributions of Schlegel et al. (2005) and Schlegel and Marquardt (2006). However, though direct single shooting is popular in engineering practice and a number of different implementation exists, to the authors’ knowledge no currently available direct single shooting code estimates to what extent the computed solution satisfies the necessary conditions of optimality in the sense of Pontryagin’s Minimum Principle (Pontryagin et al., 1962). Hence, we do not know how close a given solution computed by some optimal control algorithm to the true solution really is. To check Pontryagin’s Minimum Principle by means of an approximation resulting from a direct method, we need to estimate adjoint variables, and possibly multiplier functions for constrained problems. In the past, different approaches were suggested to utilize information by the NLP solvers in direct methods to estimate continuous adjoints. Von Stryk and Bulirsch (1992) use a full discretization approach based on collocation and provide estimates for the adjoint variables. Grimm (1993) and Grimm and Markl (1997) provide estimates for the adjoints based on direct multiple shooting3 . B¨ uskens (1998) implements the direct single shooting technique to provide an a-posteriori estimate of the adjoints. We will demonstrate an alternative approach which is related to the technique of Grimm and Markl (1997) to get accurate estimates of adjoint variables and associated multiplier functions. This information is used to construct a measure of the distance to the true solution complying with Pontryagin’s Minimum Principle. This measure relies on a multi-level solution strategy, where the optimal control problem is solved with increasingly fine discretizations by direct single shooting. Then, this measure can be used to decide, when the approximate solution is sufficiently close to the true solution, i.e. when to stop the refinement. The paper is organized as follows. The class of optimal control problems to be considered is introduced in Section 2. In Section 3, we state a proven variant of the Minimum Principle for optimal control problems with pure state path constraints, mixed control-state constraints and simple bounds on 3
The term direct multiple shooting refers to a NLP-based approach for the solution of optimal control problems as introduced by Bock and Plitt (1984); it should not be mixed up with (indirect) multiple shooting, a technique for the solution of multipoint boundary value problems (Osborne, 1969).
4
the controls. The direct transcription of optimal control problems by means of the single shooting approach is sketched in Section 4. In Section 5, we introduce composite adjoints which are later used to estimate the continuous adjoint variables. In Section 6, we introduce a conjecture which states the convergence of composite adjoints and associated Lagrange multipliers to the adjoints of the Minimum principle presented Section 3. These convergence results are used in Section 7 to construct a measure for the distance to the true solution which can be utilized for the verification of solutions stemming from direct single shooting. The case study of a nontrivial optimal control problem in Section 8 illustrates the conjecture and the usefulness of verification procedure. Finally, in Section 9, we draw some conclusions. 2. A Class of Optimal Control Problems Though the techniques presented in this tutorial can be extended to address multi-stage optimal control problems4 for differential-algebraic systems, we only consider a single-stage formulation for ordinary differential equations, since the assessment of the necessary conditions of optimality is much simpler in this case than in the case of multi-stage problems with differential-algebraic equation (DAE) models. Let L∞ ([t0 , tf ], Rn ) denote the space of essentially bounded, measurable functions, mapping [t0 , tf ] to Rn , and W 1,∞ ([t0 , tf ], Rn ) denote the subspace of absolutely continuous functions5 with essentially bounded derivatives6 7 . 4
In multi-stage problems, we have a finite set K = {1, . . . , nK } of indices such that for each index i ∈ K we have possibly different right hand sides fi (xi , ui ), different constraints of type (4)–(8) and some rules mapping the state vector from stage i to the next stage i + 1. For readers not familiar with multi-stage problems we refer to Vassiliadis et al. (1994a,b). 5 Absolute continuity is a stronger type of continuity. A function f : [t0 , tf ] → Rn is called absolute continuous if for every ε > 0 there exists a positive numberP δ > 0 such that for each P finite sequence of pairs (sk , tk ) with sk , tk ∈ [t0 , tf ] the inequality k |tk − sk | < δ implies k kf (tk ) − f (sk )k < ε. In particular, every absolute continuous function is uniformly continuous and its partial derivatives exists except for a set of zero measure. 6 Essentially bounded means bounded everywhere except for a set of zero measure. 7 Actually, the space W 1,∞ ([t0 , tf ], Rn ) can be characterized as the set of all Lipschitz continuous functions from [t0 , tf ] to Rn , which means that x(·) ∈ W 1,∞ ([t0 , tf ], Rn ), iff there exists a Lipschitz constant K > 0, such that kx(t2 ) − x(t1 )k ≤ K|t2 − t1 | for all t1 , t2 ∈ [t0 , tf ]. It is easy to see that every Lipschitz continuous function is absolutely continuous: In the situation of Footnote 5, let ε > 0 be given and take δ = ε/K.
5
In general, we are searching for a pair (x(·), u(·)) ∈ W 1,∞ ([t0 , tf ], Rnx ) × L∞ ([t0 , tf ], Rnu ) which solves the optimal control problem min x(·),u(·)
Φ(x(tf ))
(1)
s. t. x(t) ˙ = f (x(t), u(t)) almost everywhere8 (a.e.) x(t0 ) = x0 ∈ Rnx , s(x(t)) ≤ 0 ∈ Rns ∀ t ∈ [t0 , tf ], c(x(t), u(t)) ≤ 0 ∈ Rnc ∀ t ∈ [t0 , tf ], g(x(tf )) = 0 ∈ Rng , h(x(tf )) ≤ 0 ∈ Rnh , u(t) ∈ U := [umin , umax ] ⊂ Rnu , ∀t ∈ [t0 , tf ]
(2) (3) (4) (5) (6) (7) (8)
Here, Φ : Rnx → R, f : Rnx × Rnu → Rnx , s : Rnx → Rns , c : Rnx × Rnu → Rnc , h : Rnx → Rnh are sufficiently smooth functions. The variables t0 and tf denote the initial and final time, respectively. The objective functional in eq. (1) is of Mayer type. The dynamics of the system is given by eq. (2) where the vector x0 ∈ Rnx in eq. (3) prescribes fixed initial values. Eqs. (4) and (5) prescribe the pure state and mixed control-state constraints, respectively. Equality and inequality endpoint constraints are stated in eqs. (6) and (7). In eq. (8), the notation [umin , umax ] is defined by the Cartesian product u [umin , umax ] := Πni=1 [umini , umaxi ] ,
where umin , umax ∈ Rnu are the control bounds. Since the mixed control-state constraint (5) imposes additional restrictions to the control vector, the sets U(x) := {u ∈ Rnu | umin ≤ u ≤ umax , c(x, u) ≤ 0} ,
x ∈ Rnx ,
(9)
Furthermore, at points where the first-order derivative exists, the associated difference quotient and therewith also the derivative itself are bounded by the Lipschitz constant K. On the other hand, if an absolutely continuous function has an essentially bounded derivative, say kxk ˙ ∞ = K < ∞, where k·k∞ denotes the essential supremum norm, then Rt
R t
kx(t2 ) − x(t1 )k∞ = t12 x(t) ˙ dt ≤ t12 kxk ˙ ∞ dt = K|t2 − t1 |. 8
∞
Almost everywhere is a concept stemming from measure theory. A property which holds almost everywhere is valid everywhere except for a set of zero measure. For example, at discontinuities of u(t), the differential equation (2) may not be valid. But then, the set of discontinuity points of u(t) should be of zero measure. For the reader interested in general measure theory we refer to the book of Elstrodt (2009) or any other textbook about measure and integration theory.
6
are of special interest since u(t) ∈ U(x(t))
(10)
has to hold for all feasible control trajectories u(t); in particular, condition (10) is equivalent to eqs. (5) and (8). If it exists, the optimal solution of the infinite-dimensional optimal control problem (1)-(8) is characterized by the necessary conditions of optimality. Necessary conditions of optimality have been derived by Gerdts (2005). He treats optimal control problems with semi-explicit DAE constraints to index one assuming that the path constraints (4) involve only differential variables. Necessary conditions for two-stage optimal control problems with ODE constraints have been derived by Tomiyama (1985). Hartl et al. (1995) provide an excellent survey of different maximum principles for single-stage problems subject to ODE constrants; some of them are also applicable to our formulation. Before we discuss a typical implementation of the direct single shooting method, we present the necessary conditions of optimality for the true solution. 3. Pontryagin’s Minimum Principle There are a couple of proven or informal minimum principles for optimal control problems with state constraints. Hartl et al. (1995) also present a proven minimum principle for the type of optimal control problems given by eqs. (1)-(8). 3.1. Preliminaries To formulate the Minimum Principle of Pontryagin or, in other words, the necessary conditions of optimality, we introduce the Hamiltonian H(x, λ, u) := λT f (x, u),
(11)
and the extended Hamiltonian ˆ H(x, λ, u, µL , µU , γ, η) := H(x, λ, u) + µL T (umin − u) + µU T (u − umax )+ γ T c(x, u) + η T s(x),
(12)
where x, λ ∈ Rnx , u ∈ Rnu , µL , µU ∈ Rnu , γ ∈ Rnc , η ∈ Rns . If no ambiguity is possible, we denote expressions related to the trajectories x(t), u(t), λ(t), . . . , by the [t]-notation, e.g., H[t] := H(x(t), λ(t), u(t)). 7
Further, we denote limits from left or right at the time t = t1 by the argument t1 − or t1 + , respectively, e.g., λ(t1 − ) =
λ(t1 + ) =
lim λ(t1 − ε),
ε>0, ε→0
lim λ(t1 + ε).
ε>0, ε→0
To formulate some regularity conditions, we introduce the concept interior and boundary intervals, entry and exit times as well as the order of a pure state constraint. Let τ1 < τ2 be real numbers. A subinterval (τ1 , τ2 ) ⊂ [t0 , tf ] is called an interior interval of the trajectory x(·) with respect to the pure state constraint si , if si (x(t)) < 0 for all t ∈ (τ1 , τ2 ). In contrast, a subinterval [τ1 , τ2 ] ⊂ [t0 , tf ] is called a boundary interval if si (x(t)) = 0 for all t ∈ [τ1 , τ2 ]. Let t0 < τ1 < τ2 < tf and [τ1 , τ2 ] be a maximal boundary interval. Then, τ1 is called entry time and τ2 is called exit time, taken together they are called junction times. We will not discuss the case of a boundary point τ (Jacobson et al., 1971) where s(x(τ )) = 0 but x(·) is in the interior just before and just after τ . To analyze the optimal control within a boundary interval [τ1 , τ2 ] one has to introduce the concept of order of the corresponding state constraint. We assume that, for a particular index i, the additional equation si (x(t)) = 0 uniquely determines one control variable, say u1 , on [τ1 , τ2 ]. Since si (x(t)) = 0 does not explicitly depend on u1 , we have to differentiate this equation recursively with respect to t. We define s0i := s, sj+1 i
∂sji f (·), j = 0, 1, . . . . := ∂x
We say si (x(t)) ≤ 0 is of order q ≥ 1, iff ∂sq−1 ∂sqi i ≡ 0 but 6= 0. ∂u ∂u If the order of a state constraint is q, then, in many cases, the equation sqi (x, u) = 0 may be used to eliminate one degree of freedom (e.g. u1 = u1 (x, (uj )j6=1 )) on a boundary interval. 3.2. Global assumptions In order to state the necessary conditions of optimality we have to require some assumptions (cf. Hartl et al. (1995)). 8
Assumption 1. The gradients of the equality constraints in eq. (7) and of the active inequality constraints in eq. (6) with respect to x are linearly independent: gx [tf ] diag(g[tf ]) rank = ng + nh 9 . hx [tf ] 0 Assumption 2. The gradients of the mixed control-state constraints in eq. (5) and the control bounds in eq. (8) with respect to u are linearly independent along the optimal trajectories of x∗ (t) and u∗ (t): cu [t] diag(c[t]) rank −I diag(umin − u[t]) = nc + 2 nu ∀ t ∈ [t0 , tf ]. I diag(u[t] − umax ) Assumption 3. Let qi denote the order of the pure state path inequality constraint si (x, u, v) ≤ 0 and let sq := (sq11 , . . . , sqnnss )T . We require that the gradients of sqi i , i = 1, . . . , ns , with respect to x are linearly independent: q rank ∂s∂x[t] diag(s[t]) = ns ∀ t ∈ [t0 , tf ]. In the following, we tacitly assume the validity of the Assumptions 1 to 3 for the optimal control problems considered in this work. 9
The term diag(g[tf ]) in the second block column has the purpose to test only active inequality constraints with respect to linear independence. For example, if all inequality constraints are active, then we have gx [tf ] diag(g[tf ]) gx [tf ] rank = rank . hx [tf ] 0 hx [tf ] On the other hand, if no inequality constraint is active, then gx [tf ] diag(g[tf ]) rank = rank(hx [tf ]) hx [tf ] 0 holds. The same notational trick is employed in the statement of Assumptions 2 and 3.
9
3.3. The Minimum Principle Now, we are able to state a Minimum Principle for problem (1)-(8). Theorem 1. Let x∗ (t), u∗ (t) be an optimal solution of problem (1)-(8). Then, there exist • a nonnegative multiplier λ0 ∈ R, associated with the objective function Φ(x(tf )), • a multiplier ρ ∈ Rng , associated with the constraint g(x(tf ) = 0, • a multiplier ζ ∈ Rnh , associated with the constraint h(x(tf )) ≤ 0, • a right-continuous function λ : [t0 , tf ] → Rnx , the adjoint variables, also called costates, • right-continuous functions µL , µU : [t0 , tf ] → Rnu , associated with the bound constraints umin ≤ u(·) ≤ umax , • a right-continuous function γ : [t0 , tf ] → Rnc , associated with the mixed control-state constraints c(x(t), u(t)) ≤ 0, • a right-continuous nonincreasing function of bounded variation η : [t0 , tf ] → Rns , which can be normalized to η(tf ) = 0, and which is associated with the pure state constraint s(x(·)) ≤ 0, such that the following statements hold: (λ0 , λ(t), µL (t), µU (t), γ(t), η(tf ) − η(t0 ), ρ, ζ) 6= 0 for all t ∈ [t0 , tf ],
u∗ (t) ∈ arg
min H (x∗ (t), λ(t), u) almost everywhere (a.e.),
(13)
(14)
u∈U(x(t))
ˆ u [t] = 0 a.e., H
µL , µU ≥ 0 and µL T (umin − u(·)) = 0, µU T (u(·) − umax ) = 0 a.e., γ ≥ 0 and γ(t)T c(x(t), u(t)) = 0 a.e., 10
(15)
(16) (17)
λT (tf − ) = λ0 Φx [tf ] + ρT gx [tf ] + ζ T hx [tf ],
(18)
ρT g(x(tf )) = 0,
(19)
ζ ≥ 0,
ζ T h(x(tf )) = 0,
(20)
On intervals (t1 , t2 ) ⊂ [t0 , tf ], where hi [t] < 0 holds, we have ηi (t) = const.
T
+
T
+
Z
t2
λ (t2 ) − λ (t1 ) =
T
Z
−Hx [t] − γ(t) cx [t] dt + t1
(21)
sx [t] dη(t),
(22)
(t1 ,t2 ]
H[t] = const.
(23)
For a proof of this theorem we refer to Hartl et al. (1995), Theorem 4.2. The rightmost term in eq. (22) refers to a Lebesgue-Stieltjes integral using the shorthand notation Z ns Z X (si )x [t] dηi (t) . sx [t]dη(t) := (t1 ,t2 ]
i=1
(t1 ,t2 ]
For readers not familiar with Lebesgue-Stieltjes integrals, we refer to the book of Elstrodt (2009) or any other advanced textbook about measure and integration theory. Remark 1. If λ0 > 0 holds, the multiplier can be normalized to λ0 = 1. This situation is called the normal case in literature, while the case λ0 = 0 is called abnormal. Assumption 4. In the following, we assume the normal case, i.e. λ0 = 1.
11
Remark 2. Let Γ be the antiderivative of the multiplier function γ, multiplied by (-1) 10 , which is defined by Z tf γ(s) ds. Γ(t) := t
Obviously, Γ is of bounded variation, normalized with Γ(tf ) = 0 such that eq. (22) can be rewritten as Z Z Z t2 T + T + cx [t] dΓ(t) + sx [t] dη(t), −Hx [t] dt + λ (t2 ) − λ (t1 ) = (t1 ,t2 ]
t1
(t1 ,t2 ]
(24) Remark 3. Assume that the optimal control u∗ (·) is continuous almost everywhere. Then, the associated adjoint λ(t) is differentiable almost everywhere11 and λ˙ T (t) = −Hx [t] − γ(t)T cx [t] + η(t) ˙ T hx [t] holds almost everywhere. This equation relates to the well-known necessary conditions of Jacobson et al. (1971). 3.4. Interpretation of adjoints The following remark interprets the adjoints as derivatives of the objective function with respect to initial values. Remark 4. Let (x∗ , u∗ ) be a strict minimizer of the optimal control problem which satisfies the second-order sufficient conditions stated by Augustin and Maurer (2001). Then, the optimal objective function value is differentiable with respect to the initial conditions x(t0 ) = x0 . Let Ψτ (xτ ) denote the optimal value resulting from a solution of problem (1)-(8) defined on the interval [τ, tf ] (instead of [t0 , tf ]), where eq. (3) is replaced by x(τ ) = xτ ∈ Rnx . Note, that we have Φ(x∗ (tf )) = Ψt0 (x0 ) = Ψτ (x∗ (τ )), where the right equality is a consequence of the optimality principle of Bellman (1957). 10
Note, that the time variable t is the lower integration limit in the following definition of Γ(t). 11 This is due the fact that every function of bounded variation is differentiable (and therefore continuous) almost everywhere.
12
Further, assume that τ is a point of continuity of the adjoint variables and let dτ ∈ Rnx be a direction such that for a sufficiently small ε > 0 the optimal control problem is feasible for the initial condition x(τ ) = xτ + εdτ . Then, following the approach of Breakwell (1959), we have for xτ = x(τ ) δΨτ (xτ ; dτ ) = λT (τ )dτ ,
(25)
where δΨτ (xτ ; dτ ) denotes the Gateaux derivative12 of Ψτ (xτ ) with respect to the direction dτ . If a linearly independent set of nx feasible directions exists, then λ(τ ) is uniquely determined by eq. (25). 4. Direct Single Shooting So far, we have characterized the optimal solution of the continuous (infinite-dimensional) optimal control problem. Next, we state how to approximate optimal solutions by means of direct single shooting. 4.1. Parametrization of the control vector The basic idea of direct single shooting is to substitute the control vector u(t) by an approximation u˜(t, p), parameterizing each control ui by employing parameters pij , j = 1, . . . , Pi , and basis functions φij (t), such that u˜i (t, p) :=
Pi X
pij φij (t),
pij ∈ R, i = 1, . . . , nu .
(26)
j=1
The functions φij (t) are assumed to be constant or linear B-splines13 . The multivector p = (p11 , . . . , pnu Pnu )T ∈ Rnp concatenates all degrees of freedom. The control bounds ui (t) ∈ [umini , umaxi ] are directly translated into pij ∈ [umini , umaxi ], j = 1, . . . , Pi . 12
The Gateaux derivative of a functional f : X → R is the directional derivative defined (x) by δf (x; d) = limε→0 f (x+εd)−f for x, d ∈ X. ε 13 B-splines of order one correspond to piecewise constant functions, B-Splines of order 2 to piecewise linear functions. For readers interested in the general theory of B-splines we refer to De Boor (1978).
13
The state vector x˜(t, p) depends on the parameter p and follows from a solution of the parametric initial value problem x˜˙ (t, p) = f (˜ x(t, p), u˜(t, p)), x˜(t0 , p) = x0 ,
(27) (28)
determined, for example, by Runge-Kutta or extrapolation methods. The path constraints in eqs. (4) and (5) are relaxed by collocation on a grid t0 < t1 < t2 < · · · < tN = tf to result in c(˜ x(tk , p), u˜(tk , p)) ≤ 0, k = 0, . . . N. (s(˜ x(tk , p)) The single shooting approach approximates the infinite-dimensional optimal control problem by the finite-dimensional nonlinear program (NLP) min
p∈Rnp
Φ(˜ x(tN , p))
(29)
s. t. c(˜ x(tk , p), u˜(tk , p)) ≤ 0, k = 0, . . . , N , s(˜ x(tk , p)) ≤ 0, k = 0, . . . , N , g(˜ x(tN , p)) = 0 , h(˜ x(tN , p)) ≤ 0 , umini ≤ pi ≤ umaxi , i = 1, . . . , nu .
(30) (31) (32) (33) (34)
To formulate the necessary conditions of optimality for problem (29)-(34) we introduce some simplifying notations. Let µ ˜L , µ ˜U be multi-vectors of the same dimension as p. For convenience, we denote X X µ ˜L T (umin − p) := µ ˜L ij (umini − pij ), i=1,...,nu j=1,...,Pi
and use an analogous definition for the expression µ ˜U T (p − umax ). Then, the Lagrangian of the NLP (29)-(34) is given by ˜ = Φ(˜ L(p, µ ˜L , µ ˜U , γ˜ , η˜, ρ˜, ζ) x(tN , p)) + µ ˜L T (umin − p) + µ ˜U T (p − umax )+ ! ! N N X X x(tk , p), u˜(tk , p)) + x(tk , p)) + γ˜kT c(˜ η˜kT s(˜ k=0 T
k=0
˜T
ρ˜ g(˜ x(tN , p)) + ζ h(˜ x(tN , p)). 14
(35)
4.2. Karush-Kuhn-Tucker conditions The well-known Karush-Kuhn-Tucker necessary conditions characterize the optimal solution of the NLP (29)-(34). Let p∗ be an optimal solution. We assume, that the linear independence constraint qualification (cf. Nocedal and Wright (1999), Chapter 12) is satisfied. Then, there exist unique Lagrange multipliers µ ˜L , µ ˜U ∈ Rnp , γ˜k ∈ Rnc , k = 0, . . . , N , η˜k ∈ Rns , k = ng 0, . . . , N , ρ˜ ∈ R and ζ˜ ∈ Rnh , such that ˜ = 0, 1. Lp (p∗ , µ ˜L , µ ˜U , γ˜ , η˜, ρ˜, ζ) 2. the constraints (30)-(34) are satisfied, ˜L , µ ˜U , belong3. the multipliers γ˜k , k = 0, . . . , N , η˜k , k = 0, . . . , N , ζ˜ and µ ing to the inequalities in eqs. (30),(31), (33) and (34), are nonnegative, and 4. the multipliers belonging to inactive constraints are equal to zero (the complementary slackness condition). 5. Composite Adjoints Suppose we have found an optimal solution p∗ of the NLP (29)-(34) to˜ We gether with a set of optimal Lagrange multipliers µ ˜L , µ ˜U , η˜, ρ˜ and ζ. realize that the Lagrangian in eq. (35) is a multi-point ODE-embedded functional, i.e., the Lagrangian relies on the state vector at different points in time. The gradients of such multi-point ODE-embedded functions can be efficiently computed by composite adjoints, a technique recently introduced by the authors (Hannemann and Marquardt, 2010). As we will see in the following, composite adjoints can also provide accurate estimates for the adjoints λ(t) of the Minimum Principle (Theorem 1). Computational aspects of composite adjoints have been discussed in detail by Hannemann and Marquardt (2010). Here, we present a slightly different introduction to composite adjoints to emphasize the relation to the adjoints of the Minimum Principle (Theorem 1). 5.1. Characterization of composite adjoints The optimal parameter vector p∗ uniquely determines the control functions u˜∗ (t) = u˜(t, p∗ ). Furthermore, by means of eqs. (2) and (3), we can compute the associated state trajectory x˜∗ (t). To characterize the composite
15
adjoints, we introduce the function η˜0 : R → Rns defined by N X η˜k for t < t0 k=0 N X η˜0 (t) = η˜k for t ∈ [tl , tl+1 ), l = 0, . . . , N − 1 k=l+1 0 for t ≥ tN = tf .
(36)
Since the Lagrange multipliers η˜k are all nonnegative we can state the following remark. Remark 5. Like the multiplier function η(·) in Theorem 1, the function η˜0 (·) is right-continuous, non-increasing, of bounded variation and normalized with η˜0 (tf ) = 0. Later, it will turn out that η˜0 (·) approximates the corresponding multiplier function η(·) of Theorem 1. Similarly, to cope with the mixed control-state constraints in eq. (5) and ˜ which is of (30), respectively, we introduce the right-continuous function Γ, ˜ bounded variation and normalized with Γ(tf ) = 0, by setting N X γ˜k for t < t0 k=0 N ˜ = X Γ(t) (37) γ˜k for t ∈ [tl , tl+1 ), l = 0, . . . , N − 1 k=l+1 0 for t ≥ tN = tf . ˜ We denote the Hamiltonian in eq. (11) evaluated at (˜ x∗ (t), λ(t), u˜∗ (t)) ˜ and define c˜[t], s˜[t] accordingly. Then, the composite adjoints are by H[t] ˜ : [t0 , tf ] → Rnx satisfying piecewise continuous, right-continuous functions λ for all t1 , t2 ∈ [t0 , tf ], t1 < t2 , the integral equation Z t2 Z Z T + T + ˜ ˜ ˜ x [t] d˜ ˜ ˜ λ (t2 ) − λ (t1 ) = −Hx [t] dt + c˜x [t] dΓ(t) + h η 0 (t). t1
(t1 ,t2 ]
(t1 ,t2 ]
(38) The similarity to eq. (24) (which is itself equivalent to eq. (22)) is obvious and becomes even more important when stating the boundary condition of the composite adjoints as ˜ T (tN ) = Φx (˜ λ x(tN )) + ρ˜T gx (˜ x(tN )) + ζ˜T hx (˜ x(tN )), 16
(39)
which is essentially eq. (18) with λ0 = 1 (the normal case in Remark 1 and Assumption 4). Since x˜∗ (t) and u˜∗ (t) are uniquely determined by p∗ , and since the composite adjoints are piecewise continuously differentiable by construction, they can be computed by a backwards integration using eqs. (38) and (39). 5.2. Gradient computation of the Lagrangian If we set µ ˜L and µ ˜U to zero in the argument of the Lagrangian (35)14 , the gradient with respect to p is given by the integral Z tN Z ∗ ∗ ˜ ˜ ˜ u [t] u˜p (t, p ) dt − Lp (p , 0, 0, γ˜ , η˜, ρ˜, ζ) = c˜u [t] dΓ(t). (40) H (t1 ,t2 ]
t0
At an optimal point, due to the linearity of the control bounds, employing Lp = 0 at the optimum, we have the identity Z µ ˜L − µ ˜U +
˜ c˜u [t] dΓ(t), =
(t1 ,t2 ]
Z
tN
˜ u [t] u˜p (t, p∗ ) dt H
(41)
t0
which can be used to estimate switching functions for optimal control problems where the controls occur linearly. 5.3. Interpretation of composite adjoints Exactly as continuous adjoints in Remark 4, composite adjoints can also be interpreted as derivatives of the objective function with respect to initial values. In particular, we state Remark 6. We assume that the NLP (29)-(34) has an optimal solution p∗ and that the second-order sufficient conditions hold (cf. (Nocedal and Wright, 1999, Theorem 12.6)). Consequently, the optimal objective function value is continuously differentiable with respect to the initial conditions x˜(t0 ) = x0 . Then, the composite adjoints can be interpreted as derivatives of the objective function with respect to initial values x˜(t0 ) = x0 . 14
Indeed, when an NLP solver requires the evaluation of the gradient of the Lagrangian, the user usually does not have to deal with the multipliers for simple bounds on the variables since they are typically handled internally, e.g. in the solver IPOPT (W¨achter and Biegler, 2006). Hence formally, µ ˜L and µ ˜U are set to zero.
17
˜ Let τ ∈ [t0 , tf ] be a point of continuity of the composite adjoints λ(t). τ ˜ (xτ ) denote the solution of the discretized problem (29)-(34) on the Let Ψ horizon [τ, tf ] where the initial condition is given by x(τ ) = xτ . Again, we have ˜ t0 (x0 ) = Ψ ˜ τ (˜ Φ(˜ x∗ (tN )) = Ψ x∗ (τ )). Further, we recall τ ˜ ˜ T (τ ) = ∂ Φ λ ∂xτ
.
(42)
xτ =˜ x(τ )
In contrast to the assumptions in Remark 4, we do not have to consider only feasible directions dτ ∈ Rnx since the fulfillment of the second-order sufficient conditions implies that all directions are feasible. 6. Convergence Relation From now on, we assume the controls to be approximated by piecewise constant B-splines, though the results can be extended to piecewise linear approximations. 6.1. Multipliers associated to the control bounds Grimm (1993) has proven the convergence of the multipliers µ ˜L , µ ˜U to 15 the continuous multiplier functions µL (·) and µU (·) for scalar controls . Let i ∈ {1, . . . , nu } be a control index, j ∈ {1, . . . , Pi } be an index referring to a discretization parameter of control ui and tj−1 < tj ∈ [t0 , tf ] the associated grid points. Further, let φij denote the associated constant B-spline defined by 1 for t ∈ [tj−1 , tj ] φij (t) = . 0 else Then, Grimm’s result implies that the corresponding multipliers µ ˜Lij , µ ˜U ij can be related to the continuous multipliers by Z tj µL i (t) dt ≈ µ ˜Lij . (43) tj−1
15
A sketch of the proof has been given by Grimm and Markl (1997), but the full proof is only contained in the report of Grimm (1993) which is unfortunately not easily accessible.
18
Similar relations hold for µ ˜U ij and µU i (·). These relations gives rise to the following definitions. Let t0 < t1 < . . . tPi be the grid of the piecewise constant approximation of ui (t). Then, we set ( µ ˜Lij for t ∈ [tj−1 , tj ) tj −tj−1 µ ˜L 0 i (t) := . (44) 0 else The functions µ ˜U 0 i (t) are defined accordingly. If the control grids are iteratively refined, we expect that the functions µ ˜L 0 (·), µ ˜U 0 (·) tend to their continuous equivalents µL (·), µU (·). 6.2. General convergence relations The latter considerations together with the similarities between eqs. (22) and (38) suggest that the NLP (29)-(34) converges to the optimal control problem (1)-(8). The idea is to solve a sequence of NLP problems of type (29)-(34), where we successively increase the number of collocation points for the path constraints N and the number Pi of B-splines approximating the control ui (t). N and Pi , i = 1, . . . , nu , are increased such that the maximal distance ∆ between the corresponding time grid points gradually tends to zero. To expect convergence, some regularity assumptions have to be satisfied. Though we do not have a proof of the Conjecture 2, formulated below, we have a rough idea, what kind of regularity we have to request. Since for a given ∆, we have only a finite number of degrees of freedom to approximate the control, we assume the optimal controls behave well in the sense of the following Assumption 5. The (possibly locally) optimal control u∗ (t) of the optimal control problem (1)-(8) is piecewise continuous. In particulary, it exhibits only a finite number of discontinuities. Further, set set of exit and entry points is finite. For a sufficiently small ∆, we might interpret the optimal solution of NLP (29)-(34) as the true solution of some slightly perturbed optimal control problem (1)-(8). In this context, a decrease in the fineness ∆ corresponds to less perturbations. This informal requirement can be virtually quantified by the following thought experiment. We add some virtual perturbation parameter vector 19
∆ ∈ Rn to the data of the problem and augment all functions in problem (1)-(8) with ∆ , e.g. x(t) ˙ = f (x(t), u(t), ∆ ), c(x(t), u(t), ∆ ) ≤ 0, . . . . We can arrange such that ∆ = 0 corresponds to the unperturbed problem. In this context we have to require, that for small ∆ the corresponding optimal solution is close to the optimal solution of the unperturbed problem. This will be certainly the case if the requirements for solution differentiability of parametric optimal control problems16 , as discussed for example by Malanowski and Maurer (2001), are satisfied. Loosely spoken, these requirements arise in the form of some second-order sufficient conditions of strict complementarity type. Assumption 6. The pair (x∗ , u∗ ) is a local minimizer of the optimal control problem (1)-(8) which satisfies second-order sufficient conditions of strict complementarity type. The formulation of such second-order sufficient conditions for problem with higher-order state constraints has been reported by Malanowski and Maurer (2001) whereas Maurer and Augustin (2001) provide appropriate conditions for problems with mixed control-state constraints and a scalar pure state constraint. In practice, Assumption 6 can hardly be checked, especially in the context of direct single shooting when the true solution is not known. As a weak remedy, one might check the nonlinear program’s second-order sufficient conditions as formulated by Fiacco (1983) or by (Nocedal and Wright, 1999, in Theorem 12.6). To establish the existence of unique Lagrange multipliers of the nonlinear program (29)-(34) we have to impose the LICQ condition. Assumption 7. If ∆ is sufficiently small, the linear independence constraint qualification for the NLP (29)-(34) holds at the corresponding optimal solution p∆ . Loosely spoken, we assume that for sufficiently small ∆ the NLP (29)-(34) consistently approximates the optimal control problem17 . Now, we are in the Solution differentiability means that the optimal solutions x∗ (t, ∆ ), u∗ (t, ∆ ) and also the corresponding adjoints λ∗ (t, ∆ ) are almost everywhere differentiable with respect to the parameter ∆ . 17 For an introduction to consistency of approximations we refer to the book of Polak (1997). Roughly spoken, consistency implies that the approximations cover all “important” aspects of the original problem. 16
20
position to state Conjecture 2. We assume the validity of Assumptions 1–7. Suppose, we construct a sequence of nonlinear programs of type (29)-(34) where the maximal distance ∆ between the collocation points of the path constraints and the control grids is gradually reduced to zero. Then, there exists a δ > 0 such that for each ∆ < δ, there exist a strict local minimizer p∆ and corresponding associated Lagrange multipliers µ ˜L∆ , µ ˜U ∆ , γ˜∆ , η˜∆ , ρ˜∆ , ζ˜∆ such that the following convergence results hold for the corresponding states x˜∆ , con0 ˜ ∆ , multiplier functions Γ ˜ ∆ , η˜0 , µ trols u˜∆ , composite adjoints λ ∆ ˜ ∆ as well as ˜ for the multipliers ρ˜∆ , ζ∆ : x˜∆ → x∗ , ˜ ∆ → λ, λ
u˜∆ → u∗ , ˜ ∆ → Γ, Γ
ρ˜∆ → ρ, 0 η˜∆ → η,
ζ˜∆ → ζ, µ ˜0∆ → µ.
(45)
More precisely, we expect weak convergence for the primal quantities x, u and weak-star convergence18 for the dual quantities ρ, ζ, λ, Γ, η, µ. Remark 7. It is important to note, that we do not expect strong convergence. For example, it is intuitive that we can only expect weak-star conver0 gence for the multipliers η˜∆ because of their construction. More precisely, 0 we interpret η˜∆ as an element of C([t0 , tf ], Rns )∗19 . Remark 8. Note that we have formulated the conjecture in terms of strict locally optimal solutions. Especially, in the ∆-sequence of the nonlinear programs (29)-(34), each NLP might have multiple strict locally optimal solutions which correspond to different strict locally optimal solutions of the original optimal control problem. However, if the true global solution of the optimal control problem is a strict minimizer and satisfies Assumptions 1–6, then we expect that Conjecture 2 can be strengthened by stating the existence of a global minimizer p∆ . In practice, the nonlinear program (29)-(34) could be solved by a global NLP algorithm like α-BB (Adjiman et al., 1996). Actually, the latter approach has been implemented by Esposito and Floudas (2000). A related approach based on the outer approximation algorithm (Duran and Grossmann, 1986) is presented by Chachuat et al. (2005). 18
For the definition of strong, weak and weak-star convergence, we refer to the book of Luenberger (1969) or any other textbook about functional analysis. 19 Let C([t0 , tf ], Rns ) denote the vector space of continuous functions mapping [t0 , tf ] to ns R with the usual norm. Then, C([t0 , tf ], Rns )∗ is its normed dual space.
21
Rather than providing a proof of Conjecture 2 we present the following plausibility consideration: The convergence of the states x˜∆ and the controls u˜∆ follows from the assumption of a strict minimizer. In consequence ˜ τ (x∆ (τ )) → Ψτ (x∗ (τ )). If we Φ(˜ x∆ (tf , p∆ )) → Φ(x∗ (tf )). Hence, we expect Ψ ˜ τ (x∆ (τ ); dτ ) → further assume the convergence of the Gateaux derivatives δ Ψ δΨτ (x∗ (τ ); dτ ) for nx linearly independent feasible directions dτ ∈ Rnx , then, ˜ ∆ → λ. the combination of Remark 4 with Remark 6 yields λ 0 ˜ As a direct consequence, the convergence of Γ∆ , η˜∆ , ρ˜∆ , ζ˜∆ follows from the comparison of eqs. (38) and (39) with eqs. (18) and (22). The convergence of µ ˜0∆ has been shown by Grimm (1993) for the scalar case and can likely be generalized to the multi-dimensional case. Remark 9. The novelty in this and the previous section is the definition ˜ in eqs. (36) and (37), respectively, which relate to of the functions η˜0 and Γ the multiplier functions η and γ of Theorem 1. This relation enables us to conjecture a quite general convergence result for optimal control problems with multiple controls and an arbitrary number of pure state and mixed control-state constraints. 7. Verifying Solutions from Direct Single Shooting We are now in the position to recall our original motivation for the preceding analysis, namely, providing a measure for the distance of the solution of the nonlinear program (29)-(34) to the true solution of the original optimal control problem (1)-(8). 7.1. Check of weak-star convergence by test functions To build a measure for the distance to the true solution based on Pontryagin’s Minimum Principle, we assume the validity of Conjecture 2. Then, necessary conditions for the convergence of the iterative refinement to the true solution are given by the weak(-star) convergence results in eq. (45). For the sake of a simple presentation and ease of implementation we check ˜ ∆ in eq. (45). This only the weak-star convergence of the composite adjoints λ weak-star convergence implies that for each interval [t1 , t2 ] ⊂ [t0 , tf ] and for arbitrarily chosen continuous test functions χ : [t1 , t2 ] → R we have the (strong) convergence relation Z t2 Z t2 ∆→0 ˜ χ(t)λ∆ (t) dt −−−→ χ(t)λ(t) dt. (46) t1
t1
22
We will employ rather simple test functions χ[t1 ,t2 ] : [t1 , t2 ] → R defined by χ[t1 ,t2 ] (t) :≡
1 , t2 − t1
(47)
which has the advantage that if λ(t) is continuous on [t1 , t2 ] and (t2 − t1 ) is sufficiently small, by continuity of λ and the mean value theorem, we can state Z t2 t1 + t2 20 χ[t1 ,t2 ] (t)λ(t) dt ≈ λ . (48) 2 t1 To check the convergence of the iterative refinement process, we define a ˜ ∆ ∈ Rnx ×nΛ by sufficiently small grid t0 < t1 < · · · < tnΛ = tf and a matrix Λ ˜∆ = Λ ij
Z
tj
˜ ∆i (t) dt, χ[tj−1 ,tj ] (t)λ
i = 1, . . . , nx ,
j = 1, . . . , nΛ .
(49)
tj−1
A similar matrix Λ ∈ Rnx ×nΛ corresponding to the true solution can be defined. Sometimes, and in particular when we want to stress the relations (48) and (49) or want to plot some time-dependent graphs, we employ by abuse of notation the convention t + t j−1 j ∆ ˜∆ ˜i := Λ Λ ij , 2 ˜ ∆ can serve as an approximate for λi ((tj−1 + tj )/2). since Λ ij 7.2. Approximation of switching functions For the analysis of optimal control problems where the controls ui (t) only ˜ ∆ ∈ Rnx ×nΛ appear linearly, it may be helpful to introduce the matrix Σ defined by Z tj ∆ ˜ ˜ ∆ [t] dt, i = 1, . . . , nx , j = 1, . . . , nΛ , Σij = χ[tj−1 ,tj ] (t)H (50) ui tj−1
20
Actually, for each i = 1, . . . , nx , there exists a θi ∈ [t1 , t2 ], such that the identity χ [t1 ,t2 ] (t)λi (t) dt = λi (θi ) holds. Not knowing better, we assume that each θi is close t1 to (t1 + t2 )/2, which is certainly true when |t2 − t1 | is sufficiently small. R t2
23
˜ ∆ (t), u˜∆ (t)). ˜ ∆ [t] denoting the Hamiltonian (11) evaluated at (˜ with H x∆ (t), λ ui Then again, by abuse of notation we set t + t j−1 j ∆ ˜ ∆ , i = 1, . . . , nu , ˜ := Σ Σ ij i 2 to approximate the so-called switching functions σi (t) (see eq. (54) for illustration in the case study in Section 8) which characterize the structure of the optimal control profiles (see also Fig. 3, below). 7.3. The distance of direct single shooting solutions to the true solution Now we return to the definition of a measure for direct single shooting solutions to the true solution. We construct a sequence of nonlinear programs of type (29)-(34) such that the fineness ∆ is halved in each subsequent refinement step21 . More concretely, we construct a sequence ∆1 , ∆2 , . . . , such that 1 ∆i+1 = ∆i , 2 and ∆i → 0 for i → ∞. Then, the solution of the NLP (29)-(34) corresponding to iteration i is sufficiently close to the true solution if the error with respect to some scaled norm
˜ ∆i
Λ − Λ (51)
S
is below a prescribed tolerance T ol > 0. Of course, we do not now Λ in eq. ˜ ∆i+1 and shift the index i by −1 to yield (51). As a remedy we replace Λ by Λ
˜ ∆i−1 ˜ ∆i − Λ < T ol (52)
Λ S
as a termination criterion to accept iterate i of the refinement procedure. 21
In the context of adaptive shooting methods like the one presented by Schlegel et al. (2005), it is not necessary to really halve the fineness ∆. It suffices to create possible non-equidistant grids which largely behave like grids where the maximal distance between two neighboring grid points is halved.
24
7.4. A direct single shooting approach with solution verification A direct single shooting algorithm with solution verification is proposed in Table 1. Here, the Boolean variable Success indicates a successful application of our algorithm. The integer imax is the maximal number of iterations. The NLPs can be solved by either local or global methods. If local methods are used, we cannot assure that in each iteration the NLP’s local solution corresponds to the same true solution. But we if employ the local solution of iteration i − 1 as a starting point for the NLP in iteration i, we expect that all NLP solutions can be associated with the same true local solution. Table 1: Direct single shooting with solution verification
Success = false Create initial control/constraint grids with effective fineness ∆1 Solve NLP (29)-(34) on ∆1 for i = 2 to imax do Create new grids with ∆i = 21 ∆i−1 Solve NLP (29)-(34) on ∆i if stopping criterion (52) is fulfilled then Success = true Break for loop end if end for
7.5. Definition of the scaled “norm” It is well-known that the concept of relative error collapses at zero. Therefore, the scaled norm of the stopping criterion in eq. (52) has to take into account some absolute error tolerance AbsT ol > 0 as well as some relative error tolerance RelT ol > 0. Hence, we define the left hand side of eq. (52) by
˜ ∆i−1 − Λ ˜ ∆i | |Λ
˜ ∆i−1 ˜ ∆i ij ij − Λ := max . (53)
Λ ∆ i ˜ i,j RelT ol · |Λ | + AbsT ol S ij
Though this definition does not deserve to be called a norm, it takes into account both quantities AbsT ol and RelT ol if the tolerance in eq. (52) is set to T ol = 1.
25
7.6. Scalability In principle, the algorithm with solution verification scales the same way as the underlying single shooting algorithm. The only difference is the convergence check at the end of each iteration. The cost of this convergence check is dominated by the computation of the composite adjoints. However, the computational time for the determination of the composite adjoints is in general only a small multiple, typically a factor between 2 and 10 independent from the number degrees of freedom, of the cost for the forward state integration (Hannemann and Marquardt, 2010). However, within an iteration, each function evaluation of the NLP solver triggers a forward integration and each Jacobian evaluation triggers a (forward) sensitivity analysis. Hence, in typical applications, the computational costs for the convergence check are at least one order of magnitude lower than the costs of the iteration itself. 8. Illustrative Case Study To demonstrate the applicability of the suggested novel stopping criterion, we compute the optimal solution of the Williams-Otto semi-batch reactor as introduced by Forbes (1994), a practically relevant benchmark problem with two controls and a scalar state constraint. The optimal control problem is solved using 1. indirect multiple shooting in order to obtain a highly accurate numerical solution, the reference “true” solution, and 2. direct single shooting with iterative refinement applying equidistant piecewise constant parametrizations. 8.1. Background of the optimal control problem In the reactor, the reactions A + B −→ C, C + B −→ P + E and P + C −→ G take place. Reactant A is already in the reactor at initial time, whereas reactant B is fed continuously into the reactor during operation. The products P and E as well as the side-product G are formed. The heat generated through the exothermic reaction is removed by a cooling jacket, which is controlled by manipulating the cooling water temperature TW . At the end of the batch, the conversion to the desired products P and E should be maximized. During the batch, path constraints on the inlet flow rate of reactant B denoted as FB,in , the reactor temperature TR , the reactor volume V and the scaled cooling water temperature TW must be observed. We have 26
nx = 9 states and nu = 2 controls. The manipulated control variables of this process are u1 (t) = TW (t) and the flow rate of B u2 (t) = FB,in (t). The batch time is 1000 seconds. The economic objective is to maximize the yield of the main products at the end of the batch. The optimal control problem summarized in Table 2 is of type (1)-(8) where the pure state constraint (4) refers to the reactor temperature. 8.2. The “true” solution We computed a numerical solution of the optimal control problem in Table 2 by means of an indirect multiple shooting strategy using the code BNDSCO (Oberle and Grimm, 1989). For proper initialization of this method, we employ direct single shooting and use the convergence statements in Conjecture 2 to provide good estimates for the adjoint variables and the related multipliers. The solution obtained by indirect multiple shooting solution is very accurate and satisfies Pontryagin’s Minimum Principle (Theorem 1) by construction. In the sequel, we will refer to this (reference) solution as the true solution. In optimal control problems where the controls occur linearly in the dynamics, also referred to as input-affin problems, the switching functions σi (t) := Hui [t],
i = 1, . . . , nu ,
(54)
characterize the optimal controls as follows: u∗ (t) ∈ {u ∈ U | uT σ(t) =
min ∗
uT σ(t)}.
(55)
u∈U(x (t))
In the absence of mixed control-state constraints, eq. (55) is reduced to ∗ if σi (t) < 0 ui (t) = umaxi u∗ (t) ∈ [umini , umaxi ], if σi (t) = 0 , (56) ∗i ui (t) = umini , if σi (t) > 0 which also holds in the case of the Williams-Otto reactor in Table 2. Exemplarily, the true solution of the optimal control u1 (t) and the corresponding switching function σ1 (t) are sketched in Figure 1. The times τi , where the switching function changes its sign or becomes zero and nonzero are called switching points. All in all, we have six switching points τ1 , . . . , τ6 (i.e. 5 for u1 (t) and 1 for u2 (t) as shown in Figs. 1 and 2, 27
Table 2: Williams Otto semi-batch reactor optimal control problem
min Φ(tf ) x,u
1 xA FB,in − k1 ξ1 xA xB , 1000 V 1 FB,in (1 − xB ) − k1 ξ1 xA xB − k2 ξ2 xB xC , x˙ B = 1000 V 1 xC FB,in x˙ C = − + k7 ξ1 xA xB − k3 ξ2 xB xC − k6 ξ3 xC xP , 1000 V 1 xP FB,in x˙ P = − + k2 ξ2 xB xC − k4 ξ3 xC xP , 1000 V 1 xE FB,in x˙ E = − + k3 ξ2 xB xC , 1000 V 1 xG FB,in x˙ G = − + k5 ξ3 xC xP , 1000 V 1 FB,in (Tin − TR ) T˙R = + k8 ξ1 xA xB + k9 ξ2 xB xC 1000 V + k10 ξ3 xC xP − l1 TR + l2 TW , 1 V˙ = FB,in , 1000 Φ˙ = −5554.1 (k2 ξ2 xB xC − k4 ξ3 xC xP ) V − k11 ξ2 xB xC V , (xA , xB , xC , xP , xE , xG , TR , V, Φ)(0) = (1, 0, 0, 0, 0, 0, 65, 2, 0) , 60 ≤ TR (t) ≤ 90, ∀ t ∈ [0, tf ], V (tf ) ≤ 5, FB,in (t) ∈ [0, 5.784], ∀ t ∈ [0, tf ], TW (t) ∈ [0.02, 0.1], ∀ t ∈ [0, tf ],
s. t. x˙ A = −
where x = (xA , xB , xC , xP , xE , xG , TR , V, Φ), u = (TW , FB,in ) and k1 = 1659900.0, k2 = 721170000.0, k3 = 1442340000.0, k4 = 1337250000000.0, k5 = 4011750000000.0, k6 = 2674500000000.0, k7 = 3319800.0, k8 = 104656218.9, k9 = 27285184270.0, k10 = 144655676400000.0, k11 = 181605029400.0, l1 = 0.0002434546857, l2 = 1000 · l1 , Tin = 35, b2 = 6.6667 −1000.0 b2
−8333.3
−11111.0
ξ1 = e TR +273.15 , ξ2 = e TR +273.15 , ξ3 = e TR +273.15 . 28
0.3
0.1
0.2
0.08
0.1 σ1
u1
0.12
0.06
0
0.04
−0.1
0.02
−0.2
0 0 τ1 τ2
τ3
t
τ5
τ6 1000
0 τ1 τ2
τ3
t
τ5
τ6 1000
Figure 1: True optimal control u1 (left) and its switching function (right) computed by the indirect method
right). Setting τ0 = t0 = 0 and τ7 = tf = 1000 we can identify seven intervals Ik = [τk−1 , τk ], k = 1, . . . , 7. The structure of the optimal solution is shown in Table 3. The bold ’l’ and ’u’ mark controls which are at their lower or upper bounds, respectively. The bold ’s’ is reserved for singular intervals associated to a control ui (t), where the switching function σi (t) vanishes. Here, I3 is the only singular interval, which is identical to the boundary interval where the state constraint 60 ≤ TR is active. The corresponding trajectory of TR is found in Figure 2, where also the second control is shown. The switching times can be found in Table 4. 8.3. Sequence of direct single shooting solutions We compute approximate solutions of the benchmark problem by means of a sequence of direct single shooting problems. We start with an equidistant piecewise constant approximation of the controls ui (t), i = 1, 2, with Pi = 8 intervals. The initial grid for the path constraint is twice as fine (N = 16) as the one for the control variables. In each subsequent iteration, we double the number of discretization parameters and grid points. Table 3: Structure of the optimal control
I1 I2 I3 u1 (t) l u s u2 (t) u u u l: ui = umini , u: ui
I4 I5 u u u l = umaxi ,
29
I6 l l s: σi
I7 u l ≡0
6
80
5
75
4 u2
TR
70
3
65
2
60
1
55 0
0
τ2
τ3
1000
0
t
τ4 t
1000
Figure 2: Reactor temperature with active path constraint on [τ2 , τ3 ] (left) and optimal control u2 (right)
We choose nΛ = 128 and an equidistant grid for the test functions χ[ti−1 ,ti ] in eq. (47). The absolute and relative error tolerances in eq. (53) are set to AbsT ol = RelT ol = 0.01 . Table 5 shows the results of the run implementing the presented stopping criterion. The first and second column show the refinement iteration number i and the corresponding fineness ∆i . In the third column, the stopping criterion (52) is checked. In the eighth refinement, this value becomes less than 1. Hence, the prescribed tolerance is achieved and the algorithm stops. In column 4, we compute the scaled error norm with respect to the true solution. The results indicate that the desired accuracy is already achieved after the seventh refinement. However, in general, the matrix Λ is not accessible and some estimate of the true solution has to be used. Table 4: Switching points
switching point time τ1 0.523376E+02 τ2 0.148258E+03 τ3 0.359866E+03 τ4 0.518672E+03 τ5 0.538685E+03 τ6 0.858796E+03 30
Table 5: Case study employing the novel stopping criterion
Refinement i
∆i
˜ ∆i−1 − Λ ˜ ∆i kS kΛ
˜ ∆i − ΛkS kΛ
˜ ∆i−1 − Φ ˜ ∆i kS kΦ
1 2 3 4 5 6 7 8
2−3 · tf 2−4 · tf 2−5 · tf 2−6 · tf 2−7 · tf 2−8 · tf 2−9 · tf 2−10 · tf
– 112 102 41.3 6.35 2.72 1.36 0.723
132 102 41.3 5.49 4.71 2.04 0.887 0.704
– 0.113 4.28E-2 1.34E-2 1.75E-3 7.48E-4 1.09E-4 4.30E-5
Some single shooting implementations, including the code of Schlegel et al. (2005), perform also an iterative refinement of the discretization and employ a termination criterion based solely on a decrease in the objective function. However, such a stopping criterion may pretend a false solution quality. In the context of our case study, column 5 shows the scaled error norm based on objective function values only. Using this criterion, the algorithm would stop already after refinement 2, though, as will be discussed below, the true solution structure has not been found. We present some exemplary plots after refinements 2 and 8. Figure 3 displays the approximate ˜ 1 after refinement 2 based control u˜1 and its associated switching function Σ on objective function values and after refinement 8 using the novel stopping criterion. We see that the control u˜1 misses the first umin -interval in [0, τ1 ] in the first case (Fig. 3, left) though its “switching function” indicates the control to start at its lower bound. In contrast, the control u˜1 matches the true solution structure in the second case (Fig. 3, right). Further, by com˜ 1 with the true switching parison of the approximate switching function Σ function in Figure 1, we see very good correspondence. However, Figure 4 shows the logarithmic difference log10 (|u∗1 (t) − u˜1 (t)|) after iteration 8. It is observable, that the approximated optimal control u˜1 captures the bang arcs very well, though the switching points cannot be captured exactly. In the interval [τ2 , τ3 ] of the active path constraint, the piecewise constant control approximation has some trouble to match the true solution but the absolute error is still acceptably low.
31
After refinement 2
0.1
0.1
0.08
0.08
0.06
0.06
0.04
0.04
0.02
0.02
0 0 τ1 τ2
τ3
0 0 τ1 τ2
τ6 1000
0.2
0.2
0.1
0.1
0
0
−0.1
−0.1
−0.2
−0.2
0 τ1 τ2
τ3
t
τ3
0.3
˜1 Σ
˜1 Σ
0.3
t
τ5
After refinement 8
0.12
u ˜1
u ˜1
0.12
τ5
τ6 1000
0 τ1 τ2
τ3
t
t
τ5
τ6 1000
τ5
τ6 1000
˜ 1 for the iterations 2 Figure 3: Direct single shooting solution u ˜1 and the corresponding Σ and 8.
0
u∗1 vs. u˜1 after refinement 8
˜1|) log10(|u∗1 − u
−2
−4
−6
−8 0 τ1 τ2
τ3
t
τ5
τ6 1000
Figure 4: Comparison of true and approximated control profile
32
8.4. Multiplier function of the path constraint The clever definition of the function η˜0 in eq. (36) is essential for the statement of Conjecture 2. Figure 5 shows the true η and its approximation η˜0 for selected steps of the multi-level refinement algorithm. It can be seen that the approximation η˜0 gets closer to the true function with an increasing number of refinement steps. After iteration 8, practically now difference between η and η˜0 is observable by eyes. For closer inspection, we plot the logarithmic difference log10 (|η(t) − η˜0 (t)|) in Figure 6. Here, the “jump” at t = τ2 is conspicuous. This jump, the absolute value of which lies in the area of the jump of the true multiplier at t = τ2 in Figure 5, is caused by the fact that discrete approximations of optimal control problems are almost never capable to capture the true switching points and the associated jumps of the multiplier functions. However, the true multiplier function η is continuous differentiable on the open interval (τ1 , τ2 ) which is not the case for the approximations η˜0 . This observation confirms Remark 7 which states that we should not expect strong convergence of the multiplier functions. 9. Conclusions We have given a tutorial, how to verify direct single shooting solutions for optimal control problems subject to an arbitrary number of pure state and mixed control-state constraints. This verification relies on the relation of NLP information of direct single shooting approximations to the necessary conditions of optimality of the continuous optimal control problem (Pontryagin’s Minimum Principle). We show, how a measure can be defined to estimate the distance of the solution of the discretized problem to the true solution of the continuous optimal control problem. Based on such analysis we propose a multi-level direct single shooting algorithm which produces verified solutions close to the true solution subject to a user-defined tolerance. Such solution verification strategies have not been reported in literature for shooting methods before. The strategy carries over to direct multiple shooting in a straightforward manner. In a case study, the proposition of the conjecture has been illustrated and the usefulness of the verification procedure has been demonstrated.
33
η˜0 after refinement 3
5
5
0
0
−5 0
η˜0
η
true multiplier function η
τ2
τ3
−5 0
1000 t
5
0
0
−5 0
τ2
τ3
τ3
1000 t
η˜0 after refinement 8
5
η˜0
η˜0
η˜0 after refinement 5
τ2
−5 0
1000 t
τ2
τ3
1000 t
Figure 5: Convergence behavior of multiplier function η˜0 associated with the path constraint 60 ≤ TR
η vs. η˜0 after refinement 8 0
log10(|η − η˜0|)
−2 −4 −6 −8
−10 −12 0
τ2
τ3
1000 t
Figure 6: Comparison of true and approximated multiplier function
34
References Adjiman, C., Androulakis, I., Maranas, C., Floudas, C., 1996. A global optimization method, αBB, for process design. Computers & Chemical Engineering 20 (Supplement 1), 419–424, European Symposium on Computer Aided Process Engineering-6. Alt, W., 1984. On the approximation of infinite optimization problems with an application to optimal control problems. Applied Mathematics & Optimization 12, 15–27. Augustin, D., Maurer, H., 2001. Computational sensitivity analysis for state constrained control problems. Annals of Operations Research 101, 75–99. Bellman, R., 1957. Dynamic Programming. Princeton University Press, Princeton, NJ. Binder, T., Blank, L., Bock, H., Bulirsch, R., Dahmen, W., Diehl, M., Kronseder, T., Marquardt, W., Schl¨oder, J., von Stryk, O., 2001. Introduction to model based optimization of chemical processes on moving horizons. In: Gr¨otschel, M., Krumke, S., Rambau, J. (Eds.), Online Optimization of Large Scale Systems. Springer, Berlin, Ch. 3, pp. 295–339. Bock, H., Plitt, K., 1984. A multiple shooting algorithm for direct solution of optimal control problems. In: 9th IFAC World Congress Budapest. Pergamon Press, pp. 242–247. Breakwell, J. V., 1959. The optimization of trajectories. Journal of the Society for Industrial and Applied Mathematics 7 (2), 215–247. Brusch, R. G., 1974. A nonlinear programming approach to space shuttle trajectory optimization. Journal of Optimization Theory and Applications 13 (1), 94–118. B¨ uskens, C., 1998. Optimierungsmethoden und sensitivit¨atsanalyse f¨ ur optimale steuerprozesse mit steuer- und zustandsbeschr¨ankungen. Ph.D. thesis, University of M¨ unster, Germany. Chachuat, B., Singer, A. B., Barton, P. I., 2005. Global mixed-integer dynamic optimization. AIChE Journal 51 (8), 2235–2253.
35
Daniel, J. W., 1969. On the convergence of a numerical method for optimal control problems. Journal of Optimization Theory and Applications 4, 330– 342. De Boor, C., 1978. A Practical Guide to Splines. Vol. 27 of Applied Mathematical Sciences. Springer, New York. Dontchev, A. L., Jul. 1996. An a priori estimate for discrete approximations in nonlinear optimal control. SIAM Journal On Control and Optimization 34 (4), 1315–1328. Dontchev, A. L., Hager, W., 2001. The Euler approximation in state constrained optimal control. Mathematics of Computation 70 (233), 173–203. Duran, M., Grossmann, I., 1986. An outer-approximation algorithm for a class of mixed-integer nonlinear programs. Mathematical Programming 36, 307–339. Elstrodt, J., 2009. Maß-und Integrationstheorie. Springer. Esposito, W. R., Floudas, C. A., 2000. Deterministic global optimization in nonlinear optimal control problems. Journal of Global Optimization 17, 97–126. Fiacco, A., 1983. Introduction to Sensitivity and Stability Analysis in Nonlinear Programming. Academic Press, New York. Forbes, J., 1994. Model structure and adjustable parameter selection for operations optimizations. Ph.D. thesis, McMaster University, Hamilton, Canada. Gerdts, M., 2005. Local minimum principle for optimal control problems subject to index one differential-algebraic equations. Tech. rep., Department of Mathematics, University of Hamburg, to be found under the URL www.math.uni-hamburg.de/home/gerdts/Report index1.pdf. URL www.math.uni-hamburg.de/home/gerdts/Report index1.pdf Grimm, W., 1993. Convergence Relations between optimal control and optimal parametric control. Tech. Rep. 420, Institut f¨ ur Flugmechanik und Flugregelung, University of Stuttgart, Germany, Schwerpunktprogramm der Deutschen Forschungsgemeinschaft: Anwendungsbezogene Optimierung und Steuerung. 36
Grimm, W., Markl, A., 1997. Adjoint estimation from a direct multiple shooting method. Journal of Optimization Theory and Applications 92 (2), 263– 283. Hager, W., 2000. Runge-Kutta methods in optimal control and the transformed adjoint system. Numer. Math. 87, 247–282. Hannemann, R., Marquardt, W., 2010. Continuous and discrete composite adjoints for the Hessian of the Lagrangian in shooting algorithms for dynamic optimization. SIAM Journal On Scientific Computing 31 (6), 4675– 4695. Hartl, R., Sethi, S., Vickson, R., 1995. A survey of the maximum principles for optimal control problems with state constraints. SIAM Review 37 (2), 181–218. Jacobson, D. H., Lele, M. M., Speyer, J. L., 1971. New necessary conditions of optimality for control problems with state-variable inequality constraints. Journal of Mathematical Analysis and Applications 35, 255–284. Kameswaran, S., Biegler, L. T., 2008. Convergence rates for direct transcription of optimal control problems using collocation at Radau points. Comput. Optim. Appl. 41 (1), 81–126. Luenberger, D., 1969. Optimization by Vector Space Methods. John Wiley & Sons, Inc. Malanowski, K., B¨ uskens, C., Maurer, H., 1998. Convergence of approximations to nonlinear optimal control problems. In: Fiacco, A. V. (Ed.), Mathematical Programming with Data Perturbations. Vol. 195 of Lecture Notes to Pure and Applied Mathematics. Marcel Dekker, New York, pp. 253–284. Malanowski, K., Maurer, H., 2001. Sensitivity analysis for optimal control problems subject to higher order state constraints. Annals of Operations Research 101, 43–73. Maurer, H., Augustin, D., 2001. Sensitivity analysis and real-time control of parametric optimal control problems using boundary value methods. In: Online Optimization of Large Scale Problems. Springer-Verlag, Berlin Heidelberg, pp. 17–33. 37
Nocedal, J., Wright, S., 1999. Numerical Optimization. Springer, New York - Berlin - Heidelberg. Oberle, H. J., Grimm, W., 1989. BNDSCO - A Program for the Numerical Solution of Optimal Control Problems. Report 515, Institute for Flight Systems Dynamics, DLR, Oberpfaffenhofen, Germany. Osborne, M., 1969. On shooting methods for boundary value problems. J. Math. Anal. Appl 27, 417–433. Polak, E., 1997. Optimization: Algorithms and Consistent Approximations. Springer-Verlag, New York. Pontryagin, L., Boltyanski, V., Gamkrelidze, R., Mis˘cenko, E., 1962. The Mathematical Theory of Optimal Processes. Wiley, New York. Pytlak, R., 1999. Numerical Methods for Optimal Control Problemes with State Constraints. Springer. Rosen, J. B., 1966. Iterative solution of nonlinear optimal control problems. SIAM Journal on Control 4 (1), 223–244. Sargent, R., Sullivan, G., 1978. The development of an efficient optimal control package. In: Stoer, J. (Ed.), Proceedings of the 8th IFIP Conference on Optimization Technique. Springer-Verlag, pp. 158–168. Schlegel, M., Marquardt, W., 2006. Detection and exploitation of the control switching structure in the solution of dynamic optimization problems. Journal of Process Control 16 (3), 275–290. Schlegel, M., Stockmann, K., Binder, T., Marquardt, W., 2005. Dynamic optimization using adaptive control vector parameterization. Computers & Chemical Engineering 29 (8), 1731–1751. Tomiyama, K., Nov. 1985. Two-stage optimal-control problems and optimality conditions. Journal of Economic Dynamics & Control 9 (3), 317–337. Vassiliadis, V., Sargent, R., Pantelides, C., 1994a. Solution of a class of multistage dynamic optimization problems. 1. Problems without path constraints. Ind. Eng. Chem. Res. 33 (9), 2111–2122.
38
Vassiliadis, V., Sargent, R., Pantelides, C., 1994b. Solution of a class of multistage dynamic optimization problems. 2. Problems with path constraints. Ind. Eng. Chem. Res. 33 (9), 2123–2133. Vinter, R., 2010. Optimal Control. Modern Birkh¨auser Classics. Von Stryk, O., Bulirsch, R., 1992. Direct and indirect methods for trajectory optimization. Ann. Oper. Res. 37 (1-4), 357–373. W¨achter, A., Biegler, L., 2006. On the implementation of a primal-dual interior point filter line search algorithm for large-scale nonlinear programming. Mathematical Programming 106 (1), 25–57.
39