Suboptimal Receding Horizon Control for ... - Semantic Scholar

3 downloads 0 Views 108KB Size Report
Suboptimal receding horizon control for continuous–time systems 1. Franco Blanchini2. Stefano Miani3. Felice Andrea Pellegrino4. Abstract. In this paper, a ...
Proceedings of the 41st IEEE Conference on Decision and Control Las Vegas, Nevada USA, December 2002

WeA12-3

Suboptimal receding horizon control for continuous–time systems 1 Franco Blanchini2

Stefano Miani3

Abstract

is normally applied to “slow” processes for which the sampling time may be taken “large”. The basic reason of the computational difficulties encountered in on–line optimization is due to two factors. First, the smaller is the sampling time, the greater is the horizon length (in term of number of steps, and hence of decision variables) which must be taken into account. Second the smaller is the sampling time, the smaller is the time available for computation.

In this paper, a continuous–time optimal control problem is approached in a sub–optimal way by introducing the concept of suboptimal value function, which is any function satisfying the Hamilton–Jacobi–Bellman inequality. It is shown that as long as the Euler Approximating System (EAS) of a given continuous–time plant admits a positive definite convex suboptimal value function, it is possibile to determine a stabilizing control for the continuous–time system whose cost not only converges to the optimal, but it is also upper bounded by the discrete–time cost no matter how the “discretization time parameter” is chosen.

The basic idea of this work consists on introducing two distinct time intervals: 1) the implementation sampling time T : we assume this to be very small (virtually zero from the process point of view), in order to cope with fast unstable or poorly dumped system dynamics; 2) the model time parameter τ: this is used to derive the discrete-time Euler Approximating System (EAS), which is used to carry over the on–line control computation. The parameter τ doesn’t need to be necessarily small; in any case, it must be large enough to allow the evaluation on–line of the control within the implementation sampling time.

1 Introduction Optimizing transient behavior in control systems is one of the main goals of control engineering from both theoretical and practical standpoint. It is also well known that unless for very special cases the problem is very hard to solve. The two traditional approaches, namely those based on maximum principle and dynamic programming, are limited in application: the former being basically suitable for open– loop solutions, the second because of the analytical and numerical difficulties it involves.

If implementation and computation are performed on different “sampling” times, three obvious problems arise. The first is how to ensure stability. The second is how to ensure constraint satisfaction and the third is how to guarantee performances expressed by means of an integral cost functional. We deal with the three problems by investigating the concept of suboptimal value function which is a positive definite function of the state variable which satisfies the Hamilton–Jacobi–Bellman (HJB) inequality. By means of a suboptimal value function for the EAS a suboptimal controller for the continuous–time system can be derived by solving on–line an auxiliary optimal control problem for the EAS. More precisely, the following results will be presented:

A very popular approach to optimize the control is the so called receding horizon control (often referred to as rolling– horizon or model–predictive control). Receding horizon control has a long history in both academic and industrial world (see the survey [8] or the more recent survey on model predictive control for constrained dyamic systems [13]). Basically, it consists in optimizing on–line the control trajectory in the open–loop sense, given the current state, and then to apply only the first value of the optimizing input sequence. By its nature, receding horizon control is suitable for discrete–time systems because optimization of the control requires a certain amount of time. As the optimization is performed on–line, this amount of time must be smaller than the sampling time. For trajectory–optimization–based controllers with long decision horizon, the computation can be very time–demanding. That’s why receding horizon control

a) It will be shown that any convex suboptimal value function for the EAS is also a suboptimal value function for the continuous–time cost. As a consequence, the auxiliary cost is an upper bound for the continuous–time cost and stability of the resulting closed–loop system is guaranteed no matter how the time parameter is chosen. b) The suboptimal value converges from above to the true optimal as τ goes to zero. c) For linear systems with convex control and constraints it will be shown that if the constraint–satisfaction is assured for the EAS, also the continuous–time system does not violate constraints.

1 Supported

by MURST, Italy Universit`a di Udine, 33100 Udine, Italy, e-mail: [email protected] 3 DIEGM, Universit` a di Udine, 33100 Udine, Italy, e-mail: [email protected] 4 S.I.S.S.A., via Beirut 4, 34014 Trieste, Italy, e-mail: [email protected] 2 DIMI,

0-7803-7516-5/02/$17.00 ©2002 IEEE

Felice Andrea Pellegrino4

1558

Denoting by D+ Ψ(x(·)) the upper–right Dini derivative of Ψ(x(·)) we have that

2 Suboptimal value functions Given a continuous function Ψ : IRn → IR, we denote the upper directional derivative by D+ Ψ[x, v] = lim sup h→0+

Ψ(x + hv) − Ψ(x) h

D+ Ψ(x(t)) = D+ Ψ[x, f (x, K(x))] ≤ −g(x(t), K(x(t))), (8) This condition implies that Pµ is positively invariant for the closed–loop system. Therefore, for each initial condition inside Pµ the constraints are not violated. Moreover, since g is positive definite, The closed–loop system is stable. Furthermore, by integrating (8) we get

(1)

We denote by N [Ψ, µ] the (possibly empty) sub-level set N [Ψ, µ] = {x : Ψ(x) ≤ µ}

(2)

Z ∞

Consider the system

0

x(t) ˙ = f (x(t), u(t)),

u(t) ∈ U

To determine a suboptimal solution we consider the following Euler Approximating Systems (EAS) x(k + 1) = x(k) + τ f (x(k), u(k))

(4)



JD =



g(x, u)dt



g(x(k), u(k)) τ

(11)

k=0

(5)

0

and, similarly to what has been done in the continuous–time case, we define a suboptimal value function for the EAS.

where g(x, u) is locally Lipschitz and positive definite. The basic problem faced here is that of finding a stabilizing feedback control law u(t) = K(x(t)) (6)

Definition 2.2 Let ΨD : IRn → IR+ , be a locally Lipschitz positive definite function. Assume that there exists µ > 0 such that the sublevel set is inside the constraint set

such that the corresponding cost is minimized, and the constraints (4) are satisfied. This problem is very hard to solve (even in the absence of constraints). Typically, to find the optimal strategy one has to solve the HJB equations [1] which present serious problems even if handled numerically. Therefore we try to solve the problem in a sub– optimal way, and to this purpose we now introduce the concept of suboptimal value function.

Pµ = {x : Ψ(x) ≤ µ} ⊂ X and a control function K : IRn → U such that ΨD (x + τ f (x, K(x))) − ΨD (x) + τg(x, K(x)) ≤ 0,

(12)

for all x ∈ Pµ . Then we say that ΨD is a suboptimal value function (SVF) and K is a suboptimal control for the discrete–time constrained problem.

Definition 2.1 Let Ψ : IRn → IR+ , be a locally Lipschitz positive definite function. Assume that there exists µ > 0 such that the sub-level set is inside the constraint set

Again, the control u(k) = K(x(k)), as in Definition 2.2 assures that the system is stable with domain of attraction Pµ . Moreover for every initial condition x(0) = x0 ∈ Pµ the constraints are not violated and

Pµ = N [Ψ, µ] ⊂ X and a control function K : IRn → U , assuring the existence of a global (i.e. defined for all t ≥ 0) solution for the closed– loop system, such that D+ Ψ[x, f (x, K(x))] + g(x, K(x)) ≤ 0,

(10)

where τ > 0 is the time parameter. We associate with this system the cost function

where X and U are convex and closed sets including the origin in their interior. Consider for this system a cost function of the form Z J=

(9)

namely the function Ψ evaluated at x0 is an upper bound for the cost associated with the control K(x) and the initial condition x0 .

(3)

where x ∈ IRn , u ∈ IRm , and where f (x, u) is locally Lipschitz. We assume that x = 0 is an equilibrium point corresponding to u = 0, namely f (0, 0) = 0. We denote by x0 = x(0) the initial state. We also assume that the following constraints must be satisfied: x(t) ∈ X ,

g(x(t), u(t))dt ≤ Ψ(x0 )





g(x(k), K(x(k)))τ ≤ ΨD (x0 )

k=0

(7) Now we will restrict our attention to convex SVF showing how a SVF for the discrete time EAS and the corresponding associated suboptimal control result in a SVF and suboptimal control for the continuous-time system.

for all x ∈ Pµ . Then we say that Ψ is a suboptimal value function (SVF) and K(x) is a suboptimal control for the constrained problem.

1559

Theorem 2.1 Assume that ΨD (x) is a convex suboptimal value function and K(x) is a suboptimal control for the EAS (10) (for given τ > 0) according to Definition 2.2. If K(x) assures the existence of a solution for the continuous–time system, then it is a suboptimal control and ΨD (x) is a SVF for the continuous–time system. In particular for x(0) ∈ Pµ , the closed–loop system state satisfies the constraints, converges to zero and bound (9) is satisfied with Ψ = ΨD .

function, we can always provide a (possibly discontinuous) control, whose closed–loop solution is defined and which is “stabilizing” in the sense of [7]. In the next proposition we state that, by reducing τ to zero, we get suboptimal costs which are arbitrarily close to the optimal as long as the cost–to go function for the continuous–time problem is convex. Its proof is reported in the appendix.

The consequences of the theorem are important and they are stated in the following corollary.

Proposition 2 Assume that the positive definite function Ψ(x), is a convex suboptimal value function for the continuous–time system, defined and bounded over a compact neighborhood of the origin P. Assume that there exists a neighborhood W of the origin which is a domain of attraction for the EAS for some τ ∗ > 0 (with some control K(x)). Then for any ε > 0 there exists τ¯ such that for 0 < τ ≤ τ¯ and for all x ∈ P we have that the cost–to–go ˆ (τ) (x) is such that function for the EAS Ψ

ˆ ) be the optimal cost of the Corollary 2.1 Let Ψ(x 0 continuous–time optimal control problem with objective J, constraints (4) and initial state x0 (the cost–to–go funcˆ (τ) (x ), be the cost–to–go function for the EAS tion).Let Ψ 0 (10) with time–parameter τ. Assume that ΨD is a convex SVF for the EAS (10). Let 0 < τ 0 < τ. Then for all . ˆ µ] ⊂ X the following condition holds: x0 ∈ Pµ = N [Ψ,

ˆ (τ) (x) ≤ Ψ(x) + ε, for all x ∈ P. Ψ

ˆ )≤Ψ ˆ (τ 0 ) (x ) ≤ Ψ ˆ (τ) (x ) ≤ Ψ (x ) Ψ(x D 0 0 0 0

(13)

ˆ In particular, if Ψ(x) is the cost–to–go function of the continuous–time system, we have

The previous results show that the performance of the EAS dominates that of the corresponding continuous–time system. We remind that, similar results hold in the context of persistent disturbance rejection (L∞ –norm bounded) problem [5, 11]. Clearly, here we are considering a different matter, being our cost expressed as an integral.

ˆ ˆ (τ) (x) ≤ Ψ(x) ˆ Ψ(x) ≤Ψ + ε, for all x ∈ P

(14)

3 Linear systems with convex constraints and cost In this section we assume that the system is linear time– invariant x(t) ˙ = Ax(t) + Bu(t) (15)

In our definition of SVF we have explicitly required the existence of a control such that the closed loop system admits a solution and (7) and (12) are satisfied. The existence of a solution is not an issue for discrete–time systems. However in the continuous–time case some regularity properties are required for the control K. We have the following Proposition

so that the corresponding EAS is x(k + 1) = [I + τA]x(k) + τBu(k).

(16)

Next, we will work under the following assumption: Proposition 1 Assume that (3) is control affine i.e. x˙ = A(x) + B(x)u and continuous. Assume that g(x, u) is strictly convex1 , w.r.t. u, and that ΨD (x) is a convex SVF for the EAS (10). Assume that U is compact (beside convex). Consider the control defined as the unique minimizer2 of the function

Assumption 1 The pair (A, B) is stabilizable and the sets X and U are convex, closed and include the origin in their interior. The function g is convex (beside being positive definite).

. ω(x, u) = ΨD (x + τA(x) + τB(x)u) + g(x, u)τ

The following fundamental property is well–known (see for instance [2]).

. namely κ(x) = arg minu∈U ω(x, u). Then the control κ(x) is continuous and it is a suboptimal control (associated with the SVF ΨD ) for the continuous–time system. In the cases in which K(x) may be not continuous, it can be shown that, since we are dealing with a convex Lyapunov

ˆ ) and Ψ ˆ (τ) (x ) be the optimal value Proposition 3 Let Ψ(x 0 0 of the optimal control problem with constraints (4) and iniˆ ) tial condition x0 , for (15) and (16) respectively. Then Ψ(x 0 (τ) ˆ and Ψ (x0 ) are both convex functions, defined on a open convex set including the origin.

1 we mean g(x, αu + (1 − α)u ) < αg(x, u ) + (1 − α)g(x, u ), for 0 < 1 2 1 2 α 0 there exist N¯ and τ¯ such that for τ < τ¯ and N > N¯ we have (1 − ε)S ⊂ SN(τ) ⊂ S . In general, determining a domain of attraction for a (even linear) system under constraints is hard. For instance the Nsteps controllability set for a system with linear constraints has a representation which grows exponentially with the number of steps [9]. The above domain does not need to be explicitly determined. Indeed its determination in an implicit way is straightforward because checking whether x0 ∈ SN(τ) is equivalent to check the feasibility of the corresponding optimization problem.

References [1] Bardi M. and Capuzzo–Dolcetta I., Optimal control and viscosity solution of Hamilton–Jacoby–Bellman equations, Birkauser, Boston, 1997. [2] Bemporad A., Morari M., Dua V. and Pistikopoulos E.N., “The Explicit Linear Quadratic Regulator for Constrained Systems”, Automatica, vol. 38, no. 1, pp. 3 Jan. 2002.

In particular this observation can be useful to determine whether a certain candidate polytope X0 ⊂ X of possible ˆ (τ) (x ) = ∞ initial states is inside SN(τ) . To this aim set Ψ 0 N if the problem is unfeasible for the initial state x0 and deˆ (τ) (x ), the maximum of the costs fine M = maxx ∈vert{X } Ψ 0 N 0 0 over the set of vertices vert{X0 }. Note that M does not require the knowledge of Ψ(τ) , but it can be computed by N applying the procedure with all elements of vert{X0 } as initial states. With the above notation in mind we can introduce the next corollary (see also [6] in the context of linear– quadratic constrained optimal control.

[3] Blanchini F., “Set invariance in control – a survey”, Automatica, Vol. 35, no. 11, pp.1747–1767, 1999. [4] Blanchini F. and Miani S., “Constrained stabilization of continuous-time linear systems”, Syst. & Contr. Lett., vol 28, no.2, pp. 95-102, 1996. [5] Blanchini F. and Sznaier M., “Rational L1 Suboptimal Compensators for Continuos-time Systems”, IEEE Trans. Autom. Contr., Vol. 39, no. 7, p. 1487-1482, July 1994.

Corollary 4.1 The polytope X0 ⊂ X is included in SN(τ) if and only if M < ∞. Furthermore, if X0 is the µ-ball of a given norm k · k, X0 = {x : kxk ≤ µ} and g is positively homogeneous of degree p, i.e., for λ > 0, g(λ x,  λ u)=

[6] D. Chmielewski and V. Manousiouthakis, “On constrained infinite–time linear quadratic optimal control”, Syst. & Contr. Lett., vol. 29, pp. 121-129, 1996.

p

[7] Clarke F.H., Ledyaev Yu S., Sontag E.D. and Subbotin A.I., “Asymptotic controllability implies feedback stabilization”, IEEE Trans. Autom. Contr., Vol. 42, pp. 1394– 1407, 1997

We remind the reader that the exponential approximation of a continuous–time system guarantees that, for piecewise– constant input, the sampled continuous–time trajectory corresponds to the discrete time trajectory. However, the continuous-time trajectory has a cost which is not necessarily upper bounded by the cost of the discrete–time trajectory, and it could even be unfeasible whereas upper bound and feasibility are assured by the EAS although the discrete and continuous–time trajectories may be quite distant.

[8] Garcia C.E., Prett D.M. and Morari M., “Model predictive control theory and practice—a survey”, Automatica, Vol. 25, no. 3, pp. 335–348, 1989.

λ p g(x, u), then, inside X0 the bound J(x0 ) ≤ M holds

kx0 k µ

[9] Gutman P.O. and Cwikel M., “Convergence of an algorithm to find maximal state constraint sets for discrete– time linear dynamical systems with bounded control and states”, IEEE Trans. on Autom. Contr., Vol. 31, No. 5, pp. 457-459, 1986. [10] Kerrigan E.C. and Maciejowski J.M., “Invariant sets for constrained nonlinear discrete–time systems with application to feasibility in model predictive control”, Proc. 39th Conf. Decision and Control, Sydney, Australia, Dec. 2000.

5 Concluding discussions

[11] Lu W.M., “Rejection of persistent L∞ -bounded disturbances for nonlinear systems. IEEE Trans. on Autom. Contr., Vol. 43, no. 12, 1692–1702, 1998.

A suboptimal receding–horizon based controller for continuous–time systems based on the Euler approximation has been proposed. It has been shown that this

1562

Since Z is compact, xk admits a converging subsequence. Let x¯ be the limit of such converging subsequence. Without restriction, let us assume xk → x. ¯ For such a x¯ there exists τ¯ and u¯ = K(x) ¯ such that

[12] Magni L., De Nicolao G., Magnani L. and Scattolini R., “A stabilizing model–based predictive control algorithm for nonlinear systems”, Automatica, Vol. 37, pp. 1351– 1362, 2001. [13] Mayne D. Q., “Control of constrained dynamic systems”, Europ. J. Contr., Vol. 7, pp. 87-99, 2001.

Ψ(x¯ + τ¯ f (x, ¯ u)) ¯ − Ψ(x) ¯ . + g(x, ¯ u) ¯ < 0. ν = (1 + ε) τ¯

[14] Mayne D.Q., Rawlings J.B., Rao C.V. and Scokaert P.O.M., “Constrained model predictive control : stability and optimality”, Automatica, 2000, Vol. 36, pp 789-814.

This in turn comes from the assumption lim

[15] Michalska H. and Mayne D.Q., “Robust receding– horizon control of constrained nonlinear systems”, IEEE Trans. Autom. Contr., Vol. 38, no. 11, pp. 1623–1633, 1993.

τ→0+

Ψ(x¯ + τ f (x, ¯ u)) ¯ − Ψ(x) ¯ ≤ −g(x, ¯ u) ¯ τ

so that the difference quotient must be negative, for τ¯ small enough. Now we have the following (we remind that the quotient ratio is a non–increasing function of τ).

[16] Parisini T. and Zoppoli R., “A receding–horizon regulator for nonlinear systems and a neutral approximation”, Automatica, vol. 31, no 10, pp. 1443–1451, 1995.

Ψ(xk +τk f (xk ,u))−Ψ(x ¯ k) + g(xk , u) ¯ ≤ τk Ψ(xk +τ¯ f (xk ,u))−Ψ(x ¯ k) ≤ (1 + ε)  + g(xk, u) ¯ = τ¯

∆k = (1 + ε)

[17] Yang T.H. and Polak E., “Moving horizon control of nonlinear systems with input saturation disturbance and plant uncertainty”, Int. J. Contr., vol. 58, pp. 875-903, 1993.

Ψ(x¯ + τ¯ f (x, ¯ u)) ¯ − Ψ(x) ¯ = (1 + ε) + g(x, ¯ u) ¯ + τ¯ | {z } =ν 0, which doesn’t exactly fit our case. To prove the proposition we actually prove the following: for all x ∈ P the following bound holds true: ˆ (τ) (x) ≤ (1 + ε)Ψ(x) + ε. Ψ (19)

k

By continuity of f and Ψ, the quantities ak , bk and ck converge to 0 as k → ∞. Therefore, for k large enough ∆k becomes negative, in contradiction with (21). To complete the proof note that from equation (20) we have that for each x ∈ Z there exists u(x) ∈ U such that

since Ψ(x) is bounded over P, say Ψ(x) ≤ M. This is equivalent to Ψ(τ) (x) ≤ Ψ(x) + ε 0 with ε 0 = ε(1 + M), arbitrary small.

(1 + ε) [Ψ(x + τ f (x, u(x))) − Ψ(x)] ≤ −τg(x, u(x)) ≤ 0 Being Ψ positive definite, the control u(x) assures the state to reach W in finite time from any x0 . Assume that x˜ = ˜ ∈ W for some finite N. ˜ Then, by definition, Ψ(τ ∗ ) (x) x(N) ˜ ≤ ε. By Corollary 2.1, we have Ψ(τ) (x) ˜ ≤ ε for τ ≤ τ¯ ≤ τ ∗ . Then, from the above expression we get

ˆ (τ ∗ ) (x) ≤ ε} and take Consider the following set W = {x : Ψ ε small enough in such a way W ⊂ int{P} is a domain of attraction for the EAS. For every τ ≤ τ ∗ the optimal transient cost starting from any x˜ ∈ W is less than ε by Corollary 2.1. Then, to prove (19), we need to show that the optimal cost for the EAS to reach W from x0 ∈ P is bounded by (1 + ε)Ψ(x0 ). We do this by showing that for every ε there exists τ¯ (without restriction we assume τ¯ ≤ τ ∗ ), such that for all 0 < τ ≤ τ¯ the following property holds: for all x ∈ Z = P − int{W } there exists u ∈ U such that

˜ N−1

˜ ≤ (1 + ε)Ψ(x0 ) ∑ g(x(k), u(k))τ ≤ (1 + ε)[Ψ(x0 ) − Ψ(x)]

k=0

(23) Now, as mentioned above, the transient cost from x¯ ∈ W to 0 is bounded by ε and then ∞

Ψ(x + τ f (x, u)) − Ψ(x) + g(x, u) ≤ 0 (20) τ By contradiction, assume that there exist sequences τk → 0, xk ∈ Z such that, for all u ∈ U and k

∑ g(x(k), u(k))τ ≤ ε

(1 + ε)

∆k = (1 + ε)

(22)

(24)

k=N¯

Summing (23) and (24) we get a bound for the optimal transient cost which does not exceed Ψ(x0 )(1 + ε) + ε, and this proves (19).

Ψ(xk + τk f (xk , u)) − Ψ(xk ) + g(xk , u) > 0. τk (21)

1563