Direct Gradient Descent Control as a Dynamic Feedback ... - EMIS

3 downloads 0 Views 266KB Size Report
Nov 4, 2004 - Key words and phrases: Dynamic feedback control, direct gradient descent control. 1. Introduction. A famous method in linear system to obtain ...
BULLETIN of the Malaysian Mathematical Sciences Society

Bull. Malays. Math. Sci. Soc. (2) 29(1) (2006), 131–146

http://math.usm.my/bulletin

Direct Gradient Descent Control as a Dynamic Feedback Control for Linear System J. Naiborhu, S.M. Nababan, R. Saragih and I. Pranoto Department of Mathematics, Institut Teknologi Bandung Jl. Ganesa 10, Bandung 40132, Indonesia

Abstract. In this paper we study about stabilization of linear control system by using the dynamic state/output feedback control. We can prove that the dynamic state feedback control exists if the static state feedback control exists. But the existence of the dynamic output feedback control can not be guaranteed only by the existence of static feedback control, passive condition is needed. By applying the direct gradient descent control as a dynamic state feedback control to linear system, we get a condition for guaranteing the asymptotic stability of the system. 2000 Mathematics Subject Classification: 93D05, 93D15, 49J50 Key words and phrases: Dynamic feedback control, direct gradient descent control.

1. Introduction A famous method in linear system to obtain arbitrary asymptotic behavior is PoleAssignment method. In this paper we consider the stabilization problem of linear system by using the dynamic feedback control. The definition of dynamic feedback control is to add an integrator into the system. This definition is different with the one proposed by Isidori [3] or Sontag [10]. Consider the linear system (1.1) (1.2)

x˙ = Ax + Bu, x ∈ Rn , u ∈ Rr , y = Cx, y ∈ Rm ,

where A, B and C are real matrices of dimension (n × n), (n × m) and (p × n). We are interested in finding a dynamic feedback control law (1.3)

u˙ = K1 x + K2 u

or (1.4)

u˙ = L1 y˙ + L2 y + L3 u

Received: March 19, 2004; Revised: November 4, 2004.

J. Naiborhu et al.

132

that makes the extended system (1.5)

x˙ = Ax + Bu, u˙ = K1 x + K2 u

or (1.6)

x˙ = Ax + Bu, u˙ = L1 y˙ + L2 y + L3 u, y = h(x)

is asymptotically stable about (x, u) = (0, 0). In [11], it has been proved that if nonlinear system (1.7)

x˙ = f (x, u)

is locally asymptotically stabilizable with a C 1 feedback law u(x) then (1.8) (1.9)

x˙ = f (x, u), u˙ = v

is locally asymptotically stabilizable. Coron and Praly [2] studied more in detail about the relationship of stabilizability between the system in equation (1.7) and the extended system (1.8)-(1.9). In case where system (1.8)-(1.9) is considered as a cascade system, Krstic et.al.[4] calculated v by using backstepping method. The purposes of this paper is as follows. Based on result in [4, 2, 11], in linear system, we show that the dynamic state/output feedback control exists if static state/output feedback control exists such that the system is asymptotically stable. Then, we apply the direct gradient descent control [6, 7] as the dynamic feedback control to stabilize the linear system. 2. Existence of dynamic feedback 2.1. Dynamic state feedback control. Consider a linear system (2.1)

x˙ = Ax + Bu, x ∈ Rn , u ∈ Rr

where A and B are real matrices of dimension (n × n) and (n × r). Assume that there exists K such that A + BK is Hurwitz. Then, by a static feedback control law (2.2)

u = Kx,

system (2.1) is asymptotically stable. However, if we take the time derivative of the static feedback control (2.2) then the extended system which is formed together with (2.1), i.e., (2.3)

x˙ = Ax + Bu, u˙ = KAx + KBu

is not always asymptotically stable. In the following Proposition, however, we show that there exists a dynamic feedback controller of the form u˙ = K1 x + K2 u when the static feedback control (2.2) exists.

Direct Gradient Descent Control

133

Proposition 2.1. Consider the linear system (2.1) and assume that there exists K such that A + BK is Hurwitz. Then there exist K1 and K2 , such that the extended system x˙ = Ax + Bu, u˙ = K1 x + K2 u

(2.4)

becomes asymptotically stable about (x, u) = (0, 0). Proof. Consider the augmented system x˙ = Ax + Bu, u˙ = v.

(2.5) (2.6)

Let AK = A + BK. Since AK is Hurwitz (Assumption) then there exists P > 0 for a given Q, Q > 0 such that Lyapunov equation below is satisfied. P (AK ) + (AK )T P = −Q.

(2.7)

Let V (x) = xT P x be a Lyapunov function of system (2.5), and z = u − Kx.

(2.8)

Substitute z into (2.5)-(2.6) to obtain x˙ z˙

(2.9) (2.10)

= AK x + Bz, = v − K(AK x + Bz).

Let Va (x, z) = xT P x + z T z be a Lyapunov function of system (2.9)-(2.10). Then we have V˙ a (x, z) = x˙ T P x + xT P x˙ + 2z T z˙ (2.11)

= xT (ATK P + P AK )x + 2z T B T P x + 2z T (v − K(AK x + Bz)).

From (2.7) we have xT (ATK P + P AK )x = −xT Qx < 0. Our objective is to find v such that V˙ a (x, z) < 0. It holds if (2.12)

2z T B T P x + 2z T (v − K(AK x + Bz)) ≤ 0.

There are many ways to find v such that equation (2.12) is satisfied. One of them is to find v such that (2.13)

2z T B T P x + 2z T (v − K(AK x + Bz)) = −2z T Rz.

To satisfy (2.13), it is sufficient if (2.14)

v

= K(AK x + Bz) − B T P x − Rz.

So, (2.15)

V˙ a (x, z)

= −xT Qx − 2z T Rz < 0.

Substitute (2.14) into (2.10) to obtain (2.16)



= AK x + Bz

(2.17)



= −B T P x − Rz.

Since (2.15) holds, system (2.16)-(2.17) is asymptotically stable.

J. Naiborhu et al.

134

Substitute (2.8) into (2.14) to obtain v (2.18)

= K(AK x + B(u − Kx)) − B T P x − R(u − Kx) =

(KA − B T P + RK)x + (KB − R)u.

Substitute (2.18) into (2.6). Then the resulted system in the (x, u) coordinates is (2.19)

x˙ = Ax + Bu

(2.20)

u˙ =

(KA − B T P + RK)x + (KB − R)u.

Since system (2.16)-(2.17) is equivalent to system (2.19)-(2.20), system (2.4) is asymptotically stable with K1 = KA − B T P + RK and K2 = KB − R.  In the proof of Proposition 2.1 we can see the stability improvement of the system (2.1). Consider equation (2.1) as an original system and equations (2.19)(2.20) as an extended system. For the original system, the Lyapunov function is V (x) = xT P x and for the extended system, the Lyapunov function is Va (x) = xT P x + (u − Kx)T (u − Kx), where K is choosen such that A + BK is Hurwitz. By using static feedback control u = Kx, V˙ (x) = −xT Qx < 0, and based on this static feedback control, we get u˙ = (KA − B T P + RK)x + (KB − R)u, V˙ a (x) = −xT Qx − 2(u − Kx)T R(u − Kx) < 0 and (2.21)

V˙ a (x) ≤ V˙ (x) < 0.

It means that by adding an integrator into the system, the extended system goes to zero quicker than or at least same with the original system. Example 2.1. Consider a linear system x(t) ˙ = x(t) + u(t). By static control law us (t) = −2x(t), we have xs (t) = x0 e−t , and system becomes asymptotically stable. By applying Proposition 2.1 to compute the dynamic feedback control, we have the extended system (2.22) (2.23)

x(t) ˙ = x + u(t), u(t) ˙ = −4.5x(t) − 3u(t),

which has solution: xd (t) = x0 e−t cos

1√ 2t, 2

and

 1√ 1√ 1√ ud (t) = −2x0 e cos 2t + 2 sin 2t . 2 4 2 Based on this calculation we conclude that: −t



|xd (t)| ≤ |xs (t)|.

Direct Gradient Descent Control

135

In other words, the system with dynamic feedback control goes to zero quicker than the system with static control. Example 2.2. Consider a linear system      0 1 x1 (t) 0 ˙ (2.24) x(t) = + u, −1 2 x2 (t) 1     0 1 0 A= ; B= . −1 2 1 The unforced dynamics of the system (2.24) is unstable but the system (2.24) is controllable. Since system (2.24) is controllable, it can be stabilized Zby static feedback ∞

(xT x + u2 )dt,

control. From optimal control problem with V (x(0), u(·), 0) = 0

we have (2.25)

u(t) = Kx(t) =



−0.412 −4.4142



x(t).

Based on the static feedback control (2.25), we calculate the dynamic feedback control (2.20). First we find P such that ATK P + P AK = −Q where Q = I.         0 −1.412 p 1 p2 p1 p 2 0 1 1 0 + =− 1 −2.4142 p 2 p4 p2 p 4 −1.412 −2.4142 0 1  (2.26)

−2.824p2 p1 − 2.4142p2 − 1.412p4

p1 − 2.4142p2 − 1.412p4 2p2 − 4.8284p4



 =

−1 0 0 −1

 .

From equation (2.26) we have  (2.27)

P =

3.8262 2.824 1 2.824

1 2.824 1 2.824

 .

By taking R = I, we have    1    1 4.4142 −9.2404 − 2.824 K1 = + −0.412 −4.4142 2.824   1 1 4.0022 − 2.824 −13.6546 − 2.824 (2.28) = ,     0 −0.412 −4.4142 K2 = −1 1 (2.29)

= −5.4142.

Then we have the extended system: » ˙ (2.30) x(t)

=

(2.31) u(t) ˙

=

0 −1

1 2

(4.0022 −

–»

x1 (t) x2 (t)



» +

0 1

– u(t),

1 1 )x1 (t) + (−13.6546 − )x2 (t) − 5.4142u(t). 2.824 2.824

Simulation results are shown in Fig. 1 with initial condition: x1 (0) = 1; x2 (0) = 0.5; u(0) = −2.6191. From the simulation results we see that the extended system goes to zero quicker than its original system.

J. Naiborhu et al.

136 1.2

0.6 x1(t)(static) x1(t)(dynamic)

1

0.2 state

0.8 state

x2(t)(static) x2(t)(dynamic)

0.4

0.6

0

0.4

-0.2

0.2

-0.4

0

-0.6 2

4 6 time(s)

control

0

8

10

0

2 1.5 1 0.5 0 -0.5 -1 -1.5 -2 -2.5 -3

2

4 6 time(s)

8

10

u(t)(static) u(t)(dynamic)

0

2

4 6 time(s)

8

10

Figure 1. Static: static feedback control, dynamic: dynamic feedback control

2.2. Dynamic output feedback control. Consider the linear plant (2.32) (2.33)

x˙ y

= Ax + Bu, x ∈ Rn , u ∈ Rr , = Cx, y ∈ Rm

where A, B and C are real matrices of dimension (n × n), (n × r) and (m × n). It is well known that a stabilizing dynamic feedback compensator can be constructed if the system (2.32)-(2.33) is both stabilizable and detectable. For example, see [5] In this section, we find K1 , K2 and K3 such that a dynamic controller of the form (2.34)

u˙ = K1 y˙ + K2 y + K3 u

stabilizes the system (2.32) asymptotically. Proposition 2.2. Consider the linear plant (2.32)-(2.33) and assume that there exists K such that A + BKC is Hurwitz and B T P = C with P > 0, P = P T , and satisfy P (A + BKC) + (A + BKC)T P = −Q for a given Q, Q > 0. Then there exists K1 , K2 and K3 such that the extended system (2.35)

x˙ = Ax + Bu, u˙ = K1 y˙ + K2 y + K3 u

becomes asymptotically stable about (x, u) = (0, 0).

Direct Gradient Descent Control

137

Proof. Consider the augmented system (2.36) x˙ = Ax + Bu, (2.37) u˙ = v. ¯ = KC and AK¯ = A + B K. ¯ Since AK¯ is Hurwitz (Assumption) then there Let K exists P > 0 for a given Q, Q > 0 such that the Lyapunov equation below is satisfied, P AK¯ + ATK¯ P = −Q.

(2.38)

Let V (x) = 21 xT P x be a Lyapunov function of system (2.36) and ¯ (2.39) z = u − Kx. Substitute z into (2.36)-(2.37) to obtain x˙ z˙

(2.40) (2.41)

= AK¯ x + Bz, ¯ K¯ x + Bz). = v − K(A

Let Va (x, z) = xT P x + z T z be a Lyapunov function of system (2.40)-(2.41). Then we have V˙ a (x, z) = x˙ T P x + xT P x˙ + 2z T z˙ (2.42)

¯ K¯ x + Bz)). = xT (ATK¯ P + P AK¯ )x + 2z T B T P x + 2z T (v − K(A

From (2.38) we have xT (ATK¯ P + P AK¯ )x = −xT Qx < 0. Our objective is to find v such that V˙ a (x, z) < 0. It holds if ¯ K¯ x + Bz)) ≤ 0. (2.43) 2z T B T P x + 2z T (v − K(A There are many ways to find v such that equation (2.43) is satisfied. One of them is to find v such that ¯ K¯ x + Bz)) = −2z T Rz. (2.44) 2z T B T P x + 2z T (v − K(A To satisfy (2.44), it is sufficient if ¯ K¯ x + Bz) − B T P x − Rz. (2.45) v = K(A So, V˙ a (x, z)

(2.46)

= −xT Qx − 2z T Rz < 0.

Substitute (2.45) into (2.41) to obtain (2.47)



= AK¯ x + Bz,

(2.48)



= −B T P x − Rz.

Since (2.46) holds, system (2.47)-(2.48) is asymptotically stable. Substitute (2.39) into (2.45) to obtain ¯ K¯ x + B(u − Kx)) ¯ ¯ v = K(A − B T P x − R(u − Kx) (2.49)

=

¯ − B T P + RK)x ¯ + (KB ¯ − R)u. (KA

By condition B T P = C, (2.49) becomes ¯ ¯ − Ru v = K(Ax + Bu) + (−C + RK)x (2.50)

= K y˙ + (−I + RK)y − Ru.

J. Naiborhu et al.

138

Substitute (2.50) into (2.37). Then the resulted system in the (x, u) coordinates is x˙ = Ax + Bu, u˙ = K y˙ + (−I + RK)y − Ru.

(2.51) (2.52)

Since system (2.47)-(2.48) is equivalent to system (2.51)-(2.52), system (2.35) is asymptotically stable with K1 = K, K2 = −I + RK and K3 = −R.  From Proposition 2.2, the existence of static output feedback control is not enough to guarantee the existence of dynamic output feedback control (2.34). We need in proposition the condition B T P = C (passivity condition [8]) to guarantee the existence of dynamic output feedback control (2.34). Example 2.3. Consider a linear plant      1 0 x1 (t) 1 ˙ (2.53) x(t) = + u, −1 −2 x2 (t) 0     x1 (t) 1 0 y(t) = (2.54) . x2 (t)       1 0 1 A= ; B= ; C= 1 0 . −1 −2 0 The unforced dynamics of the system (2.53) is unstable but the system (2.53) is controllable. The linear plant (2.53)-(2.54) is unobservable. This system can be stabilized by static feedback output control     ¯ = −2 0 . (2.55) u(t) = Ky(t) = −2 1 0 x(t); K Based on static feedback output control (2.55), we stabilize the system (2.53) by a dynamic control u(t) ˙ = K1 y(t) + K2 u(t) where             1 0 K1 0 = K1 C = −2 0 − 1 0 + −2 0 = −5 0 , −1 −2 and therefore K1 = −5.

(2.56) Similarly (2.57)

K2 =



−2

0





1 0

 − 1 = −3.

Then we have 

1 0 −1 −2



(2.58)

˙ x(t)

=

(2.59)

u(t) ˙

= −5y(t) − 3u(t).

x1 (t) x2 (t)



 +

1 0

 u(t),

Simulation results are shown in Fig. 2 with initial condition: x1 (0) = 1; x2 (0) = 0.5; u(0) = −2. From simulation results we see that the extended system goes to zero quicker than its original system.

Direct Gradient Descent Control 1

0.5 x1(t)(static) x1(t)(dynamic)

0.8

x2(t)(static) x2(t)(dynamic)

0.4 0.3 state

0.6 state

139

0.4 0.2

0.2 0.1 0

0

-0.1

-0.2

-0.2 0

2

4 6 time(s)

8

10

0

2

4 6 time(s)

8

10

0.5 u(t)(static) u(t)(dynamic)

control

0 -0.5 -1 -1.5 -2 0

2

4 6 time(s)

8

10

Figure 2. Static: u(t) = Ky(t), dynamic: u(t) ˙ = K1 y(t) + K2 u(t)

3. Direct gradient descent control Consider a nonlinear control system ˙ x(t) = f (x(t), u(t)), x(t0 ) = x0

(3.1) n

where x(t) ∈ R is the state vector and u(t) ∈ Rm is the control vector. In general, the aim of control is to decrease a performance index F (x(t), u(t)) at any time t along the trajectory of system (3.1). In particular, our problem is formulated as follows: (3.2) (3.3)

decrease u(t) subj.to

F (x(t), u(t)) ˙ x(t) = f (x(t), u(t)),

x(t0 ) = x0 .

As the class of admissible controls, we consider a space Ut consisting of mdimensional-vector valued functions which are continuous a.e. on [t0 , t], and define the following inner product: Z t (3.4) < u, v >= u(τ )T v(τ )dτ, t0

where u and v ∈ Ut . In simple form, the problem (3.2)-(3.3) can be written as (3.5)

decrease u(t)

F (x(t; u), u(t))

J. Naiborhu et al.

140

where Z (3.6)

x(t; u)

t

f (x(τ ), u(τ ))dτ .

= x(t0 ) + t0

A very broad and perhaps the most important class of methods for unconstrained optimization is one that is based on the so-called gradient descent methods. By applying gradient descent method, we obtain as a control law, the first-order ordinary differential equation ˙ u(t) = −α∇u F (x(t; u), u(t)),

(3.7)

where α can become a constant or function of x(t), u(t) and t. Next, the gradient of performance index F (x(t; u), u(t)) with respect to u(t) will be derived. Assume that A.1: f and F are continuously differentiable on (x, u) ∈ Rn × Rm . A.2: f x and f u are Lipschitz continuous . Definition 3.1. Functional x(t; u) is Gateaux differentiable if for arbitrary s ∈ Ut , (3.8)

δs x(t; u) =

d x(t : u + εs) |ε=0 dε

exists. Theorem 3.1. Functional x(t; u) defined by (3.6) is Gateaux differentiable with Z t (3.9) δs x(t; u) = ∇x(t; u)(τ )T s(τ )dτ, t0

where = f u (x(τ ; u), u(τ ))T Φ(t, τ )T , t0 ≤ τ ≤ t and Φ(t, τ ) is a continuous transition matrix function. (3.10)

∇x(t; u)(τ )

Proof. Integrating (3.1) from t0 to t with u + εs given, we have Z t x(t; u + εs) = x(t0 ) + (3.11) f (x(τ ; u + εs), u(τ ) + εs(τ ))dτ . t0

Differentiating (3.11) w. r. t ε and letting ε = 0, and then differentiating it w. r. t. t, we finally obtain d d x(t; u + εs) |ε=0 dt dε d (3.12) = f x (x(t; u), u(t)) x(t; u + εs) |ε=0 +f u (x(t; u), u(t))s(t). dε The equation (3.12) is a time-varying linear differential equation, where the unknown variable is d δs x(t; u) = x(t; u + εs) |ε=0 . dε Its solution is given by Z t (3.13) δs x(t; u) = Φ(t, τ )f u (x(τ, u), u(τ ))s(τ )dτ t0

Direct Gradient Descent Control

141

where Φ(t, τ ) is a continuous transition matrix function that has properties ∂ Φ(t, τ ) = f (x(t; u), u(t))Φ(t, τ ), t ≤ τ ≤ t 0 x ∂t Φ(τ, τ ) = I. According to (3.4), δs x(t; u) can be rewritten in the inner product form δs x(t; u) =< ∇x(t; u), s > where (3.14)

∇x(t; u)(τ )

= f u (x(τ ; u), u(τ ))T Φ(t, τ )T , t0 ≤ τ ≤ t. 

Theorem 3.2. By defining ∇u x(t; u) = ∇x(t; u)(t), the gradient of objective function F (x(t; u), u(t)) with respect to u(t) at time t is ∇u F (x(t; u), u(t)) (3.15)

= f u (x(t; u), u(t))T Fx (x(t; u), u(t))T + Fu (x(t; u), u(t))T .

Proof. By chain rule, (3.16)

∇u F (x(t; u), u) = (Fx (x(t, u), u)∇u x(t; u))T + (Fu (x(t; u), u))T .

From (3.10), ∇x(t; u)(t)

= f u (x(t; u), u(t))T Φ(t, t)T = f u (x(t; u), u(t))T .

(3.17)

By substituting equation (3.17) into equation (3.16), we obtain ∇u F (x(t; u), u) (3.18)

= f u (x(t; u), u(t))T Fx (x(t; u), u(t))T + Fu (x(t; u), u(t))T . 

Thus, the dynamic control (3.7) becomes (3.19)

˙ u(t) = −α{f u (x(t; u), u(t))T Fx (x(t, u), u(t))T + Fu (x(t; u), u(t))T }.

This control law is called the direct gradient descent control [6, 7]. Example 3.1. Consider the nonlinear control system (3.20)

x˙1

= x1 x2 ,

(3.21)

x˙2

= −x2 − 2x21 + x2 u.

It is easy to show that by linearization, system (3.20)-(3.21) can not be stabilized asymptotically. Let the performance index F (x, u) = 21 (x21 + x22 + u2 ). By applying the direct gradient descent control to nonlinear system (3.20)-(3.21) we have the extended system     x˙1 x1 x2  x˙2  =  −x2 − 2x21 + x2 u  . (3.22) u˙ −αx22 − αu

J. Naiborhu et al.

142

Then, we find the value of parameter α by Center Manifold Theory. Let x1 = z and y = [y1 y2 ]t = [x2 u]t . The equation (3.20)-(3.21) can be written as follows. (3.23)

z˙ y˙

(3.24)

= zy1 = Az + f (z, y),      −1 0 y1 −2z 2 + y1 y2 = + 0 −α y2 −αy12 = By + g(z, y).

Let y = h(z) as a center manifold for that system and approximate it by       y1 x2 h1 (z) y = = (3.25) = y2 u h2 (z)   2 3 a1 z + a2 z + . . . (3.26) = . b1 z 2 + b2 z 3 + . . . For finding coefficients ai , bi we solve the equation (3.27) (3.28)

y˙ = Dh(z)z, ˙ By + g(z, y) = Dh(z)[Az + f (z, y)]

i. e.  (3.29) (3.30)

    −1 0 y1 −2z2 + y1 y2 + 0 −α y2 −αy12    2a1 z + 3a2 z 2 + . . . zy1 = 2b1 z + 3b2 z 2 + . . .

and we obtained a1 = −2, b1 = −4 for α > 0, and   −2z 2 + O(x4 ) (3.31) . y= −4z 2 + O(x4 ) Thus, vector field in center manifold is (3.32)

z˙ = −2z 3 + (O)(x5 ).

According to center manifold theory, the behavior of the system (3.20)-(3.21) can be determined from behavior of the system in equation (3.32). Thus, because the solution of system in equation (3.32) is asymptotically stable then the solution of system in equation (3.20)-(3.21) is asymptotically stable. Simulation result is shown in Fig. 3. 3.1. Direct gradient descent control as a dynamic feedback control. Consider the linear system (3.33)

x˙ = Ax + Bu, x ∈ Rn , u ∈ Rr ,

where A and B are real matrices of dimension (n × n) and (n × r). In the following, we apply DGDC to stabilize the system (3.33) asymptotically. Case I: Unforced system is stable Consider the linear system (3.33) and assume that x˙ = Ax is stable. Define performance index:  1 T x Qx + uT Ru (3.34) Fˆ (x, u) = 2

Direct Gradient Descent Control

143

3

x1(t) 2

1 x2(t)

0

−1 u(t)

−2

−3

0

1

2

3

4

5 time t

6

7

8

9

10

Figure 3. Gradient Descent Control

where R > 0, Q > 0 and xT QAx ≤ 0.

(3.35)

(Since x˙ = Ax is stable, there exists Q, for instance Q = I, such that xT QAx ≤ 0.) The direct gradient descent control for system (3.33) with performance index (3.34) is  (3.36) u˙ = −α B T Qx + Ru . Take α = R−1 . Then we have the extended system (3.37)

x˙ = Ax + Bu,

(3.38)

u˙ = −R−1 B T Qx − u.

In this section, we investigate the condition when the extended system (3.37)-(3.38) becomes asymtotically stable. First, consider the objective function (3.34) as a candidate Lyapunov function for extended system (3.37)-(3.38). Time derivative of performance index (3.34) along the trajectory of extended system (3.37)-(3.38) is ˙ Fˆ (x, u)

(3.39)

= xT QAx − uT Ru ≤ 0.

From (3.35) we have (3.40)

˙ Fˆ (x, u) = xT QAx − uT Ru < 0, ∀ u 6= 0.

˙ But Fˆ (x, u) ≤ 0 if u = 0. To obtain asymptotical stability of the extended system ˙ (3.37)-(3.38), the value of Fˆ (x, u) must be less than zero for all nonzero x and u. For this we need a condition: if u becomes zero then x becomes zero. Below we investigate when that condition is satisfied. From (3.34), (3.39) and LaSalle-Yoshizawa Theorem [4] we have: ˙ lim Fˆ (x(t), u(t)) = 0.

t→∞

J. Naiborhu et al.

144

Hence: lim u(t) = 0.

t→∞

According to Barbalat Lemma [9]: ˙ lim u(t) = 0.

t→∞

˙ Let us assume that for t > t1 , u(t) t1 , equations (3.37)-(3.38) can be written as x˙ = Ax 0 = −R−1 B T Qx.

(3.41)

If the system (3.41) is asymptotically stable, then the system (3.37)-(3.38) becomes asymptotically stable. Example 3.2. Let the linear control system be given as follows.        x˙1 a1 a2 x1 1 (3.42) = + u, x˙2 b1 b 2 x2 0 where  (3.43)

x˙1 x˙2



 =

a1 b1

a2 b2



x1 x2



is stable. By performance index 1 1 T x x + ru2 2 2 we have the direct gradient descent control:

(3.44)

F (x, u) =

1 1 u˙ = − B T Qx − u = − x1 − u, r r

(3.45) and extended system:  (3.46)

  a1 x˙1  x˙2  =  b1 u˙ − 1r

a2 b2 0

  1 x1 0   x2  u −1

with (3.47)

F˙ (x, u) = xT Ax − ru2 ≤ 0.

Below, we give steps to analyze the behavior of the system in equation (3.46). • F (x(t), u(t)) is a decreasing function about time t. Then lim F˙ (x(t), u(t)) = 0.

t→∞

• Then u(t) → 0 if t → ∞ and x(t) bounded. • Since x(t) is bounded, then u ¨(t) = − 1r (a1 + 1)x1 + bounded. • Barbalat Lemma: u(t) ˙ → 0 if t → ∞.

a2 r x2

+ (1 − 1r )u is

Direct Gradient Descent Control

145

˙ • Assume that for t ≥ t1 , u(t)