Document not found! Please try again

Discrete-Time Learning Control Algorithm for a CIass of Nonlinear ...

2 downloads 0 Views 448KB Size Report
[l] to a class of discrete-time varying nonlinear system. We investigate the robustness of the algorithm to state disturbance, measurement noise and ...
-

TP12 3:50

Procoodings01the h a r l u n Conlml Conl”ca Seeitla, WuhlnptDn June 1WS

Discrete-Time Learning Control Algorithm for a Class of Nonlinear Systems Samer S. Saab Union Switch tk Signal Pittsburgh, PA 15237 [email protected]

Abstract

cussed the optimality of the learning control scheme in a discrete system approach for linear time-invariant system. Geng et a1 [9, 101 proposed another discrete-time learning control where algorithm convergence was presented. Kurek and Zaremha [ 111 showed that the algorithm proposed by Togai et a1 will drive the error of linear time-invariant system to zcro if and only if the product of input/output coupling matrices (C‘R) i s full row rank. The author 11. 131 proposed a new discrete-time algorithm applied tu linear time-invariant system. It was shown that the error will converge unifurtnly to zero if and only if (CB) is full coliinnlia rank’. Moreover, it was shown that the same condition is sufficient for global robustness, to state disturbance, measurement noise and reinitialization errors, of the propnsed algorithm. Neither one directly considers nonlinear discrete systems.

In this paper, we apply a discrete-time learning algorithm [l] to a class of discrete-time varying nonlinear system. We investigate the robustness of the algorithm to state disturbance, measurement noise and reinitialization errors. Then, we prove that the input and the state vaxiables will always be bounded if certain conditions are met. Moreover, we show that the input eiror and state error will converge uniformly to zero in absence of all disturbances. A nutnerical example is added to illustrate the results.

1. Introduction Algorithms applied on identical tasks which improve system performance with repeated trials are known to be learning algorithms. Learning control algorithms, which update the most recent control, consist of an input correction which is based on stored preceding inputs and their respective output error data. To date. most trf learning control algorithm dynamics, first offered by Arimoto et a1 [21. update the system in a linear input action and the control in the pointwise fashion. Learning control is applied to a system as an open-hop control and updated a,,a closed-loop control. The main advantages of learning algorithms: 1 - they we potentially suited for dynamics with some pwuameta uncertainties. 2- The tracking convergence is uniform; i.e., no transient errors. 3- Easy to implement. Most of the successful applications of such algorithms are reported to be in the field of robotic systems.

In this paper. we consider the fbllowing class of discretetime-varying nonlinear systems given by the following statespace difference equations x ( t + 1) =f(x(i),t) + B ( . x ( t ) , l ) u ( t )

y ( r ) = C(r)x(r) The main restriction on the system that the functionsf(.) and

B( .) satisfy the Lipschitz condition. and the sufficient condition, for uniform convergence and global boundedness of trajectories proved in the latter, is that Q‘.jB(.j is always full column rank. The robustness and convergence for this more general class of systems outlines the main contribution of this paper. In Section 2, we state the problem formally and present the main results, in particular, a concise proof of global houndedness of trajectories in presence of state disturhance, measurement and reinitialization errors, and show that these bounds are continuous function of the disturbances; i.e., all the trajectory errors will converge uniformly to zero in absence of disturbances. In Section 3. we give a numerical example to illustrate the results, and we conclude in Section 4.

In the past few years, several researchers have prnpnsed different designs of learning algorithms for different systems, where they have shown convergence and many researchers. further included robustness issues of their algorithms. For continuOILY system. Arimoto et al [3] clealt with Lime-invariant mechanical systems and demonstrated robustness to initial state error and differentiable state disturlxmce with an initial trajectory in a small neighborhood of the desked one. Bondi et a1 [4] proved the uniform boundedness of trajectories. in a local sense, for time-invariant mechanical systems. Heinzinger et al 151 has studied the robustness problem for a class of time-.varying nonlinear systems, which was introduced by Hauser [12] where the convergence nf the algorithm was shown. Arimoto [61 has proved robustness of 1’-type learning contrc,l based on the passitivity analysis of robot dynamics. The iiuthnr [71 has proved global boundeclness and uniform convergence of trajectories using a P-type learning control, for a class of time-varying nonlinear systems. Unfortunately. for discrete-time systems. the robustness problem was not heavily addressed. T o p i et al [8] dis-

1. If M is full column rank, then 3 M+ such that M = I . On the other hand. if M is full n m rank, then 3 M’such that MM’ = 1. Regatding the state space repnsentation. if (’B is full colunin rank, then it will he necesrary to have the number cif outputs p a t e r or equal to the nunikr of inputs. and if CO is full row rank. then it nil1 k necessary to have the iiuiiiiwr of outputs less o r equal to the number of inputs.

2739

In this section we prove the global robustness of the ieaming algorithm, proposed in [I], for a class of nonlinear systems. In the robustness proof, we include measurement noise at the output, state disturbance. and reinitialization errors at each iteration. Consider the state-space description of a class of nonlinear discrete-tune varying systems of the form .V(l+ I,

k ) = SC.1 ( 1 , k ) , I ) + B ( s ( 4 k ) , t ) u ( I . k )

y ( 1 . k ) = C ( I ) . X ( I . +~t ), ( t , k ) +vh(k)

+ w ( t ,k )

Lemma 1: Suppose the system given in ( 1 ) satisfies assumptions

(EQ 1)

(Al)-(A4). then the following norm inequality is satisfied,

1I

where k is the iterative index of the learning algorithm. t = 0, 1, ... N2is the discrete-time index (assuming N is an integer). A.) is a vector-valued function with range in W"' C(t) ~ 3 9 " " . 1 < q 5 n, B(.) is a matrix valued function with range in WnXp.U,(.) and t i . ) are the state disturbance and output measurement noise. respectively, with the appropriate dimensions. and Vb is an undesirable cccnstant hias vector which can vary with iterations. Next. consider the update law,

.

u(~,k+l= )

ir(1,k)

+

1

Z(M~+M~~,~)

f-

i=O

t - 1 - i {lliid(i) - i l ( i . k ) b n + b 1 w

I

Proof. The following proof is proceeded by induction. From (I), we have

+ K ( z , k ) ( e ( t + l , k )- e ( t , k ) ) ( E Q 2 )

where K(.)is thc teaming control gain matrix, and e(.) is the output eiwr; i.e., E (1. k) = y ( I ) - y (I, k) and y,X.) is the n desired trajectory. The update law given in (2) IS similar to the update law proposed in [ 11.

Notations: 11.11 is any consistent matrix noim; e.g., Euclidean norm.

.&=.fl..\;i(t).

fl.v(r.k), I). p B(x&), 1 )

E

B

art(?,k ) = l4,i(l)

r).

B,,

= B(,~(t,k).t), Taking nonns. using Lipschitz condition and the norms defined in (A I)-(A4), we have

- U(?, k), &(r, k ) = X d t ) - X(l, k )

Assumptions:

(Al) The desired trajectory yJ.) is realizable; i.e.. given y&) for I = 0, 1, ... , N. and yd(0) = C(O]x,JO), 3 a sequence U&), such that y&) = C(t)xd(t), and x&+l) =fl.r&). t) + F%(.v&), t)tt&) V t E 10, NI. Since all the desired trajectories are implicitly arsurned to be bounded. then we set

Thus. for t = 1. the ahive n o m inequality is consistenl with Lemma 1. Next, we assume that the norm inequality is true for f - 1, and show that it is valid for 1. Fort -1, we have

1) '

s''pt E [ 0, N ]11"d ( I ) (A2) The state disturbance and measurement noise are all h)undetl at all times; i.e.. 3 constants b, and b,.such that

Vk

W',E sup

[I),

jq 1111' ( I , k ) II 5 hw, i =0

[(),NJII1~(t.k)II~hV

and the reinitial17atron errors are hounded V k; i.e.. 3 constant b,O such that V k

p p - - x ( O . L ) I1 -< h.to (A3) The range of the operator B(.) is always bounded; i.e., 3 constant h~ such that V k

2. fn this paper WO mean by

I

e

[I),

N J to tw eqoivalent to I = 0.1,

._.,N. 2740

-K

( t ,k ) c ( t )x ( 4 k ) - K ( l , k ) Y ( t , k )

- K ( t ,k ) Vb ( k )

Naturally. the two tenns due to tneasurernent constant disturbance will drop. Collecting terms. we have

where a > I, and h > 0.

- -sup I E [*,?, IIC ( 2 ) 1 . Taking norms on both sides, Set b _ using Lipschitz condibon and the bounds defined in (A1)-(A4). we get

In this paper, U = kff + h , @ ~(> 0). where Mfi and MB are Lipschitz constants and b, is an upper bound which are defined in (AI)-(A4). If (0 1. Without loss of generality, it will be assumed that U > 1. It can be easily verified that the h-nonn is equivalent to --norm [ 11, thus the boundedness and convergence can be proved empltrying the h-norm.

I

116~( t , k + I ) I I I - K ( t , k ) C ( t + 1) Br, k ~ ~ I /(6t u. k ) (I

Theorem I : Let the system described by (1) satisfy assumptions (Al)-(A4), given a realizable bajectoiy yd(.). if the matrix C(t+l)B(x(t,k),t)is full column rank 'd (x , t ) ER"x (0. N ] . then 3 K(t,k) where the leaning operator given in (2) will generate a sequence of inputs such that the. input, output. and state errors are all bounded in [0, NI and 'd k. Moreover, the input, output, and state errors will all converge uniformly to zero f o r t = 1,2, .... N a s k j 0 a n d i n a b s e n c e o f a l l disturbances3, in particular, whenever b.,o, h,. b,. + 0.

Since the matrix C(f)B,,kis assumed to be full column rank 'd (x , t ) ER" x [0, N ] , then 3 K(t,k) such that < p i1 and from (A3), 3 constant K ( t , k ) C ( t + 1)B t, kll such that V k t E [",N]llK(t*k)IIsbK ' Setting a = M .+ h M ,p = bKbc[ M + 1 + bilMB].Thus. the 1. (1 I above equation will reduce to

Proof.

11614 ( t . k

+ 1) II 5 p116u ( 2 , k ) II

From (2). we have

+ pIsd ( 2 ) - x ( t , k ) I + bCbM,hK+ 2bvbk

6 N ( t , k + 1) = G I I ( t , k )

(EQ 12)

Using the results given in Lemma 1, and expanding, we get

Il6ri ( t . k + 1 ) 11 5 pll6u ( 1 . k ) II + hCh,,,hK + 2bvhk using (I), the above will yield to

Next, we multiply both sides of the a b v e inequality by dhtand then we take the supremum over [0,M, we have

+ K ( L k ) v ( t + 1, k ) + K ( r , k) 1- ( k ) + K ( r . k ) C ' ( t ) s ( r ) h n

3. Except for the iiieasurpment constant bias vector vb(k).

2741

set (compact), then the input error will converge uniformly to + OJ. Next, multiply both sides of the inequality given in Lemma I by a-', then take the supremum over [0, NJ-aiid use facts 2-3

a- 1

zero in [0, NI a? k

-1 +

hM,Pa

w

[ O , N ]a

tE

i=O

Note the following facts:

1-

IICllh =

( 'I. vc E Yi

2f-

1

-t ( h -

Taking limit on both sides, the last inequality yields to

(1-1) i

a

sr4yfE [ 0. NI

lim 116.x ( r , k) Ilk Ihxo + 6 bw,+ bBa

;= 0

k+-

Again. since E is bounded and p # 1 , then the state error is bounded V k in [O. NI, and as hxp hw and h , -+ 0, then E -+ 0, hence the state error -+ 0 (pointwise) as k -+-, but since [0, N j is a finite set (compact), then the state error will converge uniformly to zero in [O, N] as k -+ -. Since yd ( t ) - y ( t , k ) = C ( t ) 6s ( f , k) , then the output error is bounded and it will converge uniformly to zero in [0, NI as k 3 OJ and as h,@.hM,and b,, -+ 0.

-(h- l ) N

a 3-

SNP

-t ( h - 1 )

t € [O.N]"

-1

= 1

3. Example

c

In this section we apply the proposed algorithm (2) on a nonlinear system to illustrate the tracking perfoimance. in particular, robustness and uniform convergence. Consider the following system

t-1

Set

-1

b 1 -=Pa sup

- t ( k - 1) f E

[O,Nlu

,hi&

1)i

i=O

Using facts 1-3. we get

x l ( t + l , k ) = x l ( t . k ) + h x 2 ( r , k ) +hrandn

x 2 ( r + I , k ) = x 2 ( t , k ) +hsin(x2(r,k))

sin (1-2( I , k ) ) 14 ( 1 , k )

E

f

+

/ x 2 ( t , k ) I+ 1

+

+ hrandn, y ( t , k ) = x2 ( t , k ) + hrandn

where " d n is a scalar random generator4, supported by MATLAB, with normal distribution, mean = 0 and variance = 1 (white gaussian noise). The integrahon frequency = 100 Hz and the constant h = 0.01. The desired trajectories fort E [0,5] (seconds) are given to be

b,,op + hCh,,hK + 2h,,bk + hu,hl

n 2 n : x2(t,d) = -

Therefore. the last inequality can be written as

1"

+z

where sl(t, d) and ~ ( td,) can be obtained by using the system above excluding the disturbances. The initial input, for t E [0, 51, is set to zero. and .r (0, k ) = x (0, d ) + hrandn .

Note that as h,,~, hw and h,. -+0. then F -+0. Since < 1, then 3 h large enough such that p < 1 . Then, iterating k, we have

Before we apply the learning algorithm. one should check if all the assumpt~onsare met. For (Al)-(A3). these conditions can be readily venfied. But for the Lipschitz CO three of the Lipschitz properties: 1- The addition of Lipschitz functions (Lf) is an Lf. 2- Scalar multiplication with an Lf is an Lf (multiplication of Lf is generally not an Lf) 3- If a funchon has continuous and bounded partial derivatives (in x), then it IS Lipschitz. Therefore, from (3) we may conclude that B,k (or sin@)) IS Lipschitz. It remains to show that. ,fl,L is Lipschitz. From (1) (2), it will be sufficient to show that g ( x ) = -is an Lf. Note that I\{. + 1

since p < I , then

ap

Since F is bounded and p f 1 . then the input error is bounded V k in [0, and as b,o. b,,. and h,, -+ 0. then E 3 0. hence the input error -+ 0 (pointwise) as k -+ -, but since [O. N j is a finite

m.

4. randn generates a random scalar evely time it is called which depends on &.;Crete-time sample, iteration and vanahle.

2742

measurement biased noise and reinitialization errors. Moreover, these bounds are continuous function of the disturbances, hence, all the trajectory errors will converge unifoiinly to zero in absence of all disturbances. The main restriction on the system is that the functionsf(.) and B(.) satisfy Lipschitz condition. The sufficient condition, for uniform convergence and global boundedness of trajectories, is that C(.)B(.)is always full column rank. A numerical example was given to verify the results.

so 8 is an Lf (more precisely, it is a contraction function). For the sufficient condition given in Theorem 1, this can also be met by fixing the controller gain 0 < IKI < 1. and the sign of K would he the same as the sign of sin(s2(r, k). From the convergence part of the p r c d of Theorem 1, one can realize an “optimum” choice of p tn be zero. On the other hand, such a choice of K(.) can increase the fluctuation due to random disturbances. For 0: now, we set K ( r ) = . 0 < U 1 where a is chosen “f$Z‘I,”’5’ to be closer to zero 1 t e size ue to random disturbances is large, and vise versa. In this example U = 0.5 and number of iterations = 20 in presence o f disturbances and a = 1 and number of iterations = 10 in absence of all disturbances. In Figures 1 (bottom), and 2, solid and dashed represent desired and 20’h iterate trajectories respectively. In Figures 1 and 2 (top), we show nraxl.ri(t,d) - .q(t,k)l in t E LO, SI, fnr i = 1, 2. Figure 1 shows the robustness of the algorithm. Figure 2 shows the uniform convergence of the state variables to their respective desired trajectories.

References

--

Absolute Errors (Unilorrn Convergence) vs Iterations r

llnlform Cnnvergence of the Control Inputs vs seconds

-..--

-

J

i--- L--

Saab S., “A Discrete-Tinit Learning Control Algorithnz,” Proceedines of American Control Conference, Baltimore, MD, June. 1994.

2.

Arimoto S., Kawamura S., and Miyazaki F., “Bettering Operation ofRobots by Learning, Journal of Robotics SYs-Vol. 1. 1984.

3.

Arimoto S., Kawamura S., and Miyazalu F., “Convergence, Stability, und Robustness of Learning Conin11 Schenrts For Robot Muniprtluzors,” hit. Svmu. on Robot Maniuulators: Modeling. Control. and Education. Albuquerque, NM, 1986.

4.

P. Bnndi, G.C a d i n o , and L. Gambardella, “On the Iierative Leurning Con1rol Schcnies ,fiw Robot Manipulators,” IEEE Journal of Robotics and Automation, 1988.

5.

r, G., Fenwick, D., Paden, B., and Miyazlki, E. of Learning Confro1 with Disturbances anti UncerrciinIniriiil Ctiniiitions,“IEEE Transactions on Automatic Control, Jan. 1992.

6.

Arimoto, S., “Learning Control 7heory jbr Robotic Motion,” S l g n a l g , Vol. 4,1990.

7.

Saab, S . , “On the P-type Leurning Control.“ IEEE Transactions on Automatic Control. Vol. 39, No. 11, November 1994.

8.

Togai, M. and Yamano. 0..“Annlysis Design of un Optimal Learning Ctmirol St.hcnie~for Industrid Robots: A Discrete Systenr Approach,” Proceedings of 24th Conference on Decision and Control, Ft. Lauderdale. FL, Dec. 1985.

9.

Model Geng. Z..Carroll, R., and Xie, J., “T~to-diniensiL~nal und Algorithnt Anolysis .for 11 Class of Iterative Learning Control Systenis,” International Journal of Control, Vol. 52, 1990.

I

0.02,

-0 0 L ,

1.

Figure 1. Algorithm performance in presence of measurement, state and reinitialization disturbances. Absolute trrors (IJniform Convergence) vs Iterations r

--

10-3Untfonn Convergence of tile Control Inputs vs seconds

10. Geng, Z., Jamshidi, M., and Carroll, R.. “An Adaptive Learning Control Approach,” Proceedings of 30th Conference on Decision and Control, Dec. 1991. 11. Kurek. J., and Zaremba, M., “Iterative Leurning Control Synthesis Bused on 2-D Systenr Theory,” IEEE Transactions on Automatic Control, Jan. 1993.

12. Hauser J.. “kiirning L‘ontrolfilr U class of nonlinear systenzs,” Proceedines of 26th Conference on Decision and Control. Los Angeles, CA, Dec. 1987.

Figure 2. Algorithm performance in absence of all disturbances.

4. Conclusion

13. Saab S.. “A Discrcie-Time Leurning Control Algorithm for 11 Cliiss of LTI Systenis.” to appear in IEEE Transactions on Automatic Contrnl, lYY5.

We have shown. without system linearization, the global boundedness o f all trajectoiy errors in presence of state disturbance,

2743

Suggest Documents