Abstract-A discretized version of the D-type learning control algorithm is presented for a MIMO linear discrete-time system. A necessary and sufficient condition ...
I138
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 40, NO. 6, JUNE 1995
A Discrete-Time Learning Control Algorithm for a Class of Linear Time-Invariant Systems
discrete-time system 5(t
+ 1. k ) = Ac(t. I C ) + B u ( t . k )
Samer S. Saab
y(t)
Abstract-A discretized version of the D-type learning control algorithm is presented for a MIMO linear discrete-time system. A necessary and sufficient condition for uniform convergence of the proposed learning algorithm is presented. Then, we prove that the same condition is sufficient for the global robustness of the proposed learning algorithm to state disturbances, measurement noise at the output, and reinitialization error are present at each iteration. A numerical example is given to illustrate the results.
I. INTRODUCTION Disturbances in real systems are unavoidable, in particular, measurement noise, state disturbance, and perturbed errors of initialization. Since learning control algorithms are iterative schemes, then any accumulation of undesired signal can lead to divergence, hence uniform boundedness of all trajectories during leaming should be guaranteed. Heinzinger et al. [I] has studied the robustness problem for the nonlinear system for a class of PID-type learning controls. Arimoto, in his recent work [2], has proved robustness of P-type and D-type learning controls based on the passivity analysis of robot dynamics. The author [3] recently proved the convergence and global robustness for a class of nonlinear time varying systems for the Ptype leaming control. For the discrete-time leaning algorithms, Togai [4] discussed the optimality of the learning control scheme for linear time-invariant system. Geng et al. [5], [6] proposed another discretetime learning algorithm, where they assume that the output vector is equal to the state vector, i.e., measurements of all the states are required at all times. Recently, Kurek et al. [7] proved that the algorithm proposed in [4] will generate a sequence of inputs such that the error converges to zero if and only if the product of the input/output coupling matrices (CB) is full row rank.’ None of the papers involving discrete-time algorithms considers the robustness problem. In this paper, we propose a discrete-time learning control algorithm and consider multi-inpuumulti-output (MIMO) linear time invariant system. We give a necessary and sufficient condition for uniform convergence, and we show that the same condition is sufficient for global robustness of all trajectories.
11. NECESSARY AND SUFFICIENT CONDITION FOR CONVERGENCE In this section we prove the convergence of the proposed learning algorithm. The description of the system, assumptions, update law, and the proof technique are similar to those in [ 5 ] , [6], and [8]. Since the system has two “discrete” dimensions (time and iteration), the problem is to establish a two-dimensional model which has the same form as the Roesser model [ 8 ] . We consider the following
Manuscript received April 9, 1993. The author is with Union Switch and Signal, Pittsburgh, PA 15237 USA. IEEE Log Number 9410755. This condition requires that the number of inputs is greater or equal than the number of outputs.
’
=Cr(f. k )
(1)
where the state vector 5 E W , the input vector U E output y E W . The proposed update law is as follows
+
~ ( tk . 1) = u ( t . k )
+ I i [ e ( t + 1. k )
-
R”,and
e(t. k ) ]
the
(2)
where “k” is the iteration index and “t” denotes the discrete time, “ U ” is the input vector, and “e” is the output error. Consider the following restrictions on (1):
A l ) Each operation ends after finite number of steps Y . A2) A desired output & j ( t )is given a priori over the same time duration, for t = 0.1. . . . . .V. A3) Reinitialization is satisfied throughout repeated trainings, i.e., S(0. k ) = S d ( 0 ) v k . Without loss of generality, it is assumed that the initial input is zero, i.e., u ( t . 0 ) = 0 V f . Remark I: A vector z ( t . k ) + 0 for any given t E [O. -1-1 as k + x if and only if z ( f . k ) + 0 uniformly in [O. S]as k + x. Proof: i = 0.1. . . . . 1Necessity: Given any E > 0. 3 an integer such that V k >_ M, implies
then we have V k
2 Mm implies
Thus we have uniform convergence. Suficiency: Straightforward. Q.E.D. Theorem I: Suppose the system of (1) satisfies assumptions Al)-A3). Given realizable trajectory y d ( f ) , then 3 a I< where the learning operator given by (2) will generate a sequence of inputs ~ ( tk ). . t = 0.1:.. . X such that the input error and the state error [ S d ( f ) - .r(t. k ) ] converge to zero uniformly in [o. -VI as k 4 x if and only if the product CB is full column rank.* Proof: Using (l), ( 2 ) gives Ud(f)
- U ( f . k + 1) = U d ( f ) - U ( f .
k)
+ Ii[Cr(t+ 1. k )
-
C.rd(t+ I ) ]
- Ii[C.r(t.k ) - C.rd(t)].
(3)
Using (l), (3) gives
2This condition requires that the number of outputs is greater or equal than the number of inputs.
0018-9286/95$04.00 0 1995 IEEE
1139
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 40, NO. 6, JUNE 1995
Employing the results in [8] or Lemma 1 [7]: The state transition matrix + 0 as j + cc for any given i if and only if Go,' is asymptotically stable, i.e., if all its eigenvalues are inside the unit circle. The proof of the above statement is as follows. Necessity: From (12)
Define Au(t. k ) 5 u d ( t ) - u ( t , k). Arranging terms, we have
Au(t, k
+ 1) = ( I - K C B ) A u ( t , k )
+ (IiC - IiCA)[zd(t)- r ( t . k ) ]
Setq(t, k ) -Zd(t) - s ( t . I F ) . Thus, (5) can be rewritten as
Au(t, k
+ 1)= ( I - I i C B )A u ( t , k ) + (IiC - IiCA)7/(t,k ) .
Since @ . " ' I is asymptotically stable Vi,:. (Po' is asymptotically stable. SufJiciency: Assuming that @'. is asymptotically stable, we have bounded, and hence
Employing (l), (6) implies
CEO
+
1. k ) =ALd(t) Bud(t) - l h ( t , k ) - B u ( t , k )
q(t or g(t
+ 1. k ) = A7/(t.k ) + B Au(t, k ) .
Now, we can write the error equations (7) and (9) in the twodimensional Roesser model [8]
Using A3) and the choice of u ( t , 0) = 0 V t , we find the boundary conditions for (10)
The following is the implication of the Roesser model [8] which is employed in [5] and [7]. Applying the boundary conditions of (lo), we have
the following equality is satisfied [7]
@ O S J is bounded, hence the Since the index i is fixed and + 0 as j -t cc. @' right-hand side of (16) is bounded :. (13) is asymptotically stable iff I - K C B is asymptotically stable. 3 a gain matrix K which move all the eigenvalues of I - I i C B inside the unit circle iff CB has full column rank. Q.E.D. Note that as k + cc,Au(t. k)converges to zero uniformly if and only if ~ ( tk ,) converges to zero uniformly.
Proof: SufJiciency: Writing a summation expression for [rd( t )- z ( t . k ) ] , A3) and (1) imply
Necessity: Equation (1) implies where is the state transition matrix. It can be noted from (1 1) that the remainder of the proof is to show that for any fixed i , a.". + 0 as j + CO iff CB is full column rank. Since by applying Remark 1, we have that as A u ( t , k ) and q(t. k ) + 0 as IF + cc pointwise in t (i.e., for every given t E [0, NI) iff A u ( t , IC) and r/(t. IC) converges uniformly to zero in [0, N ] as k + CO. The following is some useful properties of @.',I (given in [SI) @'I
B[Ud(f)-
u(t, k ) ]= [ Z d ( t
+ 1 ) - x ( f + 1. k ) ]
- A[Cd(t)- s ( t . k ) ] . Since CB is assumed to be full column rank, then B is implicitly assumed to be full column rank. Hence the above equation can be rewritten as follows
Au(t, k ) = ( B T B ) - 1 B T { 7 / ( t1. + k ) - A[q(t.I C ) ] } .
Q.E.D.
n1. ROBUSTNESS AND CONVERGENCE
Using other important properties and comparing with (lo), we get
and = [IiC -0I i C A
I - IOi C B I '
(13)
In this section we prove the global robustness of the proposed algorithm for the system given in (I), and we show the uniform convergence of the system state whenever all disturbances tend to zero. In our proof we include state disturbances, measurement noise at the output, and reinitialization error at each iteration. Consider the following discrete-time system
~ (+ tI, k ) = A z ( t , k ) y ( t ) = CZ(t. k )
+ Bu(t,k ) + ~ + c(t. k )
( tk ),
lEEE TRANSACTLONS ON AUTOMATIC CONTROL, VOL. 40, NO. 6, JUNE 1995
I 140
where the disturbance input vector w i t . k ) E R",the output noise Taking norms, using their properties and using the bounds, (20) yields input vector r ( t . k ) E W, and the rest of the variables are the same l I u d ( t ) - u ( t . k 1)115 III - IiCBll I ( u , i ( t ) - u ( t . k)ll as in (1). Consider the following restrictions on (17): B1) and B2) IlIi-C - IiCA(( are A l ) and A2) which are defined in Section 11. ' I l J d ( t ) - .I ( t . kill B3) Repeatability of initialization is satisfied throughout repeated trainings within a small error, i.e., ll,r(O. k ) - .mI( I (21) (b, 2b, ) 11~i11. b,(1 V k . Now wnting a summation expression for [ . r d ( t ) - . r ( f . k ) ] and t > 0, B4) U ( t . I ; ) and ~ ( tX .) are bounded V k and t E [O. -VI.i.e., we obtain
+
+
+
+
. r d ( t ) - .r(t. k ) = A ' [ . r d ( ~ )- r ( O . k ) ] t-1
and
+
A-l-~L?[~ld(/) U(/.
k)]
,=0
Notation: We denote by Euclidean norm
I( . 11
-
is a consistent norm such as the Tahng norms and bounds, we have
11A1where-4 is the state matrix.
(I
J ( 0 . k)ll
Dejnition: We define the X norm for a vector
where X > 0 if (1 > 1, and X < 0 if a < 1. Note that if ii = I,then choose instead of the Euclidean norm any other consistent equivalent norm such that n # 1 and then redefine the X norm. Without loss of generality, I ( . I/ represents the Euclidean norm in the proof of Theorem 2. Remark: If we let l l f l l l . z s ~ i p ~ ~Ilf(t)ll, , ~ , , then ~ - ~we have the following
I l f I l x 5 Ilfllm I ~x"llfllx. Therefore, these two norms are equivalent. Thus the boundedness and convergence can be proved employing the X norm. Theorem 2: Let the system described by (17) satisfy assumptions B1)-B4), given a realizable trajectory yd(.), if the matrix CB is full column rank, then 3 a I< where the learning operator given in (2) will generate a sequence of inputs such that the input error [ ( I (!( .) - ( I (.. k ) ] ,the state error [ . r d ( .) - .r(.. k ) ] ,and the output error are all bounded in [O. -Y] V k . Moreover, the state, output, and input errors will converge uniformly to zero in [O. S]as k 4 x and in absence of all disturbances, i.e., whenever b,". b,, , and / I , . 4 0. Proof: Using (17), (2) gives rrd(t) -
u(t. k
+ 1) =
+ Ii[c'.r(t + 1. k ) + + ( ' ( f + 1. k ) ] - Zi[Cs(t.k ) - C J d ( f ) + k)].
(id(t)-
u(t. k )
- C,(.d(t 1)
i'(t.
Using (17), (18) gives (Id(t)-
u(t. k
+ 1)= U < / ( t-) u ( t . k ) + I i [ c A r ( t . k ) + C B u ( t .k ) + w ( f . k ) + r ( t + 1. k ) - C A J d ( t ) CBUd(t)] -
- I