No-Reset Iterative Learning Control - CiteSeerX

No-Reset Iterative Learning Control

Luis G. Sison Edwin K.P. Chong y School of Electrical Engineering Purdue University West Lafayette, IN 47907-1285 fsison,[email protected] submitted to

35th IEEE Conference on Decision and Control as a regular paper

Abstract A No-Reset Iterative Learning Control (NRILC) system is an Iterative Learning Control (ILC) system where the plant is not reset at the beginning of each iteration. We compare NRILC with ILC and repetitive control systems in terms of structure. We apply this new scheme to discrete-time, LTI, SISO plants but the approach can be extended to linear time-varying and MIMO plants. We show that an NRILC equilibrium point exists if the desired trajectory lies in a certain well-de ned subspace. The convergence of the system depends on the spectral radius of an equivalent system matrix. Using results from output feedback theory, we show that the closed-loop eigenvalues of the system can be placed almost always with the selection of an appropriate nite learning gain.

1 Introduction Iterative learning control (ILC) is a control approach that uses past performance to improve future performance of plants that repeatedly follow a speci c trajectory or trajectories [10]. Supported by a scholarship from the Department of Science and Technology|Engineering and Science Education Program, Philippines. y Supported in part by the National Science Foundation through grants ECS-9410313 and ECS-9501652.

1

uk

PLANT

yk

k-th iteration LEARNING CONTROLLER

u k+1

PLANT

yd

yk+1

(k+1)-th iteration LEARNING CONTROLLER

yd

Figure 1: General ILC Architecture. Typical applications of ILC include robotic manipulators [12] and CNC machine tools [8]. It is standard in ILC to assume that the plant is reset to the same state after each execution of the trajectory. Figure 1 shows a typical ILC scheme. In this gure, the quantities do not refer to individual time samples, but to a trajectory over a nite horizon. Each cycle or iteration consists of driving the plant with a sequence of control inputs, determining the error of the output trajectory with respect to a desired trajectory, and updating the control trajectory based on this error. The control update equation is typically of the form uk+1 = uk + Teek , where Te is the operator-theoretic notation for the learning gain. An ILC system can also be interpreted as a 2-dimensional system which progresses not only with respect to time within a trajectory but also with respect to iterations [1]. The design problem is to select a learning gain that will ensure convergence to the desired trajectory at an acceptable rate while requiring minimal information about the plant and minimizing the sensitivity of the system to plant variations. Sucient conditions for the stability of various ILC architectures have been obtained, e.g., [2], [10], and [7]. Repetitive control (RC) involves the design of feedback systems where the controlled variables follow periodic reference commands [6]. A controller is designed so that the plant tracks the desired trajectory asymptotically. Common to all RC schemes is an N -th order delay/sum block which is required by the Internal Model Principle [6]. An RC architecture is shown in Figure 2. Sadegh [11] derived necessary and sucient conditions for discrete-time 2

d yd

+ -

K lQ(z)

+ +

+

z-N

+ H(z)

y

Figure 2: Repetitive Control Architecture. yd yk

LEARNING CONTROLLER

u k+1

z

-N

uk

PLANT

Figure 3: No-Reset ILC Architecture. MIMO LTI systems. He assumed that the plant is proper (but not strictly proper) and exponentially stable. Design is based in the frequency domain, using Nyquist diagrams to determine the scalar gain, and frequency response analyses of the plant for designing the compensator. A No-Reset ILC (NRILC) scheme, as shown in Figure 3, is a new architecture for implementing repetitive control. As its name implies, the architecture is the same as ILC except that the plant is not reset at the start of each iteration. The controller in an RC system typically includes a lter operating continuously on the output error signal in addition to the delay/sum block. In our NRILC scheme, the counterpart of this lter|the learning gain|operates exclusively on the output error from one period, without any overlap from the previous period. We show that this NRILC scheme leads to results dierent from those in conventional RC, the most interesting of which is the ability to arbitrarily assign the eigenvalues of the closed-loop system even if the plant is unstable. In the next section, we introduce the NRILC equations for a linear discrete-time plant. In Section 3, we de ne the conditions for which this scheme can track a trajectory with asymptotically zero error. Section 4 discusses the eigenvalue assignment problem using results from output feedback theory. Section 5 contains numerical examples that demonstrate the feasibility of this approach, followed by 3

a discussion of the limitations of the design procedure.

2 Problem Formulation Consider an ILC system with the plant initial conditions carried over from the previous iteration. The n-th order linear discrete-time SISO plant equations over a period of N samples can be written in operator notation as yk xk+1

= Tsuk + Toxk = Tu uk + T x x k ; k 0

(1) (2)

where yk 2 RN is the output trajectory at the k-th iteration, uk 2 RN is the control input trajectory at the k-th iteration and xk 2 Rn is the plant state at the start of the k-th iteration. These quantities should not to be confused with the samples at an instant in time. The output trajectory yk is the aggregate of N samples of the output, yk = [yk (0); yk (1); : : : ; yk (N ? 1)]T , where yk (i) is the i-th output sample in iteration k. The learning law is given by ek uk+1

= yd ? yk = Tpuk + Teek

(3) (4)

For a discrete-time, linear, SISO plant (A; b; c; d), A 2 Rnn, b 2 Rn1, c 2 R1n, and d 2 R, the coecients in equations (1) to (4) can be derived easily from the state and output equations NX ?1 N AN ?t?1buk(t), and xk+1 = A xk + t=0 # " t?1 X t ? ?1 t yk (t) = c A xk + A buk ( ) + duk (t) ; 0 t N ? 1; =0

respectively. The coecients are given by

Tx Tu To Ts

= = = =

AN [AN ?1b; : : :; Ab; b] [c; cA; : : :; cAN ?1]T Lf[d; cb; cAb; : : :; cAN ?2b]T g;

where Lfvg denotes the lower-triangular toeplitz matrix whose rst column is the vector v. We will also restrict Tp and Te to appropriately sized real matrices. 4

We can rewrite equations (1) to (4) as 3 3 32 3 2 2 2 x 0 x T T u 5 4 k 5 + 4 nN 5 yd 4 k+1 5 = 4 x Te uk+1 ?TeTo Tp ? TeTs uk 2 3 x = A 4 k 5 + B yd

(5)

uk

where zk =4 [xTk ; uTk ]T 2 Rn+N is the combined state of the plant and the controller. A sucient and necessary condition for the asymptotic stability of (5) is

fAg < 1

(6)

where fAg denotes the spectral radius of A. Notice that the lower right block of A, G = Tp ? TeTs is the mapping from ek to ek+1 for an ILC system with initial condition reset [10], i.e., with Tu = 0, Tx = 0, and, therefore, constant xk . The eigenvalues of G determine the convergence properties of this ILC system whereas the eigenvalues of A determine the convergence of equations (1) to (4). The relationship between the eigenvalues of G and A is not straightforward. We are interested in the relationship of the stability of the plant, as determined from A, to the stability of the NRILC system, as determined from A. Speci cally, we are interested in whether for a given plant (A; b; c; d), gains Tp and Te can be found such that ek converges to some nite value, ideally to zero. If the SISO plant (A(t); b(t); c(t); d(t)) is linear but periodically time-varying with a period N , equations (1) to (4) still apply but the coecients are given by

Tx = (N; 0) Tu = [(N; 1)b(0); : : :; (N; N ? 1)b(N ? 2); b(N ? 1)] To = [2c(0); c(1)(1; 0); : : : ; c(N ? 1)(N ? 1; 0)]T 3 d (0) 0 0 77 66 . . 77 66 c(1)b(0) d(1) 0 . 77 Ts = 66 ... ... ... 75 64 c(N ? 1)(N ? 1; 1)b(0) c(N ? 1)b(N ? 2) d(N ? 1) where (s; t) =4 Qs?1 A( ) is the state transition matrix of the plant. =t

For a MIMO plant with m inputs and p outputs, equation 2 becomes xk+1 = Tx xk + Tu1 uk;1 + + Tum uk;m

5

where uk;i 2 RN is the trajectory of the i-th control input. Likewise, the trajectory of the j -th output, yk;j 2 RN , becomes y k+1;j

= Toj xk + Tsj;1 uk;1 + + Tsj;m uk;m:

De ning zk =4 [xTk ; uTk;1; : : :; uTk;m]T and yk =4 [yTk;1; : : :; y Tk;p]T , we get

Tu = [Tu1 ; : : : ; Tum ] To = [ToT1 ; : : :; ToTp ]T 2 3 T T s s 1;1 1;m 6 ... 777 : Ts = 664 ... 5 Tsp;1 Tsp;m The derivation of Tui , Toj , and Tsj;i from the plant matrices are left to the reader. For simplicity, we restrict further discussion to LTI SISO plants.

3 Characterization of Trackable Trajectories Since the objective of NRILC is to asymptotically track a given trajectory, we have to establish whether it can do so with zero steady-state error for all possible trajectories. Let x , u , e , and y denote the equilibrium values of xk , uk , ek , and y k , respectively.

De nition 1 A trajectory yd is trackable if the NRILC system in (1) to (4) possesses a xed point (x; u) for which e = 0.

When Tp 6= 0, equation 4 implies that when e = 0, we have (IN ? Tp)u = 0, a restrictive condition. For this reason, we limit further discussion to the case Tp = IN . With this assumption, we can now characterize the set of trackable trajectories.

Theorem 1 A trajectory yd is trackable if and only if yd 2 T

=4 f[To; Ts]z j z 2 Nf[In ? Tx; ?Tu]gg

(7)

where NfAg denotes the null-space of the matrix A.

Proof: The xed point equations when e = 0 are = Tox + Tsu x = Tx x + Tu u : yd

6

(8) (9)

Equation (9) gives us

2 3 4 x 5 2 Nf[In ? Tx; ?Tu]g u

while (8) gives us

2 3 x y d = [To ; Ts] 4 5 : u

While the above theorem is useful, we are often more interested in whether all trajectories are trackable for a given plant. From the xed point equations in the preceding proof, we can determine when this condition holds.

Theorem 2 Any trajectory yd is trackable if the matrix O~ is non-singular, where 2 3 I ? T ? T O~ = 4 n x u 5 : To

Ts

(10)

The proof of Theorem 2 follows from the equilibrium equations (8) and (9). It also follows that if O~ is non-singular, then for any arbitrary trajectory yd , we can obtain the initial condition and control input that generates that trajectory from the matrix equation O~ [xT ; uT ]T = [0Tn ; y Td ]T .

4 NRILC Eigenvalue Assignment Suppose we decompose A as follows, 3 2 3 2 i h T T x u 5 ? 4 0 5 Te To Ts A = 4 0 IN IN C; = A ? BK where

A = B = K = C =

2 4 Tx 0 2 4 0 IN Te h To 7

3 Tu 5 IN 3 5 i

Ts :

This decomposition has the same form as the output feedback problem for an (N + n)-th B; C ), where we have to choose the feedback gain K = Te so that order MIMO plant (A; C is stable. A ? BK It turns out that not only can such a gain be found, but we can also place all the eigenvalues of A to any arbitrary accuracy for almost any plant, regardless of the stability of the plant. This result is elaborated below and draws from the results of Davison et al. on the eigenvalue assignment by output feedback (EVAOF) problem ([3], [5], and [4]) as stated in the theorem below.

Theorem 3 (Eigenvalue Assignment with Output Feedback (EVAOF) Theorem) Given a controllable and observable system (A; B; C ) with A 2 Rnn, rank B = m, and

rank(C ) = p, then for almost all (B; C ) pairs, there exists a constant gain output feedback matrix K such that A + BKC has min(n; m + p ? 1) eigenvalues assigned arbitrarily close to min(n; m + p ? 1) speci ed locations in the complex plane with complex eigenvalues occurring in conjugate pairs.

Before we present the main theorem, a lemma is needed to establish the conditions sucient for the application of Theorem 3.

Lemma 1 Given a SISO LTI plant (A; b; c; d), if the following conditions hold: (i) the plant is completely controllable and observable but not strictly proper, and (ii) O~ , as de ned in (10), is non-singular,

B; C ) is completely controllable and observable, and then (A; rank(B ) = rank(C ) = N

Proof: Clearly, B is full rank. Also,

(11)

3 2 d c 77 66 77 66 cA cb d (12) C = [To; Ts] = 66 .. .. . . . . . . 777 64 . . 5 cAN ?1 cAN ?2b cb d is full rank if d 6= 0 which is equivalent to the plant not being strictly proper. To show that B ) is completely controllable, consider its controllability matrix, (A; AB; : : :; AN +n?1B ]: C = [B; 8

Since (A; b) is completely controllable,

Tu = [AN ?1b; : : :; Ab; b] has rank n, and thus the submatrix

3 2 0 T AB ] = 4 nN u 5 [B; IN IN A) is completely observable if and only if for all s has rank N + n. Recall that (C; 3 2 sI ? A N + n 5 = N + n: rank 4 C When s 6= 1, 3 2 N sI ? A ? T n u 77 66 77 66 0N n ( s ? 1) I N 3 6 2 77 66 c d sI ? A N + n 77 5=6 4 77 6 cb d C 66 cA 77 ... ... ... ... 66 75 4 cAN ?1 cAN ?2b cb d has rank N + n, since d 6= 0 and (c; A) is completely observable, i.e., 1 0 BB c CC B cA CC rank B BB .. CC = n B@ . CA cAN ?1 When s = 1, 2 3 3 6 In ? AN ?Tu 7 2 4 sIN +n ? A 5 = 66 0N n 0N N 77 4 5 C To Ts A) is Removing the rows of zeros in the middle gives us O~ as de ned in (10). Therefore, (C; completely observable if rank(O~ ) = N + n, which completes the proof. Condition (ii) in the previous lemma de nes a hypersurface in the space of all possible plants (A; b; c; D) and we can therefore remove condition (ii) from the \if" part of the lemma as long as we replace it with the quali er \almost any". It is interesting to note that this is the same condition that guarantees that the system can track any given trajectory, as shown in Theorem 2. 9

Theorem 4 (Sucient Condition for Pole Placement of A) For almost any SISO LTI plant satisfying the suciency conditions in Lemma 1, there exists a learning gain Te such that A has all N + n eigenvalues assigned arbitrarily close to N + n speci ed locations in the complex plane with complex eigenvalues occurring in conjugate pairs.

The theorem follows from Lemma 1 and Davison's result (Theorem 3). An algorithm for computing Te is given in [5]. The plants for which Theorem 3 do not guarantee the existence of an output feedback matrix lie in a hypersurface de ned dierently from the one in Lemma 1. More study is needed to determine how these two hypersurfaces are related. More importantly, we have removed the condition of plant stability as a prerequisite to stabilizing A, unlike in the results from repetitive control theory [11]. The stabilizing Te obtained from Davison's algorithm, however, does not have any structure. This lack of structure may result in a computationally intensive implementation of this scheme, as opposed to, say, a toeplitz Te. Pole placement as described above requires full knowledge of the plant dynamics. This information may be obtained using online or oine system identi cation techniques. Some problems with this approach include the sensitivity of the obtained gain to the plant dynamics and desired poles, and numerical problems both in obtaining the learning gain and when the computed gain has large valued elements. These problems generally worsen as N increases. We can see this from (12), which has increasing powers of A as we go down the matrix. As we increase N , these elements either become very large or very small, requiring large gain elements (or at least a wide dynamic range) to obtain a speci ed set of eigenvalues. This exponential dependence of the matrix elements on N also lead to dynamic range problems. Robustness of the learning gain to plant parameter variations should also be studied. Robustness analysis is complicated by the nonuniqueness of the learning gain and the nonlinearity of the pole placement problem and its algorithms. For some algorithms, e.g., the MEVAO algorithm by Miminis [9], the computed learning gain is extremely sensitive to the set of desired poles, easily demonstrated by slightly perturbing the desired poles and reapplying the algorithm. A simple algorithm that produces a unique and continuous mapping from the plant parameters and desired poles to the learning gain will facilitate robustness analysis. We can then perform this analysis using results from robust control.

10

5 Numerical Examples Consider a 2nd order plant (A; b; c; d) with 2 3 0 : 4231 ? 0 : 3029 5 A = 4 ?0:3029 0:2777 2 3 0 : 0077 5 b = 4 0 h i c = 0 0:4175 d = 0:6868:

(13)

The eigenvalues of A are 0.6619 and 0.0388. The reader can verify that the plant is proper but not strictly proper, and is completely controllable and observable. Using (10), we obtain det(O~ ) = 0:133 6= 0. With N = 20, the NRILC system matrices are 3 2 ?3 ?0:1268 10?3 0 : 1608 10 6 777 ; A = 664 ?0:1268 10?3 0:1000 10?3 5 0202 I20 2 3 0 B = 4 220 5 ; and I20 3 2 0 0 : 4175 0 : 6868 0 0 7 66 66 ?0:1265 0:1159 0 0:6868 0 777 77 66 ?0:0886 0:0705 ?0:0010 0 0 : 6868 77 : 6 C = 66 0 777 66 ?0:0589 0:0464 ?0:0007 ?0:0010 66 ?0:0390 0:0307 ?0:0005 ?0:0007 ?0:0010 77 5 4 .. . . . . . . . . . . . . . . . . B; C ) are easily determined using the Matlab The transmission zeros of the system (A; function tzero, which yields f0; 0:2792 10?3 g. Using the MEVAO algorithm by Miminis [9] with the desired poles at 0.5, we obtain the learning gain 2 3 0 : 0299 0 : 0112 ? 0 : 0299 0 : 0018 0 : 0065 66 7 66 ?0:3006 ?0:1128 0:3030 ?0:0133 ?0:0592 777 66 0:3685 0:1401 ?0:3715 0:0171 0:0723 77 77 103; Te = 666 66 ?0:1861 ?0:0718 0:1909 ?0:0103 ?0:0381 777 66 1:8326 0:6979 ?1:8547 0:0872 0:3600 77 4 .. 5 . . . . . . . . . . . . . . . . 11

2.5 k=0

2 1.5 1

y_d

y_k

0.5 0 k=1

−0.5 −1 −1.5 −2 0

2

4

6

8

10 t

12

14

16

18

20

Figure 4: Output history which places the eigenvalues of A at a ball centered at 0.5 with a radius of 0.2093. Note the complete lack of structure in Te. The output (yk ) and error-norm (kek k2) history of the NRILC system for an arbitrary y d are shown in Figures 4 and 5. Suppose we shift the smaller eigenvalue of the plant in (13) to 1.2. The corresponding plant, 2 3 1 : 5843 1 : 1699 5; Au = 4 ?0:3029 0:2777 has eigenvalues at 1.2 and 0.6619. Using MEVAO with the desired eigenvalues at the origin, we obtain the learning gain 3 2 4 : 6362 3 : 2312 19 : 5071 115 : 3617 18 : 3846 7 66 66 4:9677 4:8040 ?32:8725 ?202:0584 ?41:2405 777 66 46:0379 22:9950 54:8202 324:9247 42:7656 77 77 ; Te = 666 77 43 : 2578 25 : 0801 ? 5 : 6840 44 : 6073 ? 10 : 0994 66 66 ?46:4472 ?25:6286 ?114:7523 ?596:2693 ?87:1475 777 4 ... ... ... ... ... ... 5 resulting in fAg = 0:8717. For a given fAg we expect larger gain elements in Te for unstable plants since one of the eigenvalues of A will be Nu , where u is an unstable eigenvalue 12

1

10

0

10

−1

10

||e_k||_2

−2

10

−3

10

−4

10

−5

10

−6

10

0

2

4

6

8

10 k

12

14

16

18

20

Figure 5: Error history of A. The eigenvalue Nu has to be \pushed" into the unit circle thus requiring large gain elements. In general, as N increases, so will the elements of Te. The output and error-norm histories of this system are shown in Figures 6 and 7.

6 Conclusion We presented a new scheme for the repetitive control of discrete-time, linear SISO plants. A sucient condition for the scheme to asymptotically track any periodic signal is the nonsingularity of the matrix O~ . This scheme allows the eigenvalues of the closed-loop system to be arbitrarily placed almost always. Unlike in conventional repetitive control schemes, plant stability is not required. However, some numerical issues have to be addressed to make the scheme practical, especially for large N . We have also observed that even though the pole assignment guarantees the asymptotic convergence of the error to zero, in some cases, the control inputs for the transient period may be unacceptably large, especially when the plant is unstable. This behavior motivates the study of constrained input NRILC and alternative design schemes that minimize undesirable behavior in the transient period. Although some of the results of this study can be extended to linear time-varying and 13

5 k=5

4 3

k=0

2 y_d

1 y_k

k=10 0 −1 −2 −3 −4 −5 0

2

4

6

8

10 t

12

14

16

18

20

Figure 6: Output history of unstable plant 1

10

0

10

−1

||e_k||_2

10

−2

10

−3

10

−4

10

−5

10

0

10

20

30 k

40

50

Figure 7: Error history of unstable plant 14

60

MIMO plants, an extension of Lemma 1, which determine the applicability of pole placement of A, to these types of plants has to be formulated.

References [1] N. Amann, D. H. Owens, and E. Rogers, \New results in iterative learning control," in Proceedings of the International Conference on CONTROL '94, (Coventry, UK), pp. 640{645, Mar. 1994. IEE Conference Publication vol. 1, no. 389. [2] S. Arimoto, S. Kawamura, and F. Miyazaki, \Bettering operation of robots by learning," Journal of Robotic Systems, vol. 1, no. 2, pp. 123{140, 1984. [3] E. J. Davison, \On pole assignment in linear systems with incomplete state feedback," IEEE Transactions on Automatic Control, vol. 15, pp. 348{351, June 1970. Short Paper. [4] E. J. Davison and R. Chatterjee, \A note on pole assignment in linear systems with incomplete state feedback," IEEE Transactions on Automatic Control, vol. 16, pp. 98{ 99, Feb. 1971. Technical Notes and Correspondence. [5] E. J. Davison and S. H. Wang, \On pole assignment in linear multivariable systems using output feedback," IEEE Transactions on Automatic Control, vol. 20, pp. 516{518, Aug. 1975. Short Paper. [6] S. Hara, Y. Yamamoto, T. Omata, and M. Nakano, \Repetitive control system: A new type servo system for periodic exogenous signals," IEEE Transactions on Automatic Control, no. 7, July 1988. [7] T. Ishihara, K. Abe, and H. Takeda, \A discrete-time design of robust iterative learning controllers," IEEE Transactions on Systems, Man, and Cybernetics, vol. 22, no. 1, pp. 74{84, January/February 1992. [8] D.-I. Kim and S. Kim, \On iterative learning control algorithm for industrial robots and CNC machine tools," in Proceedings of the 19th Annual International Conference on Industrial Electronics, Control, and Instrumentation, (Mauwi, Hawaii, USA), pp. 601{ 606, Nov. 1993. [9] G. S. Miminis, \Using de ation in the pole assignment problem with output feedback," in Third Annual Conference on Aerospace in Computational Control, Aug. 1989. 15

[10] K. L. Moore, M. Dahleh, and S. P. Bhattacharyya, \Iterative learning control: A survey and new results," Journal of Robotic Systems, vol. 9, no. 5, pp. 563{594, July 1992. [11] N. Sadegh, \Synthesis of a stable discrete-time repetitive controller for MIMO systems," Transactions of the ASME, pp. 92{98, Mar. 1995. [12] D. Wang, Y. C. Soh, and C. C. Cheah, \Robust motion and force control of constrained manipulators by learning," Automatica, vol. 31, no. 2, pp. 257{262, Feb. 1995.

16

No-Reset Iterative Learning Control - CiteSeerX

No-Reset Iterative Learning Control - CiteSeerX

Suggest Documents