Stability and convergence of neurologic model based robotic controllers

1 downloads 0 Views 511KB Size Report
controller which utilized a generic multilayer ANN model was introduced by ..... New Haven, Conn., 1983. [22] K. S. Narendra and A. P. Morgan, “On the uniform.
l b , Fnncs - my 1992

Stability and Convergence of Neurologic Model Based Robotic Controllers * M. Kemal Calaz Electrical Engineering Department, Bosphorous University Bebek, Istanbul 80815 Turkey

Can I4zk Electrical and Computer Engineering Department Syracuse University, Syracuse, NY 132441240 U.S.A.

Abstract

dynamics and basically adopted the adaptive controller structure of Craig et al. [4]. A trainable neuromorphic controller which utilized a generic multilayer ANN model was introduced by Guez et al. [5] and used efficiently for the stabilization of an inverted pendulum. Bassi and Bekey [6]used the same architecture as that of Psaltis et al. (i.e. with no feedback), but suggested a functional decomposition approach which they claimed would be more effective in off-line learning. A table look-up learning method in a neurological setting was applied to manipulator control by Miller et al.[?] where they extended Albus’s CMAC model [8] utilizing an efficient hashing memory structure for fast feedforward torque computation. In [9] Jordan suggested the use of ANN based forward models in a feedforward controller architecture. He incorporated a forward model and a feedforward controller in a single neural network architecture. However this method required a separate learning process in order to be able to utilize the feedforward controller effectively. The above list of references is not meant to be an exhaustive one, however it represents a good spectrum of the research efforts in this field. The present paper discusses the stability properties of an ANN based trajectory following controller. The controller algorithm in question was originally proposed by the authors in [lo, 111 and successfully tested for trajectory following tasks through simulation experiments. The controller is basically similar to the adaptive inverse dynamics control method of Craig et al. [4]. However instead of using an explicit parametric model, the controller utilizes generic multilayer ANNs to adaptively approximate the manipulator dynamics over a specified region of the state space for a given desired trajectory. This generic neural network structure can be viewed as a nonlinear extension of a deterministic auto-regressive model which is commonly used in model matching problems for linear systems.

Control of robotic manipulators with unknown or uncertain dynamics has been an important research area in the last decade. Without a parametric model of system dynamics, learning control techniques are still the most effective control methods for repeated trajectory following tasks. In this class of controllers, neurologically inspired algorithms have been gaining much attention in recent years. Although such techniques are shown to work effectively through simulation experiments, coupled and nonlinear nature of parameter update dynamics renders an effective mathematical analysis of closed loop system stability and convergence difficult. This paper investigates the local convergence properties of an artificial neural network based learning controller, using linearization techniques. The results obtained reflect the local properties of the closed loop nonlinear system dynamics, hence qualitative statements about the nonlinear system and its solutions can be made.

1. Introduction With the resurgence of artificial neural network (ANN) research for various engineering problems of nonlinear nature, many researchers have attempted to apply neurologically inspired algorithms for manipulator control. The main underlying assumption in these applications is the efficient capability of ANNs to approximate multivariable functions. With the increasing interest in this area, IEEE Control Systems Magazine had three issues which covered special sections on neural networks for control systems [l]. A learning control algorithm which utilized a multilayer neural network that was first trained to learn the inverse dynamics of the system under consideration, without any feedback, and then used this model as the feedforward element in a control loop was proposed by Psaltis et al. [2]. Kawato et al. [3]proposed a learning controller for manipulators utilizing an adaptive inverse dynamics model based on the parameterization of system dynamics. This approach required explicit knowledge of the manipulator’s

2.

In this section we introduce the trajectory following control problem and then write the closed loop system dy-

*Thiswork is in part sponsored by Westinghouse Education Foundation under contract No:3597262.

205 I

0-8186-2720-4/92 $3.00 81992 IEEE

ANN-Based Trajectory Following Control

The manipulator's inverse dynamics which is given in (2) and represented by the nonlinear transformation R-', can be decomposed into n transformations, namely

namics for the proposed ANN based controller. Consider the vector representation of an n link rigid manipulator dynamics, given as

where r is the n x 1 vector of joint torques, and q is the n x 1 vector of joint positions. The matrix M(q) is the n x n positive definite "inertia matrix". v ( q , q ) is n x 1 vector function representing centrifugal and Coriolis effects, and finally n x 1 vector function g(q) represents torques due to gravity. The derivation of (1) can be found in common reference texts [12, 131. Equation 1 can be put in a more compact form as ,

where each rF'(q,q,ii), i = 1 , . .. ,n defines the inverse dynamics of the corresponding joint, that is,

-

rF'(q,q,G) : 'R3"

Each entry TF' of the vector function R-' can be modeled by an ANN such that the overall system's inverse dynamics model is represented by, (e)

Given a bounded desired trajectory in joint variables ( q d , q d , G d ) , the control designer's task is to devise a controller to track this desired trajectory as closely as possible. The most common approach is the model based control method which is also referred to as the computed torque or the inverse dynamics method. Here the controller makes full use of the dynamic equations of the manipulator, hence the performance solely depends on the accuracy of the dynamic model available to the designer. If exact manipulator dynamics is available, then the control r = M(q)(K,e Kpe G d ) h(q,4) (3) will result in an error equation of the form,

+

[ ] [ " '1 '

i-' (iz (t ) )

r = k-'(q,q,G) =

fR'(z(t1)

=

N n ( z ,p n )

(7) where (:) denotes the estimated models, and Ni(.), i = 1 , . . .,n represents the output of each ANN model that is used to realize the nonlinear mapping T ; ' ( . ) . z(t) can be considered as an augmented state vector of the robot dynamics, z ( t ) = (qT(t),qT(t),GT(t))TE 7L3" which denotes the time dependent input vector of the inverse dynamics and pi is the vector of all adjustable weights of the corresponding ANN model. Assuming that a three-layer ANNwith "k" inputs, "m" hidden layer units and one output unit is used to model a joint's inverse dynamics, we can explicitly write this model as,

+ +

e + K,e+ Kpe = 0

72, with i = 1,... , n

(4)

due to the cancellation of nonlinear terms, where e = qdq and i! = q d - 4. K,, and Kp are the diagonal matrices of velocity and position servo feedback gains, respectively. Note that the above error equation is a decoupled one due to the diagonal nature of the constant matrices K p and K,,. Therefore adjusting these gain matrices properly, tracking errors can,be effectively forced to zero. It should be noted here that, even if an exact dynamic model of the manipulator is not available, some form of feedforward torque compensation is necessary in order to achieve satisfactory trajectory performance. For example in [13], a DAR (deterministic auto regressive model) is used to approximate the system inverse dynamics. In [14, 41 parameter based adaptive models were assumed to exist and the unknown parameters are updated based on the minimization of the trajectory tracking error. Here we propose the use of a generic ANN to model the inverse dynamic structure of each joint. For an nlink manipulator, inverse system dynamics given in (2) can be defined by a transformation R-' : 'R3" + 'R" which maps the 3n dimensional output space of the system (position, velocity and acceleration vectors) to the n-dimensional input space (joint torque vector r). This nonlinear transformation can be effectively modeled by ANNs as discussed by Funahashi [15], Cybenko [16] and Hornik, Stinchcombe and White [17].

f-'(~(i?))

= N i ( z ( t ) , w i ( t )H, i ( t ) ) = w T T ( H ~ z ) (8)

where wi(t) E 'R" is the adaptive output layer weight (parameter) vector, H i ( t ) E Rmxkis the hidden layer adaptive weight matrix of the "i"th ANN model. z ( t ) E 'Rk with k = 3n is the input vector as defined before. In the rest of the text, time argument of these vectors will sometimes be dropped for the brevity of the analysis. The vector function T(.)E 'R" is defined as T(.)= ( S I ( . ) , . . . , g m ( . ) ) ' where s i ( . ) E 'R is by definition a monotone increasing function which is taken to be a sigmoid function in this case, based on the justification (1t e-") given by the theorems in [15, 171. Hence g i ( z ) = 1 and it is bounded as 0 5 g i ( z ) 5 1. To put the hidden layer adaptive weight matrix Hi of the ANN model in vectoral form, we define vi = vec(Hi) E 'Rmkwhere vet(.) operator gives a vector which is obtained by stacking the columns of its matrix argument. Then we can rewrite the argument of vector function T(.)as, H ~= z @vi (9) where v = { H I ' , &I,. . . , H m l , .. . , H l k , . . .,H , k } = and @ E 7Lmxmk is a matrix which can be considered as the

2052

modified input of the ANN model and is defined as,

h-l(z) =

... 0 ...... Zk 0 ... 0 ... 0 ...... 0 z k ... 0 a = .. .. .. .. . * . .) . . . . . . . . .. . .. .. . .. 0 0 .... 21 ...... 0 0 ... Zk (10) where z, E 12, i = 1,...,k are the elements of the input 21

0

0

21

(

where ?r'(z,p,) with i = 1,....n denotes the error in inverse dynamic modeling for each joint, and pi = E Rmtmkis the adaptive weight vector of the corresponding ("i" th) ANN model. Using (15) and (18) the error dynamics (with diagonal Kp and K , matrices) for each joint can be written as follows,

{wT,vT}~

vector z E R k . Hence (8) can now be written as, +,-'(z,Wi,vi) = Ni(z,wi,vi) = WTT(avi)

(11)

Now that we have an adaptive manipulator model, we can write the control law based on this adaptive model. Here we note that, since an ANN model basically realizes a direct implicit transformation from the input vector z to joint torques 7 , this model does not convey any explicit information on manipulator's estimated dynamic components such as the inertia matrix, Coriolis forces etc. Hence an adaptive control architecture based on a computed torque model which is discussed earlier is not possible. However the feedforward compensation torques can be directly computed based on the proposed generic ANN model which would approximate the system dynamics for a specific desired trajectory and then added to a feedback servo signal to generate torques that will finally drive the actuators. Hence we define the control law which has an adaptive feedforward compensator and a servo feedback signal as, r

~i

.,-I(.)

= w:T(av;o)

(20)

--

p) = W:T(@vo)

?(z,

r-1

- wTT(@v)

(21)

P - 1

With this set up, we can analyze the error equation given in (19) which is in fact a stable linear system with a nonlinear forcing function. The error dbaamics in (19) can be written as,

-

= M(q)ii

(19)

where wio and vi0 represent the desired output layer and hidden layer weights that generate the desired mapping, F;'. In the rest of the analysis we drop the subscript "i",for the brevity of the presentation. Using the desired mapping given in (20), the error in the approximation of the inverse dynamics of a joint can be written as,

where K , E 'Rnxnand Kp E Rnxnare the diagpnal gain matrices with entries k, and k,, respectively, R-'(z) = N(z) = { N I , ... ,Nn}T E 12" is the dynamic model estimate which consists of "n" ANN models which represent the actuators' inverse d y n e c s . Note- that the computation of the transformation R-'(z) = R-'(q, q, 6) in (12) requires the information on Q in addition to the manipulator's state vector {q,q}. The acceleration vector Q can be computed by differentiating the velocity vector q using a first order filter [4, 181. With the control vector given in (U), the system's error dynamics can be written by substituting (12) in (2),

+ t + K,Q+ K,e

= ~,-'(z,p,),for i = 1,....n.

where ei, di and Zi denote the position, velocity and acceleration errors at joint a, respectively, kp and t, are the individual servo gains, respectively. Based on the existence theorems given in [15, 171, we assume that there exists a three layer ANN model-which is defined h (11)- that closely approximates the joint's inverse dynamics. That is,

= &-'(z)+Kvc5+K,e = N(z)+G+K,&+K,e (12)

k'(z)

+ k,ii + k,ei

where H ( s ) is a strictly positive real (SPR)transfer function, due to the positive gain terms k, and k, in (19). Based on this error dynamics, a gradient update algorithm for the parameters p can be written as,

+ h(q,q)(l3)

R-l(Z)

Z+K,i!+K,e

= R-'(z)-k-'(z)(l4)

t+K,Q+K,e

=

h(z)

(23)

(15)

where r E R(mtmk)x(mtmk) is a diagonal gain matrix which includes the learning rates as its diagonal entries. Then the state space representation of (19) is given by,

where R-'(z) = k ' ( q , q, Q) E 'R" denotes the error between the actual inverse dynamics R-' and the estimated and can be explicitly written as, model kl, P ( 2 )

=

[ rm;f;l(z)]

it

(16)

e

- i,'(z) rT'(z) - N ( z ,P I )

r,' (z)

P ( 2 )

= AX+B?-'(Z,~) = cx

(24) (25)

where x = ( e , i ) T E R Z x 2is the state vector, and A, B and and c denote the minimal state space representation due to strictly positive real H ( s ) . Since H ( s ) is an SPR transfer function, based on the Kalman-Yakubovich lemma [lg], for a given symmetric positive definite (p.d.)

=

2053

where Z. = ( q d T , q d T , 4dT) which basically corresponds to the nominal state x. = 0. Evaluating the partials, we get

matrix Q, there exists another symmetric positive definite matrix P which satisfies the Lyapunov equation, A~P+PA=-Q

(26)

= A% - BY~(~.v,)w -

and the output relation, c=

j,

BTP

(27)

~~rJAC

+ BY~(~P.V,)G + BW? JAG

(31) (32)

where J. E Rmxmis a diagonal Jacobian matrix evaluated at the nominal values,

Replacing the output vector c in (23) by (27), the gradient update law take sthe form,

and 9. is the vector function (defined in (10)) which is evaluated at z*.Writing (32) in a compact form,

where N(z,p)is the output the ANN model. Due to the nonlinear parametric dependence of the term N(z,p), the computation of the partial derivative term in (29) requires the so-called backpropagation algorithm. Note that the closed loop adaptive system given by (24) and (29) defines a coupled nonlinear system of differential equations and this makes a global convergence and stability analysis of the closed loop system difficult. However local properties of the system dynamics can be studied through the use of linearization techniques. Linearization dictates that, subject to smoothness of the nonlinear oprators in (24) and (29), one can constitute a linearized system whose stability properties are identical to the local stability properties of (24) and (29). In this way, qualitative statements about the nonlinear system and its solutions can be made.

3.

= AX

X =AX

+ B\k.(z., w.,v*)P

(34)

where

can be considered as the linearized regressor vector of the error dynamics, and

is the parameter error vector. Using the same arguments and linearizing (29) explicitly for w and V, we get [ll], W

5

Local Stability and Convergence Analysis

= I'lY(9,v,)BTPii = I'29TJ,w,BTP%

(37) (38)

where 9*and J , are as defined previously and I'I E R m k x m k and rz E R k X kare diagonal matrices which are in fact the partitions of the gain matrix I?. Combining equations (37) and (38), we can write the linearized update equation for the parameter vector p as follows,

Now, let p* and x. denote the nominal (tuned) values of the parameter vector and the state of the system, respectively. In order to linearize the system defined by (24) and (29) around a nominal trajectory, we choose the nominal values x* and p* at the desired values of x and p without loss of generality. That is x* = xo = 0 and p. = po = (W:,V;)~. This basically corresponds to a condition where the system operates in the vicinity of the desired trajectories ( q d ,q d , q d ) , and the ANN model parameters are close to their desired values. This choice is also intuitively appealing since even at the start of the parameter update algorithm, the servo feedback signal forces the error vector x to take small values. With this choice of the nominal signals, the perturbation vectors can now be defined as,W = xo-x = -x which is actually the tracking error vector and p = po - p which is the parameter error vector. With this set up, we first linearize (24) around X. = 0 and p. = po as follows,

p p

= I'QTBTPi = -rQyBTPx,

since % = -x

(39) (40)

where \k. is as defined in (35). Note that equations the (34) and (40) constitute the linearized closed loop system dynamics. Based on this linearized model we can now analyze the local convergence and stability properties of the system given by (24) and (29). We first state a Lemma which will be used in a following theorem. Lemma 1: If a diflerentiable function f ( t ) has a finite ljmit as t + CO, and i f f i s uniformly continuous, then f ( t ) + 0 as t-*CO. The proof of the above lemma can be found in [20]. Now we can state a theorem on the stability of the linearized system. Theorem 1: Given the linearized system dynamics in (34) and the update law in (do), the tracking error vector x of (34) satisfies, x-+O a s t + w

2054

Since @* is bounded in this case, the left hand inequality of (45) holds directly, hence the p.e. condition can be written as,

with all signals remaining bounded.

Proof: Consider the Lyapunov function,

+ pTr-lp

v(x,fi) = X ~ P X

lototT

(41)

@.Ti@*&

where I? is the p.d. diagonal gain matrix as defined before, and P is the p.d. symmetric matrix solution of the Lyapunov equation ATP P A = -Q with Q chosen as an arbitrary symmetric p.d. matrix. Then the derivative of V(x, 6) can be computed as,

+ 2fiT@TBTPx+ 2fiTr-’p

(42)

Replacing p in (42) by the update law 6 = -r@zBTPx, we get V ( x , 6) = -xTQx 5 0 (43) Therefore the linearized system defined by (34) and (40) is stable and x and p are bounded. Since (43) is positive semi-definite, in order to show the convergence of the X , we have to show the uniform continuity of V. Note that, since @* is bounded (due to the boundedness of z.), this implies that x is bounded as seen from (34). Hence, this shows that V is uniformly continuous, since its derivative V = -2xQX is bounded. Then based on the above Lemma [20], boundedness of V and uniform CO, and hence continuity of V imply that V -+ 0 as t x + 0 as t + 00. This completes the proof. The above result ensures the convergence of the tracking error vector x in the vicinity of a nominal solution. However note that the above theorem does not guarantee the convergence of the parameter error vector 6, although it shows its boundedness. Now let’s write the equations (34) and (40) which describe the linearized system, as, -+

lototT *T@*dt 2 y I

4.

Conclusion

ANN based robotic controllers have been receiving much attention in recent years due to their effective learning capabilities and high levels of autonomy. Although there are many reports on successful simulation results, there is not yet a well defined mathematical analysis for their stability and convergence. This is mainly due to the nonlinear nature of parameter update dynamics which stem from the architecture of the ANN models used in these controllers. In this paper, we investigated the local properties of an ANN based robotic controller which was previously tested successfully through simulation experiments. The results obtained in this paper are based on a linearized model of the closed loop system dynamics and demonstrate the local stability and convergence properties of the controller.

References [l] “IEEE Control Systems Magazine,” April 1988-19891990. Special Section on Neural Networks.

[2] D. Psaltis, A. Sideris, and A. A. Yamamura, “A multilayer neural Network Controller,” IEEE Control Systems Magazine, pp. 17-21, April 1988.

Equations similar to (44) arise in most adaptive control applications, and its asymptotic stability has been studied by several researchers [21, 22, 41. In [22], it is shown that the equation (44) is asymptotically stable, if the transfer function BTP(sI- A)-’B is strictly positive real (which is previously satisfied for this case) and if

PI 2

(46)

Finally, note that the choice of @* is related not only to the desired trajectory vector z* = ( q d T , q d T , G d T ) T , but also to the nominal parameters wt and v, as seen from (35). Thertfore a direct p.e. relation on the desired trajectory vector z+ is not trivial, and requires further research.

+

V(x,fi) = -xTQx

2 -/I

[3] M. Kawato, K. Furukawa, and R. Suzuki, “A hierarchical neural network model for control and learning of voluntary movement,” Biological Cybernetics, no. 57, pp. 169-185, 1987. [4] J. J. Craig, P. Hsu, and S. S. Sastry, “Adaptive

control of mechanical manipulators,” International Journal of Robotics Research, vol. 6, no. 2, pp. 1628, 1987.

(45)

holds for all t o , where p, y, and T are positive constants and I E 12(mtmk)x(mtmk)is the identity matrix. Hence, if the condition (45) is satisfied, then ficonverges asymp totically. The above condition means that the integral of @Fa.must be positive definite and bounded over all intervals of T. Equation (45) which is usually referred to as the persistent excitation (p.e.) condition states that @* must vary sufficiently over the interval T such that the entire ( m mk) dimensional space (i.e. the parameter space of the ANN model) is spanned to ensure the convergence of fi [4’.

[5] A. Guez and J. Selinsky, “A trainable neuromorphic controller,” Journal of Robotic Systems, vol. 5, no. 4, pp. 363-388, 1988. [6] D. F. Bassi and G. A. Bekey, “High precision position control by Cartesian trajectory feedback and connectionist inverse dynamics feedforward,” in Proc. IEEE Int. Conf. on Neural Neural Networks, vol. 2, (Washington D.C.), pp. 325-331, 1989.

+

[7] W. T. Miller, F. H. Glanz, and L. G. Kraft, “Application of a learning algorithm to the control of

2055

robotic manipulators,” Int. Journal of Robotics Research, vol. 6, no. 2, pp. 84-98, 1987. [8] J. S. Albus, “A new approach to manipulator control: The Cerebellar Model Articulation Controller (CMAC),” Journal of Dynamic Systems, Measurement, and Control, pp. 220-227, 1975. [9] M. I. Jordan, “Generic constraints on underspecified target trajectories,” in Proc. IEEE Int. Conf. on Neural Neural Networks, (Washington D.C.), pp. 217-225, 1989. [IO] K. Ciliz and C. Igik, “Trajectory learning control of robotic manipulators using Neural Networks,” in IEEE International Symposium on Intelligent Control, (Fairfax, VA), 1990.

[ll] K. Ciliz, Artificial Neural Network Based Control of Nonlinear Systems With Application to Robotic Manipulators. PhD thesis, Syracuse University, Electrical Engineering Department, 1990. [12] R. P. Paul, Robot Manipulators: Mathematics, Programming and Control. Cambridge, Massachusetts: MIT Press, 1981. [13] A. J. Koivo, Fundamentals for Control of Robotic Manipulators. New York: John Wiley & Sons, 1989. [14] J. E. Slotine and W. Li, “On the adaptive control of robot manipulators,” Int. Journal of Robs. Research, vol. 6, no. 3, pp. 49-59, 1987. [15] K. C. Funahashi, “On the approximate realization of continuous mappings by Neural Networks,” Neural Networks, vol. 2, pp. 183-192, 1989. [16] G. Cybenko, “Approximations by superpositions of a sigmoidal function,” tech. rep., University of Illinois, Center for Supercomputer Research and Development, Urbana, IL, 1989. [17) K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks, vol. 2, pp. 359-366, 1989. [18] P. Khosla and V. D. Tourassis, “Adaptive control strategies for robot manipulators,” in Proc. 30th Midwest Symposium on Circuits and Systems, (Syracuse, N.Y.), 1987. [19] C. A. Desoer and M. Vidyasagar, Feedback Systems : Input- Output Properties. New York, N Y Academic Press, 1975. [20] J. J. E. Slotine and W. Li, Applied Nonlinear Control. Englewood Cliffs, N.J.: Prentice Hall, 1991. [21] K. S. Narendra and A. M. Annaswamy, “Persistent excitation and adaptive systems,” Tech. Report No.8302, Center for System Science, Yale University, New Haven, Conn., 1983. [22] K. S. Narendra and A. P. Morgan, “On the uniform asymptotic stability of certain linear nonautonomous differential equations,” SIAM Jour. Cont. and Opt., no. 15, 1977.

2056

Suggest Documents