Neural Network Reference Compensation Technique for Position

0 downloads 0 Views 190KB Size Report
Abstract. In this paper, a new neural network technique for robot manipulator control is proposed. This technique called reference compensation technique(RCT) ...
Neural Network Reference Compensation Technique for Position Control of Robot Manipulators Seul Jung and T.C. Hsia Robotics Research Laboratory Department of Electrical and Computer Engineering University of California, Davis Davis, CA 95616 e-mail:[email protected] and [email protected]

Abstract

In this paper, a new neural network technique for robot manipulator control is proposed. This technique called reference compensation technique(RCT), is to compensate for uncertainties in robot dynamics at input trajectory level rather than at the joint torque level. The ultimate goal of the proposed technique is to achieve an ideal computed-torque controlled system. Compensating at trajectory level carries several advantages over other neural network control schemes that compensate at robot joint torques : First, the position tracking performance is better. Second, the neural controller is more robust to feedback controller gain variations. Finally, practical implementation can be done with ease without changing the internal control algorithm. Simulation studies have been conducted for various neural network structures and di erent training signals. The results showed the superior performances of the RCT over other NN control schemes.

1 Introduction Neural network has been known as an excellent nonlinear mapping device and used in many areas. Specially, the interests in neural network application to control system have been enormously increased in a last few years by delivering successful results in various control system applications. The merits of NN used most in control area are learning and adaptation capabilities for a nonlinear plant whose control is dicult by the conventional linear controllers. Many neural control schemes with back-propagation training algorithm have been proposed to solve the problem of identi cation and control of complex nonlinear system by exploiting the nonlinear mapping abilities of the NN [1]. At the same time, extensive investigations have been carried out to design NN controllers for robot manipulators as well [2, 3, 4]. The fundamental approach for robot manipulator control is the model-based computed-torque control scheme which su ers poor tracking performance from unknown uncertainties and model mismatches. The purpose of using NN is to achieve ideal computed torque controlled robot system. Within this framework, the use of NN controller is basically to generate auxiliary joint torques to compensate for the uncertainties in the nonlinear robot dynamic model in on-line fashion. Variations of this basic idea can be found in other designs[2, 4, 5]. NN can also be trained o -line to model the nonlinear robot dynamics, and then is used to implement the computed-torque controller with further on-line modi cations to account for uncertainties [4]. However, the nature of o -line control is not desirable because it is not easy to learn a dynamical system completely. In other words, it is dicult to train a system completely with nite set of data. In the previous research, the authors have proposed the feedforward neural control scheme as shown in Figure 1. The performance of the proposed scheme has been compared with an adaptive control scheme and one other neural network control scheme, and the results are superior [6]. Here, the conceptually new compensating technique is proposed. The goal of the proposed control technique is to achieve the ideal computed-torque controlled system which is the same as that of the feedforward neural control scheme. The proposed new technique is called the reference compensation technique(RCT) which compensates for

uncertainties at the reference trajectory level rather than compensating at joint torque or control input level. The proposed RCT control structure is depicted in Figure 2. Compensation of robot model uncertainty is achieved by modifying the reference trajectory, qr , using an appropriately trained NN serving as a nonlinear lter or a compensator. The compensated trajectories qr q_r qr become the desired reference inputs to the computed-torque robot control system. The ideal second order trajectory tracking error equation v is used as a training signal. Several variations of this basic approach, depending upon the neural controller structures and the di erent training signals, are examined. In the following, we will develop the NN controller training algorithms as well as presenting simulation results for on-line trajectory tracking. It is shown that the proposed control scheme is practical and its system performance is excellent.

2 Computed-Torque Control The dynamic equation of an n degrees-of-freedom manipulator in joint space coordinates is given by : D(q)q + h(q; q)_ + f (q)_ =  (1) where the vectors q; q;_ q are the joint angle, joint velocity, and joint acceleration, respectively; D(q) is the n x n symmetric positive de nite inertia matrix; h(q; q)_ = C(q; q)_ q_ + G(q); C(q; q)_ is the n x 1 vector of Coriolis and centrifugal torques; G(q) is the n x 1 gravitational torques;  is the n x 1 vector of joint actuator torques; f (q)_ is an n x 1 vector representing Coulomb friction and viscous friction forces. Computed-torque method is the basic approach to robot control when nominal robot dynamic model is available. That is, the control law can be written as ^ (t) = D(q)U(t) + h^ (q; q)_ (2) ^ ^ where D(q) and h(q; q)_ are estimates of D(q) and h(q; q), _ and U(t) is given by U(t) = qr + KD (q_r ? q)_ + KP (qr ? q) (3) where KP and KD are n x n symmetric positive de nite gain matrices. Combining (1),(2), and (3) yields the closed loop dynamic equation e + KD e_ + KP e = D^ ?1 (D(q)q + h(q; q)_ + f (q)) _ (4) ^ h(q; q)_ = h(q; q)_ ? h^ (q; q), where D(q) = D(q) ? D(q); _ e = (qr ? q) and qr = qd + p . If D^ = D; ^h = h, f = 0 and  = 0, then qr = qd and equation (4) becomes the following ideal linear second order equation of motion in error space as follows: " + KD "_ + KP " = 0 (5) where " = qd ? q. Since there are always uncertainties in the robot dynamic model, the ideal error response (5) can not be achieved and the performance is degraded as indicated by (4). Thus the computed-torque based control is not robust in practice. To improve robustness, NN controller is proposed to compensate for the uncertainties as shown in Figure 2.

3 Neural Network Control The inputs to the neural controller are the desired trajectories, qd ; q_d; qd. The compensating signals from neural compensator, a ; d; q are added to the desired trajectories. The control law is ^ qd + a + KD ("_ + d ) + KP (" + q )) + h^ (t) = D( (6) where a ; q and d are the outputs of NN. Combining (6) with (1) yields v = " + KD "_ + KP " = D^ ?1 (Dq + h + f ) ? A (7) where A = a + KD d + KP q . Ideally, at v = 0, the ideal value of NN output A is A = D^ ?1 (Dq + h + f ) (8) which are the uncertainties in robot dynamic models. Let us call this result Scheme A. 2

4 Simpli cation of NN Controller The structure of NN can be simpli ed by eliminating the number of outputs of NN. Compensating only the position and velocity trajectories reduces the number of inside neurons required without losing too much tracking accuracy. We call this two state compensation scheme as Scheme B. In this case, compensating signals are introduced at the position and velocity so that the control law becomes ^ qd + KD ("_ + d ) + KP (" + q )) + h^ (t) = D( (9) And the closed loop equation is

v = D^ ?1 (Dq + h + f ) ? B (10) where B = (KD d + KP q ). The neural controller in Scheme B has to compensate for the same uncertainty signal as that of Scheme A but one less degree of freedom for compensation. It can be further simpli ed by compensating only at position trajectory such as ^ qd + KD "_ + KP (" + q )) + h^ (t) = D( (11) And the closed loop equation is

v = D^ ?1 (Dq + h + f ) ? C (12) where C = KP q . This is called Scheme C. In Scheme C much performance degradation is expected due to the fact that only the position is compensated. In addition to the above schemes, di erent control structures can be formed depending upon how the training signals are generated. From equation (10), a new training signal can be de ned as vPD = KD "_ + KP " = D^ ?1 (Dq + h + f ) ? " ? B (13) Here, B compensates for a more complicated signal than in (10). The training signal vPD can be further simpli ed as vP = KP " = D^ ?1 (Dq + h + f ) ? KD "_ ? " ? B

(14)

Note that B in (14) minimizes only the position errors. Hence, the corresponding velocity error is expected to be slightly large.

5 Neural Network Controller Design

Since we are minimizing the error function v, the objective function J can be de ned as J = 12 vT v Di erentiating (15) yields the gradient as

(15)

@ J = @vT v = ? @T v (16) @w @w @w T is used. The back-propagation updating rule for the weights with a momentum in which the fact @v@wT = ? @@w term is T (17) w(t) =  @v @w v + w(t ? 1) where  is the update rate and is the momentum coecient.

3

6 Simulation Studies

6.1 Comparison Studies between Di erent Control Schemes

In order to demonstrate the better performance of the proposed RCT, simulation studies are performed. The simulation model is a three link rotary robot. The robot dynamic model parameters are taken from the rst three links of PUMA 560 robot manipulator. The robot is required to follow a circular trajectory as shown in Figure 3. The trajectory is given as 30-cm diameter tilted 45o in the 3D Cartesian space with a cycle time of 4 secs. In order to check the neural controller robustness, the same tracking tasks are performed with two sets of controller gains : high and low. Two tracking results are listed here : Table 1 is for the large feedback controller gains, and Table 2 is for low feedback controller gains. System performance is measured by Ep =

XP 2

t=P +

k qd ? q k (rad); Ev =

XP 2

t=P +

k q_d ? q_ k (rad=sec); Ec =

XP p(xd ? x) + (yd ? y) + (zd ? z) (m) 2

t=P +

2

2

2

(18) where P = 4 and  is a sampling time. In this simulation study, we like to test the proposed neural controllers when the same training signal v is used for all the controllers. As listed in Table 1, all neural controllers' performances are very good when compared with that of the uncompensated case. The trajectories of Scheme A and uncompensated case for two cycle time(8 sec) are plotted together in Figure 3. The trajectory of Scheme A is initially oscillatory, but is converged within one sec to the reference trajectory. The tracking error of the uncompensated control is clearly shown. The performances between Scheme A and the feedforward scheme are also compared in Table 1 and the corresponding errors are plotted in Figure 4. Clear di erences in performance are demonstrated. Among proposed schemes, the performances of Scheme A and B are excellent and very comparable to each other while Scheme C performs about 10 times worse. Table 1. Circular tracking errors during the second iteration( is optimized.) under high feedback controller gains(KD = diag[50 50 50], KP = diag[625 625 625]) Scheme A Scheme B Scheme C Feedforward Uncompensated  0.0001 0.0002 0.00001 0.009 N/A 0.9 0.9 0.9 0.9 N/A Ep 0.00043 0.00028 0.00436 0.00822 0.9412 Ev 0.82856 1.10915 0.63987 1.34556 1.6467 Ec 0.000086 0.000055 0.000679 0.0013 0.2495 This result shows that Scheme B is the best choice in terms of performance and eciency of neural network structure. More clear distinctions of controller's performance can be shown with low controller gains. Table 2 lists the results for the low feedback gains. The poor tracking performance of uncompensated case is clearly demonstrated in Figure 5. Also superior performances of Scheme A over feedforward scheme are presented in Figure 6. Among proposed controllers, the notable poor performance of Scheme C can be observed while those of Scheme A and B are still comparable. As we can see from the results in Table 2, only one state compensation is not a good idea even though neural controller structure becomes simpler specially when the feedback controller gains are low. It is also proved that at least two state compensations, position and velocity, are required to have the robustness to feedback gain variations as well as good tracking performances. Table 2. Circular tracking errors during the second iteration( is optimized.) under low feedback controller gains(KD = diag[10 10 10], KP = diag[25 25 25]) Scheme A Scheme B Scheme C Feedforward Uncompensated  0.0045 0.0045 0.00001 0.01 N/A 0.9 0.9 0.9 0.9 N/A Ep 0.006155 0.00665 32.0186 0.7067 252.2535 Ev 0.82415 0.083 80.1902 18.7449 108.6383 Ec 0.00093 0.00097 6.4688 0.0977 69.7055 4

6.2 Comparison Studies between Di erent Training Signals

It is interesting to see the performances for di erent training signals. In the previous simulation studies, the second order error equation v is used as a training signal for Scheme A,B, and C. In the training signal v, the acceleration term q(t) is obtained from the nite di erence method rather than measurement such that _ ? q(t _ ? ) (19) q(t) = q(t)  Table 3 lists the circular trajectory tracking results of Scheme B with three di erent training signals : v, vPD = KD "_ + KP ", and vP = KP ". It clearly shows that the performance with the training signal v is the best, and followed by vPD and vP . It is true from (13) that using the vPD training signal the controller has to compensate for the acceleration error " in addition to robot uncertainties. The acceleration terms are usually noisy. Furthermore, the neural network with vP training signal in (14) has to compensate for more uncertainties. Table 3. Performances with di erent inputs and trajectories during the second iteration( is optimized.) v  0.0002 Ep 0.00028 Ev 1.10915 Ec 0.000055

High Gains Low Gains vPD vP v vPD 0.000009 0.000005 0.0045 0.0045 0.00609 0.061715 0.006652 0.013218 1.80222 7.8955 0.083 10.6534 0.00106 0.012793 0.000974 0.003

vP 0.00001 76.2080 189.6862 13.0064

7 Conclusion A new neural network compensating technique called RCT(reference compensation technique) was proposed. The goal of the proposed technique was to achieve ideal computed torque controlled system by compensating for uncertainties at input trajectory level. The sample circular trajectory trackings for each controller scheme were executed. The results of the proposed scheme showed that the tracking is better than that of feedforward scheme. Among proposed controller structures and di erent training signals, the best choice is Scheme B with v training signal. By compensating at reference trajectory, several advantages can be obtained: First, tracking performance is much improved. Second, the robustness of neural controller to feedback gain variation is improved. Finally, easy implementation of real time control system without modifying internal control algorithms is achieved

References [1] K. Narendra and K. Parthasarathy, \Identi cation and control of dynamical systems using neural networks", IEEE Trans. on Neural Networks, vol. 1, pp. 4{27, 1990. [2] A. Ishiguro, T. Furuhashii, S. Okuma, and Y. Uchikawa, \A neural network compensator for uncertainties of robot manipulator", IEEE Trans. on Industrial Electronics, vol. 39, pp. 61{66, December, 1992. [3] H. Miyamoto, M. Kawato, T. Setoyama, and R. Suzuki, \Feedback error learning neural network for trajectory control of a robotic manipulator", IEEE Trans. on Neural Networks, vol. 1, pp. 251{265, 1988. [4] T. Ozaky, T. Suzuki, T. Furuhashi, S. Okuma, and Y. Uchikawa, \Trajectory control of robotic manipulators using neural networks", IEEE Trans. on Industrial Electronics, vol. 38, pp. 195{202, 1991. [5] S. Jung and T. C. Hsia, \A new neural network control technique for robot manipulators", Robotica, vol. 13, pp. 477{484, 1995. [6] S. Jung and T.C. Hsia, \On-line neural network control of robot manipulators", Proc. of International Conference on Neural Information Processing, pp. 1663{1668, Seoul, 1994. 5

Scheme A (__) and Feedforward Scheme (−−−)

Neural Network

Φf v

.. q (t) d

Σ +

. q (t) d

+

. e

Σ

KD

+

− +

q (t) d

e

Σ

.. q(t)

-

+

Σ

+ ^ D(qd)

Σ u +

+

−0.5 0

Finite Difference Σ

Robot q(t)

+ . ^ h(q d,q d)

KP

2

3

4

5

6

7

8

1

2

3

4

5

6

7

8

1

2

3

4 time (sec)

5

6

7

8

0

−0.2 0

Joint 3 (rad)



0.1 0 −0.1

Figure 1: Feedforward Neural Compensator Structure

0

Figure 4: Joint position errors with high PD gains : Scheme A and Feedforward scheme

.. q d . q

1

0.2

. q(t)

τ

+

0

Joint 2 (rad)

q (t) d qd (t-1) q (t-2) d

Joint 1 (rad)

0.5

d

q d

Computed Torque Control System v

Neural

1.2

b

+ +

a

φd +

Network φp

+ Σ

+ Σ

1

+ KD

− +

Σ +

Σ

U

+

^ D(q)

+

τ

Σ +

KP



. q Robot

0.8

q

z axis (m)

φ

^ h

0.6 a 0.4 0.2

+ Σ

.. q

0 0.5

Finite Difference



0.4

0.5 0.3

Figure 2: The Proposed Neural Network Control Structure

0.4 0.3

0.2

0.2

0.1

0.1 0

y axis (m)

0

x axis (m)

Figure 5: End point trajectory with low PD gains : (a)uncompensated (b) Scheme A

1.2

Joint 1 (rad)

Scheme A (___) and Feedforward Scheme (−−)

b

a 1

0 −0.2 0

0.8 0.6

Joint 2 (rad)

z axis (m)

0.2

0.4 0.2

2

3

4

5

6

7

8

1

2

3

4

5

6

7

8

1

2

3

4 time (sec)

5

6

7

8

0.2 0 −0.2 0

0 0.5

1

0.1

0.4

Joint 3 (rad)

0.5 0.3

0.4 0.3

0.2

0.2

0.1 y axis (m)

0.1 0

0

x axis (m)

0

−0.1 0

Figure 3: End point trajectory with high PD gains : (a)uncompensated (b) Scheme A (c) Reference

Figure 6: Joint position errors with low PD gains : Scheme A and Feedforward scheme 6

Suggest Documents