for controlling a two-link robotic manipulator. We do not need to resort to
estimating the inverse dynamics. Our control utilizes the full dynamic model
estimate ...
Proceedings of the 4th International Conference on Autonomous Robots and Agents, Feb 10-12, 2009, Wellington, New Zealand
Neural-Adaptive Control of Robotic Manipulators Using a Supervisory Inertia Matrix Dean Richert, Arash Beirami, and Chris J.B. Macnab Dept. of Electrical and Computer Engineering, University of Calgary, Calgary, Alberta, Canada, Email:
[email protected],
[email protected] Abstract—This paper utilizes a novel neural-adaptive method for controlling a two-link robotic manipulator. We do not need to resort to estimating the inverse dynamics. Our control utilizes the full dynamic model estimate including an inertia matrix estimate, referred to as a forward dynamics approach. Our novel contribution is to use an inertia matrix estimate to supervise the training of the neural networks. We find this overcomes the practical difficulties typically encountered with the forward dynamics method. The proposed method greatly improves performance over the forward dynamics approach, verified in experiment. The method is robust to changes in the real inertia matrix, because of a payload, even though the supervisory inertia matrix remains constant. Index Terms—Neural Network Control, Adaptive Control, Robotic Manipulator, Inverse Dynamics, Forward Dynamics, Cerebellar Model Articulation Controller
I. I NTRODUCTION Traditional control methods require near complete knowledge of a robot’s nonlinear dynamics in order to generate effective control signals. Obtaining accurate model parameters consumes time, expense, and loses accuracy over time. Not only do the dynamics change when motor gears and joints deteriorate, but a payload introduces significant variability. Adaptive control schemes solve many of these problems by adjusting parameters online based on error feedback. However, adaptive schemes typically rely on an inverse-dynamics approach in the control, as the inertia-matrix inverse is not linear in its parameters. However, many advanced adaptive control approaches require a complete dynamic estimate of the system including the inertia matrix (for example backstepping with tuning functions of feedback linearization). This paper presents a neural-adaptive method to estimate a robot arm dynamics and provide control signals. Associative memory neural networks attempt to cancel out nonlinearities. The Cerebellar Model Articulation Controller (CMAC) algorithm indexes the (local) basis functions [1]. Most often neural-adaptive methods generate control signals based on an inverse-dynamics approach following the adaptive-controller derivations [2],[3],[4],[5]. This method is relatively simple and effective because the weight updates always have the same sign as the state error, and the resulting control torque increase or decrease is consistent with reducing the error. Neural networks may even be able to adapt to significant payloads [6], although more advanced methods are probably required to achieve quick adaptation with quantifiable performance [7]. We first examine straightforward inverse-dynamics and
978-1-4244-2713-0/09/$25.00 ©2009 IEEE
forward dynamics approaches. With the forward dynamics approach, the inertia-matrix estimate updates no longer have the same sign as the state error. Our new method utilizes a supervisory inertia-matrix to guide the training of these estimates, so that the estimates remain much more reasonable and consistent with reducing the state error [8]. This paper presents experimental results to validate the method and show that the supervisory inertia matrix need not be accurate. This allows the method to work better than the inverse-dynamics approach even when the robot carries significant payload. II. BACKGROUND Consider the dynamics of a planar (horizontal) n-link robot manipulator ˙ + τ, M(q)¨ q = F(q, q) (1) where τ ∈ "n contains control torques, q(t) ∈ "n is a vector of link angles, M(q) ∈ "n×n is the inertia matrix, and ˙ ∈ "n is a vector of Coriolis, centripetal, and damping F(q, q) ¨d (t), the tracking terms. Given desired trajectory qd (t), q˙ d (t), q errors are e1 (t) = q1 − qd (t) and e2 (t) = q2 − q˙ d (t) and thus the error dynamics become ˙ + M−1 (q)τ − q ¨d (t). e˙ 2 = M−1 (q)F(q, q)
(2)
Calculating a control based on the robot’s inverse-dynamics results in: ˙ + M(q)¨ τ = −F(q, q) qd (t) − G1 e1 − G2 e2 ,
(3)
where constant gain matrices G1 , G2 are positive definite. When a neural network estimates the nonlinear terms, the inverse-dynamics control becomes ˆ I (q, q, ˙ q ¨ d , wI ) − G 1 e 1 − G 2 e 2 , τ = −C
(4)
ˆ I is an n × 1 vector of CMAC outputs and wI is a where C vector of weights in the neural network. A forward-dynamics approach uses control and neural-control ˙ −q ¨ d (t) − G1 e1 − G2 e2 ], τ = M(q)[M−1 (q)F(q, q) (5) −1 ˆ (q, wM )[−C ˆ F (q, q, ˙ wF ) − q ¨d (t) − G1 e1 − G2 e2 ], =C M (6) ˆ F is ˆ M is a n × n matrix of CMAC outputs and C where C n× 1. The control torque calculated using a forward dynamics approach requires models of both M and M−1 F. This requires n CMAC models for M−1 F with additional CMAC models
634
Proceedings of the 4th International Conference on Autonomous Robots and Agents, Feb 10-12, 2009, Wellington, New Zealand
450
End Effector Position RMS Error (mm)
400
350
300
250
200 Method 1 Method 2 Method 3
150
100
Fig. 1.
Two-link robot arm 50
required for M. Exploiting the symmetric positive definite characteristic of the inertia matrix requires that 3 additional CMAC models are trained for a 2-link robot manipulator. The inertia matrix poses a challenge to the online training of CMACs. This challenge is addressed by using a supervisory learning term in the training of CM . The CMAC consists of r-dimensional hypercubes, where r is the number of inputs. Each basis function is confined to its own local hypercube domain. The CMAC structure consists of m offset layers of hypercubes, and one hypercube is indexed by the input on each layer. As in the original CMAC scheme, a hash-coding algorithm ensures a reasonable size of memory allocation. In our experiments all CMACs use 50 layers and binary basis functions. The results in Section IV will compare three different control schemes. Method 1 and method 2 are the traditional inverse and forward dynamics methods respectively. Method 3 is our proposed scheme, where the inertia-matrix estimates in the forward dynamics are supervised by constant estimates. A. Inverse Dynamics (Method 1) Method 1 is the conventional approach to neural-adaptive control where the control signal is given by (4) and is determined based on the inverse dynamics of the plant . Such a method requires two CMAC neural networks in estimate ˆ I (q, q, ˙ q ¨ d ) for our two-link manipulator. A sum of weighted C basis functions provides the ideal CMAC output ! T " φI (x)w1 CI (x, wI ) = , (7) φTI (x)w2 ˙ q ¨d ]T , basis functions φI (x) ∈ "1×m , with input x = [q, q, ˆ and w ˆ will refer and ideal weights wi ∈ "m . (Terms like C to the real CMAC outputs and weights which are considered estimates of idealizations C and w.) Like all associative memories, a CMAC can uniformly approximate our inversedynamic nonlinear functions f (x) in a local subspace of "r such that f (x) = CI (x, w) + d(x),
%d(x)% ≤ dmax
(8)
0
20
Fig. 2.
40
60
80
100 Time (sec)
120
140
160
180
200
Comparing the performance of the three methods.
where dmax is positive constant that bounds the approximation error. The stable weight updates that result in a uniformly ultimately bounded signal based on Lyapunov stability analysis are ˆ i ) , for i = 1, 2 ˆ˙ i = βi (φI (x)s − ν w (9) w where s = Λe1 + e2 is a sliding surface. Proof of stability is provided in Appendix A. B. Forward Dynamics (Method 2) For our 2-link manipulator experiment, five CMAC outputs estimate the nonlinear terms as follows ! T " φM w1 + d1 φTM w2 + d2 = CM + D, (10) M−1 = φTM w2 + d2 φTM w3 + d3 ! T " φF w4 + d4 −1 M F= (11) = CF + d, φTF w5 + d5
where D and d contain CMAC approximation errors. Again, stable weight updates are determined based on Lyapunov stability analysis. For the forward dynamics approach, these weight updates become # $ ˆ˙ 1 =β1 φM (τ zT PB)11 − ν w ˆ1 , w (12) # $ T ˙w ˆ 2 =β2 φM (τ z PB)12 − ν w ˆ2 , (13) # $ T ˙w ˆ 3 =β3 φM (τ z PB)22 − ν w ˆ3 , (14) # $ T ˙w ˆ 4 =β4 φF [(z PB)1 − ν w ˆ4 , (15) # $ T ˙w ˆ 5 =β5 φF [(z PB)2 − ν w ˆ5 . (16) The control is based on (6), the forward dynamics, but is augmented with a nonlinear damping term for stability as seen in Appendix B: ˆF + q ˆ −1 (C ¨d − G1 e1 − G2 e2 − α||uw ||2 BT PT z). τ =C M
With the forward dynamics, additional neural networks are required to fully describe the forward dynamics of the robot
635
Proceedings of the 4th International Conference on Autonomous Robots and Agents, Feb 10-12, 2009, Wellington, New Zealand
0.02 11
M−1
1000
−1
−0.01
50
100
150
100
150
200
250
0 4 x 10
50
100
150
200
250
0
50
100
150
200
250
0
50
100
150
200
250
0
50
100 150 Time (sec)
200
250
−5000 1
200
Actual CMAC estimate
50
0
M−1 22
0
0
5000
−1
0
−0.02
500 0
M12 or M21
CI1
0.01
Actual CMAC estimate
0 −1
0.02
M F1
50
CI2
−1
0.01
0 −50
0 2
200
−0.02
M−1F
−0.01
0 −200
0
50
Fig. 3.
100 Time (sec)
150
200
Fig. 4.
Method 1 (inverse dynamics) CMAC estimates
(one for each distinct term in the inertia matrix). Also note that the stable weight updates for the inertia matrix model contain the actual control signal. This control signal is subsequently multiplied by the error term resulting in a weight update lacking a simple relationship to the sign of the error. III. P ROPOSED M ETHOD (M ETHOD 3) In order to improve the weight update law from method 2, a supervisory inertia matrix is utilized. A Lyapunov stability analysis for this approach is provided in Appendix B. % & ˆ 1 )] − ν w ˆ 1 , (17) ˆ˙ 1 =β1 φM [(τ zT PB)11 + γ1 (g1 − φTM w w ' ˙w ˆ 2 =β2 φM [(τ zT PB)12 + (τ zT PB)21 ( T ˆ 2 )] − ν w ˆ2 , + γ2 (g2 − φM w (18) % & ˆ˙ 3 =β3 φM [(τ zT PB)22 + γ3 (g3 − φTM w ˆ 3 )] − ν w ˆ 3 , (19) w # $ T ˆ˙ 4 =β4 φF [(z PB)1 − ν w ˆ4 , w (20) # $ T ˆ˙ 5 =β5 φ [(z PB)2 − ν w ˆ5 , w (21) F
where g1 ,g2 ,and g3 are the supervisory inertia matrix inverse values, calculated using the standard 2-link planar robot description [9] based on parameter values provided by the manufacturer. Our experiments show the new method is robust to large uncertainties in the supervisory inertia matrix because of payload.
Method 2 (forward dynamics) CMAC estimates.
IV. R ESULTS Our experiement uses a two-link robot, originally built with flexible joint but modified to have rigid joints for the purpose of this experiment (Fig. 1). The robot arm follows a smooth trajectory consisting of sinusoidal movements in angular coordinates. Each link moves with a maximum angular velocity of 11.46 deg/s and a maximum angular acceleration of 3.6 deg/s/s. The manufacturer provides model parameters. Link 1 has mass 1.5 Kg and length of 34.29 cm while link 2 has mass 0.8733 Kg and length of 26.35 cm. Using these values to calculate an inertia matrix (see [2] for instance) at link angles equal to zero produces our estimate of M−1 that supervises the training " ! " ! 13 −14 g1 g2 = Kg m2 . = M−1 supervisory g2 g3 −14 38 (22) ˆ M estimates. This also provides both initial conditions for the C ˆ (The CMACs in CF start at zero outputs.) When different payloads are added in the experiments the supervisory values g1 , g2 , g3 , g4 remain constant. Method 1 significantly outperforms method 2 for the given trajectory (Figure 2). Method 1 achieves a root-mean-square (RMS) error of 100mm in the end-effector position while method 2 exhibits an RMS error of 444mm. Method 1 results in more accurate CMAC estimates of the robot dynamics (Figure 3) than method 2 (Figure 4). The estimates of the
636
Proceedings of the 4th International Conference on Autonomous Robots and Agents, Feb 10-12, 2009, Wellington, New Zealand
11
M−1
6.5 6
M−1 or M−1 12 21
End Effector RMS Error (mm)
Actual CMAC estimate
7
0
50
100
150
200
250
−12 −14 −16 −18
0
50
100
150
200
250
130
End Effector RMS Error (mm)
M−1 22
135
125 0
50
100
150
200
250
M−1F1
10 0 −10
0
50
100
150
200
250
M−1F2
50 0 −50
0
50
100
150
200
0kg
1kg
110
110 Method 1 Method 3
100
100
90
90
80
80
70
0
50
100
150
200
70
0
50
2kg
100
150
200
100 150 Time (sec)
200
3kg
100
110 100
90
90 80 70
80 0
50
100 150 Time (sec)
200
70
0
50
250
Time (sec)
Fig. 6. Fig. 5.
Performance of Method 1 and Method 3 for different payloads
Method 3 (novel method) CMAC estimates. 0kg RMS Control Torque
360 Method 1 Method 3
360
340
340 320
320 300
0
50
100
150
200
300
0
50
RMS Control Torque
2kg
A. Robustness of proposed method Method 3 must be robust to uncertainties in the supervisory inertia matrix in order for it to be practical. These uncertainties arise from both changing payloads as well as modeling errors. Testing the robustness of the proposed method, we add various payloads to the end-effector: 0, 1, 2, and 3 Kg. Meanwhile, the supervisory inertia matrix remains constant, using the value without payload of m2 = 0.8733 Kg. Method 2 does not work at all with the changing payloads (using the same gains) and is not included in the comparisons. Method 3 adapts to changing payloads (Figure 6). In fact, there is very little noticeable difference in the RMS error when a payload of 3 Kg is used over no payload (actually, there is a 0.16mm decrease in end-effector RMS error achieved between no payload and a 3 Kg payload). Method 1 on the other hand has a slight increase in error (6.5mm)when a payload of 3 Kg is used. Thus, method 3 actually has a slight improvement even over the inverse dynamics approach in terms of robustness, as the average control effort for method 3 remains roughly the same as the payload increases unlike method 1 (Figure 7). (The slightly better performance of method 3 over method 1 in all tests is explained by the fact it uses slightly more torque.) Although the inverse dynamics requires no model parameters, the fact that only very limited information is needed for method 3 makes it a practical alternative.
1kg
380
inertia matrix in method 2 are especially poor (the main difficulty we are addressing this paper). Our novel method 3 outperforms the method 2 forward dynamics, achieving an RMS error of 87mm. Method 3 achieves fairly accurate estimates of the inertia matrix, not surprising since since these estimates are essentially undergoing supervised learning in this case. The performance of method 3 is approximately equivalent to the method 1 inverse dynamics approach (although the error is slightly less the torque is slightly greater).
360
340
340
320
320
0
Fig. 7.
50
150
200
100 150 Time (s)
200
3kg
360
300
100
100 150 Time (s)
200
300
0
50
Torque of Method 1 and Method 3 for different payloads
V. C ONCLUSIONS We contrast the inverse dynamics and forward dynamics approaches for neural-network (adaptive) control of robotic manipulators, and identify the difficulties with using forward dynamics. Overcoming these difficulties, the proposed method uses a supervisory inertia matrix to guide the weight updates when using a forward dynamics approach. Lyapunov techniques verify the stability properties. Experimental results with a two link planar arm very the new approach greatly outperforms regular forward dynamics control and is slightly more robust to payload changes than inverse dynamics control. By successfully estimating the dynamics including the inertia matrix, this new control method allows a development of adaptive control schemes that require a complete dynamic model estimate, including inertia matrix.
637
Proceedings of the 4th International Conference on Autonomous Robots and Agents, Feb 10-12, 2009, Wellington, New Zealand
A PPENDIX A This section establishes the uniform ultimate boundedness of signals using the inverse dynamics Method 1. The sliding surface is s = Λe1 + e2 and the adaptive Lyapunov function is 1 T 1 ˜ w. ˜ w (23) V = sT Ms + 2 2β Taking the derivative, utilizing dynamics in (1), gives ˙ + 1w ˆ˙ (24) ˜ T (−w). V˙ = sT (Me2 + F + τ + M¨ qd + 0.5Ms) β Replacing the nonlinear functions with neural network estimates as in (8) produces 1 T ˆ˙ ˜ (−w). V˙ = sT (CI + d + τ ) + w (25) β Rearranging gives ˆ˙ w ˆ + d + τ) + w ˜ T (φI s − ). V˙ = sT (φTI w β
˜ = w− w, ˆ the positive definite matrix P is the solution where w of the Lyapunov equation AT P + PA = −Q with positive ˜ T1 · · · w ˜ T5 ]T and R = diag(P, I/(2β)). definite Q, x = [zT w The positive quadratic function xT Rx satisfies λmin (R)xT x ≤ xT Rx ≤ λmax (R)xT x. Taking the derivative of the Lyapunov function and substituting z˙ from Eq. (33) results in : V˙ = − zT Qz + zT PB[C˜F + C˜M τ + uN D ] +
where ζ = zT PB[Dτ + d + uN D ] . Assuming the*CMACs are + all the same ( φM = φF ) and letting zT PB = (4 (5 results in:
+ [ #4
Choosing control equivalent to (4) (27)
˜ ˜ T w − νw ˜ T w. V˙ = −sT Gs + sT d + ν w
(28)
˜ T4 φ#4 + w ˜ T5 φ#5 + w ˜ T1 φ#4 τ 1 + w ˜ T2 φ#4 τ 2 = − zT Qz + w 5 X 1 T ˙ ˜ i ˜wi + ζ. ˜ T2 φ#5 τ 1 + w ˜ T3 φ#5 τ 2 + w +w βi i=1
and weight updates (9) results in
Let τ z PB = T
˜ grows Since V˙ becomes less than zero if either %s% or %w% large enough, all signals are uniformly ultimately bounded. A PPENDIX B
−1 ¨ d − G1 e1 − G2 e2 + uN D ), τ = CˆM (−CˆF + q
(31)
The control term is chosen to be:
where uN D is a robustifying term (nonlinear damping). Substituting Eq. (31) into (30) results:
where A=
!
0 G1
1 G2
"
,
Consider the Lyapunov candidate: V =
B=
!
0 1
"
(33)
5 ) 1 T 1 T ˜ i = xT Rx, ˜i w z Pz + w 2 2βi i=1
5 X i=1
˜ Ti (ηi − βi−1 w ˆ˙ i ) + ζ, w
(35)
i = 1, 2, · · · , 5
Substituting the weight update into (35), and letting w = [wT1 · · · wT5 ]T results in: V˙ = −zT Qz + v w ˜T w ˆ−w ˜T k + w ˜T S w ˆ
+ zT PB[Dτ + d + uN D ],
where: k = [kT1 · · · kT5 ]T ,
.
and (2 = (12 + (21 to get:
where gi are the parameters representing the estimate of the mass matrix: ! " g1 g2 −1 ˆ . M = g2 g3
e2 ]T the tracking errors are
z˙ = Az + B[C˜F + C˜M τ + uN D + d + Dτ ],
"
ˆ˙ i = βi (ηi − v w ˆ i )), ˆ i + γi φ(gi − φT w w
e˙ 2 = −G1 e1 − G2 e2 + C˜F + C˜M τ + uN D + d + Dτ . (32)
Using an error vector z = [e1
(12 (3
where ηi = φ(i . Choose the weight update to be
using relations CM = CˆM + C˜M and CF = CˆF + C˜F , Eq. (29) can be rewritten as: (30)
(1 (21
= − zT Qz +
(29)
¨ d (t). e˙ 2 = CˆF + C˜F + d + CˆM τ + C˜M τ + Dτ − q
!
˜ T4 φ#4 + w ˜ T5 φ#5 + w ˜ T1 φ#1 + w ˜ T2 φ#2 V˙ = − zT Qz + w 5 X 1 T ˙ ˜ i ˆwi + ζ, ˜ T3 φ#3 − w +w βi i=1
This section establishes the uniform ultimate boundedness of signals using the proposed Method 3. Substituting Eq. (10) and (11) into (2) results: ¨ d (t), e˙ 2 = CF + d + CM τ + Dτ − q
5 X 1 T ˙ ˜ i ˜wi + ζ w βi i=1 – » T –– »» T ˜4 ˜ 1 τ 1 + φT w ˜ 2τ 2 φ w φ w #5 ] + , T T T ˜5 ˜ 2τ 1 + φ w ˜ 3τ 2 φ w φ w
V˙ = − zT Qz +
(26)
ˆ − Gs, τ = −φTI w
5 ) 1 T ˙ ˜ ˜wi + ζ, w βi i i=1
ki = γi φgi ,
i = 1, 2, · · · , 5
and S = diag(C1 · · · C5 ) with Ci = γi φφT In×n . Then
V˙ = − zT Qz + v w ˜T w − vw ˜T w ˜−w ˜T k + w ˜T Sw − w ˜T S w ˜
(34)
638
−1 −1 + zT PBDCˆM uw + zT PBDCˆM uN D + zT PBd
+ zT PBuN D .
(36)
Proceedings of the 4th International Conference on Autonomous Robots and Agents, Feb 10-12, 2009, Wellington, New Zealand
Choose the robust control uN D = −α||uw ||2 BT PT z.
(37)
ˆ˙ i = w
Substituting 37 into 36 results:
−1 ˜ ˜ T w − vw ˜T w zT PBDCˆM uw + v w
˜T k + w ˜ T Sw − w ˜ T S w. ˜ −w
Since modeling errors D and d are bounded(λmax(D) ≤ d1max , ||d|| ≤ d2max ), the upper bound of the derivative of the Lyapunov function can be expressed as: V˙ ≤ − λmin (Q)||z||2 + ||z||λmax (P B)d2max −1
+ I)||uw ||2 ||z||2
+ ||z||||uw ||λmax (P B)λmax (DCˆM
−1
−1 0 if λmin (DCˆM + I) ≤ 0 ˆ i )) ˆ i + γi φ(gi − φT w otherwise. βi (ηi − v w
R EFERENCES
−1 V˙ = −zT Qz + zT PBd − αzT PB(DCˆM + I)uTw uw z+
− αλmin (P B)λmin (DCˆM
,
)
2
− ||w|| ˜ (v + λmin (S)) + ||w||(||w||(v ˜ + λmax (S)) − ||k||).
Matrices Q, P B, S are positive definite, and v > 0. Thus, the only condition needed to ensure V˙ outside of a compact region (and that all signals are uniformly ultimately bounded) is that −1 the matrix (DCˆM + I) be positive definite. The following projection rule is used to make sure this condition is satisfied:
[1] J. S. Albus, “A new approach to manipulator control: the cerebellar model articulation controller (CMAC),” J. Dyn. Syst., Meas., Contr., vol. 97, pp. 220–227, Sept. 1975. [2] S. S. Ge, T. H. Lee, and C. J. Harris, Adaptive Neural Network Control of Robot Manipulators. New Jersey: World Scientific, 1998. [3] M. Meng and X. Yang, “Real-time tracking control of robot manipulators with online learning based approach,” in Proc. IEEE Canadian Conf. Elec. Comp. Eng., vol. 2, pp. 1035 – 1040, May 1999. [4] L. Peng and W. Peng-Yung, “Neural-fuzzy control system for robotic manipulators,” IEEE Control Systems Magazine, vol. 22, pp. 53 – 63, Feb. 2002. [5] T. Lee, S. Kumarawadu, and J. Perng, “Direct-adaptive neurocontrol of robots with unknown nonlinearities and velocity feedback,” in Proc. IEEE Int. Conf. Sys., Man, Cyber., vol. 3, pp. 2073 – 2077, Oct. 2005. [6] J. Yegerlehner and P. Meckl, “Experimental implementation of neural network controller for robot undergoing large payload changes,” in Proc. IEEE Int. Conf. Robot. Aut., vol. 2, pp. 744 – 749, 1993. [7] P. Meckl and N. Nho, “Intelligent feedforward control and payload estimation for a two-link robotic manipulator,” IEEE/ASME Trans. Mechatronics, vol. 8, pp. 277– 282, June 2003. [8] A. Beirami and C. Macnab, “Direct neural-adaptive control of robotic manipulators using a forward dynamics approach,” in Proc. IEEE Canadian Conf. Elec. Comp. Eng., (Ottawa), pp. 363–367, May. 2006. [9] L. Sciavicco and B. Siciliano, Modelling and Control of Robot Manipulators. Springer, 1996.
639