12 Jul 2004 - Real Time Recurrent Learning (RTRL): R. J. Williams and D. Zipser, 1990. ⢠Extended Kalman ... Online (r
Nonlinear System Identification and Control Using Neural Networks
Swagat Kumar
Department of Electrical Engineering, Indian Institute of Technology, Kanpur. July 12, 2004
Intelligent Systems Laboratory
Page 1
Synopsis • Introduction • System Identification with Memory Neuron Network (MNN) • Lyapunov based training algorithm for FFN • Robot Manipulators • Manipulator Control • Pendubot • Summary
July 12, 2004
Intelligent Systems Laboratory
Page 2
Introduction
• Artificial Neural Network • System Identification • Nonlinear Control • Underactuated system • My work
July 12, 2004
Intelligent Systems Laboratory
Page 3
Introduction contd...
Artificial Neural Network • Inspired by Biological Nervous W x1
System
W y
• Local Processing in artificial neurons (Processing Elements, PEs)
x2
• Massively parallel processing implemented by rich connection pattern between PEs
Figure 1: An Artificial neural network
• Ability to acquire knowledge via learning/experience
• Knowledge storage in distributed memory, synaptic PE connections July 12, 2004
Intelligent Systems Laboratory
Page 4
Introduction contd...
System Identification • Building good models of unknown U
Unknown PLANT
plant from measured data
Yd +
• Involves two distinct steps
e
MODEL W
– Choosing a proper model
−
– Adjusting the parameters of
Y
model so as to minimize fit criterion Figure 2: System identification model
• Neural networks are extensively used because of their good approximation capability
July 12, 2004
Intelligent Systems Laboratory
Page 5
Introduction contd...
Nonlinear Control • Nonlinear systems – Difficult to model – No standard or generalized technique – May involve parameter uncertainty and variation – May have unstable zero dynamics
• Conventional techniques – Adaptive Control – Robust and Optimal Control – Many others like backstepping, sliding mode, feedback linearization, singular perturbation etc.
July 12, 2004
Intelligent Systems Laboratory
Page 6
Introduction contd...
Underactuated Systems • Number of actuators is less than Number of DOFs • Certain examples: – Flexible link and Flexible joint manipulators – Inertial wheel pendulum, pendubot, acrobot, furuta pendulum – Under water vehicle, PVTOL, space structures with floating platform
• Control techniques: – Energy based method: Fantoni and Lozano, 2002 – Passivity based methods – Partial feedback linearization Spong, Ortega, Block, et al. – Backstepping and a variant of sliding mode control: Bapiraju, 2004 – Coordinate Transforms and various other methods: Saber, 2001 July 12, 2004
Intelligent Systems Laboratory
Page 7
Introduction contd...
My Work
• Learning Algorithms for MNN • A new learning algorithm with FFN • NN based Robot manipulator control • NN based control technique for Pendubot
July 12, 2004
Intelligent Systems Laboratory
Page 8
System Identification with MNN
• Feedforward and Recurrent Networks • Memory Neuron Network • Learning Algorithms – Back Propagation through time (BPTT) – Real time Recurrent Learning (RTRL) – Extended Kalman Filtering (EKF)
• Simulation Results • Comparison and Inference
July 12, 2004
Intelligent Systems Laboratory
Page 9
System identification with MNN contd...
Feed-forward networks • Performs static mapping (point to point mapping) • No preservation of dynamics • Apriori knowledge of exact order of system required • all the states should be measurable • However, easy to train
Recurrent Networks • Capable of learning dynamics • No apriori knowledge of order of system required • all states need not be available for measurement • But, Computationally complex because of feedbacks July 12, 2004
Intelligent Systems Laboratory
Page 10
System identification with MNN contd...
Memory Neuron Model • Sastry et al. 1994 NN
X
f (.)
• A Memory Neuron is added to
S
each Network Neuron to capture dynamics of the recurrent network
α
• No need to store past values.
MN
• It converts a recurrent network into a feed-forward network
(1 − α)
• Locally recurrent and globally
Figure 3: Structure of Memory Neuron Model
feedforward in nature
• Easy to train as compared to a fully recurrent network
July 12, 2004
Intelligent Systems Laboratory
Page 11
System identification with MNN contd...
Memory Neuron Network
u(k)
y(k)
PLANT
z −1 1 S0
+
e(k)
2 S0 -
y(k) ˆ
y(k − 1)
Figure 4: System Identification model with MNN
July 12, 2004
Intelligent Systems Laboratory
Page 12
System identification with MNN contd...
Learning Algorithms Training Algorithms for RNN:
• Back Propagation through time (BPTT): P. J. Werbos, 1990 • Real Time Recurrent Learning (RTRL): R. J. Williams and D. Zipser, 1990 • Extended Kalman Filter (EKF): R. J. Williams, 1992, Iiguni et al., 1992 What We have done?
• We used RTRL and EKF for MNNa . • A comparative study of above three algorithms have been made. • We have tested results for both SISO as well as MIMO systems a
Sastry et al. used BPTT for this network
July 12, 2004
Intelligent Systems Laboratory
Page 13
System identification with MNN contd...
Back Propagation Through Time • Extension of Back Propagation alw
x
y
gorithm
• The recurrent network is unfolded
g x(0)
x(1)
y(0)
x(2) w
w
w y(2)
y(1) g
g t = 0
y(3)
g t = 1
t= 2
into multilayer feedforward network, with a new layer added at every time step.
Figure 5: Rolling network in time
• Offline technique as we wait until we reach the end of sequence
∂E 4wi (t) = −η ∂wi
• Information is propagated in backward direction for updating weights
• Slow convergence July 12, 2004
Intelligent Systems Laboratory
Page 14
System identification with MNN contd...
Real Time Recurrent Learning Plant output
y(t + 1) = f (s(t + 1)), where s(t + 1) = wx(t) + gy(t) Cost function for single output
E=
wn ∂E ∂w
July 12, 2004
1 d (y (t + 1) − y(t + 1))2 2
∂E ; [Gradient Descent] ∂w ∂y(t + 1) d = −[y (t + 1) − y(t + 1)] ∂w
= wo − η
Intelligent Systems Laboratory
Page 15
System identification with MNN contd...
RTRL Contd ...
Define
Pw (t + 1) = =
∂y(t + 1) ∂w ∂y(t + 1) ∂s(t + 1) ∂s(t + 1) ∂w
= y 0 (t + 1)[x(t) + gPw (t)] ∂E = −[y d (t + 1) − y(t + 1)]Pw (t + 1) ∂w And, we have following recursion
Pw (t + 1) = y 0 (t + 1)[x(t) + gPw (t)]
July 12, 2004
Intelligent Systems Laboratory
Page 16
System identification with MNN contd...
RTRL Contd ...
• Propagates information forward to compute gradient term • Online (real-time) training • Increased computational complexity • Requires large number of training patterns
July 12, 2004
Intelligent Systems Laboratory
Page 17
System identification with MNN contd...
Extended Kalman Filtering • It is a state estimation method for a non-linear system and can be used for parameter estimation.
• Multilayered NN is a multi-inputs and multi-outputs non-linear system having a layered structure and its learning algorithm can be regarded as a parameter estimation problem.
• EKF used for RNN (williams, 92) shows 6-10 times reduction in the number of presentations required for training.
• Increased computation time per iteration • Iiguni, et al., 92 proposed online training algorithm for MLP
July 12, 2004
Intelligent Systems Laboratory
Page 18
System identification with MNN contd...
EKF Contd ...
System equations
a(t) = a(t − 1) y d (t) = h[a(t)] + (t) ˆ (t) + (t) = y ˆ (t)] ˆ (t) = a ˆ (t − 1) + K(t)[y d (t) − y a K(t)
= P (t − 1)H T (t)[H(t)P (t − 1)H T (t) + R(t)]−1
P (t) = P (t − 1) − P (t − 1)K(t)H(t) where H(t)
∂h ∂a ,
K(t) - Kalman Gain, R(t) - covariance matrix of measurement noise and P (t) - error convariance matrix. July 12, 2004
=
Intelligent Systems Laboratory
Page 19
System identification with MNN contd...
Simulation Results
Example 1: SISO plant
yp (k + 1) = f (yp (k), yp (k − 1), yp (k − 2), u(k), u(k − 1)) where
f (x1 , x2 , x3 , x4 , x5 ) =
July 12, 2004
x1 x2 x3 x5 (x3 − 1) + x4 1 + x23 + x22
Intelligent Systems Laboratory
Page 20
System identification with MNN contd...
SISO Plant 1
1
desired actual
0.5
output
output
0.5
0
-0.5
-1
desired actual
0
-0.5
0
100
200
300
400
500 time steps
600
700
800
900
-1
1000
(a) BPTT Algorithm
1
0
100
200
300
400
500 time steps
600
700
800
900
1000
(b) RTRL Algorithm
desired actual
0.8 0.6 0.4
output
0.2 0 -0.2 -0.4 -0.6 -0.8 -1
0
200
400
600
800
1000
time steps
(c) EKF Algorithm
July 12, 2004
Intelligent Systems Laboratory
Page 21
System identification with MNN contd...
Example 2: MIMO Plant
yp1 (k) yp1 (k + 1) = 0.5[ 2 (k) + u1 (k)] 1 + yp2 yp1 (k)yp2 (k) + u2 (k)] yp2 (k + 1) = 0.5[ 2 1 + yp2 (k)
July 12, 2004
Intelligent Systems Laboratory
Page 22
System identification with MNN contd...
MIMO Plant: BPTT 0.6 Actual Desired
Actual Desired
0.4
Output - 2
Output - 1
0.5
0
0.2
0
-0.2
-0.5
-0.4 0
200
400
600
Data Points
800
1000
0
(d) BPTT: putput 1
July 12, 2004
200
400
600
Data Points
800
1000
(e) BPTT: output 2
Intelligent Systems Laboratory
Page 23
System identification with MNN contd...
MIMO Plant: RTRL 0.6 Desired Actual
Desired Actual
0.4
Output - 2
Output - 1
0.5
0
0.2
0
-0.2
-0.5
-0.4 0
200
400
600
Data Points
800
1000
0
(f) RTRL: output 1
July 12, 2004
200
400
600
Data Points
800
1000
(g) RTRL: output 2
Intelligent Systems Laboratory
Page 24
System identification with MNN contd...
MIMO Plant: EKF 0.6 Desired Actual
Desired Actual
0.4
Output - 2
Output - 1
0.5
0
0.2
0
-0.2
-0.5
-0.4 0
200
400
600
Data Points
800
1000
0
(h) EKF:output 1
July 12, 2004
200
400
600
Data Points
800
1000
(i) EKF:output 2
Intelligent Systems Laboratory
Page 25
System identification with MNN contd...
Comparative Analysis
Example
BPTT
RTRL
EKF
Ex.No.1
0.013293
0.006088
0.006642
Ex.No.2 o/p1
0.005753
0.001414
0.002595
Ex.No.2 o/p2
0.008460
0.001258
0.001563
Table 1: Mean Square Error while identifying with MNN
July 12, 2004
Intelligent Systems Laboratory
Page 26
System identification with MNN contd...
Conclusion
• MNN offers a simple way to convert a recurrent network into feed forward network thereby simplifying implementation
• BPTT takes a huge time for convergence and also identification is poor . – Online implementation not possible
• RTRL shows best approximation accuracy and also enables online training – But takes a long time for convergence
• EKF is the fastest a among all the three. Performance is comparable to RTRL. – But computationally complex a
in terms of number of training examples
July 12, 2004
Intelligent Systems Laboratory
Page 27
Lyapunov Based Training Algorithm • Introduction • Back Propagation Algorithm • Lyapunov Based Approach – LF - I – LF - II
• Simulation Results • Conclusion
July 12, 2004
Intelligent Systems Laboratory
Page 28
Lyapunov Based Algorithm Contd ...
• Desirable features of Learning Algorithms for Feed-forward networks – Locating Global Minimum of the Cost Function – Fast Convergence – Good Generalization - Learning from minimum Examples – Less Computational Complexity - Less training time
• Overview of present learning algorithms – Back Propagation Algorithm
∗ Most likely to find a local minimum ∗ Slow Convergence ∗ Data Sufficiency: Training patterns should span the complete Input-output Space
∗ But easy to implement
July 12, 2004
Intelligent Systems Laboratory
Page 29
Lyapunov Based Algorithm Contd ...
Introduction contd...
– Extended Kalman Filtering
∗ ∗ ∗ ∗
Fast Convergence Less training patterns Computationally intensive Global Convergence is not guaranteed
– Other Algorithms (Newton’s Method, Levenberg-Marquardt)
∗ Fast algorithms ∗ Global Convergence is not guaranteed ∗ Computationally intensive • What We have done? – A globally convergent algorithm using Lyapunov function for Feedforward Neural Networks. July 12, 2004
Intelligent Systems Laboratory
Page 30
Lyapunov Based Algorithm Contd ...
Back Propagation Algorithm Back Propagation Algorithm is based on Gradient Descent (GD) method in which the weight is updated in such a direction so as to reduce the error. The weight update rule is given by
∂E 4wi (t) = −η ∂wi 4wji (n) = ηδj (n)yi (n); η is the learning rate and δj (n) is the local gradient 0 e (n)ψ (vj (n)) j j if neuron j is an output node δj (n) = P 0 ψj (vj (n)) k δk (n)wkj (n) if neuron j is a hidden node in the above equation k refers to next layer
July 12, 2004
Intelligent Systems Laboratory
Page 31
Lyapunov Based Approach Contd...
Lyapunov Based Approach
Lyapunov Stability Criterion
• Used extensively in control system problems. • If we choose a Lyapunov function candidate V (x(t), t) such that –
V (x(t), t) is positive definite – V˙ (x(t), t) is negative definite then the system is stable. Now, if V˙ (t) is uniformly continuous and bounded then according to Barbalat’s lemma as t
→ ∞, V˙ → 0.
• Problem lies in choosing a proper Lyapunov function candidate.
July 12, 2004
Intelligent Systems Laboratory
Page 32
Lyapunov Based Approach Contd...
W
W
x1
y
x2
Figure 6: A feed-forward network
∈ RM is the weight vector. The training data consists of, say, N patterns, {xp , y p }, p = 1, 2, ..., N . Here, W
July 12, 2004
Intelligent Systems Laboratory
Page 33
Lyapunov Based Approach Contd...
The network output is given by
yˆp = f (W , xp ) p = 1, 2, . . . N
(1)
The usual quadratic cost function is given as: N
1X p E= (y − yˆp )2 2 p=1
(2)
Let’s choose a Lyapunov function candidate for the system as below:
1 T ˜) y y V = (˜ 2 ˜ where y
July 12, 2004
(3)
= [y 1 − yˆ1 , ....., y p − yˆp , ....., y N − yˆN ]T .
Intelligent Systems Laboratory
Page 34
Lyapunov Based Approach Contd...
The time derivative of the Lyapunov function V is given by
ˆ ˙ ∂y ˙ W = −˜ yT J W V˙ = −˜ y ∂W where
J=
ˆ ∂y ∂W
(4)
J ∈ RN ×M
Theorem 1 If an arbitrary initial weight W (0) is updated by
W (t0 ) = W (0) + where
Z
t0
˙ dt W
(5)
0
2 ˜ k y k T ˙ = ˜ J y W T 2 ˜k kJ y
(6)
˙ exists along the convergence ˜ converges to zero under the condition that W then y trajectory. July 12, 2004
Intelligent Systems Laboratory
Page 35
Lyapunov Based Approach Contd...
LF - I Algorithm The weight update law given in equation (5) is a batch update law. The instantaneous LF I learning algorithm can be derived as: 2 k y ˜ k T ˙ = J y˜ W i T 2 k Ji y˜ k
where y ˜=
p
p
y − yˆ ∈ R and Ji =
∂ yˆp ∂W
(7)
∈ R(1×M ) . The difference equation
representation of the weight update equation is given by
ˆ (t + 1) = W ˆ (t) + µW ˙ (t) W
(8)
Here µ is a constant.
July 12, 2004
Intelligent Systems Laboratory
Page 36
Lyapunov Based Approach Contd...
Comparison with BP Algorithm In gradient descent method we have,
∂E = ηJi T y˜ ∂W ˆ (t + 1) = W ˆ (t) + ηJi T y˜ W 4W = −η
(9)
The update equation for LF-I algorithm: 2 k y ˜ k T ˆ (t + 1) = W ˆ (t) + µ W y˜ J i T 2 k Ji y˜ k
Comparing above two equations, we find that the fixed learning rate η in BP algorithm is replaced by its adaptive version ηa :
k y˜ k2 ηa = µ k Ji T y˜ k2 July 12, 2004
Intelligent Systems Laboratory
(10)
Page 37
Lyapunov Based Approach Contd...
Adaptive Learning rate of LF-I 50 LF - I : XOR
Learning rate
40
30
20
10
0
0
100
200
300
No of iterations (4xno. of epochs)
400
Figure 7: Adaptive Learning rates for LF-I: XOR Observation:
• Learning rate is not fixed unlike BP algorithm. • Learning rate goes to zero as error goes to zero. July 12, 2004
Intelligent Systems Laboratory
Page 38
Lyapunov Based Approach Contd...
LF-II Algorithm In order to ensure smooth search in weight space, we consider a modified Lyapunov function candidate as follows:
1 T T ˜ ˜) ˜ + λW W y y V = (˜ 2 ˜ where W
(11)
ˆ − W = 4W and λ is a constant. The time derivative of =W
Lyapunov function is given by
V˙
where J
=
ˆ ∂y ∂W
ˆ ˙ ∂y ˜ TW ˙ W −W ∂W ˙ = −˜ y T (J + D)W
= −˜ yT
˙ and ˜˙ = −W : N × M Jacobian matrix, W D=λ
July 12, 2004
(12)
1 ˜T ˜ y W k˜ y k2
Intelligent Systems Laboratory
(13)
Page 39
Lyapunov Based Approach Contd...
LF-II Algorithm contd... Theorem 2 If an arbitrary initial weight is updated by
W (t0 ) = W (0) +
Z
t0
˙ dt W
(14)
0
˙ is given by: where W 2 ˜ k y k T ˙ = ˜ (J + D) y W T 2 ˜k k (J + D) y
(15)
˙ exists along the convergence ˜ converges to zero under the condition that W then y trajectory. The instantaneous weight update equation using modified Lyapunov function can be expressed as: 2 k y ˜ k T ˆ ˆ (J + D) y˜ W (t + 1) = W (t) + µ i T 2 k (Ji + D) y˜ k July 12, 2004
Intelligent Systems Laboratory
(16)
Page 40
Lyapunov Based Approach Contd...
Adaptive learning rate for LF-II Algorithm
LF - II : XOR
Learning rate
150
100
50
0
0
100
Training Data
200
300
Figure 8: Adaptive Learning rates for LF-II: XOR
July 12, 2004
Intelligent Systems Laboratory
Page 41
Lyapunov Based Approach Contd...
Smooth search in weight space 3000
250
150 3bit Parity
LF - I LF - II
150
2000
Trainiing epochs
200
4-2 Encoder
LF - II LF - I
2500
Training epochs
Training epochs
XOR
1500
1000
LF - I LF - II
100
50
500
100
0
10
20
run
30
(a) XOR
40
50
0
0
10
20
run
30
40
(b) 3-bit Parity
50
0
0
10
20
run
30
40
50
(c) 4-2 Enconder
Figure 9: Comparison of convergence time in terms of iterations between LF I and LF II
July 12, 2004
Intelligent Systems Laboratory
Page 42
Lyapunov Based Approach Contd...
Simulation Results
XOR Algorithm
epochs
time(sec)
parameters
BP
5620
0.0578
η = 0.5
BP
3769
0.0354
η = 0.95
EKF
3512
0.1662
λ = 0.9
LF-I
165
0.0062
µ = 0.55
LF-II
109
0.0042
µ = 0.55, λ = 0.01
Table 2: comparison among three algorithms for XOR problem
July 12, 2004
Intelligent Systems Laboratory
Page 43
Lyapunov Based Approach Contd...
Simulation Results
Convergence time (seconds)
0.4
BP EKF LF - II
0.3
0.2
0.1
0
0
10
20
Run
30
40
50
Figure 10: Convergence time comparison for XOR among BP, EKF and LF-II Observation: LF takes almost same time for any arbitrary initial condition. July 12, 2004
Intelligent Systems Laboratory
Page 44
Lyapunov Based Approach Contd...
Simulation Results
3-bit Parity Algorithm
epochs
time(sec)
parameters
BP
12032
0.483
η = 0.5
BP
5941
0.2408
η = 0.95
EKF
2186
0.4718
λ = 0.9
LF-I
796
0.1688
µ = 0.53
LF-II
403
0.0986
µ = 0.45, λ = 0.01
Table 3: comparison among three algorithms for 3-bit Parity problem
July 12, 2004
Intelligent Systems Laboratory
Page 45
Lyapunov Based Approach Contd...
Simulation Results
4-2 Encoder Algorithm
epochs
time(sec)
parameters
BP
2104
0.3388
η = 0.5
BP
1141
0.1848
η = 0.95
EKF
1945
2.4352
λ = 0.9
LF-I
81
0.1692
µ = 0.29
LF-II
70
0.1466
µ = 0.3, λ = 0.02
Table 4: comparison among three algorithms for 4-2 Encoder problem
July 12, 2004
Intelligent Systems Laboratory
Page 46
Lyapunov Based Approach Contd...
Simulation Results System Identification We consider the following system identification problem
x(k) 3 + u x(k + 1) = (k) 1 + x2 (k)
(17)
Training: 40000 data points randomly generated between 0 and 1. Test data: a sinusoidal input u
= sin(t) for three periodic cycles.
BP
EKFa
LF-II
0.0539219
0.0807811
0.052037
Table 5: Rms error on test data a
Convergence of EKF is not uniform for different initial conditions
July 12, 2004
Intelligent Systems Laboratory
Page 47
Lyapunov Based Approach Contd...
Simulation Results System Identification
Rms error=0.0539219, Tmax=40000
1
Desired Actual
BP
0.5
Output(Normalized)
Output (Normalized)
Output (Normalized)
-0.5
0
-0.5
0
5
10
Time (seconds)
(a) BP
15
20
-1
Desired Actual
LF - II
0.5
0
Rms error=0.052037, Tmax=40000
1 Desired Actual
EKF
0.5
-1
Rms error=0.0807811, Tmax=40000
1
0
-0.5
0
5
10
Time (seconds)
15
(b) EKF
20
-1
0
5
10
Time (seconds)
15
20
(c) LF-II
Figure 11: System identification - the trained network is tested on test-data
July 12, 2004
Intelligent Systems Laboratory
Page 48
Lyapunov Based Approach Contd...
Conclusion
• LF Algorithm is a theoretically globally convergent algorithm. • LF Algorithms perform better than both EKF and BP algorithms in terms of speed and accuracy.
• Convergence speed of modified LF (LF-II) based on smooth search in weight space is found to be independent of initial weights.
• LF-I Algorithm has an interesting parallel with conventional BP algorithm where the the fixed learning rate of BP is replaced by an adaptive learning rate.
• LF Algorithms outperform BP and EKF when the complexity of the network increases.
• Choosing a right Lyapunov function candidate is a key issue in our algorithm. July 12, 2004
Intelligent Systems Laboratory
Page 49
Robot Manipulators Dynamics:
M (q)¨ q + Vm (q, q) ˙ q˙ + F (q) ˙ + G(q) + τd = τ
(18)
Properties:
• The inertia matrix M (q) is symmetric, positive definite, and bounded so that µ1 I ≤ M (q) ≤ µ2 I ∀q(t). • The Coriolis/centripetal vector Vm (q, q) ˙ q˙ is quadratic in q˙. Vm is bounded so that k Vm q˙ k≤ vB k q˙ k2 . • The Coriolis/centripetal matrix can always be selected so that the matrix S(q, q) ˙ ≡ M˙ (q) − 2Vm (q, q) ˙ is skew symmetric. Therefore, xT Sx = 0 for all vectors x. • The gravity vector is bounded so that k G(q) k≤ gB . • The disturbances are bounded so that k τd (t) k≤ dB . July 12, 2004
Intelligent Systems Laboratory
Page 50
Manipulator Control • Overview of various control techniques – PID controllers
∗ ∗ ∗ ∗
Most widely used for manipulators Can’t be used under high speed and accuracy requirements Loose accuracy when certain parameter changes - No learning capability High gains may make the system susceptible to high frequency noise
– Traditional Adaptive Control Techniques
∗ Spong, Vidyasagar, Kokotovic, Slotine and Li ∗ Assumption of linearity in unknown system parameters: f (x) = R(x)ξ ∗ Computation of Regression matrix R(x) is quite complex and must be computed for each different manipulator
July 12, 2004
Intelligent Systems Laboratory
Page 51
Manipulator control contd...
– NN based Adaptive Control
∗ NNs possess universal approximation capability f (x) = W T σ(V T x) + ∗ ∗ ∗ ∗
No LIP assumption needed No tedious computations required - No regression matrices Portable: same controller can be used for different manipulators On line tuning possible - can deal with parameter variation and external disturbances
∗ NN control down the time-line · Narendra and Parthasarathy, 1990-91 : System identification and control · Chen and Khalil, 1992: NN based adaptive control · Levin and Narendra, 1993: Dynamic Back propagation, feedback linearization
· Delgado, Kambhapati and Warwick, 1995: Dynamic Neural Network Input/output linearization July 12, 2004
Intelligent Systems Laboratory
Page 52
Manipulator control contd...
· Behera, 1996: Network Inversion control · Lewis, Jagannathan and Yesildirek, 1998-99: NN control for robots · Kwan and Lewis, 1998-2003: NN based robust backstepping control • What I have done? – Nothing new :-( – Implemented and analyzed recent Nonlinear control techniques for robot manipulators including flexible-link and flexible-joint manipulators.
July 12, 2004
Intelligent Systems Laboratory
Page 53
Manipulator control contd...
NN based controls
• NN based simple adaptive control • NN based Robust Back stepping control • NN control based on Singular perturbation technique • Summary
July 12, 2004
Intelligent Systems Laboratory
Page 54
Manipulator control contd...
NN based simple adaptive control Consider a Single-link Manipulator:
a¨ q + bsin(q) = u Let’s define e
(affine form)
= qd − q , e˙ = q˙d − q˙ and r = e˙ + λe, then we have ar˙
= a(¨ qd + λe) ˙ + bsin(q) − u qd , q, q, ˙ e) − u = F (¨
A NN can be used to approximate this nonlinear function.
ˆ φ + and suppose F = W φ F =W |{z} NN
By choosing a control u as
July 12, 2004
= Fˆ + Kr, the closed loop error dynamics can be written
˜ T Φ − Kr where W ˜ =W −W ˆ ar˙ = W Intelligent Systems Laboratory
Page 55
Manipulator control contd...
NN based simple adaptive control contd... Consider a Lyapunov function
V =
1 T 1 ˜ T ΓW ˜) r r + tr(W 2 2
The time derivative of the Lyapunov function
˜ T Φr T − W ˜ T ΓW ˆ˙ ) V˙ = −r T Kr + tr(W Choose a weight update law (Kwan and Lewis, 2000)
ˆ ˆ˙ = ΓΦr T − mΓkrkW W Then, 2 W W M 2 ∗ M ˜ kF − ) −m V˙ ≤ −krk λmin krk + m(kW 2 4 } {z |
where λmin is the minimum eigenvalue of K and kW kF July 12, 2004
Intelligent Systems Laboratory
≤ WM
Page 56
Manipulator control contd...
NN based simple adaptive control contd...
The above term∗ will be positive if following conditions are satisfied: 2 WM krk > m 4λmin
or
˜ k F > WM kW Thus, V˙ is negative outside a compact set. The control gain K can always be selected so as to satisfy above two conditions.
˜ kF are UUB. Hence, both krk and kW
July 12, 2004
Intelligent Systems Laboratory
Page 57
Manipulator control contd...
Single Link Manipulator: Simulation
ml2 q¨ + mglsin(q) = u where m = 1Kg , l = 1m and g = 9.8m/s2 The plant model:
1.5 Actual Desired
Link velocity tracking (rad/s)
Position tracking (rad)
1
0.5
0
-0.5
-1
0.002
λ=5, Κ=20, γ=300
1.5
NN based controller
Actual Desired
1
λ=5, Κ=30, γ=300 0.0015
Position tracking error
λ=5, Κ=30, γ=300
0.5
0
-0.5
λ=5, Κ=40, γ=400
0.001
0.0005
-1
-1.5 -1.5
0
2
4
6
Time (sec)
8
10
(a) Link Position
0
2
4
6
Time (sec)
8
10
(b) Link Velocity
0
0
2
4
6
Time (sec)
8
10
(c) Position tracking error
Figure 12: Neural Network based Adaptive Controller:
λ = 5, K = 30 and Γ = 300 July 12, 2004
Intelligent Systems Laboratory
Page 58
Manipulator control contd...
SLM: Simulation Contd...
80
40
λ=5, Κ=30, γ=300
λ=5, Κ=30, γ=300
Actual Desired
30
20
0
Kp=400, Kd=80 60
20
PD control input (Nm)
Unknown function approximation
Control Torque (Nm)
40
10
0
-10
40
20
-20
-20
0 0
2
4
6
Time (sec)
(a) Control input
8
10
-30
0
2
4
6
Time (sec)
8
(b) Function approximation
10
0
2
4
6
Time (sec)
8
10
(c) PD control:
Kp = 400, Kd = 80
Figure 13: NN Controller: comparison
July 12, 2004
Intelligent Systems Laboratory
Page 59
Manipulator control contd...
NN based Robust Back-stepping Control System Description (Strict feedback form):
x˙ 1
=
F1 (x1 ) + G1 (x1 )x2
x˙ 2
=
F2 (x1 , x2 ) + G2 (x1 , x2 )x3
x˙ 3
=
F3 (x1 , x2 , x3 ) + G3 (x1 , x2 , x3 )x4
... = x˙ m
=
... Fm (x1 , x2 , . . . , xm ) + Gm (x1 , x2 , . . . , xm )u
Fi , Gi ∈ Rn×n , i = 1, 2, . . . , m are nonlinear functions that contain both parametric and nonparametric uncertainties, and Gi ’s are known and invertible.. Note: Back-stepping can be applied if Internal dynamics are stabilizable.
July 12, 2004
Intelligent Systems Laboratory
Page 60
Manipulator control contd...
Back stepping Method (Krstic, Kanellakapoulos and Kokotovic, 1995):
• Choose x2 = x2d such that the system x˙ 1 = F1 (x1 , x2d ) has a stable tracking by x1 of x1d . • Choose x3 = x3d such that x2 tracks x2d and so on. • Finally, select u(t) such that xm tracks xmd . Problems with traditional robust and adaptive backstepping control:
• Computation of regression matrices at each design step is tedious and time consuming
• LIP assumption is quite restrictive and may not be true in practical situations
July 12, 2004
Intelligent Systems Laboratory
Page 61
Manipulator control contd...
Robust Back-stepping Control Design using NN
• Step 1: Design fictitious controllers for x2 , x3 , . . ., and xm Consider the first subsystem
x˙ 1 = F1 (x1 ) + G1 (x1 )x2 Choosing the fictitous controller for x2 as
ˆ x2d = G−1 ˙ 1d − K1 e1 ) 1 (−F1 + x Then, the closed loop error dynamics of above subsystem becomes
e˙ 1 = F1 − Fˆ1 − K1 e1 + G1 e2 with e2
July 12, 2004
= x2 − x2d . We will have to ensure that e2 is bounded.
Intelligent Systems Laboratory
Page 62
Manipulator control contd...
NN back-stepping contd...
Differentiating e2 gives
e˙ 2 = x˙ 2 − x˙ 2d = F2 + G2 x3 − x˙ 2d Choosing a fictitious controller for x3 of the form
ˆ x3d = G−1 ˙ 2d − K2 e2 − GT1 e1 ) 2 (−F2 + x we get closed loop error dynamics for second subsystem as
e˙ 2 = F2 − Fˆ2 − K2 e2 − GT1 e1 + G2 e3 with e3
July 12, 2004
= x3 − x3d . And the design continues ....
Intelligent Systems Laboratory
Page 63
Manipulator control contd...
NN back-stepping contd...
• Step 2: Design of actual control u Differentiating em = xm − xmd yields e˙ m = x˙ m − x˙ md = Fm + Gm u − x˙ md Choosing the controller of the form
ˆ u = G−1 ˙ md − Km em − GTm−1 em−1 ) m (−Fm + x gives the following dynamics for error em ,
e˙ m = Fm − Fˆm − Km em − GTm−1 em−1 Here, Ki ,
July 12, 2004
i = 1, 2, . . . , m are design parameters and Fˆi are NN outputs.
Intelligent Systems Laboratory
Page 64
Manipulator control contd...
NN back-stepping contd... NN
1
x1
NN
Fˆ1
x1 2
x2
NN
Fˆ2
x1 (m-1)
Fˆm−1
xm−1
NN
x1 m
Fˆm x1d
x2d
x3d
...
xmd
xm x1
u
x2
PLANT
xm−1 xm
Figure 14: Backstepping NN control of nonlinear systems in “strict-feedback” form
July 12, 2004
Intelligent Systems Laboratory
Page 65
Manipulator control contd...
The error dynamics of the whole plant
e˙ 2
˜ 1T φ1 − K1 e1 + G1 e2 + 1 = W ˜ T φ2 − K 2 e 2 − G T e 1 + G 2 e 3 + 2 = W
e˙ 3
˜ 3T φ3 − K3 e3 − GT2 e2 + G3 e4 + 3 = W
e˙ 1
2
1
... = ... T ˜m e˙ m = W φm − Km em − GTm−1 em−1 + m ˜ 1, W ˜ 2, . . . , W ˜ m }, = [eT1 eT2 . . . eTm ]T , Z˜ = diag{W K = diag{K1 , K2 , . . . , Km }, φ = [φT1 φT2 . . . φTm ]T . Define ζ
The closed system error dynamics can be rewritten as
ζ˙ = −Kζ + Z˜ T φ + Hζ + ˜ T φ is bounded. The weight update algorithm is so choosen that term Z
July 12, 2004
Intelligent Systems Laboratory
Page 66
Manipulator control contd...
NN Back-stepping: RLED Rigid Link Electrically driven (RLED) Robot Manipulator Model
M (q)¨ q + Vm (q, q) ˙ q˙ + G(q) + F (q) ˙ + T L = KT I LI˙ + R(I, q) ˙ + T E = uE with L, R, KT
∈ Rn×n are positive definite, constant and diagonal matrices and a = l22 m2 + l12 (m1 + m2 ), b = 2l1 l2 m2 ,
c = l22 m2 , d = (m1 + m2 )l1 g0 , e = m2 l2 g0 a + bcos(q2 ) c + 2b cos(q2 ) M (q) = c c + 2b cos(q2 ) July 12, 2004
Intelligent Systems Laboratory
Page 67
Manipulator control contd...
NN Back-stepping: RLED RLED Model contd...
Vm q˙ =
−bsin(q2 )(q˙1 q˙2 + 0.5q˙22 ) 0.5bsin(q2 )q˙12
G(q) =
dcos(q1 ) + ecos(q2 ) ecos(q1 + q2 )
The parameter values are l1
= 1m, l2 = 1m, m1 = 0.8Kg , m2 = 2.3Kg , and g0 = 9.8m/s2 . The parameters of motor are Rj = 1Ω, Lj = 0.01H , KjT = 2.0N m/A, j = 1, 2.
The inputs to the NNs are given by
x = [ζ1T ζ2T cos(q)T sin(q)T I T 1]T where ζ1 July 12, 2004
= q¨d + Λe˙ and ζ2 = q˙d + Λe. Intelligent Systems Laboratory
Page 68
Manipulator control contd...
RLED: NN BS Design The robot dynamics in terms of filtered error can be expressed as
M r˙ = F1 − Vm r + TL − KT I
(19)
The complicated nonlinear function F1 is defined as
F1 = M (q)(¨ qd + Λe) ˙ + Vm (q, q)( ˙ q˙d + Λe) + G(q) + F (q) ˙ step 1: Choose I
= Id such that r → 0.
Then (19) can be rewritten as
M r˙ = F1 − Vm r + TL − KT Id + KT η where η
July 12, 2004
(20)
= Id − I is an error signal. Choose Id as 1 ˆ Id = [ F 1 + k τ r + ν τ ] k1
(21)
Intelligent Systems Laboratory
Page 69
Manipulator control contd...
RLED: NN BS Design contd...
M r˙
=
KT T ˜ −Vm r + W1 φ1 − kτ r + ε 1 + TL k1 KT KT ˆ T + I− ντ + K T η W 1 φ1 − k1 k1
step 2: Choose uE such that η Differentiating η
(22)
→ 0.
= Id − I , we get Lη˙ = F2 + TE − uE
(23)
where F2 is very complicated nonlinear function of q , q˙, r , Id , and I . The control signal uE can be chosen as
uE = Fˆ2 + kν η The closed loop dynamics:
July 12, 2004
(24)
˜ T φ2 + ε 2 + T E − k ν η Lη˙ = W 2 Intelligent Systems Laboratory
Page 70
Manipulator control contd...
RLED Simulation I
+
-
Id
Fictitious controller
Fˆ1
2-layer Neural Network
kτ
kν qd q˙d
-
e e˙
[Λ
I]
r
Controller
Fˆ2
RLED System
q q˙
2-layer Neural Network
Figure 15: NN controller structure for RLED July 12, 2004
Intelligent Systems Laboratory
Page 71
Manipulator control contd...
RLED simulation Results
PD control of RLED
PD control of RLED
Link-1 Tracking
Link-2 Position tracking
1.5
PD control of RLED Torques 60
1.5 Desired Actual
1
1
0.5
0
0.5
0
-0.5
-0.5
-1
-1
0
5
10
Time (seconds)
20
15
(a) Link 1 Position Tracking
40 T[1] and T[2] in Nm
Link-2 Position q[2]
Link one Position q[1]
Link - 1 Link - 2
Desired Actual
20
0
-20 0
5
10 Time (seconds)
15
20
0
(b) Link 2 Position Tracking
5
10 time (seconds)
15
(c) Control Torques
Figure 16: PD control of 2-link RLED
K = diag{100, 100} and Kd = diag{20, 20}
July 12, 2004
Intelligent Systems Laboratory
Page 72
20
Manipulator control contd...
RLED simulation Results
NN Backstepping control of 2-Link RLED
NN Backstepping control of 2 - Link RLED
NN Backstepping control of 2 - Link RLED
2 1.5 Desired Actual
-1
40
0.5
Control inputs
0
0
-0.5
0
5
10
time (seconds)
15
(a) Link 1 Position Tracking
20
20 0 -20 -40
-1
-2
Link - 1 Link - 2
60
1
1
Link - 2 Position Tracking
Link - 1 Position Tracking
Desired Actual
-60
0
5
10
time (seconds)
15
20
0
(b) Link 2 Position Tracking
5
10
time (seconds)
15
(c) Control Torques
Figure 17: NN Back Stepping control of 2-link RLED
kτ = diag{60, 60} and kν = diag{1.5, 1.5}
July 12, 2004
Intelligent Systems Laboratory
Page 73
20
Manipulator control contd...
RLED simulation Results
Observations:
ˆ i (0) are • The problem of weight initialization does not arise, since weights W taken as zeros.
• The tuning algorithm ensures that weights are bounded. • PD controller requires large gains which might excite high frequency unmodeled dynamics.
• NN controller improves tracking performance.
July 12, 2004
Intelligent Systems Laboratory
Page 74
Manipulator control contd...
Singular Perturbation Design
• A large class of systems have slow dynamics and fast dynamics which operate on a very different time-scales and are essentially independent.
• Control can be decomposed into a fast portion and a slow portion, which act on different time-scales: increasing the control effectiveness
• Lewis, Jagannathan and Yesildirek, 1999
July 12, 2004
Intelligent Systems Laboratory
Page 75
Manipulator control contd...
Singular Perturbation Design A large class of nonlinear systems can be described by the equations
x˙ 1 εx˙ 2 where the state x
= f1 (x, u) = f21 (x1 ) + f22 (x1 )x2 + g2 (x1 )u
= [xT1 xT2 ]T .
• ε 2π
once the link 2 reaches close to the top position (|qd1
− q1 | ≤ 0.2,
|qd2 − q2 | ≤ 0.2), we switch over to a linear state feedback controller where the state feedback gain is given by
K = [−32.68
July 12, 2004
− 7.14
− 32.76
Intelligent Systems Laboratory
− 4.88]
Page 96
Pendubot
Simulation Results
20
60
qd(1) Link - 1 Link - 2
15
Pendubot
40
Link - 1 Link - 2
30
40
10
5
Control Input (u)
Link Velocities
Link Positions
20
20
0
10 0 -10 -20
-20
0
-30
0
5
10
Time (seconds)
15
(a) Position Tracking
20
-40
0
5
10
Time (seconds)
15
(b) Velocity Tracking
20
-40
0
5
10
Time (seconds)
15
(c) Control Torque
Figure 24: NN based Partial Feedback Linearization control of Pendubot
July 12, 2004
Intelligent Systems Laboratory
Page 97
20
Pendubot
Conclusion
• NN helps in taking unmodelled dynamics into account • control input is bounded • Choosing a proper initial trajectory for link 1 is crucial.
July 12, 2004
Intelligent Systems Laboratory
Page 98
Summary • Two algorithms namely, RTRL and EKF have been proposed for MNN. Performance comparison has been done on both SISO as well as MIMO systems
• A novel learning algorithm based on lyapunov function has been proposed for FFN. Performance comparison has been done with BP and EKF.
• Various types of NN controllers have been analyzed and implemented for robot manipulators
• A new NN based partial feedback linearization control has been suggested for Pendubot.
July 12, 2004
Intelligent Systems Laboratory
Page 99
Thank You
July 12, 2004
Intelligent Systems Laboratory
Page 100