Nonlinear System Identification and Control Using ... - Google Sites

0 downloads 198 Views 2MB Size Report
12 Jul 2004 - Real Time Recurrent Learning (RTRL): R. J. Williams and D. Zipser, 1990. • Extended Kalman ... Online (r
Nonlinear System Identification and Control Using Neural Networks

Swagat Kumar

Department of Electrical Engineering, Indian Institute of Technology, Kanpur. July 12, 2004

Intelligent Systems Laboratory

Page 1

Synopsis • Introduction • System Identification with Memory Neuron Network (MNN) • Lyapunov based training algorithm for FFN • Robot Manipulators • Manipulator Control • Pendubot • Summary

July 12, 2004

Intelligent Systems Laboratory

Page 2

Introduction

• Artificial Neural Network • System Identification • Nonlinear Control • Underactuated system • My work

July 12, 2004

Intelligent Systems Laboratory

Page 3

Introduction contd...

Artificial Neural Network • Inspired by Biological Nervous W x1

System

W y

• Local Processing in artificial neurons (Processing Elements, PEs)

x2

• Massively parallel processing implemented by rich connection pattern between PEs

Figure 1: An Artificial neural network

• Ability to acquire knowledge via learning/experience

• Knowledge storage in distributed memory, synaptic PE connections July 12, 2004

Intelligent Systems Laboratory

Page 4

Introduction contd...

System Identification • Building good models of unknown U

Unknown PLANT

plant from measured data

Yd +

• Involves two distinct steps

e

MODEL W

– Choosing a proper model



– Adjusting the parameters of

Y

model so as to minimize fit criterion Figure 2: System identification model

• Neural networks are extensively used because of their good approximation capability

July 12, 2004

Intelligent Systems Laboratory

Page 5

Introduction contd...

Nonlinear Control • Nonlinear systems – Difficult to model – No standard or generalized technique – May involve parameter uncertainty and variation – May have unstable zero dynamics

• Conventional techniques – Adaptive Control – Robust and Optimal Control – Many others like backstepping, sliding mode, feedback linearization, singular perturbation etc.

July 12, 2004

Intelligent Systems Laboratory

Page 6

Introduction contd...

Underactuated Systems • Number of actuators is less than Number of DOFs • Certain examples: – Flexible link and Flexible joint manipulators – Inertial wheel pendulum, pendubot, acrobot, furuta pendulum – Under water vehicle, PVTOL, space structures with floating platform

• Control techniques: – Energy based method: Fantoni and Lozano, 2002 – Passivity based methods – Partial feedback linearization Spong, Ortega, Block, et al. – Backstepping and a variant of sliding mode control: Bapiraju, 2004 – Coordinate Transforms and various other methods: Saber, 2001 July 12, 2004

Intelligent Systems Laboratory

Page 7

Introduction contd...

My Work

• Learning Algorithms for MNN • A new learning algorithm with FFN • NN based Robot manipulator control • NN based control technique for Pendubot

July 12, 2004

Intelligent Systems Laboratory

Page 8

System Identification with MNN

• Feedforward and Recurrent Networks • Memory Neuron Network • Learning Algorithms – Back Propagation through time (BPTT) – Real time Recurrent Learning (RTRL) – Extended Kalman Filtering (EKF)

• Simulation Results • Comparison and Inference

July 12, 2004

Intelligent Systems Laboratory

Page 9

System identification with MNN contd...

Feed-forward networks • Performs static mapping (point to point mapping) • No preservation of dynamics • Apriori knowledge of exact order of system required • all the states should be measurable • However, easy to train

Recurrent Networks • Capable of learning dynamics • No apriori knowledge of order of system required • all states need not be available for measurement • But, Computationally complex because of feedbacks July 12, 2004

Intelligent Systems Laboratory

Page 10

System identification with MNN contd...

Memory Neuron Model • Sastry et al. 1994 NN

X

f (.)

• A Memory Neuron is added to

S

each Network Neuron to capture dynamics of the recurrent network

α

• No need to store past values.

MN

• It converts a recurrent network into a feed-forward network

(1 − α)

• Locally recurrent and globally

Figure 3: Structure of Memory Neuron Model

feedforward in nature

• Easy to train as compared to a fully recurrent network

July 12, 2004

Intelligent Systems Laboratory

Page 11

System identification with MNN contd...

Memory Neuron Network

u(k)

y(k)

PLANT

z −1 1 S0

+

e(k)

2 S0 -

y(k) ˆ

y(k − 1)

Figure 4: System Identification model with MNN

July 12, 2004

Intelligent Systems Laboratory

Page 12

System identification with MNN contd...

Learning Algorithms Training Algorithms for RNN:

• Back Propagation through time (BPTT): P. J. Werbos, 1990 • Real Time Recurrent Learning (RTRL): R. J. Williams and D. Zipser, 1990 • Extended Kalman Filter (EKF): R. J. Williams, 1992, Iiguni et al., 1992 What We have done?

• We used RTRL and EKF for MNNa . • A comparative study of above three algorithms have been made. • We have tested results for both SISO as well as MIMO systems a

Sastry et al. used BPTT for this network

July 12, 2004

Intelligent Systems Laboratory

Page 13

System identification with MNN contd...

Back Propagation Through Time • Extension of Back Propagation alw

x

y

gorithm

• The recurrent network is unfolded

g x(0)

x(1)

y(0)

x(2) w

w

w y(2)

y(1) g

g t = 0

y(3)

g t = 1

t= 2

into multilayer feedforward network, with a new layer added at every time step.

Figure 5: Rolling network in time

• Offline technique as we wait until we reach the end of sequence

∂E 4wi (t) = −η ∂wi

• Information is propagated in backward direction for updating weights

• Slow convergence July 12, 2004

Intelligent Systems Laboratory

Page 14

System identification with MNN contd...

Real Time Recurrent Learning Plant output

y(t + 1) = f (s(t + 1)), where s(t + 1) = wx(t) + gy(t) Cost function for single output

E=

wn ∂E ∂w

July 12, 2004

1 d (y (t + 1) − y(t + 1))2 2

∂E ; [Gradient Descent] ∂w ∂y(t + 1) d = −[y (t + 1) − y(t + 1)] ∂w

= wo − η

Intelligent Systems Laboratory

Page 15

System identification with MNN contd...

RTRL Contd ...

Define

Pw (t + 1) = =

∂y(t + 1) ∂w ∂y(t + 1) ∂s(t + 1) ∂s(t + 1) ∂w

= y 0 (t + 1)[x(t) + gPw (t)] ∂E = −[y d (t + 1) − y(t + 1)]Pw (t + 1) ∂w And, we have following recursion

Pw (t + 1) = y 0 (t + 1)[x(t) + gPw (t)]

July 12, 2004

Intelligent Systems Laboratory

Page 16

System identification with MNN contd...

RTRL Contd ...

• Propagates information forward to compute gradient term • Online (real-time) training • Increased computational complexity • Requires large number of training patterns

July 12, 2004

Intelligent Systems Laboratory

Page 17

System identification with MNN contd...

Extended Kalman Filtering • It is a state estimation method for a non-linear system and can be used for parameter estimation.

• Multilayered NN is a multi-inputs and multi-outputs non-linear system having a layered structure and its learning algorithm can be regarded as a parameter estimation problem.

• EKF used for RNN (williams, 92) shows 6-10 times reduction in the number of presentations required for training.

• Increased computation time per iteration • Iiguni, et al., 92 proposed online training algorithm for MLP

July 12, 2004

Intelligent Systems Laboratory

Page 18

System identification with MNN contd...

EKF Contd ...

System equations

a(t) = a(t − 1) y d (t) = h[a(t)] + (t) ˆ (t) + (t) = y ˆ (t)] ˆ (t) = a ˆ (t − 1) + K(t)[y d (t) − y a K(t)

= P (t − 1)H T (t)[H(t)P (t − 1)H T (t) + R(t)]−1

P (t) = P (t − 1) − P (t − 1)K(t)H(t) where H(t)

∂h ∂a ,

K(t) - Kalman Gain, R(t) - covariance matrix of measurement noise and P (t) - error convariance matrix. July 12, 2004

=

Intelligent Systems Laboratory

Page 19

System identification with MNN contd...

Simulation Results

Example 1: SISO plant

yp (k + 1) = f (yp (k), yp (k − 1), yp (k − 2), u(k), u(k − 1)) where

f (x1 , x2 , x3 , x4 , x5 ) =

July 12, 2004

x1 x2 x3 x5 (x3 − 1) + x4 1 + x23 + x22

Intelligent Systems Laboratory

Page 20

System identification with MNN contd...

SISO Plant 1

1

desired actual

0.5

output

output

0.5

0

-0.5

-1

desired actual

0

-0.5

0

100

200

300

400

500 time steps

600

700

800

900

-1

1000

(a) BPTT Algorithm

1

0

100

200

300

400

500 time steps

600

700

800

900

1000

(b) RTRL Algorithm

desired actual

0.8 0.6 0.4

output

0.2 0 -0.2 -0.4 -0.6 -0.8 -1

0

200

400

600

800

1000

time steps

(c) EKF Algorithm

July 12, 2004

Intelligent Systems Laboratory

Page 21

System identification with MNN contd...

Example 2: MIMO Plant

yp1 (k) yp1 (k + 1) = 0.5[ 2 (k) + u1 (k)] 1 + yp2 yp1 (k)yp2 (k) + u2 (k)] yp2 (k + 1) = 0.5[ 2 1 + yp2 (k)

July 12, 2004

Intelligent Systems Laboratory

Page 22

System identification with MNN contd...

MIMO Plant: BPTT 0.6 Actual Desired

Actual Desired

0.4

Output - 2

Output - 1

0.5

0

0.2

0

-0.2

-0.5

-0.4 0

200

400

600

Data Points

800

1000

0

(d) BPTT: putput 1

July 12, 2004

200

400

600

Data Points

800

1000

(e) BPTT: output 2

Intelligent Systems Laboratory

Page 23

System identification with MNN contd...

MIMO Plant: RTRL 0.6 Desired Actual

Desired Actual

0.4

Output - 2

Output - 1

0.5

0

0.2

0

-0.2

-0.5

-0.4 0

200

400

600

Data Points

800

1000

0

(f) RTRL: output 1

July 12, 2004

200

400

600

Data Points

800

1000

(g) RTRL: output 2

Intelligent Systems Laboratory

Page 24

System identification with MNN contd...

MIMO Plant: EKF 0.6 Desired Actual

Desired Actual

0.4

Output - 2

Output - 1

0.5

0

0.2

0

-0.2

-0.5

-0.4 0

200

400

600

Data Points

800

1000

0

(h) EKF:output 1

July 12, 2004

200

400

600

Data Points

800

1000

(i) EKF:output 2

Intelligent Systems Laboratory

Page 25

System identification with MNN contd...

Comparative Analysis

Example

BPTT

RTRL

EKF

Ex.No.1

0.013293

0.006088

0.006642

Ex.No.2 o/p1

0.005753

0.001414

0.002595

Ex.No.2 o/p2

0.008460

0.001258

0.001563

Table 1: Mean Square Error while identifying with MNN

July 12, 2004

Intelligent Systems Laboratory

Page 26

System identification with MNN contd...

Conclusion

• MNN offers a simple way to convert a recurrent network into feed forward network thereby simplifying implementation

• BPTT takes a huge time for convergence and also identification is poor . – Online implementation not possible

• RTRL shows best approximation accuracy and also enables online training – But takes a long time for convergence

• EKF is the fastest a among all the three. Performance is comparable to RTRL. – But computationally complex a

in terms of number of training examples

July 12, 2004

Intelligent Systems Laboratory

Page 27

Lyapunov Based Training Algorithm • Introduction • Back Propagation Algorithm • Lyapunov Based Approach – LF - I – LF - II

• Simulation Results • Conclusion

July 12, 2004

Intelligent Systems Laboratory

Page 28

Lyapunov Based Algorithm Contd ...

• Desirable features of Learning Algorithms for Feed-forward networks – Locating Global Minimum of the Cost Function – Fast Convergence – Good Generalization - Learning from minimum Examples – Less Computational Complexity - Less training time

• Overview of present learning algorithms – Back Propagation Algorithm

∗ Most likely to find a local minimum ∗ Slow Convergence ∗ Data Sufficiency: Training patterns should span the complete Input-output Space

∗ But easy to implement

July 12, 2004

Intelligent Systems Laboratory

Page 29

Lyapunov Based Algorithm Contd ...

Introduction contd...

– Extended Kalman Filtering

∗ ∗ ∗ ∗

Fast Convergence Less training patterns Computationally intensive Global Convergence is not guaranteed

– Other Algorithms (Newton’s Method, Levenberg-Marquardt)

∗ Fast algorithms ∗ Global Convergence is not guaranteed ∗ Computationally intensive • What We have done? – A globally convergent algorithm using Lyapunov function for Feedforward Neural Networks. July 12, 2004

Intelligent Systems Laboratory

Page 30

Lyapunov Based Algorithm Contd ...

Back Propagation Algorithm Back Propagation Algorithm is based on Gradient Descent (GD) method in which the weight is updated in such a direction so as to reduce the error. The weight update rule is given by

∂E 4wi (t) = −η ∂wi 4wji (n) = ηδj (n)yi (n); η is the learning rate and δj (n) is the local gradient  0  e (n)ψ (vj (n))  j j     if neuron j is an output node δj (n) = P 0  ψj (vj (n)) k δk (n)wkj (n)      if neuron j is a hidden node in the above equation k refers to next layer

July 12, 2004

Intelligent Systems Laboratory

Page 31

Lyapunov Based Approach Contd...

Lyapunov Based Approach

Lyapunov Stability Criterion

• Used extensively in control system problems. • If we choose a Lyapunov function candidate V (x(t), t) such that –

V (x(t), t) is positive definite – V˙ (x(t), t) is negative definite then the system is stable. Now, if V˙ (t) is uniformly continuous and bounded then according to Barbalat’s lemma as t

→ ∞, V˙ → 0.

• Problem lies in choosing a proper Lyapunov function candidate.

July 12, 2004

Intelligent Systems Laboratory

Page 32

Lyapunov Based Approach Contd...

W

W

x1

y

x2

Figure 6: A feed-forward network

∈ RM is the weight vector. The training data consists of, say, N patterns, {xp , y p }, p = 1, 2, ..., N . Here, W

July 12, 2004

Intelligent Systems Laboratory

Page 33

Lyapunov Based Approach Contd...

The network output is given by

yˆp = f (W , xp ) p = 1, 2, . . . N

(1)

The usual quadratic cost function is given as: N

1X p E= (y − yˆp )2 2 p=1

(2)

Let’s choose a Lyapunov function candidate for the system as below:

1 T ˜) y y V = (˜ 2 ˜ where y

July 12, 2004

(3)

= [y 1 − yˆ1 , ....., y p − yˆp , ....., y N − yˆN ]T .

Intelligent Systems Laboratory

Page 34

Lyapunov Based Approach Contd...

The time derivative of the Lyapunov function V is given by

ˆ ˙ ∂y ˙ W = −˜ yT J W V˙ = −˜ y ∂W where

J=

ˆ ∂y ∂W

(4)

J ∈ RN ×M

Theorem 1 If an arbitrary initial weight W (0) is updated by

W (t0 ) = W (0) + where

Z

t0

˙ dt W

(5)

0

2 ˜ k y k T ˙ = ˜ J y W T 2 ˜k kJ y

(6)

˙ exists along the convergence ˜ converges to zero under the condition that W then y trajectory. July 12, 2004

Intelligent Systems Laboratory

Page 35

Lyapunov Based Approach Contd...

LF - I Algorithm The weight update law given in equation (5) is a batch update law. The instantaneous LF I learning algorithm can be derived as: 2 k y ˜ k T ˙ = J y˜ W i T 2 k Ji y˜ k

where y ˜=

p

p

y − yˆ ∈ R and Ji =

∂ yˆp ∂W

(7)

∈ R(1×M ) . The difference equation

representation of the weight update equation is given by

ˆ (t + 1) = W ˆ (t) + µW ˙ (t) W

(8)

Here µ is a constant.

July 12, 2004

Intelligent Systems Laboratory

Page 36

Lyapunov Based Approach Contd...

Comparison with BP Algorithm In gradient descent method we have,

∂E = ηJi T y˜ ∂W ˆ (t + 1) = W ˆ (t) + ηJi T y˜ W 4W = −η

(9)

The update equation for LF-I algorithm: 2  k y ˜ k T ˆ (t + 1) = W ˆ (t) + µ W y˜ J i T 2 k Ji y˜ k

Comparing above two equations, we find that the fixed learning rate η in BP algorithm is replaced by its adaptive version ηa :

k y˜ k2  ηa = µ k Ji T y˜ k2 July 12, 2004

Intelligent Systems Laboratory

(10)

Page 37

Lyapunov Based Approach Contd...

Adaptive Learning rate of LF-I 50 LF - I : XOR

Learning rate

40

30

20

10

0

0

100

200

300

No of iterations (4xno. of epochs)

400

Figure 7: Adaptive Learning rates for LF-I: XOR Observation:

• Learning rate is not fixed unlike BP algorithm. • Learning rate goes to zero as error goes to zero. July 12, 2004

Intelligent Systems Laboratory

Page 38

Lyapunov Based Approach Contd...

LF-II Algorithm In order to ensure smooth search in weight space, we consider a modified Lyapunov function candidate as follows:

1 T T ˜ ˜) ˜ + λW W y y V = (˜ 2 ˜ where W

(11)

ˆ − W = 4W and λ is a constant. The time derivative of =W

Lyapunov function is given by



where J

=

ˆ ∂y ∂W

ˆ ˙ ∂y ˜ TW ˙ W −W ∂W ˙ = −˜ y T (J + D)W

= −˜ yT

˙ and ˜˙ = −W : N × M Jacobian matrix, W D=λ

July 12, 2004

(12)

1 ˜T ˜ y W k˜ y k2

Intelligent Systems Laboratory

(13)

Page 39

Lyapunov Based Approach Contd...

LF-II Algorithm contd... Theorem 2 If an arbitrary initial weight is updated by

W (t0 ) = W (0) +

Z

t0

˙ dt W

(14)

0

˙ is given by: where W 2 ˜ k y k T ˙ = ˜ (J + D) y W T 2 ˜k k (J + D) y

(15)

˙ exists along the convergence ˜ converges to zero under the condition that W then y trajectory. The instantaneous weight update equation using modified Lyapunov function can be expressed as: 2  k y ˜ k T ˆ ˆ (J + D) y˜ W (t + 1) = W (t) + µ i T 2 k (Ji + D) y˜ k July 12, 2004

Intelligent Systems Laboratory

(16)

Page 40

Lyapunov Based Approach Contd...

Adaptive learning rate for LF-II Algorithm

LF - II : XOR

Learning rate

150

100

50

0

0

100

Training Data

200

300

Figure 8: Adaptive Learning rates for LF-II: XOR

July 12, 2004

Intelligent Systems Laboratory

Page 41

Lyapunov Based Approach Contd...

Smooth search in weight space 3000

250

150 3bit Parity

LF - I LF - II

150

2000

Trainiing epochs

200

4-2 Encoder

LF - II LF - I

2500

Training epochs

Training epochs

XOR

1500

1000

LF - I LF - II

100

50

500

100

0

10

20

run

30

(a) XOR

40

50

0

0

10

20

run

30

40

(b) 3-bit Parity

50

0

0

10

20

run

30

40

50

(c) 4-2 Enconder

Figure 9: Comparison of convergence time in terms of iterations between LF I and LF II

July 12, 2004

Intelligent Systems Laboratory

Page 42

Lyapunov Based Approach Contd...

Simulation Results

XOR Algorithm

epochs

time(sec)

parameters

BP

5620

0.0578

η = 0.5

BP

3769

0.0354

η = 0.95

EKF

3512

0.1662

λ = 0.9

LF-I

165

0.0062

µ = 0.55

LF-II

109

0.0042

µ = 0.55, λ = 0.01

Table 2: comparison among three algorithms for XOR problem

July 12, 2004

Intelligent Systems Laboratory

Page 43

Lyapunov Based Approach Contd...

Simulation Results

Convergence time (seconds)

0.4

BP EKF LF - II

0.3

0.2

0.1

0

0

10

20

Run

30

40

50

Figure 10: Convergence time comparison for XOR among BP, EKF and LF-II Observation: LF takes almost same time for any arbitrary initial condition. July 12, 2004

Intelligent Systems Laboratory

Page 44

Lyapunov Based Approach Contd...

Simulation Results

3-bit Parity Algorithm

epochs

time(sec)

parameters

BP

12032

0.483

η = 0.5

BP

5941

0.2408

η = 0.95

EKF

2186

0.4718

λ = 0.9

LF-I

796

0.1688

µ = 0.53

LF-II

403

0.0986

µ = 0.45, λ = 0.01

Table 3: comparison among three algorithms for 3-bit Parity problem

July 12, 2004

Intelligent Systems Laboratory

Page 45

Lyapunov Based Approach Contd...

Simulation Results

4-2 Encoder Algorithm

epochs

time(sec)

parameters

BP

2104

0.3388

η = 0.5

BP

1141

0.1848

η = 0.95

EKF

1945

2.4352

λ = 0.9

LF-I

81

0.1692

µ = 0.29

LF-II

70

0.1466

µ = 0.3, λ = 0.02

Table 4: comparison among three algorithms for 4-2 Encoder problem

July 12, 2004

Intelligent Systems Laboratory

Page 46

Lyapunov Based Approach Contd...

Simulation Results System Identification We consider the following system identification problem

x(k) 3 + u x(k + 1) = (k) 1 + x2 (k)

(17)

Training: 40000 data points randomly generated between 0 and 1. Test data: a sinusoidal input u

= sin(t) for three periodic cycles.

BP

EKFa

LF-II

0.0539219

0.0807811

0.052037

Table 5: Rms error on test data a

Convergence of EKF is not uniform for different initial conditions

July 12, 2004

Intelligent Systems Laboratory

Page 47

Lyapunov Based Approach Contd...

Simulation Results System Identification

Rms error=0.0539219, Tmax=40000

1

Desired Actual

BP

0.5

Output(Normalized)

Output (Normalized)

Output (Normalized)

-0.5

0

-0.5

0

5

10

Time (seconds)

(a) BP

15

20

-1

Desired Actual

LF - II

0.5

0

Rms error=0.052037, Tmax=40000

1 Desired Actual

EKF

0.5

-1

Rms error=0.0807811, Tmax=40000

1

0

-0.5

0

5

10

Time (seconds)

15

(b) EKF

20

-1

0

5

10

Time (seconds)

15

20

(c) LF-II

Figure 11: System identification - the trained network is tested on test-data

July 12, 2004

Intelligent Systems Laboratory

Page 48

Lyapunov Based Approach Contd...

Conclusion

• LF Algorithm is a theoretically globally convergent algorithm. • LF Algorithms perform better than both EKF and BP algorithms in terms of speed and accuracy.

• Convergence speed of modified LF (LF-II) based on smooth search in weight space is found to be independent of initial weights.

• LF-I Algorithm has an interesting parallel with conventional BP algorithm where the the fixed learning rate of BP is replaced by an adaptive learning rate.

• LF Algorithms outperform BP and EKF when the complexity of the network increases.

• Choosing a right Lyapunov function candidate is a key issue in our algorithm. July 12, 2004

Intelligent Systems Laboratory

Page 49

Robot Manipulators Dynamics:

M (q)¨ q + Vm (q, q) ˙ q˙ + F (q) ˙ + G(q) + τd = τ

(18)

Properties:

• The inertia matrix M (q) is symmetric, positive definite, and bounded so that µ1 I ≤ M (q) ≤ µ2 I ∀q(t). • The Coriolis/centripetal vector Vm (q, q) ˙ q˙ is quadratic in q˙. Vm is bounded so that k Vm q˙ k≤ vB k q˙ k2 . • The Coriolis/centripetal matrix can always be selected so that the matrix S(q, q) ˙ ≡ M˙ (q) − 2Vm (q, q) ˙ is skew symmetric. Therefore, xT Sx = 0 for all vectors x. • The gravity vector is bounded so that k G(q) k≤ gB . • The disturbances are bounded so that k τd (t) k≤ dB . July 12, 2004

Intelligent Systems Laboratory

Page 50

Manipulator Control • Overview of various control techniques – PID controllers

∗ ∗ ∗ ∗

Most widely used for manipulators Can’t be used under high speed and accuracy requirements Loose accuracy when certain parameter changes - No learning capability High gains may make the system susceptible to high frequency noise

– Traditional Adaptive Control Techniques

∗ Spong, Vidyasagar, Kokotovic, Slotine and Li ∗ Assumption of linearity in unknown system parameters: f (x) = R(x)ξ ∗ Computation of Regression matrix R(x) is quite complex and must be computed for each different manipulator

July 12, 2004

Intelligent Systems Laboratory

Page 51

Manipulator control contd...

– NN based Adaptive Control

∗ NNs possess universal approximation capability f (x) = W T σ(V T x) +  ∗ ∗ ∗ ∗

No LIP assumption needed No tedious computations required - No regression matrices Portable: same controller can be used for different manipulators On line tuning possible - can deal with parameter variation and external disturbances

∗ NN control down the time-line · Narendra and Parthasarathy, 1990-91 : System identification and control · Chen and Khalil, 1992: NN based adaptive control · Levin and Narendra, 1993: Dynamic Back propagation, feedback linearization

· Delgado, Kambhapati and Warwick, 1995: Dynamic Neural Network Input/output linearization July 12, 2004

Intelligent Systems Laboratory

Page 52

Manipulator control contd...

· Behera, 1996: Network Inversion control · Lewis, Jagannathan and Yesildirek, 1998-99: NN control for robots · Kwan and Lewis, 1998-2003: NN based robust backstepping control • What I have done? – Nothing new :-( – Implemented and analyzed recent Nonlinear control techniques for robot manipulators including flexible-link and flexible-joint manipulators.

July 12, 2004

Intelligent Systems Laboratory

Page 53

Manipulator control contd...

NN based controls

• NN based simple adaptive control • NN based Robust Back stepping control • NN control based on Singular perturbation technique • Summary

July 12, 2004

Intelligent Systems Laboratory

Page 54

Manipulator control contd...

NN based simple adaptive control Consider a Single-link Manipulator:

a¨ q + bsin(q) = u Let’s define e

(affine form)

= qd − q , e˙ = q˙d − q˙ and r = e˙ + λe, then we have ar˙

= a(¨ qd + λe) ˙ + bsin(q) − u qd , q, q, ˙ e) − u = F (¨

A NN can be used to approximate this nonlinear function.

ˆ φ + and suppose F = W φ F =W |{z} NN

By choosing a control u as

July 12, 2004

= Fˆ + Kr, the closed loop error dynamics can be written

˜ T Φ − Kr where W ˜ =W −W ˆ ar˙ = W Intelligent Systems Laboratory

Page 55

Manipulator control contd...

NN based simple adaptive control contd... Consider a Lyapunov function

V =

1 T 1 ˜ T ΓW ˜) r r + tr(W 2 2

The time derivative of the Lyapunov function

˜ T Φr T − W ˜ T ΓW ˆ˙ ) V˙ = −r T Kr + tr(W Choose a weight update law (Kwan and Lewis, 2000)

ˆ ˆ˙ = ΓΦr T − mΓkrkW W Then, 2   W W M 2 ∗ M ˜ kF − ) −m V˙ ≤ −krk λmin krk + m(kW 2 4 } {z |

where λmin is the minimum eigenvalue of K and kW kF July 12, 2004

Intelligent Systems Laboratory

≤ WM

Page 56

Manipulator control contd...

NN based simple adaptive control contd...

The above term∗ will be positive if following conditions are satisfied: 2 WM krk > m 4λmin

or

˜ k F > WM kW Thus, V˙ is negative outside a compact set. The control gain K can always be selected so as to satisfy above two conditions.

˜ kF are UUB. Hence, both krk and kW

July 12, 2004

Intelligent Systems Laboratory

Page 57

Manipulator control contd...

Single Link Manipulator: Simulation

ml2 q¨ + mglsin(q) = u where m = 1Kg , l = 1m and g = 9.8m/s2 The plant model:

1.5 Actual Desired

Link velocity tracking (rad/s)

Position tracking (rad)

1

0.5

0

-0.5

-1

0.002

λ=5, Κ=20, γ=300

1.5

NN based controller

Actual Desired

1

λ=5, Κ=30, γ=300 0.0015

Position tracking error

λ=5, Κ=30, γ=300

0.5

0

-0.5

λ=5, Κ=40, γ=400

0.001

0.0005

-1

-1.5 -1.5

0

2

4

6

Time (sec)

8

10

(a) Link Position

0

2

4

6

Time (sec)

8

10

(b) Link Velocity

0

0

2

4

6

Time (sec)

8

10

(c) Position tracking error

Figure 12: Neural Network based Adaptive Controller:

λ = 5, K = 30 and Γ = 300 July 12, 2004

Intelligent Systems Laboratory

Page 58

Manipulator control contd...

SLM: Simulation Contd...

80

40

λ=5, Κ=30, γ=300

λ=5, Κ=30, γ=300

Actual Desired

30

20

0

Kp=400, Kd=80 60

20

PD control input (Nm)

Unknown function approximation

Control Torque (Nm)

40

10

0

-10

40

20

-20

-20

0 0

2

4

6

Time (sec)

(a) Control input

8

10

-30

0

2

4

6

Time (sec)

8

(b) Function approximation

10

0

2

4

6

Time (sec)

8

10

(c) PD control:

Kp = 400, Kd = 80

Figure 13: NN Controller: comparison

July 12, 2004

Intelligent Systems Laboratory

Page 59

Manipulator control contd...

NN based Robust Back-stepping Control System Description (Strict feedback form):

x˙ 1

=

F1 (x1 ) + G1 (x1 )x2

x˙ 2

=

F2 (x1 , x2 ) + G2 (x1 , x2 )x3

x˙ 3

=

F3 (x1 , x2 , x3 ) + G3 (x1 , x2 , x3 )x4

... = x˙ m

=

... Fm (x1 , x2 , . . . , xm ) + Gm (x1 , x2 , . . . , xm )u

Fi , Gi ∈ Rn×n , i = 1, 2, . . . , m are nonlinear functions that contain both parametric and nonparametric uncertainties, and Gi ’s are known and invertible.. Note: Back-stepping can be applied if Internal dynamics are stabilizable.

July 12, 2004

Intelligent Systems Laboratory

Page 60

Manipulator control contd...

Back stepping Method (Krstic, Kanellakapoulos and Kokotovic, 1995):

• Choose x2 = x2d such that the system x˙ 1 = F1 (x1 , x2d ) has a stable tracking by x1 of x1d . • Choose x3 = x3d such that x2 tracks x2d and so on. • Finally, select u(t) such that xm tracks xmd . Problems with traditional robust and adaptive backstepping control:

• Computation of regression matrices at each design step is tedious and time consuming

• LIP assumption is quite restrictive and may not be true in practical situations

July 12, 2004

Intelligent Systems Laboratory

Page 61

Manipulator control contd...

Robust Back-stepping Control Design using NN

• Step 1: Design fictitious controllers for x2 , x3 , . . ., and xm Consider the first subsystem

x˙ 1 = F1 (x1 ) + G1 (x1 )x2 Choosing the fictitous controller for x2 as

ˆ x2d = G−1 ˙ 1d − K1 e1 ) 1 (−F1 + x Then, the closed loop error dynamics of above subsystem becomes

e˙ 1 = F1 − Fˆ1 − K1 e1 + G1 e2 with e2

July 12, 2004

= x2 − x2d . We will have to ensure that e2 is bounded.

Intelligent Systems Laboratory

Page 62

Manipulator control contd...

NN back-stepping contd...

Differentiating e2 gives

e˙ 2 = x˙ 2 − x˙ 2d = F2 + G2 x3 − x˙ 2d Choosing a fictitious controller for x3 of the form

ˆ x3d = G−1 ˙ 2d − K2 e2 − GT1 e1 ) 2 (−F2 + x we get closed loop error dynamics for second subsystem as

e˙ 2 = F2 − Fˆ2 − K2 e2 − GT1 e1 + G2 e3 with e3

July 12, 2004

= x3 − x3d . And the design continues ....

Intelligent Systems Laboratory

Page 63

Manipulator control contd...

NN back-stepping contd...

• Step 2: Design of actual control u Differentiating em = xm − xmd yields e˙ m = x˙ m − x˙ md = Fm + Gm u − x˙ md Choosing the controller of the form

ˆ u = G−1 ˙ md − Km em − GTm−1 em−1 ) m (−Fm + x gives the following dynamics for error em ,

e˙ m = Fm − Fˆm − Km em − GTm−1 em−1 Here, Ki ,

July 12, 2004

i = 1, 2, . . . , m are design parameters and Fˆi are NN outputs.

Intelligent Systems Laboratory

Page 64

Manipulator control contd...

NN back-stepping contd... NN

1

x1

NN

Fˆ1

x1 2

x2

NN

Fˆ2

x1 (m-1)

Fˆm−1

xm−1

NN

x1 m

Fˆm x1d

x2d

x3d

...

xmd

xm x1

u

x2

PLANT

xm−1 xm

Figure 14: Backstepping NN control of nonlinear systems in “strict-feedback” form

July 12, 2004

Intelligent Systems Laboratory

Page 65

Manipulator control contd...

The error dynamics of the whole plant

e˙ 2

˜ 1T φ1 − K1 e1 + G1 e2 + 1 = W ˜ T φ2 − K 2 e 2 − G T e 1 + G 2 e 3 +  2 = W

e˙ 3

˜ 3T φ3 − K3 e3 − GT2 e2 + G3 e4 + 3 = W

e˙ 1

2

1

... = ... T ˜m e˙ m = W φm − Km em − GTm−1 em−1 + m ˜ 1, W ˜ 2, . . . , W ˜ m }, = [eT1 eT2 . . . eTm ]T , Z˜ = diag{W K = diag{K1 , K2 , . . . , Km }, φ = [φT1 φT2 . . . φTm ]T . Define ζ

The closed system error dynamics can be rewritten as

ζ˙ = −Kζ + Z˜ T φ + Hζ +  ˜ T φ is bounded. The weight update algorithm is so choosen that term Z

July 12, 2004

Intelligent Systems Laboratory

Page 66

Manipulator control contd...

NN Back-stepping: RLED Rigid Link Electrically driven (RLED) Robot Manipulator Model

M (q)¨ q + Vm (q, q) ˙ q˙ + G(q) + F (q) ˙ + T L = KT I LI˙ + R(I, q) ˙ + T E = uE with L, R, KT

∈ Rn×n are positive definite, constant and diagonal matrices and a = l22 m2 + l12 (m1 + m2 ), b = 2l1 l2 m2 ,

c = l22 m2 , d = (m1 + m2 )l1 g0 , e = m2 l2 g0   a + bcos(q2 ) c + 2b cos(q2 )  M (q) =  c c + 2b cos(q2 ) July 12, 2004

Intelligent Systems Laboratory

Page 67

Manipulator control contd...

NN Back-stepping: RLED RLED Model contd...



Vm q˙ = 

−bsin(q2 )(q˙1 q˙2 + 0.5q˙22 ) 0.5bsin(q2 )q˙12

 



G(q) = 

dcos(q1 ) + ecos(q2 ) ecos(q1 + q2 )

The parameter values are l1

= 1m, l2 = 1m, m1 = 0.8Kg , m2 = 2.3Kg , and g0 = 9.8m/s2 . The parameters of motor are Rj = 1Ω, Lj = 0.01H , KjT = 2.0N m/A, j = 1, 2.

The inputs to the NNs are given by

x = [ζ1T ζ2T cos(q)T sin(q)T I T 1]T where ζ1 July 12, 2004

= q¨d + Λe˙ and ζ2 = q˙d + Λe. Intelligent Systems Laboratory

Page 68

 

Manipulator control contd...

RLED: NN BS Design The robot dynamics in terms of filtered error can be expressed as

M r˙ = F1 − Vm r + TL − KT I

(19)

The complicated nonlinear function F1 is defined as

F1 = M (q)(¨ qd + Λe) ˙ + Vm (q, q)( ˙ q˙d + Λe) + G(q) + F (q) ˙ step 1: Choose I

= Id such that r → 0.

Then (19) can be rewritten as

M r˙ = F1 − Vm r + TL − KT Id + KT η where η

July 12, 2004

(20)

= Id − I is an error signal. Choose Id as 1 ˆ Id = [ F 1 + k τ r + ν τ ] k1

(21)

Intelligent Systems Laboratory

Page 69

Manipulator control contd...

RLED: NN BS Design contd...

M r˙

=

KT T ˜ −Vm r + W1 φ1 − kτ r + ε 1 + TL k1   KT KT ˆ T + I− ντ + K T η W 1 φ1 − k1 k1

step 2: Choose uE such that η Differentiating η

(22)

→ 0.

= Id − I , we get Lη˙ = F2 + TE − uE

(23)

where F2 is very complicated nonlinear function of q , q˙, r , Id , and I . The control signal uE can be chosen as

uE = Fˆ2 + kν η The closed loop dynamics:

July 12, 2004

(24)

˜ T φ2 + ε 2 + T E − k ν η Lη˙ = W 2 Intelligent Systems Laboratory

Page 70

Manipulator control contd...

RLED Simulation I

+

-

Id

Fictitious controller

Fˆ1

2-layer Neural Network



kν qd q˙d

-

e e˙



I]

r

Controller

Fˆ2

RLED System

q q˙

2-layer Neural Network

Figure 15: NN controller structure for RLED July 12, 2004

Intelligent Systems Laboratory

Page 71

Manipulator control contd...

RLED simulation Results

PD control of RLED

PD control of RLED

Link-1 Tracking

Link-2 Position tracking

1.5

PD control of RLED Torques 60

1.5 Desired Actual

1

1

0.5

0

0.5

0

-0.5

-0.5

-1

-1

0

5

10

Time (seconds)

20

15

(a) Link 1 Position Tracking

40 T[1] and T[2] in Nm

Link-2 Position q[2]

Link one Position q[1]

Link - 1 Link - 2

Desired Actual

20

0

-20 0

5

10 Time (seconds)

15

20

0

(b) Link 2 Position Tracking

5

10 time (seconds)

15

(c) Control Torques

Figure 16: PD control of 2-link RLED

K = diag{100, 100} and Kd = diag{20, 20}

July 12, 2004

Intelligent Systems Laboratory

Page 72

20

Manipulator control contd...

RLED simulation Results

NN Backstepping control of 2-Link RLED

NN Backstepping control of 2 - Link RLED

NN Backstepping control of 2 - Link RLED

2 1.5 Desired Actual

-1

40

0.5

Control inputs

0

0

-0.5

0

5

10

time (seconds)

15

(a) Link 1 Position Tracking

20

20 0 -20 -40

-1

-2

Link - 1 Link - 2

60

1

1

Link - 2 Position Tracking

Link - 1 Position Tracking

Desired Actual

-60

0

5

10

time (seconds)

15

20

0

(b) Link 2 Position Tracking

5

10

time (seconds)

15

(c) Control Torques

Figure 17: NN Back Stepping control of 2-link RLED

kτ = diag{60, 60} and kν = diag{1.5, 1.5}

July 12, 2004

Intelligent Systems Laboratory

Page 73

20

Manipulator control contd...

RLED simulation Results

Observations:

ˆ i (0) are • The problem of weight initialization does not arise, since weights W taken as zeros.

• The tuning algorithm ensures that weights are bounded. • PD controller requires large gains which might excite high frequency unmodeled dynamics.

• NN controller improves tracking performance.

July 12, 2004

Intelligent Systems Laboratory

Page 74

Manipulator control contd...

Singular Perturbation Design

• A large class of systems have slow dynamics and fast dynamics which operate on a very different time-scales and are essentially independent.

• Control can be decomposed into a fast portion and a slow portion, which act on different time-scales: increasing the control effectiveness

• Lewis, Jagannathan and Yesildirek, 1999

July 12, 2004

Intelligent Systems Laboratory

Page 75

Manipulator control contd...

Singular Perturbation Design A large class of nonlinear systems can be described by the equations

x˙ 1 εx˙ 2 where the state x

= f1 (x, u) = f21 (x1 ) + f22 (x1 )x2 + g2 (x1 )u

= [xT1 xT2 ]T .

• ε 2π

once the link 2 reaches close to the top position (|qd1

− q1 | ≤ 0.2,

|qd2 − q2 | ≤ 0.2), we switch over to a linear state feedback controller where the state feedback gain is given by

K = [−32.68

July 12, 2004

− 7.14

− 32.76

Intelligent Systems Laboratory

− 4.88]

Page 96

Pendubot

Simulation Results

20

60

qd(1) Link - 1 Link - 2

15

Pendubot

40

Link - 1 Link - 2

30

40

10

5

Control Input (u)

Link Velocities

Link Positions

20

20

0

10 0 -10 -20

-20

0

-30

0

5

10

Time (seconds)

15

(a) Position Tracking

20

-40

0

5

10

Time (seconds)

15

(b) Velocity Tracking

20

-40

0

5

10

Time (seconds)

15

(c) Control Torque

Figure 24: NN based Partial Feedback Linearization control of Pendubot

July 12, 2004

Intelligent Systems Laboratory

Page 97

20

Pendubot

Conclusion

• NN helps in taking unmodelled dynamics into account • control input is bounded • Choosing a proper initial trajectory for link 1 is crucial.

July 12, 2004

Intelligent Systems Laboratory

Page 98

Summary • Two algorithms namely, RTRL and EKF have been proposed for MNN. Performance comparison has been done on both SISO as well as MIMO systems

• A novel learning algorithm based on lyapunov function has been proposed for FFN. Performance comparison has been done with BP and EKF.

• Various types of NN controllers have been analyzed and implemented for robot manipulators

• A new NN based partial feedback linearization control has been suggested for Pendubot.

July 12, 2004

Intelligent Systems Laboratory

Page 99

Thank You

July 12, 2004

Intelligent Systems Laboratory

Page 100

Suggest Documents