DEEP NEURAL NETWORKS FOR MODELING NONLINEAR DYNAMICS

D EEP N EURAL N ETWORKS FOR M ODELING N ONLINEAR D YNAMICS

A comparison of Shannon‘s cross entropy and mean squared error N AJEEB K HAN AND I AN S TAVNESS D EPARTMENT OF C OMPUTER S CIENCE , U NIVERSITY OF S ASKATCHEWAN . I NTRODUCTION

R ESULTS

• Evaluation metric: root mean squared error (RMSE) • 5 repetitions of 10-fold cross validation

• Arm reaching movements can be modeled as a mapping [1]

Dataset partitioned into k folds 1 1

Joint Space

test

5 6

Torque trajectory

Hidden features α

Y-axis

Elbow Angle θ2

0

0.2

7 0 −0.2

−2

−1

0

1

2

3

Shoulder Angle θ1

−0.4 −0.8 −0.6 −0.4 −0.2

0

0.2

0.4

0.6

0.8

X-axis

test test test

Input features α

Hidden features β

Hidden features α

Hidden features β

Initial and final states

Reconstructed trajectory

Hidden layer α

Hidden layer β

6

#10

τˆ1

τˆ1

α1

α ˆ1

τ2

τˆ2

τˆ2

τˆ3

α2

α ˆ2

τ3

τˆ3

τˆ3

τ4

τˆ4

α3

α ˆ3

τ4

τˆ4

χ1

τˆ4

τ5

τˆ5

α4

α ˆ4

τ5

τˆ5

χ2

τˆ5

τ6

τˆ6

α5

α ˆ5

τ6

τˆ6

τˆ6

τ7

τˆ7

α6

α ˆ6

τ7

τˆ7

τˆ7

τ8

τˆ8

τ8

τˆ8

τˆ8

τ2

τˆ2

τ3

(a) Learning hidden features α from torque trajectory

(b) Learning hidden features β from hidden features α

100

200

300

α ˆ1

α2

α ˆ2

α3

α ˆ3

α4

α ˆ4

α5

α ˆ5

α6

α ˆ6

400

500

Cross Entropy Mean Squared Error

τ1

τˆ1

τ2

τˆ2

τ3

τˆ3

τ4

τˆ4

τ5

τˆ5

τ6

τˆ6

τ7

τˆ7

τ8

τˆ8

5 4 3 2 1

#10!3

6

200

300

400

(c) Pre-training a deep autoencoder

·10−3 4

τ1

τˆ1

τ2

τˆ2

τ3

τˆ3

τ4

τˆ4

τ5

τˆ5

τ6

τˆ6

τ7

τˆ7

τ8

τˆ8

CE MSE

(d) Deep network predicting torque trajectory from initail and final state

·10−2

τˆ7

τ8

τˆ8

100

200

300

400

·10−3

α1

α ˆ1

α2

α ˆ2

α3

α ˆ3

α4

α ˆ4

α6

RMSE

τˆ6

τ7

1

CE MSE

7

α ˆ5 α ˆ6

1.2

6

1

5

0.8

500

τ1

τˆ1

τ2

τˆ2

τ3

τˆ3

τ4

τˆ4

τ5

τˆ5

τ6

τˆ6

τ7

τˆ7

τ8

τˆ8

CE MSE

4 3 2

0.2 0

τˆ5

τ6

2

0.4

1

τˆ4

τ5

Figure 8: Mean and 95 percent confidence intervals of test reconstruction error for the deep autoencoder.

α5

2

τˆ3

τ4

Number of epochs

1.4

3

τˆ2

τ3

3

500

1.6

τˆ1

τ2

4

0 100


τ1

5

0.6

τ1

τˆ1

7

Predicted torque trajectory

Reconstructed α

τ1


α1

Figure 7: Mean and 95 percent confidence intervals of test reconstruction error for the second autoencoder with hidden-layer size 4.

!3

Figure 6: Mean and 95 percent confidence intervals of test reconstruction error for the first autoencoder with hidden-layer size 50.

RMSE

A deep network was used to map the initial and final state to the torque trajectory Torque trajectory

1

Number of epochs

• MSE: Assumes independent Gaussian residuals • CE: Average measure of information

Reconstructed input

2

0.4

Unsupervised layerwise pre-training [2] using

Torque control trajectory

3

test

M ETHODS

Inverse dynamics of the arm

4

Number of epochs

Figure 5: 10-fold cross validation.

Figure 2: A rectangular region in the joint-space transforms into a non-rectangular region in the hand-space.

Compute minimumjerk trajectories

5

test

10

0.6

6

0

test

0

Figure 3: Data set generation.

10

0.8

−3

Generate random initial and final points χ

9

test

8

Hand Space

−2

(3)

8

9

(2)

xk log x ˆk − (1 − xk ) log(1 − x ˆk )

7

#10!3

RMSE

k=1

k=1

(xk − x ˆk )

6

7

Root mean squared error (RMSE)

JCE =

K X

2

2

kth training

Figure 1: Two-link planar arm model (Adapted from Berniker et al., Nat. Neurosci. 2008).

5

test

3

2

JM SE =

4

4

• Highly non-linear dimensionality reduction using deep autoencoders • Autoencoder training using Mean Squared Error (MSE) vs Cross Entropy (CE) K X 1

3

test

2

(1)

2


{xi , xf , T } → {x(t), u(t)}

7


We used deep neural networks for the control of a two-link planar arm model.

1

0 Batch 500 Batch 50 3 Line Searches

Batch 500 Batch 50 10 Line Searches





Figure 9: Impact of hyper-parameters on performance of CE vs MSE for training: autoencoder with hidden layer 50 (left), autoencoder with hidden layer 4 (center), deep autoencoder (right).

Figure 4: Unsupervised pre-training for torque trajectory (a-c) dimensionality reduction and (d) prediction. C ONCLUSIONS

First detailed evaluation of learning criteria for deep autoencoders. • Empirically proved that CE achieves a lower reconstruction RMSE compared to MSE

• These results are independent of hyper-parameters such as minibatch size and number of conjugate gradient line searches. • As future work, the impact of data set size and use of regularization on the cost functions may be evaluated.

R EFERENCES [1] Max Berniker and Konrad P Kording. Deep networks for motor control functions. Frontiers in computational neuroscience, 9, 2015. [2] Yoshua Bengio. Greedy layer-wise training of deep networks. NIPS, 19:153, 2007.

DEEP NEURAL NETWORKS FOR MODELING NONLINEAR DYNAMICS

DEEP NEURAL NETWORKS FOR MODELING NONLINEAR DYNAMICS

Suggest Documents

modeling nonlinear dynamics with neural networks - Semantic Scholar

Modeling Information Flow Through Deep Neural Networks

Evolutionary Neural Networks for Nonlinear

Deep Convolutional Neural Networks for

Modular Neural Networks for Modeling of a Nonlinear ...

Deep Neural Networks for Acoustic Modeling in ... - Semantic Scholar

Deep Neural Networks for Acoustic Modeling in Speech ... - CiteSeerX

Deep Neural Networks for Acoustic Modeling in Speech ... - Google Sites

Deep Neural Networks for Acoustic Modeling in Speech ... - CiteSeerX

Deep Neural Networks for Acoustic Modeling in Speech Recognition

Deep Neural Networks

Deep Neural Networks

Why Deep Neural Networks?

NEURAL NETWORKS FOR SYSTEM MODELING

Nonlinear Systems Identification Using Deep Dynamic Neural Networks

Dynamical Nonlinear Neural Networks with

Deep Convolutional Neural Networks for Hyperspectral Image ...

Deep convolutional neural networks for pedestrian detection

Deep neural networks for time series prediction

Deep Neural Networks for Data-Driven Turbulence

Deep generative neural networks for novelty generation

Deep Recurrent Neural Networks for Winter

Using Deep Neural Networks for Speaker Diarisation

Multispectral Deep Neural Networks for Pedestrian Detection