DEEP NEURAL NETWORKS FOR MODELING NONLINEAR DYNAMICS
Recommend Documents
Army Belvoir R D & E Center under contract #DAAK70-92-K-. 0003. Note that this model is equally applicable for both scalar and vector sequences. The use of ...
Nov 29, 2017 - of CENT feature analysis in two separate CNN classification contexts. ... classify a large set of visual object categories, several CENT features.
The next step is to assume a functional relationship between the current state x(t) ... 0,], the system approaches a stable equilibrium point for < 4:53, a limit cycle for 4:53 .... over 10 runs) best av ge. DR. 0.4042 0.4249. ELR. 0.4591 0.4656. FR
pre-trained deep neural networks were employed, including. AlexNet and ... node detection on CT, brain segmentation, and assessing dia- betic retinopathy ...
Dartmouth conference in 1956 [6]. It had clearly defined goals, exemplified by great early projects, such as the General Problem Solver of Simon and Newell.
Apr 27, 2012 - His current main research interest is in training models that learn many levels of rich, distributed repr
Apr 27, 2012 - origin is not the best way to find a good set of weights and unless the initial ..... State-of-the-art AS
Jun 18, 2012 - Gibbs sampling consists of updating all of the hidden units in parallel using Eqn.(10) followed by updati
Apr 27, 2012 - data that lie on or near a non-linear manifold in the data space. ...... âReducing the dimensionality o
Instead of designing feature detectors to be good for discriminating between classes ... where vi,hj are the binary stat
... to train deep architectures. 10. Page 11. 11. Slide from: https://deeplearningworkshopnips2010.files.wordpress.com/2010/09/nips10-workshop-tutorial-final.pdf ...
Oct 13, 2016 - Neural networks have drawn significant interest from the machine learning ... to show that for a given upper bound on the approx- imation error, shallow ... LG] 13 Oct 2016 ... ically, we aim to answer the following two questions. Give
Neural Networks for System Modeling • Gábor Horváth, 2005 Budapest
University of .... Frequency characteristics depend on the probability p. – Example
. -1y.
Oct 5, 2016 - Multilayer networks have been used in identification and control of static and dynamic simple nonlinear systems [1], [2] while recurrent networks.
14. Dynamical Nonlinear Neural Networks with. Perturbations Modeling and Global Robust Stability. Analysis. Gamal A. Elnashar. Automatic Control Center.
Jan 22, 2015 - model for classification, whose classification performance is competitive to ..... All the programs are implemented using Python language .... Figure 8: Classification accuracies versus the training time for experimental data sets.
Mar 7, 2016 - fields of automotive, surveillance and robotics. Despite the ..... context of pedestrian detection, annotated training dataset hav- ing that ...
gorithm are widely used in solving various classification and pre- ... an overview of neural network time series prediction and existing applications of deep ...
neural networks based on local convolution filters to predict the underlying unknown non-linear ... situations where a structural relation between input and output is presumably present but unknown, when ... 2016) to natural language .... generation
Jul 13, 2018 - Deep generative neural networks for novelty generation: ...... functions. This means that it is possible to use stochastic gradient descent (SGD).
intensity radar data stack from October 2016 to February 2017. Two deep recurrent neural network (RNN)-based classifiers were employed. Our work revealed ...
Discrete Cosine Transform. DER. Diarisation Error Rate. DHC ...... A Variational Bayes (VB) system is presented by Kenny et al. (2010) which aimed to bring.
tral pedestrian detection task and then model it into a convolutional network ... mal imaging, thermal cameras are widely used in human tracking [41], face ... networks (DNNs) on vision problems with multimodal data sources, e.g., action recogni- ...
DEEP NEURAL NETWORKS FOR MODELING NONLINEAR DYNAMICS
A comparison of Shannon's cross entropy and mean squared error ... Squared Error (MSE) vs Cross Entropy. (CE) ... 5 repe
D EEP N EURAL N ETWORKS FOR M ODELING N ONLINEAR D YNAMICS
A comparison of Shannon‘s cross entropy and mean squared error N AJEEB K HAN AND I AN S TAVNESS D EPARTMENT OF C OMPUTER S CIENCE , U NIVERSITY OF S ASKATCHEWAN . I NTRODUCTION
R ESULTS
• Evaluation metric: root mean squared error (RMSE) • 5 repetitions of 10-fold cross validation
• Arm reaching movements can be modeled as a mapping [1]
Dataset partitioned into k folds 1 1
Joint Space
test
5 6
Torque trajectory
Hidden features α
Y-axis
Elbow Angle θ2
0
0.2
7 0 −0.2
−2
−1
0
1
2
3
Shoulder Angle θ1
−0.4 −0.8 −0.6 −0.4 −0.2
0
0.2
0.4
0.6
0.8
X-axis
test test test
Input features α
Hidden features β
Hidden features α
Hidden features β
Initial and final states
Reconstructed trajectory
Hidden layer α
Hidden layer β
6
#10
τˆ1
τˆ1
α1
α ˆ1
τ2
τˆ2
τˆ2
τˆ3
α2
α ˆ2
τ3
τˆ3
τˆ3
τ4
τˆ4
α3
α ˆ3
τ4
τˆ4
χ1
τˆ4
τ5
τˆ5
α4
α ˆ4
τ5
τˆ5
χ2
τˆ5
τ6
τˆ6
α5
α ˆ5
τ6
τˆ6
τˆ6
τ7
τˆ7
α6
α ˆ6
τ7
τˆ7
τˆ7
τ8
τˆ8
τ8
τˆ8
τˆ8
τ2
τˆ2
τ3
(a) Learning hidden features α from torque trajectory
(b) Learning hidden features β from hidden features α
100
200
300
α ˆ1
α2
α ˆ2
α3
α ˆ3
α4
α ˆ4
α5
α ˆ5
α6
α ˆ6
400
500
Cross Entropy Mean Squared Error
τ1
τˆ1
τ2
τˆ2
τ3
τˆ3
τ4
τˆ4
τ5
τˆ5
τ6
τˆ6
τ7
τˆ7
τ8
τˆ8
5 4 3 2 1
#10!3
6
200
300
400
(c) Pre-training a deep autoencoder
·10−3 4
τ1
τˆ1
τ2
τˆ2
τ3
τˆ3
τ4
τˆ4
τ5
τˆ5
τ6
τˆ6
τ7
τˆ7
τ8
τˆ8
CE MSE
(d) Deep network predicting torque trajectory from initail and final state
·10−2
τˆ7
τ8
τˆ8
100
200
300
400
·10−3
α1
α ˆ1
α2
α ˆ2
α3
α ˆ3
α4
α ˆ4
α6
RMSE
τˆ6
τ7
1
CE MSE
7
α ˆ5 α ˆ6
1.2
6
1
5
0.8
500
τ1
τˆ1
τ2
τˆ2
τ3
τˆ3
τ4
τˆ4
τ5
τˆ5
τ6
τˆ6
τ7
τˆ7
τ8
τˆ8
CE MSE
4 3 2
0.2 0
τˆ5
τ6
2
0.4
1
τˆ4
τ5
Figure 8: Mean and 95 percent confidence intervals of test reconstruction error for the deep autoencoder.
α5
2
τˆ3
τ4
Number of epochs
1.4
3
τˆ2
τ3
3
500
1.6
τˆ1
τ2
4
0 100
Cross Entropy Mean Squared Error
τ1
5
0.6
τ1
τˆ1
7
Predicted torque trajectory
Reconstructed α
τ1
Cross Entropy Mean Squared Error
α1
Figure 7: Mean and 95 percent confidence intervals of test reconstruction error for the second autoencoder with hidden-layer size 4.
!3
Figure 6: Mean and 95 percent confidence intervals of test reconstruction error for the first autoencoder with hidden-layer size 50.
RMSE
A deep network was used to map the initial and final state to the torque trajectory Torque trajectory
1
Number of epochs
• MSE: Assumes independent Gaussian residuals • CE: Average measure of information
Reconstructed input
2
0.4
Unsupervised layerwise pre-training [2] using
Torque control trajectory
3
test
M ETHODS
Inverse dynamics of the arm
4
Number of epochs
Figure 5: 10-fold cross validation.
Figure 2: A rectangular region in the joint-space transforms into a non-rectangular region in the hand-space.
Compute minimumjerk trajectories
5
test
10
0.6
6
0
test
0
Figure 3: Data set generation.
10
0.8
−3
Generate random initial and final points χ
9
test
8
Hand Space
−2
(3)
8
9
(2)
xk log x ˆk − (1 − xk ) log(1 − x ˆk )
7
#10!3
RMSE
k=1
k=1
(xk − x ˆk )
6
7
Root mean squared error (RMSE)
JCE =
K X
2
2
kth training
Figure 1: Two-link planar arm model (Adapted from Berniker et al., Nat. Neurosci. 2008).
5
test
3
2
JM SE =
4
4
• Highly non-linear dimensionality reduction using deep autoencoders • Autoencoder training using Mean Squared Error (MSE) vs Cross Entropy (CE) K X 1
3
test
2
(1)
2
Root mean squared error (RMSE)
{xi , xf , T } → {x(t), u(t)}
7
Root mean squared error (RMSE)
We used deep neural networks for the control of a two-link planar arm model.
1
0 Batch 500 Batch 50 3 Line Searches
Batch 500 Batch 50 10 Line Searches
Batch 500 Batch 50 3 Line Searches
Batch 500 Batch 50 10 Line Searches
Batch 500 Batch 50 3 Line Searches
Batch 500 Batch 50 10 Line Searches
Figure 9: Impact of hyper-parameters on performance of CE vs MSE for training: autoencoder with hidden layer 50 (left), autoencoder with hidden layer 4 (center), deep autoencoder (right).
Figure 4: Unsupervised pre-training for torque trajectory (a-c) dimensionality reduction and (d) prediction. C ONCLUSIONS
First detailed evaluation of learning criteria for deep autoencoders. • Empirically proved that CE achieves a lower reconstruction RMSE compared to MSE
• These results are independent of hyper-parameters such as minibatch size and number of conjugate gradient line searches. • As future work, the impact of data set size and use of regularization on the cost functions may be evaluated.
R EFERENCES [1] Max Berniker and Konrad P Kording. Deep networks for motor control functions. Frontiers in computational neuroscience, 9, 2015. [2] Yoshua Bengio. Greedy layer-wise training of deep networks. NIPS, 19:153, 2007.