Deep Belief Networks Intro to Deep Neural Networks 26th to 27th August 2016 Supervised By Dr. Asifullah Presented By Muhammad Islam (DCIS, PIEAS)
Pattern Recognition Lab Department of Computer Science & Information Sciences Pakistan Institute of Engineering & Applied Sciences
Motivation: Applications of DBN’s Object Recognition
Deep Belief Network
2
Applications of DBN’s (cont…) • Image Retrieval
Deep Belief Network
3
Applications of DBN’s (cont…) • Document Modeling
Deep Belief Network
4
Applications of DBN’s (cont…) • Document Retrieval
Deep Belief Network
5
Background • Deep neural networks were not absent before 2000
Deep Belief Network
6
Background • However, training deep networks was quite difficult
Deep Belief Network
7
Background • Hence other simple algorithms prevailed
Deep Belief Network
8
Background • Now the situation has changed
Deep Belief Network
9
Background • Deep belief Networks became popular in 2006 • Most prominent work done by Geoffrey Hinton • There were a lot of research
• And now more powerful tools exist
Deep Belief Network
10
Introduction • Deep belief Networks are basically Directed Graphs 2000 units
• Built in the form of stacks using individual units called Restricted Boltzmann Machines
Deep Belief Network
500 units 500 units
28 x 28 pixel image 11
Introduction • Keyword “Belief” indicates an important property
Deep Belief Network
12
Boltzmann Machines • Stochastic generative model • estimate the distribution of observations(say p(image)), instead of their classification p(label|image) • One input layer and one hidden layer • Defined Energy of the network and Probability of a unit’s state Deep Belief Network
13
Restricted Boltzmann Machines • Feed-forward graph structure with two layers • visible layer (binary or Gaussian units) and hidden layer (usually binary units) • No intra layer connections • Visible units and hidden units are conditionally independent Deep Belief Network
14
BM vs RBM
Hidden layer, h
Visible layer, v
Deep Belief Network
15
Restricted Boltzmann Machines • Two characters define an RBM:
• states of all the units: obtained through probability distribution.
• weights of the network: obtained through training (Contrastive Divergence) Deep Belief Network
16
Restricted Boltzmann Machines • Energy is defined for the RBM as:
E (v, h) ai vi b j h j h j wi , j vi i
j
i
j
Where E is the energy for given RBM and ai , bi and Wi represent weights for hidden layer bias, weights for visible layer bias and combined weights respectively.
Deep Belief Network
17
Restricted Boltzmann Machines • Distribution of visible layer of the RBM is given by 1 P (v ) e E ( v , h ) Z h
Where Z is the partition function defined as the sum of E ( v ,h ) over all possible configurations of {v,h} e • Probability that a hidden unit i is on(binary state 1) is m
P(h j 1 | v) (b j wi , j vi ) i 1
Deep Belief Network
18
Restricted Boltzmann Machines for calculating a particular weight between two units
logp (v) vi h j data vi h j model wij and
logp (v) wij w ij
hence
wij ( vi h j data vi h j model ) Deep Belief Network
19
Training an RBM
Deep Belief Network
20
Contrastive Divergence
Deep Belief Network
21
Training DBN’s • First train a layer of features that receive input directly from the pixels. • Then treat the activations of the trained features as if they were pixels and learn features of features in a second hidden layer.
• It can be proved that each time we add another layer of features we improve a variational lower bound on the log probability of the training data. Deep Belief Network
22
Training DBN’s
Deep Belief Network
23
References •
DBN lecture by Geoffrey Hinton; Vedios and slides at http://videolectures.net/mlss09uk_hinton_dbn/