Deep Neural Network and Transfer Learning

12 downloads 100095 Views 5MB Size Report
learning is the guy with the paint brush”. 5. Deep Neural Network ... Training a neural network .... intermediate layers are each trained to be auto encoders. 31.
Deep Neural Network and Transfer Learning 1

Workshop on Intro to Deep Neural Networks 26th to 27th August 2016

Presented By: Aqsa Saeed Qureshi Supervised By: Dr. Asifullah Khan (DCIS PIEAS)

Deep Neural Network and Transfer Learning

Outline 2

 Learning feature hierarchies (Deep learning)

 Auto-encoder  Deep Belief Net  Transfer Learning  Transfer Learning in Deep Neural Network

Deep Neural Network and Transfer Learning

Deep Learning Overview 3

 Train networks with many layers (vs. shallow nets with

just a couple of layers)

 Multiple layers work to build an improved feature

space

Deep Neural Network and Transfer Learning

4

Learning feature hierarchies/Deep learning Deep Neural Network and Transfer Learning

Learning feature hierarchies/Deep learning 5

“ Confronted with an array of pixels, no computer inherently knows the difference between a house, a tree and a cat. Deep learning is the guy with the paint brush”

Deep Neural Network and Transfer Learning

Learning feature hierarchies/Deep learning 6

http://deeplearning4j.org/whydeeplearning.html

Deep Neural Network and Transfer Learning

7

Learning feature hierarchies/Deep learning Deep Neural Network and Transfer Learning

Image features 8

 Features = local detectors

Combined to make prediction  (in reality, features are more low-level) 

Nose

Eye

Eye

Mouth Deep Neural Network and Transfer Learning

Face !

Standard image classification approach 9

Input

Use simple classifier Extract features e.g., logistic regression, SVMs Computer$vision$features$

SIFT$

Spin$image$

HoG$

RIFT$

Textons$

GLOH$

Slide$Credit:$Honglak$Lee$

Deep Neural Network and Transfer Learning

Fac e

Many hand crafted features exist… 10

Computer$vision$features$

SIFT$

Spin$image$

HoG$

RIFT$

Textons$

GLOH$

Slide$Credit:$Honglak$Lee$

… but very painful to design Deep Neural Network and Transfer Learning

Change image classification approach? 11

Input

Extract features Computer$vision$features$

SIFT$

Spin$image$

HoG$

RIFT$

Textons$

GLOH$

Use simple classifier e.g., logistic regression, SVMs

Can we learn features from data? Slide$Credit:$Honglak$Lee$

Deep Neural Network and Transfer Learning

Fac e

Why feature hierarchies 12

object models

object parts (combination of edges)

edges

pixels

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

Deep learning algorithms 13

 Deep Belief Network (DBN) (Hinton)

 Deep sparse auto encoders (Bengio)

[Other related work: LeCun, Lee, Yuille, Ng …]

Deep Neural Network and Transfer Learning

Deep learning with autoencoders 14

 Logistic regression

 Neural network  Sparse autoencoder  Deep autoencoder

Deep Neural Network and Transfer Learning

Logistic regression 15

x1 Draw a logistic regression unit as:

x2 x3 +1

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

Neural Network 16

String a lot of logistic units together. Example 3 layer network: x1

a1

a2

x2

a3

x3

Layer 3

+1

+1

Layer 1

Layer 2 http://ufldl.stanford.edu/eccv10-tutorial

/

Deep Neural Network and Transfer Learning

Neural Network 17

Example 4 layer network with 2 output units: x1 x2 x3 +1 +1

+1

Layer 1

Layer 2

Layer 3 http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

Layer 4

Training a neural network 18

Given training set (x1, y1), (x2, y2), (x3, y3 ), …. Adjust parameters q (for every node) to make: (Use gradient descent. “Backpropagation” algorithm. Susceptible to local optima.) http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

Unsupervised feature learning with a neural network 19

x1

x1

x2

x2

x3

a1 a2

x4

Network is trained to output the input (learn identify function).

x3 x4

a3 x5

x5 +1

x6 +1

Layer 1

x6

Layer 2

Autoencoder.

Layer 3

Trivial solution unless: - Constrain number of units in Layer 2 (learn compressed representation), or - Constrain Layer 2 to be sparse.

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

So: multiple layers make sense 20

Many-layer neural network architectures should be capable of learning the true underlying features and ‘feature logic’, and therefore generalise very well …

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

But, until very recently, our weight-learning algorithms simply did not work on multi-layer architectures 21

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

The new way to train multi-layer NNs… 22

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

The new way to train multi-layer NNs… 23

Train this layer first

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

The new way to train multi-layer NNs… 24

Train this layer first then this layer http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

The new way to train multi-layer NNs… 25

Train this layer first then this layer

then this layer http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

The new way to train multi-layer NNs… 26

Train this layer first then this layer then this layer then this layer Deep Neural Network and Transfer Learning

The new way to train multi-layer NNs… 27

Train this layer first then this layer then this layer then this layer finally this layer Deep Neural Network and Transfer Learning

The new way to train multi-layer NNs… 28

EACH of the (non-output) layers is trained to be an autoBasically, it is forced to learn good encoder

features that describe what comes from the previous layer http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

an auto-encoder is trained, with an absolutely standard weightadjustment algorithm 29

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

an auto-encoder is trained, with an absolutely standard weightadjustment algorithm to reproduce the input 30

By making this happen with (many) fewer units than the inputs, this forces the ‘hidden layer’ units to become good feature detectors http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

intermediate layers are each trained to be auto encoders 31

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

Auto-Encoders 32

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

Stacked Auto-Encoders 33

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

Sparse Encoders 34

any given time many/most of the features will have a 0 value – Thus there is an implicit compression each time but with varying nodes – This leads to more localist variable length encodings where a particular node with value signifies the presence of a feature (small set of bases) – A type of simplicity bottleneck (regularizer) – This is easier for subsequent layers to use for learning

Deep Neural Network and Transfer Learning

Sparse Encoders 35

Sparsity Regularization:

Sparsity regularizer attempts to enforce a constraint on the sparsity of the output from the hidden layer.

L2 Regularization: When training a sparse autoencoder, it is possible to make the sparsity regulariser small by increasing the values of the weights w. Adding a regularization term on the weights to the cost function prevents it from happening.

Deep Neural Network and Transfer Learning

Unsupervised feature learning with a neural network 36 a1

Training a sparse autoencoder.

a2 a3

Given unlabeled training set x1, x2, …

Deep Neural Network and Transfer Learning

Final layer trained to predict class based on outputs from previous layers 37

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

And that’s that 38

 That’s the basic idea  There are many many types of deep learning,

 different kinds of autoencoder, variations on

architectures and training algorithms, etc…  Very fast growing area …

Deep Neural Network and Transfer Learning

Auto-Encoders 39

 A type of unsupervised learning which tries to discover generic features of the data  Learn identity function by learning important sub-features (not by just passing through data).  Compression, etc.  Can use just new features in the new training set or concatenate both

Deep Neural Network and Transfer Learning

Deep Belief Net 40

Deep Belief Net (DBN) is another algorithm for learning a feature hierarchy. Building block: 2-layer graphical model (Restricted Boltzmann Machine).

Deep Neural Network and Transfer Learning

Deep Belief Net 41

“ Deep belief nets are probabilistic generative models that are composed of multiple layers of stochastic latent variables. The latent variables typically have binary values and are often called hidden units or feature detectors. [...] The lower layers receive top-down, directed connections from the layers above. The states of the units in the lowest layer represent a data vector.”

Deep Neural Network and Transfer Learning

Deep Belief Net 42

Motivation: The robustness and efficiency by which humans can recognize objects has ever been an intriguing challenge in computational intelligence. Theoretical results suggest that deep architectures are fundamental to learn complex functions that can represent high-level abstractions (e.g. vision, language) [Bengio, 2009]

Deep Neural Network and Transfer Learning

Deep Belief Net 43

Deep Versus Shallow Architecture:

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

Deep Belief Net 44

DBNs are composed of several Restricted Boltzmann Machines (RBMs) stacked on top of each other.

Deep Neural Network and Transfer Learning

Deep Belief Net 45

An RBM is an energy-based generative model that consists of a layer of binary visible units, v, and a layer of binary hidden units, h.

Deep Neural Network and Transfer Learning

Deep Belief Net 46

Given an observed state, the energy of the joint configuration ofthe visible and hidden units (v, h) is given by (1):

Deep Neural Network and Transfer Learning

Deep Belief Net 47

The RBM defines a joint probability over (v, h):

where Z is the partition function, obtained by summing the energy of all possible (v, h) configurations:

Deep Neural Network and Transfer Learning

Deep Belief Net 48

Given a random input configuration v, the state of the hidden unit j is set to 1 with probability:

Similarly, given a random hidden vector, h, the state of the visible unit i can be set to 1 with probability:

Deep Neural Network and Transfer Learning

Deep Belief Net

Gibbs Sampling:

49

Deep Neural Network and Transfer Learning

Deep Belief Net 50

Alternating Gibbs Sampling:

Deep Neural Network and Transfer Learning

Deep Belief Net 51

Alternating Gibbs Sampling:

Deep Neural Network and Transfer Learning

Deep Belief Net 52

CONTRASTIVE DIVERGENCE (CD–k):  v(0) ← x  Compute the binary (features) states of the hidden units, h(0), using v(0) for n ← 1 to k  Compute the “reconstruction” states for the visible units, v(n),using h(n−1)  Compute the “reconstruction” states for the hidden units, h(n), using v(n) end for Deep Neural Network and Transfer Learning

Deep Belief Net 53

Update the weights and biases, according to:

Deep Neural Network and Transfer Learning

Deep Belief Net 54

Deep Neural Network and Transfer Learning

55

Deep learning examples

Deep Neural Network and Transfer Learning

Convolutional DBN on face images 56

object models

object parts (combination of edges)

edges

pixels Deep Neural Network and Transfer Learning

Learning of object parts 57

Examples of learned object parts from object categories Faces

Cars

Elephants

Deep Neural Network and Transfer Learning

Chairs

Deep Net with Greedy Layer Wise Training 58

ML Model

New Feature Space

Supervised Learning

Unsupervised Learning

Original Inputs http://axon.cs.byu.edu/~martinez/classes/678/Slides/Deep-Learning.pptx

Deep Neural Network and Transfer Learning

TRANSFER OF LEARNING 59

TRANSFER OF LEARNING http://www.slideshare.net/ocmonmoveonpeople/transfer-of-learning-by-lorraine-anoran?qid=2d5fdd3b-13e2-449b-9410dea9dcb2ed56&v=&b=&from_search=5

Deep Neural Network and Transfer Learning

Transfer of Learning 60

 The study of dependency of human conduct, learning or

performance on prior experience.  [Thorndike and Woodworth, 1901] explored how individuals would transfer in one context to another context that share similar characteristics.  C++  Java  Maths/Physics  Computer Science/Economics

Deep Neural Network and Transfer Learning

Transfer Learning 61

 The ability of a system to recognize and apply knowledge and

skills learned in previous tasks to novel tasks or new domains, which share some commonality.  Given a target task, how to identify the commonality between the

task and previous (source) tasks, and transfer knowledge from the previous tasks to the target one?

Deep Neural Network and Transfer Learning

PositiveOF vs. Negative TRANSFER LEARNING 62

Positive transfer: - when learning in one context improves performance in some other context Negative transfer:

- when learning in one context has a negative impact on performance in another context http://www.slideshare.net/ocmonmoveonpeople/transfer-of-learning-by-lorraine-anoran?qid=2d5fdd3b-13e2-449b-9410dea9dcb2ed56&v=&b=&from_search=5

Deep Neural Network and Transfer Learning

Motivation 63

? Model

Assumptions: 1. Training and Test are from same distribution 2. Training and Test are in same feature space Deep Neural Network and Transfer Learning

Examples: Web-document Classification 64

?

Model

Learn a new model

Physics

Deep Neural Network and Transfer Learning

Machine Learning

Life Science

65

Learn new Model :

1.

Collect new Labeled Data 2. Build new model

Reuse & Adapt already learned model ! Deep Neural Network and Transfer Learning

Examples: Image Classification 66

Features Task One

Deep Neural Network and Transfer Learning

Model One

Examples: Image Classification 67

Reuse

Features Task One

Cars

Features Task Two

Model Two

Motorcycles

Task Two Deep Neural Network and Transfer Learning

Traditional Machine Learning vs. Transfer 68

Different Tasks

Learning System

Learning System

Source Task

Learning System

Traditional Machine Learning

Knowledge

Target Task

Learning System

Transfer Learning

Deep Neural Network and Transfer Learning

Traditional ML vs. TL 69

Humans can also transfer from one Humans can learn in many domains. domain to other domains. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 10, OCTOBER 2010

Deep Neural Network and Transfer Learning

test items

Transfer of learning across domains

training items

test items

training items

Traditional ML in multiple domains

Traditional ML vs. TL 70

http://www.slideshare.net/butest/ppt-3860159

Deep Neural Network and Transfer Learning

Notation

Domain: It consists of two components: A feature space

, a marginal distribution

In general, if two domains are different, then they may have different feature spaces or different marginal distributions.

Task: Given a specific domain and label space predict its corresponding label

, for each

in the domain, to

In general, if two tasks are different, then they may have different label spaces or different conditional distributions

http://www.slideshare.net/butest/ppt-3860159

Deep Neural Network and Transfer Learning

71

Notation 72

For simplicity, we only consider at most two domains and two tasks. Source domain: Task in the source domain: Target domain: Task in the target domain

http://www.slideshare.net/butest/ppt-3860159

Deep Neural Network and Transfer Learning

Why Transfer Learning? 73

 In some domains, labeled data are in short supply.  In some domains, the calibration effort is very expensive.

 In some domains, the learning process is time consuming.

 How to extract knowledge learnt from related domains to help learning in a target domain with a few labeled data?  How to extract knowledge learnt from related domains to speed up learning in a target domain?

 Transfer learning techniques may help! http://www.slideshare.net/butest/ppt-3860159

Deep Neural Network and Transfer Learning

Settings of Transfer Learning Transfer learning settings

Labeled data in a source domain

Labeled data in a target domain

Tasks

Inductive Transfer Learning

× √ √

√ √ ×

Classification Regression …

×

×

Clustering …

Transductive Transfer Learning

Unsupervised Transfer Learning

Classification Regression …

http://www.slideshare.net/butest/ppt-3860159

Deep Neural Network and Transfer Learning

74

An overview of various settings of transfer learning

Self-taught Learning

Case 1 No labeled data in a source domain

75

Inductive Transfer Learning Labeled data are available in a source domain Labeled data are available in a target domain

Case 2

Source and target tasks are learnt simultaneously

Multi-task Learning

Transfer Learning Labeled data are available only in a source domain No labeled data in both source and target domain

Transductive Transfer Learning

Assumption: different domains but single task

Domain Adaptation

Assumption: single domain and single task

Sample Selection Bias /Covariance Shift

Unsupervised Transfer Learning

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 22, NO. 10, OCTOBER 2010

Deep Neural Network and Transfer Learning

Conclusions 76

 Transfer learning is to re-use source knowledge to help a

target learner

 Self-Taught learning transfer unlabeled features

Deep Neural Network and Transfer Learning

Challenges of deep learning

Deep Neural Network 77 and Transfer Learning

Deep learning score card 78

Cons

Pros  Enables learning of

features rather than hand tuning  Impressive performance gains on  Computer vision  Speech recognition  Potential for much more impact Deep Neural Network and Transfer Learning

Deep learning workflow 79

80%

Training set

Learn deep neural net model

Lots of labeled data 20%

Validatio n set

http://www.slideshare.net/AmazonWebServices/cmp305-deep-learning-on-aws-made-easycmp305

Deep Neural Network and Transfer Learning

Valid ate

Deep learning score card 80

Pros

Cons

 Enables learning of

 Computationally really

features rather than hand tuning  Impressive performance gains on  Computer vision  Speech recognition  Potential for much more impact

expensive  Requires a lot of data for high accuracy  Extremely hard to tune  Choice of architecture  Parameter types  Hyperparameters  incredibly hard to tune

Deep Neural Network and Transfer Learning

Deep features: Deep learning + Transfer learning

81

Transfer learning: idea 82

Instead of training a deep network from scratch for your task:  Take a network trained on a different domain for a different source task

 Adapt it for your domain and your target task

Deep Neural Network and Transfer Learning

Transfer learning: idea 83

http://www.slideshare.net/xavigiro/deep-learning-for-computer-vision-transfer-learning-and-domain-adaptation-upc-2016

Deep Neural Network and Transfer Learning

Algorithms: Self-Taught Learning 84

http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

Algorithms: Self-Taught Learning 85

 Framework:  Source Unlabeled data set:



Target Labeled data set:

Build classifier for cars and Motorbikes http://ufldl.stanford.edu/eccv10-tutorial/

Deep Neural Network and Transfer Learning

Algorithm: Self-Taught Learning 86

Unlabeled Data Set

Deep Neural Network and Transfer Learning

Algorithms: Self-Taught Learning 87

Deep Neural Network and Transfer Learning

Transfer learning: Use data from one domain to help learn on another 88

Lots of data:

Some data:

Learn neural net

Neural net as feature extractor + Simple classifier Deep Neural Network and Transfer Learning

Great accuracy

Great accuracy on new problem

What’s learned in a neural net 89

vs.

Neural net trained for Task 1

More generic Can be used as feature extractor Deep Neural Network and Transfer Learning

Very specific to Task 1

Transfer learning in more detail… 90

For Task 2, learn only end part Use simple classifier e.g., logistic regression, SVMs

Clas s?

Neural net trained for Task 1 More generic Can be used as feature extractor Keep weights fixed! Deep Neural Network and Transfer Learning

Very specific to Task 1

Transfer learning: idea 91

Deep Neural Network and Transfer Learning

Example: PASCAL VOC 2007 92

 Standard classification benchmark, 20 classes, ~10K

images, 50% train, 50% test  Deep networks can have many parameters (e.g. 60M in Alexnet)  Direct training (from scratch) using only 5K training images can be problematic.  How can we use deep networks in this setting?

Deep Neural Network and Transfer Learning

Example: PASCAL VOC 2007 93

http://www.slideshare.net/xavigiro/deep-learning-for-computer-vision-transfer-learning-and-domain-adaptation-upc-2016

Deep Neural Network and Transfer Learning

“Off-the-shelf” 94

“Off-the-shelf” Idea: use outputs of one or more layers of a network trained on a different task as generic feature detectors. Train a new shallow model on these features.

http://www.slideshare.net/xavigiro/deep-learning-for-computer-vision-transfer-learning-and-domain-adaptation-upc-2016

Deep Neural Network and Transfer Learning

95

Works surprisingly well in practice! Surpassed or on par with state-of-the-art in several tasks in 2014 Image classification:  PASCAL VOC 2007  Oxford flowers  CUB Bird dataset  MIT indoors Image retrieval:  Paris 6k  Holidays  UKBench

Razavian et al, CNN Features off-the-shelf: an Astounding Baseline for Recognition, CVPRW 2014 http://arxiv.org/abs/1403.6382

Deep Neural Network and Transfer Learning

96

Can we do better than off the shelf features?

Domain adaptation

Deep Neural Network and Transfer Learning

Fine-tuning: supervised domain adaptation 97

Train deep net on “nearby” task for which it is easy to get labels using standard backprop  E.g. ImageNet classification  Pseudo classes from augmented data Cut off top layer(s) of network and replace with supervised objective for target domain Fine-tune network using backprop with labels for target domain until validation loss starts to increase fine-tuning: supervised domain adaptation

Deep Neural Network and Transfer Learning

Fine-tuning: supervised domain adaptation 98

http://www.slideshare.net/xavigiro/deep-learning-for-computer-vision-transfer-learning-and-domain-adaptation-upc-2016

Deep Neural Network and Transfer Learning

Freeze or fine-tune? 99

Bottom n layers can be frozen or fine tuned.  Frozen: not updated during backprop  Fine-tuned: updated during backprop Which to do depends on target task:  Freeze: target task labels are scarce, and we want to avoid over fitting  Fine-tune: target task labels are more plentiful In general, we can set learning rates to be different for each layer to find a tradeoff between freezing and fine tuning

Deep Neural Network and Transfer Learning

Freeze or fine-tune? 100

http://www.slideshare.net/xavigiro/deep-learning-for-computer-vision-transfer-learning-and-domain-adaptation-upc-2016

Deep Neural Network and Transfer Learning

101

How transferable are features? Lower layers: more general features. Transfer very well to other tasks. Higher layers: more task specific. Fine-tuning improves generalization when sufficient examples are available. Transfer learning and fine tuning often lead to better performance than training from scratch on the target dataset. Even features transferred from distant tasks are often better than random initial weights!

Deep Neural Network and Transfer Learning

Summary 102

 Possible to train very large models on small data by using

transfer learning and domain adaptation  Off the shelf features work very well in various domains and tasks  Lower layers of network contain very generic features, higher layers more task specific features  Supervised domain adaptation via fine tuning almost always improves performance

Deep Neural Network and Transfer Learning

Questions… Thank You Deep Neural Network and Transfer Learning

Suggest Documents