[1]: How does the prediction change when a

VISUALISING DEEP NEURAL NETWORK DECISIONS: PREDICTION DIFFERENCE ANALYSIS Luisa M. Zintgraf , Taco S. Cohen , Tameem Adel , Max Welling 1,2

1

1

1

1,3

University of Amsterdam, 2Vrije Universiteit Brussel, 3Canadian Institute for Advanced Research

Understanding individual decisions of a DCNN: which features are important?

Approach [1]: How does the prediction change when a feature is unknown?

CONTRIBUTIONS

Input features are ‘removed’ by (approximately) marginalising them out.

Importance of feature x(i): difference between p(c|x) and p(c|x\x(i))

DEEP VISUALISATION

MARGINAL vs. CONDITIONAL SAMPLING

CONDITIONAL SAMPLING We propose a more accurate approximation than [1], who use the marginal distribution over pixels in (1). Our approach is based on two insights: • a pixel depends most strongly on a small neighborhood around it, and • the conditional of a pixel given its neighborgood does not depend on the position of the pixel.

RED: evidence for the predicted class, BLUE: evidence against predicted class, TRANSPARENT: no influence on prediction. Intensity reflects relative pixel importance. Visualisation of feature maps from three different layers of the GoogLeNet (l.t.r.: increasing depth)

MULTIVARIATE ANALYSIS (1)

where

is a (l x l )-sized patch around pixel

A feature now becomes relevant if: • it is relevant to predict the class of interest, and • it is hard to predict from the neighbouring pixels.

. Visualisation of how different patch sizes (how much pixels are removed at once) influences the result

PRE-SOFTMAX vs. OUTPUT LAYER Visualisation of three different feature maps, taken from a middle layer of the GoogLeNet.

MULTIVARIATE ANALYSIS

NETWORK COMPARISON

We marginalise pixel patches of size (k x k ) instead of single pixels, implemented in a sliding window fashion.

Support for the top-three scoring classes in the pre-softmax layer and the output layer.

Comparison of explanations from different networks: AlexNet, GoogLeNet, VGG net.

MRI CLASSIFICATION EXPLANATIONS Trust me, I’m a robot.

DEEP VISUALISATION

Explanations across different slices of the MRI scan

We adapt (1) to analyse the relationship between any two nodes in the network, allowing us to look at the activation difference, i.e., visualise the role of deep convolutional filters. Explanations of predictions for MRI scan predictions from the two classes “healthy” and “sick”.

REFERENCES: [1] Robnik-Sikonja, Marko, and Igor Kononenko. “Explaining classifications for individual instances.” Knowledge and Data Engineering, IEEE Transactions on 20.5 (2008): 589-600. [2] Simonyan, Karen, Andrea Vedaldi, and Andrew Zisserman. “Deep inside convolutional networks: Visualising image classification models and saliency maps.” arXiv preprint arXiv:1312.6034 (2013). [3] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.” Advances in neural information processing systems. 2012.

[4] Russakovsky, Olga, et al. “Imagenet large scale visual recognition challenge.” International Journal of Computer Vision 115.3 (2015): 211-252. [5] Szegedy, Christian, et al. “Going deeper with convolutions.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. [6] Simonyan, Karen, and Andrew Zisserman. “Very deep convolutional networks for large-scale image recognition.” arXiv preprint arXiv:1409.1556 (2014). [7] Cobra Dataset from the Academic Medical Center (AMC) in Amsterdam, The Netherlands.

EXPERIMENTS: (top): Imagenet; images from the ILSVRC challenge [3] (natural images from 1000 categories); Classifiers: AlexNet [4], VGG net [5], GoogLeNet [6] (bottom): MRI; MRI scans from the COBRA dataset [7] (70 healthy and 70 HIV patients); Classifier: Linear Regression

Marginalising different patch sizes (k) leads to different resolutions.

CORRESPONDING AUTHOR: Luisa Zintgraf, [email protected]

CODE AVAILABLE AT: github.com/lmzintgraf/DeepVis-PredDiff

[1]: How does the prediction change when a

[1]: How does the prediction change when a

Suggest Documents

1 DOES THE INTERNET CHANGE HOW WE DIE ...

How does eukaryotic gene prediction work?

When and how does diversity increase group performance?: a ...

HOW DOES NEW EVIDENCE CHANGE OUR

Switch: How To Change Things When Change ... - nela1bello - home

How Does Multinational Production Change ... - Google Sites

How Does Change in Depressive Symptomatology

(PDF) Switch: How to Change Things When Change ... - Google Sites

SWITCH: How to Change Things When Change is Hard

Switch: How to Change Things When Change is Hard

How to Change Things When Change Is Hard - Google Sites

How to Change Things When Change Is Hard - Google Sites

Switch How to Change Things When Change is Hard.pdf - nela1bello

1 When Does Recognition Increase Charitable ...

How Does Party Position Change Happen? The ... - The Monkey Cage

How does turbulence change approaching a rotor? - Wind Energy

How does turbulence change approaching a rotor? - Wind Energy ...

How Does a Divided Population Respond to Change? - ASU Digital

How many scientists does it take to change a

How does language change as a lexical network? An ... - PLOS

How many features does it take to change a lightbulb?

How Do Student Teachers' Beliefs Change when the New ... - Eric

climate change prediction using data mining 1

climate change prediction using data mining 1

[1]: How does the prediction change when a