Computational Photography - CVPR - ANU

Computational Photography Computational Photography CVPR Easter School March 14‐18 March 14 18th, 2011, ANU 2011 ANU Kioloa Coastal Campus Coastal Campus

Michael S. Brown School of Computing National University of Singapore

Goal of this tutorial • Introduce you to the wonderful world of “Computational Photography” – A hot topic in Computer Vision (and Computer Graphics)

• Plan: to give you an overview of several areas that computational photography is making inroads – Computational Optics – Computational Illumination – Computational Processing (and other cool stuff)

• Attempt to do this in ~3.5 hours. – This is traditionallyy a full semester course – I’ve had to perform academic triage to size down materials, so my notes are by no means exhaustive

2

Outline •

Part 1: Preliminaries – Motivation and Preliminaries – Refresher on image g p processing g and basics of p photography g p y

•

Part 2:Computational Optics – Catadioptric cameras, flutter shutter camera, coded aperture, extended depth p of field camera,, hybrid y cameras

BREAK • Part 3: Computational Illumination – Radiometric calibration calibration, Flash/No-Flash Low-Light Imaging Imaging, Multi-flash (gradient camera), Dual Photography

•

Part 4: Computational Processing (and other cool stuff) – User-assisted User assisted segmentation, segmentation Poisson Image Editing Editing, Seam-carving Seam carving, Image Colorization

Added bonuses: we will touch on Markov Random Fields, bi-lateral filter, the Poisson equations, and various other useful tools

3

Part 1 Motivation and Preliminaries

4

Modern photography pipeline Scene Radiance

Pre-Camera Lens Filter Lens Shutter Aperture

In Camera In-Camera CCD response (RAW) CCD Demosaicing Gamut Mapping Preferred color selection Tone-mapping

Starting point: reality (in radiance)

Final output p Camera Output: sRGB

Ending g point: p better than reality (in RGB)

Post-Processing Touch-up Hist Hi t equalization li ti Spatial warping Etc . . .

Even if we stopped here, the original CCD response potentially has had many levels of processing.

5

Inconvenient or convenient truth? • Modern photography is about obtaining “perceptually optimal” images • Digital photography makes this more possible than ever before • Images I are made d to t be processed

6

Let’s be pragmatic • PhD in Philosophy? – No? then forget the moral dilemma dil

• We are “scientist” and engineers g – What cool things we can do? – Is it publishable?

Nerd (you)

Bill Freeman, Spacetime photography

Daniel Reetz/Matti Kariloma (futurepicture.org) Paper needed for graduation, promotion, travel to Hawaii

7

Images are made to be processed • Fact: Camera images are going to be manipulated. manipulated • Opportunity: This gives us the freedom to “do things” with the knowledge that we will process them later

Levin et al Coded Aperture

Nayar’s Catadioptric Imaging

Raskar’s Multiflash Camera

Tai’s Hybrid Camera

8

So, what should we do? • Well, first – Do the obvious obvious* – Address limitations of conventional cameras – Requires R i creativity ti it and d engineering i i

• Second, . . – Think “outside the box” – Go beyond conventional cameras – Requires creativity, possibly beer . . .

*note: obvious does not imply easiest

9

Examples of conventional photography limitations

Image Blur/Camera Shake

Limited Depth of Field

No or bad coloring

Limited Dynamic Range

Limited Resolution

Sensor noise

Slide idea from Alyosha Efros

10

Beyond conventional photography

Non-photorealistic camera

Adjustable Illumination

Camera View

Illumination View

Static View

Beyond Static “Space wiggle” - Gasparini

Inverse Light Transport

11

Where can we make changes?

Shree Nayar’s (U. Columbia) view on “Computational Cameras”...

12

What else can we do?

Put the “human in the loop” loop when editing photographs. Similar philosophy to Anton van den Hengel’s interactive image-based g modeling. g

13

Excited? Ready to go!!!

We will need some preliminaries first.

14

Frequency Domain Processing • For some problems it is easy to think about manipulating an image’s frequency directly – It is possible to decompose an image into its frequencies – We can manipulate the frequencies (that is, filter them) and then reconstruction the resulting g image g – This is “Frequency Domain” processing

• This is a very powerful approach for “global” processing of an image • There is also a direct relationship with this and spatial domain filtering g

15

Discrete Fourier Transform The heart for Frequency domain processing is the Fourier Transform. 2D Discrete Forward Fourier Transform ((to frequency q y domain))

1 F u , v   MN

M 1 N 1



f  x, y e  j 2 π ux N  vy M 

y 0 x 0

2D Discrete Inverse Fourier Transform (back to pixel/spatial domain)

M 1 N 1

f  x, y    F u , v e j 2 π ux N  vy M  v 0 u 0

16

2D Discrete Fourier Transform • Converts an image into a set of 2D sinusoidal patterns • The DFT returns a set of complex coefficients that control both amplitude and phase of the basis Some examples p of sinusoidal basis p patterns

17

Simple way of thinking of FT The FT computes a set of complex coefficients F(u,v) = a + ib. Each (u,v) controls a corresponding unique frequency base. For example:

F(0,0) is mean of the signal (also called the DC component

F(0,1) Is the first “vertical” base

F(1,0) is the first horizontal base

F(1,1) is the first mixed base

18

Complex coefficient The complex coefficient controls the phase and magnitude: F(u,v)*eistuff

= F(u,v) x (cos(stuff) + isin(stuff)) = (a + bi) x (cos(stuff) + isin(stuff))

Basis (0,1) with different F(0,1) coefficients

The frequency of the sinusoid doesn’t change, only the phase and the amplitude.

1 M 1 N 1  j 2 π0 x N 1 y M    F 0,1  f x , y e  MN y0 x0

Basis 0,1

The coefficient corresponding to this basis is computed based on the contribution of all image pixels f(x,y). 19

Inversing from Frequency Domain v Inversing is just a matter of summing up the basis weighted by their F(u,v) contribution.

+

+

+

u

+…

= Result after summing only the first 16 basis*

32 basis

64 basis

128 basis

256 basis

original

512 basis

*

4

4

f  x , y     F u , v e j 2 π ux v 0 u 0

N  vy M



Filtering • We can “filter” the DFT coefficients. This means we throw away or suppress some of the basis in the frequency domain domain. The filtered image is obtained by the inverse DFT. • We often refer to the various filters based on the type of information they allow to “pass” through: – Lowpass o pass filter te • Low-order basis coefficients are kept

– Highpass filter • High-order g basis coefficients are kept p

– Bandpass filter • Selected “bands” of coefficients are kept • Also can be considered “band reject”

– Etc . . 21

Filtering and Image Content • Consider image noise:

Original

Noise

• Does noise contribute more to high or low frequencies?

22

Typical Filtering Approach

From: Digital Image Processing, Gonzalez and Woods.

23

Example F(u,v)

H(u,v)

G = H(u,v) F(u,v)

1.0

0 F = fft2(I);

H = yourOwnFilter.m

G = H *. F;

Note that G(u,v) = H(u,v) F(u,v) is not matrix multiplication. It is a element-wise multiple.

g = ifft2(G); f(x,y) I = imload(‘saturn.tif’);

g(x,y) * Examples here have shifted the F,H, and G matrices for visualization. F,G log-magnitude are shown.

Equivalence in Spatial Domain f  x, y   h  x, y   F  u, v  H  u, v  Recall convolution theorem (In spatial domain we call h – a point spread function) (In frequency domain we often call H – a optical transfer function)

Spatial Filtering

g  x, y   f  x, y   h  x, y  The frequency q y domain filter H,, should be inversed to obtain h(x,y):

Frequency Domain Filtering

G  u, v   F  u, v  H  u, v  g  x, y   1 G  u, v 

h( x, y)  1{H (u, v)}

25

Ideal Lowpass Filter

From: “Digital Image Processing”, Gonzalez and Woods

26

Example from DIP Book

From: “Digital Image Processing”, Gonzalez and Woods.

27

Original

Do=15

Do=5

Do=30

Do is the filter radius cut-off. That is, all basis outside Do are thrown away. Note the “ringing” artifacts in Do=15,30.

Do=80

Do=230

28

Ringing • Why ringing? • This is best demonstrated when looking at the inverse of the ideal filter back in the spatial domain. h( ) = F (H(u,v)) h(x,y) (H( )) -1 1

H(u,v)

Imagine the effect of performing spatial convolution with this filter. Probably look like “ringing” . . .

29

Making smoother filters • The sharp cut-off of the ideal filter results in a sinc function in the spatial domain which leads to ringing in spatial convolution. • Instead, we prefer to use smoother filters that have better properties. • Some common ones are: Butterworth and Gaussian

30

Butterworth Lowpass Filter (BLPF) • This filter does not have a sharp discontinuity – Instead it has a smooth transition

• A Butterworth filter of order n and cutoff frequency locus at a distance D0 has the form

1 H (u , v)  2n 1  [ D(u , v) / D0 ] 1 – where D(u,v) is the distance from the center of the frequency plane.

31

1. The BLPF transfer function does not have a sharp discontinuity that sets up a clear cutoff between passed and filtered frequencies frequencies. 2. No ringing artifact visible when n = 1. Very little artifact appears when n

120

Example

Convolution matrix (H), i.e. the motion blur ->

121

The motion blur

This is basically the motion of the camera as “experienced” by tthe e high-resolution g eso ut o image’s age s p pixel e o over e tthe ee exposure posu e ttime. e

122

Deblurring

Using standard deconvolution with the estimated PSF.

123

Deblurring

Using standard deconvolution with the estimated PSF.

124

Tai’s extension • Beyond Camera Shake – Use high high-frame frame rate camera to compute optical flow. – Tai generated per-pixel per pixel convolution (h)

125

Tai’s Optimization Procedure • Global Invariant Kernel (Hand Shaking)

Deconvolution Eq.

Low Resolution Reg.

Kernel Reg.

• Spatially varying Kernels (Object Moving)

Reformulated problem to perform spatial varying deconvolution + regularization against the low-res images. This slides is just to prove we are smart.

126

Why the last slide? • To show that computational photography leads to new mathematical innovations on classical problems • Previously, y, spatially p y varying y g deconvolution was not popular – Why? Impossible to acquire the necessary information – Computational photography design makes this doable

127

Spatially Varying Deblurring

Blurry

Spatially‐Varying Kernels (Single Depth Plane) (Single Depth Plane)

Deblurred Using Correct Kernel

Deblurred Using Center Kernel

From Neel Joshi’s SIGGRAPH talk

128

Neel Joshi, SIGGRAPH 2010

Hybrid Imaging Summary • Idea is simple • Should be able to do this on a single chip – Will just take a redesign of the CCD sampling

• Allows camera ego motion to be computed reasonably accurately • Produces “good” good results – Type of motion is limited though – Global in-plane translation is limited

• Extended by Tai to do spatially varying blur • Extended byy Joshi to use inertial sensors 129

Computational Optics Summary • Exploit the fact that images will be processed • Earlyy work,, involved simple p image g warping p g ((omnicamera) • Raskar showed a very simple modification to the exposure made 1D motion deblurring significant better • Anat followed by redesigning the aperture to improve DoF applications pp • Hajime produced an image that is completely undesirable unless processed. By moving the sensor h iinduced he d dad depth th iinvariant i t bl blur ffor extended t d d DoF. D F • Moshe (and others) exploited auxillary information in the processing p ocess g to add address ess deb deblurring u g 130