Fundamentals of Light Field Imaging and Display ...

22 downloads 138998 Views 14MB Size Report
May 22, 2016 - Used for low-level image processing (edge and blob detection) ..... Run a standard High-Level-Operating-System (Android, Windows,. iOS) with ...
Display Week 2016, Sunday Short Course – 22nd May 2016

Fundamentals of Light Field Imaging and Display Systems

Dr. Nikhil Balram Ricoh Innovations Corporation (RIC)

Copyrights for all material used here belongs to original sources or author. All rights reserved.

Acknowledgements Materials* and/or insights provided by: • Dr. Ivana Tošić (RIC) • Dr. Gordon Wetzstein (Stanford University) • Prof. Marty Banks (UC Berkeley) • Dr. Noah Bedard (RIC) • Dr. Kurt Akeley (Lytro) • Prof. Xu Liu (Zhejiang University) • Dr. Jim Larimer (formerly NASA Ames) • Dr. Wanmin Wu (formerly RIC) • Dr. Kathrin Berkner (formerly RIC) *Copyrights of material provided belong to original owners

1

Overview • Section 1: Fundamentals of Human Visual System • Section 2: Introduction to Light Fields

• Section 3: Light Field Imaging • Section 4. Light Field Displays

• Section 5. Summary • Section 6. References

2

Section 1: Fundamentals of Human Visual System (HVS)

3

Human Visual System • “The human visual system detects and interprets information from visible light to build a representation of the surrounding environment.”* • The visual pathway begins at the eyes and ends at the visual cortex

* https://en.wikipedia.org/wiki/Visual_system

4

HVS Front-end Retina has 4 types of photoreceptors • Rods – Achromatic – Concentrated in the periphery – Used for scotopic vision (low light levels) Smith & Pokorny Fundamentals

• Cones 1 – Three broadband receptors – S (Short), M (Medium), L (Long) – with .1 red, green and blue peaks respectively .01 – Concentrated in fovea (centre of retina) .001 – Used for photopic vision (daylight levels) .0001 400

Sensitivity L M S

450

500

550

600

Wavelength

650

700

5

Opponent Channels • Cones are organized to produce 3 opponent channels – White/Black (achromatic) – Red/Green – Yellow/Blue

• Opponent channels differ in spatial resolution – White/Black has highest resolution because it uses only 1 pair of receptors – Yellow/Blue has lowest spatial resolution because it uses S cones, which are sparse Y/B chromatic mechanism Achromatic mechanism W/B R/G Y/B R/G chromatic mechanism

Diagram for illustration only – real receptor spacing and density is highly irregular

6

Contrast Sensitivity Function (Spatial) • Contrast Sensitivity Function (CSF) – Minimum modulation required to discriminate a sine wave grating at various spatial frequencies – “Spatial MTF of the HVS” – Envelope of narrowband tuned filters

Kelly (1974)

7

Contrast Sensitivity Function (Spatio-Temporal) • Spatio-Temporal Contrast Sensitivity Function (CSF) – Minimum modulation required to discriminate a sine wave grating at various spatial and temporal frequencies – Diamond shaped 2D response shows tradeoff between spatial and temporal resolution

Kelly (1966, 1979) 8

Resolution Limit of HVS – Visual Acuity (Snellen) • Snellen acuity refers to the ability to discriminate features – Based on density of cones and optics of the eye – Normal (“20/20”) vision refers to ability to distinguish a feature subtending 1 arc minute, which corresponds to 30 cycles/degree – Minimum contrast required for detection is plotted as the contrast sensitivity function (CSF)

http://webvision.med.utah.edu/book/part-viii-gabac-receptors/visual-acuity/ http://en.wikipedia.org/wiki/Visual_acuity

9

L/M ratio variation in eyes with normal color vision

HS

YY

*AN

AP nasal

AP temporal

MD

JP

JC

RS

*JW nasal

*JW temporal

BS

5 arcmin

* Roorda & Williams (Nature 1999) Hofer et al., (J. Neurosci. 2005)

Modeling HVS Processing • For many problems in visual processing, the optimal solution is based on minimizing a cost function – The cost function comprises of a Data term and a Model Term

Cost Function (F) = A∫ϕ1 (Data) + B∫ϕ2 (Model) combined using various statistics



• This Universal Optimization Function* may also represent the way human visual processing works – The image reconstructed by human vision is a weighted combination of the image captured by the retina and the “prior model”

• Size illusion demo (next slides) shows the effect of the prior model *“A Poet’s Guide to Video: The Art and Science of the Movie Experience”, J. Larimer, N. Balram, S. Poster, draft book manuscript (never completed)

11

Which One Looks Bigger?

12

Which One Looks Bigger?

13

Which One Looks Bigger?

14

Taxonomy of Depth Cues* • The HVS uses a number of cues to “see” depth – to interpret and understand the world around us – Stereoscopy or binocular disparity is only one type of depth cue – There are many monocular depth cues that are used in normal viewing and in 2D movies over the last hundred or more years

Monocular

Binocular

Geometry

Color

Focus

Perspective Occlusion Motion parallax Texture gradient Size

Lighting Shading Aerial perspective

Accommodation Retinal blur

*From Banks Schor Lab seminar, K. Akeley, June 2010

Convergence Retinal disparity 15

Example – Missing or Incorrect Blur Cues* • Retinal blurring is an important depth and size cue – in realworld viewing, distant objects produce blurred images, while close ones are sharp • When blur cues are incorrect, perceived depth and size are affected, causing effects like miniaturization • Example of same image with different blur – Compare how you perceive the buildings that stay in focus in first image versus second image

• Remember the UOF model – what happens when the data is suspect? F = A∫ϕ1 (Data) + B∫ϕ2 (Model)

*R. T. Held, E. Cooper, J. F. O’Brien, M. S. Banks, Using blur to affect perceived distance and size, ACM Trans. Graph. 29, 2, March 2010

16

17

18

HVS: Key Points to Remember • “The human visual system detects and interprets information from visible light to build a representation of the surrounding environment.”* • The visual pathway begins at the eyes and ends at the visual cortex • What we “see” is not the raw image on the retina but our interpretation of it • The interpretation depends on a set of sensory information (“cues”) that we extract from the data and on the rules that our system has developed during the course of our evolution (“prior model”) • Confusion (optical illusions) can arise when the data is considered suspect and is overruled by the prior model • Cue conflicts can cause physical ill-effects like nausea and fatigue * https://en.wikipedia.org/wiki/Visual_system

19

Section 2: Introduction to Light Fields

20

Definition of Light Fields* •

Originally defined by Gershun in 1936 as amount of light traveling in every direction through every point in space



7D plenoptic function defined as flow of light through 3D space (Adelson & Bergen, 1991)



4D light field defined as “The radiance as a function of position and direction, in regions of space free of occluders (free space).” (Levoy & Hanrahan, 1996)*

Figure 1: Liu et. alt. Information Display 6/14 *M. Levoy, P. Hanrahan, Light field rendering, SIGGRAPH 1996

21

What is the Plenoptic Function? • [Adelson, Bergen, 1991]: Plenoptic function tells us the intensity of light seen from any viewpoint, at any time instant, for any wavelength of the visible spectrum

22

Parametrization • In spherical coordinates (spherical camera)

spherical pixel coordinates

wavelength

time

camera center 3D coordinates

• In Cartesian coordinates (planar camera)

planar pixel coordinates

wavelength

time

camera center 3D coordinates

23

Structure of the Plenoptic Function (1/2)*

default values:

*E. H. Adelson, J. R. Bergen, The plenoptic function and the elements of early vision, Computational Models of Visual Processing, MIT Press, 1991

24

Structure of the Plenoptic Function (2/2)* • Slices through different dimensions*

x-y slice (2D image)

x-t slice (1D image scanline across time)

x-λ slice (1D image scanline across wavelengths)

x-Vx slice (1D image scanline across horizontal views)

x-Vy slice (1D horizontal image scanline across vertical views)

x-Vz slice (1D horizontal image scanline across views in depth z)

• The plenoptic function is too complex to handle in its full dimensions, but it is highly structured and that structure can be exploited to extract information that is needed for specific purposes *E. H. Adelson, J. R. Bergen, The plenoptic function and the elements of early vision, Computational Models of Visual Processing, MIT Press, 1991

25

Light Field • Fix wavelength and time, and look at rays passing through two parallel planes – Light field as 4D parametrization of the plenoptic function* – Easier to handle, and convenient for rendering of new views

*M. Levoy, P. Hanrahan, Light field rendering, SIGGRAPH 1996

26

Conventional Images as Light Field Slices Capture • Sensor integrates incoming light rays from various direction. • Sensor-plane-slice is positioned perpendicular to optical axis • Projection of light rays onto the sensor plane

Display • Emission of light rays in various directions to achieve certain appearance on a specific focal plane • Human eyes focus on a specific plane • Focal plane is imaged by the pupil onto the retina

27

How to Acquire the Light Field? Camera arrays & moving cameras

Stanford camera array

Hand-held plenoptic cameras

Stanford lego gantry

Multi-aperture cameras

Lytro Immerge

Jaunt ONE

28

Camera Arrays and Moving Rigs • Camera arrays – Pros: • • • •

High spatial resolution Good image quality Can be wide-baseline Enable depth estimation for far objects

Stanford camera array

– Cons: • Bulkier than handheld camera • View resolution might be limited due to physical spacing • Harder to calibrate and synchronize

• Moving camera rig: – Same as camera array but limited to static scenes

Lytro Immerge

Jaunt ONE

29

Light Fields From Camera Arrays • Stanford light field archive – http://lightfield.stanford.edu/lfs.html

• Heidelberg collaboratory for image processing (HCI) database – –

http://hci.iwr.uni-heidelberg.de/HCI/Research/LightField/lf_benchmark.php Datasets and Benchmarks for Densely Sampled 4D Light Fields, Wanner et al. VMV 2013

30

Hand-held Plenoptic Cameras • Hand-held plenoptic (light field) camera – Ng et al. 2005; Lytro – Perwass et al. 2012; Raytrix – Horstmeyer et al. 2009 (multimodal camera)

• Pros: – – – –

Small form factor Very dense views Calibration done once (fixed setup) Can trade-off views for different wavelengths

• Cons: – Reduced spatial resolution – Small baseline (3D estimation limited to near objects) 31

Light Fields from Plenoptic Cameras • Dataset of Lytro images of objects (EPFL) – http://lcav.epfl.ch/page-104332-en.html – Ghasemi et al. LCAV-31: A Dataset for Light Field Object Recognition, SPIE 2014 Available at https://github.com/aghasemi/lcav31 - Database of Lytro light fields of different objects (for recognition) - Light fields already extracted from raw plenoptic images

• Raytrix camera light fields from HCI (obtained from Raytrix scenes)

32

Multi-Aperture Cameras • Multi-aperture cameras – One sensor, multiple lenses mounted on the sensor – Pros: • Very small form factor (can be a cell phone camera) • Spatial resolution usually better than light field camera

– Cons: • Small number of views • Small baseline

– Pelican Imaging camera • PiCam: an ultra-thin high performance monolithic camera array, SIGGRAPH 2013 33

Light Field Pipeline – From Capture to Display*

*K. Akeley, Light-field imaging approaches commercial viability, Information Display 6/15, Nov./Dec. 2015

34

Light Fields: Key Points to Remember • Plenoptic function is a 7D function describing light flowing through space • This can be reduced to various useful subsets • Light field is a 4D function describing radiance as a function of position and direction – Simple representation using two parallel planes with 2D views (u,v) and 2D positions (s, t)

• Light fields can be captured using an array of cameras or a small-form factor camera with micro-lenses or multiple apertures – Each form of capture has tradeoffs and the best choice depends on the objectives

• Light fields can be displayed using an array of display engines or a display with special optical layers 35

Section 3: Light Field Imaging

36

Light Field Imaging •

Array of cameras vs compact single one

Stanford Array

Jaunt ONE

Lytro Immerge



Enables: generation of depth maps, digital refocusing, multiple views, volumetric rendering, multispectral imaging, etc. 37

Light Field Imaging*

*K. Akeley, Light-field imaging approaches commercial viability, Information Display 6/15, Nov./Dec. 2015

38

Light Field Imaging For Capturing 3D Scene Information Detector

Micro Lens Array (MLA) Main Lens

Processing

Main Micro Lens lens Array (MLA)

Detector

39

Light Field Imaging For Capturing 3D Scene Information • Main lens focuses image onto the Micro Lens Array (MLA) and each micro lens separates different views onto the sensor elements behind it • Alternate plenoptic systems focus the image before the MLA and enable higher resolution at the expense of greater complexity and other limitations

Raw plenoptic image

multiple views from different angles

Sensor data

light field

views

pixels 40

Light Field Imaging For Capturing MultiSpectral Information Monochrome Detector

Spectral Filter

Micro Lens Array (MLA)

Main Lens

X Y Z

41

Light Field Imaging – f/#

*K. Akeley, Light-field imaging approaches commercial viability, Information Display 6/15, Nov./Dec. 2015

42

Light Field Imaging - System Model • System model for task-specific designs

Scene Optics

Sensor

Digital processing

Performance metric

Joint end-to-end system design necessary for reaching optimal performance! Light Field Imaging Core Technology Image processing

Calibration

Optics design

Optimization

MLA manufacturing 43

Light Field Imaging: Models for Image Formation

44

Models of Plenoptic Image Formation • Image formation for plenoptic cameras – Different from traditional cameras – Micro-lens array in front of the sensor changes the image formation

• Image formation models – Geometric models • Ray-based modeling of light

– Diffraction models • Wave-based modeling of light

45

Models of Plenoptic Image Formation • Image formation for plenoptic cameras – Different from traditional cameras – Micro-lens array in front of the sensor changes the image formation

• Image formation models – Geometric models • Ray-based modeling of light

– Diffraction models • Wave-based modeling of light

46

Single Lens Stereo • Adelson and Wang, 1992* – Thin lens model for the main lens – Pinhole model for microlenses

sensor plane - focal length of the lens

object point

- displacement of aperture - displacement of object’s image on sensor plane

plane conjugate to sensor plane

lens

*E. H. Adelson, J. Y. A. Wang, Single lens stereo with a plenoptic camera, IEEE Trans. PAMI, Feb. 1992

47

How is Depth Information Captured? • Let us look at the rays falling on each pixel, through multiple pinholes object in front of focus object behind focus object in focus

Angle of these linear structures encodes the depth! *E. H. Adelson, J. Y. A. Wang, Single lens stereo with a plenoptic camera, IEEE Trans. PAMI, Feb. 1992

48

Depth Estimation Using similar triangles:

And the lens equation:

we get: and:

*E. H. Adelson, J. Y. A. Wang, Single lens stereo with a plenoptic camera, IEEE Trans. PAMI, Feb. 1992

49

Light Field Imaging: Core Technologies

50

Core Technology 1: Calibration Raw Data

Calibration

Pupil Images Multiviews

Calibration and multiview extraction 51

Calibration of Plenoptic Cameras • Calibration is an important part of plenoptic image processing • Some issues that make calibration challenging: – – – –

Rotation of the micro-lens array (MLA) Distortions (main lens and MLA) Vignetting Hexagonal lattice for packing of microlenses

• Typical calibration process 1. Precise localization of MLA centroids (using a white image) 2. Unpacking pixels from under lenslets to multiple views 3. Interpolation if unpacked pixels are on a non-uniform / nonrectangular lattice 52

Calibration of Plenoptic Cameras • Dansereau et al. “Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras”, CVPR 2013 – A method for decoding raw camera images into 4D light fields – A method for calibrating images from a Lytro camera • 15-parameter plenoptic camera model • 4D intrinsic matrix based on a projective pinhole and thin-lens model • A radial direction-dependent distortion model

– Matlab toolbox publicly available (Light Field Toolbox v0.2) – Does not deal with demosaicing (uses linear demosaicing, not optimal)

53

Decoding Raw Camera Images into 4D Light Fields* • Extracting pixels and re-arranging into a light field 1. Capture a white image through a white diffuser 2. Locate lenslet image centers 3. Estimate grid 4. Align grid 5. Slice into 4D

*Dansereau et al. Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras, CVPR 2013

54

Calibration Model* • Ray propagation model • Assumptions: – Thin lens model of the main lens – Pinhole array for the MLA

rectified ray in homogeneous coordinates

intrinsic matrix for the whole system

input ray in homogeneous coordinates

*Dansereau et al. Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras, CVPR 2013

55

Calibration Matrices for 2D Case* (1/2) conversion from relative to absolute coordinates - number of pixels per lenslet - translational pixel offset

conversion from absolute coordinates to rays - spatial frequencies in samples (for pixels and lenslets) - offsets in samples (for pixels and lenslets)

express rays in position and direction - distance between the microlens array and the sensor

*Dansereau et al. Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras, CVPR 2013

56

Calibration Matrices for 2D Case* (2/2) propagate to the main lens - distance between the main lens and the microlens array

refraction through the main lens - focal length of the main lens

express back in ray coordinates - distance between the object and the main lens

*Dansereau et al. Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras, CVPR 2013

57

Overall Calibration Matrix for 4D Case* • Multiplying all matrices we get: 12 non-zero terms

• Correction of projection through the lenslets – Transformation from the real to a virtual light field camera due to resizing, rotating, interpolating, and centering of lenslet images - superscripts indicate that a measure applies to the physical sensor (S), or to the virtual “aligned” camera (A)

*Dansereau et al. Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras, CVPR 2013

58

Calibration Optimization • Radial distortion

distortion parameters

distorted (d) and undistorted (u) 2D ray directions

• Minimization of an objective for checkerboard pattern images – use checkerboard images to find the intrinsic matrix H, camera poses T , and distortion parameters d

calibration pattern points images

light field views in two directions

distance between reprojected rays and feature point locations

*Dansereau et al. Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras, CVPR 2013

59

Core Technology 2: 3D Estimation multi-view images (light field)

raw plenoptic image *

calibration

light field slices

light field slices

preprocess + slicing

depth map Scale-depth transform for light fields + Occlusion detection + Dense depth estimation

angles give depth

* From a plenoptic camera without spectral filters 60

How to Recover a 3D Scene From LF Data? • Given the light field data, reconstruct the objects within different depth layers objects

light field

layer 2

Multi-view system layer 3

layer 1

?

challenge: occlusions! 61

Premise of Most Geometric Approaches • Exploit the line structure of the light field Light Field (LF) slice (EPI plane [Bolles et al.])

light reflected from the closer object

• •

light reflected from the farther object

Objects in different depths produce lines with different angles! Ray bundles from front object occlude ray bundles from background object

62

Scene Recovery Using Geometric Models • Layered-based approaches – LF segmentation and sparse representation [Gelman et al., 2012] – Sparse generative layered model [Lin et al. 2013]

• Dense depth estimation – Structure tensor approach [Wanner et al., 2012] – Scale-depth space approach [Tošić et al., 2014]

63

Scene recovery Using Geometric Models • Layered-based approaches – Joint segmentation of multiple views [Gelman et al., 2012] – Sparse generative layered model [Lin et al. 2013]

• Dense depth estimation – Structure tensor approach [Wanner et al., 2012] – Scale-depth space approach [Tošić et al., 2014]

64

Light Field (LF) Model* • A generative non-linear model that can model occlusions by masking layer 3

layer 2

layer 1

mask for layer 1

mask for layer 2

5

u: view coordinate

10

Each layer is a linear combination of raylike functions r with angle :

15 20 25 30 10

20

30

40

50

60

x: spatial coordinate

-5 0 5 10

5 15

10

20

15

25

occlusion

20

30

vector of coefficients

25 30 10

20

30

40

50

60

35 40 10

example of 20

30

40

50

60

overcomplete dictionary

*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013

65

Algorithm for 3D Layer Reconstruction* • To reconstruct 3D layers, we need to estimate the following unknowns: – Coefficients for each LF slice – Mask for each LF slice

• Method: an iterative algorithm: – Initialize mask and coefficients Step 1: Solve for coefficients Step 2: Refine the mask and go to Step 1

*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013

66

Algorithm Description Through an Example* • Example light field: – Humvee dataset (Stanford camera array), 16 views (horizontal parallax)

view 2

view 8

– Ray-like functions approximated by dictionary of Ridgelets *Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013

67

Algorithm: Step 1* • Step 1: Sparse Recovery – Relax the problem into linear (example: two layers)

error vector subsumes the occlusion error overcomplete dictionary

– Solve the convex problem: • Layered Sparse Reconstruction Algorithm (LSRA) (LSRA)

enforces sparsity of different angles

enforces sparsity of occluded pixels

*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013

68

Result After Step 1* • After step 1 we: – Group Ridgelets into layers (by clustering with respect to the angle) – Reconstruct each layer using only Ridgelets in that cluster 20 40 60 80 100 120 140 160 180 200

view 1

50

100

150

200

layer 1 in view 1

250

layer 2 in view 1

*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013

69

Algorithm: Step 2* • Step 2: Refine the Mask – Use the result from previous step to define the mask – Use image segmentation (e.g, Active contour) in spatial domain to refine the mask – Solve LSRA again with an updated model (includes mask) 500 Iterations

20

20

40

40

60

60

80

80

100

100

120

120

140

140

160

160

180

180

200

50

100

150

(a) One view of image

200

200 250

50

100

150

200

(b) Result from the previous step

250

(c) Active contour

*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013

70

Final Result* • After two iterations we get the final result:

layer 1 (after two iterations)

layer 2 (after two iterations)

layer 2 (after Step 1 of the first iteration)

*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013

71

Results on the Stanford Dataset* Humvee dataset

View 1

View 2

layer 1

layer 2

Chess dataset

View 1

View 2

layer 1

layer 2

Layer 3

*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013

Layer 4

72

Results on Lytro Data* • Use Lytro data, 6 different views – We retrieve the occluded part of second layer

view 1

view 2

layer 1

layer 2

*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013

73

Scene Recovery Using Geometric Models • Layered-based approaches – Joint segmentation of multiple views [Gelman et al., 2012] – Sparse generative layered model [Lin et al. 2013]

• Dense depth estimation – Structure tensor approach [Wanner et al., 2012] – Scale-depth space approach [Tošić et al., 2014]

74

3D information Within Light Fields • Plenoptic cameras acquire multi-view images light field slice (EPI*)

light field (LF)

raw plenoptic image views u

u y Pixels in the horizontal direction x x

-

-

angle associated with the depth (via a mapping based on camera parameters) dense depth map estimation = find an angle for each point on x

*[Bolles et al., IJCV ’87]

75

Analysis of Light Field Slices (EPIs) • EPI structure – Line edges (discontinuities) with a certain angle – Uniform regions between line edges Ray edges

views u (horizontal parallax)

Rays Pixels in the horizontal direction x

• How to: – Detect both structures? – Get the depth (angle) information at the same time?

• Useful approach: scale space analysis for light fields* * I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014

76

Scale-Spaces: Background • Used for multi-scale image analysis since early 80s – Gaussian scale spaces are well known

scale-space

scale

Scale

– Scale invariance:

where

is the image downsampled by 77

Derivatives of Gaussian Scale-Space* • First and Second derivatives of Gaussian scale spaces – Used for low-level image processing (edge and blob detection)

• First derivative

• Second derivative

Normalized first derivative

Edge detection

*T. Lindeberg, Scale-space theory: A basic tool for analysing structures at different scales, ‘94

Normalized second derivative

Blob detection

78

Scale Space Construction for Light Fields: Kernel* • We first need a kernel for constructing scale spaces “Ray Gaussian” (RG) filters:

• Parameters of the RG: – Scale (width): – Angle:

• Angle: – Can be uniquely mapped to depth * I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014

79

Scale Space Construction for Light Fields* • Multi-scale, multi-depth representation of LF slices Light field scale and depth (Lisad) space

LF slice

convolution over x only

We have a representation in scale AND angle

Angle / depth scale Pixel position

* I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014

80

Scale Invariance of Lisad Spaces* • Relation between the scale of Ray Gaussian and the scale of the light field signal

* I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014

81

Depth Invariance of the Inner Product with RG* • The value of the inner product of a light field with a RG does not depend on the angle – Convolution with RGs of different angles does not introduce a bias towards some depths – Valid in the case of no occlusions

* I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014

82

Ray Edge Detection Using Lisad Spaces* • Extrema in the Lisad space of the first derivative of Ray Gaussian give us ray edges Angle / depth Pixel position

scale

views u (horizontal parallax)

Pixels in the horizontal direction x * I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014

83

Ray Detection Using LF Scale Space* • Extrema in the Lisad space of the second derivative of Ray Gaussian give us whole rays Angle / depth Pixel position

scale

views u (horizontal parallax) Pixels in the horizontal direction x * I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014

84

Example: Lisad Space for the Normalized Second Derivative of RG* (scale)

angle

u

x

convolution

Light field slice (EPI)

x

Lisad space

angle

angle

scale

scale

Normalized second derivative of RG

x * I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014

Depth Estimation Using Lisad Spaces*

views (horizontal parallax) u

• Problem statement: estimate angles in LF slices for all x,y Local estimation = independently for all ray edges problems with uniform image regions Pixels in the horizontal direction x

• Approach: whole ray detection with Lisad spaces – Based on scale-spaces – Operates on whole rays u – Multi-scale approach x

Find both angles and widths of rays

* I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014

87

3D Keypoint Detection* • Extrema detection in the first derivative Lisad space – Each keypoint is assigned an angle that determines depth – hotter colors = closer points

*I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014

88

Dense Depth Estimation Method* • Ray detection + edge detection + occlusion detection + post-processing light field

Scale-space construction (ray-Gauss second derivative)

Find extrema

Occlusion detection

Scale-space construction (ray-Gauss first derivative)

Find extrema

Occlusion detection

Depth assignment

depth map

• Processing per slice, both horizontal and vertical views • Occlusion detection in ray space:

possible occlusion closer object in front of the farther

impossible occlusion farther object in front of the closer: remove ray with larger variance

* I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014

89

Evaluation on the HCI Light Field Dataset • Heidelberg collaboratory for image processing (HCI) database – http://hci.iwr.uniheidelberg.de/HCI/Research/LightField/lf_benchmark.php – Datasets and Benchmarks for Densely Sampled 4D Light Fields, Wanner et al. VMV 2013

90

Experimental Results: Synthetic Scenes* • Evaluation on HCI database (Blender data, ground truth available) middle view

depth (Lisad)

depth error >1% Lisad, 1.2%

depth error >1% Wanner et al., 3.5%

* I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014

91

Experimental Results: Real Plenoptic Images* • Raytrix®

• Ricoh prototype

middle view middle view

disparity (Lisad)

disparity (Lisad) disparity (Wanner et al. 2014) * I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014

92

Core Technology 3: Resolution Enhancement* • Spatial resolution is a key challenge • Several ways of improving or enhancing resolution (each with its own tradeoffs) including: – – – –

Use of advanced super-resolution algorithms Reducing micro-lens diameter (and sensor pitch) and increase number of micro-lens Increasing sensor size and number of micro-lens Use different plenoptic architecture - focus micro-lens on the image plane of the main lens – “Focused Plenoptic (Plenoptic 2.0)”**

*K. Akeley, Light-field imaging approaches commercial viability, Information Display 6/15, Nov./Dec. 2015 ** T. Georgiev, The focused plenoptic camera, http://www.tgeorgiev.net/EG10/Focused.pdf

93

Core Technology 4: Multi-Spectral Image Processing Green

Amber

Before

(Color fringing)

After

Multispectral Views

Multispectral parallax rectification

94

Light Field Imaging: Application Examples

95

Example 1: Color Inspection Camera • Color analyzer camera – launched in Nov. 2014 • Single-sensor, single-snapshot color accuracy measurement for displays • Uses XYZ filters in the aperture • Color accuracy is measured in Delta-E metric in CIELAB color space

https://www.ricoh.com/fa_security/other/cv-10a/

96

System Diagram for Color Accuracy Camera ColorChecker reflectance

Camera Model

Chromaticity Error

Reconstruct xyz

18.5

Measurement Reference

18

Spectrum

b

17.5

0.08

LED Illumination

0.06

17 16.5 16

0.04

15.5 10

11

12 a

13

14

0.02 0 360

460 560 660 Wavelength [nm]

Scene

760

Optics

Sensor

Digital processing

Performance metric

Optimization loop

97

Filter Layout Optimization According to Chromaticity Error Non-optimized layout

Optimized layout

X,Y,Z filters 2

1.5

Std of Std(ΔE) ΔE

Non-optimized Layout Optimized Layout

1

0.5

0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 # of color patch

98

Example 2: Light Field Otoscope (Prototype)* • Ear infection is the most common reason for antibiotic prescription to children in the US • 25M office visits, 20M prescriptions, and $2B costs • Difficult to differentiate between different conditions Kuruvilla, et al., IJBI 2013.

acute otitis media otitis media with (AOM) effusion (OME)

no effusion (NOE)

• 3D shape and color of the ear drum are the most important features for diagnosing AOM** *N. Balram, I. Tosic, H. Binnamangalam, Digital health in the age of the infinite network, Journal APSIPA, 2016 **Shaikh, Hoberman, Rockette, and Kurs-Lasky, Development of an algorithm for the diagnosis of otitis media, Academic Pediatrics, 2012.

99

Clinical Features of Otitis Media*

Goal: Leverage advances in 3D imaging and multispectral imaging to enhance detection of diagnostic features

*Shaikh, Hoberman, Rockette, and Kurs-Lasky, “Development of an algorithm for the diagnosis of otitis media,” Academic Pediatrics, 2012.

100

Light Field Otoscope (Prototype) Design Custom MLA

large field of view Image Conjugate

Pupil Conjugate

Bright illumination

101

3D Ear Drum Reconstruction closer

farther 2D image

Depth map

3D imaging • 0.25mm depth accuracy! • In RGB, 12.5FPS

3D reconstruction

102

Light Field Otoscope (Prototype) - Trials • Testing 3D and spectral imaging with Children’s Hospital Pittsburgh* – Established in 1880 – 296-bed children’s general facility – 13,687 admissions in 2013, 5,734 annual inpatient and 19,313 outpatient surgeries – Pioneer in pediatric medicine

*N. Bedard, I. Tošić, L. Meng, A. Hoberman, J. Kovacevic, K. Berkner, “In vivo ear imaging with a light field otoscope”, Bio-Optics: Design and Application, April 2015

103

Prototype Demo

104

Light Field Imaging: Key Points to Remember •

Number of tradeoffs have to be made based on specific target application – Possible applications include medical, factory automation/inspection, consumer content creation etc.



Robust system methodology exists for design of end-to-end system based on key performance metrics for target application – When designing the system, figure out requirements for spatial resolution, angular resolution (#views), depth resolution and range, temporal resolution, and spectrum



Can use array of cameras (sensors) or single camera (sensor) – Array approach enables high spatial resolution and wider baseline (provides depth for distant objects) but is bulkier and more costly – Single camera approach enables compact system and high angular resolution but has limited spatial resolution and narrow baseline (provides depth for closer objects only)



Calibration is a critical first step of the processing



Depth can be estimated using layer-based approaches or dense field (pixelbased) approaches



Processing based on geometric models is applicable in most cases but need to use diffraction models for applications involving high magnification

105

Section 4: Light Field Displays

106

Why Light Field Displays?* • To display 3D content in a way that appears natural to the human visual system – Providing natural and consistent stereo, parallax and focus cues – Avoiding cue conflicts like the vergence-accommodation conflict (VAC) posed by current Stereoscopic 3D (S3D) displays

*N. Balram, Is 3-D dead (again)?, Guest Editorial, Information Display 3/13 N. Balram, The next wave of 3-D - light field displays, Guest Editorial, Information Display 6/14

107

Vergence-Accommodation Conflict (VAC) of Stereoscopic 3D (S3D)* • Natural Viewing: eyes converge AND focus at same distance

Focal Distance (diopters)

Focal distance

Vergence distance

zone of clear single binocular vision

6

4.5

3

1.5

0 *“Conflicting Focus Cues in Stereoscopic Displays”, M. Banks, et. alt., Information Display, July 2008

Percival's zone of comfort

0

1.5

3

4.5

6

Vergence Distance (diopters)

108

Vergence-Accommodation Conflict (VAC) of Stereoscopic 3D (S3D)* • Stereo Display: eyes always focus on the screen BUT converge wherever an object is placed – leading to cue conflict – This can produce severe discomfort

Focal Distance (diopters)

Focal distance

Vergence distance

zone of clear single binocular vision

6

4.5 Location of screen

3

1.5

0 *“Conflicting Focus Cues in Stereoscopic Displays”, M. Banks, et. alt., Information Display, July 2008

Percival's zone of comfort

0

1.5

3

4.5

6

Vergence Distance (diopters)

109

Vergence-Accommodation Conflict (VAC)* • Experiment of Cues-Consistent vs Cues Inconsistent viewing* – 600 ms stimulus at near or far vergence-specified distance

*D. Hoffman, A. Girshick, K. Akeley, M. S. Banks, Vergence-accommodation conflicts hinder visual performance and cause visual fatigue, Journal of Vision 8(3):33, 2008

110

Vergence-Accommodation Conflict(VAC)* • Results of experiment of Cues-Consistent vs Cues Inconsistent viewing* Severity of Symptom

9

cues-inconsistent

7

5

3

1 **=

cues-consistent

**

**

**

**

p< 0.01 (Wilcoxen test)

*D. Hoffman, A. Girshick, K. Akeley, M. S. Banks, Vergence-accommodation conflicts hinder visual performance and cause visual fatigue, Journal of Vision 8(3):33, 2008

111

What is a Light Field Display?* • A display that presents a light field to the viewer and enables natural and comfortable viewing of a 3D scene • Fundamentally divided into two different types: – Group/Multi-user Light Field Displays** – Personal (Near-to-Eye/Head-Mounted) Light Field Displays***

• We will discuss both these types in the rest of this section *N. Balram, The next wave of 3-D - light field displays, Guest Editorial, Information Display 6/14 **X. Liu and H. Li, The progress of light field 3-D displays, Information Display 6/14 ***W. Wu, K. Berkner, I. Tosic, N. Balram, Personal near-to-eye light field displays, Information Display 6/14

112

How to Create a Light Field Display? • Need to create a natural accommodation response – Create the correct retinal blur corresponding to the 3D location of an object

• Fundamentally only two different ways to do this: A.

B.

Create parallax across each eye that produces the correct retinal blur corresponding to the 3D location of the object being viewed – by presenting multiple views (integral imaging approach) or Physically place the object at the appropriate focal plane corresponding to its 3D location – by providing multiple focal planes (multi-focal-plane approach)

• All real light field displays use one of these two ways – Group/multi-user displays typically use approach A – Single-user (Near-Eye/Head-Mounted) displays use approach A or B depending on their design point

• Fundamental questions for each approach: – For A, how many views are needed? – For B, how many planes are needed? 113

How to Create a Light Field Display? • Need to create a natural accommodation response – Create the correct retinal blur corresponding to the 3D location of an object

• Fundamentally only two different ways to do this: A.

B.

Create parallax across each eye that produces the correct retinal blur corresponding to the 3D location of the object being viewed – by presenting multiple views (integral imaging approach) or Physically place the object at the appropriate focal plane corresponding to its 3D location – by providing multiple focal planes (multi-focal-plane approach)

• All real light field displays use one of these two ways – Group/multi-user displays typically use approach A – Single-user (Near-Eye/Head-Mounted) displays use approach A or B depending on their design point

• Fundamental questions for each approach: – For A, how many views are needed? – For B, how many planes are needed? 114

Creating Focal Cues Using Multiple Views • Providing two or more views to each eye to enable focus cues*

*Y. Takaki, K. Tanaka, J. Nakamura, Super multi-view display with a lower resolution flat-panel display, Opt. Express, 19, 5 (Feb) 2011

115

Creating Focal Cues Using Multiple Views • The n pixels in each pixel group are magnified to generate the n viewing zones with each pixel generating one view*

*Y. Takaki, K. Tanaka, J. Nakamura, Super multi-view display with a lower resolution flat-panel display, Opt. Express, 19, 5 (Feb) 2011

116

Standard Single Plane Display*

𝒙

𝒖

𝒖

𝒙

display focus plane

retina

+∞

Retinal image: 𝐼(𝑥) =

𝑙 𝑥, 𝑢 𝐴 𝑢 𝑑𝑢 −∞

*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 117 With Computational Light Field Displays, Siggraph 2014

Light Field Display – Creating Focal Cues* 𝒍𝒅

𝒙

𝒖

𝒖

𝒙

retina focus plane

+∞

Retinal image: 𝐼(𝑥) =

more degrees of freedom

𝑙 𝑥, 𝑢 𝐴 𝑢 𝑑𝑢 −∞ 𝑟/2

=

𝑙 −𝑟/2

𝑑

𝑥 Ψ 𝑢

𝑑𝑢

*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 118 With Computational Light Field Displays, Siggraph 2014

Light Field Display – Creating Focal Cues* 𝒍𝒅

𝒙

𝒖

? retina focus plane

+∞

Retinal image: 𝐼(𝑥) =

𝑙 𝑥, 𝑢 𝐴 𝑢 𝑑𝑢 −∞ 𝑟/2

𝑙𝑑

= −𝑟/2

𝑥 Ψ 𝑢

*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 119 With Computational Light Field Displays, Siggraph 2014

𝑑𝑢

Light Field Display – Projection Matrix*

𝒍𝒅

𝒙

𝒖

? 𝒅 𝐋 𝐏∙

=

𝐈

*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 120 With Computational Light Field Displays, Siggraph 2014

Light Field Display – Projection Matrix*

𝒍𝒅

𝒙

𝒖

? 𝒅

𝐏∙ 𝐋

=𝐏

−𝟏

𝐈

*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 121 With Computational Light Field Displays, Siggraph 2014

Light Field Display – Projection Matrix*

𝒍𝒅

𝒙

𝒖

𝒖

𝒙

retina focus plane

more degrees of freedom

𝒅

𝐋

=𝐏

−𝟏

𝐈

become well-posed?

*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 122 With Computational Light Field Displays, Siggraph 2014

Condition Number of the Projection Matrix* Can we answer the question - how many rays (views) are needed per eye?

(Lower is better)

*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 123 With Computational Light Field Displays, Siggraph 2014

How to Create a Light Field Display? • Need to create a natural accommodation response – Create the correct retinal blur corresponding to the 3D location of an object

• Fundamentally only two different ways to do this: A.

B.

Create parallax across each eye that produces the correct retinal blur corresponding to the 3D location of the object being viewed – by presenting multiple views (integral imaging approach) or Physically place the object at the appropriate focal plane corresponding to its 3D location – by providing multiple focal planes (multi-focal-plane approach)

• All real light field displays use one of these two ways – Group/multi-user displays typically use approach A – Single-user (Near-Eye/Head-Mounted) displays use approach A or B depending on their design point

• Fundamental questions for each approach: – For A, how many views are needed? – For B, how many planes are needed? 124

Achieving Natural 3D With Multi-Focal Displays • Spatial or temporal multiplexing can create multiple focal planes that place objects at appropriate distances to be consistent with vergence – Akeley (2004)* made the case that 16 or fewer depth planes are sufficient to provide an appearance of continuous depth and showed that interpolation could be used to place objects in between the display planes – Akeley (2004)* used beam splitters to superimpose images of different parts of a monitor on the same viewing axis. – Love et. al (2009) used high-speed switchable lenses to change the optical distance of the monitor at different time instants to produce the effect of multiple planes

*K. Akeley, Achieving near-correct focus cues using multiple image planes” PhD thesis (Stanford 2004)

125

Light Field Displays: Group /Multi-User Displays

126

Traditional Stereoscopic 3D Displays • Two main types – With glasses – Without glasses

• Current large screen (group viewing) consumer 3D systems are based on glasses – which are one of two types: – Passive glasses: wavelength-based, polarization-based – Active glasses: electronically controlled liquid crystal shutters Without Glasses

With Glasses

(Auto-Stereoscopic)

Wavelength Division Multiplexing

Light Polarization osdo

Light Shuttering abscd

Parallex barrier based

Lenticular display based

127

Light Field Displays for Group Viewing • Major types: 1. Scanning-type (with rotating structure)* 2. Multi-projector arrays* 3. Multi-layer (with stacked layers of LCDs and optical elements)**

*X. Liu and H. Li, The progress of light field 3-D displays, Information Display 6/14, 2014 ** G. Wetzstein, Why people should care about light-field displays, Information Display 2/15, 2015

128

1. Scanning-Type*

129

*X. Liu and H. Li, The progress of light field 3-D displays, Information Display 6/14, 2014

2. Multi-Projector Array Type*

130

*X. Liu and H. Li, The progress of light field 3-D displays, Information Display 6/14, 2014

3. Multi-Layer Type* • Multi-layer approaches date back to use of parallax barriers (Ives 1901) and lenslets (Lipmann 1908), which have also been used in autostereoscopic displays in the recent past. In the older approaches these layers were passive but more recent ones include active (electronically operated) layers

mask 2

mask 1

*G. Wetzstein, Why people should care about light-field displays, Information Display 2/15, 2015

131

3. Multi-Layer Type – Compressive Displays* • Combination of stacked programmable light modulators and refractive optical elements – Leverage high correlation between the views in a light field to produce a more efficient display – use nonnegative tensor factorization to compress light field with high angular resolution into a set of patterns that can be displayed on a stack of LCD panels

• Tensor Displays (subset of compressive displays): – Uses (N) multi-layers, fast temporal modulation (M frames), and directional backlighting – Represent light field as Nth order rank-M tensor and use nonnegative tensor factorization (NTF) optimization framework to generate the required N x M patterns to be displayed Nonlinear optimization Problem*

Iterative update Rules* *G. Wetzstein. D. Lanman, M. Hirsch, R. Raskar, Tensor displays: compressive light field synthesis using multilayer displays with directional backlighting, SIGGRAPH 2012

132

3. Multi-Layer Type – Compressive Displays*

*G. Wetzstein. D. Lanman, M. Hirsch, R. Raskar, Tensor displays: compressive light field synthesis using multilayer displays with directional backlighting, SIGGRAPH 2012

133

Light Field Displays: Near-Eye (HeadMounted) Displays

134

The MIG Vision* • The next BIG change in mobile devices will be in the human interface • Driving a transition from Smartphone to Mobile Information Gateway (MIG) – the platform that provides True Mobility • MIG will be: – A mobile compute/communications module (CCM) + – A rich wearable human interface module (HIM) *N. Balram, W. Wu, K. Berkner, I. Tosic, Mobile Information Gateway – Enabling True Mobility, The 14th International Meeting on Information Display (IMID). Aug. 2014

135

MIG – Compute/Communications Module (CCM) The MIG-CCM will:

1. Provide general, graphics and multimedia processing for any type of application 2. Run a standard High-Level-Operating-System (Android, Windows, iOS) with a huge environment of third-party applications

3. Communicate with high bandwidth WAN, LAN, PAN, including intelligent self-powered sensors incorporated into the body and/or clothing 4. Come in both traditional and novel form factors Applications Application Framework Libraries

Android Runtime

Linux Kernel High Level Operating System

CCM can have traditional or new form factor 136

MIG – Human Interface-Module (HIM) The MIG-HIM will need to be a lightweight HeadMounted-Display (HMD) that resembles a pair of eyeglasses – only practical means of meeting the key requirements: 1. 2. 3. 4.

Wide field of view to make large image Ability to capture and interpret gestures Seamless overlay of digital over real world Natural 3D

Light weight eyeglasses 137

Classification of Head-MountedDisplays (HMDs) Virtual reality (VR) displays

Stereoscopic

Augmented reality (AR) displays

Requirements: 1. Wide field of view to make large image 2. Ability to capture and interpret gestures 3. Seamless overlay of digital over real world 4. True (volumetric) 3D

Optical-see-through

Video see-through

Monocular

Binocular

Oculus Rift Light Field No products available yet in this category

Sony

Google

Stereoscopic

Light Field

Mobile Information Gateway Human Interface Module

Epson

No products available yet in this category 138

Light Field Displays: Head-Mounted Displays (HMDs): Virtual Reality (VR)

139

VR HMD Examples* • Many VR HMDs shipping or announced • No light field products yet

*Courtesy Gordon. Wetzstein

140

VR HMD Applications – Gaming*

*Courtesy Gordon. Wetzstein

141

VR HMD Applications – Entertainment*

*Courtesy Gordon. Wetzstein

142

Depth Cues At Different Distances*

0.001

Personal

Action

Vista

Depth Contrast

Current HMD 0.01

Arial Perspective

0.1

Relative Height

1

1m

10m

100m

1000m

10000m Distance [Cutting and Vishton 1995] *F. C. Huang, K. Chen, G. Wetzstein, The Light Field Stereoscope: Immersive Computer Graphics Via Factored Near-Eye Light Field Displays With Focus Cues, SIGGRAPH 2015

143

Light Field Stereoscope (Prototype)* • Simple multi-layer (“compressive”) display – Only two LCD panels – No temporal multiplexing required

*F. C. Huang, K. Chen, G. Wetzstein, The Light Field Stereoscope: Immersive Computer Graphics Via Factored Near-Eye Light Field Displays With Focus Cues, SIGGRAPH 2015

144

Light Field Stereoscope (Prototype)* • Parallax across the eye provides focal cues

Rays on horizontal scanline observed at the centre of the viewers left and right pupils are shown in the 2D (x,u) diagrams on the right. The bottom diagram shows that for a conventional stereo display, there is no parallax, while the top diagram shows that light field stereoscope has parallax *F. C. Huang, K. Chen, G. Wetzstein, The Light Field Stereoscope: Immersive Computer Graphics Via Factored Near-Eye Light Field Displays With Focus Cues, SIGGRAPH 2015

145

Light Field Stereoscope (Prototype)* • Limitations posed by diffraction*

*F. C. Huang, K. Chen, G. Wetzstein, The Light Field Stereoscope: Immersive Computer Graphics Via Factored Near-Eye Light Field Displays With Focus Cues, SIGGRAPH 2015

146

Light Field Stereoscope (Prototype)* •

Limitations posed by diffraction* – – –

See left figure: The higher the resolution of the front panel, the more blur is created on the rear panel due to diffraction. Assuming viewers focus on the virtual image of the rear panel (placed at 1.23m in this case), high resolution viewing experiences will only be possible using a low pixel density for the front panel. See right figure: for a fixed resolution of the front panel, the maximum number of light field views entering a 3mm wide pupil are plotted. Until ~175 dpi, the angular sampling rate is limited by geometry, but above that, it is limited by diffraction. But even up to 500 dpi, the maximum number of views is above 2 and therefore theoretically accommodation could be achieved.

*F. C. Huang, K. Chen, G. Wetzstein, The Light Field Stereoscope: Immersive Computer Graphics Via Factored Near-Eye Light Field Displays With Focus Cues, SIGGRAPH 2015

147

Light Field Displays: Head-Mounted Displays (HMDs): Augmented Reality (AR)

148

Three Major Types of AR HMDs* • Type 1: Monocular basic system for simple tasks – Examples are Vuzix M100, Google Glass, Sony SmartEyeglass Attach

• Type 2: Binocular 2D/3D system for simple and moderate tasks – Examples are Epson Moverio, Sony SmartEyeglass SED-E1

• Type 3: Binocular 2D/3D system for moderate and complex tasks – Examples are Atheer Labs AiR, Magic Leap, Microsoft HoloLens – Light Field Displays are a subset of this type Ricoh Confidential

*Insight Media, Market Analysis Report on B2B Augmented Reality HMD, Custom Report for Ricoh, April 2015

149

Major B2B Use Cases and Verticals (2020)* Use Cases: •

Collecting items from a checklist •

Identify items on shelves, verify

correct, place in basket/cart

Verticals: •

Manufacturing



Transportation & Warehousing (Logistics)



Mobile access to information and/or documentation •

Access and complete checklist, review



Retail Trade



Healthcare & Social Services



Construction, Repair, Maintenance



First Responders (police, fire,

manuals etc

security) *Insight Media, Market Analysis Report on B2B Augmented Reality HMD, Custom Report for Ricoh, April 2015

150

Example: Type 1 (Monocular) Use Case in Logistics

• Ricoh pilot with DHL http://www.dhl.com/content/dam/downloads/g0/about_us/logistics_insights/csi_augmented_reality_report_290414.pdf

151

Example: Type 3 (Light Field) Use Case in “Bank Branch of the Future” • “Bank Branch of the Future” – For the concept of the bank branch to exist in the future decades, it needs to be completely re-defined to become a much more useful, interactive and pleasant place – One concept of this branch of the future is for it to be like a first class airline lounge with a comfortable ambience where customers and bank employees can interact in ways that feel natural and enable much more customer value than they would get from an online interaction

• Animated illustration of the application:

152

Example: Type 3 (Light Field) Use Case in Entertainment • Mixed-reality gaming – Play digital fantasy game in real environment

153

High Level System Overview*

Image source

Focus modulator

Optical combiner + eye piece

Two alternatives 1. Matrix display 2. Laser scanning technology *W. Wu, K. Berkner, I. Tosic, N. Balram, Personal near-to-eye light field displays, Information Display 6/14

154

1A. Matrix Display – Reflective* **

Digital micromirror device (DMD)

Deformable membrane mirror device (DMMD)*

Image source

Focus modulator

Temporal multiplexing required Optical combiner

Eyepiece

* X. Hu and H. Hua, Design and assessment of a depth-fused multi-focal plane display prototype, J. of Display Tech. 2014 ** P. Llull, N. Bedard, W. Wu, I. Tosic, K. Berkner, N. Balram, Design and optimization of a near-eye multi-focal display 155 system for augmented reality, COSI, June 2015

1A. Matrix Display – Reflective*

• Fast temporal image source enables the presentation of a number of focal planes • By using depth blending which distributes weighted image intensities across planes, such displays can approximate a continuous depth volume *W. Wu, K. Berkner, I. Tosic, N. Balram, Personal near-to-eye light field displays, Information Display 6/14

156

1B. Matrix Display – Emissive* OLED Spatial multiplexing required

Image source

Microlens array (MLA)

Focus modulator Optical combiner + eye piece

* H. Hua and B. Javidi, A 3D integral imaging optical see-through head-mounted display, Optics Express, 2014

157

2A. Laser Scanning Display* Laser image source

Scanning mirror Scanning mirrors

for x-y scanning

Deformable membrane mirror device (DMMD)*

Focus modulator

Temporal multiplexing required *B. T. Schowengerdt et al., True Three-Dimensional Displays that Allow Viewers to Dynamically Shift Accommodation, 158 Bringing Objects Displayed at Different Viewing Distances Into and Out of Focus, Cyperpsychology & Behavior, 2004

2B. Laser Scanning Display – Fiber Array* Fiber array

Laser diodes, fiber optic scanning in x-y, “multifocal” fiber array

Laser scanning

Scanning mirror

No temporal multiplexing of focal planes required

Focus modulator

Optical combiner + eye piece

*B. T. Schowengerdt and E. J. Seibel, 3D volumetric scanned light display with multiple fiber optics light sources, IDW 2010

159

*B. T. Schowengerdt and E. J. Seibel, 3D volumetric scanned light display with multiple fiber optics light sources, IDW 2010

2B. Laser Scanning Display – Fiber Array* Eye focuses on foreground (background blurs naturally)

160

*B. T. Schowengerdt and E. J. Seibel, 3D volumetric scanned light display with multiple fiber optics light sources, IDW 2010

2B. Laser Scanning Display – Fiber Array* Eye focuses on background (foreground blurs naturally)

161

Key System Design Challenges • Display specification tradeoffs – Spatial resolution, color depth, depth resolution – Brightness, contrast, color gamut, power consumption

• Perceptually correct overlay of digital information over the appropriate real world objects

• Optical tradeoffs – Field of View (FOV), weight, form-factor

• System latency between inputs and outputs • Software platform and eco-system 162

Display System Tradeoffs*

*W. Wu, K. Berkner, I. Tosic, N. Balram, Personal near-to-eye light field displays, Information Display 6/14

163

Displaying Light Field Imagery on MultiFocal Displays (MFD) • A fundamental system design point is to choose the number of focal planes • There are significant tradeoffs that have to be made to increase the number of planes • Rule of thumb is to choose the minimum number necessary for the target application and focus on how to use them in the most effective manner • In many cases 6 focal planes may be a reasonable choice

z1

z2

z3

z4

z5

z6 164

Displaying Light Field Imagery on MultiFocal Displays (MFD) • Depth blending is used to create an appearance of continuous depth across the set of physical planes – This can be done using linear weighting of pixel values between adjacent planes or through nonlinear optimization techniques

• A number of approaches have been developed to determine the content that should be displayed on the focal planes to achieve the desired result

K. J. MacKenzie et al. Journal of Vision (2010).

• The current approaches can be divided into those that assume: (a). The location of the planes is fixed (static), or (b). The location of the planes can be varied dynamically

165

Presenting Images on MFD With Fixed Planes: Using Linear Depth Blending*

K. J. MacKenzie et al. Journal of Vision (2010).

*S. Ravikumar, K. Akeley, M. S. Banks, Creating effective focus cues in multi-plane 3D displays, Optics Express, Oct. 2011

166

Presenting Images on MFD With Fixed Planes: Using Linear Depth Blending* Advantages*: 1.

Computationally simple

2.

Effective in maximizing retinal image contrast when the eye accommodates to the simulated distance

3.

Provides appropriate contrast gradient to drive the eye’s accommodative response

Disdvantages**: 1.

Does not handle complex scenes (with occlusions, reflections and other nonLambertian phenomena) correctly

*S. Ravikumar, K. Akeley, M. S. Banks, Creating effective focus cues in multi-plane 3D displays, Optics Express, Oct. 2011 **R. Narain, R. A. Albert, A. Bulbul, G. J. Ward, M. S. Banks, J. F. O. Brien, Optimal Presentation of Imagery with Focus 167 Cues on Multi-Plane Displays, ACM Trans. on Graphics, Vol. 34. No. 4, August 2015

Presenting Images on MFD With Fixed Planes: Using Content-Adaptive Opt.* • In the content-adaptive optimization approach the problem is formulated as follows: – Given a fixed number (N) of planes, equally spaced dioptrically, determine the intensity values for each pixel on each plane so that the image v(z) seen by the viewer is close to the desired image of the scene s(z) – The viewed image can be described as the sum of the images on the respective planes convolved with the point-spread-function (PSF) of the eye – Closeness between desired and actual image is defined by an error metric – traditional L2 distance weighted by contrast sensitivity of the HVS – The cost function is simplified by using the frequency domain and the optimization is done using the Prime Dual Hybrid Gradient (PHDG) algorithm. See [Narain et. alt]* for the details.

• The number of planes is chosen separately based on past studies ([MacKenzie et. al. 2010], [Ravikumar et. al. 2011]) indicating a separation between 0.6 and 0.9D is reasonable – In [Narain et. alt]*, 4 planes are used, placed at [14D, 2.0D, 2.6D, 3.2D] *R. Narain, R. A. Albert, A. Bulbul, G. J. Ward, M. S. Banks, J. F. O. Brien, Optimal Presentation of Imagery with Focus Cues 168 on Multi-Plane Displays, ACM Trans. on Graphics, Vol. 34. No. 4, August 2015

Presenting Images on MFD With Fixed Planes: Using Content-Adaptive Opt.*

*R. Narain, R. A. Albert, A. Bulbul, G. J. Ward, M. S. Banks, J. F. O. Brien, Optimal Presentation of Imagery with Focus Cues 169 on Multi-Plane Displays, ACM Trans. on Graphics, Vol. 34. No. 4, August 2015

Presenting Images on MFD With Fixed Planes: Using Content-Adaptive Opt.* • Improved occlusion handling by optimized blending versus linear blending

*R. Narain, R. A. Albert, A. Bulbul, G. J. Ward, M. S. Banks, J. F. O. Brien, Optimal Presentation of Imagery with Focus Cues 170 on Multi-Plane Displays, ACM Trans. on Graphics, Vol. 34. No. 4, August 2015

Presenting Images on MFD With Fixed Planes: Using Content-Adaptive Opt.*

*[R. Narain et. al. 2015]

171

Presenting Images on MFD With Dynamically Variable Planes* • Allocate locations of planes based on content instead of fixing them

• Requires MFD that is capable of dynamic variation of plane locations

Image from Ng, Ren, et al. "Light field photography with a hand-held plenoptic camera."Computer Science Technical Report CSTR 2.11 (2005).

Pr(z)

z

Conventional focal plane allocation (uniform spacing in diopters)

*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015

172

Presenting Images on MFD With Dynamically Variable Planes: K-Mean Opt.* • Consider placement of focal planes as a point clustering problem – 3D point cloud where the points have intensities and depth information

• We want to optimize the positions of focal planes q1, …, qM – Minimize the distance between data points and their nearest focal planes (to reduce contrast loss) – Solve using K-means

hist(z) q1* q2*

q3*

q4*

q5*

q6*

z

q1

q2

q3

q4

q5

q6

Conventional focal plane allocation (blue lines) Content-adaptive focal plane allocation (red lines) *W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015

173

Presenting Images on MFD With Dynamically Variable Planes: K-Mean Opt.*

Image from Ng, Ren, et al. "Light field photography with a hand-held plenoptic camera."Computer Science Technical Report CSTR 2.11 (2005).

𝑧5 = 2.8𝐷 𝑝1 = 0D

𝑧4 = 1.9𝐷

𝑝2 = 0.6D

𝑝3 = 1.2D

𝑧3 = 1.0𝐷 𝑝4 = 1.8D

𝑝5 = 2.4D

𝑧2 = 0.5𝐷

𝑧1 = 0.2𝐷

𝑝6 = 3.0D

Pr(z)

z Conventional focal plane allocation (blue lines) Content-adaptive focal plane allocation (red lines) *W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015

174

Presenting Images on MFD With Dynamically Variable Planes: K-Mean Opt.* • Results from simulated 3D scene Image

Simulated retinal images** Uniform (0.6D spacing) 0.41D

0.71D

Optimized 0.41D

0.71D

Depth

Linear depth blending is used for data between focal planes data from: Scharstein et al.,GCPR 2014

**Arizona eye model, J. Schwiegerling, SPIE, 2004

*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015

175

Presenting Images on MFD With Dynamically Variable Planes: K-Mean Opt.* • Real light field data

Light Field Camera Prototype

Depth

Plenoptic image: 3 books at specific depths

Image of a displayed scene

*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015

176

Rendered images captured by a camera behind the beamsplitter of the MFD

Presenting Images on MFD With Dynamically Variable Planes: K-Mean Opt.* Displayed scene – rendered images captured by a camera behind the beamsplitter of the MFD

*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015

177

Presenting Images on MFD With Dynamically Variable Planes: K-Mean Opt.* • Comparison of uniform versus optimized plane placement – using simulated retinal images – Optimized planes produce better image quality (greater contrast and sharpness)

*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015

178

Presenting Images on MFD With Dynamically Variable Planes: Content-Adaptive Opt.* • Find optimal placement of focal planes by optimizing objective function that characterizes the overall perceptual quality of the rendered 3D scene – Using a metric defined as “Multi-Focal Scene Defocus Quality (MSDQ)”

*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Content-adaptive focus configuration for near-eye multi-focal displays, ICME 2016

179

Presenting Images on MFD With Dynamically Variable Planes: Content-Adaptive Opt.* • Retinal image results – Optimal placement of planes produces sharper and higher contrast region of interest

Uniform placement of focal planes

Optimal placement

*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Content-adaptive focus configuration for near-eye multi-focal displays, ICME 2016

180

Spatio-Angular Frequency Analysis* • Compare spectrum of multi-focal (additive) versus layered (multiplicative) displays* – Multi-focal display region comprises of line spectra – with one line added for each plane – Layered display region is formed by convolution of support of each layer producing a significantly larger region, but this is a theoretical upper bound that is not easily achieved in practice

*M. S. Banks, D. M. Hoffman, J. Kim, G. Wetzstein, 3D Displays, publication pending, 2016

181

Key System Design Challenges • Display specification tradeoffs – Spatial resolution, color depth, depth resolution – Brightness, contrast, color gamut, power consumption

• Perceptually correct overlay of digital information over the appropriate real world objects

• Optical tradeoffs – Field of View (FOV), weight, form-factor

• System latency between inputs and outputs • Software platform and eco-system 182

Overlay of Virtual Over Real World* • Overlaying virtual information on associated real objects at appropriate depth and size is a critical element of AR

• Calibration is critical to doing the overlay correctly* MIG user

Traditional: Human-Computer Interface MIG: Human-Computer-World Interface *W. Wu, I. Tosic, K. Berkner, N. Balram, Depth-disparity calibration for augmented reality on binocular optical see-183 through displays, ACM Multimedia Systems Special Session on Augmented Reality, March 2015

Multi-Focal Display Research Prototype* Prototype specifications  31 degree diagonal FOV • 6-bit grayscale imagery  55fps per focal plane  >60 possible, but results in resonant frequency of tunable lens.

 6 focal planes  0-5 Diopters focal workspace possible  1.3’ resolution • 4mm exit pupil diameter • 12mm eye relief • Large depth of field  Achieved binocular display *P. Llull, N. Bedard, W. Wu, I. Tosic, K. Berkner, N. Balram, Design and optimization of a near-eye multi-focal display system for augmented reality, COSI, June 2015

184

Multi-Focal Display Research Prototype • Primary components needed: display element, focal modulation element, relay optics – Must be capable of generating images and modulating focus at 60𝑵𝒇 Hz

• Desirables: large FOV, eye box, focal workspace; high resolution Max. 2.8ms single image exposure time

Focal modulator optical power (D)

Time Ocular critical sampling time (1/60s)

185

Multi-Focal Display Research Prototype*

focal plane

*P. Llull, N. Bedard, W. Wu, I. Tosic, K. Berkner, N. Balram, Design and optimization of a near-eye multi-focal display system for augmented reality, COSI, June 2015

186

Multi-Focal Display Research Prototype 𝑓1 = 35𝑚𝑚

Tunable lens

𝑓2 = 35𝑚𝑚

Slotted ring

25mm tube

𝑓3 = −75𝑚𝑚 𝑓𝑒 = 25𝑚𝑚

Adjustable polarizers Eyepiece-beamsplitter adaptor

Projector 1 Projector 2 Optical rails for adjustable interpupillary distance 187

Focal Modulation Waveform Shaping • High-frequency step impulses result in ringing (“focal jitter”) and overshoot

• Use a “pyramid”-shaped waveform to reduce overshoot – Distribute the large jump along both sides of the periodic waveform Liquid lens focal planes

Conventional

1 Cycle (1/55 sec)

Ours

188

Focal Modulation Waveform Filtering • “Focal jitter”: ringing resulting from high-frequency step inputs. – Can reduce this by filtering the input waveform

• Which filter? – Can be designed, optimized, or empirically determined Unfiltered

2ms response time

5ms settling time

Filtered

189

Focal Waveform Filtering Experimental Procedure For k = {1, … ,6} focus positions, find σ*k that maximizes image sharpness Focal workspace 0-3D (6 focal planes)

Move camera focus to focus position

Test image sharpness

Average σ*k for the optimal filter standard deviation σ*

Test image sharpness for each standard deviation σk = {.001, .002, … , .014} Change filter standard deviation σk

Capture several images with camera

Compute mean of the captured images

Compute 2D sum of Brenner gradient on mean image

190

Focal Waveform Filtering Results • Evaluated focus quality of each focal plane sequentially.

Focal plane 3 in focus

𝜎3∗ = 0.004 191

Multi-Focal Imagery Demo: Setup

Exposure time: 1/55s

192

Demo Video: Sweep Through Focus 3 2

1 3 2

1

193

Light Field Displays: Key Points to Remember • Light field displays are intended to enable natural and comfortable viewing of 3D scene – Providing natural and consistent stereo, parallax and focus cues – Avoiding well-known cue conflicts like VAC

• Natural accommodation response can be created in two ways: – Providing parallax across each eye that produces natural retinal blur corresponding to the 3D location of the object being viewed – integral imaging approach – Placing the object being viewed onto a focal plane at the appropriate distance – multi-focal-plane approach

• Need to make tradeoffs in design specifications based on target applications – Most fundamental separation by usage is Group/Multi-User versus Personal/Single-User (Near-Eye/Head-Mounted) – Group/Multi-User needs to support multiple viewpoints whereas Near-Eye/Head-Mounted can support single viewpoint 194

Light Field Displays: Key Points to Remember •

Group/Multi-User light field displays: – – –



Personal/Single-User (Near-Eye/Head-Mounted light field displays: –



Two main categories: Virtual Reality (VR) and Augmented Reality (AR)

Virtual Reality (VR) HMDs: – –



Three main types: Scanning, Multi-projector, Multi-layer Primarily use integral imaging approach (providing large number of views) – since this also provides parallax in addition to stereo and focus Compressive displays are a family of multi-layer displays that use advanced computer graphics computation algorithms to create and present a large number of views using a small number of spatial and temporal layers

Gaming and entertainment are leading applications driving the market at this time Can use the compressive display approach to provide natural 3D viewing but diffraction is a major obstacle that needs to be addressed

Augmented Reality (AR) HMDs: – – – – –

Large number of possible AR applications, especially in verticals like logistics, manufacturing, healthcare, construction, first responders etc AR HMDs could be divided into three main types – I (Basic/monocular), II (Binocular 2D/3D), III (Binocular Advanced 3D) – with applications and products already existing for types I and II For Type III, there are many tradeoffs that need to be made based on target application Can use integral imaging or multi-focal-plane approaches – these correspond to using spatial or temporal multiplexing Practical multi-focal displays can be created with a small number of planes by using depth blending to provide the perception of continuous depth 195

Section 5: Summary

196

HVS: Key Points to Remember • “The human visual system detects and interprets information from visible light to build a representation of the surrounding environment.”* • The visual pathway begins at the eyes and ends at the visual cortex • What we “see” is not the raw image on the retina but our interpretation of it • The interpretation depends on a set of sensory information (“cues”) that we extract from the data and on the rules that our system has developed during the course of our evolution (“prior model”) • Confusion (optical illusions) can arise when the data is considered suspect and is overruled by the prior model • Cue conflicts can cause physical ill-effects like nausea and fatigue * https://en.wikipedia.org/wiki/Visual_system

197

Light Fields: Key Points to Remember • Plenoptic function is a 7D function describing light flowing through space • This can be reduced to various useful subsets • Light field is a 4D function describing radiance as a function of position and direction – Simple representation using two parallel planes with 2D views (u,v) and 2D positions (s, t)

• Light fields can be captured using an array of cameras or a small-form factor camera with micro-lenses or multiple apertures – Each form of capture has tradeoffs and the best choice depends on the objectives

• Light fields can be displayed using an array of display engines or a display with special optical layers 198

Light Field Imaging: Key Points to Remember •

Number of tradeoffs have to be made based on specific target application – Possible applications include medical, factory automation/inspection, consumer content creation etc.



Robust system methodology exists for design of end-to-end system based on key performance metrics for target application – When designing the system, figure out requirements for spatial resolution, angular resolution (#views), depth resolution and range, temporal resolution, and spectrum



Can use array of cameras (sensors) or single camera (sensor) – Array approach enables high spatial resolution and wider baseline (provides depth for distant objects) but is bulkier and more costly – Single camera approach enables compact system and high angular resolution but has limited spatial resolution and narrow baseline (provides depth for closer objects only)



Calibration is a critical first step of the processing



Depth can be estimated accurately using layer-based approaches or dense field approaches



Processing based on geometric models is applicable in most cases but need to use diffraction models for applications involving high magnification

199

Light Field Displays: Key Points to Remember • Light field displays are intended to enable natural and comfortable viewing of 3D scene – Providing natural and consistent stereo, parallax and focus cues – Avoiding well-known cue conflicts like VAC

• Natural accommodation response can be created in two ways: – Providing parallax across each eye that produces natural retinal blur corresponding to the 3D location of the object being viewed – integral imaging approach – Placing the object being viewed onto a focal plane at the appropriate distance – multi-focal-plane approach

• Need to make tradeoffs in design specifications based on target applications – Most fundamental separation by usage is Group/Multi-User versus Personal/Single-User (Near-Eye/Head-Mounted) – Group/Multi-User needs to support multiple viewpoints whereas Near-Eye/Head-Mounted can support single viewpoint 200

Light Field Displays: Key Points to Remember •

Group/Multi-User light field displays: – – –



Personal/Single-User (Near-Eye/Head-Mounted light field displays: –



Two main categories: Virtual Reality (VR) and Augmented Reality (AR)

Virtual Reality (VR) HMDs: – –



Three main types: Scanning, Multi-projector, Multi-layer Primarily use integral imaging approach (providing large number of views) – since this also provides parallax in addition to stereo and focus Compressive displays are a family of multi-layer displays that use advanced computer graphics computation algorithms to create and present a large number of views using a small number of spatial and temporal layers

Gaming and entertainment are leading applications driving the market at this time Can use the compressive display approach to provide natural 3D viewing but diffraction is a major obstacle that needs to be addressed

Augmented Reality (AR) HMDs: – – – – –

Large number of possible AR applications, especially in verticals like logistics, manufacturing, healthcare, construction, first responders etc AR HMDs could be divided into three main types – I (Basic/monocular), II (Binocular 2D/3D), III (Binocular Advanced 3D) – with applications and products already existing for types I and II For Type III, there are many tradeoffs that need to be made based on target application Can use integral imaging or multi-focal-plane approaches – these correspond to using spatial or temporal multiplexing Practical multi-focal displays can be created with a small number of planes by using depth blending to provide the perception of continuous depth 201

Section 6: References

202

Vision Science & Light Fields •

M. S. Banks, W. W. Sprague, J. Schmoll, J. A. Q. Parnell, G.D. Love, “Why do animal eyes have pupils of different shapes”, Sci. Adv. August 2015



S. Wanner, S. Meister, B. Goldluecke, “Datasets and benchmarks for densely sampled 4D light fields”, Vision, Modeling & Visualization, The Eurographics Association, 2013



R. T. Held, E. Cooper, J. F. O’Brien, M. S. Banks, “Using blur to affect perceived distance and size”, ACM Trans. Graph. 29, 2, March 2010



D. Hoffman, A. Girshick, K. Akeley, M. S. Banks, “Vergence-accommodation conflicts hinder visual performance and cause visual fatigue”, Journal of Vision 8(3):33, 2008



M. S. Banks, et. al., “Conflicting focus cues in stereoscopic displays”, Information Display, July 2008



Levoy and Hanrahan, “Light field rendering”, SIGGRAPH 1996



E. H. Adelson, J. Y. A. Wang, “Single lens stereo with a plenoptic camera”, IEEE Trans. PAMI, Feb. 1992



E. H. Adelson, J. R. Bergen, “The plenoptic function and the elements of early vision”, Computational Models of Visual Proc., MIT Press, 1991 203

Light Field Imaging •

N. Balram, I. Tosic, H. Binnamangalam, “Digital health in the age of the infinite network”, Journal APSIPA, 2016



L. Meng, K. Berkner, “Parallax rectification for spectrally-coded plenoptic cameras”, IEEE-ICIP, 2015



N. Bedard, I. Tošić, L. Meng, A. Hoberman, J. Kovacevic, K. Berkner, “In vivo ear imaging with a light field otoscope”, Bio-Optics: Design and Application, April 2015



K. Akeley, “Light-field imaging approaches commercial viability”, Information Display 6/15, Nov./Dec. 2015



I. Tošić, K. Berkner, “3D Keypoint Detection by Light Field Scale-Depth Space Analysis,” ICIP, October 2014 (Best Paper Award)



N. Bedard, I. Tošić, L. Meng, K. Berkner, “Light field Otoscope,” OSA Imaging and Applied Optics, July 2014



J. Park, I. Tošić, K. Berkner, “System identification using random calibration patterns,” ICIP, October 2014 (Top 10% paper award) 204

Light Field Imaging •

I. Tošić, K. Berkner, “Light field scale-depth space transform for dense depth estimation, CVPR workshops, June 2014



K. Masuda, Y. Yamanaka, G. Maruyama, S. Nagai, L. Meng, I. Tosic, “Single-snapshot 2D color measurement by plenoptic imaging systems”, SPIE Photonics West, OPTO, February 2014



Dansereau et al., “Decoding, calibration and rectification for lenselet-based plenoptic cameras”, CVPR 2013



Y. Lin, I. Tosic, K. Berkner, “Occlusion-aware layered scene recovery from light fields,” Proceedings of IEEE ICIP 2013



L. Meng, K. Berkner, “Optimization of filter layout for spectrally coded plenoptic camera,” OSA Applied Imaging congress, June 2013



K. Berkner, L. Meng, S. A. Shroff, I. Tosic, “Understanding the design space of a plenoptic camera through and end-to-end system model,” OSA Applied Imaging congress, June 2013 (invited talk)



S. A. Shroff, and K. Berkner. "Image formation analysis and high resolution image reconstruction for plenoptic imaging systems." Applied optics 52.10 (2013): D22-D31



S. A. Shroff, K. Berkner, “Plenoptic system response and image formation, “ OSA Applied Imaging congress, June 2013 (invited talk)

205

Light Field Imaging •

I. Tosic, S. A. Shroff, K. Berkner,”Dictionary learning for incoherent sampling with application to plenoptic imaging,“ Proceedings of IEEE ICASSP 2013



A. Gelman, J. Berent, P. L. Dragotti, “Layer-based sparse representation of multiview images”, EURASIP Journal on Advances in Signal Processing, 2012



S. Wanner, B. Goldluecke, “Globally consistent depth labeling of 4D light fields”, Computer Vision and Pattern Recognition (CVPR), 2012



L. Meng, K. Berkner, “System model and performance evaluation of spectrally coded plenoptic camera,” to be presented at OSA-Imaging and Applied Optics Congress, June 2012



S. A. Shroff, K. Berkner, “High resolution image reconstruction for plenoptic imaging systems using system response,” OSA-Imaging and Applied Optics Congress, June 2012



K. Berkner, S. A. Shroff, “Optimization of plenoptic imaging systems including diffraction effects,” International Conference on Computational Photography, April 2012



R. Horstmeyer, G. Euliss, R. Athale, M. Levoy, “Flexible multimodal camera with light field architecture”, IEEE International Conference on Computational Photography, 2009



R. Ng, M. Levoy, M. Bredif, G. Duval, “Light field photography with a handheld plenoptic camera”, Technical Report CSTR, 2005 206

Light Field Displays •

W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, “Content-adaptive focus configuration for near-eye multi-focal displays”, ICME 2016



W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, “Near-eye display of light fields”, IDW 2015



R. Narain, R. A. Albert, A. Bulbul, G. J. Ward, M. S. Banks, J. F. O. Brien, “Optimal presentation of imagery with focus cues on multi-plane displays”, ACM Trans. on Graphics, Vol. 34. No. 4, August 2015



P. Llull, N. Bedard, W. Wu, I. Tosic, K. Berkner, N. Balram, Design and optimization of a near-eye multi-focal display system for augmented reality, COSI, June 2015



W. Wu, I. Tošić́, K. Berkner, N. Balram, Depth-disparity calibration for augmented reality on binocular optical see-through displays, ACM Multimedia Systems Special Session on Augmented Reality (MMSysAR), March 2015



F. C. Huang, K. Chen, G. Wetzstein, “The light field stereoscope: immersive computer graphics via factored near-eye light field displays with focus cues”, SIGGRAPH 2015



G. Wetzstein, “Why people should care about light-field displays”, Information Display 2/15

207

Light Field Displays •

N. Balram, W. Wu, K. Berkner, I. Tosic, “Mobile information gateway – enabling true mobility”, The 14th International Meeting on Information Display (IMID 2014 DIGEST), Aug. 2014. Invited paper



N. Balram, “The next wave of 3D – light field displays”, Guest Editorial, Information Display, Nov/Dec 2014 issue, Vol. 30, Number 6



X. Liu, H. Li, “The progress of light field 3-D displays”, Information Display, Nov/Dec 2014 issue, Vol. 30, Number 6. Invited paper



W. Wu, K. Berkner, I. Tošić, N. Balram, “Personal near-eye light field display”, Information Display, Nov/Dec 2014 issue, Vol. 30, Number 6. Invited paper



W. Wu, N. Balram, I. Tošić, K. Berkner, “System design considerations for personal light field displays for the mobile information gateway”, International Workshop on Display (IDW), Dec. 2014. Invited paper



X. Hu, H. Hua, “Design and assessment of a depth-fused multi-focal plane display prototype”, J. of Display Tech. 2014



H. Hua, B. Javidi, “A 3D integral imaging optical see-through head-mounted display”, Optics Express, 2014

208

Light Field Displays •

F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, “Eyeglasses-free display: towards correcting visual aberrations with computational light field displays”, SIGGRAPH 2014



G. Wetzstein. D. Lanman, M. Hirsch, R. Raskar, “Tensor displays: compressive light field synthesis using multi-layer displays with directional backlighting”, SIGGRAPH 2012



S. Ravikumar, K. Akeley, M. S. Banks, “Creating effective focus cues in multi-plane 3D displays”, Optics Express, Oct. 2011



Y. Takaki, K. Tanaka, J. Nakamura, “Super multi-view display with a lower resolution flatpanel display”, Opt. Express, 19, 5 (Feb) 2011



B. T. Schowengerdt, E. J. Seibel, “3D volumetric scanned light display with multiple fiber optics light sources”, IDW 2010



Y. Takaki, “High-density directional display for generating natural three-dimensional images”, Proc. IEEE 94, 3, 2006



B. T. Schowengerdt et al., “True Three-Dimensional Displays that Allow Viewers to Dynamically Shift Accommodation, Bringing Objects Displayed at Different Viewing Distances Into and Out of Focus”, Cyperpsychology & Behavior, 2004

209

210

Suggest Documents