May 22, 2016 - Used for low-level image processing (edge and blob detection) ..... Run a standard High-Level-Operating-System (Android, Windows,. iOS) with ...
Display Week 2016, Sunday Short Course – 22nd May 2016
Fundamentals of Light Field Imaging and Display Systems
Dr. Nikhil Balram Ricoh Innovations Corporation (RIC)
Copyrights for all material used here belongs to original sources or author. All rights reserved.
Acknowledgements Materials* and/or insights provided by: • Dr. Ivana Tošić (RIC) • Dr. Gordon Wetzstein (Stanford University) • Prof. Marty Banks (UC Berkeley) • Dr. Noah Bedard (RIC) • Dr. Kurt Akeley (Lytro) • Prof. Xu Liu (Zhejiang University) • Dr. Jim Larimer (formerly NASA Ames) • Dr. Wanmin Wu (formerly RIC) • Dr. Kathrin Berkner (formerly RIC) *Copyrights of material provided belong to original owners
1
Overview • Section 1: Fundamentals of Human Visual System • Section 2: Introduction to Light Fields
• Section 3: Light Field Imaging • Section 4. Light Field Displays
• Section 5. Summary • Section 6. References
2
Section 1: Fundamentals of Human Visual System (HVS)
3
Human Visual System • “The human visual system detects and interprets information from visible light to build a representation of the surrounding environment.”* • The visual pathway begins at the eyes and ends at the visual cortex
* https://en.wikipedia.org/wiki/Visual_system
4
HVS Front-end Retina has 4 types of photoreceptors • Rods – Achromatic – Concentrated in the periphery – Used for scotopic vision (low light levels) Smith & Pokorny Fundamentals
• Cones 1 – Three broadband receptors – S (Short), M (Medium), L (Long) – with .1 red, green and blue peaks respectively .01 – Concentrated in fovea (centre of retina) .001 – Used for photopic vision (daylight levels) .0001 400
Sensitivity L M S
450
500
550
600
Wavelength
650
700
5
Opponent Channels • Cones are organized to produce 3 opponent channels – White/Black (achromatic) – Red/Green – Yellow/Blue
• Opponent channels differ in spatial resolution – White/Black has highest resolution because it uses only 1 pair of receptors – Yellow/Blue has lowest spatial resolution because it uses S cones, which are sparse Y/B chromatic mechanism Achromatic mechanism W/B R/G Y/B R/G chromatic mechanism
Diagram for illustration only – real receptor spacing and density is highly irregular
6
Contrast Sensitivity Function (Spatial) • Contrast Sensitivity Function (CSF) – Minimum modulation required to discriminate a sine wave grating at various spatial frequencies – “Spatial MTF of the HVS” – Envelope of narrowband tuned filters
Kelly (1974)
7
Contrast Sensitivity Function (Spatio-Temporal) • Spatio-Temporal Contrast Sensitivity Function (CSF) – Minimum modulation required to discriminate a sine wave grating at various spatial and temporal frequencies – Diamond shaped 2D response shows tradeoff between spatial and temporal resolution
Kelly (1966, 1979) 8
Resolution Limit of HVS – Visual Acuity (Snellen) • Snellen acuity refers to the ability to discriminate features – Based on density of cones and optics of the eye – Normal (“20/20”) vision refers to ability to distinguish a feature subtending 1 arc minute, which corresponds to 30 cycles/degree – Minimum contrast required for detection is plotted as the contrast sensitivity function (CSF)
http://webvision.med.utah.edu/book/part-viii-gabac-receptors/visual-acuity/ http://en.wikipedia.org/wiki/Visual_acuity
9
L/M ratio variation in eyes with normal color vision
HS
YY
*AN
AP nasal
AP temporal
MD
JP
JC
RS
*JW nasal
*JW temporal
BS
5 arcmin
* Roorda & Williams (Nature 1999) Hofer et al., (J. Neurosci. 2005)
Modeling HVS Processing • For many problems in visual processing, the optimal solution is based on minimizing a cost function – The cost function comprises of a Data term and a Model Term
Cost Function (F) = A∫ϕ1 (Data) + B∫ϕ2 (Model) combined using various statistics
• This Universal Optimization Function* may also represent the way human visual processing works – The image reconstructed by human vision is a weighted combination of the image captured by the retina and the “prior model”
• Size illusion demo (next slides) shows the effect of the prior model *“A Poet’s Guide to Video: The Art and Science of the Movie Experience”, J. Larimer, N. Balram, S. Poster, draft book manuscript (never completed)
11
Which One Looks Bigger?
12
Which One Looks Bigger?
13
Which One Looks Bigger?
14
Taxonomy of Depth Cues* • The HVS uses a number of cues to “see” depth – to interpret and understand the world around us – Stereoscopy or binocular disparity is only one type of depth cue – There are many monocular depth cues that are used in normal viewing and in 2D movies over the last hundred or more years
Monocular
Binocular
Geometry
Color
Focus
Perspective Occlusion Motion parallax Texture gradient Size
Lighting Shading Aerial perspective
Accommodation Retinal blur
*From Banks Schor Lab seminar, K. Akeley, June 2010
Convergence Retinal disparity 15
Example – Missing or Incorrect Blur Cues* • Retinal blurring is an important depth and size cue – in realworld viewing, distant objects produce blurred images, while close ones are sharp • When blur cues are incorrect, perceived depth and size are affected, causing effects like miniaturization • Example of same image with different blur – Compare how you perceive the buildings that stay in focus in first image versus second image
• Remember the UOF model – what happens when the data is suspect? F = A∫ϕ1 (Data) + B∫ϕ2 (Model)
*R. T. Held, E. Cooper, J. F. O’Brien, M. S. Banks, Using blur to affect perceived distance and size, ACM Trans. Graph. 29, 2, March 2010
16
17
18
HVS: Key Points to Remember • “The human visual system detects and interprets information from visible light to build a representation of the surrounding environment.”* • The visual pathway begins at the eyes and ends at the visual cortex • What we “see” is not the raw image on the retina but our interpretation of it • The interpretation depends on a set of sensory information (“cues”) that we extract from the data and on the rules that our system has developed during the course of our evolution (“prior model”) • Confusion (optical illusions) can arise when the data is considered suspect and is overruled by the prior model • Cue conflicts can cause physical ill-effects like nausea and fatigue * https://en.wikipedia.org/wiki/Visual_system
19
Section 2: Introduction to Light Fields
20
Definition of Light Fields* •
Originally defined by Gershun in 1936 as amount of light traveling in every direction through every point in space
•
7D plenoptic function defined as flow of light through 3D space (Adelson & Bergen, 1991)
•
4D light field defined as “The radiance as a function of position and direction, in regions of space free of occluders (free space).” (Levoy & Hanrahan, 1996)*
Figure 1: Liu et. alt. Information Display 6/14 *M. Levoy, P. Hanrahan, Light field rendering, SIGGRAPH 1996
21
What is the Plenoptic Function? • [Adelson, Bergen, 1991]: Plenoptic function tells us the intensity of light seen from any viewpoint, at any time instant, for any wavelength of the visible spectrum
22
Parametrization • In spherical coordinates (spherical camera)
spherical pixel coordinates
wavelength
time
camera center 3D coordinates
• In Cartesian coordinates (planar camera)
planar pixel coordinates
wavelength
time
camera center 3D coordinates
23
Structure of the Plenoptic Function (1/2)*
default values:
*E. H. Adelson, J. R. Bergen, The plenoptic function and the elements of early vision, Computational Models of Visual Processing, MIT Press, 1991
24
Structure of the Plenoptic Function (2/2)* • Slices through different dimensions*
x-y slice (2D image)
x-t slice (1D image scanline across time)
x-λ slice (1D image scanline across wavelengths)
x-Vx slice (1D image scanline across horizontal views)
x-Vy slice (1D horizontal image scanline across vertical views)
x-Vz slice (1D horizontal image scanline across views in depth z)
• The plenoptic function is too complex to handle in its full dimensions, but it is highly structured and that structure can be exploited to extract information that is needed for specific purposes *E. H. Adelson, J. R. Bergen, The plenoptic function and the elements of early vision, Computational Models of Visual Processing, MIT Press, 1991
25
Light Field • Fix wavelength and time, and look at rays passing through two parallel planes – Light field as 4D parametrization of the plenoptic function* – Easier to handle, and convenient for rendering of new views
*M. Levoy, P. Hanrahan, Light field rendering, SIGGRAPH 1996
26
Conventional Images as Light Field Slices Capture • Sensor integrates incoming light rays from various direction. • Sensor-plane-slice is positioned perpendicular to optical axis • Projection of light rays onto the sensor plane
Display • Emission of light rays in various directions to achieve certain appearance on a specific focal plane • Human eyes focus on a specific plane • Focal plane is imaged by the pupil onto the retina
27
How to Acquire the Light Field? Camera arrays & moving cameras
Stanford camera array
Hand-held plenoptic cameras
Stanford lego gantry
Multi-aperture cameras
Lytro Immerge
Jaunt ONE
28
Camera Arrays and Moving Rigs • Camera arrays – Pros: • • • •
High spatial resolution Good image quality Can be wide-baseline Enable depth estimation for far objects
Stanford camera array
– Cons: • Bulkier than handheld camera • View resolution might be limited due to physical spacing • Harder to calibrate and synchronize
• Moving camera rig: – Same as camera array but limited to static scenes
Lytro Immerge
Jaunt ONE
29
Light Fields From Camera Arrays • Stanford light field archive – http://lightfield.stanford.edu/lfs.html
• Heidelberg collaboratory for image processing (HCI) database – –
http://hci.iwr.uni-heidelberg.de/HCI/Research/LightField/lf_benchmark.php Datasets and Benchmarks for Densely Sampled 4D Light Fields, Wanner et al. VMV 2013
30
Hand-held Plenoptic Cameras • Hand-held plenoptic (light field) camera – Ng et al. 2005; Lytro – Perwass et al. 2012; Raytrix – Horstmeyer et al. 2009 (multimodal camera)
• Pros: – – – –
Small form factor Very dense views Calibration done once (fixed setup) Can trade-off views for different wavelengths
• Cons: – Reduced spatial resolution – Small baseline (3D estimation limited to near objects) 31
Light Fields from Plenoptic Cameras • Dataset of Lytro images of objects (EPFL) – http://lcav.epfl.ch/page-104332-en.html – Ghasemi et al. LCAV-31: A Dataset for Light Field Object Recognition, SPIE 2014 Available at https://github.com/aghasemi/lcav31 - Database of Lytro light fields of different objects (for recognition) - Light fields already extracted from raw plenoptic images
• Raytrix camera light fields from HCI (obtained from Raytrix scenes)
32
Multi-Aperture Cameras • Multi-aperture cameras – One sensor, multiple lenses mounted on the sensor – Pros: • Very small form factor (can be a cell phone camera) • Spatial resolution usually better than light field camera
– Cons: • Small number of views • Small baseline
– Pelican Imaging camera • PiCam: an ultra-thin high performance monolithic camera array, SIGGRAPH 2013 33
Light Field Pipeline – From Capture to Display*
*K. Akeley, Light-field imaging approaches commercial viability, Information Display 6/15, Nov./Dec. 2015
34
Light Fields: Key Points to Remember • Plenoptic function is a 7D function describing light flowing through space • This can be reduced to various useful subsets • Light field is a 4D function describing radiance as a function of position and direction – Simple representation using two parallel planes with 2D views (u,v) and 2D positions (s, t)
• Light fields can be captured using an array of cameras or a small-form factor camera with micro-lenses or multiple apertures – Each form of capture has tradeoffs and the best choice depends on the objectives
• Light fields can be displayed using an array of display engines or a display with special optical layers 35
Section 3: Light Field Imaging
36
Light Field Imaging •
Array of cameras vs compact single one
Stanford Array
Jaunt ONE
Lytro Immerge
•
Enables: generation of depth maps, digital refocusing, multiple views, volumetric rendering, multispectral imaging, etc. 37
Light Field Imaging*
*K. Akeley, Light-field imaging approaches commercial viability, Information Display 6/15, Nov./Dec. 2015
38
Light Field Imaging For Capturing 3D Scene Information Detector
Micro Lens Array (MLA) Main Lens
Processing
Main Micro Lens lens Array (MLA)
Detector
39
Light Field Imaging For Capturing 3D Scene Information • Main lens focuses image onto the Micro Lens Array (MLA) and each micro lens separates different views onto the sensor elements behind it • Alternate plenoptic systems focus the image before the MLA and enable higher resolution at the expense of greater complexity and other limitations
Raw plenoptic image
multiple views from different angles
Sensor data
light field
views
pixels 40
Light Field Imaging For Capturing MultiSpectral Information Monochrome Detector
Spectral Filter
Micro Lens Array (MLA)
Main Lens
X Y Z
41
Light Field Imaging – f/#
*K. Akeley, Light-field imaging approaches commercial viability, Information Display 6/15, Nov./Dec. 2015
42
Light Field Imaging - System Model • System model for task-specific designs
Scene Optics
Sensor
Digital processing
Performance metric
Joint end-to-end system design necessary for reaching optimal performance! Light Field Imaging Core Technology Image processing
Calibration
Optics design
Optimization
MLA manufacturing 43
Light Field Imaging: Models for Image Formation
44
Models of Plenoptic Image Formation • Image formation for plenoptic cameras – Different from traditional cameras – Micro-lens array in front of the sensor changes the image formation
• Image formation models – Geometric models • Ray-based modeling of light
– Diffraction models • Wave-based modeling of light
45
Models of Plenoptic Image Formation • Image formation for plenoptic cameras – Different from traditional cameras – Micro-lens array in front of the sensor changes the image formation
• Image formation models – Geometric models • Ray-based modeling of light
– Diffraction models • Wave-based modeling of light
46
Single Lens Stereo • Adelson and Wang, 1992* – Thin lens model for the main lens – Pinhole model for microlenses
sensor plane - focal length of the lens
object point
- displacement of aperture - displacement of object’s image on sensor plane
plane conjugate to sensor plane
lens
*E. H. Adelson, J. Y. A. Wang, Single lens stereo with a plenoptic camera, IEEE Trans. PAMI, Feb. 1992
47
How is Depth Information Captured? • Let us look at the rays falling on each pixel, through multiple pinholes object in front of focus object behind focus object in focus
Angle of these linear structures encodes the depth! *E. H. Adelson, J. Y. A. Wang, Single lens stereo with a plenoptic camera, IEEE Trans. PAMI, Feb. 1992
48
Depth Estimation Using similar triangles:
And the lens equation:
we get: and:
*E. H. Adelson, J. Y. A. Wang, Single lens stereo with a plenoptic camera, IEEE Trans. PAMI, Feb. 1992
49
Light Field Imaging: Core Technologies
50
Core Technology 1: Calibration Raw Data
Calibration
Pupil Images Multiviews
Calibration and multiview extraction 51
Calibration of Plenoptic Cameras • Calibration is an important part of plenoptic image processing • Some issues that make calibration challenging: – – – –
Rotation of the micro-lens array (MLA) Distortions (main lens and MLA) Vignetting Hexagonal lattice for packing of microlenses
• Typical calibration process 1. Precise localization of MLA centroids (using a white image) 2. Unpacking pixels from under lenslets to multiple views 3. Interpolation if unpacked pixels are on a non-uniform / nonrectangular lattice 52
Calibration of Plenoptic Cameras • Dansereau et al. “Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras”, CVPR 2013 – A method for decoding raw camera images into 4D light fields – A method for calibrating images from a Lytro camera • 15-parameter plenoptic camera model • 4D intrinsic matrix based on a projective pinhole and thin-lens model • A radial direction-dependent distortion model
– Matlab toolbox publicly available (Light Field Toolbox v0.2) – Does not deal with demosaicing (uses linear demosaicing, not optimal)
53
Decoding Raw Camera Images into 4D Light Fields* • Extracting pixels and re-arranging into a light field 1. Capture a white image through a white diffuser 2. Locate lenslet image centers 3. Estimate grid 4. Align grid 5. Slice into 4D
*Dansereau et al. Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras, CVPR 2013
54
Calibration Model* • Ray propagation model • Assumptions: – Thin lens model of the main lens – Pinhole array for the MLA
rectified ray in homogeneous coordinates
intrinsic matrix for the whole system
input ray in homogeneous coordinates
*Dansereau et al. Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras, CVPR 2013
55
Calibration Matrices for 2D Case* (1/2) conversion from relative to absolute coordinates - number of pixels per lenslet - translational pixel offset
conversion from absolute coordinates to rays - spatial frequencies in samples (for pixels and lenslets) - offsets in samples (for pixels and lenslets)
express rays in position and direction - distance between the microlens array and the sensor
*Dansereau et al. Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras, CVPR 2013
56
Calibration Matrices for 2D Case* (2/2) propagate to the main lens - distance between the main lens and the microlens array
refraction through the main lens - focal length of the main lens
express back in ray coordinates - distance between the object and the main lens
*Dansereau et al. Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras, CVPR 2013
57
Overall Calibration Matrix for 4D Case* • Multiplying all matrices we get: 12 non-zero terms
• Correction of projection through the lenslets – Transformation from the real to a virtual light field camera due to resizing, rotating, interpolating, and centering of lenslet images - superscripts indicate that a measure applies to the physical sensor (S), or to the virtual “aligned” camera (A)
*Dansereau et al. Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras, CVPR 2013
58
Calibration Optimization • Radial distortion
distortion parameters
distorted (d) and undistorted (u) 2D ray directions
• Minimization of an objective for checkerboard pattern images – use checkerboard images to find the intrinsic matrix H, camera poses T , and distortion parameters d
calibration pattern points images
light field views in two directions
distance between reprojected rays and feature point locations
*Dansereau et al. Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras, CVPR 2013
59
Core Technology 2: 3D Estimation multi-view images (light field)
raw plenoptic image *
calibration
light field slices
light field slices
preprocess + slicing
depth map Scale-depth transform for light fields + Occlusion detection + Dense depth estimation
angles give depth
* From a plenoptic camera without spectral filters 60
How to Recover a 3D Scene From LF Data? • Given the light field data, reconstruct the objects within different depth layers objects
light field
layer 2
Multi-view system layer 3
layer 1
?
challenge: occlusions! 61
Premise of Most Geometric Approaches • Exploit the line structure of the light field Light Field (LF) slice (EPI plane [Bolles et al.])
light reflected from the closer object
• •
light reflected from the farther object
Objects in different depths produce lines with different angles! Ray bundles from front object occlude ray bundles from background object
62
Scene Recovery Using Geometric Models • Layered-based approaches – LF segmentation and sparse representation [Gelman et al., 2012] – Sparse generative layered model [Lin et al. 2013]
• Dense depth estimation – Structure tensor approach [Wanner et al., 2012] – Scale-depth space approach [Tošić et al., 2014]
63
Scene recovery Using Geometric Models • Layered-based approaches – Joint segmentation of multiple views [Gelman et al., 2012] – Sparse generative layered model [Lin et al. 2013]
• Dense depth estimation – Structure tensor approach [Wanner et al., 2012] – Scale-depth space approach [Tošić et al., 2014]
64
Light Field (LF) Model* • A generative non-linear model that can model occlusions by masking layer 3
layer 2
layer 1
mask for layer 1
mask for layer 2
5
u: view coordinate
10
Each layer is a linear combination of raylike functions r with angle :
15 20 25 30 10
20
30
40
50
60
x: spatial coordinate
-5 0 5 10
5 15
10
20
15
25
occlusion
20
30
vector of coefficients
25 30 10
20
30
40
50
60
35 40 10
example of 20
30
40
50
60
overcomplete dictionary
*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013
65
Algorithm for 3D Layer Reconstruction* • To reconstruct 3D layers, we need to estimate the following unknowns: – Coefficients for each LF slice – Mask for each LF slice
• Method: an iterative algorithm: – Initialize mask and coefficients Step 1: Solve for coefficients Step 2: Refine the mask and go to Step 1
*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013
66
Algorithm Description Through an Example* • Example light field: – Humvee dataset (Stanford camera array), 16 views (horizontal parallax)
view 2
view 8
– Ray-like functions approximated by dictionary of Ridgelets *Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013
67
Algorithm: Step 1* • Step 1: Sparse Recovery – Relax the problem into linear (example: two layers)
error vector subsumes the occlusion error overcomplete dictionary
– Solve the convex problem: • Layered Sparse Reconstruction Algorithm (LSRA) (LSRA)
enforces sparsity of different angles
enforces sparsity of occluded pixels
*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013
68
Result After Step 1* • After step 1 we: – Group Ridgelets into layers (by clustering with respect to the angle) – Reconstruct each layer using only Ridgelets in that cluster 20 40 60 80 100 120 140 160 180 200
view 1
50
100
150
200
layer 1 in view 1
250
layer 2 in view 1
*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013
69
Algorithm: Step 2* • Step 2: Refine the Mask – Use the result from previous step to define the mask – Use image segmentation (e.g, Active contour) in spatial domain to refine the mask – Solve LSRA again with an updated model (includes mask) 500 Iterations
20
20
40
40
60
60
80
80
100
100
120
120
140
140
160
160
180
180
200
50
100
150
(a) One view of image
200
200 250
50
100
150
200
(b) Result from the previous step
250
(c) Active contour
*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013
70
Final Result* • After two iterations we get the final result:
layer 1 (after two iterations)
layer 2 (after two iterations)
layer 2 (after Step 1 of the first iteration)
*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013
71
Results on the Stanford Dataset* Humvee dataset
View 1
View 2
layer 1
layer 2
Chess dataset
View 1
View 2
layer 1
layer 2
Layer 3
*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013
Layer 4
72
Results on Lytro Data* • Use Lytro data, 6 different views – We retrieve the occluded part of second layer
view 1
view 2
layer 1
layer 2
*Y. Lin, I. Tošić, K. Berkner, Occlusion-aware layered scene recovery from light fields, ICIP 2013
73
Scene Recovery Using Geometric Models • Layered-based approaches – Joint segmentation of multiple views [Gelman et al., 2012] – Sparse generative layered model [Lin et al. 2013]
• Dense depth estimation – Structure tensor approach [Wanner et al., 2012] – Scale-depth space approach [Tošić et al., 2014]
74
3D information Within Light Fields • Plenoptic cameras acquire multi-view images light field slice (EPI*)
light field (LF)
raw plenoptic image views u
u y Pixels in the horizontal direction x x
-
-
angle associated with the depth (via a mapping based on camera parameters) dense depth map estimation = find an angle for each point on x
*[Bolles et al., IJCV ’87]
75
Analysis of Light Field Slices (EPIs) • EPI structure – Line edges (discontinuities) with a certain angle – Uniform regions between line edges Ray edges
views u (horizontal parallax)
Rays Pixels in the horizontal direction x
• How to: – Detect both structures? – Get the depth (angle) information at the same time?
• Useful approach: scale space analysis for light fields* * I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014
76
Scale-Spaces: Background • Used for multi-scale image analysis since early 80s – Gaussian scale spaces are well known
scale-space
scale
Scale
– Scale invariance:
where
is the image downsampled by 77
Derivatives of Gaussian Scale-Space* • First and Second derivatives of Gaussian scale spaces – Used for low-level image processing (edge and blob detection)
• First derivative
• Second derivative
Normalized first derivative
Edge detection
*T. Lindeberg, Scale-space theory: A basic tool for analysing structures at different scales, ‘94
Normalized second derivative
Blob detection
78
Scale Space Construction for Light Fields: Kernel* • We first need a kernel for constructing scale spaces “Ray Gaussian” (RG) filters:
• Parameters of the RG: – Scale (width): – Angle:
• Angle: – Can be uniquely mapped to depth * I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014
79
Scale Space Construction for Light Fields* • Multi-scale, multi-depth representation of LF slices Light field scale and depth (Lisad) space
LF slice
convolution over x only
We have a representation in scale AND angle
Angle / depth scale Pixel position
* I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014
80
Scale Invariance of Lisad Spaces* • Relation between the scale of Ray Gaussian and the scale of the light field signal
* I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014
81
Depth Invariance of the Inner Product with RG* • The value of the inner product of a light field with a RG does not depend on the angle – Convolution with RGs of different angles does not introduce a bias towards some depths – Valid in the case of no occlusions
* I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014
82
Ray Edge Detection Using Lisad Spaces* • Extrema in the Lisad space of the first derivative of Ray Gaussian give us ray edges Angle / depth Pixel position
scale
views u (horizontal parallax)
Pixels in the horizontal direction x * I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014
83
Ray Detection Using LF Scale Space* • Extrema in the Lisad space of the second derivative of Ray Gaussian give us whole rays Angle / depth Pixel position
scale
views u (horizontal parallax) Pixels in the horizontal direction x * I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014
84
Example: Lisad Space for the Normalized Second Derivative of RG* (scale)
angle
u
x
convolution
Light field slice (EPI)
x
Lisad space
angle
angle
scale
scale
Normalized second derivative of RG
x * I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014
Depth Estimation Using Lisad Spaces*
views (horizontal parallax) u
• Problem statement: estimate angles in LF slices for all x,y Local estimation = independently for all ray edges problems with uniform image regions Pixels in the horizontal direction x
• Approach: whole ray detection with Lisad spaces – Based on scale-spaces – Operates on whole rays u – Multi-scale approach x
Find both angles and widths of rays
* I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014
87
3D Keypoint Detection* • Extrema detection in the first derivative Lisad space – Each keypoint is assigned an angle that determines depth – hotter colors = closer points
*I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014
88
Dense Depth Estimation Method* • Ray detection + edge detection + occlusion detection + post-processing light field
Scale-space construction (ray-Gauss second derivative)
Find extrema
Occlusion detection
Scale-space construction (ray-Gauss first derivative)
Find extrema
Occlusion detection
Depth assignment
depth map
• Processing per slice, both horizontal and vertical views • Occlusion detection in ray space:
possible occlusion closer object in front of the farther
impossible occlusion farther object in front of the closer: remove ray with larger variance
* I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014
89
Evaluation on the HCI Light Field Dataset • Heidelberg collaboratory for image processing (HCI) database – http://hci.iwr.uniheidelberg.de/HCI/Research/LightField/lf_benchmark.php – Datasets and Benchmarks for Densely Sampled 4D Light Fields, Wanner et al. VMV 2013
90
Experimental Results: Synthetic Scenes* • Evaluation on HCI database (Blender data, ground truth available) middle view
depth (Lisad)
depth error >1% Lisad, 1.2%
depth error >1% Wanner et al., 3.5%
* I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014
91
Experimental Results: Real Plenoptic Images* • Raytrix®
• Ricoh prototype
middle view middle view
disparity (Lisad)
disparity (Lisad) disparity (Wanner et al. 2014) * I. Tosic, K. Berkner, Light field scale-depth space transform for dense depth estimation, CVPR workshops, 2014 I. Tošić, K. Berkner, 3D keypoint detection by light field scale-depth space analysis, ICIP 2014
92
Core Technology 3: Resolution Enhancement* • Spatial resolution is a key challenge • Several ways of improving or enhancing resolution (each with its own tradeoffs) including: – – – –
Use of advanced super-resolution algorithms Reducing micro-lens diameter (and sensor pitch) and increase number of micro-lens Increasing sensor size and number of micro-lens Use different plenoptic architecture - focus micro-lens on the image plane of the main lens – “Focused Plenoptic (Plenoptic 2.0)”**
*K. Akeley, Light-field imaging approaches commercial viability, Information Display 6/15, Nov./Dec. 2015 ** T. Georgiev, The focused plenoptic camera, http://www.tgeorgiev.net/EG10/Focused.pdf
93
Core Technology 4: Multi-Spectral Image Processing Green
Amber
Before
(Color fringing)
After
Multispectral Views
Multispectral parallax rectification
94
Light Field Imaging: Application Examples
95
Example 1: Color Inspection Camera • Color analyzer camera – launched in Nov. 2014 • Single-sensor, single-snapshot color accuracy measurement for displays • Uses XYZ filters in the aperture • Color accuracy is measured in Delta-E metric in CIELAB color space
https://www.ricoh.com/fa_security/other/cv-10a/
96
System Diagram for Color Accuracy Camera ColorChecker reflectance
Camera Model
Chromaticity Error
Reconstruct xyz
18.5
Measurement Reference
18
Spectrum
b
17.5
0.08
LED Illumination
0.06
17 16.5 16
0.04
15.5 10
11
12 a
13
14
0.02 0 360
460 560 660 Wavelength [nm]
Scene
760
Optics
Sensor
Digital processing
Performance metric
Optimization loop
97
Filter Layout Optimization According to Chromaticity Error Non-optimized layout
Optimized layout
X,Y,Z filters 2
1.5
Std of Std(ΔE) ΔE
Non-optimized Layout Optimized Layout
1
0.5
0 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 # of color patch
98
Example 2: Light Field Otoscope (Prototype)* • Ear infection is the most common reason for antibiotic prescription to children in the US • 25M office visits, 20M prescriptions, and $2B costs • Difficult to differentiate between different conditions Kuruvilla, et al., IJBI 2013.
acute otitis media otitis media with (AOM) effusion (OME)
no effusion (NOE)
• 3D shape and color of the ear drum are the most important features for diagnosing AOM** *N. Balram, I. Tosic, H. Binnamangalam, Digital health in the age of the infinite network, Journal APSIPA, 2016 **Shaikh, Hoberman, Rockette, and Kurs-Lasky, Development of an algorithm for the diagnosis of otitis media, Academic Pediatrics, 2012.
99
Clinical Features of Otitis Media*
Goal: Leverage advances in 3D imaging and multispectral imaging to enhance detection of diagnostic features
*Shaikh, Hoberman, Rockette, and Kurs-Lasky, “Development of an algorithm for the diagnosis of otitis media,” Academic Pediatrics, 2012.
100
Light Field Otoscope (Prototype) Design Custom MLA
large field of view Image Conjugate
Pupil Conjugate
Bright illumination
101
3D Ear Drum Reconstruction closer
farther 2D image
Depth map
3D imaging • 0.25mm depth accuracy! • In RGB, 12.5FPS
3D reconstruction
102
Light Field Otoscope (Prototype) - Trials • Testing 3D and spectral imaging with Children’s Hospital Pittsburgh* – Established in 1880 – 296-bed children’s general facility – 13,687 admissions in 2013, 5,734 annual inpatient and 19,313 outpatient surgeries – Pioneer in pediatric medicine
*N. Bedard, I. Tošić, L. Meng, A. Hoberman, J. Kovacevic, K. Berkner, “In vivo ear imaging with a light field otoscope”, Bio-Optics: Design and Application, April 2015
103
Prototype Demo
104
Light Field Imaging: Key Points to Remember •
Number of tradeoffs have to be made based on specific target application – Possible applications include medical, factory automation/inspection, consumer content creation etc.
•
Robust system methodology exists for design of end-to-end system based on key performance metrics for target application – When designing the system, figure out requirements for spatial resolution, angular resolution (#views), depth resolution and range, temporal resolution, and spectrum
•
Can use array of cameras (sensors) or single camera (sensor) – Array approach enables high spatial resolution and wider baseline (provides depth for distant objects) but is bulkier and more costly – Single camera approach enables compact system and high angular resolution but has limited spatial resolution and narrow baseline (provides depth for closer objects only)
•
Calibration is a critical first step of the processing
•
Depth can be estimated using layer-based approaches or dense field (pixelbased) approaches
•
Processing based on geometric models is applicable in most cases but need to use diffraction models for applications involving high magnification
105
Section 4: Light Field Displays
106
Why Light Field Displays?* • To display 3D content in a way that appears natural to the human visual system – Providing natural and consistent stereo, parallax and focus cues – Avoiding cue conflicts like the vergence-accommodation conflict (VAC) posed by current Stereoscopic 3D (S3D) displays
*N. Balram, Is 3-D dead (again)?, Guest Editorial, Information Display 3/13 N. Balram, The next wave of 3-D - light field displays, Guest Editorial, Information Display 6/14
107
Vergence-Accommodation Conflict (VAC) of Stereoscopic 3D (S3D)* • Natural Viewing: eyes converge AND focus at same distance
Focal Distance (diopters)
Focal distance
Vergence distance
zone of clear single binocular vision
6
4.5
3
1.5
0 *“Conflicting Focus Cues in Stereoscopic Displays”, M. Banks, et. alt., Information Display, July 2008
Percival's zone of comfort
0
1.5
3
4.5
6
Vergence Distance (diopters)
108
Vergence-Accommodation Conflict (VAC) of Stereoscopic 3D (S3D)* • Stereo Display: eyes always focus on the screen BUT converge wherever an object is placed – leading to cue conflict – This can produce severe discomfort
Focal Distance (diopters)
Focal distance
Vergence distance
zone of clear single binocular vision
6
4.5 Location of screen
3
1.5
0 *“Conflicting Focus Cues in Stereoscopic Displays”, M. Banks, et. alt., Information Display, July 2008
Percival's zone of comfort
0
1.5
3
4.5
6
Vergence Distance (diopters)
109
Vergence-Accommodation Conflict (VAC)* • Experiment of Cues-Consistent vs Cues Inconsistent viewing* – 600 ms stimulus at near or far vergence-specified distance
*D. Hoffman, A. Girshick, K. Akeley, M. S. Banks, Vergence-accommodation conflicts hinder visual performance and cause visual fatigue, Journal of Vision 8(3):33, 2008
110
Vergence-Accommodation Conflict(VAC)* • Results of experiment of Cues-Consistent vs Cues Inconsistent viewing* Severity of Symptom
9
cues-inconsistent
7
5
3
1 **=
cues-consistent
**
**
**
**
p< 0.01 (Wilcoxen test)
*D. Hoffman, A. Girshick, K. Akeley, M. S. Banks, Vergence-accommodation conflicts hinder visual performance and cause visual fatigue, Journal of Vision 8(3):33, 2008
111
What is a Light Field Display?* • A display that presents a light field to the viewer and enables natural and comfortable viewing of a 3D scene • Fundamentally divided into two different types: – Group/Multi-user Light Field Displays** – Personal (Near-to-Eye/Head-Mounted) Light Field Displays***
• We will discuss both these types in the rest of this section *N. Balram, The next wave of 3-D - light field displays, Guest Editorial, Information Display 6/14 **X. Liu and H. Li, The progress of light field 3-D displays, Information Display 6/14 ***W. Wu, K. Berkner, I. Tosic, N. Balram, Personal near-to-eye light field displays, Information Display 6/14
112
How to Create a Light Field Display? • Need to create a natural accommodation response – Create the correct retinal blur corresponding to the 3D location of an object
• Fundamentally only two different ways to do this: A.
B.
Create parallax across each eye that produces the correct retinal blur corresponding to the 3D location of the object being viewed – by presenting multiple views (integral imaging approach) or Physically place the object at the appropriate focal plane corresponding to its 3D location – by providing multiple focal planes (multi-focal-plane approach)
• All real light field displays use one of these two ways – Group/multi-user displays typically use approach A – Single-user (Near-Eye/Head-Mounted) displays use approach A or B depending on their design point
• Fundamental questions for each approach: – For A, how many views are needed? – For B, how many planes are needed? 113
How to Create a Light Field Display? • Need to create a natural accommodation response – Create the correct retinal blur corresponding to the 3D location of an object
• Fundamentally only two different ways to do this: A.
B.
Create parallax across each eye that produces the correct retinal blur corresponding to the 3D location of the object being viewed – by presenting multiple views (integral imaging approach) or Physically place the object at the appropriate focal plane corresponding to its 3D location – by providing multiple focal planes (multi-focal-plane approach)
• All real light field displays use one of these two ways – Group/multi-user displays typically use approach A – Single-user (Near-Eye/Head-Mounted) displays use approach A or B depending on their design point
• Fundamental questions for each approach: – For A, how many views are needed? – For B, how many planes are needed? 114
Creating Focal Cues Using Multiple Views • Providing two or more views to each eye to enable focus cues*
*Y. Takaki, K. Tanaka, J. Nakamura, Super multi-view display with a lower resolution flat-panel display, Opt. Express, 19, 5 (Feb) 2011
115
Creating Focal Cues Using Multiple Views • The n pixels in each pixel group are magnified to generate the n viewing zones with each pixel generating one view*
*Y. Takaki, K. Tanaka, J. Nakamura, Super multi-view display with a lower resolution flat-panel display, Opt. Express, 19, 5 (Feb) 2011
116
Standard Single Plane Display*
𝒙
𝒖
𝒖
𝒙
display focus plane
retina
+∞
Retinal image: 𝐼(𝑥) =
𝑙 𝑥, 𝑢 𝐴 𝑢 𝑑𝑢 −∞
*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 117 With Computational Light Field Displays, Siggraph 2014
Light Field Display – Creating Focal Cues* 𝒍𝒅
𝒙
𝒖
𝒖
𝒙
retina focus plane
+∞
Retinal image: 𝐼(𝑥) =
more degrees of freedom
𝑙 𝑥, 𝑢 𝐴 𝑢 𝑑𝑢 −∞ 𝑟/2
=
𝑙 −𝑟/2
𝑑
𝑥 Ψ 𝑢
𝑑𝑢
*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 118 With Computational Light Field Displays, Siggraph 2014
Light Field Display – Creating Focal Cues* 𝒍𝒅
𝒙
𝒖
? retina focus plane
+∞
Retinal image: 𝐼(𝑥) =
𝑙 𝑥, 𝑢 𝐴 𝑢 𝑑𝑢 −∞ 𝑟/2
𝑙𝑑
= −𝑟/2
𝑥 Ψ 𝑢
*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 119 With Computational Light Field Displays, Siggraph 2014
𝑑𝑢
Light Field Display – Projection Matrix*
𝒍𝒅
𝒙
𝒖
? 𝒅 𝐋 𝐏∙
=
𝐈
*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 120 With Computational Light Field Displays, Siggraph 2014
Light Field Display – Projection Matrix*
𝒍𝒅
𝒙
𝒖
? 𝒅
𝐏∙ 𝐋
=𝐏
−𝟏
𝐈
*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 121 With Computational Light Field Displays, Siggraph 2014
Light Field Display – Projection Matrix*
𝒍𝒅
𝒙
𝒖
𝒖
𝒙
retina focus plane
more degrees of freedom
𝒅
𝐋
=𝐏
−𝟏
𝐈
become well-posed?
*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 122 With Computational Light Field Displays, Siggraph 2014
Condition Number of the Projection Matrix* Can we answer the question - how many rays (views) are needed per eye?
(Lower is better)
*F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, Eyeglasses-free display: Towards Correcting Visual Aberrations 123 With Computational Light Field Displays, Siggraph 2014
How to Create a Light Field Display? • Need to create a natural accommodation response – Create the correct retinal blur corresponding to the 3D location of an object
• Fundamentally only two different ways to do this: A.
B.
Create parallax across each eye that produces the correct retinal blur corresponding to the 3D location of the object being viewed – by presenting multiple views (integral imaging approach) or Physically place the object at the appropriate focal plane corresponding to its 3D location – by providing multiple focal planes (multi-focal-plane approach)
• All real light field displays use one of these two ways – Group/multi-user displays typically use approach A – Single-user (Near-Eye/Head-Mounted) displays use approach A or B depending on their design point
• Fundamental questions for each approach: – For A, how many views are needed? – For B, how many planes are needed? 124
Achieving Natural 3D With Multi-Focal Displays • Spatial or temporal multiplexing can create multiple focal planes that place objects at appropriate distances to be consistent with vergence – Akeley (2004)* made the case that 16 or fewer depth planes are sufficient to provide an appearance of continuous depth and showed that interpolation could be used to place objects in between the display planes – Akeley (2004)* used beam splitters to superimpose images of different parts of a monitor on the same viewing axis. – Love et. al (2009) used high-speed switchable lenses to change the optical distance of the monitor at different time instants to produce the effect of multiple planes
*K. Akeley, Achieving near-correct focus cues using multiple image planes” PhD thesis (Stanford 2004)
125
Light Field Displays: Group /Multi-User Displays
126
Traditional Stereoscopic 3D Displays • Two main types – With glasses – Without glasses
• Current large screen (group viewing) consumer 3D systems are based on glasses – which are one of two types: – Passive glasses: wavelength-based, polarization-based – Active glasses: electronically controlled liquid crystal shutters Without Glasses
With Glasses
(Auto-Stereoscopic)
Wavelength Division Multiplexing
Light Polarization osdo
Light Shuttering abscd
Parallex barrier based
Lenticular display based
127
Light Field Displays for Group Viewing • Major types: 1. Scanning-type (with rotating structure)* 2. Multi-projector arrays* 3. Multi-layer (with stacked layers of LCDs and optical elements)**
*X. Liu and H. Li, The progress of light field 3-D displays, Information Display 6/14, 2014 ** G. Wetzstein, Why people should care about light-field displays, Information Display 2/15, 2015
128
1. Scanning-Type*
129
*X. Liu and H. Li, The progress of light field 3-D displays, Information Display 6/14, 2014
2. Multi-Projector Array Type*
130
*X. Liu and H. Li, The progress of light field 3-D displays, Information Display 6/14, 2014
3. Multi-Layer Type* • Multi-layer approaches date back to use of parallax barriers (Ives 1901) and lenslets (Lipmann 1908), which have also been used in autostereoscopic displays in the recent past. In the older approaches these layers were passive but more recent ones include active (electronically operated) layers
mask 2
mask 1
*G. Wetzstein, Why people should care about light-field displays, Information Display 2/15, 2015
131
3. Multi-Layer Type – Compressive Displays* • Combination of stacked programmable light modulators and refractive optical elements – Leverage high correlation between the views in a light field to produce a more efficient display – use nonnegative tensor factorization to compress light field with high angular resolution into a set of patterns that can be displayed on a stack of LCD panels
• Tensor Displays (subset of compressive displays): – Uses (N) multi-layers, fast temporal modulation (M frames), and directional backlighting – Represent light field as Nth order rank-M tensor and use nonnegative tensor factorization (NTF) optimization framework to generate the required N x M patterns to be displayed Nonlinear optimization Problem*
Iterative update Rules* *G. Wetzstein. D. Lanman, M. Hirsch, R. Raskar, Tensor displays: compressive light field synthesis using multilayer displays with directional backlighting, SIGGRAPH 2012
132
3. Multi-Layer Type – Compressive Displays*
*G. Wetzstein. D. Lanman, M. Hirsch, R. Raskar, Tensor displays: compressive light field synthesis using multilayer displays with directional backlighting, SIGGRAPH 2012
133
Light Field Displays: Near-Eye (HeadMounted) Displays
134
The MIG Vision* • The next BIG change in mobile devices will be in the human interface • Driving a transition from Smartphone to Mobile Information Gateway (MIG) – the platform that provides True Mobility • MIG will be: – A mobile compute/communications module (CCM) + – A rich wearable human interface module (HIM) *N. Balram, W. Wu, K. Berkner, I. Tosic, Mobile Information Gateway – Enabling True Mobility, The 14th International Meeting on Information Display (IMID). Aug. 2014
135
MIG – Compute/Communications Module (CCM) The MIG-CCM will:
1. Provide general, graphics and multimedia processing for any type of application 2. Run a standard High-Level-Operating-System (Android, Windows, iOS) with a huge environment of third-party applications
3. Communicate with high bandwidth WAN, LAN, PAN, including intelligent self-powered sensors incorporated into the body and/or clothing 4. Come in both traditional and novel form factors Applications Application Framework Libraries
Android Runtime
Linux Kernel High Level Operating System
CCM can have traditional or new form factor 136
MIG – Human Interface-Module (HIM) The MIG-HIM will need to be a lightweight HeadMounted-Display (HMD) that resembles a pair of eyeglasses – only practical means of meeting the key requirements: 1. 2. 3. 4.
Wide field of view to make large image Ability to capture and interpret gestures Seamless overlay of digital over real world Natural 3D
Light weight eyeglasses 137
Classification of Head-MountedDisplays (HMDs) Virtual reality (VR) displays
Stereoscopic
Augmented reality (AR) displays
Requirements: 1. Wide field of view to make large image 2. Ability to capture and interpret gestures 3. Seamless overlay of digital over real world 4. True (volumetric) 3D
Optical-see-through
Video see-through
Monocular
Binocular
Oculus Rift Light Field No products available yet in this category
Sony
Google
Stereoscopic
Light Field
Mobile Information Gateway Human Interface Module
Epson
No products available yet in this category 138
Light Field Displays: Head-Mounted Displays (HMDs): Virtual Reality (VR)
139
VR HMD Examples* • Many VR HMDs shipping or announced • No light field products yet
*Courtesy Gordon. Wetzstein
140
VR HMD Applications – Gaming*
*Courtesy Gordon. Wetzstein
141
VR HMD Applications – Entertainment*
*Courtesy Gordon. Wetzstein
142
Depth Cues At Different Distances*
0.001
Personal
Action
Vista
Depth Contrast
Current HMD 0.01
Arial Perspective
0.1
Relative Height
1
1m
10m
100m
1000m
10000m Distance [Cutting and Vishton 1995] *F. C. Huang, K. Chen, G. Wetzstein, The Light Field Stereoscope: Immersive Computer Graphics Via Factored Near-Eye Light Field Displays With Focus Cues, SIGGRAPH 2015
143
Light Field Stereoscope (Prototype)* • Simple multi-layer (“compressive”) display – Only two LCD panels – No temporal multiplexing required
*F. C. Huang, K. Chen, G. Wetzstein, The Light Field Stereoscope: Immersive Computer Graphics Via Factored Near-Eye Light Field Displays With Focus Cues, SIGGRAPH 2015
144
Light Field Stereoscope (Prototype)* • Parallax across the eye provides focal cues
Rays on horizontal scanline observed at the centre of the viewers left and right pupils are shown in the 2D (x,u) diagrams on the right. The bottom diagram shows that for a conventional stereo display, there is no parallax, while the top diagram shows that light field stereoscope has parallax *F. C. Huang, K. Chen, G. Wetzstein, The Light Field Stereoscope: Immersive Computer Graphics Via Factored Near-Eye Light Field Displays With Focus Cues, SIGGRAPH 2015
145
Light Field Stereoscope (Prototype)* • Limitations posed by diffraction*
*F. C. Huang, K. Chen, G. Wetzstein, The Light Field Stereoscope: Immersive Computer Graphics Via Factored Near-Eye Light Field Displays With Focus Cues, SIGGRAPH 2015
146
Light Field Stereoscope (Prototype)* •
Limitations posed by diffraction* – – –
See left figure: The higher the resolution of the front panel, the more blur is created on the rear panel due to diffraction. Assuming viewers focus on the virtual image of the rear panel (placed at 1.23m in this case), high resolution viewing experiences will only be possible using a low pixel density for the front panel. See right figure: for a fixed resolution of the front panel, the maximum number of light field views entering a 3mm wide pupil are plotted. Until ~175 dpi, the angular sampling rate is limited by geometry, but above that, it is limited by diffraction. But even up to 500 dpi, the maximum number of views is above 2 and therefore theoretically accommodation could be achieved.
*F. C. Huang, K. Chen, G. Wetzstein, The Light Field Stereoscope: Immersive Computer Graphics Via Factored Near-Eye Light Field Displays With Focus Cues, SIGGRAPH 2015
147
Light Field Displays: Head-Mounted Displays (HMDs): Augmented Reality (AR)
148
Three Major Types of AR HMDs* • Type 1: Monocular basic system for simple tasks – Examples are Vuzix M100, Google Glass, Sony SmartEyeglass Attach
• Type 2: Binocular 2D/3D system for simple and moderate tasks – Examples are Epson Moverio, Sony SmartEyeglass SED-E1
• Type 3: Binocular 2D/3D system for moderate and complex tasks – Examples are Atheer Labs AiR, Magic Leap, Microsoft HoloLens – Light Field Displays are a subset of this type Ricoh Confidential
*Insight Media, Market Analysis Report on B2B Augmented Reality HMD, Custom Report for Ricoh, April 2015
149
Major B2B Use Cases and Verticals (2020)* Use Cases: •
Collecting items from a checklist •
Identify items on shelves, verify
correct, place in basket/cart
Verticals: •
Manufacturing
•
Transportation & Warehousing (Logistics)
•
Mobile access to information and/or documentation •
Access and complete checklist, review
•
Retail Trade
•
Healthcare & Social Services
•
Construction, Repair, Maintenance
•
First Responders (police, fire,
manuals etc
security) *Insight Media, Market Analysis Report on B2B Augmented Reality HMD, Custom Report for Ricoh, April 2015
150
Example: Type 1 (Monocular) Use Case in Logistics
• Ricoh pilot with DHL http://www.dhl.com/content/dam/downloads/g0/about_us/logistics_insights/csi_augmented_reality_report_290414.pdf
151
Example: Type 3 (Light Field) Use Case in “Bank Branch of the Future” • “Bank Branch of the Future” – For the concept of the bank branch to exist in the future decades, it needs to be completely re-defined to become a much more useful, interactive and pleasant place – One concept of this branch of the future is for it to be like a first class airline lounge with a comfortable ambience where customers and bank employees can interact in ways that feel natural and enable much more customer value than they would get from an online interaction
• Animated illustration of the application:
152
Example: Type 3 (Light Field) Use Case in Entertainment • Mixed-reality gaming – Play digital fantasy game in real environment
153
High Level System Overview*
Image source
Focus modulator
Optical combiner + eye piece
Two alternatives 1. Matrix display 2. Laser scanning technology *W. Wu, K. Berkner, I. Tosic, N. Balram, Personal near-to-eye light field displays, Information Display 6/14
154
1A. Matrix Display – Reflective* **
Digital micromirror device (DMD)
Deformable membrane mirror device (DMMD)*
Image source
Focus modulator
Temporal multiplexing required Optical combiner
Eyepiece
* X. Hu and H. Hua, Design and assessment of a depth-fused multi-focal plane display prototype, J. of Display Tech. 2014 ** P. Llull, N. Bedard, W. Wu, I. Tosic, K. Berkner, N. Balram, Design and optimization of a near-eye multi-focal display 155 system for augmented reality, COSI, June 2015
1A. Matrix Display – Reflective*
• Fast temporal image source enables the presentation of a number of focal planes • By using depth blending which distributes weighted image intensities across planes, such displays can approximate a continuous depth volume *W. Wu, K. Berkner, I. Tosic, N. Balram, Personal near-to-eye light field displays, Information Display 6/14
156
1B. Matrix Display – Emissive* OLED Spatial multiplexing required
Image source
Microlens array (MLA)
Focus modulator Optical combiner + eye piece
* H. Hua and B. Javidi, A 3D integral imaging optical see-through head-mounted display, Optics Express, 2014
157
2A. Laser Scanning Display* Laser image source
Scanning mirror Scanning mirrors
for x-y scanning
Deformable membrane mirror device (DMMD)*
Focus modulator
Temporal multiplexing required *B. T. Schowengerdt et al., True Three-Dimensional Displays that Allow Viewers to Dynamically Shift Accommodation, 158 Bringing Objects Displayed at Different Viewing Distances Into and Out of Focus, Cyperpsychology & Behavior, 2004
2B. Laser Scanning Display – Fiber Array* Fiber array
Laser diodes, fiber optic scanning in x-y, “multifocal” fiber array
Laser scanning
Scanning mirror
No temporal multiplexing of focal planes required
Focus modulator
Optical combiner + eye piece
*B. T. Schowengerdt and E. J. Seibel, 3D volumetric scanned light display with multiple fiber optics light sources, IDW 2010
159
*B. T. Schowengerdt and E. J. Seibel, 3D volumetric scanned light display with multiple fiber optics light sources, IDW 2010
2B. Laser Scanning Display – Fiber Array* Eye focuses on foreground (background blurs naturally)
160
*B. T. Schowengerdt and E. J. Seibel, 3D volumetric scanned light display with multiple fiber optics light sources, IDW 2010
2B. Laser Scanning Display – Fiber Array* Eye focuses on background (foreground blurs naturally)
161
Key System Design Challenges • Display specification tradeoffs – Spatial resolution, color depth, depth resolution – Brightness, contrast, color gamut, power consumption
• Perceptually correct overlay of digital information over the appropriate real world objects
• Optical tradeoffs – Field of View (FOV), weight, form-factor
• System latency between inputs and outputs • Software platform and eco-system 162
Display System Tradeoffs*
*W. Wu, K. Berkner, I. Tosic, N. Balram, Personal near-to-eye light field displays, Information Display 6/14
163
Displaying Light Field Imagery on MultiFocal Displays (MFD) • A fundamental system design point is to choose the number of focal planes • There are significant tradeoffs that have to be made to increase the number of planes • Rule of thumb is to choose the minimum number necessary for the target application and focus on how to use them in the most effective manner • In many cases 6 focal planes may be a reasonable choice
z1
z2
z3
z4
z5
z6 164
Displaying Light Field Imagery on MultiFocal Displays (MFD) • Depth blending is used to create an appearance of continuous depth across the set of physical planes – This can be done using linear weighting of pixel values between adjacent planes or through nonlinear optimization techniques
• A number of approaches have been developed to determine the content that should be displayed on the focal planes to achieve the desired result
K. J. MacKenzie et al. Journal of Vision (2010).
• The current approaches can be divided into those that assume: (a). The location of the planes is fixed (static), or (b). The location of the planes can be varied dynamically
165
Presenting Images on MFD With Fixed Planes: Using Linear Depth Blending*
K. J. MacKenzie et al. Journal of Vision (2010).
*S. Ravikumar, K. Akeley, M. S. Banks, Creating effective focus cues in multi-plane 3D displays, Optics Express, Oct. 2011
166
Presenting Images on MFD With Fixed Planes: Using Linear Depth Blending* Advantages*: 1.
Computationally simple
2.
Effective in maximizing retinal image contrast when the eye accommodates to the simulated distance
3.
Provides appropriate contrast gradient to drive the eye’s accommodative response
Disdvantages**: 1.
Does not handle complex scenes (with occlusions, reflections and other nonLambertian phenomena) correctly
*S. Ravikumar, K. Akeley, M. S. Banks, Creating effective focus cues in multi-plane 3D displays, Optics Express, Oct. 2011 **R. Narain, R. A. Albert, A. Bulbul, G. J. Ward, M. S. Banks, J. F. O. Brien, Optimal Presentation of Imagery with Focus 167 Cues on Multi-Plane Displays, ACM Trans. on Graphics, Vol. 34. No. 4, August 2015
Presenting Images on MFD With Fixed Planes: Using Content-Adaptive Opt.* • In the content-adaptive optimization approach the problem is formulated as follows: – Given a fixed number (N) of planes, equally spaced dioptrically, determine the intensity values for each pixel on each plane so that the image v(z) seen by the viewer is close to the desired image of the scene s(z) – The viewed image can be described as the sum of the images on the respective planes convolved with the point-spread-function (PSF) of the eye – Closeness between desired and actual image is defined by an error metric – traditional L2 distance weighted by contrast sensitivity of the HVS – The cost function is simplified by using the frequency domain and the optimization is done using the Prime Dual Hybrid Gradient (PHDG) algorithm. See [Narain et. alt]* for the details.
• The number of planes is chosen separately based on past studies ([MacKenzie et. al. 2010], [Ravikumar et. al. 2011]) indicating a separation between 0.6 and 0.9D is reasonable – In [Narain et. alt]*, 4 planes are used, placed at [14D, 2.0D, 2.6D, 3.2D] *R. Narain, R. A. Albert, A. Bulbul, G. J. Ward, M. S. Banks, J. F. O. Brien, Optimal Presentation of Imagery with Focus Cues 168 on Multi-Plane Displays, ACM Trans. on Graphics, Vol. 34. No. 4, August 2015
Presenting Images on MFD With Fixed Planes: Using Content-Adaptive Opt.*
*R. Narain, R. A. Albert, A. Bulbul, G. J. Ward, M. S. Banks, J. F. O. Brien, Optimal Presentation of Imagery with Focus Cues 169 on Multi-Plane Displays, ACM Trans. on Graphics, Vol. 34. No. 4, August 2015
Presenting Images on MFD With Fixed Planes: Using Content-Adaptive Opt.* • Improved occlusion handling by optimized blending versus linear blending
*R. Narain, R. A. Albert, A. Bulbul, G. J. Ward, M. S. Banks, J. F. O. Brien, Optimal Presentation of Imagery with Focus Cues 170 on Multi-Plane Displays, ACM Trans. on Graphics, Vol. 34. No. 4, August 2015
Presenting Images on MFD With Fixed Planes: Using Content-Adaptive Opt.*
*[R. Narain et. al. 2015]
171
Presenting Images on MFD With Dynamically Variable Planes* • Allocate locations of planes based on content instead of fixing them
• Requires MFD that is capable of dynamic variation of plane locations
Image from Ng, Ren, et al. "Light field photography with a hand-held plenoptic camera."Computer Science Technical Report CSTR 2.11 (2005).
Pr(z)
z
Conventional focal plane allocation (uniform spacing in diopters)
*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015
172
Presenting Images on MFD With Dynamically Variable Planes: K-Mean Opt.* • Consider placement of focal planes as a point clustering problem – 3D point cloud where the points have intensities and depth information
• We want to optimize the positions of focal planes q1, …, qM – Minimize the distance between data points and their nearest focal planes (to reduce contrast loss) – Solve using K-means
hist(z) q1* q2*
q3*
q4*
q5*
q6*
z
q1
q2
q3
q4
q5
q6
Conventional focal plane allocation (blue lines) Content-adaptive focal plane allocation (red lines) *W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015
173
Presenting Images on MFD With Dynamically Variable Planes: K-Mean Opt.*
Image from Ng, Ren, et al. "Light field photography with a hand-held plenoptic camera."Computer Science Technical Report CSTR 2.11 (2005).
𝑧5 = 2.8𝐷 𝑝1 = 0D
𝑧4 = 1.9𝐷
𝑝2 = 0.6D
𝑝3 = 1.2D
𝑧3 = 1.0𝐷 𝑝4 = 1.8D
𝑝5 = 2.4D
𝑧2 = 0.5𝐷
𝑧1 = 0.2𝐷
𝑝6 = 3.0D
Pr(z)
z Conventional focal plane allocation (blue lines) Content-adaptive focal plane allocation (red lines) *W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015
174
Presenting Images on MFD With Dynamically Variable Planes: K-Mean Opt.* • Results from simulated 3D scene Image
Simulated retinal images** Uniform (0.6D spacing) 0.41D
0.71D
Optimized 0.41D
0.71D
Depth
Linear depth blending is used for data between focal planes data from: Scharstein et al.,GCPR 2014
**Arizona eye model, J. Schwiegerling, SPIE, 2004
*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015
175
Presenting Images on MFD With Dynamically Variable Planes: K-Mean Opt.* • Real light field data
Light Field Camera Prototype
Depth
Plenoptic image: 3 books at specific depths
Image of a displayed scene
*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015
176
Rendered images captured by a camera behind the beamsplitter of the MFD
Presenting Images on MFD With Dynamically Variable Planes: K-Mean Opt.* Displayed scene – rendered images captured by a camera behind the beamsplitter of the MFD
*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015
177
Presenting Images on MFD With Dynamically Variable Planes: K-Mean Opt.* • Comparison of uniform versus optimized plane placement – using simulated retinal images – Optimized planes produce better image quality (greater contrast and sharpness)
*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Near-eye display of light fields, IDW 2015
178
Presenting Images on MFD With Dynamically Variable Planes: Content-Adaptive Opt.* • Find optimal placement of focal planes by optimizing objective function that characterizes the overall perceptual quality of the rendered 3D scene – Using a metric defined as “Multi-Focal Scene Defocus Quality (MSDQ)”
*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Content-adaptive focus configuration for near-eye multi-focal displays, ICME 2016
179
Presenting Images on MFD With Dynamically Variable Planes: Content-Adaptive Opt.* • Retinal image results – Optimal placement of planes produces sharper and higher contrast region of interest
Uniform placement of focal planes
Optimal placement
*W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, Content-adaptive focus configuration for near-eye multi-focal displays, ICME 2016
180
Spatio-Angular Frequency Analysis* • Compare spectrum of multi-focal (additive) versus layered (multiplicative) displays* – Multi-focal display region comprises of line spectra – with one line added for each plane – Layered display region is formed by convolution of support of each layer producing a significantly larger region, but this is a theoretical upper bound that is not easily achieved in practice
*M. S. Banks, D. M. Hoffman, J. Kim, G. Wetzstein, 3D Displays, publication pending, 2016
181
Key System Design Challenges • Display specification tradeoffs – Spatial resolution, color depth, depth resolution – Brightness, contrast, color gamut, power consumption
• Perceptually correct overlay of digital information over the appropriate real world objects
• Optical tradeoffs – Field of View (FOV), weight, form-factor
• System latency between inputs and outputs • Software platform and eco-system 182
Overlay of Virtual Over Real World* • Overlaying virtual information on associated real objects at appropriate depth and size is a critical element of AR
• Calibration is critical to doing the overlay correctly* MIG user
Traditional: Human-Computer Interface MIG: Human-Computer-World Interface *W. Wu, I. Tosic, K. Berkner, N. Balram, Depth-disparity calibration for augmented reality on binocular optical see-183 through displays, ACM Multimedia Systems Special Session on Augmented Reality, March 2015
Multi-Focal Display Research Prototype* Prototype specifications 31 degree diagonal FOV • 6-bit grayscale imagery 55fps per focal plane >60 possible, but results in resonant frequency of tunable lens.
6 focal planes 0-5 Diopters focal workspace possible 1.3’ resolution • 4mm exit pupil diameter • 12mm eye relief • Large depth of field Achieved binocular display *P. Llull, N. Bedard, W. Wu, I. Tosic, K. Berkner, N. Balram, Design and optimization of a near-eye multi-focal display system for augmented reality, COSI, June 2015
184
Multi-Focal Display Research Prototype • Primary components needed: display element, focal modulation element, relay optics – Must be capable of generating images and modulating focus at 60𝑵𝒇 Hz
• Desirables: large FOV, eye box, focal workspace; high resolution Max. 2.8ms single image exposure time
Focal modulator optical power (D)
Time Ocular critical sampling time (1/60s)
185
Multi-Focal Display Research Prototype*
focal plane
*P. Llull, N. Bedard, W. Wu, I. Tosic, K. Berkner, N. Balram, Design and optimization of a near-eye multi-focal display system for augmented reality, COSI, June 2015
186
Multi-Focal Display Research Prototype 𝑓1 = 35𝑚𝑚
Tunable lens
𝑓2 = 35𝑚𝑚
Slotted ring
25mm tube
𝑓3 = −75𝑚𝑚 𝑓𝑒 = 25𝑚𝑚
Adjustable polarizers Eyepiece-beamsplitter adaptor
Projector 1 Projector 2 Optical rails for adjustable interpupillary distance 187
Focal Modulation Waveform Shaping • High-frequency step impulses result in ringing (“focal jitter”) and overshoot
• Use a “pyramid”-shaped waveform to reduce overshoot – Distribute the large jump along both sides of the periodic waveform Liquid lens focal planes
Conventional
1 Cycle (1/55 sec)
Ours
188
Focal Modulation Waveform Filtering • “Focal jitter”: ringing resulting from high-frequency step inputs. – Can reduce this by filtering the input waveform
• Which filter? – Can be designed, optimized, or empirically determined Unfiltered
2ms response time
5ms settling time
Filtered
189
Focal Waveform Filtering Experimental Procedure For k = {1, … ,6} focus positions, find σ*k that maximizes image sharpness Focal workspace 0-3D (6 focal planes)
Move camera focus to focus position
Test image sharpness
Average σ*k for the optimal filter standard deviation σ*
Test image sharpness for each standard deviation σk = {.001, .002, … , .014} Change filter standard deviation σk
Capture several images with camera
Compute mean of the captured images
Compute 2D sum of Brenner gradient on mean image
190
Focal Waveform Filtering Results • Evaluated focus quality of each focal plane sequentially.
Focal plane 3 in focus
𝜎3∗ = 0.004 191
Multi-Focal Imagery Demo: Setup
Exposure time: 1/55s
192
Demo Video: Sweep Through Focus 3 2
1 3 2
1
193
Light Field Displays: Key Points to Remember • Light field displays are intended to enable natural and comfortable viewing of 3D scene – Providing natural and consistent stereo, parallax and focus cues – Avoiding well-known cue conflicts like VAC
• Natural accommodation response can be created in two ways: – Providing parallax across each eye that produces natural retinal blur corresponding to the 3D location of the object being viewed – integral imaging approach – Placing the object being viewed onto a focal plane at the appropriate distance – multi-focal-plane approach
• Need to make tradeoffs in design specifications based on target applications – Most fundamental separation by usage is Group/Multi-User versus Personal/Single-User (Near-Eye/Head-Mounted) – Group/Multi-User needs to support multiple viewpoints whereas Near-Eye/Head-Mounted can support single viewpoint 194
Light Field Displays: Key Points to Remember •
Group/Multi-User light field displays: – – –
•
Personal/Single-User (Near-Eye/Head-Mounted light field displays: –
•
Two main categories: Virtual Reality (VR) and Augmented Reality (AR)
Virtual Reality (VR) HMDs: – –
•
Three main types: Scanning, Multi-projector, Multi-layer Primarily use integral imaging approach (providing large number of views) – since this also provides parallax in addition to stereo and focus Compressive displays are a family of multi-layer displays that use advanced computer graphics computation algorithms to create and present a large number of views using a small number of spatial and temporal layers
Gaming and entertainment are leading applications driving the market at this time Can use the compressive display approach to provide natural 3D viewing but diffraction is a major obstacle that needs to be addressed
Augmented Reality (AR) HMDs: – – – – –
Large number of possible AR applications, especially in verticals like logistics, manufacturing, healthcare, construction, first responders etc AR HMDs could be divided into three main types – I (Basic/monocular), II (Binocular 2D/3D), III (Binocular Advanced 3D) – with applications and products already existing for types I and II For Type III, there are many tradeoffs that need to be made based on target application Can use integral imaging or multi-focal-plane approaches – these correspond to using spatial or temporal multiplexing Practical multi-focal displays can be created with a small number of planes by using depth blending to provide the perception of continuous depth 195
Section 5: Summary
196
HVS: Key Points to Remember • “The human visual system detects and interprets information from visible light to build a representation of the surrounding environment.”* • The visual pathway begins at the eyes and ends at the visual cortex • What we “see” is not the raw image on the retina but our interpretation of it • The interpretation depends on a set of sensory information (“cues”) that we extract from the data and on the rules that our system has developed during the course of our evolution (“prior model”) • Confusion (optical illusions) can arise when the data is considered suspect and is overruled by the prior model • Cue conflicts can cause physical ill-effects like nausea and fatigue * https://en.wikipedia.org/wiki/Visual_system
197
Light Fields: Key Points to Remember • Plenoptic function is a 7D function describing light flowing through space • This can be reduced to various useful subsets • Light field is a 4D function describing radiance as a function of position and direction – Simple representation using two parallel planes with 2D views (u,v) and 2D positions (s, t)
• Light fields can be captured using an array of cameras or a small-form factor camera with micro-lenses or multiple apertures – Each form of capture has tradeoffs and the best choice depends on the objectives
• Light fields can be displayed using an array of display engines or a display with special optical layers 198
Light Field Imaging: Key Points to Remember •
Number of tradeoffs have to be made based on specific target application – Possible applications include medical, factory automation/inspection, consumer content creation etc.
•
Robust system methodology exists for design of end-to-end system based on key performance metrics for target application – When designing the system, figure out requirements for spatial resolution, angular resolution (#views), depth resolution and range, temporal resolution, and spectrum
•
Can use array of cameras (sensors) or single camera (sensor) – Array approach enables high spatial resolution and wider baseline (provides depth for distant objects) but is bulkier and more costly – Single camera approach enables compact system and high angular resolution but has limited spatial resolution and narrow baseline (provides depth for closer objects only)
•
Calibration is a critical first step of the processing
•
Depth can be estimated accurately using layer-based approaches or dense field approaches
•
Processing based on geometric models is applicable in most cases but need to use diffraction models for applications involving high magnification
199
Light Field Displays: Key Points to Remember • Light field displays are intended to enable natural and comfortable viewing of 3D scene – Providing natural and consistent stereo, parallax and focus cues – Avoiding well-known cue conflicts like VAC
• Natural accommodation response can be created in two ways: – Providing parallax across each eye that produces natural retinal blur corresponding to the 3D location of the object being viewed – integral imaging approach – Placing the object being viewed onto a focal plane at the appropriate distance – multi-focal-plane approach
• Need to make tradeoffs in design specifications based on target applications – Most fundamental separation by usage is Group/Multi-User versus Personal/Single-User (Near-Eye/Head-Mounted) – Group/Multi-User needs to support multiple viewpoints whereas Near-Eye/Head-Mounted can support single viewpoint 200
Light Field Displays: Key Points to Remember •
Group/Multi-User light field displays: – – –
•
Personal/Single-User (Near-Eye/Head-Mounted light field displays: –
•
Two main categories: Virtual Reality (VR) and Augmented Reality (AR)
Virtual Reality (VR) HMDs: – –
•
Three main types: Scanning, Multi-projector, Multi-layer Primarily use integral imaging approach (providing large number of views) – since this also provides parallax in addition to stereo and focus Compressive displays are a family of multi-layer displays that use advanced computer graphics computation algorithms to create and present a large number of views using a small number of spatial and temporal layers
Gaming and entertainment are leading applications driving the market at this time Can use the compressive display approach to provide natural 3D viewing but diffraction is a major obstacle that needs to be addressed
Augmented Reality (AR) HMDs: – – – – –
Large number of possible AR applications, especially in verticals like logistics, manufacturing, healthcare, construction, first responders etc AR HMDs could be divided into three main types – I (Basic/monocular), II (Binocular 2D/3D), III (Binocular Advanced 3D) – with applications and products already existing for types I and II For Type III, there are many tradeoffs that need to be made based on target application Can use integral imaging or multi-focal-plane approaches – these correspond to using spatial or temporal multiplexing Practical multi-focal displays can be created with a small number of planes by using depth blending to provide the perception of continuous depth 201
Section 6: References
202
Vision Science & Light Fields •
M. S. Banks, W. W. Sprague, J. Schmoll, J. A. Q. Parnell, G.D. Love, “Why do animal eyes have pupils of different shapes”, Sci. Adv. August 2015
•
S. Wanner, S. Meister, B. Goldluecke, “Datasets and benchmarks for densely sampled 4D light fields”, Vision, Modeling & Visualization, The Eurographics Association, 2013
•
R. T. Held, E. Cooper, J. F. O’Brien, M. S. Banks, “Using blur to affect perceived distance and size”, ACM Trans. Graph. 29, 2, March 2010
•
D. Hoffman, A. Girshick, K. Akeley, M. S. Banks, “Vergence-accommodation conflicts hinder visual performance and cause visual fatigue”, Journal of Vision 8(3):33, 2008
•
M. S. Banks, et. al., “Conflicting focus cues in stereoscopic displays”, Information Display, July 2008
•
Levoy and Hanrahan, “Light field rendering”, SIGGRAPH 1996
•
E. H. Adelson, J. Y. A. Wang, “Single lens stereo with a plenoptic camera”, IEEE Trans. PAMI, Feb. 1992
•
E. H. Adelson, J. R. Bergen, “The plenoptic function and the elements of early vision”, Computational Models of Visual Proc., MIT Press, 1991 203
Light Field Imaging •
N. Balram, I. Tosic, H. Binnamangalam, “Digital health in the age of the infinite network”, Journal APSIPA, 2016
•
L. Meng, K. Berkner, “Parallax rectification for spectrally-coded plenoptic cameras”, IEEE-ICIP, 2015
•
N. Bedard, I. Tošić, L. Meng, A. Hoberman, J. Kovacevic, K. Berkner, “In vivo ear imaging with a light field otoscope”, Bio-Optics: Design and Application, April 2015
•
K. Akeley, “Light-field imaging approaches commercial viability”, Information Display 6/15, Nov./Dec. 2015
•
I. Tošić, K. Berkner, “3D Keypoint Detection by Light Field Scale-Depth Space Analysis,” ICIP, October 2014 (Best Paper Award)
•
N. Bedard, I. Tošić, L. Meng, K. Berkner, “Light field Otoscope,” OSA Imaging and Applied Optics, July 2014
•
J. Park, I. Tošić, K. Berkner, “System identification using random calibration patterns,” ICIP, October 2014 (Top 10% paper award) 204
Light Field Imaging •
I. Tošić, K. Berkner, “Light field scale-depth space transform for dense depth estimation, CVPR workshops, June 2014
•
K. Masuda, Y. Yamanaka, G. Maruyama, S. Nagai, L. Meng, I. Tosic, “Single-snapshot 2D color measurement by plenoptic imaging systems”, SPIE Photonics West, OPTO, February 2014
•
Dansereau et al., “Decoding, calibration and rectification for lenselet-based plenoptic cameras”, CVPR 2013
•
Y. Lin, I. Tosic, K. Berkner, “Occlusion-aware layered scene recovery from light fields,” Proceedings of IEEE ICIP 2013
•
L. Meng, K. Berkner, “Optimization of filter layout for spectrally coded plenoptic camera,” OSA Applied Imaging congress, June 2013
•
K. Berkner, L. Meng, S. A. Shroff, I. Tosic, “Understanding the design space of a plenoptic camera through and end-to-end system model,” OSA Applied Imaging congress, June 2013 (invited talk)
•
S. A. Shroff, and K. Berkner. "Image formation analysis and high resolution image reconstruction for plenoptic imaging systems." Applied optics 52.10 (2013): D22-D31
•
S. A. Shroff, K. Berkner, “Plenoptic system response and image formation, “ OSA Applied Imaging congress, June 2013 (invited talk)
205
Light Field Imaging •
I. Tosic, S. A. Shroff, K. Berkner,”Dictionary learning for incoherent sampling with application to plenoptic imaging,“ Proceedings of IEEE ICASSP 2013
•
A. Gelman, J. Berent, P. L. Dragotti, “Layer-based sparse representation of multiview images”, EURASIP Journal on Advances in Signal Processing, 2012
•
S. Wanner, B. Goldluecke, “Globally consistent depth labeling of 4D light fields”, Computer Vision and Pattern Recognition (CVPR), 2012
•
L. Meng, K. Berkner, “System model and performance evaluation of spectrally coded plenoptic camera,” to be presented at OSA-Imaging and Applied Optics Congress, June 2012
•
S. A. Shroff, K. Berkner, “High resolution image reconstruction for plenoptic imaging systems using system response,” OSA-Imaging and Applied Optics Congress, June 2012
•
K. Berkner, S. A. Shroff, “Optimization of plenoptic imaging systems including diffraction effects,” International Conference on Computational Photography, April 2012
•
R. Horstmeyer, G. Euliss, R. Athale, M. Levoy, “Flexible multimodal camera with light field architecture”, IEEE International Conference on Computational Photography, 2009
•
R. Ng, M. Levoy, M. Bredif, G. Duval, “Light field photography with a handheld plenoptic camera”, Technical Report CSTR, 2005 206
Light Field Displays •
W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, “Content-adaptive focus configuration for near-eye multi-focal displays”, ICME 2016
•
W. Wu, I. Tosic, N. Bedard, P. Llull, K. Berkner, N. Balram, “Near-eye display of light fields”, IDW 2015
•
R. Narain, R. A. Albert, A. Bulbul, G. J. Ward, M. S. Banks, J. F. O. Brien, “Optimal presentation of imagery with focus cues on multi-plane displays”, ACM Trans. on Graphics, Vol. 34. No. 4, August 2015
•
P. Llull, N. Bedard, W. Wu, I. Tosic, K. Berkner, N. Balram, Design and optimization of a near-eye multi-focal display system for augmented reality, COSI, June 2015
•
W. Wu, I. Tošić́, K. Berkner, N. Balram, Depth-disparity calibration for augmented reality on binocular optical see-through displays, ACM Multimedia Systems Special Session on Augmented Reality (MMSysAR), March 2015
•
F. C. Huang, K. Chen, G. Wetzstein, “The light field stereoscope: immersive computer graphics via factored near-eye light field displays with focus cues”, SIGGRAPH 2015
•
G. Wetzstein, “Why people should care about light-field displays”, Information Display 2/15
207
Light Field Displays •
N. Balram, W. Wu, K. Berkner, I. Tosic, “Mobile information gateway – enabling true mobility”, The 14th International Meeting on Information Display (IMID 2014 DIGEST), Aug. 2014. Invited paper
•
N. Balram, “The next wave of 3D – light field displays”, Guest Editorial, Information Display, Nov/Dec 2014 issue, Vol. 30, Number 6
•
X. Liu, H. Li, “The progress of light field 3-D displays”, Information Display, Nov/Dec 2014 issue, Vol. 30, Number 6. Invited paper
•
W. Wu, K. Berkner, I. Tošić, N. Balram, “Personal near-eye light field display”, Information Display, Nov/Dec 2014 issue, Vol. 30, Number 6. Invited paper
•
W. Wu, N. Balram, I. Tošić, K. Berkner, “System design considerations for personal light field displays for the mobile information gateway”, International Workshop on Display (IDW), Dec. 2014. Invited paper
•
X. Hu, H. Hua, “Design and assessment of a depth-fused multi-focal plane display prototype”, J. of Display Tech. 2014
•
H. Hua, B. Javidi, “A 3D integral imaging optical see-through head-mounted display”, Optics Express, 2014
208
Light Field Displays •
F. C. Huang, G. Wetzstein, B. Barsky, R. Rasker, “Eyeglasses-free display: towards correcting visual aberrations with computational light field displays”, SIGGRAPH 2014
•
G. Wetzstein. D. Lanman, M. Hirsch, R. Raskar, “Tensor displays: compressive light field synthesis using multi-layer displays with directional backlighting”, SIGGRAPH 2012
•
S. Ravikumar, K. Akeley, M. S. Banks, “Creating effective focus cues in multi-plane 3D displays”, Optics Express, Oct. 2011
•
Y. Takaki, K. Tanaka, J. Nakamura, “Super multi-view display with a lower resolution flatpanel display”, Opt. Express, 19, 5 (Feb) 2011
•
B. T. Schowengerdt, E. J. Seibel, “3D volumetric scanned light display with multiple fiber optics light sources”, IDW 2010
•
Y. Takaki, “High-density directional display for generating natural three-dimensional images”, Proc. IEEE 94, 3, 2006
•
B. T. Schowengerdt et al., “True Three-Dimensional Displays that Allow Viewers to Dynamically Shift Accommodation, Bringing Objects Displayed at Different Viewing Distances Into and Out of Focus”, Cyperpsychology & Behavior, 2004
209
210