Uniformly Sampled Light Fields - CiteSeerX

Uniformly Sampled Light Fields Emilio Camahort

Apostolos Lerios

Department of Computer Sciences The University of Texas at Austin

Align Technology, Inc.

Abstract Image-based or light field rendering has received much recent attention as an alternative to traditional geometric methods for modeling and rendering complex objects. A light field represents the radiance flowing through all the points in a scene in all possible directions. We explore two new techniques for efficiently acquiring, storing, and reconstructing light fields in a (nearly) uniform fashion. Both techniques sample the light field by sampling the set of lines that intersect a sphere tightly fit around a given object. Our first approach relies on uniformly subdividing the sphere and representing this subdivision in a compact data structure which allows efficient mapping of image pixels or rays to sphere points and then to subdivision elements. We sample a light field by joining pairs of subdivision elements and store the resulting samples in a multi-resolution, highly compressed fashion that allows efficient rendering. Our second method allows a uniform sampling of all five dimensions of the light field, using hierarchical subdivision for directional space and uniform grid sampling for positional space. Light field models are acquired using parallel projections along a set of uniform directions. Depth information can also be stored for high-quality image rendering. The system can provide bounds on key sources of error in the representation and can be generalized to arbitrary scenes comprising multiple complex objects.

1 Introduction Image-based modeling and rendering techniques have recently received much attention as an alternative to traditional geometrybased techniques for image synthesis. Synthetic images are created from a prestored set of samples of the light field of an environment instead of a geometric description of the elements in the scene. While image-based techniques have long been used to augment geometric models in the form of texture maps or environment maps, for example, newer approaches completely replace geometric information with image data. This new approach allows the construction of models directly from real-world image data without conversion to geometry. It also can be more efficient than geometric techniques for complex models which require very large amounts of geometric detail to represent. Image-based modeling and rendering relies on the concept of a light field. A light field represents the radiance flowing through all the points in a scene in all possible directions. For a given wavelength, a static light field as a 5D scalar function we can ,represent that gives radiance as a function di of location in 3Dand rection. Location is representedby a point space. of angles giving azimuth and Direction is specified by a pair elevation, respectively. Later we also use the alternative notation to represent directions as unit vectors or, equivalently, points on a sphere’s surface.

Taylor Hall 2.124, Austin TX 78712, ecamahor fussell @cs.utexas.edu, http://www.cs.utexas.edu/users/ ecamahor fussell 2991 El Camino Real, Suite 120, Redwood City CA 94061, [email protected], http://www-graphics.stanford.edu/ tolis/

Don Fussell Department of Computer Sciences The University of Texas at Austin

Images are discretized 2D slices of this 5D function. Given a set of viewing parameters, we an image by evaluating the can render location of the eye for a discrete light field function at the set of directions within the field of view. Current research in image-based models falls into either of two categories: (i) techniques related to computer vision, image warping, and view interpolation, and (ii) techniques related to the fundamental representation, sampling and reconstruction of the light field. With a few exceptions, most recent developments fall into the first category. Our paper, however, improves on current techniques for light field representations of single objects. Previous light field models are based on the two plane parameterization (2PP), where the set of lines in space is parameterized by the intersection points of each line with two planes. In order to sample all the lines intersecting the convex hull of an object we can choose different arrangements of pairs of planes. Levoy and Hanrahan call each pair a light slab [25]; Gortler et al. [19] propose a single arrangement of six pairs of planes called the lumigraph. Images produced using these two approaches show noticeable artifacts when the camera crosses the boundary between two light slabs. Even arrangements of 12 light slabs do not suffice to avoid this problem, called the disparity problem. In this paper we propose a solution to the disparity problem. A light field representation should allow the user to move freely around an object without noticing any resolution changes in the model. This requires the representation to be invariant under both rotations and translations. Such a representation samples the light field function by uniformly sampling the set of lines intersecting the object’s convex hull. As an approximation to the convex hull we use a sphere tightly fit around the object. Given that approximation, we introduce two new uniform representations for light field models, the two-sphere parameterization and the sphere-plane parameterization. The first one was studied by the second author while a doctoral candidate at Stanford University. The second one is the work of the first and third authors at the University of Texas at Austin. Our representations improve on previous ones for several reasons. First, they allow us to sample the light field in a (nearly) uniform fashion. Uniformity is relevant because it solves the disparity problem, but it also has some other important advantages. For example, when sampling a function whose variation is unknown a priori, uniform sampling provides a good preview of the function, that can later be refined as more information about the function is known. Also, compression theory often makes the assumption of uniformly spaced samples. For instance, the discrete Fourier transform assumes that its input is a sequence of regularly spaced function samples. Second, our representations profit from the spherical nature of the light field function. Previous approaches avoid spherical representations due to their complexity. Instead, they use cylindrical projections or planar perspective projections for simplicity. We show in this paper that it is possible to efficiently store and render a light field using an entirely spherical representation. Furthermore, our representations implement much of the functionality of previ-

and the sphere-plane parameterization (SPP). They are illustrated in figure 1. In the first case, choosing both points with uniform probability implies that the line joining them follows a uniform distribution. In the second case, choosing both the great circle and the point on it following uniform distributions implies that the normal to the great circle through the point follows a uniform distribution. Using this principle we can now construct uniform samplings of the restriction of line space to the sphere by applying the notion of probabilistic uniformity to uniform sampling. We can then build a 2SP representation of line space by taking a uniform sample of the points on the sphere and connecting each point to one another to form a uniform sample of the set of oriented lines through the sphere. Alternatively, we can build an SPP representation by uniformly sampling the set of (oriented) great circles of the sphere and, for each great circle, by uniformly sampling the set of points on its surface. The normals to each circle through each point on its surface give a uniform sampling of the set of oriented lines in space. Note that each point on the sphere’s surface represents a unique 3D direction and there is a one-to-one correspondence between the set of all 3D directions and the set of all (oriented) great circles of the sphere. Therefore, a sample of the set of (oriented) great circles of the sphere can be easily obtained by taking a uniform sample of the set of points on the sphere’s surface.

C L

L

Q P

P

(a)

(b)

Figure 1: Two ways of uniformly selecting a random line interunisecting a 3D sphere: (a) select two random points and formly distributed on the sphere’s surface and join them with a line , and (b) select a random great circle with uniform probability, select a random point uniformly distributed over ’s surface, and choose the normal to passing through .

ous ones, including interpolation, depth correction, and compression. Additionally, they allow flexible multiresolution light field sampling densities and progressive transmission and rendering at interactive rates. Finally, we do not advocate in this paper that non-uniformly sampled representations are useless. On the contrary, they usually give us more control over sample placement so that we may increase the sample density in areas we deem more important. And, in general, non-uniform representations can divert samples from lowvariance areas to detailed ones. For this reason, even though our methods start with a uniform sampling pattern, they both support hierarchical multiresolution so that groups of uniform samples in a low-variance area can be stored compactly. In the following section we define the concept of uniformly sampled light field and introduce the two representations subject of this paper. Then we elaborate on our first representation, which relies on subdividing the unit sphere in a uniform fashion and then sampling a light field by joining pairs of subdivision elements. Section 4 presents our second approach to uniform light field sampling, which relies on the same subdivision as the first one but obtains the light field samples by parallel projections along the directions given by the bounding sphere’s discretization. For each of our approaches, we define the representation and present algorithms for acquiring models for single objects as well as rendering those models efficiently. We also discuss issues of time/space tradeoffs in the use of our techniques and give initial results of our implementations. The last two sections contain a survey of related papers, and a comparison between our approaches and related methods.

3 The Two-Sphere Parameterization Our first light field representation is based on the 2SP. We start by subdividing the surface of the sphere into (nearly) equilateral, (nearly) identical spherical triangles, called patches; then, we store one light field sample for every ordered pair of patches. We call this representation the two-sphere parameterization (2SP) of the light field, since it is equivalent to representing each line in free space by its intersection points with two overlapping spheres. We generate a 2SP model of the external appearance (color) of a synthetic object by performing two steps: 1. Scale the object to fit snugly within the unit sphere. 2. For each ordered pair of patches (a) randomly choose some of the rays crossing both patches, (b) use a ray tracer to determine the color of each ray, and (c) set the pair sample equal to the average ray color. More elaborate methods may use filter kernels other than mere averaging (box filter). Also, one may build a 2SP model from another light field by resampling it, be it a collection of camera views of a real object or a set of fast perspective renderings of a synthetic object. Sample extraction for an arbitrary ray is also very easy in its simplest form:

2 Preliminaries Consider a 3D object in free space. We want to sample the 4D light field that can be viewed from outside of its convex hull. To do so we approximate the convex hull by a bounding sphere tightly fit around the object. Then we consider the restriction of 3D oriented line space to those lines that intersect the bounding sphere. We want to choose lines from that set following a uniform distribution. From a probabilistic point of view this means that each line of the set has to be equally likely to be chosen. The problem with line space is that uniformity depends on the probability measure, which in turn depends on the parameterization of line space [31]. Different choices of parameterizations can be found in the literature [37]. Two of those choices have been shown to support the selection of random lines following a uniform distribution. The two parameterizations are the two-sphere parameterization (2SP)

1. The intersection of the given ray with the unit sphere is computed. If there is no intersection, a sentinel “background” value is returned. 2. Otherwise, we determine which pair of patches are crossed by the given ray, and in which order. 3. Finally, we return the light-field sample for the intersected pair. An obvious improvement to this approach uses reconstruction kernels which improve over our nearest-neighbor resampling scheme (piecewise-constant reconstruction). The ensuing sections will cover the details of our system. We will first describe how we uniformly subdivide the sphere, storing

2

4

0

2 1

20

Figure 2: the subdivided sphere. On we subdi the left, used the result ; for is a visions, and on the right we used trivial image resembling white noise. In these pictures, each patch is assigned a random color.

18 17

the subdivision in a compact manner. Then, we will present the algorithms and data structures that allow us to map rays to sphere points and patch pairs efficiently. We conclude our presentation of the 2SP by describing the multi-resolution data structure in which we store the 2SP samples.

The most popular subdivisions of the unit sphere’s surface are based on polyhedra which approximate this surface [13]: each point on the polyhedron, the generator, is radially projected onto the sphere to yield a point , in a 1-1 manner. Then, a patch of the subdivision comprises all points whose corresponding ’s lie on the same polyhedron face. The platonic solids are good generators: all their edges, faces, and vertices look identical; and, as a result, they induce a perfectly uniform subdivision. However, the most complex platonic solid has only 20 faces. So, in order induce a more refined subdivision, we create a new generator out of the icosahedron, by means of triangulation:

For every face Find its centroid Project centroid onto Replace original face triangles connecting centroid to original

circumsphere by raised edges

12 7

16 14 13

15

Ideally, a subdivision should produce patches which are equilateral spherical triangles. With or without a step of triangulation, the maximum patch aspect ratio is about . In other words, the superior uniformity in area does not come at the expense of distorting the patches.

Storage of the patches can be explicit or implicit. An explicit scheme computes all the patch vertices and edges and stores their coordinates and connectivity. An implicit scheme is akin to dicing the plane into a rectangular grid: one need only store a few parameters (row and column count and size) and employ fast functions to compute all the other information, such as the top left corner of a grid cell. Fekete [12] presented such an implicit scheme for a subdivided sphere, and our work builds upon it. At its heart, this scheme establishes a method of indexing all the patches in an incremental fashion: it assigns indices not only to the final patches but also to the patches corresponding to every intermediate generator — from the top-level generator down to the -refined one. Figure 3 illustrates this scheme for a single top-level face and its descendants. Each such family of faces is distinguished from all others by means of some secondary tag. Our work extends Fekete’s scheme to include an incremental numbering of all features of the subdivision, including edges and vertices. We also define efficient operators that

The result is a generator with 4 x 60 faces. Applying this process times, we a solid with x 60 faces. We usually choose can get or yielding 61k and 245k faces, respectively. Why? Because if we were to render the sphere on a square display of 300 x 300 pixels, so that the sphere edge borders edge, the display each) (for ) or ( induced patch consumes at most pixels, hence producing a smooth image; see figure 2. Prior work [12] [17] applied the above refinement process directly onto an icosahedron in order to produce refined generators; however, our experiments showed that the triangulated icosahedron is superior as a top-level generator for the following reasons:

6

Ideally, a subdivision into patches should produce patches whose area is . Without triangulation, the patches have areas between 95% and 120% of ; with triangulation, and for approximately the same total number of patches, the range narrows to 97–107%.

For every edge Find its midpoint Project midpoint onto circumsphere For each face Replace it by four triangles connecting raised midpoints to original vertices

10

The result is a generator with 60 identical faces, all triangles but not equilateral ones. We can refine this generator as follows:

9

8

Figure 3: The face indexing scheme. A top-level face, and the corresponding patch, are given the index 0. The next three indices, 1 to 4, are assigned to the four children in counter-clockwise fashion, starting at the bottom-left corner, as shown. At the next level, we assign indices 5 to 8 to the children of face 1, indices 9 to 12 to 2’s children, etc. And, within each parent face, we follow the same counter-clockwise ordering, starting at the child face sharing its East-West and SW-NE edge with its parent. With this scheme, parent/child associations are easy, e.g. the child indices of face are .

3.1 Uniformly Subdividing the Sphere

19

11

5

3

3

compute geometric information for a feature given its index, for example, the coordinates of a vertex; and compute topological information linking features, for example, the indices of the two endpoints of an edge, given the edge index.

14 13

12 B 0 5 3

4

1

8

9

12

14

11 10

10

13 8

5

6

9

j

0 2

11

C

Figure 4: The edge indexing scheme. edges of the top-level . The Indices 0 to 2 are assigned face are given special indices to the first three new edges, shown as dashed lines, in the usual counter-clockwise fashion and starting with the SE-NW edge. The next generation of edges, shown as dotted lines, is named in the same fashion: indices 3 to 5 are assigned to face 1, indices 6 to 8 to face 2, and so forth. Note that although edges 5 and 8 seem to lie on the same line in this drawing, the refinement process may make the patch boundaries 5 and 8 lie on different great circles; hence the need to assign unique indices to such seemingly collinear edges.

4

13

5

A

2 7 6

3

7

i

1

Figure 5: The vertex indexing scheme. This scheme is best explained as a 3-step process. (1) We first turn the triangle into a rectangular grid, where it is easier to express adjacency in terms coordinate ofis system shown; that is, the vertex at the , etc. To this end, we cut the top-level next to the vertex at face along edge number 2, and tip it to the right, rotating around vertex 4. As a result, distinct vertices end up overlapping along the thick line; but they should be assigned distinct indices (8 and 9, 0 and 1, etc.); also, each vertex along the top (dotted) edge appears in two copies; we will assign an index to the left copy only. (2) Next, we draw the complete grid at the level of refinement; here, . (3) We assign vertex indices by moving (i) in steps the vertex; of

units where is the refinement level introducing (ii) in increasing , and within each in increasing ; (iii) assigning no index on a vertex that lands on the right half of the top edge; (iv) assigning two indices on a grid vertex that falls on the thick line; (v) skipping over previously indexed vertices. For example, indices 6–14, shown as white squares, are at the level and

, thus units apart: we examine , , , etc., , gets, in order. already has an index, is skipped (top edge), gets 7, gets 8 and 9 6, is skipped, etc. (double line),

Figures 4 and 5 discuss our indexing scheme. In practice, a mixture of explicit and implicit storage works best, aiming at the right trade-off between processor speed and memory system performance. For example, our system caches the equations of the patch boundaries to optimize point classification (discussed below) but stores no topological information.

3.2 Mapping Rays to Patches Identifying the ordered patch pair that is intersected by a given ray is a two-step process. 1. Compute the two intersection points between the ray and the unit sphere. 2. Identify the patches which contain the two points. This step can be decomposed into two, each finding the index of the patch containing one of the points. The first step is covered in most introductory textbooks on ray tracing, such as [16]. Nevertheless, one can make significant speed improvements when a sequence of coherent rays are cast, such as when a light field model is rendered in perspective; such rendering uses a family of rays which share same origin, but cross the the image plane at different pixels . Computing the intersection of each ray with the unit sphere boils down to solving one quadratic equation per pixel. Each equation comprises coefficients which are themselves quadratics in and . For pixels whose rays do not hit the sphere, equation has no solution, i.e., has discriminant. Since ’s discriminant is a negative a quadratic in and , a simple analysis can determine the range of ’s (rows) for which may have a solution for some ’s (columns); all other rows may be skipped. Also, for each row (fixed ), one can easily find the range of ’s for which is solvable and ignore all other columns. Finally, as the columns of a row are scanned, incremental evaluation of the quadratics using forward differencing can substantially reduce the number of expensive arithmetic operations needed to determine ’s coefficients. The above improvements are very similar in nature to the work of Laporte in rendering quadrics [21], that may be consulted for further details. Besides offering a significant speedup, these improvements make rendering complexity proportional to the projected image size of the unit sphere. So, a world populated with 2SP models can be rendered in time proportional to the desired image size, regardless of the number of models (assuming no occlusions).

We now return our attention to the second step of ray to patch mapping. Having discovered one intersection point between ray and sphere, we must find the patch containing the point. We will assume that the sphere subdivision is induced by a top-level generator which has then been refined times, yielding levels in the induced subdivision. Also, assume for now that we know which projected face of , i.e. which top-level patch, contains the point. The refinement algorithm is such that, after one level of refinement, each old patch is precisely the union of four new patches. So, much like binary search in trees, we need only discover which of the four children of the parent contains the point. This classification comprises at most three tests between the point and the child edges contained in the parent’s interior; more precisely, we need to find the placement of the 3D point relative to the three planes defined by the sphere’s center and the three great circle arcs outlining the children’s patch boundaries within the parent patch. Since a constant number of tests are performed at each refinement level, the overall complexity of this search is linear in . Now we address the question of discovering the top-level patch containing the given point . A simple method, applicable only to a simple generator , is to check all its faces. A more elaborate method may assume that successive points will be nearby and start the new search in the vicinity of the last containing patch. Yet another method builds a data structure dicing space into uniform cubical cells and associates each cell with all top-level patches which intersect it. With the aid of this space subdivision, the top-level patch containing is found by first determining the cell containing , and then checking which patch in the cell set contains . This uniform space subdivision is our method of choice. Two further refinements produce a highly efficient data structure.

1. Instead of storing patches induced by in the space subdivision, we may store patches induced by a refined descendant of . Doing so will increase the average number of patches

4

Row T1

Row

0/8

3/8

5/8 6/8 7/8 8/8

7 6 5 4 3 2 1 0

Source

T2

11 8 6 5 3 2 1 0

0/8 3/8 5/8 6/8 6/8 7/8 7/8 7/8

Average pair color

(a)

x Child pair colors

Figure 6: The compact storage of the 2D analogue of the space subdivision discussed in the text. The nonempty cells in the first quadrant are shown in gray; all runs of cells sharing the same coordinate form a row. We store these gray cells in a linear array, row by row from bottom to top, and from left to right. We also construct two tables: maps each row to the number of gray cells below this row, while contains the leftmost coordinate of the row’s cells. Given a point , we use its coordinate to determine the row of cells containing the point; ’s entry for that row then tells us where in the linear array the row begins, while the distance between ’s coordinate and ’s entry determines the array offset between the leftmost cell in the row and the one containing . Due to the symmetry of the circle (sphere), the 2D (3D) space subdivision only and for the first quadrant (octant). needs to maintain

Colors

It is built based on a simple statistical analysis of the light field samples, requiring little user input. For color samples, we compute statistics in YUV space, weighing the individual components on the basis of their perceptual effect [29].

2. If the space subdivision were to be stored as a 3D array dicing the unit cube, most cells would be empty. Instead, for the sake of good cache performance and storage efficiency, we can store only the non-empty cells, yet incur an insignificant penalty via the two-level access scheme described in figure 6. In our experiments, we used a space subdivision with cell width .

The combined outcome of our representation for the sphere subdivision and our techniques for mapping pixels to patches is high performance: any function mapping one color to each patch, for up levels of subdivision, can be displayed at 30 frames a secto patch maps ond, each measuring 300 x 300 pixels (where a to a pixel), on a single 195MHz R10k processor.

It allows for progressive transmission of light fields. The average colors of all top-level pairs are sent first. Then the average colors of the pairs at the next level follow, along with flags that tell the receiver whether or not each node is a leaf. Internal nodes will send their children’s colors later, and so forth. In effect, we transmit the trees in a collective breadthfirst fashion.

3.3 Hierarchical Representation of the 2SP The simplest way to store the 2SP is a two-dimensional array, mapping each ordered patch pair (i.e., a pair of integer indices) to a light field sample. This array can be compressed using any of the standard image compression techniques; vector quantization (VQ) [14] is probably best for fast reconstruction. However, vector quantization fails to compress the large empty portion of the light field efficiently. As a result, we opted for a common variant of VQ employing trees. Our data structure is summarized in figure 7. Our scheme is a practical and efficient one.

Src child

the light field consumes 11.3GB. With VQ blocks and 16bit codewords, pure VQ yields a 500MB data set. With hierarchical VQ, the figure drops to 150MB. In addition, disk storage can be reduced to 50MB using commercial LempelZiv [40] coding (gzip).

per cell but reduce the number of levels we have to descend in order to reach the sought after patch at level . In our exper iments with , we stored the patches of the third level of the space subdivision.

Average pair color

(b)

Figure 7: The hierarchical storage scheme of the 2SP; for simplicity, we assume that the light field samples are ray colors. With each ordered pair of top-level patches — a source and a destination patch — we associate the average color of all rays crossing the pair, as well as a tree. This tree contains the colors for individual rays crossing the pair, as well as average colors for ray families that cross pairs of descendant patches at the same subdivision level. (a) Each tree grows only when a node — corresponding to a pair of patches at the same level — exhibits a variance across its children that exceeds some threshold; the children of a pair are the 16 pairs obtained by matching each of the four children of the source patch to each child of the destination. (b) If the children of a pair are all leaves — either because of low grandchild variances or because the children belong to the last subdivision level —, we use a 4x4 VQ block to represent the 16 children colors.

Destination

Dst child

y

It allows progressive rendering of a light field. A few averages at the top levels can be extracted first to give a rough picture of the light field, with more detail extracted incrementally. It allows extraction of pre-filtered samples for distant views, much like texture mip-mapping. To facilitate this application, we chose to refine both source and destination patches when the variance is high. This way the filter kernel in line space maintains the same size in all four dimensions, ensuring no bias in the filter during rendering — much like texture mipmaps must refine both horizontally and vertically to accommodate arbitrary texture projections.

3.4 Discussion We first presented a method to subdivide the unit sphere in a near uniform fashion and store this representation in a compact data structure. We also described an algorithm which efficiently maps points on the sphere to elements of the subdivision, as well as a technique that efficiently computes such points during rendering.

It achieves high compression with minimal loss of quality. Consider, for example, field of the dragon shown on the light subdivision levels, and a 24-bit figure 13. We used RGB representation for each sample. Without compression,

5

Since the sphere is the natural domain of several physical quantities, our techniques find a home in applications which sample, store, and frequently retrieve (i) light intensity distributions, (ii) environment maps, (iii) topographical and geomagnetic data, (iv) 3D directions, and many other data sets. Our method can easily be combined with Fekete’s sphere quadtrees [12] to allow hierarchical storage as well. We also presented a data structure which allows hierarchical storage of data defined over the Cartesian product of two spheres. We focused on light fields, but for example one can use our techniques on bidirectional reflectance distribution functions (BRDFs) or form factors [17] and [32]. The 2SP is an alternative to the data structures described by Levoy and Hanrahan [25] and Gortler et al. [19], both of which rely on two-plane parameterizations. The introduction compared our technique to prior work in terms of sampling quality. We now pit the two-plane parameterization against the 2SP, using the publicly available implementation of [25] for our comparisons.

( , )

v d

O

(x, y, z) u P

Figure 8: The sphere-plane parameterization of a light field.

gzip and a 6 x 24 x 256 x 256 sampling rate; the equivasampling rate lent 2SP model for the substantially higher uses 20MB, only four times more. To be fair though, careful slab placement choice of the four independent sampling —and rates tailored to a known a priori light field, especially one of a near flat object — allows an LIF to surpass the 2SP scheme; 2SP models are inflexible, their only degrees of control being the object placement within the unit sphere and the maximum level of sphere subdivision.

Rendering speed: Rendering the 2SP takes three times longer than a single-slab LIF, for comparable image quality and piecewise-constant reconstruction. For piecewise-linear reconstruction, the 2SP retrieves 9 samples per ray and applies barycentric interpolation, while LIFs retrieve 16 and apply quadralinear interpolation; since the main cost is sample retrieval, the 2SP thus narrows the single-slab rendering performance gap to a factor of two. Our results hint at two possible LIF performance improvements: cull overlapping slabs (if any) and introduce barycentric interpolation by treating each grid rectangle as two abutting triangles.

4 The Sphere-Plane Parameterization Recall field is represented by a scalar function that a light of five parameters. In order to define the sphereplane (SPP) let us consider a single .parameterization on thedirection Now we define a 2D coordinate system that contains the point unique . Fiplane orthogonal to nally, we define to be the distance from to the origin. In the new coordinate system, any point can be specified by the and the distance from coordinates of its projection along the origin to the unique plane where it belongs. The geometry of this coordinate system depicted in figure 8. is coordinates of a point obtained easNote that the can be . Also, note that ily by applying a simple transformation to each of the points in the plane corresponds to a line parallel . If we restrict the light field to free to the direction given by component of the SPP and obtain a 4D space, wecan drop the . This is true because the light field remains function in free space. constant along a given direction Now we build a discrete representation of the SPP. First, we choose a set of disjoint pencils of directions , and approximate each pencil by a single direction given by its center. We define the set of pencils using the geodesic approximation to the unit sphere [9] [12]. This choice has several advantages:

Retrieval speed: Rendering demands coherent access to the light field, which allows LIFs to showcase high performance via scan-conversion. But incoherent access, such as in the core of a ray tracer, paints a different picture. While the 2SP still takes twice the time to access a single sample randomly, surround LIFs which comprise multiple slabs — intersections with all of which must be computed — are inferior to the 2SP (usually by a factor of two to three). Retrieval quality: Surround light fields yield distracting artifacts on slab boundaries (visible seams) due to sudden changes in slab-to-object distance — causing discrepancy in filter kernel orientation between abutting slabs. To counter this effect, multiple slabs are employed to minimize these distance jumps; but even the surround dragon LIF of [25], which uses 16 slabs, does not entirely eliminate seams. No such sharp transitions in sampling rate are present in the 2SP.

Authoring: For synthetic scenes, LIFs may be built using off-theshelf, efficient renderers; unfortunately, the 2SP requires a ray tracer which can be instructed to shoot individual rays, joining pairs of points on the sphere. For acquired data, both LIFs and the 2SP require resampling: the usual camera motion is on a sphere (suiting the 2SP, but requiring LIF resampling), with each view being a planar projection (suiting the LIF, but requiring 2SP resampling). The 2SP and LIFs can also be built from one another by resampling.

Compression: Compressing LIFs seems easier due to the rectangular domain over which they are defined. However, it is hard to exploit coherence across adjacent slabs of a surround field. In addition, standard compression schemes over rectangular domains cannot be directly applied to nonuniform samples (see section 1). By contrast, (tensor products of) spherical wavelets [33] can be applied onto a 2SP model to compress at once the full surround view. But even simple techniques, such as our hierarchical VQ scheme can yield very good results: the 1/4 surround dragon LIF uses 6MB with 10-bit VQ,

each pencil is represented by a triangle of the geodesic approximation and vice versa, the set of directions gives a nearly uniform sample of directional space, the triangles give a nearly uniform tessellation of the unit sphere, it allows easy construction of the set of pencils using adaptive subdivision, and it provides excellent support for multiresolution representations on the sphere (see [33]).

The coarsest level of the geodesic approximation, level 0, has 20 triangles. Each subsequent level of the approximation has 4 times more triangles than the previous one. Such a tessellation can be stored as a hierarchy of triangles or, equivalently, pencils with different resolutions. Furthermore, the hierarchy does not need to be uniform. A non-uniform set of pencils can be easily constructed by adaptively subdividing according to some error measure. Given a directional sample , we can now model an object by using images obtained by parallel projection along . The choice

6

object

projection plane

projection plane

object V

0

object

wk

pencil

d

T k

eye V

pencil

1

pencil 0

eye

light

a

1

eye

Figure 10: 2D analogy of the depth map approximation of an object’s surface. The blobby object is surrounded by a box representing an imaginary volumetric grid. Inside the grid the bold line represents the surface’s approximation given by the depth map along direction . At rendering time rays are cast starting from the eye and passing through the vertices and of the pencil’s triangle . The top ray hits the surface’s approximation after visiting 3 cells in the vertical direction. The bottom ray misses the object after visiting 5 cells. Note that the number of cells traversed decreases with the (solid) angle subtended by a pencil.

Figure 9: Error measures. Left: is the longitudinal or translational error. Right: is the angular or rotational error. of projection plane is what gives the “plane part” the SPP. ofcoordinate Regardless of the choice of , we can orient the system on so that it allows uniform sampling in both directions and . Furthermore, for volumetric data and arbitrary scenes with occlusion, we can define multiple projection planes in the dimenalso be chosen to form a uniform sion along . Such planes could grid together with the and discretizations. We further elaborate on the issue of choosing projection plane(s) after giving a rendering algorithm for an SPP model.

(1) Store a single image projected onto a plane orthogonal to the pencil direction and located at the center of the object. (2) Store a single image projected onto the plane that provides the best approximation to the object’s surface along the pencil direction. (3) Store more than one image projected onto two or more different planes along the pencil direction. (4) Store a depth map with Z-buffer information.

4.1 Building and Rendering an SPP Model The rendering algorithm we use is an adapted version of the lumigraph algorithm [19]. Given a viewing direction and a field of view, we first determine the level-0 pencils covered by the viewing frustum. We then traverse the subtrees rooted at the visible level-0 pencils and render a texture-mapped triangle for each leaf pencil. The texture map for a pencil is given by the image associated with the direction through the pencil’s center. In order to determine the texture coordinates, the vertices of each triangle need to be projected onto the plane containing the image using a perspective projection centered at the eye. The location of the projection plane containing a pencil’s image is relevant since it determines how the image is warped and projected onto the pencil’s triangle. Specifically, the plane approximates the geometry of the object as viewed along the pencil direction. Such an approximation produces discontinuities or seams like those shown in figure 14. These are related to the errors incurred by the light field approximation. In the general case, we distinguish two error measures related to the light field approximation: longitudinal (translational) and angular (rotational) error (see figure 9). These two measures stem from the two parameters of a light field: position and direction. Longitudinal errors occur when we approximate the object’s geometry by a single image plane. Figure 9 (left) illustrates this problem: the viewer does not see the far tip of the spout, but some point at a distance from the tip. Hence, the choice of projection plane certainly affects the quality of the rendering. Angular errors are a consequence of discretization in directional space. This is illustrated in figure 9 (right), where the light source reflected by the teapot’s shiny surface misses the viewer by an angle . This occurs regardless of the absence of longitudinal error for that particularray. for angular errors are given by the Bounds resolution of the discretization of our representation. Angular error can not exceed the angle between a discrete direction and the furthest direction within the pencil it approximates. This is an advantage of uniform sampling, the angular error is constant and bounded and depends only on the directional sampling rate. This has several advantages that will be described later. As for the choice of projection planes, we studied the following options.

Option (1) works well for objects like the clock and the teapot in figures 14 and 15, that have a fairly round shape. Option (2) requires minimizing the longitudinal error measure by optimizing both position and orientation of the projection plane. The results we obtained in this case did not improve on those obtained for option (1). We never tried option (3). Even though it would be the right choice for modeling multiple objects, for a single object option (4) utilizes the same amount of storage and provides a much better approximation to the object’s geometry. The drawback of option (4) is that it requires a more expensive rendering algorithm. This is illustrated in figure 10. For each vertex of a pencil triangle, a ray is constructed starting at the eye position and passing through that vertex. The ray is then cast into an imaginary volumetric grid, where the depth information represents a surface approximating the object’s geometry. The algorithm traverses the volumetric grid incrementally until an intersection with the surface is found. Typically, less than four or five iterations are sufficient to find an intersection, depending on the size (solid angle) of the pencil. However, many times it is necessary to subdivide the triangle, because either some of the rays miss the object, or the difference between the depths along two rays is too large. In fact this constitutes the current bottleneck of our implementation. The triangle-subdivision algorithm is steered by two threshold values,

and as illustrated in figure 11. Any triangle whose area in pixels is larger than is subdivided, even if there is nothing but background behind it. The reason is that small or skinny objects may not be detected simply because no triangle vertex happens to project onto the image (or the depth map) of the object. This results in certain objects vanishing, or appearing broken, like the flagpole of the paddle boat in figure 16. Ideally we would like to reconstruct all objects in the scene, even the small and skinny ones. However, this would require setting to one, in which case we would no longer have an image-based rendering algorithm, but a modified (image) ray-tracing algorithm. The threshold has the opposite effect. Triangles are subdivided until their area in pixels falls below . This ensures that the boundaries of the

7

0.0 0.5

0.0

0.0

0.2 0.0

0.0

0.0

0.0

0.0

1.0

1.0 0.5

0.5 0.0

1/6

1/6 0.0

0.0 0.0

th

(a)

(b)

0.0 0.0

(c)

Figure 12: Reconstruction kernels: (a) piecewise constant, (b) piecewise linear with small support, and (c) piecewise linear with large support. The numbers next to the vertices indicate -value assignments. In (c) the support may reach anywhere between 9 and 12 neighboring sample directions.

we can still to fast lower-quality rendering at lower directional resolutions. Also, we can easily modify it to choose the set of pencils by adaptive subdivision. A subdivision algorithm would start with a given level of the subdivision of the sphere. Then for each pencil , it would compute an image and determine the error incurred by that image approximation. A possible error measure is, for example, the longitudinal error, as described above. Next pencils would be sorted according to average error per unit solid angle. The solid angle associated with a level-0 pencil is steradians, one twentieth the sphere’s solid angle. Solid angles for higher-level pencils are computed by dividing by four every time a new subdivision level is started. After sorting all pencils by average error, the pencil with the largest error would be subdivided using the refinement procedure described for the 2SP above. Note that we associate with each new pencil a new triangle of size 1/4 of the original one. However, only three new directions and, thus, three new images are created in the process. Given the new set of pencils and images, the sorted list of pencils is updated according to the new information available. The subdivision process is then recursively repeated until the maximum number of pencils is reached. This algorithm has the advantage of concentrating the samples near directions with larger error measures. The procedure works independently of the illumination properties of the object or its representation. Furthermore, the nature of our representation is intrinsically hierarchical and, thus, can be viewed as a multiresolution representation. At rendering time we only use those pencils of the hierarchy that we can afford to render.

Figure 11: The effect of the two threshold values and on the triangle subdivision of the image plane for a rendering of the bunny.

objects are properly sampled. Unfortunately, needs to be set to subpixel values if we do not want the object boundaries to appear jagged in the image. All the images rendered for this paper using depth correction were generated with set to 0.5. Once the texture coordinates of the triangle have been determined, its associated texture can be rendered using various reconstruction kernels. We have employed piecewise constant and piecewise linear kernels. For a constant kernel we just render the triangle associated with a given pencil (see figure 12(a)). For a linear kernel we use stan dard graphics hardware to perform bilinear interpolation in space. In space we use -blending to implement spherical linear interpolation. To illustrate this procedure consider the center triangle in figure 12(b) and note that its associated image applies to all the pixels inside of the shaded hexagon. In order to implement linear interpolation between that image and its three neighbors’, we define three quadrilaterals, one for each neighbor, and associate to their vertices the -values shown in the figure. When rendering the quadrilaterals of two neighboring images, interpolation is achieved by -blending the quadrilaterals. Figure 12(c) shows an alternative larger support for linear interpolation. We observed that the smaller support was likely to produce artifacts near the vertices with -values of 0.5. Widening the support of the linear kernel eliminated these artifacts. We now describe how to construct a model of an object using the SPP. First we center the object at the origin and scale it to fit inside of the unit sphere. Then we select the set of directions that we want to sample for the model. For each direction , we compute a parallel projection onto a plane . can be chosen according to different criteria. First, we can determine the orientation coordinate of the system, so that the projection of the object’s bounding box along has a minimum area. Second, we can choose the plane’s position and orientation by minimizing the average longitudinal error using some error norm (e.g., least-squares). This can be accomplished by using depth information from the renderer’s Z-buffer. Alternatively, we can just store the depth information together with the image obtained by orthographic projection along . It remains to describe how the set of pencils is constructed. Our current implementation computes a uniform subdivision of directional space. However, given a high resolution model,

4.2 Implementation and Results Results from our preliminary implementation of the SPP are shown in figures 14 to 16. The captions have a full description of each image. We have both acquisition and rendering programs for SPP models of synthetic objects. The acquisition program allows optimization of each pencil’s plane position and orientation to minimize longitudinal error. The rendering program allows rendering with the different reconstruction kernels. We have also implemented a rendering algorithm that uses depth maps for improved image quality. Our preliminary results show that we can render images from an SPP model at interactive rates. Rendering a model with depth correction still takes more than a second per frame on an Onyx2, but we expect substantial improvement after carefully optimizing our renderer. Also note that there is a time-quality tradeoff controlled by the two thresholds and of the triangle subdivision algorithm. The figures were produced by setting these parameters for optimal quality rather than performance. Our data sets are of moderate size, up to 175 Mbytes for the largest set, using a very simple compression scheme that achieves

8

a 60:1 compression ratio. We expect to improve on this figure by implementing a hierarchical compression scheme that warps images associated with neighboring pencils so that their correlation is maximized. Differencing after warping may yield huge storage savings for images associated with pencils near the bottom of the hierarchy. Also, we are looking into multispectral compression for both images and depth maps. One important detail is that our depth correction algorithm requires the outline of the object to be clearly defined. Therefore, lossy compression is not acceptable for the most significant bit of a depth map. Otherwise, noticeable seams appear at the boundaries of the object due to the loss of information. Our images also illustrate several advantages accruing from the directional uniformity of our sampling. First, interpolation using linear reconstruction kernels is not critical, particularly when depth correction is used. This allows avoidance of the blurring that can significantly degrade image quality as in the image in figure 14. Also, note that using uniform parallel projections provides bounds for the angular error. In fact the angular error is uniform and bounded and depends only on the directional sampling rate. This allows for fast ray-depth map intersections, as well as more accurate illumination effects like the highlights in figures 14 and 15 and the refractive sphere in figure 14. Another advantage our representation shares with the 2SP is support for adaptive frame-rate control. Given a model made of pen cils, the average number of pencils covered by a field of view is

fall into the category of modeling and rendering using image warping and interpolation, as described in the introduction of this paper. Our work, however, is more closely related to the fundamental representation of a light field. Work in that category was pioneered by Levoy and Hanrahan [25] and Gortler et al. [19]. More recently results have been obtained that improve the performance of the lumigraph [36], relate it to visibility events [20], and add illumination control to light fields [39]. Approaches similar to the SPP have been used before for different purposes, including ray-classification [2], global illumination [3], characterization of visibility events [8], and rendering of trees [27]. Our SPP approach most resembles Max’s [27], but he uses a fixed set of 22 directions and employs surface normal information at each pixel and subpixel masks in his rendering. Pulli et al. [30] use a similar approach based on perspective projections in developing a hybrid image/geometry based modeling scheme. Finally, error measures for image-based representations have been discussed by Maciel and Shirley [26], Shade et al. [35], and Lengyel and Snyder [23].

6 Discussion Non-uniform light-field parameterizations have several limitations. Models based on cylindrical and central planar projections either do not cover the entire directional space or show significant variations in image quality as a function of view direction. Our parameterizations, however, allow (nearly) uniform sampling of the light field in both directional space and 3D point space. Uniform sampling guarantees error bounds in all dimensions of the light field, so provisions can be made to reduce or avoid doing interpolation at all. Image quality is largely independent of camera position or orientation. Furthermore, the hierarchical subdivision schemes presented in this paper allow for multiresolution support and adaptive nonuniform construction of light field models. Other features of our models are fast reconstruction algorithms, progressive transmission and progressive rendering, and adaptive frame-rate control. One drawback of our parameterizations is acquisition from real world images. The 2SP is difficult to construct by taking images from the real world, since samples on the sphere require special camera equipment. An SPP representation, however, is easier to obtain by placing the camera at a distance from the object sufficiently large to give a reasonable parallel projection effect. Moving the camera away from the object poses camera calibration problems, though. Finally, we compare our parameterizations against each other. Both parameterizations sample line space in a uniform fashion, and our experience shows that both implementations can be fine tuned to render at interactive rates and provide similar functionality, like interpolation, hierarchical multiresolution and compression. The 2SP has the added advantage that artifacts of directional resolution are less noticeable, they appear as “hair” instead of “seams”. Also, its implementation has a better discretization of directional space. Using a triangulated icosahedron as a generator provides a slightly more uniform sampling, and the implicit representation of the data structure is substantially more efficient in time and space, although more difficult to implement. Conversely, the SPP stores the discretization of directional space explicitly, which is easier to understand and implement. It also allows separate control of the directional and positional resolutions of a model. Suppose, for example, that we have storage to build an sample light field model, and suppose that we want to render x images of the model, so that it fits entirely inside each image. Our experiments show that is a good choice of spatial resolution, since each sample covers 4 pixels of the screen. Using the SPP, this leaves another factor for sampling directional space which, we assume, is enough for the purpose of our application. The 2SP,

So we can easily obtain a count of the polygons to be rendered for a given view, by multiplying by the number of polygons required by the current reconstruction kernel. Given that count of polygons, we can estimate the rendering time for a frame and decide how far down the pencil hierarchy we want to go in order to keep an animation running at a given frame rate. This can also be useful to decide whether to render an image-based or a geometry-based model in a hybrid rendering system. Finally, the SPP is particularly well suited to extension to 5D models, either volumetric models or models of arbitrary scenes with occlusions. Note that projection planes, and their associated images and depth maps, can be placed anywhere along the dimension, the fifth dimension of the SPP, that we discard when we model objects in free space. Given a set of camera parameters, rendering proceeds in the same fashion as above. However, when a given pencil is processed, the associated image and depth map are found by searching along the dimension (or, equivalently, the pencil’s sample direction ) for the closest projection plane in front of the eye.

5 Related Work The name and concept of plenoptic function were coined by Adelson and Bergen [1], although the idea of a light field is due to Gershun and dates back to 1939 [15]. Epipolar geometry was first borrowed from computational geometry by Faugeras and Robert [11]. It later inspired the work on plenoptic modeling by McMillan and Bishop [28]. The first attempts at using images for scene representations are due to Chen and Williams [5], Laveau and Faugeras [22], and Faugeras et al. [10]. Most of the above authors use central planar projections to represent their samples of the light field. McMillan and Bishop, however, use cylindrical projections, as well as

Chen’s QuickTime c VR system [4]. All of these approaches use well-known image warping and manipulation techniques [38] [18] for rendering new images from a given set. Other researchers have concentrated on rendering by view interpolation [34] [6] and on improving geometry rendering by incorporating image-based techniques [26] [7] [35]. All of these references

9

however, has spatial resolution tied to directional resolution and requires 4 times the number of samples of the SPP to achieve the same spatial resolution. Specifically, we would have to store samples for the 2SP in order guarantee spatial samples for each possible direction. However, the number of directional samples of the 2SP would obviously be higher and the image quality better. The SPP has a few other advantages. First, rendering time of a 2SP model depends on the resolution of the output image, while rendering time of an SPP model depends on the field of view of the camera parameters. Larger images can thus be rendered without penalty using the SPP, as long as the camera’s field of view remains unchanged. Second, SPP models can be built directly using planar projections, while 2SP models require either ray tracing or some type of resampling (or rebinning) to be constructed. Finally, the SPP representation can be more easily extended to handle 5D light fields of arbitrarily complex scenes.

[11] O. Faugeras and L. Robert. What can two images tell us about a third one? Technical Report RR-2018, INRIA, Sophia-Antipolis, France, July 1993.

[12] G. Fekete. Rendering and managing spherical data with sphere quadtrees. In Proceedings of Visualization’90, pages 176–186, Los Alamitos, California, 1990. IEEE Computer Society Press. [13] P. C. Gasson. Geometry of Spatial Forms: Analysis, Synthesis, Concept Formulation and Space Vision for CAD. Ellis Horwood Series in Mathematics and its Applications. Ellis Horwood, Chichester, UK, 1983. [14] A. Gersho and R. M. Gray. Vector Quantization and Signal Compression. The Kluwer International Series in Engineering and Computer Science. Kluwer, Boston, MA, 1995. [15] A. Gershun. Svetovoe Pole (The Light Field, in English). Journal of Mathematics and Physics, XVIII:51–151, 1939. [16] A. S. Glassner, editor. An Introduction to Ray Tracing. Academic P., San Diego, CA, 1989. [17] J. S. Gondek, G. W. Meyer, and J. G. Newman. Wavelength dependent reflectance functions. In Computer Graphics Proceedings, Annual Conference Series, pages 213–220, New York, NY, July 1994. ACM SIGGRAPH. Conference Proceedings of SIGGRAPH ’94.

Acknowledgments The first and third authors would like to thank Nina Amenta, Steve Gortler, Al Bovik, Pat Hanrahan and Marc Levoy for useful discussions, and Makoto Sadahiro and Brian Curless for providing the data used for building their light field models. The second author would like to thank Leo Guibas, Pat Hanrahan, and Marc Levoy for their guidance, Xing Chen and Lucas Pereira for many helpful discussions, and Matt Pharr for tweaking his ray tracer to generate 2SP models.

[19] S. J. Gortler, R. Grzeszczuk, R. Szeliski, and M. F. Cohen. The lumigraph. In Proceedings of SIGGRAPH’96 (New Orleans, LA, August 4–9, 1996). In Computer Graphics Proceedings, Annual Conference Series, pages 43–54. ACM SIGGRAPH, 1996.

References

[21] H. Laporte, E. Nyiri, M. Froumentin, and C. Chaillou. A graphics system based on quadrics. Computers & Graphics, 19(2):251–260, 1995.

[18] R. C. Gonzalez and R. E. Woods. Digital Image Processing. Addison-Wesley, Reading, MA, 1992.

[20] X. Gu, S. J. Gortler, and M. F. Cohen. Polyhedral geometry and the two-plane parameterization. In Eighth Eurographics Workshop on Rendering, pages 1–12, Saint Etienne, France, June 1997. Eurographics.

[1] E. H. Adelson and J. R. Bergen. The plenoptic function and the elements of early vision. In Landy and Movshon, editors, Computational Models of Visual Processing, pages 3–20, Cambridge, MA, 1991. MIT Press.

[22] S. Laveau and O. Faugeras. 3-d scene representation as a collection of images and fundamental matrices. Technical Report RR-2205, INRIA, Sophia-Antipolis, France, February 1994.

[2] J. Arvo and D. Kirk. Fast ray tracing by ray classification. In Proceedings of SIGGRAPH ’87 (Anaheim, CA, July 27–31, 1987). In Computer Graphics, 21, 4 (July 1987), pages 269–278, New York, 1987. ACM SIGGRAPH.

[23] J. Lengyel and J. Snyder. Rendering with coherent layers. In Proceedings of SIGGRAPH’97 (Los Angeles, CA, August 3–8, 1997). In Computer Graphics Proceedings, Annual Conference Series, pages 223–242. ACM SIGGRAPH, 1997.

[3] C. Buckalew and D. Fussell. Illumination networks: Fast realistic rendering with general reflectance functions. In Proceedings of SIGGRAPH ’89 (Boston, MA, July 31–August 4, 1989). In Computer Graphics, 23, 4 (August 1989), pages 89– 98, New York, 1989. ACM SIGGRAPH. [4] S. E. Chen. QuickTime c VR - an image-based approach to virtual environment

[24] M. Levoy and P. Hanrahan. Light field rendering. In Proceedings of SIGGRAPH’96 (New Orleans, LA, August 4–9, 1996). In Computer Graphics Proceedings, Annual Conference Series, pages 31–42. ACM SIGGRAPH, 1996. [25] P. W. C. Maciel and P. Shirley. Visual navigation of large environments using textured clusters. In 1995 Symposium on Interactive 3D Graphics, pages 95– 102. ACM SIGGRAPH, 1995.

navigation. In Proceedings of SIGGRAPH’95 (Los Angeles, CA, August 6–11, 1995). In Computer Graphics Proceedings, Annual Conference Series, pages 29– 38. ACM SIGGRAPH, 1995.

[26] N. Max. Hierarchical rendering of trees from precomputed multi-layer z-buffers. In Seventh Eurographics Workshop on Rendering, pages 166–175, Porto, Portugal, June 1996. Eurographics.

[5] S. E. Chen and L. Williams. View interpolation for image synthesis. In Proceedings of SIGGRAPH 93 (Anaheim, CA, August 1–6, 1993). In Computer Graphics Proceedings, Annual Conference Series, pages 279–288. ACM SIGGRAPH, 1993.

[27] L. McMillan and G. Bishop. Plenoptic modeling: An image-based rendering system. In Proceedings of SIGGRAPH’95 (Los Angeles, CA, August 6–11, 1995). In Computer Graphics Proceedings, Annual Conference Series, pages 39–46. ACM SIGGRAPH, 1995.

[6] L. Darsa, B. C. Silva, and A. Varshney. Navigating static environments using image-space simplification and morphing. In 1997 Symposium on Interactive 3D Graphics, pages 25–34, New York, 1997. ACM.

[28] W. K. Pratt. Digital Image Processing. John Wiley & Sons, New York, NY, 1978.

[7] P. E. Debevec, C. J. Taylor, and J. Malik. Modeling and rendering architecture from photographs: A hybrid geometry- and image-based approach. In Proceedings of SIGGRAPH’96 (New Orleans, LA, August 4–9, 1996). In Computer Graphics Proceedings, Annual Conference Series, pages 11–20. ACM SIGGRAPH, 1996.

[29] K. Pulli, M. F. Cohen, T. Duchamp, H. Hoppe, L. Shapiro, and W. Stuetzle. View-based rendering: Visualizing real objects from scanned range and color data. In Eighth Eurographics Workshop on Rendering, pages 23–34, Saint Etienne, France, June 1997. Eurographics. [30] L. A. Santal´o. Integral Geometry and Geometric Probability. Addison-Wesley, Reading, MA, 1976.

[8] F. Durand, G. Drettakis, and C. Puech. The 3d visibility complex: A new approach to the problems of accurate visibility. In Seventh Eurographics Workshop on Rendering, pages 245–256, Porto, Portugal, June 1996. Eurographics.

[31] M. Sbert. An integral geometry based method for fast form-factor computation. In R. J. Hubbold and R. Juan, editors, Computer Graphics Forum, volume 12(3), pages C–409–C–420, C–538, Oxford, UK, Sept. 1993. Eurographics Association, Blackwell. Proceedings of Eurographics ’93.

[9] G. Dutton. Locational properties of quaternary triangular meshes. In Proceedings of the Fourth International Symposium on Spatial Data Handling, pages 901–910, July 1990.

[32] P. Schr¨oder and W. Sweldens. Spherical wavelets: Efficiently representing functions on the sphere. In Proceedings of SIGGRAPH’95 (Los Angeles, CA, August 6–11, 1995). In Computer Graphics Proceedings, Annual Conference Series, pages 161–172. ACM SIGGRAPH, 1995.

[10] O. Faugeras, S. Laveau, L. Robert, G. Csurka, and C. Zeller. 3-d reconstruction of urban scenes from sequences of images. Technical Report RR-2572, INRIA, Sophia-Antipolis, France, June 1995.

10

[33] S. M. Seitz and C. R. Dyer. View morphing. In Proceedings of SIGGRAPH’96 (New Orleans, LA, August 4–9, 1996). In Computer Graphics Proceedings, Annual Conference Series, pages 21–30. ACM SIGGRAPH, 1996. [34] J. Shade, D. Lischinski, D. H. Salesin, T. DeRose, and J. Snyder. Hierarchical image caching for accelerated walkthroughs of complex environments. In Proceedings of SIGGRAPH’96 (New Orleans, LA, August 4–9, 1996). In Computer Graphics Proceedings, Annual Conference Series, pages 75–82. ACM SIGGRAPH, 1996. [35] P.-P. Sloan, M. F. Cohen, and S. J. Gortler. Time critical lumigraph rendering. In 1997 Symposium on Interactive 3D Graphics, pages 17–23, New York, 1997. ACM. [36] H. Solomon. Geometric Probability, volume 28 of CBMS–NSF Regional conference series in applied mathematics. SIAM, Philadelphia, PA, 1978. [37] G. Wolberg. Digital Image Warping. IEEE Computer Society Press, Los Alamitos, CA, 1990. [38] T.-T. Wong, P.-A. Heng, S.-H. Or, and W.-Y. Ng. Image based rendering with controllable illumination. In Eighth Eurographics Workshop on Rendering, pages 13–22, Saint Etienne, France, June 1997. Eurographics. [39] J. Ziv and A. Lempel. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23:337–343, 1977.

11

Figure 13: The dragon light field. The light field, covering a 1/4 surround view, took 5 days to generate on 12 R4400 processors. For these pictures, each measuring 300 x 300 pixels, 4 rays are cast and averaged per pixel (accumulation buffer supersampling), while the light field is resampled using piecewise-constant reconstruction. When our interactive light field viewer runs on a single 195MHz R10k processor, and we cast only one ray per pixel, we can move the viewpoint at a rate of 6 frames per second.

Figure 14: This SPP model was obtained by ray tracing a 60000-polygon geometric model using LightWave, a commercial ray-tracer. It has multiple light sources, reflection, and refraction, and it took four days to construct. We built it using a perspective camera located relatively far from the object. We generated a total of 1280 256 x 256 images. Due to the lack of depth information, the images on the left, reconstructed using a constant kernel, exhibit seams. The images on the right were rendered using a linear kernel. All images are 512 x 512 and took about 1/40 seconds to render on an Onyx2.

12

Figure 15: These two SPP models correspond to the (classic) Utah teapot and Stanford bunny. Each model was generated overnight on a high-end graphics workstation. The models contain 20480 256 x 256 images. The images were compressed using JPEG compression. The depth maps were separated into two components, a stencil bit and an 8-bit integer, and compressed using Lempel-Ziv and Welch and JPEG compression, respectively. We achieved a 60:1 compression ratio. Each data set is roughly 170Mbytes. All images are 512 x 512 pixels. The left images show the data rendered using constant reconstruction; the middle images show the data rendered using linear reconstruction. Each image took about 2-4 seconds to render using depth correction on an Onyx2. The images on the right correspond to the objects’ geometry, and are included for reference.

Figure 16: These four images correspond to a paddle boat model illuminated with 3 light sources and texture mapped with 20 different textures. The left images were rendered using an SPP model, the right images were rendered using the original geometry. The SPP model is made of 1280 512 x 512 images rendered overnight on a high-end graphics workstation. The SPP model includes depth information. Each left image has a 512 x 512 resolution and took between 1 and 2 seconds to render on an Onyx2.

13