Reshading Light Fields Daniel Meneveaux Alain Fournier Department of Computer Science University of British Columbia fdanielm/
[email protected]
Abstract A category of methods usually called Image Based rendering has appeared recently to produce realistic images of objects. Instead of creating models from geometric primitives which are then rendered according to the point of view and the illumination given, the light reflected/emitted from objects is captured from enough different points of view to be able to reconstruct the light received from the objects from any new point of view. The resulting data structure is usually called a light field. This approach has many advantages, but also many disadvantages, the most obvious one being that the data has been captured under a specific illumination condition, and there is very little information to be able to reshade the objects, and therefore to be able to insert them into arbitrary new environments. We propose here a series of techniques which allows reshading light fields. We assume that there is no depth information available with the light field, and our steps consists of the following: (1) determining a coarse geometric shape of the object by creating an octree of cells occupied by the object, (2) estimating surface normals from the boundary points so determined, (3) estimating the reflectance properties by making simplifying assumptions about the surface reflectance function. Once these steps are done, one can insert the object within standard geometric models, determine visibility and shade the whole scene (including shadows and intereflection if necessary) to produce images where light fields and geometric objects are integrated. We have only began the implementation work, and this paper is a research plan rather than a report of results. In it we give the motivation and background for our application, and describe the techniques we plan to use, and give the reasons we selected them.
1
Introduction
Many applications need images and animation of scenes that can appear realistic enough to substitute for scenes from the real world. In an increasing number of these applications, such as driving or flying simulators, special effects in movies and video, architectural rendering, one needs to merge as seamlessly as possible images or real scenes and of computer generated scenes. No matter how it is measured, the appearance of the real world is very complex. To produce realistic or visually complex images there are essentially two methods: take a picture of the real world, or design geometric, surface properties and illumination models complex enough so that they will produce realistic images once rendered (of course this will be also a function of the rendering method). The latter is hard to do, and in fact a large part of research in computer graphics in the last thirty years has been to meet that challenge. It is obvious that one would like to be able to integrate the two methods, and insert into real scenes computer generated objects. This is the goal of Computer Augmented reality (CAR) [four93]. Recently an intermediary method has been used, generally
known as image based rendering (IBR). In it one captures the light reflected/emitted from objects from enough different points of view to be able to reconstruct the light received from the objects from any new point of view. The resulting data structure is usually called a light field. The main advantage of image-based rendering methods resides in their capability to produce some very realistic images (for they are in fact images of the real world) with no associated modelling cost if they are acquired from real objects. Of course light fields can be created from computer models, and these models will have the limitations shared by all our models. The main disadvantage of light field objects, if we want to use them under various illumination conditions and merge them in rendering with other kinds of objects (and if we don’t they are indeed of very limited use) are that the geometry of the object is not known, and that the pixel values were determined under a specific illumination, and modifying that requires knowledge of the surface reflectance properties at each point properties that can be summed up as the bidirectional reflectance distribution function, or BRDF, as well as a knowledge of the geometry since surface normals and surface orientations are required for a correct shading. One can improve the situation by acquiring all or part of this information. Sometimes the depth can be acquired along with the illuminance, as when the data comes from a 3D scanner, but in this case one can wonder why not build the geometric model at the same time. The BRDF, or a simplified version of it, can be acquired by capturing the light field under various illumination conditions, or of course determining separately the BRDF of the materials of the object’s surface. In the proposed work we assume that these options are not available, that is that neither exact geometry nor depth are available, and that we will not have explicitly the BRDF of the surface, and only a very limited range of illumination conditions. We will also be careful about not making too many simplifying assumptions about the characteristics of the BRDF of the materials considered. Given these ground rules, it is clear that the problem is essentially ill-posed, that is we do not have enough information to guarantee an accurate reshading, and most of our steps will be approximations of the underlying reality, and will be known to fail under some circumstances. In the rest of the paper we will summarize the relevant work from the literature, describe the techniques we plan to use to achieve effective reshading, and conclude with a few comments on the expected results.
2 Related work Ever since almost the beginning of computer graphics techniques have been developed to use digitized images to enhance the realism of computed images, and to use lighting and illumination models to shade objects in a more realistic way. The literature on these subjects is quite vast, and the following is only a quick run through the field.
2.1 Textures and Mappings Starting with the pioneering work of Catmull, Blinn and Newell [blin76], many techniques have been developed to map images unto object surfaces. An excellent survey of the first ten years or so has been written by Heckbert [heck89]. A step closer to IBR was to use environment mapping, that is to use real images mapped on a virtual environment to affect reflection or illumination in the scene [gree86]. The basic concept of mapping images on geometric objects has been applied also very early in animation, notably with movie maps [lipp80] where dynamic maps were implemented with the help of videodiscs. In this application, the user can drive through an unfamiliar space while controlling the speed, angle of view and route. The database consists in a sequence of animations that the user can play, as well as a set of possible turns. Moreover, computergenerated images allow the user to visualize the environment (which is actually a town in their implementation). The virtual museum [mill92] was also devised to visit an environment with the help of several movies. At selected points in the museum, a 360-degree panning movie allows the user to look around, and bidirectional movies are used to walk from one point to another. Those bidirectional movies contained a frame for each step in both directions along the path connecting the two points. Another, more recent approach involving real images is the QuickTime VR software [chen95], based on panoramic images captured with specialized cameras or by stitching together overlapping photographs. The motion of the virtual camera is constrained to particular locations where environment maps are available. However, this is not too restricting since the user is free to look around. The user is also able to move around objects in some applications. Finally, some hot spots tracks are used to activate events during navigation. The Facade work at the University of California at Berkeley [debe96] is a recent impressive implementation of similar ideas, mapping on a simple geometry images acquired from the real buildings.
2.2 The Plenoptic Function The plenoptic function [adel91] potentially describes all of the radiant energy that can be received from the point of view of the observer. It can be expressed as the following: p = P (; ; ; Vx ; Vy ; Vz ; t)
Where (Vx ; Vy ; Vz ) represents the viewpoint, (; ) define a direction vector, is the wavelength and t is the time. In [mcmi95], L. McMillan and G. Bishop used a set of discrete samples of the plenoptic function (say a set of images) to generate a continuous representation of that function. In their system, called Plenoptic Modeling, the scene description is given as a set of reference images. The images are then warped and merged together in order to construct several cylindrical images since cylinders are more useful in their acquisition and rendering capabilities than any other geometric shapes as spheres or cubes. Then for any viewpoint, and for a set of cylinders, the plenoptic function can be reconstructed and new images can be rendered.
2.3 Light Fields The plenoptic function involves a huge amount of data, and for some application it is more than needed. A more restricted version has become known as light fields. In this case a light field LF is given by LF = F (; ; ; u; v)
Where (; ) is a direction, is the wavelength (often ignored or replaced by colours) and (u; v) are the parameters on a surface. This representation has been used for the representation of light in a light transport algorithm in FIAT [dret90] and Lucifer [lewi96], and more recently in the work by Levoy and Hanrahan and Gortler et al also known as Lumigraph [levo96, gort96]. Acquiring a light field consists in sampling the light emitted from an object, actually corresponding to the plenoptic function. This sampling consists in interpreting images as 2D slices of the 4D function. They define the light field as the radiance at a point in a given direction, which is equivalent to the plenoptic function. There are many ways to parameterize the two pairs (; ) and (u; v), and the solution adopted by M. Levoy and P. Hanrahan or Gortler et. al. consists in using a pair of parallel planes parameterized by (u; v) and (s; t) respectively.
2.4 BRDF and its Estimation Since the initial work of the Utah school of computer graphics there have been many ways to express the reflectance of surfaces. The most popular one is as a linear combination of diffuse and specular reflector [torr66] [blin77][cook82][beck87], the latter being usually a Phong model [phon75]. More elaborate models have been developed for computer graphics [he91], but more recently work has concentrated on the acquisition and use of the function known as the bidirectional reflectance distribution function or BRDF [nico77, ward92]. The BRDF is normally defined as the ratio f (i; i ; r ; r ) =
Lr (i ; i ) Li (r ; r )cosid!i
where Li () and Lr () are the incident and reflected radiance, respectively. It is important to note that there is a close resemblance between light fields and BRDF, since they are both functions of 4 variables. In fact we have been using this fact to use similar wavelet projections to represent and compress them [lalo97]. Closely related to our goal are recent efforts to re-render scenes under natural illumination [nime94, roma95], and to extract an illumination model from several views of a real object [sato97].
2.5 State of the Art One can say that the field of Image-Based rendering has matured a lot in a few short years (and was the subject of a workshop last year in Monterey, Ca), but that a lot of work remains to be done to be able to effectively integrate light fields in particular to other modelling/rendering systems, which is absolutely necessary for them to be really useful. At Imager we have recently explored the reconstruction and merging of several light fields, and the geometric alteration of light fields [chiu98, broo99]. The present work is addressing what we feel is the most serious problem right now, the reshading of light fields.
3 The Essential Steps We will assume here that the light fields are defined and parameterized as in Levoy and Hanrahan [levo96]. In fact we plan to use the same data structure, so that we can use the light fields that have generated. Figure 1 illustrates the parameterization and the viewing. A light field is defined as several (from 1 to 6) such pairs of planes referred to as slabs, and for each pair a plane is parametrized in (u; v) and the other in (s; t). Note that the planes have a given position in space, and that one of them can be at infinity. To be able to insert a light field into a standard computer graphics scene, we have to assign it a geometry, including surface normals for shading purposes, and reflectance properties to the surface. Then we have to render it, determining visibility and blocking for
3.3 Estimation of Reflectance Properties
Figure 1: A light slab and its viewing
cast shadows, and shade it under the new viewing and illumination conditions.
3.1 Shape Recovery The construction of an accurate geometric shape from a set of images is still an open problem, and has been the main goal of computer vision [horn89]. Since we have to deal with complex shapes (otherwise it would not be worth using light fields) and quite arbitrary reflectance properties (for the same reason), we cannot apply these techniques for the time being. However, we only need a coarse shape and we can use any kind of algorithm robust enough to provide a complete but approximate shape of an object. We chose the algorithm proposed by Szeliski [szel93]. Originally Szeliski uses a set of images acquired from different viewpoints around an object. The root of the octree is positioned at the beginning of the algorithm as an enclosing box for the object, and recursively refined during the reconstruction process. For each view of the object (in the case of light fields each direction of views for orthographic views) pixels are assigned 0 or 1 depending of whether they are covered by the object or not. Then the octree voxels obtained so far are also projected onto the image plane and compared to the objects pixels. If a voxel is totally outside the object, then it will be always outside the object and is eliminated. A voxel inside the object is kept. Finally, if an octree voxel’s projection contains covered and uncovered pixels, it is marked as ambiguous, and it is split recursively until the children are not ambiguous or a minimum size criterion has been reached. We have successfully used this algorithm for merging light fields [chiu98]. It is obvious than it will give correct results (within the size criterion) only for objects such that their boundaries are part of at least a silhouette from some viewing direction. For instance in the case of a cup with a handle, the handle will be reconstructed correctly, but the inside of the cup will not be hollow, unless there is no bottom. It is a definite advantage that this algorithm gives us a hierarchical structure for the reconstructed object.
3.2 Estimation of Normals After obtaining the object octree, we have to construct a smooth surface for it, or at least to compute for boundary points normal vectors so that the object will appear reasonably smooth when shaded. This means that for each visible point of the light field we estimate a normal based on the octree known in its vicinity. An algorithm like used for implicit surfaces would work with some modifications [wyvi86, lore87, bloo88], but we chose to use the technique developed by Hoppe et al [hopp92] to estimate normals from a cloud of unorganized points. The reason is that we want to be able to use a reasonable number of octree vertices in the neighbourhood of the point whose normal we compute, and we do not want false proximity caused by the reconstruction to influence unduly the result. Moreover, we only have to compute the surface normals associated with each sample of the plenoptic function given by the light field.
We still have not fully decided on the techniques we want to apply for this step. Generally speaking, we favour an approach similar to the one used by Sato and Ikeuchi [sato97], where a simple illumination model with few parameters is fitted to the data obtained from multiple views of the object. This assumes more information than we normally have from a light field. It is not guaranteed that we even have the light position and intensity when the light filed was acquired, let alone more than one light field. Note that if we acquire the light field with three light positions we can estimate the surface normals regardless of albedo if we assume that the surface is a diffuse reflector. The other assumption that we have to make to apply a method similar to Sato et al is that the BRDF is essentially the same over the all objects except for three parameters per colour channel that have to be determined. We want to investigate situations where there is more variation assumed for the BRDF over the object.
4 Integrating and Reshading Once we have estimates of the geometry of the object, its surface normals and its reflectance properties, we can insert it into a standard computer graphics scene and renderer. We plane to use both ray-tracing and Z-buffer based rendering techniques in our implementation
4.1 Positioning and Visibility We have in effect a hierarchical geometric model of the object that we can transform and translate to the desired position. In an interactive modeller the positioning can be done in real-time with a suitable level of the octree as cubes. Within ray-tracing boxing is facilitated by the existence of the octree. In a Z-buffer method that also helps for clipping purposes. To determine visibility is relatively easy, perhaps easier within ray-tracing, but it is essentially similar to the steps we took to compute the octree, except that now we have the position in 3D of the pixels.
4.2 Rendering The rendering step can be considered with respect again of the two main rendering techniques we plan to use:
Ray-tracing is very easy to use in our case because this technique is the one most used to render a light field. The only problem is that we have to determine the best sample of the plenoptic function if a given ray intersects the bounding box associated with the slabs of the light field. Once the pixel is determined the shading is computed given its normal, the current light positions and the reflectance model obtained. This can of course all be implemented in a modular way as most ray-tracer isolate these functions for each type of primitive. We plan to use the Persistence of Vision Ray-Tracer (POV-Ray) as a well established ray-tracer whose source is available. The rendering algorithm is slightly different in the case of the Z-buffer algorithm. However we just have to determine the colour of each pixel associated with the projected bounding box. This is not difficult in the sense that for each of these pixels we have the viewpoint (given by the center of projection of the camera) and the direction (given by the pixel coordinates). These two parameters fully determine the correct sample in the light field, and the data necessary for the shading. Here we will use OpenGL as the interface for the rendering.
5
Conclusion
It is obviously too early for conclusions. There are many issues we want to explore further. One challenge for this application is to obtain reasonable rendering times for light field objects, and good image quality. One way to achieve both is to use as much as possible the filtering capabilities of the rendering systems used. In the case of Z-buffer rendering that means using texture mapping, which is already used in light field viewers such as lifview from Stanford.
6
[gort96]
Steven J. Gortler, Radek Grzeszczuk, Richard Szeliski, and Michael F. Cohen. “The Lumigraph”. SIGGRAPH, August 1996.
[gree86]
N. Greene. “Environment mapping and oter applications of world projections”. IEEE Computer Graphics and Applications, November 1986.
[he91]
Xiao D. He, Kenneth E. Torrance, Francois X. Sillion, and Donald P. Greenberg. “A Comprehensive Physical Model for Light Reflection”. Computer Graphics (SIGGRAPH ’91 Proceedings), Vol. 25, pp. 175–186, July 1991.
Acknowledgment
We thank all our associates at Imager for assistance with many conceptual and practical questions, NSERC and IRISA for generous support.
References [adel91]
E. H. Adelson and J. R. Bergen. “Computational models of visual processing”. chapter 1. MIT Press, 1991.
[beck87] P. Beckmann and A. Spizzichino. The scattering of electromagnetic waves from rough surf aces. Artech House Inc, 1987.
[heck89] P. S. Heckbert. “Fundamentals of texture mapping and image warping”. M.Sc. thesis, Dept of EECS, UCB, June 1989. [hopp92] Hugues Hoppe, Tony DeRose, Tom Duchamp, John McDonald, and Werner Stuetzle. “Surface Reconstruction from Unorganized Points”. SIGGRAPH, August 1992. [horn89] Berthold K. P. Horn and Michael J. Brooks. Shape from Shading. The MIT Press, Cambridge, MA, 1989. [lalo97]
P. Lalonde and A. Fournier. “A Wavelet Representation of Measured Reflectance Functions”. IEEE Transactions on Visualization and Computer Graphics, Vol. 3, No. 4, pp. 329–336, December 1997.
[levo96]
Marc Levoy and Pat Hanrahan. “Light Field Rendering”. SIGGRAPH, August 1996.
[lewi96]
J. Bloomenthal. “Polygonization of implicit surfaces”. Computer Aided Geometric Design, Vol. 5, No. 4, pp. 341–356, 1988.
R. R. Lewis and A. Fournie. “Light-Driven Global Illumination with a Wavelet Representation of Light Transport”. Seventh Eurographics Workshop on Rendering, pp. 12–21, June 1996.
[lipp80]
[broo99] S. Brooks. “Dynamic Light Fields”. M.Sc. Thesis, Department of Computer Science, University of British Columbia, 1999.
Andrew Lippman. “Movie-Maps: An application of the optical videodisc to computer graphics”. SIGGRAPH, August 1980.
[lore87]
William E. Lorensen and Harvey E. Cline. “Marching Cubes: A High Resolution 3D Surface Construction Algorithm”. Computer Graphics (SIGGRAPH ’87 Proceedings), Vol. 21, pp. 163–169, July 1987.
[blin76]
J. F. Blinn and M. E. Newell. “Texture and reflection in computer generated images”. Communications of the ACM, Vol. 19, No. 10, pp. 542–547, October 1976.
[blin77]
James F. Blinn. “Models of Light Reflection For Computer Synthesized Pictures”. Computer Graphics (SIGGRAPH ’77 Proceedings), Vol. 11, No. 2, pp. 192–198, July 1977.
[bloo88]
[chen95] Shengang Eric Chen. “QuickTime VR - An ImageBased Approach to Virtual Environment Navigation”. SIGGRAPH, pp. 29–38, August 1995. [chiu98]
C. Chiu. “Merging Multiple Light Fields”. M.Sc. Thesis, Department of Computer Science, University of British Columbia, 1998.
[cook82] R. L. Cook and K. E. Torrance. “A Reflectance Model for Computer Graphics”. ACM Transactions on Graphics, Vol. 1, No. 1, pp. 7–24, January 1982. [debe96] Paul E. Debevec, Camillo J. Taylor, and Jitendra Malik. “Modeling and Rendering Architecture from Photographs: A Hybrid Geometry- and Image-Based Approach”. SIGGRAPH 96 Conference Proceedings, Annual Conference Series, pp. 11–20, August 1996. [dret90]
[four93]
G. Drettakis, Eugene Fiume, and Alan Fournier. “Tightly-Coupled Multiprocessing for a Global Illumination Algorithm”. Eurographics ’90, pp. 387–98, September 1990. Alain Fournier, Atjeng S. Gunawan, and Chris Romanzin. “Common illumination between real and computer generated scenes”. Proceedings of Graphics Interface ’93, pp. 254–262, May 1993.
[mcmi95] Leonard McMillan and Gary Bishop. “Plenoptic Modeling: An Image-Based Rendering System”. SIGGRAPH, August 1995. [mill92]
G. E. Miller, S. E. Offert, E. Chen, D. Patterson, S. Blacketter, S. A. Rubin, J. Yim, and J. Hanan. “The virtual Museum: Interactive 3D navigation of a multimedia database”. The Journal of Visualization and Computer Animation, 1992.
[nico77]
F. E. Nicodemus, J. C. Richmond, J. J. Hsia, I. W. Ginsberg, and T. Limperis. “Geometric Considerations and Nomenclature for Reflectance”. Monograph 161, National Bureau of Standards (US), October 1977.
[nime94] Jeffry S. Nimeroff, Eero Simoncelli, and Julie Dorsey. “Efficient Re-rendering of Naturally Illuminated Environments”. Fifth Eurographics Workshop on Rendering, pp. 359–373, June 1994. [phon75] Bui-T. Phong. “Illumination for Computer Generated Pictures”. Communications of the ACM, Vol. 18, No. 6, pp. 311–317, June 1975.
[roma95] C. Romanzin. “Aspects of Image reshading”. M.Sc. Thesis, Department of Computer Science, University of British Columbia, 1995. [sato97]
Yoichi Sato, Mark D. Wheeler, and Katsushi Ikeuchi. “Object Shape and Reflectance Modeling from Observation”. SIGGRAPH’97, August 1997.
[szel93]
Richard Szeliski. “Rapid Octree Construction from Image Sequences”. CVGIP: Image Understanding, Vol. 58, No. 1, pp. 23–32, July 1993.
[torr66]
K. E. Torrance and E. M. Sparrow. “Polarization, Directional Distribution, and Off-Specular Peak Phenomena in Light Reflected from Roughened Surfaces”. Journal of Optical Society of America, Vol. 56, No. 7, 1966.
[ward92] Gregory J. Ward. “Towards more practical reflectance measurements and models”. Graphics Interface ’92 Workshop on Local Illumination, pp. 15–21, May 1992. [wyvi86] Brian Wyvill, Craig McPheeters, and Geoff Wyvill. “Data Structure for Soft Objects”. The Visual Computer, Vol. 2, No. 4, pp. 227–234, 1986.