A practical framework for virtual viewing and ... - Semantic Scholar

0 downloads 0 Views 2MB Size Report
This framework can observe objects under different light condi- tions and in arbitrary .... virtual light illumination is mapped onto the sphere map firstly. Then the ...
A practical framework for virtual viewing and relighting Qi Duan, Jianjun Yu, Xubo Yang, Shuangjiu Xiao Digital Art Laboratory, School of Software Shanghai Jiao Tong University {duanqi1983,jianjun-yu,yangxubo,xsjiu99}@sjtu.edu.cn http://dalab.se.sjtu.edu.cn

Abstract. Recently many practical applications have concerned with observing objects that often have specula reflection properties. They intend to know how the specula reflections and details on the surface of objects vary according to different light conditions and view positions without calculating any 3D information of the objects. In this paper, we propose an innovative framework combining the novel view synthesis algorithm and the relighting algorithm to fulfill these practical requirements. This framework can observe objects under different light conditions and in arbitrary view positions, with specula reflection properties. Another important feature of this framework is that all the algorithms are based on image without acquiring any 3D model since any kind of model may contain some high confidential secrets. Meanwhile an image measurement criterion is proposed to verify the feasibility and effectiveness of the framework. Key words: Novel View Synthesis, Image-based Relighting, Image Measurement

1

Introduction

Recently, the technique that can render images of product under arbitrary illuminations or view points became increasingly important in Factory Automation application. It is also of great value in movie, museums, and electronic commerce applications. For example, in machine vision applications, rendering method is used to simulate and decide the optimal lightning or viewpoint of the camera. In order to enhance the exhibition effect of object in some special circumstances, various special lights will be placed in certain positions to augment the effects of details on product surface; meanwhile the observation position and angle also influence the display result. The position of viewpoints and the condition of luminance are two key factors that influence the effects. To determine the best luminance condition and view point position, it is very helpful to design a tool that can precisely observe the surface reflection effect of target under various light conditions, such as different colors and types of LED light sources, and

2

A practical framework for virtual viewing and relighting

also can observe objects in some arbitrary view positions. Considering that the product may contain some high confidential secrets, those requirements can be summarized as follows: – A method to simulate the result that the object is placed under various light conditions, for example, different numbers, sizes, types, colors of light resources and different distances between objects and light. Considering practical application requirements, we need to get the best result with the minimal cost. – A technique to synthesize novel images of the objects. Considering the objects may be some industrial products relating to commercial commercial confidential secrets, the less 3D information of the target we calculate, the better. However, currently all the related work solves only one of the two requirements. Thus, we design this new practical framework to handle this problem. 1.1

Related work

Two main algorithms employed in this framework are the image-based relighting and the novel view synthesis. Image-based Relighting(IBL) attracts a lot of attention recently, because it does not require precise physical models. Image-based relighting methods can be classified into three categories, named reflectance-based relighting, basis function-based relighting and plenoptic function-based relighting. Reflectance Function-based Relighting techniques explicitly estimate the reflectance function at each visible point of the object or scene, which is known as the Anisotropic Reflection Model[1] or the Bidirectional Surface Scattering Reflectance Distribution Function (BSSRDF)[2]. Lin et al.[3] researched the dual of the Lumigraph, in which the camera is fixed and a point light source is moved. Basis Functionbased Relighting techniques take advantage of the linearity of the rendering operator with respect to illumination. Relighting process is accomplished via linear combination of a set of pre-rendered ”basis” images. Nimeroff et al. [4] used a technique of combining images to relight a scene. Debevec et al.[5] describes a Light Stage in which an object can be placed. Plenoptic Function-based Relighting techniques are based on the computational model, the Plenoptic Function[6]. So the plenoptic function-based relighting techniques extract out the illumination component from the aggregate time parameter, and facilitate relighting of scenes. Novel image synthesis has been studied in the last decade. The work in this area can be divided into two main classes roughly: image-based modeling and rendering(IBMR) and image-based morphing and rendering. The first class is designed to reconstruct 3D models from photographs. Debevec et al.[7] have

A practical framework for virtual viewing and relighting

3

shown a method of modeling and rendering architecture from a small number of photos. Criminisi et al.[8] create 3D texture-mapped models relying solely on one single image for its input. Byong Mok Oh et al.[9] take a single photo as input and model the scene. After modeling the scene, we can change from different viewpoints; modify the shape, color and illumination of the scene. The second class does not calculate any 3D information of the scene. It use various methods to synthesis the novel view image such as image warping[10], image interpolation[11], image extrapolation and so on. For example, Seitz and Dyer[12] interpolate along the base line of image pairs to obtain correct images. Levoy and Hanrahan[13] and Gortler et al.[14] interpolate between a dense set of several thousand-example images to obtain novel view images. Peleg and Herman[15] relax the fixed camera constraint to synthesis results. Avidan and Shashua[18] introduce the concept of trilinear tensor space and use trilinear tensor to synthesize the novel view images. However, the image-base relighting approach can only help us to view the target under various light conditions and the novel view synthesis method can only generate the novel image under arbitrary viewpoints. Our framework integrates the image based relighting approach and the novel view synthesis method, which can render images of product under arbitrary illuminations and view points. The following sections describe the framework, the experiment result and the conclusion. In section 2 we introduce the method to simulate the object under various light conditions and the process to synthesize a novel view image of object in our framework. Image measurement criterion and framework performance are described in section 3. Finally, we conclude and present the idea for future work in section 4.

2 2.1

Algorithms Framework architecture

The most significant feature of the framework is the ability to observe the target under arbitrary illumination and novel view positions, and all this results are totally based on images without acquiring any 3D information. Therefore, the framework integrates two important algorithms above as two modules. The structure is shown in Figure 1. Firstly, in image-based relighting module, the camera position is fixed and the relighting process is performed, then the result image is obtained under arbitrary luminance situation. This image is named the source image. Secondly, the camera position and orientation is moved in a little range and a picture of the target object is captured as reference image. Using source image and reference image as the input of novel view synthesis module, the final novel image can be synthesized. The relationship between the source image and the reference image is simply related to the two different camera positions and calculated only once in the process procedure. Once the light environment changes, the relighting result is recalculated and the source image is updated, leaving reference image unmodified. Then the novel image is syn-

4

A practical framework for virtual viewing and relighting

thesized again. Employing this framework, we can observe the target of various luminance situations under arbitrary viewpoints. The basis function-based relighting and the trilinear tensor novel view synthesis are selected in this framework instead of reflectance-based relighting, plenoptic function-based relighting and other novel view algorithms since these two image-based method are superior in keeping confidential secret. The two important methods will be introduced in the following sections of this chapter.

Fig. 1. Framework Architecture.The architecture contains two main algorithm modules. The source image is the output of relighting module, and the input of the second module, as well as the reference image. The novel image is the final result as the output of the novel module.

2.2

Image-Based Relighting

Image-based Relighting (IBL) can synthesize realistic images without a complex and long rendering process as in traditional geometry-based computer graphics. The algorithm detail is divided into four parts and will be introduced in the following.

Basis image acquisition. A light stage is designed to capture the basis images, which is shown in Figure 2. This stage is a spherical structure attached with prepared light sources. A point light is used as the ”standard” light source and the target is placed at the center of the stage. The system captures images of the target while only illuminated by the ”standard” light. The light moves around the surface of sphere and for each light position and a corresponding basis image is captured. This process is repeated until the entire sphere of incident illumination is sampled at a pre-defined resolution. Since the distance between the light and the target is much bigger than the target’s size, the point light source can be considered as ”directional light source”.

A practical framework for virtual viewing and relighting

5

Fig. 2. Light Stage and Result Images. The target on the platform is illuminated by ”standard” light. For each light direction, a corresponding image is captured. The right image shows the capture results under our spherical sample methods.

Luminance mapping. When doing the renderings, it will be convenient to have a table that maps from the image number to a direction on the sphere. Thus, the sphere map is extended to latitude-longitude map, which is shown in Figure 3.

Fig. 3. Sphere map to Latitude-Longitude map. According to the Light Stage, we split the sphere surface that light source covered in different regions. Then these regions are projected into a latitude-longitude map, each grid corresponding to a region in sphere surface.

From the Figure 3 we notice that different section on the sphere has different area. To obtain an accurate result, the impact for each region of samples is scaled according to the area of the section on the sphere. The scaling factors in latitude-longitude map can be determined by the following equation 1:

Ai =

Z (φi + ∆φ Z (φi + ∆θ 2 ) 2 ) 1 sin(θ) dθ dφ 4π (φi − ∆φ (φi − ∆θ 2 ) 2 )

6

A practical framework for virtual viewing and relighting

=

µ µ ¶ µ ¶¶ 1 ∆θ ∆θ (∆φ) cos θi − − cos θi + 4π 2 2

(1)

Relighting process. To render the target under arbitrary illumination, the virtual light illumination is mapped onto the sphere map firstly. Then the sphere map is converted to the Longitude-Latitude map. So the relighting problem can be solved by linear combination of the basis images. Relighting of the sequence of images Ii can be described by the following equation 2: Z (φi + ∆θ N X 2 ) E(φ, θ) sin(θ) dθ dφ (2) I= Ii (φi − ∆θ 2 ) i Here the equation 2 is approximated by a linear combination equation 3 of the images Ii and the mean of the lighting environment Ei over the solid angle scaled by the relative area of the solid angle projected onto the unit sphere. I≈

N X

Ii Ai Ci Ei =

i

N X

Ii Ai Ei · ID /IR

(3)

i

Where Ci is the difference between arbitrary light intensity and standard light intensity.ID and IR are the intensity of arbitrary light and standard light respectively. With the images set, latitude-longitude maps and the weights, the image under novel illuminations can be generated.

Fig. 4. Virtual Ring Light Relighting. A virtual ring light is simulated using the combination method of spatial arrays of point light sources and all the point light sources are mapped to the latitude-longitude map. The relighting image under virtual ring light source is shown in the left.

2.3

Novel View Synthesis

The novel view synthesis algorithm is employed to synthesize the image under arbitrary view position. This technique is based on two or three sample images

A practical framework for virtual viewing and relighting

7

and the primary process is the tirlinear tensor operation, which is quite fit to our application. Feature point detection. During the novel process, the relationship between the sample images is crucial to this approach. To get a more accurate result, the matching feature points are automatically detected in the sample images. The Harris corner points are firstly detected and then the putative matches[16] of the feature points in images are calculated. In the end, the RANSAC iteration algorithm is employed to detect the feature points accurately and robustly. Trilinear tensor operation. At the beginning of the process, a tensor is calculated to be the basis tensor. In the projection theory, given three sample images, there are three different projection matrixes [I, 0], [A, V 0 ], [B, V 00 ]. The trilinear tensor is a 3×3×3 matrix described by a bilinear function of the camera matrices A, B: Tijk = v 0j bki − v 00k aji

(4)

where aji , bki , v 0j , v 00k represent the elements of the homographies A,B and vector v 0 ,v” respectively. In the implement, to accelerate and simplify our process, we use only two sample images, which means aji = bki . The fundamental matrix of these two images is calculated and the elements of the fundamental matrix are rearranged[17] into the tensor; the basis tensor can also be obtained in this way. To acquire novel image of the object, a 3 × 3 homography R and a 3 × 1 vector t should be specified, which represent the rotation component and translation component from sample image to novel image. The new tensor Gjk i will be calculated in equation 5[18]: k jl k j Gjk i = Rl Ti + t ai

(5)

The Gjk i is just the new trilinear tensor used to re-project the first two sample images onto the desired novel view image. Re-projection process. After obtaining the new trilinear tensor and two sample image, the final procedure is to re-project the two sample image onto the desired new novel image. The final novel image can be generated from the source image using the re-projection equation 6[18]. pi suj Tijk = p00k

(6) suj

Where p is the matching point in the source image, represents the eipolar line intersecting at the corresponding point of p in reference image, p00 is the point in the novel image. Using the new tensor, two sample images and re-projection equation, the new novel image can be re-projected directly.

8

3

A practical framework for virtual viewing and relighting

Experiments

In this section, some experiments are designed to verify the accuracy and efficiency of this framework. Regarding to the two different algorithms, diverse verification experiments are carried out separately according to different modules.

Fig. 5. Relighting Comparison. The left image is the generated relighting image and the right image is the captured real image. The region in red square is the specula reflection area on the surface.

As we see from Figure 5, it is very difficult to distinguish the difference between the synthesized relighting image and the real image. To verify the accuracy of this relighting algorithm, the comparison between the synthesized image and the real photo is essential. Generally speaking, taking difference pixel by pixel is the most accurate way to measure the difference of two images. Most image measurement methods before concentrate on the effect of contrast and sharpness of an image on human perception. They are all used to compute the perceivable difference of one image pair, such as the CIELAB color model[19] and CIE2000 color difference formula[20]. Those methods are not suitable for our verification because they focus on human perception while our goal is to measure the quality and effectiveness of the relighting algorithm. Under such circumstances, a new appropriate approach is designed to evaluate the result of this framework. Taking the aim of this module into account, the luminance property of the image is quite important. So the histograms of the R, G, B channel as well as the luminance of the two images are compared in this approach. In addition to the histogram, the mean value and variance of the two images are calculated as well, which are two widely used statistical parameters. Here we show the analysis results of the relighting result and the real photo in Table 1 and Figure 6. As we can see from the analysis results and Figure 5, the position and intensity of specula reflection and the self-occlusion shadow on the target surface under arbitrary luminance conditions are simulated precisely, and the range of error between image pair is limited to a very small fraction. This experiment proves the method used for image-based relighting is extremely accurate and efficient.

A practical framework for virtual viewing and relighting

9

Fig. 6. Histogram Analysis. Histogram comparison of the image pair in red channel (left above), green channel (right above), blue channel (left below) and luminance (right below).

Considering the interpolation and extrapolation characteristic of novel view algorithm, it is not suitable to measure the efficiency of our framework in quantitative methods. Therefore, the criterion of measurement is that the result satisfies people’s vision effect and the physical distortions of the target object occur in the acceptable range. Meanwhile the intensity and specula reflection property as well as shadow details can be preserved in the novel image. Figure 7 shows the experiment result of this module, which is also the final novel result of this framework. The efficiency of this framework is also what we concern about. The specification of the computer is Dell Dimension(TM) E520,Intel(R) Pentium(R) D 915 with Dual Core Processing,1GB Dual Channel DDR2 Memory and 160 GB SATA HDD. This process time of this framework lies on the resolution of basis images. In this experiment, the resolution of image is fixed at 512 × 342. During the relighting process, the process time also depend on the condition of arbitrary light situation such as type, color, shape, distance of light source. In the worst situation, all the 36×17 basis images should be handled, at this time the process

10

A practical framework for virtual viewing and relighting Channel

Mean Value Variance Our Method Real Photo Our Method Real Photo Red 127 128 89.16 93.29 Green 115 115 89.08 94.54 Blue 120 122 90.81 95.77 Luminance 119 119 81.94 88.04 Table 1. Statistical result of the image pair.

Error 0.03% 0.03% 0.02% 0.02%

Fig. 7. Novel View Results. The left above image is the reference image and the left below image is the generated source image. The other images are the synthesized novel view images that we specify the camera move around the Z-axis of the target from left to right. In the right blow image, which means the camera position changes in a large range, the distortion of the target is obvious and unacceptable.

time is more than 70 seconds. But in the practical application, only parts of the basis images are required to simulate under the arbitrary light situation, take the virtual ring light source shown in Figure 4 for example, the process time is nearly 12 seconds. During the novel view process, we only need to re-project and calculate the result image per pixel once. So the process time is less than 1 second. We can nearly synthesize the real-time novel result.

4

Conclusion and Future Work

In this paper we present a framework for observing the target to satisfy some practical requirements. We choose the image-based methods to avoid calculating 3D information or model of objects and still can synthesize the result under arbitrary luminance conditions and view positions. Under various light condition

A practical framework for virtual viewing and relighting

11

and viewpoint, we can observe different details on the object’s surface clearly especially the specula and shadow details of the target. The accuracy and efficiency of this framework is acceptable. In future, we will still do some researches in these aspects: accelerating the process of relighting, improving the accuracy of novel view methods, extending the novel view algorithm that can observe the target globally. Acknowledgments. This research is sponsored by 863 National High Technology R&D Program of China (No. 2006AA01Z307) and National Natural Science Foundation of China (No. 60403044)

References 1. James T. Kajiya: Anisotropic reflection models. SIGGRAPH ’85: Proceedings of the 12th annual conference on Computer graphics and interactive techniques. ACM Press, New York, NY, USA (1985) 15–21 2. Henrik Wann Jensen and Stephen R. Marschner and Marc Levoy and Pat Hanrahan: A practical model for subsurface light transport. SIGGRAPH ’01: Proceedings of the 28th annual conference on Computer graphics and interactive techniques. ACM Press, New York, NY, USA (2001) 511–518 3. Zhoulin Lin, Tien-Tsin Wong, and Heung-Yeung Shum: Relighting with the Reflected Irradiance Field: Representation, Sampling and Reconstruction. IEEE Computer Vision and Pattern Recognition, Vol. 49. Springer Netherlands (2002) 229–246 4. J. Nimeroff, E. Simoncelli, and J. Dorsey: Efficient rerendering of naturally illuminated environments. In Eurographics Rendering Workshop 1994, Darmstadt, Germany, June 1994. Springer-Verlag (1994) 5. Paul Debevec and Tim Hawkins and Chris Tchou and Haarm-Pieter Duiker and Westley Sarokin and Mark Sagar: Acquiring the reflectance field of a human face. SIGGRAPH ’00: Proceedings of the 27th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., New York, NY, USA (2000) 145–156 6. Adelson, E. H. and Bergen, J. R: The plenoptic function and the elements of early vision. Computational Models of Visual Processing (2001) 3.20 7. Debevec P. E., Taylor C. J., Malik J.: Modeling and rendering architecture from photographs: A hybrid geometry and image-based approach. In Proc. SIGGRAPH’96. ACM Press,New Orleans, LS, USA(1996) 11–20. 8. A. Criminisi, I. Reid and A. Zisserman: Single View Metrology. International Journal of Computer Vision, Vol. 40. Springer Netherlands (2000) 123–148 9. Byong Mok Oh and Max Chen and Julie Dorsey and F. Durand: Image-based modeling and photo editing. SIGGRAPH ’01: Proceedings of the 28th annual conference on Computer graphics and interactive techniques. ACM Press, New York, NY, USA (2001) 433–442 10. Thaddeus Beier, Shawn Neely: Feature-Based Image Metamorphosis. ACM SIGGRAPH Computer Graphics, Vol. 26. ACM Press, New York, NY, USA (1992) 35–42 11. Steven M. Seitz, Charles R. Dyer: Physically-Valid View Synthesis By Image Interpolation. IEEE Workshop on Representation of Visual Scenes. IEEE Computer Society, Los Alamitos, CA, USA (1995) 18.

12

A practical framework for virtual viewing and relighting

12. S.M. Seitz and C.R. Dyer: View Morphing. SIGGRAPH ’96: Proceedings of the 23rd annual conference on Computer graphics and interactive techniques. ACM Press, New York, NY, USA (1996) 21–30 13. Marc Levoy and Pat Hanrahan: Light field rendering. SIGGRAPH ’96: Proceedings of the 23rd annual conference on Computer graphics and interactive techniques. ACM Press, New York, NY, USA (1996) 31–42 14. Steven J. Gortler and Radek Grzeszczuk and Richard Szeliski and Michael F. Cohen: The lumigraph. SIGGRAPH ’96: Proceedings of the 23rd annual conference on Computer graphics and interactive techniques. ACM Press, New York, NY, USA (1996) 43–54 15. S. Peleg and J. Herman: Panoramic Mosaic by Manifold Projection. IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, Los Alamitos, CA, USA (1997) 338 16. http://www.cs.unc.edu/~blloyd/comp290-089/fmatrix/ 17. S. Avidan and A. Shashua: Unifying two-view and three-view geometry. In ARPA, Image Understanding Workshop (1997). 18. S. Avidan and A. Shashua: Novel View Synthesis by Cascading Trilinear Tensors. IEEE Transactions on Visualization and Computer Graphics (TVCG),Vol. 04. IEEE Computer Society, Los Alamitos, CA, USA (1998) 293–306 19. B. Hill and Th. Roger and F. W. Vorhagen: Comparative analysis of the quantization of color spaces on the basis of the CIELAB color-difference formula. ACM Trans. Graph. . ACM Press, New York, NY, USA (1997) 109–154 20. Luo MR, Cui G, Rigg, B: The development of the CIE2000 colour-difference formula: CIEDE2000. Color Research and Application,(2001) 26:340–350

Suggest Documents