Document not found! Please try again

A Practical Single Image Based Approach for ... - Semantic Scholar

2 downloads 516 Views 837KB Size Report
We test and validate our method on both synthetic and real images and ..... 2(b) by adding a checker pattern to the surface .... [3] T. Kim, Y. Seo, and K. Hong.
A Practical Single Image Based Approach for Estimating Illumination Distribution from Shadows Taeone Kim and Ki-Sang Hong Department of Electronic and Electrical Engineering, POSTECH San 31, Hyojadong, Namgu, Pohang, 790-784, Republic of Korea {kimm, hongks}@postech.ac.kr Abstract This paper presents a practical method that estimates illumination distribution from shadows where the shadows are assumed to be cast on a textured, Lambertian surface. Previous methods usually require that the reflectance property of the surface be constant or uniform, or need an additional image to cancel out the effects of varying albedo of the textured surface. We deal with an estimation problem for which surface albedo information is not available. In this case, the estimation problem corresponds to an underdetermined one. We show that combination of regularization by correlation and some user-specified information can be a practical method for solving the problem. In addition, as an optimization tool for solving the problem, we develop a constrained Non-Negative Quadratic Programming (NNQP) technique into which not only regularization but also user-specified information are easily incorporated. We test and validate our method on both synthetic and real images and present some experimental results.

1. Introduction Illumination information can be used for producing better results in various applications, in particular, in computer graphics applications such as Augmented Reality (AR) [1]. The AR technique enables us to insert virtual objects seamlessly into video or images so that the virtual objects appear to be augmented like part of an original scene. The reality of virtual objects is much enhanced if we estimate scene illumination and use it for realistic rendering of virtual objects into a real scene [2, 3, 4] There has been much research interest in the methods of estimating illumination from shading, specular reflection, or shadows of objects in the computer vision community [2, 5, 4, 6]. Yang et al. [6] and Wang et al. [4] proposed a method for detecting a small number of light source directions using critical points which are induced by occluding boundaries of an object. Ramamoorthi et al. [7] proposed a signal-processing framework based on a spherical harmonic

representation for inverse rendering. However, these approaches usually assume that the surface is not textured. Sato et al. [2] proposed a method for estimating the illumination distribution from shadows. They estimated it by analyzing shadows cast by an object of known shape on a planar surface. The method, however, requires that the reflectance property of the planar surface be uniform over the surface, or needs an additional image to cancel out the effects of varying albedo of the surface. Closely related to our work, Li et al. [5] proposed a method that integrates multiple cues from shading, shadow, and specular reflections in textured environments. But, the method seems to be applicable only to a specific type of light sources (e.g., a small number of discrete point light sources). Our concern is that what we can do for illumination estimation when only a video or image acquired in the past is available. This is important, for example, when we are required to render virtual objects so that they look like a real scene in the video or image to produce better AR results like in [2, 3]. In this paper, we propose a practical method that estimates the illumination distribution of a real scene from shadows using a single image which includes the shadows of an object on a textured planar surface. For the textured surface, Sato et al. [2] require two images, the shadow and surface images, the latter of which provides surface albedo information. The surface image is captured without the object, but it is not always obtainable, for example, when we have only a shadow image or the object is not easily removable from the scene. Our method uses only a single shadow image for illumination estimation. In this case, the estimation problem becomes an underdetermined system. In general, it is difficult to distinguish variations due to varying albedo from those due to changing shadow intensity when the albedo and shadow variations are correlated to each other. However, when those two variations are less or not correlated, we show that regularization by correlation can be used for resolving the underdetermined problem to some extent. In addition, to handle the case that the regularization by correlation is not sufficient for resolving the problem completely, we also propose a practi-

cal method that combines some user-specified information with the regularization by correlation. As a mathematical optimization tool for our proposed problem, we develop a simple, constrained Non-Negative Quadratic Programming (NNQP) technique into which not only regularization but also multiple linear constraints induced by user-specified information are easily incorporated. In Section 2, our illumination estimation problem is defined and a practical method for solving the problem is proposed in Section 3. In Section 4, we test and validate the proposed method on synthetic data both qualitatively and quantitatively and also present qualitative results on real data. Finally, we conclude in Section 5.

2. Problem definition

and otherwise Sij = 1. In addition, by removing the occluding object from scene (note that Sij always becomes 1 in this case), we obtain the surface image p0i , p0 (xi , yi ) where p0i = Ki

n X

(1)

Lj cosj .

j=1

In general, for textured planar surfaces, the albedo value Ki varies with respect to the pixel coordinate. If we have both the shadow image and surface image obtained under the same illumination condition, the albedo term can be cancelled out simply by dividing [2]:   ! n n X X 0 pi /pi = Ki Lj cosj Sij cosj  / Ki Lk cosk j=1

2.1. Illumination estimation from shadows The problem of illumination estimation from shadows requires a brief explanation. Let us assume that an object of known shape is illuminated by infinitely distant light sources, casting its shadows onto a planar Lambertian surface (see Fig. 1). In general, the shadows are induced by the occlusion of incoming light rays by the object. Sato et al. [2] modelled this process by the following shadow image formation equation: p(xi , yi ) = K(xi , yi )

n X

L(θj , φj )S(θj , φj , xi , yi )ωj cos θj ,

j=1

where (xi , yi ) is the i-th image pixel coordinate, p(xi , yi ) the pixel value, K(xi , yi ) the albedo, L(θj , φj ) the illumination radiance, S(θj , φj , xi , yi ) the occlusion coefficient and ωj is the solid angle. They uniformly discretized the

(a) Shadow image

(b) Surface image

n X

=

j=1

X Lj Sij cosj = lj Sij cosj , (2) k=1 Lk cosk j=1

Pn L

where lj , Pn Ljk cosk is an illumination radiance nork=1 malized by the overall illumination magnitude. Cascading m pixel values yields the following matrix representation:      l1 S11 cos1 · · · S1n cosn p1 /p01     ..  . . . .. ..   .. (3)  .  =  pm /p0m pr

=

Sm1 cos1 Slr

···

Smn cosn

ln

If m > n, the non-negative vector lr = [l1 , · · · , ln ]T in Eq. (3) can be solved by the Non-Negative Least Squares (NNLS) method [8] or by the NNQP method [9] explained in Section 3.5.

2.2. Unknown albedo: an underdetermined system As explained in the previous section, albedo information is easily obtainable by manually removing the occluding object from the scene and the illumination distribution is computed by solving Eq. (3). However, we cannot construct Eq. (3) when the surface image or albedo information is not obtainable. With the shadow image only we can construct, at most, the following matrix equation: (4)

p = diag(k)Sl = KSl,

Figure 1: An example: shadow image v.s. surface image hemisphere to represent light source directions (note that the index j represents each discretized light source direction). For future reference, we define the abbreviated notation for each term: pi , p(xi , yi ), Ki , K(xi , yi ), Lj , L(θj , φj ), Sij , S(θj , φj , xi , yi ) and cosj , ωj cos θj . Note that the occlusion of incoming light rays are described by Sij ; if the corresponding illumination radiance Lj directing to the point (xi , yi ) is occluded by the object, Sij = 0

k=1

where p K

= =



p1

···

diag(k),

pm

T 

,

l = L1

 k = K1 ···

Ln

··· T

,

Km

T

,

where diag(k) produces a diagonal matrix whose diagonal elements are those of k. Note that k represents texture variations and is called the albedo image and Sl represents shadows and is called the albedo-free shadow image.

Combined, they produce the shadow image p. We want to estimate the non-negative quantities k and l, given known quantities p and S. However, with this setting, the solution is not unique but ambiguous because there are a total of m+n unknown variables in k and l to be estimated while we have only m linear equations. As a result, we need some other information to solve the problem.

3. Proposed method 3.1. Reformulation in quadratic minimization Eq. (4) can be recast in a quadratic cost function: min l,K

1 (p − KSl)T (p − KSl). 2

(5)

The above cost function has a bilinear quadratic form. With K being fixed, the cost function becomes quadratic in l. If we can estimate the illumination radiance l, the albedo image k is automatically computed. We assume that the pixel intensity values pi are normalized in the interval [0, 1], the albedo values Ki is in [0, 1] and the radiance values Lj are non-negative. The minimization of the cost function can be performed by the following iterative procedure: • Initialize k(0) (or K (0) ). • Iterate the procedure below until e(i) < 0 . – Compute l(i) using the NNQP algorithm (see Section 3.5). – Compute k(i) (see Eq. (6)). – Compute the error: e = (p − K (i) Sl(i) )T (p − K (i) Sl(i) ). – i = i + 1. and 0 is a threshold value and l(i) denotes a radiance value at the i-th iteration. At each iteration, the computation of k should satisfy:  (Sl)i /pi if (Sl)i /pi < 1, (6) Ki = 1 if (Sl)i /pi ≥ 1, where (a)i denotes the i-th element of column vector a. As a result, Eq. (5) can be minimized with respect to K and l. However, as mentioned in the previous section, there occurs an ambiguity problem. In general, we cannot obtain a desired solution using only a single shadow image.

3.2. Regularization by correlation If the albedo-free shadow image Sl and the albedo image k are heavily correlated with each other, it is difficult to estimate the illumination radiance. We assume that they are less or not correlated with each other. This assumption seems to make sense since the coincidence of the two

images rarely occurs in nature. This leads us to add a regularization term to Eq. (5): 1 (p − KSl)T (p − KSl) − γpT Sl with uT l = 1, (7) min l,K 2 where γ > 0 is a regularization parameter. The regularization term pT Sl in Eq. (7) represents the correlation between p and Sl and the illumination radiance l is constrained by a sum constraint uT l = 1 to remove the scale ambiguity, where u = [cos1 , · · · , cosn ]T . Note that Eq. (7) is also in a quadratic form and its minimization can be done using the iterative minimization procedure described in the previous section. However, with a fixed γ, the minimization ends up with a degenerate solution pair (k, l) where k = p and l satisfies Sl = [1, · · · , 1], respectively. This occurs due to the fact that the correlation between p and Sl is maximum at Sl = [1, · · · , 1]; In fact, the value of (k, l) changes until l satisfies Sl = [1, · · · , 1] even after the error e = (p − KSl)T (p − KSl) converges to 0. To avoid this degenerate case, γ is adjusted per iteration according to the distance between p and K (i) S: γ (i) = γ (0) dist(p, K (i) S), where dist(a, A) computes the distance between the vector a and the subspace spanned by the columns of A. Note that γ becomes 0 when the error e = 0 and then the regularization term disappears and the value of the solution pair does not change any more.

3.3. Incorporating user-specified information When the surface is textured, there may be a partly correlated region between p and Sl. This makes the use of correlation insufficient for solving the problem because certain textured areas of the shadow image look like shadows. In this case user-specified information should be provided about whether the obscured areas correspond to a shadow region or not. We accomplish this by designating that some pixel positions of the obscured areas are not shadowed. Mathematically, if the i-th pixel position is not shadowed, it can be represented by the following linear constraint sTi l = 1

if the i-th pixel position is not shadowed,

where sTi corresponds to the i-th row of the occlusion matrix S. Note that sTi l < 1 if the i-th pixel position is shadowed. In this way, q user-specified pixel positions which are not shadowed can be represented by q multiple linear constraints: sTi l = 1 for i = 1, · · · , q. Note that we assume that the rows corresponding to the user-specified pixel positions are arranged at the first q rows of S. As a result, our quadratic cost function becomes 1 (p − KSl)T (p − KSl) − γpT Sl, min l,K 2 subject to uT l = 1 and sTi l = 1, i = 1, · · · , q. (8)

To minimize the above cost function, we develop a simple, multiple linear equality constrained NNQP technique in Section 3.5.

3.4. Color image and parameterization Up to now, we have considered the estimation problem in the gray world. But, when we have a color shadow image as input, the cost function in Eq. (8) should take RGB color band information into consideration. The shadow and albedo images are reparameterized by  p = pR

pG

pB

T

and

K = diag([kR , kG , kB ]),

where pR and kR represent the R-band shadow image and albedo image, respectively. If we assume that the illumination sources are white-colored, albedo values (k R )i , (kG )i and (kB )i at the i-th pixel position can be parameterized by a scaled representation of RGB pixel intensity values     (kR )i (kG )i (kB )i = α (pR )i (pG )i (pB )i ,

where α is a scale parameter.

3.5. A multiple linear equality constrained NNQP technique

Suppose that the elements of A and b are non-negative and non-positive, respectively (note that when the elements of b are also non-negative, the minimum of the cost function occurs trivially at v = [0, · · · , 0]). Next, from the definition of bi and ci and the non-negativeness of vi , it is straightforward to know that the partial derivatives become bi ≤ 0 and ci = 0 ∀i and the update rule of Eq. (10) is simplified as   −bi . (11) vi ←− vi ai In fact, our estimation problem corresponds to this case. If we expand the cost function in Eq. (8) and compare it with Eq. (9), the following relationship is obtained A ↔ S T K T KS, b ↔ −S T K T p − γS T p, and v ↔ l. As the elements of K, S and p are all non-negative, those of A and b become non-negative and non-positive, respectively. As a result, we can use the simplified update rule in Eq. (11) Incorporating q + 1 multiple linear equality constraints in Eq. (8) can be done by using the Lagrangian multipliers: r

X 1 λk (β Tk v − βk ), F (v) = v T Av + bT v + 2

(12)

k=1

Sha et al. [9] considered the minimization of the quadratic objective function: F (v) =

1 T v Av + bT v 2

(9)

subject to the non-negative constraints that vi ≥ 0 for all i = 1, · · · , n and v = [v1 , · · · , vn ]T . The n × n square matrix A is assumed to be symmetric and semipositive definite. Let us define the matrix A+ and A− as follows:   |Aij | if Aij < 0, Aij if Aij > 0, − and A = A+ = ij ij 0 otherwise, 0 otherwise, where Aij represents the (i, j)-th element of A. It follows trivially that A = A+ − A− and F (v) = Fa (v) + Fb (v) − Fc (v) where Fa (v) = 12 v T A+ v, Fb (v) = bT v and Fc (v) = 12 v T A− v. Let us define the partial derivatives:

where β Tk v = βk ←→ sTk l = 1, k = 1, · · · , q and β Tr v = βr ←→ uT l = 1. We set r = q + 1 for simple notation. Note that the effect of each Lagrangian term λk (β Tk v − βk ) in Eq. (12) is to alter the coefficients of the term that is linear in vi by an amount λk βki . As a result, it is straightforward to find out that the update rule in Eq. (11) is rewritten by   bi (λ1 , · · · , λr ) , (13) vi ←− vi ai where bi (λ1 , · · · , λr ) , bi +λ1 β1i +· · ·+λr βri . However, the updated vi in Eq. (13) should satisfy a total of r linear equality constraints:   X bi (λ1 , · · · , λr ) βki vi = βk for k = 1, · · · , r. (14) ai i

The Lagrangian multipliers (λ1 , · · · , λr ) that satisfy the r constraints in Eq. (14) are easily computed using a linear ∂Fb ∂Fc ∂Fa = (A+ v)i , bi , = (b)i , ci , = (A− v)i . method such as the Singular Value Decomposition (SVD). ai , ∂vi ∂vi ∂vi Next, the multiple update rule takes the form (refer to [9] for further details): ! p −bi + b2i + 4ai ci vi ←− vi . (10) 2ai

4. Experimental results We have performed experiments on both synthetic and real images. The following three methods are tested and compared:

• Method 1: Sato et al.’s method without albedo information (see Eq. (4)). • Method 2: Regularization by correlation (see Eq. (7)). • Method 3: Regularization by correlation + userspecified information (see Eq. (8)). Note that Method 1 assumes that the albedo information is set by k = [1, · · · , 1]T and solves the illumination radiance vector l in Eq. (4) by using the SVD method. The iterative procedure described in Section 3.1 is adopted by Method 2 and 3 for minimization of Eq. (7) and (8).

(a)

(a)

(e)

(b)

(f)

(c)

(g)

(d)

(h)

(d)

(b)

(e)

(c)

(f)

Figure 2: Left: Generation of a synthetic shadow image; (a) Surface image; (b) Synthetically textured surface image; (c) Synthetic shadow image rendered by using the illumination radiance shown in Fig. 3(a). Right: Magnification of the rectangle region; (d) Surface image; (e) Shadow image used for analysis (the rectangle region in (c)); (f) 4 user-specified points denoted by ’×’ which are used for Method 3.

4.1. Experiments on synthetic data Synthetic data generation: For synthetic data generation, we have generated a synthetically textured surface image in Fig. 2(b) by adding a checker pattern to the surface image shown in Fig. 2(a), and then rendered a shadow image in Fig. 2(c) by placing an occluding box of known shape which is illuminated by the illumination radiance shown in Fig. 3(a) (note that the illumination directions are represented by n = 305 uniformly discretized vertices on the hemisphere). Results: We have applied the three methods, Method 1, Method 2, and Method 3 to the shadow image shown in Fig. 2(e). For Method 2 and Method 3, the albedo image is initially set by k(0) = [1, · · · , 1]T and the initial regularization parameter by γ (0) = 0.05. The optimization procedure usually converges within 2-3 iterations. The results of the

Figure 3: Left: Visualization of estimated illumination radiance (upper view of the hemisphere); (a) True; (b) Method 1; (c) Method 2; (d) Method 3. Right: Its corresponding estimated albedo-free shadow image Sl (note that the occluding box is masked out).

illumination radiance vector l estimated by each method are visualized in the left column of Fig. 3. It is difficult to perform a direct comparison between the true and estimated illumination radiance because there is an ambiguity problem in the illumination results estimated by using a single viewpoint as indicated by [3]. Therefore, we evaluate the quality of the estimated illumination indirectly by comparing the albedo-free shadow images produced by the estimated illumination. The albedo-free shadow images are displayed in the right column of Fig. 3. Note that the albedo-free image obtained by Method 1, shown in Fig. 3(f), is very noisy compared to those images in (g) and (h) in Fig. 3. Furthermore, note that when we use only regularization by correlation (Method 2), some false shadows appear in the upper-right region in Fig. 3(g). These false shadows occur because there is a texture region which looks like shadows and the texture region is not distinguishable from shadows when Method 2 is applied. To remove the appearance of the false shadows, we supply some user-specified points (the

points marked by ’×’ in Fig. 2(f)) which indicate that those points should be in non-shadow regions. From Fig. 3(h), we see that Method 3 prevents those shadows from appearing.

4.2. Experiments on real data For real data, we have captured both the shadow and surface images without a change of illumination conditions and displayed them in Fig. 1. Sato et al.’s method using both images (the shadow and surface images) is applied and its results are used as reference for comparison. The estimated albedo-free shadow image in Fig. 4(e) is very noisy and its corresponding illumination radiance in Fig. 4(b) is far from the reference illumination in Fig. 4(a), while the results obtained by the proposed method (Method 2) are promising (Fig. 4(c) and (f)). Note that we do not need to provide userspecified information because the surface image is almost not correlated with shadows and so the use of the regularization by correlation is sufficient. Finally, we have inserted a virtual box into the shadow image shown in Fig. 1. The shadows of the box are synthetically generated by using the estimated illumination radiance results; Compare (a) with (b) in Fig. 5. The synthetic shadows in (b) match neighboring real shadows better than those in (a) do. This shows that our method can be successfully applied to perform realistic rendering of virtual objects without albedo information.

(a) Shadow image

(b) Surface image

Figure 5: An AR example: the shadows of the virtual box are generated by the illumination estimated by (a) Method 1 and (b) Method 2, respectively.

5. Conclusion We have presented a practical method that estimates the illumination distribution from shadows using only a single shadow image. To solve this problem, we have proposed a practical method that combines some user-specified information with the regularization by correlation. In addition, as a mathematical optimization tool for our proposed problem, we have also developed a simple, multiple linear equality constrained NNQP technique.

References [1] Ronald T. Azuma. A survey of augmented reality. PRESENCE: Teleoperations and Virtual Environments, 6(4):355–385, August 1997. [2] I. Sato, Y. Sato, and K. Ikeuchi. Ilumination distribution from shadows. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, pages 306–312, 1999. [3] T. Kim, Y. Seo, and K. Hong. Improving ar using shadows arising from natural illumination distribution in video sequences. In Proc. Int. Conf. on Computer Vision, pages 329–334, 2001.

(a)

(d)

[4] Y. Wang and D. Samaras. Estimation of multiple illuminants from a single image of arbitrary known geometry. In Proc. European Conf. on Computer Vision, page III: 272 ff., 2002. [5] Y. Li, S. Lin, H. Lu, and H.Y. Shum. Multiple-cue illumination estimation in textured scenes. In Proc. Int. Conf. on Computer Vision, pages 1366–1373, 2003. [6] Y. Yang and A.L. Yuille. Sources from shading. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, pages 534–539, 1991.

(b)

(e)

[7] R. Ramamoorthi and P. Hanrahan. A signal-processing framework for inverse rendering. In SIGGRAPH01, pages 117–128, 2001. [8] C. L. Lawson and R. J. Hanson. Solving least squares problems. Prentice-Hall, 1974. [9] F. Sha, L. Saul, and D. Lee. Multiplicative updates for large margin classifiers. In Proc. the Sixteenth Annual Conference on Computational Learning Theory, 2003.

(c)

(f)

Figure 4: Left: Visualization of estimated illumination radiance; (a) Sato et al.’s method using two images; (b) Method 1; (c) Method 2. Right: Its corresponding estimated albedo-free shadow image (Sl) (note that γ (0) = 0.5).

Suggest Documents