Surface topography using shape-from-shading - Semantic Scholar

3 downloads 1417 Views 3MB Size Report
cerned with recovering 3D surface-shape from shading .... diance equation as a hard constraint. .... guarantee data closeness by treating the IIR as a hard.
Pattern Recognition 34 (2001) 823}840

Surface topography using shape-from-shading Philip L. Worthington, Edwin R. Hancock* Department of Computer Science, University of York, York Y01 5DD, UK Received 14 April 1999; received in revised form 18 February 2000; accepted 18 February 2000

Abstract This paper demonstrates how a recently reported shape-from-shading scheme can be used to extract topographic information from 2D intensity imagery (Worthington and Hancock, IEEE Trans. Pattern Anal. Mach. Intell. 21 (1999) 1250}1267). The shape-from-shading scheme has two important features which enhance the recovered surface description. Firstly, it uses a geometric update procedure which allows the image irradiance equation to be satis"ed as a hard constraint. This not only improves the data closeness of the recovered needle map, but also removes the necessity for extensive parameter tuning. Secondly, we use curvature information to impose topographic constraints on the recovered needle map. The topographic information is captured using the shape index of Koenderink and van Doorn (Image Vision Comput. 10 (1992) 557}565) and consistency is imposed using a robust error function. We show that the new shape-from-shading scheme leads to a meaningful topographic labelling of 3D surface structures. Moreover, the resulting topographic information proves to be useful in a simple histogram-based object recognition scheme.  2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. Keywords: Shape-from shading; Surface topography; Robust statistics

1. Introduction Marr identi"ed shape-from shading (SFS) as providing one of the key routes to understanding 3D surface structure via the 2D sketch [1,2]. The process has been an  active area of research for over two decades. It is concerned with recovering 3D surface-shape from shading patterns. The subject has been tackled in a variety of ways since the pioneering work of Horn and his coworker's in the 1970s [3,4]. The classical approach to shape-from shading is couched as an energy minimisation process using the apparatus of variational calculus [4,5]. Here the aim is to iteratively recover a needle map representing local surface orientation by minimising an error functional. The functional contains a data-closeness term, and a regularising term that controls the smoothness of the recovered

* Corresponding author. Tel.: #44-1904-43-3374; fax: #441904-43-2767. E-mail address: [email protected] (E.R. Hancock).

needle map. Since the recovery of the needle map is underconstrained, the variational equations must be augmented with boundary constraints. Despite considerable progress in the recovery of needle maps using shape-from shading [6}10], there are few examples of the use of the method for 3D surface analysis and recognition from 2D imagery [11]. One of the reasons for this is the lack of surface detail contained in the needle map. This can be attributed to the fact that most shape from shading schemes oversmooth the recovered needle map. This is a disappointing omission since there is strong psychophysical evidence that shading information is an important shape cue [12}19]. 1.1. Shape-from shading From the perspective of object recognition, existing work on shape-from shading can be criticised on two counts. Firstly, the data-closeness constraint invariably plays a relatively weak role in the recovery of the needle map, whilst the smoothness error dominates the process. Secondly, the modelling of the smoothness constraint

0031-3203/01/$20.00  2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 0 0 ) 0 0 0 3 6 - 4

824

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

is extremely simplistic. Most practical schemes opt for a quadratic regulariser operating upon the directional derivatives of the needle map [5,20]. This leads to the oversmoothing of "ne surface detail and failure to capture the di!erential or topographic structure of realistic surfaces. We have recently developed an improved framework for shape-from shading in which the dual de"ciencies of poor data closeness and simplistic consistency constraints are redressed [21]. The algorithm is based on a geometric interpretation of the ambiguity structure of the image irradiance equation in the underconstrained conditions that apply in shape-from shading. At each image location, the available intensity information together with the physics of the image irradiance equation mean that the recovered surface normal must fall on a cone of ambiguity. The axis of the cone points in the light source direction. The apex angle of the cone is determined by the local image brightness. In other words, we satisfy the image irradiance equation as a hard constraint. We can impose organisation on neighbouring surface normals by allowing them to rotate on their respective cones of ambiguity so as to satisfy curvature consistency constraints. We use the shape index of Koenderink and van Doorn to impose topographic constraints on the local needle map. This is done by using the local variance of the shape index to control the width of an adaptive robust regulariser. 1.2. Paper motivation and outline The aim in this paper is to explore the use of the previously reported shape-from shading scheme [21] for the topographic analysis of 3D surfaces from 2D intensity images. Although topographic analysis is a routine procedure in range imagery [22], there has been little e!ort directed at extracting topographic structure using shape-from shading. One notable exception is the work of Lagarde and Ferrie [23]. Here the curvature consistency process of Sander and Zucker [24] is applied to the needle map as a post-processing step so as to improve the organisation of the "eld of principal curvature directions. There is no attempt to exploit curvature consistency constraints in the recovery of needle maps via the image irradiance equation. As noted above, our modelling of curvature consistency is based around Koenderink and van Doorn's shape index [25]. This is a scale-invariant measure which captures the di!erent topographic classes using a continuous angular variable. Using the shape index allows surfaces to be segmented into meaningful topographic structures such as ridges or valleys, saddle points or lines, and, domes or cups. These structures can be further organised into simply connected elliptical or hyperbolic regions which are separated from one another by parabolic lines.

Our consistency model uses robust error kernels to capture acceptable local variations in the shape index. The model encourages parabolic structures (i.e. ridges and ravines) to be thin and contour-like. Hyperbolic and elliptical structures (domes, cups, etc.) are encouraged to form contiguous regions. The outline of this paper is as follows. In Section 2 we review the original variational framework for shape-from shading developed by Horn and Brooks. Section 3 reviews our new geometric framework for shape-from shading which imposes compliance with the image irradiance equation as a hard constraint. In Section 4 we show how needle map smoothness can be incorporated into the new framework. This idea is developed further in Section 5 where we describe our modelling of curvature consistency via the shape index of Koenderink and van Doorn. In Section 6 we evaluate the e!ectiveness of the new algorithm for extracting topographic information from a variety of 2D intensity imagery. We illustrate the usefulness of the information provided by the new shape-from shading for object recognition in Section 7. Finally, Section 8 o!ers some conclusions and provides some suggestions for further work.

2. Variational approach to SFS of Horn and Brooks Central to shape-from shading is the idea that local regions in an image E(x, y) correspond to illuminated patches of a piecewise continuous surface whose height functon is z(x, y). The measured brightness E(x, y) will vary depending on the material properties of the surface (whether matte or specular and its albedo), the orientation of the surface at the co-ordinates (x, y), and the direction of illumination. The reyectance map, R(p, q) characterises these properties, and provides an explicit connection between the image brightness and the surface orientation. The surface orientation is characterised by the components of the surface gradient in the x and y direction, i.e. p"*z/*x and q"*z/*y. The shape from shading problem is to recover the surface height function z(x, y) from the image E(x, y). As an intermediate step, we may attempt to recover a set of surface normals or needle map, describing the orientations of surface patches which locally approximate the height function z(x, y). To simplify the problem, most research has concentrated on recovering ideal Lambertian surfaces illuminated by a single point source located at in"nity [26]. A Lambertian surface has a matte appearance and re#ects incident light uniformly in all directions. Hence, the light re#ected by a surface patch in the direction of the viewer is simply proportional to the orientation of the patch relative to the light source direction and is independent of viewing angle. If n"(!p,!q, 1)2 is the local unit surface normal, and s"(!p ,!q , 1)2 the global J J

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

light source direction, then the re#ectance function is given by R(p, q)"n ) s. The image irradiance equation for Lambertian surfaces states that the measured brightness of the image is proportional to the radiance at the corresponding point on the surface, which is R(p, q). Normalising both image intensity and re#ectance map, the constant of proportionality becomes unity, and the image irradiance equation is simply E(x, y)"R(p, q).

(1)

This equation succinctly describes the mapping between the x, y co-ordinate space of the image and the p, q co-ordinates of gradient space of the surface. However, it provides insu$cient constraints for the unique recovery of the needle map. Additional constraints, based on assumptions about the structure of the recovered surface, must be utilised. Invariably, it is smoothness of the needle map that is assumed. Hence, the goal is to recover the smoothest surface satisfying the image irradiance equation. This is posed as a variational problem in which a global error functional is minimised through the iterative adjustment of the needle map. Here we consider the formulation of Brooks and Horn [26], which is couched in terms of unit surface normals. The Horn and Brooks error functional is de"ned to be

 

*n  *n  j # *x *y (E(x, y)!n ) s) # GFFHFFI GFFHFFI PGEFRLCQQ#PPMP 0CESJ?PGXGLE2CPK



I"



# k(""n""!1) dx dy. (2) GFHFI ,MPK?JGXGLE2CPK The functional has three distinct terms. Firstly, the brightness error encourages data closeness of the measured image intensity and the re#ectance function. It is the only term which directly exploits shading information. The regularizing term imposes the smoothness constraint on the recovered surface normals; it penalises large local changes in surface orientation, measured by the magnitudes of the partial derivatives of the surface normals in the x and y directions. The "nal term imposes normalisation constraints on the recovered normals. The constants k and j are Lagrangian multipliers. The functional is minimised by applying variational calculus and solving the Euler equation: (E!n ) s)s#j n!kn"0

(3)

To obtain a numerical scheme for recovering the needle map the Laplacian appearing in the variational equation must be discretised onto the pixel lattice. Suppose that the surface normals n are indexed according to their G H co-ordinates (i, j) on the pixel lattice. The resulting "xedpoint iterative scheme for updating the estimated normal at epoch k#1, using the previously available estimate

825

from epoch k is



e 1 ; n I# (E !nI ) s) s nI>" G H G H G H 4j G H 1#k (e/4j) G H where



(4)

n I " (nI #nI #nI #nI ) G H  G> H G\ H G H> G H\ is the local average surface normal. At "rst sight, it appears necessary to solve for the Lagrangian multiplier, k on a pixel-by-pixel basis. G H However, it is important to note that k only enters the G H update equation as a multiplying factor which does not e!ect the direction of update, so we can replace this factor by a normalisation step. Finally, we comment on the geometry of the Horn and Brooks needle map update equation. It is clear that there are two distinct components. The "rst of these is in the direction of the average neighbourhood normal n I. This component has a local G H smoothing e!ect. The second component is in the direction of the light-source direction s. This can be viewed as responding to the physics of the image irradiance equation and has step-size proportional to E !nI ) s. The G H G H relative step sizes are controlled by the Lagrange multiplier. The principal criticism of the Horn and Brooks algorithm and similar approaches, is the tendency to over smooth the recovered needle map. Speci"cally, the smoothness term dominates the data-closeness term. Since the smoothness constraint is formulated in terms of the directional derivatives of the needle map, it is trivially minimised by a #at surface. Thus, the con#ict between the data and the model leads to a strongly smoothed needle map and the loss of "ne detail. The problem is exacerbated by the need to select a conservative value for the Lagrange multiplier in order to ensure numerical stability [20]. Horn [20] attempts to reduce the model dominance problem by annealing the Lagrange multiplier to reduce the in#uence of the smoothness constraint as a "nal solution is approached. Meanwhile, we have used the apparatus of robust statistics to moderate the penalisation of discontinuities [27].

3. A novel framework for SFS We have recently reported a new framework for shape-from shading method which overcomes some of the problems listed above. The method is documented in detail in Ref. [21] where it is experimentally compared with a number of alternatives. The aim in this paper is to demonstrate the utility of the method for extracting topographic information from intensity images. For completeness, we reproduce here the main details of the method.

826

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

The idea underpinning our new framework for shapefrom shading previously described in Ref. [21] is to guarantee data closeness by treating the IIR as a hard constraint. In other words we aim to recover a valid needle map which satis"es the IIR at every iteration. Subject to this data closeness constraint, the task of shape-from shading becomes that of iteratively improving the needle map estimate. Here, we do this using curvature consistency constraints. Our approach is a geometric one. We view the IIR as de"ning a cone of ambiguity about the light source direction for each surface normal. The individual surface normals which constitute the needle map can only assume directions the fall on this cone. At each iteration the updated normal is free to move away from the cone under the action of the local consistency constraints. However, it is subsequently mapped back onto the closest normal residing on the cone. By applying this constraint, we gain dual advantages in terms of both numerical stability and obviating the need for a Lagrange multiplier. More importantly, the needle map evolves via a series of intermediate states which are each solutions of the IIR. 3.1. Hard constraints The new framework requires us to minimise the constraint functional



I " !

t(n(x, y), N(x, y)) dx dy

(5)

(E!n ' s) dx dy"0.

(6)

Here N(x, y) is the set of local neighbourhood vectors about location (x, y). For example, in terms of lattice coordinates i, j, the 4-neighbourhood of n is de"ned as G H N"+n

,n ,n ,n ,. G> H G\ H G H> G H\

(7)

The function t(n(x, y), N(x, y)) is a localized function of the current surface normal estimates. The size of the neighbourhood may be varied according to the nature of t. Clearly, it is possible to incorporate the hard data closeness constraint directly into t, but this needlessly complicates the mathematics. Instead, we choose to impose the constraint after each iteration by mapping the updated normals back to the most similar normal lying on the cone. The resulting update equation for the surface normals can be written as nI>"#n( I G H G H

(u, v, w)2"n I ;s. (9) G H The angle of rotation is the di!erence between the angle subtended by the intermediate update and the light source, and the apex angle of the cone of ambiguity. Since the image is normalized, the latter angle is simply cos\ E, giving a rotation angle of





n I ) s G H #cos\ E. ""n I "" ""s"" G H Hence, the rotation matrix is given by h"!cos\



#"

c#uc ws#uvc

!vs#uwc

!ws#uvc c#vc us#vwc

(10)

(8)



vs#uwc !us#vwc c#wc

where c"cos h, c"1!c, and s"sin h.

whilst satisfying the hard constraint, imposed by the image irradiance equation



where n( I is the surface normal that minimises the conG H straint functional I . The hard image irradiance con! straint is imposed by the rotation matrix # which maps the updated normal to the closest normal lying on the cone of ambiguity. Another way to look at this is that we allow the smoothness constraint to select the direction of the normal estimate in the image plane only, whilst "xing the angle between the normal estimate and the light source direction. To achieve the rotation, we de"ne an axis perpendicular to the intermediate update normal, n I , and the light G H source direction. The axis of rotation is found by taking the cross-product of the intermediate update with the light source direction

3.2. Initialization The new framework requires an initialisation which ensures that the image irradiance equation is satis"ed. This di!ers from the Horn and Brooks algorithm, which is usually initialised by estimating the occluding boundary normals, with all other normals set to point in the light source direction. To satisfy the image irradiance equation, we must choose a normal direction from the in"nite possibilities de"ned by the cone of ambiguity. We choose to initialise each normal such that its projection onto the image plane lies in the opposite direction to the image gradient direction, as shown in Fig. 1. This results in an initialisation with an implicit bias towards convex rather than concave surfaces. In other words, bright regions are assumed to correspond to peaks, and the image gradient direction points towards these peaks. We have also applied this initialisation to the Horn and Brooks algorithm in place of the traditional occluding boundary initialisation. We "nd that this initialisation produces signi"cantly better and faster results for the Horn and Brooks algorithm. However, in common with

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

827

where o (g) is a robust kernel de"ned on the residual N g and with width parameter p. Applying the calculus of variations to the resulting constraint function I yields ! the general update equation

  

* o n I>"# G H *x N

* # o *y N

*n *x

  *n *y

(12)

where o N o N

Fig. 1. The set of surface normals at a point which satisfy the Image Irradiance equation de"ne a cone such that E!n . s"0. A normal from this set is chosen such that the direction of its projection to the image plane is opposite to the maximum intensity gradient direction, g.

the Horn and Brooks algorithm, our schemes are sensitive to initialisation. It is impossible to say whether using the image gradient as described above is the best possible initialisation, but it does have stable properties and is, intuitively, a reasonable method of estimating initial normal direction.

4. Needle map smoothness constraints Before we describe our modelling of curvature consistency, we describe how needle map smoothness can be incorporated into our new framework for shape-from shading. In a recent paper [27], we showed how needle map smoothness could be modelled using robust error kernels. Here the adopted framework was based on a regularised energy function similar to that underpinning the Horn and Brooks algorithm. However, rather than using a quadratic smoothness prior, we used a continuous variant of the Huber robust error kernel. In this section we show how this smoothness model can be used in conjunction with our geometric needle map update process. In essence, we consider that the recovered surface should be smooth, except where there is a high probability that a discontinuity is present, in which case the smoothing is reduced. We de"ne the robust regularizer constraint function as t(n, N)"o

N

    *n *x

#o

N

*n *y

(11)

    *n *x

* " *n

*n *y

* " *n

V

W

      N

*n *x

,

N

*n *y

.

o

o

(13)

In [28] we experimented with several robust error kernels, including Li's Adaptive Potential Functions [29], and the Tukey [30] and Huber [31] estimators. However, the sigmoidal-derivative M-estimator, a continuous version of Huber's estimator, proved to possess the best properties for handling surface discontinuities, and is de"ned by

 

pg p . o (g)" log cosh N p p

(14)

With this kernel the surface normal that minimises I is !

             

*nI \ p *nI G H n( I" G H tanh G H *x p *x

(nI #nI ) G> H G\ H

#

p *nI \ p *nI G H G H sech p *x p *x

!

*nI \ p *nI G H G H tanh *x p *x

#

*nI \ p *nI G H G H tanh *y p *y

#

p *nI \ p *nI G H G H sech p *y p *y

!

*nI \ p *nI G H G H tanh *y p *y



*nI *nI *nI G H ) G H G H *x *x *x

(nI #nI ) G H> G H\



*nI *nI *nI G H ) G H . G H *y *y *y (15)

For comparison the quadratic smoothness penalty used by Horn and Brooks [5] results in the local averaging of the surface normal. As a result n( I" (nI #nI #nI #nI ). G H  G> H G\ H G H> G H\

(16)

828

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

It is illuminating to compare the behaviour of the update equation obtained from the robust error kernel and that for local averaging which results from a quadratic penalty. We do this for small and large smoothness errors. In the robust case, the averaging of the neighbourhood normals is moderated by a function of the form (1/""g"") tanh [(p/p)g]. This averaging e!ect is most pronounced when the smoothness error is small. The remaining contribution to the smoothness process is of the form g((1/""g"") sech g!(1/""g"") tanh g). This term vanishes at the origin and tends towards zero for large values of g, only kicking-in at intermediate error conditions.

5. Curvature consistency Needle-map smoothness appears to be an over-strong and inappropriate constraint for shape from shading. This is primarily because real surfaces are more likely to be piecewise smooth; in other words, formed of smooth regions separated by sharp discontinuities in depth or orientation. The oversmoothing problem is exacerbated by the di$culty of formulating the continuous concept of smoothness on a discrete pixel lattice, as clearly illustrated by the fact that the Horn and Brooks smoothness constraint is trivially minimised by a needle map corresponding to a planar surface. In [21] we have taken a di!erent tack by using curvature consistency. Although the curvature classes either side of a depth discontinuity may be completely unrelated, this is not the case for an orientation discontinuity. Orientation discontinuities usually correspond to ruts or ridges. Furthermore, the curvature classes for locations either side of a rut or a ridge should be the most similar classes, either trough or saddle rut for a rut, or dome or saddle ridge for a ridge. This property of smooth variation in class suggests that curvature consistency may be a more appropriate constraint for SFS than smoothness, which strongly penalises legitimate orientation discontinuities. The use of a curvature consistency measure was introduced to SFS by Ferrie and Lagarde [23]. They use global consistency of principal curvatures to re"ne the surface estimate returned by local shading analysis. Curvature consistency is formulated in terms of rotating the local Darboux frame to ensure that the principal curvature directions are locally consistent. An alternative method of representing curvature information is to use H}K labels, but these require us to set 4 thresholds to de"ne the classes in terms of the mean and Gaussian curvatures. However, we propose to use curvature consistency based upon the shape index of Koenderink and van Doorn [25]. This is a continuous measure which encodes the same curvature class information as H}K labels in an angular representation, and has the further advantage of not requiring any thresholds.

5.1. The shape index We reformulate the de"nition of the shape index in terms of the needle map. This allows us to use the needle map directly, rather than needing to reconstruct the surface. The di!erential structure of a surface is captured by the Hessian matrix





*z *z ! ! *x *x*y H" . *z *z ! ! *y*x *y The set of surface normals, n"(!p, !q, 1)2" (!*z/*x,!*z/*y, 1)2, can be substituted into Eq. (17) to give the local Hessian in terms of the needle map:



H"

    *n *x *n *y

V

*n *x *n *y



W

(17)

V W where (2) and (2) denote the x and y components of V W the parenthesised vector, respectively. The eigenvalues of the Hessian matrix, found by solving the eigenvector equation "H!iI""0, are the principal curvatures of the surface. In terms of surface normals, these are given by 1 i "!   2 1 G 2

              *n *x

#

V

*n *x

*n *y

W

*n *y



*n *x

*n *y

. (18) V W W V Koenderink and van Doorn [25] de"ned the shape index !

#4

2 i #i  , i *i .

" arctan  (19)   p i !i   This may be expressed in terms of surface normals thus

" 2 p

arctan





(*n/*x) #(*n/*y) V W . (((*n/*x) !(*n/*y) )#4(*n/*x) #(*n/*y) V W W V

(20) Fig. 2 shows the range of shape index values, the type of curvature which they represent, and the grey levels used to display di!erent shape index values. Table 1 shows the relationship between the shape index and the mean and Gaussian curvature classes. Recall that the mean curvature K"(i #i ) and the    Gaussian curvature H"i i . The classes depend on   whether the two curvatures are positive, zero or negative.

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

829

Fig. 2. The shape index scale ranges from !1 to 1 as shown. The shape index values are encoded as a continuous range of grey-level values between 1 and 255, with grey-level 0 being reserved for background and #at regions (for which the shape index is unde"ned).

Table 1 Topographic classes Class

Symbol

H

K

Region type

Shape index

Dome Ridge Saddle ridge Plane Saddle-point Saddle-rut Cup Rut

D R SR P S SV C V

* * * 0 0 # # #

# 0 * 0 * * # 0

Elliptic Parabolic Hyperbolic Hyperbolic Hyperbolic Hyperbolic Elliptic Parabolic

[, 1)  [, )  [, )  Unde"ned [!, )  [!,!)   [!,!1)  [!,!)  

It is important to stress that there adjacency constraints applying to the topographic classes. In particular, the cup (C) and dome (D) surface types may not appear adjacent to each other on a surface. Moreover, elliptic regions on the surface (those for which K is positive) must be separated from hyperbolic regions (those for which K is negative) by a parabolic line (where K"0). Parabolic lines are e!ectively zero crossings of the mean or Gaussian curvatures. In other words, domes and cups are enclosed by ridge or valley lines. Moreover, domes or cups cannot be adjacent to saddle structures. 5.2. Adaptive robust regularizer using curvature consistency As stated above, since the shape index is an angular, physical measure, we expect it to vary gradually over a smooth surface. For instance, with reference to Fig. 2, we would not expect the shape index at adjacent pixels to di!er by more than one curvature class unless they lie on opposite sides of a surface discontinuity. Since the oversmoothing e!ect of the quadratic smoothness constraint stems directly from the indiscriminate averaging of normals lying across a discontinuity, we anticipate that weighting according to curvature consistency will reduce the problem in a physically principled manner. To meet these goals we use curvature consistency to control the robust weighting kernel applied to the variation in the needle map direction. The idea is a simple one. We use the variance of the shape index in the

neighbourhood N to control the width p of the robust error kernel applied to the directional derivatives of the needle map. The kernel width determines the level of smoothing applied to the surface normals in the neighbourhood. If the variance of the shape index is large, i.e. the neighbourhood contains a lot of topographic structure, then we choose a small kernel width. This limits the local smoothing and allows signi"cant local variation in the local needle map direction. From a topographic viewpoint, we can see the rationale for this choice by considering the behaviour of the needle map and the shape index at ridges and ravines. For such features, the direction of the needle map changes rapidly in a particular direction. These two structures are parabolic lines which intercede between elliptic and hyperbolic regions. As a result there is a rapid local variation in shape index. Turning our attention to the case where the shape index variance is small, then the kernel width is large. This is the case when we are situated in a hyperbolic or elliptic region. Here the shape index is locally uniform and the needle map direction varies slowly. We meet these goals by using the variance of the shape index to control the width of the robust errorkernel. The variance dependance of the kernel is controlled using the exponential function



 

1 ( ! )  A p"p exp ! J (21)  N N *  B JZ Here is the shape index associated with the central A normal of the neighbourhood, n , is one of the neighG H J bouring shape index values and * is the di!erence in B shape index between the centre values of adjacent curvature classes listed in Table 1. The number of neighbourhood normals used in calculating the "nite di!erence approximations to *n/*x and *n/*y is denoted N, and p is a reference kernel width which we set to unity.  Using the scale of Fig. 2, * ". To summarise, if the B  shape index varies signi"cantly over the neighbourhood, a small value of p results, and the robust regulariser saturates to produce a heavy smoothing e!ect. In contrast, when the shape index values are already similar, the kernel is widened so that little smoothing occurs. When this model is used, the needle map update equation is identical to that of Eq. (12). However, now the error kernels adapt locally in line with the shape index variance.

830

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

6. Algorithm evaluation In this section we provide some experimental evaluation of the new shape-from shading technique. The evaluation focuses on the quality of the shape index information extracted from the intensity images used in our experiments. The images used in our study are taken from the Columbia University COIL data-base. We furnish some comparison between the use of curvature consistency constraints and the simple needle map smoothness constraint. We commence in Fig. 3 by showing a sequence which shows the shape index evolving with iteration number. In each case, the left-hand column shows the results obtained using the curvature consistency constraint developed in the paper, whilst the right-hand column shows the results of using a quadratic smoothness constraint within the same framework. Using the curvature consistency scheme, the elliptic and hyperbolic region classes become more connected while the parabolic lines (i.e. ridges and ravines) become thinner and more continuous. In contrast, using the smoothness constraint leads to loss of the ridge and ravine structure. The regions are noisy and exhibit poor connectivity. The "nal shape index images contain some features which merit special mention. In particular the ravines de"ning the boundaries of the wing of the duck are well segmented. In addition, the slot in the top of the piggybank is correctly identi"ed as ravine. Fig. 4 demonstrates that the curvature classes recovered using shape-from shading are stable to viewpoint changes. Here we show six views of a toy-duck. Notice how the valley lines around the beak and the wing are well recovered at each viewing angle. Also notice how the shape of the addle structure below the wing is maintained. Next, we show class probabilities for the di!erent topographic structures. These are computed as follows. Suppose that is the mean value of the shape index for S the topographic class labelled u. The probability of that the shape index belongs to this curvature class is estimated by exp[!( ! )/2* ] S B P(u)"  exp[!( ! )/2* ] J J B

(22)

where is the width of the curvature classes and is B J the mean of shape index class l. Fig. 5 shows the probabilities for the dome and rut classes for one of the toy-duck views from the COIL data-base. Notice how the dome label is most prominent on the chest and head. The rut class is most prominent around the cusps of the neck and the wing. The COIL data-base images are captured under controlled lighting conditions. To explore more naturalistic

lighting conditions, we furnish some additional examples on images of sculpture. Figs. 6 and 7 show the needle map and shape index information extracted from images of the sculptures `Davida and `Bachusa. In both cases the needle maps contain most of the important surface detail. The shape index captures the ridge and ravine structures on the surface. For instance, notice how the ribs on `Davida torso are located. To provide further evidence of the stability of the topographic information extracted using shape-from shading we experiment with a motion sequence captured with a video cam-corder. Fig. 8 shows images from the original motion sequence and Fig. 9 shows the shape index. There are several features worth noting from this sequence. Firstly, the ridge de"ning the jaw-line is cleanly segmented in each of the images. Secondly, the ravines outlining the nose and the ridges inside the ear are well segmented. Thirdly, there is is a stable ravine which corresponds to a horizontal furrow on the "rst author's forehead. Finally, the ravine between the lips is cleanly segmented on all the images in the sequence. Finally, we provide some comparison of the schemes using a simple synthetic image. The object used in this study is a sphere which has ridges and ruts embedded in its surface. The ideal shape index histogram for the object would consist of three peaks at shape index values of !0.5 (rut), #0.5 (ridge) and 1.0 (dome). Fig. 10 shows the shape index histograms recovered using curvature consistency (left) and needle map smoothness (right). The main feature to note from the two histograms is that in the case of curvature consistency the peaks are signi"cantly narrower. This means that the measurement errors in the shape index are smaller. To illustrate the e!ect of noise on the shape index measurements, Fig. 11 shows a scatter plot for shape index. The plot shows the measured versus the ground-truth shape index values for a synthetic surface. There is a clear regression line.

7. Object recognition experiments In this section we show how the extracted needle maps and shape index information can be used for object recognition. We adopt a simple histogram-based recognition strategy similar to that originally used by Swain and Ballard [32]. The idea underpinning this recognition method is to identify images of the same object by comparing attribute histograms. Although originally used with colour attributes, success has also been reported using edge orientation, texture and shape index [22]. In the work reported here we experiment with both one-dimensional and two-dimensional histograms. The one-dimensional histograms use raw grey-scale information and shape index as their attributes. The two-dimensional histograms use intensity gradient (i.e. directional

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

Fig. 3. Evolution of shape index classes with iterations of the SFS schemes.

831

832

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

Fig. 4. Di!erent views of the toy duck.

storage and matching of unfeasibly large numbers of model views. Note that the objects themselves often break the fundamental assumptions of our SFS scheme by being noisy, non-Lambertian, and of non-constant albedo. Nonetheless, we succeed in recovering qualitatively good needle maps. 7.2. Needle maps and histograms Fig. 5. Label probabilities computed from the shape index.

edge information) and the x and y components of the needle map as their attributes. We measure histogram similarity using the Bhattacharyya distance L B(P , P )"!ln (P (i);P (i) / + / + G where n is the number of bins, P is the query histogram / and P one of the model histograms. +

Fig. 2 illustrates the representations derived for 4 of the 20 images in the test set. The "rst row of Fig. 12 shows the "rst image from each of the 72 view sequences for 4 objects in the data-base. The second row shows the greylevel histograms for the raw images. The third row contains the 2D histograms of the surface normal directions. Clearly, there is a great deal of variability in the structure of these 2D histograms. Finally, the fourth row shows the shape index histograms. Note that they are all broadly similar, in that each is bi-modal, with the two modes corresponding approximately to ruts and ridges or domes. In each histogram, the leftmost bin corresponds to background pixels and is excluded from the calculation of Bhattacharyya distance between the histograms.

7.1. Test data 7.3. Ranking results The data set used in this study is the Columbia Image Object Library, consisting of 20 arbitrary objects. There are 72 views of each object, each illuminated by a light source coincident with the camera and giving a database of 1440 images in total. The images are taken at 53 intervals along a great circle of the object's view sphere. Only around 9% of the view sphere is spanned by these 72 images, underlining the need for view grouping if appearance-based object recognition is not to require the

Fig. 13 shows histogram ranking results for each of the representations. These are plots of the average histogram rank returned by a query image. The average rank is plotted as a function of the angular distance between the image and the query. The ranks are averaged over all 72 views of each object. In other words, one of the 72 images is chosen as the query image, and all 1440 images in the database ranked according to their distance from this

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

Fig. 6. Curvature consistency SFS applied to David, by Michelangelo.

Fig. 7. Curvature consistency SFS applied to Bacchus, by Michelangelo.

833

834

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

Fig. 8. Movie sequence.

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

Fig. 9. Shape index sequence.

835

836

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

Fig. 10. Distributions of shape index values resulting from the curvature consistency scheme (left) and using the smoothness constraint (right) applied to a synthetic image of a hemisphere on a plane. Ground truth for this object would result in a distribution with a tall delta function at shape index value 1, and a smaller delta at !0.5 corresponding to the perceived rut at the boundary of the sphere where it meets the plane.

Fig. 11. Scatter correlation plot for the shape index.

query. Clearly, the query image itself has zero self-distance and hence is ranked 0. Views of the same object from similar viewpoints, i.e. those with small angular deviations in any direction on the view sphere, should come next in the ranking, and so on. Each image in the set corresponding to a given object is taken as the query in turn, and an average ranking found for all images at a given angular distance either side of the query. This is repeated for each of the object representations, and subsequently for each object. In the case of the one-dimensional histogram, the shape index outperforms the grey-scale histogram. In the case of the 2D histogram, the needle map components outperform the image gradient. The bottom row shows the average ranking as a function of angular distance from the query taken over all

images of all objects, i.e. we take each image in the database as query in turn, and "nd the rankings of all views of the same object. In the bottom left plot, we show this over the full $1803 range of angular distances from the query. Note the dips which occur around $17 images ($903) away from the query, and the more signi"cant dip corresponding to $1803. These are due to many of the objects possessing twofold and fourfold rotational symmetry. The bottom right plot of Fig. 13 is a magni"cation of the origin of the average plot, and is the most important result, since it compares the recognition performance on similar views for each of the representations, averaged over the entire database. We see that the needle map and shape index histograms retain the advantage, demonstrated for the 4 sample objects, over the entire data-set of 20 objects. Speci"cally, if an image is $2 ($103) images away from the query, then the needle map histogram receives a ranking approximately half that obtained using the intensity gradient histogram. In other words, fewer other views, either of the same object from a greatly di!ering viewpoint, or of other objects, are ranked ahead of the correct matches.

8. Conclusions This paper has shown how a new shape-from shading algorithm which exploits curvature consistency can be used to extract topographic information from 2D intensity images. The novel contribution resides in the modelling of curvature consistency. We use the Koenderink

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

837

Fig. 12. Top row: Raw Images. Row 2: 25 bin grey-level frequency histograms. Row 3: 15;15 bin 2D histograms of normal direction frequency. Row 4: 25 bin shape index frequency histograms.

and van Doorn shape index as a measure of surface topography. We use the variance of the shape index to control the width of a robust error kernel which controls needle map smoothness errors. In this way we ensure that the local smoothness of the needle map responds to the variability of the local surface topography. We illustrate the comparative advantages of the new method on real-world imagery from the COIL object data-base. Here it proves to be reliable in delivering both smooth elliptic and hyperbolic

regions together with thin and continuous parabolic lines. Our future plans revolve around further exploiting the topographic information delivered by the new shapefrom shading scheme. Our "rst objective is to investigate more sophisticated schemes for 3D object recognition from 2D imagery. At a more ambitious level, we intend to use the extracted shape index information and needle maps for view synthesis. These studies are already in hand and will be reported in due course.

838

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

Fig. 13. Plots of average ranking vs. distance from query over all images of a given object. Each one of the 72 images of the object is taken as the query image in turn, and all 1440 images in the database ranked according to their distance from this query. The average ranking is found for all images at a given angular distance either side of the query. An angular distance of $1 corresponds to images at $53 from the query. The top two rows correspond to the objects in Fig. 12, whilst the bottom row shows plots averaged over all 20 objects in the database.

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

References [1] D.C. Marr, Vision, Freeman, San Francisco, 1982. [2] D.C. Marr, Colour indexing, Int. J. Comput. Vision 7 (1982) 11}32. [3] B.K.P. Horn, Obtaining shape from shading information, in: P.H. Winston (Ed.), The Psychology of Computer Vision, McGraw Hill, New York, 1975, pp. 115}155. [4] K. Ikeuchi, B.K.P. Horn, Numerical shape from shading and occluding boundaries, Artif. Intell. 17 (3) (1981) 141}184. [5] B.K.P. Horn, M.J. Brooks, The variational approach to shape from shading, Comput. Vision Graphics Image Process. 33 (2) (1986) 174}208. [6] M. Bichsel, A.P. Pentland, A simple algorithm for shape from shading, Proceedings IEEE Conference on Computer Vision and Pattern Recognition, 1992, pp. 459}465. [7] A.M. Bruckstein, On shape from shading, Comput. Vision Graphics Image Process. 44 (1988) 139}154. [8] R. Kimmel, A.M. Bruckstein, Tracking level sets by level sets: a method for solving the shape from shading problem, Comput. Vision Image understanding 62 (1) (1995) 47}58. [9] A.P. Pentland, Local shading analysis, IEEE Pattern. Anal. Mach Intell. 6 (1984) 170}187. [10] R. Szeliski, Fast shape from shading, Comput. Vision Graphics Image Process: Image Understanding 53 (1991) 129}153. [11] P.N. Belhumeur, D.J. Kriegman, What is the set of images of an object under all possible lighting conditions? Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1996, pp. 270}277. [12] A. Blake, G. Brelsta!, Geometry from specularity, Proceedings of IEEE International Conference on Computer Vision, 1988, pp. 394}403. [13] A. Blake, H. Bultho!, Shape from specularities } computation and psychophysics, Philos. Trans. Roy. Soc. London, Ser. B 331 (1260) (1991) 237}252. [14] J.J. Koenderink, A.J. van Doorn, Relief } pictorial and otherwise, Image Vision Comput. 13 (5) (1995) 321}334. [15] J.J. Koenderink, A.J. van Doorn, C. Christou, J.S. Lappin, Perturbation study of shading in pictures, Perception 25 (9) (1996) 1009}1026. [16] J.J. Koenderink, A.J. van Doorn, C. Christou, J.S. Lappin, Shape constancy in pictorial relief, Perception 25 (1996) 155}164.

839

[17] N.K. Logothetis, J. Pauls, H.H. BuK ltho!, T. Poggio, Viewdependent object recognition in monkeys, Curr. Biol. 4 (1994) 401}414. [18] N.K. Logothetis, J. Pauls, H.H. BuK ltho!, T. Poggio, Shape representation in the inferior temporal cortex of monkeys, Curr. Biol. 5 (1995) 552}563. [19] E. Mingolla, J.T. Todd, Perception of solid shape from shading, Biol. Cybernet. 53 (1986) 137}151. [20] B.K.P. Horn, Height and gradient from shading, Int. J. Comput. Vision 5 (1) (1990) 37}75. [21] P.L. Worthington, E.R. Hancock, New constraints on data-closeness and needle map consistency for shape-from shading, IEEE Trans. Pattern Anal. Mac. Intell. 21 (1999) 1250}1267. [22] C. Dorai, A.K. Jain, Shape spectrum based view grouping and matching of 3D free-from objects, IEEE Pattern Anal. Mach. Intell. 19 (1997) 1139}1146. [23] F.P. Ferrie, J. Lagarde, Curvature consistency improves local shading analysis, Proceedings of IEEE International Conference on Pattern Recognition, Vol. I, 1990, pp. 70}76. [24] P. Sander, S.W. Zucker, Inferring surface structure and di!erential structure from 3D images, IEEE Pattern Anal. Mach. Intell. 12 (1990) 833}854. [25] J.J. Koenderink, A.J. van Doorn, Surface shape and curvature scales, Image Vision Comput. 10 (1992) 557}565. [26] M.J. Brooks, B.K.P. Horn, Shape and source from shading, International Joint Conference on Arti"cial Intelligence, 1985, pp. 932}936. [27] P.L. Worthington, E.R. Hancock, Needle map recovery using robust regularizers, Proceedings of British Machine Vision Conference, Vol. I, BMVA Press, 1997, pp. 31}40. [28] P.L. Worthington, E.R. Hancock, Needle map recovery using robust regularizers. Image Vision Comput 17 (1999) 545}557. [29] S.Z. Li, Discontinuous MRF prior and robust statistics: a comparative study, Image Vision Comput. 13 (3) (1995) 227}233. [30] D.C. Hoaglin, F. Mosteller, J.W. Tukey (Eds.), Understanding Robust and Exploratory Data Analysis, Wiley, New York, 1983. [31] P. Huber, Robust Statistics, Wiley, Chichester, 1981. [32] M.J. Swain, D.H. Ballard, Colour indexing, Int. J. Comput. Vision 7 (1991) 11}32.

About the Author*PHILIP WORTHINGTON attained B.A. (Hons) in Engineering and Computing Science from University College, Oxford, in 1996. During 1996}99 he undertook EPSRC-funded D.Phil. research in the Computer Vision group at the University of York, where he worked under Professor Edwin Hancock on Shape from Shading and Appearance-based Object Recognition. In 1998 he spent 2 months at NTT Basic Research Laboratories in Atsugi, Japan, courtesy of JISTEC and the British Council, working with Professor Hiroshi Murase on using Shape from Shading in conjunction with parametric eigenspace techniques. In 1999 he was awarded the University of York Gibbs-Plessey award to visit academic and commercial computer vision groups in the US. Philip has published several papers on Shape from Shading and Object Recognition in international conferences and journals. Since October 1999 he has worked in the Computational Electromagnetics Group at Roke Manor Research, Romsey, UK.

About the Author*EDWIN HANCOCK gained his B.Sc. in Physics in 1977 and Ph.D. in High Energy Nuclear Physics in 1981, both from the University of Durham, UK. After a period of postdoctoral research working on charm-photo-production experiments at the Stanford Linear Accelerator Centre, he moved into the "elds of computer vision and pattern recognition in 1985. Between 1981 and 1991, he held posts at the Rutherford-Appleton Laboratory, the Open University and the University of Surrey. He is currently Professor

840

P.L. Worthington, E.R. Hancock / Pattern Recognition 34 (2001) 823}840

of Computer Vision in the Department of Computer Science at the University of York where he leads a group of some 15 researchers in the areas of computer vision and pattern recognition. Professor Hancock has published about 200 refereed papers in the "elds of High Energy Nuclear Physics, Computer Vision, Image Processing and Pattern Recognition. He was awarded the 1990 Pattern Recognition Society Medal and received an outstanding paper award in 1997. Professor Hancock serves as an Associate Editor of the journals IEEE Transactions on Pattern Analysis and Machine Intelligence, and, Pattern Recognition. He has been a guest editor for the Image and Vision Computing Journal. He has recently completed guest-editing a special edition of the Pattern Recognition journal devoted to energy minimisation methods in computer vision and pattern recognition. He chaired the 1994 British Machine Vision Conference and has been a programme committee member for several national and international conferences.

Suggest Documents