A probabilistic approach to 3-D inference of geons from a ... - CiteSeerX

0 downloads 0 Views 97KB Size Report
Apr 24, 1992 - A new, probabilistic approach for inferring 3-D volumetric primitives from a single 2-D ... ing each 3-D inference in a more quantitative manner.
A probabilistic approach to 3-D inference of geons from a 2-D view Alain Jacot-Descombes and Thierry Pun Computer Vision Group, Computing Science Center, University of Geneva, 12 rue du Lac, 1207 Geneva, Switzerland email: [email protected] ABSTRACT A new, probabilistic approach for inferring 3-D volumetric primitives from a single 2-D view is presented. This recognition relies on the assumption that every object can be decomposed into component parts that belong to a finite set or alphabet of volumetric primitives (geons). For each possible primitive from the permissible set, a conditional probability function is computed. This law specifies the probability of obtaining the primitive given an observable 2-D measure or feature. The distribution functions are determined by simulation, on the basis of a representative number of random projections of the primitives. The measures themselves are chosen in such a way that (a) they can easily be extracted from real images and (b) their discriminative power for the volumetric primitive inference is high. Examples illustrate the proposed approach. 1. INTRODUCTION One of the major goals of a computer vision system is the interpretation of any given image or scene. This interpretation requires recognition of some relevant objects or features that compose the scene. In turn, recognizing the objects themselves can be performed by identifying the object components. Two underlying assumptions are necessary in order to make this recognition by components simple and robust. First, it must be possible to decompose every object into parts belonging to a finite set of geometric volumetric primitives. A vision system able to recognize this alphabet of primitives should therefore be capable of recognizing any object. The second assumption is that any of the geometric primitives can be distinguished from any other one by some geometric properties, thus avoiding ambiguous identification. In the present case, the geometric volumetric primitives are classes of geons 3,4. This paper deals with the 3-D inference of a geon from a 2-D orthographic projection, based on various non-accidental properties. In addition to the 2-D geometric features proposed by Biederman, that provide a qualitative description of each class of geon, some measures are proposed for describing each 3-D inference in a more quantitative manner. Section 2 presents the world of geons studied in this paper and discusses the different visual distance between classes of geons. Section 3 explains the probabilistic approach used for the 3-D inference and describes the simulation method allowing the computation of the required probability distribution functions. It also reviews some properties of measures used for bottom-up inference in a geon-based vision system. Section 4 shows some results obtained by the probabilistic approach in order to infer the most probable 3-D description from a 2-D view of a geon. 2. THE WORLD OF GEONS In this paper, only part of the 36 geon classes proposed by Biederman are studied. The mathematical model used here to represent geons is the generalized cylinder, defined by a cross-section ρ ( θ ) , an axis and a sweeping rule r ( z ) of the cross-section along the axis 8, as illustrated in Fig. 1. The classes of geons used are those with a straight axis, a straight edge cross-section (square or rectangle) or a curved edge one (circle or ellipse), and a linear sweeping-rule (constant or expand). The discrimination problem between the members of this geon alphabet has not always the same complexity. Looking at a projection of a geon, it is quite easy to visually distinguish a straight edge from a curved edge cross-section. However it becomes more difficult to make the difference between a square and a rectangle, respectively a circle and an ellipse. This shows that the visual distance between classes of geons is not constant. This distance can even be null and other approaches must be considered to infer the most probable 3-D primitive. The 2-D properties are sometimes not sufficient to infer a non-ambiguous solution, as illustrated in Fig. 2; the 2-D projection can represent an elliptical cross-section geon as well as a circular one.

SPIE Proceedings Vol. 1708: “Applications of Artificial Intelligence X: Machine Vision and Robotics”, 20-24 April 1992, Orlando, Florida, USA.

-1-

z

n β

r(z)

y

θ ρ(θ) α

x

Fig. 1 The mathematical model used for the geon representation and the observation sphere. ρ(θ) defines the reference cross-section; r(z) defines the sweeping rule of this cross-section along the axis z . n is a vector of opposite direction to the viewing direction of the geon and perpendicular to the projection plane; its direction is defined by the couple of angles ( α, β ) . Actually, the circle is a particular case of an ellipse; this raises the need for a more quantitative than qualitative discrimination between circle and ellipse. From the projection in Fig. 2, what are the parameters of the elliptical cross-section of the original geon? Unfortunately, this question has an infinity of answers. However, looking at a geon projection, the human visual system automatically constructs the most plausible 3-D description, while it can not even imagine some of the mathematically plausible solution. This remark leads to a generalization of the initial question: given a 2-D projection of a geon, what is the most probable shape of the corresponding 3-D geon? In order to answer to this question, a probabilistic approach has been developed to infer a 3-D geon from its projection, as described in the next section. 3. THE PROBABILISTIC APPROACH 3.1. General Principle In order to infer the most probable 3-D geon of shape S from a 2-D image, a 2-D feature or measure M has to be found on the projection, that gives a characteristic information about the 3-D shape. Formally, the following expression must be known: Prob ( S M )

(1)

Considering the geon class with elliptical cross-section as illustrated in Fig. 2, a 2-D measure indicating the most probable shape for the 3-D geon cross-section has to be found. Amongst S 1, S 2, S 3 (Fig. 2), this most probable shape would be the one giving the largest value amongst Prob ( S 1 M ) , Prob ( S 2 M ) , Prob ( S 3 M ) .

SPIE Proceedings Vol. 1708: “Applications of Artificial Intelligence X: Machine Vision and Robotics”, 20-24 April 1992, Orlando, Florida, USA.

-2-

3-D geons

S1 (circle)

S2 (ellipse)

S3 (ellipse)

?

?

?

2-D projection

Fig. 2 2-D projection of a geon with three possible 3-D inferences. In order to determine Prob ( S M ) , expression (1) can be rewritten as: Prob ( S M ) =

Prob ( M S ) ⋅ Prob ( S ) Prob ( M )

(2)

Looking at the right part of equation (2), three terms have now to be to evaluated. Prob ( S ) is dependent upon the number p of volumetric primitives to be discriminated, as well as possibly on the context. This value is currently set to 1 ⁄ p , but could be modified according to some a priori information regarding the nature of the possible primitives. Prob ( M S ) and Prob ( M ) are not known a priori, but can however be obtained by simulation, as described below. 3.2. Simulation In order to compute expression (2), two distributions dependent on the measure M have to be computed: Prob ( M S ) and Prob ( M ) . Prob ( M S ) , the distribution of the measure M given a geon of shape S , is constructed by studying the values of M on a representative number of projections of the geon. Practically, the geon is positioned at the center of an observation sphere 2, as illustrated in Fig. 1; viewing directions symbolized by n and defined by the couple of longitudinal and latitudinal angles (α,β) are then randomly chosen; finally the geon is orthographically projected on a plane perpendicular to these vectors

SPIE Proceedings Vol. 1708: “Applications of Artificial Intelligence X: Machine Vision and Robotics”, 20-24 April 1992, Orlando, Florida, USA.

-3-

n . The random function has been carefully defined in such a way that the viewing directions are uniformly distributed on the surface of the sphere; it could be changed if some directions would have to be privileged. The measure M is then evaluated on each projection and its distribution constructed. In the case of Prob ( M ) , it expresses the probability that the measure M takes the value m , for any geons on which this measure is defined. Practically, the distribution of the measure M is constructed for each geon of shape S that can be discriminated by this measure; the global distribution of M is then computed, according to p

Prob ( M ) =

∑ Prob ( M

i=1

Si )

(3)

where p is the number of primitives to be discriminated. It must be noted that the measure has to be carefully chosen in order to optimize the discrimination between geons; the selection of appropriate 2-D measures is presented in section 3.3. The observation sphere also provides the values of the measure M as a function of the viewing direction (α,β) . This information can be extremely useful for a vision system, since it is possible, from a measure M computed on a 2-D projection, to restrict and infer possible viewing directions of an image. Therefore constraints can be established that help the system to recognize other components of this image. 3.3. The 2-D measures A particular attention has to be given to the choice of the measure M , if it is to be used for bottom-up inference in a real vision system. In the simulation phase, any measure can be computed, since precise data are generated mathematically by the projection of a geometrical model. But in a real recognition system, the data used for computing the value of the indexing measure are extracted from an image and are therefore noisy. The measure must consequently be easily obtainable from an image. On the one hand, a measure is defined and its distribution computed for each geon to be discriminated. On the other hand, this 2-D measure is extracted from an image during the recognition process. This 2-D measure is used for indexing the most probable geon through the probability distributions, as explained in Fig. 3. A more robust discrimination could be obtained by computing the combination of several measures, involving better peak separation. Prob ( S i M )

1

S1

S2

S3

Prob ( S2 m ) > Prob ( S3 m ) > Prob ( S1 m )

0 2-D measure (M) extracted measure (m) Fig. 3 The indexing mechanism of a geon, being given a measure. This measure can be either local or global. A local measure is certainly more easy to compute from an image but contains less information than a global one. Local measures can for example be curvature or angle at a critical points such as junctions. A global measure is more significant, but its extraction from an image needs a grouping process 10,11,12 and is therefore more difficult to compute. As an example, the measure can be the eccentricity of an elliptical cross-section after an elliptic fit 1,5, or an analytical computation from two corresponding junctions. Besides its locality or globality, the measure used for discrimination must also be non-accidental 6, meaning that its definition domain must be the entire observation sphere, exception made of a negligible number of viewing directions.

SPIE Proceedings Vol. 1708: “Applications of Artificial Intelligence X: Machine Vision and Robotics”, 20-24 April 1992, Orlando, Florida, USA.

-4-

Expression (2) helps in defining what a good discriminant measure should be. Assuming that the p geons of shape S i, are equiprobable, expression (2) becomes: Prob ( Si M ) =

1 Prob ( M Si ) ⋅ , p Prob ( M )

i = 1…p

i = 1…p

(4)

In order to maximize the right part of equation (4) for each shape S i , the distribution of the measure M for each S i has to be as distinct as possible from the others 7. This means that when a measure is chosen, a distinct peak should be obtained in its distribution for each shape S i to discriminate: this peak is the characterizing value of a measure for a given shape. This will be illustrated by the experiments. 4. EXPERIMENTS 4.1. Elliptical cylinder Let us consider as an example of geon the cylinder with elliptical or circular cross-section and constant sweeping-rule, such as S1, S2, S3 in Fig. 2. The chosen measure is the eccentricity of the projected cross-section, defined as follows: a2 − b2 a

e =

(5)

where a and b are the two principal axes of the ellipse defined by: x2 a2

+

y2 b2

= 1

(6)

This normalized measure varies between 0 (circle) and 1 (line); it is size independent. For this experiment, four geons have been defined ( S 1, S 2, S 3, S 4 ), with respective eccentricities of their elliptical cross-section being 0.866, 0.745, 0.436, 0 . Fig. 4.a shows for each geon S i the distribution Prob ( M Si ) of the eccentricity, obtained from 100000 randomly projected elliptical cross-sections. The peak at the right of each histogram represents an eccentricity of 1; it corresponds to the equatorial views of the geon, numerous due to the size of the equatorial area. These particular views, that give a projection looking like a 2-D planar rectangle, do not provide enough information for inferring a 3-D geon. However, they can be considered as accidental, because all views except equatorial and polar allow inference of a cylinder without ambiguity; incidentally, they will not disturb the maximum of Prob ( S i M ) . For the present purpose, the interesting peak is exactly located at the eccentricity value of the 3-D geon cross-section and therefore can be considered as a characterizing value of the 2-D projections. Fig. 4.b represents the probability Prob ( S i M ) of having each of the four geons, given the eccentricity value of the 2-D projected cross-section. These histograms are computed using equation (4). They confirm that the most probable value for the eccentricity of an inferred elliptical cylinder is the value of the eccentricity measured on the projected cross-section. The graphic depiction of Fig. 4.c is obtained by superimposing the four histograms of Fig. 4.b. It shows the subdivision of the value domain of the eccentricity into distinct subranges, from which either S 1 , S 2 , S 3 or S 4 can be inferred, thus providing the desired 3-D from 2-D inference. The relative size of the four distinct peaks corresponding to the eccentricity value of the 3-D cylinder quantify the strength of this discriminant measure. Fig. 5 shows for each one of the four geons a cartographic projection of the observation sphere, on which the eccentricity value of the 100000 randomly projected cross-section has been represented by a grey level. These graphs represent the eccentricity as a function of the viewing direction, defined by the couple of angles (α,β) . The colour varying from black to white corresponds to an eccentricity value varying from1 to 0. The lack of values near the poles of the observation sphere (latitude equals to 0 and π) is due to the principle of the cartographic projection: much less points are mapped onto the top and bottom rows of the rectangular representation. The four clear spots are the locations where the 2-D projected cross-section is seen almost as a circle; the less elongated the cross-section of the 3-D geon, the more oblong and the nearer to the pole these spots are. Given a value for the eccentricity, these graphs can be used to restrict the possible viewing directions of the analyzed image and therefore help a vision system to recognize other components of this image 9.

SPIE Proceedings Vol. 1708: “Applications of Artificial Intelligence X: Machine Vision and Robotics”, 20-24 April 1992, Orlando, Florida, USA.

-5-

S1 4.a

S2

S3

S4

Prob ( M S i ) 1

0 e 0.866 4.b

0.745

e

0.436

e

0

e

Prob ( S i M ) 1

0 e 0.866

0.745

e

0.436

e

0

e

4.c

S4

S3

S3 S1

Fig. 4 4.a: distribution of the eccentricity of the 2-D projected elliptical cross-section for each 3-D geon S i . 4.b: probability to have the geon S i given the eccentricity of the 2-D projected elliptical cross-section, computed by using equation (4). 4.c: graphical depiction obtained by superimposing the four graphs of 4.b, showing different intervals corresponding to different geons.

SPIE Proceedings Vol. 1708: “Applications of Artificial Intelligence X: Machine Vision and Robotics”, 20-24 April 1992, Orlando, Florida, USA.

-6-

longitude (α)

0

S1



S3

pole

0

latitude (β)

equator

π

pole S4

S2

Fig. 5 The distributions of the eccentricity value for the four geons, depending on the viewing direction. The colour varying from black to white correspond to an eccentricity value varying from1 to 0. The projection is from sphere to cylinder. 4.2. Elliptical cone Let us consider as a second example of geon the cone with elliptical cross-section and linearly expanding sweeping-rule, as shown in Fig. 6.a. The chosen non-accidental 2-D measure is the value of the angle φ , defined by the two joined sides of the projected cone (Fig. 6.c). The values of this size independent measure vary between 0° and 180° . This measure can also be considered in the case of a truncated cone (Fig. 1), by extending its two sides until they intersect each other. For this experiment, five geons have been defined ( S 1, S 2, S 3, S 4, S 5 ), with a varying eccentricity of their elliptical cross-section according to: e i + 1 = e 2i

(7)

These five 3-D cones have a constant height h ; their respective elliptical reference cross-sections have a constant width equal to h ⁄ 2 and a length such as equation (7) is verified. Two characterizing angles Φ min and Φ max of the 3-D cone can be defined in the space, similarly to the 2-D angle φ , as shown in Fig. 6.b. Φ min , respectively Φ max , is the angle between the opposite sides of the 3-D cone that intersect the axis a , respectively b , of the elliptical cross-section. The following table gives the numerical values of these parameters for this experiment: height h

ellipse axes a b

eccentricity e

Φ° min

Φ° max

S1

4.000

2.419

2.000

0.5625

14.036

16.824

S2

4.000

3.024

2.000

0.7500

14.036

20.705

S3

4.000

4.000

2.000

0.8660

14.036

26.565

S4

4.000

5.464

2.000

0.9306

14.036

34.334

S5

4.000

7.592

2.000

0.9647

14.036

43.502

Fig. 7.a shows for each geon S i the distribution Prob ( M Si ) of the angle φ , obtained from 100000 random projections. The bin at the left of each histogram, denoted by p on the horizontal axis of the histograms (Fig. 7), represents accidental views of

SPIE Proceedings Vol. 1708: “Applications of Artificial Intelligence X: Machine Vision and Robotics”, 20-24 April 1992, Orlando, Florida, USA.

-7-

3-D geons 6.a

6.b

z

z

z

Φ max

Φ min

y

a/2 y -a/2

a/2

x -b/2

b/2

b/2 x 2-D projections 6.c

6.d

φ

Fig. 6 6.a: 3-D elliptical cone 6.b: sections of the cone according to y-z plane and x-z plane, on which Φ max and Φ min are respectively defined. 6.c: a 2-D non accidental view of the cone and the angle φ between the two joined sides of the cone. 6.d: a 2-D accidental view of the cone. the geons, i.e. the views on which only geon cross-section is visible, as shown in Fig. 6.d. These particular views do not provide enough information for inferring a 3-D geon. For the present purpose, there are two interesting values on the histograms. The first one, denoted by q on the horizontal axis of the histograms (Fig. 7), is the left bound of the histograms and is equal to Φ° min . This means that a cone is always seen with an angle φ ≥ Φ min . The second value, denoted by r on the horizontal axis of the histograms, corresponds to a peak exactly located at Φ° max , and therefore considered as a characterizing value of the 2D projections. Fig. 7.b represents the probability Prob ( S i M ) of having each of the five geons, given the value of the angle φ computed on the 2-D projected cones. These histograms are computed using equation (4). They confirm that the most probable value for the angle Φ max , from which the eccentricity of the reference cross-section of the 3-D inferred cone can be computed, is the value of the angle φ measured on the 2-D projected cone.

SPIE Proceedings Vol. 1708: “Applications of Artificial Intelligence X: Machine Vision and Robotics”, 20-24 April 1992, Orlando, Florida, USA.

-8-

S1

S2

S3

S4

S5

7.a Prob ( M S i )

1

0 p

0 qr

180 p

0qr

180 p

180 p

0q r

0q

r

180 p

0q

r

180 φ°

7.b Prob ( S i M )

1

0 p

0 qr

180 p

0qr

180 p

180 p

0q r

0q

r

180 p

0q

r

180 φ°

7.c S5 S1 S4 S2S3

Fig. 7 7.a: distribution of the angle φ (Figure 6.c) computed on the 2-D projections for each 3-D geon S i . The particular values on the horizontal axis of the histograms are the accidental views ( p ), φ = Φ min ( q ) and φ = Φ max ( r ). 7.b: probability to have the geon S i given the angle φ , computed by using equation (4). 7.c: graphical depiction obtained by superimposing the five graphs of 7.b, showing different peaks corresponding to different geons.

SPIE Proceedings Vol. 1708: “Applications of Artificial Intelligence X: Machine Vision and Robotics”, 20-24 April 1992, Orlando, Florida, USA.

-9-

The graphic depiction of Fig. 7.c is obtained by superimposing the five histograms of Fig. 7.b. It shows the subdivision of the value domain of the eccentricity into distinct subranges, from which either S 1 , S 2 , S 3 , S 4 or S 5 can be inferred, thus providing the desired 3-D from 2-D inference. 5. CONCLUSION The general problem of recognition by components is seen here as the probabilistic inference of 3-D geons from 2-D projections. This approach first requires to define a measure able to discriminate between geons of the same or a different class. Using simulation, two distributions based on this measure are computed, yielding the probability of having a 3-D geon given a 2D measure. This approach can be applied to any finite set of volumetric primitives to be discriminated, as long as a measure can be found that characterizes each primitive. It should be noted that, obviously, such an ideal measure does not exist. This approach however allows to select the measure giving the best discrimination. In a vision system, this measure has to be chosen in order to be easy enough to compute from an image and contain enough information. The model of the observation sphere allows to know the set of most probable viewing directions, given the value of a measure. This information can be used as a strong constraint which will make easier the recognition process of other components. The model allows also to control the distribution of the viewing directions by changing the random function that generates these directions or by considering only part of the uniformly generated viewing directions. This provides a better adaptation of the model to the morphology of each primitives. Future work will include the study of others geons and measures. The discrimination between visually distant classes of geons will be made by qualitative measures. For the others classes or for geons belonging to the same class, measures will be selected in order to obtain quantitative inference. The combination of measures in the probabilistic equations will be also considered. 6. ACKNOWLEDGMENTS This work is supported in part by grants from the Swiss National Fund for Scientific Research (FNRS 20-26475.89) and from the Swiss National Research Program 23 “AI and Robotics” (PNR23 4023-27036). 7. REFERENCES 1.

D. H. Ballard, “Generalizing the Hough Transform to Detect Arbitrary Shapes”, Pattern Recognition, vol. 13, nº 2, pp. 111-122, 1981. 2. J. Ben-Arie, “The Probabilistic Peaking Effect of Viewed Angles and Distances with Application to 3-D Object Recognition”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. PAMI-12, nº 8, pp. 760-774, 1990. 3. I. Biederman, “Human Image Understanding: Recent Research and a Theory”, Computer Vision, Graphics and Image Processing, vol. 32, pp. 29-73, 1985. 4. I. Biederman, “Aspects and Extensions of a Theory of Human Image Understanding”, in: Computational Processes in Human Vision: An Interdisciplinary Perspective, Z. W. Pylyshyn, Ed., pp. 370-427, Ablex Publishing Corporation, Norwood, New Jersey, 1988. 5. B. B. Chaudhuri and G. P. Samanta, “Elliptic Fit of Objects in Two and Three Dimensions by Moment of Inertia Optimization”, Pattern Recognition Letters, vol. 12, pp. 1-7, 1991. 6. D. G. Lowe, Perceptual Organization and Visual Recognition, Kluwer, Boston, 1985. 7. D. G. Lowe, “Visual Recognition as Probabilistic Inference from Spatial Relations”, in: AI and the Eye, A. Blake and T. Troscianko, Eds., pp. 261-279, John Wiley & Sons Ltd., 1990. 8. J. Ponce, D. Chelberg and W. B. Mann, “Invariant Properties of Straight Homogeneous Generalized Cylinders and Their Contours”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. PAMI-11, nº 8, pp 951-965, 1989. 9. T. Pun, “The Geneva Vision System: Modules, integration and primal access”, AI and Vision Group Report 90.06, Computing Science Center, University of Geneva, December 1990. 10. T. Pun, “Electromagnetic Models for Perceptual Grouping”, in: Advances in Machine Vision: Startegies and Applications, C. Archibald, Ed., World Scientific Publ. Co, 1992. 11. A. Shashua and S. Ullman, “Grouping Contours by Iterated Pairing Network”, in: Advances in Neural Information Processing System 3 (NIPS 90), R. P. Lippman, J. E. Moody and Touretzky, Eds., 1991 12. S. W. Zucker, C. David, A. Dobbins and L. Iverson, “The Organization of Curve Detection: Coarse Tangent Fields and Fine Spline Covering”, Proc. 2nd Int. Conf. Comp. Vision, pp. 568-577, Tampa, Fl, December 1988.

SPIE Proceedings Vol. 1708: “Applications of Artificial Intelligence X: Machine Vision and Robotics”, 20-24 April 1992, Orlando, Florida, USA.

-10-

Suggest Documents