Indexing visual representations through the complexity map Benoit Dubuc and Steven W. Zucker
TR-CIM-95-02
March 1995
Computer Vision and Robotics Laboratory Centre for Intelligent Machines McGill University Montreal, Quebec, Canada
Accepted for the 5th International Conference on Computer Vision, Cambridge MA, June 1995
Postal Address: 3480 University Street, Montreal, Quebec, Canada H3A 2A7 Telephone: (514) 398-6319 Telex: 05 268510 FAX: (514) 398-7348 Email:
[email protected]
Indexing visual representations through the complexity map Benoit Dubuc and Steven W. Zucker
Abstract In dierential geometry curves are characterized as mappings from an interval to the plane. In topology curves are characterized as a Hausdor space with certain countability properties. Neither of these de nitions captures the role that curves play in vision, however, in which curves can denote simple objects (such as a straight line), or complicated objects (such as a jumble of string). The dierence between these situations is in part a measure of their complexity, and in part a measure of their dimensionality. Our goal is to develop a formal theory of curves appropriate for computational vision in general, and for problems like separating straight lines from strings in particular. The theory is applied to the problem of perceptual grouping.
Resume En geometrie dierentielle les courbes sont caracterisees comme etant une application d'un intervalle au plan. En topologie, les courbes sont caracterisees par un espace de Hausdor avec certaines proprietes de cardinalite. Aucune de ces de nitions ne capture le r^ole que jouent les courbes en vision cependant, en ce sens que des courbes sont parfois des objets simples (telle une ligne droite), parfois des objets compliques (tel un paquet de lignes entrem^elees). La dierence entre ces deux cas est due en partie a cause de leur complexite et en partie a cause de leur dimensionalite. Notre but ici est de developper une theorie formelle des courbes appropriee pour la vision par ordinateur et pour des problemes tel celui de separer les courbes simples de celles qui sont plus compliquees. En n, notre theorie est appliquee au probleme de groupement perceptuel.
1. Introduction
1. Introduction Edge detection implies a basic problem in perceptual grouping: Once the local structure is established, the transition to global structure must be aected. To illustrate, imagine standing on an edge element in an unknown image, as in the top two panels of Fig. 1. Is this edge part of a curve, or perhaps part of a texture? If the former, which is the next element along the curve? If the pattern is a texture, is it a hair pattern (in which nearby elements are oriented similarly) or a spaghetti pattern (in which they are not). The question is in part one of complexity, since curves are \simpler" than textures and in part one of dimensionality, since curves are 1-D and textures are 2-D. In this paper we propose a measure of representational complexity that seeks to answer this question. Measures for curves, dimensionality, and complexity are coupled notions, and the relationships between them are important practically as well as theoretically. Measures of curves might include their length, the number of components, or the area covered. However, the situation is more subtle than this, as is illustrated in Figs. 1 and 2. The rst example (Fig. 2a) is a demonstration due to Kanisza (1979), in which a pin-striped surface appears to be occluding a rectangle. Thus curves, or sets of curves, can actually connote either the outline of objects (as in the rectangle) or surfaces (the pin stripes). Closer examination reveals that the rectangle is actually continuous through the surface, however, suggesting that the visual inferences somehow group the pinstripes together and ignore the fact that the rectangle is the longest curve in the image. This example clearly shows that neither the length of the curve (de ning the rectangle) nor the number of components (de ning the pinstripes) is dominant. If another pinstripe were added (as shown on Fig. 2b), the percept would not change. Moreover, in contradiction to Ullman (1990), the saliency of the rectangle is camou aged, and the question of de ning the subjective \edges" that delimit the pinstriped rectangle arises. Our second example is an image of a statue called \Paolina" (see Fig. 1 bottom right). The result of applying our edge operator at a given scale is shown on Fig. 1 (bottom left) and the problem of integrating local information is the one we are interested in. Two areas in the edge map are highlighted, indicating dierent places where we would like to focus attention for integration. In the lower left region (shoulder), the resulting object is simple and a curve representation seems appropriate to group the edge elements. In the top region, subtending parts of the hair structure, a \curve" representation would be highly inadequate as opposed to a texture representation. What is required is a measure of the complexity of curves, and our speci c goal in this paper is to propose one. We show at the end of the paper how it successfully handles both the Kanisza and the Paolina examples. It is based on an intermediate representation|the discrete tangent map|and a consequence of our analysis is that such intermediate representations are necessary for properly separating curve-like patterns that are sparse from those that ll areas and from those that extend mainly along their length. We further submit that these representational dierences capture the nite stages of segmentation; but via complexity analysis not pixel grouping.
1
2
Indexing visual representations through the complexity map
Figure 1. Example of a curve-like set: an edge map of a grey level image. Bottom left, image of a statue (Paolina) provided by Pietro Perone. Bottom right, detected edges (tangent map). The gray shaded areas show places where to focus integration and the need for dierent representations required for the grouping of local edge elements. It is clear from this that integrating the response of local edge detectors in the hair region, as shown in middle right, is highly problematic. By what principles should the tangents be grouped in the upper right blowup?
1. Introduction
(a)
(b)
Figure 2. A curve-like set pattern due to Kanisza, in which a pin-striped surface appears to occlude a rectangle. This gure illustrates the need to use dierent representations for integration. Notice how dicult it is to perceive the rectangle; the grating seems to predominate over the perception of the closed curve. Moreover, comparing (a) and (b), which texture has the more stripes? Texture representations must also take this dierence into account. Moreover, can you detect at rst glance the dierence between the two gratings? Texture representations must also take this similarity into account.
3
4
Indexing visual representations through the complexity map
Much of the paper is mathematical. We introduce the techniques of geometric measure theory to computational vision to provide a de nition of a curve-like set. This is a richer class of objects than the Jordan curve, and is independent of any speci c functional form or parameterization. It is appropriate for vision in that the set is not unlike the set of pixels through which a curve might pass (at a nite scale). We are then able to introduce a parameterization-free intermediate representation, the Besicovitch tangent set, and to build our complexity map on top of it. The complexity map is related to an abstract notion of dimension, since it is dimension that separates (0-dimensional) dust patterns from (1dimensional) contours from (2-dimensional) texture ows. For the above examples, the hair region of Paolina's tangent map approaches 2-D, while along the shoulder the map it is 1-D. For the Kanisza example, the occluding surface appears to be 2-D, while the protruding top and bottom portions of the rectangle are 1-D. Thus the complexity map will agree with our intuition and with the requirements for at least these two applications. 2. Continuous rectifiable curves and curve-like sets This section is an attempt to answer the question: \what is a curve?" in the context of curve detection. The most common de nition of a curve is the one of Jordan, namely that a curve ? is the range of a continuous map from an interval I to Euclidean space (typically R2 or R3). Two other basic notions usually follow in any study of elementary dierential geometry: the one of length, denoted L(?), and the one of best linear local approximation, namely the tangent to ? at x = (t), denoted T (x). Within these are embedded the notions of local representation (the tangent) and global measure (the length), which are tied together through the map . Unfortunately, however, the map is not given for general vision problems, but must be inferred. More abstract notions of a curve are thus required. This is why a generalization of the previous ideas through measure theory is necessary. Geometric measure theory provides us with such a generalization where curve-like sets and their associated parameterization-free tangents will constitute a much better basis for our needs.
2.1. Introduction to geometric measure theory. One way to compute the length, area or volume of an object is to use the Hausdor s-dimensional measure H , where in the case s
of a smooth recti able curve, s = 1, in the case of a surface, s = 2. Consider the problem of de ning the length of a set E in the plane. Hausdor's idea was to cover the set with small circles and to take the sum of the diameters (Fig. 3a). If the balls are restricted to be smaller than some given value > 0, and if the `most economical' covering is chosen, we get an approximation of the length of the set at resolution . Allowing arbitrary covers, instead of covers by balls, gives us an outer measure, and for > 0 we write X H (E ) = inf jUij i
where jU j is the diameter of U , (i.e., jU j = supfjx ? yj : x; y 2 U g) and fUig is any sequence of sets of diameter less than covering E . The in mum here is taken over all (countable)
2. Continuous recti able curves and curve-like sets
-cover fUig of E and we can easily show that H (E ) increases as decreases, therefore:
De nition 1 (Hausdor measure [Falconer 1987]). The one dimensional Hausdor measure of E is given by H (E ) = lim !0 H (E ) = sup H (E )
0
>
One important result is that if ? is a curve, then the Hausdor measure H(?) and the Jordan length L(?) coincide [Falconer 1987]. But most importantly, we have De nition 2 (Curve-like set [Falconer 1987]). A set E with 0 < H (E ) < 1, will be called a curve-like set. The goal of this paper will be to de ne discrete curve-like sets and to provide a classi cation scheme that would make the integration process of local information, such as the one found in edge detection, reliable and meaningful. 2.1.1. Local structure of curve-like sets. Before discussing the existence of tangents for curvelike sets, we will present an alternate de nition of a tangent that doesn't rely on a parameterization of the set. This de nition is due to Besicovitch: De nition 3 (Tangent: Besicovitch [Besicovitch 1928]). A curve-like set E has a tangent TB (x) at x in the direction if (1) x \is on" the set E , (2) for every angle > 0, H (E \ (Br (x)nSr (x; ; )nSr(x; ?; ))) = 0 lim r !0 r where Br (x) is the ball of radius r centered at x and Sr (x; ; ) is the sector of radius r at angle with opening . Suppose x 2 E , then this de nition means that at x the set E is locally concentrated around the line TB (x) with orientation passing through x. The rst condition in this de nition can be formally expressed in terms of the density of the set E at x [Falconer 1987]. The second condition ensures that the concentration is around the tangent line only. Fig. 3b illustrates the de nition, namely that the second condition consists in looking at the rate of growth of what is found outside the local angular sector centered at x. If this rate of growth is much faster than r, the curve is ensured (from the rst condition) to be concentrated around the line with orientation at x. How does this de nition of tangent relate to the usual de nition of a tangent to a parameterized curve? In his seminal work, Besicovitch [Besicovitch 1928] showed that his new parameterization free de nition was indeed equivalent to the usual de nition. From now on we will therefore write T (x) for the Besicovitch tangent.
5
6
Indexing visual representations through the complexity map
(a)
x r (b)
1
2 (c)
Figure 3. Illustrating the parameterization-free de nitions of length and tan-
gents. In (a) we have the approximation of the length of a set by a covering with -balls, in (b) is an illustration of the parametrization-free Besicovitch tangent and (c) the multiple tangents set. De nitions like those underlying (b) and (c) require the density of points in the angular wedge(s) to be high, and to remain high as the wedge(s) is/are made smaller. The standard Besicovitch de nition doesn't allow however for multiple tangents at a point, a key requirement in the type of sets we are studying when doing curve detection [Zucker et al. 1989]. In real world situations objects occlude each other leading to discontinuities in bounding contours and to T-junctions. At these points \multiple tangents" must be represented. The following is an extension of the Besicovitch tangent to allow the representation of multiple tangents at a point: De nition 4 (Multiple tangents: the tangent set). A curve-like set E has a tangent set (x) at x if x \is on" the set E and for every angle > 0, H (E \ (Br (x)n(S2(x)[Sr(x; ; ) [ Sr (x; ?; )]))) lim (1) =0 r !0 r but also, for each 2 (x), 9 r0 and 0 such that 80 < < 0 and 0 < r < r0, (2) H (E \ (Br (x) \ Sr (x; ; ) \ Sr (x; ?; ))) > 0 As in the de nition of the Besicovitch tangent, the \on the set" condition can be formally expressed in terms of densities [Dubuc & Zucker 1994]. The condition given by Eq. 1 prevents things from being too crumpled around the point, while the third (Eq. 2) ensures that indeed there is something going on in the directions contained in the tangent set.
2. Continuous recti able curves and curve-like sets
Existence of multiple tangents. It is easy to build a set with multiple tangents. The graph of the absolute value function Gf = f(y; f (y)) : f (y) = jyjg for instance has a multiple tangent at x = (0; 0). In this case (x) = f=4; 3=4g. In both cases we have for all r > 0 and 0 < < =8 H (E \ (Br (x) \ Sr (x; ; ) \ Sr (x; ?; ))) = r > 0;
Later on in this paper, we will discuss the cardinality of . Because the set E is recti able, we will see in the next section that multiple tangents cannot occur `too often'. Meanwhile, by bundling the non-empty tangent sets, we obtain our key intermediate representation, the tangent map: De nition 5 (Tangent map). Given a curve-like set E , we de ne the tangent map to be [ = (x; (x)); x2E where (x) is the set of tangents (in the Besicovitch sense) at x. 2.1.2. Existence and separation theorems. Given a curve-like set E , how does the tangent map look like? A fundamental result about the existence of tangents gives us the following: Theorem 1 (Structure theorem [Morgan 1987, Falconer 1987]). A curve-like set E has a unique tangent H-almost everywhere. The next section presents original results. It namely constrains the set of possible tangent maps from the assumption of recti ability of the underlying curve-like sets.
2.2. Tangent separation theorems. In the classical sense, recti ability implies nite length, so these are the class of curves relevant to vision. Geometrically, recti ability constrains the global and local distribution of tangents. Basically, for a curve-like set to be recti able it cannot be too crumpled or cannot cut itself too often. The following theorems will try to capture these last statements and will provide constraints on the underlying local representation of a curve-like set. We rst start by looking in the neighborhood of a tangent, in the normal direction for instance, and get the following: Theorem 2 (Parallel tangents separation). Let E be a curve-like set. Consider a point w on E with tangent T (w). If N (Tw) is a line passing through w with orientation dierent than T (w), then D(w; E ) = fy 2 E N (w) : T (w) = T (y)g is a totally disconnected set. Sketch of proof. We will proceed by contradiction. Suppose there exists a connected component C D(w; E ). Since C N (w) we have L(C ) = H (C ) = diam(C ) > 0
7
8
Indexing visual representations through the complexity map
E
N (w)
T (w) Figure 4. Illustration of the parallel tangents separation theorem. The key
idea is that for a curve to be recti able, it cannot be squeezed too much or, equivalently, the parallel tangents cannot form a dense set.
Take now any point z inside C . For small enough we get H (C \ B(z)) = 2 Therefore, if is the angle for T (w), then we have for all suciently small H (C \ (Br (z)nSr(z; ; )nSr (z; ?; ))) 1 lim r !0 r which shows that there is no tangent at that point. This being true for all interior points of C , and since diam(C ) > 0, we have found a set of H-measure > 0 for which there does not exist a tangent. However E being a recti able set, this contradicts the fact that it should have a tangent H-almost everywhere. Up to now we did not put any constraint on the cardinality of . The next two theorems will address this issue. More importantly, however, Theorem 4 will be the equivalent of Theorem 2 but in orientation space rather than in the spatial domain. Theorem 3. If E is a curve-like set, then the set of tangents (x) at x is composed of a unique tangent for H-almost all x in E . Proof. The proof is a simple Corollary from Thm 1. Theorem 4 (Multiple tangents separation). If w is a point on a curve-like set E , then (w), the set of multiple tangents at w, is a totally disconnected set.
3. From edge detection to discrete curve-like sets
Sketch of proof. Suppose there exists a connected component (w). We can then nd a circular arc in the neighborhood of w for which each point has a multiple tangent. That means we would be able to build a set of measure greater than zero with more than one tangent everywhere, which contradicts Theorem 3 since E is recti able. Remark 1. Proofs of Theorems 2 and 4 really show the spirit of Poincare's cut dimension idea [Poincare 1926]. In both proofs (using contradiction) we showed that in order to disconnect the set, we needed a continuum; a straight line in the case of Theorem 2, an arc of circle in Theorem 4. This led to a contradiction because then the object couldn't be a recti able curve. Remark 2. The use of Hausdor measure allowed us to study the local structure of sets and to provide constraints on the distribution of tangents. This will be used in the subsequent sections, since the tangent map will be exactly the rst map to be inferred in the detection process. The Hausdor measure is much better known these days for its use in the study of `fractals': i.e. sets with non-integer dimension. Fractal curves are those for which tangents exist almost nowhere. We will however take another slant by studying the sets up to a given scale (band limited) and derive what we call discrete curve-like sets.
3. From edge detection to discrete curve-like sets One of the most curious features of the theory of curve-like sets was the Besicovitch tangent formulated in a parameterization-free manner. This is important for computer vision, because it establishes an analogy with other parameterization-free methods for estimating tangents, namely edge detection. In particular, just as Besicovitch sought a dense collection of points within a cone, edge operators seek a dense collection of pixels at a certain contrast [Marr & Hildreth 1980, Canny 1986, Hubel & Wiesel 1962]. There are two important dierences, however, and it is these dierences that will motivate the second half of this paper. First is the notion of scale. The Besicovitch tangent was de ned in the in nitesimal limit, and is related to a classi cation of curves as being either nite length (i.e., recti able) or in nite (non-recti able). Those with nite length were called curve-like sets. The second dierence is resolution. Orientation for the Besicovitch tangent is a real variable, as is spatial location; for any computation on a computer these will be quantized numbers. Finite scale and nite resolution, as they arise in edge detection, make the de nition of discrete curve-like sets a very delicate one. Since edge operators are all band limited, the inferred objects will all be nite, and the subtlety lies in their organization. Returning to Fig. 1 (middle left), we see that the tradeo between nite scale and nite resolution results in a distribution of tangents that is more elongated than thick. This certainly captures the spirit of the Besicovitch construction, provided the limiting process is stopped at a nite cone. But inside the hair texture the tangent distribution is more turbulent (Fig. 1, middle right). Even though the length of hair is nite, many orientations are seen in a dense neighborhood. We thus appeal to the multiple-tangents-separation theorem, and observe that not all tangents can exist at all positions, even for a bowl of spaghetti! Rather, the
9
10
Indexing visual representations through the complexity map
structural criterion will be not how well the tangents continue \along" the curve, but how they fall orthogonally to it. As we show shortly, the classi cation of discrete curve-like sets at nite scale and resolution is much richer for vision than abstract recti ability. First, however, we de ne a discrete tangent map. As non-recti able curves are uninteresting for vision, we assume that a (recti able) curve-like set projects into a quantized image, in a way such that each pixel through which the curve passes is a discrete trace point. The union of these trace points de nes the full trace T . Further, suppose a tangent set is given at each of these trace points by a curve-detection scheme. In our examples we used logical/linear operators [Iverson & Zucker in press] followed by relaxation-labeling [Zucker et al. 1989], but the abstract requirement is that at least one orientation is assigned to each discrete trace point. We then have De nition 6 (Discrete tangent map). Let I denote an image, and (xi; yi) 2 T be a subset of its domain, then the discrete tangent map ^ will be: ^ = f((xi; yi); ^)j^ 2 ^ (xi; yi)g i = 1; 2; , where ^ is a quantized orientation at the discrete image coordinate (xi; yi). Key observation 1. Recalling the property of curve-like sets that a tangent exists H-almost everywhere, the discrete analog|discrete curve-like sets|are those sets of discrete points at which a discrete tangent set is de ned. From the above de nition this is identi ed as the set of discrete trace points. In the continuous domain, tangent maps were de ned in a way that allowed multiple tangents at the same position, to represent those discontinuities that can arise at e.g. points where one object occludes another1. A key concern for discrete tangent maps is how to distinguish this situation from the multiple tangents that can arise due to nite scale and quantization. The solution to this issue, as we now show, is also related to the complexity of the discrete curve-like set: if the normal complexity is low, it would be simply an artifact of the discretization of orientation space, otherwise, the hypothesis of multiple tangents at a point must be retained. 4. Mapping complexity Section 2 introduced continuous curve-like sets and their local structure. The last section made the link between edge detection and the theory in the continuous domain using the local structure of discrete curve-like sets, the discrete tangent map. This section will re ne the classi cation and provide a means of deciding which representation should be used for the integration process. 1
Think of taking the limit into the singularity along each object.
4. Mapping complexity
4.1. From dilations to complexity. To assess the complexity of the tangent map, a
variant of an approach due to Minkowski will be used. Originally it consisted in covering the object with balls of radius and computing the measure of the dilated object. The rate of growth of this measure can be linked to the complexity of the set. Our approach diers in the way that the dilation is achieved, in particular it will not be done isotropically. After reviewing the standard technique, we present our variation. 4.1.1. Isotropic dilations. Minkowski dilations are routinely used in mathematical morphology [Matheron 1975, Serra 1982] and for the estimation of fractal dimension [Falconer 1990, Tricot 1991]. The approach consists in creating a new set which is the Minkowski sum with a dilating (structuring) element. The dilation is done isotropically over the set. More formally, given a subset X = R2, we have De nition 7 (Minkowski sum, dilation [Matheron 1975]). The Minkowski sum is de ned in the power set of X by putting A B = fa + b; a 2 A; b 2 B g: The translate of B by z being Bz = B fzg, the set fz : A \ Bz 6= ;g of the points z such that A hits the translate Bz is called the dilation2 of A by B and is denoted A B . An example of the isotropic dilation of a line segment with a ball is shown on Fig. 5a. In fractal analysis, the case of a dilation with a ball is often called the Minkowski sausage. Dilation with other shapes (squares, segments, etc.) are called generalized Minkowski sausages [Dubuc et al. 1989]. Remark 3. In the case where the dilation is done with a ball as structuring element (Minkowski sausage), we will write E () for E B . The resulting set can also be thought of the set of points that are at a distance smaller or equal than to E : E () = fy 2 R2 : jx ? yj < ; x 2 E g: 4.1.2. Oriented dilations. One of the key dierences between our approach and the standard fractal analysis techniques is the fact that the dilations will not be done isotropically but will adapt to the local structure of the set. We call them oriented dilations and we will show that they are necessary for separating sets of dierent complexity. Again, taking the example of the line segment, Fig. 5b,c illustrate the concepts of both normal and tangential dilations: De nition 8 (Normal and tangential dilations). Let E be a curve-like set and its tangent map. The normal dilation EN () of E at a scale is the dilation of the set E with the segment (?; ) in the direction normal to the tangents 2 (x) at x (Fig. 5b). The tangential dilation ET () is obtained by dilating the segment (?; ) in the direction of the tangent (Fig. 5c). Matheron [1975] actually calls this `dilatation', but the term `dilation' is more spread out in mathematical morphology [Serra 1982]. 2
11
12
Indexing visual representations through the complexity map
(a)
(b)
(c)
Figure 5. Isotropic and oriented dilations. (a) isotropic dilation with a ball
of radius . (b) normal dilation (c) tangential dilation.
Key observation 2. The departure from the standard Minkowski dilation approach by using oriented dilations will be essential for our analysis since it will segregate the classi cation curve vs. texture (using normal dilations) from the one of dust vs. curve (using tangential dilations).
One way of understanding the implications of oriented dilations is to look at the results when applied to the Paolina tangent map. In this particular example, the dilation will be done at two scales 1 and 2. The result will display the dilated sets with two dierent grays: the lighter one for the smallest scale and the union of the darker and the lighter set corresponds to the dilation at the largest scale. Moreover, in all cases the projection of the tangent map has been overlaid on the resulting dilated sets. In Fig. 6 we show the result of tangential and normal dilations on the hair and shoulder regions for the tangent map of the Paolina image. The chosen scales were 1 = 0:1 and 2 = 0:05 for this example3. From these images we get a better feel of the underlying idea for the complexity. For the shoulder region (Fig. 6a,b), the normal dilated set seems to grow linearly while the tangential dilations don't grow at all after some point. In the hair area (Fig. 6c,d), it's the normally dilated regions that stop growing at an early stage, while it is more dicult to describe at a quick glance what is happening to the rate of growth of the tangential dilated sets. 4.1.3. Normal and tangential complexity. The local information contained in the tangent map can be used to calculate what we will call the normal complexity CN () and the tangential complexity CT () of a curve-like set at a given scale . The main idea is to be able to look at the rate of growth of the measure of the dilated sets. In the case of normal 3
For all our examples we embedded the images into the unit square.
4. Mapping complexity
(a)
(b)
(c)
(d)
Figure 6. Close-ups on the oriented dilated sets for the Paolina image. In (a) and (c), the normal dilation, while in (b) and (d), the tangential dilation.
complexity, it will be jEN ()j2 at scale . If the area of EN (), denoted j j2, is of order , we say that the normal complexity CN () for E is 2 ? :
De nition 9 (Normal complexity). Let E be a curve-like set. If E () is the normal dilation of E at scale , and if j j2 denotes its area, then the normal complexity log-log plot is N
13
14
Indexing visual representations through the complexity map
de ned as follows (3)
1
1
log ; log 2 jEN ()j2 moreover the normal complexity CN () at scale will be the left derivative of the normal complexity log-log plot. Key observation 3. From the structure of curve-like sets, one can show that the normal complexity is indeed well-de ned [Dubuc 1994]. This provides the connection between Minkowski dilations, normal complexity and the Hausdor measure of the set E . Moreover, we should point out that by de nition the normal complexity will be a number between 1.0 and 2.0: the denser the set is, the closer to 2.0 the complexity gets. Much in the same way as we did for the normal complexity, we will build a measure of complexity from tangential dilations and call it tangential complexity. The tangential complexity will be sensitive to ends of lines, corners, points of high curvature and will allow a segregation between dust and curves and/or select the scale for integration. De nition 10 (Tangential complexity). Let E be a curve-like set. If ET () is the tangential dilation of E at scale , and if jET ()jp;1 denotes its perimeter, then the tangential complexity log-log plot is de ned as follows 1 1 log (4) ; log jE ()j
T
(p;1)
moreover the tangential complexity CT () at scale will be the left derivative of the tangential complexity log-log plot. To understand the idea, we can use the example of a line segment of length l. When dilated tangentially, it becomes a line of length l +2, therefore we have jET ()j(p;1) = 2(l +2). The `measure' chosen is the perimeter (not the area as was the case in the normal complexity). Therefore, we expect the rate of growth, , to be between 0 and 1. We de ned CT (), the tangential complexity, to be 1 ? . In our example, when l is large with respect to , the rate of growth is small and is \close" to zero, therefore it re ects a \curve" structure. Otherwise, if l is small with respect to , then is closer to 1 and the object would be better described as having a \dust-like" structure.
4.2. Complexity indexes and complexity map. The complexity measures presented in
the last sections had the characteristic of being local in scale. This was a major departure from the standard `fractal analysis' approach in which the rate of growth is studied around zero. The other key dierence is to make the complexity measure local in space by computing it over a given compact region (x) centered at x. The normal complexity at x can be obtained by rst restricting the dilation to the region (x): (5) EN (x; ) = EN () \ (x):
4. Mapping complexity
The case of the tangential dilation is a little more complex but basically consists in not dilating any further than the region (x), therefore leading to ET (x; ) De nition 11 (Normal and tangential complexity indexes). The normal and tangential complexity indexes at x over a region (x) are denoted CN (x; ) and CT (x; ) and are obtained by looking at the rate of growth of their local oriented dilations. More formally they are obtained as the left derivatives of their corresponding log-log plots 1 ; log 1 jE (x; )j (6) log for CN (x; ), and (7) for CT (x; ).
2
N
2
log 1 ; log 1 jET (x; )j(p;1)
In a similar fashion as we did in the case of the tangent map, we can bundle all the complexity indexes and build our main tool, the complexity map: De nition 12 (Complexity map). Given a curve-like set E , we de ne the complexity map C to be [ C = (x; CN (x; ); CT (x; )): x2E 4.2.1. Discrete complexity map. The discrete complexity map will be obtained by computing locally normal and tangential complexity at each image position subtending a non-empty tangent. The algorithm we used consisted into projecting the unit tangents from the discrete tangent map and then dilating the resulting intermediate image4. Computing the area of the dilated set and estimating the rate of growth from log-log data gives us two numbers, the normal and tangential complexity indexes, which are bundled into the complexity map.
4.3. Choosing the appropriate representation. We now return to questions of com-
puter vision, and ask which types of geometric objects are natural for early representations. Curves are clearly among them, as are textures. But is texture a distinct class; is there a dierence between texture ows, as they arise from hair patterns that are well combed, and turbulent patterns such as wind-blown hair? What sort of object is the edge of the Kanisza pin-stripes? The complexity map provides an answer, by classifying discrete curve-like sets into:
dust-like: sets in which the tangent map is sparse, the object almost nowhere extends along its length locally, and might be thought of as being curve-free;
The oriented dilations were done using features of PostScript. The code for this will be made available soon on anonynous ftp. 4
15
16
Indexing visual representations through the complexity map
0
1.0
CN
0.5
1.0
CT
dust
curves
turbulence
ow
1.5
2.0
Figure 7. Visualizing the complexity space and its use to segment an image. The square in here represents the space of complexity indexes pairs that can be encountered. The normal complexity varies between 1.0 and 2.0 while the tangential complexity is between 0.0 and 1.0. Each of the corners of this \complexity space" corresponds to a dierent type of curve-like structure. Partitioning the complexity space will result in an image segmentation scheme bound to the structure of the objects in the visual scene.
curve-like: discrete tangent maps for which a curve representation is completely ade-
quate. Objects extend along their length, like Paolina's shoulder, and the density of other tangents is low almost everywhere along it in a local neighborhood; turbulence-like: tangent maps that are characterized by objects that do not extend along their length but are dense in the normal direction; e.g., Paolina's uncombed hair; texture- ow-like: tangent maps for which the objects extend along their length and are also dense in the normal direction. The space of valid tangential/normal complexity pairs is a subset of R2, namely [0:0; 1:0]x[1:0; 2:0]. This is because the tangential complexity is a number between 0.0 and 1.0 and the normal
5. Experiments and results
complexity a number between 1.0 and 2.0. Segmenting this space allows us to partition the space of possible patterns into equivalence classes (see Fig. 7). The following segmentation gives us the nomenclature:
dust: low normal complexity - low tangential complexity curve: low normal complexity - high tangential complexity turbulence: high normal complexity - low tangential complexity
ow: high normal complexity - high tangential complexity In the next section we will show that this partitioning scheme can successfully segment an image in terms of these various kind of curve-like substructures. 5. Experiments and results
5.1. Mapping complexity and choosing representation. We have estimated the com-
plexity map of the the Kanisza pattern (Fig. 2a). The results are shown in Fig. 8a,b. In (a) we displayed the normal complexity mapped as black and white, while in (b) we mapped the tangential complexity. For the normal complexity, black stands for low normal complexity while white refers to tangents embedded in a complex region. From this we see that at the studied scale the patch stands out as requiring a dierent representation from the \curve" part (the parts of the hollow rectangle in the top and in the bottom). But our classi cation can be ner if we add the tangential complexity component (Fig.8b). Now, black stands for low tangential complexity, and white for high tangential complexity (places where the objects extend along their length). The only black spots here are the four corners and the tips of the grating pattern. The tangential complexity is therefore a good means of detecting corners, ends of lines, and points of high curvature, while the normal complexity is ideal to segregate textures from curves. The normal and tangential indexes can be collated and bundled to form the complexity map. Applying the classi cation scheme presented in the last section, it is possible to extract the various components: dust, curve, turbulence, and ow. This is shown in Fig. 9. The segmentation successfully extracts the corners, the ends of lines in the grating, the \curve" part and the grating itself. The resolution was 100x100. For the normal complexity, the local extent was 15 pixels, the scale at which the complexity was computed was around 5 pixels. For the tangential complexity, the local extent was 5 pixels, the scale was in the neighborhood of 1 pixel. The values chosen for the segmentation were 0:90 for the tangential complexity, 1:4 for the normal complexity. Fascinatingly, the boundary of the occluding rectangle appears as one-dimensional turbulence. The same analysis can be done on the Paolina image. The original image had a resolution of 512x480. Linear/logical operators where applied followed by three iterations of relaxation labelling (for a complete description of how the tangent map was generated for this image,
17
18
Indexing visual representations through the complexity map
(a)
(b)
Figure 8. Computing (a) the normal and (b) the tangential complexity for
the Kanisza pattern. In both cases the complexity has been remapped as black or white. For the normal complexity map, the white pixels refer to tangents embedded in a complex region while the black ones represent low complexity. The \curve" regions clearly stand out in black here. In the case of tangential complexity, this time the white pixels represent regions where the set extends along its length and the black pixels, regions where the tangential complexity is low, i.e. end of lines, corners, points of high curvature.
see [Iverson 1993]). The resulting tangent map is shown in Fig. 1 (bottom right), as was mentioned in the introduction. Normal dilations were done with a local extent of 11 pixels, and the chosen scale was in the neighborhood of 4 pixels. The tangential dilations were done over a smaller local extent (5 pixels) and the scale at which the complexity index was estimated was around 2 pixels. Once the complexity was calculated, we applied segmentation following the scheme presented in section 4.3. The chosen values were 1:4 for the normal and 0:75 tangential complexity. Results of the segmentation using the data from the complexity maps are shown on Fig. 10. The four types of structures stand out very clearly, namely dust, curves, turbulence and ow. Numerical methods for obtaining these segmentation values can be found in Dubuc (1994). 6. Conclusion The transition from local representations to global ones is a key problem in computer vision. In this paper, we have presented an intermediate representation scheme for the local structure of a curve-like set { the tangent map { and addressed the problems that are
6. Conclusion
(a)
(b)
(c)
(d)
Figure 9. Segmented image of the Kanisza pattern using the complexity
map and our classi cation rule. (a) low normal, low tangential: dust (b) low normal, high tangential: curve (c) high normal, low tangential: turbulence (d) high normal, high tangential: ow.
encountered when attempting to blindly integrate local information. We have demonstrated that the complexity of the tangent map, obtained through normal and tangential dilations, was key to determine the representation underlying the integration process. Stated dier-
19
20
Indexing visual representations through the complexity map
(a)
(b)
(c)
(d)
Figure 10. Segmented image of the Paolina image using the complexity map and our classi cation rule. (a) low normal, low tangential: dust (b) low normal, high tangential: curve (c) high normal, low tangential: turbulence (d) high normal, high tangential: ow.
ently, we showed how the idea of image segmentation could be lifted onto the tangent map and eected through notions of complexity.
References
The complexity maps we built constitute the rst building blocks of an emerging representational complexity theory for curve-like sets. In the vein of Turing computable numbers [Vitanyi & Li 1993], it should now be possible to de ne the notion of representable curve. Representational complexity theory will generalize the notions presented here and will dictate the action that could be taken before integrating information. It will predict what should be the salient features in the scene and justify unambiguously the choices of possible intermediate representations. References Besicovitch, A. S. (1928), `On the fundamental geometrical properties of linearly measurable plane sets of points', Matematische Annalen . Canny, J. F. (1986), `A computational approach to edge detection', IEEE Trans. PAMI 8, 679{698. Dubuc, B. (1994), On the complexity of curves and the representation of visual information, PhD thesis, McGill University, Dept of Electrical Engineering, Montreal. Dubuc, B., Quiniou, J.-F., Roques-Carmes, C., Tricot, C. & Zucker, S. W. (1989), `Evaluating the fractal dimension of pro les', Phys. Rev. A 39(3), 1500{1512. Dubuc, B. & Zucker, S. W. (1994), Curve-like sets, complexity maps and representation, AI-TR 94-02, McGill University, Center for Intelligent Machines, Montreal. Falconer, K. J. (1987), The Geometry of Fractal Sets, Cambridge University Press. Falconer, K. J. (1990), Fractal geometry: mathematical foundations and applications, John Wiley & Sons. Hubel, D. H. & Wiesel, T. N. (1962), `Receptive elds and functional architecture of monkey striate cortex', J. Physiology (London) 195, 215{243. Iverson, L. A. (1993), Toward discrete geometric models for early vision, PhD thesis, McGill University, Dept of Electrical Engineering, Montreal. Iverson, L. A. & Zucker, S. W. (in press), `Logical/linear operators for image curves', IEEE Trans. PAMI . Kanisza, G. (1979), Organisation in vision, Praeger, New York. Marr, D. & Hildreth, E. (1980), `Theory of edge detection', Proc. Royal Society 207, 187{217. Matheron, G. (1975), Random Sets ans Integral Geometry, John Wiley & Sons, New York, N. Y. Morgan, F. (1987), Geometric Measure Theory, Academic Press, Inc. Poincare, H. (1926), Dernieres Pensees, Flammarion, chapter Pourquoi l'espace a trois dimensions. Serra, J. (1982), Image analysis and mathematical morphology, Academic Press. Tricot, C. (1991), Recti able and fractal sets, in S. Dubuc & J. Belair, eds, `Fractal Geometry and Analysis', Kluwer Academic Publishers, pp. 367{403. Ullman, S. (1990), Three-dimensional object recognition, in `Cold Spring Harbor Symposia on Quantitative Biology', Vol. LV, Cold Spring Harbor Laboratory Press, pp. 889{898. Vitanyi, P. & Li, M. (1993), An introduction to Kolmogorov complexity and its applications, Springer-Verlag. Zucker, S. W., Dobbins, A. & Iverson, L. A. (1989), `Two stages of curve detection suggest two styles of visual computation', Neural Computation 1.
Acknowledgements. Research supported by the AFOSR and NSERC. S.W.Z. is a Fellow of the Canadian Institute for Advanced Research. Centre for Intelligent Machines, McGill University, Montreal, Quebec, Canada H3A 2A7 E-mail address :
[email protected],
[email protected]
21