Junction detection and grouping with probabilistic ... - Semantic Scholar

3 downloads 10897 Views 1MB Size Report
Published by Elsevier Science Ltd. All rights reserved. ...... ESCOLANO received his B.S. in Computer Science from the Polytechnical University of. Valencia ...
Pattern Recognition 35 (2002) 1869–1881

www.elsevier.com/locate/patcog

Junction detection and grouping with probabilistic edge models and Bayesian A∗ M. Cazorla ∗ , F. Escolano, D. Gallardo, R. Rizo Departamento de Ciencia de la Computacion e Inteligencia Articial, Universidad de Alicante, E-03080, Alicante, Spain Received 18 January 2001; accepted 27 July 2001

Abstract In this paper, we propose and integrate two Bayesian methods, one of them for junction detection, and the other one for junction grouping. Our junction detection method relies on a probabilistic edge model and a log-likelihood test. Our junction grouping method relies on 6nding connecting paths between pairs of junctions. Path searching is performed by applying a Bayesian A∗ algorithm. Such algorithm uses both an intensity and geometric model for de6ning the rewards of a partial path and prunes those paths with low rewards. We have extended such a pruning with an additional rule which favors the stability of longer paths against shorter ones. We have tested experimentally the e:ciency and robustness of the methods in an indoor image sequence. ? 2002 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. Keywords: Junctions detection; Grouping; Image segmentation; Bayesian inference

1. Introduction Many visual tasks like depth estimation, matching, segmentation, motion tracking, and so on, may rely on junction extraction, because these features provide useful local information about geometric properties and occlusions. Consequently, methods for extracting these low-level features from real-world images must be ef6cient and reliable. Furthermore, the relation between junctions and speci6c tasks must be investigated. In this context, mid-level representations, which encode spatial relations between junctions, may be useful to reduce the complexity of these tasks. In this paper we propose two Bayesian methods to detect and group junctions along connecting edges. Some previous methods for junction extraction have been focused on grouping edges in the neighborhood of ∗ Corresponding author. Fax: +34-965-903902. E-mail address: [email protected] (M. Cazorla).

a candidate junction [1–3]. Alternatively, other methods rely on analyzing such a neighborhood to discover a characteristic geometry [4 – 6]. Recently, a method, whose implementation is called Kona, retains features of both approaches [7]. In Kona, junctions are modeled as piecewise constant regions—wedges —emanating from a central point. Junction detection is performed in two steps: center extraction—based on a local operator—and wedge detection—based on a template deformation framework that uses the minimum description length (MDL) principle [8]. However, the proposed strategy for 6nding wedges—dynamic programming—may be too slow for real-time purposes, and also the robustness of the method may need to be improved. This fact motivated the search of alternative methods which improve the e:ciency of Kona by ensuring, at least, similar reliability in junction detection. In Ref. [9] we proposed two Bayesian methods which evolve from Kona, and we analyzed experimentally their reliability. Such analysis resulted in the edge-based method included in the 6rst part of this paper.

0031-3203/02/$22.00 ? 2002 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 0 1 ) 0 0 1 5 0 - 9

1870

M. Cazorla et al. / Pattern Recognition 35 (2002) 1869–1881

Fig. 1. (Top-left) Junctions parametric model; (top-right) discrete accumulation of intensity along a direction; (bottom-left) example of an ideal junction; (bottom-right) intensity pro6le of the junction, where each peak represents the location of a limit.

On the other hand, the use of junctions in segmentation, matching, and recognition, is the subject of several recent works. In Ref. [10] junctions are used as breaking points to locate and classify edges as straight or curved. Junctions are used as stereo cues in Ref. [11]. In Ref. [12] junctions are used as 6xation cues. In this work, 6xation is driven by a grouping strategy which forms groups of connected junctions separated from the background at depth discontinuities. The role of corners in recognition appears in Ref. [13], where a mixed bottom-up=top-down strategy is used to combine information derived from corners and the results of contour segmentation. Finally, junctions are used in Ref. [14] to constrain the grey level of image regions in segmentation. Consequently, we propose a way of building a junction map, that is, an undirected graph where nodes are assigned to junctions and edges are assigned to connecting paths between them, provided that these paths exists. In the second part of this paper we propose a method to connect junctions along edges. This method is based on recent results on edge tracking using non-linear 6lters under a statistical framework [15 –19] and emanates from Ref. [20] where we presented our initial experiments. The rest of the paper is organized as follows: In Section 2, we present our junction detector and some experimental results with a sequence of indoor images. The analysis of these results motivates our junction-connecting approach presented in Section 3. Grouping results are presented at the end of this section. Finally, we present our conclusions and future work.

2. Junction detection 2.1. Junction parametric model The relation between the real con6gurations of junctions and their appearance is well documented in the literature [21,22]. A generic junction model can be encoded by a parametric template  = (xc ; yc ; r; M; { i }; {Ti }), where (xc ; yc ) is the center, r is the radius, M is the number of wedges, { i }, with i = 1; 2; : : : ; M , are the wedge limits, and {Ti } are the intensity distributions associated to these wedges (see Fig. 1). We assume that potential junction centers (xc ; yc ) can be localized by a local 6lter. Two well-known examples of this operator are the Plessey detector [23] and the Kitchen and Rosenfeld detector [24]. Here, we use SUSAN, a robust and fast non-linear 6lter that has been recently proposed [25]. SUSAN relies on an homogeneity principle. Given a candidate point, SUSAN estimates the proportion of pixels inside a local area which have similar intensity as the candidate. If such proportion is about half of its maximum value, the candidate point is labeled as belonging to a straight edge, and if the proportion is even lower then, the point can be labeled as a corner, provided that a good threshold is used. In order to avoid distortions near the junction center, we discard a small circular domain centered at (xc ; yc ) with radius Rmin , as suggested by Parida et al. [7]. Then, r = Rmax − Rmin , where Rmax is the scope of the junction.

M. Cazorla et al. / Pattern Recognition 35 (2002) 1869–1881

1871

Fig. 2. Edge statistics: Pon (top-left), Po2 (top-right), log Pon =Po2 (bottom-left) and the function used for Pang (bottom-right).

Moreover, although Kona provides a method for estimating the optimal value of r around a given center, its cost is prohibitive for real-time purposes. Then, we assume that r can be estimated by the user. 2.2. The contrast prole and edge modeling Once the center (xc ; yc ) and the radius r are known, the problem of 6nding M , { i } and {Ti } can be solved by analyzing a one-dimensional contrast pro6le associated to the junction. Such a pro6le is computed by estimating, for each angle, ∈ [0; 2] the averaged accumulated contrast I˜ along the radius in such direction I˜ =

1

N 

(Rmax − Rmin ) i=1

li Epixeli ;

(1)

where Epixeli is the intensity contrast of the pixel associated to the segment li , as it is shown in Fig. 1, N being the number of segments needed to discretize adequately the radius along the corresponding direction (not necessarily the same for all directions). In the resulting pro6le, wedge limits, which in fact can be seen as edges emanating from the junction center, are associated to signi6cant contrast peaks, and these peaks can be found by selecting an adequate threshold. However, the reliability of the method depends on how edges are modeled. For instance, in the situation shown in Fig. 1, we have considered the ˜ G=1 ∗ Ipixel | of a simple smoothed magnitude Epixel = |∇ gradient with unitary standard deviation as a measure of the “edgeness” of the pixel. The robustness of such a measure can be improved by embodying it into a decision test which performs a good classi6cation of both edge and non-edge pixels. Recent studies in edge modeling

applied to road tracking tasks [15 –19,26] point towards building such decision test on the log-likelihood ratio, that is, the logarithm of the ratio between the probability of being “on” and “oO” an edge. This criterion guarantees an optimal decision in the sense that it minimizes the Bayesian classi6cation error [27], but the underlying distributions for “on” and “oO” must be known beforehand. Such distributions can be estimated empirically by gathering and quantizing the frequencies of the 6lter responses in both cases. Then, the empirical probability that a given response is associated to an edge pixel is denoted by Pon (Epixel ), and the empirical probability that a given response corresponds to a non-edge pixel is denoted by Po2 (Epixel ). In this paper we will use the empirical distributions presented in Ref. [19]. These distributions, were extracted from a range of images and quantized to take 20 values. Both distributions, and the corresponding plots of log-likelihood ratios, are shown in Fig. 2. Taking into account the log-likelihood ratio, expression 1 is rewritten as I˜ =

1

N 

(Rmax − Rmin ) i=1

li log

Pon (Epixeli ) : Po2 (Epixeli )

(2)

Given an input image, as shown in Fig. 3, the log-likelihood ratio between both the “on” and “oO” probabilities gives a robust identi6cation of edge pixels. However, the latter probabilistic model of the smoothed gradient 6lter can be easily improved by incorporating the orientation of such gradient in the de6nition of both the “on” and “oO” probabilities. Consequently, the smoothed gradient at a given point is de6ned by ˜ pixel = (Epixel ; pixel ), where pixel is the local estimaE tion of ∗ , the true orientation of the edge to which the

1872

M. Cazorla et al. / Pattern Recognition 35 (2002) 1869–1881

Fig. 3. (Top) Sample image and the value of the log-likelihood ratio log(Pon =Po2 ) for all its pixels; (bottom) magnitude (left) and orientation (right) of the gradient. In the case of orientation grey is 0, white  and black −.

pixel belongs. In Ref. [28], where the local estimations of the gradient are used to accumulate evidence through a Hough-like process addressed to estimate the vanishing points in an image, the “on” probability is de6ned ˜ pixel | ∗ ), the conditional probability of in terms of Pon (E a gradient vector given the true orientation, that is, as a function of the true orientation. Such de6nition makes sense because the probability of an edge being on must decrease as the estimated orientation diverges from the true orientation, and conversely it must increase as both orientations converge. Furthermore, it can be assumed that the magnitude of the gradient is independent from its orientation and viceversa, which leads to the factorization of the “on” probability into two terms, one of them depending on the gradient vector and the other one on the divergence between the real and the estimated orientations ˜ pixel | ∗ ) = Pon (Epixel )Pang ( pixel − ∗ ); Pon (E ∗

(3)

where Pang ( pixel − ) is the probability of having the correct orientation. Although this probability can be estimated empirically its shape is consistent with having a maximum both in 0 (when both orientations coincide) and  (when both orientations are opposite). In this paper we have used this simple de6nition as shown in Fig. 2.

On the other hand, “oO” probability can be rede6ned without considering the dependence between the estimated orientation and the true orientation. Therefore such ˜ pixel ) does only depends on the gradient probability Po2 (E vector. Also assuming the independence between gradient magnitude and orientation the resulting probability depends also on two terms ˜ pixel ) = Po2 (Epixel )U ( pixel ); Po2 (E

(4)

where U ( pixel ) = 1=2 is the uniform distribution. The eOectiveness of this model for estimating “edgeness” is shown in Fig. 3 where we represent the loglikelihood ratio and the magnitude and orientation of the gradient. 2.3. Wedge identication by thresholding Given the latter extended model from the “on” and “oO” probabilities, the expression 2 for estimating the averaged accumulated evidence I˜ along a given direction

is replaced by I˜ =

1

N 

(Rmax − Rmin ) i=1

li log

˜ pixeli | ∗ ) Pon (E : ˜ pixeli ) Po2 (E

(5)

M. Cazorla et al. / Pattern Recognition 35 (2002) 1869–1881

1873

Fig. 4. Contrast pro6le using the Pon =Po2 method. Peaks above the threshold represents suitable limits.

Furthermore, as the quality of the contrast pro6le depends highly on the correct localization of the junction center, we compensate for small localization errors by replacing the averaged evidence I˜ by the median Iˆ , which improves the robustness of the method. After introducing this latter consideration, the resulting pro6le is shown in Fig. 4. Each edge emanating from the center has associated a peak in the pro6le. Therefore, these peaks correspond to wedge limits. Signi6cant limits, that is, limits with enough evidence, can be found by selecting a convenient threshold. The plots corresponding both to the magnitude and orientation terms are also shown in Fig. 4. This method 6nds M , the number of wedge limits, and { i }, the limits themselves provided that a good threshold is selected. The intensity distributions {Ti } associated to each wedge are estimated once the wedge limits are obtained. We include two junction 6ltering conditions: when the number of detected wedge limits is below M = 2, and when M = 2 and the junction is quasi-straight, that is, the relative angle between the limits is close to ±. Furthermore a discretization error of =9 is assumed when looking for local maxima in the contrast pro6le, because this is the minimum wedge angle needed to declare a junction. 2.4. Experimental results for junction detection We have tested our algorithm in a sequence of 18 indoor images of size 640 × 480 pixels, each one having 100 junctions on average. In terms of e:ciency, the average time spent to process an image was 4 s in a Pentium II 266 under Linux, considering an angular discretization



error of 1 . The results obtained for two of these images are shown in Fig. 5. Furthermore, two tests were performed to evaluate the stability of the algorithm: a localization test and a noise test. The detailed results of these tests have been reported in a previous work [9] where we have compared this method with two other ones. In both tests, the angular error is considered. The localization test shows that the robust measure (median) introduced in the method is only able to compensate adequately for localization errors of only 2 or 3 pixels, where the angular error is between ◦ ◦ 20 and 30 . At higher localization errors, the angular ◦ error reaches 60 because the method has to face edges which are orthogonal, or at least not parallel, to the direction of evidence accumulation, and the method is not able of reporting enough contrast in these conditions. On the other hand, the noise test shows that the method is ◦ able to keep the angular error below 30 even when the variance of the white noise added to the image is 50, due to the robustness of the log-likelihood ratio and the median. We have also evaluated the incidence of the scope of the junction on the method, because it is a potential source of error. In order to accumulate enough evidence of contrast below a given direction, we need relatively wide scopes. In fact, our best results were obtained with r = 6, that is, Rmin = 4 and Rmax = 10, being the threshold H = 0:5. Consequently, it is very probable that two or more neighbor junctions overlap, thus yielding incorrect results. In our indoor sequence, the proportion of incorrect detections due to scope is 45% (being 5% the proportion due to bad localization and 50% deproportion of

1874

M. Cazorla et al. / Pattern Recognition 35 (2002) 1869–1881

Fig. 5. Indoor images: results of the edge-based method.

strictly correct detections). Some examples of both correct and incorrect detections are shown in Fig. 6. In order to both compensate for possible incorrect junction detections and to obtain a mid-level structural representation, we propose to perform junction grouping along potential connecting edges. As we will see in the second part of this paper, junction grouping allows to remove false wedge limits due to bad center localization and scope selection.

3. Connecting and ltering junctions 3.1. Path modeling for edge tracking Now we are interested in 6nding “connecting paths”, i.e. paths that connect pairs of junctions along the edges between them, provided that these edges exist. More precisely, a connecting path P of length L, rooted on a junction center (xc ; yc ) and starting from the wedge limit

M. Cazorla et al. / Pattern Recognition 35 (2002) 1869–1881

1875

P ∗ maximizes E({pj ; "j }) =

L 



log

j=1

+

L−1 

Pon (pj ) Po2 (pj ) 

log

j=1



PSG ("j+1 − "j ) U ("j+1 − "j )



:

(6)

The 6rst term of this function is the “intensity reward”, and it depends on the edge strength along each segment pj . De6ning the intensity reward of a each segment of 6xed length F in terms of the edge model used to compute the contrast pro6le yields 

log

Pon (pj ) Po2 (pj )



=

N ˜ pixeli | ∗ ) Pon (E 1 : li · log ˜ pixeli ) F i=1 Po2 (E

(7)

The second term is the “geometric reward”: PG ("j+1 | "j ) = PSG ("j+1 − "j ) models a 6rst-order Markov chain on orientation variables "j . Curvature smoothing is provided by a negative exponential density function 

PSG (S"j ) ˙ exp −



C |S"j | ; 2A

(8)

where S"j = "j+1 − "j , A is the maximum angle between two consecutive segments, and C modulates the rigidity of the path. Additionally, U ("j+1 − "j ) is the uniform distribution of the angular variation, and it is included to keep both the geometric and the intensity terms in the same range. 3.2. Path searching with Bayesian A∗

Fig. 6. From left to right: correct T con6guration; correct Y con6guration; erroneous con6guration due to over-segmentation; erroneous center localization.

de6ned by , is de6ned by a collection of connected segments p1 ; p2 ; : : : ; pL with 6xed or variable length. We assume that the curvature of these paths must be smooth, so we also de6ne second-order orientation variables "1 ; "2 ; : : : ; "L−1 , where "j = j+1 − j is the angle between segments pj+1 and pj . Following the Bayesian approach of Yuille and Coughlan [19], the optimal path

Finding straight or curved connecting paths in cluttered scenes may be a di:cult task, and it must be done in a short time, especially when real-time constraints are imposed. Coughlan and Yuille [17] have recently proposed a method, called bayesian A∗ , that exploits the statistical knowledge associated with the intensity and geometric rewards. This method is rooted at a previous theoretical analysis [16] about the connection between the 20 question algorithm of Geman and Jedinak and the classical A∗ algorithm [29]. Given an initial junction center (xc0 ; yc0 ) and an orientation 0 , the algorithm explores a tree in which each segment pj can expand Q successors, so there are QN possible paths. The Bayesian A∗ reduces the conservative breadth—6rst behavior of the classical A∗ by exploiting the fact that we want to detect a target path against clutter, instead of 6nding the best choice from a population of paths. In consequence there is one true path and a lot of false paths. Then, it is possible to reduce the complexity of the search by pruning partial path with low rewards. The algorithm evaluates the averaged intensity and geometric rewards of the last L0 segments of a path

1876

M. Cazorla et al. / Pattern Recognition 35 (2002) 1869–1881

(the “segment block”) and discards them when one of these averaged rewards is below a threshold, i.e. when 1 L0 1 L0



(z+1)L0 −1



log

j=zL0



(z+1)L0 −1



log

j=zL0

Pon (pj ) Po2 (pj )



PSG (S"j ) U (S"j )

¡T 

or

¡ Tˆ ;

(9)

where T and Tˆ are the intensity and geometric thresholds that modulate the pruning behavior of the algorithm. These parameters establish the minimum averaged reward that a path needs to survive, and in consequence they are closely related to the probability distributions used to design both the intensity and the geometric rewards. They must satisfy the following conditions: −D(Po2 Pon ) ¡ T ¡ D(Pon Po2 ); −D(USG PSG ) ¡ Tˆ ¡ D(PSG USG );

(10)

where D is the Kullback–Leibler divergence. The algorithm 6nds the best path that survives the pruning, and the expected convergence rate is O(N ). Typically the values of T and Tˆ are set close to their higher bounds. Additionally, if Pon diverges from Po2 , the pruning rule will be very restrictive. Conversely, if these distributions are similar, the algorithm will be very conservative. The same reasoning follows for PSG and USG . The existence of real-time considerations motivates the extension of the basic pruning rule introducing and additional, although inadmissible, rule. We also consider the “stability of long paths” against shorter paths. Long paths are more probable to be close to the target that shorter ones, because they have survived to more reward prunes. Then, if Lbest is the length of the best partial path, we will also prune paths with lengths Lj when Lbest − Lj ¿ ZL0 ;

(11)

where Z ¿ 0 sets the minimum allowed diOerence between the best path and the rest of the paths. Low values of Z introduce more pruning, and the risk of loosing the true path is higher. When Z is large, shorter paths can survive. The algorithm selects for expansion the best partial path that survives to the extended pruning rule. These paths are stored in a sorted queue. We consider that we have reached the end of a connecting path, when the center (xcf ; ycf ) of a junction is found in a small neighborhood around the end of the selected path. In order to perform this test, we use a “range tree”, a representation from computational geometry [30] that is suitable to search e:ciently within a range. The cost of generating the tree is O(J log J ), where J is the number

of detected junctions. Using this representation, a range query can be performed with cost O(log J ) in the worst case. Once a new junction is reached, the last segment of the path must lie on the limit f between two wedges. Then, we use this condition to label the closest limit as “visited”. If the last segment falls between two limits and the angle between them is below a given threshold, B = =6, then both limits are labeled as visited. As the search of a new path can be started only along a non-visited limit, this mechanism avoids tracking the same edge in the opposite direction. However, the search may 6nish without 6nding a junction. This event is indicated by an empty queue. In this case, if the length of the last path expanded by the algorithm is below the block size L0 , we consider that this path emanates from a false limit, and this limit is cancelled. Otherwise, the search has reached a “termination point” and its coordinates must be stored. If we 6nd another termination point in a given neighborhood, both paths are connected. This connection is associated to a potential undetected junction when the angle between the last segments of the paths is greater than =9, the minimum angle needed to declare a junction. 3.3. Junction grouping from connecting paths Our “local-to-global” grouping algorithm starts from a given junction and performs path searching for each non-visited limit. When a new junction is reached, its corresponding limit is labeled as visited. Once all paths emanating from a junction are tested, the algorithm selects a new junction. Connected junctions are grouped. Labeling avoids path duplicity. Robustness is provided by the fact that an edge can be tracked in a given direction, if the search from the opposite direction fails. As we have seen previously, it is possible to join partial paths at termination points. However, the more interesting feature of our grouping method is its incidence in correcting errors in junction detection. False limits will be removed when their path searching fails, and false junctions will be removed when all their limits disappear. Furthermore, our method generates a mid-level representation. We can use the connectivity and the information contained in the paths for segmentation, tracking and recognition tasks. 3.4. Experimental results for junction grouping In order to test our grouping algorithm we assume the junction detector parameters proposed in Section 2.4. On the other hand, general path 6nding parameters are selected as follows: branching factor Q = 3 (with ◦ Q0 = 5 and relative angle of S = 12 at the 6rst step of the algorithm); block size L0 = 3 segments, being F = 5

M. Cazorla et al. / Pattern Recognition 35 (2002) 1869–1881

Fig. 7. Grouping results with the standard con6guration. Three images of the indoor sequence.

the length of each segment. The incidence angle thresh◦ old for labeling visited limits is set to B = =6 = 30 . The empirical distributions Pon and Po2 for segments have the following Kullback–Leibler divergences: −D(Po2 Pon ) = − 2:776 and D(Pon Po2 = 2:856). Consequently, we have set the averaged intensity reward threshold to T = 0:0. On the other hand, the exponential distribution PSG is de6ned by setting its rigidity to C = 5:0 and the maximum angle between segments to ◦ A = 0:2 = =5 rad, that is, 12 . These parameters yield the following divergences: −D(USG PSG ) = − 0:752 and D(PSG USG ) = 0:535. Consequently, we have set the averaged geometric intensity reward threshold to Tˆ = 0:4. Finally, the extended pruning parameter, that is, the minimum allowed diOerence between the length of current best path and the ones of rest of the paths is set to Z = 2:3 times the segment block. Given the latter con6guration, we have applied our junction grouping algorithm to the 18 640 × 480 images of the indoor sequence previously referred in Section 2.4. Considering that the averaged number of paths per image was 274, the averaged processing time was 6:5 s per image (discounting the time spend in junction detection) in a Pentium II 266 under Linux (see some grouping results in Fig. 7). The value selected for the extended pruning parameter in the standard con6guration (Z = 2:3) represents a trade-oO between admissibility and e:ciency. Such a trade-oO is derived from analyzing the evolution of both the averaged processing time of an image and the averaged number of partial paths generated during the search in an image (see the respective plots in Fig. 8 and illustrative grouping results in Fig. 9). Setting Z = 0 means that all paths shorter than the current best path are pruned. This setting results in extremely high pruning, and thus in non-admissible solutions to path 6nding. When we increase Z in order to introduce admissibility in the search, the averaged number of partial paths, and thus the averaged processing time, also increase. The averaged processing time increases until stabilization, for Z ¿ 4.

1877

Fig. 8. Incidence of the Z parameter in processing time and number of paths.

Fig. 9. Experiments changing the Z parameter: (top) Z = 1; (bottom) Z = 3.

However, the averaged number of partial paths stabilizes earlier, for Z ¿ 2:3, meaning that almost all existing paths are eOectively found in such conditions. Therefore, setting Z ¿ 2:3 results in extra computing time, and setting

1878

M. Cazorla et al. / Pattern Recognition 35 (2002) 1869–1881

Z ¡ 2:3 results in 6nding less paths. Thus, Z = 2:3 is the optimal setting. On the other hand, the distribution thresholds T and Tˆ of the standard con6guration are kept inside the admissibility bounds predicted by the Kullback–Leibler divergences. Their values are 6xed so that they are as close as possible to their higher bounds provided that they 6t the contrast and geometry of the scene. In Fig. 10 we show the eOect of a too low intensity threshold (top), a correct threshold (middle) and a too high threshold (bottom). For instance, when such a threshold is too high only paths with high contrast evidence can survive. The same test is repeated for the geometric threshold in Fig. 11. In this case, a too high threshold results in the survival of almost straight paths. Finally, we have evaluated the incidence of our grouping algorithm in correcting errors in junction detection. In Section 2.4 we showed that these errors may be motivated mainly by a bad localization of the junction center or by the overlapping of the scopes associated to diOerent junctions (see examples in Fig. 6). We found that the proportion of incorrect junction detections in our indoor sequence was 50% (45% due to bad localization and 5% due to scope). Our grouping algorithm is addressed to reduce the number of false limits associated to these incorrect detections. We found that the proportion of eliminated false limits was 55%, that is, 45% of limits are the starting points of paths with enough edge evidence. Consequently, grouping contributes signi6cantly to 6lter incorrect limits. We also found that the number of eliminated junctions due to eliminating all their limits is negligible, and also that these junctions are associated to local details (lamps in the ceiling, plants, small objects) which really exist in the images but that are not meaningful from the point of view of a global mid-level representation. 4. Conclusions and future work The main contribution of this paper is the integration of recent Bayesian techniques in junction detection and junction grouping, and also the use of the models proposed in the paper in order to obtain an structured mid-level representation. Junction detection relies on a probabilistic edge model and a log-likelihood test. Such a model is used to build and segment a one-dimensional pro6le which contains the junction limits. Our junction grouping method relies on extending the latter edge model to take into account the geometry of connecting paths between limits of diOerent junctions. Path searching is performed by the Bayesian A∗ , which prunes partial paths with low intensity or geometric rewards. We complemented such a pruning with a condition which favors the stability of long paths against shorter paths.

Fig. 10. EOect of T (intensity threshold): lower threshold T = − 1 (top), correct threshold (middle) T = 0, and higher threshold (bottom) T = 2.

Our junction grouping algorithm produces a mid-level representation including junction connectivity information. This approach has been tested experimentally in an indoor image sequence. We have focused on both the e:ciency and the robustness of the approach. We have found that connecting paths may be identi6ed

M. Cazorla et al. / Pattern Recognition 35 (2002) 1869–1881

1879

5. Summary

Fig. 11. EOect of Tˆ (geometric threshold): lower threshold Tˆ = − 0:5 (top), correct threshold Tˆ = 0:4 (middle), and higher threshold Tˆ = 1 (bottom).

through a trade-oO between admissibility and e:ciency, and also that junction detection errors may be corrected eOectively by grouping. Future work includes the re6nement of this structure and its use in segmentation and reconstruction tasks, especially in the context of robot navigation.

In this paper, we propose and integrate two Bayesian methods, one of them for junction detection and the other one for junction grouping. Junction detection relies on probabilistic edge model, recently proposed in the literature. Such a model de6nes the “edgeness” of a pixel in terms of the log-likelihood ratio between its probabilities of being “on” and “oO” an edge. Given a candidate junction center, previoulsly detected by a non-linear 6lter, and its associated neighborhood, we accumulate the averaged results of the log-likelihood ratios along each direction, yielding a one-dimensional contrast pro6le whose peaks represent the limits of the piecewise constant regions emanating from the center, and thus they can be identi6ed by applying a convenient threshold. We present some experimental results showing the eOectivity of the method and the main sources of errors (neighborhood overlapping and incorrect center localization). In order to reduce incorrect detections we propose to extend the detected limits through potential edges to test whether they have enough contrast evidence below or not. Such an extension is modeled as a connecting path between junctions (a collection of consecutive segments). In order to prime the best path we use a cost function which complements the log-likelihood ratio due to edgeness of each segment (intensity reward) with a second log-likelihood ratio relating the amount of geometric knowledge to the uniform distribution (geometric reward) yielding a preference of paths with smooth curvature. Maximizing such a cost function is the core of the Bayesian A∗ method, also recently proposed. The eOectivity of this method relies on the fact that we are searching a unique true path in a population of false paths due to clutter. This allows to reduce the complexity of the search by pruning paths with low intensity or geometric rewards. The algorithm 6nds the best paths with survives to the pruning, and its expected convergence rate is linear with the length of the path. Furthermore, in order to reduce the complexity even more, we introduce an additional and non-admissible prune which favors the stability of long partial paths against shorter ones. The latter path-6nding algorithm is then embembed into a junction grouping algorithm which initiates path searching for all the limits in each junction. This includes a test to detect the end of each path (which is unknown beforehand). Such a test is performed by searching junction centers in a small neighborhood around of the selected partial path. Such a grouping algorithm is tested in an indoor image sequence by selecting a suitable con6guration for the parameters, both of junction detection and grouping. We analyze experimentally the eOect of all the parameters which de6ne the pruning conditions and also evaluate the incidence of grouping in correcting errors in junction detection. After such experiments we conclude

1880

M. Cazorla et al. / Pattern Recognition 35 (2002) 1869–1881

that our grouping approach is well suited to generate a mid-level representation which includes enough useful connectivity information to be used in tasks like feature matching and segmentation. Acknowledgements The authors would like to thank Alan Yuille and James Coughlan both in the Smith-Kettlewell Eye Research Institute, San Francisco (CA) for their encouraging discussions about the application of Bayesian inference rules to junction detection, and also for their support and hospitality. References [1] W. Freeman, E. Adelson, Junctions detection and classi6cation, Proceedings of ARVO, Sarasota, FL, 1991. [2] K. Rohr, Recognizing corners by 6tting parametric models, Int. J. Comput. Vision 9 (1992) 213–230. [3] J. Matas, J. Kittler, Contextual junction 6nder, in: J.L. Crowley, H. Christensen (Eds.), Vision as Process, Springer, Berlin, 1995, pp. 133–141. [4] J. Bigun, A structure feature for some image processing applications based on spiral functions, Comput. Vision, Graphics Image Process. 51 (1990) 166–194. [5] R. Deriche, T. Blaszka, Recovering and characterizing image features using an e:cient model based approach, Proceedings of Computer Vision and Pattern Recognition, 1993, pp. 530 –535. [6] W. FUorstner, A framework for low level feature extraction, Proceedings of the European Conference on Computer Vision, Stockholm, 1994. [7] L. Parida, D. Geiger, R. Hummel, Junctions: detection, classi6cation, and reconstruction, IEEE Trans. Pattern Anal. Mach. Intell. 20 (7) (1998) 687–698. [8] J. Rissanen, A universal prior for integers and estimation by minimum description length, Ann. Statist. 11 (1983) 416–431. [9] M.A. Cazorla, F. Escolano, Two Bayesian methods for junction detection, IEEE Trans. Image Process. 2002, submitted for publication. [10] T. Lindeberg, M-X. Li, Segmentation and classi6cation of edges using minimum description length aproximation and complementary cues, Technical Report ISRN KTH=NA=P-96=01-SE, Royal Institute of Technology, 1996. [11] J. Malik, On binocularly viewed occlusion junctions, Proceedings of the European Conference of Computer Vision, Stockholm, 1996. [12] K. BrunnstrUom, J-O. Eklundh, T. Uhlin, Active 6xation and scene exploration, Int. J. Comput. Vision 17 (2) (1996) 137–162. [13] T-L. Liu, D. Geiger, Visual deconstruction: recognizing articulated objects, in: M. Pelillo, E. Hancock (Eds.), Proceedings of the First International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, Venice, Italy, Lecture Notes in Computer Science, Vol. 1223, Springer, Berlin, 1997, pp. 295–309.

[14] H. Ishikawa, D. Geiger, Segmentation by grouping junctions, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA, 1998. [15] D. Geman, B. Jedynak, An active testing model for tracking roads in satellite images, IEEE Trans. Pattern Anal. Mach. Intell. 18 (1) (1996) 1–14. [16] A. Yuille, J. Coughlan, Twenty questions, focus of attention and A∗ : A theoretical comparison of optimization strategies, in: M. Pelillo, E. Hancock (Eds.), Proceedings of the First International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, Venice, Italy, Lecture Notes in Computer Science, Vol. 1223, Springer, Berlin, 1997, pp. 197–212. [17] J. Coughlan, A. Yuille, Bayesian A∗ tree search with expected O(N ) convergence rates for road tracking, in: M. Pelillo, E. Hancock (Eds.), Proceedings of the Second International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, York, UK, Lecture Notes in Computer Science, Vol. 1654, Springer, Berlin, 1999, pp. 189–204. [18] A.L. Yuille, J. Coughlan, An A∗ perspective on deterministic optimization for deformable templates, Pattern Recognition 33 (2000) 603–616. [19] A. Yuille, J. Coughlan, Fundamental limits of Bayesian inference: order parameters and phase transitions for road tracking, IEEE Trans. Pattern Anal. Mach. Intell. 2 (2) (2000) 160–173. [20] M.A. Cazorla, F. Escolano, D. Gallardo, R. Rizo, Bayesian models for 6nding and grouping junctions, in: M. Pelillo, E. Hancock (Eds.), Proceedings of the Second International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, York, UK, Lecture Notes in Computer Science, Vol. 1654, Springer, Berlin, 1999, pp. 70–82. [21] D. Waltz, Understanding line drawings of scenes with shadows, in: P.H. Winston (Ed.), The Psychology of Computer Vision, McGraw-Hill, New York, 1972. [22] J. Malik, Interpreting line drawings of curved objects, Int. J. Comput. Vision 1 (1) (1996) 73–104. [23] C.G. Harris, M. Stephens, A combined corner and edge detection, Proceedings of the Fourth Alvey Vision Conference, Manchester, 1988, pp. 147–151. [24] L. Kitchen. A. Rosenfeld, Gray level corner detection, Pattern Recognition Lett. 1 (2) (1982) 95–102. [25] S.M. Smith, J.M. Brady, SUSAN = a new approach to low level image processing, Int. J. Comput. Vision 23 (1) (1997) 45–78. [26] S. Konishi, J. Coughlan, A. Yuille, S.C. Zhu, Fundamental bounds on edge detection: edge cues, Proceedings of the International Conference on Computer Vision and Pattern Recognition, Fort Collins, CO, 1999. [27] R.O. Duda, P.E. Hart, Pattern Classi6cation and Scene Analysis, Wiley, New York, 1973. [28] J. Coughlan, A. Yuille, Manhattan world: compass direction from a single image by Bayesian inference, Proceedings of the International Conference on Computer Vision, Kerkyra, 1999. [29] J. Pearl, Heuristics, Addison-Wesley, Reading, MA, 1984. [30] M. de Berg, M. van Kreveld, M. Overmars, O. Schwarkopf, Computational Geometry: Algorithms and Applications, Springer, Berlin, 1997.

M. Cazorla et al. / Pattern Recognition 35 (2002) 1869–1881

1881

About the Author—MIGUEL CAZORLA received his B.S. degree and a Ph.D. in Computer Science from the University of Alicante (Spain) in the year 1995 and 2000, respectively. He is currently a lecturer with the Department of Computer Science and Arti6cial Intelligence of the University of Alicante. His research interest areas are computer vision, mobile robotics and web technology. About the Author—FRANCISCO ESCOLANO received his B.S. in Computer Science from the Polytechnical University of Valencia (Spain) in 1992 and Ph.D. in Computer Science from University of Alicante in 1997. He is currently an Associate Professor with the Department of Computer Science and Arti6cial Intelligence of the University of Alicante. He is also the head of the Computer Vision and Image Synthesis Group of the cited department. His research interests include computer vision (biomedical applications), robotics (active vision), and the coupling between biological and computer vision (vision in brains and computers). He has visited as a post-doctoral fellow of Dr. Norberto Grzywacz at the Smith-Kettlewell Eye Research Institute of San Francisco CA (USA). He has also collaborated with Dr. Alan Yuille at the latter institute. About the Author—DOMINGO GALLARDO received his B.S. in Computer Science from the Polytechnical University of Valencia (Spain) in 1992 and Ph.D. in Computer Science from University of Alicante in 1999. He is currently an Associate Professor with the Department of Computer Science and Arti6cial Intelligence of the University of Alicante. His research interests include computer vision, visual navigation and autonomous robots. X RIZO received his B.S. in Mathematics from the University of Valencia (Spain) in 1977 and Ph.D. About the Author—RAMON in Computer Science from the Polytechnical University of Valencia in 1991. He is currently the head of the Department of Computer Science and Arti6cial Intelligence of the University of Alicante where he is Professor. He is also the head of the Robotics and Arti6cial Intelligence Group of the cited department. His research interests include computer vision, arti6cial intelligence, robotics, and CAD.

Suggest Documents