Structural Sensitivity for Large-Scale Line-Pattern

0 downloads 0 Views 372KB Size Report
problem of recognising line patterns from large structural libraries. The ..... alphabet of di erent sizes and orientations, and, 25 aerial images. There are 5 multiple ...
Structural Sensitivity for Large-Scale Line-Pattern Recognition

Benoit Huet and Edwin R. Hancock Department of Computer Science University of York, York, YO10 5DD, UK

Abstract. This paper provides a detailed sensitivity analysis for the problem of recognising line patterns from large structural libraries. The analysis focuses on the characterization of two di erent recognition strategies. The rst is histogram-based while the second uses feature-sets. In the former case comparison is based on the Bhattacharyya distance between histograms, while in the latter case the feature-sets are compared using a probabilistic variant of the Hausdor distance. We study the two algorithms under line-dropout, line fragmentation, line addition and line end-point position errors. The analysis reveals that while the histogrambased method is most sensitive to the addition of line segments and end-point position errors, the set-based method is most sensitive to line dropout.

1 Introduction The recognition of objects from large libraries is a problem of pivotal importance in image retrieval [9, 7, 6, 1]. The topic has attracted massive interest over the past decade. Most of the literature has focussed on using low-level image characteristics such as colour [9], texture [2] or local feature orientation [6] for the purposes of recognition. One of the most ecient ways to realise recognition is to encode the distribution of image characteristics in a histogram [9]. Recognition is achieved by comparing the histogram for the query and those for the images residing in the library. In a recent series of papers, we have embarked on a more ambitious programme of work where we have attempted large-scale object recognition from structural libraries rather than image libraries [5, 4, 3]. Speci cally, we have shown how line-patterns segmented from 2D images can be recognised using a variety of structural summaries. We have looked at three di erent image representations and have investigated ways of recognising objects by comparing the representations. The simplest structural representation is a relational histogram. This is a variant of the pairwise geometric histogram [10] where Euclidean invariant relative attributes are binned provided that the line primitives are connected by an edge of a nearest neighbour graph [5]. Although relatively crude, object recognition via histogram comparison does not require explicit correspondences to be identi ed between individual line tokens. A more sophisticated representation is to store the set of pairwise attributes for the edges of

the nearest-neighbour graph. Di erent sets of attributes can be compared using a fuzzy variant of the Hausdor distance [4]. Here the problem of nding explicit correspondences between the elements of the set is circumvented. The nal method is to use an ecient graph-matching technique to ensure that the pattern of correspondences is consistent [3]. It is important to stress that as the recognition strategy becomes more sophisticated then so the computational overheads increase. We have viewed the application of these di erent recognition strategies as a sequential re nement process. The idea is to commence by limiting the set of possible recognition hypotheses with a coarse histogram search. The candidates are then re ned on the basis of the fuzzy Hausdor distance and nally veri ed by detailed graph-matching. The critical question that underpins this strategy is how much pruning of the data-base can be e ected in the histogram comparison step without leading to an unacceptably high probability of rejecting the true match. The answer to this question is one of noise sensitivity. Provided that the line patterns are not subjected to undue corruption, then the initial cut can be quite severe. The aim in this paper is to provide an analysis of the two hypothesis re nement steps to better understand their noise sensitivity characteristics. We consider four corruption processes. The rst of these is positional jitter. The second is the addition of clutter. The third is line dropout. The fourth and nal process is that of line-fragmentation. We illustrate that the most destructive process is the addition of clutter. Based on this analysis, we provide ROC curves that can be used to set the rejection cuto for both the relational histogram and the fuzzy Hausdor distance. With this information to hand the processes can be integrated so as to deliver a pruned set of hypotheses which is both conservative and parsimonious.

2 Object Representation We are interested in line-pattern recognition. The raw information available for each line segment are its orientation (angle with respect to the horizontal axis) and its length (see gure 1). To illustrate how the pairwise feature attributes are computed suppose that we denote the line segments indexed (ab) and (cd) by the vectors xab and xcd respectively. The vectors are directed away from their point of intersection. The relative angle attribute is given by xab ;xcd = arccos[ jxxababjjxxcdcd j ].   angle. This is an exFrom the relative angle we compute the directed relative tension to the attribute used by Thacker et al. [10], that consists of giving the relative angle a positive sign if the direction of the angle from the baseline xab to its pair xcd is clockwise and a negative sign if it is counter-clockwise. This allows us to extend the range of angles describing pairs of segments from [0,] to [?,] and therefore, reduce indexation errors associated with angular ambiguities. In order to describe the relative position between a pair of segments and resolve the local shape ambiguities produced by the relative angle attribute we introduce a second attribute.The directed relative position #xab ;xcd is represented  

d h f c

Θab,cd

ϑ ab,cd =

Dib Dab

g e

b

a

i

Dab Dib

Fig. 1. Geometry for shape representation by the normalised length ratio between the oriented baseline vector xab and the vector xib joining the end (b) of the baseline segment (ab) to the intersection of the segment pair (cd). #xab ;xcd = [ 21 + DDabib ]?1 . The physical range ofthis attribute is (0; 1]. A relative position of 0 describes parallel segments, while a relative position of 1 indicates that the two segment intersect at the middle point of the baseline. We aim to augment the pairwise attributes with constraints provided by the edge-set of the N-nearest neighbour graph. Accordingly, we represent the sets of line-patterns as 4-tuples of the form G = (V; E; U; B ). Here the line-segments extracted from an image are indexed by the set V . More formally, the set V represents the nodes of our nearest neighbourhood graph. The edge-set of this graph E  V  V is constructed as follows. For each node in turn, we create an edge to the N line-segments that have the closest distances. Associated with the nodes and edges of the N-nearest neighbour graph are unary and binary attributes. The unary attributes are de ned on the nodes of the graph and are represented by the set U = f(i ; li ); i 2 V g. Speci cally, the attributes are the line-orientation i and the line-length and li . By contrast, the binary attributes are de ned over the edge-set of the graph. The attribute set B = f(i;j ; #i;j ; (i; j ) 2 E  V  V g consists of the set of pairwise geometric attributes for line-pairs connected by an edge in the N-nearest neighbour graph. We are concerned with attempting to recognise a single line-pattern Gm = (Vm ; Em ; Um ; Bm ), or model, in a data-base of possible alternatives. The alternative data-patterns are denoted by Gd = (Vd ; Ed ; Ud; Bd ), 8d 2 D where D is the index-set of the data-base.

3 Relational Histograms With the edge-set of the nearest neighbour graph to hand, we can construct the structurally gated geometric histogram [5]. The bin-incrementing process can be formally described as follows. Let i and j be two segments extracted from the raw image. The angle and position attributes ij and #ij are binned provided the two segments are connected by an edge, i.e. (i; j ) 2 E . If this condition is met then the bin H ( ; ) spanning the two attributes is incremented as follows 

H ( ; ) + 1 if (i; j ) 2 E and i;j 2 A and #i;j 2 R H ( ; ) = H ( ; ) otherwise

where A is the range of directed relative angle attributes spanned by the th horizontal histogram-bin and R is the range of directed relative position spanned by the th vertical histogram bin. Each histogram contains nA relative angle bins and nR length ratio bins. The data-base is queried by computing the Bhattacharyya distance or histogram correlation. Suppose that hm is the normalised relational histogram for the query image and hd is the normalised histogram for the iamge indexed d in the data-base, then the Bhattacharyya distance is given by

R(Gd ; Gm ) = ? ln

nR p nA X X

hd ( ; )  hm ( ; )

=1 =1

The best-matched line pattern Gd is the one that satisifes the condiition

R(Gd ; Gm ) = arg Gmin R(G0d ; Gm ) 2D d 0

(1)

4 Feature Sets The second recognition strategy involves comparing the paiwise feature sets for the line-patterns. We measure the pattern similarity using pairwise attribute relations de ned on the edges of the nearest-neighbour graph. Suppose that the set of nodes connected to the model-graph node I is CIm = fJ j(I; J ) 2 EM g. The corresponding set of data-graph nodes connected to the node i is Cid = fj j(i; j ) 2 Ed g. With these ingredients, the consistency criterion which combines evidence for the match of the graph Gm onto Gd is X X 1 X 1 X ? m ; vd  Q(Gd ; Gm ) = jV j 1 jV j P ( i; j ) ! ( I; J ) j v m d I;J i;j M d i2Vd I 2Vm jCi j j 2C d jCI j J 2C m i

I

The probabilistic ingredients of the evidence further ? combining formulad need  explanation. The a posteriori probability P (i; j ) ! (I; J )jvm ; v represents I;J i;j the evidence for the match of the model-graph edge (I; J ) onto the data-graph d edge (i; j ) provided by the corresponding pair of attribute relations vm I;J and vi;j . In practice, these relations are the angle di erence i;j and the length ratio #i;j de ned in Section 2. We assume that the conditional prior can be modelled as follows ?



P (i; j ) ! (I; J )jvmI;J ; vdi;j = ? (jjvmI;J ? vdi;j jj)

(2)

d where ? (jjvm I ? vi jj) is a distance weighting function. In a previous study [4] we have shown that  most e ective weighting kernel is a Gaussian of the form  2the  ? () = exp ?  . We now consider how to simplify the computation of relational consistency. We commence by considering the inner sum over the nodes in the model-graph neighbourhood CIM . Rather than averaging the edge-compatibilities over the

entire set of feasible edge-wise associations, we limit the sum to the contribution of maximum probability. Similarly, we limit the sum over the node-wise associations in the model graph by considering only the matched neighbourhood of maximum compatibility. With these restrictions, the process of maximising the Bayesian consistency measure is equivalent of maximising the following relational-similarity measure

Q(Gd ; Gm ) =

X

i2Vd

max I 2V

m

X

j 2Cid

?

max ? (jjvm ? vd jj) J 2C m  I;J i;j I



(3)

With the similarity measure to-hand, the best matched line pattern is the one which satis es the condition Q(Gd ; Gm ) = arg max Q(G0d ; Gm ) (4) d 2D 0

5 Recognition Experiments We provide some examples to illustrate the qualitative ordering that result from the two recognition experiments. The data-base used in our study consists of 2500 line-patterns segmented from a variety of images. There are three classes of image contained within the data-base; trademarks and logos, letters of the alphabet of di erent sizes and orientations, and, 25 aerial images. There are 5 multiple segmentations for each aerial image. We have a digital map for a road network contained in two of the images. Since the aerial images are obtained using a line-scan process, they are subject to barrel distortion and are deformed with respect to the map. Figures 2 and 3 compare the recognition rankings obtained from the database. In each case the left-hand panel is the result of using relational histograms while the right-hand panel is the result of using feature-sets. In each panel the thumbnails are ordered from left-to-right and from top-to-bottom according to decreasing rank. In Figure 2 we show an example of querying the data-base with the letter A. In the case of the feature-sets, the 12 occurrences of the latter A are ranked at the top of the order. It is interesting to note that the noisy versions of the letter are ranked in positions 11 and 12. In the case of the relational histograms the letter A's are more dispersed. The letters K and V disrupt the ordering. Finally, Figure 3 shows the result of querying the data-base with the digital map. In the case of the feature-sets, the eight segmentations of the two images containing the road-pattern are recalled in the top-ranked positions. In the case of the relational histogram, ve of the segmentations are top-ranked. Another segmentation is ranked ninth and one segmentation falls outside the top 16.

6 Sensitivity Analysis The aim in this section is to investigate the sensitivity of the two recognition strategies to the systematics of the line-segmentation process. To this end we

(a)

(b) (c) Fig. 2. The result of querying the data-base with the letter \A"

have simulated the segmentation errors that can occur when line-segments are extracted from realistic image data. Speci cally, the di erent processes that we have investigated are: { Extra lines: Additional lines with random lengths and angles are created at random locations. { Missing lines: A fraction of line-segments are deleted at random locations. { Split lines: A prede ned fraction of lines-segment have been split into two. { Segment end-point errors: Random displacements are introduced in the end-point positions for a prede ned fraction of lines. The distribution of end-point errors is Gaussian with a standard deviation of 4 pixels. { Combined errors: Here we have mixed the four di erent segment errors described above in equal proportion. The performance measure used in our sensitivity analysis is the retrieval accuracy. This is the fraction of queries that return a correct recognition. We query the data-base with line patterns that are known to have a number of counterpart. Here the query pattern is a distorted version of the target in the data-base. An example is furnished by the digital map described earlier which is a barrel-distorted version of the target. Figure 4 compares the retrieval accuracy as a function of the fraction of lines that are subjected to segmentation errors. In the case of the relational histogram (Figure 4a) performance does not degrade until the fraction of errors exceeds 20%. The most destructive types of error are line-splitting, line segment end-point errors and the addition of extra lines. The line-splitting introduces additional combinatorial background that swamps the query pattern. The method is signi cantly less sensitive to missing lines and performs well under combined errors. In the case of the feature-sets (Figure 3b) the overall performance is much better. At large-errors it is only missing lines that limit the e ectiveness of the technique. However, the onset of errors

(a)

(b) (c) Fig. 3. The result of querying the data-base with the digital map

occurs when as few as 40% of the lines are deleted. The line-patterns are least sensitive to segment end-point errors. In the case of both line-addition and linesplitting there is an onset of errors when the fraction of segment errors is about 20 percent. However, at larger fractions of segmentation errors the overall e ect is signi cantly less marked than in the case of line-deletions. 800

Accuracy of retrieval

Accuracy of retrieval

80

Extra Lines Missing Lines Split Lines EndPoint Errors Combined Errors

100

60

40

20

80

Extra Lines Missing Lines Split Lines EndPoint Errors Combined Errors

700 Worse Ranking Position

Extra Lines Missing Lines Split Lines EndPoint Errors Combined Errors

100

60

40

600 500 400 300 200

20 100

0

0

0

(a) Relational histograms. (b) Feature sets. (c) Ranking Fig. 4. E ect of various kinds of noise to the retrieval performance (a)(b). Worse ranking position (c). 0

20 40 60 80 Percentage of lines affected by noise

100

0

20 40 60 80 Percentage of lines affected by noise

100

0

20 40 60 80 Percentage of lines affected by noise

100

We now turn our attention to how the two recognition strategies may be integrated. The idea is to use the relational histogram as a lter that can be applied to the data-base to limit the search via feature-set comparison. The important issue is therefore the rank threshold that can be applied to the histogram similarity measure. The threshold should be set such that the probability of false rejection is low while the number of images that remain to be veri ed must is small. To address this question we have conducted the following experiment. We have constructed a data-base of some 2500 line-patterns. The data-base contains several groups of images which are variations of the same object. Each group contains 10 variations. In Figure 4(c) we show the result of querying a data-base of with an object selected from each group. The plot shows the worst ranked

member of the group as a function of the amount of added image noise. The plot shows a di erent curve for each of the ve di erent noise types listed above. The main conclusion to be drawn from this plot is that additional lines and end-point segment errors have the most disruptive e ect on the ordering of the rankings. However, provided that less than 20% of the line-segments are subject to error, then the data-base can be pruned to 1% of its original size using the relational histogram comparison. If a target pruning rate of 25% is desired then the noise-level can be as high as 75%.

7 Discussion and Conclusion The main contribution in this paper has been to demonstrate some of the noise sensitivity systematics that limit the retreival accaracy that can be achieved with two simple line-pattern recognition schemes. The rst is based on pairwise geometric histogram comparison. The second involves comparing the set of pairwise goemetrc attributes. Our study reveals that the two methods have rather di erent noise sytematics. The histogram-based method is most sensitive noise processes that swamp the existing pattern. These include the addition of clutter and the fragmentation of existing lines. The feature-set based method, on the other hand, is relatively insensitive to the addition of line segments. However, it is more sensitive to the deletion of line segments.

References

1. T. Gevers and A. Smeulders. Image indexing using composite color and shape invariant features. IEEE ICCV'98, pages 576{581, 1998. 2. G. L. Gimelfarb and A. K. Jain. On retrieving textured images from an image database. Pattern Recognition, 29(9):1461{1483, 1996. 3. B. Huet, A. D. J. Cross, and E. R. Hancock. Graph matching for shape retrieval. Advances in Neural Information Processing Systems 11, Edited by M.J. Kearns, S.A. Solla and D.A. Cohn, MIT Press, (available May 1999), 1998. to appear. 4. B. Huet and E. R. Hancock. Fuzzy relational distance for large-scale object recognition. IEEE CVPR'98, pages 138{143, June 1998. 5. B. Huet and E. R. Hancock. Relational histograms for shape indexing. IEEE ICCV'98, pages 563{569, Jan 1998. 6. A. K. Jain and A. Vailaya. Image retrieval using color and shape. Pattern Recognition, 29(8):1233{1244, 1996. 7. R. W. Picard. Light-years from lena: Video and image libraries of the future. IEEE ICIP'95, 1:310{313, 1995. 8. W. J. Rucklidge. Locating ojects using the Hausdor distance. IEEE ICCV'95, pages 457{464, 1995. 9. M. J. Swain and D. H. Ballard. Indexing via colour histograms. IEEE ICCV'90, pages 390{393, 1990. 10. N. A. Thacker, P. A. Riocreux, and R. B. Yates. Assessing the completeness properties of pairwise geometric histograms. Image and Vision Computing, 13(5):423{429, June 1995. 11. R. Wilson and E. R. Hancock. Structural matching by discrete relaxation. IEEE PAMI, 19(6):634{648, June 1997.

Suggest Documents