Spatial Similarity-Based Retrievals and Image

0 downloads 0 Views 255KB Size Report
serving the spatial relationships among its various com- ponent. ... resentation, Image Indexing, Spatial Similarity-Based. Retrieval ..... Transformer. Image.
Spatial Similarity-Based Retrievals and Image Indexing By Hierarchical Decomposition Imran Ahmad and William I. Grosky Multimedia Information Systems Laboratory Department of Computer Science Wayne State University Detroit, MI 48202, USA E-mail: [email protected], [email protected]

Abstract For ecient search and spatial similarity-based retrieval of image contents, this paper introduces a new symbolic image representation and indexing technique. In this technique, an image is recursively decomposed into a spatial arrangement of features points while preserving the spatial relationships among its various component. Quadtrees are used to manage the decomposition hierarchy and help in quantifying the measure of similarity. This scheme is incremental in nature and can be adopted to nd a match at various levels of details, from coarse to ne. This approach is translation, rotation and scale independent. For search and retrieval, a two phase indexing scheme based on image signatures and quadtree matching is introduced. For a given query image, a facility is provided to rank order the retrieved spatially similar images from the image database against a given query image for subsequent browsing and user selection. Keywords: Image Databases, Symbolic Image Representation, Image Indexing, Spatial Similarity-Based Retrieval, Signature-Based ltering

1. Introduction Visual information is an integral part of any multimedia information system. In many cases, most of this information consists of images which require large amounts of external storage and complex operations for ecient management and retrieval. Existing methods for image storage and retrieval are based largely on the traditional approach of manual image annotation, which describes salient image features. Such operations are quite subjective, however, and are often inadequate

to describe the true nature of the pictorial information. Therefore, there is a need for ecient automated techniques to store and retrieve images based on their contents [8]. However, di erent images of the same entity are usually not identical, having been taken from di erent locations and at di erent times. Hence, an exact match on the basis of image contents is generally not a wise thing to do. To accommodate this natural variability in search and retrieval, the concept of similarity is required [6, 9]. In similarity-based retrieval, only those images, which share the query domain and di er with it by an allowable threshold [8] are retrieved from the database. Also, since a database may contain many similar images, a facility is needed to browse through the set of retrieved images for nal selection by the user [7]. Retrieval of images based on spatial relationships among image objects is generally known as spatial similarity-based retrieval. This type of retrieval has been identi ed as an important class of similarity-based retrievals [8, 10] and is used in such applications as geographic and medical information systems. The purpose of an index in a traditional database is to help in organizing the information for more ecient retrieval. However, the purpose of an index in an image database is to lter out the unwanted images by preprocessing, thus reducing the search space to only instances of a plausible match [2], as well as to rank-order the images which remain. Retrieval operations using actual images are time consuming and expensive. For ecient search and retrieval operations, an abstract or symbolic representation of a physical image is required. Such representation is an abstraction of the spatial and semantic information of image contents and is used for comparison purposes. The corresponding actual images are retrieved from the database only after the symbolic im-

ages have been successfully matched. Several schemes for data modeling and symbolic image representation have been proposed [1, 4, 9, 13]. Some of these techniques [4, 13] require extensive use of image processing techniques for feature detection and object classi cation as well as extensive manual annotation. One of the earliest proposed ideas for symbolic representation was proposed in [3]. In this scheme, a picture is considered as a matrix of symbols where each symbol corresponds to an object of the real image. A 2D string is obtained by symbolic projection of these symbols along the x and y axes, thus, preserving the relative positions of the image components. Since a query image can also be transformed into a 2D-string, the problem of pictorial information retrieval then becomes an instance of the largest common subsequence problem. However, this is equivalent to nding the maximal clique of a graph [9], which is NP complete. Therefore, any such solution is not computationally feasible when dealing with images having large number of objects. The encoding of spatial information in a 2D-string makes it un t for complete description and representation of spatial relationships between the constituent objects of an arbitrarily complex image [17] and to recognize its rotational variants [9]. To overcome such limitations, di erent variations of 2D-string have been proposed. These include extended 2D-strings [14], 2D C-strings [15], 2D C+ -strings [13], 2-D B-strings [22], the use of signature les with 2D B-strings [16, 17],  i. It is denoted by T (i) .

De nition 3.5 A quadtree is complete i each leaf node has an occupancy of 0 or 1. We now de ne a distance function d(T1 ; T2 ) to compute the distance between two quadtrees T1 and T2 and is given as follows:

De nition 3.6 Consider the following cases, Case 1:

d(T1 ; T2 ) = 0 Case 2:

 If T , T , T , T are quadtrees whose roots have oc1

2

3

4

cupancies m1 , m2 , m3 and m4 , respectively, where m1 + m2 + m3 + m4 > 0, we have that

T2

is a quadtree. The node

T3

T4

Suppose height(T1) = height(T2 ) = 1 and occupancy(root(T1)) +occupancy(root(T2)) > 0. For M = occupancy(root(T1)) and N = occupancy(root(T2)), we then de ne

jM ;N j d(T1 ; T2) = max (M; N )

m1+m2+m3+m4

T1

Suppose height(T1)=height(T2 )=1 and occupancy(root(T1))+occupancy(root(T2)) = 0. We then de ne

Case 3:

Suppose height(T1 ) = 1, occupancy(root(T1)) = 0, and height(T2 ) > 1. We then de ne

d(T1 ; T2) = 1 Case 4:

Case 2:

Suppose height(T1 ) = 1, occupancy(root(T1)) = 1, height(T2 ) > 1, and each child of the root node of T2 has an occupancy greater than 0. We then de ne

(0) (1) It follows that  (1)T1 = T1 = T1. Also, height T1 = 1. Suppose that    occupancy root T2(0) = N. We then have that N > 0, from which it follows by Case 2 of De nition 3.6 that d T1(0); T2(0) = 1.

d(T1 ; T2 ) = j N N; 1 j Case 5:

Suppose height(T1 ) = 1, occupancy(root(T1)) = 1, height(T2 ) > 1, and at least one child of the root node of T2 has an occupancy equal to 0. We de ne

Case 3:

Suppose height(T1 ) > 1 and height(T2) > 1. For 1  j  4, let the subtree of T1 and T2 determined by the nodes having coordinate sequence j be called T1j and T2j , respectively. Let occupancy(root(T1)) = M and occupancy(root(T2)) = N. For 1  j  4, let occupancy(root(T1j )) = mj and occupancy(root(T2j )) = nj . We then de ne d(T1 ; T2 ) =

0 1 X X m n j j max @ d(T j ; T j ); d(T j ; T j )A 4

4

j =1

M

1

2

j =1

N

1

2

We now have the following theorem:

Theorem 3.1 Let T and T be the two complete 1

quadtrees. Then, for i >= 0,



i

2



d T1(i) ; T2( )  d T1(i+1) ; T2(i+1)



Proof:

Base Case: i = 0: Case 1:

height(T1 ) = height(T2 ) = 1 Then, we have T1(0) = T1(1) and T2(0) = T2(1) and our result follows.

height(T1 ) = 1, occupancy(root(T1)) = 1, and height(T2 ) > 1 It follows that T1(0) = T1(1) = T1 . Also, height T1(1) = 1. Suppose that    occupancy root T2(0) = N. We then have that N > 0, from which it follows by Case 2 of De nition 3.6 that d T1(0); T2(0) = (NN;1) .

d(T1 ; T2) = 1 Case 6:

height(T1 ) = 1, occupancy(root(T1)) = 0, and height(T2 ) > 1





Now, height T2(1) = 2. Thus, from the Cases 3.6, we have that  (1)4 and(1)5of De nition (N ;1) d T1 ; T2  N , from which our result follows. Case 4:

height(T1 ) > 1 and height(T2 ) > 1 For 1  j  4, let the subtrees of T1 and T2 determined by the nodes having coordinate sequence j be called T1j and T2j , respectively. Let occupancy(root(T1)) = M and occupancy(root(T2)) = N. For 1  j  4, let occupancy(root(T1j )) = mj and occupancy(root(T2j )) = nj . Without loss of generality, assume that M  N . Then,by Case 2 of  De nition 3.6, we have that d T1(0); T2(0)  (MM;N ) .

P

Consider the quantity Q = 4j=1 mMj jmj ;nj j max(mj ; nj ) where, if mj = nj = 0, the jth summand is equal to 0. For 1  d  4, if md  nd , then mMd jmd;nd j md ;nd max(md ; nd ) = M , while if md < nd , then md jmd ;nd j > md md ;nd = md ;nd . M max(mdP ; nd ) M md M 4 j m j ;nj j M ; N Thus, Q  j=1 max(mj ; nj ) = M .





(1) (1)  (1) d(1)T1 ; T2  Q(M. ;NThus, ) d T1 ; T2  Q  = M  (0) (0) d T1 ; T2 , and our result follows.

Now,

Induction Hypothesis: Assume that the theorem is true for i = 0; : : : ; k, where k  1. We will show that the theorem is true for i = k + 1. For 1  j  4, let the subtrees of T and 1

T2 determined by the nodes having coordinate sequence j be called T1;j and T2;j , respectively. Let occupancy (root(T1 )) = M and occupancy (root(T2 )) = N. For 1  j  4, let occupancy (root(T1;j )) = mj and occupancy (root(T2;j )) = nj . We then have that the subtrees of T1(k) and T2(k) determined by the nodes having coordinate j are T1(;jk;1) and T2(;jk;1) , respectively. Also the subtrees of T1(k+1) and T2(k+1) determined by the nodes having coordinate sequence j are T1(;jk) and T2(;jk) , respectively. We also have that



d T1(;jk) ; T2(;jk) and 



P



P





= max[ 4j=1 mMj d T1(;jk;1) ; T2(;jk;1) , P4 nj d T (k;1); T (k;1) ], 1;j 2;j j =1 N





d T1(;jk+1) ; T2(;jk+1) = max[ 4j=1 mMj d T1(;jk) ; T2(;jk) , P4 nj d T (k); T (k) ]. 1;j 2;j j =1 N



But, by the induction hypothesis, d T1(;jk;1) ; T2(;jk;1)    d T1(;jk) ; T2(;jk) , and our result follows. QED.



Theorem 3.1 is quite important, as it implies  that (i) (i) for the two trees T1 and T2 , if d T1 ; T2 >  (i+1) (i+1) then d T1 ; T2 > , where  0 is the allowable deviation in similarity or threshold and i  0. This means that we can walk down both the database and the query tree in breadth- rst fashion and observe whether the distance between given approximations of these trees is larger than our threshold. If this is the case, we may eliminate the database tree from further consideration, as all subsequent approximations and indeed, the original trees themselves, will have a distance from each other larger than the given threshold.

4. Experimental Results For experiments, our image database consists of 1200 random images of arbitrary size and complexity. Each image consists of from 2 to 25 feature points. There are 200 basic independent images, the rest consisting of a known number of geometric variants. The

general characteristics of the image data set are summarized in Table 2. Table 2. Characteristics of the image data set

Number of original images Images and their variants Feature Points Rotational variants per image Scale variants per image Translation variants per image Image Size (Min) Image Size (Max)

200 1200 2 - 25 1 2 2 100 X 100 1024 X 1024

Each of the basic 200 images in the database have 5 di erent variants in terms of three essential geometric transformations: scaling, translation and rotation. The data set consists of 2 variants of each in terms of both scale and translation and a single rotation variant. Our experiments are based on a range of tolerance factor (TF ) values to determine successful retrievals. We have tested our approach with TF 2 f0:0; 0:1; 0:2g to create signatures for the database images. When TF = 0:0 we must have an exact match. These results are collected for di erent approximations and averaged over the 200 original images in the database. It is important to note that since each of the 200 images have 5 variants, therefore, each image has exactly 6 known instances in the database. Table 3. Tolerance Factor and its effect on matching for nth approximation

Tolerance Factor 0.00 0.10 0.20 Ave. Matched Signatures 11.42 39.95 90.17 Ave. Successful Matches 6.71 7.79 8.99 Ave. Non-Matches 4.71 32.16 81.18 Our query is composed of one of the known images in the database. Therefore, for each instance, we can predict the minimum number of retrievals. For TF = 0:0 or an exact match, there should only be 6 matched instances for each image. However, since the number of feature points in an image ranges between 2 and 25 and the process of recursive decomposition in our current implementation stops when all of the feature points are in distinct quadrants, for images with only 2 or 3 feature points, the chances of getting a false match in general are much higher. Moreover, based on the distribution of feature points, it may also be

A summary of results for the nth approximation for the three values of TF is given in Table 3 by choosing a known image from the database to nd similar images with threshold = 0:5. As expected, TF = 0 resulted in less comparisons and this number increased signi cantly for the other TFs. In such cases, although we started with a large number of images, due to the characteristics of the distance function, only those trees quali ed which were similar to the query image within the threshold. For nth approximation, a comparison of the average number of retrieved signatures and successful matches for di erent tolerance factors is given in Figure 6.

Avg. Number of Matched Instances

12 Matched Signatures

10

Successful Matches

8 6 4

Non-Matches 2 0 0

1

2

3

4

-1

Approximation

(a)

45

Avg. Number of Matched Instances

possible to have only one decomposition of the feature image and hence, to have a maximum of two approximations. However, in such cases, the accuracy can be improved signi canly by extending the decompostion hierarchy. With higher approximations, the chances of getting false matches are reduced due to the increase in the decomposition depth of the feature image. A comparison of results for the three values of TF is given in Figure 5. These results are in accordance with our proposed theory. Initially, for the root level or the 0th approximation, we start with a large number of successful matches. This is due to the fact that a number of di erent trees may have the same root occupancy. However, as proved earlier, this number reduces signi cantly for subsequently increasing approximations. The best results are obtained by matching the entire trees. The entire tree or the nth approximation is indicated by ;1 in Figure 5.

40 35 30 Non-Matches

25 20 15

Successful Matches

10 5 0 0

1

2

3

4

-1

Approximation

(b)

In this paper, we have presented a symbolic image representation and indexing scheme which di ers from earlier proposed techniques in many respects. As it stands, this scheme is independent of size, translation and orientation of the query images. The symbolic representation does not involve any string comparisons, the search space is reduced signi cantly in the earlier phases of comparisons to improve on retrieval times, and the non-matches are eliminated only due to its incremental nature. We are working on schemes to nd a match at ne levels of details by extending the decomposition hierarchy. This will allow us to locate a feature point at more precise positions for applications requiring greater levels of details such as medical systems. We are also working on extensions of this scheme to incorporate domain dependent spatial similarity-based retrievals and on devising schemes to increase the exibility of query-

Avg. Number of Matched Instances

5. Conclusion and Research Directions 100 90 80 70

Non-Matches

60 50 40 30

Successful Matches

20 10 0 0

1

2

3

4

-1

Approximation

(c) Figure 5. ith approximation Vs average number of matched instances for TF = (a) 0.0, (b) 0.1 and (c) 0.2

Average Number of Retrievals

90 80 70 60 50

Matched Signatures

40 30 20 Successful Matches

10 0 0.0

0.1 Tolerance Factor

0.2

Figure 6. A comparison of average number of retrieved signatures and successful matches for different TFs

ing process. This extension will enable us to formulate queries by simply describing the spatial relationships among the objects in terms of their relative positions.

References [1] Y. A. Aslandogan, C. Thier, C. T. Yu, and C. Liu. Design, Implementation and Evaluation of SCORE (a System for COntent based REtrieval of pictures. In Proceedings of the 11th IEEE International Conference on Data Engineering, pages 280{287, Mar. 1995. [2] A. D. Bimbo, P. Pala, and S. Santini. Visual Image Retrieval by Elastic Deformation of Object Shapes. In Proceedings of IEEE Symposium on Visual Languages, pages 216{223, Oct. 1994. [3] S. K. Chang and S. H. Liu. Iconic Indexing by 2D Strings. IEEE Transaction on Pattern Analysis and Machine Intelligence (PAMI), 9(3):413{428, May 1987. [4] S.-K. Chang, Q.-Y. Shi, and C.-W. Yan. Iconic Indexing by 2D Strings. In 1986 IEEE Computer Society Workshop on Visual Languages, pages 12{21, Dallas, Texas, June 1986. [5] C. Faloutsos and S. Christodoulakis. Signature Files: An Access Method for Documents and Its Analytical Performance Evaluation. ACM Transactions on Oce Information Systems, 2(4):267{288, Oct. 1984. [6] W. I. Grosky and Z. Jiang. Hierarchical Approach to Feature Indexing. Image and Vision Computing, 12(5):275{283, June 1994. [7] W. I. Grosky, Z. Jiang, and I. Ahmad. Structured Browsing of Image Databases. In Proceedings of Multimedia Information Systems and Hypermedia, pages 7{14, Japan, Mar. 1995. [8] W. I. Grosky and R. Mehrotra. Image Database Management. In Advances in Computers, pages 237{291. Academic Press, N.Y., 1992.

[9] V. Gudivada. 

Suggest Documents