Design and Evaluation of Algorithms for Image Retrieval ... - CiteSeerX

7 downloads 22847 Views 293KB Size Report
The Center for Advanced Computer Studies ..... We call a database image a ... image objects plus the image object at the center of the image as subgroup 2 ...
Design and Evaluation of Algorithms for Image 1 Retrieval By Spatial Similarity Venkat N. Gudivada Department of Computer Science Ohio University Athens, OH 45701

Vijay V. Raghavan The Center for Advanced Computer Studies University of SW Louisiana Lafayette, LA 70504 Abstract

Similarity based retrieval of images is an important task in many image database applications. A major class of users' requests require retrieving those images in the database that are spatially similar to the query image. In this paper, we propose an algorithm for computing the spatial similarity between two symbolic images. A symbolic image is a logical representation of the original image where the image objects are uniquely labeled with symbolic names. Spatial relationships in a symbolic image are represented as edges in a weighted graph referred to as spatial orientation graph. Spatial similarity is then quantified in terms of the number of as well as the extent to which the edges of spatial orientation graph of the database image conform to the corresponding edges of the spatial orientation graph of the query image. The proposed algorithm is robust in the sense that it can deal with translation, scale, and rotation variances in images. The algorithm has quadratic time complexity in terms of the total number of objects in both the database and query images. We also introduce the idea of quantifying a system's retrieval quality by having an expert specify the expected rank ordering with respect to each query for a set of test queries. This enables us to comprehensively asses the quality of algorithms for retrieval in image databases. The characteristics of the proposed algorithm are compared with those of the previously available algorithms using a testbed of images. The comparison demonstrated that our algorithm is not only more efficient but also provides a rank ordering of images that consistently matches with the expert's expected rank ordering.

1. Introduction Recently, there has been widespread interest in various kinds of database management systems for managing information from images, which do not lend themselves to be efficiently stored, flexibly retrieved and manipulated within the framework of conventional database management systems. The Image Retrieval (IR)

1This

research is supported by U.S. Department of Defense under Grant No: DAAL03-89-G-0118

problem is concerned with retrieving images that are relevant to users' requests from a large collection of images, referred to as the image database. Tamura and Yokoya provide a survey of image database systems that were in practice around early 1980s [2]. Chock also provides a survey and comparison of functionality of several image database systems for geographic applications [3]. The solutions that have been proposed to the IR problem can be characterized as ad hoc, at best, since their features have essentially evolved from the idiosyncrasies of diverse application area requirements. A survey of existing approaches to the IR problem and their limitations are described in [1]. To overcome some of these difficulties, a unified model for IR has also been proposed in [1]. The unified model identifies four major types of retrieval in image databases: Retrieval by Browsing, Retrieval by Nonsemantic Attributes, Retrieval by Spatial Constraints, and Retrieval by Semantic Attributes. Retrieval by Browsing is a user-friendly interface for retrieving information from image databases by employing techniques from visual languages. Typically, a browser is used when the user is very vague about his retrieval needs or when the user is unfamiliar with the structure and the types of information available in the database. The attributes that are used to describe high-level domain concepts which the image manifests are referred to as semantic attributes. Specification of these attributes necessarily involves some subjectivity, imprecision and/or uncertainty. Retrieval by Semantic Attribute queries are formulated by using the semantic attributes. Retrieval by Nonsemantic Attributes is similar to the retrieval in conventional databases using SQL (Structured Query Language). Retrieval is based on perfect match on the attribute values. Only Retrieval by Spatial Constraints (RSC) is considered in this paper. RSC facilitates a class of queries that are based on spatial relationships among the objects in an image. In RSC queries, spatial relationships may span a broad spectrum ranging from directional relationships to adjacency, overlap, and containment involving a pair of objects to multiple objects. We partition the RSC queries into two categories: 2

those that require retrieving all those database images that satisfy as many desired spatial relationships indicated in the query as possible and those that require retrieving only those database images that precisely satisfy all the spatial relationships specified in the query image. For the first category of RSC queries, a function that computes spatial similarity between two images is desired. A spatial similarity function assesses the degree to which the spatial relationships in a database image conform to those specified in the query image. Then a rank ordering of database images with respect to a query image can be obtained by applying such a spatial similarity function between the query image and each database image in turn. For the second category of RSC queries, however, spatial similarity functions are not appropriate. Rather an algorithm is required that provides a yes/no type of response. When the number of image objects involved in a query are few, then it may not be cumbersome to explicitly specify the desired spatial relationships. When this is not the case, a RSC query can be specified elegantly by borrowing techniques from visual languages. Under this scheme, the user specifies a query by placing the graphic icons corresponding to the domain objects in a special window called the sketch pad window. The sketch pad window provides both the graphic icons of the domain objects and the necessary tools for selecting and placing these graphic icons for composing an RSC query. The spatial relationships among the icons in the sketch pad window implicitly indicate the desired spatial relationships among the domain objects in the images to be retrieved. Though the algorithms for the two classes of RSC queries are different, however, the sketch pad window can be used as the query specification scheme in both the cases. In this paper, we propose an algorithm for the first category of RSC queries (i.e., a function for retrieval by spatial similarity). The algorithm works with symbolic images. Given an image representing information at pixel level (called the physical image), 3

various image processing and understanding techniques are used to identify the objects in the image and their relative positions within the image. Though this task is computationally expensive, it is performed only at the time of image insertion into the database. Moreover, this task may be carried out in a completely automated fashion or in a (human-assisted) semi-automated fashion depending upon the domain and the complexity of images. A symbolic image is then obtained by associating a name with each of the domain objects thus identified. Centroid coordinates of the image objects with reference to the image frame are also extracted. Processing images representing information only at the pixel level to construct interactive responses to high-level user queries is not economically viable if not technologically impossible in a database environment where the number of images tend to be large. The use of symbolic images obviates the need for repeated image understanding. Also, functions for retrieval by spatial similarity based on symbolic images are useful in distributed environments where physical images are stored only at a central location and symbolic images are stored at each local site. The storage required for storing a symbolic image compared to its physical representation is negligible. Decision as to which images are relevant to a query is arrived at by using only symbolic images. Once the relevant images are identified, only those images are transferred from the central location to a local site. The idea of representing physical images by symbolic images helps to achieve domain independence and is similar to the representation of documents by index terms in Bibliographic Information Systems. In the latter, documents are uniformly represented by index terms in a domain independent fashion. However, it should be noted that the indexing task itself is domain dependent and complex and is usually performed in a semiautomated fashion in commercially successful Bibliographic Information Systems. We use the terms image, symbolic image, and iconic image (explained in section 4) interchangeably in this paper. 4

Spatial relationships in a symbolic image are represented as edges in a weighted graph referred to as spatial orientation graph. Spatial similarity is then quantified based on the number as well as the extent to which the edges of spatial orientation graph of the database image conform to the corresponding edges of the spatial orientation graph of the query image. The algorithm is robust in the sense that it can deal with translation, scale, and rotation variances in images. The algorithm has quadratic time complexity in terms of the total number of objects in both the database and query images. The retrieval results obtained by using the proposed algorithm are compared with the results obtained by using the algorithm of Lee et al. [5]. Lee et al.'s algorithm is based on a spatial data structure referred to as 2D-String and has exponential time complexity in terms of the number of objects that are common to the query and the database images. Also, the proposed algorithm is contrasted with the algorithm of Chang & Lee [6]. Chang & Lee's algorithm is based on exhaustively enumerating and storing all the spatial relationships among objects in all the images in the database. We also introduce the idea of using expert-provided rank ordering of images, relative to each query in a set of test queries, in quantifying a system's retrieval quality. This quantification is based on a measure referred to as the Rnorm [9]. The Rnorm measure enables us to objectively asses the quality of algorithms for retrieval in image databases. The remainder of the paper is structured as follows. The proposed algorithm for retrieval by spatial similarity is developed in section two. Section three provides a brief introduction to the related work. Section four describes the experimental setup, evaluates the proposed algorithm, contrasts the performance of the proposed algorithm with that of one of the algorithms discussed in section three using the Rnorm measure. Finally, section five concludes the paper.

5

2. SIM - An Algorithm for Retrieval by Spatial Constraints The approach involves the following steps: First, we obtain the symbolic representations of both images and queries. We then construct an edge list corresponding to each symbolic image and query image. A similarity function that computes the degree of closeness between two edge lists is then applied. The last two steps are described below in detail. Assume that a symbolic image has n objects and associate each of these objects with a vertex in a hypothetical weighted graph. We refer to this graph as spatial orientation graph. An edge in the spatial orientation graph is a line connecting any two objects and the weight associated with the edge is the slope of that edge. The collection of all such possible edges for a symbolic image constitutes the edge list for that image. The number of edges in the list is n( n −1) / 2. The following procedure is used in generating the edge list. First, the vertices of the spatial orientation graph are lexicographically sorted on symbolic names associated with the vertices. Second, the vertex names are paired in such a way that the resulting edge names in the list remain in lexicographically sorted order. For example, if a symbolic image has four objects, say, A, B, C and D, then the edges in the list are: {AB, AC, AD, BC, BD, CD}. As we will see later that generating an edge list in this order is useful in obtaining a better time complexity for the proposed similarity algorithm. For each edge in the list, the objects connected by the edge and the slope of the edge are stored in the database; alternatively, an edge list can be generated during the query processing time from the symbolic representation of the image. The slope values of edges are used to compute the spatial similarity. Our similarity function takes edge lists corresponding to two symbolic images as arguments and produces a real number indicating their similarity. Formally, we define the similarity function for symbolic images, SIM , as: SIM:{( E1 , E 2 )} → R

6

where E1 represents the edge list of the first image and E 2 represents the edge list of the second image. The first of these correspond to the query image and, in general, SIM ( E1 , E 2 ) ≠SIM ( E 2 , E1 ). The details of the SIM algorithm are described below. Consider two symbolic images S1 and S2 and suppose that we want to compute the similarity of S2 with respect to S1 . The image S1 is referred to as the query image and the image S2 is referred to as the database image. Let E qr and E db denote the edge lists corresponding to S1 and S2 . Also, let n1 denote the number of objects in S1 . Then the number of edges in E qr is n1 ( n1 − 1) / 2 . For the sake of clarity, we want to discuss the impact of the number of edges common between two images separately from that of the differences in orientation between corresponding edges. Therefore, let us assume that slopes of the corresponding edges in the two images are equal. If all the edges of E qr are present in E db , then maximum possible similarity is assigned to S2 . Assuming a maximum possible similarity of 100. 00 , each edge in E db that is also present in E qr 100. 00 toward the similarity. The smaller the number of contributes a value of n1 ( n1 − 1) / 2 edges that are common to E qr and E db , the fewer is the number of edges contributing to the similarity value and hence a lower similarity value should result. Now consider the case that the edges common to E qr and E db do not all have the same slope or orientation. Depending upon the degree by which the corresponding edge orientations differ, the contributing factor from an edge toward the similarity value has to be modified. The greater the difference in edge orientations, the higher is the reduction in the contributing factor. If the angle between two corresponding edges in E qr and E db is θ

(see below for details), the contributing factor from this edge pair is 100. 00 1 + cos(θ ) 100. 00 ( ) . When θ = 00, the contributing factor is and when n1 ( n1 − 1) / 2 2 n1 ( n1 − 1) / 2

θ = 1800, the contributing factor is 0. The algorithm SIM is shown in Figure 1. It should be noted that the angle θ depends not only on the edge orientations but also on the differences in the vertex names/labels of the edges. The angle θ between two

7

edges is computed as follows. Consider the edges as directed line segments and apply coordinate translation such that both starting points are at the origin (i.e., the end points of the directed line segments at the origin have the same object labels). Then θ is defined as the smaller of the two angles between the line segments thus formed and is illustrated in Figure 2 for two instances. Figure 2(A) shows an instance of two directed line segments and the configuration of these two line segments after the coordinate translation is shown in Figure 2(B). Likewise, Figures 2(C) and 2(D) illustrate the θ computation for another instance of a pair of directed line segments. A more detailed and an alternative method for computing θ is shown in [1]. The time complexity of SIM is Ο(| E qr | +| E db |) , since the algorithm involves only searching for each edge in E qr for the corresponding edge in E db and both the edge lists are generated in sorted order. The range of SIM is from 0 to 100. It should be noted that SIM is non-symmetric, i.e., SIM ( E1 , E 2 ) ≠SIM ( E 2 , E1 ). SIM is robust with respect to both scale and translation variances in the sense that it assigns highest similarity for an image that is its scale or translation variant image. This is because SIM considers only the relative orientation of edges and not the length of edges nor their absolute positions. Rotational variance is treated in the next section.

Algorithm SIM Similarity ← 0. 0 n1 ← number of objects in the query image E qr ← edge list of the query image E db ← edge list of the database image For each edge ei ∈ E qr , find the corresponding edge e j ∈ E db If the corresponding edge is found, calculate angle θ between ei and e j 100. 0 F 1 + cos(θ) I Similarity ← Similarity + K n1 ( n1 − 1) / 2 H 2 Figure 1: Algorithm SIM for Spatial Similarity 8

2.1 Incorporating Rotational Invariance We need to introduce the following terminology and notation for further discussion in this section. Let O qr and O db denote the set of image objects in the query and database images. For each

db db oi ( oj , i ≠ j ) ∈O qr ∩ O db , let oiqr ( o qr j ) and oi ( o j ) denote the

corresponding objects in the query and the database images. Define θ ij (i ≠ j ) as the (smaller) angle the line joining the centroids of image objects oiqr and o qr j makes with the line joining the centroids of image objects oidb and o db j . We call a database image a perfect rotational variant

of a query image

if

θ ij

all

are

equal

and

O db = O qr = O db ∩ O qr . Under this definition, in Figure 5, Image 2 is a perfect rotational variant of Image 1. Image 2 is obtained by rotating all the objects of Image 1 as a whole by 45 degrees in counterclockwise direction. However, more often we are faced with situations where not all θ ij are equal. Under such conditions, we say that such an image is a multiple rotational variant of the original image since the image can be interpreted as a rotational variant of the original image corresponding to each group of objects of the image having the same magnitude of rotation.

To

determine

the

largest subimage that has rotated as a unit, we propose to partition the image objects in O db into various groups. Each group is characterized by a representative magnitude of rotation. Intuitively, if we have too many Figure 2: Definition of θ 9

such

groups

in

the

database image then the database image is not a rotational variant of the query image. On the other hand, if we have only a few groups then the database image is a multiple rotational variant of the query image corresponding to each of the groups. We would like to recognize only the multiple rotational variant of the database image that would yield the maximum similarity with the query image as the actual rotational invariant of the query image. If a group has large cardinality and hence greater number of edges, the contribution from this group toward the similarity value is also greater. Therefore, identifying a group that has the largest cardinality and then rotating all the objects in the database image in a direction that is opposite to the direction of rotation of the group by a magnitude equal to the representative magnitude of the group would align the database image spatially closer to the query image to yield maximal similarity. As an example, consider images 9 and 11 shown in Figure 5. We refer to the image objects in Image 9 along the middle column as subgroup 1 objects and the remaining image objects plus the image object at the center of the image as subgroup 2 objects. Image 11 can be interpreted as a rotational variant of Image 9 where only subgroup 1 objects have rotated about the image center point in clockwise direction. Subgroup 2 objects have not undergone any rotation and their current locations are their original locations. Also, Image 11 can be interpreted as a rotational variant of Image 9 where only subgroup 2 objects have rotated about the image center point in counterclockwise direction. Subgroup 1 objects have not undergone any rotation and their current locations are their original locations. Identifying the latter as the true rotational variant (since the cardinality subgroup 2 is larger than that of subgroup 1) and rotating all the objects in Image 11 in clockwise direction will yield maximum similarity with Image 9. Let θ max be the magnitude corresponding to the database image object group that has largest cardinality and is calculated using a clustering algorithm [4]. Input to this algorithm is a sorted list of θ ij s. Output of the algorithm is a set of clusters. θ max is the cluster mean of the cluster that has largest cardinality. Now all the objects in O db are 10

rotated by −θ max about an arbitrary pivot point. Then we apply the SIM algorithm. Let n be the cardinality of O qr ∩ O db . Finding rotations of vertices in O qr ∩ O db takes Ο( n ) time and sorting these vertices takes Ο( n log n ) time. Time complexity of the clustering algorithm is Ο( n 2 ) . The extended algorithm that incorporates this enhancement to recognize the differences in images due to rotations is referred to as SIMR . The overall complexity of SIMR is quadratic in the total number of objects in both images, since the computation associated with SIM dominates that of the extension. It should be noted that SIMR recognizes perfect rotational variants of an image, which are generated as a result of the rotation of the original image about any arbitrary pivot point (will be shown using a proof). Also, SIMR recognizes images that are multiple rotational variants of the original image provided that at least a significant number of image objects have rotated as a unit about any arbitrary pivot point (illustrated through an experimental study in section 4). To prove that SIMR recognizes perfect rotational variants, we introduce the following lemmas and theorems. Lemma 1: If a straight line segment is rotated about an arbitrary pivot point P by an angle θ , the (smaller) angle between the line segments corresponding to the original and the new configuration of the line segment (due to the rotation about the pivot point P ) is also θ . Proof: Consider the illustration shown in Figure 3. O1O 2 is the original straight line segment and is rotated about an arbitrary pivot point P by an angle θ (without loss of generality say, clockwise direction) to yield the new configuration of the line segment O' 1O' 2 . We show that ∠O 2 NO' 2 = θ as follows. Assume that the coordinates of O1 and O2 as ( x 1, y1) and ( x 2 , y 2 ) with reference to a Cartesian coordinate system. Also, denote the coordinates of the pivot point P as ( xp, yp ) . Clearly, ∠O1 PO' 1 = ∠O2 PO' 2 = θ since both O1 and O2 were rotated about the pivot point P by an angle θ . We denote the coordinates of O' 1 and O' 2 as ( x' 1, y' 1) and ( x' 2, y' 2 ) and are given by [10]:

11

x' 1 = ( x 1 − xp ) cos θ − ( y1 − yp ) sin θ + xp ,

y' 1 = ( x 1 − xp )sin θ + ( y1 − yp) cos θ + yp ,

x' 2 = ( x 2 − xp ) cos θ − ( y 2 − yp )sin θ + xp , y' 2 = ( x 2 − xp )sin θ + ( y 2 − yp ) cos θ + yp . Slope of the line segment O1O 2 ( m ) is given by ( y 2 − y1) / ( x 2 − x 1) . Also, slope of the line segment

O' 1O' 2

(m' )

is

given

by

( y' 2 − y' 1) / ( x' 2 − x' 1)

and

is

equal

to

(( x 2 − x 1)sin θ + ( y 2 − y1) cos θ ) / (( x 2 − x 1) cos θ − ( y 2 − y1)sin θ ) . Angle between the line segments O1O 2 and O' 1O' 2 (i.e., ∠O 2 NO' 2 )) = arctan(( m' − m ) / (1 + m' m )) . By simplification, ∠O 2 NO' 2 = arctan(tan θ ) = θ.

œ

We now state and prove the first

theorem. Theorem 1: Rotation of a symbolic image S about an arbitrary pivot point P produces another symbolic image S R such that S R is a perfect rotational variant of S . Proof: Consider a symbolic image S and denote its edge list by E . Now, rotate the symbolic image S about an arbitrary pivot point P by an angle θ . Denote the resulting symbolic image by S R and the edge list corresponding to S R as E R . Clearly, E = E R . Furthermore, by Lemma 1, angles between the corresponding edges in E and E R are all the same and is equal to θ . Therefore, S R is a perfect rotational variant of S .œ We now introduce the second lemma. Lemma 2: A straight line segment rotated about an arbitrary pivot point P by an angle θ and a subsequent rotation of this transformed straight line segment about another arbitrary pivot point C by the same angle θ but in the opposite direction (of the first rotation) will always result in the final orientation of the straight line segment in such way that it is parallel to its initial orientation. Proof: Consider the illustration shown in Figure 4. O1O 2 is the original Figure 3: Illustration for the Proof of Lemma 1 12

straight line segment and is rotated

about an arbitrary pivot point P by an angle θ (without loss

of

generality

say,

clockwise direction) to yield the new configuration of the line segment O' 1O' 2 . Now the line segment O' 1O' 2 is rotated Figure 4: Illustration for the Proof of Lemma 2

about

another

arbitrary pivot point C in the

counterclockwise direction by the same angle θ to produce the final configuration of the line segment O" 1O" 2 . We show that the initial configuration of the line O1O 2 and its final configuration O" 1O" 2 are parallel as follows. Assume that the coordinates of O1 , O2 , O' 1 , O' 2 , O" 1 , O" 2 as ( x 1, y1) , ( x 2 , y 2 ) , ( x' 1, y' 1) , ( x' 2, y' 2 ) , ( x " 1, y" 1) , ( x " 2, y" 2 ) , respectively with reference to a Cartesian coordinate system. Also, denote the coordinates of the pivot points P and C as ( xp, yp ) and ( xc, yc ) . Since O1 and O2 have rotated about the pivot point P by an angle θ to assume their new locations O' 1 and O' 2 , we have the following expressions [10]: x' 1 = ( x 1 − xp ) cos θ − ( y1 − yp ) sin θ + xp ,

y' 1 = ( x 1 − xp )sin θ + ( y1 − yp) cos θ + yp ,

x' 2 = ( x 2 − xp ) cos θ − ( y 2 − yp )sin θ + xp ,

y' 2 = ( x 2 − xp )sin θ + ( y 2 − yp ) cos θ + yp .

Similarly, since O' 1 and O' 2 have rotated about the pivot point C by an angle −θ to assume their new locations O" 1 and O" 2 , we have the following expressions: x" 1 = (( x 1 − xp ) cos θ − ( y1 − yp )sin θ + xp − xc ) cos θ − (( x 1 − xp )sin θ + ( y1 − yp) cos θ + yp − yc )( − sin θ ) + xc y" 1 = (( x 1 − xp ) cos θ − ( y1 − yp )sin θ + xp − xc )( − sin θ) + (( x 1 − xp )sin θ + ( y1 − yp ) cos θ + yp − yc ) cos θ + yc

13

x" 2 = (( x 2 − xp ) cos θ − ( y 2 − yp )sin θ + xp − xc ) cos θ − (( x 2 − xp )sin θ + ( y 2 − yp ) cos θ + yp − yc )( − sin θ ) + xc y" 2 = (( x 2 − xp ) cos θ − ( y 2 − yp )sin θ + xp − xc )( − sin θ) + (( x 2 − xp) sin θ + ( y 2 − yp ) cos θ + yp − yc ) cos θ + yc By simplification, y" 2 − y' 1 = y 2 − y1 and x" 2 − x' 1 = x 2 − x 1 . Therefore, slope of the line segment O" 1O" 2 = y" 2 − y" 1 / x " 2 − x" 1 = y 2 − y1 / x 2 − x 1 = slope of the line segment O1O 2 . Hence, the line segments O1O 2 and O" 1O" 2 are parallel.œ Now we state and prove the second theorem. Theorem 2: SIMR recognizes perfect rotational variant images. Proof: Assume that S R is a perfect rotational variant of a symbolic image S . Also, denote the edge lists corresponding to S R and S as E R and E . Since S R is a perfect rotational variant of S (by Theorem 1), angles between the corresponding edges in E R and E are all the same (refer to this angle as θ ). Therefore, the clustering algorithm [4] returns θ as θ max . Next, SIMR rotates all the edges in E R by −θ max about an arbitrary pivot point C (C can be different from the pivot point P (see Figure 4) about which the original symbolic image S has been rotated to result in S R ). This rotation results in a new symbolic image S F (and the corresponding edge list E F ) such that the corresponding edges in E and E F are parallel (by Lemma 2). Now that the corresponding edges are parallel and since E = E R = E F , SIMR returns a value 100 for spatial similarity. Hence, SIMR recognizes perfect rotational variant images.œ Next, we briefly describe two other approaches to RSC reported in the literature and contrast our approach with these.

3. Related Work An approach to image retrieval based only on spatial relationships is proposed by Chang & Lee [6]. The images in the database are represented as symbolic images. Further, each symbolic image is represented by a set of ordered triples of the form

14

(oi , o j , rij ) . oi and o j are objects in the symbolic image and rij is the spatial directional relationship of oi with respect to o j . The triple (oi , o j , rij ) is an ordered triple if oi < o j is true using lexicographic ordering. The spatial similarity problem now becomes the problem of matching the set of ordered triples of a query against the set of ordered triples corresponding to a database image. A hash function is used to map ordered triples into indexes in the picture table. Each index in the picture table corresponds to an ordered triple. A linked list of image names or image IDs is associated with each index. The entries in a linked list at an index are the IDs of all those images that have an ordered triple that maps to that index. Retrieval is performed as follows. For a given query image, the set of ordered triples is computed first. Then the first ordered triple in this set is mapped into an index in the picture table. Next, all the image IDs contained in the linked list at this index are retrieved. Let s1 designate all these image IDs. The same procedure is repeated for the remaining ordered triples in the query image to obtain the sets s2 , s3 , and so on. The intersection of the sets s1 , s2 , s3 ,

L, contains the candidate images to be retrieved for the

given query. There are several problems with this approach. First, all the possible spatial relationships are computed and explicitly stored. The additional storage required may render this approach unattractive for large image databases. Since all the spatial relationships are indexed through the picture table, addition of new images requires the entire picture table to be reconstructed to avoid collisions. Thus this approach is not suitable for environments where images are incrementally added to the database. Furthermore, retrieval is based on perfect match of spatial relationships and this is not desirable in many image retrieval environments. All the spatial relationships must be approximated with either the eight directional relationships (North, Northeast, East, etc.) or the relationship "at the same location as." Lastly, their approach recognizes translation and scale invariances but not rotation invariance. We do not use this algorithm for 15

experimental comparison in this paper primarily due to the fact that its spatial similarity computation is based on perfect match of spatial relationships. Typically, since there exists great diversity as well as varying degrees of similarity between database images in real applications, comparing the retrieval results of this algorithm with those of the other algorithms is not meaningful. A spatial similarity measure based on the 2D-String representation is proposed in Lee et al. [5]. Both symbolic images and queries are uniformly represented by 2DStrings. The image that has the largest subimage of a given query is deemed to be the most relevant image for that query. To find relevant images for a query, Lee et al. first determine the largest common subimages of the images in the collection with respect to the query. And then they rank the images based on the size of the common subimages. The problem of finding images similar to a query image is transformed to the problem of evaluating the similarity between the 2D-Strings corresponding to the image and the query. In terms of 2D-Strings, finding the largest common subimage is equivalent to finding the longest subsequence common to two 2D-Strings. It is shown that finding the longest common subsequence between two 2D-Strings is same as finding the maximal clique of a graph constructed with respect to the two 2D-Strings and the desired Type-i similarity. The edges in the graph are formed based on the conditions for a given type of similarity. These measures are referred to as Type-0, Type-1, and Type-2 similarities. Of all the three, Type-2 requires the most stringent matching. Let Q and D be the query image and the database image. D is retrieved as a Type-2 similarity image for Q, if there exists a subimage D′ of D that contains only and all the objects in Q, in the same order along both the axes. The set D − D′ contains no objects whose projections fall strictly between projections of objects in D′ . For Type-1 similarity the above condition is relaxed (i.e., the set D − D′ may contain objects whose projections fall strictly between projections of objects in D′ ). Finally, Type-0 similarity differs from Type-1 similarity in that the spatial 16

ordering of some objects in Q may project onto the same position on an axis. However, finding a maximal clique in a graph is shown to be NP-Complete [7]. An earlier version of the proposed algorithm appeared in Raghavan & Gudivada [8]. That version recognizes only scale and translation invariances. The current algorithm extends that work to recognize rotational invariance as well. Moreover, the test data used now is extensive and it facilitates a comprehensive testing of the extended algorithm to understand its behavior vis-à-vis previously known algorithms. The details of the experimental study are presented next.

4. Experiments First, we introduce the experimental setup. The discussion of the experimental results follows next.

4.1 Experimental Setup The test data consists of a collection of 24 images as shown in Figure 5. We refer to these images as Icon Image database. Graphic icons are used in composing these test images rather than symbolic names to facilitate the comparison of the user's intuitive results for the queries with those produced by the algorithms. However, both the spatial similarity algorithms (SIMR and Lee at al.'s) that we use in the experimental study use the corresponding symbolic representations. Images shown in Figure 5 are known as iconic images and are considered as a logical representation of the corresponding physical images. The images are divided into five groups to facilitate the inquiry into how a spatial similarity algorithm deals with scale, translation, and rotation differences in images in general and how the parameters associated with rotation influence an algorithm's performance in particular. All images are of size 7 units by 7 units. The left bottom corner of the square enclosing the images is considered as the coordinate origin for all the images. Images are composed by spatially arranging a subset of 17 distinct

17

image objects. Graphic icons corresponding to the image objects are obtained from the symbol library of a graphics software. The labels for the image objects are as follows. In Image 1, starting from the top left corner and going in the counterclockwise (cc) direction along the edges of the square enclosing the image, the labels are Plant, Child, Monument, Animal, and Building. In Image 17, starting from the top left corner and going in the cc direction, we have Lamp, Child, Sewing Machine, Sofa, Computer, Bookshelf, Plant, Stereo, TV, End Table, Wheel Chair, and Chair. At the image centroid location, we have Table. In Image 10, Phone is located at the top right corner location. Image objects are

considered

as

point

objects

situated at their centroid locations. Images 1 through 4 have five objects each and form a logical grouping which we refer to as Group 1 images. Group 1 images are created to illustrate how a spatial similarity algorithm deals with scale, translation, and rotation differences in images. The number of objects in Group 1 images is kept at five to enable the reader to compare his visual intuitive ranking of images with those rankings induced by the spatial similarity algorithms easily. Images 2, 3, and 4

Figure 5: Test Images for Evaluating Spatial Similarity Algorithms

are rotation, scale, and translation 18

variants of Image 1, respectively. Image 2 is obtained by rotating all objects in Image 1 by 45o about the image centroid. X-coordinate of the image centroid is the average of xcoordinates of all the objects in the image. Similarly, y-coordinate of the image centroid is the average of y-coordinates of all the objects in the image. Image 3 is obtained from Image 1 by scaling up Image 1 by 1.2 in both the coordinate axes directions. Scaling is performed about the image centroid. Image 4 is obtained from Image 1 by translating objects in Image 1 by 0.2 units along both the coordinate axes. Images 5

through 8

are

created to contrast closely the SIMR algorithm with that of the Lee et al.'s. These images are referred to as Group 2 images. Group 3, Group 4, and Group 5 images are created to asses how the spatial similarity algorithms handle the following aspects of the rotational differences in images: magnitude

and

direction

of

rotation, and multiple rotational variants. Group 3 images consists of the following: 9, 10, 11, 12, 13, and 14. Images 9 and 10 are similar in the sense that they have the same image objects at the corners of the square enclosing the Figure 5: Test Images for Evaluating Spatial Similarity Algorithms Contd.

image. Image objects placed at the middle point of each of the sides

19

of the square in Image 10 are those of the corresponding objects in Image 9 after undergoing a c rotation by 90 degrees. Image objects at the center of Images 9 and 10 are different. Image 11 is derived from Image 9 by leaving the image object at the centroid in Image 9 at the same location, rotating two objects (TV and Wheel Chair) 15 degrees in c direction, and rotating all other objects 20 degrees in cc direction. Image 12 is also derived from Image 9 with the following changes. Three objects (Table, TV, and Wheel Chair) that lie on the vertical line passing through the image centroid remain in the same locations. Two objects (Stereo and Lamp) are rotated in the c direction by 20 degrees. Four objects (Bookshelf, Plant, End Table, and Telephone) are rotated in cc direction by 30 degrees. Both Images 13 and 14 are derived from Image 10. Image 13 retains two objects (Lamp and Computer) of Image 10 in the same locations. One object (Plant) is rotated in c direction by 10 degrees while the remaining objects are rotated 30 degrees in cc direction. Image 14 is obtained from Image 10 with the following changes. One object (Wheel Chair) is rotated in the cc direction by 25 Figure 5: Test Images for Evaluating Spatial

degrees, one other object (TV) is rotated

Similarity Algorithms Contd

in c direction by 35 degrees, while all 20

the remaining objects retain their original locations. Images 15 and 16 are created to generate their rotational variants. Images 17, 18, 19, and 20 are rotational variants of Image 15 and are obtained by rotating Image 15 in the cc direction by 5, 15, 35, and 75 degrees, respectively. Images 15, 17, 18, 19, and 20 are referred to as Group 4 images. Images 21, 22, 23, and 24 are rotational variants of Image 16 and are obtained by rotating Image 16. Images 21 and 22 are obtained by rotating Image 16 in cc direction by 50 and 90 degrees. Images 23 and 24 are obtained by rotating Image 16 in c direction by 50 and 90 degrees. We refer to Images 16, 21, 22, 23, and 24 as Group 5 images. Spatial similarity algorithms are applied on the database images in two stages. In stage one, the performance

of

the

similarity

algorithms within each of the logical groups (i.e., Groups 1-5) are compared

and

contrasted.

The

objective here is to inquire into the robust behavior of these algorithms and their abilities in recognizing scale,

translation,

and

rotation

invariances in general and how the algorithms are affected with the variation in the magnitude and the Figure 5: Test Images for Evaluating Spatial Similarity Algorithms Contd. 21

direction of rotation in particular.

In stage two, the objective is to quantify the performance of the algorithms with reference to the expert provided results. Each database image is considered as a query image in turn. Similarity of each query image is evaluated with respect to every image in the database using each of the spatial similarity algorithms. This gives us a rank ordering of database images for each query image and we refer to this as the system provided rank ordering. We then evaluate the system provided rank ordering using the Rnorm measure. This measure is introduced in LIVE-Project [9] and is used in assessing the quality of retrieval functions used in Bibliographic Information Systems (see Appendix A). Calculation of Rnorm for a query requires two rank orderings of the database images relative to the query image. The first one is the system provided rank ordering and the second one is the expert provided rank ordering that defines the desired system output. A graduate student in Computer Science served the role of an expert. It was emphasized to the expert that rank orderings be based on the degree of conformance of spatial relationships in the database image with those present in the query image. The notion of spatial similarity was exemplified to the expert by using an illustration similar to the following. As an example, consider the relevance of Images 6 and 7 for Image 5 as a query image. All possible spatial relationships between the objects in Image 5 are: 'Animal south Tree,' 'Child southeast Tree,' 'Monument south Tree,' Child east Animal,' 'Monument south Animal,' 'Monument southwest Child.' As an example, the spatial relationship 'Animal south Tree' means that the Animal is to the south of the Tree. In Image 6, all of the spatial relationships of Image 5 hold with the following exception. Two spatial relationships, 'Animal south Tree,' and 'Monument south Animal' are true only approximately. On the other hand, in Image 7 only four of six spatial relationships of Image 5 hold. Hence, Image 6 should be preferred over Image 7 for Image 5 as a query. No other instructions were given to the expert. Each image in the Icon Image database is considered as a query image in turn. For each query image, the expert is asked to provide a rank ordering of the database images. 22

Table B.1 in Appendix B shows the expert provided rank ordering for all the query images. Stage one experimental results are presented in the next section.

Stage One Experimental Results and Discussion Based on the expert provided rank ordering (Table B.1), one would expect a similarity function to conclude that all the Group 1 images are the same by assigning the highest similarity value for any pair of images in this group. In deed, SIMR performed as expected. The results obtained by applying Lee et al.'s algorithm on the same group of images is shown in Table 1. The row labels of the table indicate the query image number and the column labels of the table indicate the database image numbers. An entry in the table corresponding to a given row label and column label indicates the similarity of the image designated by the column label with respect to the image designated by the row label. This algorithm also produces results as expected except in those cases where Image 2 is involved as either a query image or a database image. This clearly demonstrates that Lee et al.'s algorithm fails to recognize rotational variance. This is true for all the three types of similarity provided by the Lee et al.'s algorithm, which we refer to as Type-0, Type-1, and Type-2 algorithms.

1 2 3 4

1 5 2 5 5

Type-0 2 3 2 5 5 2 2 5 2 5

4 5 2 5 5

1 5 2 5 5

Type-1 2 3 2 5 5 2 2 5 2 5

4 5 2 5 5

1 5 1 5 5

Type-2 2 3 1 5 5 1 1 5 1 5

4 5 1 5 5

Table 1: Type-0, Type-1, and Type-2 Similarities for Group 1 Images A pictorial representation of the comparative performance of SIMR and Lee et al.'s algorithms is shown in Figure 6. Since the range of SIMR (0 to 100) is different from the that of Lee et al.'s (1 to 5 for Group 1 images), the range of the latter has been interpolated to conform to that of the former. All the three variations of the Lee et al.'s

23

Query Number: 2

0

50

100

Query Number: 1

r AAAA AA AAAA AAAA AA AAAA AA AA AAA AA AAAA AA AAAA AAA AAAAAA AAAA AAA AAAA AAA AAAA AAA AA AAAA AAA AA AAAA AA AAA AA AAAA AA AAAA AAA AAAAAA AAAA AAA AAAA AAA AAAA AAA AA AA AAAA AA AAAA AAAType-0 AAAA AA AAAA AAA A AAA AAAA AAA AAAA AAA AAAA AA AA AAAA AA AAAA AAA A AAA AAAAAA AAAAAA AAAA AAA AAAA AAA AAA AAA AAAAAA AAAAAA AAA AAAA AAA Type-1 1

100

50

0

1

Type-2

2

3

r AAAA AA AAA AAA AAAA AA AAA AAAAA AAAA AA AAAA AA AAAA AA AAAA AA AAA AA AAAA AA AA Type-0 AAAAAAAAAAAAAAAA AAA AAAAAA AA AA AAAAAA AAAAAAAAAA AAAA AAAA AAAAAAAAAA AAAA AAAAAA AA Type-1

0

r AA AAAA AAA AAAA AAA AAAA AAA AAA A AAAA AAAA AA AAA AA AAA AA AAAA AAA AAAA AA AAAA AAA AAAA AAA AAAA AAA AAA AA AAAA AAAA AA AAA AAAA AAA AAAA AA AAAA AAA AAAA AAA AAAA AAA AAAA AAAA AA AAA AAAA AA AAAA AAA Type-0 AAAA AAA AAAA AA AAAA AAA AAAA AA AAA Type-1 AAAA AAAAAA AAAA AAA 1

2

Type-2 3

Database Image Numbers

4

minimum value (always) to indicate no similarity and a maximum value of 5 (in the case of Group 1 images) to

4

indicate that the images are

Query Number: 4

Query Number: 3

50

Type-2 3

Database Image Numbers

Database Image Numbers

100

2

4

algorithm produce 1 as a

100

50

0

r AAAA AA AAAA AAA AAAAAAAA AAA AAAA AAAA AAA AAAAAA AAAA AAAAAAAA AAAA AAAA AAA Type-0 AAAA AAAAAAAA AAA AAAA AAAA AAA AAAA AAA A AAA AA AA AAAA AAA Type-1 AAAAAAA A AAAAA AAAA AAAA AA AAAA AAA 1

2

Type-2 3

identical.

Using

linear

interpolation, values 1, 2, 4, 5 in Table 1 interpolate to 0, 25, 75, 100, respectively. The

depth

dimension

4

Database Image Numbers

indicates

Figure 6: Comparitive Performance of Spatial Similarity Algorithms on Group 1 Images

the

spatial

similarity value in Figure 6. For the following

discussion on Group 2 images, all the expected rankings are based on the expert provided rank ordering. We expect the following ranking for Image 5 as a query image: 5, 6, 7, 8. For Image 6, the expected ranking is 6, 5, 7, and 8. The expected ranking for Image 7 is 7, 5, 8, 6. Finally, 8, 7, 5, and 6 is the expected ranking for Image 8. Ranking of Group 2 images by SIMR is shown in Table 2. By noting the entries in Table 2, we conclude that the ranking provided by SIMR in fact agrees with the expert provided ranking. All the three types of ranking provided by Lee et al.'s algorithm is shown in Table 3. All the three types of ranking perfectly agree with the expert provided ranking with the following exception. For example, both Images 5 and 6 are assigned the same similarity value for Image 5 as a query image. However, the expert provided ranking indicates that Image 5 should be assigned a higher value than the value assigned to Image 6. The rank ordering provided by Lee et al.'s algorithm is a weak ordering in the sense that several images are shown to have equal similarity for a given query. Moreover, as we see later 24

that for some situations, Lee et al.'s algorithm does not provide adequate degree of resolution to the weak ordering for the ordering to be useful. We refer to this problem as the problem of inadequate resolution. Since Group 2 images are not rotational variants and the expert has provided only a rank ordering, it is not known what numeric value is the right value to quantify the similarity. Hence, we do not provide a pictorial representation to depict the comparative performance of the algorithms on Group 2 images. The results obtained by applying SIMR on Group 3 images are shown in Table 4. Tables 5, 6, and 7 show the results obtained by using Lee et al.'s algorithm for Type-0, Type-1, and Type-2 similarities, respectively. To analyze the results for this group, it is convenient to think of Images 9, 11, and 12 as one subgroup and Images 10, 13, and 14 as another subgroup. First we analyze the results of the former subgroup. For Image 9, according to the expert's rank ordering, Image 11 is more relevant than Image 12 since Image 11 is a closer rotational variant than Image 12. In general, results given by all the algorithms agree quite well with the following exceptions. For Image 9, the similarity ordering provided by SIMR is: 9, 11, 12. Type-0 measure considers Images 9, 11, and 12 to be equally similar whereas Type-1 and Type-2 measures assess both Images 11 and 12 to be of same relevance. For Image 12, similarity ordering provided by SIMR is: 12, 11, 9. Both Type-1 and Type-2 measures concur with this ordering while Type-0 provides a different ordering (12, 9, 11). Both Type-1 and Type-2 measures rank Image 12 as a more relevant image than Image 9 for Image 11 as a query image whereas Type-0 measure considers both Images 11 and 9 as equally relevant for Image 11 as a query image. Next we analyze the results of second subgroup within Group 3. 5 6 7 8

5 100.00 93.02 74.31 60.48

6 93.02 100.00 60.82 46.53

7 74.31 60.82 100.00 85.54

8 60.48 46.53 85.54 100.00

Table 2: SIMR Values for Group 2 Images 25

5 6 7 8

5 4 4 3 3

Type-0 6 7 4 3 4 3 3 4 2 4

8 3 2 4 4

5 4 3 3 2

Type-1 6 7 3 3 4 2 2 4 2 3

8 2 2 3 4

5 4 2 3 1

Type-2 6 7 2 3 4 2 2 4 1 2

8 1 1 2 4

Table 3: Type-0, Type-1, and Type-2 Similarities for Group 2 Images 9 9 100.00 10 61.59 11 97.22 12 93.53 13 64.73 14 61.61

10 61.59 100.00 61.02 63.64 95.16 99.16

11 12 97.22 93.53 61.02 63.64 100.00 96.55 96.55 100.00 65.85 70.21 60.98 63.66

13 14 64.73 61.61 95.16 99.16 65.85 60.98 70.21 63.66 100.00 94.59 94.59 100.00

Table 4: SIMR Values for Group 3 Images

9 10 11 12 13 14

9 9 8 9 9 6 7

10 8 9 6 6 7 9

11 9 6 9 7 5 5

12 9 6 7 9 5 5

13 6 7 5 5 9 7

14 7 9 5 5 7 9

Table 5: Type-0 Similarities for Group 3 Images

9 10 11 12 13 14

9 9 4 3 3 3 4

10 4 9 2 2 3 7

11 3 3 9 6 5 3

12 3 2 6 9 5 3

13 3 3 5 5 9 4

14 4 7 3 3 4 9

Table 6: Type-1 Similarities for Group 3 Images

26

9 10 11 12 13 14

9 9 4 1 1 2 1

10 4 9 1 1 1 2

11 1 1 9 3 3 1

12 1 1 3 9 2 1

13 1 1 2 1 9 1

14 1 2 1 1 1 9

Table 7: Type-2 Similarities for Group 3 Images Expert provided ranking indicates that Image 14 is more similar to Image 10 than does Image 13. The ordering provided by SIMR function perfectly agrees with the ordering provided by all Type-0, Type-1, and Type-2 measures with the exception that the problem of inadequate resolution is associated with the latter measures. A pictorial representation for Group 3 images is not provided for the same reasons mentioned earlier for Group 2 images. A pictorial representation of the comparative performance of the algorithms on Group 4 images is shown in Figure 7. For any two images within Group 4, SIMR returns a value of 100. The results obtained by applying Lee et al.'s algorithm on Group 4 images in numeric form are shown in Tables C.1, C.2, and C.3 in Appendix C. A linear interpolation is used to transform Lee et al.'s algorithm range (1 to 13 for Group 4 Images) to conform to that of SIMR . It is expected that Images 15, 17, 18, 19, and 20 be recognized as being identical to each other by assigning highest similarity for any pair of images in this group. This is truly the case with SIMR function as depicted by the height of the bars in the row labeled 'SIMr' in Figure 7. The height of the bars in rows labeled 'Type-0', 'Type-1', and 'Type-2' in Figure 7 illustrate the results of Type-0, Type-1, and Type-2 similarity measures on Group 4 images. When the degree of rotation is small, Type-0 measure is able to recognize rotational variance to certain extent. Both for Type1 and Type-2 measures, this capability is dramatically reduced and the problem of inadequate resolution becomes much more severe. 27

A pictorial representation of the comparative performance of the algorithms on Group 5 images is shown in Figure 8. As in the case of Group 4 images, for any two images within Group 5, SIMR returns a value of 100. The results obtained by applying Lee et al.'s algorithm on Query Number: 15

r AAAA AAAAAA AAAAAA AAAAA AAAA AAA AAAA AAAAA AAAA AAAA AAAA AA A AAAA AA AAA A AAAA AAAAAA AA AAA AA AAA AAA AA AAAA AAA AAA AA AAAA AAA AAAAAAAA AAA AAA AA AA AAAA AAA AAA AAAAA AAAA AAAA AAA A Type-0 AAA AAA AAA AAAA AA A AAAA AAA Type-1 15

00

50

0

17

Query Number: 17

100

50

0

15

Type-2

18

19

r AAAA AA AA AAAA AAA AAA AAAA AAA AA AA AA AAAA AAAA AA AAA AAAA AAA AA AAAA AAA AA AAAA AAAAA AAAA AA AAAA AAA AAAA AAAA AAAA AAAA AAAA AA AA AAA AAAA AAA AA AAA AA AAAA AAA AA AAAA AAAAAAA AAAAAAA AAAA AA AAAA AAA AA AA AAAA AAAA AAA AA AAAA AAA AAAA AAAA AAA AA AA AAAA AAAA AA AA AAA AA AAA AA AAAA AAA AAAA AA AAAAAAAAA AAAAAAAAAAAAA AA AAA AAAA AAAA Type-0 AAAAAAAAAA AA AAAA AA AAA AAAAA AA Type-1 17

5

images

in

numeric form are shown in Tables C.4, C.5, and C.6.

Here also one

would expect the same

Type-2 18

19

20

Database Image Numbers

Group

20

results

Database Image Numbers

to

occur

as

expected for Group 4 Query Number: 18

Query Number: 19

r AAA AAA AAA AAAAAAA AAAA AAAA AAA AAAA AAA AAA AA AAA AAA AAAA AA AA AAA AAAA AAA AAA AAA AAA AAA AAA AAA AAAA AAA AAA AAAA AA AA AAAA AA AAA AAA AAA AAAAAAA AAAA AAA AAA AAAA Type-0 AAA AAAA AA AAAAAA AAA AAA AAA AAA AAAA AA AA AAA AAAAAAA AAA AAAA AAAA AA AAAA AA AAA AA 15 AAA AA Type-1 17

100

50

0

18

Type-2

19 Database Image Numbers

100

50

0

r AA AAAA AA AAAA AAAA AA AAAA AAA AAA AA AAAA AAAA AA AAA AAAA AAAA AAAA AA AA AAAA AA AAAA AA AA AAAA AAAAAAAAAA AAAA AAAAA AAAA AAAAAAAType-0 AAAAAA AAAAAA AAAAAA AAAAAA AAAA AA AAAA AA AAAAAAA AAAA AA Type-1

15

17

images.

A

linear

interpolation is used to transform Lee el al.'s algorithm range (1 to 13 for Group 5 images) to

Type-2

18

19

20

20

Database Image Numbers

conform to that of SIMR . Figure

8

clearly

Query Number: 20

100

50

0

r AAAA AA AAA AA A AAAA AA AAAA AAAA AAAA AA A AAA A AAA AAA AA AA AAAA AAA AAA AA AAA AA AA AAAA AA A AAAA AA AAA AAAA A AAA AA AAAA AAA AAAA AA AAAA AAAA AA AAA AA AAA AAAA AA AAAA AA A AAAA AA AAA AAAA A AAAA AA AAA AA AAA AAA AA AAAA AAA AAAA AA AAAA AA A Type-0 AAAA AAAAA AAAAAA A AAAA AA AAA AA AAAA AAA AA AAAA AA A AAAA AAAAA AAAAAA AAAA AAA AA AA AAAA AA A AAAA AAAAA AAAAAA AA AA AAAA AAAAAA AAAA AAA A Type-1 15

17

demonstrate the inability of Lee et al.'s algorithm to recognize rotational

Type-2 18

19

variance

for

Type-0,

Type-1,

and

Type-2

20

Database Image Numbers

measures. This inability

Figure 7: Comparitive Performance of Spatial Similarity Algorithms on Group 4 Images 28

of Lee et al.'s algorithm to

detect

rotational

Query Number: 21

Query Number: 16

r AAAA AAAAAAA AAAA AA AAAA AAA AA AAAA AA AAAA AAAA AAA AA AAAA AA AAAA AAA AAAAA AAAA AAAAAAA AAAA AA AAA AA AAAA AAAA AA AAAA AA AAAAA AA Type-0 AAAA AAAAAA AAAA AAAAA AA AAAAAA AAAA AAA AAAAA AAA AA A AAA AAAAAA AAA A AAAA AAA AAA A Type-1 16

00

50

0

21

100

50

23

0

16

21

Type-2 22

23

22

23

variance is attributable to the following. Recall that 2D-String is a projection of image objects on both x- and y-axes. Lee et al.'s

24

method can recognize

Query Number: 23

r AAAA AA AAAA AAAA A A AAAA AAAA AA AA AAAA A AAAA A AAAA AAAAA AAAAAA AAAAA AAA AAAA AA A AAAAA AAAA AAAA A AA AAAA A Type-0 A AAAA AAAAA AAAA AAAAA AA AAA A Type-1 AAA A AAAA AAA AAAA AA AAA AAAA AA

50

21

Database Image Numbers

Query Number: 22

100

AAAA AAAA A AA AA AAAA A AA AAAA AA AA AAAA AA AAAA AA AA AAAA AA AAAA AA AAAA AA AAAA AA AA AAAA AA AAAA AA AAAA AA AAAA AA AA AAAA AA AAAA AA AAAA AA AAAA AA AA AAAA AA AAAA AA AAAA AA AAAA AA AA AAAA AA AAAA AA AAAA AA AAAA AA AA AAAA AA AAAA AA AAAA AA AAAA AA AA AAAA AA AAAA AA AAAA AA AAAA AA AAAA AA AA AAAA AA AAAA AA AAAA A AA AAAA AA AAAA AA AAAA AA AA SIMr AAAA AA AAAA AA AAAA AAAAA AAAA AAAA AAAAA Type-0 A AAAA A AAAA Type-1 A AAAA A AAAA Type-2

16

24

Database Image Numbers

AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA

0

Type-2

22

AAAAAAAAAA AAAAAA AAAA AA AAAA AA AAAA A AAAA AA AAAA AAAAAA AAAA AA AAAAA AAAA AA AAAA AA AAAA A AAAA AAAA AA AAAAAA AAAA AA AAAA AA AAAA AAAAAA AAAA AA AAAAA AAAA AA AAAA AA AAAA AAAAAA AAAA AA AAAAA AAAA AA AAAAAAAAAAA AAAAAA AAAA AAA AAAAA AAAA AAAAAAA AA AA AAAA A AAAA AAAAAAAAAAA AAAA AAAAAAAAAA AAAA AAAAA AAAA AAAA AA AAAA AA AAAA AAAAAA AAAA AA AAAAA

24

Database Image Numbers

100

50

0

r AAAAA AAAA AA AAAA AAAA AAAA AAAAAAAA AA AAAA AAAA AA AAAA AA A AAA AA AAAA AAAAA AAAA AAA AA AAAA AAA AAAAAA AA AA AAAAAA AAA AA AAAA AAAAAA AAAA AAAA Type-0 AA AAAA AA AA AAAA AAA AA AAAA AA AAAAAA Type-1 AAAAAAAAAAAAAAAA AA AA AA AAAA AA AAAA AAAA AAAAAA AAAAAA 16

21

Type-2

22

23

rotational variance only when

the

2D-Strings

corresponding

to

its

before and after rotation states are the same. This

24

Database Image Numbers

will occur only when the

Query Number: 24

100

50

0

r AA A AAAAA AAAAAA AAA A AAA AAAA AAA AAAA AAA AA AAAA AAA AAA AAAA AA AA AAAA AAAAAA AAAA AAA AA AAA AAAAA AAA AA AA AAAA AAA AAA AAAA A AA AAA AA AAAA AA AAA AA AAAA AAA AAA Type-0 AAAA AAAAAAA A AAAA AAA AAAAAA AA AAAA AAA AAA AA AA AAAA AAA AAA AAAAA AAAA AAA AAAAAA AAAAAA AAA AAAA AAA AAAAA AA AAAType-1 AAA AAAAAA AAAA AAA AAAA AAA 16

21

degree of rotation is extremely small. Even for small rotations, the 2D-Strings orresponding to

Type-2 22

23

before

and

after

24

Database Image Numbers

rotation

Figure 8: Comparitive Performance of Spatial Similarity Algorithms on Group 5 Images

states

of

an

image are likely to be different.

Next

we

evaluate the performance of the spatial similarity algorithms using Rnorm measure.

Stage Two Experimental Results and Discussion Table 8 shows Rnorm values for SIMR , SIM , Type-0, Type-1, and Type-2 algorithms. A pictorial representation corresponding to Table 8 is shown in Figure 9. The rank ordering of database images provided by the expert are used in calculating Rnorm values for the corresponding queries. Rnorm values range from 0 to 1.0 and a value of 1.0 indicates that the system provided rank ordering of the database images is an acceptable 29

Query Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

SIMR

SIM

Type-0

Type-1

Type-2

1.00 0.98 1.00 1.00 0.98 0.95 0.97 0.95 0.98 1.00 0.98 0.99 1.00 1.00 1.00 0.99 0.96 0.98 0.98 0.97 1.00 0.99 0.96 0.93

1.00 0.98 1.00 1.00 0.98 0.95 0.97 0.95 0.98 1.00 0.98 0.99 1.00 1.00 0.99 0.96 0.97 0.98 0.99 0.97 0.95 0.85 0.90 0.80

0.91 0.91 0.91 0.91 0.97 0.96 0.97 0.96 0.99 0.99 1.00 1.00 1.00 0.99 0.92 0.88 0.93 0.93 0.95 0.92 0.81 0.80 0.64 0.70

0.91 0.93 0.91 0.91 0.93 0.92 0.93 0.73 0.93 0.97 0.92 0.90 0.96 0.98 0.86 0.75 0.89 0.91 0.93 0.81 0.80 0.68 0.59 0.49

0.75 0.66 0.75 0.75 0.71 0.72 0.78 0.65 0.60 0.65 0.66 0.61 0.70 0.62 0.59 0.57 0.61 0.62 0.56 0.56 0.59 0.57 0.54 0.56

ranking with respect to the rank ordering of the database images provided by the expert. An acceptable ranking is a ranking that is either identical to the expert provided ranking or differs from the expert provided ranking only in the degree of resolution (i.e., the system provided ranking has higher degree of resolution than the expert provided ranking). As shown by the second and third columns in Table 8, Rnorm values for SIM and SIMR rank ordering is equal to 1.0 or very close to 1.0. By noting the Rnorm values in the first and second columns, it can be concluded that SIM and SIMR perform equally well in all the cases except in those situations involving rotationally invariant images with large magnitude of rotation.

30

Overall, Type-0 algorithm performs reasonably well and the situation quickly degenerates as we go from Type-0 to Type-1 and Type-2 algorithms. Type-0 and Type-1 algorithms perform well for small rotations and when the number of objects in rotationally variant images is also small. This is evidenced by the top four entries in columns 3 and 4 in Table 8. However, as the magnitude of rotation as well as the number of objects in rotationally variant images increase, performance of Type-0 and Type-1 algorithms suffer considerably as evidenced by last four entries in columns 3 and 4 of Table 8. Same observations about the retrieval quality that apply to Type-0 and Type-1 algorithms apply to Type-2 algorithm in general. However, the retrieval quality of Type2 algorithm is severely affected by the number of objects and the magnitude of rotation factors as evidenced by the last eight entries of column 5 in Table 8.

uery Numbers AAA AAAA 6

1 0.9 0.8 0.7 Rnorm Value 0.6 0.5 0.4

1

SIMr

AAAA AAAA AAAA AAAA SIM

AAAA AAAA AAAA AAAA

Type-0

AAAA AAAA AAAA AAAA Type-1

AAA AAA AAA AAA

Type-2

Figure 9: Rnorm Based Comparative Performance of Spatial Similarity Algorithms 31

Occasionally, Type-0 performed better than SIMR . The inability of the human perception process to recognize two rotationally variant images as the same image when the number of objects in the images are many and the degree of rotation is high has actually helped Type-0 algorithm. However, SIMR is designed to recognize rotation invariance irrespective of the number of objects in the images and the magnitude of rotation. This result also indicates that the expert made decisions about expected rank ordering independently of how the algorithms, SIM and SIMR , are designed.

5. Conclusions Two algorithms (that of Chang & Lee; Lee et al. ) exist in the literature for spatial similarity computation. Since the algorithm of Chang & Lee exhaustively stores all the possible spatial relationships in an image, it may not be useful in complex image database applications. Moreover, the spatial similarity computation is based on exact match. As we have seen in section 3, Lee et al.'s algorithm has exponential time complexity, which may render its use for image database applications requiring interactive retrieval unattractive. Also, their algorithm does not recognize rotational invariance in images. Moreover, the problem of inadequate resolution becomes pronounced as we progress from Type-0 to Type-2. In this paper, we have introduced an algorithm for spatial similarity computation that performs as well as Lee et al.'s algorithm for scale and translation variants. In addition, it has been formally shown that the proposed algorithm recognizes rotational invariances in images. The algorithm has quadratic time complexity in terms of the total number of objects in both the database and query images. We have also introduced the idea of using expert judgments in quantifying a system's retrieval quality. This enables us to systematically asses the quality of algorithms for retrieval in image databases. By comparing the results produced by our algorithm with that of Lee et al.'s algorithm, we conclude that our algorithm is not only more efficient but also provides a rank ordering

32

of images that consistently matches with the user's expected rank ordering. In this context, we feel that the proposed algorithm and the measure to quantify the quality of retrieval algorithms are important contributions to the IR problem.

Acknowledgment This research is supported by the U.S. Department of Defense under Grant No: DAAL03-89-G-0118. The authors wish to express their appreciation to the anonymous referees for their helpful suggestions that significantly improved the paper.

References 1. Gudivada, V.N. (1993), A Unified Framework for Retrieval in Image Databases, Ph.D. Dissertation, University of Southwestern Louisiana, Lafayette, LA. 2. Tamura, H. and Yokoya, N. (1984), "Image Database Systems: A Survey," Pattern Recognition, Vol. 17, No. 1, pp. 29-43. 3. Chock, M. (1982), A Database Management System for Image Processing, Ph.D. Dissertation, Department of Computer Science, University of California, Los Angeles. 4. Fisher, W. (1958), "On Grouping for Maximum Homogeneity," Journal of American Statistical Association, Vol. 53, pp. 789-798. 5. Lee, S.Y., Shan, M.K. and Yang, W.P. (1989), "Similarity Retrieval of ICONIC Image Database," Pattern Recognition, Vol. 22., No. 6, pp. 675-682. 6. Chang, C. and Lee, S. (1991), "Retrieval of Similar Pictures on Pictorial Databases," Pattern Recognition, Vol. 24, No. 7, pp. 675-680. 7. Garey, M.R. and Johnston, D.S. (1979), Computers and Intractability: A Guide to the Theory of NP-Completeness, W.H. Freeman, San Francisco, CA. 8. Raghavan, V.V. and Gudivada, V.N. (1990), "A Domain Independent Similarity Measure for Symbolic Images," First Indian Computing Congress, Hyderabad, India, November, pp. 195-203. 9. Bollmann, P., Jochum, F., Reiner, Weissmann, V., and Zuse, H. (1985), The LIVEProject - Retrieval Experiments Based on Evaluation Viewpoints, Proc. of the Eighth Annual International ACM/SIGIR Conference on Research & Development in Information Retrieval, Montreal, Canada, June 1985, pp. 213-214.

33

10. Foley, J., van Dam, A., Feiner, S., and Hughes, J. (1990), Computer Graphics: Principles and Practice, Addison-Wesley, Reading, MA.

34

Appendix A Definition of Rnorm Let I be a finite set of images with a user defined preference relation ⋅ ≥ that is complete and transitive (weak order). Let ∆usr be the rank ordering of I induced by the user preference relation. Also, let ∆sys be some rank ordering of I induced by the similarity values computed by an image retrieval system. Then Rnorm is defined as Rnorm ( ∆sys ) =

1 F S+ − S − I G1 + J + 2H Smax K

where S + is the number of image pairs where a better image is ranked ahead of a worse one, S − is the number of pairs where a worse image is ranked ahead of a better one and + Smax is the maximum possible number of S + . It should be noted that the calculation of

S + , S − , and S max is based on the ranking of image pairs in ∆sys relative to the ranking of corresponding image pairs in ∆usr . The Rnorm was introduced in the LIVE-Project [9].

Example: Consider the following two rank orderings: ∆usr = (i1 , i4 | i2 , i3 | i5 ) , and ∆sys = (i5 | i2 , i4 | i1 , i3 ) According to the user i1 and i4 have the highest preference, followed by both i2 and i3 at the next level of preference, followed by i5 at the lowest level of preference. The user considers i1 equivalent to i4 and i2 equivalent to i3 . ∆sys is interpreted in a similar way. + Here we have, Smax = 8, S + = 1, S − = 5. Therefore, Rnorm ( ∆sys ) =

35

1 F 1 − 5I 1+ = 0. 25. 2H 8 K

Appendix B Query Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Expert Provided Rank Ordering (Images Shown Within the Same Cell of a Row Have Same Relevance) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 5 6 7 8 1 2 3 4 9 10 11 12 13 14 15 16 17 6 5 7 8 1 2 3 4 9 10 11 12 13 14 15 16 17 7 5 8 6 1 2 3 4 9 10 11 12 13 14 15 16 17 8 7 1 2 3 4 5 6 9 10 11 12 13 14 15 16 17 9 11 12 10 13 14 15 16 17 18 19 20 21 22 23 24 10 14 13 9 11 12 15 16 17 18 19 20 21 22 23 24 11 9 12 10 13 14 15 16 17 18 19 20 21 22 23 24 12 9 11 10 13 14 15 16 17 18 19 20 21 22 23 24 13 10 14 9 11 12 15 16 17 18 19 20 21 22 23 24 14 10 13 9 11 12 15 16 17 18 19 20 21 22 23 24 15 17 18 19 20 16 21 22 23 24 9 10 11 12 13 14 16 21 22 23 24 15 17 18 19 20 9 10 11 12 13 14 17 15 16 18 19 20 21 22 23 24 9 10 11 12 13 14 18 15 17 19 16 20 21 22 23 24 9 10 11 12 13 14 19 18 20 15 16 17 21 22 9 10 11 12 13 14 23 24 20 18 19 15 16 17 9 10 11 12 13 14 21 22 23 24 21 16 22 23 24 15 17 18 19 20 9 10 11 12 13 14 22 16 21 23 24 15 17 18 19 20 9 10 11 12 13 14 23 16 21 22 24 15 17 18 19 20 9 10 11 12 13 14 24 16 21 22 23 15 17 9 11 12 18 19 20 12 13 14

18 18 18 18 18 18 18 18

19 20 19 20 19 20 19 20 19 20 19 20 19 20 19 20 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3

Table B.1: Expert Provided Rank Ordering for Queries on Icon Image Database

36

21 21 21 21 21 21 21 21 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

22 23 22 23 22 23 22 23 22 23 22 23 22 23 22 23 5 6 7 5 6 7 5 6 7 5 6 7 5 6 7 5 6 7 5 6 7 5 6 7 5 6 7 5 6 7 5 6 7 5 6 7 5 6 7 5 6 7 5 6 7 5 6 7

24 24 24 24 24 24 24 24 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8

Appendix C Results of Lee et al.'s Algorithms on Group 4 and Group 5 Images

15 17 18 19 20

15 13 13 13 5 4

17 13 13 13 5 4

18 13 13 13 5 4

19 5 5 5 13 4

20 4 4 4 4 13

Table C.1: Type-0 Similarities for Group 4 Images 15 17 18 19 20

15 13 5 5 3 1

17 5 13 13 5 4

18 5 13 13 5 4

19 3 5 5 13 4

20 1 4 4 4 13

Table C.2: Type-1 Similarities for Group 4 Images

15 17 18 19 20

15 13 1 1 1 1

17 1 13 13 1 1

18 1 13 13 1 1

19 1 1 1 13 1

20 1 1 1 1 13

Table C.3: Type-2 Similarities for Group 4 Images 16 21 22 23 24

16 13 4 4 4 4

21 4 13 5 1 1

22 4 5 13 1 1

23 4 1 1 13 5

24 4 1 1 5 13

Table C.4: Type-0 Similarities for Group 5 Images

37

16 21 22 23 24

16 13 3 1 3 1

21 3 13 3 1 1

22 1 3 13 1 1

23 3 1 1 13 3

24 1 1 1 3 13

Table C.5: Type-1 Similarities for Group 5 Images 16 21 22 23 24

16 13 1 1 1 1

21 1 13 1 1 1

22 1 1 13 1 1

23 1 1 1 13 1

24 1 1 1 1 13

Table C.6: Type-2 Similarities for Group 5 Images

38

Suggest Documents