Cyrus Shahabi Maytham Safar. Integrated Media Systems Center. University of Southern California. Los Angeles, California 90089-078 1. [ shahabi, safar] (3 ...
Efficient Retrieval and Spatial Querying of 2D Objects * Cyrus Shahabi Maytham Safar Integrated Media Systems Center University of Southern California Los Angeles, California 90089-0781 [shahabi, safar](3usc .edu
Abstract
scenes and their objects can be stored in a database for future queries. A sample query might be “jind all the scenes that contain a certain object”. Therefore, one of the important functionalities required by all these applications is the capability to find objects in a database that match a given object.
Besides traditional applications (e.g., CAD/CAM and Trademark registry), new multimedia applications such a s structured video, animation, and MPEG-4 standard require the storage and management of well-defined objects. For efJicient retrieval of 2 0 objects by shape, we propose three index structures on features that are extracted from the objects’ minimum bounding circles (A4BC). A major observation is that those features are unique per object and can be utilized to filter out non-similar candidates. To evaluate our techniques, we conducted a simulation study on a database of 2 0 objects. The results show the superiority of our techniques as compared to a naive indexing (at least 40% improvement in I/O cost). We also identify one of the indexing structures as the superior one, independent of the size of the database and the number of vertices of the objects.
In this paper, we concentrate on whole matching and address two obstacles for efficient execution of whole match queries. First, the general problem of comparing two 2D objects under rotation, scaling, and translation invariances is known to be computationally expensive [ 6 ] .Second, the size of the multimedia and CAD databases are growing and hence given a query object the matching objects should be retrieved without accessing all the objects in the database. We address both obstacles collectively by identifying a set of six features that could be extracted from the A4BC of each object. Different subsets of object’s M B C features are used to build three alternative index structures, that are enhanced further by incorporating multidimensional index structures (e.g. R*-tree). The index structures are consequently used to expedite the performance of variations of match queries (EM, RST, and SIM). The variations cover cases where the database is searched for exactly identical objects; identical objects that are rotation, scaling, and translation invariant; or similar objects per our definition of similarity.
1. Introduction Many applications in the areas of CAD/CAM and computer graphics require to store and access large databases. A major data type stored and managed by these applications is representation of two dimensional (2D) objects. These object databases are then queried and searched for different purposes. For example, a CAD application [4] for manufacturing might intend to reduce the cost of building new industrial parts by searching for reusable existing parts in a database. For an alternative trade mark registry application [lo], one might need to ensure that a new registered trademark is sufficiently distinctive from the existing marks by searching the database. Meanwhile, new multimedia applications such as structured video [9], animation, and MPEG-4 standard define specific objects that constitute different scenes of a continuous presentation. These
For designing our index structures we take into consideration the completeness of the model; i.e. it reduces the number of false hits and guarantees no false drops. In the final step, by utilizing our feature-based variations of comparison algorithms, we eliminate all the false hits. To evaluate our index structures, we conducted a simulation study on a database of 2D objects. The results show the superiority of our techniques as compared to naive with a minimum margin of 40% improvement in the I/O cost, and orders of magnitude improvement in CPU cost. This study is different from the work in computational geometry, pattern recognition and computer vision. In computational geometry, the main focus of the research is in de-
‘This research was supported in part by NASNJPL contract nr. 961518, and unrestricted cashlequipment gifts from Intel, NCR ,andNSF grants EEC-9529152 (IMSC ERC) and MRI-9724567.
0-7695-0253-9/99 $10.00 0 1999 IEEE
611
veloping algorithms to compare two given objects to see if they match or not. In pattern recognition and computer vision the goal is to recognize objects in a given scene (e.g. by using edge detection algorithms). In our work, the set of objects in the database is already well defined by their vector representations either manually as a result of drawing software or automatically as a result of an object recognition algorithm. A subset of M BC features could also be utilized to answer other query types such as spatial queries (e.g. topological and direction queries). For the example of multimedia applications, here a sample query might be ‘‘jind all the scenes that an object (A) is on the top or near an object (B)”. M B C approximations and features maintain the most -_ important features of the objects (position and extension). Therefore, those features may be used to expedite the performance of spatial queries. Due to the lack of space, we do not discuss the support for spatial queries any further. For further discussion on support of spatial queries using MBC, please consult [ 13. The remainder of this paper is organized as follows. Sec. 2 provides some background material on related work. In Sec. 3, we define the six M B C features that can be extracted for a 2D object. Sec. 4 provides the description of three different query types considered in this study. In Sec. 5 , we explain in details our three alternative index structures based on MBC features. Sec. 6 reports on the performance results obtained from a simulation experiment. Finally, Sec. 7 concludes the paper and provides an overview on our future plans.
(b): M B C of the original object with it’s center C, radius r. TP’s are denoted by X’s and SP by bullet
(a): Original object
1 (c): TPAS:[pl, i472,cp3,i474]
VAS:
(d):
cp2,rp3,rp4,’~5,476,97,lp8,4991
Figure 1. MBC Features objects. Then we search in the index for the objects that are within an E of the object fourier transform. Our method is rotation, scaling, and translation invariant and also guarantees no false drops. The second group [2,3] focuses on speeding up the similarity search for 1-D sequences. In [2] an indexing method to process similarity search queries for 1-Dsequences of real numbers (time-sequences) in a sequence database was proposed. They reduce the high dimensionality of time sequences to a lower dimensionality space by using a dataindependent method like Discrete Fourier Transformation of the time sequence elements. Subsequently, they use traditional spatial access methods, like the R*-tree, to index the objects. These studies can be considered as a complement to our work. This is because we use object features in order to reduce the dimensionality into single dimension sequences and then we can utilize their proposed techniques to speed up the search further.
2. Related Studies In this section, we focus on the related works in the database area in order to distinguish this paper from those studies. Towards this end, we categorize those works into two groups. The first group of studies [5, 4, 2, 31, investigates techniques to speed up the search of 2-D “similar” objects to a target object based on their shapes. In [5] two 2-D polygons (constructed by the union of rectangles) are considered similar if they have similar proportions of area. The major problem of their algorithm is that the representation is not unique and very sensitive to small changes in the coordinates of the rectangles. Also, the algorithm is not invariant against rotations, and limited to representing only polygon objects with sides parallel to the coordinate system. Therefore it can not guarantee no false drops. For the work in [4], the similarity of two objects is based on the similarity of their curvature. While this approach is rotation invariant, its major problems are the need to approximate the curvature of polygons and the existence of large index. Our similarity shape retrieval of objects depends only on the vertex angle sequence ( A S ) of the objects. w e build an index on the fe coefficients of the fourier transform of the A S of the
3. MBC Features In this section, we describe a set of six features that could be extracted from the objects’ MBCs. These features are very important in our study since we will be using them for both indexing the objects and expediting the comparison between two objects. The six M BC features that can be identified for each simple polygon object ’are: T (the radius of MBC), C (the coordinates of the center of MBC), TP (the set of touch points on MBC), TPAS (touch points angle sequence), VAS (vertex angle sequence), and SP (the start point of the angle sequence). Fig. 1 provides an example of ‘ A simple polygon is a polygon that has no pair of non-consecutive edges sharing a point. In the rest of this paper we refer to simple polygon as polygon.
612
an object with its six identified features.
Definition 3.1: Minimum Bounding Circle-MBC of an object is the closed disk of smallest radius containing all the points in the object (see Fig. l(b)). I Definition 3.2 Touch Points-TP of an object is the ordered set of points of the object that lays on the object’s MBC
Figure 2. p and y factors
(see Fig. I(b)). of RST,it is important that the matching objects be invariant with respect to translation, rotation and scaling. To support the first type of RST queries, we use those MBC features of polygons that are translation, scaling and rotation independent. These are AS,either TPAS or VAS, and n T p . For the second query type of RST, it is important that the matching objects be within a specified rotation angle, scaling factor, translation vector, or any combination of the three. We use object’s SP to check if it is rotated and the direction of rotation, object’s T to check if it is scaled, and the object’s C to check if it is translated. For further discussion on the description of different types of exact match queries, please consult [I].
Definition 3.3: Angle Sequence-AS of an object is the sequence of angles contained between vectors connecting the center of the object’s MBC and a subset of its points. I
Definition 3.4: Start Point-SP of an angle sequence is the point in the angle sequence that makes the smallest angle, counterclockwise; with the x-axis passing the center of the object’s M B C (see Fig. I@)). I
4. Query Qpes Here, we describe three alternative matching query types that utilize different subsets of the MBC features. With the first query type, we investigate whether two objects are exactly identical in shape (termed, EM). With the second query type, we investigate whether two objects are identical in shape but either rotated or scaled or translated or any combination of the three (termed, RST).Finally, with the third query type, we investigate whether two objects are similar in shape per our definition of similarity (termed, SIM). For different query types, different MBC features are used to support the query. Later in Sec. 5, we describe three index structures on different subsets of object’s M BC features in order to expedite these query types.
-
4.3. Similarity Shape Retrieval SIM
With this type of object retrieval we are interested in answering the query “jnd all objects in a set of objects that are exactly identical to the query object”. Some necessary conditions (but not sufficient) for two objects to be identical are: they should have the same number of vertices (m), the same number of touch points ( n ~ p )identical , TPAS, identical S P , and the same M B C size with identical radii, and centers (Cs).
With this type of object retrieval we are interested in answering the query “ j h d all objects in a set of objects that are similar to the query object”. There are many notions of similarity which depends on the application domain. The application domain or the user may specify and describe the properties of the notion of similarity. For example, a property might be the invariance of similarity of objects under some kind of transformation such as scaling and rotation. This type of notation can be supported by our RST query types. Our notion of similarity is that two objects are considered similar if the difference between their VASs is within a specified error margin; this error depends on two parameters ,9, and y. Where y is the percentage of the vertices of the original object that were changed, and p is the percentage of change in the angle of the vector connecting the center of the MBC of the object to one of the object’s vertices (see Fig. 2). Note that our similarity search is also scaling, rotation, and translation invariant; since we are using VAS as a measure of similarity.
4.2. RST Shape Retrieval
5. Index Structures
With this type of object retrieval we are interested in answering two types of queries:“jnd all objects in a set of objects that are exactly identical to the query object regardless of its size and orientation”, and “find all objects in a set of objects that are exactly identical to the query object with a specijied size a n d o r orientation”. For the first query type
We propose three types of index structures on different subsets of M B C features. The first structure, ITPAS , indexes the objects according to their m, n T p , and their T P A S s . The second structure, IVAS , uses V A S of the objects for indexing purpose. Finally, the third index structure, I T P V A,~is a hybrid of the first two structures, where
4.1. Exact Match Shape Retrieval - EM
6 13
account for examining scaling-invariance, it only checks for translation and rotation invariance.
5.2. ITPAS Index Structure This index structure uses an object's m, the
~ T P and S
TPASs to construct the index, and can be used to solve the EM and RST queries (see Fig. 3(b)). The ITPASindex structure consists of three levels. At the first level, we index the objects according to their rn's using a hash table. At the second level, we use a hash table based on their ~ T P S Finally, . at the third level the objects with the same rn and ~ T are P indexed using a multi-dimensional index structure (R*-tree) according to their T P A S . To eliminate false hits, further filteration is done by comparing the final set of objects to Oq to check if they are exactly the same. The overall complexity for EM queries using ITPAS index structure is of order O ( m ) ,and for RST queries is of order O ( n ~ *pm2).
(a):
R*-tree on F o u r i e r Transform of VAS
Figure 3. Index Structures
5.3. IVAS Index Structure The IVAS index structure uses the ASS of all the vertices of the objects to construct the index, and can be used to solve the EM, RST and SIM queries. It consists of a single R*-tree on the fourier transform of the V A S of the objects (see Fig. 3(c)). Experiments [8] have shown that R*-tree based methods work well for up to 20 dimensions. In addition, in [2] they showed that using an fc 5 3 is more than efficient to index data sequence of sizes larger than 256 elements. Therefore, we use only the first fc coefficients of the fourier transforms of the V A S s to both reduce and fix the dimensionality of the problem. See [I] for algorithm to compute the D F T coefficients of an -4s.The Euclidean distance between sequences is used as a measure of similarity of two sequences. In [7] Parsevals theorem shows that the Euclidean distance is preserved under orthonormal transforms such as DFT. This would guarantee that we have no false drops, but note that we might produce some false hits. To eliminate the false hits, further filteration is done by employing our feature-based comparison algorithms. The overall complexity for EM,RST, and SIM queries using IVAS index structure is of orderO(m), O ( m 3 ) and , O ( m 3 ) ,respectively.
it uses the information provided by both the vertices of the objects ( V A S and m) and their T P s . For the purpose of comparison, we start by describing a naive index structure, IN-,,^ ,which only uses objects' m's for indexing purpose. The index only helps to identify a set of candidates that satisfy the query, i.e., we might have false hits but no false drops. This set is then filtered in two steps in order to eliminate all the false hits. In the first step, we use a subset of M B C features (i.e., S P , r , and C ) , to identify rotated, scaled and/or translated candidates. Finally, in the last step, the final candidates can be compared to the query object ( 0 9 ) by employing any known algorithm proposed in the area of computational geometry. Instead, we continue utilizing M B C features for the final comparisons. Towards this end, we employ our feature-based comparison algorithms, termed P C E M , PCRST, and P C S I M , which are used for EM, RST, and SIM queries, respectively (see [ 11 for further details).
5.1.
Index Structure
5.4. ITPVAS Index Structure
The naive index structure is simply a hash table build on objects' m's (see Fig. 3(a)). The returned set of objects from the index, in response to a query, forms the candidate set which trivially contains false hits. To eliminate the false hits, we apply a polygon comparison algorithm PC proposed in [ 6 ] .The complexity of the proposed algorithm to compare two objects 0' and Oj is O(m"); where in our case m' = mJ = m. This computation complexity does not
This index structure is similar to IVAS index structure except that we use the extra information provided by the same nTp. The I T P V A ~index structure could be used to solve the EM and RST queries. The ITPVASindex structure consists of two levels. In the first level we index the objects according to their nTp. At the second level, we index
614
the objects with the same n ~ pusingR*-tree , (see Fig. 3(d)). As in IVAS, the R*- tree is build on the first fc fourier coefficients of the object’s V A S . The overall complexity for EM,RST, and SIM queries using ITPVAS index structure is of order O(m)and 0(m3), respectively.
6. Performance Evaluation We conducted three sets of experiments to evaluate the performance of the proposed shape-retrieval index structures, With the first set of experiments we investigated the impact of changing the number of vertices of the objects on the performance of the proposed index structures. The second set of experiments was conducted to see how our methods scales up as the database size increases. Finally, with the third set of experiments, we focus on the SIM query type to investigate the sensitivity of our similarity to y and j3 parameters. For further discussion on experimental setup and the set of data objects used for the experiments, please consult [ 11. With the first set of experiments, we generated 21 different groups of objects where each group consists of 100 objects with the same number vertices (varied from 7 to 27 vertices). We submitted a total of 2100 RST queries; 100 queries per group where each query is an RST query for one of the objects in that group. We measured the U 0 and CPU costs to serve each query and averaged them out within each group. Fig. 4 demonstrates the results where the X-axis is the number of vertices of 100 objects within a group. In Fig. 4(a), the Y-axis is the CPU cost. Since all the three index structures outperformed I ~ a i . , ~with a very wide margin, in Fig. 4(a) their performance curves are not distinguishable. Therefore, in Fig. 4(b), we reported the same result, but we removed from the graph. Comparing ~ in the order Fig. 4(a) and 4(b), the CPU cost of I ~ a i , ,was of l O I 3 while for the other methods it was decreased to the order of lo4 (a lo9 improvement in CPU cost). ITPVAS and ITPAS outperformed IVAS by an average of 75% (see Fig. 4(b)). At the first glance this seems counter-intuitive because V A S s of an object is the superset of its T P A S s and hence one expect to see more false hits for an index based on TPASs. However, note that IVAS is based on the first three coefficients of the fourier transform of V A S and not on the ASS themselves. Second, the reason that ITPVASresulted in less false hits as compared to IVAS is because of utilizing the hash tables on n T p as the extra filter. An interesting observation is that as the number of vertices increases the degradation in IVAS performance is higher than that of ITPVASand ITPAS . This is because the total summation of the angles in IVAS is fixed at 2 * n. With higher number of vertices, 2 * n gets divided into smaller angles. Consequently, this results in more objects with identical VASs and by retaining only the first few
615
coefficients of the fourier transform it increases the number of false-hits with IVAS. Fig. 4(c) concludes the first set of experiments by reporting on the percentage of improvement in the U 0 cost as compared to as the number of the vertices for each group of objects increases. The ITPVASindex structure had the best I/O cost improvement followed by ITPAS and IVAS . Both ITPVASand ITPASoutperformed I ~ a i ~ ~ consistently by at least 40%. The performance of IVAS, on the other hand, degrades as we increase m due to both the increase in the number of false hits as explained before and the increase in the number of queries required to examine for rotation invariance. With the second set of experiments, we fixed the number of vertices for all the objects in the database to 19. We varied the size of the database from 100 to 1000 objects (the Xaxis of the graphs in Fig. 5). The number of RST queries is 100 and they query for the 100 objects constitute the smallest database size. The results demonstrate identical trends to those of the first set of experiments (see Fig. 5). Note that at first glance it looks like that IVAS is not as sensitive to the change in the database size as compared to the change in the number of vertices when measuring I/O cost (see Fig. 5(c)). However, this is not true because the real reason that in Fig. 5(c) IVAS seems to have a better performance is due to the high sensitivity of I ~ a toi the ~ in~ crease in the database size. That is, since the number of vertices was fixed in this set of experiments, INdve operates as if there is no index structure and hence its performance gets worse directly proportional to the size of the database. Therefore, since the result in Fig. 5(c) is based on the I ~ a i ~ ~ as its baseline, it seems as if IVAS is improving. This was not the case in the first set of experiments when we were increasing the number of vertices. In sum, all three index structures outperformed by a wide margin in both CPU and U 0 costs. Furthermore, ITPVASconsistently performed better than ITPAS and IVAS . Note that the percentage of improvement in the U 0 cost increases as the database size grows with ITPAS, IVAS , and ITPVAS index structures (see Fig. 5(c)). This shows that all our methods scale up as the size of the database increases. Some retrieval results for RST queries are shown in Fig. 6 . This figure shows the query object 0 9 , and the objects retrieved using ITPAS , IVAS , and ITPVASindex structures. All of the 3 index structures successfully retrieved the set of 5 objects that satisfied the RST query (wrapped in boxes in Fig. 6). However, observe that the number of false hits is less with ITPVASthan that of ITPAS and IVAS . IVAS had the highest number of false hits. Note that the false hits from ITPASare different from those of I V A S . With the third set of experiments, we focused on the
-II-INalve
+ITPAS
+ITWAS
...................................
+IVU
(a): CPU cost of all four index SlIIICNreS
(b): CPU cost of I T ~ ,AIVAS ~ , and ITPVAS index shllcNres
(c): Percentage of improvement in the U 0 cost over INaiv.
Figure 4. First set of experiments: fixing the database size and varying the number of vertices
.
(b): CPU COSt o f ITPAS IVAS , and ITPVAS
(c): Percentage of improvementin the uo cost over
index structures
INaivs
Figure 5. Second set of experiments: fixing the number of vertices and varying the database size
(a): Query object
(c):
SIM query type to study the impact of j3 and y on the distortion of an object. The objective of these experiments is to see how we can translate an error parameter provided by an application domain to the allowable variations in our ,8 and y parameters. By choosing an object from the database we varied ,8 and y to generate distorted objects. The Euclidean distance between the Fourier coefficients of the VASs of the original object and a distorted object is considered as the error. We varied j3 from -50% to 50% with an increment of 4% and y from 5 % to 100% with an increment of 5%. Fig. 7(a) demonstrates the observed error when varying both y and p . To report on the impacts, let us look at the less complex, two dimensional versions of this graph. Fig. 7(b) shows the observed error when varying ,8 and fixing y. Each curve represents a fixed value of y. The results indicate that as ,8 increases/decreases from 0, for the same value of y, the error increases. This graph can be used in the following fashion. Suppose that the application domain
(b): ITPVAS (5 objects)
ITPAS (13 objects)
(d): IVAS(16 objects)
Figure 6. A visual example of a query
616
i
I.,.
(b): Varying ,O for fixed values of y
(a): Varying ,O and y
(c): Varying y
for fixed values of
Figure 7. The impact of changing /? and y with SIM
specifies that two objects can be considered similar as long as the error is under 14%.Utilizing Fig. 7(b), we can guarantee such an error rate by (say) fixing y at 5% and allow /? to vary from -50% to 50%. Fig. 7(c) shows the observed error when varying y and fixing j3. Each curve represents a fixed value of /?. One expect that the larger the value of y,the larger the error should be. however, Fig. 7(c) shows that this is not true. Observe that as we increase the value of 7 , for a fix p, we see a roller coaster effect for the error. This is because as we rotate more vertices (i.e., increasing r),the distorted objects are getting more similar to the original. That is in the worst case, if we rotate all the vertices, the distorted object will be identical to the original object.
techniques that could represent objects with curved parts and with holes. Finally, we want to extend this work to support 3D objects.
References [I] M. Safar, and C. Shahabi
2D Topological and Direction Relations in the World of Minimum Bounding Circles Universig of Southern California Technical Report 99-701 ftp://usc.edu/pub/csinfo/tech-reports/papers/99-70 1.ps.Z. [2] R. Agrawal, C. Faloutsos, and A. Swami Efficient Similarity Search In Sequence Databases FOD0,1993. [3] C. Faloutsos, M. Ranganathan, and Y. Manolopoulos Fast Subsequence Matching in Time-Series Databases SIGMOD, 1994. [4] S. berchtold, D. Keim, and H.P. Kriegel Using Extended Feature Objects for Partial Similarity retrieval VLDB,1997. [5] Jagadish H. V. A Retrieval Technique for Similarity Shapes Proc. ACM SIGMOD Int. Conj on Management of Data, 1 9 9 1 , ~208-217. ~. (61 Alt H., Blomer J. Resemblance and Symmetries of Geometric Patterns Data Structures and Eficient Algorithms, in: WIGS, Vol. 594, Springer 1992, pp. 1-24 [7] A. V. Oppenheim and R.W. Schafer Digital Signal Processing Prentice Hall, Englewood Cliffs, N.J., 1975 [SI M. Ottennan Approximate Matching with High Dimensionality R-trees M S c . scholarlypaper, Dept. of ComputerScience, Univ, ofMaryland, College Park, MD, 1992 [9] S. Ghandeharizadeh Stream-Based Versus Structured Video Objects: Issues, Solutions, and Challenges In S.Jajodia and K Subrahmanian, eds, Multimedia DB Systems: Issues and Res. Direct., Springer-Verlag, 1995 [IO] J.P. Eakins Retrieval of trade mark images by shape feature Proc. First International Conference on Electronic Li-
7. Conclusion and Future Work In this paper, we presented a multi-step query processing method that efficiently supports the retrieval of 2D objects by shape. We proposed three index structures ITPAS , IVAS , and ITPVAS based on features that are extracted from the objects M BC. Those features are C, r , T P , S P , T P A S and V A S . Our index structures support similarity and exact match queries while guaranteeing no false drops and reducing the number of false hits. Our approach recognizes rotation, uniform scale and translation invariance in the query objects by using the subset of the extracted features. Further optimization and final elimination of false hits are achieved by using efficient comparison algorithms that are based on our identified features. Our simulation results indicate that while all three index structures outperformed I ~ a byi a~wide ~ margin, ITPVAS is the superior index structure considering both U 0 and CPU costs. We intend to extend this work to efficiently support similarity search queries of objects under different transforrnations. We also would like to study other approximation
brary and Visual Information System Research, de Montforr University, 1994, pp. 101-109
617