singular vector direction, the point variation is the second largest, and so on. ... right singular vectors u1 and v1 should be mostly parallel to each other ...
Indexing of Variable Length Multi-attribute Motion Data ∗ Chuanjun Li
Gaurav Pradhan
S.Q. Zheng
B. Prabhakaran
Department of Computer Science University of Texas at Dallas, Richardson, TX 75083 {chuanjun, gnp021000, sizheng, praba}@utdallas.edu
ABSTRACT
Keywords
Haptic data such as 3D motion capture data and sign language animation data are new forms of multimedia data. The motion data is multi-attribute, and indexing of multiattribute data is important for quickly pruning the majority of irrelevant motions in order to have real-time animation applications. Indexing of multi-attribute data has been attempted for data of a few attributes by using R-tree or its variants after dimensionality reduction. In this paper, we exploit the singular value decomposition (SVD) properties of multi-attribute motion data matrices to obtain one representative vector for each of the motion data matrices of dozens or hundreds of attributes. Based on this representative vector, we propose a simple and efficient interval-tree based index structure for indexing motion data with large amount of attributes. At each tree level, only one component of the query vector needs to be checked during searching, comparing to all the components of the query vector that should get involved if an R-tree or its variants are used for indexing. Searching time is independent of the number of pattern motions indexed by the tree, making the index structure well scalable to large data repositories. Experiments show that up to 91∼93% irrelevant motions can be pruned for a query with no false dismissals, and the query searching time is less than 30 µs with the existence of motion variations.
indexing, dimensionality reduction, similarity, singular value decomposition, multi-attribute motion.
1.
INTRODUCTION
Haptic data such as 3D motion capture data and sign language animation data are new forms of multimedia data. For instance, a gesture sensing device such as CyberGlove with dozens of sensors generates multi-attribute data when a person wearing the glove moves his/her hand to create sign languages for hearing impaired. When a person wearing the glove moves his/her hand, each sensor has one reading at one sampling time, and the resulting data corresponding to one sign (e.g., sign language for TV or Milk) forms a matrix of multiple columns or attributes and hundreds or thousands of rows. Each row corresponds to 1 sampling of all the sensors. In the case of 3D motion capture data, a motion such as walking or running has to be captured by using dozens or even hundreds of sensors, and the 3D motion capture data can thus have hundreds of attributes corresponding to the sensors placed over different parts of a motion object, and have thousands of rows of readings. For many of these multimedia applications, more than one value is generated at each time, rather than only one value at each time as in a time series data sequence. As a result, multi-attribute data of one motion forms a matrix, rather than a multidimensional vector as in a time series sequence. For example, multi-attribute motion data used in this paper are matrices, and each of the matrices is for one motion and has 22 columns and hundreds or thousands of rows. The matrices of motion data can be of variable lengths, due to the facts that motions can be carried out with different speeds at different time, they can have different durations, and motion sampling rates may also be different. Hence there are no continuous row-to-row correspondences between data of similar motions as shown in Figure 1. For similar motions, corresponding attributes can have large differences in values between them, and different corresponding attribute pairs can have different variations at different times due to local speed differences. These properties of multi-attribute motion data make it difficult to index the data efficiently, and no indexing techniques so far can index multi-attribute motion data directly or efficiently. Indexing should help prune the majority of irrelevant data quickly for a query, and that’s the purpose of our proposed indexing structure for multi-attribute motion data.
Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing—Indexing methods
General Terms Algorithm ∗Work supported partially by the National Science Foundation under Grant No. 0237954 for the project CAREER: Animation Databases. Any opinions, findings, and conclusions or recommendations expressed in this work are those of the authors and do not necessarily reflect the views of the National Science Foundation. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MMDB’04, November 13, 2004, Washington, DC, USA. Copyright 2004 ACM 1-58113-975-6/04/0011 ...$5.00.
75
1
singular vectors of multi-attribute data provide the essential geometric structures of matrices of the multi-attribute motion data. Section 2 reviews related work, and section 4 generates indexing vectors from the first right singular vectors. Section 5 proposes an interval-based tree index structure, followed by section 6 which experimentally evaluates the pruning efficiency of the proposed indexing structure. Finally section 7 concludes the paper.
0.5
Motion Data Values
0
-0.5
-1
-1.5
-2
2.
-2.5 0
100
200
300
400
0.5
Motion Data Values
0
-0.5
-1
-1.5
-2
-2.5 0
100
200
300 Motion Data Lengths
400
RELATED WORK
For time series sequences, which are multidimensional vectors, multi-dimensional index structures such as R-tree [9] or its variants [2, 11] have been used to index the dimensionality reduced data. Dimensionality reduction techniques include Discrete Fourier Transform (DFT) [1, 7], discrete Haar Wavelet Transform (DWT) [5, 6, 16], and Piecewise Aggregate Approximation (APP) [12] and Adaptive Piecewise Constant Approximation (APCA) [4] as well as Singular Value Decomposition (SVD) [3, 4, 12]. Any signal, or time series data of length n can be decomposed into n sine/cosine wave superpositions, and many of the waves, each of which represented by a Fourier wave coefficient, have very low amplitude and contribute little to reconstructing the original signal. Discrete Fourier Transform (DFT) [1, 7] reduces the dimensionality of time series data by retaining only the first few coefficients. To use DWT to reduce dimensionality of time series data, Haar wavelet functions are used to represent data in terms of the sum and difference of a mother wavelet, and part of wavelet coefficients can be truncated to reduce the dimensionality. One drawback of DWT is that the data sequence length is required to be in integral power of two, which is obviously too strong for practical scenarios. APP [12] approximates the data by segmenting the sequences into equi-length sections and indexes the mean values of the sections in a lower dimensionality space, and APCA [4] approximates each time series by a set of constant value segments of varying lengths. To use SVD for dimensionality reduction, SVD of a matrix composing pattern time series data is computed for transforming a query series. The transformed query series is truncated with only a few components retained. Although it requires all sequences to have equal lengths and its computation is costly (if the time series are very long), SVD reduces dimensionalities optimally for linear projections. Among these techniques, APCA demonstrated its superiority comparing with the others, while SVD can provide better pruning efficiency than other techniques and scales up well to database sizes [12]. Equal length multi-attribute sequences are considered in [10]. A CS-Index structure is proposed for shift and scale transformations. The I/O overhead of CS-index increases with the number of attributes, making the index structure non-scalable to sequences of large attributes. Similarity search of multi-attribute sequences with different lengths cannot be solved by the distance definitions and the CSindex as proposed in [10]. In [14], multi-attribute sequences are partitioned into subsequences, each of which is contained in a Minimum Bounding Rectangle (MBR). Every MBR is indexed and stored into a database by using an R-tree or any of its variants. Partitioning of a sequence involves MBRs of other sequences in the database, causing the proposed method not scalable in large databases. If two sequences are of different lengths,
1
500
Figure 1: Data for Two Similar Motions. Similar motion data can have different lengths, and different corresponding attribute pairs can have different variations at different times, hence there are no row-to-row correspondences between data of similar motions, making it difficult to index multi-attribute motion data. Proposed Approach: Assuming that all motions including the pattern motions to be indexed in a database and any query motions have the same number of attributes. One important observation can be made that the geometric structures are similar for similar motions. Based on this observation, we proposed to index multi-attribute matrix data according to the geometric structures of data matrices. Since Singular Value Decomposition (SVD) exposes the geometric structures of motion data matrices, as explored in Section 3, multi-attribute motion matrix data are decomposed by using SVD, and the resulting first right singular vectors are employed for indexing the corresponding matrices. The dimensionalities of the first right singular vectors are reduced by using SVD, due to their short equi-lengths. Based on the distribution of the reduced first singular vectors or their negations of similar motions, a simple and efficient interval-tree based index structure is proposed for indexing the reduced first singular vectors of the pattern motions in the database. After a query matrix has been transformed into a reduced singular vector, only one component of the reduced singular vector of the query needs to be checked at each tree level during searching, comparing to all the components of the query vector that get involved for an R-tree or its variants. Experiments show that up to 91∼93% irrelevant motions can be pruned for a query, the query searching time can be less than 30 µs even if there might be reasonably large variations in similar motions, and the searching time is independent of the number of pattern motions in the database. The rest of the paper is organized as follows. Section 3 gives the background information about SVD for its use in the index structure to be proposed, and shows that first right
76
the shorter sequence is compared with the other by sliding from the beginning to the end of the longer sequence. This approach is not suitable for recognition of similar sequences of different lengths or similar sequences of different local speeds. Dynamic time warping (DTW) and longest common subsequence (LCS) are extended for similarity measures of multiattribute data in [19]. Before the exact LCS or DTW is performed, sequences are segmented into MBRs to be stored in an R-tree. Based on the MBR intersections, similarity estimates are computed to prune irrelevant sequences. Both DTW and LCS have a computational complexity of O(wd(m + n)), where w is a matching window size, d is the number of attributes, and m, n are the lengths of two data sequences. When w is a significant portion of m or n, the computation of DTW and LCS can be quadratic in the length of the sequences. This makes DTW and LCS non-scalable to large databases with long multi-attribute sequences. It has been shown in [19] that the index performance significantly degrades when the warping length increases. Even for a small number of 20 MBRs per long sequence, the index space requirements can be about a quarter of the dataset size. In [18], a weighted-sum SVD is defined as below for measuring the similarity of two multi-attribute motion sequences.
such that A = U ΣV T where Σ = diag(σ1 , σ2 , . . . , σmin(m,n) ) ∈ Rm×n , σ1 ≥ σ2 ≥ . . . ≥ σmin(m,n) ≥ 0. The σi is the ith singular value of A in non-increasing order and the unit column vectors ui and vi are the ith left singular vector and the ith right singular vector of A for i = min(m, n), respectively. For similar motions with different lengths, their left singular vectors are of different lengths, but their right singular vectors are of the equal length. The singular values of matrix A are unique, and the singular vectors corresponding to distinct singular values are uniquely determined up to the sign, or a singular vector can have opposite signs [17]. SVD exposes the geometric structure of a matrix A. If the multi-dimensional row vectors or points in A have different variations along different directions, the SVD of matrix A can find the direction with the largest variation. Along the direction of the first right singular vector, the row vectors in A have the largest variation, and along the second right singular vector direction, the point variation is the second largest, and so on. The singular values reflect the variations along the corresponding right singular vectors. Figure 2 illustrates the data in an 18 × 2 matrix with its first and second right singular vectors v1 and v2 . The 18 points in the 18 × 2 matrix have different variations along different directions, hence data have the largest variation along v1 as shown in Figure 2.
Ψ(Q, Pk ) = min(Ψ1 (M1 , M2 ), Ψ2 (M1 , M2 )) where M1 = QT Q, M2 = PkT Pk , Ψ1 (M1 , M2 ) = ( n i=1 σi (ui · n n vi ))/( n |σ |), Ψ (M , M ) = ( λ (u ·v ))/( i 2 1 2 i i i i=1 i=1 i=1 |λi |), th ui and vi are the respective i right singular vectors of M1 and M2 , and σi and λi are the respective ith largest singular values of M1 and M2 . The weighted-sum SVD similarity measure computes the inner products of all the corresponding singular vectors of the corresponding matrices weighted by their singular values, and takes the minimum of the inner products sum as the similarity measure of two matrices. This definition includes some noise components since we observed that not all corresponding singular vectors of similar motions are close to each other as shown in Figure 4. Hence, not all the inner products of singular vectors need be considered in the definition of the distance measure. The inner products of singular vectors can be both positive and negative and the weights can be the singular values of either matrix. Hence, it is very likely that the weighted sum can drop or jump sharply even if a query approximately matches some pattern. Linear scanning is used in [18] to search for a similar sequence, which is computationally hard when the number of sequences in the database is very large. Attributes of the data indexed in the previous research are less than ten, whereas our proposed indexing structure can handle dozens or hundreds of data attributes.
3.
y
v1 v2
x
Figure 2: Geometric Structure of Matrix Exposed by its SVD
4.
In this section, we transform multi-attribute motion data into representative vectors for indexing.
4.1
Representative Vector Generation for Indexing
As with many similarity measure cases, motion matrices should have similar geometric structures if the corresponding motions are similar. Since the geometric similarity of matrix data can be captured by SVD, we propose to exploit SVD to generate representative vectors for motion matrices, and use these representative vectors as the bases for indexing the multi-attribute motion data. The motion patterns to be indexed in a database are assumed to be represented using different matrices Pk , k = 1, 2, . . . , p, where p is total number of motion patterns in a
BACKGROUND INFORMATION OF SVD
In this section, we give the definition and geometric interpolation of SVD for its application to the indexing of multi-attribute motion data. As proved in [8], for any real m × n matrix A, there exist orthogonal matrices U = [u1 , u2 , . . . , um ] ∈ Rm×m V = [v1 , v2 , . . . , vn ] ∈ Rn×n
INDEXING VECTOR GENERATION
and
77
1
0.4
0.2
0.8 Vector Component Values
Inner Products of the First Singular Vectors
0.9
0.7
0.6
0
-0.2
-0.4
0.5
-0.6
0.4
u2 v2
Similar Motions Dissimilar Motions 0.3
-0.8 50
100
150
200
250
300
2
4
6
8
Number of Inner Products
12
14
16
18
20
22
Figure 4: The second singular vectors u2 and v2 for similar motions by two different subjects. u2 · v2 = 0.57. The large variations in the second singular vectors show that the second singular vectors might not be close to each other due to the variations in the similar motions.
Figure 3: Inner Products of the First Singular Vectors of Similar Motions and Those of Dissimilar Motions database. Let Q be the query motion. If the motions represented by Q and Pk are similar, their corresponding first right singular vectors u1 and v1 should be mostly parallel to . each other geometrically, so that |u1 · v1| = |u1 ||v1||cos(θ)| = |u1 ||v1| = 1, where θ is the angle between the two right singular vectors u1 and v1, and |u1 | = |v1 | = 1 by the definition of SVD. As Figure 3 shows, |u1 · v1 | are very close to one when two motions are similar to each other, and are mostly less than 0.95 when two motions are not similar to each other. It indicates that the first right singular vector or its negation of one motion is close to the first right singular vector of a similar motion. They are most likely to be different from each other when two motions are not similar to each other. Other corresponding singular vectors may not be close to each other even if two motions are similar as shown in Figure 4. This immediately suggests that right singular vectors other than the first ones might not reflect the similarity of two similar motions and cannot be used to efficiently index multi-attribute motions, and first right singular vectors can be used to index multi-attribute motions. It is worth noting that for motions to be similar, other information should also be considered. For example, their corresponding singular values should also be proportional to each other as shown in [15]. Although being necessary conditions for similarity measure, close first right singular vectors are sufficient for indexing purpose given certain variation tolerance as will be shown in Section 6. Hence, given multi-attribute motion data, we propose to apply SVD to them to obtain their first right singular vectors, which will be used as the representative vectors for the indexing purpose. For convenience, we will refer to right singular vectors as singular vectors, and consider them as transposed row vectors hereafter.
4.2
10
Vector Components
0.2
Component Values of First Singular Vectors
0
-0.2
-0.4
-0.6
Motion 1 Motion 2 Motion 3 Motion 4 Motion 5 Motion 6
-0.8
-1 0
2
4
6
8 10 12 14 Components of First Singular Vectors
16
18
20
22
Figure 5: Component Distributions of the First Singular Vectors Figure 5 illustrates the component distributions of the first singular vectors for six sets of motions. Motion 1 is similar to motion 2, and they are different from all the other four motions. Motion 3 is similar to motion 4, and motion 5 is similar to motion 6. Motions 3 and 4 are different from motions 5 and 6. Since the norm of a singular vector is 1 by the definition of SVD, the component values of first singular vectors are between -1 and 1. It can be seen from Figure 5 that although the first singular vectors are close to each other for similar motions, they can have some small variations at any components. Although the first singular vectors of dissimilar motions are mostly different from each other, some corresponding component values can be very close to each other. All these distribution properties of the first singular vectors make the R-tree or its variants not applicable to the first singular vectors. The zigzag changing of the first singular vectors as illustrated in Figure 5 indicates that the superior time series dimensionality reduction technique APCA [4] is not applicable to first singular vectors. Since they all have equal lengths or dimensions, the first singular vectors will form a singular vector matrix A if we put these row vectors together. The number of attributes needed to capture motions is usually
Dimensionality Reduction of Representative Vectors
The lengths or dimensions of the first singular vectors of multi-attribute motion data are usually greater than 15 (the length is 22 for our experimental data), and the performances of R-tree and its variants degrade significantly when the dimensionalities are larger than 15. In order to index the first singular vectors efficiently, dimensionality reduction needs to be performed on them first.
78
small, say dozens or at most several hundreds, making SVD the best dimensionality reduction technique taking into consideration that SVD is optimal for linear projections. Let A be the matrix composing the first right singular vectors of the motions to be indexed, and A = W ΣZ T
components have the largest variance among all components as shown in Figure 6, the first components would not cluster around 0. Reduced singular vectors whose first components are close to 0 are only a small fraction of all reduced singular vectors, thus including both those reduced vectors with first components close to 0 and their negations in the index would only increase a small fraction of the actual number of vectors to be indexed. All reduced singular vectors are considered to have been negated if necessary from now on.
(1)
then AZ = W Σ gives the transformed first singular vectors of motion patterns in the coordinate system spanned by the column vectors of Z [13], and for a singular vector u1 of a query motion, u1 Z gives a corresponding transformed singular vector of u1 in the system spanned by the column vectors of Z. Due to singular value decomposition, the component variations of the transformed first singular vectors are the largest along direction z1 , and decreases along directions z2 , . . . , zn as shown in Figure 6. The transformed first singular vectors in Figure 6 are for the same 6 motions as shown in Figure 5. Unlike the first singular vectors as shown in Figure 5, the first several components of the transformed vectors have large variations, and the other components become very small and approach zero. The differences among the first singular vectors are optimally reflected in the first several dimensions of the transformed first singular vectors, hence we can index the transformed first singular vectors by keeping only the first several components. Differences among all the other components are small even if motions are different, so the other components can thus be ignored and the dimensionalities are reduced to the first several ones. The reason for this behavior is that Z gives the directions along which the variations of the singular vectors in A decrease from the first column vector to the last column vector of Z. AZ projects singular vectors in A along directions of column vectors of Z, and u1 Z projects query singular vector u1 along the same directions. Hence, the transformed vectors in AZ have such properties that the variations among them decrease from the first component to the last component and the first r components optimally approximate the differences of the original singular vectors. Hence, only r components of the transformed vectors need to be kept, and all rest components can be truncated. We refer to the transformed first singular vectors with reduced dimensionalities as the reduced first singular vectors.
4.3
Component Values of Transformed First Singular Vectors
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6
Motion 1 Motion 2 Motion 3 Motion 4 Motion 5 Motion 6
-0.8 -1 0
2
4
6 8 10 12 14 16 Components of Transformed First Singular Vectors
18
20
22
Figure 6: Component Distributions of the Transformed First Singular Vectors
4.4
Effect of Large Component Variations
We now make some important observations. As Figure 6 shows, if two motions are different, their corresponding first few transformed components would be very likely to be different, and can thus be separated very likely. If two motions are similar, the first components of their reduced first singular vectors are very close to each other, and the other components can have relatively larger differences. The variations in the corresponding components, especially the very first few components, should be within a small range for similar motions. If some corresponding components are quite different from each other, the two motions cannot be similar even though the Euclidean distance between the two reduced vectors is small. The Euclidean distance between the two reduced vectors of motion 1 and motion 2 is 0.051, and the Euclidean distance between the two reduced vectors of motion 1 and motion 3 is 0.048 as shown in Figure 7. Due to approximation caused by dimensionality reduction, retained vectors of more similar motions can have somewhat larger Euclidean distance than retained vectors of less similar motions. If we use the Minimum Bounding Rectangle (MBR) concept as in an R-tree, motion 1 and motion 3 are more likely to be grouped together than motion 1 and 2, although motion 1 and motion 2 are more similar to each other than motion 1 and motion 3. Hence R-tree cannot be used to efficiently index the reduced first singular vectors of multi-attribute motions. The main reason is that two motions cannot be similar when any very first corresponding components of their reduced first singular vectors are quite different from each other, no matter how close all other components are to each other. We refer to this as the pruning effect of large component variations.
Negation of Reduced Singular Vectors
Due to the sign non-uniqueness of singular vectors, the reduced singular vectors of similar motions can have opposite signs. If the first component of a reduced first singular vector is negative, we negate all components of the vector, then similar motions will have similar reduced vectors after negation. If the first component of the reduced singular vector is close to zero, small variation in the first component of a similar motion may hide the opposite sign of the reduced singular vector, and negation according to its first component might not solve the sign non-uniqueness issue. In this case, the reduced vector and its negation are both included in the database in order not to miss similar motions for a query with the first component of reduced singular vector close to zero. Since the first singular vectors are unit vectors which have norms of one, the norms of the reduced first singular vectors are less than one. Hence the first components of the reduced singular vectors range over (-1,1). Because the first
79
5.1
Component Values of Transformed First Singular Vectors
1 Motion 1 Motion 2 Motion 3
0.8
For each level i of the index tree, we have a tolerance εi which accounts for the variation in the ith component of the reduced first singular vectors v1r for similar motions. Let the root node of the tree be T . The searching process is performed as follows.
0.6
0.4
0.2
• Query Processing: Given a query motion, we first compute its SVD and obtain its first singular vector u1 . Then compute the transformed vector u1 Z of u1 , where Z is the SVD component of matrix A composing of the first singular vectors of pattern motions as in (1). Truncate u1 Z so that its dimensionality is r, and let ur1 be the reduced u1 Z, and ci be the ith component of ur1 .
0
-0.2
-0.4
-0.6 1
2
3
4
5
6
7
8
Components of Transformed First Singular Vectors
Figure 7: Pruning Effect of Large Component Variations
5.
Searching
• Subtree Searching: If T is a non-leaf node, find all entries whose Ii ’s overlap with [ci − εi , ci + εi ]. For each overlapping entry, search the subtree whose root node T is pointed to by cp of the overlapping entry.
INDEX TREE CONSTRUCTION
After generating the vectors for indexing, we construct an index tree structure as follows. Let r be the number of the retained dimensionalities of the reduced first singular vectors, r < n. We designate one level of the index tree to each of the r dimensions. Let level 1 be the root node, level i include nodes for dimension i, i = 1, 2, . . . , r, and level r + 1 contains the leaf nodes. Leaf nodes contain unique motion identifiers Pk , and non-leaf nodes contain entries of the form
• Leaf Node Searching: If T is a leaf node, return all the motion pattern identifiers Pk ’s contained in T . Let Ii be the intermediate entry interval at level i, then d2εi /Ii e entries per node traversed at level i would be covered in the searching range, or d2εi /Ii e nodes would be traversed at level i + 1. Traversing a tree of r + 1 levels takes time O(d2εi /Ii e)r . We would construct the index tree such that d2εi /Ii e ≤ 3, and r < 10. Since each level has -1 and 1 as its lower and upper value bounds, the number of entries in a node or the entry intervals Ii are bounded and are independent of the number of pattern motions indexed by the tree. 2εi accounts for the variation tolerance at level i, and is also independent of the number of pattern motions in the index. Hence the searching ranges or the number of searched entries for a query are be bounded for similar motions at each level, and are independent of the number of pattern motions. Because the level of the tree r+1 is bounded and is usually less than 10, the searching time should be independent of the number of patterns indexed by the tree, making the index structure very scalable to large number of multi-attribute motion datasets. As will be shown in Section 6, the CPU time for searching is quite small for index trees of level 7 in our implementation. Figure 8 illustrates the first three levels of an example index tree. Root node at level 1 has four entries, each of which has a child node at level 2. Each node at level 2 and level 3 has three entries, and each of which has a child node at one lower level. Given a query ur1 = (0.65, 0.15, -0.1, . . .), and let ε1 = 0.04, εi = 0.08 for i ≥ 2 . Entries at the root node are checked with [0.65 − 0.04, 0.65 + 0.04] = [0.61, 0.69]. Only the third entry overlaps with it, so the query ur1 is forwarded only to node n3 of level 2. At level 2, the query searching range is [0.15-0.08, 0.15+0.08] or [0.07, 0.23]. The second and third entries of node n3 overlap with the query searching range [0.07, 0.23], hence query will be forwarded to node n5 and to node n6 at level 3. At level 3, the query searching range is [-0.1-0.08, -0.1+0.08] or [-0.18, -0.02]. Only the second entries of nodes n5 and n6 overlap with this range, so the nodes pointed by the second entries of nodes n5 and n6 will be traversed. This searching goes on until the leaf nodes are traversed and all contained pattern identifiers Pk are returned as the result of the query.
(Ii, cp) where Ii is a closed interval [a, b] describing the component value ranges of the reduced first singular vectors at level i, −1 ≤ a < 1, −1 < b ≤ 1. Each entry has the address of one child node, and cp is the address of the child node in the tree. The index tree has the following properties: • Ii is not overlapped among sibling entries except the two overlapping ends. The interval width is the same for all entries of the same node except the first and last entries at one level, and can be different for nodes at different levels. • The first entry has Ii = [−, b1 ] at level 1 and Ii = [−1, b] at all other non-leaf levels. The last entries have Ii = [a, 1] at all non-leaf levels, where is some small positive value, 0 < b1 < 1, −1 < b < 0, and 0 < a < 1. • The interval width of Ii is smaller at level 1 than those of any Ii ’s of non-leaf nodes at all other levels. • The interval widths at different levels are determined according to possible variations in component values of the reduced first singular vectors of similar motions. • Since singular vectors are unique up to the sign, the reduced first singular vectors of two similar motions can have opposite signs. We negate all the vector components if the first component of the reduced vector is negative. If the first component of a reduced singular vector is close to zero, both the reduced vector and its negation are included in the index tree with the same identifier Pk .
80
n1
Level 1 -0.2, 0.3
0.3, 0.5
0.5, 0.7
0.7, 1
.......... n2
n3
n4
Level 2 .....
.....
.....
-1, -0.2
-0.2, 0.2
0.2, 1
.....
.....
..........
.......... n5
Level 3
.....
-1, -0.2
-0.2, 0.2
n6
0.2, 1
-1, -0.2
-0.2, 0.2
0.2, 1
Figure 8: An Index Tree Example Showing Three Non-leaf Levels. Bold lines show how an example query traverses the nodes.
5.2
Similarity Computation
After the index tree has been searched for a query, the majority of irrelevant motions should have been pruned, and similar motions and a small number of irrelevant motions are returned as the result of the query. To find out the motion most similar to the query, a similarity measure shown below as defined in [15] can be used to compute the similarity of the query and all returned motions, and the motion with the highest similarity is the one most similar to the query. Ψ(Q, Pk ) = |u1 · v1 | × (~σ · ~λ − η)/(1 − η)
Figure 9: 22 Sensors in a CyberGlove motion data. One hundred different motions have been carried out, each motion has a different duration, and all the resultant motion data matrices have different lengths ranging anywhere from about 200 to about 1500 rows. Each different motion is repeated for three times, so that each motion has other two similar motions with different motion variations and different durations. The data matrix of one motion can have more than two times the length of the data matrix of a similar motion. Each motion data matrix is given a unique pattern identifier Pk .
(2)
where u1 and v1 are the first singular vectors of Q and Pk , respectively, ~σ = σ/|σ|, ~λ = λ/|λ|, and σ and λ are the vectors of the singular values of QT Q and PkT Pk , respectively. Weight parameter η is to ensure that ~σ and ~λ have similar contributions as u1 and v1 have to the similarity measure and is determined by experiments. η can be set to 0.9 for the multi-attribute motion data.
6.
6.2
PERFORMANCE EVALUATION
In this section, we evaluate the performance of the indexing tree in pruning non-promising motions for a query. Let Npr be the number of irrelevant motions pruned for a query by the index tree, and Nir be the total number of irrelevant motions in the database. We define the pruning efficiency P as P=
6.1
Index Structure Building
The index structure is built based on the distribution properties of the reduced singular vectors of multi-attribute motion data. For testing the performance of the index tree, indexes with different tree configures have been built. Ii , r, and εi are determined as follows. The component value interval Ii is around 2εi so that high pruning efficiency can be obtained with quick query searching. The reason is two-fold. Level i of the index tree contains nodes for dimension i, 1 ≤ i ≤ r. If the width of Ii is much smaller than 2εi , the search would have to traverse multiple sibling nodes rather than a couple of nodes at level i + 1, and the subsequent traversing of the subtrees might cost searching time much more than necessary. On the other hand, if the width of Ii is much larger than 2εi , fewer irrelevant motions would be pruned. The reason is that although the searching time can be saved by searching only one node at level i + 1, more motion data will be included in the leaf nodes of the subtrees of the entry with large component value interval Ii . We observed that 5 dimensions are sufficient to prune irrelevant motions and to give high pruning efficiency. The transformed first singular vectors are thus truncated to dimensionality r = 5, and the indexing tree can have a height of only 6 including the leaf node level. Figure 10 and Figure 11 show that for similar motions, the variations in the first components of the reduced first singular vectors are mostly within 0.04, and the other component
Npr × 100% Nir
Motion Data Generation
The motion data was generated by using CyberGlove, a fully instrumented glove with 22 sensors as shown in Figure 9 that provides 22 high-accuracy joint-angle measurements representing the angles of the physical hand joints at different parts of a hand. When signs (mostly American Sign Language signs) are performed each sensor emits the normalized joint angle values with a speed of about 120 times per second. The angle data from all sensors depict the shapes of the glove at each sampling time, and continuous shape changes show the motion of the glove, hence we consider the angle measurements from the sensors as (1D)
81
100
variations are largely within 0.10. Taking into consideration that variations of the reduced singular vectors of similar motions at dimension 1 are smaller than those at other dimensions, we build an index tree with the root node having more child nodes than any other non-leaf nodes have. Let L be the number of levels of the index tree, L = r + 1, and Ni be the number of entries a node at level i has, i ≥ 1. In our implementation, we have Ni = N2 and εi = ε2 for i > 2.
L = 6, N1 = 12, N2 = 7, ε1 = 0.1 L = 6, N1 = 10, N2 = 6, ε1 = 0.1 L = 7, N1 = 8, N2 = 5, ε1 = 0.04 L = 6, N1 = 8, N2 = 5, ε1 = 0.04
95
Pruning Efficiency (%)
90
85
80
75
0.12
First Component Variation for Similar Motions
70 0.1 65 0.08 0.08
0.06
0.14 Searching Range ε2
0.16
0.18
0.2
0.04
queries return only the two most similar motions including one exact match, and the similar motions with the largest variations are pruned or falsely dismissed. In this case, more than 94% irrelevant motions are pruned when 6 dimensions are retained (L = 7), and about 91% irrelevant motions are pruned when 5 dimensions are retained (L = 6) as shown in Figure 12. When ε1 is set to 0.1, all queries return all the three similar motions, and there is no false dismissal regardless the motion variations. In this case, the pruning efficiency can be as high as 93% when N1 = 12 and N2 = 7. If we decrease the number of entries per node so that N1 = 10 and N2 = 6, the pruning efficiency can be still as high at 91%. If ε1 < 0.04, or ε2 < 0.08, there would be false dismissals of similar motions with large variations. When ε1 = 0.1, and ε2 = 0.08, there would be no false dismissals and the pruning efficiency can be up to 91-93%.
0.02
10
20
30
40 50 60 Number of Motions
70
80
90
100
Figure 10: Variations in First Components of reduced Singular Vectors
0.25
Component Variations for Similar Motions
0.12
Figure 12: Pruning Efficiency under Different Tree Configurations and Searching Ranges
0
0.2
0.15
0.1
6.4
0.05
10
Second Component Third Component
20
30
40 50 60 Number of Motions Fourth Component Fifth Component
70
80
90
Computational Efficiency
We tested the average CPU time spent on the 300 queries by using the index tree to prune irrelevant motions and compared it with the CPU time for the same 300 queries by linear scan. All experiments are performed on a computer with Linux OS and one 1.4 GHz Intel Pentium 4 processor. On average, 1.25 ms are required for computing the SVD of a query matrix, and about 1 µs is required for computing the similarity of a query and one pattern after SVD of the query matrix has been computed. About 300 µs are required to find the most similar motion of a query among all the 300 patterns if linear scan is used. Dimensionality reduction of a query’s first singular vector takes about 17 µs, and the CPU time of searching in the index tree are shown as Figure 13. The CPU time increases as expected when ε2 increases. When only the most similar motions are expected for the queries, searching takes less than 10 µs when ε1 = 0.04 and ε2 = 0.08. The search time is less than 30 µs with ε1 = 0.1 and all similar motions are returned for 300 queries. Since the searching time is independent of the number of pattern motions, we can see that dimensionality reduction of a query’s first singular vector and the subsequent searching of the index tree for the query take less than 50 µs. As high as 93% irrelevant motions can be pruned. Hence, less than 80 µs are required for reducing the first singular vector of the query matrix, searching the
0 100
Sixth Component
Figure 11: Variations in the Second to Sixth Components of reduced Singular Vectors
6.3
0.1
Pruning Efficiency
We tested the pruning efficiencies of the index tree by using different tree configurations and different variation tolerances ε1 and ε2 for similar motions. We built trees with 8, 10, and 12 entries at level one, and 5, 6, and 7 entries per node at other non-leaf levels, respectively. ε1 is set to 0.04∼0.1, and ε2 is set to 0.08∼0.2. For each test case, we gave 300 queries and computed the average pruning efficiency as shown in Figure 12. When ε2 increases while others remain the same, more entries are covered at the second and higher levels, and larger searching ranges or more nodes are to be traversed, resulting in decreasing pruning efficiency as shown in Figure 12. When ε1 is set to 0.04, more than 98% queries return all three similar motions including one exact match. About 2%
82
100
8.
L = 6, N1 = 12, N2 = 7, ε1 = 0.1 L = 6, N1 = 10, N2 = 6, ε1 = 0.1 L = 7, N1 = 8, N2 = 5, ε1 = 0.04 L = 6, N1 = 8, N2 = 5, ε1 = 0.04
90
[1] R. Agrawal, C. Faloutsos, and A. Swami. Efficient similarity search in sequence databases. In Proceedings of the 4th Conference on Foundations of Data Organization and Algorithms, pages 69–84, October 1993. [2] N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger. The R∗ -tree: an efficient and robust access method for points and rectangles. In Proceedings of ACM SIGMOD Int’l Conference on Management of Data, pages 322–331, May 1990. [3] V. Castelli, A. Thomasian, and C.-S. Li. CSVD: Clustering and singular value decomposition for approximate similarity search in high-dimensional spaces. IEEE Transactions on knowledge and data engineering, 15(3):671–685, 2003. [4] K. Chakrabarti, E. J. Keogh, S. Mehrotra, and M. J. Pazzani. Locally adaptive dimensionality reduction for indexing large time series databases. ACM Transactions on Database Systems, 27(2):188–228, 2002. [5] K.-P. Chan and A. W.-C. Fu. Efficient time series matching by wavelets. In Proceedings of The 15th International Conference on Data Engineering, pages 126 – 133, March 1999. [6] K.-P. Chan, A. W.-C. Fu, and C. Yu. Haar wavelets for efficient similarity search of time-seiers: With and without time warping. IEEE Transactions on knowledge and data engineering, 15(3):686–705, 2003. [7] C. Faloutsos, M. Ranganathan, and Y. Manolopoulos. Fast subsequence matching in time-series databases. In Proceedings of ACM SIGMOD Int’l Conference on Management of Data, pages 419–429, May 1994. [8] G. H. Golub and C. F. V. Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, Maryland, 1996. [9] A. Guttman. R-trees: a dynamic index structure for spatial searching. In Proceedings of ACM SIGMOD Int’l Conference on Management of Data, pages 47–57, June 1984. [10] T. Kahveci, A. Singh, and A. Gurel. Similarity searching for multi-attribute sequences. In Proceedings. of 14th Int’l Conference on Scientific and Statistical Database Management, pages 175 – 184, July 2002. [11] N. Katayama and S. Satoh. The SR-tree: a index structure for high-dimensional nearest neighbor queries. In Proceedings of ACM SIGMOD Int’l Conference on Management of Data, pages 369–380, May 1997. [12] E. J. Keogh, K. Chakrabarti, M. J. Pazzani, and S. Mehrotra. Dimensionality reduction for fast similarity search in large time series databases. Knowledge and Information Systems, 3(3):263–286, 2001. [13] F. Korn, H. V. Jagadish, and C. Faloutsos. Efficiently supporting ad hoc queries in large datasets of time sequences. In Proceedings of ACM SIGMOD Int’l Conference on Management of Data, pages 289–300, May 1997.
80
CPU Time (µsec)
70 60 50 40 30 20 10 0 0.08
0.1
0.12
0.14 Searching Range ε2
0.16
0.18
0.2
Figure 13: CPU Time under Different Tree Configurations and Searching Ranges index tree and computing similarities of the query and the returned patterns. This is to be compared with about 300 µs required for linear scanning of the 300 patterns. It should be noted that the time taken by linear scan is proportional to the number of pattern motions and it would take linear scan several milliseconds to compute the similarities between a query and several thousand pattern motions, while searching the index tree takes less than 30 µs no matter how many motion patterns are indexed in the tree.
7.
REFERENCES
CONCLUSIONS AND DISCUSSIONS
In this paper, we considered the problem of querying by example on a repository of multi-attribute motion data. We focused on developing an efficient indexing structure to address this problem. We discussed that the geometric structures as exposed by SVD of matrices of multi-attribute motion data can be exploited to develop an efficient index tree structure. Based on these discussions, we proposed an efficient interval-tree based index structure that uses the component values of the reduced first singular vectors. In the proposed structure, one level of the index tree is designated to each dimension of the reduced singular vectors. The searching time is independent of the number of motions indexed by the index tree, making the proposed index structure well scalable to large data repositories. After searching the index tree for a query Q, similarity of Q and returned patterns Pk ’s can be computed by similarity measure (2) [15]. Experiments on real world data generated by using CyberGlove show that the proposed index tree can prune up to 91∼93% irrelevant motions on average with no false dismissals, and a query search takes less than 30 µs. Hence, the proposed indexing technique can be used for real-time multi-attribute motion data comparison as well. We proposed an index structure for multi-attribute motion data based on the geometric structures as revealed by SVD of the motion data matrices. As long as the data sequences have certain geometric structures such that the variances along the directions of the first right singular vectors are much larger than those along other directions, the proposed index structure should be applicable to the sequences. Since typical multimedia applications do not generate random data, we believe the proposed index structure is applicable to the typical multimedia applications.
83
[14] S.-L. Lee, S.-J. Chun, D.-H. Kim, J.-H. Lee, and C.-W. Chung. Similarity search for multidimensional data sequences. In Proceedings. of 16th Int’l Conference on Data Engineering, pages 599 – 608, Feb./Mar. 2000. [15] C. Li, P. Zhai, S.-Q. Zheng, and B. Prabhakaran. Segmentation and recognition of multi-attribute motion sequences. To appear in Proceedings of The ACM Multimedia Conference 2004, October 2004. [16] I. Popivanov and R. J. Miller. Similarity search over time series data using wavelets. In Proceedings of The 18th International Conference on Data Engineering, pages 212–221, Feburary 2002.
[17] B. D. Schutter and B. D. Moor. The singular value decomposition in the extended max algebra. Linear Algebra and Its Applications, 250:143–176, 1997. [18] C. Shahabi and D. Yan. Real-time pattern isolation and recognition over immersive sensor data streams. In Proceedings of The 9th Int’l Conference on Multi-Media Modeling, pages 93–113, January 2003. [19] M. Vlachos, M. Hadjieleftheriou, D. Gunopulos, and E. Keogh. Indexing multi-dimensional time-series with support for multiple distance measures. In Proceedings of ACM SIGMOD Int’l Conference on Management of Data, pages 216–225, August 2003.
84