Multi-feature Integration on 3D Model Similarity ... - Semantic Scholar

7 downloads 0 Views 109KB Size Report
Josef Küng. 2. , Roland Wagner. 3. FAW, Johannes Kepler ..... [4] Mihael Ankerst, Gabi Kastenmueller, Hans-Peter. Kriegel, Thomas Seid. 3D Shape Histograms ...
Multi-feature Integration on 3D Model Similarity Retrieval 1

2

Saiful Akbar , Josef Küng , Roland Wagner

3

FAW, Johannes Kepler University of Linz, AUSTRIA {1 saiful.akbar, 2 jkueng, 3rrwagner}@faw.uni-linz.ac.at

Abstract In this paper, we describe several 3D shape descriptors for 3D model retrieval and integrate them in order to obtain higher performance than single descriptor may yield. We analyze four feature vector (FV) integration approaches: Pure FV Integration (PFI), Reduced FV Integration (RFI), Distance Integration (DI), and Rank Integration (RI). We observe which weighting factor might be the best for each approach. Our experiments show that the weighting factors consistently enhance the retrieval performance on not only training dataset, but also another extended dataset. Our experiments also highlight that RFI, which is obviously useful for processing unknown query object, is the best among the others. In another side, DI provides faster processing as it uses pre-computed distance, but does not have a capability of processing unknown query object. Hence, both approaches could be combined in order to obtain higher efficiency and effectiveness of 3D model retrieval system for either known or unknown query object.

1. Introduction In recent years, the applications of 3D models grow very rapidly, including virtual reality, molecular biology, entertainment, automobile/machine manufacture, architecture, etc. Due to the increasing computation power, storage media capacity, and the use of internet, the accumulation of available 3D models on the web or local storage is rapidly increasing. This trend leads the need of 3D model similarity retrieval [1] [2] [3]. 3D model similarity retrieval plays an important role in virtual reality system and in molecular biology research, such as building virtual worlds and, protein or DNA matching [3] [4]. For example, in a virtual reality system, consider we would like to develop a virtual apartment. Instead of creating every object such as door, window, table, or chair from scratch, we might retrieve objects from the available furniture model database, placing them in a appropriate position, and modify them as necessary. In this case,

we need a 3D model similarity retrieval system providing objects, which are similar to user query. There are many features or descriptors which are considered to represent 3D model shape, such as Shape Histogram [4], Geometry Images [5], Aspect Graph [9], Visual Similarity [10], 3D Fourier Transform based descriptor [11], Physical Moment [7], and Cube Based 3D Similarity [8]. They are suitable for different categories of shape. A descriptor might be effective for a kind of shape characteristic, but not for another kind of those. Moreover, as long as we know they employ a feature vector partially, instead of combining them to obtain higher discriminating power. In order to improve the performance of singlefeature based similarity search, [13] introduces a multi-feature based 3D model retrieval, which employs a combination of two features altogether. The basic idea is as follows. When one extracts a 3D model into a feature vector, some information will be lost. Therefore, an extracted feature vector will not able to represent the characteristic of 3D model shape as a whole. Hence, a combination of some features should be able to describe 3D shape more effectively such that in the end it will provide better retrieval-performance. The experiment highlighted that combining two features, by either merging the features or the distances, using certain weighting factor improves the search performance. Feature integration is also introduced by [15] and [17]. Bustos, et.al. [15] introduces a concept of purityweighted combination of distances; while Atmosukarto, et.al. [17], employs relevance feedback to a combination of the ranks and adjusts the weighting factor based on effectiveness probability of feature. Different from the previous works, this paper compares four possible approaches of feature integration, and proposes a use of both Reduced FV Integration (RFI) and Distance Integration (DI). In Section 2, we describe five 3D shape descriptors extracted from 3D model for our experiments. Section 3 describes an analysis of feature orthogonality, feature relevance, and several possible integration approaches and a comparison between them. In Section 4, we present implementation issues related to the prototype of 3D

model retrieval. In Section 5, we discuss the experimental results. Section 6 provides conclusions and directions for future research.

2. 3D Shape Descriptors In this section, we describe several 3D shape descriptors used in our experiments.

2.1 2D Contour (C2D) A 3D model viewed from the major axes, z, y, and x, forms three 2D projection-images, lying on xy plane, xz plane, and yz plane, respectively. A 2dimensional contour is defined as a collection of border points, i.e the outermost points of image, which intersects with image background as depicted in Fig. 1 (a). Fig. 1 (b) shows that by tracing the contour in a clock-wise direction, we record the distance between contour point and the object’s origin. The n-dimensional 2D contour feature vector is defined as n-biggest magnitudes of Fourier coefficient obtained by employing 1D Fast Fourier Transform (FFT) to the sequence of distances between contour points and the origin. Note that rotation invariance is obtained by ignoring the phase information and using only the magnitudes, while scaling invariance by dividing the magnitudes with the DC component.

(a)

(b)

Fig. 1 (a) 2D Contour of a 3D model, obtained by projecting it to yz-plane, xz-plane, and xy-plane, and (b) the centroid distances of the sample points of the third contour image, recorded in a clockwise direction.

2.2 Depth Buffer (DB) [12] Depth buffer stores the depth of image projection of 3D model from a certain direction. As there are three major axes in 3D space, we obtain six depth buffers containing the depth information of the model viewed from x+, x-, y+, y-, z+, and z-, as depicted in Fig. 2.

Fig. 2. Depth images of 3D model, viewed from x+, x-, y+, y-, z+, and zDepth buffer based FV is the n-biggest magnitudes of image obtained by employing 2D FFT to the image. Scaling invariance is obtained by scaling each object into a unit of bounding ball.

2.3 Fractional Occupancy (FO) Adapting from [14], we define fractional occupancy (density) as the ratio of the actual number of voxel in a partition (Vxi) to the maximum possible number of voxel in the partition (Vpi). Note that, in this case, we partition 3D voxel to a grid partitions. The n-dimensional FO feature vector is defined as nbiggest magnitudes (lowest frequency) of the Fourier coefficient of the collections of FO. Scaling invariance is obtained by scaling each 3D object into a unit of bounding ball and partitioning them by the same grid number.

2.4 The Inverse of Local Elongation and Object Bumpiness (LEBP) [15] By performing Principal Component Analysis (PCA) on the 3D voxel, we obtain eigenvalues λ1, λ2, and λ3 in decreasing order. Local Elongation is inversely related to λ2/λ1, while bumpiness is defined as λ3/λ1. The n-dimensional LEBP feature vector is defined as the vector of n-biggest magnitudes of the Fourier coefficient obtained by employing 3D FFT, as employed to FO feature vector.

2.5 Cords and Spherical Harmonic (C3D) [12][14] A cord is a vector that goes from the centre of mass of an object to a representing surface point. One issue to be concerned with is which point should be a representing point. We observe two candidate points, which are possible to select: the most distance point, which intersects with the cords [12], or the most distant point in the partition, no matter it intersects with the cord or not. In our implementation, a 3D model is sampled into a collection of voxel. Hence, we decide to use the second, as it is relatively more invariant to the number of sampling than the first. We extract spherical harmonic coefficients by employing a Spherical Harmonic Transformation (SHT) to the collections of cords’ length using S2kit

[18]. Cords based FV is defined as a collection of the n-first coefficients. Note that the FV is inherently invariant to object rotation and scaling.

3. Multi Feature Integration 3.1 Feature Orthogonality In general, every feature vector represents its own relevant aspect/characteristic of the shape. Therefore, each feature is suitable only for a corresponding aspect of the shape. In order to enhance search performance, the candidates of feature should be complementing to each other. We certainly do not expect that we will have two totally-complement FVs as we found in x-axis and y-axis in a 2D Cartesian coordinate. Nevertheless, there should be some FVs, which are relatively complementing to each other in a certain degree. Therefore, our hypothesis is combining more FVs will yield better retrieval performance than a single FV does.

3.2 Feature Relevance While combining features, some specific features may be more relevant in representing the shape of a 3D model than others. Practically, the relevance could be represented by a weighting factor of combined features. Let f1, f2, …fn be a set of FVs to combine, feature combination f(1) w.r.t. the degree of relevance w, is defined as (4) f (1) = w1 . f 1' o w2 . f 2' .....wn . f n'

3.3 Feature Integration Approaches We observed that there are four approaches of feature integration: Pure Feature Integration (PFI), Reduced Feature Integration (RFI), Distance Integration (DI), and Rank Integration (RI). Note that, in the paper we do not differentiate between the unweighted integration and the weighted one; as the former is the same as the latter with an equal weighting factor. 3.3.1

Pure FV Integration (PFI)

Let f1,f2, …..fn be a set of normalized feature vectors, a new pure integrated feature vector f(1) is defined as (5) f (1) = w1 . f1 o w2 . f 2 .....wn . f n In order to preserve proportionality when integrating them, the original feature vectors need to be priory normalized. Note that the dimension of the integrated feature is equal to the total dimension of the original FVs.

3.3.2

Reduced FV Integration (RFI)

Let f1, f2, …fn be a set of normalized FVs and f’1, f’2, …f’n be a new set of FVs obtained by reducing the dimension of the original FVs fi, a new reduced integrated feature vector f(1) is defined as (6) f (1) = w1 . f 1' o w2 . f 2' .....wn . f n' The goal of reducing dimension before the integration is to preserve the dimension of integrated FV equal or almost equal to those of the original FVs; thereby the cost for distance computation is almost the same as those of single original feature. The dimension reduction might also reduce the discriminating power, which is not expected. However, in general for a certain range of dimension, i.e 64-512, the difference of dimensionality does not yield much difference of discriminating power. Therefore, the integration model could be employed if the reduction should be not too much; and the original FVs support a multi-resolution representation. A multi-resolution representation is obtained by sorting the elements of FV such that the elements is ordered based on its discriminating power. Therefore, dimension-reduction will only reduce the discriminating power, without loosing the capacity of describing 3D object as a whole. By using FFT and SHT, the second requirement is easily fulfilled. 3.3.3

Distance Integration (DI)

Let d1, d2, ….dn is a set of normalized distance between object O1 and O2, using a set of features f(01) (01) (01) (02) (02) (02) 1, f 2, …f n and f 1, f 2, …f n, respectively. A (1) new integrated distance d is defined as: (7) d (1) = w1 .d1 + w2 .d 2 + .....wn .d n 3.3.4

Rank Integration (RI)

Let r1, r2, ….rn is a set of integer rank of an object O in the database under a query Q, a new integrated rank r(1) is defined as (8) r (1) = w1 .r1 + w2 .r2 + .....wn .rn Note that by using the integer rank, we ignore the absolute value of the distance and only make use of the order/rank of the distance. Therefore, the distance between the most similar object to the second is considered the same as those between the second and the third, and so on. Table 1 describes a comparison of the characteristics, advantages, and disadvantages of the integration approaches.

Table 1. A comparison between the four feature integration approaches Approach PFI

RFI

Characteristics, advantages and disadvantages • Combining high-dimensional feature vectors yields a very high-dimensional feature vector. • Distance computation is needed anytime a query is being processed • Preserves absolute position of the model in feature space; thus supports unknown query processing • Feature normalization is needed in order to obtain a proportional integration. • Preserves the same or almost the same number of dimension as the originals’. • Distance computation is needed anytime a query is being processed • Preserves absolute position of the model in feature space; thus supports unknown query processing • Feature normalization is needed in order to obtain a proportional integration.

DI

• Assumption: the distance value is considered to affect the effectiveness more than the rank.

RI

• Distance computation is not needed when processing a query; thus provides faster query processing • Does not preserves absolute position, but relative position of the model; thus does not support unknown query processing • Distance normalization is needed • Assumption: the rank is considered to affect the effectiveness more than the distance value. • Distance computation is not needed while processing a query; thus provides faster query processing • Does not preserves absolute position, but relative position of the model; thus does not support unknown query processing • Normalization is not needed as the rank is inherently normalized.

4. Implementation We implemented a prototype of 3D retrieval system employing the multi-feature integration approaches described in Section 3. We compare the integration models using Princeton Datasets [1], i.e Training Dataset (PrincTrain) and Testing Dataset (PrincTest). The former contains 907 models grouped by 90 categories; and the latter contains 907 models grouped by 92 categories. Training dataset is used for obtaining the best weighting factor and measuring the effectiveness of retrieval, while Testing dataset for cross-validation in order to confirm whether the result is applied not only for Training dataset but also the more general dataset. After pose normalization using weighted PCA, a modification of [19], we extract five features from the models in the dataset, as described in Section 2.

A previous work [17] highlighted that the dimension of about 256 is quite stable; slightly increasing or decreasing from this number does not lead much difference of retrieval performance. Hence, we choose the dimension of about 256. In the case of RFI, we reduce the dimension of every FV to 51 such that the reduced integrated FV will have the dimension of 255. In order to calculate the distance between two objects, we make use of L1 distance as our preliminary experiments show that it is the best among the others, such as Euclidean distance and Quadratic distance. We employ Gaussian normalization [16] for normalizing both features and distances, such that their values range in [0,1]. We make use of four levels of relevance when integrating features: irrelevant, less relevant, relevant, and more relevant; and for each level, we employ weighting factor 0, 1, 2, and 3 respectively. A feature, which is not used et. al. is defined as irrelevance, and assigned with weighting factor zero. As we employ fives FVs, there exist five weighting factors. For uniformity, in the next explanation, the weighting factors are always written in a fixed order, i.e. weighting factor of C2D, C3D, DB, FO, and LEBP.

5. Experimental Results and Discussion We performed our experiments on the both datasets. We use PrincTrain dataset for observing the best combination of weighting factor; while PrincTest dataset for employing the prior weighting factor and to confirm if it is working in an expected degree of effectiveness, i.e. more effective than the best single feature vector provides. Our first experiment on PrincTrain dataset shows that the best single FV is C3D with the average of Rprecision value: 0.4279, followed by DB (0.3967), C2D (0. 3862), LEBP (0.3624), and FO (0.3616). The second experiment aims to obtain which weighting factor is the best for the four integration approaches. As we use four levels of relevance to the five FVs, we have 45-1 = 1024 possible combinations of the weighting factor. Of the 1024 possible combinations, we take the best-ten combinations for each integration approaches as depicted in Table 2. We also highlight that 72.0%-86.1% of the combination items yield retrieval-performance, which is greater or equal to those of the best single FV. This means that if we randomly take any combination of weighting factor, there exists the probability of 72.0% that the system yields the same or better performance as the best single FV does. Table 2 also shows that in general C3D and C2D have greater relevance than the others, while FO and LEBP are

less relevant and tend to be interchangeable, therefore less orthogonal, to each other. Another interesting observation is that RFI and DI is the best among the others. Recall that RFI and DI have different and complementing characteristics. Without loosing the efficiency (the dimension remain almost the same as the original FVs’), RFI preserves absolute position of 3D objects in the feature space. Therefore it support unknown query object. In another side, DI stores relative position of 3D object to each other in the feature space; therefore does not support unknown query. Nevertheless, it is obvious that DI is analytically faster that RFI. Our experiment performed to PrincTest dataset confirms that the best combinations are applied not only to the training dataset, but also to another dataset. Recall that the best weighting factors depicted in Table 2 are obtained by running all possible combinations, and finding the best of them. The experiment shows that the best weighting factors, as depicted in the first row of Table 2, yield higher performance comparing to the best single FV, i.e C3D, with average R-Precision of 0.4358, as depicted in Fig. 3. In fact, this weighting factor is not the best among the possible combination of weighting factor for PrincTest itself. However, they consistently yield higher performance. The experiment applied to PrincTest also highlights that RFI and DI are still consistently the best among the others; therefore, we suggest making use of the both for feature integration in 3D model

retrieval. Note that the both models have different characteristics, which are complementing to each other. RFI supports unknown query objects; while DI does not support it but faster as distance computation is not needed for a known query object. The scenario of using the both approaches can be as follows. If the query is priory available in the database, use DI to provide the rank of similarity. Otherwise, calculate FVs of the query object, compute the distance between the object with all objects in the database, and provide the rank of similarity.

Fig. 3. Average Precision vs Recall of the best weighting factor of PFI, RFI, DI, and RI, compared to the best single feature vector, C3D, applied to PrincTest dataset.

Table 2. The best-ten weighting factors and their average precision vs recall for the four integration approaches applied to PrincTrain dataset PFI (72.0 % ) No 1 2 3 4 5 6 7 8 9 10

RFI (78,7%)

DI (86.1%)

Weight

R-Prec. Avg.

Weight

R-Prec. Avg.

Weight

23101 33101 33201 22101 23110 13101 33110 23201 32101 32201

0.4782 0.4774 0.4767 0.4759 0.4753 0.4752 0.4751 0.4746 0.4746 0.4739

23110 13110 23101 12110 13101 23210 23111 22110 23201 33220

0.4925 0.4913 0.4913 0.4897 0.4897 0.4890 0.4883 0.4879 0.4879 0.4879

32101 33101 32110 32201 22101 33201 31101 21101 32210 33110

6. Conclusion In this paper, we observe and analyze four approaches of weighted FV integration in 3D model similarity retrieval. The integration aims to enhance retrieval effectiveness. Our hypothesis is a single FV

RI (80.5%) R-Prec. Avg. 0.4875 0.4861 0.4859 0.4859 0.4857 0.4854 0.4850 0.4846 0.4843 0.4843

Weight 33211 32211 33111 22101 33201 22110 23101 32201 23111 33101

R-Prec. Avg. 0.4575 0.4572 0.4571 0.4569 0.4569 0.4568 0.4568 0.4567 0.4566 0.4566

describes only a partial aspect of 3D shape. Therefore, integrating several FVs might describe more comprehensively aspects of 3D shape, and in turn enhance retrieval performance. Our experiments highlight the hypothesis; weighted integration of several FVs tends to enhance the performance with the probability at least 72.0%. It is interesting to note that the weighting factors, which are selected based on the

Training dataset, are also working consistently for an extended dataset, i.e. the Testing dataset. Among the four FV integration approaches, PFI and DI are the best. RFI, which has similar characteristics as PFI’s, outperforms not only the effectiveness of PFI, but also its efficiency because of dimension reduction. Likewise, DI, which has similar characteristics as RI’s, outperforms the effectiveness of RI. We propose the use of both RFI and DI in order to obtain higher efficiency and effectiveness of 3D model retrieval system, which supports both known and unknown query object. Several research issues merit further investigation. How to predict dynamically the weighting factor when processing a query, instead of using a fixed weighting factor applied to the Training dataset? How to employ different weighting factor for different classes of 3D model? How to make retrieval system learn such that the weighting factor could be adjusted to obtain higher performance for the next query?

[7]

Michael Elad, Ayellet Tal, and Sigal Ar. Directed Search in a 3D Objects Database Using SVM. HPL2000-20(R.1). 2000

[8]

Ching-Sheng Wang, Jia-Fu Chen, Lun-Ping Hung, and Cun-Hong Huang. Efficient Indexing and Retrieval Scheme for VRML Database. IC CSCW. 2004

[9]

Christopher M. Cyr and Benjamin B. Kimia. 3D Object Recognition Using Shape Similarity-Based Aspect Graph. ICCV 2001

[10] Ding-Yun Chen, Xiao-Pei Tian, Yu-Te Shen, and Ming Ouhyoung. On Visual Similarity Based 3D Model Retrieval. Eurographics. 2003. [11] Dejan V. Vranić, D. Saupe. 3D Shape Descriptor Based on 3D Fourier Transform, Proceedings of the EURASIP Conference on Digital Signal Processing for Multimedia Communications and Services (ECMCS 2001), Budapest, Hungary, Sept. 2001

Acknowledgement

[12] Dejan V. Vranić. 3D Model Retrieval. Ph.D. Dissertation. Universität Leipzig. 2004

During this work, Saiful Akbar was supported by ASEA-UNINET Technology Grant of OeAD, which he greatly acknowledges.

[13] Saiful Akbar, Josef Küng, Roland Wagner. Multifeature based 3D Model Similarity Retrieval. International Conference on Computing and Informatics. Malaysia 2006.

References

[14] Eric Paquet, Marc Rioux, Anil Murching, Thumpudi Naveen, Ali Tabatai. Description of Shape Information for 2-D and 3-D Objects. Signal Processing Image Communication. 2000 Elsevier Science B.V.

[1]

Princeton Shape Retrieval and Analysis Group. 3D Model Search Engine. http://shape.cs.princeton.edu /search.html. Last access: September 2005

[2]

Dejan Vranic. Content-based Classification of 3Dmodels by Capturing spatial Characteristics. http://merkur01.inf. uni-konstanz.de/CCCC/. Last access: September 2005

[3]

3D Model Retrieval System based on LightField Descriptors 3D Protein Retrieval System. http://3d.csie.ntu.edu.tw/index.html. Last access: September 2005

[4]

Mihael Ankerst, Gabi Kastenmueller, Hans-Peter Kriegel, Thomas Seid. 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. SSD. 1999

[5]

Hamid Laga, Hiroki Takashi, and Masayuki Nakajima. Geometry Image Matching for Similarity Estimation of 3D Shapes. CGI'04 pp. 490-496

[6]

Motofumi Suzuki T. A Web-based Retrieval System for 3D Polygonal Models, IFSA World Congress and 20th NAFIPS International Conference on 2001. Joint 9th, Volume:4, pp 2271-2276

[15] Indriyati Atmosukarto, Wee Kheng Leow, Zhiyong Huang. Feature Combination and Relevance Feedback for 3D Model Retrieval, MMM, pp. 334International Multimedia Modeling 339, 11th Conference, Australia. 2005. [16] Qasim Iqbal and J.K. Aggarwal. Feature Integration, Multi-image Queries and Relevance Feedback in Image Retrieval. 6th Int’l Conference on Visual Information System (VISUAL) 2003. pp. 467-474 Miami, Florida 2003 [17] Benjamin Bustos, Daniel Keim, Dietmar Saupe, Tobias Schrech, Dejan Vranić. Automatic Selection and Combination of Descriptors for Effective 3D Similarity Search. IEEE Sixth International Symposium on Multimedia Software Engineering (ISMSE'04) pp. 514-521. 2004 [18] Peter J. Kostelec and Daniel R. Rockmore, S2Kit: A Lite Version of SpharmonicKit. Department of Mathematics. Dartmouth College. 2004 [19] F. Murtagh. The source code of Principal Component Analysis, The Departement of Statistics, Carnegie Mellon University. http://lib.stat.cmu.edu/multi/pca.c. Last Access: December 2005.

Suggest Documents