3D Free Form Object Recognition using Rotational ... - Semantic Scholar

4 downloads 7104 Views 267KB Size Report
free form object recognition system based on a novel local ... Jarvis [7] introduced a point signature for 3D object recog- nition. ..... In 9th Conference on Digital.
3D Free Form Object Recognition using Rotational Projection Statistics Yulan Guo ⋆† , Mohammed Bennamoun † , Ferdous A. Sohel † , Jianwei Wan ⋆ , Min Lu ⋆ ⋆ National University of Defense Technology † The University of Western Australia ⋆ †

{yulan.guo, kermitwjw, lumin}@nudt.edu.cn

{mohammed.bennamoun, ferdous.sohel}@uwa.edu.au

Abstract

computing devices made 3D object recognition a popular research topic in recent years. Existing methods for object recognition can be broadly categorized into two classes, i.e., global feature based and local feature based methods [3]. Local feature based methods have been extensively studied in the literature due to their robustness to occlusion and clutter [3]. For example, Stein and Medioni [18] proposed a splash representation which encodes the relationship between the surface normals of the feature point and its neighboring points. Chua and Jarvis [7] introduced a point signature for 3D object recognition. A point signature describes the signed distances from the neighboring points to their projections on a fitted plane. One major limitation of the point signature is that the reference direction may be not unique [16]. Johnson and Hebert [12] proposed a 3D object recognition method based on spin images. However, the power of descriptiveness of spin images is limited and it is sensitive to mesh resolution [23]. Yamany et al. [22] proposed surface signatures which encode the surface curvature information into a 2D histogram. Mian et al. [15] proposed a robust multidimensional table representation (called Tensor) by defining a 3D local reference frame (LRF) for a pair of oriented points and encoding the intersected surface area. They also introduced a tensor based free form object recognition and segmentation algorithm. One major limitation is the combinatorial explosion of vertex pairs when defining a LRF. Furthermore, Mian et al. [16] proposed a keypoint detection and automatic scale selection method for feature extraction and object recognition. Bariya et al. [1, 2, 17] proposed an Exponential Map (EM) representation by mapping a 3D local surface to a 2D domain and encoding the surface normals of neighboring points in the 2D domain. They also proposed an object recognition algorithm by using a scalehierarchical interpretation tree. Zhong [23] introduced an Intrinsic Shape Signature (ISS) by encoding the point distribution around a feature point of an object and presented an efficient feature indexing technique for fast model-based

Recognizing 3D objects in the presence of clutter and occlusion is a challenging task. This paper presents a 3D free form object recognition system based on a novel local surface feature descriptor. For a randomly selected feature point, a local reference frame (LRF) is defined by calculating the eigenvectors of the covariance matrix of a local surface, and a feature descriptor called rotational projection statistics (RoPS) is constructed by calculating the statistics of the point distribution on 2D planes defined from the LRF. It finally proposes a 3D object recognition algorithm based on RoPS features. Candidate models and transformation hypotheses are generated by matching the scene features against the model features in the library, these hypotheses are then tested and verified by aligning the model to the scene. Comparative experiments were performed on two publicly available datasets and an overall recognition rate of 98.4% was achieved. Experimental results show that our method is robust to noise, mesh resolution variations and occlusion.

1. Introduction Object recognition is now a fundamental research area in computer vision with numerous applications including surveillance, robotics, and industrial inspection [20, 21]. The task of object recognition is to correctly identify objects that are present in a scene and estimate their pose (i.e., position and orientation) [15]. Object recognition in 2D images has been extensively investigated, and a significant progress has been achieved [14]. However, the performance of these methods is limited because 2D images are sensitive to illumination, viewpoint and scale [13, 10] and most importantly lack depth information. Range images provide an extrinsic geometric representation of the scene, and do not suffer from these limitations. These advantages together with the rapid development of 3D acquisition systems and 1

object recognition. One drawback of this method is that the LRF of a feature point is not unique. More recently, Tombari et al. [20] proposed a Signature of Histograms of OrienTations (SHOT) by defining a unique LRF and encoding both the histogram and the spatial distribution of the surface normals of the neighboring points. Experimental results show that it outperformed point signature, spin image and EM in the presence of noise. However, it revealed to be sensitive to mesh resolution variations and has not been tested in the context of object recognition. In this paper, we propose a hierarchical 3D object recognition algorithm based on the RoPS feature descriptor. To generate the RoPS feature descriptor, a unique and robust LRF is defined for each feature point by calculating the eigenvectors of the covariance matrix of the local surface. The feature descriptor is then constructed by rotationally projecting the neighboring points onto 2D planes and calculating statistics (low-order moments and entropy) of the distribution matrices on these 2D planes. During object recognition, a set of local features is first extracted from the scene and matched against all model features to generate candidate models and transformation hypotheses. The candidate models are then transformed to the scene and verified through surface matching. The accurately aligned objects in the scene are finally recognized. The performance of our proposed algorithm was tested on two publicly available datasets (namely, the Bologna dataset [20] and the UWA dataset [15]). It was also compared to the state-of-the-art methods with respect to noise, mesh resolution variation and occlusion. Experimental results show that our method outperformed spin image, tensor, keypoint matching and the EM matching methods. The rest of this paper is organized as follows. Section 2 describes our proposed RoPS method for local surface feature description. Section 3 presents the hierarchical algorithm for 3D object recognition. Section 4 presents the results and analysis of comparative experiments. Section 5 concludes this paper.

2. Rotational Projection Statistics: A Novel Feature Descriptor A local surface feature descriptor is a mathematical vector that encodes the information and characteristics of a local surface. It therefore plays an important role in a 3D object recognition system. A qualified feature descriptor should not only provide sufficient descriptive richness but also retain robustness to various variations, such as pose variation, noise, mesh resolution, occlusion and clutter. As aforementioned, existing methods suffer from some specific limitations. In order to overcome the deficiencies of existing methods, a novel effective and robust local surface feature descriptor (RoPS) has been developed and used for 3D object recognition.

Given a range image, a set of feature points {p1 , p2 , · · · , pN } are selected to generate the local feature descriptors, where N is the total number of feature points. These feature points can be selected by random selection [12] or using a keypoint detection method (e.g., 3D Harris [9]). In our case, we simply adopted a random selection method rather than keypoint detection and achieved superior results to the existing methods (as shown in Section 4). For a feature point p, the local surface within a support radius r is cropped from the range image. In order to achieve pose invariance, a local reference frame is defined for that feature point p. Different from the methods given in [20, 23], a novel LRF for our RoPS feature descriptor is constructed by performing an eigenvalue decomposition (EVD) on the covariance matrix of all the points lying on the local surface rather than just mesh vertices, and a sign disambiguation by aligning the signs of the local axes to the majority of the vectors that are shooting from the feature point to the neighboring points. Therefore, the proposed LRF is robust to mesh resolution (as shown in Section 4.1). Once the LRF of feature point p is defined, all the neighboring points Q = {q 1 , q 2 , . . . , q M } can be expressed in this local coordinate system, resulting in a transformed pointcloud Q′ = {q ′1 , q ′2 , . . . , q ′M }. The process of generating a RoPS feature descriptor is described in details as follows, and an illustration is shown in Fig. 1. First, the pointcloud Q′ is rotated around the x axis by an angle θk to get a rotated pointcloud Q′ (θk ). Then, Q′ (θk ) is projected onto the xy plane to obtain a 2D projection f′ (θk ). The process of projection offers an oppointcloud Q tion to describe the 3D surface in a concise and efficient manner, and it clearly preserves some unique and particular 3D metric information of the local surface from this viewpoint θk . f′ (θk ), the Next, in order to encode the information in Q 2D plane is evenly divided into L × L bins. L × L matrix D is obtained by accumulating the number of points falling within each bin. We refer to the matrix D as a “distribution matrix” since it encodes the 2D distribution of the neighboring points. Since the surface can be represented in different mesh resolutions, the distribution matrix D is therefore normalized such that the sum of all bins is equal to 1. However, the information in this distribution matrix should be further condensed in order to achieve computational and storage efficiency. We use the central moments to encode the information of distribution matrix D. The central moment µmn of order m + n is defined as

µmn =

L X L X i=1 j=1

m

n

(i − ¯i) (j − ¯j) D (i, j) ,

(1)

Figure 1: Illustration of the generation of a RoPS feature descriptor. (a) The Armadillo model and the local surface around a feature point. (b) The local surface is cropped and relocated in the local reference frame. (c) The local surface is rotated by a set of angles and projected onto a 2D plane. (d) The information in a 2D distribution matrix is encoded into 5 statistics. (e) The statistics of all rotations are concatenated to form an overall feature descriptor. (Figure best seen in color.) where ¯i =

L X L X

iD (i, j) ,

(2)

and L X L X

L X L X

D (i, j) log (D (i, j)) .

(4)

i=1 j=1

i=1 j=1

¯j =

e=−

jD (i, j) .

(3)

i=1 j=1

It is clear that the entire properties of the distribution can be characterized by using an infinite set of central moments. In practice, only a small subset of the central moments is selected to represent the distribution matrix. Specifically, we select the low-order moments {µ11 , µ12 , µ21 , µ22 }. This is because, first, the zeroth central moment µ00 is one and the first central moment µ01 and µ10 is zero, they therefore contain no useful information. Second, these low-order moments {µ11 , µ12 , µ21 , µ22 } contain the most meaningful and significant information in the distribution matrix, therefore, the descriptiveness of these moments is sufficiently high. Moreover, the inclusion of moments of higher order not only increases the computational cost and storage, but also decreases the robustness to nuisances such as noise and variable mesh resolutions. We also use the Shannon entropy to encapsulate the information of the distribution matrix D and further improve the descriptiveness of our proposed feature descriptor. The Shannon entropy e is calculated as:

As a result, five statistics {µ11 , µ12 , µ21 , µ22 , e} of the distribution matrix D are calculated and used as a subfeature for the rotation θk . In order to encode the “complete” information of the local surface, the pointcloud Q′ is rotated around the x axis by a set of angles {θ1 , θ2 , . . . , θT }, resulting in a set of statistics {f x (θ1 ) , f x (θ2 ) , . . . , f x (θT )}. These statistics constitute the sub-features for all rotations around the x axis. Further, the pointcloud Q′ is rotated by a set of angles around the y axis and a set of statistics f y (θ1 ) , f y (θ2 ) , . . . , f y (θT ) on the yz plane are calculated. Finally, the pointcloud Q′ is rotated by a set of angles around the z axis and a set of statistics {f z (θ1 ) , f z (θ2 ) , . . . , f z (θT )} on the xz plane are calculated. The overall feature descriptor is then generated by concatenating all the sub-features of different rotations into a vector, that is:  f = f x (θk ) , f y (θk ) , f z (θk ) , k = 1, 2, · · · , T. (5)

3. 3D Object Recognition In this section, a new hierarchical 3D object recognition algorithm is proposed. Our 3D object recognition algorithm contains four major modules, i.e., model representa-

tion, candidate model generation, transformation hypotheses generation, verification and segmentation. A flow chart illustration is given in Fig. 2.

3.1. Model Representation We first construct a model library of the 3D objects. A total of Nm feature points are randomly selected to represent a model M. For each feature point pm , the local reference frame Fm and the feature descriptor (e.g., our RoPS feature) f m are generated using the method described in Section 2. The point position pm , LRF Fm and feature descriptor f m of all the feature points are then stored in a library for object recognition. For the consideration of efficient feature matching during online recognition, the local feature descriptors of all the models are indexed using a k-d tree method. Note that the model feature calculation and indexing can be performed offline, while the following modules are operated online.

transformation. This not only eliminates the combinatorial explosion of feature correspondences but also improves the reliability of estimated transformation. As all the plausible transformations between the scene S and the model M are calculated, these transformations are then grouped into several clusters. Each cluster center (Rc , tc ) gives a potential transformation between the scene and the model. Next, a confidence score for each cluster is calculated as the ratio between the number nf and the average feature distance d of all the feature correspondences falling into that cluster, that is s=

nf . d

(8)

Only these hypotheses whose confidence scores are larger than a threshold τt are considered transformation hypotheses and used in final verification.

3.2. Candidate Model Generation

3.4. Verification and Segmentation

Given a scene S, a set of Ns points are randomly selected. The local reference frame Fs and feature descriptor f s are then calculated for each feature point ps . The scene features are matched against all the model features in the library using a k -d tree. If the ratio between the smallest distance and the second smallest one is below a threshold τf , the scene feature and the closest model feature are considered a feature correspondence. Therefore, each feature correspondence can vote for a model. The models are sorted in descending order of received votes, and the models which receive more than τm votes are considered candidate models and they are verified by the subsequent steps.

Given the scene S, the candidate model M and the transformation hypothesis (Rc , tc ), the model M is first transformed to the scene S using the transformation hypothesis (Rc , tc ). This transformation is then further refined using the Iterative Closest Point (ICP) algorithm [4]. If the model aligns accurately with a portion of the scene, the hypothesis is accepted, otherwise it is rejected. Specifically, the accuracy of alignment is calculated as

3.3. Transformation Hypotheses Generation For each feature correspondence that belong to the model M, a transformation can be calculated by aligning the LRF of the model feature point to the LRF of the scene feature point. Specifically, given the LRF Fs , the point position ps of the scene feature point, and the LRF Fm , the point position pm of the model feature point, the rigid transformation is estimated by R = FT s Fm ,

(6)

t = ps − pm R,

(7)

and

where R is the rotation matrix and t is the translation vector of the rigid transformation. Note that, a transformation can be derived from a single feature correspondence using our RoPS feature descriptor. This is a major advantage compared to most of the existing methods (e.g., spin image) which require at least two correspondences to calculate a

α=

nc , e

(9)

where nc is the number of corresponding vertices between the scene S and the model M, e is the residual error of the ICP matching. Here, a scene point and a transformed model point are considered corresponding if their distance is less than twice the model resolution (mr). If the accuracy α is larger than a threshold τa , the candidate model M and the transformation hypothesis are accepted, and the scene vertices that correspond to this model are removed from the scene. Otherwise, this transformation hypothesis is rejected and the next transformation hypothesis is verified by turn. If no transformation hypothesis results in an accurate alignment, we conclude that the model M is not present in the scene S. While if more than one transformation hypotheses are accepted, it means that multiple instances of the model M are present in the scene S. Once all the transformation hypotheses for candidate model M are tested, the object recognition algorithm then proceeds to the next candidate model. This process continues either until all the candidate models have been verified or until there are too few points left in the scene for recognition.

Stop Yes

Yes

Models

Scene image

All candidates verified?

Select feature points

Select feature points

Select the next candidate

Select the next hypothesis

Calculate feature descriptors

Calculate feature descriptors

Estimate the plausible transformation

Align model to scene by using this hypothesis

Index feature descriptors

Get candidate models by feature matching

Get transformation hypotheses by grouping

Refine matching by ICP algorithm

No

All hypotheses verified? No

No Accurately aligned? Yes Accept the candidate and hypothesis

Segment the recognized object

Figure 2: Flow chart of the 3D object recognition algorithm. Table 1: Tuned parameter values for feature descriptors. Support Radius (mr)

Dimensionality

Length

Spin Image

15

15*15

225

NormHist

15

15*15

225

LSP

15

15*15

225

THRIFT

15

32*1

32

SHOT

15

8*2*2*10

320

RoPS

15

3*9*5

135

4. Experimental Results In this section, we analyze the performance of our proposed RoPS feature descriptor and object recognition algorithm, and compare it with the state-of-the-art methods.

4.1. Performance of RoPS Feature Descriptor The performance of our proposed RoPS feature descriptor is tested on the Bologna dataset and compared with several state-of-the-art feature descriptors, including spin image [12], Normal Histogram (NormHist) [11], Local Surface Patches (LSP) [6], THRIFT [8] and SHOT [20]. The parameters for generating these feature descriptors were tuned on an independent data and the resulting parameter values are shown in Table 1. The Bologna dataset contains 6 models taken from the Stanford 3D Scanning Repository, and 45 synthetic scenes generated by randomly rotating and translating different subsets of the model set so as to create clutter and pose variance [20]. We evaluate the performance of feature descriptors using the criteria of global matching index (GB) [5]. The global matching index expresses the ability to cap-

ture feature correspondences between different range images, the maximum value (i.e., GB = 1) means that all the right correspondences have been correctly discovered. 4.1.1 Robustness to Noise A Gaussian random noise with an increasing standard deviation of 0.1mr, 0.2mr, 0.3mr, 0.4mr and 0.5mr was added to the scene data for performance evaluation. The resulting GBs under different levels of noise are presented in Fig. 3(a). It is clear that: i) For noise free data, all feature descriptors achieved comparable performance, the GBs for all feature descriptors are around 100%. As the noise level increased, the performance of LSP and THRIFT deteriorated sharply. Although NormHist and spin image worked relatively well under minor and medium noise conditions with a deviation less than 0.2mr, they failed completely under a noise with a high level deviation. ii) Our proposed RoPS feature descriptor achieved a comparable performance with SHOT, and outperformed other methods by a large margin in terms of GB. Specifically, the performance of RoPS is better than SHOT under a minor level noise with a deviation less than 0.1mr. As the level of noise increased, SHOT performed slightly better than RoPS. However, the performance of our proposed RoPS was significantly better than SHOT under a high level of noise, e.g., with a noise deviation larger than 0.4mr. It is also notable that our RoPS feature descriptor was able to retrieve more than 30% of the feature correspondences even under noise with a deviation of 0.5mr. iii) The robustness of our proposed RoPS feature descriptor can be explained by at least three facts. First,

100

Spin image NormHist LSP THRIFT SHOT RoPS

90 80 GB value (%)

70 60 50 40 30 20 10 0 0

0.1

0.2 0.3 Noise deviation (mr)

0.4

0.5

(a) 100

Spin image NormHist LSP THRIFT SHOT RoPS

90 80 GB value (%)

70 60 50 40 30 20 10 0 1

1/2 1/4 Decimation level

1/8

(b)

Figure 3: Global matching indices of 6 feature descriptors. (a) Global matching indices in the presence of noise. (b) Global matching indices with respect to mesh decimation.

RoPS encodes the information of the local surface from various viewpoints and therefore encapsulates more information than the existing methods. It is therefore highly descriptive. Second, RoPS uses only the low-order moments of the 2D distribution matrix and is therefore less affected by noise. Third, our proposed unique and stable LRF also helps to increase the descriptiveness and robustness of the RoPS feature descriptor. 4.1.2 Robustness to Mesh Resolution The noise free scene meshes were resampled to 1/2, 1/4 and 1/8 of their original mesh resolution, the resulting GBs with respect to different levels of mesh decimation are presented in Fig. 3(b). It is clear that our proposed RoPS feature descriptor outperformed all the other descriptors under all levels of mesh decimation, followed by NormHist, spin image and SHOT, while LSP and THRIFT were very sensitive to mesh resolution variation. The robustness of RoPS to mesh resolution may be due to at least two factors. First, the LRF of RoPS is derived by calculating the covariance matrix of all the points

lying on the local surface rather than just the vertices, which therefore makes RoPS robust to different mesh resolution. Second, the 2D plane is roughly partitioned, the distribution matrix is normalized, and only the low-order central moments are calculated to form the feature descriptor. Taking into consideration both noise and mesh resolution, it can be inferred from Fig. 3(a) and (b) that our proposed RoPS achieved the best overall performance. Although SHOT achieved comparable performance with our RoPS feature descriptor in the presence of noise, it is very sensitive to different mesh resolutions. Besides, although NormHist performed slightly worse than our RoPS feature descriptor with respect to mesh resolution, RoPS performed better than NormHist by a large margin in the presence of noise.

4.2. Recognition Results on the Bologna Dataset To assess the performance of our proposed 3D object recognition algorithm, we used the aforementioned 6 feature descriptors to perform object recognition on the Bologna dataset. Since only SHOT and RoPS have their associated LRFs, we used our proposed LRF construction method to define a LRF for each spin image, NormHist, LSP and THRIFT descriptor, such that these methods can be applied in our object recognition framework. In this experiment, 5000 feature points were randomly selected from each scene and 1000 feature points from each model. The recognition rates in the presence of noise is given in Fig. 4(a). It is clear that both RoPS and SHOT matching methods achieved satisfying results, with recognition rates of 100% under all levels of noise. This indicated that our proposed 3D object recognition algorithm is effective and able to work in the case of scenes with high levels of noise. The recognition rates with respect to mesh resolution are given in Fig. 4(b). It can be seen that RoPS matching method achieved the best results, with the recognition rates of 100% under all levels of mesh decimation. SHOT matching method achieved comparable results. In this experiment, the high descriptiveness and robustness of our RoPS feature descriptor were firmly validated again, and the effectiveness of our proposed hierarchical object recognition algorithm was initially demonstrated, which will be further evaluated on a dataset containing real scenes in the next section.

4.3. Recognition Results on the UWA Dataset In order to further evaluate the performance of our 3D object recognition algorithm on cluttered real scenes, object recognition experiments were performed on the UWA dataset [15]. The UWA dataset is one of the most popular publicly available datasets to date, which contains 5 models and 50 real scenes acquired with the Minolta Vivid 910 scanner.

100

100 Spin image NormHist LSP THRIFT SHOT RoPS

80 70 60 50 40 30

90 Recognition rate (%)

Recognition rate (%)

90

80 70 60 50 40 30

20

20

10

10

0 0

0.1

0.2 0.3 Noise deviation (mr)

0.4

0.5

(a)

0 60

Tensor matching Spin image matching EM matching Keypoint matching VD−LSD matching RoPS matching 65

70

75 80 Occlusion (%)

85

90

Figure 5: Recognition rates on UWA Dataset.

100

Recognition rate (%)

90 80 70 60 50 40 30 20 10 0 1

Spin image NormHist LSP THRIFT SHOT RoPS 1/2

1/4

1/8

Decimation

(b)

high descriptiveness and strong robustness of our RoPS feature descriptor improve the accuracy of object recognition. Second, the unique and reliable LRF enables the estimation of a plausible transformation from a single feature correspondence, which therefore reduces the errors of transformation hypotheses. This is because the probability of selecting two or three correct feature correspondences is much lower than the probability of selecting only one correct feature correspondence. Moreover, our proposed hierarchical object recognition algorithm enables object recognition being performed in an effective and efficient manner.

Figure 4: Recognition rates on Bologna Dataset. (a) Recognition rate in the presence of noise. (b) Recognition rate with respect to mesh resolution.

5. Conclusion

To achieve a rigorous and fair comparison, we executed the RoPS based 3D object recognition experiments using the same data and experimental setup as in Mian et al. [15] and Bariya et al. [2]. 5000 feature points were randomly selected from each model and scene for the feature description and object recognition. The recognition rates of our proposed algorithm are presented in Fig. 5, and compared with the results given by the state-of-the-art algorithms, including Tensor [15], spin image [15], keypoint [16], VD-LSD [19] and EM matching [2] algorithms. As shown in Fig. 5, our algorithm outperformed all of these methods. It achieved a recognition rate of 100% with up to 80% occlusion, and a recognition rate of 83.3% even under 90% occlusion. The average recognition rate of our RoPS matching algorithm was 98.4%, while the average recognition rate of spin image, tensor and EM matching algorithms are 87.8%, 96.6% and 97.5% respectively. Note that there is no false positive in this experiment when using our RoPS matching algorithm, and only 3 out of 188 objects in the 50 scenes were not correctly recognized. The good performance of our RoPS based 3D object recognition algorithm is due to several reasons. First, the

Acknowledgements

In this paper, we presented a novel RoPS local surface feature descriptor based on the rotational projection statistics. The feature descriptor was generated by encoding the information of the local surface from various viewpoints, its high descriptiveness and strong robustness to noise and mesh resolution have been demonstrated by comparative experiments. Moreover, we proposed a novel hierarchical 3D object recognition algorithm based on the proposed RoPS feature descriptor. Recognition experiments were performed on two publicly available datasets, the results show that our method outperformed the state-of-the-art methods.

This research is supported by a China Scholarship Council (CSC) scholarship (2011611067), a National Natural Science Foundation of China grant (61179010), Australian Research Council grants (DE120102960, DP110102166) and a UWA Postdoctoral Fellowship.

References [1] P. Bariya and K. Nishino. Scale-hierarchical 3D object recognition in cluttered scenes. In IEEE Conference

on Computer Vision and Pattern Recognition, pages 1657–1664, 2010. 1 [2] P. Bariya, J. Novatnack, G. Schwartz, and K. Nishino. 3D geometric scale variability in range images: Features and descriptors. International Journal of Computer Vision, 99(2):232–255, 2012. 1, 4.3 [3] N. Bayramoglu and A.A. Alatan. Shape index SIFT: Range image recognition using local features. In 20th International Conference on Pattern Recognition, pages 352–355, 2010. 1 [4] P.J. Besl and N.D. McKay. A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2):239–256, 1992. 3.4 [5] U. Castellani, M. Cristani, S. Fantoni, and V. Murino. Sparse points matching by combining 3D mesh saliency with statistical descriptors. In Computer Graphics Forum, volume 27, pages 643–652, 2008. 4.1 [6] H. Chen and B. Bhanu. 3D free-form object recognition in range images using local surface patches. Pattern Recognition Letters, 28(10):1252–1262, 2007. 4.1 [7] C.S. Chua and R. Jarvis. Point signatures: A new representation for 3D object recognition. International Journal of Computer Vision, 25(1):63–85, 1997. 1 [8] A. Flint, A. Dick, and A. Hengel. THRIFT: Local 3D structure recognition. In 9th Conference on Digital Image Computing Techniques and Applications, pages 182–188, 2007. 4.1 [9] P. Glomb. Detection of interest points on 3D data: Extending the Harris operator. Computer Recognition Systems 3, pages 103–111, 2009. 2

[13] Y. Lei, M. Bennamoun, and A.A. El-Sallam. An efficient 3D face recognition approach based on the fusion of novel local low-level features. Pattern Recognition. In press, 2012. 1 [14] D.G. Lowe. Distinctive image features from scaleinvariant keypoints. International Journal of Computer Vision, 60(2):91–110, 2004. 1 [15] A.S. Mian, M. Bennamoun, and R. Owens. Threedimensional model-based object recognition and segmentation in cluttered scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10):1584–1601, 2006. 1, 4.3 [16] A.S. Mian, M. Bennamoun, and R. Owens. On the repeatability and quality of keypoints for local featurebased 3D object retrieval from cluttered scenes. International Journal of Computer Vision, 89(2):348–361, 2010. 1, 4.3 [17] J. Novatnack and K. Nishino. Scale-dependent/ invariant local 3D shape descriptors for fully automatic registration of multiple sets of range images. In 10th European Conference on Computer Vision, pages 440– 453, 2008. 1 [18] F. Stein and G. Medioni. Structural indexing: efficient 3D object recognition. IEEE Transaction on Pattern Analysis and Machine Intelligence, 14(2):125– 145, 1992. 1 [19] B. Taati and M. Greenspan. Local shape descriptor selection for object recognition in range data. Computer Vision and Image Understanding, 115(5):681– 694, 2011. 4.3 [20] F. Tombari, S. Salti, and L. Di Stefano. Unique signatures of histograms for local surface description. In European Conference on Computer Vision, pages 356–369, 2010. 1, 2, 4.1, 4.1

[10] Yulan Guo, Jianwei Wan, Min Lu, and Wei Niu. A parts-based method for articulated target recognition in laser radar data. Optik. In press, 2012. 1

[21] J. Williams and M. Bennamoun. A multiple view 3D registration algorithm with statistical error modeling. IEICE Transactions on Information and Systems, 83(8):1662–1670, 2000. 1

[11] G. Hetzel, B. Leibe, P. Levi, and B. Schiele. 3D object recognition from range images using local feature histograms. In IEEE Conference on Computer Vision and Pattern Recognition, volume 2, pages II–394, 2001. 4.1

[22] S.M. Yamany, A.M. El-Bialy, and A.A. Farag. Surface point signature (SPS): a new representation scheme for object registration and recognition. In Proceedings of SPIE, volume 3837, pages 311–, 1999. 1

[12] A. E. Johnson and M. Hebert. Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(5):433–449, 1999. 1, 2, 4.1

[23] Y. Zhong. Intrinsic shape signatures: A shape descriptor for 3D object recognition. In IEEE International Conference on Computer Vision Workshops, pages 689–696, 2009. 1, 2

Suggest Documents