Color-Based Algorithm for Automatic Merging of ... - ACM Digital Library

0 downloads 0 Views 3MB Size Report
one of the most popular methods is the SIFT descriptor [Lowe 2004], which defines the ... gous principle is the Gradient Location-Orientation Histogram (GLOH) ..... points in a point cloud and the values for the nearest scales in the scale space.
Color-Based Algorithm for Automatic Merging of Multiview 3D Point Clouds ELWIRA HOŁOWKO, JERZY WOJSZ, ROBERT SITNIK, and MACIEJ KARASZEWSKI, Warsaw University of Technology

In this article, a method of merging point clouds using the modified Harris corner detection algorithm for extracting interest points of textured 3D point clouds is proposed. A new descriptor characterizing point features for identifying corresponding points in datasets is presented. The merging process is based on the Random Sample Consensus (RANSAC) algorithm, which enables calculation of the geometric transformation between point clouds based on a set of interest points that includes incorrect samples, called outliers. The proposed processing path is designed to integrate many directional measurements, which are acquired with a 3D scanner and are represented as unsorted point clouds (x, y, z) with color information (R, G, B). Exemplary measurements shown in this article represent sections of ceiling in the King’s Chinese Cabinet of the Museum of King Jan III’s Palace at Wilanow in Warsaw, Poland, as well as some more complex objects. Experimental verification confirms the effectiveness of the proposed method in integrating directional measurements of objects with detailed texture, particularly if they have no unique geometric features. Categories and Subject Descriptors: I.4.8 [Scene Analysis]: Color, Shape; I.4.7 [Feature Measurement]: Feature Representation, Texture; I.4.m [Miscellaneous]: Point Cloud Integration General Terms: Point Cloud Integration Additional Key Words and Phrases: Color features, automatic merging of directional point clouds, interest points, feature descriptor ACM Reference Format: Elwira Hołowko, Jerzy Wojsz, Robert Sitnik, and Maciej Karaszewski. 2014. Color-based algorithm for automatic merging of multiview 3D Point Clouds. ACM J. Comput. Cult. Herit. 7, 3, Article 16 (April 2014), 21 pages. DOI: http://dx.doi.org/10.1145/2558306

1 INTRODUCTION Three-dimensional (3D) scanning technology has enabled the creation of realistic 3D models in many application areas. In cultural heritage documentation, 3D measurements are becoming an essential tool by providing many advantages over traditional 2D images. Moreover, 3D measurements may become a standard in surface documentation [Ikeuchi and Miyazaki 2008]. The resolution and accuracy of a measurement system for digitization of cultural heritage objects must typically be high, particularly when the obtained model is used for conservation or restoration purposes. Because the high Authors’ address: E. Hołowko, J. Wojsz, R. Sitnik, and M. Karaszewski, Warsaw University of Technology, Department of Mechatronics, 8 Sw. Andrzeja Boboli Street, 02-525 Warsaw, Poland; email: {e.holowko, j.wojsz, r.sitnik, m.karaszewski}@ mchtr.pw.edu.pl. Permission to make digital or hardcopies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credits permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121–0701 USA, fax +1 (212) 869–0481, or [email protected]. c 2014 ACM 1556-4673/2014/04-ART16 $15.00  DOI: http://dx.doi.org/10.1145/2558306 ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

16

16:2



E. Hołowko et al.

resolution of a measurement implies a small working volume of the scanner, the digitization process is performed by acquiring many directional measurements with the scanner positioned in different relations to the object. Measurement results covering different parts of an object’s surface must be integrated to obtain a complete 3D model. Automation of the merging process is one of the key directions in the development of 3D digitization techniques. The development of algorithms for performing this task may greatly simplify the process and shorten the time needed to create spatial documentation. This achievement would help reduce the cost of scanning, which is a significant factor in popularizing 3D cultural heritage digitalization [Sitnik et al. 2010]. Various approaches to automated measurement merging already exist. For example, Makadia et al. [2006] proposed estimation of the transformation between subsequent views according to the correlation of Extended Gaussian Images in the Fourier domain. Roth [1999] and Bendels et al. [2004] proposed correlation of range images according to interest points extracted from 2D intensity data. The proposed correlation is based on the assumption that for every range image there is available a corresponding intensity image. G´omez-Garc´ıa-Bermejo et al. [2013] extended the Iterative Closest Point (ICP) algorithm to deal with texture data using consideration of convergence and processing time improvement. However, none of those algorithms can be used for point clouds that lack geometrical features and are devoid of additional information about initial alignment or corresponding intensity images. In most existing methods, the geometric features of an object are used to integrate directional measurements. The proposed method, on the other hand, addresses the issue of merging multiple point clouds to create complete spatial documentation of color-textured objects, specifically in cases where the object lacks characteristic geometric features and the initial alignment of measurements, such as from mechanical manipulators, is unknown. An exemplary dataset that fulfills these two conditions is used to evaluate the proposed method. The dataset is a set of point clouds from 3D measurements of the King’s Chinese Cabinet ceiling (Figure 1) in the Museum of King Jan III’s Palace at Wilanow in Warsaw, Poland. The ceiling is a smooth, locally planar surface with 4 × 4m dimensions and a rounded connection with the walls; it thereby lacks any characteristically shaped features. This fact prevents use of curvature distribution, for example, in the process of merging measurements. The ceiling, however, is covered with a painting, which can be used instead of geometric features. This object was digitized with more than 500 directional measurements, which were acquired with at least 30% overlap. Each point cloud covered a 500 × 350mm section of the ceiling and contained between 4 and 7 million points, with an average distance between points of 100μm. Measurements were acquired with the 3DMADMAC structured-light scanning system [Sitnik et al. 2002]. Extracting 3D features from color information contained within point clouds may be considered an extension of raster-image processing methods. Identifying and matching features in 2D images is a well-researched topic; moreover, numerous algorithms exist that robustly detect characteristic points, which are insensitive to brightness changes, scaling, or local occlusions. A method of interest point detection, based on analysis of intensity gradients in the point neighborhood, was introduced by Moravec [1980]. This algorithm verifies the correlation between points based on their surroundings, which is a square window containing 9 neighboring pixels. The Moravec algorithm is one of the simplest interest point detection algorithms; however, it is also the least accurate and least stable, while being difficult to implement for point clouds. Some algorithms for local feature descriptor calculation were compared in a survey by Mikolajczyk and Schmid [2005] with consideration of their robustness in terms of noise, lighting, and changing viewpoint. According to their survey, the Scale-Invariant Feature Transform (SIFT) method proposed by Lowe [2004] performs best. A scale-space-based SIFT method is invariant to scale changes because the image is analyzed with different levels of Gaussian blur. The SIFT detector is not directly transferable to 3D scans; however, some of the concepts used in its implementation, ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

Color-Based Algorithm for Automatic Merging of Multiview 3D Point Clouds



16:3

c Fig. 1. Ceiling painting of the King’s Chinese Cabinet, Museum of King Jan III’s Palace at Wilanow. Eryk Bunsch.

such as extraction of a unique orientation of an analyzed feature, might be applied. Nevertheless, the property of scale-invariance is dispensable in point cloud integration and may unnecessarily extend calculation time. From this perspective, therefore, applying a scale-sensitive detector, such as a Harris detector, may be considered. The Harris algorithm [Harris and Stephens 1988] analyzes local gradients of the intensity based on directional derivatives calculated for the point neighborhoods in the Gaussian window. This method integrates multiple views when features to match are in the same scale. A Harris detector was extended on 3D mesh data by Sipiran and Bustos [2010] and modified to successfully extract characteristic vertices according to the surface shape. The method proposed herein is the modified Harris algorithm, which is adapted to work with unordered point clouds. It is computationally simple, rotationally invariant, and provides high stability, which make it suitable for the process of directional measurements integration. Among 3D interest point detectors for point clouds, apart from SIFT, is the Intrinsic Shape Signatures shape-based detector [Zhong 2009]. An intrinsic shape signature characterizes both the local neighborhood of each point and a view-independent representation of the 3D shape to match points in different directional measurements. The Normal Aligned Radial Feature (NARF) detector proposed by Steder et al. [2011] also involves extracting features of surface represented as a point cloud. It is intended to find points that correspond to major curvature changes but that do not lie on unstable border regions. Regardless of the method used to detect characteristic points, some of them are often incorrectly determined or lie outside the overlapping dataset area to be merged. Filtering those nonmatching points is typically performed by comparing descriptors that characterize their features. For 2D images, one of the most popular methods is the SIFT descriptor [Lowe 2004], which defines the distribution of gradient directions in a neighborhood as a set of histograms. Its modification, called PCA-SIFT ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

16:4



E. Hołowko et al.

[Mikolajczyk and Scmid 2005], analyzes the environment with a higher resolution and simplifies the results by using Principal Component Analysis (PCA) [Jolliffe 2002]. An algorithm that uses an analogous principle is the Gradient Location-Orientation Histogram (GLOH) [Mikolajczyk and Scmid 2005]; however, it segments the data into polar coordinates instead of Cartesian ones. These descriptors are based on analysis of correlations of point neighborhoods [Zhao 2006] or their momentum coefficients. Construction of these descriptors is closely related to the representation of raster images, for which the descriptors are designed and which cannot be directly applied to unordered data as point clouds. Most descriptors developed for 3D data characterize the shape of a 3D surface. Some of these descriptors include Point Feature Histograms proposed by Rusu et al. [2008] extended with Fast Point Feature Histograms [Rusu et al. 2009], spin images presented by Johnson and Hebert [1999], and the Normal Aligned Radial Feature descriptor [Steder et al. 2011]. The descriptors that deal with both shape and texture information are less popular. The most renowned are the Combined Texture-Shape Signature of Histograms of OrienTations descriptor (CSHOT) proposed by Tombari et al. [2011] and MeshDOG proposed by Zaharescu et al. [2009]. Both descriptors are adjusted to process 3D meshes, not clouds of points. Although most 3D scanners return results as dense point clouds, it seems reasonable that a descriptor optimized for this type of data would be developed. Therefore, we present such a descriptor, which characterizes the distribution of color gradients as a function of angle in spherical neighborhoods of four points arranged around the interest point. In addition, we propose a new processing path that automates the process of merging 3D point clouds in a complete model on the basis of surface color. The process uses interest point detection and matching according to local descriptors of interest points, which has become the standard paradigm. The main novelties here are the modified Harris algorithm, which is used to detect interest points, and a descriptor characterizing their features based on color information. 2.

PROCESSING PATH: COLOR-BASED MERGING OF POINT CLOUDS

The automated process of coarse view alignment for obtaining a complete 3D model is designed as a multistage process, as shown in Figure 2. It involves detection of interest points with the Harris corner detector. Before calculating a response map of the Harris algorithm, conversion from the RGB to HSL color space is proposed. This conversion serves to deal with more intuitive and perceptually relevant cylindrical-coordinate representations rather than the cube (RGB) representation. In the next step, features of interest points are calculated. For this purpose, the new rotation-invariant descriptor, which characterizes the distribution of gradients and their derivatives around the interest point, is proposed. For each pair of clouds in the set, all point pairs that satisfy the similarity condition are saved. Transformations between clouds are determined with the RANSAC algorithm according to the most similar interest points that satisfy some geometric constraints. In the following section, detailed steps of the processing path are described. 2.1 Color Space To describe texture information, instead of directly using the RGB coordinates, we recalculate them to the HSL color space [Harris and Stephens 1988], which includes three components—hue, saturation, and lightness (also called brightness)—which are more intuitive than those in the RGB system. In our studies, when dealing with object surfaces that lack characteristic geometric features, the brightness component (L) proved to be the most effective for determining characteristic points in the measurement data. This component contains information on texture details as well as surface defects, such as cracks and unevenness. Furthermore, the charge-coupled device (CCD) matrix of the scanner detector generates much more chrominance than luminance noise; therefore, the quality of data in the L channel is higher, which enables the capture of more stable interest points. Hue has the most stable ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

Color-Based Algorithm for Automatic Merging of Multiview 3D Point Clouds



16:5

Fig. 2. Color-based algorithm scheme for automatic merging of multiview 3D point clouds.

value in the corresponding sections of each point cloud because it is not affected by lighting direction and intensity. Consequently, hue (H) and lighting (L) channels are used as the primary input for the descriptor of interest point features, whereas the saturation (S) channel is used as an auxiliary source. The influence of detector noise is reduced by averaging data in the local neighborhood of the point. 2.2 Extracting Interest Points As mentioned earlier, the Harris algorithm is one of the most popular methods for detecting characteristic points in 2D images because it is invariant to rotation, illumination variation, and image noise. However, because of differences between the mathematical representation of 2D images and 3D point clouds, it is not possible to directly apply this algorithm to data acquired by 3D scanners. The main ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

16:6



E. Hołowko et al.

problems with 3D data are variable density of points, surface discontinuities caused by object selfocclusions or depressions, and surface elevations that are greater than measurement volume depth. Therefore, it is necessary to modify the original algorithm to compensate for these issues. The Harris detector describes the local autocorrelation function of a signal, which measures local changes of the signal with patches shifted in small amounts in different directions. Harris and Stephens proposed to analyze the eigenvalue of the local autocorrelation function; however, to avoid calculating the eigenvalue, whose computational complexity is high, they instead analyzed the Harris detector response defined by: S (x, y) = det ( M) − k · tr ( M) , ⎡  2 ⎤ ∂I∂I ∂I  ∂x ∂ x ∂y2 ⎦ , w(x, y) ⎣ M = x

y

∂I ∂I ∂x ∂y

∂I ∂y

(1) (2)

where S(x, y) is the stability coefficient of point, k is the empirically set coefficient, and w(x,y) is the Gaussian window. According to the given formulas, the Harris algorithm requires calculating the partial derivatives of intensity distribution in the XY plane. In our approach, when calculating the partial derivatives in each point, the plane is fitted with the least-squares method to the point’s spherical neighborhood, and the set of points is rotated so that the normal of the fitting plane becomes the z -axis and the central point is in the origin of the XY plane. The orientation of the x -axis is determined as an orthogonal projection of the x-axis on the best fitted plane; consequently, the y -axis is oriented perpendicularly to the x and z axes. In this way, computation of local partial derivatives is limited to the X Y plane of a new local coordinate system relating to the analyzed point. The value of the directional derivatives calculated in the directions of the X and Y axes are given by the difference of mean values of lightness of the neighboring points located on both sides of the plane perpendicular to the object surface. The radius of neighborhood Rdr has a value that depends on the average distance between points, and it should enable determining the derivatives while maintaining most of the texture details. According to the partial derivatives of intensity distribution, the response of the Harris detector (stability coefficient) is computed. The stability of detection of the interest points is determined by the level of the intensity gradients in the direction of the x and y axes in the image coordinate system. The higher the value of the gradient, the higher the probability that the point is well localized. Values of the resulting Harris map (Figure 2(e)) have the following meaning: —S > 0 – corner (characteristic area), —S < 0 – edge, —S ≈ 0 – flat area. The Harris map local maxima, which have sufficient contrast, are considered to be characteristic points of a point cloud (Figure 2(f)) with a condition that they are not localized on a point cloud boundary (places of point cloud discontinuity) where the wrong results are more likely due to various measurement errors and noise. 2.3 Interest Point Features Building a complete 3D model requires integration of many views. To clearly determine the alignment between two point clouds, at least three pairs of corresponding interest points from both datasets are needed. In terms of both computation time and the probability of obtaining a correct result, it is ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

Color-Based Algorithm for Automatic Merging of Multiview 3D Point Clouds



16:7

necessary to minimize the number of analyzed pairs of points while maintaining the greatest possible number of correct pairs. A descriptor of the interest points must be developed to enable the selection of the minimal number of pairs that are likely to indicate the same point of an object surface. We propose a new descriptor, called the Orbital Points descriptor (OrP), which characterizes the interest points by analyzing derivatives and average values of hue (H), saturation (S), or brightness (L) in their neighborhood RD . The descriptor was designed to be highly distinctive and invariant of point cloud rotation while also being simple to apply to unstructured point clouds. It is constructed according to the scheme presented in Table I. The descriptor of each point contains values that are shown in Table II. 2.4 Merging of Point Clouds Merging of point clouds is performed with the RANSAC algorithm [Zuliani 2008], which significantly accelerates the search for the proper transformation and rejects incorrect interest points without matches, called outliers. While merging two point clouds, the algorithm analyzes all possible pairs of points between two datasets. The initial sifting process, which serves to exclude false matches, is conducted while accounting for the similarity coefficients of points, which are separately calculated for each of the descriptor components. Coefficients are defined as the Euclidean metric between the corresponding values, according to:  (vai − vbi )2  2 , (3) D= 2  vai + vbi where vai is the value of the descriptor component in cloud a, and vbi is the value of the descriptor component in cloud b. The threshold of the analyzed coefficient, which defines what points can be considered similar, is selected according to the mean value of the coefficient calculated for the whole set of points. To be more detailed, in the first step, pairs of points whose coefficients do not exceed a given percentage of mean value are considered; if the merging process does not succeed, more pairs of points are analyzed. The aim of applying RANSAC in this task is to extract a set of at least three pairs of points from all possible filtered pairs and, according to these pairs, calculate the transformation between two datasets. The number of iterations of the RANSAC algorithm is set based on the number of pairs of similar points; therefore, the probability of finding corresponding points in two datasets is maintained on a certain level [Zuliani 2008]. Aside from the condition of descriptor similarity, the selected pairs of points must fulfill an additional distance condition. Therefore, the differences of distances between each pair of points in the first and second point cloud must be smaller than a given threshold. Moreover, distances between selected point pairs must be larger than the distance between neighboring points, and the triangle inequality must be satisfied because they must not be linear. The correctness of the merged cloud transformation is initially evaluated by a number of votes of similar pairs of points. The more votes obtained for a particular transformation, the more likely the transformation is a correct one. The result of the voting does not provide explicit information about the correctness of the match. Therefore, transformations with the highest number of votes are stored, and the geometric and color fitting errors between analyzed point clouds are calculated (for each of the popular transformations) according to Formula 6. EG evaluates the geometric misalignment of two point clouds by analyzing a distance between their surfaces in randomly selected points from the first cloud. In the spherical neighborhood of each randomly selected point, two planes are locally fitted with the least squares method to the surfaces of the first and second cloud. The distance used to formulate the EG error is a distance between the selected point’s orthogonal projection onto the first and second plane. The geometric fitting error is defined with ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

16:8



E. Hołowko et al. Table I. Method of Calculating an Interest Point Descriptor.

ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

Color-Based Algorithm for Automatic Merging of Multiview 3D Point Clouds



16:9

Table II. Descriptor Components

the following:

1 n EG = [dj (BP1 , BP2 )]2 , j=1 n

(4)

where n is the number of randomly selected points in the first cloud, and dj (BP1, BP2) is the Euclidean distance between the randomly selected jth point projected on planes that are best fitted to the first and second cloud. The influence of measurement noise is reduced by using best-fit planes (by the least squares method) instead of directly calculating the distances between points. Color fitting error EC is evaluated for component H of the HSL color space. For the points in the overlapping area of the clouds, it is determined by the following formula:

 n  2 1 (5) Hp1i − Hp2a , EC =  n i=1

where Hp1i is the value of component H for the ith point in the first cloud, and Hp2a is the value of component H for the point in the second cloud, adjacent to point Hp1i . Point Hp2a , which is adjacent to point Hp1i of the first cloud, is chosen as closest to the line perpendicular to the surface of the first cloud, passing through Hp1i . The transformation with the smallest cumulative error, for which EG and EC are within the established thresholds, is considered correct. The cumulative error is determined as the sum of the geometric fitting error and color fitting error multiplied by a weight factor: E = EG + kE EC .

(6)

The weight factor enables the summing of the difference of color values and the difference of Euclidean distances. It is described with: kE = (1 + σrel ) ∗ A,

(7)

where σ rel is the relative standard deviation of color in the common part of merged clouds, and A is the average distance between points. If the transformation is accepted, the fitting accuracy is increased by applying the ICP algorithm [Besl and Mckay 1992; Jost 2002; Pulli 1999]. The merging point clouds algorithm is able to align a single pair of clouds or conduct the alignment according to the “all to all” scheme. While merging the ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

16:10



E. Hołowko et al.

Fig. 3. Result of data simplification. (Left) computation time as a function of the degree of simplification. The highlighted area covers degrees of simplicity for which the correct transformation was obtained. (Right) Fitting error as a function of simplification.

entire set of clouds, the first step is to find the initial pair of clouds that can be successfully fitted. After finding and merging the first pair of clouds, they are marked as fitted and stable. Other clouds are iteratively fitted to the model while minimizing cumulative error. 3 RESULTS To validate the usefulness of the Harris detector and OrP in the automatic merging of a directional point cloud, we performed several experiments using actual data obtained with a high-resolution 3D scanner. Before applying a coarse merging algorithm, each analyzed cloud was simplified to speed up calculations. The simplification factor was selected with consideration of fitting precision, which can be achieved with simplified data. This precision was evaluated on the basis of the feasibility of Color ICP [Jost 2002], which uses color as a factor in stabilizing the fit. Its accuracy was determined by comparing the root-mean-square error between clouds transformed before and after correction. The attached chart (Figure 3) shows that both too great and too small a degree of simplification does not provide the correct result. The degree of simplification should then be adopted as a compromise between the acceptable fitting error and computation time. For testing purposes, point clouds were simplified 50 times, which enabled the attainment of a fitting inaccuracy of less than 1% of the average distance between points. This is the acceptable compromise between calculation time and fitting accuracy. All tests were performed on a machine with two Quad-Core Intel Xeon processors clocked at 2.66GHz with 32GB of RAM. The computer operating system was Microsoft Windows 7 Professional 64-bit. The first experiment was intended to compare the results obtained with the Harris detector and OrP descriptor using existing algorithms for extracting and describing interest points. For this purpose, we performed some calculations using the Point Cloud Library (PCL) (http://pointclouds.org/ documentation), which is a software developed by numerous engineers and scientists from many wellknown organizations. The PCL framework includes numerous algorithms for filtering, feature estimation, and surface reconstruction. We compared the results obtained with the Harris and PCL SIFT detectors (Figure 4). The SIFT 3D detector blurs the intensity of a point by analyzing its neighbors within a fixed-sized radius (based on the scale of the Gaussian). Blurring and subtracting point clouds in different scales enables the creation of a 4D difference of Gaussian (DoG) scale space for which the local maxima are selected as interest points. This involves analyzing both nearest neighbors of points in a point cloud and the values for the nearest scales in the scale space. Table III provides an ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

Color-Based Algorithm for Automatic Merging of Multiview 3D Point Clouds



16:11

Fig. 4. Sample pairs of point clouds from the King’s Chinese Cabinet ceiling with extracted interest points.



Table III. Processing parameters for Interest Point Extraction A - average point-to-point distance. ∗∗ signifies heavily optimized SIFT with SSE optimization and initially optimized Harris.

overview of processing parameters used in both cases. For this test, we chose two pairs of point clouds. For the first pair, the overlapping part of two surfaces was estimated at 20%; for the second pair, it was estimated at 60% of the whole cloud surface. We used pairs of point clouds from the set presented in the second experiment; however, detailed comparison of the results for all techniques was limited to two pairs because the other examples led to analogous conclusions. In all cases, the interest points were extracted with the same input parameter values. The Harris detector required four parameters, including radiuses controlling the neighborhood for partial derivatives calculation, extracting local maxima of the Harris map, and marking the point cloud boundary where interest points are not extracted. Additionally, the parameter that determines the level of acceptable contrast for the Harris detector response range was required. The SIFT detector used in these tests calculates the interest points according to color information and requires setting input parameters, including the standard deviation of the smallest scale in the scale space, number of octaves to compute, number of scales to compute within each octave, and minimum contrast required for detection. The values of input parameters set in the first experiment are provided in Table III. In our approach, these parameters are defined according to the average point-to-point distance in a point cloud. The ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

16:12



E. Hołowko et al. Table IV. Processing Parameters for Descriptor Computation

Table V. Results of View Integration Using RIFT and OrP Descriptors

initial tests demonstrated that the calculation time of the interest points was comparable in both cases. However, it should be noted that the PCL SIFT implementation was mature and heavily optimized with SSE instructions, whereas our Harris detector was implemented in a proof-of-concept manner and only initially optimized. In addition, according to the result of the first experiment, the Harris detector ensured better filtering of points, which are assumed to be the characteristic ones (see Table V for a comparison of the number of filtered point pairs for Harris OrP and SIFT OrP). To compare the results of integration according to the interest points extracted with the Harris and SIFT detectors, the computation of Rotation-Invariant Feature Transform (RIFT) [Lazebnik et al. 2005] and OrP descriptors is introduced. The RIFT descriptor is a generalized Lowe’s SIFT algorithm implemented in the PCL library. It analyzes the neighborhood as concentric rings of equal width, and a histogram of gradient orientation is computed within each ring. This algorithm requires input parameters, including the number of bins to use both in the distance dimension and in the gradient orientation dimension, as well as the radius of the RIFT feature (Table IV). The proposed OrP descriptor, on the ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

Color-Based Algorithm for Automatic Merging of Multiview 3D Point Clouds



16:13

Fig. 5. Result of merging Pair 1 and Pair 2 with the Harris detector and OrP descriptor.

other hand, requires only the radius of the neighborhood being analyzed while identifying the orbital points and color information channel that should be used. In our tests, both interest point detectors and both descriptors were interchangeably used to merge point clouds in Pair 1 and Pair 2 datasets. To obtain final transformations, the RANSAC algorithm was used. The obtained results are presented in Table V. First, the numbers of points that were filtered among all possible pairs according to the descriptors were compared. From those filtered pairs, three pairs of corresponding points were randomly chosen; if the distance difference condition was fulfilled, the transformation between clouds was calculated. For the few transformations with the highest voting results, the fitting error was calculated according to Formula 6. Errors presented in Table V are the minimal ones achieved for each case. The results presented in Table V clearly show the usefulness of the modified Harris detector along with the OrP descriptor for merging 3D data that lacks characteristic geometric features but has good color texture (Figure 5). In terms of speed and accuracy, this combination of methods performed far better than the SIFT detector in conjunction with the RIFT descriptor. When the overlapping area of two point clouds was sufficient, the SIFT RIFT combination merged point clouds with an error higher than with Harris OrP but below the threshold value (threshold value of an error acceptable in the experiment was set to 1.0A = 1.56mm). In addition, the RIFT computations were five times longer than those of Harris OrP. Decreasing the overlapping area between point clouds resulted in failure when using SIFT RIFT, whereas Harris OrP succeeded in merging data with a fitting error below the threshold and with reasonable computation time. In this experiment, we also compared how the SIFT detector performed with the proposed OrP descriptor. As shown by data in Table V, the application of the OrP descriptor enabled significant improvements in view integration results compared to RIFT. This clearly shows the potential of the developed OrP descriptor. The results of Harris OrP processing are similar to those obtained with SIFT OrP methods. Considering that our Harris detector implementation was far from optimized, it can be assumed that after optimization it will easily outperform SIFT in computation time with similar qualitative results. In the second experiment, the proposed method was used to build a model of two sections of the King’s Chinese Cabinet ceiling (Figure 1). The first section consisted of 22 point clouds from the area covered with distinctive paintings (Figure 6); the second section included 24 point clouds (Figure 7). The first dataset was properly merged with a fitting error under the threshold value. In the second case, one of the merged point clouds was improperly matched, which caused error propagation in the next three clouds. The error was too large to be corrected with ICP. A summary of the computation results is presented in Table VI. ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

16:14



E. Hołowko et al.

Fig. 6. Result of merging the first test fragment. The enlarged fragments show neighboring clouds before and after Color ICP correction.

Fig. 7. Result of merging the second test section with ICP correction. The highlighted area is the overlapping section of the incorrectly merged pair of clouds.

Table VI. Time of Integration and Correction of Individual Fragments of the King’s Chinese Cabinet Ceiling for the 50-fold Simplification of Point Clouds ∗ denotes fully automated integration when all possible connections between point clouds were considered.

ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

Color-Based Algorithm for Automatic Merging of Multiview 3D Point Clouds



16:15

Fig. 8. Experimental data from the wall of the King’s Chinese Cabinet.

Table VII. Processing Parameters for the Model from Figure 8

In the next two experiments, the Harris detector and OrP descriptor were applied to address more complex examples. In the first dataset (Figure 8), the object surface was no longer planar, and the overlapping area of two point clouds was estimated as 20% to 30% of the whole area of a point cloud. The processing parameters are presented in Table VII. Statistics for the integration process are presented in Table VIII. A comparison of the results shows that when the algorithm was able to eliminate many outliers by analyzing the descriptor’s value, the time needed to obtain the proper solution was much shorter. In the opposite case, when the number of filtered pairs of points was large, the computation time was much longer. Nevertheless, in all presented examples, the number of interest points described with distinctive features was enough to merge point clouds with a fitting error below the acceptable threshold (the error threshold was set to 1.0A = 1.36mm). Therefore, it is evident that the developed pair of algorithms—the modified Harris detector and the OrP descriptor—is also suitable for working with clouds of a nonplanar, complicated surface. For comparison, in the second part of Table VIII, the results of integration with pure geometrical methods are shown (PCL Harris3D method, which identifies characteristic points by analysis of normal vectors, along with SHOT descriptor). It is clearly visible that our method gives similar results to the purely geometrical one, but is more universal since it can be also used for processing data without distinct geometrical features. ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

16:16



E. Hołowko et al. Table VIII. Results of View Integration

Fig. 9. (a) Different model views of a Chinese figurine from Museum of King Jan III’s Palace at Wilanow. (b) Six overlapping pairs of point clouds used in the repeatability test.

In the next experiment, the model of an ancient Chinese figurine (Figure 9) consisting of 41 directional measurements is presented. It was merged with the newly developed Harris detector and OrP descriptor. The parameters controlling the process are given in Table IX. In the first experiment (with planar clouds from the King’s Chinese Cabinet ceiling), we confirmed the effectiveness of the proposed OrP descriptor compared to the RIFT descriptor from the PCL library. We also compared our Harris detector with PCL’s SIFT detector in terms of interest point filtering ability (the Harris detector ensured slightly better filtering). The computation time was also comparable; however, considering that our implementation was not optimized, it can be assumed that its final performance after optimization will be significantly better than that of PCL’s SIFT. Furthermore, ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

Color-Based Algorithm for Automatic Merging of Multiview 3D Point Clouds







16:17

Table IX. Processing Parameters for the Model in Figure 9 denotes the whole dataset. ∗∗ signifies fully automated integration.

Table X. Statistics for Interest Point Extraction with Harris and SIFT Detectors denotes heavily optimized SIFT with SSE optimization and initially optimized Harris.

in the following experiment, for exemplary views from the Chinese figurine model (Figure 9(b)), we compared the repeatability of interest point extraction with the proposed Harris detector and the SIFT detector (Table X). Six pairs of neighboring point clouds were manually aligned, and the overlapping part was extracted in each case (Figure 10). For each extracted region, interest points were calculated with both the Harris and SIFT detectors. This repeatability test evaluated the similarity of extracting interest points in cloud pairs by each detector. The more similar points in the cloud pair are identified, ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

16:18



E. Hołowko et al.

Fig. 10. Exemplary pair of point clouds with interest points extracted with Harris and SIFT detectors.

Fig. 11. Repeatability test results for six exemplary pairs of point clouds.

the better chance that calculated transformation is the correct one. The measure of this similarity is the reciprocal of mean distance between the closest interest points from overlapping point clouds: R=

1

n

i=1

di

,

(8)

n

where di is the distance between the ith point from the first overlapping point cloud to the closest point from the second one, and n is the number of points in the first overlapping point cloud. The results of the repeatability test are presented in Figure 11. In this experiment, we additionally compared how many pairs of points extracted with the Harris and SIFT detectors voted for correct transformation. For this purpose, the percentage of all pairs of points that voted—that is, are similar in each pair of point clouds—is given in Table XI. A higher ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

Color-Based Algorithm for Automatic Merging of Multiview 3D Point Clouds



16:19

Table XI. Results of Repeatability Test for Harris and SIFT Detectors ∗ denotes calculation with Formula 8.

Fig. 12. Test point clouds that cannot be successfully merged. (Overlapping areas are highlighted.)

number of votes given by pairs of corresponding points suggests that the performance of a detector is more stable. The success of view integration using the proposed method depends on both the number of interest points extracted from the common area and their distinguishability according to the descriptors used. In Figure 9, we present point clouds that cannot be successfully merged with the method proposed in this article. The overlapping area of point clouds is approximately 20%. Moreover, there are approximately 100 interest points extracted from this area; however, they cannot be distinguished from each other according to the OrP descriptor because the color gradient in their neighborhood is ambiguous. 4.

CONCLUSIONS

In this article, we presented a Harris corner detector that was designed to extract interest points from unordered clouds of points containing color texture. We also proposed the new OrP descriptor, which is rotationally invariant and highly distinctive. We validated their performance by comparing them with the SIFT detector and RIFT descriptor implemented in the PCL. The comparison considered the ability to converge to proper a solution, number of input parameters, computation time, and repeatability of extracted interest points. The proposed detector extracts interest points as maxima of the Harris operator’s response, which is calculated according to the gradient of the color information. Our implementation involved some additional steps, such as ignoring points with low contrast and eliminating the boundary of point clouds. These steps enable the avoiding of extracting interest points in places where wrongly calculated points ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

16:20



E. Hołowko et al.

may occur (Figure 10). The presented results confirmed the effectiveness of extracting interest points with our Harris detector for point clouds that have characteristic texture and are devoid of characteristic geometric features (planar surfaces), as well as for point clouds with more complex geometries. A comparison of our implementation of the Harris detector and PCL’s SIFT 3D detector indicated that the obtained results are comparable for simple examples of plane point clouds with characteristic features; however, experiments with more complex datasets confirmed that the proposed detector has a more stable performance and better repeatability of interest point extraction. The presented results of calculation time are similar for both detectors; however, this outcome should be regarded as an initial comparison because the SIFT implementation from PCL is heavily optimized with use of SSE instructions, whereas our Harris detector implementation was only initially optimized and requires further improvements (at least multithreading). Nevertheless, the results of the initial tests obtained with the Harris detector show the high potential of the proposed method. In addition, the complexity of the SIFT detector in requiring a cascade filtering approach and detection of extrema of a 4D scale-space is much higher than the complexity of the Harris detector, wherein the maxima of its operator’s response are analyzed. Therefore, it is justifiable to expect much shorter computation times after further optimization. In addition, we compared our newly developed OrP descriptor with PCL’s RIFT descriptor. The obtained results proved that our descriptor not only requires fewer input parameters, but it also ensures better distinctiveness of interest points. Consequently, our descriptor enables faster and more accurate integration of directional views with each other. We tested the proposed method on multiple examples that included both objects without distinctive geometric features and more complex ones. The obtained results proved that the Harris detector used with the OrP descriptor is able to successfully integrate such point cloud data according to color information, even if the overlapping area of two datasets is small (approximately 20%), with the only required condition being a sufficient number of distinctive features in the texture of analyzed surface. The comparison with purely geometrical integration method (for data with characteristic shape features), presented in Table VIII, shows that our method is capable of obtaining similar results. It must be noted that algorithms proposed by us are also capable of dealing with data without distinct geometrical features and therefore they are more universal. ACKNOWLEDGMENTS

This work has been financially supported by the National Centre for Research and Development as part of the project, “Reconstruction of the Course of Events Based on Bloodstain Pattern Analysis” (DOBR/0006/R/ID1/2012/03), realized within the Defense and Security Programme (2012–2015), and statutory work of Warsaw University of Technology. REFERENCES G. H. Bendels, P. Degener, R. Wahl, M. Kortgen, and R. Klein. 2004. Image-based registration of 3D-range data using feature surface elements. In Proceedings of the 5th International Conference on Virtual Reality, Archaeology and Intelligent Cultural Heritage. 115–124. P. J. Besl and N. D. Mckay. 1992. A method for registration of 3-D l. IEEE Transactions on Pattern Analysis and Machine Intelligence 14, 239–256. J. G´omez-Garc´ıa-Bermejo, E. Zalama, and R. Feliz. 2013. Automated registration of 3D scans using geometric features and normalized color data. Computer-Aided Civil and Infrastructure Engineering 28, 98–111. C. Harris and M. Stephens. 1988. A combined corner and edge detection. Proceedings of the Fourth Alvey Vision Conference, Manchester, 147–151. K. Ikeuchi and D. Miyazaki. 2008. Digitally Archiving Cultural Objects. Springer. ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

Color-Based Algorithm for Automatic Merging of Multiview 3D Point Clouds



16:21

A. E. Johnson and M. Hebert. 1999. Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 433–449. I. T. Jolliffe. 2002. Principal Component Analysis (2nd Ed). Springer. T. Jost. 2002. Fast Geometric Matching For Shape Registration. Retrieved from http://wwwa.unine.ch/parlab/pub/pdf/2002thTJ.pdf. S. Lazebnik, C. Schmid, and J. Ponce. 2005. A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1265–1278. D. G. Lowe. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91– 110. A. Makadia, A. I. Patterson, and K. Daniilidis. 2006. Fully automatic registration of 3D point clouds. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition 1, 1297–1304. K. Mikolajczyk and C. Schmid. 2005. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis & Machine Intelligence 27, 1615–1630. H. Moravec. 1980. Obstacle Avoidance and Navigation in the Real World by a Seeing Robot Rover. Robotics Institute, Carnegie Mellon University. K. Pulli. 1999. Multiview registration for large data sets. In Proceedings of the 2nd International Conference on 3-D Digital Imaging and Modeling. 160–168. G. Roth. 1999. Registering two overlapping range images. In Proceedings of the 2nd International Conference on 3-D Digital Imaging and Modeling. 191–200. R. B. Rusu, N. Blodow, and M. Beetz. 2009. Fast point feature histograms (FPFH) for 3D registration. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’09). 3212–3217. R. B. Rusu, Z. C. Marton, N. Blodow, and M. Beetz. 2008. Persistent point feature histograms for 3D point clouds. In Proceedings of the 10th International Conference on Intelligent Autonomous Systems. 154, 477. I. Sipiran and B. Bustos. 2010. A robust 3D interest points detector based on Harris operator. In Proceedings of the 3rd Eurographics Conference on 3D Object Retrieval. 7–14. R. Sitnik, P. Bolewicki, J. Rutkiewicz, J. Michonski, M. Karaszewski, J. Lenar, K. Mularczyk, and W. Zaluski. 2010. Project “Revitalization and Digitization of the Seventeenth Century Palace Complex and Garden In Wilanow – Phase III” task “3D Digitalization of Selected Exhibits Collection. In Proceedings of the Euromed 2010. 7–13. R. Sitnik, M. Kujawinska, and J. Woznicki. 2002. Digital fringe projection system for large-volume 360-degree shape measurement. Optical Engineering 41, 443–449. B. Steder, R. Bogdan, R. Kurt, and K. W. Burgard. 2011. Point feature extraction on 3D range scans taking into account object boundaries. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’11). 2601–2608. F. Tombari, S. Salti, and L. Di Stefano, 2011. A combined texture-shape descriptor for enhanced 3D feature matching. IEEE 18th International Conference on Image Processing (ICIP’11). 809–812. A. Zaharescu, E. Boyer, K. Varanasi, and R. Horaud. 2009. Surface feature detection and description with applications to mesh matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). 373–380. F. Zhao. 2006. Image matching by normalized cross-correlation. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’06). Y. Zhong. 2009. Intrinsic shape signatures: A shape descriptor for 3D object recognition. In Proceedings of the IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops). 689–696. M. Zuliani, Ed. 2008. RANSAC for Dummies. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?rep=rep1&type= pdf&doi=10.1.1.139.2808. Received March 2013; revised November 2013; accepted December 2013

ACM Journal on Computing and Cultural Heritage, Vol. 7, No. 3, Article 16, Publication date: April 2014.

Suggest Documents