206 F. Boughorbel, A. Koschan, and M. Abidi, "Automatic Registration of 3D Datasets using Gaussian Fields," in Proc. IEEE International Conference on Image Processing ICIP2005, Vol. III, Genoa, Italy, pp. 816-819, September 2005.
Automatic Registration of 3D Datasets using Gaussian Fields
2005.
Faysal Boughorbel
Andreas Koschan, Mongi Abidi
Video Processing and Visual Perception Group Philips Research Labs Eindhoven, The Netherlands
[email protected]
Department of Computer and Electrical Engineering University of Tennessee Knoxville, Tennessee, USA
[email protected],
[email protected]
Abstract—In this paper we introduce a new 3D automatic registration method based on Gaussian fields and energy minimization. The method defines a simple C ∞ energy function, which is convex in a large neighborhood of the alignment parameters; allowing for the use of powerful standard optimization techniques. We show that the size of the region of convergence can be significantly extended reducing the need for close initialization and overcoming local convergence problems of the standard Iterative Closest Point (ICP) algorithms. Furthermore, the Gaussian criterion can be evaluated with linear computational complexity using Fast Gauss Transform methods, allowing for an efficient implementation of the registration algorithm. Experimental analysis of the technique using real world datasets shows the usefulness as well as the limits of the approach. Keywords-3D Registration; Gaussian Fields; Fast Gauss Transform; Optimization.
I.
INTRODUCTION
Due to their limited field of view and to the occlusion problem most 3D imaging systems, such as laser range scanners, will provide partial scans of a scene. In order to build complete description of scene geometry, we need to merge together several of these partial views. Since these datasets are originally represented in the local sensor coordinates frame, registration is a fundamental step in most 3D modeling pipelines. In this paper we focus on the case of rigid transformations ( R, t ) : 3D rotations and translations. Given point correspondences between the 3D views, the rotation and translation parameters could be computed in a closed form solution [8]. The recovery of those matches automatically is the first task in many modeling systems. This step, however, results usually in a rough registration that needs to be refined using Iterative Closest Point (ICP) algorithms [1][10]. To obtain point correspondences between 3D datasets, several invariant feature extraction techniques were proposed. Methods surveyed by Campbell and Flynn [2] are mostly surface-based and use differential properties to build their representations. Most invariant feature methods assume that the partial surfaces were accurately reconstructed from the range maps. In real applications, however these surfaces are affected by noise, reducing the accuracy of invariant features registration, hence the need for further refinement using pointbased techniques. The ICP algorithm is a locally convergent scheme that requires parameter initialization close to the aligned position. It
operates at the point level, minimizing the mean squared distance between the datasets. Despite its popularity, ICP has several shortcomings, including the need for sufficient overlap between the datasets and local convergence. The main contribution of this work is a new point-sets registration criterion which is differentiable and convex in a large neighborhood of the aligned position. The goal was to overcome the problems of standard registration techniques, and in particular ICP. The main reason behind these limitations can be attributed to the nondifferentiable cost function associated with ICP which imposed heuristic minimization and local convergence. Hence, in real applications the preliminary point-feature extraction and matching step is necessary before proceeding to what is considered an ICP-based refinement step. Recently, similar work was done on the design of approximations to non-differentiable matching and similarity measures including the work by Charpiat et al [3] on approximating Hausdorff distances by a differentiable metric on shape space, and the use of distance functions and gradient-based optimization techniques to minimize the ICP criterion [5]. In our case we will use a straightforward sum of Gaussian distances that is defined for generalized point-sets with associated attributes. These attributes can be local moments computed from the datasets [12]. The criterion is convex in the neighborhood of the solution and always differentiable allowing for the use of a wide range of well proven optimization techniques. Physically the method is interpreted in terms of Gaussian force fields attracting the two datasets to the correct registration, similar to the Gaussian forces encountered in particle physics. Also the method can be derived from combinatorial matching using a mollification and relaxation approach. We show that this criterion can be effectively used for registration extending the region of convergence so that we do not need close initialization. More importantly, the criterion can be evaluated, with linear computational complexity, using the recent numerical techniques known as the Fast Gauss Transform methods [4]. In the following sections we first present the Gaussian energy function, present an overview of the fast evaluation method, and finally show an analysis of our approach based on several experimental results.
II.
We start by introducing a very simple combinatorial criterion satisfying the maximum (point-to-point) overlap of two point-sets M = {Pi }i =1... N M and D = {Q j }j =1... N , that are registered by a D
transformation Tr * . We assume at this point the noiseless case. For the problem to be well–posed we need also to assume that M and D have a maximum point-to-point overlap at the aligned position. Then the following measure (1) will have a global maximum at Tr * :
The first author acknowledges the generous support of Philips Labs Eindhoven
0-7803-9134-9/05/$20.00 ©2005 IEEE
THE GAUSSIAN FIELDS CRITERION
III-804
E (Tr ) =
(1)
∑ δ (d (Tr ( P ), Q )) i
j
i =1... N M j =1... N D
with δ (t ) = 1 for t = 0 , and δ (t ) = 0 otherwise. Where d ( P, Q ) is the distance (in our case Euclidean) between points. Incorporating local shape similarity in this criterion is straightforward and requires just using a higher dimensional representation of the datasets where points are defined by both position and a vector of shape attributes: M = {( Pi , S ( Pi ))}i =1... N M and D = {(Q j , S (Q j ))} j =1... N .
Tr
(a)
D
Obviously the resulting discrete criterion is not continuous with respect to the alignment transformations and can be visualized by a collection of “spikes” in parameter space. The resulting optimization problem will not be practical since it is difficult to find the global maximum of discrete combinatorial functions. One of the core ideas upon which our approach is built is to find a smooth approximation of the combinatorial criterion using an analytical method known as Mollification. This approach was used as a tool to regularize ill-posed problem with respect to differentiability [9]. Tr
2
Given the Gaussian kernel ρ (t ) = exp( − t ) , and an arbitrary σ 2
(b)
σ
non-differentiable function f (t ) defined on Ω ⊂ ℜ d , a ‘mollified’ function f σ (t ) can be obtained by convolution:
f σ (t ) = ( ρ σ * f )(t ) = ∫ exp(
− (t − s ) 2
σ2
Ω
(2)
) f ( s)ds
The resulting function will be an approximation of the original one such that: lim f (t ) = f (t ) . Furthermore we will have f σ ∈ C ∞ (Ω) . σ →0
σ
This operation is also known as the Gauss Transform and is encountered in many applications. Now if we apply discrete mollification to our combinatorial registration criterion (1) we have: Eσ (Tr ) = ∫ exp(−
=
(d (Tr ( Pi ), Q j ) − s ) 2
∑ ∫ exp(−
σ2
∑ ∫ exp(−
i =1... N M j =1... N D
∑ δ (d (Tr ( P ), Q i
σ
(d (Tr ( Pi ), Q j ) − s ) 2
σ
2
j
))}ds
The differentiability and convexity properties allow for the use of standard and well-proven gradient-based optimization techniques. Extending the width of the basin of convergence is easily done by increasing the parameter σ. However this relaxation will come at the price of decreasing the localization accuracy of the criterion. The tradeoff between registration accuracy and size of the region of convergence is mainly due to the effect of outliers (i.e. the areas that are outside the intersection of model and data).
i =1... N M j =1... N D
(d (Tr ( Pi ), Q j ) − s )
i =1... N M j =1... N D
=
){
Figure1. Mollification converts the discrete combinatorial criterion into a smooth sum of Gaussians (a). For σ relaxed we will have an overlap between the different Gaussians. The mixture of these will be our registration criterion, having a dominant peak around the registered position (b).
2
)δ ( s)ds =
III.
2
)δ (d (Tr ( Pi ), Q j ))ds
∑
i =1... N M j =1... N D
exp(−
d 2 (Tr ( Pi ), Q j )
σ
2
(3) )
The mollified criterion is a straightforward sum of Gaussians of distances between all pairs of model and data points. Expression (3) can be re-interpreted physically as the integration of a potential field whose sources are located at points in one of the datasets dataset and targets in the other one. In the noisy case the Gaussian criterion can account for the noise affecting the position of points by relaxing the parameter σ to values near that of noise variance. Fig 1 illustrates the working of the discrete combinatorial criterion and the mollified version. Having met the first of our objectives, which is differentiability, we now examine the possibility of extending the basin of convergence of our criterion. (We are focusing here on the case of rigid registration where Tr (Q j ) = RQ j + t ). Being the sum of closely packed Gaussian functions, the profile of the criterion with respect to the transformation parameters will generally have the appearance of a Gaussian, with local convexity in the neighborhood of the registered position.
FAST GAUSS TRANSFORM METHODS
The registration criterion is essentially a mixture of N D Gaussians evaluated at N M points then summed together. The cost of direct evaluation will be O ( N M × N D ) , which for large datasets is computationally expensive. Similar limitations are encountered in other computer vision tasks, especially for Gaussian kernel density estimation. A new numerical method, called the Fast Gauss Transform, was recently employed in color modeling and tracking applications [4] in order to reduce the computational complexity of Gaussian Mixture evaluation to O( N M + N D ) . The method, which belongs to a new class of fast evaluation algorithms known as “fast multipole” methods, was first introduced by Greengard and Strain [6][7] and applied to potential fields computations. The basic idea is to exploit the fact that all calculations are required to be only up to certain accuracy. In this framework the sources and targets of potential fields were clustered using suitable data structures, and the sums were replaced by smaller summations that are equivalent to a given level of precision. N s −t To evaluate sums of the form S (t ) = ∑ f j exp(−( j i )2 ) , i j =1
σ
are the centers of the Gaussians known i = 1,..., M where {s j } j =1,..., N
as sources and {t i }i =1,..., M the targets. The following shifting identity and expansion in terms of Hermite series are used:
III-805
exp( = exp(
− (t − s ) 2
σ
2
− (t − s 0 )
) = exp( 2
σ2
∞
)∑ n =0
− (t − s0 − ( s − s0 )) 2
σ
2
(4)
)
t − s0 , 1 s − s0 n ( ) Hn( ) σ n! σ
where H n are the Hermite polynomials. Given that these series converge rapidly, and that only few terms are needed for a given precision, this expression can be used to replace several sources by s0 with a linear cost at the desired precision, these clustered sources can then be evaluated at the targets. For a large number of targets the Taylor series (5) can similarly be used to group targets together at a cluster center t 0 , further reducing the number of computations. exp( −(
t−s
σ
) 2 ) = exp(
− (t − t0 − ( s − t0 )) 2
σ2
(5)
)
s − t0 t − t0 n 1 hn ( )( ) n ! σ σ n =0 p
≈∑
2
where the Hermite functions hn (t ) are defined by hn (t ) = e − t H n (t ) . The method was shown to converge asymptotically to a linear behavior as the number of sources and targets increases. Implementation details and analysis can be found in [4][7]. The main problem with the original Fast Gauss Transform is the exponential increase of the complexity with the number of dimensions. To address this limitation a new variant of the method was proposed by Yang et al. [13] where a data-clustering scheme along with a more intelligent multivariate Taylor expansion were used. Their experiments show orders of magnitude gained in processing time for large datasets.
IV.
from 1% to 30%). The plots of Fig. 2(c) show the variation of MSE criterion with respect to σ as compared to the MSE of the ICP algorithm. We show that there is a point at which the Gaussian method will outperform the ICP algorithm. We clearly see also that a minimum of the MSE is obtained with respect to σ corresponding to an optimal behavior balancing the Bias and Variance constraints. The threshold below which the Gaussian method is better than ICP as well as the optimal σ are inevitably dependant on the datasets, but a typical range in which sub-optimal behavior is ensured can be determined. Similar results were obtained for several other datasets.
V.
CONCLUSION
In this paper we introduced a new registration criterion, based on the integration over a Gaussian potential field linking the datasets to be aligned and resulting in a highly automatic registration framework. The method enhances the automation of 3D modeling systems by overcoming current need for a two-stage pipeline, where first initialization using manual or automatic feature extraction and matching is needed before using ICP for refinement. By employing a unified framework based on the Gaussian energy function, we can start from initial transformations far from the aligning parameters and converge accurately to the pose parameters. Furthermore, the method can be implemented with linear computational complexity using the recent numerical techniques known as the Fast Gauss Transform methods. Analysis performed using real-world noisy datasets, illustrate the behavior of the method comparing it to standard techniques and demonstrating its applicability to real world data. While in this paper we focused mainly on 3D rigid registration, extending the Gaussian Fields approach to the non-rigid registration case is also possible; a task that we are currently investigating.
REFERENCES
ANALYSIS
One of the most important issues in registration is the size of the region of convergence (ROC) of the algorithm. The ROC determines the degree of automation of the technique and was a major focus in our analysis. To have a better understanding effect of σ on the ROC we analyzed the so-called basins of convergence of the algorithm for real-world datasets (Fig.2). The plots of Fig.2(c) show the relationship between the initial value of the transformation parameters provided to the algorithm and the residual registration error at the end of the process. These basins of convergence were obtained for several values of σ. What the resulting plots confirm is the tradeoff between a large basin of convergence obtained for a large value of σ which associated with a large residual error as well, and a smaller basin of convergence obtained for lower values of σ that come with a better registration accuracy. We note that the width of the basins will grow fast first but then does not increase much after a certain value of the force range parameter which was already deduced from the profiles of the criterion. Also the width of these basins is significantly larger than the value of σ. When the Gaussian Fields basins are compared with those of the point-based ICP (Fig. 2(c)) algorithm we notice that they are wider even for low values of σ. This is to be expected, since we know that ICP is a close-range locally convergent scheme. On the other hand ICP has a smaller residual error except when compared with the algorithm tuned for close range Gaussian Fields. A balance between residual error and ROC size is clearly achieved by an adaptive optimization strategy. We illustrate the improvement that our algorithm brings to registration as compared to ICP by using a uniform distribution of initial translations (along the x-axis) in the same way that we obtained the basins of convergence. We then compute the Mean Squared Error (MSE) over the different initializations. The results are obtained for several values of σ set as a fraction of the size of the datasets (ranging
[1]
P. J. Besl and N. D. McKay, “A Method for Registration of 3-D Shapes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 2, pp. 239-256, 1992. [2] R. Campbell and P. Flynn, “A Survey of Free-form Object Representation and Recognition Techniques”, CVIU, vol. 81, no. 2, pp. 166-210, 2001. [3] G. Charpiat, O. Faugeras, and R. Keriven, “Shape Metrics, Warping and Statistics,” in Proc. International Conference on Image Processing, vol. 2, pp. 627-630, Barcelona, 2003. [4] A. Elgammal, R. Duraiswami, and L. Davis, “Efficient Kernel Density Estimation using the Fast Gauss Transform with Applications to Color Modeling and Tracking,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 11, pp. 1499-1504, 2003. [5] A. W. Fitzgibbon, “Robust registration of 2D and 3D Point Sets,” Image and Vision Computing, vol. 21, pp. 1145-1153, 2003. [6] L. Greengard, The Rapid Evaluation of Potential Fields in Particle Systems, Cambridge, Massachusetts, MIT Press, 1988. [7] L. Greengard and J. Strain, “The Fast Gauss Transform,” SIAM J. Scientific Computing, vol. 12, pp. 79-94, 1991. [8] B. Horn, “Closed-form Solution of Absolute Orientation using Unit Quaternions,” Journal of the Optical Society of America A, vol. 4, no. 4, pp. 629-642, 1987. [9] D. A. Murio, The Mollification Method and the Numerical Solution of Ill-Posed Problems, J. Wiley & Sons, New York, 1993. [10] S. Rusinkiewics, and M. Levoy, “Efficient Variants of the ICP Algorithm,” in Proc. of 3D Digital Imaging and Modeling, IEEE Computer Society Press, pp. 145-152, 2001. [11] F. A. Sedjadi and E. L. Hall, “Three-Dimensional Moment Invariants,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 2. no. 2, pp. 127-136, 1980.
III-806
[12] G. C. Sharp, S. W. Lee, and D. K. Wehe, “ICP Registration using Invariant Features,” IEEE Transactions on Pattern Analysis Machine Intelligence, vol. 24, no. 1, pp. 90-102, 2002
[13] C. Yang, R. Duraiswami, N. A. Gumerov, L. Davis, “Improved
(a)
fast gauss transform and efficient kernel density estimation,” Proc. Ninth International Conference on Computer Vision, pp. 464-471, October, Nice, France, 2003.
(b)
(c)
(d)
Figure2. Sample 3D datasets used in our experimental analysis and evaluation of the Gaussian Fields registration method (a: Multiple parts and objects, b: a building). For each dataset we show an image of scene as well as the two 3D views, which are shown in unregistered (middle picture) and registered (right) positions. One of the main objectives of the analysis is the comparison of the basin of convergence of our technique with that of ICP. In (c) we show the basin of convergence for several values of σ compared to ICP obtained for the building dataset (initial translation and error are expressed as a fraction of the size of the scene). We also performed a Mean Squared Error (MSE) analysis with respect to initialization and for different values of σ: in (d) we show a comparison of the MSE of the Gaussian method with that of the ICP algorithm for the ‘Parts’ dataset of (a).
III-807