Application of Dixon resultant to maximization of the likelihood function of Gaussian mixture distribution Robert Lewis; Fordham University, New York (USA) Bela Pal´ancz; Budapest University of Technology and Economics 1521 Budapest (Hungary) Joseph Awange; Curtin University (Australia)
[email protected] Abstract In this presentation a new robust technique employing expectation maximization to separate outliers (corrupted data points) from inliers (true data points) iteratively, represented by different Gaussian distributions is introduced. Since in every iteration step, a new parameter estimation should be carried out, it is important to solve this parameter estimation as fast as possible. To do that, the problem of numerical global maximization of the likelihood function of the Gaussian mixture was transformed into the solution of a multivariate polynomial system. The symbolic solution of the resulting polynomial system consisting of four equations is quite challenging due to the high number of parameters. In order to solve it, a linear transformation was employed to reduce the number of the equations as well as the total degrees of the polynomials. This reduced system has been solved successfully via Dixon resultant accelerated by the Early Discovery of Factors method implemented in Fermat. The symbolic result was verified via numerical Groebner basis and the suggested robust technique was compared with other robust methods such as Danish and Random Sample Consensus methods based on the data set from a real laser scanning experiment.
Keywords Robust parameter estimation, expectation maximization, maximum likelihood method, numerical Groebner basis, Dixon resultant
1
Introduction
Surface reconstruction from point clouds generated by laser scanning technology is encountered, e.g., in robotics, computer vision, digital photogrammetry, computational geometry, digital building modeling, forest planning and operational activities, etc. Point clouds, however, are limited due to the fact that occlusions, multiple reflectance, and off - surface points corrupt the data thus necessitating the need for robust fitting techniques. Some of the robust estimation techniques employed to eliminate outliers include the Danish and the RANdom Sample Consensus (RANSAC) methods, which eliminate outliers using noise thresholds. For example, Huang and Tseng (2008) employed the Danish robust estimation method using Total Least square (TLS) as well as Principal component analysis (PCA). Other outlier elimination methods include e.g., those that are based on minimum covariance determinant method (Russeeuw and Van Driessen, 1999 ), expectation maximization method (Lakaemper and Latecki 2006 ), Bayesian techniques (Diebel et al, 2006 ), region growing algorithm (Chen and Stamos, 2007 ), improved 3D Hough Transform (Borrmann et al, 2011 ). All of these methods require from the user to select one or two parameters of the procedure individually, which leads to a trial and error sort of approach. In this contribution we suggest a parameter free method based on Expectation Maximization (EM) technique.
2
Maximum Likelihood Method
First, we carry out a parameter estimation assuming no outliers exist. Generally, to carry out a regression procedure using maximum likelihood method (ML), one needs to have a model M ( x, y, z : θ ) = 0, an error definition eMi (xi , yi ,zi : θ ) as well as the probability density function of the error PDF (eM ( x , y, z : θ )). The linear model then becomes M(x, y, z : θ ) = αx + βy + γ − z,
(1)
with parameters θ = (α, β, γ). The error model - corresponding to the TLS - is the shortest distance of a point Pi from its perpendicular projection to the plane, eMi (xi , yi ,zi : θ )) =
zi − xi α − yi β − γ p . 1 + α2 + β 2
(2)
The probability density function of the model errors is considered as a Gaussian - type error distribution of N (0, σ) (eM )2
e− PDF (eM (x,y,z : θ )) = √
2σ 2
2πσ
.
(3)
Considering a set of {( x1 , y1 ) , (x2 ,y2 )...,(xN , yN )} as measurement points, the maximum likelihood approach aims at finding the parameter vector θ that maximizes the likelihood of the joint error distribution. Assuming that the measurement errors are independent, one should maximize, (eMi )2 N Y e− 2σ2 √ L= . 2πσ i=1
(4)
In order to use the sum instead of product, one can consider the logarithm of Eq. (4), ! N N Y X LogL = Log P DF (eM ) = − Log(P DF (eM )). i=1
(5)
i=1
If the Gaussian - type error distribution is considered, the function to be maximized becomes LogL(α, β, γ) = N Log
√
2πσ +
N 1 X (zi − xi α − yi β − γ) 2 . 2σ 2 i=1 1 + α2 + β 2
(6)
From the necessary conditions of the optimum, namely eq1 =
∂LogL ∂LogL ∂LogL = 0, eq2 = = 0, eq3 = = 0, ∂α ∂β ∂γ
(7)
one can obtain the following polynomial system, eq1 = i − bα + hα − iα2 − eβ − 2gαβ + eα2 β + iβ 2 − 2 bαβ + dαβ 2 − eβ 3 − aγ − 2f αγ + aα2 γ + 2cαβγ − aβ 2 γ + Nαγ 2 eq2 = g − eα + gα2 − eα3 − dβ + hβ − 2iαβ + bα2 β− 2 dα β − gβ 2 + eαβ 2 − cγ − cα2 γ − 2f βγ + 2aαβγ + cβ 2 γ + Nβγ 2 eq3 = f − aα − cβ − Nγ
,
(8)
where the constants (a, b, c, e, f, g, h, i) depending on the measured values ( xi , yi ), i= 1,2,..., N. The solutions of this polynomial system are the possible optimums of Eq.(6). This system can be solved for example using Sylvester resultant since the last expression of Eq.(8) is linear. The solution yields univariate polynomials of seventh order for α and β. To illustrate this situation, let us consider Fig. (1), where inliers (blue points) and outliers (red points) are considered together as acceptable data points. The model error can be computed with the known parameters (α,β,γ), see Fig 2. This distribution has a “tail” on the right-hand-side indicating that histogram represent the distribution of a mixture of two distributions.
3
Expectation Maximization
The components of this mixture can separated by EM algorithm. Assuming two Gaussian distribution, this method provides not only the means and standard deviations {µ1 ,σ1 },{µ2 ,σ2 }, but also the membership functions {η1 ,η2 } too. Consequently, the data belonging to the two different distributions can be identified (see Fig. 3). We can see, that some sample elements are misclassified. Using these clusters, let us compute new parameters of the plane (α, β, γ). Since the inliers as well as the outliers are known, and the parameters of their approximated Gaussian are computed, e.g., ({µ1 ,σ1 },{µ2 ,σ2 },{η1 ,η2 }) we employ Maximum Likelihood (ML) Method but now for a Gaussian mixture.
Figure 1: The plane with black color is the zero approximation of the true blue plane
-3
-2
0
-1
1
2
3
4
Figure 2: The histogram of the error distribution resulting from the zero approximation 20 15 10 5 0 10
5 3
2
1
0
1
2
3
4
0 0 20
20
40
Figure 3: The histograms and the data points of the error distributions resulting from the zero approximation
3.1
Maximum Likelihood Method for Gaussian mixture
P The P likelihood function for the Gaussian mixture is LogL (xi , θ) = i∈N1 Log (N (µ1 , σ1 , xi )) + i∈N2 Log (N (µ2 , σ2 , xi )) + N1 Log (η1 ) + N2 Log (η2 ), where index “1” refer to the first component and index “2” to the second component. The corresponding polynomial form of the maximization problem like in case of a single component distribution, e.g., can be similarly √ developed σ22 −
N1 −2αγ 2 −2αγµ1
1+α2 +β 2
+ 2α2 γ − 1 + α2 + β 2 γ a1 + b1 α3 − α 1 + α2 + β 2 + e1 2α2 β − β 1 + α2 + β 2 + i1 1 + α2 + β2 − 2α2 + p 3/2 a1 α2 µ1 1 + α2 + β 2 − µ1 1 + α2 + β 2 + 2αβγc1 + αβ 2 d1 − 2αγf1 + 2
p p αh1 − 2αβg1 + αβµ1 1 + α2 + β 2 − f1 αµ1 1 + α2 + β 2 + √ N2 −2αγ 2 −2αγµ2 1+α2 +β 2 2 σ1 − + 2 2α2 γ − 1 + α2 + β 2 γ a2 + b2 α3 − α 1 + α2 + β 2 + e2 2α2 β − β 1 + α2 + β 2 + i2 1 + α2 + β2 − 2α2 + p 3/2 + 2αβγc2 + αβ 2 d2 − 2αγf2 + a2 α2 µ2 1 + α2 + β 2 − µ2 1 + α2 + β 2 p p αh2 − 2αβg2 + αβµ2 1 + α2 + β 2 − f2 αµ2 1 + α2 + β 2 , similarly, √ σ22 −
N1 −2βγ 2 −2βγµ1
(9)
1+α2 +β 2
+ 2 c1 2β 2 γ − γ 1 + α2 + β2 + e1 2αβ 2 − 1 + α2 + β α + d1 β 3 − 1 + α2 + β 2 β + g1 −2β 2 + 1 + α2 + β 2 + p 3/2 + 2αβγa1 + α2 βb1 − 2βγf1 + c1 β 2 µ1 1 + α2 + β 2 − µ1 1 + α2 + β 2 p p βh1 − 2αβi1 + αβµ1 1 + α2 + β 2 − βµ1 f1 1 + α2 + β 2 + √ N2 −2βγ 2 −2βγµ2 1+α2 +β 2 σ12 − + 2 2 c2 2β 2 γ − γ 1 + α2 + β2 + e2 2αβ 2 − 1 + α2 + β α + d2 β 3 − 1 + α2 + β 2 β + g2 −2β 2 + 1 + α2 + β 2 + p 3/2 + 2αβγa2 + α2 βb2 − 2βγf2 + c2 β 2 µ2 1 + α2 + β 2 − µ2 1 + α2 + β 2 p p βh2 − 2αβi2 + αβµ2 1 + α2 + β 2 − βµ2 f2 1 + α2 + β 2 , and √ 2
N1 2γ+2µ1
(10)
1+α2 +β 2
σ22 − − αa1 − βc1 + f1 + 2 √ N2 2γ+2µ2 1+α2 +β 2 − αa − βc + f σ12 − 2 2 2 2
(11)
where the unknowns are the parameters of the plane to be fitted (α, β and γ), while the others are known (constant) parameters. To solve this system using numerical Groebner basis is feasible, but to solve it in symbolic way is very difficult. We tried Sylvester resultant, improved Dixon resultant developed by Kapur-Saxena-Yang, as well as, Groebner basis implemented in Mathematica, Maple, Magma and Singular but all of these experiments failed, since the computations were running over the realistic time limit, normally one hour. However, Dixon’s method implemented with Early Discovery of Factors heuristic algorithm was successful.
4
Separation of inliers and outliers via iteration
In order to eliminate (to separate) outliers, we should repeat the application of EM and ML steps until no changes in the two sets occur. Fig. (4) shows four different phases of this iteration process. Tables 1 and 2 also illustrate the results of this iteration process.
Figure 4: Separation of the inliers (blue) and outliers (red) in the subsequent iteration steps . Step 1 (top, left), step 2 (top, right), step 3 (bottom, left), and step 4 (bottom, right) Table 1: Parameters of the two-component Gaussian during the processing phase
1 2 3 4 5 6 7 8 9 10
Μ1
Σ1
Μ2
Σ2
N1
N2
0.439112 0.936942 1.32548 1.66079 2.05269 2.74818 3.54184 3.72236 3.73615 3.73694
1.05598 1.10364 1.07268 1.05214 1.06531 1.042 1.01137 1.01355 1.01359 1.01359
-0.101072 -0.147129 -0.188991 -0.22366 -0.252983 -0.272046 -0.276444 -0.276785 -0.276808 -0.276813
0.490322 0.444075 0.395388 0.342297 0.27383 0.178616 0.103428 0.098946 0.0989182 0.0989181
861 910 1081 1254 1413 1493 1499 1500 1500 1500
10 639 10 590 10 419 10 246 10 087 10 007 10 001 10 000 10 000 10 000
Table 2: Model parameters during the processing phase
1 2 3 4 5 6 7 8 9 10
5
Α
Β
Γ
0.170587 0.151338 0.130191 0.10248 0.0597415 0.012018 0.00104897 0.000212315 0.000164903 0.000162183
0.0996839 0.0825265 0.0675083 0.0504112 0.0282492 0.00547826 -0.000217415 -0.000643044 -0.000668525 -0.000669987
5.00224 5.22465 5.42374 5.64439 5.92556 6.20785 6.27581 6.28092 6.28123 6.28125
Conclusions
This study has presented an algebraic technique to carry out robust estimation via maximization of likelihood function of a Gaussian mixture representing the mixed distribution of inlier and outlier data points. The separation of the two distributions was carried out by EM algorithm. An iteration process has been developed, where in every iteration step the solution of a multivariate polynomial system is needed. This system can be solve using numeric Groebner basis or alternatively by Dixon EDF implemented in Fermat. The method has been illustrated by synthetic data set however it was also successful for real laser scanner measurements. The suggested technique was compared with other robust methods such as Danish and Random Sample Consensus methods on the data set of a real laser scanning experiment and the results were very promising.
6
References
Borrmann D, Elseberg J, Lingemann K, N¨ uchter A (2011) The 3D Hough Transform for Plane Detection in Point Clouds: A Review and a new Accumulator Design, 3D Res. 02, 02003, 3DR EXPRESS. Chen C C and Stamos I (2007) Range image segmentation for modeling and object detection in urban scenes, 3DIM2007.1 Diebel JR, Thrun S and Brunig M (2006) A Bayesian method for probable surface reconstruction and decimation, ACM Transactions on Graphics (TOG). Huang C-M and Tseng Y-H. Plane fitting methods of Lidar point cloud, Dept. of Geomatics, National Cheng Kung Uni. Taiwan. Lakaemper R and Latecki LJ (2006) Extended EM for Planar Approximation of 3D Data, IEEE Int. Conf. on Robotics and Automation (ICRa), Orlando, Florida, May 2006. Russeeuw P J and Van Driessen K (1999) A Fast Algorithm for the Minimum Covariance Determinant Estimator, TECHNOMETRICS, Vol. 41. No. 3. pp. 212-223.