This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS
1
An Efficient Planar Feature Fitting Method Using Point Cloud Simplification and Threshold-Independent BaySAC Zhizhong Kang, Ruofei Zhong, Ai Wu, Zhenwei Shi, and Zhongfei Luo Abstract— Three-dimensional laser scanning can acquire point cloud data with high spatial resolution. However, for practical applications, such as point cloud fitting and 3-D reconstruction, there is usually significant data redundancy, which reduces the operational efficiency. In this letter, we propose a fast point cloud fitting algorithm that uses point cloud simplification to preserve feature boundaries and threshold-independent Bayesian sampling consensus (BaySAC) to fit planar features. We first extract the point features, such as corner points and contour points, using a smoothing analysis of the vicinities of scattered points and an angle analysis of vectors based on search points and their adjacent points. Then, keeping all the feature points, we thin the nonfeature points by constructing a cube grid. Finally, based on the least median squares and the BaySAC algorithm, we propose a robust nonthreshold-dependent method to perform the rapid fitting of planar features in the point cloud after thinning. We used three sets of point cloud data acquired using a 3-D laser scanner to verify the accuracy and efficiency of the planar feature fitting method. The experimental results indicate that the method can extract finer planar features and has significantly better accuracy and computational efficiency than the classical random sample consensus algorithm for the fitting of planar features without using a threshold. Index Terms— Bayesian sampling consensus (BaySAC), least median squares (LMedS), point cloud fitting, point cloud simplification, random sample consensus (RANSAC).
I. I NTRODUCTION RIMITIVE fitting mechanisms serve as a central component of remote sensing applications, such as 3-D modeling and as-built surveys. Primitive fitting is primarily conducted to estimate model parameters from raw data that are contaminated by outliers. In general, two issues need to be addressed for the purpose of improving the efficiency and robustness of primitive fitting. The first issue is the massive data. Thus, the efficient and accurate simplification of the point cloud is necessary. Mesh-based simplification and point-based simplification
P
Manuscript received August 2, 2016; revised September 18, 2016; accepted September 22, 2016. This work was supported in part by the Natural Science Foundation of China under Grant 41471360 and in part by the Fundamental Research Funds for the Central Universities under Grant 2652015176. Z. Kang, A. Wu, Z. Shi, and Z. Luo are with the Department of Remote Sensing and Geo-Information Engineering, School of Land Science and Technology, China University of Geosciences, Beijing 100083, China (e-mail:
[email protected];
[email protected];
[email protected];
[email protected]). R. Zhong is with the State Key Laboratory Incubation Base of Urban Environmental Processes and Digital Simulation, Capital Normal University, Beijing 100048, China (e-mail:
[email protected]). Color versions of one or more of the figures in this letter are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/LGRS.2016.2614749
are commonly used methods. In contrast to pointbased simplification methods, mesh-based ones require mesh generation, which is naturally more complex and cumbersome. Therefore, point-based simplification, as a new research domain, has received increased attention. Concerning the fitting of primitives, edge points have more important and distinct characteristics than nonedge points, so these special points should always be retained in the simplification process. Shi et al. [1] extended the k-means clustering theory to simplify 3-D points. Yu et al. [2] proposed a so-called adaptive simplification method that preserves the geometric characteristics of the original point-based models using a hierarchical cluster tree, simplification criteria, and a local clustering approach. Sharp edge points were extracted and preserved after a simplification process that proposed using the importance of a constructed nonnegative function to measure the point cloud and that deleted the most insignificant points in the point cloud based on an iterative method [3]. The second issue is the robust estimation of primitive parameters. Therefore, the majority of existing primitivefitting techniques focus on this issue. Random sample consensus (RANSAC) [4] is a well-regarded technique for the segmentation and robust model fitting of laser scanning data because it is proven to be capable of addressing more than 50% of all outliers. Schnabel et al. [6] improved the efficiency of RANSAC through local point selection and the incorporation of a simplified score function. Torr and Zisserman [7] applied the robust estimation method maximum likelihood estimation sample consensus to identify best-fitting roof models in a model-driven manner, and Torr and Davidson [8] presented the importance sampling and RANSAC methods, which use a hierarchical resampling algorithm. Differing from RANSAC, a conditional sampling method, Bayesian sampling consensus (BaySAC) [9], was proposed to always select the minimum number of required data with the highest inlier probabilities as a hypothesis set to reduce the number of iterations needed to find a good model. However, [9] admitted the possibility that degenerate configuration incorrectly assumed to contain outliers could cause a sampling strategy to fail. To improve the robustness and applicability of the original BaySAC method, Kang et al. [10] optimized the BaySAC algorithm by developing a model-free algorithm for the statistical testing of candidate model parameters to compute the prior probability of each data point. However, BaySAC and other basic sampling algorithms assess the quality of the hypothesis model by counting the number of points that support the current hypothesis. The inliers are determined in terms of their point-to-model distances and
1545-598X © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS
We begin with discrete 3-D laser scanned data and derive the feature points of the point clouds through the analysis of point cloud smoothness and boundary features. We then perform thinning of the nonfeature points to achieve point cloud simplification while retaining the feature points. Because the feature points are important for further processing, as many as possible are kept. Although a few false feature points may be retained, their influence on the fitting process can be ignored. We first organized the scattered point clouds using a kd-tree data structure to conduct an efficient neighbor search.
Fig. 1.
Flowchart of the proposed method.
a user-defined threshold, resulting in a threshold-dependent verification process. To improve the efficiency of planar feature fitting, we propose a fast planar feature fitting algorithm that integrates point cloud simplification, preserving the feature boundaries and using threshold-independent BaySAC. This method includes two parts: a smoothness analysis based on the nearby domain of discrete points and an analysis of the angles between vectors formed by a candidate point and its adjacent points, which simplifies the point cloud while retaining the feature points. The application of the least median squares (LMedS) method results in a BaySAC algorithm that possesses threshold independence and allows the robust fitting of planar features. Fig. 1 shows a flowchart of the proposed method. The algorithm proceeds as follows. 1) Simplify the original point cloud. 2) Determine the prior inlier probability of each point in the simplified point cloud using the statistical testing process. 3) Select the n data points with the highest inlier probabilities as the hypothesis set. 4) Compute primitive parameters corresponding to the n chosen data points. 5) Evaluate all data points with respect to the primitive parameters and determine a good consensus set using the LMedS-based cost function. 6) Update the inlier probabilities of each data point in the hypothesis set using Bayes’ rule. 7) Repeat steps 3–6 with the new inlier probabilities until the sampling number reaches T. 8) Select the primitive parameters with the smallest Med deviation as the optimum model. 9) Using all inlier points, compute the optimal model parameters through least-squares adjustment. II. P OINT C LOUD S IMPLIFICATION P RESERVING F EATURE P OINTS The critical step in the preprocessing of point cloud data is the simplification of the scanned data while ensuring accuracy.
A. Extraction of Feature Points Feature points can be divided into contour points and corner points. Contour points are data points located at the edge of a feature, e.g., a planar primitive, and corner points are data points located at the intersection of multiple features. To determine whether a point is a feature point, we adopted a new approach to extract feature points from the point cloud using a smoothness analysis of the area adjacent to a point of interest and an angle analysis of the vectors formed by a candidate point and its neighbor points. 1) Extraction of Corner Points: In point clouds, the neighborhood of a corner point does not possess the characteristics of a plane or a smooth curved surface [11], and therefore we can use a smoothness analysis of the neighborhoods of scattered points to extract the corner points from the cloud points. To calculate the smoothness in the neighborhood of points, we first need to calculate the covariance matrix of the coordinates of the points within the vicinity of a e point. Suppose that λ1 > λ2 > λ3 are the eigenvalues of the covariance matrix C. The covariance matrix of the coordinates of the adjacent point set can be constructed as follows [12]: cov(x, x) cov(x, y) cov(x, z) (1) C = cov(y, x) cov(y, y) cov(y, z) cov(z, x) cov(z, y) cov(z, z) where 1 ¯ ¯ (Ai − A)(B i − B). n−1 n
cov(A, B) =
(2)
i=1
The evaluation index of the point cloud smoothness is defined as η = (λ2 −λ3 )/λ2 , η ∈ (0, 1). The smoothness of the area adjacent to a corner point is small, while the smoothness of adjacent points fitting well to a smooth curved surface or a planar primitive is approximately 1. To determine whether the discrete point is a corner point, we calculate the smoothness of its adjacent points. If the smoothness is less than a given threshold (i.e., 0.5, determined experimentally), the point is a corner point; otherwise, we will judge it to be a contour point. 2) Extraction of Contour Points: In point clouds, the neighborhood between a point and its adjacent points can reflect whether it is a boundary feature point of a plane. An algorithm for boundary point extraction is proposed using a moving window. Fig. 2 shows a circular window with a predefined radius centered at the point of interest p. All points within the window are considered the neighboring points of point p and are projected onto the local plane w. The kd-tree data structure is used to represent the data, and the nearest neighbor point set of point p is determined based on the Euclidean distance. The vector u, i.e., the direction of the u-axis, is drawn from point p to point p1 , which is the farthest of the adjacent points
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. KANG et al.: EFFICIENT PLANAR FEATURE FITTING METHOD
3
coordinates of the center of mass are the mean values of the sum of the coordinates (x, y, z) of the points within the voxel. III. F ITTING P LANAR P RIMITIVES U SING LMedS-BASED C ONDITIONAL S AMPLING
Fig. 2.
Extraction of boundary points using a moving window.
from point p. The direction of v- axis is given by v = w × u, where w is the normal vector of the local plane fitted from the points within the circular window. The polar angles of the neighboring points are computed relative to point P (e.g., αi ). We then calculate the differences between consecutive polar angles. If point P is a boundary point, the difference αi+1,i between boundary points Pi and Pi+1 is much larger than the difference αi+2,i+1 between boundary point Pi+1 and interior point Pi+2 . Therefore, once the difference exceeds a self-adaptive defined threshold, point P is labeled as a boundary point. A histogram is employed to detect the highest peak of the distribution of the differences between consecutive polar angles, and the threshold is set to three times α, ¯ which is the average of the differences at the highest peak in the histogram. The histogram bin size was 3 × (360°/k), where k is the number of neighbor points. Because the same neighborhood of a candidate point is used to detect corner and contour points, contour point detection will be implemented on the same candidate point if it is not detected as a corner point. If the candidate point is not a corner point or a contour point, we consider it a nonfeature point. B. Thinning of Nonfeature Points The nonfeature points can be retained based on the user-defined simplification ratio. We propose to implement voxelization on the nonfeature points. The dimensions of the voxels are fixed and determined in terms of the predefined simplification ratio and the average point interval, which is computed during the process of generating a kd-tree from the point cloud. The directions of the voxel edges are set parallel to the coordinate axes. Therefore, the following describes the mapping from nonfeature point coordinates (x, y, z) to the voxel coordinates (i , j , k), which can be used to decide which voxel a point should belong to: i = int(x/l + 0.5) j = int(y/w + 0.5) k = int(z/ h + 0.5)
(3)
where (x, y, z) is the set of nonfeature point coordinates; (l, w, h) represents the dimensions of the voxels; int() is the rounding operation; and (i , j , k) denotes the voxel coordinates. The calculation is implemented for each nonfeature point to determine which voxel should contain it. Afterward, for each voxel, only the nonfeature point closest to the center of mass of the voxel remains as the postsimplification point. The
Based on Kang et al.’s method [10], we introduce the LMedS method to construct the cost function of the model hypothesis test, which will avoid the selection of the experience threshold and achieve threshold independence. The cost function is the basis for determining whether a data point is an inlier point. Different model parameter estimates have different cost functions, such as the distance from the point in the point cloud to the fitting feature and the distance of the corresponding point pair in the point cloud registration. However, to determine whether the data point can pass the test of the cost function, an experience threshold must be predefined experimentally. If the threshold is too large or too small, the model consistency test cannot estimate the optimum model parameters. Based on the LMedS algorithm, we propose the BaySAC-LMedS algorithm. The prior probabilities of the data points are estimated using a histogram to dynamically evaluate the convergence of the hypothesis primitives during the hypothesis testing process, by which a hypothesis point set is selected to compute the candidate parameter set. We then use the LMedS method rather than the distance from the point to the fitting feature to construct the cost function of the model consistency test. Finally, we select the candidate primitive parameter set with the smallest median of all the residual squares of the hypothesis tests as the optimum.
A. Determination of the Prior Inlier Probabilities of Data Points Because the prior inlier probabilities for each point indicate whether the point is believed to be an inlier, determining the correct prior inlier probability is important for the BaySAC algorithm. We implemented the statistical testing process proposed in [10], which is generic and can be applied to any BaySAC problem. The process is implemented using a histogram that illustrates the distribution of the discrete hypothesis model parameter sets computed during the different iterations and the degree of convergence of each candidate parameter set, which describes how the other sets converge to it. The degree of convergence of a bin in the histogram is calculated as the number of parameter sets in that bin divided by the total number of parameter sets. When the degree of convergence of a cluster in the distribution of parameter solutions reaches a predefined threshold, the first hypothesis set in that cluster is used to determine the prior inlier probabilities of the data points according to the following equation: ⎧ Di ⎨ , (Di < m) 1− Pi = (4) m ⎩0, (D >= m) i
where Pi denotes the prior probability of point i , Di is the distance between point i and the fitted primitive, and m represents the predefined threshold for outlier identification, which is set to five times the point precision.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4
Fig. 3.
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS
Simplification of data set. (a) Original point cloud. (b) Extracted feature points. (c) Point cloud after simplification.
B. Principle of Probability Updating After determining the prior inlier probabilities of each corresponding pair, the following equation [10] is used to update the inlier probabilities during consecutive iterations: ⎧ ⎨k P (i ∈ I ), i ∈ Ht Pt (i ∈ I ) = D t −1 (5) ⎩ P (i ∈ I ), i∈ / H t −1
TABLE I R ESULTS OF P OINT C LOUD S IMPLIFICATION
t
where I is the set of all inliers, Ht is the hypothesis set of n data points used in iteration t of the hypothesis testing process, Pt −1 (i P(A|B) = ((P(B|A)P(A))/(P(B))) ∝ L(A|B)P(A)I ) and Pt (i L(A|B) ≈ k/D I ) denote the inlier probabilities for data point i during iterations t − 1 and t, respectively, k is the number of points consistent with the model during a test, and D is the total number of data points. C. Construction of the Cost Function Using LMedS We applied the LMedS method [13] to construct the cost function for determining the optimum model to reduce the influence of human factors and improve the robustness of the model estimate. We first randomly select a subset from the samples to calculate the model parameters and then calculate the deviation (i.e., square of the distance from the point to the model) of all the other sample points with respect to the model. We use the following equation to calculate the median of all the model deviations, which is called the Med deviation: Med = mid Di2 (1 ≤ i ≤ N ) (6) where Di is the distance from point i to the model and N is the number of data points used to test the model parameter. We use the Med deviation to replace the number of inlier points complying with the hypothesis model as the criteria to determine the optimum model. After the iterative process of the hypothesis test is finished, we select the model with the smallest Med deviation as the optimum model. IV. E XPERIMENTAL R ESULTS AND A NALYSIS To verify the proposed method, we conducted experiments on two sets of point clouds acquired using a RIEGLZ-620 3-D laser scanner. Data set I was captured from an underground parking garage [Fig. 3(a)], while data set II was collected from a teaching building [Fig. 5(a)]. Data set I contains planar primitives, while data set II includes several concave characters, which are shown as fine planar features in the space. Although these point clouds seem small, they contain many finely detailed planar features, which is believed to be a more important characteristic than a large size for verifying our proposed algorithm. A. Simplification of the Point Cloud To evaluate the feasibility of the proposed algorithm, we performed simplification experiments on the discrete point clouds. The two groups (data sets I and II) illustrate cases with different numbers of fitting planes before and after the
Fig. 4. Comparison of the fitting results of different methods before and after the point cloud simplification of data set I. (a) Threshold of 0.08 (left: original and right: simplified). (b) BaySAC-LMedS method (left: original and right: simplified).
simplification. Table I compares the simplification results for these two data sets, which shows that the average simplification ratio is 59.6%. Taking data set I as an example, Fig. 3(a) shows the original data, consisting of a total of 310 205 points. This data set was simplified with k = 30, ηthr = 0.5, and l = w = h = 0.1 m. Fig. 3(b) shows the 37 636 feature points extracted by the proposed method. Fig. 3(c) shows the results of the point cloud simplification for this data set, which represents an underground parking garage; the resulting point cloud contains a total of 119 089 discrete points. B. Fitting of Point Clouds We used the BaySAC algorithm proposed by Kang et al. [10] and the BaySAC-LMEDS method for the planar primitive fitting of the two point clouds simplified using the proposed algorithm. The optimized BaySAC algorithm assesses the quality of the hypothesis model by counting the number of points that support the current hypothesis. The inliers are determined in terms of their pointto-model distances and a user-defined threshold. Therefore, the thresholds for the distance from the laser point to the fitting plane were set to 0.02, 0.04, 0.06, and 0.08 m. The fitting results of data set I are shown in Fig. 4 (the original point clouds are on the left, and the simplified point clouds are on the right), and the different features are represented by surface patches with different colors. Data set I contains many planar features, so we analyzed the segment highlighted in Fig. 3(a) with a red box before and after the simplification (Fig. 4). Fig. 4 shows images of the analyzed region. When the threshold of the original point cloud and the simplified point cloud in the BaySAC fitting cloud and the simplified point cloud in the BaySAC fitting method was greater than 0.06
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. KANG et al.: EFFICIENT PLANAR FEATURE FITTING METHOD
5
TABLE II C OMPARISON OF F ITTING R ESULTS B EFORE AND A FTER S IMPLIFICATION
selected, and thus they are consistent with those of the BaySAC-LMedS algorithm that is threshold independent. Table II illustrates that both the correctness and the completeness of the plane fitting were improved after the simplification of the point clouds using the proposed algorithm. The BaySAC-LMedS algorithm is more efficient than the BaySAC algorithm. Table II also shows that fitting the simplified point cloud is clearly more efficient than fitting the original one.
Fig. 5. Simplification of data set II. (a) Original point cloud. (b) Point cloud simplification. (c) Threshold of 0.08. (d) BaySAC-LMeds results.
(e.g., 0.08), several small-sized primitives were missed [highlighted with white rectangles in Fig. 4(a)]. Only with the BaySAC-LMedS method, it was possible to fully reconstruct the 3-D details of the walls [Fig. 4(b)]. Compared with the results of the original point cloud, finer planar primitives were detected thanks to the proposed simplification algorithm, which preserves the feature points and enhances the prominence of small-sized features after simplification [highlighted with the above white rectangle in the right parts of Fig. 4(b)], so the fitting time is not proportional to the number of points in the original and simplified data sets. Data set II also contains many planar features, so we analyzed the segment highlighted with a red box in Fig. 5(a). In addition to one obvious planar feature, this example also contains several detailed character features. These features are also planar features in a geometric sense. The fitting of the simplified point cloud was tested. When the threshold for the original point cloud in the BaySAC method was greater than 0.06 (e.g., 0.08), the planar fitting was incorrect [Fig. 5(c)], and the character features were missed. However, the character features were successfully fitted by the BaySAC-LMedS method from the point cloud without setting any threshold [Fig. 5(d)]. C. Comparison of the Fitting Results To validate the robustness of the plane fitting, the correctness and completeness were adopted to evaluate the results. The correctness was computed as the number of correctly fitted planes divided by the number of all fitted planes, while the completeness was calculated as the number of fitted planes divided by the number of all planes contained in the point cloud. We also calculated the fitting time and compared the performance of the different algorithms based on their computational cost. The results of the BaySAC algorithm in Table II were achieved using the optimal threshold that was manually
V. C ONCLUSION To solve the problem of significant data redundancy in the extraction of planar features from 3-D scanned surface data, we propose a fast planar feature fitting algorithm that integrates point cloud simplification, retains feature boundaries, and uses threshold-independent BaySAC. The key to this method is to retain all the feature points in the point cloud simplification process and to achieve threshold independence during the robust model hypothesis test. The experimental results indicate that the proposed method can extract detailed features that are easily overwhelmed by features with large numbers of points. Moreover, the operational efficiency of the planar feature fitting after the point cloud simplification can be substantially improved in comparison with the fitting using the original point cloud. The experimental results also indicate differences in the fitting accuracies when different thresholds are selected for the existing algorithms, such as RANSAC and BaySAC, whereas the fitting results of the BaySAC-LMedS method are independent of the thresholds. The presented method addresses only the most prevalent planar features. In future work, this method will be expanded to the fitting of complex features, which will increase the adaptability of the method. R EFERENCES [1] B.-Q. Shi, J. Liang, and Q. Liu, “Adaptive simplification of point cloud using k-means clustering,” Comput.-Aided Design, vol. 43, no. 8, pp. 910–922, 2011. [2] Z. Yu, H.-S. Wong, H. Peng, and Q. Ma, “ASM: An adaptive simplification method for 3D point-based models,” Comput.-Aided Design, vol. 42, no. 7, pp. 598–612, 2010. [3] K. Demarsin, D. Vanderstraeten, T. Volodine, and D. Roose, “Detection of closed sharp edges in point clouds using normal estimation and graph theory,” Comput.-Aided Design, vol. 39, no. 4, pp. 276–283, 2007. [4] M. A. Fischler and R. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM, vol. 24, no. 6, pp. 381–395, 1981. [5] O. Chum and J. Matas, “Randomized RANSAC with T(d,d) test,” in Proc. Brit. Mach. Vis. Conf., vol. 2. 2002, pp. 448–457. [6] R. Schnabel, R. Wahl, and R. Klein, “Efficient RANSAC for point-cloud shape detection,” Comput. Graph. Forum, vol. 26, no. 2, pp. 214–226, 2007. [7] P. H. S. Torr and A. Zisserman, “MLESAC: A new robust estimator with application to estimating image geometry,” Comput. Vis. Image Understand., vol. 78, no. 1, pp. 138–156, 2000.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6
[8] P. H. S. Torr and C. Davidson, “IMPSAC: Synthesis of importance sampling and random sample consensus,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 3, pp. 354–364, Mar. 2003. [9] T. Botterill, S. Mills, and R. Green, “New conditional sampling strategies for speeded-up RANSAC,” in Proc. Brit. Mach. Vis. Conf., 2009, pp. 1–11. [10] Z. Kang, L. Zhang, B. Wang, Z. Li, and F. Jia, “An optimized BaySAC algorithm for efficient fitting of primitives in point clouds,” IEEE Geosci. Remote Sens. Lett., vol. 11, no. 6, pp. 1096–1100, Jun. 2014.
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS
[11] G. Vosselman, “Point cloud segmentation for urban scene classification,” ISPRS Int. Arch. Photogram., Remote Sens. Spatial Inf. Sci., vol. XL-7/W2, pp. 257–262, Nov. 2013. [12] A. Kalaiah and A. Varshney, “Modeling and rendering of points with local geometry,” IEEE Trans. Vis. Comput. Graphics, vol. 9, no. 1, pp. 30–42, Jan. 2003. [13] P. J. Rousseeuw and A. M. Leroy, Robust Regression and Outlier Detection, vol. 1. New York, NY, USA: Wiley, 1987.