Automation in Construction 47 (2014) 78–91
Contents lists available at ScienceDirect
Automation in Construction journal homepage: www.elsevier.com/locate/autcon
Towards terrestrial 3D data registration improved by parallel programming and evaluated with geodetic precision Janusz Będkowski a,⁎, Karol Majek a, Pawel Musialik a, Artur Adamek b, Dariusz Andrzejewski b, Damian Czekaj b a b
Institute of Mathematical Machines, ul. Krzywickiego 34, 02-078 Warsaw, Poland Faculty of Geodesy and Cartography, Warsaw University of Technology, Pl. Politechniki 1, 00-661 Warsaw, Poland
a r t i c l e
i n f o
Article history: Received 24 October 2013 Received in revised form 10 June 2014 Accepted 25 July 2014 Available online 29 August 2014 Keywords: Iterative closest point Data registration Mobile mapping CUDA parallel programming Spatial design support
a b s t r a c t In this paper a quantitative and qualitative evaluation of proposed ICP-based data registration algorithm, improved by parallel programming in CUDA (compute unified device architecture), is shown. The algorithm was tested on data collected with a 3D terrestrial laser scanner Z + F Imager 5010 mounted on the mobile platform PIONNER 3AT. Parallel implementation enables data registration on-line, even using a laptop with a standard hardware configuration (graphic card NVIDIA GeForce 6XX/7XX series). Robustness is assured by the use of CUDA-enhanced fast NNS (nearest neighbor search) applied for ICP (iterative closest point) with SVD (singular value decomposition) solver. The evaluation is based on the reference ground truth data registered with geodetic precision. The geodetic approach extends our previous work and gives an accurate benchmark for the algorithm. The data were collected in an urban area under a demolition scenario in a real environment. We compared four registration strategies concerning data preprocessing, such as subsampling and vegetation removal. The result is the analysis of measured performance and the accuracy of the geometric maps. The system provides accurate metric maps on-line and can be used in several applications such as mobile robotics for construction area modelling or spatial design support. It is a core component for our future work on mobile mapping systems. © 2014 Elsevier B.V. All rights reserved.
1. Introduction The 6D-SLAM (simultaneous localization and mapping) algorithm, apart from solving the simultaneous localization and mapping problem, allows for the quick and reliable creation of digital models of large environments without the need for direct intervention. The 6D comes from the six dimensions of the robot motion model, which integrates 3D position coordinates (x, y, z) with orientation information (yaw, pitch, and roll). Such a model is a natural choice for an outdoor environment. There is no limitation in using 6D-SLAM in indoor environments, but using pitch and roll angles on flat surfaces is not always necessary. The output of the algorithm, in most cases, is a map in one of two forms — dense or sparse. Dense maps are related with 3D point clouds [1] obtained typically with 3D laser scanners; sparse maps are related with features extracted mostly from images. The 3D data registration problem was introduced by Besl and McKay in Ref. [2]; from that moment on, many researchers have been trying to solve the problem of augmenting the accuracy and the performance of aligning two clouds of points. Based on the State of the Art, we can state that the solutions to key issues of 3D GPGPU (general purpose computing on graphics processing units) data registration proposed in the important contributions are very close to optimum, but may ⁎ Corresponding author. E-mail address:
[email protected] (J. Będkowski).
http://dx.doi.org/10.1016/j.autcon.2014.07.013 0926-5805/© 2014 Elsevier B.V. All rights reserved.
still be improved upon. An approach that is widely used for 3D data registration is the iterative closest point (ICP) algorithm. The goal of the ICP is to find the transformation matrix that minimizes the sum of distances between the corresponding points in two different data sets. The method's effectiveness depends mostly on solutions to two important problems: • the nearest neighbor search (NNS) and • choosing the proper optimization technique for the minimization of the mentioned function (estimation of the 3D rigid transformation). The NNS procedure is dominant compared to the rest of the ICP algorithm; therefore, many researchers are trying to optimize the time of its execution. The SoA provides several CUDA based approaches for the NNS problem in the ICP algorithm. An approach from Ref. [3] uses regular grid decomposition [4], whereas in Ref. [5] kd-tree is used. The second problem, choosing the proper optimization technique, has been a research topic in recent decades. A comparison of four algorithms for estimating 3D rigid transformation is shown in Ref. [6]. The first algorithm proposed in Ref. [7] uses singular value decomposition (SVD) for derive matrix. The second approach, based on orthonormal matrices and the computation of an eigensystem of a derived matrix, is proposed in Ref. [8]. The third algorithm is shown in Ref. [9]. It finds the transformation for the ICP algorithm by using unit quaternions. The fourth algorithm, shown in Ref. [10], uses the so-called dual
J. Będkowski et al. / Automation in Construction 47 (2014) 78–91
quaternions. Apart from these four closed-form solution methods, a novel linear solution to the scan registration problem is shown in Ref. [11]. The advantage of these new linear solutions is that they can be extended straightforwardly to n-scan registrations. It was stated that under the assumption that the transformation (R,t) that has to be calculated by the ICP algorithm is small, it can be approximated by applying instantaneous kinematics. This solution was initially given in Refs. [12, 13]. Reported experiments have shown that the helix transform performs qualitatively as well as the uncertainty-based algorithm using Euler angles. The paper is composed of 11 chapters. The current one provides an introduction and short state of the art summary. The second explains the motivation behind the research. In the real task scenario details of the experiment are explained. Chapter 4 describes the data acquisition and processing. In chapter 5, the methodology for evaluation is described, followed by algorithm modifications, vegetation removal and sub-sampling. Chapter 8 provides a detailed description of the experiment, with the analysis of the results in chapter 9. The article in chapter 10 introduces the end-user case study, and chapter 11 closes with a summary and conclusions. 2. Problem formulation The goal of this work is to benchmark the 3D data registration method, improved by CUDA parallel programming, shown in the previous work [14] within the scope of quantitative spatial design support. The benchmark is analyzed using reference data of geodetic precision. The approach extends the state of the art by providing qualitative information concerning the accuracy of the proposed method. The secondary goal is to test the system in the real task scenario with an assumption of the on-line performance. The resulting maps can be used for numerous applications: urban area modeling, spatial design support, basic space design, etc. 3. Real task scenario To assure the real-life conditions for the experiment, a proper environment has to be chosen. Fig. 1a shows an object of interest: a building in village Klomino (Poland), abandoned since 1993. The choice is motivated by the hard terrain conditions of the area. The goal of the experiment is to create a metric model of this building. Data for the model are gathered with a geodetic laser range finder mounted onto a robotic
(a) Location of scenario.
79
platform (Fig. 1c). This scenario simulates a potential real robotic application: deployment of a mobile platform in a hazardous environment for gathering data and providing a metric map in an on-line fashion. Similar equipment (RIEGL LMS-Z210) was involved in disaster assessment at Fukushima 1 in 2011. The key factor is the accuracy of scan matching, which has to be as high as possible to increase the fidelity of the produced metric map. 4. Data registration The ICP algorithm, with its variations, point to point and point to plane, has become a well-known method since it appeared in Ref. [2]. The fastest implementation that can be found in literature needs 60 ms to align two point clouds, each of 320 × 240 data points [15], but the authors unfortunately did not discuss the scalability of proposed method. The key concept of the standard ICP algorithm can be summarized in two steps [16]: 1. Compute correspondences between the two scans (nearest neighbor search). 2. Compute a transformation which minimizes distance between corresponding points. Iteratively repeating these two steps should result in convergence to the desired transformation. Range images (scans) are defined as model set M where jMj ¼ Nm
ð1Þ
and data set D where jDj ¼ Nd :
ð2Þ
The alignment of these two data sets is solved by minimizing the following cost function: EðR; tÞ ¼
Nm X Nd X
2 wij mi − Rd j þ t
ð3Þ
i¼1 j¼1
wij is assigned 1 if the ith point of M corresponds to the jth point in D. Otherwise wij = 0. R is the rotation matrix, t is the translation matrix, mi corresponds to points from the model set M, and dj corresponds to
(b) Object of interest.
(c) Mobile robot and geodetic equipment.
Fig. 1. Real task scenario.
80
J. Będkowski et al. / Automation in Construction 47 (2014) 78–91
points from the data set D. It was already proven that the ICP algorithm needs a good prediction to achieve an accurate matching. Therefore, in this paper we decided to show that by decreasing the radius of NNS during ICP we can improve the accuracy. The main contribution in this paper is related to the following improvements of the 3D data registration shown in our previous work [3]: • processing up to 64 × 1024 × 1024 points in a single step, • using regular grid decomposition for a robust nearest neighborhood search, • using a parallel reduction for the correlation matrix computation, • implementation of SVD solver in CUDA, • data post processing implemented in CUDA.
(a) Scalable programming model in CUDA.
4.1. CUDA implementation of classic ICP NVIDIA GPUs are fully programmable multi-core chips built around an array of processors working in parallel. The GPU is composed of an array of SM (FERMI)/SMx (KEPLER) multiprocessors, where each of them can launch up to 1024 co-resident concurrent threads. Thread management (creation, scheduling, synchronization) is performed on a hardware level (SM/SMx); therefore, overhead cost is extremely low. The SM/SMx multiprocessors work in an SIMT scheme (single instruction, multiple thread), where threads are executed in groups of 32, called warps. The CUDA programming model defines the host and the device. The host executes CPU sequential procedures, whereas the device executes parallel programs — kernels. A kernel works according to a SPMD scheme (single
(b) Neighboring buckets with its indexing in regular grid of buckets implemented on GPU (k=32, 64, 128, 256 or 512).
(c) System over view for data registration. Fig. 2. Parallel implementation in CUDA.
J. Będkowski et al. / Automation in Construction 47 (2014) 78–91
(a) Input data.
(b) Normal vectors.
81
(c) Removed vegetation.
Fig. 3. Vegetation detection and removal.
(a) Geodetic network.
(b) Numbered poses of registered scans
Fig. 4. Geodetic network used for measuring ground truth data.
82
J. Będkowski et al. / Automation in Construction 47 (2014) 78–91
Fig. 5. Comparison of three strategies for registration of observations 1 and 2.
program, multiple data). CUDA massively parallel computation has several applications in a given problem. The main idea is using the GPU for NNS by decomposing the 3D space x, y, ∈ b − 1, 1 N into a regular grid of 2k × 2k × 2k (k ∈ {4, 5, 6, 7, 8, 9}) buckets. Another idea is to perform calculations for each query point in parallel using SIMT — single-instruction, multiple-thread in CUDA. Assuming a scalable programming model shown on Fig. 2a that allows the GPU architecture to scale the number of multiprocessors and memory partitions, we can expect higher performance on GPUs with a higher number of multiprocessors. To demonstrate it, we have shown the performance difference between the high-end GeForce GTX TITAN (14 multiprocessors) and the common GeForce GT 650M (2 multiprocessors) in Fig. 8. Each multiprocessor in our implementation is able to perform calculations for up to 1024 query points in parallel. Therefore increased performance can be observed for data sets over 2048 data points. In a 6D-SLAM, one of the problems is that single point clouds only partially overlap each other. Because the assumption of full overlap is violated, we are forced to add a maximum matching threshold parameter dmax. This threshold addresses the fact that some points will not have any correspondence in the second scan, preventing them from being matched. The value of the threshold is connected with the dimension size of a single bucket dmax b 2/2k. In most implementations of ICP, the choice of dmax represents a tradeoff between convergence and accuracy. A low value may lead to not finding any neighbors and, as such, a very random solution. On the other hand, large values may result in bad convergence (far from the optimal). In our approach the State of the Art algorithm described in Ref. [17] is improved by replacing the complex k–d tree data structure with CUDA regular grid. As creating grid representation is much faster than building a full tree, overall performance of the closest point search is significantly increased. Parallel implementation further decreases the computation time. All derivations of investigated registration method can be found in Ref. [18]. For the comparison purpose, the classic ICP is listed as algorithm 2, and the ICP algorithm using CUDA parallel programming is listed in algorithm 2. The main idea is to decompose the 3D
space into a regular grid of buckets (Fig. 2b) and to perform NNS computation for each query point in parallel.
5. Method of evaluation The main idea behind evaluating the proposed method is to create, using geodetic methods, an accurate reference model and compare with it the one resulting from 6D-SLAM. Before conducting the experiment with the robot a local, high accuracy geodetic control network has to be established. The number of control points of the network is chosen based on the object of interest shape. For the described experiment, four points were chosen. The position of the control points has to be determined with high accuracy, as it is later used for transforming 3D scans into the network's coordinate system. Precise measurement of the network was performed using highly accurate station Leica TCRP1201 +. Results were adjusted using the least square method.
J. Będkowski et al. / Automation in Construction 47 (2014) 78–91
83
(a) Raw data
The final accuracy of the control network was approximately 0.3 mm. Apart from the control points, a set of artificial markers and natural tie points (building corners etc.) may be used. In the experiment a set of four paper markers was used. The reference model is built in two steps. First, a uniform local coordinate system is established for the scans. After that, the scans are transformed into the control's network geodetic coordinate system. Transformation parameters can be computed using a 3D Helmert transformation, which was used in our case. The final accuracy of the model was 2.9 mm. All process was done manually. Detailed description of the reference model building process is given in Section 8. Fig. 4a shows the final network. The process of evaluating the models created by 6D-SLAM algorithm requires them to be transformed into the same geodetic coordinate system as the reference model. Often control points, used as tie points in geodetic model building, cannot be used, as the ambiguity of the SLAM models is too high. Alternatively, the centroids of each single scan of the model can be chosen. Then a 3D Helmert transformation with six parameters (without changing scale) can be used. This enables the exclusion of errors coming from the exterior pose of the full models, permitting a focus only on those coming from the matching process. The centroid's pose is compared with the reference model counterparts. The results are local fit errors that can be averaged to get the global error. This information informs the decision about the accuracy of the model.
(b) Subsampled data
6. Vegetation detection Considering the nature of the ICP algorithm, it is evident that dynamic and unstable objects can decrease the accuracy of the matching. In many cases such interference may be ignored because they are local in nature (for example, a single person, even moving, is only a small part of the scan). The problem is more significant when we consider large unstable areas, such as vegetation. The size of vegetation in the scan may influence the global accuracy of the matching. Therefore we decided to implement a robust method for vegetation identification and removal. The process is based on the normal vector analysis (Fig. 3). Estimating the surface normal is done by the principal component analysis (PCA) of a covariance matrix C created from the nearest neighbors of the query point [19]. We developed a PCA solver based on the SVD method that performs normal vector computation in parallel for each query point. In the last step of the algorithm, the orientation of the normal vector is decided. The base principle behind vegetation detection is checking, for each point, whether the direction and orientation of the neighbors are similar to that of considered point. Points for
(c) Subsampled data without trees Fig. 6. Data for registration experiment shown in Fig. 7.
whom the percentage of similarity is lower than a threshold (10% in our case) are considered vegetation and are removed from the scan. This simple approach misclassifies some building points, especially on corners, but the number is much lower than the vegetation point filtering and thus is not considered a problem.
84
J. Będkowski et al. / Automation in Construction 47 (2014) 78–91
Fig. 7. Registration errors for data from Fig. 6.
7. Data subsampling
8. Experiments
In our research, we observed that the equal density of points of initial 3D scans improves the matching accuracy. Therefore, for each scan we perform subsampling. The method used counts points within a bucket and leaves a given maximum amount of them (in our experiment we leave 1000 points per bucket, assuming decomposition 256 × 256 × 256 buckets).
Fig. 1 shows the environment where data were collected. To ensure the full coverage of the building, 18 scanning points were chosen. Fig. 4b shows 18 initial poses for data registration evaluation. Data were collected with the laser measurement system 3D Z + F IMAGER 5010 mounted onto robotic platform. Initial poses were obtained manually using dedicated software. The goal was to register all 18 scans and to compare them with the reference model obtained with classical geodetic methods. 8.1. Ground truth data with geodetic precision The control network was established using measurements from the geodetic survey total station, by adjusting them with the least square method. In the first step, the distances and directions were averaged for control points in 4 series and for markers in 1. Skew distance was reduced on an instrument level, and network side lengths were averaged from 2 measurements each. It was decided to adjust the poses of both control points and markers on buildings in one calculation process. Thus the observation system consisted, besides observations made on control points, of observations connecting the network with the markers. 8.2. Horizontal adjustment
Fig. 8. Comparison between GeForce GT 650M and GeForce GTX TITAN performance of the total registration of 18 scans. Y axis — time in seconds, X axis — number of experiment (1 — inner = 50 and outer = 50, data subsampled without trees; 2 — inner = 300 and outer = 300, data subsampled without trees; 3 — inner = 1000 and outer = 1000, data subsampled without trees; 4 — inner = 300 outer = 300, data subsampled).
For purposes of this paper, the coordinate system was made as local system with axis X parallel to the section between points 1002 and 1001. Horizontal coordinates of the 1002 point was determined as X = 100.000 m and Y = 100.000 m. The control network was not tied to any external coordinate system, so it was decided to perform adjustment with free type conditions. Using already known angle and distance values, approximate coordinates of four control points and four targets marked on the building were calculated. The system of equations consisted of ten angle observations and ten distance observations.
J. Będkowski et al. / Automation in Construction 47 (2014) 78–91
85
Fig. 9. Reference geometric model obtained with geodetic precision.
They were compensated by the root mean square error of each observation. RMS errors were based on the accuracy of those used in measurements Leica TCRP 1201+ total station (angle error 0.0003 GON, distance error in reflector-less mode 0,002 m + 2 ppm) and the number of observation series. As a result of adjustment, coordinates and accuracy of eight points were obtained. Besides two targets, whose accuracy was near 2 mm (caused by measurements made only from one position), targets and control points were calculated with an accuracy level higher than 1 mm. Also, parameters of mean error ellipse were calculated and plotted on Fig. 4a. The last step of the horizontal adjustment was to evaluate statistical tests. Both global tests, which check for gross errors and correct choices of alignment model, and local tests, were passed successfully. 8.3. Height adjustment The first step was calculating the height difference between control points and targets on the building. For that purpose averaged vertical
angles and distances, reduced on incremental level, were used. Point 1001 (Fig. 4) was assumed fixed, and its height was set to 10 m. A system of equations was created from ten equations of height differences. Observations were compensated by wage parameter p, calculated as the inverse of the network's side length multiplied by the square root of the number of measurements m: m = 1 if measurement was taken from one point, and m = 2 if measurement was taken from both points. 8.4. Building of a reference model As a first step, a rough manual orientation of the scans was performed. Afterwards, a precise orientation was performed based on the control points and markers visible in the scan. The orientation was done separately for X and Y horizontal coordinates and Z vertical coordinates. Measurements made on control points were used to do georeference (exterior orientation) model to network's coordinate system. Interior orientation was based on paper markers on walls and characteristic, easy to identify
Fig. 10. Errors for each of our four evaluated registration strategies using the reference model obtained with geodetic precision with additional comparison to the SoA algorithm from 3DTK.
86
J. Będkowski et al. / Automation in Construction 47 (2014) 78–91
points of building. For best accuracy, points were chosen regularly on a whole building. As a result, a compact and precise reference model of an object was compiled. Orientation parameters of all eighteen point clouds were calculated. An accuracy evaluation of model was done. Based on residuals vx, vy, and vz obtained for every measured point, RMS was calculated. Its value reached 2.9 mm.
whereas computation time is lower with lower ones. The best results were obtained for experiment (inner = 300 and outer = 300, data subsampled). Data reduction slightly decreases the accuracy of the registration. A decreased number of searched points within the NNS procedure
8.5. Evaluation of registration method The experiment was meant to answer two main questions: 1. How to improve the accuracy of the final model? 2. How to efficiently reduce the data set and minimize the impact on the accuracy of the final model? To improve the accuracy we decided to apply a registration method that decreases an NNS radius parameter after performing a number of ICP iterations. This parameter strongly determines the NNS area. Fig. 5 demonstrates the impact of the NNS radius parameter on the improvement of the registration's accuracy (for the demonstration, we registered observation 1 and 2). Three strategies for registration observation 1 and 2 were compared: 1. 300 iterations with radius = 6.25 m, 2. 200 iterations with radius = 6.25 m + 100 iterations with radius = 1.56 m, 3. 200 iterations with radius = 6.25 m + 50 iterations with radius = 1.56 m + 50 iterations with radius = 0.20 m. In all cases (errors for angles and errors of displacement), the third registration strategy converges to a satisfactory result. We cannot start registration with small radius because the registration tends to find a different local minimum. Therefore, large value of radius should be used for an initial registration. Another problem that was observed during experiments was related with the raw scans. We concluded that, to achieve better results, proper subsampling, ensuring equal density, is needed. We show the results on Figs. 6 and 7, where we were trying to register observation 1 and 2 with different strategies (raw, subsampled, subsampled without trees). We applied the following radius tuning strategy for the registration: 10 iterations: r = 6.25 m + 10 iterations, r = 3.125 m + 10 iterations, r = 1.5625 m + 10 iterations, r = 1.5625 m + 10 iterations, r = 1 m + 10 iterations, r = 0.80 m + 10 iterations, r = 0.60 m + 10 iterations, r = 0.40 m + 30 iterations, r = 0.20 m. The result shows a much higher error for raw data than for other approaches. The registration of subsampled data and subsampled data without trees converges to the similar error, which leads to the conclusion that the proposed data reduction is beneficial. An important observation is that data reduction based on vegetation removal slightly reduces the accuracy of the data registration. However, as shown further in the paper, it provides approximately 2 times an increase in time performance. Finally, we decided to compare this registration strategy within four experiments:
(a) Height -1,5m
1. inner = 50 and outer = 50, data subsampled without trees (total number of data 3,457,939), 2. inner = 300 and outer = 300, data subsampled without trees (total number of data 3,457,939), 3. inner = 1000 and outer = 1000, data subsampled without trees (total number of data 3,457,939), 4. inner = 300 and outer = 300, data subsampled (total number of 5,786,040). Parameters inner and outer define what number of points is used during the nearest neighbor search. Inner defines the number of random points that are chosen from the bucket in which the processed point is. Outer defines the number of points taken from each of the 26 neighboring buckets. For further explanation of the inner and outer parameters, we encourage studying our previous work [3]. These parameters can drastically influence the performance of the registration: both in the time and accuracy. Accuracy is greater with higher numbers,
(b) Height - 12m Fig. 11. Qualitative evaluation of the registration — intersection of the building on height of 1.5 m and 12 m.
J. Będkowski et al. / Automation in Construction 47 (2014) 78–91
(a)
(b)
(c)
(d)
87
Fig. 12. Maps of differences, negative and positive residuals, contour of reference model marked by the red line.
affects the accuracy, but the computation time is much lower. The total computation time of 18 scans' registration is as follows (GeForce FT 650M 1GB GDDR5, GeForce GTX TITAN 6GB GDDR5): 1. inner = 50 and outer = 50, data subsampled without trees: total registration time (415 s, 70s), 2. inner = 300 and outer = 300, data subsampled without trees: total registration time (1754s, 228 s),
3. inner = 1000 and outer = 1000, data subsampled without trees: total registration time (4770 s, 607 s), 4. inner = 300 and outer = 300, data subsampled: total registration time (3527 s, 449 s). Fig. 8 visualizes this comparison. GeForce GTX TITAN (14 multiprocessors), on average, is seven time faster than GeForce GT 650M (2 multiprocessors); therefore it proves the statement that scaling the
88
J. Będkowski et al. / Automation in Construction 47 (2014) 78–91
(a)
(b)
(c)
(d)
Fig. 13. Maps of differences, negative and positive residuals, contour of reference model marked by the red line.
number of multiprocessors can increase the performance of our parallel implementation. 9. Evaluation of the accuracy of the registration The reference geometric model (2.9 mm accuracy), used in quantitative evaluation, is shown on Fig. 9. Fig. 10 shows resulting errors for each of the four evaluated registration strategies. For comparison, the
results for SoA implementation from Ref. [17] are shown (3DTK implementation). The smallest errors were observed for the fourth strategy: inner = 300 and outer = 300, data subsampled. Removing vegetation slightly decreases the accuracy, but the performance of the registration is much better (from 3527 s down to 1754s of total registration time using GeForce GT 650M). To demonstrate the qualitative result of our approach, we show walls and corners for all models in reference to the geodetic model. Fig. 4b shows numbering for the
J. Będkowski et al. / Automation in Construction 47 (2014) 78–91 Table 1 Average error by method. Model
σx[m]
σy[m]
σz[m]
σω[rad]
σϕ[rad]
σκ[rad]
σxyz[m]
1 2 3 4
0.039 0.035 0.038 0.017
0.034 0.034 0.042 0.034
0.194 0.036 0.034 0.018
0.0085 0.0023 0.0022 0.0013
0.0136 0.0035 0.0027 0.0017
0.0019 0.0017 0.0014 0.0012
0.119 0.036 0.039 0.025
analysis: black circles 1, 2, 3, 4 — walls; white circles 1, 2, 3, 4 — corners. For each model, no matter what approach was used for its creation, the consistency with the assumed accuracy is required. It should provide a true representation of the geometry and structure of the scanned object. The models were tested by comparing selected elements of their visualization with that of the reference model. Two types of tests were conducted. The first was to generate a map of differences for individual walls' deviations between studied models and the reference one. In the corners of the building, a cross section was completed. The tested models, for the most part, are located on one side of the reference model. It is evident that Model 1 has deviations much larger than the other models — which demonstrates the significant geometric distortion. The most interesting are the results for wall 2 (Fig. 12) and wall 4 (Fig. 13): the deviations for the worst model (number 1) are 0.25– 0.30 m. Building has been mapped most accurately for Model 4. On wall 1, displacements reached 15 cm. It can be noted that deviations are lower in Model 2 than 3 and 4. The wall number 4 has been clearly mapped worse than the three others. This may be the result of the iterative scan-matching, as the scans of wall 4 where taken last. Thus they are biased with accumulated error from fitting previous point clouds. The results for all 4 walls show that approach 4 was best, with approach 1 being the worst. Analysis of the control cross sections of the modeled object allowed us to obtain additional information about the quality of models made using our algorithm. Sections were positioned in eight places — at the corners of the block at 1.5 m and 12 m (Fig. 11) above the ground. The analysis confirms the findings from the previous examination. Considering all models, it is apparent that the Model 1 coherence is the worst, and Model 4 is the best. There is no significant visible non-compliance in the Model 4. Differences in deviations from the reference model between sections at the bottom and at the top of the building suggest that the accuracy changes with height. Such discrepancies may be caused by the tilt of models in comparison to the reference model, a
(a) Robot Husky equipped with 3D laser Z+F Imager 5010.
89
shift along the Z axis or errors in the angular and linear parameters of orientation of scans. In summary, both differences maps and crosssections analysis confirm the conclusions of the examination of orientation of the models. Table 1 shows the average error for each of pose parameters, computed for each of the presented models. 10. End user case study Fig. 14 shows the mobile mapping system developed at the Institute of Mathematical Machines for field operations as a result of research shown in this paper. Currently, this system is used in two research projects (“Research of Mobile Spatial Assistance System” Nr: LIDER/036/ 659/L-4/12/NCBR/2013 and FP7 ICARUS — “Integrated Components for Assisted Rescue and Unmanned Search operations”). The first project concerns the application of spatial design support where a mobile mapping system is used for accurate mapping of urban environments. These maps are used for spatial design support by providing software tools for interaction with spatial intent. The second project concerns 3D mapping in SAR (search and rescue). Research on building mobile mapping system is inspired by Fukushima Nuclear Disaster in 2011, where 3D mapping in such scenarios could help in mission execution and 3D data collection. This practical and very important application scenario necessitates the on-line nature of the approach. Decreased time of measurement, data registration and visualization can decrease the risk of potential contamination of workers involved in this particular SAR mission. We are convinced that laser scanning technologies and efficient exploitation of scanning results can contribute to future catastrophe prevention. The mobile mapping system is composed of mobile robot Husky equipped with 3D laser Z + F Imager 5010 and ruggedized NVIDIA GRID system (hardware configured by Boston Limited company from UK). NVIDIA GRID system is composed of Supermicro RZ-1240iNVK2 server with Citrix software capable of GPU virtualization. The system needs two operators, and the initial phase of operational procedure takes about 15 minutes. After this procedure, the system is capable of working 2 hours by acquiring 20–30 local scans in different locations. The operator controls the robot and the laser from laptop connected via WiFi router. Software in the GRID system for data registration and visualization is designed in the SaaS (software as a service) model and it is available from any device (laptop, smartphone, tablet). The single operator is collecting data from the robot and performing data registration. The visualization of an accurate 3D map is then redistributed over the local network via Citrix XenApp. Therefore, it is accessible from any mobile
(b) Mobile ruggedized NVIDIA GRID system.
Fig. 14. Mobile mapping system for field operations.
90
J. Będkowski et al. / Automation in Construction 47 (2014) 78–91
(a) Institute of Mathematical Machines
(b) Klomino - main object of interest in
(design support scenario).
this paper (SAR scenario).
(c) Underground garage (SAR scenario).
(d) Indoor (SAR scenario).
Fig. 15. Use cases for mobile mapping system. These locations were scanned with Z+F Imager 5010 and registered with software described in this paper.
device. Using GRID technology improves 3D rendering over Ethernet and it is solving the problem of high performance computing (CUDA) in the cloud. The system is capable of registering 3D data in the field and immediately redistributing this map over local network for many end users. This functionality can help in increasing awareness of SAR operation. To summarize, the proposed mobile mapping system can be used for: • spatial design support (Fig. 15a), • search and rescue applications (Fig. 15bcd). 11. Conclusions In this paper we have shown the quantitative and qualitative evaluation of the data registration algorithm improved by CUDA parallel programming. Data were collected in urban area demolition scenario using the 3D terrestrial laser scanner Z+F Imager 5010 mounted onto mobile platform PIONNER 3AT to simulate a real mobile robotic task. We have shown a system for on-line data registration with the analysis of the accuracy. The proposed implementation is robust because of the fast nearest neighborhood search applied for the iterative closest point with singular value decomposition solver. The performed qualitative and quantitative evaluation is based on the reference ground truth data calculated with geodetic precision. The geodetic approach extends the previous work [3, 14] and provides an accurate benchmark for the algorithm. We compared four registration strategies for data preprocessing, such as subsampling and vegetation removal. We observed that proper subsampling increases the accuracy and performance. Vegetation removal increases the
performance, but, at the same time, it slightly decreases the accuracy. It is thus recommended for applications that are time critical. Our system is already tested in realistic experiments and provides accurate consistent metric maps on-line. The accuracy level of the maps is appropriate to potential applications, such as information gathering for urban area modeling, spatial design support and initial space planning. It is a core component for our future work on mobile mapping systems. Acknowledgements The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/20072013) under grant agreement no. 285417 — project ICARUS Integrated Components for Assisted Rescue and Unmanned Search operations. This work is done also with the support of NCBiR (Polish Centre for Research and Development) project: Research of Mobile Spatial Assistance System no. LIDER/036/659/L-4/12/NCBR/2013 and with the support of NCN (Polish National Center of Science) project: Methodology of semantic models building based on mobile robots observations, nr: DEC- 2011/03/D/ST6/03175. We would like to thank our reviewers for the hard work and valuable comments. References [1] A. Nüchter, H. Surmann, K. Lingemann, J. Hertzberg, S. Thrun, 6D SLAM with an application in autonomous mine mapping, Proceedings of the IEEE International Conference on Robotics and Automation, 2004, pp. 1998–2003.
J. Będkowski et al. / Automation in Construction 47 (2014) 78–91 [2] P.J. Besl, N.D. McKay, A method for registration of 3-D shapes, IEEE Trans. Pattern Anal. Mach. Intell. 14 (2) (1992) 239–256, http://dx.doi.org/10.1109/ 34.121791. [3] J. Bedkowski, A. Maslowski, G. de Cubber, Real time 3D localization and mapping for USAR robotic application, Ind. Robot. 39 (5) (2012) 464–474. [4] T. Rozen, K. Boryczko, W. Alda, GPU bucket sort algorithm with applications to nearest-neighbour search, WSCG 16 (1–3) (2008) 161–167. [5] D. Qiu, S. May, A. Nüchter, GPU-accelerated nearest neighbor search for 3D registration, Proceedings of the 7th International Conference on Computer Vision Systems, ICVS09, Springer-Verlag, Berlin, Heidelberg, 2009, pp. 194–203. [6] A. Lorusso, D. Eggert, R. Fisher, A comparison of four algorithms for estimating 3-D rigid transformations, Proceedings of the 1995 British Conference on Machine Vision (BMVC95), Birmingham, vol. 1, BMVA Press, Guilford, 1995, pp. 237–246. [7] K.S. Arun, T.S. Huang, S.D. Blostein, Least-squares fitting of two 3-D point sets, IEEE Trans. Pattern Anal. Mach. Intell. 9 (5) (1987) 698–700, http://dx.doi.org/10.1109/ TPAMI.1987.4767965. [8] B.K.P. Horn, H. Hilden, S. Negahdaripour, Closed-form solution of absolute orientation using orthonormal matrices, J. Opt. Soc. Am. A 5 (7) (1988) 1127–1135. [9] B.K.P. Horn, Closed-form solution of absolute orientation using unit quaternions, J. Opt. Soc. Am. 4 (4) (1987) 629–642. [10] M.W. Walker, L. Shao, R.A. Volz, Estimating 3-D location parameters using dual number quaternions, CVGIP: Image Underst. 54 (3) (1991) 358–367, http://dx.doi. org/10.1016/1049-9660(91)90036-O.
91
[11] A. Nüchter, J. Elseberg, P. Schneider, D. Paulus, Study of parameterizations for the rigid body transformations of the scan registration problem, Comput. Vis. Image Underst. 114 (8) (2010) 963–980, http://dx.doi.org/10.1016/j.cviu.2010.03.007. [12] H. Pottmann, S. Leopoldseder, M. Hofer, Simultaneous registration of multiple views of a 3D object, ISPRS Arch. 34 (3A) (2002) 265–270. [13] M. Hofer, H. Pottmann, Orientierung von laserscanner-punktwolken, Vermessung Geoinf. 91 (2003) 297–306. [14] J. Bedkowski, Intelligent mobile assistant for spatial design support, Autom. Constr. 32 (2013) 177–186, http://dx.doi.org/10.1016/j.autcon.2012.09.009 (URL http:// www.sciencedirect.com/science/article/pii/S0926580512001586). [15] S.-Y. Park, S.-I. Choi, J. Kim, J. Chae, Real-time 3D registration using GPU, Mach. Vis. Appl. (2010) 1–1410, http://dx.doi.org/10.1007/s00138-010-0282-z. [16] A. Segal, D. Haehnel, S. Thrun, Generalized-ICP, Proceedings of Robotics: Science and Systems, Seattle, USA, 2009, pp. 1–8. [17] A. Nüchter, K. Lingemann, J. Hertzberg, Cached k–d tree search for ICP algorithms, Proceedings of the Sixth International Conference on 3-D Digital Imaging and Modeling, IEEE Computer Society, Washington, DC, USA, 2007, pp. 419–426, http://dx.doi.org/10.1109/3DIM.2007.15. [18] A. Nüchter, J. Hertzberg, Towards semantic maps for mobile robots, Robot. Auton. Syst. 56 (11) (2008) 915–926, http://dx.doi.org/10.1016/j.robot.2008.08.001. [19] R.B. Rusu, Z.C. Marton, N. Blodow, M. Beetz, Learning informative point classes for the acquisition of object model maps, Proceedings of the 10th International Conference on Control, Automation, Robotics and Vision (ICARCV), Hanoi, Vietnam, 2008, pp. 1–8, (URL http://files.rbrusu.com/publications/Rusu08ICARCV.pdf).