Available online at www.sciencedirect.com
ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99 – 114 www.elsevier.com/locate/isprsjprs
Iterative processing of laser scanning data by full waveform analysis Michael Kirchhof a,⁎, Boris Jutzi a , Uwe Stilla b b
a FGAN-FOM Research Institute for Optronics and Pattern Recognition, 76275 Ettlingen, Germany Photogrammetry and Remote Sensing, Technische Universitaet Muenchen, 80290 Muenchen, Germany
Received 8 December 2006; received in revised form 22 August 2007; accepted 23 August 2007 Available online 24 October 2007
Abstract The latest developments in laser scanners allow capturing and digitizing of the full waveform of the backscattered pulse. The waveform can be analyzed for measurement features such as range, reflectance values and spreading of the pulse. These features are used to distinguish between locally planar surface elements and partly penetrable objects caused by partial occlusions. This presegmentation and the derived range values are used to automatically generate surface primitives in the form of planes. This allows refining each range value taking the surface geometry in a close neighborhood into account. To refine the modeling of the surface, partly occluded surface areas are extended by prediction of the expected range values. This prediction is further improved by considering the surface slope for the estimated received waveform. Then the point cloud associated with the surface is enhanced by additional range values that were missed in the first processing step due to weak signal response. This procedure is repeated several times until all useful range values are considered to estimate the surface. © 2007 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved. Keywords: Laser scanning; Waveform analysis; Feature extraction
1. Introduction The generation of accurate 3D models from laser scanning data for a description of man-made objects is of great interest in photogrammetric research. Detailed description of objects such as buildings requires sampling the surface as a point cloud as dense and complete as possible. However, depending on the scene and the point of view of the laser scanning system, foreground objects like trees in the line of sight of the laser beam interrupt the uniform sampling of the surfaces. These gaps in the sampling of the surface are called partly occluded regions. Maas (2000) included a discussion about partly occluded ⁎ Corresponding author. Tel.: +49 7243 992 213; fax: +49 7243 992 299. E-mail address:
[email protected] (M. Kirchhof).
regions in his work on TIN (Triangulated Irregular Network) structures and least-squares matching. A different strategy to handle gaps in the point cloud is the application of morphological operations (Gorte and Pfeifer, 2004). Morphological operations can also be used in the case that no reflections of the region can be measured due to total occlusion by an impervious object. However, vegetation often shows a semi-penetrable property. State of the art laser scanning systems allow recording the first and last pulse or a given number of pulses. While first pulse registration is the optimum choice to measure the hull of partially penetrable objects, e.g. canopy of trees, last pulse registration should be chosen to measure impenetrable surfaces, e.g. ground surface below vegetation for airborne applications or a building behind vegetation for terrestrial applications.
0924-2716/$ - see front matter © 2007 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved. doi:10.1016/j.isprsjprs.2007.08.006
100
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
Recent developments in commercial airborne laser scanning systems have led to systems such as OPTECH ALTM 3100, TOPEYE MK II, and TOPOSYS HARRIER 56 (based on RIEGL LMS-Q560) that allow capturing the waveform:. These systems are specified to operate with a transmitted pulse width of 4–10 ns at fullwidth-at-half-maximum (FWHM) and allow digitization and acquisition of the waveform with approximately 0.5–1 GSample/s. Detailed overviews for laser scanning systems are given in (Huising and Gomes Pereira, 1998; Wehr and Lohr, 1999; Baltsavias, 1999). In contrast to airborne systems, the prototype of the terrestrial laser scanning system ECHIDNA (Lovell et al., 2003) allows to capture the waveform of the backscattered pulse. Terrestrial applications often show a distribution of objects that range from several meters in the foreground to hundreds of meters in the background. Multiple reflections of objects within the beam corridor can be extracted from the full waveform data (Reitberger et al., 2006; Ullrich and Reichert, 2005). Assuming that waveforms are sampled with sufficiently high frequency, techniques of digital signal processing can be applied. Comparing the transmitted and the received waveform by the cross-correlation function can improve the range estimation. The argument of the maximum of the cross-correlation function estimates the range value with higher reliability and accuracy than the amplitude of the received waveform alone. Details of the improvement can be found in Hofton and Blair (2002), Jutzi and Stilla (2005), and Thiel et al. (2005). Different surfaces have to be analyzed for different applications. For example for urban objects, it is relevant to deal with objects at different elevations (Brenner et al., 2001). In rural environments it is relevant to deal with randomly distributed natural objects (Reitberger et al., 2006). The impact of the scene on the received waveform has been discussed using standard examples (Jutzi et al., 2002; Wagner et al., 2004; Jutzi and Stilla, 2006). The focus of this work is reconstruction of man-made objects that can be approximated by planar surfaces. A robust algorithm for finding planes can be implemented using the RANSAC algorithm (Fischler and Bolles, 1981; Hartley and Zisserman, 2000). Many authors have used RANSAC in 2D for extraction of low parameterized transformation models (Workshop 25 Years of RANSAC, 2006). However RANSAC can also be used to fit planes (Brenner et al., 2001) or cylinders (Beder and Förstner, 2006) to 3D point clouds. While Brenner et al. (2001) exploit well defined neighborhood relations to bound the search region for plane primitives, Beder and Förstner (2006) make no assumptions about the search regions. In both approaches a coarse estimation of the variance of
point distances to the fitted model is required. The membership of points to the fitted model is determined by a constant threshold. Local distortions of the point distribution caused by details of the object that are not represented by the model at the given level of detail (LOD) lead to exclusion of the points related to these details. This disadvantage can be overcome by an improved RANSAC based on automatic threshold detection. Speaking precisely, it is possible to capture all points of the object even when the current model does not fit the corresponding point cloud exactly at the current LOD. In this paper, we propose a method for iterative knowledge-based processing of terrestrial laser scanning data by full waveform analysis. The approach that we present improves the point density and accuracy of the range values. Furthermore, it allows closing gaps in partly occluded surface regions by knowledge-based search. In Section 2, the surface response of a planar surface with slope is derived. Using this surface response, the corresponding range value is estimated and the correlation function is analyzed. The derived property values are used in a pre-segmentation to separate locally planar surface elements from partly penetrable objects. The result of this process is used to extract surface primitives by RANSAC with automatic threshold selection. In an iterative processing step the slope of the surface is used to improve the accuracy of the range value by increasing the cross-correlation between the received waveform and the expected surface response. This procedure requires the assumption that the surface is locally planar. In Section 3, outdoor experiments with an experimental laser scanning system capturing an urban scene are described. The results of the iterative processing are presented in Section 4. In Section 5 the improvement of the point clouds and the pre-segmentation are discussed. Section 6 completes the contribution with a summary of the advantages of our approach and directions of further research. 2. Methods The following section describes the whole processing chain. An overview of the processing chain is depicted in Fig. 1. First we introduce the concept of estimating the surface response in Section 2.1. The general interaction between a surface and a laser pulse is analyzed. The output of the analysis is the reduction of the 3D surface characteristic to a 1D range dependent signal. In the following this signal will be called the surface response. A theoretical derivation of the surface response as a function of the slope of a planar surface is given in Section 2.2. In Section 2.3, we introduce a
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
101
2.5. Section 2.6 addresses the problem of missing knowledge about the distribution of the 3D points located on a surface primitive. Section 2.7 closes the processing loop and gives initial estimates of the surface response for the matched filter in the next iteration loop. 2.1. Estimating the surface response The received waveform of a laser pulse depends on the transmitted waveform s(t), the impulse response h(t) of the receiver unit, the spatial beam profile of the used laser P(x,y), and the illuminated surface S(x,y,z). Specifically the received waveform r(x,y,z,t) can be expressed by a convolution of these terms, rð x;y;z;t Þ ¼ sðt Þ⁎hðt Þ⁎Pð x;yÞ⁎S ð x;y;zÞ;
ð1Þ
where (⁎) denotes the convolution operation. The impulse response consists mainly of the receiving properties of the photodiode and amplifier that are used. The spatial beam profile typically takes the shape of a Gaussian or uniform distribution and the surface characteristics can be described by the geometry of the surface and reflectance properties (mixture of diffuse and specular). We assume a receiver unit consisting of an ideal photodiode and amplifier with an infinite bandwidth and a linear frequency characteristic. The 3D surface characteristic can thus be reduced to the rangedependent 1D surface response S(z). An analytical description of the surface response for a sloped plane surface is derived in the following section. 2.2. Plane surface with slope
Fig. 1. Flowchart of the processing procedure.
matched filter approach that is used to estimate the range value and additional features of the response. A method for pre-segmenting partly penetrable objects is described in Section 2.4. The result is a set of 3D points that are expected to be located on one surface of a building, e.g. facades or roofs. Here we assume that these surfaces are locally planar. A method for detecting these surface primitives using RANSAC is presented in Section
We assume the beam profile of our laser to be uniform (top-hat form), because measurements of the beam profile in the near field have shown that a uniform distribution fits our data best (Jutzi and Stilla, 2006). This is in contrast to the commonly used Gaussian distribution for the beam profile. The divergence of the beam is denoted by Θ. The range from the origin to the idealized plane surface A is denoted by r0A. The angle between the normal Y n of the surface and the optical axis at the centre of the beam is φ (Fig. 2b). We use a Cartesian coordinate system for which the z-axis points in the direction of propagation of the beam and x is chosen such that Y n coincides with the x–z plane. This implies that small variations in the y direction have negligible influence on the range value z. The general form of the beam cone is H 2 x2 þ y2 V z tan : ð2Þ 2
102
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
of the surface response S(t). If the range to the surface is large compared to the beam diameter, then the surface response is symmetric. 2.3. Estimating the range value by processing the correlation coefficient
Fig. 2. Schematic description of the projected beam (footprint) at a plane surface. a) Oblique view, b) side view.
The illuminated surface area can be parameterized by Inequality (2) combined with x sinu þ z cos u ¼ r0A cos u:
ð3Þ
Eq. (3) represents the orientation of the surface (Fig. 2). With x ¼ ðr0A zÞcotu
ð4Þ 2 and Inequality (2) y is bounded byymin ¼ ztan H2 2 ððr0A zÞcotuÞ2 Þ0:5 and ymax ¼ ztan H2 ððr0A zÞcotuÞ2 Þ0:5 . The surface response S(z) is directly related to the area of the illuminated surface at depth z. The choice of the coordinate system ensures that the range value z is independent of y. Therefore the surface response is given by SðzÞ ¼ ymax ymin s ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi H 2 ¼2 z tan ððr0A zÞcotuÞ2 : 2
ð5Þ
As can be seen in Fig. 2a the limits for the range z are given by the equation ymin = ymax = 0. It follows that the range z is only valid within the range interval " # r0A cot u r0A cot u za½zmin ; zmax ¼ ; : cot u þ tan H2 cot u tan H2 ð6Þ Substituting z = ct / 2, where c is the speed of light and t is the travel time, we obtain a temporal description
First the transmitted waveform s(t) and the received waveform r(t) have to be measured with the receiver unit of the laser system (Fig. 1(2) and (3)). Then, by the use of the transmitted waveform s(t) and the modeled surface response S(t) (Fig. 1(1)) of the known surface for each position on the surface, the estimated received waveform ˆr (t) (Fig. 1(3)) is calculated by a convolution to derive the matched filter. If the surface is unknown, a sub-optimal matched filter based on the transmitted waveform s(t) has to be used (dotted line). If the surface is known, the estimated received waveform ˆr (t) is compared with the measured waveform r(t) by determining the normalized cross-correlation function k(τ). The maximum coefficient of the normalized crosscorrelation function yields accurate range (Fig. 1(3)). 2.3.1. Matched filter approach The data analysis starts with the detection of the backscattered pulses in the temporal signal. This signal is usually disturbed by various noise components: background radiation, amplifier noise, photo detector noise, etc. Detecting the received waveform of the backscattered pulse in noisy data and extracting the associated travel time is a well-known problem and is discussed in detail in radar techniques (Skolnik, 1980) and system theory (Papoulis, 1984). In radar techniques, these problems are tackled by a matched filter. To improve the range accuracy and the signal-to-noise ratio (SNR), the matched filter for the waveform of the backscattered pulse has to be determined. In practice, it is difficult to determine the optimal matched filter. In cases where no optimal matched filter is available, sub-optimal filters may be used at the cost of decreasing the SNR. In the simplest case, the received waveform is the uniformly attenuated (isotropic attenuation by reflection or transmission of the pulse) copy of the transmitted waveform. In this case, the transmitted waveform of the emitted pulse is the best choice for the matched filter. This is assumed in the first iteration step, where the surface is unknown. In practice, the received waveform is influenced by the interaction between the laser beam and the illuminated surface. If the surface is known by prior knowledge or rough estimation from previous iteration, the estimated received waveform can be determined (Section 2.1). This estimated
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
103
2.3.2. Processing the range value The output signal Rsr with improved SNR is analyzed by searching for the local maximum to determine the travel time of the pulse. By using the correlation signal to calculate the travel time t, higher accuracy is reached than by operating on the waveform alone (Jutzi and Stilla, 2005). Instead of using a single value of the waveform, the shape of the waveform can be used to increase the accuracy. This is because the specific pulse properties (e.g., asymmetric shape, intensity fluctuations) are taken into account and so less time-varying influence (jitter) is expected on range estimation. Fig. 3. Example for the automatic threshold detection. a) Smoothed histogram function: ghist(x_), b) first derivation of the smoothed histogram function: ghist′( _x), c) normalized histogram function fhist( _x), and d) mixture fit function f (x_).
received waveform can be used to improve the matched filter. To be precise, the estimated received waveform is the best matched filter if the surface properties are exactly known. Let us assume that the noise components of the system mentioned above are sufficiently described by white noise with a constant factor N. Let the signal energy of the pulse be E. The maximum SNR occurs if the signal and the filter match. In this case, the associated travel time t of the delayed pulse is τ and the SNR is described by
2.4. Pre-segmentation Pre-segmentation (Fig. 1(4)) is used to compute an initial cue for partially penetrable or partially illuminated objects and impenetrable surfaces. These surfaces are the focus of interest in this contribution. Partially penetrable objects like vegetation spread the received waveform (Wagner et al., 2004). It is straightforward to show that this spreading propagates to the width W of the cross-correlation function. Since energy is conserved, the spreading also leads to decreased signal amplitude. It is again straightforward that the decreased signal amplitude leads to a decreased maximum I of the
2E : ð7Þ N An interesting fact of this postulate is that the maximum of the instantaneous SNR depends only on the signal energy of the emitted pulse and the noise. Generally, if the surface is unknown, the matched filter is computed by the normalized cross-correlation function Rsr between the transmitted waveform s(t) of the emitted pulse and the received waveform r(t) of the backscattered pulse (Fig. 1(2), dotted line). Assuming zero-mean waveforms, we obtain the output signal Rsr with a local maximum Isr at the delay time τ via Z l sðt Þ rðt þ sÞdt t¼l ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi: Rsr ðsÞ ¼ rZ ð8Þ Z l l s2 ðt Þdt r2 ðt Þdt SNRðsÞ ¼
t¼l
t¼l
If the surface is known, the estimated received waveform ˆr (t) is substituted for the transmitted waveform s(t) and the normalized cross-correlation function Rrˆ r is calculated in an analogous manner. The local maximum of Rrˆ r is denoted by Irˆ r .
Fig. 4. Transmitted waveform. a) Single example, b) adaptive overlay of 500 waveforms.
104
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
cross-correlation function. Thus a decreased crosscorrelation maximum I and increased width W of the cross-correlation function will indicate partially penetrable objects. From these two measures we can compose the feature F ¼ ð1 I ÞW :
ð9Þ
The new feature F has the physical interpretation of lost energy. An energy difference occurs if a surface is only partially illuminated or the object is partially penetrable. Using a single threshold we use the new feature F to separate points located on impenetrable surfaces from partially illuminated or partially penetrable objects. This threshold is derived manually from the histogram of the feature F. As it will be shown in the next section it is not important to detect every 3D point located on the building's surface by the pre-segmentation with a single
threshold. It is more important to exclude the vegetation from further processing to reduce ambiguities, because, for example, a virtual plane could be fitted through a large subset of 3d-points located on aligned trees. However this plane could have no physical reality and would therefore belong to no surface primitive. 2.5. Extraction of geometric primitives The segmentation of Section 2.4 produces a set of 3D points that are expected to be located on a surface primitive of a building (Fig. 1(4)). From this set we generate hypotheses for possible planes in the scene (Fig. 1(5)) based on the random sample consensus algorithm RANSAC (Fischler and Bolles, 1981). RANSAC is an architecture for hypothesis generation and testing. Hypotheses are generated from a minimal set of
Fig. 5. a) Photo of the test SCENE_A with different urban objects, b) complete 3D point cloud of SCENE_A (APR), c) photo of SCENE_B, d) complete 3D point cloud of SCENE_B.
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
correspondences. At least three 3D points are needed to compute an unique plane. The hypotheses are computed from a homogeneous representation of the 3D points Y xi by the formula: Y Y Y T Y nY x1 x xi B ; ¼ 0 ; with B ¼ ; 2 N d 1 1 1 ð10Þ Y n or ¼ argmin ð jj Bx jj Þ: d jxj¼1 The second formulation has to be used in the over determined case. Here {xY1 , Y x2 , …, Y xi } can be any arbitrary subset of the 3D point cloud with cardinality i N 2. Following the exact RANSAC idea, we use i = 3 for the hypothesis generation combined with least squares optimization (guided matching). The optimization is computed from the subset of 3D points that are initially layered as inliers to the current hypothesis. The T vector [nY d ]T is the hypothesis of the plane. The first three components nY describe the orientation of the plane while d represents the distance to the origin if Y n is a unit vector. With this parameterization, the geometric distance between the plane and a 3D point Y xi is given T T YT d ][ Y by |[n xi 1]T|. Eq. (10) for [nY d]T is linear and can be solved directly using singular value decomposition. Suitable hypotheses are then evaluated by the number of inliers in the dataset. An initial threshold is set (0.5 m) and each point is initially called an inlier if the geometric distance between the plane and the point is smaller than this threshold. The hypothesis with the largest number of inliers is our solution. The procedure of generating and testing is stopped if the probability that at least one hypothesis was computed only from inliers is greater than p = 95%. The number NH of needed hypotheses is computed from the formula (Hartley and Zisserman, 2000): NH ¼
logð1 pÞ
3 : finlier log 1 fpoints
ð11Þ
Here #inlier denotes the size of the currently largest inlier set and #points the size of the 3D point set. The Table 1 SCENE_A: Surfaces areas and distances between surfaces and the sensor SCENE_A SCENE_A SCENE_A SCENE_A FACADE1 ROOF1 FACADE2 ROOF2 Estimated area of 232 the surface [m2] Distance to center 115 of gravity [m]
221
132
60
118
110
112
105
Table 2 SCENE_B: Surfaces areas and distances between surfaces and the sensor SCENE_B surface number 1–4 2
Estimated area of the surface [m ] Distance to center of gravity [m]
1
2
3
4
59 113
42 92
143 58
42 98
Table 3 SCENE_B: Surfaces areas and distances between surfaces and the sensor SCENE_B surface number 5–8 2
Estimated area of the surface [m ] Distance to center of gravity [m]
5
6
7
8
134 63
129 59
93 63
16 93
finlier is the current estimate for the probability number fpoints that a 3D point lies on the largest remaining plane.
2.6. Automatic threshold selection The computation of the planar surface primitives (Fig. 1(5)) is based on an initial threshold (0.5 m) for the distance of points to the plane. The threshold can be chosen arbitrarily between the accuracy of the range values and the extension of the smallest plane to detect. This is sufficient to determine a suitable hypothesis for the plane but now we have to decide which of the points are parts of the plane and which are not. This is achieved by an automatic threshold detection algorithm (Fig. 1 (6)). We assume that the distribution of the distances ρ between the estimated plane and the 3D points in the interval [0; ρmax] is a mixture composed of a narrow Gaussian with mean μ = 0 for the inliers and a uniform distribution for the outliers. Here ρmax is an arbitrary limit for the range of interest greater than 2 m. The function f ðqÞ ¼ aeðbÞ þ c for qa½0; q max q 2
ð12Þ
is fitted to the histogram function fhist( ρ) by nonlinear optimization using the Levenberg–Marquardt algorithm (Marquardt, 1963). The function fhist(ρ) is constructed from piecewise R linear interpolation of the histogramR, normalized to ℝ fhist ðqÞdq ¼ 1 to be independent of the number of bins. The residuals f( ρ) − fhist(ρ) are weighted
q by the term 3 qmax representing the increasing uncertainty of our assumption with increasing range to the plane. Then the resulting equation fcost ðq;a;b;cÞ ¼
2 NB qi 2 P aeð b Þ þ c fhist ðqi Þ 3 q qi is minimized, where ρi max i¼1
are the bins of the histogram and NB is the number of bins. Let ghist( ρ) be the convolution of fhist(ρ) with a Gaussian
106
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
Fig. 6. Oblique view of the pre-segmentation results: Partly penetrable objects (left side) and 3D points corresponding to planar surfaces (right side). a) and b) APR, c) and d) MAY, e) and f) OCT, g) and h) NOV.
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
Fig. 7. Pre-segmentation result of the points labeled impervious without the main plane of the scene (APR).
function with mean μ =0 cm and a standard deviation of σ = 0.1 cm. Then the optimization is initialized by a0 ¼ max ðf fhist ðqÞjxa½0; qmax gÞ; Aghist ðqÞ b0 ¼ arg max ; and Aq qa½0;qmax Z qmax fhist ðqÞ dq: c0 ¼ q¼0m qmax
ð13Þ
A typical example for the automatic threshold detection is depicted in Fig. 3. The solid curve shows
107
the piecewise linear approximation of the histogram fhist( ρ), the smoothed histogram function ghist ( ρ) is depicted by a dashed curve, the dash–dot curve shows the first derivative of ghist( ρ) and the resulting mixture fit function f (ρ) is drawn as a dotted curve. The histogram is computed with 1000 bins. Because of the normalization used for the histogram, the result f(ρ) is independent of the number of bins but depends on quantization effects. These effects are suppressed by low pass filter, namely the convolution with the Gaussian function above. 2.7. Computation of the relative surface normal and the iterative loop The surface primitives and the corresponding point sets computed by RANSAC combined with the automatic threshold detection give us an initial guess for the scene geometry (Fig. 1 (5) and (6)). Since high level modeling methods require dense point clouds, we use this initial cue to improve our results. The convex hull of a point set is increased within the surface primitive to estimate the search region for new points in the scene. The process is similar to a morphological dilation within the plane in 3D. The range value z0 of a beam hitting the
Fig. 8. Examples of four different surface primitives. a) SCENE_A{FAÇADE}, b) SCENE_A{ROOF}, c) SCENE_A façade of the second building, d) SCENE_A roof of the second building.
108
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
geometric primitive within the increased convex hull is computed by the formula z0 ¼ d= Y vTY n:
ð14Þ
Again the vector [nY d] represents the geometric primitive as described in Section 2.5 and Y v is the normalized direction of the laser beam. T
T
The computed angle between the normal vector of the surface and the laser beam is used to calculate the expected surface response (Fig. 1 (7) and (1)). This surface response is used to increase the cross correlation function between the expected and the measured waveform. Since we have already computed the direction vector of the laser beam Y v , the angle
Fig. 9. Complete set of surface primitives detected in the SCENE_A (APR). a) Before iteration, b) after iteration. Complete set of surface primitives detected in the SCENE_A (MAY). c) Before iteration, d) after iteration. The complete set of evaluated surface primitives in SCENE_B. e) Before iteration, f) after iteration.
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
between the laser beam and the surface normal nY is given by T w u ¼ arc cos Y v nY : ð15Þ This can be used with Eq. (5) and z0 from Eq. (14) to compute the resulting surface response sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi w H 2 w 2 S z; u ¼ 2 ðz0 zÞcot u : ð16Þ z tan 2 w The estimated surface response S(z,u ) can now be used as prior knowledge in the matched filter approach described in Section 2.3.1.
3. Experiments 3.1. Laser scanning system An experimental setup that allows capturing the waveform was built to explore the capabilities of the laser scanning system described in Section 2. The laser scanning system has three main components: an emitter unit, a motion control unit, and a receiver unit. We use a laser pulse system with a pulse duration of 5 ns at full-width-at-half-maximum (FWHM) and a high repetition rate (42 kHz). The peak power of the laser is up to 10 kW. The multi-mode erbium fiber laser operates at a wavelength of 1550 nm with a beam divergence of 1 mrad. The system uses a photodiode to pump the multi-mode fiber cavity and a fiber amplifier. The transmitted waveform shows strong random modulation for each emitted pulse. In Fig. 4 a single example of the transmitted waveform and an adaptive overlay of 500 transmitted waveforms is depicted to make the strong random modulation clear. For the 2D scanning process, a moving mirror is used for an elevation scan with ±15 degrees in vertical direction (320 raster steps) and a rotation stage for an azimuth scan with 360 degree rotation in the horizontal direction (variable number and spacing of the raster steps). For the experiments, we selected 60° (600 raster
109
steps) for equiangular spacing of approximately 0.1° in vertical and horizontal direction. For our investigations we use a receiver with a bandwidth of 250 MHz containing a PIN photodiode sensitive at wavelengths of around 900 to 1700 nm. Furthermore, we use an A/D converter with 20 GSamples/s. The A/D conversion and digital recording is accomplished by using a digital memory oscilloscope (Le Croy – Wavemaster 8600A), where the bandwidth of the oscilloscope is limited to 6 GHz. 3.2. Data acquisition We used four datasets of SCENE_A (Fig. 5a, b) and one dataset of SCENE_B (Fig. 5 c, d) to assess the algorithms and to demonstrate the performance of our methodology. The datasets were taken in April (APR), May (MAY), October (OCT), and November (NOV) for SCENE_A and in February for SCENE_B. The sensor platform were placed at a height of 15 m for SCENE_A and 23 m for SCENE_B. Objects in the scene are buildings, streets, vehicles, parking lots, meadow, bushes, and trees. The scenes do not include coniferous trees but do include deciduous trees (birches and chestnuts). Some objects are partly occluded and the materials have different reflectance properties. Since the evaluation would not be reliable on small surface patches (Fig. 10), we focus on two large surfaces in SCENE_A: SCENE_A{ROOF} and SCENE_A {FAÇADE}. The area size of the evaluated surfaces and their distances (center of gravity) to the sensor are depicted in Tables 1, 2 and 3. The pre-segmentation is influenced by the foliage of the trees. Fig. 6 shows an oblique view of the partial penetrable 3D point cloud (left side) and the presegmentation results (right side) of all SCENE_A datasets. The evaluation shows that pre-segmentation is much easier if the trees are leafy. This was expected since leafy trees spread the waveform. For clarity the pre-segmentation result of the APR scene is shown without the main scene plane in Fig. 7. The detected
Table 4 Results of the evaluation methods for the surface primitive SCENE_A{FAÇADE} SCENE_A{FAÇADE} w between surface and beam Median angle u
μ mean of the improvements Ui r standard deviation of the improvements Ui Probability P(l N 0) (Eq. (17)) # points on plane without iteration # points on plane after iteration
APR
MAY
OCT
NOV
10.4° [10–3] 0.752 [10–3] 0.629 90% 4038 6013
8.7° [10–3] 0.637 [10–3] 0.479 90% 1830 3641
8.3° [10–3] 0.537 [10–3] 0.384 92% 3207 4398
8.5° [10–3] 0.705 [10–3] 0.583 89% 4102 4588
110
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
Table 5 Results of the evaluation methods for the surface primitive SCENE_A {ROOF} SCENE_A {ROOF} w Median angle u between surface and beam l mean of the improvements Ui r standard deviation of the improvements Ui Probability P(l N 0) (Eq. (17)) # points on plane without iteration # points on plane after iteration
Apr
May
Oct
Nov
37.2°
40.2°
40.55°
40.3°
correlation value computed in the matched filter by using the estimated surface normal. This evaluation also tests our theory of the surface response. If the surface response is modeled correctly, the use of the surface normal should increase the cross-correlation value between the estimated received waveform and the measured received waveform. The evaluation is conducted using the 3D points located on the detected planes. If the estimation is not improved, the “improvements” Φi = Irˆ r − IST will be normally distributed with zero mean. The improvements Φi are defined by the local maxima of the cross-correlation functions described in Section 2.3.1. The first measure is the mean μ of the improvements Φi for each surface. We now assume that the improvements Φi are normally distributed and we compute their standard deviation σ. Combining μ and σ into the cumulative probability density function Φμσ for the normal distribution, we can compute the probability of the hypothesis that the improvements Φi have a positive mean μ N 0 by
[10–3] 6.9 [10–3] 9.9 [10–3] 7.4 [10–3] 6.6 [10–3] 4.2 [10–3] 3.2 [10–3] 3.9 [10–3] 3.5
94.5%
99%
97%
98%
4701
4615
5565
2814
5543
5259
6037
4025
surface primitives SCENE_A{ROOF} and SCENE_A {FAÇADE} of the two buildings are shown in Fig. 8. Beside SCENE_A, a second urban scene was processed. SCENE_B is more complex, with many occlusions and objects of various sizes with dimensions of 15–400 m.
Z PðlN0Þ ¼ 1 Ulr ð0Þ ¼ 1
2
ð xlÞ 1 2 pffiffiffiffiffiffi e 2r dx r 2p l
0
ð17Þ 4. Results The second measure is the power of the 3D point set (number of elements) located on the estimated planes. Tables 4–6 show the power of the 3D point sets before and after iteration. The additional points were not detected in the initial processing step.
We evaluate our results quantitatively with the evaluation methods described in Section 4.1. A visual impression of the results is presented in Fig. 9. All 3D points located on the evaluated surfaces and the main scene plane are shown before and after the iteration procedure. The ground surface is only shown for visualization purposes. In Section 4.3, we focus on the influence of the surface response on the maximum of the cross-correlation function.
4.2. Numerical results Tables 4–6 are computed with the method described in Section 4.1. In SCENE_A we focus on the surfaces SCENE_A{FAÇADE} and SCENE_A{ROOF}. They provide reliable results because of their large size. The dependence of the reliability of the results on the size can be seen in Fig. 10 and is further analyzed in Section 4.3. Table 6 shows the results for eight
4.1. Evaluation methods We evaluate our results using two different measures. The first measure quantifies the increase of the cross-
Table 6 Results of the evaluation methods for the SCENE_B SCENE_B
1
2
3
4
5
6
7
8
w Median angle u l mean r standard deviation Probability P(l N 0) # without iteration # after iteration
6.8° [10–3] 0.40 [10–3] 0.17 99% 682 889
7.8° [10–3] 0.36 [10–3] 0.12 99.8% 1306 1494
15.9° [10–3] 0.49 [10–3] 0.22 98% 10912 13502
16.3° [10–3] 0.96 [10–3] 0.48 98% 1114 1390
27.8° [10–3] 1.3 [10–3] 0.9 93% 7839 9976
35.8° [10–3] 1.8 [10–3] 1.2 92% 6364 7986
37.5° [10–3] 2.3 [10–3] 1.6 92% 4219 5342
38.4° [10–3] 4.1 [10–3] 3.2 90% 371 525
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
111
Fig. 10. Histograms of the cross-correlation improvements Φi for all evaluated surfaces of SCENE_B with the angles a) 7°, b) 8°, c) 16°, d) 16.5°, e) 28°, f) 36°, g) 37°, and h) 38°.
randomly chosen surfaces of SCENE_B. Tables 4–6 show the results of the evaluation methods described in Section 4.1. 4.3. Improvements depending on the surface response Fig. 10 shows the 8 histograms of the improvements Φi evaluated for the surfaces of SCENE_B. The improvements Φi are approximately normally distributed. Mean and standard deviation rise with increasing surface slope. Surfaces elements with a large area (Fig. 10 c, e, f, g) provide many 3D points that can be analyzed. Therefore the histogram is similar to a continuous distribution of the improvements Φi. In addition to this it is known that the standard deviation of the mean estimation decreases 1 with pffiffiffiffiffiffi where n is the number of measurements that n1 contribute to the mean.
Fig. 11. Histogram of the angles estimated by the surface responses (SCENE_A{ROOF}).
112
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
Fig. 12. Enlargement of the surface boundaries. a) Before iteration data was incomplete at interesting regions like windows, b) after iteration an improvement is visible with additional points at interesting regions.
To test the validity of the results we conducted another experiment: Each range value of SCENE_A {ROOF} taken in November was reprocessed with w varying assumed surface slope u in the interval [1°;90°] and the angle with the maximal cross-correlation was recorded. The results are shown in Fig. 11. The mean surface angle was about 40° as can be seen in Table 5. 5. Discussion The experiments show that the use of the estimated surface response from the surface normal can increase the response of the matched filter for feature extraction. They also show that the methodology still holds in real life scenarios with low SNR. Tables 4, 5, and 6 show that the mean μ of the improvements Φi in row 3 is w correlated with the surface slope u in row 2. But the standard deviation σ of the improvements Φi in row 4 is w correlated with the slope u as well. More precisely, if we can achieve larger improvements Φi by considering the w estimated slope u , then the variations of the improvements Φi increase (Fig. 10). As mentioned in Section 2, an increased surface slope u has the effect of spreading the waveform, which decreases the amplitude of the waveform due to conservation of the energy. Therefore the estimated range value becomes less accurate, independent of the detection method used, for increasing the surface slope. Nevertheless the algorithm proposed improves the detection rate as demonstrated in rows 6 and 7 of Tables 4–6. This results show that the surface slope is relevant for almost every application. For example, actual sensors like RIEGL LMS-Q560 have a maximum field of view of 60°. Even if we assuming further a planar scene, the range variation increases with the angle between the
direction of propagation of the beam and the surface normal and becomes at least 30°. In Section 4.1 we assumed that the improvements Φi can be approximated by a normal distribution. This was required to compute our evaluation measure, the probability P(μ N 0). All histograms in Fig. 10 are obviously close to a normal distribution which matches our assumption. Due to noise and numerical effects, negative improvements Φi can be observed. The set of points associated with a surface primitive can be increased between by 10% and 100%. A huge number of the additional 3D points are located near the boundaries of the surface, which are the edges of the object (Fig. 12). If the surfaces are curved, the 3D points in these regions describe the edges more accurately than the intersections of the computed planes. Again, we would like to point out that we see our contribution as an input for high-level modeling which requires as few gaps as possible. For example, a method based on 3D morphological operations, such as that of Gorte and Pfeifer (2004), would benefit from additional points in partly occluded regions. Fig. 9 shows the point sets belonging to the detected planar surface primitives. Beside these the remaining points, which were initially detected, might correspond to partially penetrable objects like vegetation (e.g., foliage of trees) but could although correspond to manmade objects (e.g., fences or street lamps). The segmentation of vegetation and man-made objects is required for many applications in urban environments (Maas and Vosselman, 1999). The application of fullwaveform laser scanning systems in forestry environments has been demonstrated (Wynne and Nelson, 2006). In addition to laser scanning data, additional data like close infrared thermal images are typically used (Haala and Brenner, 1999) for segmentation.
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
The comparison of the pre-segmentation results of an identical scene at different times (Fig. 9) shows that the denser vegetation in summer makes the pre-segmentation more accurate. This might be caused by the foliage which leads to a spreading of the received waveform. This effect can be seen in Fig. 6c and d. For the same reason, we could detect only a few 3D points occluded by trees in summer. The effect is visible in Fig. 9c and d when compared with Fig. 6c. The trees become almost impervious in leaf-on situations. No reflections can be measured from the occluded regions. 6. Conclusion The iterative processing of full waveform laser data increases the set of 3D points associated with each surface. Additional 3D points can be detected in partially occluded and partly illuminated regions. A rough presegmentation can be achieved by a single threshold using a combination of the width and the amplitude of the cross correlation function. The use of prior knowledge of the surface normal improves the range and correlation value. Vice versa the surface response can be used to compute the surface normal without prior knowledge but at the cost of high computational effort. Local plane fitting with RANSAC is a useful method to compute this surface normal. The method presented requires full waveform laser data but is not restricted to terrestrial or airborne laser scanning systems. Additional work could be done on analyzing the improvements of the cross-correlation maximum Φi depending on the surface slope and reflectance properties. The next step for a more detailed description of the scene is to fit quadrics (e.g., cylinders, spheres, ellipsoids, hyperboloids) to the point clouds. Note here that planes are degenerate versions of a quadric and are therefore also described by the same mathematical model. This means that planes and curved surfaces can be determined using the same model without detection of planarity or curvature. Further, the performance of the RANSAC approach can be improved by using sets of points in the hypothesis generation step that are neighbored in a general way, e.g. the distances between pairs of points in the generation set is bounded. References Baltsavias, E.P., 1999. Airborne laser scanning: existing systems and firms and other resources. ISPRS Journal of Photogrammetry and Remote Sensing 54 (2–3), 164–198. Beder, C., Förstner, W., 2006. Direct solutions for computing cylinders from minimal sets of 3D points. In: Leonardis, A., Bischof, H., Pinz, A. (Eds.), Computer Vision— ECCV 2006: 9th European
113
Conference on Computer Vision, Part I. Lecture Notes in Computer Science, vol. 3951. Springer, pp. 135–146. Brenner, C., Haala, N., Fritsch, D., 2001. Towards fully automated 3D city model generation. In: Baltsavias, E., Grün, A., Van Gool, L. (Eds.), Automatic Extraction of Man-Made Objects from Aerial and Space Images (III). Balkema Publishers, Rotterdam, pp. 47–57. Fischler, M.A., Bolles, R.C., 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communication of the Association for Computing Machinery 24 (6), 381–395. Gorte, B., Pfeifer, N., 2004. Structuring laser-scanned trees using 3D mathematical morphology. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 35 (Part B5), 929–933. Haala, N., Brenner, C., 1999. Extraction of buildings and trees in urban environments. ISPRS Journal of Photogrammetry and Remote Sensing 54 (2–3), 130–137. Hartley, R., Zisserman, A., 2000. Multiple View Geometry. Cambridge University Press, Cambridge, UK. Hofton, M.A., Blair, J.B., 2002. Laser altimeter return pulse correlation: A method for detecting surface topographic change. Journal of Geodynamics 34 (3), 491–502. Huising, E.J., Gomes Pereira, L.M., 1998. Errors and accuracy estimates of laser data acquired by various laser scanning systems for topographic applications. ISPRS Journal of Photogrammetry and Remote Sensing 53 (5), 245–261. Jutzi, B., Eberle, B., Stilla, U., 2002. Estimation and measurement of backscattered signals from pulsed laser radar. In: Serpico, S.B. (Ed.), Image and Signal Processing for Remote Sensing VIII. SPIE Proceedings, vol. 4885, pp. 256–267. Jutzi, B., Stilla, U., 2005. Measuring and processing the waveform of laser pulses. In: Grün, A., Kahmen, H. (Eds.), Optical 3-D Measurement Techniques VII, vol. I, pp. 194–203. Jutzi, B., Stilla, U., 2006. Range determination with waveform recording laser systems using a Wiener Filter. ISPRS Journal of Photogrammetry and Remote Sensing 61 (2), 95–107. Lovell, J.L., Jupp, D.L.B., Culvenor, D.S., Coops, N.C., 2003. Using airborne and ground based ranging Lidar to measure canopy structure in Australian forests. Canadian Journal of Remote Sensing 29 (5), 607–622. Maas, H.-G., 2000. Least-squares matching with airborne laserscanning data in a TIN structure. International Archives of Photogrammetry and Remote Sensing 33 (Part 3A), 548–555. Maas, H.-G., Vosselman, G., 1999. Two algorithms for extracting building models from raw laser altimetry data. ISPRS Journal of Photogrammetry and Remote Sensing 54 (2–3), 153–163. Marquardt, D., 1963. An algorithm for least-squares estimation of nonlinear parameters. Journal of the Society for Industrial and Applied Mathematics 11 (2), 431–441. Papoulis, A., 1984. Probability, Random Variables, and Stochastic Processes. McGraw-Hill, Tokyo. Reitberger, J., Krzystek, P., Stilla, U., 2006. Analysis of full waveform LIDAR data for tree species classification. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 36 (Part 3), 228–233. Skolnik, M.I., 1980. Introduction to radar systems, Second Edition. McGraw-Hill International Editions, New York. Thiel, K.H., Wehr, A., Hug, C., 2005. A new algorithm for processing fullwave laser scanner data. EARSeL 3D—Remote Sensing Workshop. (on CD-ROM). Ullrich, A., Reichert, R., 2005. High resolution laser scanner with waveform digitization for subsequent full waveform analysis. In:
114
M. Kirchhof et al. / ISPRS Journal of Photogrammetry & Remote Sensing 63 (2008) 99–114
Kamerman, G.W. (Ed.), Laser Radar Technology and Applications X. SPIE Proceedings, vol. 5791, pp. 82–88. Wagner, W., Ullrich, A., Melzer, T., Briese, C., Kraus, K., 2004. From single-pulse to full-waveform airborne laser scanners: potential and practical challenges. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 35 (Part B3), 201–206. Wehr, A., Lohr, U., 1999. Airborne laser scanning— an introduction and overview. ISPRS Journal of Photogrammetry and Remote Sensing 54 (2–3), 68–82.
Workshop 25 Years RANSAC, (conjunction with CVPR'06), 2006, ISBN 0-7695-2646-2, IEEE Computer Society, Los Alamitos USA. Wynne, R.H., Nelson, R.F. (Eds.), 2006. SilviScan special issue— Lidar applications in forest assessment and inventory. Photogrammetric Engineering and Remote Sensing, vol. 72 (12).