Automatic registration of UAV-borne sequent images ...

ISPRS Journal of Photogrammetry and Remote Sensing 101 (2015) 262–274

Contents lists available at ScienceDirect

ISPRS Journal of Photogrammetry and Remote Sensing journal homepage: www.elsevier.com/locate/isprsjprs

Automatic registration of UAV-borne sequent images and LiDAR data Bisheng Yang ⇑, Chi Chen ⇑ State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430079, China

a r t i c l e

i n f o

Article history: Received 13 May 2014 Received in revised form 25 November 2014 Accepted 29 December 2014

Keywords: Unmanned aerial vehicles mapping LiDAR data Sequent images Registration Linear features Multi-view stereo

a b s t r a c t Use of direct geo-referencing data leads to registration failure between sequent images and LiDAR data captured by mini-UAV platforms because of low-cost sensors. This paper therefore proposes a novel automatic registration method for sequent images and LiDAR data captured by mini-UAVs. First, the proposed method extracts building outlines from LiDAR data and images and estimates the exterior orientation parameters (EoPs) of the images with building objects in the LiDAR data coordinate framework based on corresponding corner points derived indirectly by using linear features. Second, the EoPs of the sequent images in the image coordinate framework are recovered using a structure from motion (SfM) technique, and the transformation matrices between the LiDAR coordinate and image coordinate frameworks are calculated using corresponding EoPs, resulting in a coarse registration between the images and the LiDAR data. Finally, 3D points are generated from sequent images by multi-view stereo (MVS) algorithms. Then the EoPs of the sequent images are further refined by registering the LiDAR data and the 3D points using an iterative closest-point (ICP) algorithm with the initial results from coarse registration, resulting in a fine registration between sequent images and LiDAR data. Experiments were performed to check the validity and effectiveness of the proposed method. The results show that the proposed method achieves high-precision robust co-registration of sequent images and LiDAR data captured by mini-UAVs. Ó 2015 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.

1. Introduction Airborne laser scanning (ALS) and aerial photogrammetry systems have become key tools for gathering information on Earth surfaces. The information derived from ALS data and imagery enables a wide variety of applications, including DEM generation (Chen et al., 2013; Zhang and Lin, 2013), building extraction and 3D reconstruction (Barnea and Filin, 2013; Yang et al., 2013), defoliation investigation (Solberg et al., 2006), and forest biomass inventory (Zhao et al., 2009). However, the spatial and temporal resolutions of standard remote acquisition systems (e.g., ALS, aerial photogrammetry) are limited because of the inflexibility and high costs of flying platforms. Unmanned aerial vehicles (UAVs) are showing great potential for observing and monitoring the Earth and provide a faster and easier solution because they can reach an area of interest in a short time. UAVs are small, flexible, and light aerial platforms equipped with multiple sensors (e.g., laser scanners, cameras) ⇑ Address: State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430079, China. E-mail addresses: [email protected] (B. Yang), chenchi_liesmars@foxmail. com (C. Chen).

depending on different applications such as fine-scale mapping (Lin et al., 2011) and biomass change detection (Jaakkola et al., 2010). UAVs provide a new way to complement standard remote acquisition systems both in spatial and temporal resolutions because they can acquire information from low altitude. Mini-UAVs equipped with GPS, IMU, lightweight laser scanners, and consumer cameras present several interesting characteristics, including flexibility, high resolution, efficiency, and potential for customization, and have led to promising mapping solutions for many applications in recent years. Several mini-UAV-based applications and studies (e.g., terrain mapping, vegetation and tree mapping) have been reported (Jaakkola et al., 2010; Johnson and Danis, 2006; Lin et al., 2011; Nagai et al., 2009; Wallace et al., 2012). Light-detection ranging (LiDAR) and aerial photogrammetry are based on different sensor technology. Each of them has its own unique features. LiDAR point clouds provide accurate 3D surface information in the scattered-point data modality inherited from its range-detection principle. Aerial photogrammetry, on the other hand, conceals the 3D surface information through stereo vision while supplying semantic and texture information directly in the form of spectral imagery. Integration of optical imagery and LiDAR point clouds benefits from mutual compensation in both geometric

http://dx.doi.org/10.1016/j.isprsjprs.2014.12.025 0924-2716/Ó 2015 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.

B. Yang, C. Chen / ISPRS Journal of Photogrammetry and Remote Sensing 101 (2015) 262–274

and spectral domains. Using both systems to acquire accurate geometric and semantic information simultaneously is preferable to using either of them alone. Extensive studies have investigated fusing ALS data and imagery captured by manned vehicles to improve the accuracy, robustness, and level of automation of classification (Gerke and Xiao, 2014; Singh et al., 2012), as well as segmentation (Barnea and Filin, 2013), building extraction (Awrangjeb et al., 2013; Li et al., 2013; Rottensteiner and Briese, 2003), and change detection (Qin and Gruen, 2014). Mini-UAV-based laser-scanning and image-gathering systems have unique characteristics: low-altitude flying pattern, light weight, small selection of onboard sensors because of limited payload (generally less than 20 kg), and rapid consumption of battery power. Hence, a tradeoff must often be made between the accuracy and weight of positioning and orientation sensors (Wallace et al., 2012). Moreover, consumer cameras (Nagai et al., 2009) or even smart phones (Kim et al., 2013) instead of aerial surveying cameras are usually the choice for mini-UAVs. It is difficult to obtain highly accurate registration between laser-scanning points and imagery using hardware synchronization and bore-sight calibration because of unknown camera exposure delay, system calibration errors, and insufficient quality control of GPS and IMU observations (Skaloud, 2006), resulting in misalignments between the geo-referenced optical and laser datasets. Co-registration errors in multisensor data have a dramatic impact on fusion results. The potential of multi-sensor data fusion can be fully exploited only when the registration error achieves the single-pixel level (Habib and Schenk, 1999). As a prerequisite for fusion of optical imagery and laser point clouds, registration of the two datasets poses great challenges and is attracting more and more attention. Registration between LiDAR data and optical imagery captured by manned vehicles or satellites has been carried out in the last decade (e.g., (Habib et al., 2005; Mastin et al., 2009; Mitishita et al., 2008; Parmehr et al., 2014). However, less attention has been paid to the registration of LiDAR data and optical imagery captured by mini-UAVs because use of mini-UAVs for laser scanning is still a new research field. On the other hand, use of direct geo-referencing data with existing registration methods for ALS data and imagery may result in registration failure because of the tradeoff between sensor accuracy and weight in mini-UAVs. Moreover, the low flying altitude of mini-UAVs and the relatively high-resolution imagery produced magnify any registration deviation, which may reach tens or hundreds of pixels, leading to difficulties in achieving accurate registration between mini-UAV LiDAR data and imagery. This paper therefore proposes a robust coarse-to-fine method to register LiDAR data and imagery captured by mini-UAVs. First, the proposed method extracts building outlines from laser-scanning points, then extracts the corresponding building outlines from the images (defined as key-images) with the help of direct geo-referencing data to estimate the exterior orientation parameters (EoPs) of the images in the LiDAR reference frame with the conjugate building outlines. Second, structure from motion (SfM) techniques are used to estimate the EoPs of each image in the photogrammetric coordinate system. The transformation between the photogrammetric coordinate system and the LiDAR reference frame is solved using the key-image EoPs in both reference frames according to a voting procedure, leading to EoP estimates of the non-key images in the LiDAR reference frame. Hence, coarse registration between LiDAR points and images is achieved. In the final step, 3D dense points are generated from sequent images by the multi-view stereo (MVS) method. The EoP of each image is further refined by registering the LiDAR points and the 3D points by means of an iterative closest point (ICP) algorithm that uses the coarse registration results as the initial transformation. The main contributions of the proposed method are that it overcomes the shortcomings of geo-referencing data for registration

263

and corrects large deviations between LiDAR data and images captured by mini-UAVs by means of a three-dimensional registration refinement, resulting in robust and accurate registration. The remainder of the paper is organized as follows: following the introduction, related literature is reviewed in Section 2. Section 3 elaborates the key components of the proposed method. Experimental studies on registering mini-UAV-based LiDAR data and imagery are presented in Section 4, and conclusions are drawn in the final section. 2. Literature review Extensive studies have been carried out into registration between airborne and terrestrial laser-scanning points and satellite images. Existing methods for registration of imagery and laser-scanning points can generally be divided into two categories: area-based and feature-based (Palenichka and Zaremba, 2010). 2.1. Area-based registration methods Area-based methods optimize the EoPs of the optical imagery by maximizing the statistical or grayscale similarities of the corresponding image areas from the LiDAR data and the optical imagery. Usually, digital surface models (DSM) and return-pulse intensity images (or reflectance images) are generated from LiDAR data by interpolating the height coordinates or return-pulse intensities. With the two kinds of LiDAR images, the three- to two-dimensional registration problem degenerates into a two-dimensional image registration problem. Conjugate area matching is the key step in this kind of approach. Local or global area similarity is usually measured by mutual information, which exploits the statistical dependencies between the data to be registered and derives similarity measurements. Because of its nonlinear joint probability characteristics, mutual information has been widely adopted as a registration technique for heterogeneous data (Le Moigne et al., 2011; Suri and Reinartz, 2010). Mastin et al. (2009) registered aerial imagery onto LiDAR point clouds by maximizing the mutual information between the grayscale-encoded height or return-pulse intensity and optical imagery. Parmehr et al. (2014) used complementary information in the LiDAR DSM and LiDAR intensity data to improve registration accuracy and robustness using combined mutual-information techniques. However, in the case of LiDAR data and imagery from mini-UAVs, using area-based methods may lead to registration failure. The flying altitudes of mini-UVAs can be as low as dozens of meters because of the limited power of onboard laser scanners, leading to dramatically varying values of individual laser scan range. These dramatically varying values cause intensity inconsistency within the same object in the absence of the return-pulse intensity calibration signal (because the return-pulse reflectance value is a reciprocal biquadratic function of laser scan range (Wagner et al., 2006)), thus affecting the mutual-information similarity measurements. Although calibration of manned airborne LiDAR sensor intensity has been well studied (Kaasalainen et al., 2005, 2009; Roncat et al., 2014; Wagner et al., 2006), calibration of mini-UAV-borne LiDAR sensor intensity has not yet been reported to the authors’ knowledge. On the other hand, the low-altitude mini-UAV data-acquisition procedure leads to much denser point clouds than those from manned LiDAR systems (Lin et al., 2011). The density of the captured points reaches dozens of points per square meter, which reveals the potential of this method for preserving fine features that cannot be gathered by manned LiDAR systems (Jaakkola et al., 2010; Nagai et al., 2009; Wallace et al., 2012). These denser point clouds represent a potential way to register the imagery with LiDAR data using the preserved features from the scatter-point clouds.

264


2.2. Feature-based registration methods Feature-based methods carry out the registration task by extracting features from imagery and LiDAR data to establish correspondences for camera pose estimation. Various types of scalar variables or geometrical primitives can be used to form the conjugated features. A comprehensive review of feature-based registration methods has been given by Rönnholm (2011). According to Rönnholm and Haggrén (2012), registration accuracy using artificial features is higher than that using natural ones (e.g., a tree canopy). Hence, the scope of this work has been restricted to registration using artificial features. Exact feature points (e.g., nodes, corners) do not exist in LiDAR data because of their discrete sampling nature. However, describing a feature point by its supporting area information has proved to be a feasible approach for matching conjugate feature points from LiDAR data and imagery. Palenichka and Zaremba (2010) incorporated intensity descriptors, area shape characteristics, and coordinates for automated extraction of control points from LiDAR data and from imagery of both natural landscapes and structured scenes with man-made objects. Local scale-invariant image features such as the Scale-Invariant Feature Transform (SIFT) (Lowe, 2004) are also widely used to solve heterogeneous-data registration problems. Böhm and Becker (2007) used the SIFT feature detector to match corresponding feature points between reflected images and imagery. Second, linear features fitted by discrete sampling points around the edges of LiDAR data are also used to register LiDAR data and imagery. Habib et al. (2005) presented a co-registration framework for LiDAR data and imagery using line segment pairs. Related studies also include registration based on connected line segments (Wang and Neumann, 2009), vanishing points derived from parallel line segments (Liu and Stamos, 2012), and planar features (e.g., roofs) (Mitishita et al., 2008). In particular, Zhao et al. (2005) used a dense 3D reconstruction technique to perform registration by transforming image-to-points (2D–3D) registration into pointsto-points registration (3D–3D), which can be solved using an ICP algorithm. However, automatic initialization of ICP algorithms is a non-trivial issue (Rusinkiewicz and Levoy, 2001). Generally speaking, area-based registration methods perform the registration task using the statistical dependence (e.g., mutual information) between image and LiDAR data. Complex procedures including feature extraction and matching, which are error-prone, are omitted. This leads to more robust alignment results (Parmehr et al., 2014). Area-based methods rely heavily on the quality and correctness of the intensity image, which is determined by intensity calibration effectiveness. However, return-pulse intensity calibration in mini-UAV-borne LiDAR systems is still an ongoing research issue. Feature-based registration methods aim to extract and match physical geometric primitives within the scene to align both data sets. Finding conjugate feature in LiDAR data and imagery is a complex task. Mismatched corresponding features may cause reduced accuracy and robustness of registration results. To achieve accurate and robust registration between LIDAR data and imagery captured by mini-UAVs, this paper proposes a two-step registration method, including a coarse step to achieve good initialization of image EoPs and a fine step to optimize the EoPs by refining 3D–3D point alignment.

3. Methodology The proposed method aims to register LiDAR data and image sequences captured by mini-UAVs in an automatic and robust manner. Three key components are encompassed in the proposed methods: (1) extracting building outlines from LiDAR data and key images (images with building objects) and establishing

correspondences to estimate the key frame-image EoPs in the LiDAR reference frame; (2) recovering the EoPs of the sequent images in the photogrammetric coordinate system using the SfM technique and estimating the transformation between the LiDAR reference frame and the photogrammetric coordinate system by means of the EoPs of key frame images using a voting procedure to achieve a coarse registration; and (3) refining the coarse registration by registering the 3D points generated from the sequent images and the LiDAR data by means of an ICP algorithm using the coarse registration result as the initial transformation. Fig. 1 illustrates the framework of the proposed method. The key components of this method are described below. 3.1. Building outline extraction and matching Building outlines have distinguishable features in LiDAR data and imagery which provide good hints for finding the corresponding features. Many methods can be used to extract building outlines from LiDAR data alone or by fusion with auxiliary data (e.g., imagery, vector maps) (Awrangjeb et al., 2013; Filin and Pfeifer, 2006; Niemeyer et al., 2014; Rottensteiner and Briese, 2003; Xu et al., 2014). The method proposed by Yang et al. (2013) has been used here to extract building outlines from LiDAR data because of its robustness. The extracted building outlines are further converted into rectangle-based polygons using the recursive minimum bounding rectangle (RMBR) (Kwak, 2013), generating rich structure lines and corners which are used to find corresponding features in imagery. Fig. 2 shows an example of building outline extraction and RMBR regularization from LiDAR data. The extracted building outlines are back-projected onto all the images using the direct geo-referencing data from the onboard position orientation system (POS), as illustrated in Fig. 3-1. If the back-projected building outlines falls into or partly falls into the image frame, the image is selected as key image, otherwise, as non-key image. The projected building outlines provide good prior knowledge to guide saliency-based building outline extraction from the corresponding imagery. However, significant position and orientation deviations may occur while extracting the outlines from the imagery because of camera synchronization limitations, system calibration errors, low-cost sensors, and low flying altitudes (Fig. 3-2). Once one extracted building outline has been back-projected onto the imagery, the approximate corresponding area in the imagery is determined (as illustrated in Fig. 3-2, red polygon (R1)). Then a buffer area of the building outline is generated to search the building area in the imagery, thus overcoming deviations in system calibration. The radius of the buffering area usually ranges from 30 to 50 pixels (as illustrated in Fig. 3-4, the transparent area is the buffering area). To determine the accurate building area, the color-structure tensor statistics of the buffer area are calculated. 1 2 n T For a multi-channel image f ¼ ðf ; f ; . . . ; f Þ , the structure tensor (G) is given by (van de Weijer et al., 2006) as:

G¼

fx fx fx fy fy fx fy fy

! ð1Þ

;

where the subscripts indicate spatial derivatives, the bar () indicates convolution with a Gaussian filter, and f x ; f y represent the gradient in horizontal and vertical dimensions. The structure tensor describes the local differential structure of images and is well suited to find features such as edges and corners. In the case of color UAV imagery, f ¼ ðR; G; BÞT . With the aid of spatial derivatives, the first two eigenvalues of the tensor G can be calculated as:

k1 ¼

1 ðf x f x þ f y f y þ 2

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 ðf x f x f y f y Þ þ ð2f x f y Þ Þ;

ð2Þ


265

Fig. 1. Framework for registering mini-UAV sequent images and LiDAR data.

(1) LiDAR data captured by a mini-UAV. (2) Building object extraction. (3) Building outline regularization. Fig. 2. Building outline extraction and regularization from mini-UAV-borne LiDAR data.

k2 ¼

1 ðf x f x þ f y f y 2

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 ðf x f x f y f y Þ þ ð2f x f y Þ Þ;

ð3Þ

and the most prominent local gradient orientation can be calculated as:

! 1 2f x f y : h ¼ arctan 2 fx fy fy fy

ð4Þ

According to the procedure just described, the tensor gradient magnitude and orientation of each pixel in the buffer area are calculated. For each pixel in the buffer area, its tensor gradient magnitude is calculated by applying non-maximum suppression on k1 in the prominent direction (h). The histogram of the most prominent local gradient orientations in the buffer area is also calculated,

as illustrated in Fig. 4. It has been found that the histograms of the tensor gradient orientation of a rectangular building area and an Lshaped building area have one peak and two peaks respectively (Fig. 4-1 and -2). The peak indicates the direction of the long edge of a rectangle or an L-shaped building area. All the building outlines extracted from LiDAR data are then back-projected onto the corresponding images, and the histogram statistics of tensor gradient orientation for the corresponding areas are calculated. Suppose that the tensor gradient orientation histogram of one building area does not fit the distributions of rectangular or L-shaped buildings described earlier. The extracted building outline and its corresponding image are then filtered and are not used for finding the corresponding outlines. The unfiltered building outlines are rotated to the orientation indicated by

266


3-1. Back-project building outline onto image. 3-2. Back-projected outline (R1) from LiDAR data, red polygon. polygon.

3-3. Approximate building outline (R2) in imagery, green dotted

3-4. Optimal building outline area (R3) in imagery, blue polygon. Fig. 3. Extraction of building outlines from imagery.

4-1. Histogram statistics of tensor gradient orientation of rectangular building.

4-2. Histogram statistics of tensor gradient orientation of L-shaped building. Fig. 4. Histogram statistics of tensor gradient orientation of buildings.

the peak. The rotated polygon R2 is illustrated in Fig. 3-3. Then the rotated building outline is moved along its long axis and its orthogonal direction pixel by pixel within a local area. An optimal area is obtained when the sum of tensor gradient magnitudes inside the area reaches a maximum (denoted by R3), as illustrated in Fig. 3-4. Once one building area has been determined in a local area on the image, the building object shows high global contrast (Fig. 5-1) inside the image area. Then a global area contrast-based saliency

cut (RCC) method proposed by Cheng et al. (2011) is used to segment the building outline from its surroundings in the local area (Fig. 5-2) because saliency cut-based methods are good at detecting the most prominent image patch globally. Thus the contour and regularization of the detected building outline are obtained, as illustrated in Fig. 5-3 and -4. The building outlines extracted from the key images and the LiDAR data are used to build the transformation from the photogrammetric coordinate system to the LiDAR reference frame,


resulting in a coarse registration between the images and the LiDAR data. 3.2. Coarse registration using matched building outlines Using the method described above, the building outlines from the imagery and the LiDAR data are extracted and matched. Let one corner point of the building outline from one key image be T m ¼ ðu; v ; f Þ and the corresponding point in the building outline from the LiDAR data be M las ¼ ðX; Y; ZÞT . The transformation between the corresponding points can be written as:

spnp m ¼ A½Rpnp jt pnp M las ;

ð5Þ

where A and spnp are the calibration matrix and scale factor and Rpnp and tpnp are the rotation and translation matrices. To solve Eq. (5), the efficient perspective N-point (EPNP) algorithm (Lepetit et al., 2009) is used. In addition, co-planarity and collinearity constraints can be incorporated to optimize the EPNP results utilizing the linear feature based registration method proposed by Habib et al. (2005). In this way, the key-image EoPs in the LiDAR reference frame are calculated. However, the EoPs of nonkey images in the LiDAR reference frame still remain to be obtained. To estimate the EoPs of non-key images in the LiDAR reference frame (Cw), the EoPs of the non-key images in the photogrammetric coordinate system (CMVS) are first calculated using the structure from motion (SfM) technique. Then the transformation between Cw and CMVS is solved using the corresponding key-image EoPs in each reference frame. Let a 3D point in CMVS be MMVS and its 2D point in an image be m; the collinearity constraint can then be written as:

sbundle m ¼ A½Rbundle jtbundle MMVS ;

ð6Þ

where sbundle is the scale factor and Rbundle, tbundle are the rotation and translation matrices in CMVS, which are solved using the incremental spare-bundle adjustment method proposed by Snavely et al. (2008). Hence, registration between LiDAR data and sequent images can be performed by solving the transformation between the coordinate reference frames (CMVS and Cw), which can be written as:

M las ¼ kRM MVS þ T:

ð7Þ

In light of Eqs. (5)–(7), the rotation matrix R, the scale factor k, and the translation matrix T can be written as:

267

R ¼ RTpnp Rbundle ;

ð8Þ

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi Pn 2 2 2 i¼1 ðX i þ Y i þ Z i Þ ; k ¼ Pn 2 2 2 i¼1 ðxi þ yi þ zi Þ

ð9Þ

T ¼ Ppnp kRP bundle ;

ð10Þ

where n is the number of key images and Pbundle ðxi ; yi ; zi Þt and Ppnp ðX i ; Y i ; Z i Þt are the centralized coordinates of the camera position in the coordinate frameworks CMVS and Cw, which are calculated by SfM and from the corresponding building outlines respectively. It should be understood that each key image in the coarse registration corresponds to one Rpnp and one tpnp, generating one rotation matrix R according to Eq. (8). To obtain an accurate rotation matrix R, each R is decomposed into three rotation angles ðu; x; jÞ. Because minor changes in rotation angles can lead to obvious translation errors, R should be determined first. This is done by calculating the norm of pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi any two groups of rotation angles, n ¼ Du2 þ Dx2 þ Dj2 , and further clustering it using the k-means algorithm (Lee and Lee, 2013). The optimum rotation angles correspond to the largest cluster according to the voting principle (Fernandes and Oliveira, 2008). Hence, the center of the largest class is determined as the correct set of rotation angles. Then the calculated rotation angles are used to recalculate the translation parameters, and the norm of any two qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi groups of translation parameters, d ¼ DT 2x þ DT 2y þ DT 2z , is calculated and further clustered using the k-means algorithm (Lee and Lee, 2013). The translation parameters are determined by the same principle used to calculate rotation parameters. Hence, the EoPs of non-key images in the LiDAR data coordinate framework are determined using the transformation ðk; R; TÞ, resulting in a coarse registration between the LiDAR data and the sequent images. However, the coarse registration needs to be further refined because of imprecise matching points caused by the discrete sampling nature of the LiDAR data and occlusions in the scene. 3.3. Fine registration between LiDAR data and sequent images The EoPs of the sequent images are determined in the coarse registration, and the transformation parameters between the LiDAR reference frame (Cw) and the photogrammetric coordinate system (CMVS) are also calculated, resulting in a coarse registration between the LiDAR data and the sequent images. To refine this coarse

5-1. Detected area. 5-2. Saliency cut segmentation. 5-3. Contour extraction . 5-4. Regularization. Fig. 5. Extracting building outlines from imagery using prior knowledge from LiDAR data.

268


Fig. 6. Rotor-wing mini-UAV with a laser scanner, POS, and digital camera.

Table 1 Sensor specifications.

Table 2 Dataset description.

Sensor

Description

Laser scanner POS Digital camera

Riegl LMS-Q160 scanner NovAtel Span Canon 5D Mark II with 24 mm lens (calibrated)

registration, dense 3D points are first generated from the sequent images using the Daisy MVS algorithm (Tola et al., 2010). Refinement of the coarse registration is achieved by a variant of standard ICP which minimizes the distances between the generated MVS points (IPC = {ai}) and their corresponding nearest points in the LiDAR data (LPC = {bi}) and finds the optimal transformation parameters Tðk3d—3d ; R3d—3d ; T 3d—3d Þ, which can be written as:

T

( ) X 2 arg min wi kgi ðT ai bi Þk : T

ð11Þ

i

7-1. Data for site 1.

Flight altitude (m) Point density (pts/m2) Resolution of image (cm) Number of images Number of LiDAR points (pts) Area (km2)

Site 1

Site 2

80 25.4 2.1 78 6,506,721 0.26

130 10.2 3.5 343 8,893,549 0.87

where gi is the surface normal at bi, wi is the weight. Through the coarse registration procedure, a coarse transformation between the two reference frames has been obtained, providing a good initialization for the ICP algorithm and overcoming the problem of determining an initialization automatically. Once the initial transformation is determined, the key component of the ICP algorithm can be summarized in the following two steps: (1) Find correspondences between the two point cloud. (2) Minimizes distance

7-2. Data for site 2.

Fig. 7. Aerial ortho-images and LiDAR data captured by a mini-UAV.

269


8-1. LiDAR data.

8-2. Building point extraction.

8-3. Building outline regularization.

Fig. 8. Building outline extraction and regularization from LiDAR data.

9 -1. Building outline from LiDAR data.

9- 2. Segmented building patch.

9- 3. Regularized outline.

Fig. 9. Extracting building outlines from images using prior knowledge from LiDAR data.

between the matched pairs to calculate the transformation. The above two steps is iteratively repeated until it convergences, thus resulting in the desired transformation. In the matching step, kdtree is utilized to accelerate the closest point search. Also, relative motion threshold (RMT) rejection technique proposed by Pomerleau et al. (2010) is used to identify outliers (e.g., points with no match) based on the error produced by paired points instead of a distance measurement, because ICP is sensitive to outliers (e.g., points with no match) that may prevent the convergence (Rusinkiewicz and Levoy, 2001). In the minimization step, the point-to-plane error function (Eq. (11)) which is originally proposed by Chen and Medioni (1991) is is minimized with constant weight to acquire the optimal transformation. Hence, fine registration is achieved by applying the abovementioned variant of standard ICP algorithm, and the optimal transformation parameters ðk3d—3d ; R3d—3d ; T 3d—3d Þ are calculated. Based on the refined

parameters ðk3d—3d ; R3d—3d ; T 3d—3d Þ, the EoPs (Rcam, Tcam) of each image can be calculated as:

Rcam T cam

¼

R3d—3d Rbundle k3d—3d R3d—3d T bundle þ T 3d—3d

:

ð12Þ

The EoPs of all images are finally used to accomplish registration between the sequent images and the LiDAR data. 4. Experiments and analysis Experiments were undertaken to check the validity and effectiveness of the proposed method using two datasets captured by a mini-UAV (Fig. 6). The sensor specifications of the mini-UAV are listed in Table 1. Fig. 7 contains snapshots of the two sites, and Table 2 lists the dataset description for the two sites.

270


10-1. Registration based on direct geo-referencing data.

10-2. Coarse registration based on corresponding outlines.

10-3. Fine registration based on ICP refinement. Fig. 10. Coarse to fine registration results for sites 1 and 2.

4.1. Registration of imagery to LiDAR data Fig. 8 illustrates the building outlines extracted from the LiDAR data for sites 1 and 2, showing good linear features and corner points for finding the corresponding features in the images. The extracted building outlines are back-projected onto images using direct geo-referencing data from the POS of the mini-UAV (as illustrated in Fig. 9-1). The overlapping results show that large

deviations occurred in the corresponding features in the LiDAR data and the image. However, the corresponding building outlines have been well extracted from the images based on prior knowledge from the LiDAR data. The extracted building outlines from the LiDAR data and the images provide good linear features and corner points for estimating the image EoPs. Fig. 10-1 illustrates a registration failure based on direct geo-referencing data, indicating that vibration of the mini-UAV,


low flying altitude, and limited camera synchronization and boresight calibration led to failure of the registration task. To correct the failed registration between the LiDAR data and the images (the first row in Fig. 10), the building outlines extracted from the LiDAR data and the images are used to calculate the key-image EoPs by means of the proposed method, resulting in a coarse registration between the LiDAR data and the images. The LiDAR points are then back-projected onto the corresponding image to show the registration based on direct geo-referencing data (Fig. 10-1), which showed large deviations between the LiDAR data and the images. However, compared with the registration based on direct geo-referencing data, the coarse registration based on corresponding building outlines has eliminated large deviations, demonstrating that the image EoPs have been refined by matching the corresponding building outlines in the LiDAR data and the images. However, misalignments still occur in areas with occlusion or inaccurate outline extraction, as illustrated in Fig. 10-2. Fig. 10-3 illustrates details of the fine registration. The misalignments in the XY plane have been eliminated, showing that the image EoPs have been further improved by registering 3D points generated by the Daisy MVS algorithm (Tola et al., 2010) to the LiDAR data using an ICP algorithm with initialization from the coarse registration. Hence, fine registration has been achieved. The dense 3D points generated from the sequent images by the Daisy MVS algorithm (Tola et al., 2010) were overlapped with the LiDAR data using the transformation parameters in the coarse and fine registrations, as illustrated in Fig. 11. Visual inspection determined that the horizontal and vertical misalignments had been well corrected in the fine registration, demonstrating that the proposed method achieves good registration quality between

271

mini-UAV-borne LiDAR data. This experiment has also proved that the proposed method improves the accuracy of the EoPs of each sequent image, resulting in good registration between mini-UAVborne LiDAR data and the dense 3D points generated from the sequent images. Due to low flying altitude, UAV imagery contains many occluded areas. To obtain the correct color of each associated laser-scanning point, occlusions are detected by the direct visibility of point sets algorithm (Katz et al., 2007) using the refined image EoPs in the fine registration. Fig. 12 shows a 3D perspective view of the colorized mini-UAV-borne LiDAR data for site 2. Visual inspection shows that the building outlines are colored correctly, which demonstrates the good registration quality of the proposed method. The registered mini-UAV-borne LiDAR data and images provide additional knowledge for many applications such as classification, segmentation, and ortho-image generation. The registration results for mini-UAV-borne LiDAR data and sequent images can also be used to eliminate the effects of building height on orthoimage generation. Fig. 13 shows a true ortho-image generated using registered LiDAR data and images. A detailed comparison demonstrates that the building outline has been rectified and that the effect of building height on the ortho-image has been eliminated. 4.2. Evaluation of the registration accuracy of the proposed method To quantify the accuracy of the registration results from the proposed method, many artificial targets with high reflectance were mounted in different locations in the sites before the

11-1. Details of coarse registration.

11-2. Details of fine registration. Fig. 11. Visual inspection of the registration between mini-UAV-borne LiDAR data (false color) and MVS points (true color) before and after proposed ICP refinement.

272


Fig. 12. Coloring mini-UAV-borne LiDAR data using registered images.

Fig. 13. Generating an ortho-image using registered data.

14-1. Targets in LiDAR data.

14-2. Targets in image.

Fig. 14. High-reflectance targets in mini-UAV-borne LiDAR data and images.

Fig. 15. Distribution of residuals of artificial target registration errors for site 1 (left) and site 2 (right).

273

B. Yang, C. Chen / ISPRS Journal of Photogrammetry and Remote Sensing 101 (2015) 262–274 Table 3 Registration errors of datasets in sites 1 and 2 (in pixels). Registration method

Direct geo-referencing Coarse registration Fine registration

Max error

Min error

Average error

RMSE

Site 1

Site 2

Site 1

Site 2

Site 1

Site 2

Site 1

Site 2

149.634 32.518 1.637

137.424 31.283 1.283

46.637 2.637 0.008

37.847 3.784 0.462

91.353 13.886 0.619

86.972 15.390 0.834

32.770 8.531 0.493

35.320 7.398 0.219

mini-UVA gathered the data (Fig. 14). Because the artificial targets had high reflectance, the targets were easily identified and measured from the mini-UAV-borne LiDAR data and the sequent images. The pixel offsets between the registered LiDAR data and the images were measured to evaluate registration error. The registration errors caused by direct geo-referencing data-based registration, coarse registration, and fine registration were measured. Table 2 lists the registration errors of the datasets for sites 1 and 2, demonstrating that the proposed method has greatly improved the accuracy of the registration. Fig. 15 shows the distribution of the registration errors of the high-reflectance targets. The horizontal axis represents the high-reflectance target index, and the vertical axis represents the registration pixel offset. It shows that most corresponding points had about 80 pixels offset before registration and that the pixel offset of the corresponding points was reduced to about one pixel after fine registration (see Table 3).

5. Conclusions Automated registration between mini-UAV-borne sequent images and LiDAR data is critically important for high-accuracy UAV mapping. Direct geo-referencing data-based registration is incapable of performing the task because of low flying altitudes, low-cost sensors, vibration of mini-UAV platforms, and limitations of system synchronization and calibration. After preliminary work, an automated method was proposed for accurate registration of mini-UAV-borne sequent images and LiDAR data. The method performs the registration task by means of coarse and fine registration. Coarse registration resolves the registration between mini-UAV-borne sequent images and LiDAR data using corresponding corner points of roofs derived indirectly by using linear features, thus determining image EoPs. Fine registration serves to refine the EoPs by registering 3D points generated from sequent images and mini-UAV-borne LiDAR data. Finally, good registration between mini-UAV-borne sequent images and LiDAR data is achieved. Experiments were conducted to evaluate the effectiveness and accuracy of the proposed method and demonstrated that the proposed method registers mini-UAV-borne sequent images and LiDAR data successfully with a registration error of less than one pixel. Colorized 3D point clouds and true ortho-images were then generated using the registered data to present promising applications of the proposed registration method. In the near future, multiple geometric features (e.g., points, lines) will be investigated to perform registration of natural scenes where few buildings exist. Acknowledgments Work described in this paper is jointly supported by the National Science Foundation of China Project under Grant No. 41371431, the National Basic Research Program of China under Grant No. 2012CB725301, and the research program from the Ministry of Education of China under Grant No. 20120141110035.

References Awrangjeb, M., Zhang, C., Fraser, C.S., 2013. Automatic extraction of building roofs using LIDAR data and multispectral imagery. ISPRS J. Photogramm. Remote Sensing 83, 1–18. Barnea, S., Filin, S., 2013. Segmentation of terrestrial laser scanning data using geometry and image information. ISPRS J. Photogramm. Remote Sensing 76, 33– 48. Böhm, J., Becker, S., 2007. Automatic marker-free registration of terrestrial laser scans using reflectance features. In: 8th Conference on Optical 3D Measurement Techniques, Zurich, Switzerland, pp. 338–344. Chen, Y., Medioni, G., 1991. Object modeling by registration of multiple range images. In: IEEE International Conference on Robotics and Automation, Sacramento, CA, 9–11 April, vol. 3, pp. 2724–2729. Chen, C., Li, Y., Li, W., Dai, H., 2013. A multiresolution hierarchical classification algorithm for filtering airborne LiDAR data. ISPRS J. Photogramm. Remote Sensing 82, 1–9. Cheng, M., Zhang, G., Mitra, N.J., Huang, X., Hu, S., 2011. Global contrast based salient region detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, pp. 409–416. Fernandes, L.A.F., Oliveira, M.M., 2008. Real-time line detection through an improved Hough transform voting scheme. Pattern Recognition 41 (1), 299– 314. Filin, S., Pfeifer, N., 2006. Segmentation of airborne laser scanning data using a slope adaptive neighborhood. ISPRS J. Photogramm. Remote Sensing 60 (2), 71–80. Gerke, M., Xiao, J., 2014. Fusion of airborne laserscanning point clouds and images for supervised and unsupervised scene classification. ISPRS J. Photogramm. Remote Sensing 87, 78–92. Habib, A., Schenk, T., 1999. A new approach for matching surfaces from laser scanners and optical scanners. Int. Arch. Photogramm. Remote Sensing 32, 3/ W14-55-61. Habib, A., Ghanma, M., Morgan, M., Al-Ruzouq, R., 2005. Photogrammetric and LiDAR data registration using linear features. Photogramm. Eng. Remote Sensing 71 (6), 699–707. Jaakkola, A., Hyyppä, J., Kukko, A., Yu, X., Kaartinen, H., Lehtomäki, M., Lin, Y., 2010. A low-cost multi-sensoral mobile mapping system and its feasibility for tree measurements. ISPRS J. Photogramm. Remote Sensing 65 (6), 514–522. Johnson, P.B., Danis, M., 2006. Unmanned aerial vehicle as the platform for lightweight laser sensing to produce sub-meter accuracy terrain maps for less than $5/km2, Report. Mechanical Engineering Department Columbia University, 48p. Kaasalainen, S., Ahokas, E., Hyyppa, J., Suomalainen, J., 2005. Study of surface brightness from backscattered laser intensity: calibration of laser data. IEEE Trans. Geosci. Remote Sensing Lett. 2 (3), 255–259. Kaasalainen, S., Hyyppa, H., Kukko, A., Litkey, P., Ahokas, E., Hyyppa, J., Lehner, H., Jaakkola, A., Suomalainen, J., Akujarvi, A., Kaasalainen, M., Pyysalo, U., 2009. Radiometric calibration of LIDAR intensity with commercially available reference targets. IEEE Trans. Geosci. Remote Sensing 47 (2), 588–598. Katz, S., Tal, A., Basri, R., 2007. Direct visibility of point sets. ACM Trans. Graphics 26 (3). http://dx.doi.org/10.1145/1275808.1276407. Kim, J., Lee, S., Ahn, H., Seo, D., Park, S., Choi, C., 2013. Feasibility of employing a smartphone as the payload in a photogrammetric UAV system. ISPRS J. Photogramm. Remote Sensing 79, 1–18. Kwak, E., 2013. Automatic 3D building model generation by integrating LiDAR and aerial images using a hybrid approach. Ph.D thesis. University of Calgary, Canada. Le Moigne, J., Netanyahu, N.S., Eastman, R.D., 2011. Image Registration for Remote Sensing. Cambridge University Press, UK. Lee, S., Lee, W., 2013. Evaluation of the selection of the initial seeds for K-means algorithm. Int. J. Database Theory Appl. 6 (5), 13–21. Lepetit, V., Moreno-Noguer, F., Fua, P., 2009. EPnP: an accurate O(n) solution to the PnP problem. Int. J. Comput. Vision 81 (2), 155–166. Li, Y., Wu, H., An, R., Xu, H., He, Q., Xu, J., 2013. An improved building boundary extraction algorithm based on fusion of optical imagery and LIDAR data. Optik – Int. J. Light Electron Opt. 124 (22), 5357–5362. Lin, Y., Hyyppa, J., Jaakkola, A., 2011. Mini-UAV-Borne LIDAR for fine-scale mapping. IEEE Trans. Geosci. Remote Sensing Lett. 8 (3), 426–430. Liu, L., Stamos, I., 2012. A systematic approach for 2D-image to 3D-range registration in urban environments. Comput. Vision Image Understand. 116 (1), 25–37. Lowe, D., 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60 (2), 91–110.

274


Mastin, A., Kepner, J., Fisher, J., 2009. Automatic registration of LIDAR and optical images of urban scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, pp. 2639–2646. Mitishita, E., Habib, A., Centeno, J., Machado, A., Lay, J., Wong, C., 2008. Photogrammetric and lidar data integration using the centroid of a rectangular roof as a control point. The Photogramm. Record 23 (121), 19–35. Nagai, M., Tianen, C., Shibasaki, R., Kumagai, H., Ahmed, A., 2009. UAV-borne 3-D mapping system by multisensor integration. IEEE Trans. Geosci. Remote Sensing 47 (3), 701–708. Niemeyer, J., Rottensteiner, F., Soergel, U., 2014. Contextual classification of lidar data and building object detection in urban areas. ISPRS J. Photogramm. Remote Sensing 87, 152–165. Palenichka, R.M., Zaremba, M.B., 2010. Automatic extraction of control points for the registration of optical satellite and LiDAR images. IEEE Trans. Geosci. Remote Sensing 48 (7), 2864–2879. Parmehr, E.G., Fraser, C.S., Zhang, C., Leach, J., 2014. Automatic registration of optical imagery with 3D LiDAR data using statistical similarity. ISPRS J. Photogramm. Remote Sensing 88, 28–40. Pomerleau, F., Colas, F., Ferland, F., Michaud, F., 2010. Relative motion threshold for rejection in ICP registration. In: Howard, A., Iagnemma, K., Kelly, A. (Eds.), Field and Service Robotics. Springer, Berlin Heidelberg, pp. 229–238. Qin, R., Gruen, A., 2014. 3D change detection at street level using mobile laser scanning point clouds and terrestrial images. ISPRS J. Photogramm. Remote Sensing 90, 23–35. Roncat, A., Briese, C., Jansa, J., Pfeifer, N., 2014. Radiometrically calibrated features of full-waveform lidar point clouds based on statistical moments. IEEE Trans. Geosci. Remote Sensing Lett. 11 (2), 549–553. Rönnholm, P., 2011. Registration Quality – Towards Integration of Laser Scanning and Photogrammetry. In EuroSDR Official Publication No 59, pp. 9–57. Rönnholm, P., Haggrén, H., 2012. Registration of laser scanning point clouds and aerial images using either artificial or natural tie features. ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci. I-3, 63–68. Rottensteiner, F., Briese, C., 2003. Automatic generation of building models from LIDAR data and the integration of aerial images. Int. Arch. Photogramm., Remote Sensing Spatial Inform. Sci. ISPRS 34 (3/W13), 174–180. Rusinkiewicz, S., Levoy, M., 2001. Efficient variants of the ICP algorithm. In: Third International Conference on 3-D Digital Imaging and Modeling, Quebec City, Canada, 28 May–1 June, pp. 145–152. Singh, K.K., Vogler, J.B., Shoemaker, D.A., Meentemeyer, R.K., 2012. LiDAR-Landsat data fusion for large-area assessment of urban land cover: Balancing spatial

resolution, data volume and mapping accuracy. ISPRS J. Photogramm. Remote Sensing 74, 110–121. Skaloud, J., 2006. Reliability in direct georeferencing: an overview of the current approaches and possibilities. In: International Calibration and Orientation Workshop, EuroCOW06, Castelldefels, Spain, 25–27 January (On-CDROM). Snavely, N., Seitz, S., Szeliski, R., 2008. Modeling the World from internet photo collections. Int. J. Comput. Vision 80 (2), 189–210. Solberg, S., Næsset, E., Hanssen, K.H., Christiansen, E., 2006. Mapping defoliation during a severe insect attack on Scots pine using airborne laser scanning. Remote Sensing Environ. 102 (3–4), 364–376. Suri, S., Reinartz, P., 2010. Mutual-information-based registration of TerraSAR-X and ikonos imagery in urban areas. IEEE Trans. Geosci. Remote Sensing 48 (2), 939– 949. Tola, E., Lepetit, V., Fua, P., 2010. DAISY: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans. Pattern Anal. Mach. Intelligence 32 (5), 815– 830. van de Weijer, J., Gevers, T., Smeulders, A.W.M., 2006. Robust photometric invariant features from the color tensor. IEEE Trans. Image Process. 15 (1), 118–127. Wagner, W., Ullrich, A., Ducic, V., Melzer, T., Studnicka, N., 2006. Gaussian decomposition and calibration of a novel small-footprint full-waveform digitising airborne laser scanner. ISPRS J. Photogramm. Remote Sensing 60 (2), 100–112. Wallace, L., Lucieer, A., Watson, C., Turner, D., 2012. Development of a UAV-LiDAR system with application to forest inventory. Remote Sensing 4 (12), 1519–1543. Wang, L., Neumann, U., 2009. A robust approach for automatic registration of aerial images with untextured aerial LiDAR data. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Miami, FL, June 20–25, pp. 2623–2630. Xu, S., Vosselman, G., Oude Elberink, S., 2014. Multiple-entity based classification of airborne laser scanning data in urban areas. ISPRS J. Photogramm. Remote Sensing 88, 1–15. Yang, B., Xu, W., Dong, Z., 2013. Automated extraction of building outlines from airborne laser scanning point clouds. IEEE Trans. Geosci. Remote Sensing Lett. 10 (6), 1399–1403. Zhang, J., Lin, X., 2013. Filtering airborne LiDAR data by embedding smoothnessconstrained segmentation in progressive TIN densification. ISPRS J. Photogramm. Remote Sensing 81, 44–59. Zhao, W., Nister, D., Hsu, S., 2005. Alignment of continuous video onto 3D point clouds. IEEE Trans. Pattern Anal. Mach. Intelligence 27 (8), 1305–1318. Zhao, K., Popescu, S., Nelson, R., 2009. Lidar remote sensing of forest biomass: a scale-invariant estimation approach using airborne lasers. Remote Sensing Environ. 113 (1), 182–196.