Road Extraction from Lidar Data Using Support Vector Machine Classification Ali Akbar Matkan, Mohammad Hajeb, and Saeed Sadeghian
Abstract
This paper presents a method for road extraction from lidar data based on SVM classification. The lidar data are used exclusively to evaluate the potential in the road extraction process. First, the SVM algorithm is used to classify the lidar data into five classes: road, tree, building, grassland, and cement. Then, some misclassified pixels in the road class is removed using the road values in the normalized Digital Surface Model and Normalized Difference Distance features. In the postprocessing stage, a method based on Radon transform and Spline interpolation is employed to automatically locate and fill the gaps in the road network. The experimental results show that the proposed algorithm for gap filling works well on straight roads. The proposed road extraction algorithm is tested on three datasets. An accuracy assessment indicated 63.7 percent, 60.26 percent and 66.71 percent quality for three datasets. Finally, centerline of the detected roads is extracted using mathematical morphology.
Introduction
Road information plays an important role in many modern applications, including transportation, automatic navigation systems, traffic management, and crisis management, and enables existing geographic information system (GIS) databases to be updated more efficiently. In the past two decades, automatic road extraction has become an important topic in remote sensing, photogrammetry, and computer vision. In addition, recent advances in lidar systems and their enormous potential in automatic feature extraction motivate the development of automatic road extraction algorithms based on lidar data. Many studies have been performed on road extraction from remotely sensed data. Mena (2003) provided a bibliography of nearly 250 references related to this topic. Hu (2003) proposed a method for road extraction from lidar data. In this approach, specified range and intensity thresholds were used in an exponential membership function. Alharty and Bethel (2003) successfully extracted roads from lidar data using some constraints proportional to the road properties such as intensity and proximity to a digital terrain model (DTM). Zhu et al. (2004) extracted city road by use of digital image and laser data. Height and edge of high objects were obtained from laser data and road edges were detected from a digital image. Shadowed parts were reconstructed by a spline-approximation algorithm. Hu et al. (2004) used Ali Akbar Matkan and Mohammad Hajeb are with the Remote Sensing and GIS Department, Shahid Beheshti University, Evin, Tehran, Iran (
[email protected]). Saeed Sadeghian is with the Geomatics College of the National Cartographic Center, National Cartographic Center, Meraj Street, Azadi Square, Tehran, Iran.
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
high-resolution imagery combined with lidar data for road extraction. They used an iterative Hough transform algorithm to distinguish car parks from roads stripes. Clode et al. (2005) presented a road classification technique for lidar data based on region growing. Akel et al. (2005) suggested a method to extract roads from lidar data using a segmentation technique. Clode et al. (2007) used a hierarchical classification technique to progressively classify the lidar points into road or non-road groups. The resultant binary classification was then vectorized by convolving a Phase Coded Disk (PCD). Youn et al. (2008) utilize lidar data and true orthoimage for urban road extraction in sequential steps. First, the candidate road pixels were selected from the true orthoimage based on a free passage measure that is called the “acupuncture” method. Then, a first-last return analysis and morphological filter were used with the lidar data to mask building pixels. Supervised classification techniques were used with the lidar intensity and true orthoimage to mask grass pixels. In (Li et al., 2008) a method based on a parallel algorithm was proposed for road extraction from lidar data. Harvey and McKeown (2008) successfully extracted roads using both lidar and multi-spectral source data. Choi et al. (2008) proposed a method to extract urban roads using range and intensity lidar data combined with clustered road point information and the global geometry of the road system. Tiwari et al. (2009) proposed an integrated approach to road extraction using lidar and highresolution satellite data. An object-oriented fuzzy rule-based algorithm identifies roads based on high resolution satellite images, and then a complete road network is extracted from a combination of lidar and high-resolution satellite data. In (Zhu and Mordohai, 2009) the lidar data are segmented based on both edge and region properties and these two features are combined to obtain a heat map of road likelihood using hypothesis testing. A minimum cover algorithm is then used to find a set of road segments which best cover this likelihood map. Samadzadegan et al. (2009) proposed a method based on a multiple classifier system (MCS) to extract roads from lidar data. Gong et al. (2010) extracted roads from lidar data using k-mean clustering method and refined the results using spectral information from aerial images. Zhang (2010) presented a method to identify road regions and road edges using lidar data. The road segments and road edge points were detected according to a local extreme-signal detection filter according to elevation data whit a prior knowledge of the minimal with of roads. Wang et al. (2011) applied lidar data fused with aerial images to extract 3D road information by use of Photogrammetric Engineering & Remote Sensing Vol. 80, No. 5, May 2014, pp. 409–422. 0099-1112/14/8005–409 © 2014 American Society for Photogrammetry and Remote Sensing doi: 10.14358/PERS.80.5.409
M a y 2014
409
an improved Mean Shift algorithm for classification process. Silva et al. (2011) introduced a method, based on the iterative and localized radon transform and optimal algorithms, to extract roads from lidar data and images of rural areas. Jiangui and Guang (2011) presented a method in which an adaptive TIN (Triangulated Irregular Network) model filtering algorithm was utilized to classify the lidar point clouds into ground and non-ground point clouds. The ground point clouds were then classified into candidate road and non-road point clouds by use of intensity information. Zhao et al. (2011) offered an unsupervised approach for efficient extraction of grid-structured urban roads from airborne lidar data. Boyko and Funkhuser (2011) described a method for extracting the road from a large scale unstructured 3D point cloud. The proposed method separates the road from the other objects by aid of a 2D road map. Huang et al. (2011) focused on feature-extraction algorithms from lidar data and spectral-lidar information fusion approaches. SVM-based multisource information fusion was implemented in three levels: the feature level (vector stacking), the multiclass output level (re-classification) and the decision level (post-processing) for urban information extraction (e.g., buildings, roads). In (Jin, 2011) an integrated approach for the automatic road extraction from high resolution aerial images and lidar point clouds is presented. In this paper, an adaptive mean shift (MS) segmentation algorithm was utilized to segment the original image and the SVM classification was then applied on the segmented image to extract urban road objects. Lidar intensity image was used to remove the effects of shadows and trees, and nDSM obtained from lidar was employed to filter out above-ground objects. Yang et al. (2012) extracted road from lidar data using the threshold automatically specified by the discrete discriminate analysis. These extracted road segments along with the lidar intensity values are used for extraction of road markings. Li et al. (2012) proposed a hierarchical algorithm to extract the terrain point from lidar data and then used the information, including elevation, the intensity information, the morphological characteristics and other features, to extract the road network from derived DTM. In Zhao and You (2012), lidar point clouds are first separated into ground and non-ground parts, and a structure template was then designed to search for roads on the intensity image in the ground point. Road widths and orientations are determined by a subsequent voting scheme. They constructed a Markov graph to make a global inference on road networks. Finally, a method was developed for elevated road extraction from the non-ground point, and the whole network was formulated by combining both the ground and elevated roads. Support vector machines (SVM) have been successfully applied to many classical recognition problems, with results comparable or even superior to traditional classifiers such as decision trees, neural networks and maximum likelihood (Yager and Sowmya, 2003). SVMs application to the problem of road recognition from different remotely sensed images has been investigated in some researches (Yager and Sowmya, 2003; Lai et al., 2005; Huang and Zhang, 2009; Ziems et al., 2011; Jin, 2011). Song and Civco (2004) suggested a method in which an SVM was used to classify an image into road and non-road groups. The road group image was then segmented into geometrically homogeneous objects using a region growing technique. Huang and Zhang (2008) proposed a novel mean-shift (MS) system for extracting urban features such as road, grass, water, trees, trails, and roofs from hyperspectral Imagery. In this paper an MS was applied to obtain an objectoriented representation of hyperspectral data and then SVM was used to classify the feature set. The roads were extracted from the classification map, using the postprocessing including centerline extraction based on a morphological thinning algorithm and connected component analysis.
410
M a y 2 014
In our paper, a new algorithm based on SVM classification is introduced to extract roads from lidar data. A novel method is also presented for automatic gap finding and gap filling using the Radon transformation and spline interpolation, respectively. The paper is organized into four sections. The next section reviews the theory of SVM classification, the Radon transformation and spline interpolation and then describes the proposed algorithm for road extraction. The experimental results and an accuracy assessment of the algorithm are then provided, followed by the conclusions.
Methodology SVM Classification SVMs discriminate between two classes by fitting an optimal separating hyperplane to the training samples in a multidimensional feature space (Cortes and Vapnik 1995). Consider a set of instances, where (xi,yi), i = 1,2,…,l, where xi ∈RN and yi ∈{–1,1}. The SVM method uses a hyperplane surface to separate between classes, thereby maximizing the margin between them. The hyperplane takes the form yi(w.xi + b) – 1 ≥ 0 for i=1,2, … , l where xi is a point on the hyperplane, w is an n-dimensional vector perpendicular to the hyperplane, b is an offset parameter that allows us to increase the margin, and w.xi is the dot product of vectors w and xi. The optimal separating hyperplane is the one that separates the data with the maximum margin (Vapnik, 1995). This hyperplane can be found by minimizing the norm of w subject yi(w.xi + b) – 1 ≥ 0 for i=1,2, … , l. The problem can be formulated as follows: Minimize 1 w ,b 2
w Subject to y i (w .x i + b ) − 1 ≥ 0, for i = 1, 2,…, l. 2
(1)
In practice, a separating hyperplane may not exist, e.g., when different classes have overlapping instances. In this case, a standard approach is to introduce slack variables (ξi ≥ 0, i=1,2, … , l). The hyperplane can be rewritten as yi(w.xi + b) ≥ 1 – ξi for i=1,2, … , l. The optimal hyperplane can be found by minimizing the objective function Minimize 1 w ,b 2
w
2
+ C Σ1l ξi , Subject to
y i (w .x i + b ) ≥ 1 − ξi , and ξi ≥ 0 for i = 1, 2,…, l.
(2)
The constant C>0 controls the trade-off between the separation margin and the number of training instances lying on the wrong side of the hyperplane. If C is very small, then a large fraction of support vectors may be obtained. In contrast, the analysis may over-fit to the training data if C is very large, which may yield a low level of generalization ability (Su, 2009). The minimization problem in Equation 2 can be solved through a Lagrange dual optimization. The basic approach to SVM classification may be extended to allow for nonlinear decision surfaces by applying the kernel trick to maximum-margin hyperplanes. The kernel method converts a linear classifier algorithm into a non-linear one to map the original instances into a higher-dimensional space. Here, the final hyperplane decision function can be defined by
(
F ( x ) = sign Σ si =1y i λ i K ( x .x i ) + b
)
(3)
where K(x.xi) is a kernel function, λi is a Lagrange multiplier, and s is the number of support vectors, which is the subset of training samples corresponding to λi>0. A widely used kernel is the radial basis function (RBF) kernel (Chang and Lin, 2001). The RBF kernel is defined by K(x.xi) = exp(–g||x–xi||2), where g
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
is the width of kernel. The accuracy with which an SVM can classify a dataset depends on the magnitude of the parameters C and g (Foody and Mathur, 2004). Both C and g depend on the data range and distribution, and they differ from one classification problem to another. These parameters are often selected based on a cross-validation analysis (Foody et al., 2006). The SVM was originally designed for binary classification, but various methods exist to extend the binary approach to multi-class classification (Hsu and Lin, 2002). An n-class problem is often divided into several sub-problems with individual binary SVMs. There are two main strategies for this purpose: the one-against-one (OAO) and one-against-all (OAA) strategies. The OAO strategy is based on the training of n(n–1)/2 binary SVMs, one for each possible pair-wise classification problem. The OAA concept is based on the training of n SVMs, with each class separated from the remaining classes. The Radon Transform The Radon transform is an integral transform that is able to represent lines of images as peaks in a domain of possible line parameters. This transform has been employed in many line detection applications in image processing and computer vision (Toft, 1996). The Radon transform computes projections of an image matrix along specified directions using a set of line integrals. The Radon function computes the line integrals from multiple sources along parallel beams in a specified direction. To represent an image, the Radon function takes multiple parallel-beam projections of the image from different angles by rotating the source around the center of the image. More precisely, the Radon transform is the projection of the image intensity along a radial line that is oriented at a specified angle. The radial coordinates are the values along the x'-axis, which is oriented at θ degrees from the x-axis. A projection can be computed along an arbitrary angle θ using the Radon transform equation,
Rθ ( x ′ ) =
∫∫
∞
−∞
f ( x, y ) δ ( x cos θ + y sin θ − x ′ ) dxdy ,
Figure 2. Interpolation of discrete data. Assume that the desired values of an unknown function f(x) are available in a finite subset of the function domain. For example, the values of the function may be known at a = t0 < x1 < …