International Journal of Remote Sensing Vol. X, No. X, DD Month 200X, xxx-xxx
Performance evaluation of classification trees for building detection from aerial images and lidar data: a comparison of classification trees models M. SALAH*†, J. C. TRINDER†, A. SHAKER‡ †School of Surveying and Spatial Information Systems, The University of New South Wales, Sydney, 2052 Australia ‡ Department of Surveying Engineering, Shoubra Faculty of Engineering, Benha University, Cairo, 11269 Egypt (Received 00 month 200x; in final form 00 month 200x) This study assesses the performance of three classification trees (CT) models (Entropy, Gain Ratio and Gini) for building detection by the fusion of laser scanner data and multi-spectral aerial images. Data from four study areas with different sensors and scene characteristics were used to assess the performance of the models. The process of performance evaluation is based on four criteria: model validation and testing; classification accuracies; relative importance of variables; as well as transferability of classification trees derived from one dataset to another. The lidar point clouds were filtered to generate a Digital Terrain Model (DTM) based on the orthogonal polynomials, and then a Digital Surface Model (DSM) and the Normalized Digital Surface Model (nDSM) were generated. A total of 25 uncorrelated feature attributes have been generated from the aerial images, the lidar intensity image, DSM and nDSM. Finally, the three classification trees models were used to classify buildings, trees, roads and ground from aerial images, lidar data and the generated attributes with the most accurate average classifications of 95% being achieved. The Entropy splitting algorithm proved to be a preferable algorithm for building detection from aerial images and lidar data.
1. Introduction Research on building detection from aerial images and lidar data has been fuelled in recent years by the need for data acquisition and updating for GIS. The high dimensionality of aerial and satellite imagery presents a challenge for traditional classification methods based on statistical assumptions. A number of approaches have been tested at the University of New South Wales (UNSW), Sydney Australia, on building detection based on image and lidar data, including (Xiongcai et al. 2005) based on machine learning, (Lu et al. 2006) and (Rottensteiner et al. 2007) both based on the Dempster-Shafer probability method, and (Salah et al. 2009a), which has implemented a self organizing map (SOM) neural network. This paper is a continuation of that research and implements a method based on classification trees. Classification trees (CT) (also called decision trees) are a very effective alternative approach for land cover mapping for multidimensional data, because they are
*Corresponding author. Email:
[email protected] International Journal of Remote Sensing ISSN xxxx–xxxx print/ ISSN xxxx–xxxx online © 2009 Taylor & Francis http://www.tandf.co.uk/journals DOI: 10/1080.xxxxxxxxxxxxx
nonparametric and do not require initial assumptions about the distribution of the input classes. Furthermore, classification trees have the advantage that they also work when the classification variables are a mixture of categorical and continuous. In the final classification not all but only the most prominent attributes are used (Weiguo and Elaine 2005). If new data sources become available in the future, they can be easily included in the classification process. This makes the classification method highly automatic and different from most other approaches in which the input data must remain fixed. There are two main motivations behind this study, which represent the innovations of this research. First of all, there have been no results prior to this study published on assessing the performance of classification trees models for image and lidar data fusion. Secondly, there are several issues requiring consideration in respect of the application of classification trees for classifying images and lidar data: which classification trees model performs the best for image and lidar data fusion; how does each algorithm perform with data acquired from sensors having different characteristics; which individual and groups of attributes used in data fusion contribute to the quality of the classification results; and how well do classification trees derived from one area transfer to other regions? It is the goal of this paper to answer these questions, based on a thorough evaluation of three classification trees models using four datasets of different sensor and scene characteristics. This paper is organised as follows. Section 2 introduces related work on the extraction of buildings from images and lidar data. Section 3 describes the study areas and datasets. Section 4 describes the experiments while Section 5 presents and evaluates the results. We summarise our results in Section 6. 2. Related Work 2.1 Building extraction from lidar data There have been many research efforts on the application of lidar data for automatic building detection and extraction, including Maas and Vosselman (1999); Morgan and Tempfli (2000); Priestnall et al. (2000); Alharthy and Bethel (2002); Vögtle and Steinle (2003); Vosselman et al. (2004); Neidhart and Sester (2008). Most of these methods use step-wise classification approaches to distinguish buildings from other objects. They normally begin by extracting the ground surface using a filtering algorithm, and then the most important task is to distinguish buildings from trees. Although there are several advantages in the use of only lidar data for automatic building localization and planar patch extraction compared to aerial photographs and satellite imagery, several difficulties can also occur. These include lower point densities or even gaps in the data, roof surfaces that have poor reflectance properties (e.g. wet or dark roofs), small roof features that contain no or few laser points, and the inability to discriminate between occluded regions next to buildings and roof parts with bad reflectance (Sander 2008). 2.2 Building extraction by fusion of lidar and image data. Research on building detection based on the fusion of multispectral aerial imagery and lidar
data has been undertaken so that the individual strengths of each data source can compensate for the weaknesses of the others. Low contrast, occlusions and shadow effects in the images can be compensated by the accurately detected planes in the lidar data. Rottensteiner et al. (2007) evaluated a method for building detection by the DempsterShafer fusion of lidar data and multispectral images. Salah et al. (2009a) tested the SelfOrganizing Map (SOM) for building detection from aerial images and lidar data. There have been very few applications of classification trees for building detection from aerial images and lidar data. Matikainen et al. (2007) used the Gini splitting criterion for classification trees for building detection. A DSM derived from last pulse laser scanner data was first segmented into classes ‘ground’ and ‘building or tree’, using different combinations of 44 input attributes. The attributes were derived from the last pulse DSM, first pulse DSM and a colour aerial ortho image. In addition, shape attributes calculated for the segments were used. Compared with a building reference map, a mean accuracy of almost 90% was achieved for extracting buildings. 3. Study areas and data sources Four test datasets of quite different sensor and scene characteristics were used in this study as summarized in Table 1 and 2. These scenes include data available from data providers in Australia to the researchers together with that provided by TopoSys in Germany for a range of land covers, including high and low density urban areas, a rural township and a densely populated European town. The reason behind using the four different test areas is to achieve a robust decision about the behaviour of the various splitting algorithms which determine the splits at each node in the classification trees, when several attributes and splits are almost equally good. Test area 1. A part of the region surrounding the University of New South Wales (UNSW) campus, Sydney Australia, covering approximately 500m x 500m, which is a largely urban area that contains residential buildings, large Campus buildings, a network of main roads as well as minor roads, trees, open areas and green areas. First and last pulse lidar data were acquired over the study area including the lidar intensity data, which was used as infra red (IR) image data. The colour imagery was captured by film camera at a scale of 1:6000. The film was scanned in three colour bands (red, green and blue) in TIFF format, with 15µm pixel size (GSD of 0.09m) and radiometric resolution of 16-bit as shown in figure 1(a). Test area 2. A part of Bathurst city, NSW Australia, covering approximately 1000m x 1000m of a largely rural area that contains small residential buildings, road networks, trees and green areas. First and last pulse lidar data was acquired over the area including the lidar intensity data, which was used as IR image data. The colour (red, green and blue) images were captured by a Leica ADS40 line scanner sensor and supplied as an ortho image as shown in figure 1(b). Test area 3. An area over suburban Fairfield, NSW Australia covering low density development in the southwest half of the scene, and large industrial buildings in the northeast part as shown in figure 1(c). First and last pulse lidar data was acquired over the area including the lidar intensity data, which was used as IR image data. The image data was acquired by a film camera at a scale of 1:10 000 which was scan digitized and supplied
as an ortho image. Test area 4. An area over Memmingen Germany, featuring a densely developed historic centre in the north of the scene and industrial areas in the remainder. First and last pulse lidar data were acquired, as shown in figure 1(d). Multispectral images (CIR), including an infrared image with the same resolution as the colour bands, were acquired by a line scanner sensor and supplied as an ortho image Table 1. Characteristics of image datasets. pixel Test area Size(Km) bands size Camera (cm) UNSW 0.5 x 0.5 RGB 9 LMK1000 ADS40 Bathutrst 1x1 RGB 50 Line scanner Fairfield 2x2 RGB 15 LMK1000 TopoSys Falcon II Memmingen 2x2 CIR 50 line scanner
Look Angle (deg.) along across track track ±30 ±30 Line 46 scanner ±30 ±30 Line 22 scanner
Table 2. Characteristics of lidar datasets. UNSW Bathurst Fairfield Optech ALTM 1225
Spacing across track (m) Spacing along track (m) VL accuracy (m) HL Accuracy (m) Density (Points/m2) Sampling intensity (mHz) Wavelength (µm) Average altitude (m) Laser swath width (m)
(a)
1.15 1.15 0.10 0.5 1 11 1.047 1100 800
Leica ALS50
0.85 1.48 0.10 0.5 2.5 150 1.064 1450 777.5
(b)
Optech ALTM 3025
1.2 1.2 0.15 0.5 1 167 1.047 1500 700
Memmingen TopoSys
0.15 1.5 0.15 0.5 4 125 1.56 800 750
(c)
(d)
Figure 1. Orthophotos for: (a) UNSW; (b) Bathurst; (c) Fairfield; and (d) Memmingen 4. Methodology 4.1 Data pre-processing 4.1.1 Filtering of lidar data. First the original lidar point clouds were filtered to separate on-terrain points from points falling onto natural and human made objects. A filtering technique based on orthogonal polynomial filtering (Abo Akel et al. 2004), which permits the usage of polynomial interpolation functions without restricting the polynomial degree, was used. Data from both the first and the last pulse echoes were used in order to obtain denser terrain data and hence a more accurate filtering process. After that, the filtered lidar points were converted into an image DTM, and the DSM was generated from the original lidar point clouds. In order to compensate for the difference in resolution between image and lidar data, the DSM and DTM grids were interpolated to 30cm in case of UNSW and Fairfield and 50cm in case of Bathurst and Memmingen. Then, the nDSM was generated by subtracting the DTM from the DSM as shown in figure 2 (a), (b) and (c). Finally, a height threshold of 3m was applied to the nDSM to eliminating other objects such as cars to ensure that they are not included in the final classified image. Kraus and Pfeifer (1998) have reported that with laser scanner technology which is independent of the sun and uses the near-infrared portion of the electro-magnetic spectrum to collect data, the shadows do not pose any problem. Also, Baltsavias (1999) and Kraus (2003) have proposed the use of the synergetic information content of photogrammetrically derived imagery and lidar data leads to a high quality land cover classifications in shaded areas. However, in our case there is significant overlap between the lidar strips of 30% which is sufficient to solve the shadow problem in case of the wide scan angles, 25° and 30° in cases of Optech and Leica lidar respectively. Therefore, the laser scanner data, DSM and nDSM, and the attributes generated from it will be able to detect land cover classes in the shaded areas when combined with the multispectral images in one classifier.
Figure 2. UNSW filtering results: (a) DSM; (b) DTM; and (c) the nDSM. 4.1.2 Generation of attributes. Because edges of features are not located accurately in lidar point clouds due to the lidar system’s discrete sampling interval of 0.5m to 1m (Li and Wu 2008), we have derived 25 attributes from both aerial image and lidar data by a number of models. The attributes were calculated for pixels as input data for the classification trees. In our test and before generating the attributes, the aerial photographs (already orthorectified) were registered to the lidar intensity image using a projective transformation. Following the transformation, the image was resampled to the resolution of the lidar data, 30cm in case of UNSW and Fairfield and 50cm in case of Bathurst and Memmingen. A bilinear interpolation was used for resampling, which results in a better quality image than nearest neighbourhood resampling and requires less processing than cubic convolution. A set of 78 possible attributes were selected. Because of the way the texture equations derived from the Grey-Level Co-occurrence Matrix (GLCM) (Haralick 1979) are constructed, many of them are strongly correlated with one another (Clausi 2002). Based on these studies, only 25 of the 78 possible attributes were uncorrelated and hence available for the classification process as shown in table 3. The attributes include those derived from the GLCM, Normalized Difference Vegetation Indices (NDVI), slope and the polymorphic texture strength based on the Förstner operator (Förstner and Gülch 1987). The NDVI values for the UNSW, Bathurst and Fairfield test areas were derived from the red image and the lidar reflectance values, since the radiation emitted by the lidars is in the IR wavelengths. The resolutions of the lidar reflectance data for these study areas are lower than that for the images, and this may impact on the ability to detect vegetation. Since the images derived for the Memmingen dataset include an IR channel, the NDVI was derived from the image data only. For GLCM construction and texture calculation, a small window of 3 x 3 pixels was used. Large window sizes usually result in lower producer’s accuracy, (the total number of correctly classified pixels in a class divided by the total number of pixels that should have been classified in that class), since the effect of between-class variance on the edge pixels caused many of these pixels to be placed in an incorrect category due to their high texture values (Ferro and Warner 2002). Furthermore, if a window has dimensions M x M, a strip of (M–1)/2 pixels wide around the image will remain unoccupied. The usual way of
handling this is to fill in these edge pixels with the nearest texture calculation. Edge effects can be a problem in classification (Hall-Beyer 2008). Table 3. The full set of the possible attributes from aerial images and lidar data; uncorrelated attributes available for the classification are shown by shading. Attributes Spectral
GLCM
Height
Attribute mean SD strength contrast dissimilarity homogeneity mean entropy mean variance correlation SD slope
R ● ● ● ● ● ● ● ● ● ● ● ● ●
G ● ● ● ● ● ● ● ● ● ● ● ● ●
B ● ● ● ● ● ● ● ● ● ● ● ● ●
Intensity/IR ● ● ● ● ● ● ● ● ● ● ● ● ●
DSM ● ● ● ● ● ● ● ● ● ● ● ● ●
NDSM ● ● ● ● ● ● ● ● ● ● ● ● ●
Finally, the attributes were applied in seven separate groups which include those from: red, green and blue bands of the aerial image; intensity image or IR image in the case of the Memmingen data; DSM; nDSM and the Total group of attributes as shown in table 4. The three image bands (RGB) and the NDVI were considered as the primary data source and available in all tests. For the implementation of classification tree analysis, a MATLAB interface was developed. The interface enables the user to: filter the lidar point clouds into on-terrain and of-terrain points; generate attributes; select training pixels representative of each class; classify the image using the three classification trees models; and evaluate the classification results. Table 4. The seven groups of attributes which were used as input for the classification process: Yes and No indicate whether the attribute has been used for the group or not. Attributes Red Green Blue Intensity/IR DSM nDSM Total group group group group group group group RGB bands Intensity Strength Homogeneity Entropy GLCM_mean NDVI DSM nDSM Slope
Yes No Yes Yes Yes Yes Yes No No No
Yes No Yes Yes Yes Yes Yes No No No
Yes No Yes Yes Yes Yes Yes No No No
Yes Yes Yes Yes Yes Yes Yes No No No
Yes No Yes Yes Yes Yes Yes Yes No No
Yes No Yes Yes Yes Yes Yes No Yes Yes
Attributes from all groups (32 nonrepeated attributes) which include: 25 uncorrelated feature attributes in table 3; RGB bands; Intensity/IR image; DSM; nDSM; and NDVI)
4.1.3 Training datasets. The training data are sets of manually classified samples comprising the numbers of pixels shown in Table 5. Eighty polygons of approximately
equal areas, twenty for each land cover class, were overlaid over the images to generate the training data. The positions of the polygons were selected carefully to be representative and to capture changes in the spectral variance of each class. Each sample was then converted into a vector representing the attributes or features. All tests were conducted using identical training sets. Table 5 presents a summary of all training data configurations. However, according to Christopher et al. (2008), classification trees are shown to be a robust and accurate classification technique when even presented with deficient training data. Table 5. Training set statistics. B: buildings, T: trees, R: roads, G: ground. Input data Single attributes Red group Green group Blue group Intensity group DSM group nDSM group Total group
No. of attributes 1 8 8 8 9 9 10 32
Total sample pixels/class B T R G 1644 1264 1395 1305 13152 10112 11160 10440 13152 10112 11160 10440 13152 10112 11160 10440 14796 11376 12555 11745 14796 11376 12555 11745 16440 12640 13950 13050 52608 40448 44640 41760
4.2 Land cover classification In this process and using the classification trees, we want to achieve a per-pixel classification of the input data into one of the four primary classes of interest for these tests: buildings (B), trees (T), roads (R), and ground (G). Class “ground” mainly corresponds to grass, parking lots and bare fields. The theory of classification trees was developed by Breiman et al. (1984). A brief introduction is given by Waske (2007), while a detailed description is given by Safavian and Landgrebe (1991). A classification tree is a nonparametric univariate technique built through a process known as binary recursive partitioning. This is an iterative procedure in which a heterogeneous set of training data consisting of multiple classes is hierarchically subdivided progressively into more homogeneous clusters using a binary splitting rule to form the tree, which is then used to classify other similar datasets. A classification tree comprises nodes joined by a series of branches and includes the following elements: the root node (the starting point of the tree); the non-terminal nodes which are connected by the branches to the root node and all other internodes; and the terminal node that describes the group of pixels that are assigned to the same class. The classification rules are derived from training samples. Each split in the tree is generally derived using statistical methods in which the fundamental strategy for the creation of rule T at node N is dependent on the ‘impurity’. If all samples contained by N belong to the same class, the node is pure and the impurity is 0, whereas impurity is large if all classes are equally distributed within the samples (Waske 2007). If the logical if-then condition related to the attribute value at a node is fulfilled, the branch to the left is followed, otherwise the right branch is chosen. The process continues until a node becomes
pure, containing pixels from only one class, and is assigned as a terminal node. In this study, starting from the root node and using training data, pixels were split into groups and assigned along a binary split rule. If the pixels were from the same class, that is the impurity approached zero, they were combined to form a terminal node. Otherwise, a nonterminal node was assigned and the process of splitting continued. Three models were used as the splitting criteria in our study: Entropy; Gain Ratio; and Gini (Breiman 1996). 4.2.1 Entropy model. Entropy is related to information content (after Shannon 1949), the higher the entropy, the more information is required to describe the data. So, we aim to decrease the entropy until we reach pure terminal nodes or nodes with zero entropy which represent pixels from one class (all pixels have the same value for the attribute). In the Entropy model of a training dataset in node N, where n is the number of training samples in N with their corresponding class labels X = {xi}, with i=1,...,l, the required information to identify the class xi can be described as: L
Entropy ( N ) = −∑ P( xi ) log 2 P ( xi )
(1)
i =1
where P(xi) is the probability or the relative frequency of class xi:
P( xi ) =
nx i n
(2)
nxi is the number of samples belonging to class xi. The test T (which produces nodes with zero entropy) differentiates N in z outputs {N1, N2, ..., Nz). The total information after applying the test T is equal to the sum of the outputs: z
nj
j =1
n
EntropyT ( N ) = ∑
Entropy ( N j )
(3)
Information Gain is a measure which calculates the reduction in Entropy (Gain in information) that would result on splitting N using rule T and is defined by the difference between equations (1) and (3):
gain(T ) = Entropy ( N ) − EntropyT ( N )
(4)
By calculating this value for each attribute, we can see which attribute splits the data more purely, resulting in the highest reduction in entropy or the highest information gain. We would therefore choose this attribute at that node to split the data into subsets corresponding to all the different values of that attribute. After that, we can proceed recursively through the subsets until terminal nodes have been reached and all subsets are pure with zero entropy.
4.2.2 Gain Ratio model. A drawback of using equation (4) is that the information gain generally selects the attributes which have relatively high numerical values. This is known as entropy bias. In order to compensate for this, the expression is normalized by the number of values for an attribute: z
Entropy s (T ) = ∑ j =1
⎛ nj log⎜⎜ n ⎝ n
nj
⎞ ⎟⎟ ⎠
(5)
The final Information Gain Ratios are calculated as:
IGR(T ) = gain( N ) / Entropy s ( N )
(6)
4.2.3 Gini model. The Gini index measures the impurity at a given node and aims to separate the largest homogeneous group within the training data from the remaining samples (Zambon et al. 2006). The index is defined as: n
gini (t ) = ∑ p xi (1 − p xi )
(7)
i =1
Pxi is the probability or the relative frequency of class xi as defined in equation (2). The Gini index of all parts is summed for each split rule and the test that results in the maximum reduction in impurity is selected. Several other splitting models, based on the distribution of the data, have been proposed such as chi-squared, permutation tests and multivariate splits (Frank 1981), but these methods will not be used in this paper. 4.3 Pruning the classification trees
Pruning is a method to avoid the overfitting process which occurs when the classification tree characterizes too much detail or noise in the training data and leads to poor accuracy in the prediction process. Studies have shown that Pruning will result in smaller and more effective trees by up to 25% (Esposito et. al. 1997). Different pruning techniques have been developed such as Reduced Error Pruning and Minimum-Description Length Pruning. Several studies have compared the pruning methods and there is little variation in terms of performance (Esposito et. al. 1997). In this study the trees were pruned through a 10-fold cross validation process as shown in item 5.1.1. Also, for this work, the classification trees were applied directly to generate the classification results without converting them to production rules. As reported in Quinlan (1987), transforming a decision tree with as many as 92 nodes or less to a set of production rules will result in an improvement in the average error rate of no more than 0.1% which is negligible. The pruned classification tree obtained for UNSW test area using the Entropy model and the Total group of attributes is shown in figure 3 as a typical example derived in this study.
Figure 3. Classification tree obtained for UNSW using the Entropy model and the Total group of attributes (Table 4). The given numbers indicate the pixel values that were used as thresholds for each node condition. (a)
(b)
(c)
(d)
Figure 4. Classifications using the Entropy model and the Total group of attributes for: (a) UNSW; and (b) Bathurst; (c) Fairfield; and (d) Memmingen. 5. Results and analysis 5.1 Model evaluation 5.1.1 Model validation and testing. In order to find a simpler tree that performs as well as, or better than, more complex trees, a 10-fold cross-validation was done. The 10-fold cross validation technique has been demonstrated to produce highly accurate results without requiring an independent dataset for assessing the accuracy of the model. Research has shown that little is gained by using more than 10 partitions (Sherrod 2008), and hence this validation test was based on 10 partitions. First, all of the observations in the training dataset were used to build the tree. This is called the reference or unpruned tree. The reference tree is the best tree that fits the training dataset. Then, the original training data was equally divided into 10 subsamples. Of the 10 subsamples, a single subsample was retained as the validation data for testing the models, and the remaining 9 subsamples were used as training data to build the tree which was used to classify the retained subsample. The classification error for that data was estimated and the number of the terminal nodes of the tree recorded. The cross-validation process was then repeated, with each of the 10 subsamples used only once as the validation data. Once the 10 test trees were built, their classification error rate as a function of tree size was averaged. This averaged error rate for a particular tree size is known as the Cross Validation cost (CV cost). Figure 5 shows that the validation error decreases as the tree size grows, but beyond a certain point, increasing the tree size increases the error rate. The simplest tree is the tree with the minimum cross validation cost. 0.7 Cross-validation error
CV coast (averaged error rate)
0.65
Best choice
0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0
2
4
6 8 10 12 14 Number of terminal nodes
16
18
20
Figure 5. The classification error in case of UNSW test area using the Entropy model and the Total group of attributes (Table 4) plotted against the number of terminal nodes.
The reference tree is then pruned to the number of nodes matching the size that produces the minimum cross validation cost. The pruning is done in a stepwise fashion, removing the least important nodes during each pruning cycle. The decision as to which node is the “least important” is based on the cost complexity measure as described in Breiman et al. (1984). It is important to note that the test trees built during the cross-validation process are used only to find the optimal tree size. Their structure (which may be different in each test tree) has no bearing on the structure of the reference tree which is constructed using the full training dataset. The reference tree pruned back to the optimal size determined by cross validation is the best tree to use for the classification process. The test was repeated for the four test areas using the three classification trees models and the Total group of attributes. Figure 6 shows the minimum errors obtained for the validation data (the error corresponding to the best choice). The Entropy model produced the most accurate results. On the other hand, both Gini and Ratio models produced lower classification accuracies and were comparable to each other.
missclassification error (%)
45 40 35 30
Ratio Entrop Gini
25 20 15 10 5 0 UNSW
Bathurst
Fairfield
Memmingen
Test area
Figure 6. Cross validation errors of the best tree choices. 5.1.2 Classification accuracies. To evaluate the ability of the classification trees models to detect features using various combinations of the test datasets, the attributes described in Section 4.1.2 were applied in seven separate groups which include those referred to in Table 4 as: Red group; Green group; Blue group; Intensity image /IR group; DSM group; nDSM group; and the Total group. Then, the classification process, with the best trees derived from the cross-validation process, was performed 21 times for each test area (7 groups of attributes for 3 classification trees models). The contributions of each group of attributes to the overall classification accuracies were computed and plotted against the groups of attributes for the four test areas as shown in figure 7(a) – (d).
(a)
90 80 70 60 Ratio Model Entropy Model Gini Model
50 40 Red
Green
Blue
Intensity
(b)
100
DSM
nDSM
Classification Accuracy (%)
Classification Accuracy (%)
100
90 80 70 60 Ratio Model Entropy Model Gini Model
50 40 Red
Total
Green
Blue
(c)
90 80 70 60 Ratio Model Entropy Model Gini Model
50 40 Red
Green
Blue
Intensity Attributes
DSM
nDSM
Total
(d)
100
DSM
nDSM
Total
Classification Accuracy (%)
Classification Accuracy (%)
100
Intensity Attributes
Attributes
90 80 70 60 Ratio Model Entropy Model Gini Model
50 40 Red
Green
Blue
IR DSM Attributes
nDSM
Total
Figure 7. Contribution of each group of attributes (Table 4) determined from the accuracy of the classifications using Entropy, Gain Ratio and Gini models in cases of: (a) UNSW; (b) Bathurst; (c) Fairfield; and (d) Memmingen. The results indicate a clear dependence on the range of input data included in the tests. Figure 7 shows that, using attributes generated from or including height data (DSM, nDSM and Total group of attributes) resulted in significantly higher classification accuracies. Also in this case, Entropy classifications trees model produced the most accurate classifications followed by Gini model and Gain ratio model. On the other hand, using attributes generated from Red band, Green band, Blue band or intensity/IR image, resulted in lower classification accuracies. In this case, Entropy, Gain Ratio and Gini models produced classification accuracies comparable to each other. The use of attributes derived from the IR image in the Memmingen case study, marginally improved the overall classification accuracy to about 57% compared to accuracies of 54%, 54% and 55% that were obtained by using the attributes from the lidar intensity image as the IR image data in the UNSW, Bathurst and Fairfield case studies respectively. Figure 8, which is a typical example of the performance of these tests, shows the classifications in a sub-area of the Fairfield test area using the Entropy model. By focusing on the buildings beside the white arrows we can find that the Total group of attributes performed the best (figure 8h) while the Red, Green, Blue groups (figures 8b, 8c and 8d respectively) performed the worst. Intensity/IR and DSM groups (figures 8e and 8f)
performed better than the Red, Green and Blue groups. The nDSM group (figure 8g) performed almost as well as the Total group of attributes.
Figure 8. Classification results using the Entropy model for 7 groups of attributes in a sub area of Fairfield. (a) Original image; classifications using attributes from: (b) the Red group; (c) the Green group; (d) the Blue group; (e) the Intensity/IR group; (f) the DSM group; (g) the nDSM group; (h) the Total group of attributes. (Red: buildings, dark green: trees, black: roads and light green: ground).
Attributes
h_ R
VI N D
St re ng t
nD
SM
G
ity
SM
_i nt en s
_D
En t ro py
En t ro py
D SM
2000 1800 1600 1400 1200 1000 800 600 400 200 0
B
SM nD
_n DS M
SM
En t ro py
h_ D
B
ity In te ns
re ng t St
Attributes
No. of Pixels
buildings trees roads ground
1600 1400 1200 1000 800 600 400 200 0
R
No. of Pixels
5.1.3 Relative importance of variables. The ability of the Entropy, Gain Ratio and Gini models to take advantage of the input data was assessed by comparing the relative importance of each attribute as selected by each of the three models. Matikainen et al. (2007) demonstrated the relative importance of different attributes in the classification by counting the total number of pixels passing through the nodes with a given attribute. If one pixel passed more than one node with the same attribute, it was counted more than once for that attribute. Therefore, the number of training pixels per attribute can exceed the total number of pixels in the dataset. Figure 9 (a) –(d), a typical example of this test, shows the number of pixels passing through nodes with each selected attribute from the Total group of attributes for the four test areas in the case of Entropy model. (a) (b) 1800
No. of Pixels
(d) buildings trees roads ground
En b tr o D py SM _n G D LC SM M (m e G a n g G L LC CM )_D S M ( (m me M ea an G n) )_g LC _n M DS H ( H om me M om og an og e )_ en nie r i e ty _ ty _n b D In SM te ns it nD y SM N D VI St re ng r th _g
h_ nD SM
SM
re ng t
16000 14000 12000 10000 8000 6000 4000 2000 0
Attributes
St
Attributes
nD
In te ns
ity
G
DS M
buildings trees roads ground
B
No. of Pixels
(c)
40000 35000 30000 25000 20000 15000 10000 5000 0
Figure 9. Contribution of individual attributes to the classification results using the Entropy model and the Total group of attributes in cases of: (a) UNSW; (b) Bathurst; (c) Fairfield; and (d) Memmingen. By analysing the results from the three splitting models (Entropy, Gain Ratio and Gini) we found that some attributes were only selected for one test area, while other attributes were never selected. We concluded that an attribute is important for the classification process using a certain classification trees model if it is selected by that model for at least two of the four case studies and resulted in a higher number of pixels passing through the nodes when compared to other attributes. Table 6 shows the most useful attributes for each land cover class as selected by each classification trees model. It has been noted by Matikainen et al. (2007) that there can also be useful attributes among those that were not selected. That is because at each node the algorithm selects the best split according to the splitting criterion, but there can be several attributes and splits that would be almost equally good. The lower splits in the tree also depend on the splits selected earlier. In order to solve this problem and to ensure the selection of the most useful attributes for each model, the classifications of the first two test areas were also undertaken using the same group of attributes and the Self Organizing Map (SOM) classifier (Kohonen 1990) by Salah et al. (2009b). By using the SOM classifier, most of the attributes have performed almost equally except for the few attributes which resulted in higher classification accuracies. The attributes which were most useful for detecting each land cover class using the SOM classifiers were compared to the corresponding ones using the Entropy, Gain Ratio and Gini models as shown in Table 6. Table 6. The most useful attributes for each land cover class as selected by Entropy, Gain Ratio, Gini and SOM.
tioRa
Entropy
UNSW
Bathurst
Fairfield
B R strength/nDSM nDSM
B R nDSM
B strength/nDSM nDSM R
R B
B DSM
B DSM
Memmingen B R entropy/nDSM DSM GLCM_mean/DSM homogeniety_B B DSM
intensity strength_DSM entropy/nDSM nDSM
entropy/DSM G nDSM NDVI strength_R
G intensity nDSM strength/nDSM
Gini
B strength/nDSM nDSM
B nDSM
B strength/nDSM nDSM
SOM
homogeneity/DSM entropy/nDSM homogeneity/nDSM intensity
strength/nDSM homogeneity/nDSM slope/nDSM
entropy/nDSM homogeneity/nDSM nDSM
entropy/nDSM G GLCM_mean/DSM GLCM_mean/G GLCM_mean/nDSM GLCM_mean/R homogeniety_B homogeniety_nDSM IR nDSM NDVI R strength/G B entropy/nDSM DSM GLCM_mean/G homogeniety_B entropy/nDSM NDVI intensity homogeneity/DSM homogeneity/nDSM
The results show the ability of the Entropy model to take advantage of many more attributes followed by Gain ratio and Gini models. The numbers of attributes which were selected as useful attributes by the Entropy model are 6,7,6 and 15 in cases of UNSW, Bathurst, Fairfield and Memmingen respectively. In the case of Gain ratio and Gini models, these numbers were (4, 3, 4 and 6) and (3, 2, 3 and 5) respectively. On the other hand there was a relatively high degree of similarity between the selected attributes by both the Entropy model and the SOM classifier. Also, the use of the IR image data, in the case of Memmingen area, increased the number of the attributes selected from the Total group of attributes compared to the number selected in the UNSW, Bathurst and Fairfield case studies where the lidar intensity image was used as shown in table 6. Finally, results indicate the relative importance of the nDSM and its derivatives (entropy, homogeneity and strength) for the classification process. 5.1.4 Transferability of classification trees from one dataset to another. Tests were carried out to determine how well the classification trees derived from one dataset can be used to classify the other datasets. This is described as transferability of the classification trees. The classification tree derived for the UNSW test area using the Total group of attributes for the Entropy, Gain Ratio and Gini models, has been used to classify the other three test areas and the results are shown in figure 10.
Classification Accuracy (%)
100 Ratio
90
Entropy
80
Gini
70 60 50 40 30 20 UNSW
Bathurst
Fairfield
Memmingen
Test Area
Figure 10. The classification performance when the classification tree derived for the UNSW test area using the Total group of attributes was used to classify the other three test areas for 3 classification trees models. Results in figure 10 indicate that Entropy model yielded marginally higher classification accuracies when the UNSW dataset was used to classify other datasets (84.5%, 92.8% and 33.3% for Bathurst, Fairfield and Memmingen datasets respectively) followed by Gini (84.5%, 91.2% and 32.3% for Bathurst, Fairfield and Memmingen datasets respectively) and Gain ratio models (83%, 87.2% and 29.7% for Bathurst, Fairfield and Memmingen datasets respectively). In general, the classification trees derived for the UNSW, Bathurst, and Fairfield, resulted in almost the same high classification accuracies when they were used to classify each other, while they resulted in poor accuracy when they were used to classify the Memmingen test area. The generated classification tree from the UNSW dataset resulted in a high classification accuracy of 92.8% when it was used to classify the Fairfield dataset since they were both captured using the same lidar sensor (Optech) and the images have approximately equal pixel size (10 and 15cm respectively). The same classification tree resulted in 84.5% classification accuracy when it was used to classify the Bathurst dataset although the images have different pixel size (10 and 50cm respectively), and the lidar data were derived from two lidar systems (Optech and Leica) which operate approximately in the same wavelength range 1.047µm and 1.046µm respectively and with the same scanning systems. For Memmingen Test area, the UNSW classification tree resulted in very poor classification accuracy (33.3%) since the images have different pixel sizes, 10 and 50cm respectively, and the lidar data were derived from different sensors (Optech and TopoSys) which operate at different wavelengths (1.047 and 1.56µm respectively) and with different scanning systems, scanning mirror and fibre optic systems respectively. Furthermore, the Memmingen image is a CIR image while for the other three test areas are RGB images plus lidar intensity image were used. The scene characteristics comprising vegetation and building materials in Australia and Germany differ significantly and therefore it is not
surprising that the same classification trees could not be used for the classification of scenes in Australia and Germany. However, these results confirm that the classification tree generated for one test area can be used to classify other areas provide the laser data captured using the same lidar sensor and the multispectral images have approximately equal pixel size. 5.2 Evaluation of building detection results 5.2.1 Post classification smoothing. The classified images using the Entropy model and the Total group of attributes, as the most accurate classification results, were used to evaluate the ability of classification trees to detect buildings. Since this study was primarily interested in building detection, buildings were separated by converting the classified image to a binary image. The digital numbers of buildings were converted to one, while the digital numbers of non-building pixels (roads, trees and ground) were converted to zeros. Then, the smaller raster homogeneous building regions were merged into larger neighbouring homogeneous regions or deleted according to an arbitrary 1m distance and (30m2) area threshold, respectively. The area threshold represents the expected minimum building area, while the distance threshold was set to 1m to fill in any gaps produced through the classification process. Regions were retained if they were larger than the given area threshold and/or were adjacent to a larger homogeneous region by a distance less than 1m. Finally, building borders were cleaned by removing structures that were smaller than 8 pixels and that were connected to the image border. There was a compromise between cleaning thresholds less than 8 pixels, which may leave the original buildings uncleaned, and thresholds greater than 8 pixels which may remove parts of the buildings. The result was a black and white image that represents the detected buildings without noisy features and also without holes. Figure 11(a) is a typical example derived in this study. In order to evaluate the accuracy of the detection process, buildings were manually digitized in the multispectral image as shown in figure 11(b) to serve as the reference data. Adjacent buildings that were joined but obviously separated were digitized as individual buildings. Otherwise, they were merged as one polygon. In order to overcome the horizontal layover problem of tall objects such as buildings, roofs were first digitized and then each roof polygon was shifted if possible to coincide at least one point of the polygon with the corresponding point on the ground. Finally, since the orthophotos and the lidar data correspond to different epochs, we have excluded all building polygons that were only available in one data set from the analysis.
(a)
(b)
Figure 11. (a) The final detected buildings, (b) The manually digitized buildings for the UNSW dataset. 5.2.2 Accuracy estimates of the building detection results. In comparison with the reference data an average of 96% of all buildings were detected with well defined edges and also without holes. To gain an insight into the behaviour of the building detection process, completeness and correctness as described in (Rottensteiner et al 2007), were computed as shown in equations (8) and (9). Completeness is the percentage of the actual buildings that are detected by an algorithm, and the correctness is the percentage of the buildings detected by an algorithm that correspond to real buildings.
TP TP + FN TP Correctness = TP + FP Completeness =
(8) (9)
TP denotes the number of true positives, i.e., the number of entities found to be available in both reference and experiment datasets. FN is the number of false negatives, i.e., the number of entities in the reference dataset that were not detected automatically, and FP is the number of false positives, i.e., the number of entities that were detected, but do not correspond to an entity in the reference dataset. From figure 12, false positives and false negatives mostly occurred at the building outlines. That is because in the resulting binary building image, small elongated areas of buildings were eliminated by the post classification smoothing process. Also it is clear that fences with a height of more than 3m, the height threshold which was applied to the nDSM, were detected as buildings as shown by green rectangles at the lower right corner of the figure. The improvement in the detection process achieved by the noise removal was evaluated. From Table 7, postclassification, smoothing and noise removal improved the detection process by 3.8%, the average completeness by 5%, but decreased the average correctness by about 1.6%, which is insignificant.
Figure 12. Evaluation of the results of building detection for the UNSW dataset. Black: correct building pixels; green: false positives; red: false negatives. Table 7. No.: Number of buildings. Cp, Cr: average completeness and correctness for building pixels before and after post-classification, smoothing and noise removal. Before After No. Cp (%) Cr (%) No. Cp (%) Cr (%) UNSW 243 89.8 92.1 232 93.7 90.2 Bathurst 207 87.1 93.6 193 93.2 91.9 Fairfield 2041 86.3 94.1 2018 91.1 92.3 Memmingen 2098 86.2 93.4 2053 91.3 92.4 Figure 13 (a)–(d) shows the completeness and correctness for buildings against the building size for the: UNSW; Bathurst; Fairfield; and Memmingen case studies respectively as a function of building size. For the UNSW case study, buildings around 30m2 were detected with completeness and correctness around 77% and 79% respectively and these statistics improve for increasing building size. For Bathurst, Fairfield and Memmingen case studies, buildings around 30m2 were detected with both completeness and correctness around 73% and 70% respectively and these statistics improve for increasing building size. For all cases, all buildings larger than 70m2 were detected with both completeness and correctness over 90%. The difference between completeness and correctness is a matter of 1–2% except for buildings smaller than 70m2 where the difference is up to 10%. This further confirms the lower reliability of detecting buildings smaller than 70m2. The overall results are very consistent for all test areas, which are derived by 3 different lidar systems in terms of scanning pattern and resolution, in different urban environments in Australia and Germany and different vegetation types. It can therefore be concluded that these tests
strongly represent achievable accuracies for detection of buildings by the method of classification trees using a combination of lidar data and images. (a)
100 98
95
96
90 Percent
92 90 88
Completeness Correctness
86
85 80
Completeness Correctness
75 70
(c)
100
310
290
270
250
230
Area (m2)
210
190
170
150
130
90
110
Area (m )
70
65
2
50
330
310
290
270
250
230
210
190
170
150
130
90
110
70
50
82
30
84
(d)
100 95
95
90 Percent
90 85 80
Completeness Correctness
75
85 80
Completeness Correctness
75 70
70
310
290
270
250
230
210
190
170
150
130
110
90
70
310
290
270
250
230
210
190
170
150
130
90
70
50
30
110
Area (m2)
50
65
65
30
Percent
94
Percent
(b)
100
Area (m2)
Figure 13. Completeness and correctness derived for classification trees plotted against building areas in cases of: (a) UNSW; (b) Bathurst; (c) Fairfield; and (d) Memmingen. 6. Conclusion
A method for building detection based on the fusion of lidar, multispectral aerial images and 25 auxiliary attributes by classification trees was presented for 4 test areas in different urban environments, based on lidar data derived from 3 different sensors and different vegetation types. Overall the completeness and correctness for extracting buildings larger than 70m2 was 90% or better, while these measures deteriorated to 80 to 85% for building 50m2 in size. Tests also demonstrated that the classification trees derived from one area could be successfully transferred to another region provide the laser data captured using the same lidar sensor, the multispectral images have approximately equal pixel size and the ground cover types are similar. The results show that using the Entropy splitting model for classification trees with the Total group of attributes usually performs the best when measured in terms of misclassification errors, classification accuracies, as well as transferability of classification trees from one dataset to another. Also the results show that using the attributes generated from infrared band instead of those generated from the lidar intensity image improved the classification accuracy by about 3%. An investigation into the relative importance of the
input data showed the relative importance of the nDSM and its derivatives (entropy, homogeneity and strength) for the classification process. This investigation has also shown that some attributes were useful only for the earlier SOM study and others for classification trees, which suggests that construction of a hybrid classifier based on multiple classifiers operating simultaneously, which takes advantage of all attributes should achieve a more effective and robust decision making process. Finally, these results are a prerequisite to geometrical reconstruction of buildings from roof planes, which should be detectable from the building regions determined by the fusing aerial imagery and lidar data as demonstrated in this study. Acknowledgements The authors wish to acknowledge AAMHatch for the provision of the UNSW and Fairfield datasets, the Department of Lands, NSW, Australia for Bathurst datasets and the TopoSys GmbH, Germany for the Memmingen datasets. References ABO AKEL, N., ZILBERSTEIN, O. AND DOYTSHER, Y., 2004, A robust method used with orthogonal polynomials and road network for automatic terrain surface extraction from lidar data in urban areas. In 20th ISPRS Congress-Commission III on geo-imagery bridging Continents-XXXV/B3, 12–23 July 2004, Istanbul, Turkey. ALHARTHY, A. AND BETHEL, J., 2002, Heuristic filtering and 3d feature extraction from lidar data. In Proceedings of ISPRS commission III symposium on Photogrammetric computer vision- XXXIV/3A, 9 – 13 September 2002, Graz, Austria, p. A-29 ff. BALTSAVIAS, E. P., 1999, A comparison between photogrammetry and laser scanning, ISPRS Journal of Photogrammetry and Remote Sensing, 54, pp. 83–94. BREIMAN, L., FRIEDMAN, J. H., OLSHEN, R. A., STONE, C. J. (Ed.), 1984, Classification and Regression Trees, 358 p (New York: Chapman & Hall).
BREIMAN, L., 1996, Technical Note: Some properties of splitting criteria. Journal of Machine Learning, 24, pp. 41–47. CLAUSI, D. A., 2002, An analysis of co-occurrence texture statistics as a function of grey-level quantization. Canadian Journal of Remote Sensing, 28, pp. 45–62.
CHRISTOPHER D., JOHN R., ZHE L., J. RONALD AND TREVOR G., 2008, Mapping selective logging in mixed deciduous forest: a comparison of machine learning algorithms. Journal of Photogrammetric Engineering and Remote Sensing, 74, pp.1201–1211. ESPOSITO, F., MALERBA, D. AND SEMERARO, G., 1997, A comparative analysis of methods for pruning decision trees. IEEE transactions on pattern analysis and machine intelligence, 19, pp. 476–491. FERRO, C. J. S. AND T. A. WARNER, 2002, Scale and texture in digital image classification. Photogrammetric Engineering and Remote Sensing, 68, pp. 51–63. FÖRSTNER, W. AND GÜLCH, E., 1987, A fast operator for detection and precise location of distinct points, Corners and Centres of Circular Features. In ISPRS Intercommission Workshop, June 1987, Interlaken, pages 281–305. FRANK, O. (Ed.), 1981, A Survey of Statistical Methods for Graph Analysis. pp. 110–155 (San Francisco: Jossey-Bass).
HALL-BEYER, M., 2008, The GLCM tutorial home page. Available online at: http://www.fp.ucalgary.ca/mhallbey/tutorial.htm (Accessed 25 February 2009). HARALICK, R.M., 1979, Statistical and structural approaches to texture. Proceedings of the IEEE, 67, pp. 786–804. KOHONEN, T., 1990, The self-organizing map. Proceedings of the IEEE, 78, pp. 1464–80. KRAUS, K., Pfeifer N., 1998, Determination of terrain models in wooded area with airborne laser scanner data. ISPRS Journal of Photogrammetry and Remote Sensing, 53, pp. 193–203. KRAUS, K., 2003, Laser-Scanning-ein Paradigmawechsel in der Photogrammetrie, Bulletin SEV/VSE (invited), 9, pp. 19–22. LI Y. AND WU H., 2008, Adaptive building edge detection by combining lidar data and aerial images. In 21st ISPRS congress-XXXVII/B1, 3–11 July 2008, Beijing, China. LU Y. H., TRINDER J. C. AND KUBIK K., 2006, Automatic building detection using the dempster-shafer algorithm, Journal of Photogrammetric Engineering and Remote Sensing, 72, pp. 395–404 MAAS, H. AND VOSSELMAN, G., 1999, Two algorithms for extracting building model from raw laser altimetry data. ISPRS Journal of photogrammetry and remote sensing, 54, pp.153– 163. MATIKAINEN, L., KAARTINEN, H. AND HYYPPÄ, J., 2007, Classification tree based building detection from laser scanner and aerial image data. In Proceedings of the ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, 12–14 September 2207, Espoo, Finland, pp. 280–287. MORGAN, M. AND TEMPFLI, K., 2000, Automatic building extraction from airborne laser scanning data. In: 21st ISPRS congress-33(B3/2), 16–22 July, Amsterdam, The Netherlands, pp. 616–623. NEIDHART, H. AND SESTER M., 2008, Extraction of Building Ground Plans from Lidar Data. In: 21st ISPRS congress. Vol. 37, Part B2, 3–11 Jul 2008 Beijing, China. PRIESTNALL, G., JAAFAR, J. AND DUNCAN, A., 2000, Extracting urban features from LIDAR digital surface models. Journal of Computers, Environment and Urban Systems, 24, pp. 65– 78.
QUINLAN, J.R., 1987, Simplifying decision trees, International Journal of Man-Machine Studies, 27, pp. 227–248. ROTTENSTEINER F., TRINDER J., CLODE S., KUBIK K., 2007, Building detection by fusion of airborne laser scanner data and multi-spectral images: Performance Evaluation and Sensitivity Analysis, ISPRS Journal of Photogrammetry & Remote Sensing, 62, pp. 135– 149. SAFAVIAN, S. AND LANDGREBE, D., 1991, A survey of decision tree classifier methodology. Available online at: http://cobweb.ecn.purdue.edu/~landgreb/SMC91 (accessed 01 June 2009). SALAH, M., TRINDER, J. and SHAKER, A., 2009a, Evaluation of the self-organizing map classifier for building detection from lidar data and multispectral aerial images. Journal of Spatial Science, accepted.
SALAH, M., TRINDER, J. and SHAKER, A., 2009b, Aerial images and lidar data fusion for automatic feature extraction using the self-organizing map (som) classifier. In proceedings of the 38th ISPRS Congress- XXXVIII, Part 3/W8, 1–2 September 2009, Paris, France, pp. 317–322. SANDER, O., 2008, Problems in automated building reconstruction based on dense airborne laser scanning data. In: 21st ISPRS Congress. Vol. 37, Part B3a, 3–11 Jul 2008 Beijing, China.
SHANNON C. E. (Ed.), 1949, reprinted 1998, The Mathematical Theory of Communication, (Urbana, IL: University of Illinois Press). SHERROD, P. H., 2008, DTREG Predictive Modeling Software. Users Manual. www.dtreg. com/DTREG.pdf VÖGTLE, T. AND STEINLE, E., 2003, On the quality of object classification and automated building modeling based on laser scanning data. In Proceedings of the ISPRS working group III/3 workshop on 3-D reconstruction from airborne laserscanner and InSAR dataXXXIV/3W13, 8–10 October, Dresden, Germany, pp. 149–155. VOSSELMAN, G., GORTE, B.G.H., SITHOLE, G., RABBANI, T., 2004, Recognising structure in laser scanner point clouds. In: 20th ISPRS congress 35 -B8, 12–23 July, Istanbul, Turkey, pp. 33–38. WASKE, B., 2007, Classifying multisensor remote sensing data: concepts, algorithms and applications. PhD thesis, Bonn University, Germany. WEIGUO L. AND ELAINE Y., 2005, Comparison of non-linear mixture models: sub-pixel classification. Journal of Remote Sensing of Environment, 94, pp.145–154. XIONGCAI CAI, SOWMYA A., TRINDER J. C., 2005, Learning to recognize roads from high resolution remotely sensed images, In 2nd International Conference on Intelligent Sensors, sensor networks and information processing, 5–12 December 2005, Melbourne, Australia, electronically published in Lecture Notes in Computer Science, Volume 3851 / 2006, pp. 868 – 877. ZAMBON, M., LAWRENCE R., BUNN A., AND POWELL S., 2006, Effect of alternative splitting rules on image processing using classification tree analysis. Journal of Photogrammetric Engineering and Remote Sensing, 72, pp. 25–30.