KSCE Journal of Civil Engineering (2013) 17(2):486-497 DOI 10.1007/s12205-013-1800-0
Surveying and Geo-Spatial Information Engineering
www.springer.com/12205
A Multispectral Image Segmentation Approach for Object-based Image Classification of High Resolution Satellite Imagery Young Gi Byun*, You Kyung Han**, and Tae Byeong Chae*** Received November 15, 2011/Revised May 8, 2012/Accepted June 27, 2012
···································································································································································································································
Abstract Image segmentation has been recognized as an essential process that performs an object-based rather than a pixel-based classification of high-resolution satellite imagery. This paper presents an efficient image segmentation method that considers the spatial and spectral information of high-resolution pan-sharpened imagery. First, we conduct multispectral nonlinear edge preserving smoothing and extract the multispectral edge, which is used as valuable information for seed selection and image segmentation. The initial seeds are automatically selected using the proposed edge variation-based seed selection method, which uses the obtained multispectral edge in a local region. After automatic selection of significant seeds, image segmentation is achieved by applying the modified seeded region growing procedure, which integrates the multispectral and gradient information existing in the image to provide homogenous image regions with accurate and closed boundaries. Experimental results on two multispectral satellite images are given to show that the proposed approach has capability superiority to the previous segmentation techniques on visual evaluation and quantitative comparative assessment. Keywords: image segmentation, object-based classification, seed selection, unsupervised segmentation evaluation, multispectral edge ···································································································································································································································
1. Introduction There is a growing need for technologies that will enable automatic feature extraction and classification for urban applications owing to the constantly increasing public availability of highresolution satellite data sets. In extracting and classification of major object feature information from high-resolution satellite imagery, it is very difficult to obtain satisfactory results using pixel-based classification methods, which only utilize spectral information. The main reasons for this are that they have considerable difficulties dealing with the abundant information of high-resolution satellite data, they produce inconsistent saltand-pepper classification map, and they are not capable of extracting objects of interest (Van der Sande et al., 2003; Karantzalos and Arglalas, 2009a). To overcome these problems, object-oriented classification, which subdivides the image into meaningful homogeneous regions (or image objects) and classifies them on the basis of not only spectral properties but also shape, texture, and other topological features, has recently been proposed (Van der Sande et al., 2003; Darwish et al., 2003; Wang et al., 2004; Lizarazo and Elsner, 2009). One of the most important issues in object-oriented classification is the accurate segmentation of the input image because
the quality of object-oriented classification is directly affected by the image segmentation quality (Richard et al., 2006). Image segmentation plays a crucial role in various image processing applications in several domains, including medical and remote sensing. It describes the task of partitioning an image into several segments or regions. Generally, segmentation algorithms can be broadly divided into edge-based, cluster-based and region-based algorithms (Zhang, 1997; Guindon, 1997). Edge-based algorithms detect object contours using the discontinuity property, whereas regionbased algorithms focus on grouping pixels using the similarity property based on certain homogeneity criteria. The spatial domain of a processed image is used in both cases. Cluster-based segmentation algorithms use an iterative moving method that attempts to search for a cluster configuration to separate distinct structures in the spectral feature domain. The clustering technique does not consider spatial information. Edge-based segmentation has not been very successful because of its poor performance in the detection of textured objects (Gong et al., 2006). This approach also requires post-processing procedures, such as edge tracking or gap filling, to identify object contours (Kerman and Chehdi, 2002). A larger number of region-based segmentation algorithms has therefore been proposed for segmentation of remote sensing
*Senior Researcher, Korea Aerospace Research Institute (KARI), Daejeon 305-333, Korea (Corresponding Author, E-mail:
[email protected]) **Ph.D. Candidate, Dept. of Civil and Environmental Engineering Seoul National University, Seoul 151-744, Korea (E-mail:
[email protected]) ***Senior Researcher, Korea Aerospace Research Institute (KARI), Daejeon 305-333, Korea (E-mail:
[email protected]) − 486 −
A Multispectral Image Segmentation Approach for Object-based Image Classification of High Resolution Satellite Imagery
imagery (Ketting and Landgrebe, 1976; Eddy et al., 2008; Aiping et al., 2010). Most of these methods are based on conventional region growing (Gonzalez and Woods, 2002) and morphological watershed transformation (Li and Xiao, 2007). Mott et al. (2002) proposed a selective region-growing algorithm that combines the evaluation of class-specific spectral information and immediate vicinity relations of pixels for classification of IKONOS multispectral imagery. As a morphologic segmentation algorithm, watershed transform is used widely in remote sensing to the resulting closed and connected region. For example, Karantzalos and Argialas (2006b) proposed a scheme for improving edge detection and watershed segmentation for detecting olive trees from an IKONOS panchromatic image. Chen et al. (2006) presented an approach for segmentation of an IKONOS panchromatic image using watershed transformation based on image morphology combined with a multiple-scale region-merging process. An extension of the watershed transformation for multispectral image segmentation was presented by Li and Xiao (2007), who embedded a vector-based morphological approach to compute the gradient image. A region-based segmentation method combining texture and spectral distribution was recently proposed by Aiping et al. (2010), which consists of three steps: hierarchical splitting, modified agglomerative merging and pixelwise refinement. Although several methods based on conventional region growing or watershed transformation are able to successfully segment satellite images in some cases, there are some drawbacks to using these two methods for segmenting high resolution multispectral satellite images. Conventional region growing is highly sensitive to the threshold value for stopping the growth of a region, and it produces different results depending on the selection of the scanning direction for seed points (Kai and Muller, 1991). The watershed transformation is intrinsically executed on a gradient image and it cannot use the spectral information of a multispectral image, which results in over-segmentation due to the existence of many local minima (Zhang et al., 2008). It generally requires a complicated post-processing process, such as region merging, which can use spectral information for elimination of the over-segmentation problem. Most methods based on conventional region growing or watershed transformation cannot be wholly free from the drawbacks mentioned above. The purpose of this paper is to develop a new method for the segmentation of high-resolution satellite imagery, which takes into account multiple-feature information, such as multispectral and edge information, and is more robust to the seed selection. To do this, we take advantage of the Seeded Region Growing (SRG) approach, which is robust and rapid and does not require a tuning parameter (Adams and Bischof, 1994). In the approach, the SRG algorithm is modified to make use of information in all spectral bands and edge information for better image segmentation. A seed selection method based on the local variation characteristics of a multispectral edge is also developed to obtain seeds for the Modified SRG (MSRG) procedure. Vol. 17, No. 2 / March 2013
The remainder of this paper is organized as follows. In section 2, we describe in more detail the framework and component parts of our algorithm, and evaluation methods for comparison of different segmentation methods are presented in section 3. Section 4 provides experimental results and discussions, which also include a comparison with other methods. Conclusions are presented in section 5.
2. MSRG Image Segmentation Algorithm The proposed image segmentation algorithm consists of two stages. In the first stage, the high-resolution images are fused to generate high-resolution multispectral images (pan-sharpened images), and edge-preserving smoothing is conducted to alleviate spectral distortion and image noise in the fused images. Multispectral edge information is extracted using an entropy operator for the modified SRG procedure. In the second stage, initial seed points are extracted through the proposed edge variation-based seed selection, which uses the obtained multispectral edge in a local region. Image segmentation is achieved by applying the MSRG procedure, which integrates the multispectral and gradient information to provide homogenous image regions with accurate and closed boundaries. 2.1 Multispectral-based Image Processing 2.1.1 Multispectral Nonlinear Edge-Preserving Smoothing A certain amount of colour distortion occurs when the multispectral image and panchromatic image are fused. This produces noise pixels, which may result in substantial errors, such as regions being too small in the segmentation process. Therefore, image smoothing is required before segmentation, especially when the images are degraded by noise. In general, since each individual band of a multispectral image can be considered a monochrome image, image smoothing of a multispectral image is carried out on each band separately. However, because each processing step is usually accompanied by a certain error, the formation of the output spectral vector from separately processed multispectral components usually produces spectral artifacts. In order to ensure robust performance of a segmentation algorithm, an enhancement algorithm, which can process all bands simultaneously, is necessary. In this paper, we use the multi-valued anisotropic diffusion method proposed by Sapiro and Ringach (1996), which can handle all bands of a multispectral image simultaneously and also preserve edge information while removing image noise. Let I(x1, x2) be a multivalued image with components I n(x1, x2), n=1, ···, N. To describe the image discontinuities of I, let us look at the differential of I. In a Euclidean distance: ∂I ∂I dI = ------- dx1 + ------- dx2 ∂x2 ∂x1 and its squared norm is given by:
− 487 −
2
dI =
2
2
∑ ∑ gijdxi dxj ,
i = 1j = 1
(1)
Young Gi Byun, You Kyung Han, and Tae Byeong Chae T
∂I ∂I dx1 g11 g12 dx1 = , where gij = ------ ⋅ -----∂x i ∂x j dx2 g21 g22 dx2
(2)
N
This quadratic form is called the first fundamental form that allows the measurement of changes in the image. The matrix gij has two positive eigenvalues, λ + and λ−, and the corresponding orthogonal eigenvectors θ + and θ−, give the respective directions. The anisotropic vector diffusion model which is based on the partial derivative of the multispectral image at diffusion time t is defined as follows: 2
∂I ∂I ----- = g ( λ+ – λ− ) --------2 ∂t ∂ θ−
(3)
Equation (3) includes the second derivative of a vector in the tangential direction to the maximal variation and a decreasing function of the difference between two eigenvalues, where the diffusion time t serves as scale parameter: A larger value of t gives a more smoothed image. The diffusion time of 10, based on our experiments, was selected. In order to preserve edges and to simultaneously smooth within more homogenous regions, the diffusivity function g(·) is chosen as a decreasing nonnegative function: 2
s g ( s ) = exp ⎛ – ----2⎞ where s = λ + – λ− ⎝ k⎠
(4)
Where, the constant k controls the sensitivity to edges, which was set to 16 in our experiments. 2.1.2 Generating Edge Image from Multispectral Image Edge information provides simplified image information while preserving the domain geometric structures and spatial relationship found in the original image. Thus, it can be used as useful additional information for seed selection and image segmentation. Edge information extracted from a panchromatic single image is commonly used in most segmentation approaches. However, with respect to the reliability of extracted edge information, it is better to take into account edge information of all bands simultaneously, than to process only a single band. In this paper, a fast, simple edge operator based on entropy proposed by Shiozaki (1986) is chosen for identifying the geometric structures of multispectral imagery. The entropy operator is a nonlinear rotation-invariant filter that calculates the entropy of brightness for the central pixel of the window mask. The entropy measure of individual bands, which provides information on the homogeneity of the window mask, is defined as follows SH = Σjn= 0 pi logpi ⁄ log ( n + 1 )
can be expressed as a linear combination of individual entropy measures:
(5)
where pi = ai ⁄ Σjn= 0 aj Here, SH is entropy measure of the central pixel in the local window, n is the total number of neighboring pixels, ai represents the n pixel values, and pi is the probability. The integrated entropy measures containing the contrast information of individual bands
H=
∑ qiSHi , qi = bi ⁄ ΣiN= 0 bi
(6)
i=0
Here, N is the total number of bands, and bi and SHi indicate entropy measure of the individual bands respectively. The entropy of a multispectral image takes high values in edge regions and low values in flat regions within the range [0 1]. 2.2 Image Segmentation using MSRG Procedure 2.2.1 Determining Seed Points The SRG algorithm starts with initial seed points and attempts to aggregate the yet unlabelled pixels to one of the given seed regions. An obvious way to improve the SRG algorithm is to automate the process of seed selection. Chung et al. (2008) developed an automatic window-based seed generation technique to automate the SRG algorithm, where homogenous pixels were selected as seed points when their homogeneity levels were greater than the predefined threshold; the homogeneity level of one pixel in the image was determined by its standard deviation and discontinuity of subimage covered by the window mask. An edge-oriented seed selection method was presented by Fan et al. (2005), where the initial seed points for SRG were automatically obtained from the centroids of the adjacent labelled color edges. He et al. (2007) presented an automatic seed selection method in which initial seed points were extracted from the scalar force field derived from the Gradient Vector Flow (GVF) by minimizing energy functional iteratively. Most of these methods, however, may induce an over-segmentation problem because they extract too many points as the seed points in an area having a homogenous surface characteristic or over-detected color edges. In this section, we propose an edge variation-based seed selection algorithm that automatically selects suitable seed points for more meaningful region growth. The basic concept for extracting seed points is to use the local minima of the image’s edge information. This method is fulfilled on the predefined multispectral edge image. From the edge image, each central pixel moves toward the neighboring pixel having the lowest gradient, and this course is repeated until the point converges to the local minima and no more movement is made. The region having the same local minima shows a homogeneous property so that these local minima can be used by the seed points. However, according to various effects of image acquisition characteristics, a subtle difference of edge variation is presented even though these are extracted by the same region. If this property is not considered when the seed point is extracted, it extracts too many seed points and causes over-segmentation. Therefore, we extract the final seed points considering the difference of values between detected local minima and its neighboring eight pixels. If there are pixels for which the difference is within a tolerable range, the points are re-spread to other local minima. The final seed points are extracted through
− 488 −
KSCE Journal of Civil Engineering
A Multispectral Image Segmentation Approach for Object-based Image Classification of High Resolution Satellite Imagery
this process, and a tolerable range is determined by the edge difference between local minima and its neighboring pixel. min ( G i – Gk ) - , k = 1, 2, …, N Tor ( i ) = ------------------------------Gmax – Gmin
(7)
where Gi is the edge magnitude at a local minimum point i, N is the number of neighbours of Gi , Gk represents the N neighboring edge magnitudes, and Gmax and Gmin are the maximum and minimum magnitudes of the edge in the image, respectively. The local minima when the value of Tor is greater than a specified threshold is selected as seed points. To promote a better understanding of the proposed seed extraction method, as an example, Fig. 1 shows the process of extracting seed points in an edge image which has 8-bit radiometric resolution. (Note that the actual multispectral edge images used for extracting seed points have float pixel values.) The green pixels in the Fig. 1 indicate local minima. If the proposed method is applied with the threshold of 10 as the tolerable range of movement, the local minimum point that has a 30 edge value is moved toward the direction of the 31 edge value, which has the smallest difference among its neighboring pixels within the tolerable range, and it spreads and finally finds a new local minimum having 0 value. Therefore, the level of image segmentation in a subsequent MSRG procedure can be determined by adjusting the threshold of the tolerable range in the proposed seed extraction method.
additional pixels one at a time. The pixels in the same regions are labelled with the same symbol and the pixels in different regions are labelled with different symbols. Let H be the set of all unallocated pixels that are adjacent to at least one of the labelled regions. n n ⎧ ⎫ H = ⎨ ( x, y ) ∉ ∪ Ri N ( x, y ) ∩ ∪ Ri ≠ Φ ⎬ (8) ⎩ ⎭ i=1 i=1 Here, N(x, y) denotes the eight nearest neighbors of the pixel. In each step of the algorithm, one pixel is taken from set H and added to one of the regions with which neighbors N(x, y) of the pixel intersect. The pixel is given the label of that region. All pixels of N(x, y) are then examined and calculated the similarity to their neighboring regions. According to that similarity measure, we put them into set H in increasing order. The novel similarity measure that takes into account the edge strength difference as well as spectral difference is defined.
I ( x, y ) ⋅ I ( x, y ) - × G ( x, y ) – G ( x, y ) ϕ ( x, y ) = ------------------------------I ( x, y ) 2 1 where I ( x, y ) = ---- ∑ I ( x, y ) m (x, y ) ∈ R 1 G ( x, y ) = ---- ∑ G ( x, y ) m ( x, y ) ∈ R i
i
Where I(x, y) indicates the pixel vector value of each component of the testing pixel (x, y), I(x, y) is the mean spectral
2.2.2 The Modified SRG We extend the grey-level version of the SRG procedure into multidimensional space and implement image segmentation using a novel similarity measure that combines structural and spectral information to provide homogenous image regions with accurate and closed boundaries. Given a set of seeds S1, S2, ···, Sn, each step of SRG involves adding one additional pixel to one of the seed sets. Moreover, these initial seeds are further replaced by the centroids of the homogenous regions R1, R2, ···, Rn generated by involving the
Fig. 1. Example of Extracting Seed Points Considering the Variation Property of the Edge Vol. 17, No. 2 / March 2013
(9)
− 489 −
Fig. 2. The Algorithm Procedure of the MSRG
Young Gi Byun, You Kyung Han, and Tae Byeong Chae
vector for each region, G(x, y) is the edge magnitude of the multispectral edge map, G(x, y) is the mean edge strength value for each region, Ri is the set of pixels included in each region, m is the number of pixels in each region, and || || and · represent the vector norm and inner product, respectively. If N(x, y) intersects with two or more of the labelled regions, ϕ (x, y) takes a value of i such that N(x, y) intersects with Ri, and ϕ (x, y) is minimized.
ϕ ( x, y ) = (x,min { ϕ ( x, y, Rj ) j ∈ { 1, 2, …, n } } y)∈H
(10)
The SRG procedure is repeated until all pixels in the image have been allocated to regions. The implementation of SRG involves ordering the data of H as a linked list according to ϕ (x, y). Adams et al. (1994) refer to the list as the Sequentially Sorted List (SSL). In this paper, heap-based Priority-Queue (PQ) data structure instead of the SSL is applied because the time complexity of the priority queue is lower than that of the SSL. An overview of the algorithm for implementing the proposed method is presented in Fig. 2.
N
Hr ( I ) =
⎛ S j⎞
- H(Rj) ∑ ⎝ --SI⎠
j=1 N
H ( Rj ) =
Lj( m )
Lj( m )
- log ------------∑ -----------Sj S
j=1
j
Where N is the total number of regions, Vj is the set of all possible values associated with the intensity in the region j, and Lj(m) denotes the number of pixels in region j that have a value of m for intensity in the panchromatic image. The evaluation function Q proposed by Borsotti et al. (1998), which measures mean squared spectral error of the segments, is defined as: N
ej2 N ( Sj )⎞ 2 N - + ⎛ -----------Q ( I ) = ------------------------ ∑ ------------------10000 × SI j = 1 1 + logSj ⎝ Sj ⎠ where ej2 =
∑ ( Cx ( p ) – C x ( R j ) )
(12)
2
p ∈ Rj
Cx ( Rj ) = ( Σp ∈ R Cx ( p ) ) ⁄ Sj j
3. Evaluation of MSRG Segmentation Results In order to evaluate the performance of this approach, the best segmentation results by the proposed method are compared with the outcome derived by a conventional region growing method (Gonzalez and Woods, 2002), the toboggan watershed algorithm (Mortensen and Barrett, 1999) and the mean shift algorithm (Comaniciu and Meer, 2002). The visual assessment, unsupervised segmentation evaluation measure, and object-based classification accuracy are used to compare and evaluate theses segmentation results. 3.1 Relative Assessment using Unsupervised Segmentation Evaluation Method Unsupervised evaluation methods evaluate segmentation results by using a quality evaluation function. The key advantage of unsupervised evaluation is that it enables an objective comparison of different segmentation methods without making a comparison with a manually-segmented reference image, which is often referred to as ground truth. We use the evaluation functions E and Q in an unsupervised evaluation method because these require no user-defined parameters and are independent of the content and type of image. Before describing the evaluation metrics, some notations and definitions are given. Let I be the segmented image with height Ih and width Iw. Let SI be the area of the full image (i.e. SI=Ih × Iw) and Sj be the area of region j. Evaluation function E consists of region entropy (Hr) as the measure of intra-region uniformity, and layout entropy (Hl) is the entropy indicating which pixels belong to which regions. The E function, which combines these two measures, is defined as: E = Hl ( I ) + Hr( I ) N
where Hl ( I ) =
Sj
(11) Sj
- log ---∑ --SI S
j=1 I
Where N is the total number of regions and ej2 is the quadratic error of the spectral values of the jth region. N(Sj) represents the number of regions that have an area equal to Sj. Generally, the smaller the values of E and Q, the better the segmentation results for the image. 3.2 Quantitative Assessment using Object-based Classification Accuracy The unsupervised evaluation measure which is described in the previous section has a low reliability for satellite images because it is developed for images such as indoor photos and composed pictures that are less complicated than satellite images. Therefore, we use another assessment method for the evaluating proposed process. The object-based classification is applied to the segmentation results by the same conditions. The Support Vector Machine (SVM) classifier (Wu et al., 2004) which is the supervised method known for showing good generalization results is applied to the proposed method's result as well as the other comparison results. To apply the supervised classification, the specific classes for each site should be determined. The classes were determined by visual analysis and limited to within the class that can exist in urban areas because all sites are urban areas of high resolution satellite images. Therefore, we classified the regions that have ambiguous and complicated properties as a bare-soil class.
4. Results and Discussion 4.1 Experimental Data To estimate the performance of the proposed image segmentation method, two high-resolution satellite data sets – a QuickBird image acquired on 28 October 2006 and a GeoEye-1 image acquired on 5 February 2009 – were used in the experiment (Fig. 3). These two satellite data sets each have two satellite images: one obtained for the panchromatic band and the other obtained
− 490 −
KSCE Journal of Civil Engineering
A Multispectral Image Segmentation Approach for Object-based Image Classification of High Resolution Satellite Imagery
Fig. 3. Study Sites: (a) Site A in the Pan-Sharpened QuickBird image, (b) Site B in the Pan-Sharpened GeoEye-1 Image Table 1. Specification of Study Sites Spatial Spectral Radiometric Resolution Resolution Resolution (m) (bands) (bits) B, G, R and Site A QuickBird 611×589 0.6 11 NIR B, G, R and Site B GeoEye-1 758×720 0.5 11 NIR Sensor
Size (pixels)
for four multispectral bands, with the former having four times the spatial resolution of the latter. Therefore, the panchromatic band needs to be fused with the multispectral bands to obtain multispectral data with higher spatial resolution. We use the Gram–Schmidt spectral sharpening module in ENVI software, which is developed by Laben and Brower (2002) to generate a four-band pan-sharpened multispectral image. The images are of dense urban areas in Daejeon, Korea, and in Hobart, Australia respectively. Whereas the QuickBird image mainly contains buildings with blue roofs and many trees and some principal roads located at the border of the image, the GeoEye-1 image contains some trees as well as many buildings and roads having various sizes and orientations. The data specifications of these sites are summarized in Table 1. 4.2 Comparison with Other Methods In order to evaluate the performance of the proposed method, the results obtained using the proposed method were compared with the results obtained using the conventional region growing, the toboggan watershed algorithm and the mean shift algorithm. The conventional region growing algorithm is composed of simple row-wise scanning for seed selection and region growing process. This method use the Euclidean distance as the region growing criteria, which decides whether or not the pixel under current observation should be added into the region. A merging parameter of 155 and 150 for the QuickBird and GeoEye-1 images was respectively selected. Toboggan-based watershed segmentation is an edge-based segmentation algorithm not having a tuning parameter, which segments image pixels by finding a downstream path from each pixel of the gradient magnitude image to a local minimum of topographic surface (Mortensen and Barrett, 1999). The gradient Vol. 17, No. 2 / March 2013
magnitude image is treated as a topographic surface, and it is obtained by applying a Sobel gradient operator on the Gaussiansmoothed input image. The mean shift algorithm is a nonparametric segmentation method that estimates the gradient of probability density function to detect modes in a feature space composed of a spatial and a range domain (Comaniciu and Meer, 2002). It requires a minimum region size, a spatial and a range bandwidth parameter, which control not only the size of the kernel, but it also determines the degree of image segmentation. Various parameter combinations were tested to find the optimal parameter setting. After a visual comparison of the results, the minimum region size, the spatial and the range bandwidth parameter were set at 30, 56 and 85.2, respectively for the QuickBird image. In the case of the GeoEye-1 image, these three parameters were set at 30, 50 and 85.2, respectively. 4.2.1 Result Comparison using the Segmentation Evaluation Method Before applying the proposed segmentation algorithm to a high-resolution pan-sharpened image, we first smoothed the input image and extracted edge information in pre-processing. After extraction of the edge information from the smoothed image, the seed points are selected using the proposed automatic seed selection method which requires a threshold value for determination of the level of segmentation. We performed the experiment using various threshold values to find the most acceptable value without a large loss of significant objects in the QuickBird image. The experiment showed that the image segmentation result was generally acceptable when the threshold value had been set as a range from 0.1 to 0.8. The best result was acquired when we selected a threshold value of 0.5 for the seed selection in the QuickBird image (site A). If a higher threshold value is used, an insufficient number of pixels are classified as seeds and some objects may be missed. The same threshold value was applied to the GeoEye-1 image (site B) to examine the influence of the selected threshold value on the seed selection performance. For a visualization of the resulting by different segmentation methods, the segmented images overlapped with the regions boundaries are shown in the Figs. 4 and 5, respectively, corresponding to the QuickBird image (site A) and the GeoEye-1 image (site B). In the experiments, toboggan-based watershed segmentation led to severe over-segmentation in the entire area of both sites. It was caused by too many detected regional minima, which were not all perceptually significant. This is a general problem that occurs in watershed-based segmentation algorithms, and this problem becomes more serious especially when these algorithms are applied to complicated satellite images, such as study sites including various objects with different textures. Conventional region growing method also made superfluous segments for both the QuickBird image and the GeoEye-1 image, especially in non man-made areas, such as forests and bare soil regions.
− 491 −
Young Gi Byun, You Kyung Han, and Tae Byeong Chae
Fig. 4. Results of Different Segmentation Algorithms for Site A: (a) Result (39661 segments) by the Toboggan-based Watershed Segmentation, (b) Result (2436 segments) by the Conventional Region Growing Method, (c) Result (2482 segments) by the Proposed Method, (d) Result (2152 segments) by the Mean Shift Algorithm, (e) A Magnified Subimage Extracted from a Pre-Processed Image Overlapped with Region Boundaries of (b), (f) A Magnified Subimage Extracted from a Pre-Processed Image Overlapped with Region Boundaries of (c), (g) A Magnified Subimage Extracted from a Pre-Processed Image Overlapped with Region Boundaries of (d)
Figures 4(b) and 5(b) reveal that the diagonal road feature in site A was not correctly segmented, and some road and building patches in site B were segmented as a part of a forest. The segment images also do not preserve the geometric shapes of all exiting significant structures, as shown in the magnified subimages overlapped with region boundaries in Figs. 4(e) and 5(e). This is because the conventional region growing scheme has a characteristic of dependence in the merging order. The result of the segmentation of mean shift algorithm was very good overall but it tended to under-segment the bright image area. In Figs. 4(d) and 5(d), most features were correctly
segmented; especially some very small features. However, small regions appeared inside large regions and some under- or oversegmentation took place. The proposed method and mean shift algorithm gave quite a good performance. For the QuickBird image, our algorithm gave a better result than the mean shift algorithm in respect to the diagonally located road features, as shown in Fig. 4(f). However, some buildings in the lower center of the image were undersegmented by our method. Whereas both the proposed method and mean shift algorithm gave good resolution in segmenting buildings in the GeoEye-1 image, the former was slightly
− 492 −
KSCE Journal of Civil Engineering
A Multispectral Image Segmentation Approach for Object-based Image Classification of High Resolution Satellite Imagery
Fig. 5. Results of Different Segmentation Algorithms for Site B: (a) Result (75202 segments) by the Toboggan-based Watershed Segmentation, (b) Result (16110 segments) by the Conventional Region Growing Method, (c) Result (9938 segments) by the Proposed Method, (d) Result (5167 segments) by the Mean Shift Algorithm, (e) A Magnified Subimage Extracted from a PreProcessed Image Overlapped with Region Boundaries of (b), (f) A Magnified Subimage Extracted from a Pre-Processed Image Overlapped with Region Boundaries of (c), (g) A Magnified Subimage Extracted from a Pre-Processed Image Overlapped with Region Boundaries of (d)
superior to the latter in defining roads. As you can see in Figs. 4(f) and 5(f) that depict how exactly the geometrical position of the segment border fit the real-world position, most segment boundaries of the proposed method coincide with actual object boundaries better than any other comparison method. This is because the proposed method has the characteristic of combining edge information in the process of segmentation. The images in Figs. 4 and 5 offer compelling evidence that our segmentation algorithm performs well on both satellite images from different sensors with different ground resolutions. Such visual comparisons have been used extensively in the past as a means of illustrating the capabilities of segmentation algorithms Vol. 17, No. 2 / March 2013
(Chen et al., 2006; Pesaresi and Benediktsson, 2001). In addition, we use an objective evaluation method, which can be used to compare the algorithms directly to more reliable and objective measure of our algorithm’s performance. For quantitative analysis, the results of our segmentation algorithm were compared with those of different segmentation methods in terms of some evaluation function values of Hl, Hr, E, and Q. The quantitative comparison of the results arrived at by the four methods is listed in Table 2. As shown in this table, the proposed method provides the best segmentation result from among the four methods in terms of Q value which represents mean squared spectral error of the segments.
− 493 −
Young Gi Byun, You Kyung Han, and Tae Byeong Chae
Table 2. The Evaluating Segmentation Results and the Number of Segments for Each Site
QuickBird (site A)
GeoEye-1 (site B)
Method Toboggan-based watershed segmentation Conventional region growing The proposed method Mean shift segmentation Toboggan-based watershed segmentation Conventional region growing The proposed Method Mean shift segmentation
Hl 10.3 3.63 6.54 6.69 10.98 4.33 8.39 7.80
For both the QuickBird and GeoEye-1 images, the tobogganbased watershed segmentation method and the conventional region growing method gave inadequate results, with a very large Q value compared to the other methods. The proposed method showed a slightly smaller Q value than that derived by the mean shift algorithm. However, the E value measured from the intensity of the panchromatic image took on a different aspect. It is clear that the conventional region growing shows the smallest E values for both sites, but it is difficult to say that the conventional region growing gave a good result. That is because the number of segments has a large influence on the Hl values. These have a tendency to become smaller as the number of segments increases (Zhang et al., 2004). When considering the intra-region uniformity measure Hr, we find that the proposed approach gives the best result from among the four methods for both sites. It is evident from these results that the segmentation quality of the proposed method is better than that of the other segmentation methods with respect to regional spectral error and region homogeneity. Because the reliability of measurements obtained with the unsupervised segmentation evaluation measure is somewhat insufficient to get an accurate comparison, another experiment to prove the superiority of the proposed method was carried out. The next section presents a quantitative comparison of the segmentation results produced with the four algorithms using object-based classification accuracy. 4.2.2 Result Comparison using Object-based Classification Accuracy The unsupervised evaluation measure which is used in the previous section had low reliability for the satellite images because it was developed for images such as indoor photos and composed pictures, which are less complicated than the satellite images. Therefore, in order to evaluate the performance of the proposed segmentation algorithm in a more objective way, we carried out another accuracy assessment comparing the objectbased classification accuracy of the proposed method with that of the other segmentation methods. The object-based classification is applied to the segmentation results by the same conditions. The SVM classifier, which is a supervised method known for showing good generalization result, is applied to the proposed method's segmentation result as well as the other comparison results. For each segmentation
Hr 1.7 3.60 2.92 3.00 1.74 3.45 2.62 2.92
E 12.0 7.23 9.46 9.69 12.7 7.78 11.01 10.73
Q 0.747 0.652 0.0019 0.0026 22.38 58.46 0.0025 0.0031
Number of regions 39661 2436 2482 2152 75202 16110 9938 5167
result, the training data sets were extracted at the same training areas of all the segmentation results, and the test data sets, which are selected on the input pan-sharpened image through visual inspection, were used to assess the accuracy of each classification result. To apply the supervised classification, the specific classes for each site need to be determined. The classes were determined by visual analysis and limited within the class that can exist in urban areas because all the sites were urban areas of high resolution satellite images. Therefore, we classified the regions that have ambiguous and complicated properties as a bare-soil class. The site of QuickBird was classified into six classes: building, road, bare-soil, impervious, shadow, and vegetation. The site of GeoEye-1 was classified into four classes: building, road, vegetation, and bare-soil. We commonly applied the kernel function of SVM as Radial Basis Function (RBF), which handles the case in which the relationships between class labels and attributes were nonlinear, and the gamma value to determine the RBF kernel width and the penalty parameter to control the margin error were set at 0.333 and 100, respectively. Fig. 6 and Fig. 7 show the object-based classification results applied to each segmentation result for the QuickBird image and GeoEye-1 image respectively. They confirmed that, on the whole, the proposed method classifies the road, shadow, and building classes very well compared with other methods, and also preserves quite well the shape of the perceived structure in the image visually. The toboggan-based watershed segmentation method shows a satisfactory classification result for all classes, but the small-size segments, such as the road centerline, were miss-classified to the impervious class according to the over-segmented result that we mentioned before. In the conventional region growing method, some diagonal segments were exactly classified as road class, but it does not coincide with the actual road shape. This classification result indicates the reason segmentation is the important step for the object-based image analysis: The segmentation result can be affected directly to the classification accuracy. Table 3 compares the overall accuracy and kappa coefficient for each method by using the error matrix (Congalton, 1991). The proposed method shows the highest overall accuracy for each sites, 91.15% and 96.70%, and the next is the tobogganbased watershed method at 88.06% and 92.12% overall
− 494 −
KSCE Journal of Civil Engineering
A Multispectral Image Segmentation Approach for Object-based Image Classification of High Resolution Satellite Imagery
Fig. 6. Object-based Classification Results of Different Segmentation Algorithms for Site A: (a) Object-based Classification Result by Toboggan-based Watershed Segmentation, (b) Object-based Classification Result by Conventional Region Growing, (c) Object-based Classification Result by the ProPosed Method, (d) Object-based Classification Result by Mean Shift Segmentation
Fig. 7. Object-based Classification Results of Different Segmentation Algorithms for Site B: (a) Object-based Classification Result by Toboggan-Based Watershed Segmentation, (b) Object-based Classification Result by Conventional Region Growing, (c) Object-based Classification Result by the ProPosed Method, (d) Object-based Classification Result by Mean Shift Segmentation
accuracy, respectively. The kappa statistics were also higher in the proposed method compared to other methods (Table 3). In the case of user accuracy and producer accuracy, the proposed method shows an overall higher result in the classes of building, shadow, and bare-soil compared with the others. It was obvious that the overall accuracies of the four objectbased classification results were relatively high due to a few pure testing samples. For the QuickBird image, 16321 pixels, corresponding to 4.54% of all pixels, were used as testing pixels for assessment accuracy. For the GeoEye-1 image, 12264 pixels were used as testing pixels corresponding to 2.25% of all pixels, which is smaller than that for the QuickBird image. We would get low accuracy when selecting more testing samples. Therefore, the accuracy assessment is relative accuracy than absolute accuracy. Although the proposed method appears to be satisfactory, it has a limitation in that it may not generate the best results for all images. This is because we use threshold value in the seed selection process for which the tuning is image dependent. We therefore need to study criteria that are more efficient and adapt a local threshold to improve the quality of segmentation in future work.
5. Conclusions
Vol. 17, No. 2 / March 2013
We proposed a new method of automatic image segmentation using modified SRG which makes use of information in all spectral bands and multispectral edge information for a highresolution pan-sharpened satellite image. The multispectral edges were first obtained from the edge-preserved smoothed image and the initial seeds were automatically generated through the local variation characteristics of the multispectral edge. A modified SRG algorithm that uses multispectral and edge information was then applied to the image to obtain image segmentation. We tested our method using two high-resolution satellite images of dense urban areas. To test the effectiveness of the proposed algorithm, it was compared with other established algorithms, and the obtained results for both satellite images, in general, were visually satisfactory and quantitatively somewhat superior to other methods in discriminating buildings and roads from other objects. These experimental results showed that our proposed method is promising for high-resolution satellite image segmentation. The proposed method has the advantage that it can use multiple feature information, including multispectral and edge information, contained in the image. Our future research will focus on further testing the proposed method in the automatic
− 495 −
Young Gi Byun, You Kyung Han, and Tae Byeong Chae
Table 3. Accuracy Assessment Values of Object-based Classification of Different Segmentation Methods for Each Site Method
Toboggan-based watershed segmentation
Conventional region growing
Proposed method
Mean shift segmentation
Class type Building Road Vegetation Bare-soil Impervious Shadow
Building Road Vegetation Bare-soil Impervious Shadow
Building Road Vegetation Bare-soil Impervious Shadow
Building Road Vegetation Bare-soil Impervious shadow
QuickBird Producer (%) 97.37 90.61 99.02 71.09 82.65 86.05 Overall accuracy=88.062 Overall Kappa=0.8566 94.01 83.47 67.35 97.07 94.30 87.73 Overall accuracy=87.247 Overall Kappa=0.847 98.20 83.84 90.53 95.86 90.89 87.63 Overall accuracy=91.153 Overall Kappa=0.8938 59.08 77.72 71.87 40.77 96.79 70.79 Overall accuracy=69.365 Overall Kappa=0.6319
User (%) 99.95 80.79 81.96 84.22 90.94 92.99
89.40 95.07 99.20 67.58 94.65 90.06
100 84.66 97.36 85.86 83.33 97.68
57.57 72.17 49.44 69.06 94.78 87.52
Class Building Road Vegetation Bare-soil
GeoEye-1 Producer (%) 91.75 88.19 93.96 95.31
User (%) 90.63 91.01 95.35 95.13
Building Road Vegetation Bare-soil
Overall accuracy=92.120 Overall Kappa=0.906 65.43 89.44 67.09 84.86
65.75 92.26 86.16 71.85
Building Road Vegetation Bare-soil
Overall accuracy=74.352 Overall Kappa=0.695 95.88 97.09 97.22 97.42
95.10 98.92 97.46 97.32
Building Road Vegetation Bare-soil
Overall accuracy=96.709 Overall Kappa=0.9610 89.24 82.30 96.21 97.99
87.20 96.86 98.00 90.97
Overall accuracy=90.88 Overall Kappa=0.892
extraction of man-made objects and handling limitations of the algorithm to improve segmentation accuracy.
References Adams, R. and Bischof, L. (1994). “Seeded region growing.” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 16, No. 6, pp. 641-647. Aiping, W., Shugen, W., and Lucieer, A. (2010). “Segmentation of multispectral high resolution satellite imagery based on integrated feature distribution.” International Journal of Remote Sensing, Vol. 31, No. 6, pp. 1471-1483. Borsotti, M., Campadelli, P., and Schettini, R. (1998). “Quantitative evaluation of color image segmentation results.” Pattern Recognition Letters, Vol. 19, No. 8, pp. 741-747. Chen, Z., Zhao, Z., Gong, P., and Zeng, B. (2006). “A new process for the segmentation of high resolution remote sensing imagery.” International Journal of Remote Sensing, Vol. 27, No. 22, pp. 49915001. Chung, K. L., Yang, W. J., and Yan, W. M. (2008). “Efficient edge-
preserving algorithm for color contrast enhancement with application to color image segmentation.” Journal of Visual Communication and Image Representation, Vol. 19, No. 5, pp. 299-310. Comaniciu, D. and Meer, P. (2002). “Mean shift: A robust approach toward feature space analysis.” IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 24, No. 5, pp. 603-619. Congalton, R. G. (1991). “A review of assessing the accuracy of classifications of remotely sensed data.” Remote Sensing of Environment, Vol. 37, No. 1, pp. 35-46. Darwish, A., Leukert, K., and Reinhardt, W. (2003). “Image segmentation for the purpose of object-based classification.” In IEEE proceeding of IGARSS’2003, Toulouse, France, pp. 2039-2041. Eddy, P. R., Smith, A. M., Hill, B. D., Peddle, D. R., Coburn, C. A., and Blackshaw, R. E. (2008). “Hybrid segmentation-artificial neural network classification of high resolution hyperspectral imagery for site-specific herbicide management in agriculture.” Photogrammetric Engineering and Remote Sensing, Vol. 74, No. 10, pp. 12491257. Fan, J., Zeng, G., Body, M., and Hacid, M. S. (2005). Seeded region growing: An extensive and comparative study, Pattern Recognition
− 496 −
KSCE Journal of Civil Engineering
A Multispectral Image Segmentation Approach for Object-based Image Classification of High Resolution Satellite Imagery
Letters, Vol. 26, No. 8, pp. 1139-1156. Gonzalez, R. C. and Woods, R. E. (2002). Digital image processing, Prentice-Hall, New Jersey. Guindon, B. (1997). “Computer-based aerial image understanding: A review and assessment of its application to planimetric information extraction from high resolution satellite image.” Canadian Journal of Remote Sensing, Vol. 23, No. 1, pp. 38-47. He, Y., Luo, Y., and Hu, D. (2007). “Automatic seeded region growing based on gradient vector flow for color image segmentation.” Optical Engineering, Vol. 46, No. 4, pp. 047003. Kai, L. and Muller, J. P. (1991). “Segmenting satellite imagery: a region growing scheme.” In IEEE Proceeding of IGARSS’1991, Helsinki, Finland, pp. 3-6. Karantzalos, K. and Argialas, D. (2006b). “Improving edge detection and watershed segmentation with anisotropic diffusion and morphological levelling.” International Journal of Remote Sensing, Vol. 27, No. 24, pp. 5427-5434. Karantzalos, K. and Arglalas, D. (2009a). “A region-based level set segmentation for automatic detection of man-made objects from aerial and satellite images.” Photogrammetric Engineering and Remote Sensing, Vol. 75, No. 6, pp. 667-677. Kerman, C. D. and Chehdi, K. (2002). “Automatic image segmentation system through iterative edge-region co-operation.” Image and Vision Computing, Vol. 20, No. 8, pp. 541-555. Ketting, R. L., and Landgrebe, D. A. (1976). “Classification of multispectral image data by extraction and classification of homogeneous objects.” IEEE Transaction on Geoscience Electronics, Vol. 14, No. 1, pp. 19-26. Li, P. and Xiao, X. (2007). “Multispectral image segmentation by a multichannel watershed-based approach.” International Journal of Remote Sensing, Vol. 28, No. 19, pp. 4429-4452. Laben, C. A. and Brower, B. V. (2002). Process for enhancing the spatial resolution of multispectral imagery using pan-sharpening, US patent 6,011,875, Patent and Trademark Office, Washington D.C. Lizarazo, I. and Elsner, P. (2009). “Fuzzy segmentation for object-based image classification.” International Journal of Remote Sensing, Vol. 30, No. 6, pp. 1643-1649. Mortensen, E. N. and Barrett, W. A. (1999). “Toboggan-based intelligent scissors with a four-parameter edge model.” Proceedings of IEEE Computer Vision and Pattern Recognition, pp. 452-458. Mott, C., Anderson, T., Zimmermann, S., Schneider, T., and Ammer, U. (2002). “Selective region growing-An approach based on object-
Vol. 17, No. 2 / March 2013
oriented classification routines.” In IEEE proceeding of IGARSS’ 2002, Toronto, Canada, pp. 1612-1614. Pesaresi, M. and Benediktsson, J. A. (2001). “A new approach for the morphological segmentation of high resolution satellite imagery.” IEEE Transaction on Geoscience and Remote Sensing, Vol. 39, No. 2, pp. 309-320. Richard, G. L., Paul, M., and Scott, H. (2006). “A Multi-scale segmentation approach to mapping seagrass habitats using airborne digital camera imagery.” Photogrammetric Engineering and Remote Sensing, Vol. 72, No. 6, pp. 665-675. Sapiro, G. and Ringach, D. (1996). “Anisotropic diffusion of multivalued images with applications to color filtering.” IEEE Transaction on Image Processing, Vol. 5, No.11, pp. 1582-1585. Shiozaki, A. (1986). “Edge extraction using entropy operator.” Computer Vision and Processing Intelligence, Vol. 36, No. 1, pp. 1-9. Van der sande, C. J., Jong, S. M., and De roo, A. P. J. (2003). “A segmentation and classification approach of IKONOS-2 imagery for land cover mapping to assist flood risk and flood damage assessment.” International Journal of Applied Earth Observation and Geoinformation, Vol. 4, No. 3, pp. 217-229. Wang, W., Zhao, S., and Chen, X. (2004). “Object-oriented classification and application in land use classification using SPOT-5 PAN Imagery.” In IEEE proceeding of IGARSS’2004, Seoul, Korea, pp. 3721-3723. Wu, T. F., Lin, C. J., and Weng, R. C. (2004). “Probability estimates for multi-class classification by pairwise coupling.” Journal of Machine Learning Research, Vol. 5, No. 1, pp. 975-1005. Yu, Q., Gong, P., Clinton, N., Biging, G., Kelly, M., and Schirokauer, D. (2006). “Object-based detailed vegetation classification with airborne high spatial resolution remote sensing Imagery.” Photogrammetric Engineering and Remote Sensing, Vol. 72, No. 7, pp. 799-811. Zhang, Y. J. (1997). “Evaluation and comparison of different segmentation algorithms.” Pattern Recognition Letters, Vol. 18, No. 10, pp. 963-974. Zhang, Y., Feng, X., and Le, L. (2008). “Segmentation on multispectral remote sensing image using watershed transformation.” Congress on Image and Signal Processing, Vol. 4, No. 1, pp. 773-777. Zhang, H., Fritts, J. E., and Goldman, S A. (2004). “An entropy-based objective segmentation evaluation method for image segmentation.” Proceedings of SPIE: Electronic Imaging-Storage and Retrieval Methods and Applications for Multimedia 2004, San Joes, CA, pp. 38-49.
− 497 −