A fuzzy segmentation tool for remote sensing data Ana C. Siravenhaa , Victor Britoa and Evaldo G. Pelaesa a Federal
University of Para, Av. Augusto Correa - 1, Belem, Brazil ABSTRACT
Earth. Remote sensing data are an important source of information for a variety of applications, such as coastal mapping applications, monitor land use, and chart wildlife habitats, for example. One of the most important task for these data analysis is the segmentation. Segmentation means the action of merging neighbouring pixels into segments (or regions), based on their homogeneity or heterogeneity parameters. Traditional image segmentation methods looks for delineating discrete image objects with sharp edges, which cannot be always possible, mainly considering that many geographic objects, both natural and manmade, may not appear clearly bounded in remotely sensed images. A fuzzy approach seems natural in order to capture the structure of objects in the image and takes into account the fuzziness of the real world and the ambiguity of remote sensing imagery. The main goal of this work is define boundaries of objects in an image. This proposal aims to be faster than other segmentation approaches inside the TerraLib tools by considering only the neighbourhood of a selected pixel. This work proposes the use of images tone and colour to select and define objects in remote scenes based on fuzzy rules. The fuzzy set is defined by an input tolerance level, which can be adjustable according to the desired granularity of the selection. The proposal methodology is not limited by the selection of only one object, that is, the mask can be designed by a set of objects with different features and tolerances. The algorithm also returns the objects size and proportion. The quality of the individual segmentation results is evaluated based on multi-spectral Landsat 5-TM data. This is done by visual comparison, which is supplemented by a detailed investigation using visual interpreted reference areas. Keywords: Fuzzy set, Segmentation, TerraLib, Landsat 5-TM
1. INTRODUCTION The use of remote sensing data is widespread and its applications reaches a large range of practical purposes. Segmentation methods are abased on assumption that the clustering of data spectrally similar are close, and this process aims to discover and organize the structure in data sets by quantifying the similarities among individual patterns.1 The efforts to identify clusters by unsupervised data-driven and pattern-based approaches - as2, 3 - resulted in a crisp classification that poorly represent the spatial continuum present in the biophysical environment. The spectral and spatial vagueness of these images can be handled by fuzzy segmentation and classification techniques, which are able to model its uncertainty by allowing individual points to have partial belongings to multiple classes. The partial membership of fuzzy logic better represent the real phenomenon of non geometrical sharp, presented, for example, during the transition from forest to terrain, where there is no exactly defined border between them. Although numerous segmentation methods have already been developed, there are several contrary reasons to the directly use into remote sensing domain. In particular, the multi-spectral, and sometimes multi-scale data, generated by the remote sensors that increases the redundancy and consequently, the overall data complexity. Also, the various objects of heterogeneous properties with respect to size, form, spectral behaviour, etc. have to be considered, mainly in model-based methods.4 The use of statistical methods to segment an image requires an estimate of region process given the observed images according to a specified statistical criterion (maximum a posteriori, for example). Li and Peng5 discusses the non-stationary texture presented by remote sensing data, represented by a non-stationary double random Further author information: (Send correspondence to Ana C. Siravenha) Ana C. Siravenha: E-mail:
[email protected], Telephone: 55 091 3201 7674
field. The segmentation parameters were estimated by the expectation maximization algorithm. Besides loworder dependence among pixels in this random field, they introduce a high-order dependence as a feature to recognize urban areas from a Landsat image. The low-order model is used to obtain a number of uniform texture regions, while the high order model recognizes the objects that have less false positive detection. Mitra et al.6 highlight that the success of an image analysis system depends on the quality of segmentation. They address the problem of scarcity of labelled pixels in supervised pixel classification framework with support vector machine (SVM), initially designed using a small set of labelled points and subsequently refined by actively querying for the labels of pixels from a pool of unlabelled data. One of the most important requirements of supervised approaches refers to the availability of labelled data, manly extracted from ground truths and by manual labelling. In order to overcome this question, the active learning have been used, including in Mitra et al.6 There, the learner has the ability to select its own training data by a iterative process. Error driven techniques,7 uncertainty sampling8 and adaptive resampling9 are examples of active learning strategies. A simple and effective technique for data modeling is the Markov random field (MRF), which encompass the prior knowledge during the segmentation. Many of them have been proposed to describe the different features encountered in the images of interest, including the local variations of image statistics.10, 11 The hierarchical model proposed by12 was introduced and refined by10 and,11 respectively. But in12 it is addressed the lack of prior knowledge in a supervised framework. They explore the description capability as well as the limitations due to superimpose tree structure, and in addition the recursive optimization procedure. They assess the model performance with one Systeme pour l’Observation de la Terra (SPOT) satellite image and compare its results with another approaches based on minimal distance, discriminant analysis, among others, with superior results, although the few experiments carried and the non-automatic procedure to find the best tree structure. Fuzzy c-means (FCM) approaches are well know and widely used in segmentation purposes. In spite of it prevalence, the FCM algorithm does not take in account the spatial information of pixels, hence suffers from high sensitivity to noise. Weighted approaches to FCM clustering (see Ji et al.13 and Fan et al.1 ), for example, aims to give the biophysical meaning of classes and memberships without pre-setting parameters and with lower computation complexity. Fan et al.1 implemented a procedure that utilizes a single point iterative weighted fuzzy C-means cluster algorithm to segment remote sensing images. FCM is based on calculate terrain attributes following an iterative procedure to construct continuous spatial patterns of fuzzy landforms class memberships. The weighted approach of FCM is designed to adjust original samples to the uniform distribution, and in addition, the provided cyclic iteration can improve the algorithm precision. Although the algorithm efficiency, it is important an effective weights’ adjust and the initial clusters centres design without any a priori information. Image patch is naturally a noise removal technique, specially for non-local based algorithms. It was used by Ji et al.13 to improve the FCM model, replacing each pixel used in constructing the objective function with the corresponding image patch, in which all pixels are, as well as by,1 iterative weighted. Taking the path as the basic unit to be clustered (instead pixels), the spatial constraints are incorporated intrinsically into the clustering process. The tests with synthetic images and brain MR was proceed and compared with another FCM-based algorithms and since its results can effectively overcome the impact of noise, it can be applied to remote sensing data. FCM-based methodologies are used in a variety of purposes. Chu et al.14 presented an FCM for tracking typhoon trajectories. The typhoon track in western North Pacific (Taiwan region) are detected using kernel density estimation (KDE) and these tracks centres belongs to the six fuzzy clusters with different membership degrees. The final map with typhoon density estimation combines the KDE hotspots on the best-cluster results with the FCM weights, and could be used in planning for disaster management. Mitra et al15 explore the concept of shadowed sets, which disambiguate and capture the essence of a cluster distribution. Shadowed clustering serves as a conceptual and algorithmic bridge between FCM and rough c-means (RCM), incorporating both merits and enhancing the exclusion zones, producing a general c-means algorithm. This new technique reduces the effect of external parameters, at the same time that its faster convergence better manage the uncertainty.
Quadtree-based methods have been extensively used to image segmentation in the wavelet domain. In Fu et al.16 it was used on spatial context, overcoming the difficulties about image splitting by adding the spatial indexing mechanism based on improved Morton coding; the problems with region merging was treated by a process based on region adjacency graph (RAG). The method was validated on GeoEye-1 and IKONOS color images, and the results showed that the efficiency was considerably increased, as well as the accuracy, when compared with another typical algorithms. Many classification algorithms based on fuzzy set category have been proposed. In unsupervised field one can highlight the cited FCM, the fuzzy Gustafson-Kessel, fuzzy c-shells and genetic algorithm and so on. In supervised studies, it must be considered two steps: training and classification itself. The training step refers to the learning phase in which it is created the data signature, to be processed by the second step. The signatures can be generated by unsupervised, classical or fuzzy-based methods, and evaluated by statistical metrics. The classification step can be also executed by fuzzy membership methodology, or even by a classical one, but it can not be evaluated separated from the first step, just like the result of an entire procedure. Some authors consider the use of fuzzy knowledge in both steps as the optimized setting for a supervised approach.17 This work presents an effort to implement a faster segmentation algorithm to be aggregate to the TerraLib library. The fuzzy tool is implemented similarly to the Magic wand tool, present in well-know graphical softwares like Adobe Photoshop and Gnu Image Manipulation Program - GIMP. It uses the image’s tone and colour to select and define objects in remote scenes based on fuzzy rules. The code improvement allows the user to select two or more different objects to compose the mask of interested objects. It is also possible measure the size and proportion of the objects. The method validation uses Landsat-5 TM data in a detailed visual investigation. The outline of the paper is as follows: Section 2 describes the background for the presented fuzzy application; Section 3 explores the designed methodology, as well as the challenges to be faced; Section 5 shows the achievements and its evaluation, discussed in Section 6.
2. FUZZY SET THEORY Fuzzy sets were introduced in 1965 by18 as an extension of the classical notion of sets. Instead sharp boundaries (binary terms of belonging), these new sets are defined in a continuum membership grades, with the aid of a membership function valued in the real unit interval [0, 1]. In fuzzy set theory, classical bivalent sets are usually called crisp sets.19 Originally, Zadeh defined fuzzy set as: Let X be a space of points, with a generic element of X denoted by x. Thus X = x. A fuzzy set A in X is characterized by a membership function f A(x) which associates with each point in X a real number in the interval [0, 1], with the values of f A(x) at x representing the ”grade of membership” of x in A. Thus, the nearer the value of f A(x) to unity, the higher the grade of membership of x in A. In another words, all x’s for which f A(x) =1 are full members of the set A, and all x’s for which the membership function is between 0 and 1 have partial membership in the set. If f A(x) = 0, it have zero degree of membership in the set, that in practical meaning, it is not member of the set. Thus, the fuzzy set A is an ordered pair that assign a grade of membership in A to each x: A = {x, f A(x)|x ∈ Z}
as
(1)
In this work, the triangles function is used for modelling the characteristic function. Such a function is defined 0 x
c
This kind of set is specified by three parameters a, b, c as the most possible value, the lowest possible value, and the highest possible value, respectively.
3. METHODOLOGY The segmentation algorithm is based on concepts of similarity between gray levels to implement the fuzzy set. According this, we assume that that there is enough contrast into scene to distinguish objects, and the user has knowledge about the spatial location of the object to be segmented from the scene. The algorithm uses the graphical interface to allow the user select an ordinary pixel that belongs to the desirable object to be masked. Here, an object can represents a class, and it is defined by the neighbouring of a selected pixel. The fuzzy membership function follows the triangle model, in which the central vertices is defined by the value captured in the mouse clicking position, and the membership boundaries are defined by the tolerance chose by the user during the parameter definition step. Tolerance This time, Photoshop selected an entire range of brightness values rather than limiting itself to pixels that were exactly the same tone and color as the middle gray area I clicked on. Why is that? To find the answer, we need to look up in the Options Bar along the top of the screen. More specifically, we need to look at the Tolerance value: What if I want to select just the specific shade of gray I click on in the gradient and nothing else? In that case, Id set my Tolerance value to 0, which tells Photoshop not to include any pixels in the selection except those that are an exact match in color and tone to the area I click on: With Tolerance set to 0, Ill click again on the same spot in the center of the gradient, and this time, we get a very narrow selection outline. Every pixel thats not an exact match to the specific shade of gray I clicked on is ignored: Contiguous As we were exploring the effect the Tolerance setting has on Magic Wand selections, you may have noticed something strange. Each time I clicked on the gradient above the red bar, Photoshop selected a certain range of pixels but only in the gradient I was clicking on. The gradient below the red bar, which is identical to the gradient I was clicking on, was completely ignored, even though it obviously contained shades of gray that should have been included in the selection. Why were the pixels in the lower gradient not included? The reason has to do with another important option in the Options Bar Contiguous. With Contiguous selected, as it is by default, Photoshop will only select pixels that fall within the acceptable tone and color range determined by the Tolerance option and are side by side each other in the same area you clicked on. Any pixels that are within the acceptable Tolerance range but are separated from the area you clicked on by pixels that fall outside the Tolerance range will not be included in the selection. In the case of my gradients, the pixels in the bottom gradient that should otherwise have been included in the selection were ignored because they were cut off from the area I clicked on by the pixels in the red bar which were not within the Tolerance range. Lets see what happens when I uncheck the Contiguous option. Ill also reset my Tolerance setting to its default value of 32: The Contiguous option for the Magic Wand in Photoshop. Image 2010 Photoshop Essentials.com Contiguous is selected by default. Click inside the checkbox to deselect it if needed. Ill click again in the center of the upper gradient with the Magic Wand, and this time, with Contiguous unchecked, the pixels in the bottom gradient that fall within the Tolerance range are also selected, even though theyre still separated from the area I clicked on by the red bar: The Fuzzy Select (Magic Wand) tool is designed to select areas of the current layer or image based on color similarity. When using this tool, it is very important to pick the right starting point. If you select the wrong spot, you might get something very different from what you want, or even the opposite. The Wand is a good tool for selecting objects with sharp edges. It is fun to use, so beginners often start out using it a lot. You will probably find, however, that the more you use it, the more frustrated you become with the difficulty of selecting exactly what you want, no more, no less. More experienced users find that the Path and Color Select tools are often more efficient, and use the Wand less. Still, it is useful for selecting an area within a contour, or touching up imperfect selections. It often works very well for selecting a solid-colored (or nearly solid-colored) background area.
Note that as the selected area expands outward from the center, it does not only propagate to pixels that touch each other: it is capable of jumping over small gaps, depending on Threshold option. To increase/decrease Threshold, during the use of Fuzzy Selection, after the first button-press, dragging the pointer downward (or to the right) or upward (or to the left)
4. IMPLEMENTATION DETAILS 5. RESULTS 6. DISCUSSION 7. CONCLUSION ACKNOWLEDGMENTS The authors want to thank the Amazon Foundation/Vale for the financial support (grant number 001/2010). Also thank the National Counsel of Technological and Scientific Development and the Federal University of Para for the partial support.
REFERENCES [1] Fan, J., Han, M., and Wang, J., “Single point iterative weighted fuzzy c-means clustering algorithm for remote sensing image segmentation,” Pattern Recognition 42, 2527 – 2540 (2009). [2] Andersson, M., Gudmundsson, J., and Levcopoulos, C., “Approximate distance oracles for graphs with dense clustersn,” Computational geometry 37 (3), 142 – 154 (2007). [3] D. R. Martin, C. C. F. and Malik, J., “Learning to detect natural image boundaries using local brightness, color, and texture cues,” IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (5), 530 – 549 (2004). [4] Schiewe, J., “Segmentation of high-resolution remotely sensed data-concepts, applications and problems,” in [Symposium on Geospatial Theory Processing and Applications], (2002). [5] Li, F. and Peng, J., “Double random field models for remote sensing image segmentation,” Pattern Recognition Letters 25, 129 – 139 (2004). [6] Mitra, P., Shankar, B. U., and Pal, S. K., “Segmentation of multispectral remote sensing images using active support vector machines,” Pattern Recognition Letters 25, 1067 – –1074 (2004). [7] Meng, Q. and Lee, M., “Error-driven active learning in growing radial basis function networks for early robot learning,” in [IEEE International Conference on Robotics and Automation], 2984 – 2990 (2006). [8] Joshi, A., Porikli, F., and Papanikolopoulos, N., “Multi-class active learning for image classification,” in [IEEE Conference on Computer Vision and Pattern Recognition], 2372 – 2379 (2009). [9] Marusic, B., Kale, I., and Tasic, J., “Image compression based on fast adaptive resampling on a hilbert-peano curve,” in [IEEE Instrumentation and Measurement Technology Conference ], 1091–5281 (1999). [10] Possi, G. and Zerubia, J. B., “Image segmentation by tree-structured Markov random field,” IEEE Signal processin letters 7, 155 – 157 (1999). [11] D’Elia, C., Possi, G., and Scarpa, G., “A tree-structured Markov radom field model for bayesian image segmentation,” IEEE Transactions on image processing 12 (10), 1259 – 1273 (2003). [12] Possi, G., Scarpa, G., and Zerubia, J. B., “Supervised segmentation of remote sensing images based on a tree-structured MRF model,” IEEE Transactions on geoscience and remote sensing 43 (8), 1901 – 1911 (2005). [13] Ji, Z., Xia, Y., Chen, Q., Sun, Q., Xia, D., and Feng, D. D., “Fuzzy c-means clustering with weighted image patch for image segmentation,” Applied soft computing 12, 1659 – 1667 (2012). [14] Chu, H.-J., C-J, L., Lin, C.-H., and Su, B.-S., “Integration of fuzzy cluster analysis and kernel density estimation for tracking typhoon trajectories in the taiwan region,” Expert system with applications. 39, 9451 – 9457 (2012). [15] Mitra, S., Pedrycz, W., and Barman, B., “Shadowed c-means: Integrating fuzzy and rough clustering,” Pattern recognition 43, 1282 – 1291 (2010).
[16] Fu, G., Zhao, H., Li, C., and Shi, L., “Segmentation for high-resolution optical remote sensing imagery using improved quadtree and region adjacency graph technique,” Remote sensing 45, 3259 – 3279 (2013). [17] Droj, G., “The applicability of fuzzy theory in remote sensing image classification,” Studia Universitatis Babes-Bolyai 52, 89 – 96 (2007). [18] Zadeh, L. A., “Fuzzy set,” Information Control 8, 338 – 353 (1965). [19] Tobias, O. J. and Seara, R., “Image segmentation by histogram thresholding using fuzzy sets,” IEEE TRANSACTIONS ON IMAGE PROCESSING 11 (12), 1457 – 1465 (2002).