scale space segmentation of color images using ... - Semantic Scholar

SCALE SPACE SEGMENTATION OF COLOR IMAGES USING WATERSHEDS AND FUZZY REGION MERGING S. Makrogiannis1 I. Vanhamel2 1 : Electronics Lab. University of Patras, Greece [email protected], [email protected] ABSTRACT A multi-resolution segmentation approach for color images is proposed. The scale space is generated using the Perona-Malik diffusion approach and the watershed algorithm is employed to produce the regions in each scale. The dynamics of contours and the relative entropy of color regions distribution are estimated as region dissimilarity features across the scale-space stack, and combined using a fuzzy rule based system. A minima-linking process by downward projection is carried out and subsequently the region dissimilarity, combining color, scale and homogeneity is estimated for the finer scale (localization scale). The final segmentation is derived using a previously presented merging process. To validate its performance qualitative and quantitative results are provided. 1. INTRODUCTION Color image segmentation is a field that attracts a growing interest. Several methods have been proposed in the past depending on the nature of the problem involved [12]. Recent works have shown that region based approaches outperform other methods in terms of segmentation accuracy and satisfactory results have been presented by many researchers. The watershed transform has been adopted in several works to produce closed regions with one-pixel-width contours. Similar to the grey level case, the segmentation of color images using the watershed transformation can be translated as elimination of its main drawback, namely over-segmentation. Without being an exception to the rule of the grey level case, the oversegmentation problem in color images has been treated by following main approaches: markers [10], flat-zones [4] and waterfall [1], and dynamics of contours [2],[11],[14],[16]. Other approaches consider split and merge techniques [6],[7],[8],[9],[15] for color image segmentation. The merging step combines segments to form regions that correspond to objects or parts of them. It is applied by means of a dissimilarity function (or merging cost function), which measures the visual difference between two adjacent regions. Recently segmentation methods tend to incorporate the multiscale nature of images. This allows the integration of both the deep and superficial image structure [3],[5],[14],[16]. These approaches, known as scale-space or multi-resolution approaches, generally include a scale generation mechanism. For the latter, one commonly employs a linear method. However, the inherent problems of linear scale-space methods led to the investigation of their non-linear counterparts. To avoid problems such as delocalisation and the similar treatment of noise and information, the non-linear methods add extra information to guide the diffusion process.

H. Sahli2

S. Fotopoulos1 2 : ETRO/IRIS Vrije Universiteit Brussel, Belgium [email protected], [email protected]

In this paper a multi-resolution segmentation approach for color images, which employs the scale generation and linking mechanism of a previously reported color segmentation method in scale-space [16], is presented. The main idea is based on the analysis of the multi-scale behavior of the catchment basins and the gradient-watersheds [14]. As in [16], the different scales are generated using the regularized Perona and Malik anisotropic diffusion [13]. The regions are produced by means of the watershed applied on the outcome of a non-parametric color difference estimator [8]. The 1-1 relationship between the catchment basins and the local minima of the color gradient is exploited to link them along the scale-space stack [14]. Once the linking is accomplished, a linkage list L(p,q) for each region couple (p,q) at the finest scale, i.e. the localization scale, is created. Next, a multi-scale dissimilarity measure that combines the dynamics of contours in scale space [14] and the relative entropy of color distributions by applying a fuzzy inference system (based on the idea in [9]) is estimated. Finally, a region merging process that incorporates a graph simplification and histogram thresholding process is applied to derive the final segmentation results. The paper is organized as follows: In section 2 the scale-space generator and the linking process are briefly reviewed. Section 3 is dedicated to the dissimilarity measure and in section 4 the merging algorithm is briefly described. Experimental results are shown in section 5 and finally some conclusions are made in section 6. 2. SCALE-SPACE GENERATION AND LINKING 2.1. Perona and Malik Anisotropic Diffusion In [13], Perona and Malik propose an anisotropic diffusion filtering for scalar images. It avoids blurring and delocalisation problems of the linear diffusion filtering. They apply an inhomogeneous process that reduces the amount of diffusion at those locations, which have a larger likelihood to be edges. This likelihood is measured by the squared gradient. The proposed filter is based on the following PDE equation: 2

δ t ⋅ u = div ( g ( ∇u ) ⋅ ∇u )

(1)

where g(.) is a function that determines the amount of diffusion and is referred to as diffusion tensor. This work uses a numerical scheme based on finite differences, which was originally proposed in [13] and was adopted and extended for the case of color images in [16]. The sampling of the non-linear scale-space stack is based on a combination of the scale-invariance property of gaussian scale-

space and its link with the PDE notation. The resulting sampling can be described as follows:

t =

σ 02 .e 2. N .∆τ 2

(2)

where σo is the standard deviation of the gaussian filter in the localization scale, ∆τ is a sampling parameter, N represents the scale number, and t denotes the scale parameter of Perona-Malik filter. 2.2. Linking process The watershed is routinely applied to the modulus of the intensity gradient, which corresponds to a measure of a distance between the neighboring pixels in the sense of their intensity. In this work appropriate color distance is estimated as the relative color entropy using local non-parametric density kernel estimation [8]. The linking of the regions for successive scales is applied using a proximity criterion [14]. This criterion checks the relative distance for all regions’ minima at scale i, which have been projected on the same influence zone at scale i+1 with respect to the original minimum of this influence zone. The result of this procedure is a linkage list L(p,q), which contains for each adjacent region pair (p,q) found in the localization scale the corresponding region pairs in the scale-space stack. The linkage list is used for the valuation (weight) of each arc of the Region Adjacency Graph (RAG) at the localization scale. 3. MULTI-RESOLUTION REGION DISSIMILARITY In this work the dissimilarity measure used for the valuation of the RAG combines color, texture and scale information as well. The employed region features are the dynamics of contours and the relative entropy of the regions distributions. These are used as fuzzy variables in a fuzzy logic system to measure the dissimilarity of two adjacent regions in each scale. Given the linking information from the previous stage, the outcome of the fuzzy logic system is subsequently summed up over the successive scales to obtain a more robust dissimilarity measure. The multi-resolution dissimilarity measure is expressed by the following equation: Sa

RD( p, q ) = ∑ F ( DC i i = So

L ( p ,q )

, RE i

L ( p ,q )

)

Relative Entropy. is a dissimilarity measure for two distributions and is defined as: 255  Pp ( g ) Pq ( g )  (4) REp, q = ∑  Pp( g ) ⋅ log10 ( ) + Pq ( g ) ⋅ log10 ( ) Pq ( g ) Pp ( g )  g =0  

Pk denotes the Luminance probability density function for region k and REp,q the relative entropy of the two distributions. There are several techniques available for the probability density estimation of multivariate samples. Nonparametric techniques have the advantage that they do not make any explicit assumptions for the underlying distribution. A nonparametric method, widely used in the statistical community, is the multivariate Parzen density estimator and this is the method employed here. This estimation method is quite efficient for comparing the distributions of two regions regardless of their size. It should also be noticed that the probability distribution is estimated as a vector process over the three channels of the image. Fuzzy Rule-Based Scheme. The main advantage of this approach is the inherent flexibility of the fuzzy sets definition and of the inference mechanism, plus the fact that its outcome is normalized and thus can be easily handled in the merging process. In this work a two-input-one output fuzzy reasoning system is employed to express the dissimilarity in each scale. It uses as input variables the dynamics of contours and the relative entropy and as output variable the grade of dissimilarity. Both input variables are divided into two sets, namely SMALL and LARGE. The output variable comprises two fuzzy sets called SIMILAR and NOTSIMILAR. The shape and range of all the fuzzy sets are depicted in Figure 1.

(3)

where p and q are two adjacent regions at the localization scale So, Lp,q is the linking list derived from the scale space generation stage, DCi denotes the dynamics of contours in each scale i, REi symbolizes the relative entropy of the sample distributions of the two examined regions in scale i and F(.,.) denotes the fuzzy inference scheme that provides as outcome the regions dissimilarity. Dynamics of Contours are a contrast measure based on the notion of dynamics of the minima that were traced during the watershed process and the flooding scenario of the watershed transform. The latter is used to locate the most significant minimum of the flooding chain [11],[14].

Figure 1: Fuzzy sets of input and output variables.

The fuzzy inference includes the following rules:

1. 2.

If dynamics_of_contours (SMALL) AND relative_entropy (SMALL) THEN output (SIMILAR) If dynamics_of_contours (LARGE) OR relative_entropy (LARGE) THEN output (NOT_SIMILAR) 4. REGION MERGING

The RAG is simplified by keeping for each node only the arc with the lowest weight RD(p,q) this results in the Most Coherent Neighbor Graph (MCNG). The arcs of the MCNG are registered and sorted in an increasing order according to their weight. The nodes linked by the MCNG are subsequently merged and updated until the termination criterion is met. The termination criterion is determined from the distribution (histogram) of the MCNG-weights as a threshold. This threshold is experimentally set to 5% of the maximum value of the histogram. Figure 2 illustrates the distribution of the MCNG weights (merging costs) for the case of the multi-resolution dissimilarity measure and its single scale counterpart.

Single Scale case

Edge Histogram

150

where h, w and c are the amount of the rows, the columns and the channels of the image respectively, R the total amount of regions, σi2 the color error over region i, Ai the amount of pixels of region i. This criterion expresses the trade-off between the suppression of heterogeneity and preservation of details. The smaller the value of LGC, the better the segmentation result is. Table I gives a comparative evaluation for our test images in terms of initial, final number of regions and segmentation cost. These quantities correspond to the results of figures 3-5. Comparing the above results it becomes obvious that the multiple scale version produces fewer final regions and lower segmentation costs than the single scale one. 5. CONCLUSION A multi-resolution, watershed-driven, split and merge, color segmentation method has been presented. It utilizes information from superficial and deep image structure to estimate the dissimilarity between adjacent regions in the localization scale. The employed features i.e. dynamics of contours and relative entropy, are processed by a fuzzy rule-base scheme to quantify the dissimilarity of adjacent regions in each scale. This approach provides increased flexibility and a normalized outcome that can be handled more efficiently in the merging process. The final multiscale merging costs are derived from several scales by means of linking and downward projection across scales. The final segmentation

100 50 0

1

3

5

7

9

11 13 15 17 19

Edge Cost (x0.05)

Multiple Scale Case

Edge Histogram

250 200

(a)

150

(b)

100 50 0

1

3

5

7

9

11 13 15 17 19

Edge Cost (x0.05)

Figure 2: Histogram of the merging costs for (a) single scale and (b) multi-resolution dissimilarity measure (image Lena). 5. EXPERIMENTAL RESULTS In order to verify it’s efficiency, the proposed method (referred to as MSV: multi-resolution version) has been tested on several color images subjectively and objectively in comparison to its single scale counterpart (SSV: single scale version), where the dissimilarity measure is estimated only in the localization scale. Qualitative results are illustrated in figures 3-5, where the original image, the oversegmentation produced in the localization scale, the final results of the SSV and the MSV segmentation schemes are shown. A quantitative analysis of the segmentation results was obtained using the Liu and Yang criterion [17]:

LGC =

R R ⋅ ∑ σ i 2 ⋅ Ai h ⋅ w ⋅ c i =1

(5)

(c) (d) Figure 3: (a) Original image House (b) oversegmentation (c) SSV (d) MSV. The effectiveness of this method has been evaluated qualitatively and quantitatively in comparison to the single scale counterpart to verify it’s reliability. It was concluded that integration of color and homogeneity information from different scales yields more effective results than it's single scale counterpart. Our future objectives include the integration of an earlier fuzzy dissimilarity function [9] in the proposed scheme and a method to automatically select the localization scale.

5. ACKNOWLEDGMENTS The above work was partially supported by General Secretariat of Research and Technology of Greece under grant 97ΥΠ46 and E.U. Socrates exchange program. 6. REFERENCES (a)

[1]

(b)

[2] [3]

[4]

(c) (d) Figure 4: (a) Original image Parrots (b) oversegmentation (c) SSV (d) MSV.

[5]

[6]

[7]

[8]

(a)

(b)

[9]

[10] [11]

(c) (d) Figure 5: (a) Original image Lena (b) oversegmentation (c) SSV (d) MSV.

Image

Method

Initial Regions

Final Regions

LGC

House House Lena Lena Parrots Parrots

SSV MSV SSV MSV SSV MSV

1048 1048 718 718 1011 1011

285 140 173 79 121 101

63.68 45.82 53.66 33.38 69.8 58.42

Table 1: Comparison of single scale and multi-resolution segmentations for several images

[12] [13] [14]

[15] [16]

[17]

S. Beucher. Watershed, Hierarchical Segmentation and Waterfall Algorithm, In Proc. ISMM’94, pp. 69-76, 1994. A. Bleau and L. J. Leon, Watershed-Based Segmentation and Region Merging, Computer Vision and Image Understanding, vol. 77, pp. 317-370 (2000). J. Bosworth, T. Koshimizu, S.T. Acton, Automated Segmentation of Surface Soil Moisture From Landsat TM Data, In Proc. of the IEEE Southwest Symposium on Image Analysis and Interpretation, 1998. J. Crespo, R. Schafer, The flat-zone Approach and Color Images. In Proc. ISMM’94, pp. 85-92, 1994. P. de Smet, R. Pires, D. De Vleeschauwer, I. Bruyland. Activity Driven Non-linear Diffusion for Color Image Watershed Segmentation, Journal of Electronic Imaging (SPIE), Vol. 8, no. 3, 1999. K. Haris, S. Efstratiadis, N. Maglaveras and A.K. Katsaggelos, Hybrid Image Segmentation Using Watersheds and Fast Region Merging, IEEE Transactions on Image Processing, vol. 7, no 12, pp. 1684-1699 (December 1998). C. W. Lim, N. C. Kim, S.C. Jun and C. S. Jung, Rate-Distortion Based Image Segmentation Using Recursive Merging, IEEE Transaction Circuits and Systems for Video Technology, vol. 10, no. 7, pp. 1121-1134 (October 2000). S. Makrogiannis, G. Economou and S. Fotopoulos, A Color Preserving Image Segmentation Method, Proceedings of European Conference on Circuit Theory and Design (ECCTD), Stresa, Italy, vol. 2, September 1999, pp. 928-931. S. Makrogiannis, G. Economou and S. Fotopoulos, A Fuzzy Dissimilarity Function for Region Based Segmentation of Color Images, Int. Journal of Pattern Recognition and Artificial Intelligence, vol. 15, no. 2, pp. 255-267 (March 2001). F. Meyer, Color Image Segmentation, Proceedings of 4th IEEE Conference on Image Processing and Applications, vol. 354:53, pp. 303-306 (1992). L. Najman, M. Schmitt, Geodesic Saliency of Watershed Contours and Hierarchical Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 12, pp. 1163-1173 (1996). N.R. Pal, S.K. Pal, A Review on Image Segmentation Techniques, Pattern Recognition, vol. 26, no. 9, pp.1277-1294 (1993). P. Perona, J. Malik, Scale-Space and Edge Detection Using Anisotropic Diffusion, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 7, pp. 629-639 (1990). I. Pratikakis, H. Sahli, J. Cornelis, Hierarchical Segmentation using Dynamics of Multiscale Gradient Watershed, Proceedings of 11th Scandinavian Conference on Image Analysis 99, pp. 577-584 (1999). A. Tremeau and Philippe Colantoni, Regions Adjacency Graph Applied to Color Image Segmentation, IEEE Transactions Image Processing, vol. 9, no. 4 (April 2000). I. Vanhamel, I. Pratikakis, and H. Sahli, "Hierarchical multiscale watershed segmentation of color images," Proceedings of First International Conference on Color in Graphics and Image Processing, Saint-Etienne - France, 2000, pp. 93-100. Y.-H. Yang and J. Liu, Multiresolution Image Segmentation, IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 16, pp. 689-700 (1994).