Fusion of Color and Edge Information for Improved Segmentation and Edge Linking Eli Saber1 2, A. Murat Tekalp1, and Gozde Bozdagi3 ;
Department of Electrical Engineering and Center for Electronic Imaging Systems University of Rochester, Rochester, New York 14627 Phone: (716) 275-3774 FAX: (716) 473-0486 E-mail: fsaber,
[email protected] Systems and Platforms Development, Xerox Corporation, Webster, NY 14580 Phone: (716) 383-7582 FAX: (716) 383-7395 E-mail:
[email protected] Digital Imaging Technology Center, Xerox Corporation, Webster, NY 14580 1
2
3
Abstract
We propose a new method for combined color image segmentation and edge linking. The image is rst segmented based on color information only. The segmentation map is modeled by a Gibbs random eld, to ensure formation of spatially contiguous regions. Next, spatial edge locations are determined using the magnitude of the gradient of the 3-channel image vector eld. Finally, regions in the segmentation map are split and merged by a region-labeling procedure to enforce their consistency with the edge map. The boundaries of the nal segmentation map constitute a linked edge map. Experimental results are reported. Keywords:
Color segmentation, Split and merge, Edge-linking.
1 Introduction Image segmentation and edge-linking are long-existing, dicult problems in image analysis and computer vision. Image segmentation based on color or gray-level only usually fails to identify meaningful objects. Edge detection using classical approaches usually fails to extract connected edges. We, therefore, propose a new method which uses both color and spatial edge information to obtain more \meaningful regions," which are de ned as regions with distinct colors whose boundaries coincide with connected spatial edges. The approach rests on the principle that the segmentation map which is an indicator of \similarities" between pixels must be consistent with the edge map which represents \discontinuities" between pixels. Recent approaches for color segmentation include clustering based on histograms [1, 2], minimum volume ellipsoids [3], fuzzy c-means [4, 5], competitive learning [6] and Gibbs random eld (GRF) modeling in a Bayesian framework [7-12]. Among early work using GRF 1
is the method proposed by Wright [7] where each color component is segmented separately. More recently, Chang et al. [8] have extended the adaptive Bayesian segmentation approach of Pappas [9] using a space-varying image intensity model to color images. Multiresolution approaches using Markov random elds (MRF) have been proposed in [10, 11]. In [12], Panjwani et al. employed MRF texture models for color image segmentation, where textures are characterized in terms of spatial and color space interactions. For a more detailed review of gray and color image segmentation methods, the reader is referred to [13]. Edge detection on color images can be performed by: i) detect edges in each color channel independently and then combine the results; ii) detect edges in the luminance channel only; and iii) detect edges based on the gradient of the 3-channel image eld. Here, we employ the vector-gradient method proposed by Lee and Cok [14]. Prior art to integrate segmentation and edge detection can be classi ed as: i) knowledgebased methods, ii) pixel-wise Boolean methods, and iii) region re nement methods. Most of the recent work belongs to the third class. Among these, Pavlidis et al. [15] describe a method to combine segments obtained by using a region growing approach where edges between the regions are eliminated or modi ed based on contrast, gradient and the shape of the boundary. Haddon and Boyce [16] generate regions by partitioning the image cooccurrence matrix, and then re ning them by relaxation using the edge information. Chu and Aggarwal [17] presented an optimization method to integrate segmentation and edge maps obtained from several channels including visible, infrared, etc., where user speci ed weights and arbitrary mixing of region and edge maps are allowed. In this paper, we present a new, robust integration method by region-labeling, as opposed to pixel-labeling (see Figure 1 for a owchart). We rst compute an initial color segmentation map, where labels form spatially contiguous clusters, called regions. Then, region labels are optimized by split and merge procedures to enforce consistency with the edge map. Color segmentation and edge detection are performed in the YES luminance-chrominance color space de ned by a linear transformation from RGB [18]. However, the proposed segmentation and fusion framework can be applied in any other suitable color space [19]. Detailed descriptions of the components of the system are provided in the following: Section 2 describes the adaptive color segmentation module, which is an extension of the method presented in [8] to include a \small" class elimination step. Section 3 addresses estimation of the spatial gradient of the 3-channel image, and splitting of color regions containing edge segments. 2
Section 4 covers generation of a thinned edge eld, and merging of regions based on this edge eld. Potential applications and experimental results are presented in Section 5.
2 Adaptive Bayesian Color Segmentation The maximum a posteriori probability (MAP) segmentation requires maximization of the a posteriori probability density function (pdf)
p(xjy) / p(y jx)p(x);
(1)
of the segmentation eld x, given the multispectral image y = [y ; y ; y ]T . In (1), p(yjx) denotes the conditional pdf of the image given the segmentation, and p(x) is the a priori pdf, modeled by a Gibbs distribution X p(x) = Z1 exp[? Vc (x)]; (2) c2C 1
2
3
where Z denotes the partition function, C is the set of all cliques. The individual two-pixel clique potential Vc (i; j ; k; l) is given by (
? if x(i; j ) = x(k; l) ; Vc (i; j ; k; l) = + (3) otherwise where (i; j ) and (k; l) denote pixels forming a clique c, and is a positive number in order to impose spatial connectivity of the segmentation eld. The image can be modeled by a uniform mean [20], or a space variant mean [9, 8], plus zero-mean white Gaussian residual. Using the space-variant formulation, the conditional pdf in (1) can be expressed as [8] 8 9 3 x (i;j ) 2= < XX [ y ( i; j ) ? ( i; j )] n n p(y jx) / exp :? 2 ;; 2 n (i;j ) n=1
(4)
where xn i;j (i; j ) is the mean function for each distinct class as a function of location (i; j ) and color channel n, and n denotes the variance of the nth channel. The algorithm employed to obtain the color segmentation is similar to [8], except for the \small" class elimination step, incorporated within the iteration process. An iteration is composed of three steps (see Figure 2 for a owchart): Starting with an initial segmentation with K classes, the rst step encompasses the calculation of the means for all pixels belonging to class x(i; j ) within a window, whose size is initially taken to be equal to the size of (
)
2
3
the image itself. In the second step, the segmentation labels are updated by utilizing the iterated conditional modes (ICM) algorithm. This procedure often converges to a local minimum of the Gibbs potential within a relatively small number of cycles. The last step consists of eliminating \small" classes, which is accomplished by combining a class whose membership is less than a prespeci ed threshold with its nearest-color class. This is dierent than elimination of small regions, as several regions (at dierent locations) may have the same class membership. This last step results in adaptation of the number of classes to the image at hand. Upon convergence, the size of the window, employed in the computation of the means, is then reduced by a factor of two in both horizontal and vertical directions, and the procedure is repeated until the window size is smaller than a user speci ed threshold. As a result, the algorithm starts with global estimates and slowly adapts to the local characteristics of each region.
3 Spatial Gradient Estimation and Region Splitting In order to enforce consistency of the color segmentation map with color edge locations, we propose a split and merge procedure. The split process is guided by the magnitude of the spatial gradient of the 3-channel image.
3.1 Spatial Gradient Estimation Both the split procedure and the edge eld generation module use the magnitude of the gradient G(i; j ) of the 3-channel color image eld, computed as described by Lee and Cok [14]. The computation of the magnitude of the spatial gradient of the 3-channel image is reviewed here for completeness. Let a color image y be represented by a vector eld y = [y ; y ; y ]T where y ; y ; y denote the three color channels. The spatial gradient matrix D of the vector eld y can be de ned as 3 2 @y =@u @y =@v (5) D(i; j ) = 64 @y =@u @y =@v 75 @y =@u @y =@v where u and v are the spatial coordinates for the pixel (i; j ). The square root of the largest eigenvalue of # " p ( i; j ) t ( i; j ) T (6) D D(i; j ) = t(i; j ) q(i; j ) 1
1
1
2
2
3
3
4
2
3
1
2
3
and its corresponding eigenvector represent the equivalents of the magnitude and direction of the gradient of a scalar eld, respectively. Note that DT D is symmetric and positive semide nite; and hence, has real, nonnegative eigenvalues. The largest eigenvalue of DT D can be expressed as q (7) (i; j ) = 21 p(i; j ) + q(i; j ) + (p(i; j ) + q(i; j )) ? 4(p(i; j )q(i; j ) ? t(i; j ) ) where ! ! ! @y @y @y p(i; j ) = @u + @u + @u (8) 2
1
!
2
2
!
2
!
2
3
!
2
@y + @y @y + @y @v @u @v @u ! ! ! @y @y @y q(i; j ) = @v + @v + @v q Finally, the magnitude of the gradient G(i; j ) = (i; j ). t(i; j ) = @y @u
1
1
1
2
2
2
2
2
3
3
2
!
@y @v
3
!
(9) (10)
3.2 Region Splitting Color segments that have at least one edge segment within their boundary will be split into multiple regions. The splitting is accomplished by rst thresholding G(i; j ) at 10% of the maximum of G(i; j ), and labeling all contiguous regions therein. Next, the labeled gradient map is overlayed on the color segmentation map, and color regions containing multiple edgebased regions are split accordingly. A owchart of the split procedure is provided in Figure 3. Region splitting is demonstrated in Figure 4 by an example. The original image, shown in Figure 4a, is made up of two regions on a white background. Assume that the GRF segmentation has resulted in a single region as shown in Figure 4b. Let the thresholded color gradient, depicted in Figure 4c, yield three contiguous regions. These three regions are used as seeds to initialize the splitting process. They are allowed to grow in the direction of the increasing gradient - stopping short of crossing color segmentation boundaries - until all pixels within the image are assigned to one of the seed regions. In this example, the result of the split procedure will be identical to the original image shown in Figure 4a, where the shaded region in the color segmentation map (Figure 4b) is split into two regions. It should be noted that oversplitting of regions is well accommodated, since the split process is followed by a merging procedure (described in the next section). 5
4 Edge Field Estimation and Region Merging In this section, rst horizontal and vertical edge elds are generated (as a function of the magnitude of spatial gradient) between those sites which have dierent segmentation labels to aid the merging step. Then, two methods are presented for region merging.
4.1 Edge Field Estimation The sites of the horizontal and vertical edge elds are located in between the pixel sites (i; j ) and (i + 1; j ), and (i; j ) and (i; j + 1) respectively. The module consists of two components: The rst component consists of a binarization of G(i; j ) by two-level thresholding with a neighborhood check, similar to Canny's hysterisis idea [21]. This is done as follows: L(i; j ) = 1 if G(i; j ) > TH or G(i; j ) > TL and it is connected to at least one point that has a gradient amplitude greater than TH , where TH and TL are the high and low thresholds respectively, otherwise L(i; j ) = 0. The second component performs edge thinning about the borders of the split-segmentation map using zero-crossings of the partials of the gradient magnitude. That is, the horizontal (Lh (:)) and vertical (Lv (:)) edge elds are generated between those sites which have dierent segmentation labels x(i; j ) and x(i +1; j ), and x(i; j ) and x(i; j +1), respectively, according to 8 > 1 if L(i; j ) = 1, Gx(i; j ) > 0, > > > < and Gx (i + 1; j ) < 0 (11) Lh (i; j ; i + 1; j ) = > 1 if L(i; j ) = 1, Gx(i; j ) < 0, > and G ( i + 1 ; j ) > 0 x > : 0 otherwise 8 > 1 if L(i; j ) = 1, Gy (i; j ) > 0, > > < and Gy (i; j + 1) < 0 (12) Lv (i; j ; i; j + 1) = > 1 if L(i; j ) = 1, Gy (i; j ) < 0, > and Gy (i; j + 1) > 0 > > : 0 otherwise where Gx(i; j ) and Gy (i; j ) denote the partial derivatives of G(i; j ) in the x? and y? directions, which are approximated by nite dierence operators.
4.2 Region Merging Having obtained the color-only segmentation map after the split process, and the edge eld, we propose two methods for merging regions which do not have a \signi cant" edge between them. 6
Algorithm A 1. Suppose we have L regions and K classes. Sort the regions Rs , s = 1; ; L from smallest to largest in size and start with the smallest region. 2. Change the label xs ; (xs = 1; :::; K ) of the region Rs under consideration to that of a neighboring region Rt, s; t 2 f1; ; Lg, which minimizes the energy term given by
Es (xt) = where
X
3 X [yn(i; j ) ? xn(i;j)(i; j )]2
2n 2
n=1 (i;j )2Rs x(i;j )=xt
+
X c2C
Vc (x)
8 > ? if x(i; j ) 6= x(i + 1; j ); > > and Lh(i; j ; i + 1; j ) = 1 < Vc (i; j ; i + 1; j ) = > ? if x(i; j ) = x(i + 1; j ); > and Lh(i; j ; i + 1; j ) = 0 > > : + otherwise 8 > ? if x(i; j ) 6= x(i; j + 1); > > > and Lv (i; j ; i; j + 1) = 1
? if x(i; j ) = x(i; j + 1); > and L ( i; j ; i; j + 1) = 0 > v > : + otherwise The window size for computing the local means is taken equal to the last window size utilized in the color segmentation algorithm. The mean estimate for class k at the pixel (i; j ) is deemed unreliable if the number of pixels of class k inside the window is less than a pre-speci ed threshold, Nmin = min(window width; window height), and such pixels are not employed in the energy computation. The result of merging is, therefore, aected by the window size chosen; e.g., a larger window size would result in less merging. 3. If all regions are visited go to Step 4, otherwise nd the next smallest region and go to Step 2. 4. Stop if none of the region labels have changed, otherwise re-label the regions, update the edge elds, and go to Step 1. Algorithm B 1. Starting with L regions and K classes, compute a stability measure for each region Rs as follows: (16) Stability = Es(h) ? Es (xs) 7
where
h=
arg min
m2f0;:::;K g;m6=xs
Es (m);
and Es is computed using Eq. (13). 2. Change the label of the region with minimum stability to the one that corresponds to the lowest energy, then update its stability and that of its neighbors until Stability 0 for all regions. 3. Stop if none of the regions labels have changed, otherwise re-label the regions, update the edge elds and go to step 1.
5 Applications and Results Most multimedia image database indexing and annotation systems available today oer automatic classi cation based on pixel features, such as gray level and color. However, the use of more sophisticated image features, such as texture-based or shape-based classi cation often requires user interactivity to select meaningful regions (object boundaries). Another active area of research today is object-based and region-based image and video compression, which require automatic selection of region-of-interest, including object background separation. The improved color segmentation and edge linking methods proposed in this paper are intended for meaningful region selection in such or other related applications. The performance of the proposed color and edge fusion method is demonstrated on six images, shown in Figures 5, 6, 7, 8, 9, and 10. The original images are of RGB type, where each channel is quantized to 8 bits/pixel, with the sizes 266 166, 305 149, 288 130, 240 144, 256 256, and 228 256, respectively. In each gure, we show (a) the original image; (b) the output of the GRF based approach; (c) the magnitude of the color gradient used throughout the segmentation; (d) the result of the splitting procedure; (e) the edge eld employed in the region merging process; (f) the nal segmentation obtained using Algorithm B with the split option turned o; and (g) on; (h) the nal segmentation obtained using Algorithm A with the split option on; and (i) the linked edge map - for the segmentation shown in (g). The thickness of the edges shown in (i) is two pixels. Except for the original images, dierent colors signify dierent classes. The edge values are shown as black on a white background, where the true edges occupy the dual lattice de ned between any two pixels, thereby resulting in an edge map that is twice the size of the original image, as shown 8
in Figures 5e, 6e, 7e, 8e, 9e, 10e. The parameters utilized throughout the segmentation and fusion are: = 3, = 3, n = 5:0, TL = 0:05, and TH = 0:1. The threshold for the split process was taken equal to 0:04. These were determined empirically through extensive experimentation. Regions smaller than 100 pixels were eliminated by grouping them with their closest neighboring regions in terms of color. However, these do not have to be eliminated if the user requires the level of detail conveyed by these \small" regions. Figure 5 provides an example of the improvement obtained by utilizing the proposed splitting procedure. In Figure 5b, we can see that the side of the boat, the shadow on the water directly underneath, and the lower portion of the water body are grouped in one class. However, these regions were clearly separated as shown in Figure 5d, where the initializations were obtained from Figure 5c. The eect of the splitting procedure on the nal segmentation can be clearly observed when comparing Figures 5h and 5g with Figure 5f, where the results obtained by employing both the region splitting and merging process are clearly superior. In addition, the segmentations of the boat and water body are clearly superior to those obtained by utilizing the GRF algorithm by itself. The groupings of the various regions are also consistent with the semantically \relevant" edges within the image. Figures 6, 7, 8, 9 and 10 demonstrate the superior results obtained by our proposed algorithms when compared with the outcome of the GRF segmentation. The eect of the splitting procedure is minimal in these cases since the segmentations obtained by the GRF algorithm (see Figures 6b, 7b, 8b, 9b and 10b) resulted by and large in an oversegmentation with no inconsistencies across regions boundaries. However, note the separation of the stairway in Figure 10d when compared with Figure 10b. The nal segmentations obtained by utilizing either Algorithm A or B (see Figures 6h/7h/8h/ 9h/10h and 6g/7g/ 8g/9g/10g) are clearly more meaningful than those shown in Figures 6b/7b/ 8b/9b/10b. In particular, we can see that the background region is classi ed into one region as shown in Figures 6g/6h instead of multiple regions as shown in Figure 6b. Similar results can be observed when comparing Figures 7g/7h with Figure 7b, where the upper background, the road and the body of the car are grouped into more meaningful regions.
6 Conclusions We propose a new method to obtain an improved segmentation and a linked edge map by fusion of spatial edge information and regions resulting from adaptive Bayesian color 9
segmentation. The color segmentation module generates contiguous regions by the virtue of GRF modeling of the segmentation eld. Regions are later merged by one of two region labeling algorithms to obtain the desired results. The merging criterion favors combining two regions if there is no spatial edge between the region boundaries. Despite the fact that both region merging algorithms (Algorithms A and B) provided more accurate segmentations when compared with the basic GRF approach, our experiments indicate that Algorithm B provides a more meaningful segmentation than Algorithm A, and reaches the nal solution at a quicker rate. Finally, the execution time of our approach is on the order of a few minutes on a Sparc 20 without any software optimization.
7 Acknowledgement This work is supported in part by Xerox, an SIUCRC grant from National Science Foundation and a New York State Science and Technology Foundation grant. The authors wish to extend their gratitude to Professor Glenn Healey and Mr. David Slater of the University of California at Irvine for providing the image shown in Figure 9a, and the Computer Vision Research Group of the University of Massachusetts for the image shown in Figure 10a.
References [1] R. Ohlander, K. Price, and D. R. Reddy, \Picture segmentation using a recursive region splitting method," Computer Graphics and Image Processing, vol. 8, pp. 313{333, 1978. [2] Y. Ohta, T. Kanade, and T. Sakai, \Color information for region segmentation," Computer Graphics and Image Processing, vol. 13, pp. 222{241, 1980. [3] J. M. Jolion, P. Meer, and S. Bataouche, \Robust clustering with applications in computer vision," IEEE Trans. Pattern Anal. Mach. Intel., vol. 13, pp. 791{802, Aug. 1991. [4] T. L. Huntsberger, C. L. Jacobs, and R. L. Cannon, \Iterative fuzzy image segmentation," Pattern Recognition, vol. 18, no. 2, pp. 131{138, 1985. [5] Y. W. Lim and S. U. Lee, \On the color image segmentation algorithm based on the thresholding and the fuzzy C-means techniques," Pattern Recognition, vol. 23, no. 9, pp. 935{952, 1990. 10
[6] T. Uchiyama and M. Arbib, \Color image segmentation using competitive learning," IEEE Trans. Pattern Anal. Mach. Intel., vol. 16, pp. 1197{1206, December 1994. [7] W. A. Wright, \Markov random eld approach to data fusion and colour segmentation," Image and Vision Computing, vol. 7, pp. 144{150, 1989. [8] M. M. Chang, M. I. Sezan, and A. M. Tekalp, \Adaptive Bayesian segmentation of color images," Journal of Electronic Imaging, vol. 3, pp. 404{414, October 1994. [9] T. N. Pappas, \An adaptive clustering algorithm for image segmentation," IEEE Trans. Signal Proc., vol. SP-40, pp. 901{914, April 1992. [10] C. L. Huang, T. Y. Cheng, and C. C. Chen, \Color image segmentation using scale space lter and Markov random eld," Patt. Recognition, vol. 25, no. 10, pp. 1217{1229, 1992. [11] J. Liu and Y. H. Yang, \Multiresolution color image segmentation," IEEE Trans. Pattern Anal. Mach. Intel., vol. 16, pp. 689{700, July 1994. [12] D. K. Panjwani and G. Healey, \Markov random eld models for unsupervised segmentation of textured color images," IEEE Trans. Pattern Anal. Mach. Intel., vol. 17, pp. 939{954, October 1995. [13] N. R. Pal and S. K. Pal, \A review on image segmentation techniques," Pattern Recognition, vol. 26, no. 9, pp. 1277{1294, 1993. [14] H. C. Lee and D. R. Cok, \Detecting boundaries in vector eld," IEEE Trans. Signal Proc., vol. SP-39, pp. 1181{1194, May 1991. [15] T. Pavlidis and Y. T. Liow, \Integrating region growing and edge detection," IEEE Trans. Pattern Anal. Mach. Intel., vol. PAMI-12, pp. 225{233, March 1990. [16] J. F. Haddon and J. F. Boyce, \Image segmentation by unifying region and boundary information," IEEE Trans. Pattern Anal. Mach. Intel., vol. PAMI-12, pp. 929{948, October 1990. [17] C. C. Chu and J. K. Aggarwal, \The integration of image segmentation maps using region and edge information," IEEE Trans. Pattern Anal. Mach. Intel., vol. PAMI-15, pp. 1241{1252, Dec. 1993. 11
[18] \Xerox color encoding standards," Tech. Rep., Xerox Systems Institute, Sunnyvale, CA, 1989. [19] H. C. Lee, \Color image quantization for multimedia," in SPSTJ: Symposia on Fine Imaging, (Shigaku Kaikan, Tokyo, Japan), pp. 193{196, 1995. [20] H. Derin and H. Elliott, \Modeling and segmentation of noisy and textured images using Gibbs random eld," IEEE Trans. Pattern Anal. Mach. Intel., vol. PAMI-9, pp. 39{55, Jan. 1987. [21] J. Canny, \A computational approach to edge detection," IEEE Trans. Pattern Anal. Mach. Intel., vol. PAMI-8, pp. 679{698, November 1986.
12
Color image
Adaptive Bayesian color segmentation using GRF
Color edge detection
Split regions containing edges
Merge adjacent regions based on color and edge information
Linked edge map
Improved segmentation map
Figure 1: Flowchart of the proposed method.
13
Get initial segmentation using 3 channel thresholding
Window equals whole image
Iteration =1
Given segmentation x, get new class mean estimates
Given mean estimates, update segmentation x Reduce window size by 2
Eliminate small classes by combining them with nearest color centered class
Iteration = Iteration + 1 Y
Window
x converged or iteration =max
Y
> Minimum size
N
N GRF segmentation
Figure 2: Flowchart for the GRF segmentation algorithm
14
Segmentation using GRF
Color gradient
Eliminate small regions
Threshold color gradient
Label all regions
Eliminate small regions
Select first region
Go to next region No
Yes
if all regions visited
No
Determine if region contains multiple regions in the thresholded and labeled gradient
Yes
Splitting complete
Utilize the intersecting regions from the thresholded and labeled gradient as an initialization for the split process
Expand these intersecting regions in the direction of increasing gradient until all pixels within the region in the GRF segmentation are visited
Figure 3: Region splitting algorithm
15
Label regions within thresholded color gradient
Edge not comprehended within the segmentation
Not present in the thresholded gradient map, just placed for reader’s convenience
Single region
(a) Original image
(b) GRF segmentation
(c) Thresholded color gradient White = high gradient Black = low gradient
Figure 4: Region splitting example
16
Figure 5: Results on Boat Picture
17
Figure 6: Results on Car-1 Picture
18
Figure 7: Results on Car-2 Picture
19
Figure 8: Results on Garden Picture
20
Figure 9: Results on Beach Picture
21
Figure 10: Results on House Picture
22