Non-Contrast Based Edge Descriptor for Image Segmentation - CNRS

0 downloads 0 Views 168KB Size Report
Non-contrast Based Edge Descriptor for Image Segmentation. Byung-Gyu .... Scaled response. 4 x 4. 6 x 6. 8 x 8. 10 x 10. R (decrease). p. d. f.. (a). 0. 1. 2. 3. 4. 5. 6. 7. 8 .... [7] A. K. Jain, Fundamentals of digital image processing,. Prentice Hall ...
Non-contrast Based Edge Descriptor for Image Segmentation Byung-Gyu Kim∗ , Pyeong-Soo Mah∗ , Dong-Jo Park∗∗ , Jik-Han Jung∗∗ , Ju-Seok Park∗∗ ∗ Real-time Multimedia Research Team, Embedded S/W Technology Center, ETRI 161 Gajeong-dong, Yuseong-gu, Daejeon 305-350, Korea E-mail: {bgkim, pmah}@etri.re.kr ∗∗

Dept. of EECS, KAIST 373-1 Guseong-dong, Yuseong-gu, Daejeon 305-701, Korea E-mail: [email protected], {jikhanjung, mangto}@kaist.ac.kr Abstract We present an efficient object segmentation strategy based on edge information to assist object-based video coding, motion estimation and motion compensation for the MPEG system. Two parameters are introduced and described based on edge information from the analysis of a local histogram. Using these features, a non-contrast based edge function is defined to generate an edge information map, which can be thought as a gradient image. Then, an improved marker-based region growing and merging techniques are derived to separate image regions. The proposed algorithm is tested on several standard images and demonstrates high reliability for object segmentation.

1. Introduction Most computer vision applications require segmentation of objects or regions from an original image in order to understand/analyze a given image frame. In most algorithms for video object segmentation, moving objects in the time domain are usually considered to be meaningful objects. Thus, many developed algorithms for object segmentation are based on the motion field and change detection between consecutive frames [2]-[3]. The motion field can be used but it has high noise-sensitivity and heavy computational load. Various algorithms that are based on edge and color information using color clustering, watershed or edge masks have been suggested to detect the boundary of the video object [1]-[5]. Approaches based on data clustering are iterative and require exhaustive computation. Edge detection based on edge masks requires spatial convolution for every pixel. In many cases of natural images, there are no per-

fect closed boundaries by using these edge masks. Thus, some additional post-processing such as edge linkage, edge clustering and removal of isolated edges should be applied for the best result. Also, edges are often missed where the change of intensity or color is gradual. We introduce an automatic object segmentation algorithm based on newly defined edge information. Analysis of local histogram properties is performed to define two parameters in the analysis window for every pixel. Use of the modality and the defined parameters of the given histogram allows definition of a measure that can provide edge information as a gradient map [4], [5], [7].

2. Proposed Descriptor for Edge Information An edge is usually defined by the magnitude of a convolution result with spatial masks in the image plane [2], [7]. Neighboring pixels around the pixel point in question (current pixel) are convoluted with spatial masks to measure how much dose it contain the edge information. According to the magnitude of the current pixel point, a pixel is classified as either an edge or non-edge pixel. We define the edge features for the edge information based on an analysis of the local histogram rather than using spatial masks. An edge can be considered as adjacent regions having two or more distinct brightness or color values. If the intensity distribution of an image is multi-modal, the image is probably edge-contained. Otherwise, the image has no edge information.

2.1. Description of Edge Features We propose new parameters for the edge information based on the local histogram properties. These parameters

0-7695-2128-2/04 $20.00 (C) 2004 IEEE

should specify the properties of the local histogram of the window image. • Region ratio (R) For the given image with two regions, the distinctiveness between regions also depends on the areas of two regions. This is called as the areal effects [6]. Objects with small retinal images have high luminance thresholds and objects with larger retinal images have lower perceptible thresholds. For bi-modal distribution of the image in Fig. 1, the distribution of the gray level shows that the image has two distinct regions. Under this situation, the region ratio (R) is defined as the ratio of the minimum area (B in Fig. 1) to the maximum area (A in Fig. 1). R2 Min{AR1 νi , Aνi } for ν1 , · · · , νN , R2 Max{AR1 νi , Aνi }

We devise an edge function after analyzing properties of the defined parameters R and P to obtain edge information. We employ a simple step-edge model with two regions to investigate the responses of the region ratio (R) and edge potential (P ) in detail as depicted in Fig. 2. From this result, we can make out that the shapes of curves follow those of exponential distributions with different variances, approximately. These responses can be also modeled by Gaussian

(1)

p. d. f.

R =

potential increases. The concept of edge potential is illustrated in Fig. 1 where the edge potential P is |P2 − P3 | for the local minimum νi .

Scaled response

where νi is the gray level of the ith local minimum, N is the number of the detected local minima. For any R2 νi , AR1 νi and Aνi are defined as sums of probability density function in both left and right directions.

1

4x4 6x 6 8x8 10 x 10

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

Histogram of an edge-contained block

0

1

1

2

3

R (decrease)

Area of B

5

6

7

8

9

10

(a)

R = Area of A

P1

vi

P

P2

p. d. f.

Probability density function

4

R (decrease)

Scaled response 1 0.9

A

0

0

c1

P3

B

c2

4x4 6x6 8x8 10 x 10

0.8

0.7

0.6

255

0.5

0.4

Gray levels

0.3

Figure 1. The region ratio (R) and edge potential (P ).

0.2

0.1

0

P (decrease)

• Edge potential (P ) In an image with several regions, the homogeneity of a region becomes larger as the variation of the intensity decreases. Thus, we introduce another parameter to take into account the variation of the intensity in the region. For any local minimum νi , the edge potential can be expressed as follows: νi i P = Min{pνlef t , pright } − p(νi ),

(2)

i where pνlef t is the probability of the maximum peak i is the probability of the on the left-side for νi , pνright maximum peak on the right-side for νi and p(νi ) is the probability at the local minimum νi . The image contains more edge information as the edge

1

2

3

4

5

6

7

8

9

10

P (decrease)

(b)

Figure 2. The responses for the R and P to several sizes of windows distributions with several variances. Using these factors, we model the edge function as the responses of Gaussian models in this work. As we see in Fig. 2, the variance of the Gaussian model depends on the size of the analysis window. The devised edge function can be expressed as following:  (P − Pmax )2 + (R − Rmax )2  }, D(R, P ) = exp {− σ2 (3)

0-7695-2128-2/04 $20.00 (C) 2004 IEEE

where Rmax and Pmax are the maximum values of parameters as Rmax = 1.0 and Pmax = 0.5. Especially, the variance of each model is determined to fit the actual response as 2.25σ = L2 , where L denotes the size of the window for the histogram analysis. For case of multiple valleys in the histogram, two parameters can be determined as {Ri , Pi } for each local minimum point νi . If there are two local minimum points ν1 and ν2 , two sets of parameters {R1 , P1 } and {R2 , P2 } are computed for each minimum ν1 and ν2 . After obtaining these feature parameters, the edge information D(·) can be computed by Eq. (3) for each local minimum νi . Thus, the edge information D(·) can exist as many as the number of local minimum νi dose. So, we take the maximum value among the computed the edge information D(·)’s as edge response at the given image.

3. Spatial Segmentation Algorithm The edge information map has good and valuable properties for the conventional marker-based region segmentation in computer vision and image processing. The characteristics of the generated edge map allow us to use a region growing method to segment the image. Region growing consists of determining marker points and expanding from marker locations. Region growing is followed by region merging and filtering to yield the final segmentation result. 1) Marker Extraction Marker areas are localized as minima of the local edge information map. In general, extracting good marker areas is a nontrivial problem and an important procedure for better segmentation. The connected regions that have zero value in the edge information map are chosen as effective markers for region-growing and flooding. Remaining pixels are the uncertainty region for the growing procedure. 2) Marker Growing For marker extraction, the edge map is divided into the two categories of marker areas and uncertainty regions. A marker area is assigned to a label, then the uncertainty regions are grouped into the marker area. A modified approach is used for implementation of this procedure, as follows: 1. The uncertainty pixels are grown with the following color similarity: CD(c, mi ) = Max{dR , dG , dB }

(4)

where c and mi are color vectors of a uncertainty pixel and the ith h marker area, respectively. Also, dR , dG and dB are Euclidean distances of the color components between the uncertainty pixel and the marker area. If the above

color distance between an uncertainty pixel and a labeled area is smaller than the predefined color distance, which is set at 15, this pixel is assigned to that label. 2. The average values of colors R, G and B for each labeled region are updated. For an unclassified pixel with only one labeled region in its neighborhood, the pixel is assigned to the region. If the pixel has more than one labeled region in its neighborhood, it is assigned to the label of several regions if the shortest color distance to this pixel for each assigned region is less than the proper threshold value. The proper threshold value can be determined as the minimum color distance between neighboring regions. 3. The remaining uncertainty pixels are assigned to the label regions with the shortest color distance to each pixel. This process is repeated until there are no remaining uncertainty pixels. 3) Region Merging and Filtering The initial result has often oversegmented and small regions. Thus, small areas that are less than the defined size (50 pixels) are merged into the neighborhood region with the shortest color distance to the small area.

4. Simulation Results To compare results with the proposed algorithm, the conventional marker-based watershed algorithm was employed [4], [5]. In this method, the gradient image is needed to extract the marker areas. In this paper, the Canny operator with the size of 5 × 5 and σ = 2 is used for the gradient image of the watershed algorithm. The size of the analysis window was set as 4 × 4 in the proposed algorithm. Figure 3 illustrates results for gradient information map of the Claire. For the result of the Canny operator, collars of woman’s clothe have weak contrast to the other of that. So, the spatial gradient based on the Canny operator was relatively low value in the boundary between the collar and the other part of woman’s clothe. But it can be seen that the proposed method gave a stable response for edge information as shown in Fig. 3(c) owing to the local histogram analysis. Figure 4 shows the final results of segmentation. Visually, the watershed algorithm based on the Canny operator seems to perform better than the proposed method. This is due to the larger number of segments for the watershed method than for our algorithm. However, the proposed algorithm gives a reliable and detailed segmentation result with a smaller number of segments (or classes) for all tested images (hats of man and woman in the Foreman and Lena).

0-7695-2128-2/04 $20.00 (C) 2004 IEEE

(a)

(b)

(c)

(a) Claire

(b)

(c)

(d) Foreman

(e)

(f)

(g) Lena

(h)

(i)

Figure 3. Gradient information map of the Claire: the original, the Canny gradient operator and the proposed histogram analysis in order of column.

We employ three measurements to evaluate the objective segmentation performance of the presented algorithm. These are the number of segments (M ), goodness (F ), and signal-to-noise ratio (SN R) [5], [7]. Smaller values of F result in better segmentation whereas larger of SN R results in better. As shown in Table 1, all images have lower Goodness values than watershed algorithm using fewer segments. Table 1. Objective performance comparison. Contents Watershed-based Claire Proposed Watershed-based Foreman Proposed Watershed-based Garden Proposed

M 41 36 105 85 200 130

F 37.06 32.73 174.13 140.08 535.09 34.40

SN R(db) 31.79 31.31 30.88 30.48 25.94 40.38

5. Conclusion We have developed a novel intra-frame region segmentation algorithm for separating regions. It was based on the edge features which were newly defined by the local histogram analysis. The edge function on the basis of the defined parameters was introduced to describe the degree of the existence of two or more distinct brightness regions at a pixel. By using the characteristic of the information map by the local histogram analysis of the given image, it can allow us to use the region growing method to segment the image. Some experimental results were provided to demonstrate the reliability of the developed algorithm.

Figure 4. Segmentation results: the original, the conventional marker-based watershed, and the proposed algorithm in order of column.

[2] T. Meier and K. N. Ngan, “Video segmentation for content-based coding,” IEEE Trans. Circuit Syst. Video Technol., vol. 9, pp. 1190-1203, 1999. [3] C. G. Kim and J. N. Hwang, “A fast and robust moving object segmentation in video sequences,” Proc. International Conference on Image Processing, vol. 2, pp. 131-134, 1999. [4] H. Gao, W. H. Siu and C. H. Hou, “Improved techniques for automatic image segmentation,” IEEE Trans. Circuit and Systems for Video Technology, vol. 11, pp. 1273-1280, 2001. [5] J. B. Kim and H. J. Kim, “Multiresolution-based watersheds for efficient image segmentation,” Pattern Recognition Letters 24, pp. 473-488, 2003. [6] G. H. Clarence, Vision and visual perception, New York, Wiley 1965. [7] A. K. Jain, Fundamentals of digital image processing, Prentice Hall, 1997.

References [1] L. Salgado, N. Garcia, J. M. Men´endez and E. Rend´on, “Efficient image segmentation for region-based motion estimation and compression,” IEEE Trans. Circuit Syst. Video Technol., vol. 10, pp. 1029-1039, 2000.

0-7695-2128-2/04 $20.00 (C) 2004 IEEE

Suggest Documents