Homogeneous Region Merging Approach for Image Segmentation

0 downloads 0 Views 81KB Size Report
Image Segmentation Preserving Semantic Object Contours. Hyun Sang Park ..... [10] A.K. Jain, Fundamentals of Digital Image Processing,. Prentice-Hall, New ...
Homogeneous Region Merging Approach for Image Segmentation Preserving Semantic Object Contours Hyun Sang Park and Jong Beom Ra Dept. of Electrical Engineering, Korea Advanced Institute of Science and Technology 373-1 Kusongdong Yusonggu, Taejon, Korea E-mail: [email protected], [email protected] Abstract - Morphological image segmentation is an essential

are preferred because they begin with precise segmentation

tool for various image analysis tasks. In this paper, we propose

preserving most of the perceptually important object contents.

two successive region-merging algorithms that mainly aim to

Thus, we will consider the bottom-up approach for image

represent homogeneous objects simply and clearly. First, we

segmentation in this paper.

define a marker cluster and extract markers having similar

One of the semantic image analysis applications is VOP (Video

intensity values from each marker cluster. After initial

Object Plane) generation in MPEG-4 [9]. A semantic object

segmentation using the markers, perceptually insignificant small

consisting of several large distinct regions would be easily

regions are removed to reduce the number of regions based on

extractable by properly merging them. In practice, however, a

some heuristics. Then, we classify all regions into three classes

semantic object is often decomposed into a large number of small

according to their homogeneity. Finally, gradient-based region

regions of arbitrary shapes and inconsistent statistical properties.

merging is performed for all regions except inhomogeneous

This makes it very difficult to directly extract and compose

regions. Experimental results show that the proposed region-

semantic objects from a video frame.

merging scheme can represent homogeneous objects with few

As a simple example of semantic object extraction, let us

regions while preserving most of the semantic object shapes.

consider an image composed of two semantic objects; a human

Hence, the proposed method can be a useful tool for semantic

object and a plain background. Such scenes can be found in

object segmentation.

typical ‘head and shoulder’ image sequences. Here, the goal is to separate two semantic objects. However, without prior knowledge

1.

Introduction

about objects, i.e., well-formulated semantics, it is very difficult to compose ‘a human object’ from a number of arbitrarily shaped

Many morphological image segmentation methods have been

and randomly distributed regions. That is, a semantic object can

proposed and applied for various image analysis tasks [1-8].

not be composed using signal-processing techniques, but has to

Among these tasks, the goal of region-based image/video coding

be treated in the basis of semantics or proper object models.

[2,3,4] is to improve subjective and objective quality in very low-

However, if we first compose an object from segmented

bit rates. To achieve a high compression ratio, the coding schemes

homogeneous regions of similar characteristics, the object

have to reduce bits for region contour coding, which may occupy

extraction problem can be simplified. In other words, this object

up to 45 % of the total bits [4]. So, they limit the number of

can be regarded as the background and the remaining regions as

regions regardless of their semantic contents, and thereby lack in

another object, regardless of its semantic contents. This simplified

preserving low-contrast object contours that may be perceptually

problem implies that homogeneous objects in an image should be

important. In segmentation for other image analysis tasks, low-

represented as simply as possible to make the semantic

contrast edges are usually not preserved, because their

segmentation problem easier.

discrimination is not essential for these tasks [1,5-8].

The purpose of this paper is to present an image segmentation

For semantic image analysis tasks, top-down approaches

method that aims to represent homogeneous or simple objects

[2,3,6] have difficulty in defining perceptually important objects.

with few regions while preserving low-contrast object contours.

In other words, it is hard to find a good rule to split coarsely

We adopt a bottom-up approach that consists of five successive

segmented

bottom-up

steps as shown in Fig. 1. First, we will describe initial region

approaches [7,8] composed of region segmentation and merging

segmentation in section 2. The proposed region merging method

regions

semantically.

Therefore,

will be described in section 3. Section 4 will be devoted to present

Step 2:

Find the representative intensity value I p that provides

the largest accumulated histogram from I p − 1 to I p + 1 in

simulation results. Finally, conclusions will be given in section 5.

M.

Original image

Step 3:

Find a connected region R composed of the three

intensity values

Morphological image simplification

{I

p

}

− 1, I p , I p + 1 , and then update M by

excluding R from M. Step 4:

Marker extraction

If Area ( R ) ≥ h , R is registered as a marker and go to

step 3.

Boundary decision by watershed algorithm

Step 5:

If Area ( M ) ≥ h , go to step 2.

Step 6:

If there remains any unexamined marker cluster, go to

step 1. Step 7:

Small region merging

End.

Here, it is preferable that size criterion h is set to the minimum

Homogeneous region merging

area of preserved components after morphological filtering. Let

ψ N be a morphological filter by reconstruction that employs a

Fig.1. Overall procedure

2.

square structuring element whose side is (2 N + 1) . It will remove

Initial region segmentation

all components that do not fit into the structuring element in a binary image. Thus, the proper value of h is

Let f

be a gray image and

f s be the morphologically

(2 N + 1)× (2 N + 1) .

However, it does not hold for a gray image.

simplified image from f . Among a number of flat regions in

For a gray image composed of only rectangular components,

f s , regions brighter or darker than their surroundings are highly

the smallest region that is preserved after morphological filtering

noticeable. Morphological gradient operators are very attractive

is always larger than (the side length of the structuring

in defining such regions, since the morphological gradient by

element) × 1. Then, the size criterion can be defined as h = 2N +1 .

dilation (or erosion) can locate a zero gradient value inside locally

(2)

bright (or dark) components. Therefore, to locate those noticeable

However, this criterion is not valid for a gray image having

regions precisely, we propose an operation G m ( f s ) as follows.

components of arbitrary shapes. In this case, the simplified image

 

     = minδ  f  − f , f − ε  f ,

G m f s = min G + f s , G − f s 1

s

s

s

1

may have very small regions including one-pixel regions. (1)

s

where G , G , δ N and ε N denote morphological gradient by +



Nevertheless, experimental results show that the criterion given in Eq. (2) works well in most cases. Therefore, we use this value as a size criterion.

dilation, morphological gradient by erosion, dilation of size N,

After marker extraction, initial region segmentation is

and erosion of size N, respectively. Then, those regions as well as

performed by applying the modified watershed transform on

large flat regions can be discriminated by examining areas with

extracted markers [2].

zero values of G m .

3.

Let us define a marker cluster as a connected region with zero values of G

m

Homogeneous region merging

( f s ). A marker cluster includes various significant

flat regions such as large flat regions, brighter or darker flat

3.1.

Ordered small region merging algorithm

regions compared to the surroundings regardless of their sizes. It

Initial region segmentation contains a number of redundant

should be noted that they are all very important for representing

small regions. In this section, we propose an ordered region-

object contents. Now we can restrict marker extraction within

merging algorithm to reduce these small regions. Let Ri be a region considered for merging, and ξ (Ri ) be a

marker clusters without the loss of significant flat regions. Since markers are not separated by using G m ( f s ), they are to be discriminated by the following procedures. Step 1:

Find a marker cluster M in G

m

( f s ).

region set whose element is one of the neighboring regions of Ri

by 4-connectivity [10]. First, valid merging candidates ξ M (Ri )

have to be selected from ξ (Ri ). To find appropriate merging

candidates, ξ M (Ri ) , we have observed some tendencies from

homogeneous objects may be extractable more easily than other

several simple region-merging experiments [10], i.e.,   

most of the small regions are likely to be merged to larger

complex objects, since their components have very similar

ones;

statistical properties to each other. However, low contrast

regions of low variances are often merged to regions of

boundaries between objects may result in merging objects of

higher variances;

different semantics. To avoid non-semantic merging, we perform

regions near the image boundary may be components of

a ternary classification for segmented regions. We determine the class of a region Ri , C ( Ri ) , as follows.

the background or the simple objects.

H , if σ ≤ VAR _ TH , C R  =  IAH , if C R  = H , R ∈ξ  R , IH , otherwise, 

The first two observations often happen at the same time. Let

2

i

S i , µ i , and σ i denote the area, statistical mean, and variance 2



i

of Ri , respectively. Then, a region R j can be a valid merging candidate, if either of following conditions is satisfied, i.e., or

j

j

i

(3)

σ i ≥ σ j , S i ≥ S j and µ i − µ j ≤ DIFF ,

where VAR_TH is the variance of the largest region. Here, H, IAH,

σ i < σ j , S i < S j and µ i − µ j ≤ DIFF ,

and

IH

are

abbreviations

of

a

homogeneous

region,

where DIFF is a threshold value to prevent dissimilar regions

inhomogeneous region adjoining a homogeneous region, and

from being merged. Additionally, if both Ri and R j are

inhomogeneous region adjoining only inhomogeneous regions,

adjacent to the image boundary and

µ i − µ j ≤ DIFF + 2, R j is

also taken as a merging candidate. ξ M (Ri ) can be determined in this way.

respectively. After classification, we examine only regions of class H for merging, and regard regions of class H or IAH as valid merging

In region merging, the merging order is an important factor to

candidates. That is, we examine Ri and R j for merging such

( )

consider, because merged results can be quite different according

that C (Ri ) = H and R j ∈ ξ (Ri ), C R j ≠ IH . This restriction is

to the merging order even with the same merging criterion. In this

to prevent two regions of different semantic contents from being

paper, we regard the priority of a region as its variance value, and

merged. Here, ξ M (Ri ) is selected by using gradient-based

the region of a lower priority will be merged first. This ordering

criterion.

can be implemented by a loop operation that begins from one and

Pixel p is defined as a boundary pixel between Ri and R j , if

proceeds to a sufficiently large number. Then, only the regions

there are two pixels, pi and p j such that pi ∈ Ri , p j ∈ R j ,

having a lower standard deviation than the argument in the loop will be considered for merging. Additionally, in order to promote an opportunity for a region to be merged, we expand ξ M (Ri ) by

and p j ∈ N 4 (pi ) , where N 4 (p ) is a set of pixels neighboring p

by 4-connectivity. Then, merging candidates ξ M (Ri ) for a given

Ri can be determined by considering the weakness of boundary

increasing threshold value DIFF, as merging proceeds. The

pixels. If R j satisfies both conditions, C (Ri ) ≠ IH and at least

proposed region-merging algorithm is as follows.

half of the boundary pixels between two regions have gradient

Step 1:

DIFF = 0, VAR = 1.

Using these merging candidates, we perform the following

Step 2:

Find a region Ri such that σ i ≤ VAR , S i ≤ S MIN , and

algorithm until it terminates automatically.

values less than 2 VAR _ TH , R j is an element of ξ M (Ri ) .

ξ M (Ri ) ≠ φ .

If there is no such region, go to step 3.

Otherwise, find a merging pair ( Ri , R j* ) that provides the

Step 1:

Find Ri such that C (Ri ) = H and ξ M (Ri ) ≠ φ .

smallest value of variance after merging, and then merge

Step 2:

If there is no such region, the merging procedure is

them. Repeat step 2. Step 3:

terminated. Otherwise, find merging pair ( Ri , R j* ) that

Increase VAR by one. If VAR > VARMAX, go to step 4.

Otherwise, go to step 2. Step 4:

provides the smallest value of variance after merging, and then merge them.

Increase DIFF by one. If DIFF > DIFFMAX, it is

terminated. Otherwise, go to step 2.

Step 3:

region continuously by merging. Step 4:

3.2.

Classify the merged region to H in order to expand this Go to step 1.

Homogeneous region merging

The previous stage only removes redundant regions that do not

4.

Simulation results

annoy object semantics. The main purpose of this paper is to represent

homogenous

objects

with

few

regions.

Such

Experimental image segmentation has been performed on the

first frames of typical video sequences: “Miss America” and

coders. Our further work will be focused on a merging method

“Mother and daughter.” The open-close by reconstruction filter of

that aims to reduce the number of regions in complex objects as

size one [2] is used for image simplification. Markers are

well as homogeneous objects.

extracted with h = 3 as described in section 2. Boundary decision

References

is performed by using the modified watershed algorithm [2]. For merging, we select parameters of DIFFMAX = 1, VARMAX = 50, and SMIN = 200. We should mention that DIFFMAX is set to one in order to maintain region homogeneity as highly as possible. The first and second rows in Fig. 2 show segmentation results for “Miss America” and “Mother and daughter,” respectively. In each row, 4 figures represent an original image, and results of region segmentation, ordered small region merging, and homogeneous region merging, respectively. The number of regions after the final segmentation is 163 for “Miss America”, and 256 for “Mother and daughter”, respectively. Even though the number of regions is somewhat large, simulation results demonstrate that extracted homogeneous background contours coincide with natural object boundaries very well.

5.

Conclusions

In this paper, we adopt a bottom-up image segmentation approach

and

propose

two

region-merging

algorithms.

Experimental results show that the proposed image segmentation method preserves most of the semantic object contours well while clearly representing homogeneous objects with few regions. Hence, the proposed work is highly suitable for initial image

[1]

F. Meyer and S. Beucher, “Morphological segmentation,” Journal of Visual Comm. and Image Representation, vol. 1, no. 1, Sep. 1990, pp. 21-46. [2] P. Salembier, “Morphological multiscale segmentation for image coding,” Signal Processing, vol. 38, pp. 359-386, 1994. [3] P. Salembier and M. Pardas, “Hierarchical morphological segmentation for image sequence coding,” IEEE Trans. on Image Process., vol. 3, no. 5, pp. 639-651, Sep. 1994. [4] D. Wang, et al., “Segmentation-based motion-compensated video coding using morphological filters,” IEEE Trans. on Circ. and Syst. for Video Tech., vol. 7, no. 3, pp. 549-555. June 1997. [5] D. Cortez, et al., “Image segmentation towards new image representation methods,” Signal Processing: Image Comm., vol. 6, pp. 485-498, 1995. [6] J.G. Choi, et al., “Spatio-temporal segmentation using a joint similarity measure, IEEE Trans. on Circ. and Syst. for Video Tech., vol. 7, no. 2, pp. 279-286, Apr. 1997. [7] K. Haris, et al., “Hybrid image segmentation using watersheds,” Proc. of VCIP, vol. 2727, pp. 1140-1151, 1996. [8] L. Shafarenko, et al., “Automatic watershed segmentation of randomly textured color images,” IEEE Trans. on Image Process., vol. 6, no. 11, pp. 1530-1544, Nov. 1997. [9] “MPEG-4 video verification model V.5.0,” ISO/IEC JTC 1/SC 29/WG 11/N1469, Nov. 1996. [10] A.K. Jain, Fundamentals of Digital Image Processing, Prentice-Hall, New Jersey, 1989.

segmentation in various image segmentation applications, including semantic image analysis tasks and region-based image

(a)

(b)

(c)

(d)

Fig. 2. (a) Original images, (b) initial image segmentation results, (c) small region merging results, and (d) the final results by homogeneous region merging.