Robust Image Segmentation Using Local Median - IEEE Computer ...

1 downloads 0 Views 345KB Size Report
Robust Image Segmentation using Local Median. Jundong Liu. School of Electrical Engineering and Computer Science. Ohio University. Athens, OH 45701.
Robust Image Segmentation using Local Median Jundong Liu School of Electrical Engineering and Computer Science Ohio University Athens, OH 45701

Abstract In recent years, region-based active contour models have gained great popularity in solving image segmentation problem. Those models usually share two assumptions regarding the image pixel properties: 1) within each region/object, the intensity values conform to a Gaussian distribution; 2) the ”global mean” (average intensity value) for different regions are distinct, therefore can be used in discriminating pixels. These two assumptions are often violated in reality, which results in segmentation leakage or misclassification. In this paper, we propose a robust segmentation framework that overcomes the above mentioned drawback existing in most region-based active contour models. Our framework consists of two components: 1) instead of using a global average intensity value (mean) to represent certain region, we use local medians as the region representative measure to better characterize the local property of the image; 2) median and sum of absolute values (L1 norm) is used to formulate the energy minimization functional for better handling intensity variations and outliers. Experiments are conducted on several real images, and we compare our solution with a popular region-based model to show the improvements. Keywords: Segmentation, Level Set, Chan-Vese Model

1 Introduction Segmenting or partitioning an image into different homogenous regions has long been a fundamental problem in image analysis. It has tremendous amount of applications in various areas including but not limited to medical imaging, remote sensing and optical imaging. Segmentation also often serves as a preprocessing step for many other image analysis tasks, e.g, recognition, registration, retrieval and measurement. Numerous solutions have been proposed in literature in solving image segmentation problem. Among them, active contour based models have long been a popular group,

starting the original work [4] proposed by Kass-WitkinTerzopoulos 20 years ago. Active contour models can be classified using different criteria. Based on their representation and implementation, the existing active contour models can be divided into parametric snakes and geometrical snakes. Parametric snakes care represented explicitly as parameterized contours and the snake evolution is carried out on the predetermined spline control points. Geometrical snakes, on the other hand, are represented implicitly as the zero-level sets of higher-dimensional surfaces, and the updating is performed on the surface function within the entire image domain. Based on the features begin used in leading the evolution, active contour models can also be classified as edgebased (often also referred as boundary-based) and regionbased models. Edge-based models are based on the notion of shriking/expanding a curve until it reaches high-gradient areas (edges). Regions are characterized by properties of their contours only, and the active contour model usually reply on certain stopping function to slow down the curve evolution when object boundaries are reached. A typical stopping function is defined as g(|∇u0 (x, y)|) =

IEEE

(1)

Where G ⊗ u0 is the convolution of the image u0 with a Gaussian filter, which results in a smoother version of u 0 . The design goal of the stopping term g(|∇u0 |) is to obtain a smoothly varying function that is positive in non-edge areas and turns to zero at the edges. The original snake models [4], together with many variants [5, 9, 12, 3, 11], belongs to this edge-based category. There are several drawbacks associated with the edge-based approaches. Firstly, the stopping function g(|∇u0 (x, y)|) implemented with discrete gradient, is never zero on the edges, therefore many edge-based models [5, 9] suffer from the leakage problem, especially when weak edges are encountered. Secondly, when the input image is very noisy, or the boundary is extremely smooth, this group of models rest at local minimum, and have trouble

Proceedings of the 3rd Canadian Conference on Computer and Robot Vision (CRV’06) 0-7695-2542-3/06 $20.00 © 2006

1 p≥1 1 + |∇G ⊗ u0 (x, y)|p

to identify the accurate location to stop. Thirdly, as they mostly rely on certain image gradient based edge-function to stop the curve evolution, this group of models can detect only objects with edges defined by gradient. Other types of discriminant features, e.g., texture boundary, will not be treated as edges, therefore difficulties will arise when try to use edge-based models to capture the regions separated by textures. Region-based models [15, 1, 8, 7, 10], on the other hand, utilizing homogeneities properties to decompose the image domain into different regions, can very well overcame the above-mentioned drawbacks, and have gained great popularity in recent years. The region competition method [15] proposed by Zhu & Yullie was a pioneer region-based model, which formulates the segmentation problem as a variational minimizing problem under the Bayes/MDL (Minimum Description Length) framework, with two segmentation models – snakes and statistical region growing – as the built-in components. This model assumes that the pixels within each region conforms to a Gaussian distribution. The snake deformation is carried out along the functional gradient, derived from the variational principle on the snake energy minimization. Not like the region-based models proposed in later years, this model represents the evolution curve as a set of control points, and level set method is not utilized. Samson et al. [8] proposed a supervised classification model to partition an image into homogeneous regions. Like Zhu & Yullie model, Gaussian distribution is assumed for pixels within each region, and the number of classes and the intensity profiles (mean, variance) for all the regions are assumed known. A system of coupled partial differential equations is utilized to lead the propagation of mutually exclusive curves, which are models as the zero level sets of several embedding level set functions. Each evolving curve is guided by internal forces (regularity of the interface), and external ones (data term, no vacuum, no regions overlapping). In [1], Chan & Vese introduce a region-based active contour model which can detect contours both with or without gradient. A stopping term from Mumford-Shah segmentation technique [6] is adopted in this model and the entire curve evolution is conducted under the level set framework. The model assumes that there are only two regions whose segments are piecewise constant, but it also works for the generalized input cases where pixels conform to Gaussian distributions for both inside and outside of the desired segmentation. Impressive experimental results have been reported in segmenting image with very blurred or even discontinuous boundaries. Another salient advantage over other models is that initialization can be started anywhere in the image, and interior contours are automatically detected.

These three region-based models, as well various extension and variants [10, 7, 2] share several common assumptions and therefore common formulations regarding the segmentation criterion: 1. Within each region/object, the intensity values ”globally” conform to a Gaussian distribution.  2. Correspondingly, a common L2 norm term (u0 − ci )2 /σ 2 can be found in the minimization energy formulation of these models, where ci is the average intensity value for certain region. The justification of this formulation stems from a fact: when the noise follows independent and identical Gaussian distribution, the maximum likelihood solution is obtained by minimizing a L2 norm cost function. 3. Mean (average intensity value) has been used as the entity to characterize and differentiate regions. In this paper, we call such entity as a region representative. Most existing region-based active contour models use global mean as the region representative and no local intensity variation has been taken into consideration in the segmentation procedure. ”global” here refers to the fact that the mean value is calculated based on the entire region. However, in practice, these assumptions are often violated, especially for medical images. ”Gaussian distribution with a global mean” may not be an accurate account for an image region. Take the 2D MRI slice in Figure 1 as an example. Due to the existence of the bias field, the pixel gray values gradually decrease along the perpendicular axis. Let c1 and c2 to be the average values of the brain and the background. The value of c1 is dictated by the upper/brighter part of the brain, while c2 is quite close to zero. Some pixels of the brain, especially within the bottom half of the image, have intensity values closer to the background zero than to the global brain mean c1 . Therefore, those pixels would be likely classified as background points if the above-mentioned region-based models are used. In addition, L2 norm and mean are not robust measures to characterize a group of pixels when their intensity values are far away from a Gaussian distribution. In practice, intensity outliers often exist within certain regions. More over, during the curve evolution process, the pixels enclosed in a moving curve often belong to two or more objects, therefore multimodal Gaussian is more likely than a mono-modal one to appear. All in all, a more robust measure that can capture the major Gaussian component and reject the outliers, is desired. In this paper, we propose a robust segmentation framework that overcome the above mentioned drawback existing in many region-based active contour models. Our framework consists of two components: 1) in order to account

Proceedings of the 3rd Canadian Conference on Computer and Robot Vision (CRV’06) 0-7695-2542-3/06 $20.00 © 2006

IEEE

20 40 60

20

20

40

40

60

60

80

80

100

100

120

120

80 100

140

140

20

120

40

60

80

100

120

20

40

60

80

100

120

140

20

40

60

80

100

(a)

120

Figure 1. An example where global mean is insufficient to segment the image correctly.

for the local variation existing in regions, the region representative will be computed locally. 2) in order to robustly represent the major object enclosed by an evolving curve, median, instead of mean will be used as the region representative for characterizing different regions/objects. Correspondingly, absolute difference between the pixel and the median (L1 norm) will be used as the energy functional minimization criterion. The rest of paper is organized as follows: some drawbacks of a popular region-based active contour – Chan-Vese model – will be pointed out and analyzed in section 2. Our proposed framework will be laid out in section 3, followed by experiment results in section 4. Section 5 closes this paper.

(b)

Figure 2. Chan-Vese model’s inability to handle local image variations. a) is a slice of brain MRI image before bias correction. b) is the curve evolution result using Chan-Vese model.

The computation of c1 and c2 is conducted as  Ω u0 (x, y)H(φ(x, y))dxdy c1 (φ) = H(φ(x, y))dxdy Ω  u0 (x, y)(1 − H(φ(x, y)))dxdy Ω  c2 (φ) = Ω (1 − H(φ(x, y)))dxdy

(4) (5)

Note that both c1 and c2 are global values, computed based on the entire image. Although impressive segmentation results on various types of images have been reported for this model, for medical images, the global mean may not be the best characteristic to represent a region. Local variations prevail in many medical modalities, and negligence of local information would often result in undesired segmentations. As have 2 Chan-Vese model been introduced in section 1, fig 2.a) is an MRI image before the bias field is removed. Fig 2.b) shows the segmentation Let C be an evolving curve in Ω. inside(C) denotes result of using Chan-Vese model. The evolving contour was the region enclosed by C and outside(C) denotes the reinitialized as a single circle. The final estimate of c1 is 47.06 gion outside of C. Chan-Vese model [1] is to minimize the and c2 is 7.03. The pixels at the bottom half of the brain functional defined as have intensity values much closer to the background mean c2 than to the brain mean c1 , therefore the algorithm evoles + v · Area(insideC) F (c1 , c2 , C) = μ · Length(C) them into the background class an undesired segmentation  +λ1 inside(C) |u0 − c1 |2 dxdy (2) – a half brain and the rest – is resulted .  +λ2 outside(C) |u0 − c2 |2 dxdy One solution to overcome this problem is to take local intensity variations into consideration. More specifically, the global property for c1 and c2 should be replaced by localwhere c1 and c2 are the averages of u0 inside C and outside ized region representative measures, which vary at different C respectively. Mapping to level set framework, the new locations and better account for local details. We call this functional Chan-Vese model tries to minimize is approach as localize the representative. The details of our solution will be given in next section.  Having decided to choose local representative over F (c1 , c2 , C) = μ Ω δ(φ(x, y))|∇φ(x, y)|dxdy global mean, we need to make choice as what entity is ideal H(φ(x, y))dxdy +v Ω  (3)to characterize a relatively uniform local region. ”Intensity +λ1 Ω |u0 − c1 |2 H(φ(x, y))dxdy 2 mean” + L2 norm combination have been used in most of +λ2 Ω |u0 − c2 | (1 − H(φ(x, y))dxdy

Proceedings of the 3rd Canadian Conference on Computer and Robot Vision (CRV’06) 0-7695-2542-3/06 $20.00 © 2006

IEEE

10

10

20

20

30

30

40

40

50

50

60

60 10

20

30

40

50

60

(a)

10

20

30

40

50

60

(b)

median for c1 and c2 in Eqn. 3 and redo the experiment in Fig. 3 with the same initial curve. As the rectangle pixels dominate, c1 takes the value of the rectangle pixels, and very quickly, the curve evolution procedure converges and the rectangle, together with the bright spot, is classified as the foreground. This time, the yellow line in Fig 3.b captures the correct object.

3 Our Local-Median + L1 Norm Model 3.1

Figure 3. Necessity of using median, and drawback of using mean. a) is the result using Chan-Vese model; b) is the result using a modified Chan-Vese model, where mean is replaced by median. Red line shows the initial position of the evolving curve, and yellow line is the final result.

the region-based active contour models. However, this discriminator works well only if the underlying distribution is Gaussian. In practice, intensity outliers imposes errors to the estimation of the mean, which is magnified by the L 2 norm when seeking for the minimum of the segmentation energy. Figure 3 depicts a synthetic example where ”mean” may not lead the curve toward the desired position. Fig 3.a shows a rectangle region with a bright spot embedded inside. The red curve is where the initial curve started. One can notice that, within the initial curve, the (gray) pixels from the rectangle are dominant, therefore, the curve is expected to expand to cover the entire rectangle (gray) area. However, due to that fact the mean is not a robust estimator, whose value is greatly influenced by the larger components, the c1 value estimated by using eqn. 5 leans to the bright spot, and eventually, the curve shrinks to capture the bright spot only, leaving the rectangle as part of the background. The yellow line surrounding the bright spot in Fig 3.a is the result generated from Chan-Vese model. Obviously, ”mean” is not providing a good ”representative” for this particular image input. Many robust estimators [14], e.g., Cauchy, Welsch, Geman-McClure etc, can be taken to replace mean in calculate the regional intensity representative. In this paper, we choose median due to its simplicity. Median has the property of being relatively robust to outliers, which is more accurate than mean in reflecting the average situation within an area. Especially, when one curve subsumes two or multiple objects, median owns the inherent ability to pick the dominant object. An simple experiment is conducted to compare mean and median in active contour model. We replace the mean with

As we pointed out in the previous sections, the main drawbacks of most existing region-based models reside in that global means are being used for region representatives. Local details are not take into account in curve evolution. Our solution, aiming to overcome these drawbacks, is designed with two considerations in mind: 1) localize the region representative to better account for local variations; 2) use median as the region representative to improve the robustness. If Chan-Vese model can be called as a ”global” + ”mean” combination, then our model is a combo for ”local” + ”median”. In addition, sum of absolute difference (L 1 norm) is adopted in the energy functional formulation and the justification comes from a fact that median minimizes L1 norm.

3.2

IEEE

Local median

In Chan-Vese model, two global means c1 and c2 are computed for the two areas of inside(C) and outside(C). In our model, we compute two local medians for each pixel (x, y) on the image domain. More specifically, we introduce two functions f1 and f2 , both defined on the image domain, to represent the median values of the local pixels inside and outside the moving curve. Local refers to that only neighboring pixels will be considered. A simplest implementation of the ”neighborhood” is to introduce a rectangular window W with size of 2k + 1 by 2k + 1, where k is a constant integer. Therefore, f1

= median(u0 ∗ inside(C) ∗ W )

(6)

f2

= median(u0 ∗ outside(C) ∗ W )

(7)

The functions f1 and f2 are defined on the entire image domain. f1 (x, y) is computed for each point (x, y), and it takes the median intensity value for the neighboring pixels that are inside the moving curve C. f2 (x, y) takes the median intensity value for the neighboring pixels that are outside the moving curve. Our segmentation model can then be defined as a minimization of the following energy:

Proceedings of the 3rd Canadian Conference on Computer and Robot Vision (CRV’06) 0-7695-2542-3/06 $20.00 © 2006

Design goal

F (f1 , f2 , C) = μ · Length(C) + v · Area(insideC)  +λ1 inside(C) |u0 − f1 |dxdy  +λ2 outside(C) |u0 − f2 |dxdy

20 40

(8)

60 80 100 120 140

Comparing with Chan-Vese model, our solution incorporates two changes. Firstly, the global means c1 and c2 has been replaced with local medians f1 and f2 . Secondly, L1 norm of the data term replaces the original L2 norm.

3.3

20

40

60

80

100

120

(a) Curve evolution at certain iteration. The yellow line is the moving curve.

Under level set framework

Using the Heaviside function H, and the onedimensional Dirac measure δ [1] as the bridge, the energy function F (f1 , f2 , C) can be minimized under the level set framework. Introduce a continuous function φ : Ω → R, so C = {(X) ∈ Ω : φ(X) = 0}, and we choose φ to be positive inside of moving curve C and negative outside C. We have the following new functional to minimize:  = μ Ω δ(φ(x, y))|∇φ(x, y)|dxdy +v Ω  H(φ(x, y))dxdy +λ1 Ω |u0 − f1 |H(φ(x, y))dxdy +λ2 Ω |u0 − f2 |(1 − H(φ(x, y))dxdy

F (f1 , f2 , C)

20

20

40

40

60

60

80

80

100

100

120

120

140

140

20

40

60

80

100

120

(b) f1 at the time

20

40

60

80

100

120

(c) f2 at the time

Figure 4. A glimpse of f1 and f2 functions.

(9)

3.4

Implementation

Correspondingly, f1 and f2 are computed with f1 f2

= median(u0 ∗ H(φ) ∗ W ) = median(u0 ∗ (1 − H(φ)) ∗ W )

(10) (11)

How do f1 and f2 look like? Let us still use the bias field MRI as the example. Figure 4 shows a snapshot of the curve evolution, as well as the corresponding f1 and f2 functions. The yellow line on fig 4.a is the evolving active contour at certain iteration, and f1 and f2 are given in 4.b and 4.c, respectively. The window size used in this example is 21 × 21. Under the level set framework, we deduce the associated Euler-Lagrange equation for the level set function φ. Parameterizing the descent direction by an artificial time t ≥ 0, the gradient flow for φ(t, x, y) is given as ∂φ ∂t

=

φ(0, x, y)

=

∇φ δ(φ)[μdiv( ) − v − |u0 − f1 |(12) |∇φ| +|u0 − f2 |] φ0 (x, y) in Ω

(13)

where φ0 is level set function of the initial contour. This gradient flow is the evolution equation of the level set function of our proposed method.

In practice, the Heaviside function H and Dirac function δ in eqn. 13 have to be approximated by smoothed versions. We adopt the H2, and δ2, used in [1], and they are formulated as follows, H2, (z) =

IEEE

(14)

 1 2 π  + x2

(15)

 δ2, (z) = H2, (z) =

For all the experiments conducted in this paper, we set the Dirac  = 1.0 and set the size of the window W as 21 × 21.

4 Experimental Results We applied our proposed robust segmentation method to a number of synthetic and real images. Comparisons were made with Chan-Vese model. In all the experiment results shown in the section, we initialize the evolving curve C as a circle, and the level set function φ is computed as the signed distance function with respect to C, positive inside C and negative outside C.

Proceedings of the 3rd Canadian Conference on Computer and Robot Vision (CRV’06) 0-7695-2542-3/06 $20.00 © 2006

2 z 1 (1 + arctan( )) 2 π 

20

20

40

40

60

60

80

80

100

100

120

120

140

140

20

40

60

80

100

120

20 40 60 80 100 120 20

40

60

80

100

120

140

20

Figure 5. Segmentation comparison of ChanVese model and our model in handling intensity variations. a) curve evolution result using Chan-Vese model; b) result using our model.

40

20

20

40

30

60

40

80

50

100

60

120 140

IEEE

20

30

40

50

(b) Curve evolution result from Chan-Vese model

20

40

60

80

100

(c) Result from our model

Figure 6. Segmentation comparison made on a bias-corrected MRI slice.

5 Discussion and Conclusions In this paper, we propose a robust active contour model for solving the image segmentation problem. The design purpose is to overcome the drawbacks existing for most region-based active contours, which stem from the two assumptions made by these models regarding the image properties: 1) global Gaussian distribution is assumed for each region, and 2) global mean is sufficient to be used as the discriminant measure. Our model integrates the concept of localization to account for the intensity variations appearing locally. Two local median functions f1 and f2 defined on the entire image domain are utilized to replace the global means c1 and c2 used in Chan-Vese model. Observed from the conducted experiments, our model also has the potential of representing multiple segmentations using only one curve. To explore and discover more desired properties of the new model is our planned future research direction.

References [1] T. F. Chan and L. A. Vese, ”Active contours without edges,” IEEE Trans. Image Processing, vol. 10, no. 2, pp. 266–277, 2001. [2] L. A. Vese, T. F. Chan, ”A Multiphase Level Set Framework for Image Segmentation Using the Mumford and Shah

Proceedings of the 3rd Canadian Conference on Computer and Robot Vision (CRV’06) 0-7695-2542-3/06 $20.00 © 2006

100

10

10

We conducted a similar experiment on the bias-fieldcorrected version of the same MRI slice. The input image is shown in fig 6.a. Fig 6.b and fig 6.c show the results from Chan-Vese model and our model, respectively. Predictably, Chan-Vese model divides the image into two parts: brain and background, as its region representative values (c1 and c2 ) are globally estimated. Our model, on the other hand, by locally adjusting f1 and f2 , is able to discriminate more local details. As evident, four regions - image background, white matter (WM), gray matter (GM), Cerebral spinal fluid (CSF) - have all been clearly delimited, even though we are using a two-phase model. Another noteworthy comparison can be made based on fig 5.c and fig 6.c. Very similar segmentation results can be observed for the MRI image before bias correction and after correction, which may indicate another merit of our model: it is quite independent to global intensity transformation.

80

(a) The input image.

70

The first example is the MRI image with bias field, which we showed in a previous section. The original image can be found at fig 1. As we mentioned before, this image greatly violates the global mean assumption, therefore traditional region-based approaches are expected to fail. Figure 5 shows the result of using Chan-Vese model (fig 5.a) and that of using our local median model (fig 5.b). As evident, by dynamically adjusting its region representative measure (local median) to accommodate to local variations, our model can successfully classify the pixels from white matter (WM), gray matter (GM), Cerebral spinal fluid (CSF) into the correct groups. We should also note that, though the model proposed in this paper is a two-phase one, which in theory can only separate an image into two different areas, a bit of additional post-processing work can make this model produce multiple segmentations.

60

Model”, International Journal of Computer Vision 50(3): 271293 (2002) [3] X. Han, C. Xu and J. Prince, A topology preserving level set method for geometric deformable models, IEEE Trans. Patt. Anal. Mach. Intell., vol. 25, pp. 755-768, 2003. [4] Kass M, Witkin A, Terzopoulos D. Snakes: active contour models. In: First international conference on computer vision; 1987. pp. 59-68. [5] R. Malladi, J.A. Sethian, and B.C. Vemuri, “Shape Modeling with Front Propagation: A Level Set Approach,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17(2), pp. 158-175, Feb. 1995. [6] D. Mumford and J. Shah, Optimal approximation by piecewise smooth functions and associated variational problems, Commun. Pure Appl. Math, vol. 42, pp. 577685, 1989. [7] N. Paragios and R. Deriche, ”Coupled Geodesic Active Regions for Image Segmentation: A Level Set Approach”, ECCV (2) 2000, pp. 224-240. [8] C. Samson, L. Blanc-Fraud, G. Aubert, and J. Zerubia, ”A Level Set Model for Image Classification”, International Journal of Computer Vision 40(3): 187-197 (2000) [9] V. Caselles, R. Kimmel, and G. Sapiro, Geodesic active contours, International Journal of Computer Vision, vol. 22, pp. 61-79, 1997. [10] A. Tsai, A. Yezzi, W. Wells, C. Tempany, D. Tucker, A. Fan, E. Grimson and A. Willsky, ”A Shape-Based Approach to Curve Evolution for Segmentation of Medical Imagery”, IEEE Transactions on Medical Imaging, Vol. 22, No. 2, 137154, February 2003 [11] X. Xie and M. Mirmehdi, ”RAGS: region-aided geometric snake”, IEEE Transactions on Image Processing 13(5): 640652 (2004) [12] C. Xu and J. L. Prince, ”Snakes, Shapes, and Gradient Vector Flow,” IEEE Transactions on Image Processing, 7(3), pp. 359369, March 1998. [13] Yezzi A., Tsai A., and Willsky A.S., A statistical approach to snakes for bimodal and trimodal imagery, in Proceedings of 7th International Conference on Computer Vision, 1999, vol. 1, pp. 898903. [14] Z. Zhang, ”Parameter Estimation Techniques: A Tutorial with Application to Conic fitting”, Image and Vision Computing, Vol. 25, S. pp. 59 – 76, 1997. [15] S. Zhu and A. Yuille, ”Region competition: Unifying snakes, region growing, and bayes/MDL for multiband image segmentation”, PAMI, 18(9):884–900, 1996.

Proceedings of the 3rd Canadian Conference on Computer and Robot Vision (CRV’06) 0-7695-2542-3/06 $20.00 © 2006

IEEE