Optimal Level Curves and Global Minimizers of Cost ... - CiteSeerX

0 downloads 0 Views 817KB Size Report
models captures all the important scene variables but may be useful to get an insight .... The average value of f over a set A is de ned as fA = 1. jAj. Z. A f(x)dx. ? The set indicator function for a set A is de ned as 1x2A =1 if x 2A and 1x2A = 0 ...
Special issue of JMIV on Shapes and Textures

Optimal Level Curves and Global Minimizers of Cost Functionals in Image Segmentation Charles Kervrann ([email protected]) and Alain Trubuil ([email protected])

INRA - Biometrie, Domaine de Vilvert, 78352 Jouy-en-Josas, France

Abstract. We propose a variational framework for determining global minimizers

of rough energy functionals used in image segmentation. Segmentation is achieved by minimizing an energy model, which is comprised of two parts: the rst part is the interaction between the observed data and the model, the second is a regularity term. The optimal boundaries are the curves that globally minimize the energy functional. Our motivation comes from the observation that energy functionals are traditionally complex, for which it is usually dicult to precise global minimizers corresponding to \best" segmentations. Therefore, we focus on basic energy models, which global minimizers can be characterized. None of the proposed segmentation models captures all the important scene variables but may be useful to get an insight into objects, surfaces or parts of objects in the scene. In this paper, we prove that the set of curves that minimizes the cost functionals is a subset of level lines, i.e. the boundaries of level sets of the image. For the completeness of the paper, we present a fast algorithm for computing partitions with connected components. It leads to a sound initialization-free algorithm without any hidden parameter to be tuned. We illustrate the performance of our algorithm with several examples on both 2D biomedical and aerial images, and synthetic images.

Keywords: image segmentation, energy minimization, level sets, level lines, connected components.

1. Introduction Image segmentation and object boundaries estimation are among the most challenging and fundamental addressed problems in computer vision and image analysis. Conventional approaches have centered on formulating the problem of image segmentation as the minimization of a cost functional involving the image intensity and edge functions. Energy minimization-based segmentation methods then fall into two distinct classes, being either boundary-based or region-based. The former class looks at the image discontinuities near objects boundaries, while the latter examines the homogeneity of spatially localized features inside objects boundaries. Among the boundary-based approaches, general purpose closed contours (\snakes" [23]) controlled by elastic forces based on local curvature, in ating forces, and immersed in a potential eld (created for

c 2002 Kluwer Academic Publishers. Printed in the Netherlands.

2 Special issue of JMIV on shapes and textures instance by local edges) have been used to extract continuous contour lines. The interactive initialization of the energy-based active contour near the desired boundary signi cantly reduces the diculty of segmentation. The main diculty in this approach is the robustness to strong noise and artifacts and sensitiveness to the starting position. In addition, most algorithms optimizing the energy function associated to classical snake-type models, nd only local minima, and thus have no measure of the signi cance of the extracted boundary for the image as a whole. There has been signi cant progress on interactive contourbased segmentation: a \balloon force" which controls the interior area of the curve has been introduced in [11] to pass over weak edges and extract salient edges ; Geodesic active contours [25, 7] based on the theory of surfaces evolution and geometric ows [36, 45] and topologically adaptable snakes [30] have been introduced to detect an arbitrary number of objects in the image. These models precisely delineate object boundaries but can use unreliable local information to make a hard premature decision. Region-based approaches are less susceptible to noise and more suitable for unsupervised image segmentation. In the following, we will address this category of approaches which are our main interest. In contrast to boundary-based methods, region-based approaches try to nd partitions of the image pixels into zones the most homogeneous possible corresponding to coherent image properties such as brightness, color and texture. Homogeneity is traditionally measured by a given global objective function and hard decisions are made only when information from the whole image is examined at the same time. This approach generally assumes that the image contains parts of objects and their boundaries are the set of curves that minimizes a global energy function. Some energy models are based on a discrete model of the image, such as Markov random elds ([18, 27, 47] whereas variational models are based on a continuous model of the image [5, 35, 34, 33, 44, 21]. Leclerc [27] proposed a partition process based on a Minimum Description Length (MDL) representation of both the intensity variation within region and the enclosed boundary. Finally, Blake and Zisserman [5] and Mumford and Shah [35] have written about most aspects of the energy minimization-based approach to segmentation and have proposed various complex functionals whose minima correspond to segmented images. Depending on the formulation of the cost function with either continuous or discrete variables di erent minimization techniques can be applied. For cost functions of continuous variables, it is in many cases possible to solve a set of partial di erential equations, the Euler Lagrange equations, that are solved by the minima. For cost functions of discrete variables one must generally rely on direct minimization techniques. Often graduate-non-convexity [5, 27] or Monte Carlo Markov

3 Chain algorithms such as the well known simulated annealing [18] are the methods of choice. Typically, they estimate the curves that maximally separate unknown statistics inside and outside the curves and avoid bad local minima of cost functionals [5, 27, 19]. Unlike previous unsupervised segmentation techniques, other methods use speci c a priori knowledge to ease the segmentation task: they may assume that the number of objects [24, 10, 16], layers [13], classes [38, 48], or the statistics inside region boundaries [19, 43, 8, 3] are known or estimated using prohibitive Expectation-Maximization procedures or ad-hoc methods [13, 38, 48]. The region boundaries propagation can be then implemented using the level set theory [36, 45, 25] but these supervised segmentation methods may be sensitive to the initial curve conditions [3, 43, 8, 38]. Finally, there has been only very limited previous works aiming at integrating boundary and region information sources through a singleobjective functional [50, 38]. The global energy functional is optimized with a suboptimal optimization scheme [50] or a level set framework [38] which provides the advantages of numerical stability and topological

exibility. In practical imaging, these methods are more robust to noise and weak boundaries, but may su er from the problem of initialization of curves [50, 38], o -line estimation of the mixture model of Gaussians approximating the probability density function of the image [38], or selection of hyperparameters weighting the contribution of energy terms [50, 38]. Special issue of JMIV on shapes and textures

1.1. Motivation While the energy minimization-based approach o ers a powerful theoretical framework and minimizers exist [33, 50, 44], the main diculty is to nd more e ective and faster ways for computing the set of curves that minimizes the segmentation energy. A key issue arising in image segmentation is then the search for global minimizers of energy functionals, since global minima should correspond to \optimal partitions". The main contribution addressed in this paper is to list basic energy models, which global minimizers can be characterized in advance. Accordingly, iterative algorithms are not necessary to solve the optimization problem. Instead, meaningful segmentations can be computed by examining a family of possible partitions including the optimal partitions associated to global minima of the energy functional. Basically, the idea of our own work starts from the review by Morel and Solimini [33] of the energy model introduced in a discrete setting by Beaulieu and Goldberg for image partitioning [4]. The cost functional is composed of two terms, one that punishes deviations from the image and another

4 Special issue of JMIV on shapes and textures that encourages the emergence of a small number of regions [4, 33]. The present original investigation is based on a variational model to characterize global minimizers of similar energy models. A large part of this paper is devoted to the proof that the set of curves that minimizes a particular class of energy models related to the Beaulieu and Goldberg's energy is a subset of level lines de ned from level sets of the image. In other words, minimizing the energy is therefore equivalent to select a subset of connected components corresponding to image regions that optimally partition the image domain. We list some prior models (Markov connected component elds, entropy prior) which are consistent with this theoretical framework. In this sense, the method is equivalent to a procedure that splits the image into its natural components, the level curves, and selects a subset of these lines delivering a global or local minimum for the segmentation functional. Finally, we present a fast and original algorithm for computing partitions with connected components. It leads to a sound initialization-free algorithm without any hidden parameter to be tuned. The approach is robust to the initialization of both the number of regions and statistics inside regions. 1.2. Paper organization The remainder of this paper is organized as follows. Background and more related studies are presented in Section 2. In Section 3, we describe the energy-based models for image segmentation. We point out some properties of these models and outline minimizers. Section 4 is devoted to the numerical implementation of our models and the computation of the image segmentations. We describe a segmentation algorithm where user interaction is limited to providing a few sound parameters. In Section 5, we illustrate with some experiments on both synthetic and real images, the main features of our numerical schemes. Conclusions and perspectives are presented in Section 6. 1.3. Notation The notation used hereafter are the following: General notation: ? I is a digital image with 255 integer gray levels. ? S is a bounded and open subset of R2. ? f : S ! R is a ltered and continuous version of the original image I . We consider the possibility of examining the image at various scales using a Gaussian smoothing of the image I .

5 ? We use the terminology \site" or \pixel" to denote a point x 2 S , even in the continuous case. Each point is assigned a value f (x). ? pz is the standardized frequency of the number of pixels with value z 2 [0; 255]. ? The level sets L (f ) of f are de ned as: L (f ) = fx 2 S : f (x)  g; 2 R Special issue of JMIV on shapes and textures

and their boundaries constitute the level lines of f . ? We de ne the bilevel sets of f with levels v and w, 0  v < w, as the set of pixels x 2 S such as v  f (x) < w. ? BV (S ) is the space of functions with bounded variations [33, 6]. The BV space is a very straightforward space for image segmentation. A situation for an image I not to be in the BV space is to have level lines with in nite length or the sum of nite perimeters of level lines tends to in nity. We assume hereafter that small objects are not too numerous in I to be consistent with the BV assumption.

? We note jAj =

Z

dx the two-dimensional Lebesgue measure which corresponds to the area of the set A  S since S is de ned as a A

subset of the Euclidian space.

? The average value of f over a set A is de ned as f A = 1

jAj

Z

A

f (x)dx.

? The set indicator function for a set A is de ned as 1x 2 A = 1 if x 2 A and 1x 2 A = 0 otherwise. ? An usual distance between two closed subsets A and B is the Hausdor distance, de ned by (

)

d1(A; B) = max sup d(x; B); sup d(x; A) ; x2A

x2B

where d(x; A) denotes, as usually, the distance of a point x to the set A d(x; A) = yinf jx ? yj: 2A For two sets A and B , denote

Z

AnB

f def =

Z

A

f?

Z

B

f:

6 Special issue of JMIV on shapes and textures Partition notation: We consider that objects (or regions) i  S are connected, disjoint and non-empty image domains: i \i6=j j = ;. We note @ i the boundaries of objects i . Let be the non-connected image background de ned as the complementary subset of the union of objects:

= Sn

P [

i=1

i and i

\

= ;:

2. Background and related works There is a substantial literature describing the use of energy minimizationbased methods for image segmentation. In this section, we only touch on some of the work more closely related to ours. The Mumford-Shah functional [35] is known as the most synthetic criterion for the segmentation process [33], depending on two variables, the unknown image function f^ and the set of unknown boundaries denoted D: EMS (f;^ D) =

Z

n

S D

^ 2 dx +  (f(x) ? f(x))

Z

n

S D

^ j dx +  `(D): (1) jrf(x)

where f is the observed image, f^ the denoised image and D the set of discontinuities. This functional is the combination of three terms: the rst term penalizes a function f^ that di ers from f ; the second term penalizes large gradients rf^ ; the last term penalizes the total length of the segmentation curves. (f;^ D) is the segmentation of f corresponding implicitly to a piecewise smooth function f^ with a set of boundaries D. The  and  parameters control the relative weight of the three terms. The simplest energy functional associated with this generic description of Mumford and Shah is the restriction to piecewise-constant functions [26, 33, 50]:

EMS (f;^ D) =

Z

S nD

(f (x) ? f^(x))2 dx +  `(D);

(2)

where D is a union of boundaries in S with length `(D) and f^ is piecewise constant on S n D. This functional assumes highly constrained parametric models for pixel intensities within each region, whereas the complete Mumford-shah model naturally accommodates variability across each region without the need to model such variability parametrically. The constant  represents the scale parameter of the functional and measures the amount of boundaries: if  is low, a lot of boundaries are allowed ; as  increases, the segmentation gets coarser. This

7 functional represents the simplest compromise between accuracy of the regions and parcimony of the boundaries. If we x the boundaries D, then the corresponding minimal f^ is completely de ned by the fact that its value on each region i of S n D is equal to the average value f i of f over i . This corresponds to the \cartoon" model [50]: Special issue of JMIV on shapes and textures

Ecartoon (f; 1 ;    ; P ) =

P X

Z

=1 i

i

(f(x) ? f i )2 dx +  `(

P [ i=1

@ i ): (3)

This energy functional is adequate for extracting at zones with regular boundaries in images and, in this sense, the functionals we consider in this paper are quite related. In addition, from a numerical point of view, it is not easy to compute a minimizer for this functional since the set of boundaries is unknown. We mention the level set methods [36, 45] and the merging regions method [26, 33] to solve the so-called \minimal partition problem". In this paper, we propose an algorithm devoted to the search of global minima of basic energy functionals, i.e. less ambitious than the \cartoon" model, for which the length term has been relaxed. We note also that the energy functionals introduced in this paper (6) are also related to balloons models [11, 50, 40]. Snakes and balloons have been typically used with creative external forces to segment various anatomical structures. An active contour is a closed contour of a region @ , de ned by @ (s) where s 2 [0; 1] can be the arc length of the contour. The two-dimensional balloon model extends the snake energy to include a force N~ (s) which pushes the contour out (or in) along its normal N~ (s). Changing the sign of  makes this normal force in ating or de ating. Contrasting the balloon and snake models, we note that incorporation of the normal force in the balloon model allows the initial position of the contour to be further from the intended nal position, while still enabling convergence. As well, in the balloon model, the initial position can lie either inside or outside the intended contour, while the snake model requires the initial position to surround the intended contour [11, 40]. The additional in ation or de ation force can be derived from an Renergy functional. It corresponds to the following energy term j j = dx which is a reduced form of our prior model PP j

i=1 i j where P = 1 introduced in Section 3.4. Thus, the balloon model maximizes its area while smoothing its bounding contour @

and maximizing the intensity gradient along the contour. Finally, we want to point out that the approach presented in the current paper shares common aspects with the region competition approach of Zhu and Yuille [50]. The Zhu-Yuille's algorithm is derived by minimizing a generalized Bayes/MDL criterion using the variational principle. In this part of the section, we propose to determine the

8 Special issue of JMIV on shapes and textures motion for a point xb at the common boundary @ i (parametrized by s 2 [0; 1]) of a region i and the background by computing the gradient descent on (6). The time dependent position of the boundary @ i can be expressed parametrically by xb (s; t). The motion of the boundary @ i is governed by the Euler-Lagrange di erential equation. For any point xb (s; t) on the boundary we have: dxb(s; t) = ? E (f; i ; ) (4a) dt xb (s) ~ b (s; t)) + (f(x) ? f i )2 ? (f(x) ? f )2  N(x ~ b (s; t)) (4b) = N(x

where N~ (xb (s; t)) is the unit normal to @ i at point xb (s; t). There are two forces acting on the contour, both pointing along the normal: the rst term is analogous to a pressure term ; the second term is the statistics force. The better the point xb (s; t) satis es the homogeneity requirement, and the weaker is the statistics force. This equation can be seen as a degenerate case of the region competition algorithm [50]. The Euler-Lagrange equations solving for each region can be complex and the region competition algorithm (see [50]) nds a local minima. Using the level-set formulation [36, 8, 43, 38, 48], suitable numerical schemes have been derived for solving propagating equations. However, in both cases, seed regions must be provided by the user or randomly put across the image, and average values f i and f are updated at each step of the iterative algorithm. In Sections 3.3 and 3.4, we prove that the steady solution of a set of P equations of the form (4b) can be directly determined (P regions superimposed on a background). Most of the time, these energy functionals are designed as a combination of several terms, each of them corresponding to a precise property which much be satis ed by the optimal solution. The common approach consists in minimizing a functional including two parts: a prior model Ep(f^) and a data model Ed(f;^ f ). The prior term is sometimes called the \regularizer" because it was initially conceived to make the problem of minimizing the data model well-posed (in the sense of Hadamard). They may be used in conjunction with Bayes's theorem. The reviewed models are quite related to the segmentation energies described in the remainder of this paper, for which a complementary description of minimizers is given.

3. The framework The energy models we propose are very rough and incomplete models for real world images. They assume the existence of regions in images and try to give a relevant description explaining images.

9

Special issue of JMIV on shapes and textures LIST OF ALL OBJECTS

CONFIGURATION CODING

Ω1 Ω1 Ω2 Ω3 Ω4 0 0 0 0

IMAGE OF ALL OBJECTS

Ω2

MINIMAL ENERGY-BASED SELECTION

Ω2

Ω2

Ω1 Ω4



Ω3 Ω1 Ω2 Ω3 Ω4 0 1 0 0

Ω3

Ω3

Ω4 Ω1 Ω2 Ω3 Ω4 1 1 1 1

Figure 1. A minimal energy-based partition is given by the set of (P = 2) objects [ f 2 ; 3 g and a background = 1 4 .

3.1. Some preliminaries Our theoretical setting is the following. I is a digital image assumed to be in the BV space. In particular, this means that small objects are not too numerous in I . A more regular image f de ned on S can be obtained by Gaussian smoothing of I . Thus, we may work in a space of continuous functions. In this framework, we interpret f in the classical way as a nested family of level sets L (f ) de ned by Matheron [29]. In this section, we shall propose highly simpli ed models for image generation. We just aim at nding both the most meaningful regions corresponding to visible parts of occluded real objects and the image background. A partition of space S consists in nding a set f i gPi=1 and a background de ned as the complementary subset of the union of objects (see Fig. 1):

Sn =

P [

i=1

i ;

i

\

i6=j

j = ; and i

\

= ;:

(5)

3.2. Minimization problem We de ne the solution to the segmentation problem as the global minimum of a regularized criterion over all regions. We seek a strong segmentation, that is, a partition of the rectangle S into a nite set of patches, each of which corresponds to a part of the image where f is

10 Special issue of JMIV on shapes and textures approximated by a constant value. Moreover, we just wish to control the number P of regions in the image. The regularization theory leads to associate with the unknown domains i and the following regularized objective function, inspired from [4, 20]: E (f; 1;    ; P ; ) = Ed (f; 1;    ; P ; ) +  Ep ( 1 ;    ; P ; ) (6) where f is any continuous function, Ep( 1;    ; P ; ) is a penalty functional and  > 0 is the regularization parameter. Some choices of f have been recently listed in [21]. In [28], f is an image with local variance computed over a xed-sized window as the values at each pixel. Equation (6) is the most general form of energy we can optimize globally at present. We present three appropriate energy models for segmentation which attempt to capture homogeneous regions with unknown constant intensities. It will be clear that none of these models captures all the important scene variables but may be useful to provide a rough analysis of the scene. We shall see that objects can be extracted as connected components of image levels sets provided the prior model only includes nonlinear functions of the area of objects. The regularization term Ep ( 1;    ; P ; ) is herein introduced to encourage the emergence of a large background and gives no control on the smoothness of boundaries. Instead of xing a priori the cardinality of the segmentation, which is a highly arbitrary choice, it seems indeed more natural to control the emergence of regions by an object area-based penalty weighted by a parameter . The constant  will be interpreted as a scale parameter of the functional that only tunes the number of regions [4, 33]. If  is large, the background is encouraged to be a large part of the image. As  decreases, a lot of regions are allowed and the segmentation is ne. If  = 0, each point is potentially a region and = ; ; the global minimum value coincides with zero and this segmentation is called the \trivial segmentation" [26, 33]. From the many energy criteria that have been proposed in the literature, we recall three signi cant basic examples commonly used in image segmentation. Least squares criterion: In this modeling, implicitly, a Gaussian distribution for the noise is assumed [33, 50]. The data model is usually de ned as Ed(f; 1 ;    ; P ; ) =

P Z X

=1 i

i

(f(x) ? f i )2 dx +

Z



(f(x) ? f )2 dx (7)

where f i and f denote respectively the unknown average values of f over i and = S n[Pi=1 i . This means that one observes a corrupted function f = ftrue + ", where " is a zero-mean Gaussian white noise.

Special issue of JMIV on shapes and textures

11

The true image ftrue is supposed piecewise constant:

ftrue (x) =

P X i=1

f i

1x 2 i

+ f 1x 2 :

(8)

The standard deviation is assumed to be constant over the entire image [50]. The image domain S is split into unknown P disjoint regions

1;    ; P and a background region . The data model (7) is of a general form and includes the situation when the user speci es input a priori knowledge about objects and background [3]: by setting f i = 255 and f = 0, we are interested in the detection of regions of high intensity or bright spots against a dark background [21] ; in motion analysis, the form

Ed (f; 1;    ; P ; ) =

Z

S

f 2 (x) dx ?

P X

Z

i=1 i

f 2 (x) dx

(9)

where f is the temporal gradient image, has been used to detect temporal changes in image sequences. Intracluster variance criterion: The intracluster variance is simple and yet captures compact sets of points with large distance between clusters. The approximation error, introduced in [4], is then the sum per unit area of the squared di erences between the pixel values and the region average f i : Z Z 1 1 2 (f(x) ? f )2 dx:(10) Ed (f; 1 ;    ; P ; ) = j j (f(x) ? f i ) dx+ j

j i



i i=1 P X

The method of Beaulieu and Golberg [4] may be regarded as a problem of piecewise image approximation, which consists in nding the partition with small enough cardinality and variance [33]. Contrast criterion: One may be interested in identifying boundaries corresponding to sharp contrast in the image. We de ne the contrast of a boundary by the di erence between the average value of f per unit area on the inside of the boundary and the outside of the object, that is the background [37, 48]. Formally, the data model is

Ed(f; 1;    ; P ; ) = ?

P X i=1

(f i ? f )2:

(11)

Regions are assumed to be surrounding simple closed curves superimposed on the background. This data model does appear to have a

12 Special issue of JMIV on shapes and textures fairly wide application potential, especially in medical image analysis and confocal microscopy, where the regions of interest appear as bright objects relative to the dark background. This criterion may be further generalized by considering statistics other than means [48]. For the sake of clarity, we restrict ourselves to the rst case, i.e. the least squares criterion, and give major results for the other criteria. Our aim is now to de ne objects in f . Therefore, we de ne the following collection CP of P  0 admissible objects

CP = ff 1;    ; P g  S are connected ; S n =

i

\

j = ; g:

P [ i=1

i ;

1i6=jP

When P = 0, there is no object in the image. An optimal segmentation of image f is by de nition a global minimum of the energy (when exists)

( ?1;    ; ?P ? ; ?) = inf0P T inff 1 ;; P g2CP E (f; 1;    ; P ; )(12) where the collection CT denotes the bank of all admissible objects, i.e. CP  CT ; 8P  T , and T is the maximum number of registered objects in the image. In Fig. 1, T = 4 and a minimal energy-based partition is given by the set of (P = 2) objects f 2; 3g and a background

= 1 [ 4 . A joint minimization with respect to all unknown domains i and parameters f i is an intricate problem [5, 35, 33, 50, 12], even if T is low, because of the large number of possibilities of placing object boundaries inside S and P is unknown. In addition, the set of unknown variables (sets and functions) are, by de nition, clearly not independent. In Section 2.3, we prove that the object boundaries that minimize (6) are level lines of function f , which makes the problem tractable. 3.3. Minimizer description and level lines We assume here existence of minimizers of the energy E(f; 1;    ; P ; ). Our estimator is de ned by (when exists)

b ( b 1 ;    ; b Pb; ) = argmin0P T argminf 1 ;; P g2CP E (f; 1;    ; P ; ):(13)

Minimizing E with respect to f 1;    ; P g is herein equivalent to estimate a piecewise-constant function f^ (denoised image) in the space of functions of the form:

f^(x) =

^

P X i=1

f ^ i 1x 2 ^ i + f ^ 1x 2 ^ :

13 The question of the existence of a global minimum for the energy like Mumford and Shah's energy is a dicult problem [33]. In this paper, our aim is not to investigate conditions for having global minima and discuss their existence, which is quite dicult. In what follows, we just assume the existence of minimizers of the energy [33, 20]. Special issue of JMIV on shapes and textures

Minimizer description We propose the following lemma

LEMMA 1. If there exist minimizers, if in nitesimal variations in the neighborhood of the minimal solution do not introduce topological changes, and if Ep penalizes only the area of objects, then the set of curves that minimizes the energy is a subset of level lines of f :

fj@ bi  i ;

i = 1;    ; Pb :

i.e. the border @ b i is a boundary of a connected component b i of a level set of f . Proof (of Lemma 1). Let  be a variation of a set , i.e. the Hausdor distance d1 (  ; )   . To prove Lemma 1, we assume that, for any connected perturbation of such d1 (  ; )   , two neighboring sets and 0 do not merge into one single set [ 0 and, for any connected perturbation of such d1 (  ; )   , does not split into two new sets. This corresponds to prohibited topological changes. Without loss of generality, we prove Lemma 1 for one object

and a background , that is the closure Zof the complementary Z Zset of

. We de ne the image moments as m0 = 1; m1 = f; m2 = f 2 ;

M0 =

Z

1 ; M1 =

Z

Z





2 f : The di erence between the involved

f; M2 = S energies is equal to Ed(f; ; ) = E(f;  ;  ) ? E(f; ; ), i.e. E(f; ; ) = E d(f;  ;  ) ? Ed(f; ; ) + Ep (  ;  ) ? Ep( ; ) : (14) | | {z } {z } S

S

Ed (f; ; )

Ep ( ; )

Denote j j = j  j ? j j. Let xb be a xed point of the border @ . Choose  such that @  = @ except on a small neighborhood of xb . if j j ! 0, i.e. j  j ' j j, we obtain (higher order terms are neglected) (see Appendix) Ed (f; ; ) = j j

2 X

j =0

aj f j (xb ) + O(j j2)

(15)

where the coecients faj g are computed from image moments. For the three criteria described in Section 2.1, the coecients faj g are listed in table I.

14 as

Special issue of JMIV on shapes and textures

Now, if Ep penalizes only the area of objects, we can write Ep( ; ) Ep( ; ) = j j

2 X

j =0

bj f j (xb ) + O(j j2)

(16)

where the coecients fbj g are computed from the same image moments. The energy having a minimum for , f (xb ) needs to be solution of the following equation 2 E(f; ; ) = X (aj + bj ) f j (xb ) + O(j j) = 0: (17) j j j =0

By passing to the limit j j ! 0, we obtain 2 X

j =0

(aj + bj ) f j (xb ) = 0:

(18)

Equation (18) has not more than two solutions. The coecients faj + bj g do depend on neither xb nor f (xb ), and a0 + b0 6= 0. The function f is continuous on a domain S  R2 and @ is a connected curve. Therefore f (xb ) is constant when xb covers @ . In other words, @ is a level line of f . This completes the proof 2 We have proved Lemma 1 with a connected perturbation. Equation (18) states a necessary condition which is essential to prove that a subset of level lines is a minimizer of the energy. In this approach, the family of possible partitions, based on level lines which do not cross each other's, includes local and global minimizers. Indeed, placing a level curve inside S corresponds to select a local or a global minimum of the energy. A global minimizer can be then computed if all combinations of level curves are supervised. We basically consider further that a connected component is an object i : each bounded connected component has a topological border that is composed of edgels called lines lines ; a level line separates the plane into two disjoint connected parts, it bounded interior and its unbounded exterior ; i is comprised in the interior of one of the level lines. In the next section, we list two penalty functionals relying on the Markov connected component elds and entropy theories, which are consistent with Lemma 1. 3.4. Prior models One of the diculties in the Bayesian approach is to assign the prior law to re ect our prior knowledge about the solution. Besides, in consequence of Lemma 1, the set of penalty functionals is limited. The

Special issue of JMIV on shapes and textures

15

Table I. Coecient faj g associated to the least squares, intraclass variance and contrast criteria. a0

Least Square criterion

m21 m20

? ((MM10 ??mm10 ))22

? mm202 + 2mm301

a1

a2

? 2mm01 + 2(MM01??mm01 )

|

2

Intraclass variance criterion

+ (MM02??mm02)2

? 2((MM01??mm01))32

? 2mm201 + 2((MM01??mm01)2)

Contrast statistic criterion

2m21 m30 ? 2((MM01??mm01))32 ? 2m1 (1m?20 (2Mm00?)(mM01)+2 m1 )

? 2mm201

+ 2((MM01??mm01))2 + m2(0M(M1 ?0 ?2mm10)) 2

1 m0

? M0 ?1 m0

|

contribution of a given pixel to the prior does not depend on the relation with neighbors and the resulting regions may have noisy boundaries. Here, the proposed penalty functionals are not necessary convex but only enable to select the right number of regions. Instead of xing a priori the cardinality of the segmentation, which is a highly arbitrary choice, it seems more natural to control the emergence of regions by an object area-based penalty or by an information criterion weighted by a scale parameter . Markov connected component elds Beforehand, we recall that a grayscale image can be interpreted as a nested family of sets fL (f )g where L (f ) = fx 2 S : f (x)  g. These deterministic thresholded sets give rise to a random set if the index 2 [0; 255] is replaced by a random variable U de ned on the set of grey-levels. If U is uniformly distributed on the set of grey-levels, this yields a simple relation between the distribution of LU and the image f : Prfx 2 LU g = PrfU  f (x)g / f (x): (19) We will now discuss models based on some random partition of the plane into connected components. These primitives made from image level sets, have the ability to capture some global features of the image without the need to specify too tightly the exact scene contents. From

16 Special issue of JMIV on shapes and textures the relationship between gray-scale images and random sets, a new class of Gibbsian models with potentials associated to the connected components or homogeneous parts has been introduced in [31]. For these models, the neighborhood of a pixel is not xed as for Markov random elds, but given by the components which are adjacent to the pixel. These models are especially applicable for images where a relatively few number of gray levels occur, and where some prior knowledge is available about size and shape characteristics for the connected components [47]. The Markov connected component elds possess certain appealing Markov properties which have been established in [31]. Here we considered a Markov connected component eld whose probability density function is proportional to exp ?

( P X |

i=1



1 P + 2 j i j + {z

Ep( 1 ;:::; P ; )

3 j ij2

)

:

(20)

}

The parameter 3 controls the size of the components since the squared area of the union of two components is greater than the sum of the squared areas of each component. The size of the components is however also in uenced by the parameter 2 together with the parameters 1 and  = f0; 1g which control the number of components. Another Markov connected component eld can be obtained by penalizing small components [31]. Formally, the prior model is Ep ( 1; : : :; P ; ) = PP ?1 . These potentials Ep( 1 ; : : :; P ; ) are the more gen j

j 0 i i=1 eral form we can use since the boundaries of connected components cannot be penalized in our framework. The internal potentials can be separately used to select the right number of regions by setting 0 ; 1; 2 ; 3 = f0; 1g. Thereby, the prior model is of the form Ep( 1 ; : : :; P ; ) =

P 3 X X

=1 n=0

i

n j ijn?1; n = f0; 1g and

3 X

=0

n

n = 1: (21)

In Section 2.2, we proved Lemma 1 for one object and a background . Using the same notations, we easily write 3 X

n (j jn?1 ? j jn?1 ) (22a) n=0 = j j (? m02 + 2 + 23 m0 ) + j j2( m04 + 3 ): (22b) 0 0  0 Accordingly, we obtain b0 = (? m20 + 2 +23 m0) and b1 = b2 = 0, which Ep( ; ) =

is consistent with Lemma 1 if no pathological event (e.g. topological changes) occurs.

17 The application of Markov connected component elds is somewhat more computationally demanding than the application of Markov random elds. By the local Markov property the calculations for an update of a site in a single site updating algorithm only involves the components adjacent to this site. Our work may be regarded as an preliminary exploitation of the theoretical framework described by Mller et al. [31] in image segmentation. Special issue of JMIV on shapes and textures

Entropy prior The entropy function has been widely used as a prior in a Bayesian context for image restoration. Here, the entropy of the segmented image is written as follows [14]

Ep( 1; : : :; P ; ) = ? =?

P X i=1 P X i=1

p i log p i ? p log p

(23a)

j ij log j ij ? j j log j j (23b) jS j jS j jS j jS j

where fp 1 ; : : :; p P ; p g the standardized frequencies of the number of pixels with gray level values ff 1 ;    ; f P ; f g in the segmented image. The histogram entropy is minimized for a Dirac distribution corresponding to one single class in the segmented image. In image segmentation, we want to obtain a histogram sharper than the histogram of the initial image, so the entropy should be minimized [14]. The actual reduction of number of classes is obtained from the information prior Ep( 1; : : :; P ; ). Using the notations introduced in Section 2.2, we write j j 2 (24) Ep( ; ) = jSj j j log jS jj?

j + O(j j ):

j Accordingly, we obtain b0 = jS1 j log jS j?j j j and b1 = b2 = 0, which is consistent with Lemma 1.

3.5. Upper bound of the objects number It appears, most of the time, that variations in the values of this parameter  have signi cant e ects on the qualitative properties of the minimizer [49]. In this section, we show that the maximum number of objects is explicitly in uenced by . LEMMA 2. Suppose that the segmentation energy is maximum when there is no object in the image. If there exist minimizers of (6) and (7), P P if Ep ( 1;    ; P ; ) = Pi=1 3n=1 n j i jn?1 and if j i j  j minj; i =

18 Special issue of JMIV on shapes and textures 1;    ; P , then the optimal number P ? of objects is upper bounded by Z

Pmax = ?1 (1 + 2 j minj + 3 j minj2)?1 (f (x) ? f S )2 dx; S

with 0 = 0, 1 ; 2; 3 = f0; 1g and

3 X

n=1

n = 1.

Proof (of Lemma 2). If 0 = 0, we have



?

P X

3 X

i=1 n=1

n j ?i jn?1  E(f; ?1;    ; ?P ? ; ? ) 

Z

S

(f (x) ? f S )2dx:

3 P? X X 2 ? n If j i j  j minj, we have P (1 + 2 j minj + 3 j minj )  i=1 n=1 Z and P ?  ?1 (1 + 2 j minj + 3 j minj2 )?1 (f (x) ? f S )2dx S

j ?ijn?1 2

4. A stepwise algorithm for image segmentation This section describes our algorithmic procedure for object boundaries estimation using the result described above. We discuss issues that have arisen in converting the theory to practice for our applications. We present a formal description of the method starting with a description of the input parameters of the algorithm. Our recommendations for the concrete choice of these parameters as well as the default choices used in our simulations and applications are collected in this section. The algorithm we propose is automatic and does require neither the number of regions nor any initial average values for regions and background. 4.1. Level sets and object boundaries The key ingredient of the procedure is the construction of objects whose boundaries are level lines in the image [6]. In the today's technology, we can traditionally associate with an image 256 level sets fL (f )g,

2 f0; 1; 2;   ; 255g. Let be a xed level of the image, 0   255, and let u be the binary image at level of the discretized original image f , de ned by u (x) = 1 if f (x)  and u (x) = 0 otherwise. A crude way to build pixels sets corresponding to objects would be to proceed to a connected components labeling [32] of binary images fu g, 0   255, and to associate each label with an object i. Its boundary @ i would be the border of the connected component within the image level sets [6].

19 Instead of computing all the 256 level sets, we restrict only this computation to a small number of L(< 256) level sets and adaptively quantify the image histogram using an entropy method. Now, we consider the scenario where a point x belongs to one single connected component at once within the image level sets. We take into account this fact and de ne the L-bilevel sets of f [2]. For l 2 N varying from 1 to L, let bl be the binary image with bl (x) = 1 if f (x) 2 [tl?1 ; tl [ and bl (x) = 0 otherwise, where tl is a threshold (tl 2 [fmin ; fmax]). Each bilevel image represents a quantization level of the original image. In general, each bilevel set is made up of n(tl ) connected components, where n(tl ) is a function of the threshold tl . Notice that the connected components f tl;1 ; tl;2 ;    ; tl;n(tl) g, 1  l  L are disjoint and their union is the image domain S: Special issue of JMIV on shapes and textures

l[ =L l=1

[ tl;1

[

tl;2

[

[

   tl;n(tl)] = S:

(25)

The connected components f i g of level sets can be characterized by their surrounding curves f@ i g, that is the level lines [6, 2]. If we map these level lines for a given set of L levels, we get a segmentation of the image also called topographic map [6, 17]. Recently, Monasse and Guichard proposed a fast discrete algorithm to compute a topographic map using a contrast invariant tree representation of connected components [32]. As made clear in [6], the topographic map is the basic structure of the image. More generally, one can consider a segmentation achieved using only some connected components of level sets, which is the philosophy of our approach. The most perceptible level lines can be determined by an isoperimetric criterion [17] or the detection of Tjunctions of level lines [6]. Both criteria are strong indicators of region boundaries. Instead, we de ne herein perceptually signi cant level lines as the level sets boundaries of an adaptively quantized image by using L quantizers and an entropy method. As a consequence, the detection of meaningful level lines will depend on the quantization parameter L. Unlike previous criteria [6, 17], this quantization operation is not invariant to contrast changes. Nevertheless, we shall see that, in practice, L = f4; : : :; 8g seems sucient to detect physically meaningful objects with large areas in the image. 4.2. Adaptive image quantization Let us address the detection of bilevel intervals in a image histogram. This task is a preliminary step useful for separating objects from background. In histogram analysis, we can distinguish several classes of algorithms computing modes or meaningful intervals. First of all, the

20 Special issue of JMIV on shapes and textures normalized histogram may be regarded as the instance of a mixture of L gaussian random variables whose averages and variances have to be estimated. Clearly, optimization algorithms, e.g. EM procedures, can be used to solve this problem [38]. Moreover, many theories intend to threshold a histogram in an optimal way, that is divide the histogram into intervals according to some criterion [15]. The presence of several distinct objects makes traditionally the histogram multimodal. The thresholds are found by locating the valleys that separate the modes. Most grey-levels histogram are not multimodal, for this reason other techniques are needed. Information theoretic methods for image thresholding use the image histogram and an objective function derived using information theory to determine the optimal thresholds. The most popular criterion is the entropy criterion. Entropy methods seek to maximize the information content between objects and background pixels of an image. This problem is relevant in image analysis, since it leads to the problem of optimal quantization of the grey levels. We present a technique based on Shannon's entropy concept, for grey-scale image quantization [41, 9, 22]. The maximum entropy sum method proposed by Kapur et al. is based on the maximization of the information measure between two classes (objects and background) [22]. The distribution of the grey levels in the image can be approximated by a histogram which gives the normalized frequency of occurrence of each gray level in the image. The entropy of the entire image 255 X is given by H = ? pu log2 pu where pu represents the standardized u=1 frequency of the number of pixels with grey-level u. If we assume L bilevel intervals fIlg in the image, the a priori entropies are de ned as HIl = ?

tl L X pu log pu with p(I ) = X p and p(Il ) = 1:(26) l u p(Il ) 2 p(Il ) u=tl?1 u=tl?1 l=1 tl X

Denote Hl = ?

tl X u=tl?1

pu log2 pu . The information between intervals

I (I1;    ; IL) is de ned as I (I1;    ; IL) =

L X l=1

HIl =

L X

l : log2 p(Il ) + pH ( I l) l=1

(27)

The method due to Kapur et al. chooses the thresholds ftl g to be the values at which the information is maximum. In this approach, the number of levels L is assumed to be known. The estimation of L could be addressed using a more complex information criterion [27].

21 Experimental results show that this standard approach is very simple and ecient to implement compared to similar approaches in terms of computational complexity (computational time and memory requirement). Besides, it yet yields reliable estimates of the threshold levels with respect to noise e ect. The resulting algorithm thus provides an ecient and totally unsupervised threshold selection method. Intensive experiments of the algorithm with various characteristics has shown its e ectiveness in practical applications that will be further explained in Section 4. Special issue of JMIV on shapes and textures

4.3. The procedure The proposed algorithm is not a region growing algorithm as described in [26, 33, 4, 39] since all objects are built once and for all. It di ers from the wathershed approach since regions that emerge from the watershed segmentation are not necessarily connected components within the image level sets [46]. Unlike our method, the watershed approach includes an oversegmentation and computational expense. We post-process the connected components to remove any components whose surface area j ij is less than some threshold j minj (a parameter of the method) to eliminate regions corresponding to noise and artifacts in the original image. This parameter is commonly used in image segmentation [42, 1]. To implement our level set image segmentation based on energy minimization, a four stages method is used. Let L, , j minj be the input parameters set by the user. 1. Bilevel set construction The rst step completes a crude

mapping of each image pixel on a given bilevel set. At present, we quantize the function f 2 [fmin ; fmax] in L non-equal-sized and nonoverlapping intervals [tl?1 ; tl[, l = f1;    ; Lg. Given this set of intervals estimated using the maximum entropy sum method [22], let bl be the bilevel set image with bl(x) = 1 if f (x) 2 [tl?1 ; tl [ and bl(x) = 0 otherwise. 2. Object extraction A crude way to build pixels sets correspond-

ing to objects is to proceed to a connected components labeling of bilevels image bl and to associate each label with an object i . Though this process may work in the noise-free case, in general we would also need some smoothing e ect of the connected components labeling. So we consider a size-oriented morphological operator acting on sets that consists in keeping all connected components of the output of area larger than a limit j minj. This area operator does not introduce new features or edges and boundaries of connected components are preserved [42, 1]. The list of remained connected components then forms

22 Special issue of JMIV on shapes and textures the bank CT of admissible T objects f 1;    T g (P  T ) such as j ij  j minj. The connected components of area lower than j minj are a part of the background . 3. Configuration determination The connected components are

then combined during the third step to form object con gurations. For instance, these con gurations can be built by enumeration of all possible object combinations, i.e. 2T con gurations. Each con guration is made of a subset of objects taken in the bank f 1 ;    T g The background corresponds to the complementary set of objects selected for each con guration. Each possible con guration can then be indexed by a binary number bi which is the binary expansion of i (0  i  2T ? 1). The binary value of each bit in bi determines the presence or absence of a given object in the con guration (see Fig. 1). 4. Energy computation and object configuration selection

Energy calculations take the image intensities of the original (not quantized) image to establish piecewise-constant approximation errors. EnR ergies of the form f i (f (x) ? f i )2 dxg (Least squares criterion) are R computed once and stored in ram memory. The energy term (f (x) ? f )2 dx (Least squares criterion) is eciently updated for each con guration since is the complementary subset of the union of objects f igPi=1. The con guration that globally minimizes the energy functional corresponds to the optimal segmentation. The time necessary to perform image segmentation essentially depends on the size of the object bank CT , i.e. the number T of registered connected components. Nevertheless, all con gurations are independent and could be potentially evaluated on suitable parallel architectures. 4.4. Computational issues Now we discuss how some parameters of the procedure can be selected and indicate one possible choice used in our experimental results. On the discrete domain S (with rectangular tessellation) the neighborhoods of a pixel x are typically de ned via 4-connectivity or 8-connectivity. Number of bilevel sets The value of L is mainly determined by the number of meaningful objects that one wishes to extract and the computational e ort one is able to spend. Decreasing L allows to reduce the number of connected components and process a small bank of T objects. We propose herein a method for mapping a set of pixels to a small set of levels such as each connected component forms a relatively large and meaningful region. In practice, our approach successfully segmented various images into only 4 or 8 levels.

Special issue of JMIV on shapes and textures

23

ALGORITHM 1:

C

T f T P = 0, E(f; 1 ;    ; P ; ) E(f; S ).

Let T be a bank of objects and the observed image. Compute the energies of all individual objects. Initialize Repeat

?

j 2 CT n f b 1;    ; b P g compute E(f; j ; ; ). ? Choose b P +1 so that

b P +1 = arg max E(f; j ; ;  ): For each object

j 2CT nfb 1 ;;b P g

? f b 1;    ; b P g f b 1 ;    ; b P g

[

b P +1 , P

P + 1.

E(f; j ; ;  ) < 0; 8 j 2 CT n f b 1;    ; b P g: Pb = P and b = S n f b 1;    ; b Pbg.

Until Set

Figure 2. Object selection algorithm.

Minimal size of objects This area operator a ects the image by remaining connected components within the image level sets that do not satisfy the minimum area criterion [42, 1]. Boundaries of connect components are not distorted by this operator since it does not incorporate any shape of structuring element on the processed image. We post-process the connected components to eliminate patches corresponding to noise in the original date. Our default choice is j minj 2 [0:0001; 0:015]  jS j. Hyperparameter  The choice of this parameter determines mostly the properties of the segmentation result. Increasing this parameter reduces the nal number of objects to be extracted. If f is a function from S to [0; 255], a default choice for the hyperparameter is  2 [0:001; 1:]  2552. Of course larger values of  lead to the extraction of only one object. In practice,  = 0:01  2552 provides a reasonable compromise for most cases. However, we keep a possibility to tune this parameter in some speci c situations depending on what is important in each particular case. Energy minimization For a xed bank CT = f 1;    ; T g of T objects, one way to choose the optimal set of objects f b 1;    ; b Pb g, Pb  T de ned by (13), is to search for all possible combinations of P objects and compute the corresponding energy E(f; 1;    ; P ; ). Then, by

24 Special issue of JMIV on shapes and textures comparing the energies, we can see which collection of objects is the best. Enumerating all possible sets of objects in the object bank and comparing their energies is computationally too expensive if T is large (typically, it is infeasible if T > 32). Instead of a such a brute force search, only used in experiments when T  20, we propose the following stepwise greedy algorithm for minimizing E(f; 1 ;    ; P ; ). We start from P = 0 and introduce one object j at a time. Energies of all objects are assumed to be already stored in a ram memory. At the rst step, we compute the T energies with one single object j at once against the complementary subset = S n [Tj6=i=1 i . Let b 1 the estimated object that best lowers E . This object is stored on a ram memory as an object of the optimal con guration. It is removed from the initial bank of objects CT . At any steps of the algorithm, a new estimated object is chosen to maximally decrease the energy E. Suppose that at the P -th step, Pb and b are not known but we have estimated P objects f b 1;    ; b P g and a current background = S n f b 1;    ; b P g. Let E(f; b 1;    ; b P ; ) the current computed energy. Then at the (P + 1)-th step, for each object j 2 CT n f b 1;    ; b P g, we calculate the following energy di erence: E(f; j ; ;  ) = E (f; b 1;    ; b P ; ) ? E(f; b 1 ;    ; b P ; j ;  ) (28) where =  [ j . Intuitively, we choose the object which has the maximal di erence, i.e.,

b P +1 = arg

max

j 2CT nf b1 ;; bP g

E(f; j ; ; ):

(29)

The algorithm stops at P -th step when the adding of any object does not decrease E. This means that the optimal number of objects is Pb = P and the remained objects of the bank are a part of the estimated background, i.e. b = S n f b 1 ;    ; b Pbg. In summary, we propose the algorithm of object selection described in Fig. 2. This fast algorithm selects a suboptimal con guration of objects corresponding to a local minima of the energy functional. Using this algorithm, T (T2 +1) object con gurations are examined at the most, whereas the supervision of all the con gurations correspond to 2T iterations. The cpu-time taken by our implementation is reported in Section 4.

Special issue of JMIV on shapes and textures

25

5. Experimental results in image segmentation This section presents experiments on synthetic images as well as realworld images. We are interested in the use of the technique in the context of medical and meteorological imagery. Our system successfully segmented various images into a few regions. For the bulk of the experiments, the algorithm parameters were set in these experiments as follows: L = f4; 8g, and regions which areas j ij < j minj are discarded. The cleaning threshold j minj 2 [0:0001; 0:00075]  jS j indicates the minimum area of connected components to be retained in the segmentation. Recall that obtaining the most meaningful objects is the goal of this work. For this reason, L was set fairly low in the experiments to obtain large regions and to improve robustness to noise and artifacts in the image. For our method,  varies across the images depending on the image content. It is set empirically and values that gave visually better results were chosen. Only a small number of values was tried for this parameter. Most segmentations took approximately about 530 seconds on a 296MHz workstation. Simulations were conducted on synthetic as well as real-world images to evaluate the performance of the algorithm. We start by examining the in uence of the penalization parameter  on results. In uence of the penalization parameter  Figure 3a shows an aerial 256  256 image (in the visual spectrum) depicting the region of SaintLouis during the rising of the Mississipi and Missouri rivers in July 1993. We are interested in extracting the rivers and a background corresponding to textured urban areas. Figure 3 shows, from left to right, the original image, the topographic map when L = 8 and j minj = 0:00025  jS j and the piecewise constant approximation of the image. In this experiment, the maximum number of signi cant components is T = 291. The non-connected background labeled in \white", is composed of connected components that do not satisfy the minimum area criterion. The objects are labeled using f i . Figure 4 shows how the penalization parameter in uences the segmentation results for four di erent prior models combined with the Least Squares criterion. It takes about 10 seconds (42486 iterations at the most) of computing time for building the objects bank and selecting the best con guration using the stepwise greedy algorithm. Enumerating all the con gurations is infeasible since 2T = 3:981087 iterations ! In Fig. 4, we see that the constant  only tunes the number of regions. If  is large, the background is encouraged to be a large part of the image. As  decreases, a lot of regions are allowed and the segmentation is ne. In addition, we get similar segmentation results for any prior models by adjusting . Note that the extraction of large

26

Special issue of JMIV on shapes and textures

a. Original image

b. Topographic map (L c. Piecewise-constant =8; j minj =0:00025 jS j) approximation (T = 291) Figure 3. Level lines of a satellite image. P regions is encouraged as  increases, except when Ep = Pi=1 j i j and  is higher than a critical value c .

In uence of the number L of bilevel sets To go further in the comparisons, we can focus on the segmentation results we obtain by varying L = f2;    ; 17g. The performance of the minimization procedure is demonstrated for a MR image (181  217) shown in Fig. 5a. The image histogram is plotted in Fig. 5c. Figure 6 displays the crudely piecewiseconstant approximation results by varying L. In order to extract bright areas in the original image, the data model corresponds to (9) and PP 2  = 0:15  255 controls the prior model Ep = i=1 j i j. We observe that the image is under-segmented using L = f2; : : :; 7g (see Figs. 6af) and over-segmented using L = f10; : : :; 17g (see Figs. 6i-p). In this experiment, a relevant segmentation is obtained by setting L = 8 or L = 9. The boundary sets (L = 8) superimposed on the original image are shown Fig. 5b. Dynamic segmentation process An example of cloud detection is provided in Figs. 7 and 8. In this experiment, the solution is de ned as the argument which minimizes the least Squares criterion regularized by Ep = PPi=1 j i j. For this set of parameters L = 4,  = 0:1  2552, and j minj = 0:0001  jS j, the algorithm labeled seas, continents and small clouds as \background". The signi cant clouds are crudely extracted from the 383  260 image and labeled as \objects" ( rst row in Fig. 8). Figure 7 shows, from left to right, the original image, the topographic map and the piecewise constant approximation of the image if all the T = 207 regions are used. Figure 8 depicts three samples of the dynamic segmentation process: the piecewise-constant approximation and the boundary sets superimposed on the original image are shown at signi cant iterations of the stepwise greedy algorithm. Figure 8 shows

Special issue of JMIV on shapes and textures

27

Segmentation using Ep = P

a.  = 1  2552 (P = 105)

b.  = 5  2552 c.  = 10  2552 d.  = 102  2552 (P = 17) (P = 8) (P = 1) P Segmentation using Ep = Pi=1 j i j

e.  = 0:0025  2552 f.  = 0:01  2552 g.  = 0:025  2552 h.  = 0:05  2552 (P = 230) (P = 146) (P = 105) (P = 53) PP ? 1 Segmentation using Ep = i=1 j ij

i.  = 1  2552 j.  = 102  2552 k.  = 103  2552 l.  = 104  2552 (P = 260) (P = 58) (P = 17) (P = 4) PP j ij j

j ij Segmentation using Ep = ? i=1 jS j log jS j ? j S jj log jj S jj

m.  = 1  2552 n.  = 102  2552 o.  = 103  2552 p.  = 104  2552 (P = 260) (P = 58) (P = 17) (P = 4) Figure 4. Segmentation of a satellite image (L = 8, j minj = 0:00025  jS j pixels, T = 291).

Special issue of JMIV on shapes and textures

0

Number of pixels 500 1000

1500

28

0

50

100 150 Intensity magnitude

200

250

a. Original image b. Topographic map c. Image histogram Figure 5. Segmentation of a MR image (L = 8; j minj = 0:00075  jS j pixels).

the results at the 214th (P = 2), 2053th (P = 11)and the last 9781th (Pb = 55) iteration. The algorithm nally selects Pb = 55 regions and stops at the 9781th iteration (=21s of cpu time), i.e. before the maximal iteration T (2T +1) = 21528. For this experiment, numerical results of the stepwise greedy algorithm are shown in Table II. Note that the larger signi cant regions are rst extracted before examination of small clouds.

Mumford and shah functional In these experiments, we compare our segmentation results to those provided by a region growing method using the simpli ed version of the Mumford and Shah model which aims at approximating a given image with piecewise constant functions [26, 33]. The prior model ensures the discontinuity set has a small length [35]. A satisfying reconstruction using seven regions is shown in Figs. 9b-c. Our approach produces comparative visual results by P adjusting  that controls Ep = Pi=1 j ij (see Fig. 9h-i). In comparison, we run the stepwise greedy algorithm to extract regions using a preliminary uniform quantization of the image. Figs. 9h-i displays the crudely piecewise-constant approximation result. In this last experiment, we observe that the segmentation is not relevant and regions are not meaningful. The time necessary for building the bilevel sets and extracting 7 objects (entropic quantization) and 12 objects (uniform quantization) is respectively 13s and 22s on a 296 Mhz workstation. We need 228s to extract 7 segments using the region growing method (Fig. 9b-c).

Special issue of JMIV on shapes and textures

a. L = 2; T = 8; P =7

29

b. L = 3; T = 18; c. L = 4; T = 14; d. L = 5; T = 30; P = 10 P =6 P =8

e. L = 6; T = 40; f. L = 7; T = 39; g. L = 8; T = 55; h. L = 9; T = 57; P =9 P =8 P = 11 P = 14

i. L = 10; T = 35; k. L = 11; T = 58; k. L = 12; T = 85; l. L = 13; T = 105; P = 14 P = 27 P = 23 P = 33

m. L =14; T =107; n. L =15; T =103; o. L =16; T =123; p. L = 17; T = 122; P = 33 P = 44 P = 44 P = 66 Figure 6. Segmentation of a MR image (j minj = 0:00075jS j pixels ;  = 0:152552 ).

30

Special issue of JMIV on shapes and textures

a. Original image

b. Topographic map (L =4; j minj =0:0001 jS j) Figure 7. Level lines of a meteorological image.

c. Piecewise-constant approximation (T = 207)

Figure 8. Segmentation of a meteorological image ( = 0:1  2552 ). Left column (top to bottom): piecewise-constant approximation at iteration 214 (P = 2), 2053 (P = 11) and 9781 (P = 55). Right column (top to bottom): boundary sets superimposed on the original image at iteration 1214, 2053 and 9781.

Special issue of JMIV on shapes and textures

31

Segmentation using the simplified Mumford-Shah model

a. Original image

b. Piecewise-constant approximation using 7 regions

c. Boundary set of the 7 regions

0

500

Number of pixels 1000

1500

Segmentation using the level lines selection approach

0

50

100 150 Intensity magnitude

200

0

500

Number of pixels 1000

1500

d. Entropic quantization e. Piecewise-constant f. Boundary set using of the image histogram approxi mation using  = 0:025  2552  = 0:025  2552 (P = 7) (entropic quantization) (entropic quantization)

0

50

100 150 Intensity magnitude

200

g. Uniform quantization h. Piecewise-constant i. Boundary set using of the image histogram approximation using  = 0:025  2552  = 0:025  2552 (P = 12) (uniform quantization) (uniform quantization) Figure 9. Segmentation of the \hand" image. The rst row shows segmentation results using the simpli ed Mumford-Shah model. The second and third rows show segmentation results using the level lines selection approach (L = 4;  = 0:025  2552 ; j minj = 0:00035  jS j pixels) with an entropic (T = 10) and uniform (T = 27) quantization of the image histogram.

32

Special issue of JMIV on shapes and textures

Table II. Object selection at each step of the stepwise greedy algorithm. P

Iteration

E

P X

j i j

P

Iteration

E

i=1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

5 214 420 662 832 1028 1229 1449 1630 1844 2053 2258 2431 2620 2817 3019 3203 3401 3602 3783 3977 4147 4352 4530 4694 4881 5069 5248

8804 8360 8089 7983 7928 7902 7887 7877 7868 7862 7855 7850 7846 7843 7839 7836 7833 7830 7827 7825 7822 7820 7818 7816 7814 7812 7811 7809

15864 22185 27670 29051 30082 30418 30676 30838 31026 31137 31230 31293 31362 31415 31502 31569 31629 31671 31713 31746 31788 31833 32194 32222 32260 32293 32323 32352

P X

j i j

i=1

29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55

5420 5610 5785 5960 6140 6313 6481 6661 6834 7003 7170 7341 7508 7674 7834 8007 8167 8352 8490 8651 8824 8969 9142 9286 9457 9614 9781

7808 7807 7806 7805 7804 7803 7802 7801 7800 7799 7798 7798 7797 7796 7796 7795 7794 7794 7793 7793 7792 7792 7791 7791 7791 7790 7790

32378 32402 32424 32442 32456 32473 32491 32506 32520 32531 32545 32556 32567 32579 32591 32616 32628 32663 32674 32685 32720 32731 32747 32765 32782 32793 32809

6. Conclusion and perspectives In this paper, we have presented a level line selection approach for extracting structures in images. We proved that minimizers of our segmentation energies can be directly determined. A total cpu time of a few seconds on a 296 Mhz workstation for partitioning a 256  256 image into meaningful regions makes the method attractive for many time-critical applications. The contribution of this approach has been illustrated on synthetic as well as real-world images. Several promising directions may be explored for continued research. A direction for future work is to extend the proposed approach to operate on multi-spectral 3D images. In this setting, the structure of the algorithm would be largely the same, although there are a number of points which would need to be examined closely.

Special issue of JMIV on shapes and textures

33

Acknowledgments A part of this work has been done using the MegaWave2 image processing environment (Georges Koepfler { Copyright (C)1993-1199 CMLA, ENS Cachan, 94235 Cachan cedex, France - http://www.ceremade.dauphine.fr - All rights reserved {).

References 1. S.T. Acton and D.P. Mukherjee. Scale space classi cation using area morphology. IEEE Trans. Image Processing, 9(4):623{635, 2000. 2. L. Alvarez, Y. Gousseau, and J.M. Morel. Scales in natural images and a consequence on their bounded variation. In Int. Conf. on Scale-Space Theories Comp. Vis., pages 247{258, Kerkyra, Greece, September 1999. 3. O. Amadieu, E. Debreuve, M. Barlaud, and G. Aubert. Simultaneous inward and outward curve evolution. In Int. Conf. on Image Processing, Kobe, Japan, 1999. 4. J. Beaulieu and M. Goldberg. Hierarchy in picture segmentation: a stepwise optimization approach. IEEE Trans. Patt. Anal. and Mach. Int., 11(2):150{ 163, 1989. 5. A. Blake and A. Zisserman. Visual Reconstruction. MIT Press, Cambridge, Mass, 1987. 6. V. Caselles, B. Coll, and J.M. Morel. Topographic maps and local contrast changes in natural images. Int J. Computer Vision, 33(1):5{27, 1999. 7. V. Caselles, R. Kimmel, and G. Sapiro. Geodesic active contours. Int J. Computer Vision, 22(1):61{79, 1997. 8. T. Chan and L. Vese. Active contour model without edges. In Int. Conf. on Scale-Space Theories Comp. Vis., pages 141{151, Kerkyra, Greece, September 1999. 9. C. Chang, K. Chen, J. Wang, and M. Althouse. A relative entropy-based approach to image processing. Pattern Recognition, 27(9):1275{1289, 1994. 10. C. Chesnaud, P. Refregier, and V. Boulet. Statistical region snake-based segmentation adapted to di erent physical models. IEEE Trans. Patt. Anal. and Mach. Int., 21(11):1145{1157, 2000. 11. L.D. Cohen. On active contour models and balloons. CVGIP: Image Understanding, 53(2):211{218, 1991. 12. L.D. Cohen. Deformable curves and surfaces in image analysis. In Int. Conf. Curves and Surfaces, Chamonix, France, July 1996. 13. T. Darrell and A. Pentland. Cooperative robust estimation using layers of support. IEEE Trans. Patt. Anal. and Mach. Int., 17(5):474{487, 1995. 14. X. Descombes and F. Kruggel. A markov pixon information approach for lowlevel image description. IEEE Trans. Patt. Anal. and Mach. Int., 21(6):482{ 494, 1999. 15. A. Desolneux, L. Moisan, and J.M. Morel. Maximal meaningful events and applications to image analysis. Preprint of CMLA, Cachan, France, 2000. 16. M.A.T. Figueiredo, J.M.N. Leitao, and A.K. Jain. Unsupervised contour representation and estimation using b-plines and mininim description length criterion. IEEE Trans. Image Processing, 6(9):1075{1087, 2000. 17. J. Froment. Perceptible level lines and isoperimetric ratio. In Int. Conf. on Image Processing, Vancouver, Canada, 2000.

34

Special issue of JMIV on shapes and textures

18. S. Geman and D. Geman. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Trans. Patt. Anal. and Mach. Int., 6(6):721{741, 1984. 19. U. Grenander and M.I. Miller. Representations of knowledge in complex systems. J. Royal Statistical Society, series B, 56(4):549{603, April 1994. 20. J. Istas. Statistics of processes and signal-image segmentation. University of Paris VII, 1997. 21. I.H. Jermyn and H. Ishikawa. Globally optimal regions and boundaries. In Int. Conf. on Comp. Vis., pages 904{910, Kerkyra, Greece, September 1999. 22. J.N. Kapur, P.K. Sahoo, and A.K.C.. Wong. A new method for gray-level picture thresholding using the entropy of the histogram. Comp. Vis. Graphics and Image Proc., 29:273{285, 1985. 23. M. Kass, A. Witkin, and D. Terzopoulos. Snakes: active contour models. Int J. Computer Vision, 12(1):321{331, 1987. 24. C. Kervrann and F. Heitz. A hierarchical markov modeling approach for the segmentation and tracking of deformable shapes. Graphical Models and Image Processing, 60(3):173{195, 1998. 25. S. Kichenesamy, A. Kumar, P. Olver, and A. Yezzi. Conformal curvature ows: from transition to active contours. Archive for Rational Mechanics and Anal., 134:275{301, 1996. 26. G. Koep er, C. Lopez, and J.M. Morel. A multiscale algorithm for image segmentation by variational method. SIAM J. Numerical Analysis, 31(1):282{ 299, 1994. 27. Y.G. Leclerc. Constructing simple stable descriptions for image partitioning. Int J. Computer Vision, 3:73{102, 1989. 28. L.M. Lorigo, O.D. Faugeras, W.E.L. Grimson, R. Keriven, and R. Kikinis. Segmentation of bone in clinical knee mri using texture-based geodesic active contours. In Int. Conf. Medical Image Computing and Computer-Assisted Intervention, pages 1195{1204, Cambridge MA, USA, October 1998. 29. G. Matheron. Random Sets and Integral Geometry. John Wiley, New York, 1975. 30. T. McInerney and D. Terzopoulos. T-snakes: topology adaptive snakes. Medical Image Analysis, 4(2):123{1361, 2000. 31. J. Mller and R.P. Waagepertersen. Markov connected component elds. Adv. in Applied Probability, pages 1{35, 1998. 32. P. Monasse and F. Guichard. Fast computation of a contrast-invariant image representation. IEEE Trans. Image Processing, 9(5):860{872, 2000. 33. J.M. Morel and S. Solimini. Variational methods in image segmentation. Birkhauser, 1994. 34. D. Mumford. The bayesian rationale for energy functionals. GeometryDriven Di usion in Computer Vision, pages 141{153, Bart Romeny ed., Kluwer Academic, 1994. 35. D. Mumford and J. Shah. Optimal approximations by piecewise smooth functions and variational problems. Communication on Pure and applied Mathematics, 42(5):577{685, 1989. 36. S. Osher and J. Sethian. Fronts propagating with curvature dependent speed: algorithms based on the hamilton-jacobi formulation. J. Computational Physics, 79:12{49, 1988. 37. F. O'Sullivan and M. Qian. A regularized contrast statistic for object boundary estimation { implementation and statistical evaluation. IEEE Trans. Patt. Anal. and Mach. Int., 16(6):561{570, 1994.

35

Special issue of JMIV on shapes and textures

38. N. Paragios and R. Deriche. Coupled geodesic active regions for image segmentation: a level set approach. In Euro. Conf. on Comp. Vis., volume 2, pages 224{240, Dublin, Ireland, June 2000. 39. T. Pavlidis and Y.T. Liow. Integrating region growing and edge detection. IEEE Trans. Patt. Anal. and Mach. Int., 12:225{233, 1990. 40. N. Rougon and F. Preteux. Directional adaptive deformable models for segmentation. J. of Electronic Imaging, 7(1):231{256, 1998. 41. P.K. Sahoo, C. Wilkins, and J. Yeager. Threshold selection using Renyi's entropy. Pattern Recognition, 30(1):71{84, 1997. 42. P. Salembier and J. Serra. Flat zones ltering, connected operators, and lters by reconstruction. IEEE Trans. Image Processing, 4(8):1153{1160, 1995. 43. C. Samson, L. Blanc-Feraud, G. Aubert, and J. Zerubia. A level set model for image classi cation. In Int. Conf. on Scale-Space Theories Comp. Vis., pages 306{317, Kerkyra, Greece, September 1999. 44. C. Schnorr. A study of a convex variational di usion approach for image segmentation and feature extraction. J. Math. Imaging and Vision, 3(8):271{ 292, 1998. 45. J. Sethian. Level Sets Methods: Evolving Interfaces in Geometry, Fluid Mechanics, Computer Vision, and Material Science. Cambridges University Press, 1996. 46. L. Vincent and P. Soille. Watershed in digital spaces: an ecient algorithm based on immersion simulations. IEEE Trans. Patt. Anal. and Mach. Int., 13(6):583{598, 1991. 47. J.P. Wang. Stochastic relaxation on partitions with connected components and its application to image segmentation. IEEE Trans. Patt. Anal. and Mach. Int., 20(6):619{636, 1998. 48. A. Yezzi, A. Tsai, and A. Willsky. A statistical approach to snakes for bimodal and trimodal imagery. In Int. Conf. on Comp. Vis., pages 898{903, Kerkyra, Greece, September 1999. 49. L. Younes. Calibrating parameters of cost functionals. In Euro. Conf. on Comp. Vis., Dublin, Ireland, June 2000. 50. S.C Zhu and A. Yuille. Region competition: unifying snakes, region growing, and bayes/mdl for multiband image segmentation. IEEE Trans. Patt. Anal. and Mach. Int., 18(9):884{900, 1996.

Appendix A. Energy variation for the least squares criterion We compute the energy variation for one object and a background

, where denotes the closure of the complementary set of . The data model is

Ed(f; ; ) =

Z



(f (x) ? f

)2 dx

+

Z



(f (x) ? f )2 dx:

Z

Z

(30) Z

For two sets A and B such that B  A, denote f def = f? f: A?B A B Let  be a small perturbation of such that   , i.e. the Haus-

36

Special issue of JMIV on shapes and textures Z

dor distance d1 (  ; )   : We de ne 2 Z

Z



f ?



f

2

=2

Z



Z

f

Z

f+

 n

 n

!2

 n

1 def = j  j ? j j {z

|

j j

}

and

f .

The di erence between the involved energies is equal to Ed (f; ; ) = Ed(f; ; ) ? Ed (f; ; ). Therefore, Ed(f; ; ) = T1 + T2 + T3 + T4

with

T1 = T3 =

Z Z



f2 ?

S n 

Z

f 2;

Z

f2 ?

S n

1

T2 = ? j 1 j

Z



f2 ;

!2

Z

2

f + j 1 j



1

2

Z



f ; (31a) (31b)

!2

Z

(31c) f f + jS j ? j j T4 = ? jS j ? j j S n  S n

 Denote j j = j  j ? j j. If j j ! 0, i.e. j  j ' j j, we obtain

(higher order terms are neglected) T1 = ?T3 =

Z

 n

f2 ;

(32a)

Z Z T2 = ? j 2 j f f ? j 1 j

 n

Z

 n

f

!2

Z Z T4 = jS j ?2 j j f f ? jS j ?1 j j

 n S n

1

Z

Z

1 ? (jS j ? j j)2

 n

S

n

f

!2

Z

Z + j 1j2 Z

 n

f

 n

!2

1



f

2

; (32b) (32c)

:

Z

Z

Z

Z

De ne the image moments as m0 = 1; m1 = f; M0 = 1; M1 = f:



S S Using the mean value theorem for double integral, which states that if f is continuous and a connected subsetZ E is bounded by a simple curve, then for some point x0 in E we have f (x)dE = f (x0 ) jE j where jE j E denotes the area of E , it follows that Ed(f; ; ) =

z "

m21 m20

a0

}|

?

{

# (M1 ? m1 )2 Z 1 (M0 ? m0 )2  n

a1 }|

z 

{

 Z 2 m 2( M ? m ) 1 1 1 + ? m + M ? m f (x0 )

0

0

0

 n

1

Special issue of JMIV on shapes and textures !2   Z 1 1 2 : ? m + M ? m f (x0) 1

 n

0 0 0

37 (33)

Let xb be a xed point of the border @ . Choose  such that @  = @

except on a small neighborhood of xb . The energy variation is then of the following form: (34) Ed(f; ; ) = j j [a0 + a1 f (xb )] + O(j j2): Address for O prints: Charles KERVRANN, INRA - Biometrie, Domaine de Vilvert, 78352 Jouy-en-Josas, France e-mail: [email protected] Fax: +33 1 34 65 22 27

Suggest Documents