Synergistic Integration of Active Shape Modeling and

0 downloads 0 Views 363KB Size Report
ASM are listed below. Each step is described in some detail in each subsection of Section 2. ... These steps are described below in detail. ..... Duncan, J. S., “3D image segmentation of deformable objects with joint shape-intensity prior models.
GC-ASM: Synergistic Integration of Active Shape Modeling and Graph-Cut Methods Xinjian Chen1, Jayaram K. Udupa1, Drew A. Torigian2 and Abass Alavi2 Medical Image Processing Group, Department of Radiology, University of Pennsylvania, Philadelphia, PA 19104, USA. 2 Hospital of the University of Pennsylvania, Department of Radiology, University of Pennsylvania, Philadelphia, PA 19104, USA. Phone: (215) 662-6780 Fax: (215) 898-9145 E-mail: [email protected], [email protected] 1

ABSTRACT Image segmentation methods may be classified into two categories: purely image based and model based. Each of these two classes has its own advantages and disadvantages. In this paper, we propose a novel synergistic combination of the image based graph-cut (GC) methods with the model based ASM methods to arrive at the GC-ASM method. GC-ASM effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability of the GC method. We propose a new GC cost function, which effectively integrates the specific image information with the ASM shape information. The ASM results are fully utilized to help GC in several ways: (1) For automatically selecting seeds to do GC segmentation, thus helping GC with object recognition; (2) For refining the parameters of the GC algorithm from the ASM result; (3) For bringing object shape information into the GC cost computation. (4) In turn, for using the cost of GC result to improve the ASM’s object recognition process. The proposed methods are implemented to operate on 2D images and tested on a clinical abdominal CT data set. The results show: (1) GC-ASM becomes largely independent of initialization. (2) The number of landmarks can be reduced by a factor of 3 in GC-ASM over that in ASM. (3) The accuracy of segmentation via GC-ASM is considerably better than that of ASM. Keywords: Recognition, Pattern Recognition, Segmentation, Active Shape Models, Graph Cut

1. INTRODUCTION Automatic image segmentation is a fundamental and important process in computer vision and medical image analysis. Medical image segmentation is the process of identifying and delineating objects in medical images. In spite of several decades of research and many key advances [1, 2], several challenges still remain in this area. The whole segmentation operation can be thought of as consisting of two related processes: recognition and delineation. Recognition is the highlevel process of determining roughly the whereabouts of an object of interest and distinguishing it from other object-like entities in the image. Delineation is the low-level process of determining the precise spatial extent of the object in the image. Effecting efficient cooperation between the high-level recognition process and accurate low-level delineation has remained a challenge in image segmentation. This paper is an attempt at efficient incorporation of high level recognition help into a low level delineation method. Image segmentation methods may be classified into two categories: purely image based and model based. Many purely image-based segmentation methods have been proposed, such as graph cut [3, 4], live wire [5, 6] and level set [7]. These methods perform well when the image quality is consistently good. But they do not work as well or fail when there is boundary information missing in some parts of the image. In recent years, there has been an increased interest in model-based segmentation methods. An advantage of these methods is that, even when boundary information is missing in the image in some parts of the object boundary, such gaps are filled simply because of the closure, connectedness and shape information stored in the model. Deformable models [8], such as active contours [9] and their variants [10], are capable of modeling complex shapes via continuously deformable curves. The active shape model (ASM) approaches [11] can effectively incorporate prior shape information into the model as such they have been successfully used in medical

Medical Imaging 2009: Image Processing, edited by Josien P. W. Pluim, Benoit M. Dawant, Proc. of SPIE Vol. 7259, 72590C · © 2009 SPIE CCC code: 1605-7422/09/$18 · doi: 10.1117/12.812182 Proc. of SPIE Vol. 7259 72590C-1

image segmentation. However, a drawback of these methods is that they rely on matching (blurred) statistical information in the model to the given image. Therefore the specific information present in the given image cannot be made effective use of in a specific manner as in purely image based strategies, especially when the image contains good boundary information. Recently, hybrid methods have been proposed that combine the purely image based and model based methods to reinforce their strengths, such as a combination of active shape model with live wire method [12, 13] and a combination of shape-intensity prior models with level sets [14]. Udupa et al. [12, 13] have proposed an oriented active shape model (OASM) approach to combine the strength of ASM in recognition with the strength of live wire in delineation. The strategy of the OASM approach is to bring the strengths of live wire into the ASM approach. In this paper, we propose a novel synergistic combination of the image based graph-cut (GC) method with the model based ASM method to arrive at the GC-ASM method. GC-ASM effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability of the GC method. A new GC cost function is proposed, which effectively integrates the ASM shape information into the GC paradigm. We also demonstrate how both automation and accuracy of segmentation can be improved in this proposed GC-ASM method. In the proposed method, GC and ASM methods work closely to improve the performance of each other: (1) The ASM shape information is combined into the GC cost computation; (2) The model is used to automatically select seeds for GC segmentation, thus helping GC with the recognition task; (3) The parameters of the GC algorithm are updated from the ASM results to help GC segmentation; (4) in turn, the optimal GC cost is used to help ASM’s recognition performance. The proposed method was evaluated on a clinical abdominal CT image data set. The results show: (a) GC-ASM becomes largely independent of initialization. (b) The number of landmarks can be reduced by a factor of 3 in GC-ASM over that in ASM. (c) The accuracy of segmentation via GC-ASM is considerably improved over ASM. In Section 2, the complete methodology of GC-ASM is described. In Section 3, we describe an evaluation of this method in terms of its accuracy and efficiency. In Section 4, we summarize our conclusions.

2. GRAPH CUT WITH ACTIVE SHAPE MODELS 2.1 Overview of Approach In this section, an overview of the approach taken in this paper is presented. This paper addresses the 2D segmentation problem and proposes a new GC-ASM method which effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability of the GC method. Let O be a given physical object, such as the talus bone of the human ankle. For segmenting the boundaries of O in an image of a particular manifestation of O (such as the talus bone of a particular subject’s foot), ASM captures the statistical variations in the boundary of O within O’s family via a statistical shape model M. GC-ASM determines a cost structure K associated with M via the principles of optimal cut underlying the graph cut method. As per this cost structure, every shape instance x generated by M is assigned a total GC cost K(x) in a given image. This cost is determined from the graph cut method. GC-ASM seeks that boundary in the given image, which satisfies the shape constraints of M and for which the cost K(x) in the given image is the smallest possible. The main steps involved in GCASM are listed below. Each step is described in some detail in each subsection of Section 2. The proposed system can be used to segment single object or multiple objects. Training module: T1. Specify landmarks on training images. T2. Construct ASM model M from landmarks and training images. T3. Train GC parameters (the mean and standard deviation of intensity inside the object region) for each object from training images. Segmentation/searching module: S1. Automatic initialization/Recognition. S2. Find the shape x representing the best oriented boundary as per GC in the given image.

Proc. of SPIE Vol. 7259 72590C-2

S3. If convergence criterion is satisfied, output the best oriented boundary found in S2 and stop, else subject x to the constraints of model M and go to Step S2. In this procedure, T1-T3 constitute training and model creation steps, and S1-S3 represent searching and segmentation steps. These steps are described below in detail. 2.2 T1: Specifying Landmarks One way to describe a particular instance of the shape of an object of type O is by locating a finite number of points on its boundary, referred to as landmarks. A landmark [15] is a homologous point of correspondence on each object that matches within the same population. A mathematical representation of an n-point shape in a d-dimensional space may be obtained by concatenating each dimension into a dn component vector. In this paper, 2D shapes are considered, hence d = 2. The proposed method can be used to segment single or multiple objects. Suppose there are m objects to segment, p1 landmarks for object 1, p2 landmarks for object 2, …, pm landmarks for object m, then the vector representation for planar shapes would be: x= ( x1O1 , y1O1 , x2O1 , y 2O1 ,L, x Op1 , y Op1 , x1O2 , y1O2 ,L, x Op2 , y Op2 ,L x Opm , y Opm ) 1

1

2

2

m

m

(1)

In most ASM studies, a manual procedure is used to label the landmarks in a training set, although automatic methods are also available for this purpose. That is, for each image of the training set, operators locate the shape visually, and then identify significant landmarks on that shape. It is important that the landmarks are accurately located and that there is an exact correspondence between landmark labels in different instances of the training shapes. For our approach, any such method will work, although we have used the manual method in producing all presented results. 2.3 T2: Creating the ASM M To obtain a true shape representation of an object family O, location, scale, and rotation effects within the family need to be filtered out. This is done by aligning shapes within O (in the training set) to each other by changing the pose parameters (scale, rotation, and translation) [15]. The standard ASM model building method is used in this paper. For multi objects, the model is created for each object in the same way. 2.4 T3, S1, S2: GC Segmentation Let Ι = ( I , g ) be an image, where I is a pixel array, and for p ∈ I , g(p) denotes the intensity of pixel p in Ι , and let Lp be a fixed label (like 1 or 0) assigned to p ∈ I that results from a segmentation operator ψ applied to Ι . In the traditional graph cut method [3, 4], the energy function is usually composed of two parts: data penalty and boundary penalty terms. In this paper, we propose a new graph cut energy function, which is composed of three parts: data penalty, boundary penalty, and shape penalty. The function is defined as follows; (2) E ( Ψ ) = ∑ D p ( L p ) + ∑ V p , q ( L p , Lq ) + ∑ S p ( L p ) p∈I

( p , q )∈N

p∈I

where D p (⋅) is a data penalty function, V p,q is the boundary term, N is the set of all pairs of neighboring pixels in Ι , and S p is the shape term. These components are defined as follows: D p ( L p ) = K D ⋅ (1 − exp(−

V p ,q ( L p , Lq ) = K B ⋅ exp(− and

( g ( p ) − MeanL p ) 2 2VarL p

( g ( p) − g (q)) 2 2δ

2

)⋅

) ,

1 ⋅ δ ( L p , Lq ) , dist ( p, q)

(3) (4)

⎧1 if L p ≠ Lq ⎩0 otherwise .

δ ( L p , Lq ) = ⎨

K D and K B are the weights for the data term and boundary term, respectively. MeanLp , VarLp are the intensity mean and variance of the region corresponding to label

L p in the image.

Proc. of SPIE Vol. 7259 72590C-3

S p ( L p ) = K S ⋅ (1 − exp(−

d ( p, shape ASM ) )) ⋅ δ ( L p ) , R ASM

(5)

⎧1 if L p = source label otherwise , ⎩0

δ (L p ) = ⎨

and

where, K S is the weight for the shape term, d ( p, shape ASM ) is the distance from pixel p to the ASM shape boundary, R ASM is the radius of a circle that just encloses the mean shape in M. There are several ways to compute the distance d ( p, shape ASM ) ; the linear transform method of reference [16] was applied in this paper. During the training stage, the parameters Mean, Var for each object are estimated from the training images. The parameters KD, KB, K S can also be estimated from training images using the least squares [17] method. These parameters will be used in the following GC segmentation. 2.5 S1: Automatic Initialization The purpose of this step is to recognize the desired boundary in Ι ; that is, to find a shape instance xi of M which is sufficiently close to the boundary of O in Ι that we want to segment. xi is subsequently deformed in Steps S2-S3 to best fit the image information in Ι. The automatic initialization method proposed here is an essential underpinning of the GC-ASM method. It relies on the fact that, corresponding to a pose (meaning scale, orientation, and location) of a shape instance x of M that is close to the pose of the correct boundary of O in Ι , the total GC cost (minimum cut cost) is likely to be sharply smaller than the cost found at other poses in Ι . Suppose p denotes the pose vector for the object assembly, which includes a location component (x and y), scale component (s), and orientation component (θ). Our goal is to find the best pose in Ι for the model M. Our experiments indicate that, the small variations in orientation observed in clinical images can be automatically handled and thus θ can be ignored. Thus p becomes 3-dimensional (x,y,s)t. Let Κ ( p) be the sum of the GC minimum cut costs for all objects achieved at p, then the recognition task is to find: p* = argminp ( p, Ι , Κ ( p) ). The algorithm for automatic initialization is as follows:

(6)

S1.1 For each potential pose vector p, do steps S1.2 and S1.3. S1.2 ASM searching. Put model M with its geometric center at p. Then it is deformed as per the standard ASM method. S1.3 GC segmentation. Based on the cost function E described in Equation (2), perform GC segmentation and get the minimum cut cost for each object separately. S1.4 Pick the deformed ASM shape and GC result corresponding to the lowest total cut cost of all objects over all p as automatic initialization result for the following steps. In our implementation, the potential p is determined by discretizing the search space of (x translation, y translation, scale) and doing an exhaustive search. During GC segmentation stage, the image was represented as a 4-connectivity graph. The α-expansion [3] method was applied as the optimization method. When segmenting an object, its landmarks can be used as source seeds, and the landmarks of the other objects can be employed as sink seeds. The strategy of this method is to evaluate the total GC cost, and choose that pose p for which this cost is the smallest over the plausible searching space of p. As we shall see in Section 3.2 (Figure 3), the cost exhibits a sharp minimum in the pose corresponding to the correct recognition of the object. 2.6 S2: Finding the Optimum Boundary This step assumes that the initialized (recognized) shape xp derived from Step S1 is sufficiently close to the actual boundary of O in Ι . It then determines what the new position of the landmarks of xp should be such that the sum of the

Proc. of SPIE Vol. 7259 72590C-4

minimum graph cut costs is the smallest. The shape penalty part measures the agreement between the GC result and the shape information in ASM. So when the landmarks are closer to the true boundary, the total cost will become smaller. 2.7 S3: Testing Convergence and subjecting to model constraints The convergence criterion used here is a measure of the distance between two shapes encountered in two consecutive executions of Step S2. This measure is simply the maximum of the distance between corresponding landmarks in the two shapes. If this distance is greater than 0.5 pixel unit, the optimum shape found in Step S2 is subjected to the constraints of model M. Then the iterative process is continued by going back to Step S2. Otherwise, the GC-ASM process is considered to have converged and it stops with an output of the optimum shape and the optimum oriented boundary found in Step S2.

3. EXPERIMENTAL RESULTS In this section, we demonstrate both qualitatively, through image display, and quantitatively, through evaluation experiments, the extent of effectiveness of the GC-ASM method. Routinely acquired clinical abdominal CT image data set have been used to test the proposed method. Our method of evaluation, based on the framework of reference [18], will focus on the analysis of accuracy, and efficiency of GC-ASM. We will consider manual segmentation performed in this data set to constitute a surrogate of true segmentation for assessing the accuracy of the methods. 3.1 Image Data Set The abdominal CT image data set constitutes slices selected from 3D studies on a Siemens Sensation 16 CT scanner with a slice spacing = 5 mm, image size of 512x512, and pixel size = 0.78x0.78 mm2. The data set consists of 40 slices selected from full 3D images, acquired from fifteen different subjects. These slices are approximately at the same location in the body, so that, for each object, the 40 2D images can be considered to represent images of a family of objects of same shape. Two to Three slices are taken on average from the same subject’s data, either from the same 3D image or from different 3D images. Among them, 25 images are randomly selected as training images, and the rest are used as testing images. We considered four objects in this data set as illustrated in Fig. 1: the left and right pelvic bone, vertebra, and skin boundary. Object 2 Object 4 Object 1 Object 3

Fig. 1. The four objects selected in the CT abdomen data set.

3.2 Qualitative Analysis A subjective inspection revealed that, in all experiments and in all data, the GC-ASM results matched the perceived boundary very well. Some examples are displayed in Fig. 2. Automatic initialization based on location and scale search worked well in all cases in the sense that initialized shapes were found close to the true boundary. Whatever orientation differences that occurred naturally in the subjects considered here were present in the data and no particular effort was made to minimize these. Our experiments indicate that, because of the globally optimal and orientedness nature of the model, the small variations in orientation observed in clinical images can be automatically handled. In Fig. 2, (b) displays the original image overlaid by the mean shape of the model, which is the starting point for automatic initialization; (c) is the automatically initialized shapes output by Step S1, which constitutes the initial shapes input to

Proc. of SPIE Vol. 7259 72590C-5

Step S2 of the GC-ASM method; and (d) is the final GC-ASM segmentation result. In the vicinity of p* where the method correctly recognizes the object, the cost is sharply smaller than at other locations. Here, p* occurred at scale =1.0, x=25, y=0. It means when scale =1.0, x=25, y=0, the cost of the recognized shape is the minimum.

(a) (b) (c) (d) Fig. 2. The original image and its recognition and delineation results. (a)Original image, (b) Initial placement of model, (c)Automatic initialization result, (d) GC-ASM delineation result.

3.3 Quantitative Analysis Here, we will focus on the analysis of accuracy, and efficiency of GC-ASM based on the framework of reference[18]. Accuracy relates to how well the segmentation results agree with the true delineation of the objects. Efficiency indicates the practical viability of the method, which is determined by the amount of time required for performing computations and for providing any user help needed in segmentation. The measures that are used under each of these groups and their definitions are given below. 3.4 Accuracy The following measures, called true-positive volume fraction (TPVF) and false-positive volume fraction (FPVF) are used to assess the accuracy of the methods. TPVF indicates the fraction of the total amount of delineated tissue that lies in the true delineation. FPVF denotes the amount of tissue falsely identified (see [18] for details). Table 1 lists the mean and standard deviation values of TPVF and FPVF achieved in the two experiments by using ASM and GC-ASM methods. It shows that GC-ASM produces considerably more accurate segmentations than the basic ASM method. Table 1. Mean and standard deviation of TPVF and FPVF for ASM and GC-ASM. Mean operator time To and computational time Tc are also shown (in seconds). TPVF ASM GC-ASM

87.09%±1.23% 97.83%±0.17%

To

FPVF 1.26% ±0.95% 0.43% ±0.06%

160s (n =132) 60s(n = 44)

Tc 25s 29s

3.5 Efficiency The proposed method is implemented on an Intel Pentium IV PC with a 3.4 GHZ CPU using Matlab programming. In determining the efficiency of a segmentation method, two aspects should be considered - the computation time (Tc) and the human operator time (To). The mean Tc and To are listed in Table 1. To mesured here is the operator time required in the training step. Table 1 shows that the operator time (training) required in GC-ASM is less than that of ASM since far fewer landmarks are needed in GC-ASM. The computation time required in GC-ASM is a little more than that of ASM because of the GC part of the algorithm in GC-ASM.

Proc. of SPIE Vol. 7259 72590C-6

4. CONCLUDING REMARKS In this paper, we have presented a synergistic combination of the image based GC and model based ASM methods. GCASM effectively combines the rich statistical shape information embodied in ASM with the globally optimal delineation capability of the GC method. A new GC cost function is proposed, which effectively integrates the ASM shape information into the GC paradigm. The recognition strategy was based on the minimum cut cost which utilizes combined prior shape and specific image information. In other words, an object/object assembly is recognized via the model at that pose where the object/object assembly matches the expected shape (information coming from the model) and the specific image information (coming from GC via the minimum cut) as best as possible over the entire image. The proposed method was evaluated on a clinical abdominal CT data set. The evaluated results indicate that: (a) An overall delineation accuracy of TPVF>97%, FPVF

Suggest Documents