Liver Segmentation in CT Data: A Segmentation ... - MBI@DKFZ

Liver Segmentation in CT Data: A Segmentation Refinement Approach Reinhard Beichel12 , Christian Bauer3?? , Alexander Bornik3 , Erich Sorantin4 , and Horst Bischof3 1

3

Dept. of Electrical and Computer Engineering, The University of Iowa, USA, 2 Dept. of Internal Medicine, The University of Iowa, USA, [email protected], Inst. for Computer Graphics and Vision, Graz University of Technology, Austria, 4 Department of Radiology, Medical University Graz.

Abstract. Liver segmentation is an important prerequisite for planning of surgical interventions like liver tumor resections. For clinical applicability, the segmentation approach must be able to cope with the high variation in shape and gray-value appearance of the liver. In this paper we present a novel segmentation scheme based on a true 3D segmentation refinement concept utilizing a hybrid desktop/virtual reality user interface. The method consists of two main stages. First, an initial segmentation is generated using graph cuts. Second, an interactive segmentation refinement step allows a user to fix arbitrary segmentation errors. We demonstrate the robustness of our method on ten contrast enhanced liver CT scans. Our segmentation approach copes successfully with the high variation found in patient data sets and allows to produce segmentations in a time-efficient manner.

1

Introduction

Liver cancer is one of the four most common deadly malignant neoplasms in the world, causing approximately 618,000 deaths in 2002, according to the World Health Organization5 . Tomographic imaging modalities like X-ray computed tomography (CT) play an important role in diagnosis and treatment of liver diseases like hepatocellular carcinoma (HCC). Deriving a digital geometric model of hepatic (patho)anatomy from preoperative image data facilitates treatment planning [1]. Thus, methods for liver segmentation in volume data are needed which are applicable in clinical routine. In this context, several problems have to be addressed: (a) high shape variation due to natural anatomical variation, disease (e.g., cirrhosis), or previous surgical interventions (e.g., liver segment resection), (b) inhomogeneous gray-value appearance caused by tumors or metastasis, and (c) low contrast to neighboring structures/organs like colon or ??

5

Cristian Bauer was supported by the doctoral program Confluence of Vision and Graphics W1209. http://www.who.int/whr/2004/en

T. Heimann, M. Styner, B. van Ginneken (Eds.): 3D Segmentation in The Clinic: A Grand Challenge, pp. 235-245, 2007.

stomach. For practical application, segmentation must be capable of handling all possible cases in a time-efficient manner. Several approaches to liver segmentation have been developed so far (see [2–6] for examples). However, in summary, basic bottom-up segmentation algorithms frequently fail, especially in more complex cases like livers with large tumors. In addition, solely model-based approaches are problematic, because of the high shape variability of the liver. Very few approaches provide methods for the refinement or editing of segmentation results. In general, segmentation refinement approaches are very rare. For example, a tool is reported in [7] and [8] where Rational Gaussian (RaG) Surfaces are used to represent segmented objects. Segmentation errors can be corrected by manipulation control points using a 2D desktop setup. Another tool for data driven editing of pre-segmented images/volumes based on graph cuts or alternatively random walker algorithms was proposed in [9]. All approaches mentioned so far are based on 2D interaction and monoscopic desktop-based visualization techniques, despite the fact that 3D objects are targeted. Usually, 2D interaction methods are not sufficient for refinement of 3D models extracted from volumetric data sets, which is inherently a 3D task [10]. We propose a novel refinement approach to 3D liver segmentation. Based on an initial highly automated graph cut segmentation, refinement tools allow to manipulate the segmentation result in 3D, and thus, to correct possible errors. Segmentation refinement is facilitated by a hybrid user interface, combining a conventional desktop setup with a virtual reality (VR) system. The segmentation approach was developed for clinical application. In addition, our concept can be utilized for other segmentation tasks.

2

Methods

The proposed approach to liver segmentation consists of two main stages: initial segmentation and interactive segmentation refinement. As input for the first stage, a CT volume and one or more start regions, marking liver tissue, are used. The segmentation is then generated using a graph cut approach6 . In addition, a partitioning of the segmentation and the background into volume chunks is derived from edge/surface features calculated from CT volume. These two types of output are passed on to the second stage which allows for the correction/refinement of segmentation errors remaining after the first stage. Refinement takes place in two steps. First, volume chunks can be added or removed. This step is usually very fast, and the majority of segmentation errors occurring in practice can be fixed or at least significantly reduced. Second, after conversion of the binary segmentation to a simplex mesh, arbitrary errors can be addressed by deforming the mesh using various tools. Each of the refinement steps is facilitated using interactive VR-enabled tools for true 3D segmentation inspection and refinement, allowing for stereoscopic viewing and true 3D interaction. Since 6

Note that graph cut segmentation is not used interactively, as proposed by Boykov et al. in [11], since the behavior of graph cuts is not always intuitive.

236

the last stage of the refinement procedure is mesh-based, a voxelization method is used to generate a labeled volume [12]. 2.1

Graph-Cut-based initial segmentation

An initial segmentation is generated using a graph cut [11] approach. From image data, a graph G = (V, E) is built, where nodes are denoted by V and undirected edges by E. Nodes V of the graph are formed by data elements (voxels), and two additional terminal nodes, a source node s and sink node t. Edge weights allow to model different relations between nodes (see [11] for details). Let P denote the set of voxels from the input volume data set V—to reduce computing time, only voxels with density values above −600 Hounsfield Units (HU) are considered as potentially belonging to the liver. The partition A = (A1 , . . . , Ap , . . . , A|P | ) with Ap ∈ {”obj”, ”bkg”} can be used to represent the segmentation of P into object (“obj”) and background (“bkg”) voxels. Let N be the set of unordered neighboring pairs {p, q} in set P according to the used neighborhood relation. In our case, a 6-neighborhood relation is used to save memory. The cost of a given P graph cut segmentation A is defined as E(A) = B(A) + λR(A)P where R(A) = p∈P Rp (Ap ) takes region properties into account and B(A) = {p,q}∈N Bp,q δAp 6=Aq , with δAp 6=Aq equaling 1 if Ap 6= Aq and 0 if Ap = Aq , being boundary properties. The parameter λ with λ ≥ 0 allows to tradeoff the influence of both cost terms. Using the s-t cut algorithm, a partition A can be found which globally minimizes E(A). However, in practice a refinement of this segmentation result might be necessary to be useful for a given clinical application. Region term The region term R(A) specifies the costs of assigning a voxel to a label based on its gray-value similarity to object and background regions. For this purpose, user defined seed regions are utilized. Following the approach proposed in [13], region cost Rp (·) for a given voxel p is defined for labels “obj” and “bkg” as negative log-likelihoods Rp (”obj”) = −ln(P r(Ip |”obj”)) and Rp (”bkg”) = −ln(P r(Ip |”bkg”)) with 2 2 P r(Ip |”obj”) = e−(Ip −mobj ) /(2σobj ) and P r(Ip |”bkg”) = 1 − P r(Ip |”obj”), respectively. From a object seed region placed inside the liver, the mean mobj and standard deviation σobj are calculated. Clearly, in the above outlined approach, a simplification is made since liver gray-value appearance is usually not homogeneous. However, this simplification works quite well in practice in combination with the other processing steps. Further, the specified object seeds are incorporated as hard constraints, and the boundary of the scene is used as background seeds. Boundary term The basic idea is to utilize a surfaceness measure as boundary term which is calculated in four steps: 1. Gradient tensor calculation: First, to reduce the effect of unrelated structures on the  gradient, the gray value range of the image is adapted:  vlow if If < tlow I˜f = κ(If ) = vhigh if If > thigh .  If otherwise

237

Second, a gradient vector ∇f = (fx , fy , fz )T is calculated for each voxel f on the with κ gray-vale transformed data volume V by means of Gausx2 +y 2 +z 2

sian derivatives with the kernel gσ = 1/(2πσ 2 ) 2 e− 2σ2 and standard deviation σ. The gradient tensor S = ∇f ∇f T is calculated for each voxel after gray-value transformation. 2. Spatial non-linear filtering: To enhance weak edges and to reduce false responses, a spatial non-linear averaging of gradient tensors is applied. The non-linear filter kernel consists of a Gaussian kernel which is modulated by the local gradient vector ∇f . Given a vector x that points from the center of the kernel to any neighboring voxel, the weight for this voxel  − tan(φ)2 r  − 1  e 2σ02 e 2ρ2 if φ 6= π N 2 is calculated as: hσ0 ,ρ (x, ∇f ) = 0 if φ = π2 and r = 0 ,   1 otherwise N T π T with r = x x and φ = 2 − | arccos(∇f x/(|∇f ||x|))| . Parameter ρ determines the strength of orientedness, and σ 0 determines the strength of punishment depending on the distance. N is a normalization factor that makes the kernel integrate to unity. The resulting structure tensor is denoted as W. 3. Surfaceness measure calculation: Let e1W(x) , e2W(x) , e3W(x) be the eigenvectors and λ1W(x) ≥ λ2W(x) ≥ λ3W(x) the corresponding eigenvalues of W(x) at position x. If x is located on a plane-like structure, we can observe that λ1 0, λ2 ≈ 0,pand λ3 ≈ 0. Thus, we define the surfaceness measure as t(W(x)) = λ1W(x) − λ2W(x) and the direction of the normal vector to the surface is given by e1W(x) . 4. Boundary weight calculation: In liver CT images, objects are often separated only by weak boundaries, with higher gray level gradients present in close proximity. To take these circumstances into account, we propose the following boundary cost term Bp,q= min{ξ (t(W(xp ))) , ξ (t(W(xq )))} if t < t1  c1 if t > t2 which with the weighting function ξ(t) = c2  (t − t ) c2 −c1 + c otherwise 1 t2 −t1 1 models a uncertainty zone between t1 and t2 (note: t1 < t2 and c1 > c2 ). Ideally, the graph cut segmentation should follow the ridges of the gradient magnitude. Therefore, we punish non-maximal responses in the gradient magnitude volume by adjusted the weighting function as follows: ξnon max (t) = min{ξ(t) + cnm , 1}, where cnm is a constant. 3

2.2

Chunk-based Segmentation Refinement

After initial segmentation, objects with a similar gray-value range in close proximity can appear merged or tumors with different gray-value appearance might be missing. Therefore, a refinement may be needed in some cases. The first refinement stage is based on volume chunks, which subdivide the graph cut segmentation result (object) as well as the background into disjunct subregions.

238

(a)

(b)

(c)

(d)

Fig. 1. Mesh-based refinement using a sphere deformation tool. In this case the segmentation error is a leak. (a) Marking the region containing the segmentation error. (b) Refinement using the sphere tool. (c) After pushing the mesh surface back to the correct location with the sphere tool, the error is fixed. (d) The corrected region in wire frame mode highlighting the mesh contour.

Fig. 2. Initial graph cut (GC) segmentation results. From left to right, a sagittal, coronal and transversal slice from a relatively easy case (1, top), an average case (4, middle), and a relatively difficult case (3, bottom). The outline of the reference standard segmentation is in red, the outline of the segmentation of the method described in this paper is in blue. Slices are displayed with a window of 400 and a level of 70.

239

Thus, the initial segmentation can be represented by chunks and it can be altered by adding or removing chunks. By thresholding t(W), a binary boundary volume (threshold tb ) representing boundary/surfaces parts is generated and merged with the boundary from the graph cut segmentation by using a logical “or” operation. Then the distance transformation is calculated. Inverting this distance map results in an image that can be interpreted as a height map. To avoid oversegmentation, all small local minima resulting from quantization noise in the distance map are eliminated. Applying a watershed segmentation to the distance map results in volume chunks. Since boundary voxels are not part of the chunks, they are merged with the neighboring chunks containing the most similar adjacent voxels. Since the method can handle gaps in the edge scene, the threshold tb can be set very conservatively to suppress background noise. Refinement can be done very efficiently, since the user has to select/deselect predefined chunks, which does not require a detailed border delineation. This step requires adequate tools for interactive data inspection and selection methods. For this purpose, a hybrid user interface was developed, which is described in Section 2.4. 2.3

Simplex-Mesh-based Refinement

After the first refinement step, selected chunks are converted to a simplex mesh representation. Different tools allow then a deformation of the mesh representation. One example is shown in Fig. 1. More details regarding this mesh-based refinement step can be found in [14]. 2.4

Hybrid Desktop/Virtual Reality User Interface

To facilitate segmentation refinement, a hybrid user interface consisting of a desktop part and a virtual reality (VR) part was developed (see [10] for details). It allows to solve individual refinement tasks using the best suited interaction technique, either in 2D or 3D. The VR system part provides stereoscopic visualization on a large screen projection wall, while the desktop part of the system uses a touch screen for monoscopic visualization.

3

Data and Experimental Setup

For evaluation of the segmentation approach, ten liver CT data sets with undisclosed manual reference segmentation were provided by the workshop organizers. Segmentation results were sent to the organizers, which provided in return evaluation results7 . For all the experiments, the following parameters have been used: Gaussian derivative kernel: σ = 3.0; non-linear filtering: σ 0 = 6.0, ρ = 0.4; graph cut: λ = 0.05; weighting function: t1 = 2.0, t2 = 10.0, c1 = 1.0, c2 = 0.001, ccm = 0.75; Threshold for chunk generation: tb = 10.0; gray-value transformation: tlow = −50, vlow = −150, thigh = 200, and vhigh = 60. To simulate clinical 7

See http://mbi.dkfz-heidelberg.de/grand-challenge2007/sites/eval.htm for details.

240

work-flow, the initial seed regions were provided manually and the graph cut segmentation as well as the chunk generation was calculated automatically. Based on the initial segmentation, a medical expert was asked to perform: (a) chunkbased (CBR) and (b) mesh-based refinement (MBR). Intermediate results and task completion times were recorded. Prior to evaluation, the expert was introduced to the system by an instructor.

Fig. 3. Chunk-based segmentation refinement (CBR) results. From left to right, a sagittal, coronal and transversal slice from a relatively easy case (1, top), an average case (4, middle), and a relatively difficult case (3, bottom). The outline of the reference standard segmentation is in red, the outline of the segmentation of the method described in this paper is in blue. Slices are displayed with a window of 400 and a level of 70.

4

Results

Table 1 summarizes segmentation metrics and corresponding scores for the initial graph cut segmentation (Table 1(a)), CBR (Table 1(b)), and MBR (Table 1(c)). The averaged performance measures and scores clearly show the effectiveness

241

of the segmentation refinement concept: metrics and scores improve with each refinement stage. For example, after the initial graph cut segmentation, five cases have an overlap error larger than 10 %, and the over-all average is 14.3 %. Using CBR, the average overlap error was reduced to 6.5 %, and reached 5.2 % after the final MBR stage. The average time needed for seed placement is less than 30 seconds. For the CBR step 58 seconds were required on average, and the MBR step took approximately five minutes on average. Despite the low time consumption of the CBR step, it is quite effective regarding segmentation quality improvement and delivers already a good segmentation result. Computation time for the graph cut segmentation and chunk generation was approximately 30 minutes per data set, which is not critical for our application.

Fig. 4. Mesh-based segmentation refinement (MBR) results. From left to right, a sagittal, coronal and transversal slice from a relatively easy case (1, top), an average case (4, middle), and a relatively difficult case (3, bottom). The outline of the reference standard segmentation is in red, the outline of the segmentation of the method described in this paper is in blue. Slices are displayed with a window of 400 and a level of 70.

In comparison, averages for the performance measures determined from an independent human segmentation of several test cases yielded: 6.4 % volumetric

242

overlap; 4.7 % relative absolute volume difference; 1.0 mm average symmetric absolute surface distance; 1.8 mm symmetric RMS surface distance; 19 mm maximum symmetric absolute surface distance8 . Thus, our refinement results (CBR and MBR) are within the observed variation range (see Table 1). Figs. 2, 3, and 4 depict a comparison of reference and actual segmentation for the initial graph cut, CBR, and MBR for three different data sets. Because of the formulation of the initial graph cut segmentation, larger tumors are not included in the segmentation result, as shown in the third row of Fig. 2. However, this can be easily fixed during the CBR stage (Fig. 3). Remaining errors can then be fixed in MBR stage. The examples show that the average maximum symmetric absolute surface distance of 15.7 mm on average can be explained by differences in the interpretation of the data in regions where vessels enter or leave the liver.

5

Discussion

For our experiments, we have used a full-blown VR setup which is quite expensive. However, a fully functional scaled-down working setup can be built for a reasonable price, comparable to the costs of a radiological workstation. Several experiments with different physicians have shown that the system can be operated after a short learning phase (typically less that one hour), because of the intuitive 3D user interface. The proposed refinement method can also easily be integrated into clinical work-flow. The CT volume together with the manual generated start region is sent by a radiology assistant to a computing node which performs the automated segmentation steps. As soon as a result is available, a radiologist is notified that data are ready for further processing. After inspection, possible refinement, and approval of correctness, the segmentation can be used for clinical investigations or planning of treatment. A previous independently performed evaluation with twenty routinely acquired CT data sets of potential liver surgery candidates yielded a comparable segmentation error. However, more time was needed for interactive refinement. This has several reasons: lower data quality (more partial volume effect, motion blur due to cardiac motion, etc.), more severely diseased livers with larger tumors or multiple tumors, and more focus on details (e.g., consistently excluding the inferior vena cava). These observations lead to the following conclusions. First, the used imaging protocol impacts the time needed for segmentation refinement, and thus, should be optimized. Second, the developed method allows the user to adjust the level of detail according to the requirements in trade-off with interaction time.

6

Conclusion

In this paper we have presented an interactive true 3D segmentation refinement concept for liver segmentation in contrast-enhanced CT data. The approach consists of two stages: initial graph cut segmentation and interactive 3D refinement. 8

The values reported were provided by the workshop organizers.

243

(a) Graph Cut (GC)

Dataset Overlap Error Volume Diff. Avg. Dist. RMS Dist. Max. Dist. Total [%] Score [%] Score [mm] Score [mm] Score [mm] Score Score 1 17.9 30 10.7 43 4.8 0 10.9 0 67.9 11 17 2 18.2 29 16.5 12 3.3 19 8.6 0 54.1 29 18 3 35.2 0 20.3 0 11.2 0 22.2 0 92.3 0 0 4 8.4 67 1.3 93 1.6 61 3.8 47 43.3 43 62 5 7.0 73 1.8 90 1.2 69 2.7 62 27.8 63 72 6 5.7 78 -1.4 92 1.0 76 2.5 65 27.5 64 75 7 12.2 52 0.7 96 5.3 0 11.3 0 61.6 19 33 8 7.7 70 -1.6 92 1.6 59 4.3 40 45.9 40 60 9 7.5 71 -0.0 100 2.2 46 5.4 25 31.9 58 60 10 23.0 10 -17.9 5 3.4 14 7.3 0 39.7 48 15 Average 14.3 48 3.1 62 3.6 34 7.9 24 49.2 38 41 (b) Chunk-based Refinement (CBR)

Dataset Overlap Error Volume Diff. Avg. Dist. RMS Dist. Max. Dist. Total [%] Score [%] Score [mm] Score [mm] Score [mm] Score Score 1 8.2 68 2.5 87 2.0 49 5.4 24 47.9 37 53 2 6.2 76 3.0 84 0.9 78 1.8 75 17.9 76 78 3 7.1 72 3.2 83 1.6 61 3.5 51 34.7 54 64 4 6.7 74 -0.5 97 1.2 69 2.5 66 25.2 67 75 5 6.4 75 1.9 90 1.1 72 2.4 67 21.5 72 75 6 5.0 80 0.4 98 0.7 81 1.6 78 17.2 77 83 7 5.4 79 2.2 88 0.8 80 1.5 79 13.0 83 82 8 7.1 72 -1.2 94 1.1 72 2.5 65 20.2 73 75 9 5.1 80 2.2 88 0.6 85 1.2 83 16.8 78 83 10 8.0 69 -2.4 87 1.2 71 2.2 69 19.2 75 74 Average 6.5 74 1.1 90 1.1 72 2.5 66 23.4 69 74 (c) Mesh-based Refinement (MBR)

Dataset Overlap Error Volume Diff. Avg. Dist. RMS Dist. Max. Dist. Total [%] Score [%] Score [mm] Score [mm] Score [mm] Score Score 1 5.3 79 2.3 88 0.8 80 1.5 79 15.9 79 81 2 5.5 79 1.9 90 0.8 81 1.4 81 17.9 76 81 3 4.1 84 1.5 92 0.8 80 1.3 82 14.2 81 84 4 6.4 75 1.0 95 1.1 72 2.1 70 21.0 72 77 5 5.4 79 0.1 99 0.9 78 1.7 77 19.2 75 81 6 4.1 84 -0.9 95 0.6 85 1.1 85 12.6 83 86 7 4.4 83 2.5 87 0.6 85 1.1 85 12.5 84 85 8 5.7 78 1.7 91 0.9 77 1.8 76 17.1 77 80 9 4.2 84 2.4 87 0.5 88 0.8 88 16.8 78 85 10 6.6 74 -2.8 85 0.9 77 1.5 79 9.6 87 80 Average 5.2 80 1.0 91 0.8 80 1.4 80 15.7 79 82 Table 1. Results of the comparison metrics and corresponding scores for all ten test cases and processing steps (see Section 4 for details).

244

The evaluation of our method on ten test CT data sets shows that a high segmentation quality (mean average distance of less than 1 mm) can be achieved by using this approach. In addition, the interaction time needed for refinement is quite low (approx. 6.5 minutes). Thus, the presented refinement concept is well suited for clinical application. The approach is not limited to a specific organ or modality, and therefore, it is very promising for other medical segmentation applications.

References 1. Reitinger, B., Bornik, A., Beichel, R., Schmalstieg, D.: Liver surgery planning using virtual reality. IEEE Comput. Graph. Appl. 26(6) (2006) 36–47 2. Schenk, A., Prause, G.P.M., Peitgen, H.O.: Efficient semiautomatic segmentation of 3D objects in medical images. In: Medical Image Computing and ComputerAssisted Intervention (MICCAI), Springer (2000) 186–195 3. Pan, S., Dawant, M.: Automatic 3D segmentation of the liver from abdominal CT images: A level-set approach. In Sonka, M., Hanson, K.M., eds.: Medical Imaging: Image Processing. Volume 4322 of Proc. SPIE. (2001) 128–138 4. Soler, L., et al.: Fully automatic anatomical, pathological, and functional segmentation from CT scans for hepatic surgery. Computer Aided Surgery 6(3) (2001) 131–142 5. Lamecker, H., et al.: Segmentation of the liver using a 3D statistical shape model. Technical report, Konrad-Zuse-Zentrum f¨ ur Informationstechnik Berlin (2004) 6. Heimann, T., Wolf, I., Meinzer, H.P.: Active shape models for a fully automated 3D segmentation of the liver - an evaluation on clinical data. In: Medical Image Computing and Computer-Assisted Intervention (MICCAI). Volume 4191 of Lecture Notes in Computer Science., Springer Berlin / Heidelberg (2006) 41–48 7. Jackowski, M., Goshtasby, A.: A computer-aided design system for revision of segmentation errors. In: Proc. Medical Image Computing and Computer-Assisted Intervention (MICCAI). Volume 2. (2005) 717–724 8. Beichel, R., et al.: Shape- and appearance-based segmentation of volumetric medical images. In: Proc. of ICIP 2001. Volume 2. (2001) 589–592 9. Grady, L., Funka-Lea, G.: An energy minimization approach to the data driven editing of presegmented images/volumes. In: Medical Image Computing and Computer-Assisted Intervention – MICCAI. Volume 4191., Springer (2006) 888– 895 10. Bornik, A., Beichel, R., Kruijff, E., Reitinger, B., Schmalstieg, D.: A hybrid user interface for manipulation of volumetric medical data. In: Proceedings of IEEE Symposium on 3D User Interfaces 2006, IEEE Computer Society (2006) 29–36 11. Boykov, Y., Funka-Lea, G.: Graph cuts and efficient N-D image segmentation. In International Journal of Computer Vision (IJCV) 70(2) (2006) 109–131 12. Reitinger, B., et al.: Tools for augmented reality-based liver resection planning. In Galloway, R.L., ed.: Medical Imaging 2004: Visualization, Image-Guided Procedures, and Display. Volume 5367., SPIE (2004) 88–99 13. Boykov, Y., Jolly, M.P.: Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In: ICCV. Volume 1. (2001) 105–112 14. Bornik, A., Beichel, R., Schmalstieg, D.: Interactive editing of segmented volumetric datasets in a hybrid 2D/3D virtual environment. In: VRST ’06: Proceedings of the ACM symposium on Virtual reality software and technology. (2006) 197 – 206

245