Robust Segmentation of Human Cardiac Contours

Abteilung Neuroinformatik Prof. Dr. G¨ unther Palm

Robust Segmentation of Human Cardiac Contours from Spatial Magnetic Resonance Images

Dissertation zur Erlangung des Doktogrades Dr.rer.nat. der Fakultät f¨ ur Informatik der Universität Ulm

Haythem El-Messiry ¨ aus Agypten 2004

Amtierender Dekan:

Prof. Dr. Friedrich von Henke

1. Gutachter:

Prof. Dr. Heiko Neumann

2. Gutachter:

Prof. Dr. G¨ unther Palm

Tag der Promotion:

Abstract Automated segmentation to find the endocardial boundary of the left heart ventricle from magnetic resonance (MR) images has shown to be a difficult task. One of the major problems related to the detection of the boundary are the shortcomings typical of discrete data, such as sampling artifacts and noise, which may cause the shape boundaries to be indistinct and disconnected. Furthermore, the structures inside the ventricular cavities, such as papillary muscles, are often indistinguishable from structures of interest for diagnostic analysis, such as the moving inner heart boundary. Thus, segmentation is error-prone and often incomplete. The aim of this work is to develop a model towards an automatic segmentation of the endocardial border. The proposed method is composed of two phases: The segmentation phase uses a bottom-up multi-scale analysis, based mainly on morphological scale-space processing by decomposing the image into a number of scales of different structure size. As a result of the decomposition, the structures adjacent to the endocardial border are located, and finally an estimated boundary is obtained regardless of those structures. The refinement phase subsequently asserts prior information about local structure around defined points along the shape boundary. In order to obtain the best accuracy of the endocardial segmentation.

i

ii

Acknowledgements I would like to express my gratitude to the DAAD for the exchange scholarship PhD program. I would like to thank my scientific supervisor, Prof. Dr. Heiko Neumann for his encouragement, guidance and support during my entire PhD study at Ulm University. He has been a valuable source of ideas and stimulating discussions on both technical and non-technical topics. It is my great fortune and honor to earn my degree under his supervision. I would like to especially thank Prof. Dr. G¨ unther Palm for his helpful comments and suggestions in helping this dissertation take its final form. All my colleagues and staff from Neural Information Processing Department throughout the years deserve much appreciation, especially Birgit Lonsinger-Miller, Pierre Bayerl, and Guilleum Pagés Gassull made me really enjoy my time there. Every body there was always supportive and gave me a lot of good advice and help in right moments. I would like to thank the department of Internal Medicine II/ Cardiology at Ulm University for providing cardiac MR data used in this research, especially, Dr. Hans Kestler and the staff of MRI unit for having many helpful discussions on heart and providing much needed data and expertise. Finally I dedicate this thesis to: My father Mohamed Fakhrey, who I dedicate every single success in my life to him. My mother Fatma and my sister Radwa, they have given their unconditional support, knowing that doing so contributed greatly to my absence these last four years. They were strong enough to let me go easily, to believe in me, and to let slip away all those years during which we could have been geographically closer. My wife Doaa who deserves an award for her patience, understanding, and prayers during my PhD study and the writing of this thesis. My daughter Aimy, which is everything in my life. iii

iv

Table of Contents Abstract

i

Acknowledgements

iii

Table of Contents

v

1 Introduction 1.1 Introduction . . . . . . . . . . . . . . 1.2 Motivation . . . . . . . . . . . . . . . 1.3 Medical Imaging . . . . . . . . . . . 1.4 MR Medical Imaging . . . . . . . . . 1.5 Endocardial Border Segmentation . . 1.6 Outline of the Segmentation Process 1.7 Thesis Contributions . . . . . . . . . 1.8 Thesis Organization . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

2 Related Work 2.1 Medical Image Segmentation . . . . . . . . . 2.1.1 Region-Based Segmentation . . . . . 2.1.2 Local Feature Detection . . . . . . . 2.1.3 Template Matching . . . . . . . . . . 2.1.4 Deformable Models . . . . . . . . . . 2.1.5 Statistical Prior Knowledge of Shape 3 Low-Level Visual Analysis Approach 3.1 Introduction . . . . . . . . . . . . . . 3.2 Mathmatical Morphology . . . . . . . 3.2.1 Foundation and Notation . . . 3.2.2 Morphological Properties . . . v

. . . .

. . . .

. . . .

. . . .

. . . . . . . .

. . . . . .

. . . .

. . . . . . . .

. . . . . .

. . . .

. . . . . . . .

. . . . . .

. . . .

. . . . . . . .

. . . . . .

. . . .

. . . . . . . .

. . . . . .

. . . .

. . . . . . . .

. . . . . .

. . . .

. . . . . . . .

. . . . . .

. . . .

. . . . . . . .

. . . . . .

. . . .

. . . . . . . .

. . . . . .

. . . .

. . . . . . . .

. . . . . .

. . . .

. . . . . . . .

. . . . . .

. . . .

. . . . . . . .

. . . . . .

. . . .

. . . . . . . .

. . . . . .

. . . .

. . . . . . . .

1 1 2 2 3 7 8 9 10

. . . . . .

11 11 11 12 12 12 15

. . . .

17 17 18 18 20

CONTENTS

3.3

CONTENTS

. . . . . . . . .

21 21 22 24 24 28 32 34 49

. . . . . . . . .

51 51 52 52 52 56 58 58 61 63

. . . . . . .

67 67 67 71 71 71 77 82

6 Summary and Conclusions 6.1 Summary and Conclusions of the Thesis . . . . . . . . . . . . . . . .

87 87

Zusammenfassung

93

A Principal Component Analysis (PCA)

95

B Image Warping B.1 Piece-wise Affine . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97 97

3.4

3.5

Morphological Scale-space . . . . . . . . . . . . . . . . . 3.3.1 Scale-Space . . . . . . . . . . . . . . . . . . . . . 3.3.2 Multiscale Morphology . . . . . . . . . . . . . . . 3.3.3 Morphology with Scaled Structuring Functions . . Morphological Scale-space Decomposition . . . . . . . . . 3.4.1 Morphological Filtering . . . . . . . . . . . . . . . 3.4.2 Descriptor Scales in Opening-Closing Scale-space 3.4.3 Local Feature Extraction . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . .

4 Combined Low-High level Visual Approach 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 4.2 High- Level Approaches . . . . . . . . . . . . . . . . 4.2.1 Active Shape Model . . . . . . . . . . . . . . 4.2.2 Active Appearance Model . . . . . . . . . . . 4.2.3 Problems of the AAM . . . . . . . . . . . . . 4.3 Combined Low-High level Visual Approach . . . . . . 4.3.1 Building The Knowledge-based Model . . . . 4.3.2 Searching Using Gray Appearance Knowledge 4.4 Example of Gray Appearance Knowledge Search . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

5 Experimental Results and Evaluation 5.1 Experimental Design . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Data Acquisition and Preprocessing . . . . . . . . . . 5.1.2 Performance Assessment . . . . . . . . . . . . . . . . 5.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Demonstration of Low-Level Visual Approach Results 5.2.2 Combined Low-High Level Approach Evaluation . . . 5.2.3 Evaluation Compared to Other Approaches . . . . .

vi

. . . . . . . . .

. . . . . . . . .

. . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . .

CONTENTS

CONTENTS

C Texture Normalization

101

D GVF Snakes Implementation

103

Bibliography

105

vii

CONTENTS

CONTENTS

viii

List of Figures 1.1

MRI scanner and its application in medical imaging.

. . . . . . . . .

1.2

Standard MR short-axis image sequence of the heart starting from left

4

to right and from top to bottom. . . . . . . . . . . . . . . . . . . . .

6

1.3

(a) Short-axis cross section. (b) Location of the endocardium in the LV.

7

1.4

Left: The hand-drawn endocardium border is shown. Right: Contour segmentation models do not lead to satisfied results (using Canny edge detector). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.5

8

The system architecture, illustrating the combination of the knowledgebased model with a bottom-up model for the endocardial border segmentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.1

9

Results of GVF applied for segmenting the endocardial boundary (Chenyang Xu.(CVPR’97)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

3.1

Scale-space filtering. . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

3.2

Smoothing of 1D signal by Multiscale Morphological Dilation-Erosion.

23

3.3

test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

3.4

Left: test image containing regions that have different constant widths, Center: appropriate scales measured by eq.3.4.1 are always small near edges. Right: magnitude of appropriate scales measured by Köthe’s method are constant within each region. . . . . . . . . . . . . . . . . ix

27

LIST OF FIGURES

3.5

LIST OF FIGURES

Band-pass decomposition of an image (center) with respect to structure sizes (shown the upper of each scale). Left from top to bottom: s=1,2,4,8,16,32, dark blobs. Right: s=-1,-2,-4,-8,-16, -32, light blobs. .

3.6

Reconstruction of the original image from opening-closing scale decomposition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.7

Left: Descriptor scale of

Left: Descriptor scale of

32

the closing scale-space decomposition.

Right: Descriptor scale of the opening scale-space. . . . . . . . . . . 3.8

31

33

the closing scale-space decomposition.

Right: Descriptor scale of the opening scale-space. The variation in gray intensities comes from the maximum response over the scalespace decomposition. 3.9

. . . . . . . . . . . . . . . . . . . . . . . . . .

33

Extracting the inner cavity and locating the appearance structures. .

34

3.10 Segmentation process architecture. Firstly, the morphological scalespace and scale descriptors are calculated. Secondary, segmenting both the inner cavity and appearance structures. Finally, as a result of the combination of the previous segmentation processes an estimated epicardial boundary is detected. . . . . . . . . . . . . . . . . . . . . .

35

3.11 Decomposition of the ROI (Region Of Interest) with respect to structure sizes, using the closing and opening morphological operators. . .

35

3.12 Left: Descriptor scale of closing decomposition. Right: Mapping the Descriptor closing scale into a color domain. The scales are shown around the right figure with their mapped color in the figure. . . . . .

36

3.13 Left: Descriptor scale of opening decomposition. Right: Mapping the Descriptor opening scale into a color domain. The scales are shown around the right figure with their mapped color in the figure. . . . . . x

36

LIST OF FIGURES

LIST OF FIGURES

3.14 Schematic diagram of the operation of grey scale opening (erosion and dilation) in one dimension, showing (starting from the top): the original profile with result of first (maximum) pass, producing a new profile through brightest points; the second step in which a new profile passes through the darkest (minimum) points in the result from step 1; comparison of the final result to the original profile, showing rejection of noise and dark spikes. . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

3.15 The Scale selection: Left original images for 2 different phases of the same case. Middle: Scale decomposition marked by green. Right: Scale decomposition marked by blue. . . . . . . . . . . . . . . . . . .

38

3.16 The opening scale-space decomposition for the image. The Scales are presented from left to right and from top to bottom, each scale represents certain structure size from the image. . . . . . . . . . . . . . . .

40

3.17 (a) and (c)The scales marked with green and blue, with the seed points as initialization for the growing region algorithm. (b) and (d) The segment regions using the algorithm. . . . . . . . . . . . . . . . . . . 3.18 Building the counter accumulator vectors for the segmented regions.

41 41

3.19 Left: The counter accumulator for the segmented region from the scale marked by green. Right: The counter accumulator for the segmented region from the scale marked by blue. The maximum count accumulator vote to the scale, representing the inner cavity. . . . . . . . . . .

42

3.20 a. The inner cavity with appearance structures. b. Applying an edge detector algorithm will segment those appearance structures as part of the inner boundary. c. The desired inner boundary must be beyond those appearance structures. . . . . . . . . . . . . . . . . . . . . . . .

43

3.21 The closing scale-space decomposition for the image. The Scales are presented from left to right and from top to bottom, each scale represents certain structure size from the image. . . . . . . . . . . . . . . . xi

44

LIST OF FIGURES

LIST OF FIGURES

3.22 Top left: The scale representing the inner appearance structures information with its histogram distribution. Bottom left: The histogram of the above scale with the thresholded value. Top right: After applying the threshold algorithm. The thresholded scale with the extracted appearance structures, some segmented structures located outside the inner region must be removed, like the ones at the top left or at the bottom middle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

3.23 The combination of inner region and appearance structures to obtain a smoothed estimated endocardial contour. . . . . . . . . . . . . . . .

46

3.24 Final estimated contour with the 2 scales representing the inner region and inner structures segmentation. The arrows pointed toward the appearance structures and their localization in the descriptor scale. .

47

3.25 Sequence of MR images for one case, showing the estimated endocardial contour using the proposed scale-space decomposition approach. . . . 4.1

48

Combined shape and gray-level appearance model. First two modes of appearance variation of the inner cavity of the left ventricle. . . . . .

54

4.2

Training image with manually labelled Endocardial border. . . . . . .

58

4.3

(a) Labelled Endocardial contour. (b) Control points selection with 15 degree difference in clockwise direction. (c) Selected control points marked with white dots. . . . . . . . . . . . . . . . . . . . . . . . . .

4.4

60

Building the statistical appearance model. For every control point, the gray profile is sampled along a line passing through the control points in the direction of the average center of mass. . . . . . . . . . . . . .

4.5

Search along sampled profile to find the best fit, compared to the trained profile. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.6

61 63

Estimated contour obtained from low-level visual approach with control points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

4.7

Sampled profile and sub-profiles. . . . . . . . . . . . . . . . . . . . . .

65

4.8

Search the best fit sub-profile for the gray-level model. . . . . . . . .

65

xii

LIST OF FIGURES

4.9

LIST OF FIGURES

Searching using the Appearance Knowledge. The search guides each control point separately to the optimum final position, in such a manner that large displacements are made for the control points which are far from the best fit position, while small displacements are generated for the control points which are near to the optimum fit. . . . . . . .

66

5.1

Sample frame from acquired MRI data set, with marked ROI. . . . .

68

5.2

Sample slices from acquired MRI data set from normal cases. . . . . .

69

5.3

Sample slices from acquired MRI data set from abnormal cases with significant left ventricular hypertrophy. . . . . . . . . . . . . . . . . .

70

5.4

Estimated contour with located inner region and inner structures. . .

73

5.5

Estimated contour obtained from Morphological scale-space decomposition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.6

Estimated contour obtained from Morphological scale-space decomposition (continue). . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.7

76

Visualization of table 5.1 results. The mean square error distance (yaxis) for each training group in each test set (x-axis). . . . . . . . . .

5.9

75

Estimated contour obtained from Morphological scale-space decomposition (continue). . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.8

74

78

Left: The evaluation of the proposed model using the training set (T) against itself. Right: The evaluation of the proposed model using the training set (T) against non-trained set (S). Showing in each case the range of error in pixels . . . . . . . . . . . . . . . . . . . . . . . . . .

79

5.10 Left:The mean square error distance (y-axis) for each control point (xaxis) of initial contour. Right:The mean square error distance (y-axis) for each control point (x-axis) of best fit of the final contour. Both with standard error. . . . . . . . . . . . . . . . . . . . . . . . . . . .

80

5.11 Search using the combined approach to estimate the endocardial border. 81 5.12 Distance mean error of combined model and GVF snakes. . . . . . . . xiii

83

LIST OF FIGURES

LIST OF FIGURES

5.13 Left: The final contour (white) after applying the proposed model. Right: The final contour (white) after applying the GVF snake. Both are initialized with the estimated contour (red). . . . . . . . . . . . .

83

5.14 Left: The image. Right: The vector flow of the GVF directed towards the appearance structures. . . . . . . . . . . . . . . . . . . . . . . . .

84

5.15 Illustration of the AAM search: (a)Initial contour and final contour. (b)Initial mode appearance and final mode appearance. . . . . . . . .

86

5.16 Comparing the 3 models. . . . . . . . . . . . . . . . . . . . . . . . . .

86

6.1

Comparing the 3 models. . . . . . . . . . . . . . . . . . . . . . . . . .

90

6.2

3 cases showing the failures of the proposed model in some control points. 91

B.1 Delaunay triangulation of the mean shape. . . . . . . . . . . . . . . .

xiv

98

Chapter 1 Introduction

1.1

Introduction

Medical imaging is an important source of anatomical and functional information and is indispensable for the diagnosis and treatment of diseases. However, huge amounts of high resolution three dimensional spatial and temporal data cannot be effectively processed and utilized with traditional visualization techniques. It is generally insufficient or inefficient for physicians to only visually inspect the medical image data collected from MR, CT, PET and other modalities. The role of medical imaging is expanding and the medical image analysis community has become busy with the challenging problem of creating quantification algorithms that make full use of the information in the flood of image data.

Among the primary tasks of medical image analysis are image segmentation, registration, and matching. Medical image analysis directly impacts applications such as image data fusion, quantitative and time series analysis, biomechanics modelling, generating anatomical atlases, visualization, virtual and augmented reality, instrument and patient localization and tracking, etc. Medical images are analyzed to ascertain the detailed shape and organization of anatomic structures, for example the therapy operational planning, trying to enable a surgeon to preoperatively plan an optimal approach to some target structure.

1

2

1.2

CHAPTER 1. INTRODUCTION

Motivation

A fundamental problem in image processing and computer vision is the segmentation of the deforming objects from image sequence data. One deforming object in particular is the heart, which has received a great deal of attention in recent years because of its complex motion and the fact that automatic cardiac image analysis can be used to diagnose damage to the heart muscle caused by, e.g. heart attack. The left ventricular wall of the heart is significantly thicker than the right ventricular wall. This difference in muscle mass is due to the fact that, the blood pressure in the aorta is roughly four times the pressure in the pulmonary artery; as a result the left ventricle must produce more pressure than the right ventricle [85]. Another result of this pressure difference is that any damage to the ventricular wall will have a significant effect on cardiac performance. Also measurement of ventricular volumes, muscle mass and function is based on determining the left ventricular endocardial and epicardial borders [86]. Most research efforts are focussed on the left ventricular performance [34, 61, 50, 3, 75]. The thesis is concerned with the segmentation of the inner contour (endocardial) of the left ventricle from spatial MR image. Since manual border detection is laborious, automated segmentation is highly desirable as a fast, objective and reproductive technique. Automated segmentation will thus enhance comparability between and within cardiac studies and increase accuracy by allowing acquisition of thinner MRIslices.

1.3

Medical Imaging

The achievements in medical imaging over the past decades are enabling physicians to use non-invasively methods inside the human body for the purpose of diagnosis and therapy [24, 2]. With the advent of medical imaging modalities that provide different measures of internal anatomical structure and function, physicians are now able to perform typical clinical tasks such as patient diagnosis and monitoring more safely and effectively than before such imaging technologies existed. Applications of imaging in medicine include computer aided diagnosis (CAD), image guided therapy and therapy evaluation, computer assisted intervention, surgical simulation, planning, and navigation, medical telepresence and telesurgery, functional cardiology, etc. Evidently this introduction to numbers of advanced medical imaging technologies, which allow for acquisition of high resolution cross sectional images of the human body, has significantly improved the quality of medical care available to patients. Short

1.4. MR MEDICAL IMAGING

3

descriptions of some of the common modalities follow. • Planar (2D) X-ray images, as in mammography and chest X rays [76], are projection (shadow) images of a patient’s 3D region of interest. The images are produced from X rays passing through the patient’s body tissues and attenuated according to the varying tissue densities. • Computed Tomography (CT) or Computer Axial (Computer Assisted)Tomography (CAT) is based on the same principle as conventional X-ray radiography [23]. However, stacks of axial slices or reconstructed volume (3D) images are produced. X-ray based imaging is useful for the investigation of bones structure and fat tissue. For adequate acquisition of soft tissue images, invasive contrast agents are required which may cause allergic reactions in some patients. • Ultrasound imaging (such as B-mode and Doppler) uses pulsed or continuous high frequency sound waves to image internal structures by recording the different reflecting signals. Among others, ultrasound imaging is used in echocardiography for studying heart function and in prenatal assessment. Ultrasoundgraphic images are typically not high resolution as images obtained through CT or MRI. • Nuclear medicine acquisition methods such as Single Photon Emission Computed Tomography (SPECT), and Positron Emission Tomography (PET) are functional imaging techniques. They use radioactive isotopes to localize the physiological and pathological processes rather than anatomic information.

1.4

MR Medical Imaging

Magnetic Resonance Imaging (MRI) is non-invasive and non-hazardous medical imaging technique [43] (figure 1.1). MRI is based on the principal of resonance (the absorption of energy from a source at a particular frequency, the resonant or natural frequency). The patient is placed in a strong, homogenous magnetic field and then excite the protons of the (mostly hydrogen) atoms in the different body tissues with radio frequency (RF) pulses perpendicular to the magnetic field. The RF pulse frequency is set such that the resulting proton spin alternation between lower and higher energy levels occurs at the intrinsic precessional frequency, the largest part of the frequency is referred to the resonance in MRI imaging [4].

4


The effects which are actually measured and imaged are the relaxation of the proton spins after termination of the RF pulses. The weighting of different relaxation effects as well as different excitation techniques allow for a wide variety in contrast and visibility of different tissues.

Figure 1.1: MRI scanner and its application in medical imaging.

MRI Techniques for Cardiac Images For the acquisition of MR image sets, especially of moving organs such as the heart, fast MRI techniques are essential to obtain high quality images without movement artifacts within a reasonable time. Numerous acquisition technologies have been developed for this purpose [36, 72, 71] to allow the acquisition of several images during one breath hold. The images we use have been provided by the Internal Medicine Cardiology in Ulm university. Imaging was performed by a 1.5T whole body scanner (Intera CV, Philips Medical Systems) with Master Gradients (slew rate 150 T/m/s, amplitude 30 mT/m) and a 5-element phased-array cardiac coil. Three short survey scans were performed to define the position and true axis of the left ventricle. Afterwards, wall motion was imaged during breath holding in long and short axis slices using a steady-state free

1.4. MR MEDICAL IMAGING

5

precession sequence, which provides an excellent demarcation of the endocardium. Cardiac synchronization was achieved by prospective gating. The cine images were recorded with 23 heart phases (23 frames per heart cycle). The scan matrix was 179x224, reconstructed to 256x256 with a slice thickness of 10mm. A single midventricular short axis slice was chosen for the analysis. Short axis slices are more representative for the cardiac function than long axis views and planimetry results in a nearly circular area. In addition, partial volume effects are less in short axis slices compared to the long axis. Endocardial contours for each of the 23 heart phases were hand-drawn on a Sun Ultra 60 workstation using the Easy Vision Software Release 4.4 (Philips, Best, The Netherlands). Movie sequences of the contours were then exported as MPEG files (reconstructed resolution 240x352). An example of such a sequence is shown in figure 1.2.

6


Figure 1.2: Standard MR short-axis image sequence of the heart starting from left to right and from top to bottom.

1.5. ENDOCARDIAL BORDER SEGMENTATION

1.5

7

Endocardial Border Segmentation

As shown in figure (1.3a) the heart is composed of four chambers [30]: right atrium (RA), left atrium (LA), right ventricle (RV) and left ventricle (LV). The atria and ventricles are surrounded by muscle tissue called the myocardium and together form a pump which moves blood through the body. Contraction and relaxation of the muscle fibers in the myocardium causes the pumping action of the heart. The inner surface of the myocardium is called endocardium, and the outer surface is called the epicardium. figure (1.3b).

Figure 1.3: (a) Short-axis cross section. (b) Location of the endocardium in the LV. One of the major problems related to the detection of the boundary are the shortcomings typical of discrete data, such as sampling artifacts and noise, which may cause the shape boundaries to be indistinct and disconnected. Furthermore, the gray-level structures inside the ventricular cavities, such as papillary muscles, are often indistinguishable from structures of interest for diagnostic analysis, such as the moving inner heart boundary. Thus, segmentation is error-prone and often incomplete as in figure (1.4).

8


Figure 1.4: Left: The hand-drawn endocardium border is shown. Right: Contour segmentation models do not lead to satisfied results (using Canny edge detector). Results from previous methods have suggested that an initial contour is placed either using a manual techniques or using semi-robust techniques in order to overpass the structures adjacent to the border. As a conclusion those techniques need human interfere. So the main concept is to introduce a model that automatically locates the inner cavity region of the left ventricle within the inner structures, and isolates such structures from the endocardial border.

1.6

Outline of the Segmentation Process

What we are mainly concerned with here, is to recognize and locate the inner appearance structures adjacent to the endocardial border, as a result an estimated contour is determined. Then building a knowledge based model, we can direct the estimated contour to the best fit on the endocardial border. Our approach to the segmentation problem is split into three major stages as described in figure 1.5: 1. Low-level stage: The processing starts with a bottom-up multi-scale analysis, based mainly on morphological scale-space by decomposing the image into a number of scales of different size of structuring element. As a result of the decomposition, the gray level appearance structures adjacent to the endocardial border are located, and finally an estimated boundary is obtained irrespective to the details of such structures. 2. High-level stage: Prior information about local structure around defined points along the shape boundary is asserted. 3. Combined low-high level stage: Refinement the estimated boundary from lowlevel stage, using the high-level stage to the accurate segmentation for the endocardial border.

1.7. THESIS CONTRIBUTIONS

9

Figure 1.5: The system architecture, illustrating the combination of the knowledgebased model with a bottom-up model for the endocardial border segmentation.

1.7

Thesis Contributions

The contributions of the thesis to the segmentation of the endocardial border of the left ventricle are summarized in the following points: • We applied a low-level method to separate the gray-level appearance structures inside the ventricular cavities such as papillary muscles, from the endocardial border, since those structures are often indistinguishable from the endocardial border.

10


• We investigated the Morphological scale-space decomposition based on multiscale spatial analysis comprises a powerful tool which presents many advantages: the preservation of scale-space causality, the localization of sharp-edges, and the reconstruction of the original image from the scale-space decomposition. • We defined two descriptor scales; each maximizes the response of the morphological decomposition at each point. This scale gives constant values for structures of constant width. • We developed an approach, which gives a new strategy initialization matching approach; through combining low-level approach, which gives good initialization position and high-level approach in one framework. By building a statistical model that describes the variability in border’s appearance instantiation in terms of prior distribution of the templates. This technique offer greater robustness than either technique alone.

1.8

Thesis Organization

This thesis is organized as follows. Chapter 2 provides background materials about medical imaging, by reviewing the previous models in this area and discussing the appropriateness of those various approaches. Chapter 3 describes a new low-level method for image segmentation designed to solve the problems not addressed by the previous work. Chapter 4 describes a knowledge-based model combined with the new low-level method in one frame, to enhance the desired segmentation. Chapter 5 describes experiments demonstrating the approach on data sets for actual human patients, both normal and abnormal, and compares the results with other methods for the endocardium border segmentation. Finally, in Chapter 6 the thesis is concluded with a summary and discussion of future research area.

Chapter 2 Related Work 2.1

Medical Image Segmentation

Existing techniques for the segmentation of images into distinct objects or background are reviewed and discuss their strengths and weaknesses with respect to the problems outlined in Section 1.5. Several general surveys of such techniques are available, which the highlights techniques will be summarize. The reader is referred to the articles by Haralick and Shapiro [31] and McInerney and Terzopoulos [57] for further details.

2.1.1

Region-Based Segmentation

The simplest segmentation technique is thresholding based on pixel values. Objects are associated with distinct ranges of pixel values, and pixels are assigned to objects depending on what gray-level range they belong to. The gray-level range correlation can either be manually selected by expert, or it may be possible to automatically choose the thresholds based on histogram analysis. This method begins to break down when the histogram profiles for objects overlap, which is the case when different objects may contain the same pixels value. Thresholding [29] alone does not consider the structure of objects, and may produce many islands,such as small disconnected objects inside other objects or the background. This becomes a problem when processing noisy images, or images with many small objects which are not needed. Region growing [65] can be used to remove the islands by starting at specific points in the image and recursively including other thresholded points based on pixel connectivity. The area of the connected components provides a parameter which can be used to remove small objects from the segmentation. Region growing does nothing to smooth object boundaries and may still leave 11

12

CHAPTER 2. RELATED WORK

small holes inside objects if the island size parameter is not properly chosen.

2.1.2

Local Feature Detection

Another low-level approach is the use of operators to detect local dissimilarities in the image. The most common of these operators are edge detectors [42] such as the Roberts, Sobel or Prewitt operator or the Canny edge detector [8]. Typically these operators deliver a multitude of non-connected edges which then have to be linked together and identified with object boundaries. The possible presence of other objects, spurious edges not corresponding to desired features and the presence of invisible contour parts in MR images, prevent us from directly using local contour features to solve our segmentation and outlining problem.

2.1.3

Template Matching

If models or templates of the desired contour are available, one can try to find the desired object by finding the maximum of correlation of these templates with the image or an edge map. However this method is computationally expensive and very prone to misidentify other structures in the image for the heart [28], despite well fitting templates. Another problem is the principal difficulty to detect a high deformable object of possible very different shape using rigid templates.

2.1.4

Deformable Models

Deformable models such as Snakes introduced by Kass et al. [45], are curves or surfaces defined within an image domain. They are designed to be attracted to image features (such as edges, which act as external forces on the shape model) while maintaining internal shape constraints (such as smoothness), thus progressively changing their shape in an effort to locate a desired structure in the image. The solution to these systems generally involves minimizing an energy function which quantifies the shape of the model and image information near the boundary of the object. To avoid getting stuck in local minima, most model-based techniques require the model to be initialized near the solution or supervised by a rule based system that can decide which minima to ignore. Snakes have been extended to accommodate splitting and merging. McInerney surveys the use of deformable models in image segmentation in [56].

2.1. MEDICAL IMAGE SEGMENTATION

13

Traditional Snakes Traditional snakes is a curve x(s) = (x(s), y(s)), s ∈ [0, 1], that moves through image data to minimize an energy E = Eint + Eext , (2.1.1) where Eint is the internal energy and Eext is the external energy, defined as Z 1 Z 1 0 2 Eint = c0 kx (s)k ds + c1 kx00 (s)k2 ds, 0 Z 10 Eext = P (x(s)) ds.

(2.1.2) (2.1.3)

0

The potential function P (x) = P (x, y), is derived from the image so that it has a minimum value along the boundaries of interest. It can be shown using calculus of variations that a curve minimizing E must satisfy the Euler equation −∇x P (x(s)) + 2α x00 (s) − 2β x0000 (s) = 0

(2.1.4)

Where ∇ is the gradient operator. α and β are weighting parameters that control the deformation’s tension and rigidity. This equation can be viewed as a force balance as in equation 2.1.5, where the negative gradient of the potential balances the sum of two internal forces arising from the elasticity and rigidity of the snakes. Fint + Fext = 0

(2.1.5)

Gradient Vector Flow Snakes (GVF Snakes) The gradient vector flow snakes [84] replaced the external force term in equation 2.1.5 by a gradient vector flow field as in equation 2.1.6, derived from the image by minimizing a certain energy functional in a variational framework. The minimization is achieved by solving a pair of decoupled linear partial differential equations which diffuses the gradient vectors of agray-level or binary edge map computed from the image. v + 2α x00 (s) − 2β x0000 (s) = 0

(2.1.6)

Then the GVF is the vector field v(x, y) = [u(x, y), v(x, y)] that minimizes the energy functional ZZ E= µ(u2x + u2y + vx2 + vy2 ) + |∇f |2 |v − ∇f |2 dx dy (2.1.7)

14


where f represent the edge map derived from the image. Using the calculus of variations, it can be shown that a GVF field can be found by solving the following Euler equations µ∇2 u − (u − fx )(fx2 + fy2 ) = 0 2

µ∇ u − (u −

fx )(fx2

+

fy2 )

= 0

(2.1.8) (2.1.9)

where ∇2 is the Laplacian operator. In conclusion, the main advantages of the GVF snakes can be summarized in the following points [83]: • The GVF snakes has a large capture range without distorting the boundary. • The GVF snakes can be initialized completely inside, outside, or across the target boundaries. • The GVF field of the snake can smooth out weak gradients while maintaining strong gradients, which makes the deformable model perform better against image strong noise. But still the GVF model does not lead to satisfied results for the defined problem in section 1.2, in order to move beyond the appearance structures towards the endocardial boundary as shown in figure 2.1.

Figure 2.1: Results of GVF applied for segmenting the endocardial boundary (Chenyang Xu.(CVPR’97)). There are numerous other papers in the literature on deformable contours. The variations involve different data models, different forms of curve parameterizations, and

2.1. MEDICAL IMAGE SEGMENTATION

15

different forms of smoothing penalties on the curves [13, 12, 11, 9, 10]. Others deformable approaches allows a user to specify only the distant end points of the curve, without having to supply a complete polygonal approximation [59, 78].

2.1.5

Statistical Prior Knowledge of Shape

The original snakes formulation may be too general to give acceptable results when dealing with images, where irregular shape and appearance are present due to, e.g., occlusions, closely located but irrelevant structures, and noise. This leads to several techniques that utilize prior knowledge of object shape for segmentation, pioneered by the work on Active Shape Models (ASM) [19]. additionally, introducing prior knowledge generally improves the segmentation results. ASM is a deformable shape modelling technique that is used for segmentation of objects in digital images and has been used for locating anatomical structures in medical images [21, 20]). In ASM the statistical variation of shapes pertaining to a specific class of objects is modelled beforehand from a training set. An initial model guess is then applied and the model is allowed to deform according to image data. Proposed deformations, which are chosen to minimize a certain energy (cost) function, are constrained to be consistent with the prior knowledge about the target object. The energy function is chosen in a way that the model will be attracted to certain image features extracted from the gray-level values of the image. Several enhancements and improvements of the basic ASM method were developed [67]. An automatic landmark generation algorithm was proposed in [35]. A multi-resolution implementation of ASM was presented in [22]. Active Appearance Models (AAM) were introduced in [15], where a set of statistical models of shape and gray-level appearance are build to generate model parameters. Both models ASM, and AAM, require that the model be initialized near the true shape model. Chapter 4 describes the steps involved in AAM in more detail.

16


Chapter 3 Low-Level Visual Analysis Approach This chapter describes a low-level, pixel-based method that uses a multiscale analysis technique based on mathematical morphology; features are derived after mapping them into a nonlinear scale-space. Low-level methods operating on pixels have the potential to reduce errors made in model-based approaches, due to inaccurate prior assumptions about the salient image features. The used multiscale analysis differs from most low-level methods that use multiscale analysis, in that it utilizes the derived features using defined descriptor scales which allow independent structure analysis.

3.1

Introduction

The method introduced in this chapter, is to define an analyzed image model for the left ventricle in order to extract visual structure features, which is robust to varying imaging sequenced conditions. This goal is achieved by decomposing the image into different structures defined in scale-space domain using a low-level image analysis model. Scale-space extends and formalizes multi-resolution analysis by defining the scale as a continuous parameter, [79]. In a conventional multi-resolution method, coarse scale is associated with coarse resolution and analysis takes place at a set of fixed scales. On the other hand, the scale-space representation allows analysis over all scales.

17

18

CHAPTER 3. LOW-LEVEL VISUAL ANALYSIS APPROACH

Koenderink [38] formalized the causality requirement which demands that a scalespace processor must progressively simplify the signal as the scale parameter is increased. In addition to causality he introduced the further constraints of homogeneity and isotropy. These require all spatial points and all scales to be treated in the same manner. Under these conditions koendernik showed that the scale-space representation of an image must satisfy the diffusion equation. However, the resulting scale-space of the linear diffusion method has several problems: • Edges become blurred at increasing scales. • An image at scale s may contain features at many scales. • Multiple convolutions require significant computation. The morphological scale-space decomposition based on multiscale spatial analysis represents an alternative nonlinear method of scale-space filtering [53]. The approach comprises a powerful tool which presents many advantages: the preservation of scalespace causality, the localization of sharp-edges [62, 87], and the reconstruction of the original image from the scale-space decomposition. An appropriate scale is defined as the scale that maximizes the response of the morphological filter through the scalespace at each point giving constant scale values in a region of constant width. This chapter is organized as follows: In Sections 3.2 and 3.3, the morphology operations required are defined with their properties discussed and the morphological scale-space is introduced. In Section 3.4, the proposed method is then described and demonstrated using example sequences. Finally in Section 3.5, the developed approach is summarized.

3.2 3.2.1

Mathmatical Morphology Foundation and Notation

Mathematical morphology is a nonlinear analysis of signals, using structuring elements. Two dual operations, erosion and dilation, are the most basic morphological operators. Erosion is a shrinking operation while dilation is an expanding one. By combining dilation and erosion, two new operations opening and closing can be defined. Morphological image processing defines a set of operations in which the spatial structure of objects within an image is modified, through using a mask of defined structure and size (the structuring element) scanned over the image. In case of binary images if the pattern of the mask matches the state of the pixel under the mask, then

3.2. MATHMATICAL MORPHOLOGY

19

the corresponding output pixel is set to ”1”; if there is a mismatch, the output pixel is set to ”0”. This section is a basic introduction to the morphological operations. The original work by Matheron in 1975 [54] was extended by Serra [69, 44]. Erosion Binary morphological erosion [33] is defined as the set of all points, y, from the output of the structuring element, B, where it completely fits inside the image, X, B (X) = {y, ∀b ∈ B, y + b ∈ X} \ X B = (X + b)

(3.2.1) (3.2.2)

b∈B

where B (X) and X B are alternative notations for erosion of X by B. Equation 3.2.1 shows that binary erosion can be considered as the intersection of the image and structuring element. This can be extended to grayscale [73], i.e. the processing of a real function, f (x), with a structuring element, h(x), that need not be flat. Grayscale erosion is then defined as, h(x) f (x) = inf [f (x + y) − h(y)] y∈H

(3.2.3)

If the space is discrete the infimum is replaced by the minimum operation, h(x) f (x) = min[f (x + y) − h(y)] y∈H

(3.2.4)

Dilation Binary dilation [33], is the complement of erosion and defines the set of points, y, from the output of the structuring element, B, where it overlaps the image, X, δB (X) = {y + b, y ∈ X, b ∈ B} [ X ⊕B = (X + b)

(3.2.5) (3.2.6)

b∈B

again δB (X) and X ⊕ B are alternative notations for the dilation of X by B. Dilation is the union of the image and the structuring element. This can also extended to gray level processing of the function, f (x), and the structuring element, h(x), δh(x) f (x) = sup[f (x − y) + h(y)] (3.2.7) y∈H

20


If the space is discrete the supremum is replaced by the maximum operation, δh(x) f (x) = max[f (x − y) + h(y)]

(3.2.8)

y∈H

Opening and Closing The application of an erosion immediately followed by a dilation (min then max) using the same structuring element is referred to as an opening operation. The name opening is a descriptive one, describing the observation that the operation tends to open small gaps between touching objects in an image. This process success to remove spurious black pixels, but does not remove the white ones. ψh(x) f (x) = δh(x) (h(x) f (x))

(3.2.9)

This removes the positive extrema smaller than the structuring element size. The opposite to opening is closing which is defined as a dilation followed by erosion (max then min). If an opening creates small gaps in the image, a closing will fill them. The closing removes much of the white pixel noise, giving a fairly clean image. γh(x) f (x) = h(x) (δh(x) f (x))

(3.2.10)

This removes the negative extrema smaller than the structuring element size.

3.2.2

Morphological Properties

Now the main properties of the morphological operations are briefly covered referring to [41]. To define the properties it is useful to introduce the symbol Φ to be any of the morphological operators defined previously. In all cases a flat structuring element is assumed. • Given an operator Φ the dual is Φ∗ (f ) = −Φ(−f )which is the inverse of the dual operator applied to the inverse of the image. Erosion-dilation and openingclosing are duals. • The operator Φ is monotonically increasing if given g ≤ h, then Φ(g) ≤ Φ(h). This means given an image that is subset of another the filtered output of the subset is a subset of the filtered output of the image. Erosion, dilation, opening and closing are all increasing. • The operator Φ is anti-extensive if Φ(f ) ≤ f and extensive if f ≤ Φ(f ). This describes operators that are contained by the original image and operators that contain the original image. Opening is anti-extensive and closing is extensive.

3.3. MORPHOLOGICAL SCALE-SPACE

21

• The operator Φ is idempotent if Φ(Φ(f )) = Φ(f ). This property means that once applied, repeated applications of the same filter have no effect. Opening, closing are idempotent.

3.3 3.3.1

Morphological Scale-space Scale-Space

In scale-space theory one embeds an image f : R2 → R into a continuous family (Tt f | t ≥ 0) of gradually smoother versions of it. The original image corresponds to the scale t = 0 and increasing the scale should simplify the image without creating spurious structures (figure 3.1). Since a scale-space introduces a hierarchy of the image features, it constitutes an important step from a pixel-related image description to a semantical image description [81, 46]. The scale-space image is generated by convolution of the original image with Gaussian of increasing width. This Gaussian scale-space is equivalent to calculating (Tt f )(x) as the solution u(x, t) of the linear diffusion process N

1 1X ∂t u = ∇ 2 u = ∂x x u 2 2 i=1 i i

(3.3.1)

with initial condition u(x, 0) = f (x), or equivalently, by convolution with the Gaussian kernel u(x, t) = g(x, t) ∗ f (x), where g(x, t) : R 2 × R → R is given by 1 2 2 e−(x1 +...+xN )/(2t) , N/2 (2πt) x = (x1 , . . . , xN )t

g(x, t) =

(3.3.2)

Although the traditional scale-space theory provides a well-founded framework for handling image structures at different scales, it does not directly address the problem of selecting appropriate scales and structures from the scale-space representation for further analysis [7].

22


Figure 3.1: Scale-space filtering.

3.3.2

Multiscale Morphology

With a scaled structuring function, morphology operations are joined at zero scale (i.e. the original signal) to form a single multiscale operation which unifies the morphological operations [82] as follows: Definition 3.3.1. The multiscale dilation-erosion of signal f (x) by the scaled structuring function hσ (x) is denoted by f ~ hσ , σ is the scale parameter and is defined by   (f ⊕ hσ )(x) ≤ 0 (σ > 0) (3.3.3) f ~ hσ = f (x) (σ = 0)  (f hσ )(x) ≤ 0 (σ < 0)

In other words, a scale-space is constructed from the morphological multiscale. However the justification for the joining of the multiscale dilation and erosion would be considerably strengthened if both operations approach f (x) as the scale parameter approaches zero from above and below. This can be done on the grounds of continuity refereing to [41]. With this method scale may be negative; it is |σ| which corresponds to the intuitive notation of scale. Unlike linear operators, dilation and erosion are non-self-dual [70], positive and negative scales in scale-space contain differing aspects of the information in a signal. Positive scales related to local maxima in the signal, whereas negative scales correspond to local minima. The scale-space image F : D ⊆ Rn × R −→ R defined by: F (x, σ) = (f ~ hσ )(x)

(3.3.4)

3.3. MORPHOLOGICAL SCALE-SPACE

23

where the (n + 1) dimensional space given by D × R is called the multiscale dilationerosion scale-space. The geometric visualization of dilation and erosion are intuitively helpful [63]. For the moment, take the scaled structuring function to be n-dimensional ball with the radius as a scale parameter: a positive radius corresponds to rolling the ball along the top of the surface of the signal, and negative radius to rolling the ball along the underneath. The smoothed signal can be visualized as the surface traced out by the center of the ball when it is traced over the top (dilation) or underneath (erosion) of the surface of the signal. We illustrate this operation for a 1-D signal in figure 3.2.

Figure 3.2: Smoothing of 1D signal by Multiscale Morphological Dilation-Erosion.

Intuitively, this new surface is smoother (in the sense of having flat and less hills) than the original signal, and furthermore the larger the radius the smoother the filtered surface becomes. In the limit: as the radius approaches zero the original signal is recovered (for continuous signals); and as the radius approaches infinity the output becomes flat. In case of positive and negative scales, it should be apparent that if the ball touches the top hill (local maximum) then a hill will appear on the output at exactly that point. If however the radius is such that the ball is prevented from touching that hill by nearby hills, then no hill will appear at that point on the output, and more importantly that hill cannot reappear for any increased value of radius, r. In more precise terms: the number of local maxima is a monotone decreasing function of r.

24


In the same way scaled morphological opening and closing can be combined: Definition 3.3.2. The multiscale closing-opening1 of the signal f (x) by the scaled structuring function hσ (x) is denoted by f hσ , and is defined by

3.3.3

  (f • hσ )(x) ≤ 0 (σ > 0) f hσ = f (x) (σ = 0)  (f ◦ hσ )(x) ≤ 0 (σ < 0)

(3.3.5)

Morphology with Scaled Structuring Functions

In image analysis it is often desirable to ensure isotropic properties in any operation. This requirement translates directly to the morphological structuring functions being circularly symmetric. Commonly used circularly symmetric functions are: hemispheres, cylinders, cones and paraboloids. In almost all practical cases we will use the smaller class of continuous convex structuring functions. Typical scaled structuring functions include [40]: • flat structuring functions: hσ (x) = 0 x ∈ H particularly scaled disks where H = {x : kxk ≤ |σ|}; • spheres: hσ (x) = −|σ|(1 − (1 − kx/σk2 )1/2 ) for kxk ≤ σ; • circular poweroids: hσ (x) = −|σ|kx/σkα for kxk > 1, particularly the circular paraboloid α = 2; √ • elliptic poweroids: hσ (x) = −|σ|( x0 Ax/|σ|)α for kxk > 1 and A is a symmetric positive definite matrix. Example of various 2-D circular structuring functions are shown in figure 3.3.

3.4

Morphological Scale-space Decomposition

As mentioned previously, in order to segment the endocardial contour we must first, develop a model to segment the appearance structures found adjacent to the boundary and classify them as part from the inner cavity. This can be achieved by using multiscale decomposition of the image, as we will describe through this section, at first, we will discuss some points and then the developed approach is described. 1

the symbol ”•” refers to closing operator. the symbol ”◦” refers to opening operator

3.4. MORPHOLOGICAL SCALE-SPACE DECOMPOSITION

Figure 3.3: 2D circular structuring functions.(a) flat (i.e. cylinder); (b) sphere; (c) poweroid with α = 2 (i.e. paraboloid); (d) poweroid with α = 4 (i.e. quartoid).

25

26


Using a pure scale-space technique does not require any prior knowledge about the image content and is therefore suitable as a universal image representation. However, when one attempts to extract higher level information from the representation, an important question arises: If no scale is special in any way, how do we know at which scale level the interesting information can be found? A very interesting answer to this question was given by Lindeberg [52]. He proposed to measure local appropriate scales which optimize the tradeoff between smoothing and feature visibility. These measurements are then used to appropriately tune subsequent operators. Lindeberg defines the appropriate scales as the scale that maximizes the response of certain nonlinear operators with respect to scale [51]. For each basic feature type (blobs, corners, edges) a different operator is used. For example, a measure for the sizes of blobs and ridges, i.e. local extrema of image brightness, is obtained by the magnitudes of the scale normalized Laplacean of the Gaussian: SA (x) = args (max(|s2 ∇2x G(x, s)|)) s

(3.4.1)

s is the scale parameter, G(x, s) is the Gaussian kernel. These functions give good results near the centers of the blobs. However near edges they reflect the sharpness of the edges rather than the sizes of the nearest blobs. Consequently, one has to localize blobs respectably edges before the results of equation 3.4.1 can be interpreted correctly. Köthe [49] suggested that the appropriate scale measurements would be even more useful if they were available prior to feature detection. To achieve this, an operator that works uniformly all over the image is needed, regardless of what feature type a pixel belongs to. As an immediate consequence of this requirement such an operator must be based on regions, because measurements at edges and corners would never be able to cover the entire image in a uniform manner. The difference between the two methods is illustrated by figure 3.4.


27

Figure 3.4: Left: test image containing regions that have different constant widths, Center: appropriate scales measured by eq.3.4.1 are always small near edges. Right: magnitude of appropriate scales measured by Köthe’s method are constant within each region. In order to better understand the issues involved behind Köthe’s idea, he needed to clarify the notation of a uniform scale measurement. Since it must be based on regions its most intuitive behavior would be to associate the appropriate scale of a point with the width of the region it belongs to. In particular this means that all points in a region with constant width should have the same scale value. This enables him to define local appropriate scale on the basis of morphological band-pass filter as:

SA = argsk

B sk sk ∓1 sk =s−n ,...,s−1 ,s1 ,...,sn sk − sk∓1 max

(3.4.2)

where SA is the appropriate scale, sk−1 applies if sk > 0 and sk+1 if sk < 0. The expression Bsskk∓1 /(sk − sk∓1 ) is the normalized band-pass filter. The reason of using morphological filter is: morphological operations are more sensitive to geometrical shape than convolution based operators, in order to solve main two problems pointed by Köthe: 1. A point may belong to regions at different scales simultaneously. These different regions must be identified, and the size of the most salient among them should determine the appropriate scale. 2. The width of a region must be defined and measured at every point without making unnecessary assumptions about possible region properties. In the next sections a model based on Köthe’s idea is developed and applied to the spatial heart sequences, using the morphological opening-closing scale-space decomposition and defining one local appropriate scale for both the opening and

28


closing decomposition. Contrary to the existing approach, our extension utilizes two local appropriate scales (through the thesis it is referred as descriptor scales), representing the opening and closing decomposition defined separately. As a result, features are extracted from the left ventricle MR images. The use of a scale based decomposition allows the feature space to be defined using scale information rather than pixel intensity values directly.

3.4.1

Morphological Filtering

When dealing with a continuous signal, a filter is commonly defined as any operator that is linear, continuous, and invariant under translation. Any filter in the above sense can be expressed as a convolution product of the signal with a convolution kernel [64]. It is also common to imply some frequency selective properties, such as being bandpass. Algebraically, an ideal bandpass filter has the property of idempotence which means that a signal once filtered is unchanged by a second identical filtering; that is, if f is a real valued signal on Rn and ψ is a mapping of real valued functions from Rn to Rn then ψ(ψ(f )) = ψ(f ) [66]. This is true for an ideal bandpass filter since once part of the signal spectrum is lost it will not be further affected by filtering. Definition 3.4.1. A low-pass filter with respect to the structure size has the following properties [48]: 1. The filter should be isotropic. 2. Blobs smaller than |s| (i.e. local maxima if s < 0 and minima if s > 0) should not be present in the filtered image. 3. Blobs larger than |s| should not be affected. A special case of property 3 is that a blob of infinite size, e.g. a single step edge, must not be changed for any finite |s| < ∞. This leads to the following proposition: Proposition 3.4.1. Disks with radius s are the only isotropic, anti-convex structuring functions that do not change a blob of infinite size under opening or closing. The relationship between blobs and opening-closing operations is established by the following proposition: Proposition 3.4.2. Morphological opening and closing with disk structuring functions ds are perfect low-pass filters with respect to the blob size, i.e.:


29

1. Light blobs smaller than |s| are not present in an image that is opened with respect to the structuring function ds . Dark blobs smaller than |s| are not present in an image that is closed with respect to the structuring function d s . 2. Points that constitute light blobs of size |s| or larger are not modified by an opening with structuring function ds or larger are not modified by closing with structuring function ds . Note also that the disk structuring functions have another unique property: as opposed to non-flat structuring functions they conserve dimensionality with respect to image brightness: ((λf ) ⊗ ds )(x) = λ(f ⊗ ds )(x) (3.4.3) where ⊗ denotes any morphological operation. This property is important since the absolute relationship between image brightness and light intensity is usually unknown. As long as the interesting information lies in the relative brightness, an image operator should be invariant with respect to rescaling of f (x). Therefore disk structuring functions are used. Definition 3.4.2. An image that has been morphologically high-pass filtered does not contain blobs of size s and larger according to the following equation:   f (x) − (f • ds )(x) ≤ 0 (s > 0) H(x, s) = f (x) − F (x, s) = 0 (s = 0)  f (x) − (f ◦ ds )(x) ≤ 0 (s < 0)

(3.4.4)

Definition 3.4.3. Band-pass filter is defined by combining low-pass and high-pass filters with respect to blob size, and has the following properties: 1. The filter is isotropic. 2. Structures smaller than |sl | are not present in the filtered image Bsslu (x). 3. Structures larger than |su | are not present in the filtered image. The simplest idea to define the band-pass filter is to apply the high-pass filter first and then low-pass filter the result, defined as the generalization of Wang et al. [77]. Starting at coarsest scale and proceeding to the finer scales, according to the following properties:

30


1. A family of morphological band-pass filters with limiting blob sizes −∞ = s−n−1 < s−n < . . . < s0 = 0 < . . . < sn < sn+1 = ∞ is obtained by the following formula: for sk ≥ 0 :

and for sk ≤ 0 :

Hn+1 = f Bn = (Hn+1 • dn ) Hn = Hn+1 − Bn

(3.4.5)

Hn+1 = f Bn = (Hn+1 ◦ dn ) Hn = Hn+1 − Bn

The resulting Bn represents a morphological decomposition of the image into bands of different structure sizes with light and dark blobs as shown in figure 3.5 limited with disk structure element dn of size n. Hn are intermediate high-pass filtered images.


31

Figure 3.5: Band-pass decomposition of an image (center) with respect to structure sizes (shown the upper of each scale). Left from top to bottom: s=1,2,4,8,16,32, dark blobs. Right: s=-1,-2,-4,-8,-16, -32, light blobs.

32


2. The original image can be reconstructed from either the positive or the negative part of the decomposition as shown in figure 3.6 n X k=0

Bsskk+1 (x)

=

0 X

Bsskk−1 (x) = f (x)

(3.4.6)

k=−n

Figure 3.6: Reconstruction of the original image from opening-closing scale decomposition.

3.4.2

Descriptor Scales in Opening-Closing Scale-space

The descriptor scale is defined as the scale that maximizes the response of the bandpass morphological filter at each point in the image. This scale gives constant values in a region of constant width, the definition was given by Köthe [47], where he defined only one descriptor scale for the whole morphological scale-space decomposition domain (opening and closing). Contrary, we defined a descriptor scale for the opening scale-space decomposition, and another for the closing scale-space decomposition separately (figures: 3.7 and 3.8). Through the following equations: B sk+1 sk Sclosing = argsk max (3.4.7) sk =s1 ,...,sn sk − sk+1 B sk sk−1 (3.4.8) Sopening = argsk max sk =s−n ,...,s−1 sk − sk−1 Where Sclosing and Sopening are the two descriptor scales obtained from closing s opening scale-space decomposition respectively. Bskk+1 is the band-pass filter for the positive scales k > 0 (closing scale-space decomposition) in equation 3.4.7, and for negative scales k < 0 (opening scale-space decomposition) in equation 3.4.8 , as in figures 3.12 and 3.13.


33

Figure 3.7: Left: Descriptor scale of the closing scale-space decomposition. Right: Descriptor scale of the opening scale-space.

Figure 3.8: Left: Descriptor scale of the closing scale-space decomposition. Right: Descriptor scale of the opening scale-space. The variation in gray intensities comes from the maximum response over the scale-space decomposition.

34


3.4.3

Local Feature Extraction

As mentioned before, the appearance structures contest us to get better and accurate endocardial boundary segmentation. So if we extract local features, which guide us to segment the inner cavity and locate within the appearance structures inside it, we can segment the endocarial border (figure 3.9). This is achieved by utilizing the morphological scale-space decomposition, which decompose the image into scale-space of different structure sizes. Additionally, defining the descriptor scales, that represent a visual measure for constant structure sizes. In order to determine which scales describe both the inner cavity and appearance structures (figure 3.10).

Figure 3.9: Extracting the inner cavity and locating the appearance structures. The morphological band-pass filter based on equation 3.4.5 is applied on the region of interest (ROI)(ROI is manually extracted ), using a disk as a flat structuring element of increasing logarithmic size (1, 2, 4, 8, 16, 32), obtaining a close scale-space decomposition and an open scale-space decomposition respectively, figure 3.11 shows that every scale presents all structures of equal size found in the region of interest of the image. Applying equation 3.4.7 for both closing and opening scale-space separately, two descriptor scales are obtained as shown in figures 3.12 and 3.13. For better visualization, we framed each scale by a unique color and mapped it in a color domain, in order to extract all the needed features.


35

Figure 3.10: Segmentation process architecture. Firstly, the morphological scalespace and scale descriptors are calculated. Secondary, segmenting both the inner cavity and appearance structures. Finally, as a result of the combination of the previous segmentation processes an estimated epicardial boundary is detected.

Figure 3.11: Decomposition of the ROI (Region Of Interest) with respect to structure sizes, using the closing and opening morphological operators.

36


Figure 3.12: Left: Descriptor scale of closing decomposition. Right: Mapping the Descriptor closing scale into a color domain. The scales are shown around the right figure with their mapped color in the figure.

Figure 3.13: Left: Descriptor scale of opening decomposition. Right: Mapping the Descriptor opening scale into a color domain. The scales are shown around the right figure with their mapped color in the figure.


37

Inner Region Segmentation One of our main interests in this section is to determine the best scale that contains almost a complete presentation of the inner cavity. Due to the characteristic of the inner area which is brighter than the surrounded boundary, the opening scale-space decomposition can locate and describe such characteristic. This can be seen from the opening morphology definition, where two operations are performed on the region of interest. First each pixel value is replaced with the brightest value in the neighborhood; then using this image, each pixel value is replaced by the darkest value in the same size neighborhood, having a rejection of noise and dark values from bright values regions. So different structures of bright areas can be presented using the opening scale-space decomposition, figure 3.14 shows a one-dimensional representation of the image applying the opening morphology process.

Figure 3.14: Schematic diagram of the operation of grey scale opening (erosion and dilation) in one dimension, showing (starting from the top): the original profile with result of first (maximum) pass, producing a new profile through brightest points; the second step in which a new profile passes through the darkest (minimum) points in the result from step 1; comparison of the final result to the original profile, showing rejection of noise and dark spikes.

38


Segmenting the inner cavity represents a problem, if you trace the size of the inner cavity in spatial frames of heart motion, you will distinguish that the size of the inner cavity is changeable due to the extraction and the contraction of the heart (chapter 1 figure 1.2). By using the opening scale-space decomposition you can recognize this problem as shown in figure 3.15. The introduced question, is how the right scale, representing the inner cavity, is selected automatically from the scale-space decomposition? The idea of choosing the best scale, representing the inner cavity, depends on finding the area covered by each scale in the mapped descriptor opening scale. But instead of searching the best scale in the whole scale-space decomposition domain, we used the increasing logarithmic structure element in the morphological scale-space decomposition to decrease the search in only two scales. The selection criteria of those scales was based on statistical validation of 279 images, where the sizes of structure element corresponding to inner cavity was oscillated between two scales. The inner cavity is presented in either the scale marked with green or the scale marked with blue in the opening scale-space decomposition (figure 3.15) in different images.

Figure 3.15: The Scale selection: Left original images for 2 different phases of the same case. Middle: Scale decomposition marked by green. Right: Scale decomposition marked by blue.


39

The process of finding the best scale is summarized as follows: 1. We take the image and apply the opening scale-space decomposition, by obtaining the scale-space as shown in figure 3.16. Then in each of the marked scale (with green and blue) we segment the brightest connected area, then compare them with each other using the descriptor opening scale to vote for the best scale to represent the inner cavity. The method utilizes a region growing algorithm that starts from selecting a seed point to agglomerate pixels that belong to the same cluster defined by an intensity thresholding. The implementation of the algorithm is based on the description in [37]. The seed point is selected as the brightest pixel in the 8-connected neighborhood of the center of mass for each scale, where the center of mass is calculated according to the following: mpq =

M −1 N −1 X X

xp y q S(x, y)

i=0 y=0

xˆ = (m10 /m00 )

(3.4.9)

yˆ = (m01 /m00 ) Where S is the given scale of size (M × N ), mpq is the moment transformation and xˆ and yˆ are the center of mass position values as shown in figure 3.17.

40


Figure 3.16: The opening scale-space decomposition for the image. The Scales are presented from left to right and from top to bottom, each scale represents certain structure size from the image. 2. Currently, we have two segmented regions as a result from applying the region growing algorithm on each of the given scales. For every pixel (i, j) that belongs to the segmented region, we lookup which scale it belongs to. Then by using the mapped descriptor opening scale, we obtain an accumulator vector of size (1 × 6) with elements (indexes) corresponding to the scales and accumulator

entries (cells) corresponding to the number of pixels counted (figure 3.18).


41

Figure 3.17: (a) and (c)The scales marked with green and blue, with the seed points as initialization for the growing region algorithm. (b) and (d) The segment regions using the algorithm.

Figure 3.18: Building the counter accumulator vectors for the segmented regions.

42


3. Finally, we compare between the two segmented regions from both scales, the segmented region which covers the maximum area from its scale represents the best scale for the inner cavity. Figure 3.19 illustrates the process of choosing the best scale.

Figure 3.19: Left: The counter accumulator for the segmented region from the scale marked by green. Right: The counter accumulator for the segmented region from the scale marked by blue. The maximum count accumulator vote to the scale, representing the inner cavity.


43

Inner Appearance Structures Segmentation We need now to determine the appearance structures in the inner cavity, since as previously mentioned, those appearance structures complicate the inner boundary segmentation as shown in figure 3.20.

Figure 3.20: a. The inner cavity with appearance structures. b. Applying an edge detector algorithm will segment those appearance structures as part of the inner boundary. c. The desired inner boundary must be beyond those appearance structures.

From the closing descriptor scale-space we observed, that the appearance structures can be located in the scale marked with black of the closing scale-space decomposition. This observation is acceptable due to the properties of the closing morphology operator extended by the closing scale-space decomposition, where structures smaller and larger in size than the structure element are removed; according to the bandpass filter definition 3.4.3. Those appearance structures share the property of having a gray level value lower than the surrounded structures. Beside, they all share the same size in all of the available data set we had. In figures 3.21 and 3.12 the reader

44


can see that those structures are visualized with a black color in the mapped descriptor closing scale.

Figure 3.21: The closing scale-space decomposition for the image. The Scales are presented from left to right and from top to bottom, each scale represents certain structure size from the image. To segment the appearance structure, we used a simple threshold method, by defining a range of gray-level values in the chosen scale, select the pixels within this range as belonging to the foreground, and reject all of the other pixels to the background (displayed as a binary image). The threshold is adjusted using the histogram of the gray-levels; we have fixed the percentage of black pixels to 50% (mean gray level in the image). The results from the threshold algorithm showed that we do not lead more complex algorithm to implement as in figure 3.22.


45

Figure 3.22: Top left: The scale representing the inner appearance structures information with its histogram distribution. Bottom left: The histogram of the above scale with the thresholded value. Top right: After applying the threshold algorithm. The thresholded scale with the extracted appearance structures, some segmented structures located outside the inner region must be removed, like the ones at the top left or at the bottom middle.

Combining Inner Region with Appearance Structures Previously, the segmentation was shown to inner region and the appearance structures. Combining both features we can obtain the inner cavity region, which leads to the endocardial boundary segmentation. Still we have one problem, where for the segmentation of appearance structures, we had some structures lying outside the inner cavity region, which needed to be isolated so to do that we follow the steps: • Map every pixel of the segmented inner cavity to the threshold image which the

structures appearance is determined. At the moment we have the segmented inner region within the appearance structures (figure 3.23b).

• Assign every pixel of the determined appearance structures the same value as the estimated inner region if:

46


– The blue region represented the estimated inner cavity in figure 3.23a is defined as A. – The Thresholded inner appearance structures marked by black in figure 3.23b are defined as Bk , k = 1, . . . , n. where n number of the thresholded regions.

Operation: if A ∩ Bk 6= ∅ then A ∪ Bk

else A

end if • From the previous point, we obtain the inner cavity with some appearance struc-

tures assigned as part of it. By using a simple binary edge detector (e.g. Sobel)

[60], we can segment the outer edge of the estimated inner cavity. Obtaining the result in figures 3.23c and 3.24.

Figure 3.23: The combination of inner region and appearance structures to obtain a smoothed estimated endocardial contour.


47

Figure 3.24: Final estimated contour with the 2 scales representing the inner region and inner structures segmentation. The arrows pointed toward the appearance structures and their localization in the descriptor scale.

48


Figure 3.25 demonstrates a sequence of region of interests of the left ventricle MR images. By applying the introduced approach to those image sequence, the results of the estimated endocardial boundary are shown. Pointing on, the over passe of the estimated contour to the inner appearance structures in both contraction and extraction phases of the heart [25].

Figure 3.25: Sequence of MR images for one case, showing the estimated endocardial contour using the proposed scale-space decomposition approach.

3.5. SUMMARY

3.5

49

Summary

A new strategy is proposed to initialize a segmentation algorithm finding the endocardial boundary. The approach utilizes morphological scale-space decomposition based on a multiscale analysis. We described two descriptor scales obtained from the closing and opening scale-space decomposition. Three steps are processed for segmentation: • The inner region is segmented, guided by the opening descriptor scale, which

points towards the scale in the opening scale-space decomposition presents the inner cavity.

• The inner appearance structures adjacent the endocardial border are segmented,

guided by the closing descriptor scale, which points towards the scale in the closing scale-space decomposition presents such appearance structures.

• The combination of the two previous processes we obtain an estimated endocardial border.

This approach comprises a powerful tool which presents many advantages: the preservation of scale-space causality, and the localization of sharp-edges. A descriptor scale is defined as the scale that maximizes the response of the morphological filter through the scale-space at each point in the image, which guided the approach to separate the gray-level appearance structures inside the ventricular cavities from the endocardial contour. The approach is subsequently be combined with a high level model to obtain a fully convergence to the endocardial boundary as it will be shown in the next chapter.

50


Chapter 4 Combined Low-High level Visual Approach In this chapter we describe a new combined low-high level method. The high level approach is defined as a model-based approach, which asserts prior information about local structure around defined points along the shape boundary. Applying the described model-based approach to the output of the low-level model. The estimated contour is deformed to the best fit of the true endocardial boundary. Our new contribution in this chapter is based on refining the resulting contour obtained by the approach introduced in chapter 3, where the output contours need to be enhanced in some parts to obtain the best fit to the endocardial border. So we need to define a learning-based model to be able to move parts of the contour to better locations.

4.1

Introduction

High-level approaches are used successfully in computer vision. Often these approaches use some form of template matching. Templates incorporate knowledge about both the shape of the object to be segmented and its appearance in the image, and are matched by correlation techniques [80]. But template matching will likely fail for segmentation tasks with a lot of variation in the appearance of objects and background, such as real-life images and medical data, even if several scaled and rotated versions of a template are used. Here, few high-level methods are reviewed, that are 51

52 CHAPTER 4. COMBINED LOW-HIGH LEVEL VISUAL APPROACH

needed to understand the method developed in this thesis.

4.2 4.2.1

High- Level Approaches Active Shape Model

Active Shape Models (ASMs) is an approach based on prior knowledge introduced by Cootes and Taylor [17]. ASMs have been used for several segmentation tasks in medical images. The method is described by the shape model, which is given by the principal components of vectors in which landmark points are stacked, and the appearance model is built around the border of the object, and consists of the normalized profiles sampled perpendicular to each landmark. The optimization depends on minimizing the cost of energy function (using Mahalanobis distance) of the normalized profiles. The fitting procedure is an alternation of landmark displacements and model fitting in a multi-resolution framework. The next section explains in detail the active appearance model (AAM), to be compared to our proposed model results in chapter 5.

4.2.2

Active Appearance Model

Cootes and Taylor have extended the active shape models by active appearance models (AAMs) [14, 16]. In AAMs a combined principle component analysis of the landmarks and pixel values within the object is made which allows the generation of images [58, 32]. The iterative steps in the optimization of the segmentation are steered by the difference between the true pixel values and the modelled pixel values within the object.

4.2. HIGH- LEVEL APPROACHES

53

Labelling and Aligning the Training Set Before labelling the shapes of the training set, we need to determine the number of landmark points to represent the shape outline. For each image in the training set we manually allocate the interest shape, and then determine significant landmarks on its boundary. The landmarks must be accurately located in order to obtain exact correspondence between the different shapes in the training set. As a result a labelled training set denoted by S is obtained. It contains N training shapes, each has n landmark points of (xi , yi ) coordinates, and the vector describing the n points of the ith shape in the training set is Xi = [xi1 , . . . , xin , yi1 , . . . , yin ]

(4.2.1)

In order to study the variation of the position of each landmark through the set of training shapes, all shapes must be aligned to each other by changing the pose parameters (translation, scale, rotation) successively until the complete set is properly aligned [6, 27].

Model Variation Now having a set S of aligned shapes represented by the vectors Xi , where Xi contains the new coordinates resulting from alignment. These vectors form a distribution in the 2n-dimensional space. The goal is to model this distribution by deriving principals which govern the behavior of the variations of the N points in 2n-dimensional space to generate new examples similar to those in the original training set. A principle component analysis [39] (PCA) to all vectors is applied to obtain the model x = x + P s bs

(4.2.2)

where x is the mean shape vector, Ps is a matrix of the orthogonal principle components of variation and bs is vector of the shape parameters.


In addition to shape information, variations in texture gray level information in the training set can be obtained and expressed as a linear combination of the principal components. This is obtained by warping each training example into the mean shape to finally get a normalized representation of the vectors of intensities invariant against gray-level shifts and scaling [74, 5]. Now the PCA is performed to the normalized intensities vectors to obtain the model g = g + P g bg

(4.2.3)

where g is the mean normalized gray-level vector, Pg is a matrix of the orthogonal principle components of variation and bg is a vector of the gray-level parameters.

Figure 4.1: Combined shape and gray-level appearance model. First two modes of appearance variation of the inner cavity of the left ventricle. The shape and gray level information of any training sample can be fused to generate a combined model appearance representing both shape variation and gray


55

level variation as in figure 4.1. Since there are correlations between the shape and gray-level variations a 3rd PCA is applied to both the shape parameters vector b s and the gray-level parameters vector bg in the following manner: • For each example in the training set a combined vector of shape bs and gray-level bg information is generated b=

(

W s bs bg

=

(

Ws PsT (x − x) PgT (g − g)

(4.2.4)

where the weighting matrix Ws is a diagonal matrix that relates the different units of shape and gray-level intensity coefficients. • Apply a PCA to all b vectors of the training set, giving a further model b = Pc c

(4.2.5)

where Pc denotes a set of eigenvectors and c is the vector of appearance parameters controlling both the shape and grey-levels of the model. Each example can now be expressed as a linear combination of these appearance eigenvectors

x = x + Ps Ws−1 Pc,s c g = g + Pg Pc,g c

(4.2.6)

Matching the AAM As stated above if we have a set of model parameters c, we can generate an appearance model consisting of shape x, and a gray-level gm , to compare this appearance model to a target image with the shape x projected onto the target image domain and the gray-level intensities gs sampled accordingly, and compute the difference δg = gs −gm .

To determine the best match between the model and target image, the magnitude

of the difference vector|δg|2 is minimized, by finding an affine transformation and


varying the model parameter c. In order to achieve a better fit prior knowledge of how to adjust the model parameters during image matching must be provided. This knowledge is obtained during the training phase such that the AAM learns a linear relationship between δg and the error in the model parameters δc, δc = Rc δg

(4.2.7)

to minimize|δg|2 . The matrix Rc is obtained by multivariate linear regression using the set of training images, their corresponding modelling parameters c, and pose parameter are displaced, thus creating a |δg|2 .

4.2.3

Problems of the AAM

The related problems are summarized in the following points: • The computational complexity and the variation of the parameters is relatively high [55].

• The method of labelling the training sets effects the landmark accuracy. • The AAM is optimized on global appearance and thus is less sensitive to local structures and boundary information.

• The main problem related to most of the knowledge-based models is that, when

the target object is too distant from the mean object, which is used as the initialized object for the model, no accurate convergence occurs.

To overcome the pointed problems above, which are shared in most of high-level models, we proposed a combined low-high level approach. Such combination exploits the benefits of the contributions; from low-level bottom up processing for feature extraction and region segmentation described in chapter 3, which gives a good initialization position beside its invariance to translation properties. In addition, the benefits of building a statistical model, that describes the variability in border instantiation in


57

terms of prior distribution on deformations of a template. Since most of the high-level approaches need the initial position of the object, generally have to be near the true position of the object. But if the object can be located anywhere within the input image, an exhaustive search to find a suitable initialization may be necessary.


4.3

Combined Low-High level Visual Approach

Most of the high-level approaches based on the ASMs and AAMs depend on building a shape model and an appearance model to describe an object. Instead we propose that the predicted shape given by the morphological scale-space approach (low-level process), is used as an initialization without the need to build a statistical model of shape variation. In addition we build an appearance model around selected points along the shape (high-level process); in order to guide the estimated contour to the accurate fit to the endocardial boundary.

4.3.1

Building The Knowledge-based Model

Labelling The Training Set For generating a training set in the new proposed technique, each boundary of the training image is represented by a set of labelled points. The number of points should be large enough to show the overall boundary. In this thesis a manual procedure is applied for labelling the training set guided by an expert knowledge database (figure 4.2).

Figure 4.2: Training image with manually labelled Endocardial border.

4.3. COMBINED LOW-HIGH LEVEL VISUAL APPROACH

59

Control Points Selection From the generated set of labelled points a set of control points is selected, which are separated by equal orientation and located along a certain path using the following approach: 1. Calculate the average center of mass for all labelled boundaries of the training images (in order to align the labelled points of the training boundaries into a common coordinate frame). 2. For each training image do: (a) Take the average center of mass of the closed contours calculated from step 1 as the point of origin. (b) Perform a radial decomposition to the region into wedges with 30◦ opening with 15◦ overlap, along the labelled boundary. (c) Select the intersection point of the wedge regularity lies on the labelled boundary as a control point (figure 4.3). The vector describing the n control points of the boundary in the training set is

x = (x1 , y1 , x2 , y2 , . . . , xn , yn )T

(4.3.1)

Modelling Gray-level Appearance Knowledge Currently, the control points are selected. We will discuss how the statistical model around each control point is build utilizing the ASM framework described by in [17]. The main idea is to gather the gray level information in a region around each control point throughout the training set, concentrating on the gray profiles along a line passing through the control point in the direction of the selected origin (average center of mass).


Figure 4.3: (a) Labelled Endocardial contour. (b) Control points selection with 15 degree difference in clockwise direction. (c) Selected control points marked with white dots. For every control point j in the image i of the training set, a gray level profile gij is extracted, of length p pixels, centered at the control point (figure 4.4). Instead of using the raw intensities along the profile, we utilize the normalized derivative in order to reduce the effect of global intensity changes.

The gray level profile of the control point j in the image i is a vector of p values, gij = [ gij1 gij2 . . . gijp ]T

(4.3.2)

The derivative profile (of length p − 1) becomes dgij = [ gij2 − gij1 gij3 − gij2 . . . gijp − gijp−1 ]T

(4.3.3)

The normalized derivative profile is given by dgij dgij norm = Pp−1 k=1 |dgijk |

(4.3.4)

Now, the mean of the normalized derivative profiles of each control point is calculated throughout the training set N 1 X y ij = dgij norm N i=1

(4.3.5)

4.3. COMBINED LOW-HIGH LEVEL VISUAL APPROACH

61

Figure 4.4: Building the statistical appearance model. For every control point, the gray profile is sampled along a line passing through the control points in the direction of the average center of mass. The covariance matrix of the normalized derivatives is given by Cyj

N 1 X (yij − y j )(yij − y j )T = N i=1

(4.3.6)

With this we obtain a statistical model for the gray levels around each control point j (24 control points) represented by y j and Cyj .

4.3.2

Searching Using Gray Appearance Knowledge

The modelling of gray level statistics around each control point can be used to determine the adjustment of each control point (xj ) so that a better model-to-data fit is obtained.


The search is applied along a line passing through the control point in the direction of the average center of mass. In this way, we obtain a sample profile. Within this sample profile we look for a sub-profile with characteristics that match the ones obtained from training. In order to do so, the gray level values are collected along the sample profile, compute the derivative and normalize. We then search within the normalized derivatives sample profile (having length s, s > p ) for a sub-profile that matches the mean normalized derivative profile (of length p) obtained from the training set (figure 4.5).

The sub-profile Sj along the control point j is given by Sj = [ Sj1 Sj2 . . . Sjp ]T

(4.3.7)

The derivative sub-profile of control point j will be of length p − 1 as follows: dSj = [ Sj2 − Sj1 Sj3 − Sj2 . . . Sjp − Sjp−1 ]T

(4.3.8)

The normalized derivative sub-profile becomes dSj yjp = Pp−1 k=1 |dSjk |

(4.3.9)

Now examine yjp that match y j (the mean normalized derivative profile obtained from the training set of images). Denoting the sub-interval yjp centered at the dth pixel of yjp by h(d), then find the value of d that makes the sub-interval h(d) similar to y j . This can be done by minimizing the following square error function with respect to d. −1 f (d) = (h(d) − y i )T Cyj (h(d) − y j ))

(4.3.10)

The advantage of using the defined square error function in equation 4.3.10, it measures the distance of the sample from the model mean y j , in terms of the covariance

4.4. EXAMPLE OF GRAY APPEARANCE KNOWLEDGE SEARCH

63

Cyj , making it very sensitive to inter-variable changes in the training data. Also it is linearly related to the log of the probability that yjp is drawn from the distribution [18]. Minimizing f (d) is equivalent to maximizing the probability that y jp comes from the distribution.

Figure 4.5: Search along sampled profile to find the best fit, compared to the trained profile.

4.4

Example of Gray Appearance Knowledge Search

For a given example, the search process was initialized using the estimated boundary obtained as the output from the morphological scale-space processing (see chapter 3). The subsequent search algorithm is summarized as follows:


1. The given estimated contour is labelled by 24 control points, determined using the technique described in section 4.3.1 (figure 4.6).

Figure 4.6: Estimated contour obtained from low-level visual approach with control points. 2. for each control point do (a) Sample a profile of length s pixels starting from the control point. (b) Divide the sampled profile into sub-profiles of length p (equal in length to the profile used in training) as shown in figure 4.7. (c) Test the quality of fit for every sub-profile using equation 4.3.10. Choose the minimum cost value of f (d) as shown in figure 4.8. Figure 4.9 demonstrates the result of using the appearance knowledge search. As the search progresses, more suitable adjustments are made for each control point separately. The final convergence according to the minimum cost of the energy function, which gives a good match to the epicardial border. The algorithm guides each control point separately to the optimum final position, in such a manner that large displacements are made for the control points which are distant from the best fit position, while generating small displacements for the control points which are near to the optimum fit [26].

4.4. EXAMPLE OF GRAY APPEARANCE KNOWLEDGE SEARCH

Figure 4.7: Sampled profile and sub-profiles.

Figure 4.8: Search the best fit sub-profile for the gray-level model.

65


Figure 4.9: Searching using the Appearance Knowledge. The search guides each control point separately to the optimum final position, in such a manner that large displacements are made for the control points which are far from the best fit position, while small displacements are generated for the control points which are near to the optimum fit.

Chapter 5 Experimental Results and Evaluation In this chapter we present the evaluation of the proposed method described in this thesis. To prove the alleged generality of the model, we demonstrate the results of the segmentation process of the applied low-level model. Then the results of the combined low-high level visual approach were compared with two algorithms described in chapter refch:Relatedwork. In particular, with a data driven approach the gradient vector flow snakes (GVF), and the model driven approach of the active appearance model (AAM).

5.1 5.1.1

Experimental Design Data Acquisition and Preprocessing

The evaluation and training were performed using MR image sequence data acquired using a 1.5T whole body scanner with Master Gradients and 5 element phased array cardiac coil. The image sequences consist of 23 frames per heart cycle, each frame of 256x256 pixels with a slice thickness of 10mm. For each frame a region of interest is extracted manually of size 95x95 pixels as shown in figure 5.1.

67

68

CHAPTER 5. EXPERIMENTAL RESULTS AND EVALUATION

Figure 5.1: Sample frame from acquired MRI data set, with marked ROI. The data set is organized as follows: • Set 1 - Normal hearts

Contains 207 images, from a group with normal septal wall thickness (median

10.0mm [9; 10mm]) (figure5.2). • Set 2 - Abnormal hearts

Contains 52 images, from a group with significant left ventricular hypertrophy

(median 14.0mm [12.0; 15.5mm], p < 0.01) (figure5.3).

5.1. EXPERIMENTAL DESIGN

69

Figure 5.2: Sample slices from acquired MRI data set from normal cases.

70


Figure 5.3: Sample slices from acquired MRI data set from abnormal cases with significant left ventricular hypertrophy.

5.2. EXPERIMENTAL RESULTS

5.1.2

71

Performance Assessment

To assess performance, the proposed model is compared to a known ground truth, as given by experts. Comparison to Ground Truth The comparison was based on ground truth given by a finite set of control points for each image, those points were selected from the manual drawn contours. Using a distance measure, D(xgt , x), that gives a scalar interpretation of the fit between the two borders, the ground truth border, xgt , and the optimized border, x. The distance measure used to assess the performance using control points is: • Point to point error: Defined as the Euclidean distance between each corresponding control point:

Dpt.pt (xgt , x) =

n q X i=1

5.2

(xi − xgt,i )2 + (yi − ygt,i )2

(5.1.1)

Experimental Results

The results, will be presented and evaluated under 3 main points: • Demonstration of Low-Level visual approach results. • Combined low-high level visual approach evaluation. • Evaluation compared to High-level approaches.

5.2.1

Demonstration of Low-Level Visual Approach Results

This section shows the results of the low-level multiscale analysis approach described in chapter 3. All results were evaluated according to the data set in section 5.1.1. The idea depend upon the evaluation of how the morphological scale-space approach.

72


Combined with the descriptor scales, can locate the appearance structures then classifying them as a part from the inner ventricle cavity. Figure 5.4 shows the initial estimated contour of the endocardium boundary, using the band-pass decomposition and the descriptor scales for both the closing/opening scale-space decomposition described in chapter 3. We used the statistical hypothesis evaluation as ground truth measurement. The hypothesis categories are: • TRUE: If the low-level visual approach: 1. Select the right scale represents the inner cavity. 2. Segment the appearance structures correctly. 3. The obtained estimated contour is beyond the appearance structures, towards the truth endocardial contour. • FALSE: If not TRUE. The low-level visual approach located the best scales describing the inner region and inner structure, giving a result of correctness in 254 images (91%), and failed in 25 images (9%) as shown in figures (5.5, 5.6 and 5.7).


73

Figure 5.4: Estimated contour with located inner region and inner structures.

74


Figure 5.5: Estimated contour obtained from Morphological scale-space decomposition.


75

Figure 5.6: Estimated contour obtained from Morphological scale-space decomposition (continue).

76


Figure 5.7: Estimated contour obtained from Morphological scale-space decomposition (continue).


5.2.2

77

Combined Low-High Level Approach Evaluation

As evaluation of the proposed approach, several training sets were obtained using cross-validation experiments was conducted on all training sets. The substantial increment in workload herein is justified by the removal of all uncertainties caused by dividing the cases into a training set and evaluation set. Further more, this methodology yield better models due to the capture of almost all variations.

Training Data sets The data set is divided into four groups, each representing 25%. In sequence manner, One group as a training set and the rest as test sets. A manual procedure is applied for drawing the endocardium contour of the training set guided by an expert knowledge database. From each drawn contour, a set of control points are selected (24 control points with 15 degree angle difference) automatically, as described previously in chapter 4 section 4.3.1 . Then a statistical model, concerning the gray level profile for each control point (profile’s length 13 pixels), is build as described in chapter 4. Results For an instance image, the ROI is initialized by the estimated contour. Then the control points are located on the initialized contour. The searching procedure using the knowledge based model described in chapter 4 section 4.3.2, is applied on the control points until the best localization of the estimated boundary is reached. The combined approach was tested on the data set in section 5.1.1. The data set is organized into 3 groups: • D=

S4

k=1

Tk , represents all data set.

• S = D \ Ti • Ti

(i = 1, . . . , 4), represents non-trained set.

(i = 1, . . . , 4), represents training data set.

78

Ti S


T1 Mean ±Std 2.27 1.66 2.42 2.08

T2 Mean ±Std 2.27 1.97 2.41 2.03

T3 Mean ±Std 2.06 1.61 2.50 2.52

T4 Mean ±Std 2.14 1.57 2.54 2.39

Table 5.1: Comparison of average results between the Test sets trained with 4 training groups. Results were based on, the evaluation of the combined approach using the trained data sets described in section 5.2.2 on the data set groups. Table 5.1 and Figures (5.8, 5.9) summarizes and visualizes the results, the final error recorded were averaged for each training group. The mean error difference in pixels between the training groups is almost the same, which shows how the stability of the model even with the variation of training data set. The maximum distance mean error of 2.54 pixels from the ground truth contour, which also points to the satisfied accuracy reached using the combined proposed model.

Figure 5.8: Visualization of table 5.1 results. The mean square error distance (y-axis) for each training group in each test set (x-axis).


79

Figure 5.9: Left: The evaluation of the proposed model using the training set (T) against itself. Right: The evaluation of the proposed model using the training set (T) against non-trained set (S). Showing in each case the range of error in pixels Figure 5.10 evaluates the combined model performance for the whole data set, by comparing the mean error of the control points positions of the initial contour with average distance error 6.28 pixels and the final control points positions average distance error 2.41 pixels for the whole tested data set. It can be seen that the accuracy of the control points increased after applying the search technique using the proposed knowledge-based model compared to the ground of truth contour. Figure 5.11 demonstrates the result for locating the endocardial border.

80


Figure 5.10: Left:The mean square error distance (y-axis) for each control point (xaxis) of initial contour. Right:The mean square error distance (y-axis) for each control point (x-axis) of best fit of the final contour. Both with standard error.


81

Figure 5.11: Search using the combined approach to estimate the endocardial border.

82

5.2.3


Evaluation Compared to Other Approaches

Combined Model versus GVF snakes The proposed combined approach was also compared to another a data deriven model called GVF snakes as described in chapter 2. The GVF snakes has advantages over other active contour models are its insensitivity to initialization, its ability to move into boundary concavities, and its large capture range. Which make the GVF snakes a good model for comparison. The GVF snakes was tested on the previous described data sets. The model is initialized as follows: • Model parameter: α = 0.1, β = 0.0. are weighting parameters that control the GVF snake’s tension and rigidity.

• Initial deformable contour: The estimated contour obtained from the morphological scale-space is used as initial contour.

Figure 5.12 shows mean error distance for each control point with their standard deviation. It is demonstrated that the performance of the search technique used by the proposed combined model is better, giving the average mean distance error of 2.41 pixels, while the average mean distance error by the GVF snakes model is 3.8 pixels. The proposed combined approach enriches in the large capture range than the GVF snakes 5.14, this is because that the search technique does not fall in a local minimum problem. Due to the learning from the training set, since the model points are not always placed on the strongest edge in the locality; they may represent a weaker secondary edge. This increases the performance of the proposed model as shown in figure 5.13.


83

Figure 5.12: Distance mean error of combined model and GVF snakes.

Figure 5.13: Left: The final contour (white) after applying the proposed model. Right: The final contour (white) after applying the GVF snake. Both are initialized with the estimated contour (red).

84


On the other hand, the connectivity between the contour control points for deformation in the GVF model is better than the proposed approach, since the search technique is applied for every control point separately, without having any related curvature features between each other.

Figure 5.14: Left: The image. Right: The vector flow of the GVF directed towards the appearance structures.

Combined Model versus AAM The proposed combined model was compared to the AAM, which represent an extension to the ASM as describes in chapter 4. The model was trained and evaluated on the data described above. The endocardial boundary in the training set was manually selected with 63 landmark points. From the training set a shape model with 71 parameters, a gray-level model of 6400 pixels with 41 parameters, and a combined model with 22 parameters were constructed. The model was displaced by an (x,y) range ±10 pixels from its true hand drawn

position and with scaling invariant. The AAM matching was used to predict the


85

displacement on different endocardial appearances by iterative matching until convergence. The results were recorded and compared based on 2 measurements the least square error between the gray-level vector of the truth image and the final appearance. And the mean distance between the hand-drawn contour points, and both the starting appearance model points, and the final appearance mode at which the AAM algorithm finally converged. The final errors recorded were averaged for all data set. We can conclude that by starting from the best estimate of the correct pose, the AAM model leads to more accurate location to the hand drawn contour. This is because the initial contour was close to the original contour so that the predicted displacement forced the search towards the optimized appearance. As shown in figure 5.15 the AAM has the ability to overpass the structures adjacent to the boundary, but still the fundamental drawbacks of the AAM makes our proposed model better. The drawbacks are summarized in the following points: 1. AAMs are dependent on a good initialization. 2. The matching phase is achieved by building a prior knowledge of how to adjust the model parameters during image matching. Several crucial questions remain unanswered such as: • How many displacements should be used? • How large should the displacements be? • Should all parameters(i.e. rotation, translation and scale) be displaced at once or separately?

• Should the displacements be done in a deterministic or random fashion? 3. The model can only deform in ways observed in the training set. If the contour exhibits a particular type of deformation not present in the training set, the model will not fit to it.

86


Figure 5.15: Illustration of the AAM search: (a)Initial contour and final contour. (b)Initial mode appearance and final mode appearance. In conclusion, figure 5.16 compares the accuracy of the three models starting from the same position. We can see that the combined model achieved less distance error from the ground truth than the other two approaches.

Figure 5.16: Comparing the 3 models.

Chapter 6 Summary and Conclusions 6.1

Summary and Conclusions of the Thesis

This thesis considered the problem of segmenting the endocardial boundary of the left ventricle from magnetic resonance (MR) images. A new model is developed and validated its performance compared to other models on numerous examples. The developed model gives a new initialization and matching strategies in one framework. In this chapter the presented work and the main contributions are summarized, beside proposing new directions for future research. In the preceding chapters we have presented the following work.

In Chapter 1 we introduced the field of medical imaging, and how the role of medical imaging has expanded beyond the simple visualization and inspection of anatomic structures, through the rapid development and proliferation of medical imaging technologies such as CT, Ultrasound, MR, and other modalities. We then motivated and defined the challenging problem, the robust segmentation of the left ventricular endocardial boundary of the human heart, which has received a great deal of attention in recent years. Finally, at the end of this chapter, we outline the segmentation model (described in details in the followed chapters) with a diagram description of all of its process, and summarized the contributions given by the thesis. 87

88

CHAPTER 6. SUMMARY AND CONCLUSIONS

Chapter 2 provided a more technical literature review of Medical imaging segmentation, introducing the standard approaches and their limitations, with a description and mathematical notation on two related and well known deformable models: Gradient vector flow snakes (GVF) and Active appearance model (AAM). The choice of those two approaches returns to the fact that they are widely used in medical image segmentation, especially in anatomical structures segmentation. The two approaches represent a good comparison measurement to our model: The GVF has a large capture range without distorting the boundary, i.e. the initialization process can be far from the desired boundary. The AAM utilizes prior knowledge of object shape and its gray-level appearance, this model is in detail described in chapter 4.

Chapter 3 presented the first stage in our model, where we used a low-level analysis approach based on multiscale decomposition technique and a local descriptor scale for the decomposed scale-space. The contribution of this new approach is to group all equalled sized structure in one scale, so we can easily locate the appearance structures adjacent to the endocardial border, obtaining accurate border segmentation. We reviewed the background, principles, and major operations of the mathematical morphology; we follow the notation of Sternberg throughout this thesis. Then we presented the Morphological scale-space, addressing the scale-space theory and its direct corresponds to the problem of selecting descriptor scales and structures from the scale-space representation. Finally we presented our approach by the discussion of the term morphological filter reviewing the needed propositions and formulas for building closing and opening band-pass morphological filters. Then we described two descriptor scales obtained from the closing and opening scale-space decomposition. Three steps are processed: The inner region is segmented, guided by the opening descriptor scale, which points towards the scale in the opening scale-space decomposition presents the inner cavity. The inner appearance structures adjacent the endocardial border are segmented, guided by the closing descriptor scale, which

6.1. SUMMARY AND CONCLUSIONS OF THE THESIS

89

points towards the scale in the closing scale-space decomposition presents such appearance structures. Combing the two previous processes we obtain an estimated endocardial border over passing those inner appearance structures. To conclude this section we developed an approach that separate the structures inside the ventricular cavities such as papillary muscles, from the endocardial border, since those structures are often indistinguishable from the endocardial border. And most of the proposed models segment those structures as a part from the endocardial border. Chapter 4 presented a treatment of the estimated endocardial border, in order to get accurate final segmentation. The refinement of the estimated contour is achieved by defining a high-level model-based approach, which asserts prior information about local gray-level distribution around defined points along the estimated contour. The learning based-model deformed the estimated contour to the best fit of the true endocardial border. The chapter started by introducing in detail the active appearance model (AAM) and its limitations to our challenged problem. Then we presented the proposed model: Firstly, we build the knowledge based model through generating a training set and selecting the control points. We proposed a new automatic selection of the control points, depending on performing a radial decomposition with a 15 degree angle, along the labelled boundary, then selecting the intersection point as a control point. Secondly, the statistical gray-level model around each control point is build. Finally, the searching technique is explained to determine the adjustment of each control point to a better model-to-data fit is obtained. In the end we demonstrated the results of some examples. We can summarize the contributions we made it in this chapter in the following points: • We described a searching technique used by the developed approach, which gives the ability to describe the variability in border’s appearance instantiation

in terms of prior distribution of the templates. • We introduced a robust selection of control point in the matching phase of the model.

90


In Chapter 5 the proposed model is evaluated and compared with two high level approaches. We showed how the data set is organized in the evaluation and training, and defining a measurement depending on comparison the model to ground truth. The results were evaluated and presented in three main stages: Demonstration of low-level model, evaluation of the combined low-high approach, and comparing the results with gradient vector flow snakes approach and active appearance approach. The results showed that our model gives high performance (figure 6.1) than other models in the following points: • The developed model does not need to be good initialized. Since using morphological scale-space decomposition based on multiscale spatial analysis, and defining the descriptor scale that gives constant values for structures of constant width, inherit the localization of edges. Where most of the applied techniques in the segmentation of endocardial border, need first to assume roughly where the contour is in the ROI. • The developed model minimizes human initialization and interaction. • The model is well suited to tracking the endocardial boundary through image sequences.

Figure 6.1: Comparing the 3 models.

6.1. SUMMARY AND CONCLUSIONS OF THE THESIS

91

For future work, the model needs to overcome the drawback of the connectivity between the contour control points for deformation (figure 6.2), since the search technique is applied for every control point separately, without having any related curvature features between each other. We can use dynamic programming techniques in order to overcome the pointed drawback [68]. Generalizing the model to works on different modalities of medical images, such as CT, and Ultrasound. And different dimensionality of images 2D and 3D.

Figure 6.2: 3 cases showing the failures of the proposed model in some control points.

92


Zusammenfassung Ein grundsätzliches Problem in der Bildverarbeitung und Computer vision ist die Segmentierung deformierender Objekte aus Bildsequenzdaten. In der medizinischen Bildanalyse f¨ uhrt die zeitliche Folge von Herzbewegungen zu Bildern, die aus einem Komplex deformierender Gestalten zusammengesetzt sind. In den letzten Jahren hat die Herzbildanalyse wegen ihrer komplizierten Form und Bewegung sowie der Tatsache, daß automatische Herzbildanalyse verwendet werden konnte, um die durch verschiedene Krankheiten verursachte Herzschäden zu diagnostizieren, sehr viel Aufmerksamkeit erhalten. Das Ziel dieser Arbeit ist die Entwicklung einer Methode zur automatischen Segmentierung der endokardialen Grenze. Die mit der Grenzentdeckung verbundenen Hauptprobleme sind typische Fehler bei der Trennung von Daten, wie z.B. ausfallende Artefakte und Rauschen. Diese können undeutliche bzw. getrennte Gestaltgrenzen verursachen. Außerdem können Grautonstufenerscheinungen von Strukturen innerhalb ventrikulärer Höhlen, wie die der papillären Muskeln, oft von relevanten Strukturen f¨ ur die diagnostische Analyse nicht unterschieden werden. Die vorgeschlagene Methode ist aus zwei Phasen zusammengesetzt: Die Segmentierungsphase verwendet eine ”bottom-up” Multiskalanalyse, die hauptsächlich auf dem morphologischen Skalenaraum basiert. In einem Verfahren zersetzt diese Bilder in mehrere Skalen verschiedener Strukturgrößen. Infolge der Zerlegung werden an die endokardialen Grenze angrenzende Strukturen festgelegt, und schließlich wird eine geschätzte Grenze unabhängig von jenen Strukturen erhalten. Die Verbesserungsphase 93

94


nutzt anschließend Vorwissen u ¨ber die lokale Struktur im Umfeld definierter Punkte um die Grenze, um eine bessere Genauigkeit der endokardialen Segmentierung zu erhalten. Das vorgeschlagene Modell wurde weiterhin mit anderen bekannten Methoden in der Computer vision literatur verglichen. Zusammenfassend vertritt die entwickelte Methode einen ersten Schritt zur automatischen Segmentierung der linken Herzkammer mit einer verbesserten Leistung.

Appendix A Principal Component Analysis (PCA) The PCA is performed as an eigen-analysis of the covariance matrix of the aligned shapes. Suppose we wish to apply a PCA to s n − D vectores, xi where s < n.

The covariance matrix is n × n, which may be very large. However, we can calculate its eigenvectors and eignvalues from a smaller s × s matrix derived from the data.

Because the time taken for an eignvector decomposition goes as the cube of the size

of the matrix, this can give considerable savings. Subtract the mean from each data vector and put them into the matrix D D = ((x1 − x)| . . . |(xs − x))

(A.0.1)

The covariance matrix can be written S =

1 DDT s

(A.0.2)

Let T be the s × s matrix

1 T D D (A.0.3) s let ei be the s eigenvectors of T with corresponding eigenvalues λi , sorted into deT =

scending order. It can be shown that the s vectors Dei are all eigenvectors of S with corresponding eigenvalues λi , and that all remaining eigenvectors of S have zero eigenvalues. Note that Dei is not necessarily of unit length so may require normalizing. 95

96

APPENDIX A. PRINCIPAL COMPONENT ANALYSIS (PCA)

Appendix B Image Warping Image warping is a simple matter of transforming one spatial configuration of an image into another. Hence, a simple translation of an image can be considered an image warp. Formally: I ∈ Rk 7−→ I 0 ∈ Rk and pictorial in the planar case, k = 2,. Suppose we wish to warp an image I, so that a set of n control points xi are mapped to new positions, x0i . We require a continuous vector valued mapping function f , such that f (xi ) = x0i ∀i = 1 . . . n

B.1

(B.0.1)

Piece-wise Affine

The most simple construction of an n-point based warp is to assume that f is locally linear. To utilize this in a planar framework one need to define the term locally more tightly. One approach is to partition the convex hull of the points, using a suitable triangulation such as the Delaunay triangulation. The Delaunay triangulation connects an irregular point set by a mesh of triangle’s each satisfying the Delaunay property. This means that no triangle has any points inside its circumcircle, which is the unique circle that contains all three points (vertices) of the triangle. The Delaunay triangulation of mean epicardial from previous is given in figure B.1. 97

98

APPENDIX B. IMAGE WARPING

Figure B.1: Delaunay triangulation of the mean shape.

The warp is now realized by applying the triangular mesh of the first point set in I to the second point set in I 0 . In that each point on each triangle can be uniquely mapped upon the corresponding triangle of the second point set by an affine transformation, which basically consist of scaling, translation and skewing. If x 1 , x2 and x3 denotes the vertices of a triangle in I any internal point can be written as superposition: x = x1 + β(x2 − x1 ) + γ(x3 − x1) = αx1 + βx2 + γx3

(B.1.1)

Thus α = 1 − (β + γ) giving α + β + γ = 1. To constrain x inside the triangle we must have 0 ≤ α, β, γ ≤ 1. Warping is now given by using the relative position within the triangle given by α, β, andγ on the triangle in I 0 :

x0 = αx01 + βx02 + γx03

(B.1.2)

Given the three points of a triangle it is trivial to determine α, β, andγ by solving the

B.1. PIECE-WISE AFFINE

99

system of two linear equations given by B.1.1 for a known point, x = [x, y]T : α = 1 − (β + γ) yx3 − x1 y − x3 y1 − y3 x + x1 y3 + xy1 β = −x2 y3 + x2 y1 + x1 y3 + x3 y2 − x3 y1 − x1 y2 xy2 − xy1 − x1 y2 − x2 y + x2 y1 + x1 y γ = −x2 y3 + x2 y1 + x1 y3 + x3 y2 − x3 y1 − x1 y2

(B.1.3)

100

APPENDIX B. IMAGE WARPING

Appendix C Texture Normalization The texture model could be invariant to global changes in illumination. We will compensate for linear changes by applying a scaling of α and offset of β. If gimage denotes the actual pixel values sampled in the image: g mage − β · 1 gnorm = i (C.0.1) α where 1 is a unit vector. In practice each texture vector of m pixels is aligned to the standardized mean texture, g, by offsetting it to zero mean: m 1 X 1 gzeromean = g − g · 1 , g = gi = g · 1 m i=1 m

(C.0.2)

and scale it to unit variance:

m

1 = gzeromean σ

1 X (gi − g)2 gzeromean , σ = (C.0.3) m i=1 P 2 Notice that the variance estimate simplifies to σ 2 = m1 m i=1 (gi ) since gzeromean has 2

a zero mean. α and β can thus be written as:

α = gimage · g gimage · 1 β = m

(C.0.4) (C.0.5)

Since α is defined in terms of the mean, an iterative approach must be taken.

101

102

APPENDIX C. TEXTURE NORMALIZATION

The pseudo code is: 1. Do 2.

Estimate mean of all texture vectors, g

3.

Standardize g

4.

For each texture vector, gimage

5.

α = gimage · g

6.

β = (gimage · 1)/m

7.

Normalize gimage using (C.0.1)

8.

End

9. Until g is stable

Appendix D GVF Snakes Implementation The GVF deformable contour model is characterized by the following dynamic equation

 0 0 00 00    xt (s; t) = (α(s)x (s; t)) − β(s)x (s; t)) + Fext (x)

(D.0.1)

   x(s; 0) = x (s) 0

where Fext is the external force and x0 denotes the partial derivative of x with respect to s. The equation reduces when the values of α and β are chosen to be constants. Approximating the derivatives with finite difference, and converting to the vector notation xi = (xi , yi ) = (x(ih), y(ih)), we can rewrite Eq.(D.0.1)as xti − xt−1 i = −αi (xti ) + αi + 1(xti+1 − xti ) τ − βi−1 (xti−2 − 2xti−1 + xti ) + 2βi (xti−1 − 2xti + xti+1 ) − βi+1 (xti − 2xti+1 + xti+2 ) + Fext (xt−1 i )

(D.0.2)

where xi = x(ih), αi = α(ih), βi = β(ih), h the step size in space, and τ the step size in time. In general, the external force Fext is stored as a discrete vector field, i.e., a finite set of vectors defined on an image grid. The value of Fext at any location xi 103

104

APPENDIX D. GVF SNAKES IMPLEMENTATION

can be obtained through a bilinear interpolation of the external force values at the rid points near xi . Equation D.0.2 can be written in a compact matrix form as xt − xt−1 = Axt + Fext (xt−1 ) τ

(D.0.3)

where A is a pentadiagonal banded matrix. Equation D.0.3 can be solved iteratively by matrix inversion: xt = (I − τ A)−1 (xt−1 + τ Fext (xt−1 ))

(D.0.4)

We note that since the above finite difference scheme is implicit with respect to the internal forces, it can solve very rigid deformable contours with large step size [1]. The external force Fext is derived from the image as the interseted features such as boundaries. Given a gray-level image I(x, y), typical external force leads the snake toward step edges: Fext (x, y) = −|∇I(x, y)|2 Fext (x, y) = −|∇[Gσ (x, y) ∗ I(x, y)]|2 where Gσ (x, y) is a two dimensional Gaussian function with standard deviation σ and ∇ is the gradient operator.

Bibliography [1] W. F. Ames. Numerical Methods for Partial Differential Equations. Boston: Academic Press, 3rd edition, 1992. [2] N. Ayache. Medical Computer Vision, Virtual Reality and Robotics. Image and Vision Computing, 13(4):295–313, May 1995. [3] E. Bardinet, L.D. Cohen, and N. Ayache. Tracking and motion analysis of the left ventricle with deformable superquadrics. Medical Image Analysis, 1(2), 1996. Note: Also INRIA Research Report RR-2797,, 1996. [4] J.-P Berenger. A perfectly matched layer for the absorption of electromagnetic waves. J. Computational Phys, 114:185–200, 1994. [5] F. L. Bookstein. Principal warps: thin-plate splines and the decomposition of deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(6):567–585, 1989. [6] F. L. Bookstein. Landmark methods for forms without landmarks: Localizing group differences in outline shape. Medical Image Analysis, 1(3):225–244, 1997. [7] B. Burns, K. Nishihara, and S. Rosenschein. Appropriate Scale Local Centers: a Foundation for Parts-based Recognition. Technical report, Teleos Research, 1994. TR-9405. [8] J. Canny. A computational approach to edge detection. IEEE Trans. Pattern Analysis and Machine Intelligence, 8(6):679–698, 1986. 105

106

BIBLIOGRAPHY

[9] V. Caselles, F. Catte, T. Coll, and F. Dibos. A geometric model for active contours. Numerische Mathematik, 66:1–31, 1993. [10] V. Caselles, R. Kimmel, and G. Sapiro. Geodesic active contours. pages 694–699. Fifth International conference on Computer Vision, 1995. [11] G. Chuang and Kuo. Wavelet description of planar curves: theory and applications. IEEE Transactions on Image Processing, 5:56–70, 1996. [12] L. D. Cohen. On active contour models and balloons. CVGIP: Image Understanding, 53:211–218, Mar 1991. [13] L. D. Cohen and I. Cohen. Finite-element methods for active contour models and balloons for 2-D and 3-D. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15:1131–1147, Nov 1993. [14] T. Cootes, C. Beeston G. Edwards, and C. Taylor. A Unified Framework for Atlas Matching using Active Appearance Models. Proceedings of the International Conference on Image Processing in Medical Imaging, pages 322–333, 1999. [15] T. Cootes, G. Edwards, and C. Taylor. Active Appearance Models. Proceedings of the European Conference on Computer Visions, 2:484–498, 1998. Springer. [16] T. Cootes, K. Walker G. Edwards, and C. Taylor. View-Based Active Appearance Models. Proceedings of the International Conference on Face and Gesture Recognition, pages 227–232, 2000. [17] T. Cootes, A. Hill, and C. Taylor. Medical Image Interpretation Using Active Shape Models: Recent Advances. Information Processing in Medical Imaging. Kluwe Academic Publishers, Editor: Y. Bizais, pages 371–372, 1995. [18] T. Cootes and C. Taylor. Statistical models of appearance for medical image analysis and computer vision. Proceedings of SPIE Medical Imaging, 2001.

BIBLIOGRAPHY

107

[19] T. Cootes, C. Taylor, D. Cooper, and J. Graham. Training Models of Shape from Sets of Examples. Proceedings of the British Machine Vision Conference, pages 9–18, 1992. Springer-Verlag. [20] T. Cootes, C. Taylor, D. Cooper, and J. Graham. Active shape models- their trainig and application. Computer Vision Image Understanding, 61(1):38–59, 1995. Springer-Verlag. [21] T. Cootes, C. Taylor, A. Hill, and J. Halsam. The Use of Active Shape Models for Locating Structures in Medical images. Proceedings of the 13th International Conference on Information Processing in Medical Imaging, pages 33–47, 1993. Springer-Verlag. [22] T. Cootes, C. Taylor, and A. Lanitis. Active Shape Models: Evaluation of a Multi-Resolution Method for Improving Image Search. Proceedings of the British Machine Vision Conference. BMVC1994, pages 327–336, 1994. [23] Kunio Doi, K. Doi, and M. L. Giger. Computer-Aided Diagnosis in Medical Imaging. Elsevier Science, 1999. [24] J. Duncan and N. Ayache. Medical Image Analysis: Progress over two decades and the challenges ahead. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1):85–106, 2000. [25] H. El-Messiry, H. A. Kestler, O. Grebe, and H. Neumann. Scale-space Decomposition for Segmenting the Ventricular Structure in Cardiac MR Images. Bildverarbeitung för die medizin, pages 181–185, 2003. [26] H. El-Messiry, H. A. Kestler, O. Grebe, and H. Neumann. Segmenting the Endocardial Border of the Left Ventricle in Cardiac Magnetic Resonance Images. IEEE Conf. Computers in Cardiology, 30:181–185, 2003.

108

BIBLIOGRAPHY

[27] C. A. Glasbey and K. V. Mardia. A review of image-warping methods. Journal of Applied Statistics, 25(2):155–172, 1998. [28] Y. Gong and M. Sakauchi. Detection of regions matching specified chromatic features. Computer Vision and Image Understanding, 61:263–269, 1995. [29] Rafael C. Gonzalez. Digital Image Processing. Addison-Wesley, 2002. [30] H. Gray. Gray’s Anatomy. C. Livingstone, New York, 37th edition, 1989. [31] R. M. Haralick and L. G. Shapiro. Image segmentation techniques. Computer Vision Graphics and Image Processing, 29(1):100–132, Jaunuary 1985. [32] G. Harmarneh. Deformable spatio-temporal shape modelling. PhD thesis, Department of Signals and Systems, Chalmers University of Technology, Sweden, 1999. [33] H. Heijmans. Morphological images operators, Advances in Electronics and Electron. [34] W. E. Higgins, M. Chung, and E. L. Ritman. Extraction of left-ventricular chamber from 3-D CT images of the heart. IEEE Trans. Med. Imag., 9:384–395, 1990. [35] A. Hill, C. Taylor, and A. Brett. A framework for automatic landmark identification using a new method of nonrigid correspondence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(3):241–251, 2000. [36] W. S. Hinshaw and A. H. Lent. An introduction to nmr imaging: from the bloch equation to the imaging equation. Proceedings of the IEEE, 71(3):338–350, March 1983. [37] R. j. Schalkoff. Digital image processing and computer vision. John Wiley Sons, Inc, 1989.

BIBLIOGRAPHY

109

[38] L. Florack P. Johansen J. Sporring, M. Neilsen. Gaussian scale-space theory. Kluwer, 1997. [39] E. A. Jackson. User’s Guide to Principle Components. John Wiley Sons, Inc, 1991. B. Jones 3-17-94. [40] P. T. Jackway.

Morphological Scale-Space With Application to Three-

Dimensional Object Recognition. PhD thesis, Queensland University of Technology, 1994. [41] P. T. Jackway and M. Deriche. Scale-space Properties of the Multiscale Morphological Dilation-Erosion. IEEE Transaction on Pattern Analysis and Machine Intelligence, 18(1):38–51, 1996. [42] A. Jain. Fundamentals of Digital Image Processing. Prentice Hall, 1989. [43] J. Jin. Electromagnetic Analysis and Design in Magnetic Resonance Imaging. CRC Press, New York, 1999. [44] J.Serra. Introduction to mathmatical morphology. Computer Vision, Graphics and Image Processing, 35:283–305, 1986. [45] M. Kass, A. Witkin, and D. Terzopoilos. Snakes: ActiveContourModels. International Journal of Computer Vision, 1(4):321–331, 1987. [46] J. J. Koenderink. The structure of images. Biological Cybernetics, 50:363–370, 1984. [47] U. Köthe. Local appropriate scale in morphological scale-space. Proc. 4th European Conference on Computer Vision, 1:219–228, 1996. [48] U. Köthe. Morphological Appropriate Scale Measurements for Region Segmentation. Technical Report 96/19, Dept. of Computer Science, P. Johansen (ed.):

110

BIBLIOGRAPHY

Proc. of Copenhagen WS on Gaussian Scale-Space Theory, U of Copenhagen, 1996. [49] U. Köthe. Primary image segmentation. Proc. 7th DAGM Symposium, Springer 1995. [50] H. J. Lamb, J. Doombos, E. A. van der Velde, M. C. Kruit, J. H. C. Reiber, and A. de Roos. Echo-planar MRI of the heart on a srandard system: Validation of measurment of left ventricular function and mass. J. Computer Assistant Tomography, 20(6). [51] T. Lindeberg. On scale selection for differential operators. Processeding 8th Scandinavian Conference on Image Analysis, pages 857–866, May 1993. [52] T. Lindeberg. Scale-Space Theory in Computer Vision. Kluwer Academic, 1994. [53] T. Lindeberg. Scale-space: A framework for handling image structures at multiple scales. CERN School of Computing, 1996. [54] G. Matheron. Random Sets and Integral Geometry. Wiley, 1975. [55] I. Matthews, T. Cootes, A. Bangham, and R. Harvy. Extraction of Visual Features for Lipreading. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(2). [56] T. McInerney and D.Terzopoulos. Topology adaptive deformable surfaces for medical image volume segmentation. IEEE Transactions on Medical Imaging, 18(10):840–850, October 1999. [57] T. McInerney and D. Terzopoulos. Deformable models in medical image analysis: A survey. Medical Image Analysis, 1(2):91–108, 1996. [58] S. C. Mitchell, B. Lelieveldt, J. van der Geest, H. Bosch, J. Reiber, and M. Sonka. Multistage Hybrid Active Appearance Model Matching: Segmentation of Left

BIBLIOGRAPHY

111

and Right Ventricles in Cardiac MR Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(5). [59] W. M. Neuenschwander, P. Fun, L. Iverson, G. Szekely, and O. Kubler. Ziplock snakes. International Journal Computer Vision, 25(3):191–201, 1997. [60] H. Neumann. Methoden der primren bildverarbeitung 2, 2004. Lecture Notes in Computer Vision I. [61] P. M. Pattynama, H. J. Lamb, E. A. van der Velde, E. E. van der Wall, and A. deRoos. Left ventricular measurements with cine and spin-echo MR imaging: A study of reproducibilty with variance component analysis. Rediology, 187:261– 268, 1993. [62] P. Perona and J. Malik. Scale-space and edge detection using anisotropic diffusion. IEEE Transaction on Pattern Analysis and Machine Intelligence, 12:629– 639, 1990. [63] I. Pitas and A. Maglara. Rang Image Analysis by Using Morphological Signal Decomposition. Pattern Recognition, 24(2):165–181, 1991. [64] W. K. Pratt. Digital Image Processing. John Wiley Sons, New York, 1978. [65] P. K. Sahoo, S. Soltani, A. K. C. Wong, and Y. C. Chen. A survey of thresholding techniques. Computer Vision Graphics and Image Processing, 41(2):233–260, February 1988. [66] S. Salembier and M. Kunt. Size-sensitive multiresolution decomposition of images with rank order based filter. Signal Processing, 27(2):205–241, 1996. [67] S. Sclaroff and J. Isidoro. Active blobs. Proc. of the Int. Conf. on Computer Vision, pages 1146–1153, 1998. [68] R. Sedgewick. Algorithms. Addison-Wesley, second edition edition, 1988.

112

BIBLIOGRAPHY

[69] J. Serra. Image Analysis and Mathematical Morphology, volume 1. Academic Press, 1982. [70] J .Serra. Image Analysis and Mathematical Morphology, volume 2. Academic Press, 1982. [71] K. K. Shung. Principles of Medical Imaging. Academic Press, San Diego, 1992. [72] D. D. Stark and W. G. Bradley. Magnetic Resonance Imaging. C. V. Mosby Company, St. Louis, 1988. [73] S. R. Sternberg. Grayscale morphology. Computer Vision Graphics Image Processing, 35(3):333–355, 1986. [74] M. Turk and A. Pentland. Eignfaces for recognition. Journal of Cognitive Neuroscience, 3(1):71–86, 1991. [75] R. J. van der Geest, V. G. M. Buller, E. Jansen, H. J. Lamb, L. H. B. Baur, E. E. van der Wall, A. de Roos, and J. H. C. Reiber. Comparison between manual and semiautomated analysis of left ventricular volume parameters from short-axis MR images. J. Computer Assistant Tomography, 21(5):756–765, 1997. [76] Bram van Ginneken, Bart M. ter Haar Romeny, and Max A. Viergever. Computer-Aided Diagnosis in Chest Radiography: A Survey. IEEE Transaction on Medical Imaging, 20(12):1228–1241, December 2001. [77] D. Wang, V. Haese-Coat, A. Bruno, and J. Rosin. Texture classification and segmentation based on iterative morphological decomposition. J. Visual Communication and Image Representation, 4(3):197–214, jul 1993. [78] J. Wang and X. Li. Guiding Ziplock Snakes With a priori Information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(2):176–185, 2003.

BIBLIOGRAPHY

113

[79] J. Weickert, S. Ishikawa, and A. Imiya. Scale- Space has been Discovered in Jpan. Technical report, Department of Computer Science University of Copenhagen, 1997. TR-97/18. [80] J. Weng, A. Singh, and M. Y. Chiu. Learning-based ventricle detection from cardiac MR and CT images. IEEE Trans. Med. Imag., 16:378–391, 1997. [81] A. P. Witkin. Scale-space filtering. Proceedings of the 8th International Joint Conference on Artificial Intelligence, 2:1019–1022, 1977. [82] A. P. Witkin. Scale-space filtering: A new approach to multiscale description. in Image Understanding, pages 79–95, 1984. [83] C. Xu. Deformable models with application to human cerebral cortex reconstruction from magnetic resonance images. PhD thesis, Johns Hopkins University, 2000. [84] C. Xu and J. L. Prince. Generalized gradient vector flow external forces for active contours. Signal Processing, An International Journal, 71(2):132–139, 1998. [85] F. R. C. Yin. Ventricular wall stress. Circ. Res., 49:829–842, 1981. [86] A. A. Young, H. Imai, C.-N. Chang, , and L. Axel. Two-dimensional left ventricular deformation during systole using magnetic resonance imaging with spatial modulation of magnetization. Circulation, 89(2):740–752, Feb 1994. [87] A. L. Yuille and T. A. Poggio. Scaling theorems for zero crossings. IEEE Transaction on Pattern Analysis and Machine Intelligence, 8:15–25, 1986.