Segmentation of the Parotid Glands

6 downloads 74 Views 254KB Size Report
TomoTherapy, Madison, WI. 3 Results. A common methodological background shared between all submissions consists of non-rigid registration between the ...
Head and Neck Auto-segmentation Challenge: Segmentation of the Parotid Glands Vladimir Pekar1 , St´ephane Allaire2,4 , Arish A. Qazi2,4 , John J. Kim2,3 and David A. Jaffray2,4 1

2

Philips Research North America, Markham, ON, Canada Princess Margaret Hospital, Radiation Medicine Program, Toronto, ON, Canada 3 Dept. of Radiation Oncology, University of Toronto, Toronto, ON, Canada 4 Dept. of Medical Biophysics, University of Toronto, Toronto, ON, Canada

Abstract. This paper presents the results of the Head and Neck Autosegmentation Challenge, a part of the workshop, “Medical Image Analysis in the Clinic: A Grand Challenge”, held in conjunction with the 13th International Conference on Medical Image Computing and Computer Assisted Interventions (MICCAI), Beijing, China, in September 2010. The aim of the challenge was to evaluate the performance of fully automated algorithms in segmenting the parotid glands in head and neck CT image data used in radiotherapy planning. We describe the motivation behind the clinical application selected for the challenge, the image data used, and the metrics applied for the quantitative assessment of the segmentation accuracy with respect to the ground truth segmentations provided by a clinical expert. The quantitative evaluation results of the auto-segmentations submitted by the workshop participants are included.

1

Introduction

Radiation therapy is one of the three principal treatments for cancer besides surgery and chemotherapy. It is based on the principle of damaging DNA of the malignant cells by applying ionizing radiation [1]. External beam radiation treatment planning is a process of setting up the treatment protocol including dose computation and beam placement and is typically done using 3-D computed tomography (CT) image data. Accurate segmentation of the target volumes and risk organs in the patient’s image is a crucially important part of the planning procedure. Although some commercial software products allowing for semi- and fully automated segmentation of risk organs have recently become available, their application is limited for many anatomical structures, where the common clinical practice is still 2-D manual contouring in axial slices using standard drawing tools. The planning of head and neck cancer radiation therapy is especially laborintensive due to the complexity of the underlying anatomy and the large number of contours that need to be generated. Manual contouring can often require

273

several hours to be spent on a single plan. On the other hand, automated segmentation of many organs at risk in the head and neck area is challenging due to poor soft tissue discrimination in CT, artifacts from dental fillings, and large variability of patient’s anatomy. The purpose of the Head and Neck Auto-segmentation Challenge is evaluating the performance of state-of-the-art fully automatic segmentation algorithms in CT image data used for radiotherapy planning. It is organized as a part of “The Grand Challenge” series of workshops series [2] held in conjunction with the Medical Image Computing and Computer Assisted Interventions (MICCAI). These workshops have been attracting considerable attention from the scientific community as they provide an excellent testground for systematic and unbiased evaluation of segmentation algorithms with a focus on important clinical applications. The first Head and Neck Auto-segmentation Challenge took place in London, UK, in September 2009 [3]. The present paper describes the offsite evaluation results for the contributions submitted to the second Head and Neck Auto-segmentation Challenge held in Beijing, China, in September 2010. The organization of this paper follows the structure of the paper summarizing the on-site results of the previous event [3]: Section 2 describes the challenge objectives, presents the image data used for the contest, and introduces the evaluation metrics used for the quantitative assessment of segmentation accuracy. In Section 3, the results of the evaluation study for the data submitted offsite by the participants are discussed. Section 4 concludes the paper.

2 2.1

The challenge Clinical background

Anatomical structures that need to be contoured in the planning routine include the treatment target volumes and a set of structures at risk. Among the target volumes, a differentiation is made between the gross tumor volumes (GTV) encompassing the visible extents of the disease and regional lymph nodes, the clinical target volumes (CTV) accounting for possible microscopic infiltration into the surrounding tissue, and the planning target volumes (PTV) which add margins to the CTV due to various geometric uncertainties at treatment time, such as patient setup differences, changes in the tumor volume, etc. The definition of the target volumes is usually highly patient specific. In most cases it cannot rely on image information alone and is based on the clinical case, hospital common practice, physician’s experience, and other subjective factors. Due to these reasons, automatic contouring of the target volumes is quite challenging. In addition to the target volumes, a set of critical structures at risk must also be contoured. The goal is to incorporate this information into the treatment plan, thus minimizing the dose delivered to the critical structures and leading to reduced radiation induced toxicity. Due to the complexity of the head and neck anatomy, a large number of organs needs to be contoured. Their size and appearance in the image is highly variable across patients. Based on their appearance

274

in CT, anatomical structures can be divided in two groups: high-contrast bones and low-contrast soft tissues. For the first Head and Neck Auto-segmentation Challenge, which took place in 2009, two prominent organs at risk were selected: i) mandibular bone and ii) brainstem, which is a soft tissue structure. Both organs are always contoured in a head and neck treatment plan and their excessive irradiation can lead to significant morbidity for the patient. The submitted segmentation results demonstrated that it was potentially possible to reduce the contouring time by applying fully automatic algorithms to segment the mandible, where the segmentation accuracy of approximately two thirds of all slices could be deemed acceptable. In contrast, the user would need to correct about half of the slices of the brainstem after applying auto-segmentation [3]. In this year’s challenge, the complexity of the task has been increased, and the participants were invited to develop fully automatic solutions to segment the parotid glands. These bilateral organs are responsible for the salivary function and represent important structures at risk contoured in most of the head and neck treatment plans [4]. The shape of the parotid glands is rather irregular with concavities, and their gray-value appearance in CT data spans the intensity range between fat and muscle but can be inhomogeneous and highly affected by dental artifacts. All these factors make their reliable fully automatic segmentation very challenging. 2.2

Image data

The datasets used were the planning CT images of 25 anonymized patients acquired at the Princess Margaret Hospital in Toronto, Canada, and were identical to the 2009 challenge. The reconstruction matrix for all datasets was 512×512 pixels with the pixel size of approximately 0.98×0.98 mm. The number of slices was in the range of 100-200 slices with the slice thickness of 2 mm. Manual delineations of both left and right parotid glands were generated by an expert radiation oncologist and stored as a set of axial contours for visualization purposes and as binary masks for the quantitative evaluation. Analogously to the 2009 challenge, the datasets were organized in 3 groups: 10 datasets could be used by the participants for training purposes, for which the manual ground truth segmentations were provided; 8 datasets were used for the offsite testing; 7 datasets were left for the online contest. 2.3

Evaluation metrics

The evaluation metrics used in the challenge have been selected to reflect different aspects of segmentation quality assessment for the clinical application in focus. The contours produced by any auto-segmentation algorithm in radiotherapy planning must be reviewed and approved by a clinician in order to be used instead of manual delineations, and interactive corrections are often required for the problematic areas. The review and correction process is typically performed, analogously to manual contouring, by inspecting axial slices of the dataset. Thus,

275

G;