An enhanced block matching algorithm for fast elastic registration in ...

35 downloads 52076 Views 1MB Size Report
Sep 8, 2006 - and accurate elastic registration of images acquired for treatment planning, ..... The rigid transformation matrix Ri defines the best correspondence of ... position p of image A and a rigidly displaced domain B(Rp) of image B is ...
INSTITUTE OF PHYSICS PUBLISHING Phys. Med. Biol. 51 (2006) 4789–4806

PHYSICS IN MEDICINE AND BIOLOGY

doi:10.1088/0031-9155/51/19/005

An enhanced block matching algorithm for fast elastic registration in adaptive radiotherapy U Malsch1, C Thieke2,3, P E Huber2,3 and R Bendl1 1

Department of Medical Physics in Radiation Therapy, Deutsches Krebsforschungszentrum (DKFZ), Im Neuenheimer Feld 280, 69210 Heidelberg, Germany 2 Clinical Cooperation Unit Radiooncology, Deutsches Krebsforschungszentrum (DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg, Germany 3 Department of Radiation Oncology, University of Heidelberg, Im Neuenheimer Feld 400, 69120 Heidelberg, Germany E-mail: [email protected], [email protected], [email protected] and [email protected]

Received 13 April 2006, in final form 26 July 2006 Published 8 September 2006 Online at stacks.iop.org/PMB/51/4789 Abstract Image registration has many medical applications in diagnosis, therapy planning and therapy. Especially for time-adaptive radiotherapy, an efficient and accurate elastic registration of images acquired for treatment planning, and at the time of the actual treatment, is highly desirable. Therefore, we developed a fully automatic and fast block matching algorithm which identifies a set of anatomical landmarks in a 3D CT dataset and relocates them in another CT dataset by maximization of local correlation coefficients in the frequency domain. To transform the complete dataset, a smooth interpolation between the landmarks is calculated by modified thin-plate splines with local impact. The concept of the algorithm allows separate processing of image discontinuities like temporally changing air cavities in the intestinal track or rectum. The result is a fully transformed 3D planning dataset (planning CT as well as delineations of tumour and organs at risk) to a verification CT, allowing evaluation and, if necessary, changes of the treatment plan based on the current patient anatomy without time-consuming manual re-contouring. Typically the total calculation time is less than 5 min, which allows the use of the registration tool between acquiring the verification images and delivering the dose fraction for online corrections. We present verifications of the algorithm for five different patient datasets with different tumour locations (prostate, paraspinal and head-and-neck) by comparing the results with manually selected landmarks, visual assessment and consistency testing. It turns out that the mean error of the registration is better than the voxel resolution (2 × 2 × 3 mm3). In conclusion, we present an algorithm for fully automatic elastic image registration that is precise and fast enough for online corrections in an adaptive fractionated radiation treatment course. (Some figures in this article are in colour only in the electronic version) 0031-9155/06/194789+18$30.00 © 2006 IOP Publishing Ltd Printed in the UK

4789

4790

U Malsch et al

Figure 1. Flow diagram of adaptive radiotherapy level 2. Diagnosis, initial planning (manual delineation, beam configuration, etc) and dose calculation for simulation are done once. Patient setup (positioning), verification CT scans, rigid (level 2a) or elastic (level 2b) registration of tissue deformations and adaptation of the plan have to be done repeatedly during the fractionated treatment course.

1. Introduction Recent developments in radiotherapy techniques, most prominent being intensity-modulated photon therapy and particle therapy, allow for a conformal irradiation where the high dose area can be tailored even to complex shaped target volumes with steep dose gradients to surrounding normal tissues. However, in a fractionated treatment course patient misalignments and changing internal anatomy are becoming more critical since higher conformality holds a higher risk of target underdosage or organ at risk (OAR) overdosage. Even with rigid fixation devices, maximum positioning errors higher than 1 cm are observable (Thieke et al 2006). In addition, deviations occur as a result of organ movements due to different fillings of hollow organs such as bladder and rectum, tissue-dependent reaction to irradiation and other changes, e.g. respiration motion during therapy. These uncertainties are handled in clinical routine by standardized safety margins around the tumour to compensate for deviations in size, shape and position during fractionated therapy. As a result healthy tissue is often exposed to therapeutic dose levels, which can increase the complication probability or force the planner to reduce the prescribed dose, resulting in a lower tumour control probability. A novel strategy for a more precise radiotherapy is dynamic adaptation of the initial treatment plan to changes in position, size and shape of organs and target volume(s) during fractionated radiotherapy. By this strategy safety margins can potentially be reduced to decrease the normal tissue complication probability while preserving or increasing the tumour control probability. Different levels of image-guided adaptive radiotherapy can be defined: level 1 corrects for errors offline, i.e., a control CT scan is acquired and the patient is irradiated immediately afterwards. After several scans have been acquired, the systematic component of the patient misalignment is calculated and corrected for either by simply shifting the target point (level 1a) or by elastic adaptation and re-optimization of the complete plan (level 1b). Level 2 uses the same procedures as level 1, but with online correction between the control CT scan and the following irradiation to compensate also for random errors. Level 3 means the adaptation during a single fraction to compensate also for intrafractional variations like breathing motion, requiring an imaging device operating while the patient is irradiated. The general procedure for adaptive radiotherapy with online correction (level 2) is shown in figure 1. Changing a plan requires new calculation and evaluation of the resulting dose distribution. This is in contrast to conventional static radiotherapy where treatment is planned once before the first fraction without any changes during the course of therapy (level 0). To setup a new plan and evaluate the resulting dose distribution, an adaptation of the threedimensional patient model is necessary. This also requires a new delineation of all structures

An enhanced block matching algorithm for fast elastic registration in adaptive radiotherapy

4791

of interest (target volumes and organs at risk) which is a challenging task, especially if it should be performed in the short period between acquisition of verification images and dose delivery. New delineations can be achieved by automated segmentation. For instance Gibou et al (2005) presented a PDE-based approach for automatic delineation, which uses for each organ a segmentation model, which must be placed manually at the organs at risk, or Pekar et al (2004) uses an adaption of 3D deformable surface models to the boundaries of organs. This paper is focused on a different technique to overcome this crucial bottleneck in adaptive radiotherapy in adaptation level 1b and especially level 2b (online elastic adaptation). An algorithm is presented that automatically performs an elastic transformation of the planning CT scan to match the actual control CT scan. Then the same transformation is applied to the original organ contours. The result is detailed information about the displacement of each single voxel in the form of a vector field, and fully automatic adaptation of the contours of targets and organs at risk to the actual control CT scan. 2. Image registration techniques Several registration techniques were proposed during the last few decades. Depending on the character of the deviation, a global rigid transformation, described by a single 4 × 4 transformation matrix R for translation and rotation, can be sufficient. In the following we will focus on more complex deviations between two images, requiring elastic registration described by a vector field varying over the whole image Let us assume an image A and another image B, both of which are full three-dimensional datasets. B is an image of the same scene as A, but taken at different time points (Bt1 , Bt2 , . . . , BtN ) and possibly also taken with a different orientation or modality (e.g. ACT and BMRI for CT and MRI images). The result of elastic registration between these images is a vector field T, mapping each element of image A to a corresponding element of image B: TBA (B) ≈ A

or

T −1 (A) ≈ B.

(1)

Every pixel is assigned to an individually calculated translation vector t to map its position p = (px , py , pz ) in image A to another position q = (qx , qy , qz ) in image B: A(p) = B(p − t(p)) = B(q).

(2)

Several methods for calculating such a transformation vector field by elastic registration have been developed. Overviews of these methods are published in Hill et al (2001), Maintz and Viergever (1998) and Zitov´a and Flusser (2003). One of the first methods for elastic registration was optical flow (OF), introduced by Schunck and Horn (1981). OF calculates the translation vector field by handling image positions like particles in liquid. The main part of the calculation is solving differential equations which describe the pixel motion by changes in the image gradient, i.e. image intensities are differentiated in 3D space and time. Motion estimations with OF are especially reliable for small movements in monomodal images of different time points. If the movements exceed the size of the moving objects, corresponding gradients are not mapped properly and the calculated displacement vectors are wrong. Thirion (1998) proposed to consider elastic registration as a diffusion process: ‘demons’ at significant image positions deform the image according to local deviations between both images. The introduction of forces was inspired from optical flow equations. Adjustments of the result or a correction of input parameters can be difficult, because the position of the demons is changed often during the registration process to overcome some limitations of OF. Therefore it is difficult to predict the outcome of the algorithm, and if the method fails it is difficult to understand the reasons (Pennec et al 1999). Wang et al (2005a, 2005b) accelerated

4792

U Malsch et al

Thirions algorithm markedly by the usage of an additional force, i.e. not only the gradients of the first image are used but also the gradients of the second image. Fast calculation times (approximately 6 min for a CT scan with 61 slices) and an impressive accurateness (within 1.5 pixels) were presented. Other authors propose viscous fluid registration methods which can cope with larger deformations (e.g. Christensen et al (1994)). Properties of either solid or hypothetical compressible viscose fluids are simulated when the model is deformed. A refinement uses a convolution filter to reduce the calculation time of this approach (BroNielsen and Gramkow 1996). Recent implementations allow deformable registration within calculation times of approximately 12 min per image (Foskey et al 2005) or within minutes with a multiresolution approach (Crum et al 2005). Currently, free form deformations (FFD) are a widely used registration approach (Lau et al 2001). It can be used for a wide range of images, monomodal as well as multimodal. FFD methods iteratively estimate the registration by minimizing a global voxel-based similarity measure (e.g. mutual information, correlation, summed absolute differences), calculated on the whole image. Regularly arranged grid nodes are transformed in all three dimensions, and the whole image is transformed in each optimization step e.g. by thin-plate splines or b-splines. If N nodes are defined, an optimization of N∗3 variables must be performed. These algorithms are computationally expensive, and only special hardware (OpenGL, massive parallel architecture, shared memory) allows acceptable calculation times in the range of minutes (Hastreiter et al 2004, Rohlfing and Maurer 2003, Rueckert et al 1999). Block matching algorithms use a set of corresponding landmarks to calculate an interpolated vector field (Bookstein 1989, Pekar et al 2006). To overcome the time-consuming manual selection of corresponding landmark pairs R¨osch et al (2001) introduced a fully automatic method. Clatz et al (2005) showed the possibility of parallelizing block matching for usage in intra-operative MRI registration. They reported calculation times of 35 s on a PC cluster consisting of about 15 dual Pentium CPUs. For the usage of elastic registration in adaptive radiotherapy, several requirements must be considered. The most important constraint is the already mentioned short time frame available for calculation of the elastic transformation if online correction is desired. Another one is the ability to deal with images of reduced quality, e.g. when verification CT images are acquired with a lower dose compared to a diagnostic CT scan or when linac-attached kV-imaging is used. The matching accuracy should be higher than say two voxels in clinically relevant structures. The algorithm should be able to deal with image discontinuities, e.g. with structures visible in only one of the two images like gas cavities in the rectum, situations where different tissue types should be processed separately, or when closely adjacent structures actually move in opposite directions. Finally, it should be possible to adjust the segmented patient model (i.e. the contours of the target and OARs) by using the resulting vector field. Due to these requirements, we decided to use a block matching algorithm similar to that proposed by R¨osch et al (2001). In this paper we present our implementation of this approach and discuss necessary extensions and modifications to deal sufficiently with all mentioned constraints with the focus on application in adaptive radiotherapy. 3. Material and methods The main steps of the algorithm are the identification of pairs of corresponding landmarks at suitable positions in both images, calculation of appropriate translation vectors and interpolation of vectors for all other image positions. Therefore the global translation vector field is not calculated at once (as in e.g. optical flow based methods). Instead, first some promising positions p1 , . . . , pN are selected in the first image and relocated in the second image by searching for a surrounding sub-region pi with M voxels arranged isotropically

An enhanced block matching algorithm for fast elastic registration in adaptive radiotherapy

(a)

(b)

4793

(c)

Figure 2. (a) Transversal slice of a prostate case. (b) Local variances of surrounding 7 × 7 × 3 voxels. Highest variances are located at the body surface and at bony structures. (c) Raw tissue separation by fixed thresholds with selected templates highlighted as white dots.

around pi . Additional strategies are applied for handling discontinuities to allow for separate handling of moving air cavities etc. Then a translation vector field T is interpolated to get the complete elastic transformation, and the translation is applied to the organ contours as well. 3.1. First step: selection of promising positions Initial median filtering reduces the noise in the CT images by preserving edge information. In the first calculation step, promising positions (templates) are identified for relocation. Promising positions are spots in dataset A which can presumably be recognized easily in dataset B. Usually the spots should be in areas with a high image structure. R¨osch et al recommended the identification of these areas by calculating the local variance:    1   ¯ pi 2 pi (x) −  (3) V pi = M x∈ pi

¯ pi the with M being the number of voxels in the given area pi around position pi and  mean value of this area, each position with a local variance higher than a given threshold is a promising candidate for a template. To avoid agglutination of templates which would result in useless increase of calculation time, a minimal distance between each template is an essential constraint. This approach works sufficiently, but since the determination of local variance is very time consuming we use a different strategy: usually high variances can be observed at tissue borders and identified quickly, especially in CT images, since tissue density corresponds to intensity values (Hounsfield Units, HU). With static thresholds a raw segmentation of different tissue types can be achieved very quickly, and a promising subset of possible template positions can be localized at the tissue borders (see figure 2(c)). We use the following procedure to generate the set of template positions: in a first cycle, each CT voxel is classified as one of five different tissue types. The following threshold values are used: Intensity (HU): Tissue:

−800 −130 −50 100 Air Lung Fat Muscles Bones

4794

U Malsch et al

These thresholds were derived from images from our CT scanner, a Siemens Emotion. Being part of the Siemens Primatom, it is an in-room CT scanner sharing the same couch with the linear accelerator Siemens Primus. The thresholds can be applied to most body regions, e.g. pelvis, trunk, head and neck. For other scanners it might be necessary to modify these levels. In a second cycle, tissue borders are detected by counting up neighbouring voxels belonging to the same intensity range. If less than six identical voxels are present, the centre voxel is located on a tissue border and it is a template candidate. It is finally added to the template list if it has a minimal distance to other already selected positions. 3.2. Second step: relocation Each template position pi of image A, representing a characteristic structure, must be relocated in image B. Therefore a surrounding area i around pi is defined and re-identified in the corresponding image by rigid registration, i.e. by rotating and translating i by a 4 × 4 matrix Ri until optimal similarity is found. An adequate similarity measurement  is needed: Optimize[(A(p) , B(Rp) )].

(4)

The rigid transformation matrix Ri defines the best correspondence of sub-image i of A in the corresponding image B. Therefore, the corresponding position of p is q = Rp. Similarity measurements such as sum of square differences (SSD), sum of absolute differences (SAD), correlation coefficient (CC) (Hill et al 2001) and ratio image uniformity (RIU) (Woods et al 1992) can be used for monomodal (CT to CT, MRI to MRI, etc) registration. Mutual information (MI) (Maes et al 1997, Viola and Wells 1997) can also be used for multimodal registration (CT to MRI, MRI to CT, etc). Each method has its particular advantages and disadvantages and varies in its complexity, calculation time and requirements regarding image information. Multimodal registration benefits from entropybased similarity measurements like mutual information MI . For sufficiently good statistics, MI requires a reasonable number of voxels in both regions, but bigger templates reduce the ability to detect a higher grade of elasticity. For monomodal registration the fastest similarity measurements are SSD or SAD. For monomodal CT to CT registration however, a better similarity measurement is the normalized correlation coefficient CC, because different CT images may vary in absolute intensities. Since we want to calculate transformation parameters between two CT images (monomodal) in small limited local regions, we are using CC as the best compromise between precision and calculation time. In addition, CC can be used as an absolute measure of conformity, which allows us to identify and reject doubtful pairs of correspondences. The normalized correlation coefficient between a small domain A(p) at position p of image A and a rigidly displaced domain B(Rp) of image B is given by C (A(p) , B(Rp) ) (5) CC (A(p) , B(Rp) ) = [C (A(p) , A(p) )C (B(Rp) , B(Rp) )]1/2 with cross correlation C defined as  ¯ A )(B(Rp) (x) −  ¯ B ). C (A(p) , B(Rp) ) = (A(p) (x) −  (6) x∈A,B

¯ A(p) =  ¯ B(Rp) = 0, equation (6) can be If the images are normalized with mean values  simplified to  C (A(p) , B(Rp) ) = (A(p) (x))(B(Rp) (x)) (7) x∈A,B

which is a convolution in the spatial domain. Therefore it is a simple multiplication in the frequency domain, and by transforming the areas A(p) and B(Rp) into the frequency domain

An enhanced block matching algorithm for fast elastic registration in adaptive radiotherapy

4795

Feature Ω A( p )

Search region Ω B ( Rp )

ΘC ( Ω A( p ) , Ω B ( Rp ) )

(a)

(b)

(c)

Figure 3. (a) Feature in image A at position pi (here located at a border of the bony spine), (b) search area in corresponding image B and (c) surrounding correlation values (calculated in frequency domain). The bright spot indicates the best translation. Size of feature: 11 × 11 × 7 voxel. Maximal detectable displacement: tx = ±8, ty = ±8, tz = ±5.

by fast Fourier transformation (FFT), all relevant correlation coefficients can be calculated at once and the global optimum of problem (4) can easily be identified without iterative optimization. So, the correlation function we use is C (A(p) , B(Rp) ) = FFT−1 (FFT(A )FFT(B )∗ )

(8)



with FFT(B ) being the complex conjugate of FFT(B ). CT data are usually not locally ¯ B = 0, but the application of a filter which subtracts the surrounding ¯ A =  normalized, i.e.  ¯ mean value  from each voxel can normalize the image within : ¯ A(p) ¯ B(Rp) . B(pR) = B(pR) −  (9) A(p) = A(p) −  To detect a feature of image A in a specified searching area B in image B, the area should be large enough to enclose all possible movements, on the other hand it should not be too large to avoid mismatching and to reduce calculation time. Detectable movements of a feature of size df in a region of size ds are limited to t = ±1/2 (ds − df ), see figure 3. A and B must have the same minimal size of ds + df − 1 to avoid overlaps in the frequency domain due to the periodicity of discrete FFT calculation. Therefore, A (containing the feature) is padded with zeros, see figure 3(a) (grey corresponds to HU = 0). If motions are small, A and B can be located at the same spatial position. If motions are larger than t, there must be an additional shift of B . It is sufficient to consider only translations and no rotations, since the template areas are small and rotations of larger structures can be described simply by different displacements of their surface points. It should be noted that the cross correlation C is not an absolute measure of similarity since its value depends on the local intensity values of the examined pixels. However, the calculation of the normalized correlation coefficient CC gives the information necessary for accepting or rejecting candidate templates. 3.3. Third step: interpolation After relocating and accepting template positions, we have a set of arbitrarily distributed translation vectors. Now, interpolation is needed to find the translation vectors for other

4796

U Malsch et al

image positions. To register the whole image cube, translation vectors for all voxels must be generated. If only the segmented volumes of interest (VOIs) have to be transformed, it is sufficient to calculate translation vectors only for the vertices of the VOIs. Fast methods are tri-linear or Bezier interpolation (B´ezier 1972, Otte 2001), but these techniques need regularly arranged input values. Therefore we use thin-plate spline (TPS) interpolation (Bookstein 1989), which does not read to have input vectors arranged on a regular grid. Generally, the transformation vector field T (A, B) should be filled with translation vectors t for every voxel position or VOI vertex p = (px , py , pz ) ∈ R3 . The translation t = (tx , ty , tz ) ∈ R3 depends on p and is defined as tx,y,z (p) = a0 + ax px + ay py + az pz +

n 

wk,x,y,z (u(|pk − p|)

(10)

k=1

with the radial base function u(t) describing the bending energy of thin metal plates in 2D or 3D as  t · log(t) in 2D u(t) = (11) in 3D. t Equation (10) can be written in matrix form as Y = W L, with Y being the set of known target points q = t(p), which are the anchor points for the interpolation. L is the set of all combined distances between the template positions pi weighted by a radial base function u(t) and W = (w, a)T the unknown elastic and affine coefficients. Therefore the elastic coefficients can be determined by W = L−1 Y . A 3D transformation between two images can be defined for each dimension by three separate TPS: t(p) = (tx (px , py , pz ), ty (px , py , pz ), tz (px , py , pz ))T .

(12)

The TPS interpolation is global in nature, and in the general formulation above each voxel is affected by each anchor point. This is clear because the number of coefficients is equal to the number of anchors. Since the calculation time for inverting the matrices increases quadratically with the number of anchor points, we use a series of locally defined TPS. This way one translation vector ti does not affect all image positions, but only their nearby image fractions. This significantly speeds up the calculation and reduces the influence of far-off image positions. The image fractions are determined in a recursive process: we start with the whole image as the first current fraction. If the current fraction contains more than a predefined number of template positions pi , it is divided into two halves (in the first iteration along x-, in second iteration along y- and in last iteration along z-direction, then starting again with the x-direction). This strategy is applied recursively to each new fraction. The results are M fractions which contain a predefined maximum number of template positions. After all local fractions have been identified, each fraction is handled by a separate TPS interpolation. To guarantee a smooth transition between all fractions, an overlap of 10% is defined (see figure 4). Inside the overlapping regions the vectors are first calculated separately for all associated fractions, and then their linear interpolation is used as final result. Although the overlapping regions increase the number of translation vectors, the overall calculation time is reduced markedly compared to one global transformation since the number of TPS coefficients for each calculation is reduced and the inversion of smaller matrices L1 , . . . , LM is more efficient. This makes it feasible to calculate the TPS coefficients in an acceptable time frame (approximately 5 s for 1400 templates), but the interpolation of a complete vector field is still time consuming (see table 1). If calculation time is highly critical, a further speed-up at the cost of reduced accurateness can be achieved by first interpolating some vectors on a fixed

An enhanced block matching algorithm for fast elastic registration in adaptive radiotherapy

4797

Figure 4. Block TPS. The initial image is divided into a recursive process until each fraction contains 10% of all translation vectors (black dots in left image). Overlapping regions of 10% of the fraction volume ensure a sufficiently smooth transition.

regular grid and then applying simpler and faster interpolation methods (trilinear, Bezier) to calculate the complete vector field. 3.4. Discontinuities As mentioned in section 2.1, image discontinuities like objects present in only one of the two image sets cannot be described by a continuous vector field and require separate processing. Especially for use in adaptive radiotherapy of prostate cancer, where gas-filled cavities in colon or rectum can appear, move and disappear from day to day, the method presented so far has its limitations. These cavities can produce severe deformations (see figure 5) which prevent detection of corresponding areas by maximizing problem (4). Datasets with such deformations must be identified and processed separately. Since maximization of CC is not possible because of the different image structures, an alternative search for correspondences is included. First, air cavities are detected by Gaussian filtering and applying an appropriate HU threshold. Then pairs of landmarks are set at the border of the cavity and its centre. To determine the exact distance in 3D from the border points pi to the corresponding centre points q i , a distance map D is calculated. q i = pi + r0 ∇A(pi ),

with r0 = argmax D(pi + r∇A(pi ))

(13)

r

∇A(pi ) is the gradient at border position q i in image A. Tissue at the border of the cavity is moved perpendicularly, i.e. in the direction of the centre point. A final TPS interpolation of surrounding vectors will move neighbouring tissue in the same direction, resulting in a collapse of the air cavity (see figure 6). To ensure that this process will have only local effect, the interpolated translation vectors are multiplied with a weighting factor which varies reciprocal to its distance to the air cavity. To avoid misleading distortions of bony structures in the direct neighbourhood of air cavities, additional fixation points are inserted to prevent deformations of these structures. Fixation points are placed on lines perpendicular to the border and within a maximum distance to the border. They are placed on bony structures closest to the border or at the maximum distance (if there is no bony structure within the maximum distance). To transform the surrounding tissue more realistically, an additional weighting of the vectors is added. Since fat is more elastic than muscle tissue, intensity values

4798

U Malsch et al

(a)

(b)

Figure 5. Frontal view of a prostate case. Some heavy deformations are the result of gas-filled hollow organs: in planning dataset (a) the rectum is filled with gas. In first fraction (b), no gas bubbles can be observed.

(HU) are used as a measure of tissue elasticity and to scale translation vectors. Therefore every translation vector t is multiplied by a normalized grey-value k between 0 and 1, estimated by (13). t (p) = k(p) · t(p), with  1 k(p) = −0.005 · I (p) + 0.5  0

if if if

I (p) < −100 −100  I (p)  100 · I (p) > 100

(14)

3.5. Verification Reliable quantification of the accuracy of our registration algorithm is difficult because no true ‘gold standard’ registration exists. We therefore use a combination of three verification strategies described below in further detail: comparison using manual landmarks, visual assessment and consistency testing. We have verified our procedure by comparing the automatically generated translation vectors with vectors calculated on carefully manually selected corresponding anatomical landmarks. A set of n manually selected landmark pairs LMm,i = (pA,i , pB,i ) (i = 1, . . . , n) allows us to quantify deviations between two images A and B by calculating the distances between corresponding points. Comparing these distances with those calculated by the automatic registration algorithm at the same positions LMa,i = (pA,i , t(pA,i )) allows a quantification of the registration accuracy. Since that quantification is limited to a near neighbourhood of the selected landmarks, and since it is not reasonable to establish a global criterion for accepting or refusing the result, an additional visual assessment of both original images and the transformed image is mandatory to verify registration results. A fast visual assessment is possible by comparing transformed delineations with the underlying structures of the target image. This strategy is self-evident since adaptation of these delineations is the primary goal of our approach. Since imprecise initial delineation could be an additional source of errors, a more reliable strategy is a comparison of the transformed planning image with the target (verification) image by using simple image fusion techniques like red/green imaging (one image is coloured in red, the other one in green) or subtraction mode (difference image A–B). Using red/green image fusion, deviations can be clearly identified by red and green areas while regions with same image information appear in yellow. In subtraction mode, higher deviations are indicated by brighter areas, whereby the size of these areas is more relevant than the absolute intensity values (e.g. deviations near bony structures will result in high difference intensities without necessarily indicating a big error). It is clear that visual

An enhanced block matching algorithm for fast elastic registration in adaptive radiotherapy

4799

(a)

(b)

(c)

(d)

(e)

(f)

Figure 6. Elimination of the gas-filled cavity in the planning CT as shown in figure 5(a). (a) Distance map indicating the distances from border to centre of the air cavity. Yellow contour indicates the automatically detected air cavity, the red contour shows the manual delineation of the rectum. (b) TPS interpolation, limited to a near neighbourhood by weighting the vectors with (c) the distances outside of the cavity. (d) Deformed air cavity in the same slice of planning CT. The blue contour is the result of the deformed red one. (e), (f) Transversal slice of planning CT and verification CT together with a 3D model of original (red) and adjusted (blue) rectum delineation.

inspection can verify registration correctness only at tissue borders, i.e. for parts of the images containing structural information. Inside homogeneous tissues, no differences are observable. To get a more comprehensive impression of the registration result, we also implemented consistency testing. This is a widely used method for registration evaluation (Freeborough et al 1996, Hill et al 2001, Holden et al 2000, Woods et al 1998a, 1998b). Consistency testing is based on a combination of forward and backward transformations (e.g. TAB , TBA ). If only two images are involved, errors occurring at the same position in TAB and TBA could mutually neutralize themselves. This effect can be minimized by using at least three image sets and three transformations: TAB , TBC and TCA . The error e at any position p is calculated by e(p) = TAB (p) + TBC (p) + TCA (p) .

(15)

However, each of the translations TAB , TBC and TCA may be erroneous. Verifying, e.g. the registration TAB by using two other registrations TBC and TCA does not reveal which of the three registrations has failed in which regions. Therefore e should be used simply to identify regions with less reliable information but not as an absolute measure of registration correctness. In addition, low values of e do not guarantee the absence of errors. An algorithm might always have difficulties in determining local transformations in some regions. If the single translation vectors then have a size close to zero, e would be small even if the algorithm has failed to detect the correct transformation.

4800

U Malsch et al Table 1. Calculation times (in

seconds)a.

Case

Init (s)

Search of pi (s)

Search of qi

Sum 1 (s)

Interpolation

Sum 2 (s)

1 2 3 4 5

13 19 17 21 22

5 4 6 3 3

189 s (2516/1298) 259 s (2871/1180) 172 s (2038/690) 76 sb (1321/1007) 122 s (1349/1033)

211 285 199 105 150

51 s (69 slices) 75 s (66 slices) 43 s (60 slices) 105 s (77 slices) 106 s (90 slices)

262 360 242 210 256

a b

Measured on an Intel Pentium  R IV CPU machine with 2.8 Ghz and 3 GB RAM. The search region was reduced from 60 mm to 50 mm because of low deformations.

3.6. Clinical test cases The algorithm was tested with datasets of five patients with different tumour locations, one prostate (1), two paraspinal (2, 3) and two head-and-neck (4, 5) cases. All datasets were taken from clinical routine at the German Cancer Research Center and represent clinically available image resolution and quality. According to our treatment protocol, all patients were positioned with custom-made fixation devices such as vacuum pillow, head mask or wrap-around body cast, to minimize setup errors. 4. Results 4.1. Calculation time Table 1 shows the calculation time required for the elastic matching process on a current standard computer system. The column ‘init’ shows the time needed for initialization of the datasets (loading, preparing, median filtering to reduce noise, etc). The column ‘search of pi ’ shows the time needed to select initial template positions pi , and ‘search of q i ’ represents the time for finding corresponding positions q i . A search region of 60 mm diameter was used for all datasets except for Pat 4, where 50 mm were sufficient because of low deformations. In brackets, the initial number of selected template positions and the number of templates actually accepted are shown. In all cases a threshold of CC > 0.8 was used to accept a candidate template position. ‘Sum 1’ shows the overall calculation time necessary for transformation of the VOI vertices only. ‘Interpolation’ shows the duration of interpolation of the complete vector field to transform the whole planning CT, which depends on the number of acquired slices. Finally, column ‘sum 2’ gives the total calculation time. The elimination of air cavities in the prostate case took approximately 17 s with additional 34 LM, including additional filtering, selection of LM pairs and transformation. 4.2. Verification of accuracy 4.2.1. Comparison with manual landmarks. A comparison with manual registration using clearly identifiable anatomical landmarks LMm = (pA , pB ) was performed for semiquantitative quality control. In each of the five datasets landmarks were selected at muscle branches, calcifications, edges of bony structures and organs. In regions where identification of corresponding landmarks was difficult or impossible, no landmarks were positioned, especially in the area of the small intestine in patients 1–3. A reliable identification of corresponding landmarks in that region was not possible for two reasons. Sometimes corresponding structures have moved outside of the scanned volume due to the organ’s flexibility, and the small intestine for its most parts does not exhibit characteristic time invariant landmarks.

An enhanced block matching algorithm for fast elastic registration in adaptive radiotherapy

4801

Table 2. Deviations between images A and B (in mm) using manual landmarks. n = number of landmarks. Case

max|tx|

max|ty|

max|tz|

mean|tx|

mean|ty|

mean|tz|

n

1 2 3 4 5

4.4 3.4 14.2 6.1 4.4

5.4 5.4 5.4 9.3 19.2

8.5 9.0 8.3 6.4 6.3

1.4 0.9 9.0 2.2 1.1

1.7 2.3 2.5 3.5 7.7

2.4 6.7 3.2 2.5 2.1

67 59 63 78 79

Table 3. Absolute differences between LMm and LMa (in mm). Case

max|dx|

max|dy|

max|dz|

mean|dx|

mean|dy|

mean|dz|

n

1 2 3 4 5

1.9 2.0 3.2 2.1 2.0

2.2 2.0 2.0 2.1 2.9

3.1 2.9 2.9 3.2 3.0

0.7 0.7 1.1 0.7 0.6

0. 8 0.8 0.6 0.9 1.1

1.2 0.9 0.9 1.2 0.8

67 59 63 78 79

Table 2 shows the maximal and mean distances of manually selected landmarks in both images. Even with the rigid patient fixation, we observed deviations up to 14.2 mm in case 3 and 19.2 mm in case 5. The mean distances for these cases were up to 9.0 mm and 7.7 mm, respectively. Then the absolute distances between manual (LMm ) and automatically calculated deviations (LMa ) at the same positions were calculated. Table 3 shows the absolute differences between LMm and LMa by maximal and mean values (max|dx| , max|dy| , max|dz| , mean|dx| , mean|dy| , mean|dz| ). These values quantify the error of the automatic registration. We observed a larger error in the z-direction (maximal 3.2 mm in 4) than in the x- or y-directions, with largest errors of 3.2 mm (3) and 2.9 mm (5), respectively. Most errors were in the range of the voxel size (2 × 2 × 3 mm3), which is considered to be sufficiently low. In datasets 3 and 5 a higher error was observed at some few positions (max|dx| = 3.2 mm and max|dy| = 2.9 mm). Landmark positions showing larger errors were mostly located in those regions showing the largest distortions between planning and verification datasets. Figure 7 visualizes the results by bar charts (whisker caps indicating the maximal error). 4.2.2. Visual assessment and consistency testing. The goal of our approach is the adaptation of delineations that were initially defined on the planning CT to the actual situation at the time of treatment given by verification CT images. The accuracy of the matching result can be visualized by superimposing original and transformed delineations on planning and verification datasets, as shown in figure 8. The green crosshair indicates the same spatial coordinate in all images, given by the extracorporal patient fixation frame. In the first image (planning CT), this coordinate is located at the centre below the spinal cord. In the second image (acquired directly prior to the first fraction) this coordinate is located left below the spinal cord. Due to the organ displacements, parts of the spinal cord (blue contour) are covered by target (pink contour) and boost volume (red contour) and would therefore be exposed to therapeutic dose levels. After automatic registration the transformed delineations precisely reflect the new situation. Figure 9 shows the evaluation of case 5 (head-and-neck case, rows 1 and 2) and case 3 (paraspinal case, rows 3 and 4). In the first row of each case, the original planning CT and

4802

U Malsch et al

Figure 7. Bar charts showing the difference between manual and automatic landmarks in x-, y- and z-directions for the five selected patients. In addition, maximal errors are shown by whisker caps. The values are absolute differences between manual selected landmarks and the automatically calculated translation vector field at the same positions.

(a)

(b)

(c)

Figure 8. Transversal slices of paraspinal case (3). (a) Planning dataset, (b) verification dataset with original VOIs, (c) verification dataset with adapted VOIs. (a), (b) Pink target volume, red boost volume and blue spinal cord. Without compensation, the spinal cord has moved into regions classified as target or boost volumes and would have been exposed to the therapeutic dose level. Due to smaller movements at the skin of the patient’s back, the original shape of target volume is deformed.

the verification CT are shown, and the second rows show the respective transformed planning CT together with the verification CT. Both the red/green image fusion (column (a)) and subtraction technique (column (b)) are used. In case 5, the strongest deviations can be observed in the ventro–dorsal direction; in case 3 the deviation occurs mainly in the lateral direction. Both deviations are compensated for most parts by the automatic registration. Nevertheless, deviations are still present in the area of small intestine of 3, and some discrepancies in 5 are visible close to the clavicle (simply reflecting the fact that contrast media were applied only in the planning image) and the pharynx (as a result of different positions of the patient’s epiglottis; obviously too few landmarks were placed to compensate for that complex motion). The difference image of case 5 also shows some deviations along bony structures, especially

An enhanced block matching algorithm for fast elastic registration in adaptive radiotherapy

(a)

(b)

4803

(c)

Figure 9. Sagittal views of case 5 in the first two rows and transversal views of case 3 in the lower two rows. Upper rows of 3 and 5 show the fusion of original images and lower rows show the fusion of a transformed planning CT and verification CT image. (a) Red/green fusion (addition). (b) Subtraction image. Deviations between the images are mainly perceivable at tissue edges. The extent of deviation is not expressed by intensity, but by the size of the areas coloured in red and green resp. showing values unequal to zero. (c) Values of consistency testing and vector fields.

the spine, but they appear exaggerated due to the high intensity difference between bones and surrounding soft tissue—mostly they have only limited spatial extension. For both cases, column (c) of figure 9 also shows the superimposed motion vector field of the automatic matching algorithm and the result of the consistency test. In case 5 a limitation of the consistency check becomes obvious: while the fusion images show deviations in the tongue area, the consistency test image does not report any problems there. Due to missing landmarks in that area none of the three transformations has detected any deviations. As stated above, a comprehensive and exact verification of the elastic matching is not possible. However, the qualitative checks of the clinical test cases presented here can be summarized as follows. We observed errors better than voxel precision in the clinically relevant structures, i.e. the image parts containing the target volume and the highly critical organs at risk. Higher errors were observed at the patient’s surface, the small intestine and

4804

U Malsch et al

some other areas of minor clinical relevance (e.g. the pharynx in case 5). Since the body surface is more or less evenly shaped, the number of non-ambiguous landmarks is limited. The small intestine is quite a flexible organ with too few clearly identifiable and time invariant landmarks.

5. Discussion In an adaptive fractionated radiation treatment course, it is most desirable to adapt the outlines of target volumes and organs at risk exactly to each control CT scan. This is a prerequisite for a quantitative dosimetric analysis and, if necessary, correction of misalignments and anatomical changes compared to the original planning CT scan. Manual re-delineation of the VOIs for each scan is possible, but quite time consuming and therefore not suitable for clinical routine. Especially for online corrections, i.e. corrections between a control CT scan and the directly following irradiation, the whole procedure of plan evaluation and correction should not take more than approximately 10–15 min, leaving approximately 5 min to the part of outline adaptation. The goal of the present work was to develop and test an algorithm for the elastic adaptation of the outlines that meets several criteria: it should run automatically without user interaction, take no longer than 5 min, be able to handle images with reduced quality and images containing discontinuities and provide detailed information about the deformations observed. An additional constraint was accuracy within 1–2 voxels. The strategy to achieve these goals was to register the original planning CT scan onto the control CT scan and subsequently apply the same transformation to the original outlines. A block matching algorithm automatically identifies promising landmarks in the planning CT and tries to relocate them in the control scans by maximizing the local correlation coefficient in the frequency domain. Based on these landmarks, a complete vector field is interpolated which is used to transform the segmented structures. Plotting the vector field as an overlay together with the CT image visualizes the local deformations. We have presented tests with five different datasets with different tumour locations (prostate, paraspinal and head-and-neck cases). In each of the five cases, the current implementation has satisfied the mentioned constraints. Matching accuracy can be improved by generating more landmarks, however, there is a trade-off between accuracy and calculation time. With the settings we chose, the calculation time was markedly shorter than required with maximal 3 min for a complete elastic transformation of the CT image data and VOI outlines, and the accuracy as determined by comparison with manual landmarks was, besides a few outliers, in the range of 1–2 voxels. The mean error in this comparison was even lower than the size of a single voxel. The algorithm has shown its limitations in regions where it is not possible to identify a sufficient number of reliable time invariant landmarks, especially in the small intestine region. Since the intestine is not a typical target for high conformal radiation therapy, we do not consider this as a big problem. The therapy relevant structures in the paraspinal and the headand-neck test cases could be registered with excellent accuracy, so we think for these tumour sites the presented algorithm has a high potential for clinical use in an adaptive radiation treatment course. Our first experiences with application to prostate treatments also show good results. However, more clinical test cases are necessary to clearly identify the situations where the algorithm can be used routinely. Some kind of quick manual checks performed by the therapist (e.g. looking at the red/green images) will likely always be mandatory to prevent mistreatments in cases where the algorithm got severely misled.

An enhanced block matching algorithm for fast elastic registration in adaptive radiotherapy

4805

In its current form, the algorithm is suitable for monomodal registration of CT images. Because the control CT scans were acquired with a lower dose protocol, it could already be shown that it is able to handle image data of lower quality. So far we have not applied the algorithm to cone beam CT images; however, because of the current results we do not expect fundamental problems. It should be noted that the algorithm can potentially be extended to monomodal registration of MRI images or even multimodal registration of CT and MRI images, probably simply by replacing the routines for relocating the landmarks (e.g. by using mutual information instead of the correlation coefficient). This paper focused on automatic elastic registration. We are currently working on an efficient clinical workflow that integrates the presented method in adaptive treatment schemes. Such a workflow has to include tools for automatic data management (e.g. sending CT data from the scanner console to the planning workstation), tools for quick visual assessment of deviations from the original treatment plan, supporting the therapist in the decision whether corrections are necessary at all and eventually generating and documenting a new adapted treatment plan. Currently we are working on a retrospective planning study to quantify elastic deformations and their dosimetric impact on the actually delivered total dose for patients who underwent a fractionated treatment course and received regular control CT scans. Together with the clinical workflow to correct for those deformations, this will lead to an updated protocol for target definition and dose prescription to optimize both tumour control probability and avoidance of adverse effects. We think that in this scenario automatic elastic registration plays a key role. In that sense the presented work might be considered as a major step towards the next level of high precision radiotherapy. References B´ezier P 1972 Numerical Control, Mathematics and Applications (Wiley Series in Computing) ed C A Lang (Wiley: New York) pp 96–198 Bookstein F 1989 Principal warps: thin-plate splines and the decomposition of deformations IEEE Trans. Pattern Anal. Mach. Intell. 11 567–85 Bro-Nielsen M and Gramkow C 1996 Fast fluid registration of medical images Proc. Visualization in Biomedical Computing (VBC’96) (Berlin: Springer) pp 267–76 Christensen G E, Rabbitt R D and Miller M I 1994 3D brain mapping using a deformable neuroanatomy Phys. Med. Biol. 39 609–18 Clatz O, Delingette H, Talos I F, Golby A J, Kikinis R, Jolesz F A, Ayache N and Warfield S K 2005 Robust nonrigid registration to capture brain shift from intraoperative MRI IEEE Trans. Med. Imaging 24 1417–27 Cooley J W and Tukey J W 1964 An algorithm for the machine calculation of complex Fourier series Math. Comput. 19 297–301 Crum W R, Tanner C and Hawkes D J 2005 Anisotropic multi-scale fluid registration: evaluation in magnetic resonance breast imaging Phys. Med. Biol. 50 5153–74 Foskey M, Davis B, Goyal L, Chang S, Chaney E, Strehl N, Tomei S, Rosenman J and Joshi S 2005 Large deformation three-dimensional image registration in image-guided radiation therapy Phys. Med. Biol. 50 5869–92 Freeborough P A, Woods R P and Fox N C 1996 Accurate registration of serial 3D MR brain images and its application to visualizing change in neurodegenerative disorders J. Comput. Assist. Tomogr. 20 1012–22 Gibou F, Levy D, Liu P and Boyer A 2005 Partial differential equations-based segmentation for radiotherapy treatment planning Math. Biosci. Eng. 2 209–26 Hastreiter P, Rezk-Salama C, Soza G, Bauer M, Greiner G, Fahlbusch R, Ganslandt O and Nimsky C 2004 Strategies for brain shift evaluation Med. Image Anal. 8 447–64 Hill D L, Batchelor P G, Holden M and Hawkes D J 2001 Medical image registration Phys. Med. Biol. 46 R1–45 Holden M, Hill D L, Denton E R, Jarosz J M, Cox T C, Rohlfing T, Goodey J and Hawkes D J 2000 Voxel similarity measures for 3-D serial MR brain image registration IEEE Trans. Med. Imaging 19 94–102 Lau Y H, Braun M and Hutton B F 2001 Non-rigid image registration using a median-filtered coarse-to-fine displacement field and a symmetric correlation ratio Phys. Med. Biol. 46 1297–319

4806

U Malsch et al

Maes F, Collignon A, Vondermeulen D, Marchel G and Suetens P 1997 Multimodality image registration by maximization of mutual information IEEE Trans. Med. Imaging 16 187–98 Maintz J B and Viergever M A 1998 A survey of medical image registration Med. Image Anal. 2 1–36 Otte M 2001 Elastic registration of fMRI data using Bezier-spline transformations IEEE Trans. Med. Imaging 20 193–206 Pekar V, Gladilin E and Rohr K 2006 An adaptive irregular grid approach for 3D deformable image registration Phys. Med. Biol. 51 361–77 Pekar V, McNutt T and Kaus M 2004 Automated model-Based organ delineation for radiotherapy planning in prostatic region Int. J. Radiat. Oncol. Biol. Phys. 60 973–80 Pennec X, Cachier P and Ayache N 1999 Understanding the ‘Demon’s algorithm’: 3D non-rigid registration by gradient descent Medical Image Computing and Computer-Assisted Intervention (MICCAI) 1999 (Berlin: Springer) pp 597–605 Rohlfing T and Maurer C R Jr 2003 Nonrigid image registration in shared-memory multiprocessor environments with application to brains, breasts, and bees IEEE Trans. Inf. Technol. Biomed. 7 16–25 R¨osch P, Mohs T, Netsch T, Quist M, Penney G P, Hawkes D J and Weese J 2001 Template Selection and rejection for robust non-rigid 3D registration in the presence of large deformations Proc. SPIE 4322 545–56 Rueckert D, Sonoda L I, Hayes C, Hill D L, Leach M O and Hawkes D J 1999 Nonrigid registration using free-form deformations: application to breast MR images IEEE Trans. Med. Imaging 18 712–21 Schunck B G and Horn B K P 1981 Determining optical flow Artif. Intell. 17 185–204 Thieke C, Malsch U, Schlegel W, Debus J, Huber P, Bendl R and Thilmann C 2006 Kilovoltage CT using a linac-CT scanner combination Br. J. Radiol. at press Thirion J P 1998 Image matching as a diffusion process: an analogy with Maxwell’s demons Med. Image Anal. 2 243–60 Viola P and Wells W M III 1997 Alignment by maximization of mutual information Int. J. Comput. Vis. 24 137–54 Wang H, Dong L, Lii M F, Lee A L, de Crevoisier R, Mohan R, Cox J D, Kuban D A and Cheung R 2005a Implementation and validation of a three-dimensional deformable registration algorithm for targeted prostate cancer radiotherapy Int. J. Radiat. Oncol. Biol. Phys. 61 725–35 Wang H et al 2005b Validation of an accelerated ‘demons’ algorithm for deformable image registration in radiation therapy Phys. Med. Biol. 50 2887–905 Woods R P, Cherry S R and Mazziotta J C 1992 Rapid automated algorithm for aligning and reslicing PET images J. Comput. Assist. Tomogr. 16 620–33 Woods R P, Grafton S T, Holmes C J, Cherry S R and Mazziotta J C 1998a Automated image registration: I. General methods and intrasubject, intramodality validation J. Comput. Assist. Tomogr. 22 139–52 Woods R P, Grafton S T, Watson J D, Sicotte N L and Mazziotta J C 1998b Automated image registration: II. Intersubject validation of linear and nonlinear models J. Comput. Assist. Tomogr. 22 153–65 Zitov´a B and Flusser J 2003 Image registration methods: a survey Image Vis. Comput. 21 977–1000