teria and propose a new matching criterion: the entropy of the grey-level scatter-plot. This criterion requires no segmentation or feature extrac- tion and no a ...
3D Multi-Modality Medical Image Registration Using Feature Space Clustering Andre Collignon, Dirk Vandermeulen, Paul Suetens, Guy Marchal Laboratory for Medical Imaging Research? Katholieke Universiteit Leuven, Belgium. Department of Electrical Engineering, ESAT, Kardinaal Mercierlaan 94, B-3001 Heverlee. Department of Radiology, University Hospital Gasthuisberg, Herestraat 49, B-3000 Leuven.
Abstract. In this paper, 3D voxel-similarity-based (VB) registration al-
gorithms that optimize a feature-space clustering measure are proposed to combine the segmentation and registration process. We present a unifying de nition and a classi cation scheme for existing VB matching criteria and propose a new matching criterion: the entropy of the grey-level scatter-plot. This criterion requires no segmentation or feature extraction and no a priori knowledge of photometric model parameters. The effects of practical implementation issues concerning grey-level resampling, scatter-plot binning, parzen-windowing and resampling frequencies are discussed in detail and evaluated using real world data (CT and MRI).
1 Introduction 3D multi-modality image registration is a prerequisite for optimal planning of complex radiotherapeutical and neurosurgical procedures. For instance, in radiotherapy planning, the anatomy of the target lesion, usually a soft tissue structure, can best be delineated on magnetic resonance (MR) images, while the electron density map of a CT is needed to calculate the isodose distribution. The speci city of diagnosis can be improved by combining the anatomical information depicted in CT or MR with functional information extracted from emission tomography. In the past many 3D registration algorithms have been proposed (see [1], [11], [15] for reviews). They can be categorized into: 1) stereotactic [18] or externalmarker-based, 2) anatomical-point-landmark-based, 3) surface-based, and 4) VB registration. Since our work focusses on automatic registration algorithms that can be applied retrospectively, we have excluded algorithms of the rst and the second type from our research scope. Our experience with surface-based registration algorithms ([6] - [4]) leads us to the following conclusions: ?
Directors: Andre Oosterlinck & Albert L. Baert
196
1. Comparison with stereotactic registration has shown that the registration accuracy for real world data, especially when MRI is involved, may be worse than subvoxel due to inaccurate scanner calibration, image distortions and segmentation dierences. 2. Completely automatic surface-based registration algorithms require robust surface segmentation algorithms, which are highly data and application dependent. This problem has been tackled by Neiw [12] for CT/MRI/PET, by Mangin [10] for MRI/PET and by van Herk [17] for CT/CT, CT/MRI and CT/SPECT. However, the reliable estimation of the registration error remains an unsolved problem because the \ground truth" is unknown or the \golden standard" is in error itself whenever real world data are involved. In our work we look for a fully automatic solution and therefore it is an absolute requirement that the accuracy is such that it is impossible to detect the registration error by visual inspection. This can not be guaranteed by a straightforward application of surface-based registration without increasing the robustness of the surface-based matching criterion against errors that will inevitably be introduced by violation of the rigid body assumption (due to image distortion and inaccurate scanner calibration) and by violation of the surface correspondence assumption (due to the dierent structural image content of dierent modalities and due to segmentation errors). The robustness of surfacebased matching algorithms can be increased by including corresponding pairs of anatomical landmark points, which increases interactivity. Alternatively, one can design outlier treatment methods to eliminate conjectured outliers from the matching evaluation. However, they have not proven to be successful yet [4]. A conceptually dierent approach to increasing the robustness of automatic registration algorithms is to re-engineer the matching criterion and combine the segmentation and registration process by using VB matching criteria. Voxelsimilarity-based matching criteria are measures of misregistration that are functions of the attributes (e.g. grey-level, gradient, texture) of all pairs of corresponding voxels from the images to be registered at a given misregistered position. Using all voxels is much more constraining than using surface voxels alone, and therefore we expect VB matching criteria to be more robust than surface based criteria. These considerations leave us with the problem of selecting appropriate features and designing a corresponding matching criterion. Therefore, in the next section we present a brief review and classi cation of existing VB matching criteria. In Sect. 3 we propose a new matching criterion: the entropy of the multi-modal grey-level scatter-plot. In Sect. 4 we analyse the new matching criterion by looking at its behaviour in function of the registration parameters in the neighbourhood of a stereotactic registration solution for CT and MR images. In [5] we observed that such analyses reveal the in uence of the choice of interpolation method used for grey-level resampling. In this paper we also analyse the eects of binning and Parzen-windowing of the grey-level scatter-plot. In Sect. 5 these measurements are interpreted, and in the nal chapter some conclusions are formulated.
197
2 Voxel Similarity Based Registration Algorithms 3D multi-modality medical image registration involves two types of sensor transformations: 1) geometrical (rotation, translation, and scale), and 2) grey-level (CT-to-MR, MR-to-PET, etc.) transformations. While in surface-based registration algorithms grey-level transformations are used only implicitly during segmentation of the surfaces, in VB matching criteria geometric correspondence is modeled implicitly by the rigid body registration assumption. Let be an n-tuple of registration parameters - six due to the rigid body assumption and three more to allow for scaling corrections. Let s be the coordinates of the elements of the image grid of one image (the reference image), and let s0 be the coordinates of the other image (called the oating image because it will be transformed to resample the reference image for all registration parameter values at which the matching criterion is evaluated during the registration process). Then, if T is the geometric transformation relating the image voxel coordinates and (V; V 0 ) are the grey-level transformations relating voxel grey-levels gR (s) and gF (s0 ), a VB matching criterion optimizes the following:
m() = f ( f(a; b) j a = V (gR (s0 )) ^ b = V 0 (gF (s)) ^ 8s; 9s0 = T(s)g ) : (1) This general de nition of voxel-similarity covers a broad collection of matching criteria. This de nition also highlights the design problems of VB matching criteria: 1) some form of interpolation of voxel grey-levels will be needed to obtain corresponding voxel grey-levels (gR ; gF ) by resampling the reference image, 2) a consistent procedure is needed to account for changes in non-overlapping parts in the images due to changes in the registration parameters , 3) an appropriate selection of features (V; V 0 ) needs to be made, and 4) a global similarity function f (f(a; b)g) needs to be designed. The design of a complete VB registration algorithm also requires the selection of an appropriate optimisation algorithm in function of the selected matching criterion. Ideally, the optimisation strategy should not aect the registration accuracy. Therefore, in this paper we will focus on the comparison of VB matching criteria only. We have classi ed existing VB matching criteria into two classes in terms of the complexity of the features used. 1. 0-th Order Image Intensity Based: V and V 0 are simple one-dimensional grey-level transformations. The grey-level transformations V and V 0 relating grey-levels of corresponding voxels of images from dierent modalities are non-linear and not one-to-one, and thus cross-correlation and its Fourier versions fail when applied to the untransformed grey-levels. (a) Gerlot-Chiron and Bizais [8] assume that the histogram H (T ; d) of the dierence image d contains a peak corresponding to registered pixels. Their \Region Overlap" (RO) criterion aims at maximizing the number of registered pixels in d, which it achieves by maximizing the peak area in the dierence histogram using a gaussian matched lter.
198
(b) The algorithm proposed by Woods [19] is based on the idealised assumption that if two images are accurately aligned, then the value of any voxel in one image is related to the value of the corresponding voxel in the other image by a multiplicative factor R. There is a single value of R for all intensity values. In practice, he minimises the \variance of intensity ratios" (VIR) to guarantee optimal uniformness for R. He applied it symmetrically for PET/PET, and non-symmetrically for PET/MRI matching. (c) Hill et al [9] have investigated several measurements of features space dispersion, the most successful of which they found to be the VIR between voxels of selected tissues. So, they re-implemented Woods's work, but introduced additional exibility by allowing the calculation of the variance over any combination of groupings (or bins) of denominator image intensity values. They applied the generalized algorithm to the matching of CT and MR. Hill also presents results he obtained minimising a 1D third order moment of the feature space scatter-plot on MR/MR and MR/CT data sets. (d) Van den Elsen [16] has recently proposed to minimize or to maximize cross-correlation between an MR image and the corresponding CT after the application of a delayed ramp mapping and a triangular intensity mapping respectively. 2. Higher Order Image Intensity Based: V (g) represents dierential image intensity information, or more general V (g) is a feature characterizing voxel neighbourhoods instead of single voxels. Van den Elsen et al [14] performed cross-correlation of \ridgeness" images using a scale-space based de nition of image ridges and applied it to CT/MR matching. Hill et al [9] describe the dispersing character of the feature space scatterplot during misregistration. The VB matching criteria reviewed above can all be compared by looking at the driving or \clustering" forces that they generate to counteract the dispersion phenomenon. The analogy with mechanical forces can be made by considering the matching criteria to be potential functions from which the forces can be derived by taking the gradient. The clustering forces in the RO criterion are restricted in direction to minus unit slope lines in the scatter-plot. The VIR criterion apply vertical (and in the symmetric case also horizontal) clustering forces only. In the generalized VIR criterion proposed by Hill the clustering forces can be restricted to a subset of the scatter-plot. Using cross-correlation for f (f(a; b)g) both van den Elsen's grey-level and ridgeness matching criteria generate clustering forces that are directed toward one of the diagonals of the scatter-plot of the transformed grey-levels and the ridgeness feature respectively. We believe that all of the above mentioned restrictions on the directions of the clustering forces are heuristics which we prefer to eliminate. Instead we propose to have the clustering forces depend on the data alone. In a rst attempt to do so we tried statistically modeling the normalized scatter-plot. We performed a case study [5] in which we interactively outlined parts of distinct regions t of the
199
brain (skull, soft tissue, background, skin and fat) in CT and MR images from which we determined their grey-level statistics (vector mean (t) and variance P (t)) in the 2D multi-modality grey-level scatter-plot. Using these statistics P to de ne Gaussian feature space probability distributions N (G(s); (t); (t)) we traced two new candidate matching criteria: 1) the \number of unlabeled voxels" according to a set of labeling thresholds applied to the Mahalanobis distance, and 2) the \geometric mean maximum likelihood labeling probability" based on the same Gaussian distributions. The advantage of both criteria is that the clustering forces that they invoke are pointed to the (t) which are determined P by the data. The disadvantage is that a \photometric model" ((t) and (t)) needs to be determined which requires user interaction. In general, we call \photometric model" any 0-th order image intensity model, and we refer to any higher order model as a \scene model". In the next section we present a new VB matching criterion that does not require any user interaction for the speci cation of its photometric model and that does not introduce any arti cial constraints on the clustering forces.
3 Entropy of Multi-Modal Scatter Plot Histogram While Hill based his choice of the third order moment of the feature space scatterplots on a list of detailed observations of such scatter-plots when moving them away from the registration solution (obtained by point landmark based registration), a new 0-th order image intensity based matching criterion is presented here based 1) on the information theoretic consideration that the entropy of a random variable is a measure of the information required on the average to describe it [13], and 2) on the general observation that misregistration diuses the multi-modal grey-level distribution. Since diusion of the grey-level scatter-plot corresponds to an increase in the information required to describe it, we propose to use the entropy of the scatter-plot of image intensities as a matching criterion since we expect it to be minimal in the registered position. Formally:
= arg min
,
X s
p(gR (T s); gF (s)) log [p(gR (T s); gF (s))]
!
(2)
where p(gR (); gF ) is obtained by normalizing the scatter-plot into a probability distribution. Solving (2) requires the calculation of p(gR (); gF ). However, since the number of possible pairs of (gR ; gF ) (e.g. 4096*4096 for 12 bit images) is larger than the number of samples that are available in the multi-modal image (e.g. 256*256*128 for full 3D MRI images), robust induction of p(gR (); gF ) is not trivial. In principle this can be solved by parzen-windowing [7]. In practice, because it is much faster to implement and to execute, and because it reduces memory consumption, we have chosen to perform simple binning of the image grey-levels in order to reduce the number of possibilities. Binning has been performed uniformly by neglecting the least signi cant bits, i.e. p(gR ; gF )
200 H (gR >>nR ;gF >>nF ) N
where H (gR >> nR ; gF >> nF ) equals the binned version of the scatter-plot H (gR ; gF ), N equals the number of voxels taken into account in the estimation of p(gR ; gF ), nR and nF are the number of bits neglected in the respective images. In principle what needs to be proved for any matching criterion is that: 1) for a wide range of unregistered positions around the true registered position, 2) the matching function is uni-modal, 3) and the optimum of the function should be located at the registered position. In practice, at best, matching criteria can be evaluated by looking at traces of the matching function.
4 Experiments and Results The data consist of a 12 bit, 39 slice (256 x 256) CT image, a 12 bit, 30 slice (256 x 256) T2-weighted MR image, and the same MR image after gadolinium enhancement (MRE), all having (1.33 x 1.33 x 4) mm3 voxels. The images contain stereotactic localisation marks, and can thus be registered fairly accurately. As a matter of fact, both MR images were registered by acquisition while the CT image is only shifted and not rotated with respect to the MR images. This shift however involves subvoxel translations in all three directions. Note that all measurements shown hereafter have been performed using a 30 slice CT image acquired by trilinearly resampling the original CT image on the image grid of the MR images. Since we will be looking at the eect of resampling algorithms this should be kept in mind. In order to avoid the overlap problem only part of the oating images is considered by deliberately eliminating all voxels within 20 voxels from the image boundaries. Figures 1, and 2 present traces for rotations of the MRE volume relative to the CT volume about the axis perpendicular to the image slices ranging from -12 to +12o, while Fig. 3 presents traces for medial-lateral translations of MRE relative to MR ranging from -2 to 2 cm. All entropy traces in the gures have been normalized for optimal use of gure space. Figure 1 shows the in uence of the type of interpolation used for resampling the reference images (nearest neighbour, trilinear, and cubic convolution). Figure 2 shows the in uence of the resolution of the sample s relative to that of the data. These traces were obtained using 'all voxels' (46656 in number), by in-plane 'sub-sampling' (1 sample in 3 voxels in both directions), and by in-plane 'super-sampling' (10 samples in 3 voxels in both directions). Figure 3 demonstrates the problem of image grid interference and shows the eects of parzen-windowing and super-sampling on this problem. It should be remarked that the traces were calculated taking into account voxels from the central slice only, so, essentially 2D. We have chosen to do so after we found out that the behaviour of the 3D and the 2D traces was the same which saves us a lot of computer time.
201 Effect of Grey−value Interpolation 1 0.9 0.8
Normalized entropy
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 −15
−10
−5 0 5 Rotation about caudal−cranial axis in degrees
10
15
Fig. 1. Normalized entropy traces in axial rotation space for CT/MRE volume pair
around the stereotactic reference solution. The traces have been obtained using nearest neighbour (bottom), linear (middle), and cubic convolution interpolation (top) respectively. Nearest neighbour interpolation has a at minimum. Its size is determined by the sampling resolution. The higher the order of the interpolation, the sharper is the minimum, that precisely coincides with the stereotactic reference, and the 'noisier' the local behaviour. No parzen-windowing, 'all voxels', and ( R F ) = (2 2) were used. The range of grey-values is 4092 for the CT volume, and 3580 for the MRE volume. n
;n
;
5 Discussion Figure 1 clearly shows that nearest neighbour interpolation tends to nd the best t of the image grids and is inherently less accurate. The ringing eect in the close neighbourhood of the registration solution that is clearly visible for trilinear interpolation disappears with super-sampling. The opposite happens when combining super-sampling and nearest neighbour interpolation. Currently, we believe trilinear interpolation to oer the best compromise between speed and accuracy. Figure 2 shows that the trace for super-sampling has its minimum slightly o from the stereotactic reference. There are two possible causes for this deviation. Either the stereotactic ground truth is in error, or one of the assumptions underlying the derivation of our entropy matching criterion is not satis ed, e.g. the rigid body registration assumption is probably in error due to small image distortions. The translation traces in Fig. 3 are more complex due to large interpolation eects. Trilinear interpolation introduces ripples with a frequency that equals the grid spacing in the direction of the translation. Super-sampling considerably
202 Effect of Super−sampling 1 0.9 0.8
2
Normalized entropy
0.7 0.6
1
0.5 0.4 0.3 0.2
3
0.1 0 −15
−10
−5 0 5 Rotation about caudal−cranial axis in degrees
10
15
Fig. 2. Normalized entropy traces in axial rotation space for CT/MRE volume pair around the stereotactic reference solution. The traces have been obtained using: 1) sub-sampling (1/3 sample/voxel), 2) all voxels (1/1 s/v), and 3) super-sampling (10/3 s/v) respectively. Both the scale and the amplitude of the 'noise' on the traces decrease with increasing sampling ratio. ( R F ) = (2 2). n
;n
;
reduces this artefact. Parzen-windowing did not have the eect that was hoped for, i.e. the ripple eect is only slightly reduced.
6 Conclusion We have classi ed and reviewed VB matching criteria by looking at the complexity of the features used, and the heuristic constraints used to determine feature space clustering forces. We have shown the entropy of the scatter-plot to be the simplest possible criterion that introduces no such heuristic constraints. From the experiments we may conclude that if the entropy of the grey-level scatter-plot is calculated using super-sampling on a rectangular grid with trilinear grey-level interpolation then traces of the rigid body rotation and translation parameters are well-behaved minimisation functions for registration of CT and MR. From the experiments we can not determine the critical super-sampling resolution. Super-sampling renders the use of binning and/or parzen-windowing super uous. Binning can be used though to reduce memory requirements.
Acknowledgements
This work is part of COVIRA (Computer Vision in Radiology), project A2003
203 Effects of Parzen−windowing and Super−sampling on Grid Interference 1 0.9 0.8
Normalized entropy
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 −20
−15
−10
−5 0 5 10 In−plane Ear−to−Ear translation in mm
15
20
Fig. 3. Normalized entropy traces in medial-lateral translation space for MR/MRE volume pair around the stereotactic reference solution. These traces have been obtained using plain linear interpolation (dashed), linear interpolation combined with parzen-windowing, and linear interpolation combined with super-sampling (10/3 s/v) (ragged). In the abscence of relative rotation, and due to interference of image grids (MR and MRE have identical grids), the entropy matching criterion shows a relatively large periodic ripple, which is reduced by parzen-windowing, but even more by super-sampling. ( R F ) = (2 2). The range of grey-values is 3357 for the MR volume. n
;n
;
of the AIM (Advanced Informatics in Medicine) programme of the European Commission. The software was developed on an IBM RS/6000 workstation using xlC C++. Graphics were created with MATLAB 4.2a from The MathWorks, Inc. We are grateful to Petra van den Elsen and Derek Hill for sending us their work in press.
References 1. Brown L.G.: A Survey of Image Registration Techniques. ACM Computing Surveys 24:4 (1992) 325-376 2. Collignon A., Vandermeulen D., Suetens P., Marchal G.: Surface based registration of 3D medical images. SPIE Int'l Conf. Medical Imaging 1993: Image Processing, 14-19 februari, 1993, Newport Beach, California, USA. SPIE 1898 (1993) 32-42 3. Collignon A., Vandermeulen D., Suetens P., Marchal G.: An Object Oriented Tool for 3D Multimodality Surface-based Image Registration. Computer Assisted Radiology, CAR93, 24-26 juni, 1993, Berlin, 568-573
204 4. Collignon A., Vandermeulen D., Suetens P., Marchal G.: Registration of 3D MultiModality Medical Images Using Surfaces and Point Landmarks. Pattern Recognition Letters 15 (1994) 461-467 5. Collignon A., Vandermeulen D., Suetens P., Marchal G.: Automatic Registration of 3D Images of the Brain Based on Fuzzy Objects. SPIE Int'l Conf. Medical Imaging 1994: Image Processing, 13-18 februari, 1994, Newport Beach, California, USA. SPIE 2167 (1994) 162-175 6. COVIRA, Computer Vision in Radiology: Deliverable 55, D5/2.5 - Demonstration of Final Pilot System for Conformal/Stereotactic Radiotherapy Planning. AIM Programme Project A2003 of The European Commission DG XIII, April 1994 7. Duda R.O., Hart P.E.: Pattern Classi cation and Scene Analysis. Stanford Research Institute, Menlo Park, CA, USA, A Wiley-Interscience Publication (1973) 8. Gerlot-Chiron P., Bizais Y.: Registration of Multimodality Medical Images Using Region Overlap Criterion. CVGIP: Graphical Models and Image Processing 54:5 (1992) 396-406 9. Hill D.L.G., Studholme C., Hawkes D.J.: Voxel Similarity Measures for Automated Image Registration. SPIE Int'l Conf. on Visualization in Biomedical Computing, October 4-7, 1994. SPIE 2359 (1994) in press 10. Mangin J.-F., Frouin V., Bloch I., Bendriem B., J. Lopez-Krahe: Fast nonsupervised 3D registration of PET and MR images of the brain. Journal of Cerebral Blood Flow and Metabolism 14 (1994) 749-762 11. Maurer C.R., Fitzpatrick J.M.: A Review of Medical Image Registration. Interactive Image-Guided Neurosurgery, Maciunas R.J. (Ed), Park Ridge, IL, American Association of Neurological Surgeons (1993) 17-44 12. Neiw H.M., Chen C-T., Lin W.C., Pelizzari C.A.: Automated three-dimensional registration of medical images. SPIE Int'l Conf. Medical Imaging V: Image Processing, California, USA. SPIE 1445 (1991) 259-264 13. Shanmugam K.S.: Digital and Analog Communication Systems, University of Kansas, John Wiley & Sons (1979) 14. Van den Elsen P.A., Maintz J.B.A., Pol E.J.D., Viergever M.A.: Image fusion using geometrical features. SPIE Int'l Conf. on Visualization in Biomedical Computing, 1992. SPIE 1808 (1992) 172-186 15. Van den Elsen P.A., Pol E-J.D., Viergever M.A.: Medical Image Matching - A Review with Classi cation. IEEE Engeneering in Medicine and Biology Magazine 12:1 (1993) 26-38 16. Van den Elsen P.A., Pol E.J.D., Sumanaweera T.S., Hemler P.F., Napel S., Adler J.R.: Grey value correlation techniques used for automatic matching of CT and MR brain and spine images. SPIE Int'l Conf. on Visualization in Biomedical Computing, October 4-7, 1994. SPIE 2359 (1994) 227-237 17. Van Herk M., Kooy H.M.: Automatic three-dimensional correlation of CT-CT, CT-MRI, and CT-SPECT using chamfer matching. Med. Phys. 21:7 (1994) 11631178 18. Verbeeck R., Vandermeulen D., Michiels J., Suetens P., Marchal G.: Computer Assisted Stereotactic Neurosurgery. Image and Vision Computing 11:8 (1993) 468-485 19. Woods R.P., Mazziotta J.C., Cherry S.R.: MRI-PET Registration with Automated Algorithm. Journal of Computer Assisted Tomography 17:4 (1993) 536546 This article was processed using the LATEX macro package with LLNCS style