3D Gabor wavelets for evaluating SPM normalization ... - CiteSeerX

0 downloads 0 Views 1MB Size Report
Available online 11 January 2008. Abstract ... Available online at www.sciencedirect.com. Medical ... There are usually two steps to extract image features using.
Available online at www.sciencedirect.com

Medical Image Analysis 12 (2008) 375–383 www.elsevier.com/locate/media

3D Gabor wavelets for evaluating SPM normalization algorithm Linlin Shen a,*, Li Bai b a

Faculty of Information and Engineering, Shenzhen University, ShenZhen 518060, China b School of Computer Science, University of Nottingham, Nottingham NG8 1BB, UK

Received 6 September 2006; received in revised form 16 December 2007; accepted 21 December 2007 Available online 11 January 2008

Abstract A Gabor wavelets based method is proposed in this paper for evaluating and tuning the parameters of image registration algorithms. We propose a 3D local anatomical structure descriptor, namely the Maximum Responded Gabor Wavelet (MRGW), for measuring registration quality based on anatomical variability of registered images. The effectiveness of the descriptor is demonstrated through a practical application, using the variance of MRGW response to tune parameters of a nonlinear spatial normalization algorithm, which is part of the popular software package for medical image processing – the Statistical Parametric Mapping (SPM). Ó 2008 Elsevier B.V. All rights reserved. Keywords: Gabor wavelet; Image normalization; Image registration; Statistical Parametric Mapping

1. Introduction Spatial normalization in brain imaging is the process of aligning, or warping an individual’s anatomy into a standardized space, so that meaningful comparisons of structural or functional data can be made in this new space. The anatomical variability after normalization is expected to be minimal. The paper aims to quantify this residual variability and provide guidance for optimizing normalization methods. A large number of spatial normalization/registration techniques have been proposed in the literature. Some techniques identify homologous features in the images first then find a transformation to bring the two images together. The features can be points, lines or surfaces (Thompson and Toga, 1996). However, it is often very difficult to locate these features automatically and a time-consuming manual process is used. Compared to feature based techniques, the whole image based approaches find a spatial transforma* Corresponding author. Tel.: +86 755 26536380; fax: +86 755 26536198. E-mail addresses: [email protected] (L. Shen), [email protected] (L. Bai).

1361-8415/$ - see front matter Ó 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.media.2007.12.004

tion that minimizes the difference between images (Hajnal et al., 1995; Zitova and Flusser, 2003; Gaens et al., 1998). While global transformations such as translation, rotation and shearing are used for linear methods, local deformations are considered in nonlinear methods (Ashburner and Friston, 1999). Several parameters, e.g., cost function, basis function, Bayesian model and regularization strength need to be determined before a nonlinear method can be applied. However, in 3D context, many of these parameters are not specified through theoretical reasoning (Robbins et al., 2004), and an empirical performance measure is required to evaluate these design choices. Only a few studies have been reported on assessing the performance of spatial normalization algorithms. The earliest work can be traced back to the evaluation project (West et al., 1997), where the ground truth rigid body transformations are obtained with the help of fiducial markers. The concept of registration consistency is introduced in (Holden et al., 2000) to evaluate different cost functions within the rigid body transformation framework. Mutual information is shown to have produced the best consistency. However, both methods are only applicable to rigid body linear normalization approaches. Frequency-adaptive wavelet shrinkage is proposed in (Dinov

376

L. Shen, L. Bai / Medical Image Analysis 12 (2008) 375–383

et al., 2002) to transform 3D image data into a much more compact space, where the difference between images can be represented using the Euclidean distance. The variability among normalized images can then be measured using statistics of the pair-wise distances. However, such a measure is based on the whole image and might not be able to precisely describe local anatomic variability. A recent work measures anatomical variability for each voxel using the entropy of anatomical ‘‘label” (Robbins et al., 2004). This labels each voxel with one of the tissue types (grey matter, white matter, CSF and background) using an automatic segmentation algorithm. Given a set of ideally aligned images, the labels at a given location would be the same. Entropy of the labels can thus be used to measure the anatomical variability at voxels. However, the method requires a highly reliable segmentation procedure, which is not practical. The segmentation algorithm itself is sensitive to the anatomical variability among different images. In this paper, we propose a method based on 3D Gabor wavelets for evaluating and tuning spatial normalization algorithms. Gabor wavelets were first proposed as a joint time–frequency analysis tool for 1D signal decomposition (Gabor, 1946). They were observed to be able to achieve the best resolution in both time and frequency domain. Gabor wavelets were then extended to the 2D domain (Granlund, 1978). Daugman demonstrated the striking similarity between the shape of 2D Gabor wavelets and the receptive field of the human visual cortex (Daugman, 1985). Since then, Gabor wavelets have been widely used and proven robust for feature extraction for texture segmentation (Jain and Farrokhnia, 1991; Weldon et al., 1996), retinal vessel segmentation (Li et al., 2006; Soares et al., 2006) and face recognition (Wiskott et al., 1997; Shen and Bai, 2006; Shen et al., 2007; Liu and Wechsler, 2002). There are usually two steps to extract image features using Gabor wavelets: (1) convolving the image with a set of Gabor wavelets each tuned to a specific frequency, orientation and bandwidth. (2) Fusing the output/response of Gabor wavelets to obtain the local feature vector. While most applications use the magnitude of the wavelet response, Gabor phase is also being considered for feature extraction and image registration (Elbakary and Sundareshan, 2005; Liu et al., 2002). Adaptively tuned Gabor wavelets have been used to segment and track the tagging sheets/lines in cardiac MR images (Qian et al., 2006; Qian et al., 2003). The parameters of Gabor wavelets with the maximum response are used to find the translation, rotation and deformation of tagging sheets/lines. Our method for evaluating and tuning spatial normalization algorithms based on 3D Gabor wavelets has advantages over, for example, those described in (Robbins et al., 2004; Dinov et al., 2002) in that it does not require a segmentation process, and it has complexity of O(N) as opposed to O(N2), where N is the number of images. In the rest of the paper, we will give details of our method (Section 2), describe the nonlinear registration algorithm in SPM (Section 3), and report on the experiments per-

formed on the method using real MR images (Section 4). Finally, we will summarize the paper (Section 5). 2. Methods 2.1. 3D Gabor wavelets A 3D Gabor wavelet is a sinusoidal wave modulated by a 3D Gaussian function and can be defined as  0 2  0 2  0 2 !! x y z þ þ uf ;h;u ðx; y; zÞ ¼ S  exp  rx ry rz  expðj2pðxu þ yv þ zwÞÞ u ¼ f sin u cos h;

v ¼ f sin u sin h 0 0 0 T

w ¼ f cos u;

½x y z  ¼ R  ½xyz

T

ð1Þ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi where S is a normalization scale, f ¼ ðu2 þ v2 þ w2 Þ is the amplitude of the complex sinusoid wave with frequency (u, v, w). 0 6 h < p and 0 6 u < p are the orientations of the wave vector in 3D frequency domain (see Fig. 1a), and rx, ry, rz define the width of the Gaussian envelop in different x, y and z axis, respectively. Note that the Gaussian envelop and the sinusoid could have different orientations, though they are normally set to be the same for normalization purpose. R defines the rotation matrix for transforming the Gaussian envelope to coincide with orientation of the sinusoid. The Gaussian scale parameters rx, ry, rz could be tuned to the local structures. Unfortunately, prior information about structures to be analyzed is usually unknown. Determining the best scales for visual processing itself, is a big task, when no knowledge about structures and the effects of different sampling rates is available. Following the works described in (Elbakary and Sundareshan, 2005; Liu et al., 2002), we will focus on Gaussian with the same shape in each axis in this paper, i.e. rx = ry = rz = r. Fig. 1b shows the projections of a 3D Gabor wavelet at different 2D planes (only the real part is shown), with f = 1/8, h = p/2, u = p/2 and r = 1/f. When prior information about the image structure to be analyzed is unavailable, a set of Gabor wavelets of different frequencies fi and orientations (hj, uk) is required to obtain sufficient information about the image at a voxel:

a

b w f

ϕ u

θ

v Fig. 1. The 3D frequency domain (a) and an example Gabor wavelet (b).

L. Shen, L. Bai / Medical Image Analysis 12 (2008) 375–383

n

o pffiffiffi i ufi ;hj ;uk ðx; y; zÞ; fi ¼ fmax =ð 2Þ ; hj ¼ jp=J ; uk ¼ kp=K ð2Þ

where fmax is the highest possible frequency of the signal to be analyzed. In this paper we simplify the representation of the wavelets and denote them as {ui,j,k, i = 0, . . . , I  1; j = 0, J  1; k = 0, . . . , K  1} and let I = J = K = 4. Frequency and orientation information at a voxel ~ z of an image volume V can now be represented as gi;j;k ð~ zÞ ¼ ðV  ui;j;k Þð~ zÞ, where  is the convolution operation. The magnitude of the convolution result jgi;j;k ð~ zÞj, the so-called response of a wavelet ui,j,k to the signal, is normally used to represent such information (Wiskott et al., 1997). Since a set of Gabor wavelets {ui,j,k} is used, I  J  K coefficients fjgi;j;k ð~ zÞj; i ¼ 0; . . . ; I  1; j ¼ 0; . . . ; J  1; k ¼ 0; . . . ; K  1g are obtained, which contain useful information about the intensity changes at voxel ~ z. 2.2. The Maximum Responded Gabor wavelet (MRGW) The following sections describes our method for evaluating and tuning spatial normalization algorithms. Given a set of ideally aligned image volumes, the corresponding voxels among the images should have the same local structure. We first use 3D Gabor wavelets to extract local image information and then calculate the entropy of the Maximum Responded Gabor Wavelet (MRGW) to assess the anatomical variability of registered images. The MRGW of a voxel represents the Gabor wavelet that evokes maximum response at the voxel, representing the dominant frequency and orientation of intensity changes at the voxel. The variance of MRGW among registered images thus provides a measure of anatomical variability among registered images. Details of the method are as follows. After convolving an image with the set of Gabor wavelets, a feature vector of dimension zÞ ¼   I  J  KJ ð~ jg0;0;0 ð~ zÞj    jgi;j;k ð~ zÞj    jgI1;J 1;K1 ð~ zÞj is extracted to represent the anatomical structure around the voxel ~ z. The variance between the structures of two voxels ~ z1 and ~ z2 can be described by the distance of these vectors, i.e. distðJ ð~ z1 Þ; J ð~ z2 ÞÞ. However, as the variance needs to be calculated among all registered images, computation cost of this pair-wise differences increases as the number of images increases. We thus use the concept of MRGW to represent local image structure. MRGW is defined as a three-dimensional vector of indexes (if, jh, ku), representing the Gabor wavelet which gives the maximum response at voxel ~ z, i.e. jgif ; jh ; k u ð~ zÞj ¼ maxi;j;k jgi;j;k ð~ zÞj. While J ð~ zÞ gives an overall description of the voxel structure at different frequencies and orientations, MRGW reflects the dominant frequency and orientation. Compared with the high dimension of J ð~ zÞ, MRGW representation is more compact. As shown in the following section, the computation of anatomical variability measure using MRGW is more efficient. To show the effectiveness of MRGW in representing local information at a voxel, we apply it to a 2D image

377

as shown in Fig. 2. A 2D image (Fig. 2a) with frequency 1/8 and orientation p/2 isnsynthesized and analyzed using pffiffiffi i 40 2D Gabor wavelets ufi ;hj ðx; yÞ; fi ¼ 0:25=ð 2Þ ; hj ¼ jp=8; i ¼ 0; ::; 4; j ¼ 0; . . . ; 7g. To discriminate between different wavelets, we denote each of them using a 2D vector (i, j), where i index into the five frequencies, and j index into the eight orientations. For visual display purpose, (i, j) is converted into a single index using the formula 8  i + j. The MRGW index (if, jh) is computed for each pixel and displayed in Fig. 2c, with the response of MRGW shown in Fig. 2b. As shown in the figure, the MRGW indexes for all pixels point to the same Gabor wavelet ((2, 4) in this case), which was tuned to frequency 1/8 and orientation p/2, exactly the same as that of the signal. Note that there are some boarding effects around the edges of the images, which could be effectively removed using masks. The MRGW of background pixels could also be removed by simply thresholding the response of MRGW, i.e. there are no dominant frequency and orientation if the response is less than a predefined threshold. We also tested the robustness of MRGW for representing local image information against noise using a 2D brain slice. Gaussian noise (S/N ratio: 5) is added to the original slice to produce a noisy image, as shown in Fig. 3. The original and the noisy slices have very different gray values. In contrast, the MRGWs for the original and the noisy slice are visually the same. The differences of MRGW responses are mainly located outside of the skull, which is caused by high frequency noise in the background and can be efficiently removed using thresholding or brain masking. Note that the difference has been re-scaled for visual purpose. We have also used the mean squared difference (MSD) to perform quantitative analysis. While the MSD between the original image and the noisy image is as large as 145.923, the value has been substantially reduced to 0.569 between the responses of MRGW. This example clearly shows that MRGW image representation is much more robust against noise than gray values in representing image structures. We also compare MRGW representation with wavelet decomposition, which is widely used as a tool to denoise an image (Bai and Liu, 2002). The MSD measuring the difference between wavelet approximation coefficients of the original and the noisy slices is as large as 94.579, compared with 0.569 using MRGW. MRGW is much less sensitive to image noise.

Fig. 2. A 2D example: (a) original image; (b) response of MRGW; and (c) index of MRGW.

378

L. Shen, L. Bai / Medical Image Analysis 12 (2008) 375–383

two orientation indexes. A standard measure of variability of such probability distributions, entropy, can thus be defined as X zÞ ¼  pi ð~ zÞlog2 pi ð~ zÞ hf ð~ i

zÞ ¼  hh ð~

X

pj ð~ zÞlog2 pj ð~ zÞ

j

hu ð~ zÞ ¼ 

X

pk ð~ zÞlog2 pk ð~ zÞ

ð3Þ

k

The overall variability (Hf, Hh, Hu) can then be calculated as the average over all N voxels: 1 X 1 X Hf ¼ hf ð~ zÞ; H h ¼ hh ð~ zÞ; N all~z N all~z 1 X hu ð~ zÞ ð4Þ Hu ¼ N all~z A weighted measure is also defined here to give single variability score: H f ;h;u ¼ a  H f þ b  H h þ c  H u

ð5Þ

where a, b, c are the weights for the frequency and orientation of the signal change, and a + b + c = 1. 3. The nonlinear registration algorithm

Fig. 3. An example slice: (a) original slice; (b) noisy slice; (c) difference between original and noisy slice; (d) MRGW response of original slice; (e) MRGW response of the noisy slice; (f) difference between MRGW responses of original and noisy slice; (g) wavelet reconstruction of original slice; (h) wavelet reconstruction of noisy slice; and (i) difference between wavelet coefficients of original and noisy slice.

2.3. Anatomical variability measure We have in the last section explained how to use MRGW to describe the local image structure at a voxel. Given a set of aligned image volumes {Vm, m = 1, 2, . . . , M}, each voxel of Vm is now represented by the MRGWðimf ð~ zÞ; jmh ð~ zÞ; k mu ð~ zÞÞ. For ideally aligned volumes, the corresponding voxels should have the same anatomical structure, i.e., i1f ð~ zÞ ¼    imf ð~ zÞ ¼    ¼ iM zÞ, j1h ð~ zÞ ¼    ¼ f ð~ 1 m m M jh ð~ zÞ ¼    jh ð~ zÞ and k u ð~ zÞ ¼    k u ð~ zÞ ¼    k M ð~ zÞ. The anau tomical variability at a particular voxel~ z among the images can thus be measured by the entropy of MRGWs ðhf ð~ zÞ; hh ð~ zÞ; hu ð~ zÞÞ. To measure the entropy of MRGW at each voxel ~ z, we start with pi ð~ zÞ, the probability distribution of the frequency index imf ð~ zÞ of the MRGW. The probability is estimated as the fraction of the volumes whose voxel ~ z has MRGW with frequency index i(i = 0,. . .,I  1). Similarly, we can also define the probabilities pj ð~ zÞ, pk ð~ zÞ for the

A nonlinear registration algorithm integrated in the SPM (Statistical Parametric Mapping) package developed by University College London, UK, is used to align image volumes. As presented in (Ashburner and Friston, 1999), the algorithm starts with a linear affine registration, which globally aligns the source with the target by translating, rotating and shearing. The affine registration is followed by an iterative nonlinear local warping transformation which minimizes the sum of squared difference between the source and the target image. While a linear transformation can be easily represented by displacement and rotation angles, nonlinear transform is approximated using a number of low frequency DCT (Discrete Cosine Transform) basis functions. Given a set of voxels xi,j in source volumes and a set of voxels yi,j in target volumes, the transformation from xi,j to yi,j can be approximated as y i;j ¼ xi;j 

J X

ti;j bi;j ðxi;j Þ

ð6Þ

j¼1

where bi,j(xi,j) are DCT basis functions, J is the number of basis functions used and ti,j are the associated weights. A cut-off wavelength CDCT is predefined such that only those DCT basis functions with wavelength >CDCT are used for the approximation. The lower the value of CDCT, the more DCT basis functions are used. As CDCT moves towards positive infinity (Inf), no DCT basis is applied and the nonlinear normalization algorithm thus becomes a linear affine one, i.e. only the global affine transforms such as translation, rotation and shear are considered. A smoother

L. Shen, L. Bai / Medical Image Analysis 12 (2008) 375–383

warping could be achieved when more DCT functions are used, at the cost of increased computational complexity. Since the algorithm involves an iterative optimization process, it may introduce unnecessary deformations that only reduce the value of the objective function slightly. There is thus the need to learn the priori distribution about the deformation parameters from training images. In addition to the difference between the source and the target image, the likelihood of the deformation is also included in the objective function for optimization. When the weight of the prior information, i.e. the value of regularization strength RS, is too large, the algorithm will try to change deformation parameters so as to increase the probability of deformation, instead of focusing on decreasing the differences between the source and target, thus make the deformations underestimated.. If the value is too small, the algorithm will focus too much on decreasing the image difference and allow irregular deformation. Unfortunately, there is not a simple way to predict what the best value should be. The iterative optimization process updates the transformation parameters repeatedly until a predefined number of iterations NIte is reached. SPM spatial normalization also includes a spatial smoothing process to reduce the dynamic anatomical variability of different brains and increase the signal to noise ratio. Since the average of 152 normal MRI scans is used in SPM as the template, smoothing could also speed up the registration process and make it converge. Performed with a low pass Gaussian filter, its kernel width, i.e. full width half maximum (FWHM), could have a significant effect on the statistical analysis following image normalization. Again, it is not trivial to estimate the best kernel width without knowing must about the image to be analyzed. 4. Results To test our method, we first use the SPM package to register a set of brain image volumes. The proposed method is then applied to the registered image volumes to assess how well the registered volumes are aligned and use this information to tune the parameters of the normalization algorithm. 4.1. Data Forty-three T1 weighted brain scans are used to test our method. These are the scans of 43 healthy volunteers using two 3T Philips MRI scanners, with voxel resolution from 0.8  0.8  0.8 mm to 1  1  1 mm, volume size from 256  256  184 to 256  256  160. The 43 volumes are randomly grouped to two sets, tuning set and test set. Twenty-three volumes are randomly selected to make up the tuning set, and the remaining 20 volumes form the test set. Once all of the volumes in the same set were aligned with the template, the anatomical variability among them is measured using the proposed method to assess the performance of the spatial normalization algorithm. While

379

the tuning set is used to find the optimized parameters, the test set is used to test the generalization ability of the algorithm, i.e. the alignment performance of the algorithm in different image sets. In this paper, we mainly investigate four parameters of the SPM normalization algorithm, the number of DCT basis functions CDCT, the regularization strength RS, the number of iterations NIte, and Full Width Half Maximum (FWHM) of the Gaussian smoothing kernel. 4.2. Tuning set 4.2.1. Number of DCT basis functions In this section we investigate the effect of CDCT on normalization results. Default values are used for RS and NIte, i.e. RS = 1 and NIte = 16. Fig. 4 shows the values of Hf, Hh, Hu and Hf,h,u for varying CDCT. Three different weighting schemes, a = 0.2, b = 0.4, c = 0.4, a = 0.4, b = 0.3, c = 0.3 and a = 0.6, b = 0.2, c = 0.2, are used in the tests. As shown in Fig. 4d, the variations of Hf,h,u to different weighting schemes are quite similar. Similar results are also observed in tests with RS and NIte (see Table 1 and Fig. 6). One can also observe from Fig. 4 that the values of Hf, Hh, Hu and Hf,h,u increase monotonously with CDCT, since fewer DCT bases are used. When CDCT moves towards Inf, i.e. no nonlinear deformation is performed, Hh and Hu reach their maximum. However, when the value of CDCT P 80, the DCT basis functions are unable to describe the local deformation precisely, i.e. the value of Hf is even higher than that when CDCT = Inf. The value of Hf,h,u reaches its minimum when CDCT = 25. The average time required to finish the spatial normalization process of an image volume is about 60 s when CDCT = 25 using a P4 3.0 GHz (1G RAM) PC. Note that when the number of DCT basis increases, the computation and memory costs rise as well. For example, the average time increased to 120 s when CDCT = 20. When CDCT < 20, the computational complexity of the optimization process increases dramatically due to the large number of DCT functions used. The normalization process failed to converge in our experiments. 4.2.2. Regularization strength In this section, we investigate the effect of regularization strength RS on the performance of the SPM normalization algorithm. In this test, NIte is set to the default value, and CDCT is set to 25, i.e. the selected value in the previous section. We tested 3 different regularization strengths, i.e. very heavy (RS = 100), heavy (RS = 10) and medium (RS = 1). Hf, Hh, Hu and Hf,h,u for different regularization strengths are shown in Table 1, which clearly suggest that weaker regularization leads to better alignment. However, when the regularization is too weak, e.g. RS < 1, the nonlinear spatial normalization algorithm produces irregular results. Fig. 5 shows an example where the brain volume is deformed to an unreasonable shape due to weak regularization (RS = 0.1). As observed in Table 1, the value Hu is

380

L. Shen, L. Bai / Medical Image Analysis 12 (2008) 375–383

Fig. 4. The effects of cut-off wavelength on anatomical variability.

Table 1 Effect of different regularization strength Anatomical variability

Very heavy regularization

Heavy regularization

Medium regularization

Hf Hh Hu Hf,h,u, a = 0.2, b = 0.4, c = 0.4 Hf,h,u, a = 0.4, b = 0.3, c = 0.3 Hf,h,u, a = 0.6, b = 0.2, c = 0.2

0.5530 0.7980 1.1527 0.8909

0.5558 0.7922 1.1335 0.8814

0.5311 0.7577 1.0818 0.8420

0.8064

0.8000

0.7643

0.7219

0.7186

0.6866

consistently higher than that of Hh, which might be caused by the sulci and gyri of the brain cortex. Since the orientations of intensity changes between sulci and gyri mostly differ in the azimuth, a larger variation of orientation in the azimuth is observed. 4.2.3. Number of iterations Once the values of CDCT and RS have been selected, we are able to determine the effect of NIte on normalization

performance. An iterative process is used in SPM to adjust the weights of different DCT basis functions, which minimizes the MSD between voxel intensities of the two volumes. However, voxel intensities are not robust since they could be easily interfered with by noise. The reduction of MSD does not necessarily lead to better alignment. Fig. 6 displays the values of Hf, Hh, Hu and Hf,h,u for different NIte. As shown in the figure, the anatomical variability decreases rapidly during the initial stages (NIte < 6), which matches well with the value of MSD. However, after reaching its minimum, the variability Hf starts to climb once NIte is larger than 14. The weighted measure, Hf,h,u, reaches its minimum when NIte = 18. 4.2.4. Width of the smoothing kernel Table 2 summarizes the values of Hf, Hh, Hu and Hf,h,u for different FWHMs. Since we have shown in previous tests that the variance of Hf,h,u is insensitive to the weighting scheme, we set a = 0.4, b = 0.3, c = 0.3 in the following experiments. Such a weighting scheme could achieve a balance between the frequency and orientation information, i.e. let the weight of frequency similar to that of orienta-

L. Shen, L. Bai / Medical Image Analysis 12 (2008) 375–383

381

Table 2 Effects of smoothing kernel width FWHM of Gaussian kernel (mm)

8

6

4

Hf Hh Hu Hf,h,u

0.5325 0.7573 1.0811 0.8064

0.5254 0.7428 1.0673 0.8000

0.5175 0.7370 1.0651 0.7643

Fig. 5. An irregularly deformed brain volume.

tion. Table 2 clearly shows the drawback of smoothing, i.e. a loss of spatial precision. The anatomical variability increases with kernel size. However, certain amount of smoothing is necessary for the following subject/group comparison. Smoothing could also speed up the registration process towards convergence. 4.3. Test set Once the parameters, i.e. CDCT = 25, RS = 1 and NIte = 18 have been selected using the tuning set, they are further tested using an independent test set. The 20 image volumes in the test set are first registered to the same tem-

plate, and anatomical variability is then computed. Note that the 20 image volumes are from 20 different subjects, which are not available in the tuning set. The variability measures for different CDCT, NIte and RS are shown in Fig. 7 and Table 3, respectively. We observe that the value of Hf,h,u reach its minimum when CDCT = 25 and NIte = 18, which is slightly lower than the one when CDCT = 20, NIte = 16 or NIte = 20. While Table 3 shows that the value of Hf,h,u when RS = 1 is significantly lower than that when RS = 100 or RS = 10. The results show that the tuned registration algorithm again produces the smallest anatomical variability among the volumes in the test set, which proves the generalization ability of such a tuning process.

Fig. 6. The effects of iterations on anatomical variability.

382

L. Shen, L. Bai / Medical Image Analysis 12 (2008) 375–383

Fig. 7. Anatomical variability of the test set.

Table 3 Effects of different regularization strength Anatomical variability

Very heavy regularization

Heavy regularization

Medium regularization

Hf Hh Hu Hf,h,u

0.5657 0.8065 1.1569 0.8153

0.5668 0.8016 1.1402 0.8093

0.5429 0.7701 1.0987 0.7778

We also show in Fig. 8 the weighted entropy map of 20 individuals after spatial normalization using the SPM registration algorithm. When most of the variances are found to be located around the edges between different structures of the brain, the nonlinear registration algorithm using the optimized parameters achieves less variance than the one using non-optimized parameters (CDCT = 60, RS = 1 and NIte = 12) and the affine configuration (CDCT = 1, RS = 1 and NIte = 18).

Fig. 8. Entropy map of the test set after spatial normalization using the SPM registration algorithm with (a) optimized parameters; (b) non-optimized parameters; and (c) affine configuration.

L. Shen, L. Bai / Medical Image Analysis 12 (2008) 375–383

5. Discussions A robust and effective local anatomical structure descriptor – the Maximum Responded Gabor Wavelet (MRGW) has been proposed in this paper. The entropy of MRGW has been successfully applied to tune the parameters of a SPM nonlinear spatial normalization algorithm. Using the selected parameters, the normalization algorithm has been shown to achieve less anatomical variability among 23 aligned T1 brain volumes. The tuned algorithm is tested using a set of 20 volumes from different subjects and the smallest variability has been observed, which clearly indicates the generalization ability of the parameter tuning method. The parameter tuning framework used in this paper is similar to that described in papers (Robbins et al., 2004; Dinov et al., 2002) but there are fundamental differences. As the method in (Dinov et al., 2002) uses a global distance in the wavelet space to measure the difference between two image volumes, the information of local anatomic variability might be lost. In addition, the pair-wise strategy adopted to measure the anatomical variance among a set of N registered image volumes has a O(N2) computational complexity. In contrast, the computation cost of our method scales linearly with the number of image volumes. Compared with the method proposed in (Robbins et al., 2004), the segmentation process is not required in our framework. Since different evaluation strategies are used, different tuning framework could suggest different parameters to the same registration algorithm. It is always very difficult to compare these methods when a ground truth of the registration accuracy is unavailable. Based on its ability to extract important local intensity frequencies and orientations, we are also extending the MRGW for brain landmark location (Zhong et al., 2004), and for general image registration algorithms. Acknowledgement We thank Prof. Dorothee Auer of Division of Academic Radiology, University of Nottingham, UK, for providing the images for the experiments. The work was partially supported by the National Natural Science Foundation of China under grant no. 60572100 and 60673122, Royal Society (UK) International Joint Projects 2006/R3 – Cost Share with NSFC under grant no. 60711130233. References Ashburner, J., Friston, K.J., 1999. Nonlinear spatial normalization using basis functions. Human Brain Mapping 7, 254–266. Bai, L., Liu, Y.H., 2002. When eigenfaces are combined with wavelets. Knowledge Based System 15, 343–347. Daugman, J.G., 1985. Uncertainty relation for resolution in space, spatialfrequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America A – Optics Image Science and Vision 2, 1160–1169. Dinov, I.D., Mega, M.S., Thompson, P.M., Woods, R.P., Sumners, D.L., Sowell, E.L., Toga, A.W., 2002. Quantitative comparison and analysis of brain image registration using frequency-adaptive wavelet shrinkage. IEEE Transactions on Information Technology in Biomedicine 6, 73–85.

383

Elbakary, M., Sundareshan, M.K., 2005. Accurate representation of local frequency using a computationally efficient Gabor filter fusion approach with application to image registration. Pattern Recognition Letters 26, 2164–2173. Gabor, D., 1946. Theory of communications. Journal of Institution of Electrical Engineers 93, 429–457. Gaens, T., Maes, F., Vandermeulen, D., Suetens, P., 1998. Non-rigid multimodal image registration using mutual information. In: Proceedings of the Medical Image Computing and Computer-Assisted Intervention – Miccai’98, pp. 1099–1106. Granlund, G.H., 1978. Search of a general picture processing operator. Computer Graphics and Image Processing 8, 155–173. Hajnal, J.V., Saeed, N., Soar, E.J., Oatridge, A., Young, I.R., Bydder, G.M., 1995. A registration and interpolation procedure for subvoxel matching of serially acquired MR-images. Journal of Computer Assisted Tomography 19, 289–296. Holden, M., Hill, D.L.G., Denton, E.R.E., Jarosz, J.M., Cox, T.C.S., Rohlfing, T., Goodey, J., Hawkes, D.J., 2000. Voxel similarity measures for 3-D serial MR brain image registration. IEEE Transactions on Medical Imaging 19, 94–102. Jain, A.K., Farrokhnia, F., 1991. Unsupervised texture segmentation using Gabor filters. Pattern Recognition 24, 1167–1186. Li, Q., You, J., Zhang, L., Bhattacharya, P., 2006. A multiscale approach to retinal vessel segmentation using Gabor filters and scale multiplication. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pp. 3521–3527. Liu, C.J., Wechsler, H., 2002. Gabor feature based classification using the enhanced Fisher linear discriminant model for face recognition. IEEE Transactions on Image Processing 11, 467–476. Liu, J., Vemuri, B.C., Marroquin, J.L., 2002. Local frequency representations for robust multimodal image registration. IEEE Transactions on Medical Imaging 21, 462–469. Qian, Z., Montillo, A., Metaxas, D.N., Axel, L., 2003. Segmenting cardiac MRI tagging lines using Gabor filter banks. In: Proceedings of the IEEE International Conference of the Engineering in Medicine and Biology Society, pp. 630–633. Qian, Z., Metaxas, D.N., Axel, L., 2006. Extraction and tracking of MRI tagging sheets using a 3D Gabor filter bank. In: Proceedings of the 28th IEEE EBMS Annual International Conference, New York City, USA, pp. 711–714. Robbins, S., Evans, A.C., Collins, D.L., Whitesides, S., 2004. Tuning and comparing spatial normalization methods. Medical Image Analysis 8, 311–323. Shen, L., Bai, L., 2006. MutualBoost learning for selecting Gabor features for face recognition. Pattern Recognition Letters 27, 1758–1767. Shen, L., Bai, L., Fairhurst, M., 2007. Gabor wavelets and General Discriminant Analysis for face identification and verification. Image and Vision Computing 25, 553–563. Soares, J.V.B., Leandro, J.J.G., Cesar, R.M., et al., 2006. Retinal vessel segmentation using the 2-D Gabor wavelet and supervised classification. IEEE Transactions on Medical Imaging 25, 1214–1222. Thompson, P., Toga, A.W., 1996. A surface-based technique for warping three-dimensional images of the brain. IEEE Transactions on Medical Imaging 15, 402–417. Weldon, T.P., Higgins, W.E., Dunn, D.F., 1996. Efficient Gabor filter design for texture segmentation. Pattern Recognition 29, 2005–2015. West, J., Fitzpatrick, J.M., Wang, M.Y., et al., 1997. Comparison and evaluation of retrospective intermodality brain image registration techniques. Journal of Computer Assisted Tomography 21, 554–566. Wiskott, L., Fellous, J.M., Kruger, N., von der Malsburg, C., 1997. Face recognition by elastic bunch graph matching. IEEE Transactions on PAMI 19, 775–779. Zhong, X., Shen, D.G., Davatzikos, C., 2004. Determining correspondence in 3-D MR brain images using attribute vectors as morphological signatures of voxels. IEEE Transactions on Medical Imaging 23, 1276–1291. Zitova, B., Flusser, J., 2003. Image registration methods: a survey. Image and Vision Computing 21, 977–1000.

Suggest Documents