A Fast Hierarchical Non-Iterative Registration ... - Semantic Scholar

3 downloads 4203 Views 1MB Size Report
Furthermore, Fourier domain methods appear to be less ... The registration method of this paper employs a 'patch' algorithm, which integrates a ...... where S is the pixel value in the original noise-free image, and n ~ 0.3SmaxN(0,1) is the.
A Fast Hierarchical Non-Iterative Registration Algorithm M.E. Alexander Institute for Biodiagnostics, National Research Council Canada, Winnipeg, Canada

Published in: International Journal of Imaging Systems and Technology, Vol. 10, pp. 242-257 (1999)

Mailing address of corresponding author: National Research Council Canada, Institute for Biodiagnostics, 435 Ellice Avenue, Winnipeg, Manitoba, CANADA R3B 1Y6

Telephone: (204) 984-6995 Fax: (204) 984-5472 email: [email protected]

ABSTRACT

This note describes a fast algorithm for registering pairs of images from a time sequence of images. The algorithm solves a linear regression problem based on a linearization of the image matching equation, in order to obtain the registration coefficients. The problem of ill-posedness caused by differentiation of a noisy image sampled on a finite lattice is solved by means of a ‘patch algorithm’. The algorithm uses an integrated form of the linearized displacement equation. Registration is simultaneously carried out on a set of computationally fast pre-filters providing three, downsampled band-pass images for each input image. The filters are multi-level, and permit an efficient and versatile hierarchical registration

procedure.

Downsampling

the

images

before

registration

significantly

reduces computation time. As a final step, registration is repeated using full-size images. Results for 2D images using a 6-parameter affine registration transformation, and a 12parameter second-order polynomial transformation indicate the method is 2.5 times faster (12.7 seconds per 256 X 256 image pair) than a previous, iterative method (28 seconds per pair), and, like the previous method, is robust to noise. The method may easily be generalized to 3D image registration and more general transformations, and is well-suited to parallel processing.

2

1. INTRODUCTION

With the advent of fast (TR < 100ms) imaging, motion artifacts during a single image acquisition tend to be smaller than those caused by motion between successive images in a time sequence. There are several methods available that perform real-time correction of inter-image patient motion1,2 - e.g., 'orbital' navigator echoes with ring-shaped k-space trajectories can extract both translation and rotation from 2D image sequences2 . Higher order corrections – those that involve distortions from rigid-body displacements – are generally computationally expensive and require off-line post-processing. Furthermore, the k-space trajectories to detect more general motions (e.g., distortions) and therefore requiring more parameters, are not generally known. However, they may require techniques involving correlation between pairs or even triplets of points3 and are likely to be computationally too expensive for real-time processing.

Many registration algorithms are iterative5-7 , and their indeterminate computational time makes them unsuitable for real-time registration. Fourier techniques4,8,9, though noniterative, are restricted to rotations, translations and uniform scaling, and can be slow due to the number of FFTs required. Furthermore, Fourier domain methods appear to be less robust to noise than spatial domain methods5,6.

The method proposed here is non-iterative, robust to noise, and can handle arbitrary global distortions. In practice, rigid body motion forms the dominant type of

3

displacement requiring registration, and is specified by 3 parameters in 2D or 6 parameters in 3D. Affine displacements are defined by 6 (2D) or 12 (3D) parameters, and include the special case of rigid body motion. Unlike rigid body motion (usually defined in terms of translation and rotation), affine displacements are linear in their parameters, and therefore easier to compute. The extra degrees of freedom of affine over rigid body displacements are useful in registering medical images of deformable tissue. More general

non-linear

displacements

require

larger

numbers

of

parameters;

such

displacements will be present, for example, in MR imaging of the brain undergoing distortion during motion of the patient in the magnet.

Timing tests indicate that the present method is significantly faster than a previous, iterative method5,6,

which uses feature matching at a small number of ‘registration

points’ strategically located at the strongest edge points in the image. Timing tests also demonstrate that significant computational savings are achieved by partially registering downsampled versions of input images, rather than the full-size images.

The registration method of this paper employs a ‘patch’ algorithm, which integrates a linearized form of the image matching equation – which is analogous to the well-known ‘brightness constancy equation’ in the optical flow literature10 – over a set of patches in the spatial domain. A pair of images is brought into registration by applying a spatial distortion field (analogous to the optical flow field) that warps the features of one image until they match corresponding features in the other image. Prior to this registration, the pair of images is convolved with a set of pre-defined filters. Each filtered image pair is

4

assumed to obey the same image matching equation – that is, the same distortion field applies to all filtered image pairs. By employing several filters – each highlighting a different set of features in the original images – it is possible to generate several equations of constraint for the distortion field. Furthermore, a filter with a given profile may be applied at more than one spatial scale, and this gives rise to a multiscale registration method. However, the distortion field at different scales will, in general, be different, so that the distortion field is derived using filters at a given spatial scale. The distortion derived at one spatial scale can then provide a starting approximation for deriving the distortion at a different (finer) scale. The equations of constraint thus generated provide, at each spatial scale, a system of regression equations for the unknown distortion field, one equation for each patch and each filter.

The actual set of filters used on image pairs, prior to registration, were derived from a low-pass/high-pass pair of one-dimensional (1D) filters due to Simoncelli11 . The lowpass filter serves both as an interpolating function and smoothing filter, while the highpass filter is an optimal approximation to the first derivative of the low-pass filter. The two-dimensional filters needed for the present application were formed from the tensor product of these 1D filters. The Simoncelli filters were specifically designed to compute derivatives (of different orders) of sampled signals, using the shift-invariant property of the derivative operator to express the derivative as a convolution. High-pass filters representing approximations to higher-order derivatives of the low-pass filter, can also be constructed12 .

These filters were shown effective for orientation-estimation tasks11 and

optical flow estimation12 . Only the first-derivative filters11 were used in the present paper,

5

though in principle any mix of filters, of different orders of derivatives, could be included. Furthermore, for the present application, the Simoncelli filters were adapted to multiscale applications by incorporating them in Mallat’s pyramid algorithm13 .

A considerable simplification results if the image matching equation can be made linear in the distortion field. In order to justify a linear approximation, certain conditions must be met, as will be discussed in the Method section. Then, if the distortion field that warps one image into another is expanded as a linear combination of prescribed basis functions, the

regression

equations

become

linear

in

the

unknown

expansion

coefficients

(registration parameters) and may be solved using standard techniques. The advantage of analyzing the integrated form of the matching equation is that the spatial derivatives of the noise-corrupted image function, which is defined only on a uniform sampling lattice, can be eliminated.

The paper is organized as follows. In Section 2, the patch algorithm is derived from an integrated form of the image matching equation. The role of the multiresolution approach in justifying the linear approximation is discussed. The distortion field is modeled as an affine displacement, and the images may undergo small changes in overall contrast (a situation that is common when acquiring a series of MR images over time, due to fluctuations in the acquisition hardware). In Section 3, some results are presented on the performance of the algorithm, for both an artificial case − recovering the computed distortions of a single image and matching against the original image − and a real situation involving large in-plane motions of a subject’s head. In addition, the

6

performance of the algorithm is tested when noise is artificially overlaid on the latter images. Finally, in Section 4, generalizations of the algorithms to higher-order distortions, in both 2D and 3D images, and the opportunities for exploiting the highly parallel nature of the patch algorithm, are discussed.

2. METHOD

Let I(x) = I(x,t 0 ) and I'(x) = I(x,t 1 ) be two images acquired in a time sequence of images, where I' is the result of changes in I caused by a displacement field u(x) and a mean intensity change by factor (1+e). That is, I'(x) = (1+e)I(x+u). This ‘image matching equation’, apart from the factor (1+e), is just a statement of the ‘image brightness equation’ of Horn and Schunck10 , expressing the constancy in brightness of each point of a feature of image I as it undergoes distortion in the second image I’. Usually this condition is stated in derivative form (letting t 2 → t 1 , and assuming I to have continuous first-order derivatives), sometimes called the ‘optical flow’ equation, dI ∂I = + u • ∇I = 0 dt dt A first-order approximation to the matching equation above is

∆I = eI + u•∇ ∇I

(1)

7

where ∆I(x) ≡ I'(x) – I(x) is the difference image. Note that ∆I plays the role of (–∂I/∂t) in the optical flow equation. Let u be parameterised by the coefficients {ck} of a linear expansion in a set of prescribed basis functions {Φk(x): k = 1,..,M} of the spatial coordinates x: M

u = ∑ c k Φk ( x )

(2)

k =1

We may evaluate Eq. (1) at several points, or regions, in the image and solve the resulting linear regression equations for the parameters {ck}. This description of the displacement field does not include the possibility of discontinuities of motion, such as multiple motions at a point in the image or occluded object boundaries. For time sequences of MR images, localized brightness changes of a given image feature (e.g., a boundary between tissue types), as opposed to changes at a given pixel, will also violate this model. This constraint is similar to the brightness constancy condition10 in optical flow. The present image matching equation is slightly more general than this, however, in that it allows for changes (by a factor (1+e)) in the overall brightness contrast of the distorted image I’.

A special case of Eq. (2), which will be considered in the rest of this paper, is the affine displacement: u = Ax + b, where matrix A and vector b contain the unknown parameters (a total of 6 for 2D, 12 for 3D images). Affine displacements are more general than rigid body displacements, so that they can accommodate global distortions (such as shear and isotropic expansion or contraction), yet are still simple enough that computations are fast. Non-linear distortions, requiring more parameters, are also important in medical imaging

8

of deformable tissue. As shown by the example in Section 3, these are probably best incorporated by applying small, corrective registration after registration using an affine model for displacements has been completed.

2.1 Numerical instability

Equation (1) involves the spatial derivative of the image function I(x), whose values are known only on a discrete lattice and are corrupted by noise. Under these circumstances, numerical differentiation is ill-posed, leading to unstable solutions14 . Also, the linear approximation implied by the gradient term in Eq. (1) is justified only if the displacement u is smaller than the smallest scale of features appearing in the images I and I´. If this is not the case, two things happen: (i) The second and higher-order derivatives appearing in the Taylor expansion of I(x + u) are non-negligible, rendering the linear approximation invalid; (ii) the linear approximation may converge to a false (local) match instead of the true (global) one. For the optical flow equation, Kearney et al. 15 have shown that estimating the gradients by forward-differencing on the sampling lattice introduces systematic errors, which depend on the magnitude of the optical flow as well as the first and second spatial derivatives of I. Note also that, in the present paper, we specify a parametric form (the affine representation) for the distortion field u, which specifies the field globally, so that the ‘smoothness constraint’10 on the optical flow is not required.

A possible strategy to overcome the problems introduced by linearization is multiscale pre-filtering. Multiscale approaches have also been used for solving the optical flow

9

equations16 , in an attempt to eliminate errors caused by large motions (analogous to large values of u). In optical flow, it is found16 that when the spatial scale of feature is smaller than half the size of the sampling grid, an error of up to 100% can occur in estimating the (local) velocity. An analogous difficulty occurs in the registration of image pairs. Initially, the images are low-pass filtered, which has the effect of reducing the size of the second- and higher-order derivative terms relative to the first-order term retained in Eq. (1). The error of linear approximation is reduced, allowing one to obtain an initial estimate of the field u from Eq. (1). Using this estimate, an approximate distortion correction is applied to I to reduce the disparity with corresponding features in image I’. The residual distortion field remaining after this correction will be smaller than the original one, so the same updating procedure may be repeated using a similar low-pass filter but which retains features of the original image at a finer scale. One may solve the regression equations using this ‘coarse-to-fine’ filtering, in which the solution found from a previous coarse-filtered image pair is used to update I in preparation for solving at the next finer level. In this way, u can be incrementally improved. Thus, pre-smoothing the images has the effect of removing fine-scale details, thus reducing the risk of the solution to Eq. (1) becoming trapped in a false match caused if u happens to be larger than the smallest-scale features.

A second advantage of the multiscale procedure is that, by performing as much of the registration as possible on downsampled pre-filtered images, significant computational savings are possible. Downsampling is a method for reducing the redundancy of information resulting when an image is filtered over a sub-band of frequencies of the

10

original image. Downsampling means that fewer points need to be processed. However, as discussed in Section 2.3 below, redundancy is essential for accurately interpolating image values at non-lattice sites x + u, as will be required in the course of the registration procedure. These considerations suggest that a hybrid technique using both downsampled and full-sampled data might be useful.

2.2 Pre-filtering and downsampling for efficient registration

There exist classes of filter pairs that admit the hierarchical coarse-to-fine approach above and are computationally efficient (i.e., require O(N) computations for N data points). One such class is wavelets13,18,19. In particular, the Mallat-Zhong17 have been widely used in image processing tasks, including registration5,6. Another class is the Simoncelli

matched

high-pass/low-pass

(H/L)

filter

pairs11,12

described

in

the

Introduction. This pair is comprised of a low-pass smoothing filter, defined in terms of an underlying continuous interpolation function, and a high-pass filter defined in terms of an optimal approximation to some order of derivative of the interpolation function.

For the results reported here, we use a 5-tap Simoncelli matched pair11 , for which the high-pass filter acts as a first-derivative operator. Two-dimensional filters can be constructed from this pair by forming their tensor product. If C(x) denotes the (continuous) interpolating function underlying the low-pass filter, then there are four possible filters that can be constructed:

11

Φ( x, y ) = C ( x )C ( y ) dC ( x ) ∂Φ Ψ1 ( x, y ) = C ( y) = dx ∂x dC ( y) ∂Φ Ψ2 ( x , y) = C ( x ) = dy ∂y

(3)

dC ( x ) dC ( y ) ∂ 2Φ Ψ3 ( x, y ) = = dx dy ∂x∂y Thus, the (discrete) convolution of an image I(x,y) with Φ produces a smoothed output image IΦ = I⊗Φ (where ⊗ denotes convolution),

which we denote by LL (indicating

low-pass filtering in both x- and y-directions). Convolution with Ψ1 produces an xgradient image ∂IΦ/∂x, denoted by HL (high-pass filtering in x-direction, low-pass in ydirection); similarly convolution with Ψ2 produces a y-gradient image ∂IΦ/∂y, denoted LH; and Ψ3 produces the second-order xy-derivative image ∂2 IΦ/∂x∂y, denoted HH. Thus, each pass of these filters produces 4 output images (LL, HL, LH, HH) for each input image. In complete analogy to the wavelet decomposition algorithm13 , we may now apply the above filtering scheme recursively to obtain successively lower-pass filtered images, by operating on the LL image IΦ produced at the preceding step with a coarser-scaled version of the above filter set. As with wavelet applications, the scale factor between successive levels is chosen as 2. We may then define a generalization of the filter set in Eq. (3) to any level j (> 0) as follows. Φj ( x , y) = 2 −2 j Φ(2 − j x ,2 − j y ) Ψ1, j ( x , y) = 2 −2 jΨ1 ( 2 − j x ,2 − j y )

(4)

Ψ2 , j ( x, y ) = 2− 2 jΨ2 ( 2 − j x ,2 − j y ) Ψ3, j ( x , y) = 2 −2 jΨ3 ( 2 − j x ,2 − j y)

12

When downsampling is used, the 4 output images are each one-quarter of the size of the input image and therefore together they occupy the same storage as the input image. Subsequent filter passes operate on the immediately preceding LL sub-image. Figure 1 describes the decomposition procedure for 2D images, and Figure 2 shows a 2-level example. Note that the LL image at a given level j (denoted by LLj) is overwritten by the 4 level-(j+1) images LLj+1 , HLj+1 , LHj+1 , HHj+1 , in the multiscale hierarchy. Only the three high-pass filter outputs are used in the registration – that is, at each level j, the HLj image originating from I is compared with the HLj image from I′, and similarly for LHj and HHj images. The LLj image is used only for propagating the filtering to higher (coarser) levels of decomposition. At each level j, the three band-pass image pairs are simultaneously used in forming the estimate of the registration parameters (see below).

Since the above technique is identical to the one employed with wavelet filters, one would expect similar results. Only symmetric compact biorthogonal wavelets were considered18 for application within our registration method. Unlike the compact orthonormal wavelets, they may be symmetric (or anti-symmetric)19 which are desirable characteristics for image processing. The quality of the estimation of the registration parameters will depend on the number and type of features that are highlighted in each sub-band filter output.

Figure 4 shows the decomposition of a MR image to level-3 using both a biorthogonal wavelet (Ref. 18: Table 6.1, N = 2, 13-tap symmetric/anti-symmetric filter pair) and the Simoncelli 5-tap matched pair 11 . Figure 3 shows the magnitude Fourier transform of each

13

of these filters at the same scale (j = 2 was chosen for clarity of display). From these plots, it is evident that the wavelet high-pass filter profile (Fig. 3) peaks at higher spatial frequencies than does the corresponding Simoncelli filter at the same scale. In Figure 4 this accounts for the fact that the Simoncelli filtered images contain considerably more detail than the wavelet-filtered images. Experiments bear out the expectation that the former afford much more accurate registration than the latter. Also, it is evident that the LL (low-pass) images contain fewer features than the three bandpass images, and their inclusion in the matching procedure was found to cause a deterioration in the stability and accuracy of estimation of the registration parameters – for both the Simoncelli and the wavelet filters. In essence, the Simoncelli filters are designed to operate as smoothing and smoothed-first-derivative operators, whereas the profile of the biorthogonal wavelet filter resembles a second-derivative operator. This special design property of the Simoncelli filter was not explicitly used, except insofar as first-order derivative (gradient) filters highlight edges which are the dominant features suited to registration5,6, and the fact that a richer set of image features was afforded by this filter than by the particular wavelet filter considered. This is borne out by Figure 4, which demonstrates that the Simoncelli gradient-filtered images HL and LH have more contrast than either the HH (secondderivative) image or any of the wavelet-filtered images.

Another set of filters of interest are the quadratic-spline filters of Mallat and Zhong17 , which were designed to perform as a multiscale edge detector and, like Simoncelli’s filters, resemble first-derivative operators. However, for each input, there are only two band-pass output images (HL and LH), instead of the three provided by the Simoncelli

14

filters. The Mallat-Zhong filters have impulse-response profiles similar to the Simoncelli filters Ψ1 and Ψ2 in Eq. (3). Some preliminary tests indicate that the performances of these two filter types in registration were comparable.

The smoothing properties of the multiscale hierarchy, described above, mean that only those features in the original image that are at spatial scale ≥ 2j will be present at level j. This sets a lower bound for the distance between image features, and a corresponding upper bound on the registration misalignment allowed. The advantage of having more than one filtered image-pair to register (in the present scheme, there are three at each level j), is that there are many more features, at each level, on which to register the pair, than are available from just one, unfiltered, image-pair on its own. This also provides more equations for the regression (see below).

2.3 Interpolation

The displacement field u(x) takes on arbitrary values, so that, in general, an image lattice point x will map to the interstitial point x + u(x). This requires that some form of interpolation procedure be used to evaluate image values I(x + u(x)) during the registration procedure. The downsampling of the images resulting from band-pass filtering, which is employed to reduce computational load, has the disadvantage of destroying the translational invariance or ‘shiftability’ property of these filters20 . The shiftability property implies that the filter transform of a perturbed input (in this case, the perturbation is the displacement field u) can be computed as a linearly weighted

15

combination of the transform coefficients of the unperturbed input. Since the downsampling operation retains the sampling rate at (or close to) the Nyquist rate, the shiftability property is not satisfied. However, by filtering at the full image size, thereby creating redundancy (oversampling) in the output, the shiftability property is preserved.

The strategy adopted in this paper is as follows. First, successive coarse-to-fine registration is carried out using the downsampling hierarchy of filters. After this has been done, the error of registration will be of the order of of 2Jmin pixels (where Jmin (> 0) involves filtering at the highest spatial frequencies of all the registration steps), depending on the interpolation scheme used. Second, using the approximate values of u (the displacement field) found in the first step, the same matching procedure is applied to the full-size filtered image pairs, in order to satisfy the shiftability criterion. One or two levels of coarse-to-fine registration using full-size images is generally sufficient to bring about a good match.

2.4 Equations for registration

There are several possible methods for solving the matching Eq. (1). In general, the images are pre-filtered by a set of filters h, such as those arising from the multi-level decomposition described above. The matching procedure may be implemented either by projecting the matching equation onto a subspace spanned by the translations of the filter function { h(x – n), n ∈ Z2 } (that is, convolving with the filter h), or by applying the

16

matching directly to the filtered images Ih ≡ I⊗h and Ih ’ ≡ I’⊗h. In the first case, convolution of Eq. (1) with the filter h(x) yields, after some manipulation:

∆Ih = eIh + (uI) ⊗•∇ ∇ (h) - (I ∇ •u) ⊗h

(5)

where ∆Ih ≡ Ih ' – Ih . In the second case, applying Eq. (1) directly to the prefiltered images Ih and Ih ', and rearranging the last term yields

∆Ih = eIh + ∇ •(Ihu) - Ih ∇ •u

(6)

Apart from the second (divergence) term on the right of Eq. (6), the derivative has been moved off the image function I onto the filter h or the displacement field u. The differentiation of Ih in the divergence term is eliminated by use of a ‘patch algorithm’ described below. Thus, since both u and h are modeled as continuously differentiable functions, the problem of ill-posedness is eliminated. Another benefit is that, at a sufficiently high (coarse) level of the filter-decomposition tree, pre-filtering I and I’ removes noise. In this formulation of the registration problem, the noise is not considered as part of the model, as in several recent Bayesian approaches to registration and optical flow estimation12,21,22.

17

2.5 Patch registration algorithm

Eq. (4) is simpler to solve (at least in the spatial domain) than Eq. (5), so henceforth we consider only the former. An effective strategy for finding the displacement field vector components is to integrate Eq. (6) over a set of patches in the spatial domain. To integrate over a given patch P in the image, we use Gauss’s theorem to express ∇.(Ih u) as an integral of Ih u over the boundary ∂P of the patch14 :

∫P ∆Ih dx = e∫P I dx + ∫∂P Ihu•n ds - ∫P Ih∇ •u dx

(7)

where n is the unit outward normal vector on ∂P. See Figure 5.

Size and Placement of Patches The optimal size and placement of the patches is not known. In general, the size of the patch will depend on the distribution of spatial frequencies in the image, which in turn depends on the level j at which it is pre-filtered prior to registration. Patches covering smooth areas in the image provide less reliable estimates than those covering areas that contain detail. To automate the method, rectangular patches were chosen. The choice of rectangular patches simplifies the evaluation of the surface and area integrals in Eq. (7). The size of the patches is specified by the user, and is kept fixed for each level j of the downsampled hierarchy; this scales the effective size of the patch to cover a larger fraction of the area of the downsampled image, the higher the level of decomposition. In this way, larger displacement fields can be accommodated at the higher levels. Also,

18

approximate corrections for these displacements are applied after each stage of registration, before proceeding to the next, finer level, where the residual displacements are smaller and the effective size of the patch (i.e., relative to the image size) is halved along each coordinate direction. For placement of the patches, a simple ‘tiling’ procedure was chosen: one that does not take into account the local image characteristics such as contrast and spatial scale of features which may vary considerably across the image. For this procedure, the algorithm attempts to distribute the patches uniformly over the image. The number of patches is computed for each level j as the maximum number that can be included in the current-size image without overlap. If the images are downsampled, there will be fewer patches the larger j, since the patch size is kept fixed (see above). At successively higher levels of decomposition, downsampling will reduce the number of pixels sufficiently that the patches may partially overlap. A more sophisticated algorithm would locate the patches in places where there are the largest numbers of features, at that level, within the area of the patch.

The final refining step of registration uses full-size images. The patch size is scaled by a factor of 2Jmin along each coordinate direction, where Jmin is the lowest (finest) level of the previous hierarchy of downsampled images. In this way, residual displacements resulting from the level-Jmin downsampled hierarchy of registration will adequately be ‘captured’ when registering the full-size image.

19

Regression Equations Substituting the linear representation of the displacement field (Eq. (2)) into Eq. (6), and evaluating the integrals over the rectangular patch Pα = [x 0α , x1α] × [y0α, y1α], we arrive at the following set of scalar equations, one for each patch and filtered image Ih , that are linear in the (2M+1) unknowns c = (e,c1 (x),…,cM(x), c1 (y),…,cM(y))T, defining the image contrast e and displacement field vector u:

M

∆αh = eEαh + ∑ [ Ak ,αh ( x ) ck ( x ) + Ak ,αh ( y ) c k ( y ) ]

(8)

k =1

where α := 1,.., npatch identifies the patch; h ∈ {HL, LH, HH} identifies the filter; and npatch is the number of patches at the given level. The scalar coefficients of c are defined by ∆αh = ∫∫ ∆I h dxdy Pα

Eαh = ∫∫ I hdxdy Pα

(9) Ak ,αh ( x ) =

y1 α

∫ [I

h

y0 α

Ak ,αh ( y ) =

x0 α

∂Φk dxdy ∂x

( x, y1α )Φk ( x, y1α ) − I h ( x, y0 α )Φk ( x, y0 α )]dx − ∫∫ I h ( x, y )

∂Φk dxdy ∂y



x1α

∫ [I

( x1α , y)Φk ( x1α , y) − I h ( x0 α , y )Φk ( x0 α , y)]dy − ∫∫ I h ( x, y )

h



20

Eq. (6) may be written in matrix-vector form as

Ac = ∆

(10)

in which A is the ‘design matrix’ formed from the constants Eαη, Ak,αη(x), Ak,αη(y) on the right in Eq. (8); and ∆ the ‘data vector’ of ∆αh ’s. The integrals are performed using Simpson’s rule over the image lattice points covered by Pα. For each level j, there are three band-pass filtered images Ih : namely, IHL, ILH, and IHH, and therefore 3npatch equations of the form (8) in the (2M+1) unknown components of c. To solve this set of linear regression equations, we require the number of patches to exceed (2M+1)/3. The solution to Eq. (10) is most efficiently computed using Singular Value Decomposition (SVD)23 : the design matrix is factored as A = UΣ ΣVT, where U and V are orthogonal matrices of order 3npatch X (2M+1) and (2M+1) X (2M+1), respectively, and Σ is a (2M+1) X (2M+1) diagonal matrix. The solution is then explicitly given by c = VΣ Σ-1 UT∆. This algorithm also yields an estimate of the condition number κ of the design matrix: κ = σmax /σmin , where σmax and σmin are the maximum and minimum singular values in the diagonal matrix Σ.

Updating Procedure Once estimates of the registration parameters e and ck have been obtained in this way, they are substituted into Eq. (2) and the image I is updated according to

21

Inew(x) = (1+e)Iold (x+u),

(11)

and Inew becomes the updated image to be registered against I´ at the next (finer) level of registration (see Figure 1). Starting at the coarsest level Jmax , e and u are incrementally refined and the images correspondingly updated, to the finest level Jmin (< Jmax ), using the downsampled hierarchy. Subsequently, in order to refine the level-Jmin registered image and remove any residual registration errors caused by the downsampling, the same procedure is applied using full-size images. This second hierarchy may be chosen to start at level Kmax = Jmin and finish at level Kmin < Kmax ; and, in general, only one or two levels are required to achieve a good fit. In practice, most of the adjustments occur at the first level of registration (Jmax ). It was found that, by repeating this step a significant overall gain in computational efficiency and registration accuracy could be achieved.

In general, the range of pixel intensities in the band-pass filtered images will vary from filter to filter, giving rise to variations in the magnitudes of the integrals defining the constants E, A(x,y), ∆ (Eq. (9)) that appear in each row of the design matrix. These variations will in turn give undue weight to high-contrast filtered images relative to lowcontrast ones. Therefore, for each band-pass filter at each level j – in both the downsampled and full-size image hierarchies – it is necessary to apply a compensating weight factor. In the present case, the weight was chosen according to the L2 -energy of the image (i.e., the sum of squares of the pixel intensity values) in the sub-bands ε(j)HL, ε(j)LH, or ε(j)HH, at level j as

22

w S (j) = (εS (j))-1/2

(12)

where S := HL, LH, or HH. The ε’s by definition scale as the square of magnitude of the band-pass filtered image intensity contrast, whereas the constants in Eq. (9) scale linearly with the intensity contrast. Therefore, low-contrast sub-band images will be given more comparable weight relative to high-contrast sub-band images, than if uniform weighting were applied.

Description of the Algorithm The patch algorithm can be summarized as follows. Let I(j) denote the n1 x n2 image obtained after registering at the jth-level of the hierarchy of processing. Also, define n1j = n1 /2j and n2j = n2 /2j, and npatch = no. of patches. The computational procedure for a pair of n1 x n2 images I and I´ is:

Input: Image dimensions (n1,n2) Patch dimensions (s1 ,s2 ); Coarse and fine limits of downsampled registration hierarchy: Jmin , Jmax ; Fine-scale limit of full image-size hierarchy: Kmin .

1. Using the given patch size s1 by s2 , compute a ‘tiling’ of the spatial domain by patches; 2. I(Jmax) := I; 3. For j = Jmax ,…,Jmin :

23



Apply j passes of the (downsampling) filter-set to I(j), producing n1j x n2j sub-images ILL, ILH, IHL, IHH;



For each patch Pα, evaluate integrals (using Simpson’s rule) in Eq. (5), with u parameterised as in Eq. (2), over each of the 3 band-pass sub-images IHL, ILH, and IHH. Weight the integrals according to energy in the pass-band filtered image (Eq. (12));



Solve the resulting set of 3npatch regression equations for the e and ck, using SVD;



Update the registration estimate: I(j-1)(x) = I(j)(x+u(j)), where u(j) is the estimate of u obtained from the e and ck’s found in the preceding step. (Bilinear interpolation was used to evaluate I at interstitial points (x+u(j)));

4. Resize each patch to (2Jmin s1 +1, 2Jmin s2 +1) and re-compute a tiling of the image plane. 5. Repeat Step 3 for the full-size hierarchy, starting at level Kmax = Jmin and finishing at level Kmin < Kmax . Output: Registered image I; Corresponding registration parameters (e, {ck: k = 1,..,M})

The reliability of the estimates can be determined at each level from the condition number κ of the matrix in the linear regression equations. The expected displacements u between the unregistered input images must be smaller than 2Jmax pixels in size, i.e. Jmax ≥ log2 (|u|max ), and Jmin and Kmin must be large enough that noise does not corrupt the

24

solution. It is found that the corrections to u diminish rapidly; indeed, most of the correction is achieved in the first step: j=Jmax .

Affine registration In the present paper, the displacement field is represented in its simplest form as the affine transformation:

u = b + Cx

that is, in Eq. (2) M = 3, Φ1 (x) = 1, Φ2 (x) = x, Φ3 (x) = y, and

u1 = b1 + c11 x + c12 y

(13)

u2 = b2 + c21 x + c22 y

Affine transformations have the useful property that they form a group under composition. This implies that changes in the registration parameters (∆e, ∆b1 , ∆c11 , ∆c12 , ∆b2 , ∆c21 , ∆c22 ), found in the incremental step going from a coarse to the next finer level, can be composed with the current estimate (e, b1 , c11 , c12 , b2 , c21 , c22 ) of the registration parameters, to produce the updated parameters (e’, b1 ’, c11 ’, c12 ’, b2 ’, c21 ’, c22 ’), given by the requirement

Inew(x) = (1+∆e) (1+e)I(x + ∆b + ∆Cx + C(x + ∆b + ∆Cx)) ≡ (1+e’)I(x + C’x + b’)

25

obtained by combining two steps of the matching equation. This identity yields the set of updated registration parameters

e’ = e + (1+e)∆e C’ = C + (I+C) ∆C

(14)

b’ = b + (I+C)∆b

where I is the identity matrix. Importantly, each time the image I needs to be updated, this is done directly from the original input image I using the new set (e’, b’, C’), not incrementally from the previously updated image using the increments (∆e, ∆b, ∆C). In this way, errors introduced by the interpolation procedure (required to compute I at interstitial points) cannot accumulate, as they arise only from a single interpolation step. For higher-order transformations of the form Eq. (2), this group property no longer holds, so successive compositions of transformations entail some form of approximation.

3. RESULTS

We report results using the Simoncelli matched filter pair to achieve multiscale filtering. The algorithm described above was tested on the following pairs of 256 x 256 T2 *weighted 2D brain images:

26

(i) Image I´ was artificially constructed by distorting I using the affine transformation

u1 = 0.05x - 0.05y + 0.0195;

(15)

u2 = 0.06x + 0.06y + 0,

and e = 0. Here, the spatial domain has been scaled to [0,1] × [0,1]. Using the method described in the previous section, we attempted to recover the parameters of this transformation. The maximum displacement in this example is 30 pixels. The patch size used was (s1 ,s2 ) = (5,5) pixels. Figure 6(a) shows the image pair prior to registration.

(ii) Image I´ was artificially constructed by distorting I using the non-linear (quadratic) transformation

u1 = 0.05 + 0.05x – 0.05y + 0.1x 2 + 0.05xy + 0.1y2 u2 = 0

(16)

+ 0.06x + 0.06y + 0.02x 2 + 0.01xy – 0.1y2

We attempted to recover the 12 registration parameters (as well as e) of this transformation. The same patch size as in (i) was used. The pre-registration image pair is shown in Figure 6(b). In Eqs. (15) and (16), the coefficients were chosen arbitrarily, merely to exercise the algorithm, and do not reflect any particular real situation.

(iii) Images I and I’ were obtained from two consecutive stationary positions of the subject during the same experiment. The two positions represent a large in-plane head

27

movement between the two images. The patch size (7,7) was used. The pre-registration images are shown in Figure 7(a).

(iv) The images in (iii) were artificially corrupted independently by Rician24 noise, which was added to each pixel of both images according to the formula Snoise = √[S2 + n2 ]

(17)

where S is the pixel value in the original noise-free image, and n ~ 0.3Smax N(0,1) is the noise component derived from the normal distribution with mean zero and standard deviation 0.3Smax , where Smax is the maximum intensity of pixels in the original image. The same patch size as in (iii) was used. The pre-registration images are shown in Figure 7(b).

The accuracy of registration was defined in terms of the following error measure:

err(j) = ||I´-I (j)||/√( ||I´ || ||I (j)||),

(18)

where ||I|| denotes the L2 -norm of I, and was computed at each level j:= Jmax ,..,Jmin , using the full-size image pair I and I’ after registration. It is important to note that, even if perfect registration could be achieved, this error measure may still be greater than zero, and thus be somewhat misleading. There are two reasons for this: (a) The noise characteristics of the images to be registered will, in general, be different. This is not the case in (i) or (ii), where the transformations (15) or (16) were applied to each pixel in turn; (b) If any features are present in one image but absent in the other – for example,

28

due to occlusion by an image boundary – then a complete matching of features will not be possible. This situation occurs even in cases (i) and (ii), since the transformations (15), (16) will move points near the border of image I outside the domain of image I’.

In the experiment (i), Jmax = 3 and Jmin = 2. The level-3 step was executed twice to achieve better accuracy (see Section 2.5 and Figure 8). The error was found to vary as err(3) := 0.470, 0.076, err(2) = 0.151 for the first (downsampling) hierarchy. For the second (full-size image) hierarchy, the patch was enlarged to (2Jmin s1 +1, 2Jmin s2 +1) = (21,21) pixels. After two passes at level Kmax = 2, Kmin = 1 of the second (full-size image) hierarchy, err (2) = 0.007, err(1) = 0.0001. For comparison, the original error before registration was err(0) = 0.205. The initial increase in err (err(3) > 0.205), as well as an increase in going from j = 3 to 2 in the first hierarchy, may be due to the non-shiftability property of downsampling filters, as discussed in Section 2.3. This behaviour highlights the necessity of using the redundancy in the full-size filtered images to attain accurate final

registration,

but

the

utility

of

downsampling

for

initially

correcting

large

misalignments at reduced computational cost. Also, it should be noted that, since filtered images are being used for registration, the lowest level that can be registered is j = 1 (corresponding to a single pass of the multiscale filtering procedure).

The condition numbers were, for the first hierarchy: κ(3) := 18.6, 20.6, κ(2) = 36.4; and for the second hierarchy: κ(2) = 32.6, κ(1) = 34.6. By comparison, if regression is applied using only one image (LL) without downsampling, the condition numbers were found to be of order of a few thousand. The parameters were recovered with maximum error ≈

29

0.0009 in C, 0.0004 in b (equivalent to 0.1 pixel, or 0.1mm with the FOV = 25cm), and 0.2% in e. (Note: The errors in b and C are normalized to the image size as unit). The final solution attains sub-pixel alignment accuracy (Figure 8). The total processing time on an SGI Challenge Series CPU with 150MHz clock was 2.8 secs. for levels (3,3,2) of the first hierarchy, and a further 9.9 secs. for level 2 of the second hierarchy. These timings demonstrate the savings achieved by downsampling. Indeed, the majority of the computation time is spent in performing multiple passes of the filter on the full-size image, each pass at level j requiring 4j times as many computations as its downsampled counterpart. For comparison, if the images are not downsampled at each filter pass, the entire computation takes 28.57 secs., and requires more levels of hierarchical processing (j := 5,4,3). Furthermore, the iterative method in Refs. 5 and 6 (generalized to affine transformations instead of just rigid body displacements) requires 28 secs. using the processing hierarchy Jmax = 3, Jmin = 1, and produces somewhat larger errors in the registration parameters. These timings indicate a speedup for the affine transformations by a factor 2.5 over the previous method5,6.

For the experiment (ii), it was found that the computations were more efficient if an affine approximate registration was first applied, followed by the full non-linear transformation. Figure 9 shows results of an affine transformation with Jmax = 6, Jmin = 2, followed by the quadratic transformation with Jmax = 3, Jmin = 1. For the final registration, err(1) = 0.009. [Note: In this case, no downsampling was used; if it were, then computations would have been more efficient and fewer levels would probably have been required.]

30

For experiments (iii) and (iv), a sequence of j-levels (3,3,3,3,2) was used in the downsampling hierarchy, and (2,2) in the full-size hierarchy (see Figures 10 and 12). In the first case, the final error following registration was err = 0.056, and in the second case err = 0.268. Figure 12 displays the original reference image ((a) and (c)) and difference between reference and registered images ((b) and (d)) for the two cases. However, as discussed above this error is misleading, due to the independent noise in the two images of each pair. Also, inspection of the pre-registered images in Figure 7a reveals small localized differences in anatomy that are not amenable to affine registration. Comparison of the difference images after registration (Figure 12 (b) and (d)) also shows that the registration was poorer in the presence of noise, but the algorithm did not fail catastrophically (as was found with Fourier-based methods5 ). Finally, it was found that there was no merit in proceeding to level j = 1, as the local differences in anatomy between the images ‘confused’ the affine registration procedure and no further improvement, in either err or the appearance of the difference image, could be achieved. However, the 13-parameter quadratic registration algorithm was applied to the images in (iii), after they had been affine-registered, using Jmax = 3, Jmin = 2. It was found that err was reduced slightly, from 0.056 to 0.042 – see Figure 11, but only marginal improvement in the difference images (a reduction in the range of intensities by 23%) was achieved.

Finally, the automatic choice of registration run-time parameters (patch size, number of levels in each of the hierarchies) is still a matter for investigation. The choices in the four

31

experiments above were guided by the apparent misalignments between the unregistered images, the size (in pixels) of the images, and the spatial scale distribution of detail; the final values chosen were based on comparing the values of err between different trials. Although err is not an ideal measure of registration error, it was generally found adequate for the purpose of finding an optimal set of run-time parameters.

4. DISCUSSION AND CONCLUSION

An algorithm for fast, efficient registration of images has been presented. The method has strong links with the optical flow literature, but differs in that it uses only spatial filtering, not spatio-temporal filtering. The method of this paper is similar to the approach of Gupta and Kanal14 , in which an integral form of the optical flow equation was used in order to circumvent the problem of numerical instabilities of approximating derivatives on a sampling lattice.

For rapid sequences of MR images, such as are

acquired during a functional MR imaging (fMRI) experiment, registration should include temporal filtering as well, and thus would more closely resemble the optical flow approach. The analysis of flow estimation errors in optical flow12,15,16 carries over to the registration approaches, though an analysis of stability and errors for the integral formulation of the present algorithm remains to be done.

It was found that by carrying out preliminary registration on reduced-size images, obtained

by

downsampling

through

a

hierarchy

of

low-pass/band-pass

filters,

considerable computational savings could be achieved, even though the registration

32

accuracy was limited by the lack of ‘shiftability’. However, the output images from this hierarchy were sufficiently well registered that a final registration using the full-size images (for which the shiftability property is satisfied) could make significant corrections to the residual errors. For example, the artificial experiment (i) (Section 3) showed that the residuals could be reduced to 0.1 pixel. For rapid imaging sequences, such as echoplanar imaging (EPI) in MR, in which the inter-image motion is generally of the order of a few pixels, the downsampling stage should not be used; only full-size image registration is of avail. The algorithm has been successfully used to register long sequences of EPI images from an fMRI experiment, resulting in improved detection of activated regions in the brain in cases where inter-image displacements were greater than a few pixels. For example, to register a sequence of 200 64x64 EPI images took less than 5 minutes on a SGI Challenge Series CPU with 150MHz clock.

Some outstanding questions with the method concern the automatic choices of patch size, the ranges of levels at which to carry out registration using both the downsampled and full-size images, and the adaptive placement of patches to take advantage of the structural features in the images. Adaptive methods exist for the optical flow methods16 , but the use of such methods for the present integral form of the image matching equation (Eq. (7)) requires separate investigation.

The method described has proved efficient and general enough to be extended to more elaborate models for global distortion. An example of a quadratic deformation, requiring 12 deformation parameters {ck} and one intensity parameter e, was given. It

33

demonstrated that it is computationally more efficient to first apply a simple transformation (in this case, a 7-parameter affine transformation) to obtain approximate alignment, then apply the complete, more expensive transformation to do the final corrections. Quadratic deformations are of interest, for example, in cardiac imaging, and even brain imaging, wherever global deformations occur.

The robustness of the method to noise has been demonstrated (Section 3, Experiment (iv)). However, only Rician noise24 (arising from MR scanning hardware) was tested. The effects of various physiological artifacts are under investigation, particularly those arising during fMRI scans. One of the main difficulties arises from vasculature. These features usually deform in highly non-linear ways, and therefore will require the resources of a computationally expensive local-distortion registration program, using large numbers of parameters to define the distortion field u. The generalization of the method of the present paper to register local deformations is in progress (see below). Also, the present method allows for variations in overall image brightness (the parameter e above) which are common in MR imaging scans.

The computational speed makes the method a candidate for real time correction of interimage patient motion in a fast imaging experiment. The pre-filtering transformations are computationally the limiting factor in this algorithm, and the speed of computation depends on the range of levels Jmax ,…,Jmin , and Kmin , needed to successfully perform registration. Their values are dictated by the size of the distortion (from which Jmax may be estimated), and the amount of noise and deviation of true distortions from the

34

registration model (from which Jmin and Kmin may be estimated). Smaller displacements allow smaller Jmax , thereby reducing the time required for pre-filtering.

Extensions to 3D images are immediate. For example, the integrals in Eq. (7) extend over patch volumes P and their bounding surfaces ∂P, and the evaluation of these integrals (as well as the placement of the patches in the spatial domain) is greatly facilitated by use of cubic rectangular shapes. However, the larger numbers of parameters (13 for an affine transformation and image intensity correction), and the need to compute filter transformations on larger image data sets, will both increase the computational time. For quadratic deformations (e.g., 3D fast MR cardiac imaging) the number of parameters increases to 31 (including image intensity correction). The time for filter transformations scales linearly with the number of data points, and the SVD algorithm needs to solve for nearly twice (affine) or four times (quadratic) as many registration parameters as in the 2D affine case.

The extension of the patch algorithm to handle arbitrary (including local) deformations would be of interest, and is under investigation. The global adjustments provided by the algorithm of the present paper could be followed by local corrections, caused by deviations from the assumed registration model (e.g., affine transformation). For local registration, similar to the approach of Szeliski et al.25 , we consider basis functions of the form

Φk(x,y)

=

φ(x-kx) φ(y-ky)

35

where φ is a spline function or wavelet with small support. The indices k = (k x,k y), and lie on a sub-lattice of the image lattice: for example, k = (mx,my) for some integer m > 0, where (x, y) lie on the image lattice. The optimal choice for size of the patches P will depend on the scale and distribution of image features. The localized nature of the patches and the basis functions together imply that the design matrix for this regression problem will be sparse, so sparse matrix algorithms are of importance for carrying out the Singular Value Decomposition which would otherwise be prohibitively slow. If the size and placement of patches can be adapted to local image features, then the performance will probably be significantly improved. Local distortion correction is of great importance in medical imaging in general. However, since the methods employ many more parameters to describe the distortion field, a preliminary global registration, such as affine transformation, would be essential to reduce the computations to a manageable amount.

Finally, the algorithm is well suited to a parallel processing implementation. The computations of Eqs. (9), for setting up the regression equations, are all of identical form and localized to the image data covered by the patch. Therefore, all patch computations can be carried out independently, and their results will be available for the SVD computations each after the same number of computations. Also, once the levels (Jmax , Jmin , Kmax , Kmin ) and patch size (s1 ,s2 ) have been decided, the total number and duration of the computations are known. The number of registration parameters {ck} (Eq. (2)) that

36

can be accommodated in real-time processing will therefore be limited by the degree of parallelism available for the computations.

ACKNOWLEDGEMENTS

The author expresses his appreciation for the many useful comments of the two anonymous referees for improving this paper, and Dr L. Ryner of the Institute for Biodiagnostics for his help in acquiring the MR images used to test the registration programs.

37

REFERENCES

1. Lee, C.C. , Jack, C.R., Grimm, R.C., Rossman, P.J., Felmlee, J.P., Ehman, R.L., and Riederer, S.J. “Real-time adaptive motion correction in functional MRI,”

Magn.

Reson. Medicine 36, 436-444 (1996).

2. Fu, Z.W., Wang, Y., Grimm, R.C., Rossman, P.J., Felmlee, J.P., Riederer, S.J., and Ehman, R.L., “Orbital navigator echoes for motion measurements in magnetic resonance imaging,” Magn. Reson. Medicine 34, 746-753 (1995).

3. Van Gool, L., Moons, T., Pauwels, E., and Oosterlink, A., “Vision and Lie’s approach to invariance,”, Image and Vision Computing 13:259-277 (1995).

4. Maas, L.C., Frederick, B., and Renshaw, P.F., “Decoupled automated rotational and translational registration for functional MRI time series data: The DART registration algorithm,” Magn. Reson. Imaging 37:131-139 (1997).

5. Alexander, M.E. & Somorjai, R.L., “The registration of MR images using multiscale robust methods,” Magn. Reson. Imaging 14, 453-468 (1996).

6. Alexander, M.E., Scarth, G. & Somorjai, R.L., “An improved robust hierarchical registration algorithm,” Magn. Reson. Imaging 15, 505-514 (1997).

38

7. Woods, R.P., Cherry, S.R., & Mazziotta, J.C., “Rapid automated algorithm for aligning and reslicing PET images,” J. Comp. Assist. Tomogr.16, 620-633 (1992).

8. Chen, Q.-S., Defrise, M., & F. Deconinck, “Symmetric phase-only matched filtering of Fourier-Mellin transforms for image registration and recognition,” IEEE Trans. Pattern Anal. Mach. Intell. 16, 1156-1168 (1994).

9. Reddy, B.S. & Chatterji, B.N., “An FFT-based technique for translation, rotation, and scale-invariant image registration,” IEEE Image Processing 5, 1266-1271 (1996).

10. Horn, B.K.P., and Schunck, B.G., “Determining optical flow,” Artificial Intelligence 17, 185-203 (1981).

11. Simoncelli, E.P., “Design of multi-dimensional derivative filters,” Proceedings

of

ICIP-94, Austin, Texas, pp. 790-794, 1994.

12. Simoncelli, E.P., “Distributed representation and analysis of visual motion,” PhD thesis, Technical Report #209, Vision and Modeling Group, MIT Media Laboratory, 1993.

13. Mallat, S. G., “A theory for multiresolution signal decomposition: the wavelet representation,” IEEE Trans. Pattern Anal. Mach. Intell. 11, 674-693 (1989).

39

14. Gupta, N. & Kanal, L., “Gradient based image motion estimation without computing gradients,” Int. J. Computer Vision 22, 81-101 (1997).

15. Kearney, J.K., Thompson, W.B., and Boley, D.L., “Optical flow estimation: An error analysis of gradient-based methods with local optimization”, IEEE Trans. Pattern Anal. Mach. Intell. 9, 229-244 (1987).

16. Battiti, R., Amaldi, E., and Koch, C., “Computing optical flow across multiple scales: An adaptive coarse-to-fine strategy”, Int. J. Computer Vision 6, 133-145 (1990).

17. Mallat, S., and Zhong, S., “Characterization of signals from multiscale edges,” IEEE Trans. Pattern Anal. Mach. Intell. 14:710-732 (1992).

18. Cohen, A., Daubechies, I., and Feauveau, J.-C., “Biorthogonal bases of compactly supported wavelets,” Comm. Pure Appl. Math. 45, 485-560 (1992).

19. Daubechies, I., Ten Lectures on Wavelets, Society for Industrial and Applied Mathematics (Philadelphia, PA), 1992.

20. Simoncelli, E.P., Freeman, W.T., Adelson, E.H., and Heeger, D.J., “Shiftable multiscale transforms,”, IEEE Trans. Inform. Theory 38:587-607 (1992).

40

21. Downie, T.R., “Wavelet methods in statistics,” PhD thesis, University of Bristol, Dept. of Mathematics, 1997.

22. Magarey, J., and Kingsbury, N., “Motion estimation using a complex-valued wavelet transform,” IEEE Trans. Signal Processing 46:1069-1084 (1998).

23. Press, W.H., Teukolsky, S.A., et al., Numerical Recipes in C, 2nd Edition (Cambridge University Press, Cambridge MA), 1992.

24. Gudbjartsson, H., and Patz, S., “The Rician distribution of noisy MR data”, Magn. Reson. Medicine 34:910-914 (1995).

25. Szeliski, R., and Coughlan, J., “Spline-based image registration,” Int. J. Computer Vision 22, 199-218 (1997).

41

LIST OF FIGURE CAPTIONS Figure 1: Flow diagram for the patch registration algorithm. Image I is aligned with image I´ by successive refinement of the registration parameters. The registration is carried out from coarse (j = Jmax ) to fine (j = Jmin ) scales. At level j, the pre-filtered images for registration are obtained by j passes of the filters. Each such pass produces one low-pass (LL) and three band-pass (HL,LH, HH) images. [Note: Subscripts j on LLj etc., have been omitted for clarity]. The filtering can either include downsampling by a factor two along each coordinate direction, or retain the image size for each filter output. Note that image I´ need be filtered to level Jm ax, prior to registration, only once. This is because the filter decomposition, as it proceeds from level 0 to Jmax , makes the filtered images available at all lower levels (i.e., finer scales) j < Jmax . These images can subsequently be used in successive stages of registration as it proceeds from coarse (j = Jmax ) to fine (j = Jmin ) levels. This is not the case with image I, since it is transformed into a different image by the current registration parameters at each stage. Consequently, the filtering operation needs to be repeatedly carried out for each new update of I. At each stage, only the band-pass images (HL, LH, HH) are used for registration, as shown in the figure. The low-pass filter does not participate in the registration (and is therefore not shown), but is used only by the multiscale filtering procedure. (Recall that, to obtain the filtered images at level (j+1), the filtering is carried out only on the low-pass image (LLj) at the finer level j.) The parameters found at level j are used to update I for more refined registration at level j-1. I’ does not need to be updated, and having been pre-computed is immediately available for this next stage of registration. Figure 2: Hierarchical decomposition using Simoncelli’s matched filter pair, with downsampling by a factor two along each coordinate axis. As in the text, let LLj, HLj, LHj, HHj denote the low-pass and band-pass filtered images after j levels of decomposition.

At level 1, the original image is decomposed into,

and overwritten by, a low-pass (LL1 ) and three band-pass (HL1 , LH1 , HH1 )

42

images, each one-quarter the size of the input. At level 2, LL1 is again filtered, downsampled, and replaced by LL2 , HL2 , LH2 , and HH2 . The previously band-pass filtered images HL1 , LH1 , HH1 , do not undergo any further filtering. This cascade continues to a maximum level Jmax (= 2 in this Figure). When the images are not downsampled – as is the case in the second part of the registration algorithm – all images LL, HL, LH, HH are the same size as the original image, and thus – unlike the downsampled case shown in the Figure – require extra storage. Figure 3: Comparison of Simoncelli and biorthogonal wavelet filter impulse response profiles in the magnitude-Fourier and spatial domains. The Figure depicts these filters at level j = 2: low-pass Fourier (left column), high-pass Fourier (middle column), and high-pass spatial domain (right column). The 5-tap Simoncelli filter pair is taken from Ref. 11; the 13-tap wavelet/scaling function filter pair is taken from Ref. 18, Table 6.1. Both high-pass filters are directionally sensitive. The Simoncelli high-pass filter is asymmetric in the spatial domain, representing the first-derivative along this direction. The corresponding filter profile for the wavelet is symmetric, and resembles a second-derivative along this direction. Notice that the Simoncelli high-pass filter peaks at a lower spatial frequency than the corresponding wavelet filter. Figure 4: 2-level decomposition of a MR image, using (a) bi-orthogonal wavelet filter (see caption to Figure 3), and (b) Simoncelli filter. The filtered images show that the Simoncelli band-pass filters produce much more prominent features than do the wavelet filters. For both filters, the composite ‘tiling’ of successive levels of decomposition of the images shown here follows the layout described in Figure 2. Figure 5: The location of an arbitrary patch in the plane of a 2D image of size n1 by n2 pixels. The arrow circuit shows the path of integration (∂P) for the boundary integral in Eq. (7). For each patch P, the integrals in Eq. (7) are carried out for each of the three band-pass filtered images (HL,LH,HH) and for each level j:=

43

Jmax ,..,Jmin (first hierarchy) and j = Kmax ,..,Kmin (second hierarchy). See Section 2.5 for notation. Figure 6: The pre-registration images for the affine (a) and quadratic (b) registration experiments (i) and (ii) in Section 3. The reference images are in the left column; the corresponding images in the right column were obtained by artificially distorting the reference image using the transformations given in Eqs. (15) and (16). All images are T2 * -weighted, of size 256x256 pixels. Figure 7: The pre-registration images taken from a MR multi-slice EPI scan, and used in Experiments (iii) and (iv) in Section 3. The images on the right were registered against those on the left as reference. (a) shows the actual images taken from the scanner, while the images in (b) contain independent 30% Ricean noise artificially added to the corresponding images in (a). The subject performed a large in-plane head displacement between the two images. All images are T2 * -weighted, of size 256x256 pixels. Figure 8: The progress of the registration algorithm for a 256 by 256 pixel image pair (see Figure 6(a)), for the affine distortion given in Section 3, Experiment (i). The patch size for downsampled registration was 5x5 pixels, and for the fullsize registration 21x21 pixels. (a) Difference image for the unregistered pair; (b)→ → (d) Difference images resulting from the downsampling hierarchy of registration at levels j := 3, 3, and 2, respectively. Residual errors due to the inability

to

achieve

sub-pixel

alignment

with

downsampling

(‘non-

shiftability’) are evident; (e)→ → (f) Difference images after two passes of the second, full-size image registration (Kmax = 2, Kmin = 1), refining the registration in (d). The residual misalignment in (f) is of the order of 0.1 pixel. Figure 9: The progress of the registration algorithm for a 256 by 256 pixel image pair for the quadratic distortion given in Section 3. Quadratic registration was performed on full-size images, with a patch size of 21x21 pixels. (a) Original reference image; (b) Difference image for the unregistered pair; (c) Difference image at the end of the hierarchy of affine (approximate) registration between levels Jmax = 6 and Jmin = 2; (d) Difference image at the end of the quadratic

44

(exact) registration hierarchy between levels Jmax = 3 and Jmin = 1, which refines the registration in (c). Throughout, the full-size images were used. Figure 10: Affine registration of real MR (EPI) images in Experiment (iii) of Section 3. The patch size for the downsampled registration was 7x7 pixels, and for the full-size registration 29x29 pixels. (a) Original reference image; (b) Original target image for registration; (c) Final target image after registration; (d) Difference image ((c)-(a)) after registration. Registration used levels j := 3,3,3,2 for the downsampled images, and levels j:= 2,2 for the full-size images. The error of registration was err = 0.056. Figure 11: Comparison of affine registration and a subsequent quadratic refinement registration of the images from Experiment (iii) (see Fig. 7(a)). Quadratic registration was performed on the full-size images. (a) Original reference image;

(b)

Original

target

image;

(c)

Difference

image

after

affine

registration; (d) Difference image after quadratic refinement. The value of err decreased from 0.056 in (c) to 0.042 in (d). The patch size used for the quadratic registration was 29x29 pixels. Figure 12: Comparison of affine registration on real image data from Experiment (iii) and the same data artificially corrupted by independent 30% Ricean noise. (a) Original reference image; (b) Difference image after registration of image pair without noise; (c) Reference image for registration of noise-corrupted image pair; (d) Difference image after registration of noise-corrupted image pair. The value of error (err) is 0.056 in (b) and 0.268 in (d). For (d), the different noise characteristics of the reference and target images accounts for a significant part of the large value of err.

45

Jmax level filtering

I

Jmax-1 level filtering

I(Jmax-1)

H

H H

L

H H

Jmax-1 level filtering

Jmax level filtering

j = Jmax

Registration

Registration

Registration



H

H

H

L

L

L

L

H

-2)

H

H

Jmax-2 level filtering

I(Jmax



L

H

Jmax-2 level filtering

j = Jmax-1

Figure 1

46



j = Jmax-2

HH

LH1

LH2

LL2

HH

HL1

HL2

Figure 2

47

Simoncelli filters

Wavelet filters

Low-pass (Fourier)

High-pass (Fourier)

Figure 3

48

High-pass (spatial)

(b)

(a)

Figure 4

49

n2-1 Image plane y1αα

P

Unit normal vector

y0αα

∂P

O

x0αα

x1αα

n1-1

Figure 5

50

(a)

(b)

Figure 6

51

(a)

(b)

Figure 7

52

(a)

(d)

(c)

(b)

(e)

Figure 8

53

(f)

(a)

(b)

(c)

(d)

Figure 9

54

(b)

(a)

(c)

(d)

Figure 10

55

(a)

(b)

(c)

(d) Figure 11

56

(b)

(a)

(d)

(c) Figure 12

57

Suggest Documents