image based temporal registration of mri data for medical ... - CiteSeerX

IMAGE BASED TEMPORAL REGISTRATION OF MRI DATA FOR MEDICAL VISUALIZATION Meghna Singh1, Richard Thompson2, Anup Basu3, Jana Rieger4 and Mrinal Mandal1 Departments of 1Electrical & Computer Engineering, 2Biomedical Engineering, 3Computing Science, 4Rehabilitation Medicine University of Alberta, Edmonton, Alberta, Canada. Contact: [email protected], [email protected] ABSTRACT The capability of creating video data from MRI has many advantages in visualization for medical practitioners, including (i) not subjecting patients to harmful radiations, (ii) being able to monitor patients at short inter-exam time intervals, and (iii) being able to capture 3D volume data. The quality and speed with which MRI data can be acquired, however, poses a challenge towards supporting good quality visualization. In this work we present results from our preliminary attempts at enhancing the temporal resolution of video captured via MRI. Our initial focus is on visualization of swallowing and associated problems that are broadly categorized as Dysphagia. We present a method to register data from multiple swallows to generate high temporal resolution MRI videos.

1. INTRODUCTION Medical visualization is fast becoming a vital tool for the detection and treatment of several diseases and disorders. Many clinicians assert that the most promising developments in medical visualization have been in MRI (magnetic resonance imaging), X-ray imaging (CT and video-fluoroscopy), ultrasound and endoscopy. Of these techniques, MRI has emerged as the front-runner because of its ability to be tailored to visualize internal soft tissue structure or even blood flow and its complete harmlessness to the human body (unlike X-rays scans, which also have limitations with regards to how often they can be conducted). MRI also allows for visualization in 3D and potentially 4D (with the fourth dimension being time). Other techniques for visualization fall behind because of their respective limitations; such as, ultrasound has poor spatial resolution and soft tissue contrast and limited access windows; endoscopy is invasive and X-rays result in harmful radiation exposure. In this work we address the problem of visualizing disorders in swallowing. In a broad medical sense, difficulty in swallowing is referred to as Dysphagia. The need for thorough swallow evaluations is rooted in the fact that dysphagia can lead to serious physical complications such as malnutrition, weight loss, and aspiration of food into the airway, all of which if not managed properly may eventually

lead to serious morbidities and even mortality. Currently, there are two widely-used technological methods of swallow assessment: Flexible endoscopic evaluation of swallow (FEES) [1] and modified barium swallow (MBS) captured by video-fluoroscopy [2]. The drawbacks of current assessment techniques (invasiveness for endoscopy and radiation exposure and poor soft tissue visualization for video-fluoroscopy) have prompted recent work in exploring other avenues for better visualization of swallowing. One tool that has the potential to address these drawbacks is MRI [3]. However, the major challenge with MRI is the tradeoff that exists between spatial and temporal resolution. High frame rate data capture can only be accomplished by losing spatial image quality. Our approach looks at integrating multiple high spatial-resolution MRI scans (taken during multiple swallow events at low temporal resolution) by aligning them temporally and spatially. There are two major challenges in achieving acceptable results: (i) no two swallows are identical in terms of the duration of a swallow; (ii) there is no external gating available to time stamp the multiple acquisitions. The rest of this paper is organized as follows. In Section 2 we review recent developments in MRI and related areas such as temporal registration; we also highlight the novelty of our approach. In Section 3 we present our proposed methodology. In Section 4 we discuss experimental results and in Section 5 we present conclusions and future work. 2. RECENT DEVELOPMENTS: REVIEW MRI was initially developed and used as a static imaging technique, but over the course of time it has evolved into more dynamic applications and is now being used to image dynamic events, such as cardiac rhythms, blood flow and brain activity [4]. However, when imaging fast occurring events, the low temporal resolution of MRI becomes a bottleneck. Researchers have tried to overcome this limitation by experimenting with multiple acquisitions of a cardiac cycle in order to get sufficient spatial and temporal resolution. An ECG signal that is captured simultaneously with the MRI is used as a gating signal. Other researchers are developing techniques for self-gating of the cardiac cycle which can be used for fetal cardiac imaging. Thompson et al. [5][6] use sum of pixel intensities in regions of interest and sum of pixel intensities in all of k-space as their self-gating

signal. The advantage with cardiac imaging is that the ECG or image derived gating signal is periodic and repeatable; the problem we are addressing is different in that multiple instances of events such as swallowing are not entirely reproducible and no reliable gating method such as ECG is available for swallowing. Developments in temporal registration such as generation of super-high resolution videos ([11]), frame rate upsampling [7] or registration of tongue models during speech [8] can be found in literature. [11] addresses temporal registration between multiple cameras acquisitions of a single event. They register the images based on the different frame rates of the cameras and a known or computed offset in the capture time which is modeled as a 1D affine transform. However, our problem statement deals with acquisitions of multiple repetitions of a single activity. [8] registers 3D models of the tongue based on dynamic time warping (DTW) of the articulations (sound) that are captured. DTW is a well-established technique for audio processing and speech recognition that allows alignment of multiple vocalizations for the same syllable. Others such as [9] use a 1D B-spline deformable model and affine transformation on cardiac images for correcting local and global temporal misalignments. Our approach is different in the following regards. Unlike [11] our problem statement deals with acquisitions (via a single source) of multiple repetitions of a single activity, and those repetitions of swallowing are not identical. Compared to [8] there is no external measurable event that can act as a gating signal. Drawing inspiration from work done by Thompson et al. [6], we propose to use image based techniques to self gate multiple swallowing events and align them in time to generate high temporal resolution MRI swallow data for better medical visualization and diagnosis. 3. PROPOSED APPROACH An image-based temporal registration approach allows us to derive parameters from the image sequences itself, that help register the sequences in time. Figure 1 shows an overview of our method; details are discussed in following sections. Swallow 1

………….

Swallow n

High spatial-Low temporal resolution MRI

Spatial Registration Bolus Tracking Temporal Registration High spatial-High temporal resolution MRI

Fig. 1: Overview of approach.

A. MRI acquisition strategies When imaging a short-lived event such as a swallow, rapid capture of MRI images is crucial. There are many acquisition strategies for MRI, of which Cartesian acquisition is most commonly used, but other approaches like Radial acquisition have been considered in research applications [6] (Fig. 2). In conventional Cartesian MRI acquisition, data is acquired in raster form and image resolution is sacrificed in order to attain faster imaging rates. In radial acquisition, data is acquired as a series of radial projections through the center of k-space. The advantage with radial acquisition is that information at the center of k-space (low frequency) is sampled at a higher temporal resolution without a significant loss of spatial resolution. We experimented with both of these capture methods in our work, and based on limits on the spatial resolution and temporal resolution obtained, we chose radial acquisition for our data capture. Ky

Ky

Kx

Kx (a)

(b)

Fig. 2: MRI acquisition strategies. (a) Cartesian acquisition, (b) Radial acquisition.

B. Spatial Registration There is some expected spatial movement of the head in between swallows. To compensate for this movement we register the swallow images using a linear conformal transformation. Linear conformal transformations may include rotation, scaling and translation. In this type of transformation shapes and angles are preserved. Spatial registration is accomplished by manually selecting control points in the two image sets and computing a transform based on these control points. C. Bolus Segmentation and Tracking Once the images have been spatially registered, the next step is to segment the bolus and track its path in the space-time volume of the swallow (as illustrated in Fig. 5). In order to segment the moving bolus we use a standard background separation technique [10]. A background template image is calculated as the average of MRI images where no bolus is present (images towards the end of swallow acquisition). From this background image we subtract the swallow MRI images. Regions of motion are thus segmented as shown in Fig. 3.

continuous time. From these continuous curves, we then compute the sub-frame displacement that minimizes Eq.1. Fig. 4 (a) shows the continuous curves generated at half-frame resolution and Fig. 4 (b) shows the temporal registration computed by matching the centroidal paths.

Fig. 3: Background separated images from a swallow showing the bolus.

We then compute a weighed centroid of the bolus in all segmented images. The weighing is computed in order to eliminate any residual noise. The centroid of the largest extracted region is assigned the highest weight and is retained, while other regions are considered as noise and removed from the space-time volume. A path of the centroid motion is then created as shown in Fig. 5 (c)&(d). Tracking centroidal motion has the advantage that it is stable and invariant to baseline transformations [12]. D. Temporal Registration The progression of the bolus down the oral cavity and the throat is represented by a space-time volume. This spatio-temporal representation of the bolus motion is crucial for visualization of problems with swallowing since abnormalities in a swallow will be reflected as deviations from the standard spatio-temporal path. Fig. 5 shows two such space-time volume representations. Temporal registration for the purpose of generating a high-resolution video of the swallow is based on matching the spatio-temporal paths of the bolus in the multiple swallows. Let the paths of the centroid in two swallows be represented as c( x, y, t ) and c' ( x' , y ' , t ' ) . As the multiple swallow images have been previously spatially registered c ' ( x' , y ' , t ' ) equates to c' ( x, y, t ' ) , where t ' = t + ∆t . Temporal registration of the two paths involves finding a sub-frame displacement ∆t that minimizes the distance between the centroidal positions in the two paths as follows: arg min ' 2 ' 2 (1) C = ∆t ∑ ((c x − c x ) + (c y − c y ) ) paths

[11] computes this sub-frame displacement by interpolating from the adjacent integer point locations. However, since each swallow is a related but non-identical event, interpolation from adjacent points may result in a sub-optimal solution. Instead, taking the entire event (swallow) dynamics into account for registration is a more favorable approach in our case. Thus, via curve fitting we compute the parametric curves that represent the entire paths of the centroid in

(a)

(b)

Fig. 4: (a) Curve fitting on the centroidal path to generate continuous time information. (b) Temporal registration of the paths shown in (a).

4. EXPERIMENTAL RESULTS All image processing and registration algorithms have been implemented in the MATLAB7.0 programming environment. The MRI scan was conducted at the University of Alberta at the Centre for the NMR Evaluation of Human Function and Disease. All image data was acquired with subjects lying supine in a Siemens Sonata 1.5T MRI scanner. Measured amounts of water (bolus) were delivered to the subject via a system of tubing and the swallow event was captured in the mid-saggital plane. We were advised by collaborating clinicians that if the amount of bolus being fed to the subject is kept constant, multiple repetitions of swallowing will have a higher degree of similarity, thus making the combination of the MRI sequences more realistic. As the current work deals with a prototype system, we captured only two repetitions of the swallow. The data was acquired as 96 radial projections of 192 points and reconstructed to an image size of 384x384. Acquisition time for each image with the above configuration is 0.138 seconds, which computes to a frame rate of 7.2fps. The MRI experiment was designed so that the water in the bolus appears bright in the images, which provides clear contrast between the bolus and the surrounding tissue, which is essential for tracking. A linear conformal transformation was computed from the manually selected control points and was applied to the MRI sequence. Results of the segmented and tracked bolus are shown in Fig. 5. Temporal registration based on matching the spatio-temporal paths of the bolus is shown in Fig. 4(b). A sample of the MRI images aligned in time is shown in Fig. 6. By visual inspection it is can be seen that the motion of the bolus has been interpolated accurately by the alignment, resulting in a doubling of the frame rate from 7fps to 14fps.

y

y

y

Fr a m d o m e ( sp a a i n) t i a l

x

b er e N um Fram omain) d (time

Fr a m d o m e ( sp a a i n) t i a l

b er e Nu m Fram omain) d (time

x

(a)

Fra m d o m e ( sp a a i n) tia l

(b)

5. CONCLUSIONS AND FUTURE WORK In this preliminary study we demonstrated that image-based techniques could be used to generate high quality MRI video of fast events, such as swallowing, that are not entirely reproducible. Spatio-temporal path matching of the bolus centroid was used to register multiple swallows in time to produce high temporal quality images. The frame rate of the video was doubled from 7fps to 14fps. Combining multiple swallows will provide a proportional further increase in temporal resolution, for example, up to ~30 fps (conventional video frame rate) by combining 4 swallows. In the future, we plan to pursue warping techniques to compare swallows across multiple subjects. Eventually, we plan to model 3D swallows for better visualization. ACKNOWLEDGEMENTS The authors are indebted to Dr. Carol Boliek for volunteering as a subject for this preliminary study. They would also like to thank Jana Zalmanowitz for her helpful review of dysphagia. REFERENCES [1] S.B. Leder, L.M. Acton, H.L. Lisitano and J.T. Murray, “Fiberoptic endoscope evaluation of swallowing (FEES) with and with out blue-dyed food”, 20: 157-162, Dysphagia 2005. [2] R.E.R. Wright, C.S. Boyd and A. Workman, “Radiation doses to patients during pharyngeal videofluoroscopy”, 13: 113-115, Dysphagia 1998.

Sw4–f8

Sw5–f1

Sw5–f4

Fr a m d o m e ( sp a a i n) ti a l

(c)

Fig. 5: Bolus track in space-time volume of (a) swallow1 and (b) swallow2.

Sw4–f5

x

b er e N um Fram omain) d (time

x

b er e Nu m Fram omain) d (time

(d)

Centroid path in space-time volume of (c) swallow1 and (d) swallow2.

[3] A. Anagnostara, S. Stoeckli , OM Weber and SS Kollias, “Evaluation of the anatomical and functional properties if deglutition with various kinetic high-speed MRI sequences”, 14:194-199, Journal of Magnetic Resonance Imaging 2001. [4] M.Singh, W.Sungkarat, J.Jeong and Y. Zhou, “Extraction of temporal information in function MRI”, IEEE Trans on Nuclear Science, Vol 49, No.5, Oct 2002. [5] R.B. Thompson and E.R. McVeigh, “Flow-Gated Phase Contrast MRI Using Radial Acquisitions”, Magn Reson Med. 52:598-604 (2004). [6] R.B. Thompson and E.R. McVeigh, “High Temporal Resolution Phase Contrast MRI with Multiple Echo Acquisitions”, Magn. Reson. Med. 47, 499-512 (2002). [7] A.M. Tourapis, C. Hye-Yeon, M.L. Liou and O.C. Au, “Temporal interpolation of video sequences using zonal based algorithms”, Proc of ICIP, Vol. 3, Oct. 2001. [8] C. Yang and M. Stone, “Dynamic programming method for temporal registration of three-dimensional tongue surface motion from multiple utterances”, Speech Communication, vol. 38, no. 1-2, pp. 201–209, 2002. [9] D. Perperidis, R. Mohiaddin, D. Rueckert, “Spatio-Temporal Free-Form Registration of Cardiac MR Image Sequences”. MICCAI (1) 2004: 911-919. [10] C. R. Wren, A. Azarbayejani, T. Darrel and A. Pentland, “Pfinder: Real Time Tracking of the Human Body”, IEEE Transactions on PAMI, pp 780-785, 1997. [11] Y. Caspi and M. Irani, “Spatiotemporal alignment of sequences”, IEEE Trans on PAMI, Vol24, No.11, Nov 2002. [12] L. Lee, R. Romano, and G. Stein, “Monitoring activities from multiple video streams: Establishing a common coordinate frame”, IEEE Trans. On PAMI, Vol.22, No. 8, pp. 758-767, Aug. 2000.

Sw4–f6

Sw5–f2

Sw4–f7

Sw5–f3

Sw4–f9

Sw5–f5

Sw4–f10

Sw5–f6

Fig. 6: Temporally registered MRI swallow images. Legend: Sw< sequence # > - f < frame # >.