A multi-center study: Intra-scan and inter-scan ... - Semantic Scholar

6 downloads 282 Views 905KB Size Report
May 1, 2012 - f Support Center for Advanced Neuroimaging, Institute of Diagnostic and Interventional Neuroradiology, Inselspital, Bern University Hospital, ...
NeuroImage 62 (2012) 87–94

Contents lists available at SciVerse ScienceDirect

NeuroImage journal homepage: www.elsevier.com/locate/ynimg

A multi-center study: Intra-scan and inter-scan variability of diffusion spectrum imaging A. Lemkaddem a,⁎, A. Daducci a, S. Vulliemoz b, g, K. O'Brien d, i, F. Lazeyras d, i, M. Hauf f, R. Wiest f, R. Meuli c, h, M. Seeck b, g, G. Krueger e, J.-P. Thiran a, c, h a

Ecole Polythechnique Fédéral de Lausanne, Signal Processing Laboratories (LTS5), Lausanne, Switzerland Epilepsy Unit, Neurology Clinic, University Hospitals of Geneva, Switzerland University Hospital Center (CHUV), Lausanne, Switzerland d Service of Radiology, University of Geneva, Geneva, Switzerland e Siemens Schweiz AG, Healthcare Sector IM&WS S, Renens, Switzerland f Support Center for Advanced Neuroimaging, Institute of Diagnostic and Interventional Neuroradiology, Inselspital, Bern University Hospital, University of Bern, Switzerland g Faculty of Medicine of Geneva, Switzerland h University of Lausanne (UNIL), Switzerland i University Hospitals of Geneva, Geneva, Switzerland b c

a r t i c l e

i n f o

Article history: Accepted 23 April 2012 Available online 1 May 2012 Keywords: DSI DTI ADC FA GFA Reproducibility Multi-center study

a b s t r a c t The objective of this study was to investigate whether it is possible to pool together diffusion spectrum imaging data from four different scanners, located at three different sites. Two of the scanners had identical configuration whereas two did not. To measure the variability, we extracted three scalar maps (ADC, FA and GFA) from the DSI and utilized a region and a tract-based analysis. Additionally, a phantom study was performed to rule out some potential factors arising from the scanner performance in case some systematic bias occurred in the subject study. This work was split into three experiments: intra-scanner reproducibility, reproducibility with twin-scanner settings and reproducibility with other configurations. Overall for the intra-scanner and twin-scanner experiments, the region-based analysis coefficient of variation (CV) was in a range of 1%–4.2% and below 3% for almost every bundle for the tract-based analysis. The uncinate fasciculus showed the worst reproducibility, especially for FA and GFA values (CV 3.7–6%). For the GFA and FA maps, an ICC value of 0.7 and above is observed in almost all the regions/tracts. Looking at the last experiment, it was found that there is a very high similarity of the outcomes from the two scanners with identical setting. However, this was not the case for the two other imagers. Given the fact that the overall variation in our study is low for the imagers with identical settings, our findings support the feasibility of cross-site pooling of DSI data from identical scanners. © 2012 Elsevier Inc. All rights reserved.

Introduction Multi-center neuroimaging studies may provide a powerful framework to conduct comprehensive studies of basic neuroanatomy and clinical disorders for research, clinical and pharmaceutical purposes. Since patient access can be significantly increased, multicenter studies allow a substantial speed up of research and development projects. Multi-center studies also provide the opportunity to introduce standards that could move MRI toward a quantitative and user-independent diagnostic tool. Collecting and combining images from different imaging devices, however, introduce additional sensitivity to systematic variability in the data. Variability may arise from device-dependent factors, differences in

⁎ Corresponding author. Fax: + 41 21 693 7600. E-mail address: alia.lemkaddem@epfl.ch (A. Lemkaddem). 1053-8119/$ – see front matter © 2012 Elsevier Inc. All rights reserved. doi:10.1016/j.neuroimage.2012.04.045

the reconstruction and processing strategies, in the case of in-vivo measurements, and also from physiological aspects that to the end might confound the interpretation of the results. Since the introduction of diffusion magnetic resonance imaging (dMRI) (Le Bihan and Breton, 1985), the method has evolved to a standard neuroimaging protocol in clinical diagnosis. During the past decade, dMRI has also been increasingly exploited in clinical and research studies as it allows assessing the micro-structural integrity of the brain. Some dMRI-derived parameters have been proven to be a sensitive marker of disease-related structural changes for many pathologies and disorders. For example, a reduction of the apparent diffusion coefficient (ADC) has been demonstrated to be a clinical marker of tissue death (Roberts and Rowley, 2003), while changes in the fractional anisotropy (FA) were used as a biomarker in diseases like epilepsy (Focke et al., 2008; Schoene-Bake et al., 2009), multiple sclerosis (Ge et al., 2005; Goldberg-Zimring et al., 2005), schizophrenia (Sun et al., 2003), and even in normal aging

88

A. Lemkaddem et al. / NeuroImage 62 (2012) 87–94

(Pfefferbaum et al., 2000) and the developing brain (Bonekamp et al., 2007). The most common analysis approach used in dMRI group studies consists in defining regions of interest across the brain and then comparing the groups based on the values of some dMRI-derived measures, such as FA or ADC, in those regions (Bonekamp et al., 2007; Lau and Goodyear, 2007; Vollmar et al., 2010). An alternative connectivity-oriented approach is to use tractography (Basser et al., 2000) to compute fiber pathways through the white matter and then considering the values of the scalar measures (i.e. ADC, FA etc.) along those pathways (Concha et al., 2010; Kanaan et al., 2006; Pannek et al., 2011). As these techniques are potentially of high clinical value and increasingly used in clinical research, it is very important to characterize and understand their reproducibility in detail. So far, however, reproducibility in diffusion MRI has obtained only little attention and has been studied only in diffusion tensor imaging (DTI) (Ciccarelli et al., 2003; Heiervang et al., 2006; Marenco et al., 2006). Unfortunately, there are only very few studies addressing the reliability of cross-center measures at 3 T (Fox et al., 2011; Teipel et al., 2011; Vollmar et al., 2010; Zhu et al., 2011), that would allow us to draw generalized conclusion about the dMRI reproducibility and in particular to understand the most limiting factors in cross-scanner comparisons. Moreover, high angular resolution diffusion imaging (HARDI) schemes and in particular diffusion spectrum imaging (DSI) are becoming increasingly available and attractive as these schemes are capable of measuring more accurate white matter fiber tracts and are increasingly used to study structural connectivity in the brain (Gigandet et al., 2008; Granziera et al., 2009, 2011; Hagmann et al., 2008; Honey et al., 2009; Knock et al., 2009; Wedeen et al., 2005). To our knowledge, no reports of multi-site reproducibility from more advanced dMRI acquisition schemes have been published. In the context of a Swiss multi-center study involving 3 different hospitals and 4 scanners and aiming at measuring structural connectivity alterations in a cohort of epileptic patients, we acquired DSI MRI data to explore the following questions: (i) what are the factors determining intra- and inter-scanner variability in DSI data, (ii) how can we assess the intra- and inter-scanner variability in DSI measurements, (iii) are intra- and inter-scanner variations low enough to permit pooling of clinical dMRI data and (iv) what is the relation between diffusion tensor related FA and HARDI related GFA (generalized fractional anisotropy) measures.

Materials and methods MRI data was collected from four 3 T MRI scanners from the same vendor (Siemens, Erlangen, Germany) located at three different imaging centers. Table 1 summarizes the differences in technical specifications. Scanners A and B (both Magnetom Trio a Tim System) are equipped with a 32-channel head helmet coil and will be referred to as “twin scanners” in the following. Scanner C (Magnetom Verio) is also equipped with a 32-channel head helmet coil but is based on a 70 cm magnet bore opening. Scanner D (Magnetom Trio, a Tim System) is equipped with a 12-channel head matrix coil. Scanners C and D are located in the same hospital.

Table 1 Details of scanner configurations. Scanner

Model

RF-coil

Coil combine

A B C D

Trio Trio Verio Trio

32 32 32 12

Adaptive-combine Adaptive-combine Sum-of-square Adaptive-combine

Recent studies have pointed out the dependency of FA and ADC in the brain tissue on the underlying signal-to-noise-ratio (SNR) (Burdette et al., 2001). Thus, SNR differences among scanners may introduce inter-scanner variations and we consider a basic SNR measurement a meaningful criterion for qualification. Furthermore differences in eddy-current and shim performance between scanners may introduce inter-scanner and inter-session variations. Thus, a further qualification criterion could be the assessment of the eddycurrent performance and the obtained B0-homogeneity. It should be noted, that those factors can be corrected if appropriate information and algorithms for artifact correction are applied. Phantom study SNR-measurements SNR and eddy current (EC) performance of the scanners were assessed using a Siemens spherical imaging phantom (1000 g of distilled water doped with 1.25 g NiSO46H2O) that is also used up for the service tune-up procedure. Since we used in our study multicoil receive head arrays and parallel imaging with an acceleration of 2, the obtained SNR strongly depends on the position of the object in the coil coordinate system. To obtain a pixel-wise SNR measurement, we acquired a time-series of images without diffusion weighting (b-value = 0 s/mm 2) using a single-shot echo-planar imaging (EPI) product sequence consisting of 22 interleaved slices with 32 repetitions. The other image parameters of the sequence were: TR/TE = 4200/154 ms, voxel size = 2.2 × 2.2 × 3 mm, and matrix size = 96 × 96. The SNR of this EPI time-series, E (x, y, z, t), where t = 1, … 32, was computed for each voxel as follows (Glover and Lai, 1998): SNRðx;y; zÞ ¼

μ ðSÞ σ ðNÞ

where the signal and the noise are estimated as Sðx;y; z; 1 : 16Þ ¼

½Eðx;y; z; 2iÞ þ Eðx;y; z; 2i−1Þ 2

and Nðx;y; z; 1 : 16Þ ¼

½Eðx;y; z; 2iÞ−Eðx;y; z; 2i−1Þ ; 2

respectively and where i = 1, …, 16. and stands for mean and standard deviation. We thereafter compared the SNR dependence as a function of the spatial position in the magnet by plotting it along a diameter through the phantom volume, as shown in Fig. 3b). Eddy-current-measurements In high-angular resolution diffusion experiments high b-values are applied that may introduce residual eddy-current related image distortions that depend on the diffusion weighting of each frame. This could lead to image blurring, erroneous ADC and FA estimates and could affect fiber tract computation in the affected areas. Eddy currents (EC) were measured by acquiring 9 slices in a phantom with 4 b-values (0, 400, 700 and 1000 s/mm 2) and 12 directions. An EC constrained registration was performed to obtain the expected shear, scale and translation parameters as a function of the diffusion gradient amplitude (Andersson and Skare, 2002). Thus the shear, scale and translation parameters for a b-value 6400 s/mm 2 scan could be estimated. The coefficients of the EC constrained registration were analyzed to assess the impact of distortions on the subject study.

A. Lemkaddem et al. / NeuroImage 62 (2012) 87–94

In-vivo study A total of eight healthy volunteers (4 females and 4 males, age = 29 ± 5 years) were included in this study. None of the subjects had a history of neurological or psychiatric disorders. The ethical committee of every hospital involved in this work approved this study and a written informed consent was obtained from each participant. Data acquisition All data acquired at the four scanners had a matched protocol as follows. DSI data was acquired using a twice-refocused single-shot echoplanar imaging (EPI) product sequence and the following parameters: TR/TE = 8000/154 ms, voxel size = 2.2× 2.2 × 3 mm, 44 slices, 256 directions covering half of the q-space with 4 shells and a maximum diffusion weighting with b = 6400 s/mm 2, and one image acquired without diffusion weighting (b= 0 s/mm 2). The twice-refocused diffusion encoding applies a bipolar diffusion encoding scheme to minimize residual eddy-current sensitivity (Reese et al., 2003). For anatomical reference, 3D T1-weighted MPRAGE volumes (TR/TE=2300/2.86 ms, FoV=240×256 mm, voxel size=1×1×1.2 mm) and 3D T2-weighted SPACE images (TR/TE=3200/408 ms, FoV=256×256 mm, voxel size=1×1×1.2 mm) were acquired for each subject. The following measurements were performed in order to assess reproducibility: (i) Intra-scanner reproducibility: all the subjects underwent an MRI scan twice on the same scanner (scanner A), but at different times. (ii) Reproducibility with twin-scanner settings: all the subjects underwent an additional MRI session on scanner B, which is of the same model as scanner A and is equipped with the same coil hardware and identical reconstruction settings were used as with scanner A. (iii) Reproducibility with other configurations: five out of the eight subjects were scanned on all four scanners at the three hospitals. Besides the twin scanners A and B, this also included scanners C and D. Scanner C differs from the twin scanners in the employed reconstruction settings as the signals from the different coil elements were combined using the so-called sumof-square method (Roemer et al., 1990). All other scanners used the conventional adaptive-combine combination of coil signals. Scanner D was equipped with a different RF-coil hardware using a 12-channel head matrix coil (all other scanners were equipped with a 32-channel head coil). All scanning sessions were carried out within a six month time interval, to minimize variability in the data due to biological changes as sources for obtained reproducibility. To minimize the motion artifacts we fixed the head of the subject with cushions and instructed them to avoid any motion during the 30-minute scan. At scanner B, an additional scanning session was conducted with five subjects using the reconstruction settings of scanner C (sum-of-squares). Image processing The intra- and inter-scanner variability of diffusion measurements was assessed using (i) a region-based evaluation of scalar maps and (ii) a tract-based evaluation of scalar maps. These two analysis techniques have also been employed in recent diffusion studies (Bonekamp et al., 2007; Lau and Goodyear, 2007; Vollmar et al., 2010). The orientation distribution function (ODF) was estimated using the Diffusion Toolkit (www.trackvis.org/dtk). The ODF is evaluated for a set of vectors which are the vertices of a regular polyhedron, the 362 vertex 6-fold geodesated icosahedron, of mean nearestneighbor separation = 0.16, rad = 9°.

89

The GFA was directly computed from the DSI raw data. Tuch (2004) defines the GFA as an analog for q-ball imaging of the FA in DTI. The GFA metric is defined as

GFA ¼

SDðODFÞ RMSðODFÞ

where SD(ODF) is the standard deviation of the orientation distribution function ODF and RMS(ODF) is its root mean square. The computation of FA and ADC is based on diffusion weighted data with sufficient SNR. Since the DSI data contains images with weightings of up to b = 6400 s/mm 2, only a subset of the DSI data is considered for the calculation of the FA and ADC scalar maps. In fact, a DTI dataset was obtained by linear interpolating point on a qspace shell at b = 1000 s/mm 2 along 64 directions distributed on the unit sphere. (i) Region-based-evaluation. The obtained FA, ADC, and GFA scalar maps were registered to the MNI space, where the corpus callosum splenium (CCS), cortico-spinal tract (CST) and uncinate fasciculus (UF) were manually delineated by an expert neurologist (S.V., 10 years of expertise) as depicted in Fig. 1. Registration was carried out using the nonlinear registration tool of FSL (www.fmrib.ox.ac.uk/fsl/). The mean value of each scalar map was computed inside every region-of-interest (ROI). (ii) Tract-based-evaluation. Fiber-tracking was performed using an in-house streamline-based algorithm adapted to work with DSI data. 30 seed points were created inside each voxel in the white matter. Then, from the MPRAGE image of each subject 70 cortical and 8 sub-cortical regions with anatomical landmarks were mapped using Freesurfer 5.0 (surfer.nmr.mgh.harvard.edu), and subsequently non-linearly registered to diffusion data. Six bundles-of-interest were isolated by selecting pairs of regions of interest known to be connected through these tracts, as shown in Fig. 2. These tracts were identified by an expert neurologist. The mean value of each scalar map was computed along every single fiber of a tract of interest and then averaged. The variability of the region-based and the tract-based analysis was assessed by means of the coefficient of variation (CV) of the measurements, which is the most commonly used relative measure of reproducibility reported in literature (Bonekamp et al., 2007; Ciccarelli et al., 2003; Heiervang et al., 2006). The CV is defined as the standard deviation of certain measurements divided by the corresponding mean ,

CV ¼

σ : μ

We determined the CV for each subject between two scanners and thereafter average the CVs, that is the mean CV that was computed for each subject. To further investigate any possible correlation between the scanners we also computed the intra-class correlation coefficient (ICC) (Shrout and Fleiss, 1979). The ICC considers both the within-subject variance arising from measurement error and the variance from the difference between subjects, according to

ICC3;1 ¼

BMS−EMS ; BMS þ ðk−1ÞEMS

where BMS is the between subject mean square (between subject variation), EMS is the error mean square and k is the number of scanners.

90

A. Lemkaddem et al. / NeuroImage 62 (2012) 87–94

Fig. 1. The three manually drawn regions-of-interest considered for the region-based analysis. From left to right: corpus callosum splenium (CCS), cortico-spinal tract (CST) and uncinate fasciculus (UF).

Whether we calculate CV or ICC, we always consider only the analysis between a pair of scanners. Results Phantom As demonstrated in the phantom measurements in Fig. 3a), scanners A, B, and C – all equipped with a 32-channel head array – perform very similarly. Scanner D (equipped with a 12channel head matrix coil) shows very similar SNR in the center of the coil but lower SNR closer to the proximity of the coil elements. The eddy-current measurements on the phantom showed that, even though the acquisition protocol was designed to be particularly sensitive to the eddy-currents, eddy currents were negligible across all scanners. As described in Materials and methods we applied an EC model, similar to Andersson and Skare (2002), to estimate the expected translation, scaling and shear for the worst case scenario, b-value = 6400. This showed that errors due to scaling, translation and shear were all equally affected on all sites. The ranges of pixel shift for translation, scale and shear across the sites are: 0.0424– 0.42, 0.015–0.3 and 0.048–0.48.

Scanner A and twin-scanners The first column in Fig. 4 reports a summary of the CV of the mean GFA, FA and ADC inside the ROIs and along the considered tracts. Some interesting considerations arise from these comparisons: first, the averaged CV for all scalar maps are in a range of 1%–4.2% for the region-based analysis and below 3% for all computed bundles of the tract-based approach except for the UF. The UF showed the lowest reproducibility, especially for FA and GFA values (CV ~ 3.7–6%). Second, a paired t-test showed no significant differences for both the intraand inter-scanner comparisons (intra: p > 0.17, inter: p > 0.06). Third, in general the ADC shows the best reproducibility as compared to FA and GFA, while FA and GFA show very similar reproducibilities. The second column of Fig. 4 compares the variability of the scalar measures obtained at the two twin-scanners by means of ICC. Values of ICC above 0.7 are considered as measures of high reproducibility (Marenco et al., 2006). From the figure, it appears that both FA and GFA have an ICC above this threshold in the majority of the investigated regions and tracts. No significant difference can be observed among them. This is however not the case for ADC, where some low ICC values down to 0.3 can be observed for smaller regions, bundles and interhemispheric bundles. Finally, the last observation is that intra-

Fig. 2. The six manually delineated bundles-of-interest considered for the tract-based analysis. From left to right, top to bottom: uncinate fasciculus (UF), corpus callosum body (CCB), corpus callosum splenium (CCS), thalamo-cortical fibers (TCF), fronto-occipital fasciculus (FOF) and arcuate fasciculus (AF).

A. Lemkaddem et al. / NeuroImage 62 (2012) 87–94

91

Fig. 3. a) The log(SNR) of the four scanners along a diameter. b) SNR map of scanner A.

scanner acquisitions showed, on average, slightly better ICC (0.71 (ADC), 0.83 (FA) and 0.86 (GFA)) than the inter-scanner (0.68 (ADC), 0.83 (FA) and 0.85 (GFA)), although this difference was not significant (ADC: p > 0.71, FA: p > 0.95, GFA: p > 0.76).

Different scanners Fig. 5 summarizes the comparison between scanner A vs itself, A vs C and A vs D concerning the reproducibility of the three considered

Fig. 4. ICC and CV of GFA (1st row), FA (2nd row) and ADC (3rd row) of the inter(green)- and intra(red)-site comparison of scanners A and B, CV is represented as mean ± standard deviation. ROIs: corpus callosum splenium (CCS), cortico-spinal tract (CST) and uncinate fasciculus (UF). Tracts: uncinate fasciculus (UF), corpus callosum body (CCB), corpus callosum splenium (CCS), thalamo-cortical fibers (TCF), fronto-occipital fasciculus (FOF) and arcuate fasciculus (AF).

92

A. Lemkaddem et al. / NeuroImage 62 (2012) 87–94

to the rest. A paired t-test showed no significant difference for the A vs D comparison when looking at the GFA and FA maps, but this did not hold for the ADC map where five of the ROIs and tracts showed p b 0.02. The CV for A vs C showed more significant different regions and tracts especially in the case of GFA when all ROIs and tracts but the UF were significantly different at p b 0.036. Fig. 6 highlights the relation of the FA vs. the ADC scalar maps for all the four scanners. Mean ADC and FA values are reported in the UF for the region-based analysis and along the CCB bundle for the tractbased analysis. Similar trends (not shown here) hold for all combinations of scalar measure (FA, ADC and GFA) and region/bundle considered. It can be noted that there is a very high similarity of the outcomes from scanners A and B, which are indeed identical imagers. However, FA and ADC measures from scanner D (12-channel coil) are slightly lower and an even more pronounced shift can be observed for scanner C (sum-of-square coil combination). Each color represents a certain scanner and each symbol a specific subject. As mentioned in the Materials and methods section, five additional subjects were scanned on scanner B with the same reconstruction settings as scanner C (sum-of-square) to confirm the hypothesis that the shift of scanner C in Fig. 6 is indeed due to the different reconstruction method. Discussion

Fig. 5. CV of the inter-scan comparison of scanners C and D vs A.

scalar maps (i.e. ADC, FA and GFA) for the region-based and tractbased analyses. As in the previous section, the CV of the mean GFA, FA and ADC inside the ROIs and along the considered tracts is reported. The worst CV is for the A vs D, 9%–16% for the GFA, 4%– 8.9% for the FA and 4.2–10% for the ADC. Here again, the CV for the UF tend to have a consistently higher standard deviation for all three scalar maps, reflecting the relative size of this bundle compared

The intra- and inter-scanner variability of the DSI measurements can be affected by a number of factors, including differences in hardware, protocol settings, the signal-to-noise level, eddy current distortions, field inhomogeneity, subject positioning, head and physiological motions, differences in the reconstruction and post-processing strategies. Some of these contributors can be assessed with phantom scans, i.e. we investigated the SNR and the distortions induced by the eddy current. The observation of the highest SNR with the 32-channel coil in the proximity of the coil elements, which at the same time exceeds the SNR obtained with the 12-channel coil in this area, is expected due to the coil design and is in very good agreement with recent publications (Wiggins et al., 2006). SNR properties are very similar for scanners A, B, and C. However, the 12-channel head array used in scanner D provides a more than two fold reduction of SNR at around 8 cm from the coil center, which at a first glance corresponds to the cortex location in human in-vivo measurements. As SNR differences may introduce variations in the FA, ADC and GFA computation (Burdette et al., 2001; Farrell et al., 2007), i.e. in this investigation that relies on an interpolation of a q-space shell with b = 1000 s/ mm 2, it appears advantageous for multi-centric studies to ensure similar SNR across the platforms. Fig. 6 witnesses that there is a bias shift for the FA and ADC measures obtained for scanners C and D. The 12-channel coil array used in

Fig. 6. One of the ROIs and tracts that were analyzed. Each color represents a certain scanner and each symbol a specific subject.

A. Lemkaddem et al. / NeuroImage 62 (2012) 87–94

scanner D provides a lower SNR away from the coil center, which most likely explains the observation of shifted FA and ADC values compared to scanners A and B. The shift in scanner C originates from a different reconstruction setting used for the combination of the signals from different coils. Whereas scanners A, B, and D used an adaptive-combine combination that generally leads to a lower noise contamination (in other words higher SNR), scanner C utilized the so-called sum-of-square coil combination (Roemer et al., 1990) method. This interpretation of the results was confirmed by an additional acquisition on scanner B for a subset of the subjects that used the sum-of-square method for the reconstruction as explained in the Results section. It has been shown in Walsh et al. (2000) that compared to the sumof-square technique, thepRMS ffiffiffiffi noise level in dark image regions is reduced by as much as N, where N is the number of coils in the array. The sum-of-square method also tends to inadvertently enhance motion and flow artifacts in dark pixels. It is therefore recommended to use adaptive combine when the signal level is comparable to the noise level (cholangiography, DTI/DWI, etc.). In Benner et al. (2006) the tractography yields better results when using adaptive combine compared to the sum-of-square reconstruction. Summarizing, SNR can be modulated by the field strength, RF-coil hardware, and reconstruction algorithms including coil combination, but also by other parameters such as parallel imaging implementation and others. The results in this study indicate that the computed scalar maps reveal clear sensitivity to the underlying SNR. Thus, it is recommended to closely monitor all hardware and protocol settings of the imaging platforms in a multi-center setting. Similarly eddy current behavior and field inhomogeneities after shim should be monitored. Our results indicated that at large b-values the distortion is expected to be on the order of 0.5 pixels; however, the different eddy current characteristics did not introduce a systematic variability across these scanners. In a multi-centric setting, particular attention of the potential systematic bias degrading the reliability of the diffusion parameters always needs to be considered. If one scanner has significantly different, and large, eddy current characteristics, strategies to correct the distortions need to be implemented. Ideally if time is not a constraint, the residual eddy currents should be measured and corrected through a field map distortion (Truong et al., 2008), alternatively image registration procedures that utilize information from low b-value scans to predict the correction of higher b-values similar to Andersson and Skare (2002) have shown merit. In addition measuring the field inhomogeneities provides an attractive way to retrospectively correct for susceptibility induced image distortions that may improve subsequent processing steps. In this study we can notice that GFA and FA carry coherent information regarding the local diffusion anisotropy as their values computed from different region/tracts on different imagers follow comparable trends. This has been confirmed in clinical studies as well (Chiu et al., 2011; Liu et al., 2010; Lo et al., 2011; Tang et al., 2010). Another factor that has been demonstrated to affect the reproducibility of the DWI results is cardiac pulsation (Pierpaoli et al., 2003), which showed that cardiac gating may provide a better intra- and intersession reproducibility in some brain regions (Vollmar et al., 2010). Unfortunately, gating has not been widely accepted in clinical studies because of the extended scanning time and occasional instability of gating depending on patient conditions such as arrhythmia. In this study, we decided not to use cardiac gating to represent more common imaging protocols. We report for the first time the reproducibility of GFA, FA and ADC measures derived from DSI at 3 T on healthy controls using four scanners from three different sites with four 3 T scanners. The acquisition protocols were identical for scanners A and B, but differed slightly for scanner C (different coil combination) and scanner D (different coil hardware). The reproducibility of the FA, ADC, and GFA measures were assessed by using a region-based and a tract-based approach.

93

Our results indicate that the ADC values generally provide a lower CV and high ICC, Fig. 4, when observing the intra-hemispheric connection in particular TCF, FOF and AF. A reason why ICC is lower in the inter-hemispheric connections could be the partial-volume. In fact, at the spatial resolution commonly used for a slice (here 3 mm), the fiber trajectories passing close to the CSF regions (this is the case of the CC) will be very sensitive to this partial volume. The dynamic range of the ADC between voxels in the white matter and voxels in the CSF is much higher than for FA/GFA. Thus, in these regions with CSF partial volume, the ADC variability would be more affected than the FA/GFA. GFA however, shows the best reproducibility, Fig. 4, in the region based analysis compared to FA and ADC, and this could be because it is the true scalar map of the DSI and therefore has not been affected by the interpolation as FA and ADC might have been. However, GFA and FA follow more or less the same pattern and this is not surprising since both are quantifying anisotropy. The highest variability of the GFA, FA and ADC was found in the UF both for the region-based analysis, as in Vollmar et al. (2010), and the tract-based analysis. This is not surprising, since this bundle is located in the frontal lobe, where the effects of susceptibility-induced distortions and partial volume effects can lead to imperfect co-registration, segmentations etc. When we refer to how reproducible some scalar maps or ROIs/ tracts are we use the CV and ICC in Fig. 4. This is due to the fact that we used the total number of subjects, 8, which statistically would be more reliable than only using 5 subjects as this is the case for scanners C and D. The ICC is very sensitive and is therefore mainly used to assess the variability between two similar systems rather than comparing two systems that are significantly different. Both C and D (more C than D) showed a significant difference to scanner A and Fig. 6 points to this as well, therefore we could only expect a high CV and a very low ICC for these scanners, in particular C. The CV, that is the average CV along the subjects, and ICC are in line with what is reported in Vollmar et al. (2010) for the region-based analysis when looking at the GFA map since this is the original map from the DSI and is not affected by the interpolation as the FA and ADC. Conclusions In this study we investigated the intra- and inter-site variability of DSI measurements at four different 3 T scanners. We investigated the diffusion MRI scalar measures (FA, ADC and GFA). We have focused our attention on DSI, as it is a technique capable of providing the most complete information about the underlying diffusion process and allows disentangling more complex fiber architecture such as fiber kissing and crossings. To our knowledge, the reproducibility of DSI has never been studied so far. We have shown that reproducibility at a level as good as the standard clinical DTI acquisitions can be achieved provided that some factors are taken into consideration. The most critical factors in our study affecting the variability of the aforementioned outcomes turned out to be the obtained SNR that is modulated by the number of channels in the receive coil and the strategy used to combine the signals from each separate coil element. However, if the SNR is matched across the acquisitions, the use of properly designed protocols makes it possible to pool DSI data from different scanners. The results suggest that DSI can be reliably used in clinical research involving different imaging centers, hence increasing the power of the clinical studies and the sensitivity of diffusion MRI to brain pathologies and neurological disorders. Acknowledgments This research was supported by the Swiss National Science Foundation, SPUM (grant nos 33CM30-124089 and 320030‐141165) and by the Centre d'Imagerie BioMédicale (CIBM) of the University of Lausanne (UNIL), the Swiss Federal Institute of Technology Lausanne (EPFL),

94

A. Lemkaddem et al. / NeuroImage 62 (2012) 87–94

the University of Geneva (UniGe), the Centre Hospitalier Universitaire Vaudois (CHUV), the Hôpitaux Universitaires de Genève (HUG), the Leenaards Foundation and the Jeantet Foundation. The authors would like to thank the MRI technician at CHUV, Nicolas Chevrey, for his valuable help especially at the beginning of this project. References Andersson, J., Skare, S., 2002. A model-based method for retrospective correction of geometric distortions in diffusion-weighted EPI. Neuroimage 16, 177–199. Basser, P., Pajevic, S., Pierpaoli, C., Duda, J., Aldroubi, A., 2000. In vivo fiber tractography using DT-MRI data. Magn. Reson. Med. 44, 625–632. Benner, T., Jellus, V., Wang, R., Wedeen, V.J., van der Kouwe, A.J., Wiggins, G.C., Sorensen, A.G., 2006. Adaptive Combination of Phased Array Signals for Diffusion Imaging. International Society for Magnetic Resonance in Medicine (ISMRM). Bonekamp, D., Nagae, L.M., Degaonkar, M., Matson, M., Abdalla, W.M.A., Barker, P.B., Mori, S., Horska, A., 2007. Diffusion tensor imaging in children and adolescents: reproducibility, hemispheric, and age-related differences. Neuroimage 34, 733–742. Burdette, J.H., Durden, D.D., Elster, A.D., Yen, Y.F., 2001. High b-value diffusionweighted MRI of normal brain. J. Comput. Assist. Tomogr. 25, 515–519. Chiu, C.H., Lo, Y.C., Tang, H.S., Liu, I.C., Chiang, W.Y., Yeh, F.C., Jaw, F.S., Tseng, W.Y.I., 2011. White matter abnormalities of fronto-striato-thalamic circuitry in obsessive–compulsive disorder: a study using diffusion spectrum imaging tractography. Psychiatry Research: Neuroimaging 192, 176–182. Ciccarelli, O., Parker, G.J.M., Toosy, A.T., Wheeler-Kingshott, C.A.M., Barker, G.J., Boulby, P.A., Miller, D.H., Thompson, A.J., 2003. From diffusion tractography to quantitative white matter tract measures: a reproducibility study. Neuroimage 18, 348–359. Concha, L., Livy, D.J., Beaulieu, C., Wheatley, B.M., Gross, D.W., 2010. In vivo diffusion tensor imaging and histopathology of the fimbria–fornix in temporal lobe epilepsy. J. Neurosci. 30, 996–1002. Farrell, J., Landman, B.A., Jones, C.K., Smith, S.A., Prince, J.L., Van Zijl, P.C., Mori, S., 2007. Effects of signal-to-noise ratio on the accuracy and reproducibility of diffusion tensor imaging derived fractional anisotropy, mean diffusivity and principal eigenvector measurements at 1.5 T. J. Magn. Reson. Imaging 26, 756–767. Focke, N.K., Yogarajah, M., Bonelli, S.B., Bartlett, P.A., Symms, M.R., Duncan, J.S., 2008. Voxel-based diffusion tensor imaging in patients with mesial temporal lobe epilepsy and hippocampal sclerosis. Neuroimage 40, 728–737. Fox, R., Sakaie, K., Lee, J.C., Debbins, J., Liu, Y., Arnold, D., Melhem, E., Smith, C., Philips, M., Lowe, M., Fisher, E., 2011. A validation study of multicenter diffusion tensor imaging: reliability of fractional anisotropy and diffusivity values. Am. J. Neuroradiol. 33 (4), 695–700. Ge, Y., Law, M., Grossman, R.I., 2005. Applications of diffusion tensor MR imaging in multiple sclerosis. Ann. N. Y. Acad. Sci. 1064, 202–219. Gigandet, X., Hagmann, P., Kurant, M., Cammoun, L., Meuli, R., Thiran, J.P., 2008. Estimating the confidence level of white matter connections obtained with MRI tractography. PLoS One 3, 4006. Glover, G.H., Lai, S., 1998. Self-navigated spiral fMRI: interleaved versus single shot. Magn. Reson. Med. 39, 361–368. Goldberg-Zimring, D., Mewes, A.U.J., Maddah, M., Warfield, S.K., 2005. Diffusion tensor magnetic resonance imaging in multiple sclerosis. J. Neuroimaging 15, 68–81. Granziera, C., Schmahmann, J.D., Hadjikhani, N., Meyer, H., Meuli, R., Wedeen, V., Krueger, G., 2009. Diffusion spectrum imaging shows the structural basis of functional cerebellar circuits in the human cerebellum in vivo. PLoS One 4, e5101. Granziera, C., Daducci, A., Gigandet, X., Cammoun, L., Meskaldji, D., Michel, P., Maeder, P., Sorensen, A., Thiran, J.P., Meuli, R., Krüger, G., 2011. Diffusion spectrum imaging after stroke shows structural changes in the contra-lateral motor network correlating with functional recovery. 19th International Society for Magnetic Resonance in Medicine (ISMRM) conference, Montreal, Canada. Hagmann, P., Cammoun, L., Gigandet, X., Meuli, R., Honey, C., Wedeen, V., Sporns, O., 2008. Mapping the structural core of human cerebral cortex. PLoS Biol. 6, 159. Heiervang, E., Behrens, T., Mackay, C., Robson, M., Johansen-Berg, H., 2006. Between session reproducibility and between subject variability of diffusion MR and tractography measures. Neuroimage 33, 867–877. Honey, C.J., Sporns, O., Cammoun, L., Gigandet, X., Thiran, J.P., Meuli, R., Hagmann, P., 2009. Predicting human resting-state functional connectivity from structural connectivity. Proc. Natl. Acad. Sci. 106, 2035–2040. Kanaan, R., Shergill, S., Barker, G., Catani, M., Ng, V., Howard, R., McGuire, P., Jones, D., 2006. Tract-specific anisotropy measurements in diffusion tensor imaging. Psychiatry Res. Neuroimaging 146, 73–82.

Knock, S., McIntosh, A., Sporns, O., Kotter, R., Hagmann, P., Jirsa, V., 2009. The effects of physiologically plausible connectivity structure on local and global dynamics in large scale brain models. J. Neurosci. Methods 183, 86–94. Lau, R.W.M., Goodyear, B.G., 2007. Minimum detectable change in water diffusion using 3-T magnetic resonance imaging. Neuroimage 36, 491–496. Le Bihan, D., Breton, E., 1985. Imagerie de diffusion in-vivo par resonance magnetique nucleaire. C. R. Acad. Sci. (Paris) 301, 1109–1112. Liu, I.C., Chiu, C.H., Chen, C.J., Kuo, L.W., Lo, Y.C., Tseng, W.Y.I., 2010. The microstructural integrity of the corpus callosum and associated impulsivity in alcohol dependence: a tractography-based segmentation study using diffusion spectrum imaging. Psychiatry Res. Neuroimaging 184, 128–134. Lo, Y.C., Soong, W.T., Gau, S.S.F., Wu, Y.Y., Lai, M.C., Yeh, F.C., Chiang, W.Y., Kuo, L.W., Jaw, F.S., Tseng, W.Y.I., 2011. The loss of asymmetry and reduced interhemispheric connectivity in adolescents with autism: a study using diffusion spectrum imaging tractography. Psychiatry Res. Neuroimaging 192, 60–66. Marenco, S., Rawlings, R., Rohde, G., Barnett, A., Honea, R., Pierpaoli, C., Weinberger, D., 2006. Regional distribution of measurement error in diffusion tensor imaging. Psychiatry Res. Neuroimaging 147, 69–78. Pannek, K., Mathias, J., Bigler, E., Brown, G., Taylor, J., Rose, S., 2011. The average pathlength map: a diffusion MRI tractography-derived index for studying brain pathology. Neuroimage 55, 133–141. Pfefferbaum, A., Sullivan, E.V., Hedehus, M., Lim, K.O., Adalsteinsson, E., Moseley, M., 2000. Age-related decline in brain white matter anisotropy measured with spatially corrected echo-planar diffusion tensor imaging. Magn. Reson. Med. 44, 259–268. Pierpaoli, C., Marenco, S., Rohde, G., Jones, D.K., Barnett, A., 2003. Analyzing the contribution of cardiac pulsation to the variability of quantities derived from the diffusion tensor. ISMRM 11th Annual Meeting, Toronto, Ontario, Canada. Reese, T., Heid, O., Weisskoff, R., Wedeen, V., 2003. Reduction of eddy-current-induced distortion in diffusion MRI using a twice-refocused spin echo. Magn. Reson. Med. 49, 177–182. Roberts, T.P.L., Rowley, H.A., 2003. Diffusion weighted magnetic resonance imaging in stroke. Eur. J. Radiol. 45, 185–194. Roemer, P.B., Edelstein, W.A., Hayes, C.E., Souza, S.P., Mueller, O.M., 1990. The NMR phased array. Magn. Reson. Med. 16, 192–225. Schoene-Bake, J.C., Faber, J., Trautner, P., Kaaden, S., Tittgemeyer, M., Elger, C.E., Weber, B., 2009. Widespread affections of large fiber tracts in postoperative temporal lobe epilepsy. Neuroimage 46, 569–576. Shrout, P.E., Fleiss, J.L., 1979. Intraclass correlations: uses in assessing rater reliability. Psychol. Bull. 86, 420–428. Sun, Z., Wang, F., Cui, L., Breeze, J., Du, X., Wang, X., Cong, Z., Zhang, H., Li, B., Hong, N., et al., 2003. Abnormal anterior cingulum in patients with schizophrenia: a diffusion tensor imaging study. Neuroreport 14, 1833–1836. Tang, P.F., Ko, Y.H., Luo, Z.A., Yeh, F.C., Chen, S.H., Tseng, W.Y., 2010. Tract-specific and region of interest analysis of corticospinal tract integrity in subcortical ischemic stroke: reliability and correlation with motor function of affected lower extremity. AJNR Am. J. Neuroradiol. 31, 1023–1030. Teipel, S.J., Reuter, S., Stieltjes, B., Acosta-Cabronero, J., Ernemann, U., Fellgiebel, A., Filippi, M., Frisoni, G., Hentschel, F., Jessen, F., Kloppel, S., Meindl, T., Pouwels, P.J., Hauenstein, K.H., Hampel, H., 2011. Multicenter stability of diffusion tensor imaging measures: a European clinical and physical phantom study. Psychiatry Res. Neuroimaging 194, 363–371. Truong, T.K., Chen, B., Song, A.W., 2008. Integrated sense DTI with correction of susceptibility- and eddy current-induced geometric distortions. Neuroimage 40, 53–58. Tuch, D.S., 2004. Q-ball imaging. Magn. Reson. Med. 52, 1358–1372. Vollmar, C., O'Muircheartaigh, J., Barker, G.J., Symms, M.R., Thompson, P., Kumari, V., Duncan, J.S., Richardson, M.P., Koepp, M.J., 2010. Identical, but not the same: intra-site and inter-site reproducibility of fractional anisotropy measures on two 3.0 T scanners. Neuroimage 51, 1384–1394. Walsh, David O., Gmitro, Arthur F., Michael, Marcellin W., 2000. Adaptive reconstruction of phased array MR imagery. Magn. Reson. Med. 43, 682–690. Wedeen, V.J., Hagmann, P., Tseng, W.Y.I., Reese, T.G., Weisskoff, R.M., 2005. Mapping complex tissue architecture with diffusion spectrum magnetic resonance imaging. Magn. Reson. Med. 54, 1377–1386. Wiggins, G., Triantafyllou, C., Potthast, A., Reykowski, A., Nittka, M., Wald, L., 2006. 32channel 3 tesla receive-only phased-array head coil with soccer-ball element geometry. Magn. Reson. Med. 56, 216–223. Zhu, T., Hu, R., Qiu, X., Taylor, M., Tso, Y., Yiannoutsos, C., Navia, B., Mori, S., Ekholm, S., Schifitto, G., Zhong, J., 2011. Quantification of accuracy and precision of multicenter DTI measurements: a diffusion phantom and human brain study. Neuroimage 56, 1398–1411.

Suggest Documents