Graphical Display of fMRI Data. Visualizing ... - CiteSeerX

3 downloads 7094 Views 216KB Size Report
(MST)–based sequencing of multivariate time-courses, in combination with a homogeneity ... Application of data-driven methods in fMRI data analysis lagged.
Graphical Display of fMRI Data. Visualizing Multidimensional Space.

R. Baumgartner, R. Somorjai+

Institute for Biodiagnostics, National Research Council Canada, 435 Ellice Ave., Winnipeg Manitoba, Canada, R3B 1Y6

Running title: Graphical Display of fMRI Data Submitted to Magnetic Resonance Imaging as a Technical Note

+

Corresponding author: R. Somorjai Institute for Biodiagnostics National Research Council Canada 435 Ellice Avenue Winnipeg, Manitoba MB R3B 1Y6 Canada Tel: 204-984-4538 Fax: 204-984-5472 Email: [email protected]

Abstract Visualization of multidimensional data is an integral part of computational statistics and exploratory data analysis (EDA). We show how visualization of fMRI time-courses may be used to reveal the fMRI data structure. We consider fMRI time-courses (TCs) as points in multidimensional space. In simulated and in vivo data, we show that minimum spanning tree (MST)–based sequencing of multivariate time-courses, in combination with a homogeneity map visualization, allows for effective and useful graphical display of the groups of coactivated timecourses obtained by temporal clustering. This display may serve as a tool for investigation of brain connectivity. We also suggest a simple overall display of the entire fMRI data set.

Keywords: functional MR imaging, graphical display, exploratory data analysis, minimum spanning tree, homogeneity map

Introduction In computational statistics, multidimensional data structure exploration may be performed by data-driven approaches such as clustering (hard, fuzzy, Kohonen maps) and projection-pursuit based methods (Principal Component Analysis - PCA, Independent Component Analysis – ICA or multidimensional scaling - MDS). Application of data-driven methods in fMRI data analysis lagged behind the traditional hypothesis-led inferential methods (e.g. Ref. 1); however, recently EDA approaches have been gaining popularity (see e.g. Refs. 2-12, etc.). After structure exploration, parsimonious and compact summary / graphical display of the results is desirable / useful (13-17). In fact, as was suggested by J. Tukey in 1977 (18), graphical display of multidimensional data must be an integral part of exploratory data analysis (EDA). In particular, in fMRI, coactivation of brain regions is of interest when these regions show similar temporal behavior (19). Therefore, we focus here on the results of temporal cluster analysis, which naturally yields coactivated brain regions. In fMRI, the results are customarily summarized as color-coded values of the correlation coefficient (cc), or corresponding p-values, superimposed on the anatomical image. This display in fact reduces the multidimensional time-courses (TCs) to a single feature (cc / p-value). Here we suggest visualization of an entire (single) cluster of TCs, with the TCs in their original, experimentally acquired form, i.e. a temporal sequence of intensity values. The idea is to show all the features contained in the cluster of interest. We show how the combination of sequencing / ranking of the fMRI TCs with a particular homogeneity map display may be used to visually represent the coherence of a cluster of TCs observed in fMRI. We also suggest a visually revealing overall display of the entire fMRI data set.

Materials and Methods Simulated data: To demonstrate the efficiency of visually displaying multidimensional TCs, two types of TCs, “activation” (# of TCs = 10) and “noise” TCs (# of TCs = 10) were selected. “Hemodynamic” response was simulated by a two-parameter gamma function (20), with two “on” phases of 30 instances each. “Noise” TCs were simulated by i.i.d. Gaussian noise. For the “activation” case,

noise was added to the simulated waveforms, with the contrast-to-noise ratio (CNR=∆S/ σnoise, where ∆S is the signal enhancement and σnoise is the noise standard deviation) 5.0 (20). In vivo data: An “activation cluster” was obtained from an in vivo single trial fMRI study with a mental rotation paradigm (21). The data set was acquired with an EPI sequence on a 4 T whole body imaging system (matrix size = 64x64x64 with 88 time instances). The “artifact” cluster was obtained from fMRI data obtained under null condition (without stimulation). Data were acquired with an EPI sequence on a 1.5 T GE Signa Horizon scanner (matrix size = 128 x 128x1 with 120 time instances). To acquire this data set a short TR value (TR=335 ms) was chosen to avoid aliasing breathing into low frequencies. Thus, the breathing artifact is not undersampled as would be for higher TRs. Data partition: We used fuzzy C-means clustering in the temporal domain, as implemented in the software package EvIdent  (22,23). Homogeneity map – single cluster visualization: In the homogeneity map, the vertical coordinate corresponds to the TC number in the cluster; along the horizontal coordinate the intensity value of a particular TC is displayed as a function of time, color-coded such that high intensity values are yellow, low ones are blue. (The homogeneity map has been a standard feature in EvIdent). The TCs in the homogeneity map are ordered. Here, we used ranking according to the minimum spanning tree. Other rankings are possible, e.g., according to the degree of correlation with the cluster centroid, or the fuzzy membership values. Sequencing of the TCs, Minimum Spanning Tree (MST): MST is a spanning tree with minimum length (for a detailed description see e.g. Ref. 24). We constructed the MST of a group of TCs, considering each TC as a node of the tree. The connections between nodes are called edges. We calculated the length of an edge as the Euclidean distance (L2 norm) between two nodes (TCs). MST is a generalization of a onedimensional sorted list. It may be used for sequencing multivariate observations (25).

Display of the entire data structure: As the representative of a cluster, we chose its centroid (the weighted average of the TCs of the cluster). Then we calculated the correlation coefficient of the centroid with all TCs. We expect the highest correlation values for the members of the cluster the centroid was derived from. We displayed the values of the correlation coefficient in a two-dimensional “block” image. The “block” image is shown schematically in Fig. 1. The horizontal coordinate denotes the cluster number. The dimensions of the image are (Σ j Nj xC), where C is the number of clusters and Nj is the number of TCs in cluster j. To demonstrate the visualization of the entire data set, in the results section we use equal number of TCs for each cluster, Nj = 50, j = 1,…,C. Note however, that in general the number of TCs is different in the different clusters. For each cluster we display the correlation between all TCs and the cluster centroid. For (k)

example, the element r

ij

in the “block” image (Fig. 1) is the correlation coefficient between the ith (k)

TC of cluster j and the centroid of cluster k . Within cluster j, the values r

ij

are in ascending order.

Results The homogeneity map of simulated data, (“noise” and “activation” groups combined in one cluster), is shown in Fig. 2. This example demonstrates the application of the homogeneity map in the extreme case of pure “noise” mixed with “activation”. Using the MST-sequencing and displaying its results by the homogeneity map reveals clearly (and visually) the structure of the “activation” group against the structureless “noise” group. The white arrow indicates the boundary between the two original clusters, with the “noise” cluster above the arrow, and the “activation” cluster beneath it. The homogeneity map of the in vivo data (“activation cluster”) is shown in Fig. 3. This visualization reveals the “quality” of fuzzy clustering. High temporal concordance among the TCs in the cluster is apparent from the display. The MST-ordering also shows the gradual “broadening” of the hemodynamic response, from top down (more yellow towards the bottom of the map). The homogeneity map of in vivo data obtained under null hypothesis is shown in Fig.4. The cluster identified represents a breathing artifact, with 9 off-on cycles (the peaks, 9 vertical

yellow bands of approximately the same width, separated by the troughs, blue vertical bands). Again, the homogeneity map displays the high internal coherence within the cluster. Eight clusters of fifty TCs each (C=8 / N=50) from in vivo data are shown in an overall graphical display in Fig. 5. The diagonal elements (TCs) of the block image have high correlation values with their own centroids. The black arrow shows a cluster distinctly partitioned from all the others. The white arrow denotes a cluster that is more similar to three other clusters.

Discussion Sequencing (ranking) the multivariate data in combination with the homogeneity map provides a simultaneous visual display of a cluster of TCs. The temporal structure of the multidimensional data is apparent from this display. There are other possible methods for visualization and sequencing of multidimensional data. Bivariate plots (where the plot dimension is reduced to 2) based on PCA or MDS could be used for the display (14). Note, however, that in contrast to MDS/PCA, MST–based plots do not deform the Euclidean distances between TCs. Another possibility of ordering the TCs is according to their correlation with some (weighted) mean TC, such as the fuzzy membership-weighted cluster centroid. We have shown that before feature reduction, a homogeneity map display, with all features present, is useful in high-dimensional structure exploration in fMRI, especially when the coactivation or concordance between the TCs is to be visualized. The visual display of groups of TCs may be interesting and useful not only for EDA, but also for inferential methods.

Acknowledgements We thank Dr. Wolfgang Richter and Dr. Lawrence Ryner for providing the mental rotation fMRI data set and the null fMRI data set, respectively.

References 1. Friston, K.; Holmens, A.; Worsley, K.; Poline J.; Frith, C. Statistical parametric maps in functional imaging: A general linear approach. Human Brain Mapping 2: 189-210, 1995. 2. Baumgartner, R.; Windischberger, C.; Moser, E. Quantification in fMRI: Fuzzy clustering vs. correlation analysis. Magnetic Resonance Imaging 16:115-125, 1998. 3. Baumgartner, R.; Somorjai, R.; Summers, R.; Richter, W.; Ryner, L. Correlator beware: Correlation has limited selectivity for fMRI data analysis. NeuroImage 12: 240-243 (2000) 4. Baune, A.; Sommer, F.; Erb, M.; Wildgruber, D.; Kardatzki, B.; Palm, G.; Grodd, W. Dynamical cluster analysis of cortical fMRI activation. NeuroImage 9: 477-489, 1999. 5. Chuang, K.; Chiu M.; Lin, C.; Chen, J. Model free fMRI analysis using Kohonen clustering neural network and fuzzy c-means. IEEE Transactions on Medical Imaging 18: 1117-1128, 2000. 6. Fadili, M.; Ruan, S.; Bloyet, D.; Mazoyer, B. A multistep unsupervised fuzzy clustering analysis of fMRI time series. Human Brain Mapping 10:160-178, 2000. 7. Filzmoser, P.; Baumgartner, R.; Moser, E. A hierarchical clustering method for analyzing fMRI. Magnetic Resonance Imaging 17: 817-826, 1999. 8. Fischer, H.; Hennig, J. Neural network-based analysis of MR time-series. Magnetic Resonance in Medicine 41: 124-131, 1999. 9. Golay, X.; Kollias, S.; Stoll, G.; Meier, D; Valavanis, A.; Boesiger, P. A new correlation-based fuzzy logic clustering algorithm for fMRI. Magnetic Resonance in Medicine 40: 249-260, 1998. 10. McKeown, M.; Jung, T.; Makeig, S.; Brown, G.; Kindermann, S.; Lee, T.; Sejnowski, T. Analysis of fMRI data by blind separation into independent spatial components. Human Brain Mapping 6: 160-188, 1998. 11. Moser, E.; Baumgartner, R.; Barth, M.; Windischberger, C. Explorative signal processing in fMRI. International Journal of Imaging Systems and Technology 10: 166-176, 1999. 12. Ngan, S.; Hu, X. Analysis of fMRI data using self-organizing mapping with spatial connectivity. Magnetic Resonance in Medicine 41: 939-946, 1999.

13. Minotte, M.; West W. The data image: a tool for exploring high dimensional data sets. Proceedings of the ASA Section on Statistical Graphics, in press. http://math.usu.edu/~minotte/research/pubs.html 14. Pison, G.; Struyf, A.; Rousseeuw P. Displaying a clustering with CLUSPLOT. Computational Statistics and Data Analysis 30:381-392, 1999. 15. Wegman, E. Hyperdimensional data analysis using parallel coordinates. Journal of the American Statistical Association 85:664-675, 1990. 16. Chernoff, H. The use of faces to represent points in k-dimensional space graphically. Journal of the American Statistical Association 68, 361-368, 1973. 17. Andrews, D. Plots of high-dimensional data. Biometrics 28: 125-136, 1972. 18. Tukey, J. Exploratory data analysis. Addison-Wesley, 1995. 19. Carpenter, P.; Just, M. Modeling the mind: very-high-field fMRI activation during cognition. Topics in Magnetic Resonance Imaging 10:16-36, 1999. 20. Lange, N. Tutorial in biostatistics; statistical approaches to human brain mapping by fMRI. Statistics in Medicine 15:389-428, 1997. 21. Richter, W.; Ugurbil, K.; Georgopoulos, A.; Kim, S-G. Time resolved fMRI of mental rotation. Neuroreport 8:3697-3702, 1997. 22. Somorjai, R.; Jarmasz, M. Exploratory data analysis of fMRI: Philosophy, strategies, tools, implementation. Seventh Annual Meeting ISMRM, Philadelphia, USA, 1714, 1999. 23. Somorjai, R.; Jarmasz, M.; Baumgartner, R. EVIDENT: A two-stage strategy for the exploratory data analysis of fMRI data by fuzzy clustering. Institute for Biodiagnostics, Technical report #37, 2000. http://www.ibd.nrc.ca/informatics. EvIdent a 3D analysis software package. http://www.ibd.nrc.ca/informatics 24. Harary, F. Graph theory. Addison-Wesley, 1969. 25. Friedman, J.; Rafsky, L. Multivariate generalizations of the Wald-Wolfowitz and Smirnov twosample tests. Annals of Statistics 7: 697-717, 1979.

Figure Captions Fig. 1 Schematic figure for the “block” image of the entire data set. Nj – number of TCs in the jth cluster, C – number of cluster displayed.

Fig. 2 Simulated data: Homogeneity map of two clusters, one consisting of TCs following the paradigm (two cycles, yellow bands), the other of pure “noise”, merged and ranked by MST. Arrow indicates the border between “activation” and “noise” regions, with the latter on top.

Fig. 3 In vivo data: Homogeneity map of the MST-ranked “activation” cluster. Note the high temporal concordance among the TCs.

Fig. 4 In vivo data: Homogeneity map of the MST-ranked breathing “artifact” cluster obtained from null hypothesis data. Again, note the high temporal concordance among the TCs.

Fig. 5 In vivo data: Entire data display. Black arrow points to (a row which represents) a cluster distinctly separated from all the others. White arrow shows (a row which represents) a cluster that is more similar to three other clusters.

r (C ) N 1 ,1

r (C ) N C ,C

r (C ) N 1 −1,1

r (C ) N C −1, C

...

...

r

(C )

r (C )1 ,C

1,1

... ... ...

... ... ...

... ... ...

... ... ...

r (1) N 1 ,1

r (1) N C , C

r (1) N 1 − 1,1

r (1) N C −1, C

...

...

r (1) 1,1

r (1) 1,C

Cluster 1

...

...

Fig. 1

Cluster C

TC number

Time

TC number

Fig. 2

Time

Fig. 3

TC number

Time instance

Fig. 4

Cluster number

Fig. 5

Cluster members with ordered correlation coefficients