Unsupervised Learning and Mapping of Active ... - Semantic Scholar

2 downloads 0 Views 1MB Size Report
tection (LSIIT, UMR CNRS 7005), ENSPS-LSIIT, Boulevard Sébastien Brant,. BP 10413 ...... [2] C. Goutte, P. Toft, E. Rostrup, F. A. Nielsen, and L. K. Hansen, “On.
IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 24, NO. 2, FEBRUARY 2005

263

Unsupervised Learning and Mapping of Active Brain Functional MRI Signals Based on Hidden Semi-Markov Event Sequence Models Sylvain Faisan, Laurent Thoraval*, Jean-Paul Armspach, Marie-Noëlle Metz-Lutz, and Fabrice Heitz

Abstract—In this paper, a novel functional magnetic resonance imaging (fMRI) brain mapping method is presented within the statistical modeling framework of hidden semi-Markov event sequence models (HSMESMs). Neural activation detection is formulated at the voxel level in terms of time coupling between the sequence of hemodynamic response onsets (HROs) observed in the fMRI signal, and an HSMESM of the hidden sequence of task-induced neural activations. The sequence of HRO events is derived from a continuous wavelet transform (CWT) of the fMRI signal. The brain activation HSMESM is built from the timing information of the input stimulation protocol. The rich mathematical framework of HSMESMs makes these models an effective and versatile approach for fMRI data analysis. Solving for the HSMESM Evaluation and Learning problems enables the model to automatically detect neural activation embedded in a given set of fMRI signals, without requiring any template basis function or prior shape assumption for the fMRI response. Solving for the HSMESM Decoding problem allows to enrich brain mapping with activation lag mapping, activation mode visualizing, and hemodynamic response function analysis. Activation detection results obtained on synthetic and real epoch-related fMRI data demonstrate the superiority of the HSMESM mapping method with respect to a real application case of the statistical parametric mapping (SPM) approach. In addition, the HSMESM mapping method appears clearly insensitive to timing variations of the hemodynamic response, and exhibits low sensitivity to fluctuations of its shape. Index Terms—Brain mapping, functional MRI, hidden Markov models, signal processing, wavelet transform.

Manuscript received July 15, 2004; revised November 8, 2004. The Associate Editor responsible for coordinating the review of this paper and recommending its publication was P. Thompson. Asterisk indicates corresponding author. S. Faisan is with the Université Louis Pasteur (ULP), Strasbourg, France, and the Laboratoire des Sciences de lImage, de lInformatique et de la Télédétection (LSIIT, UMR CNRS 7005), 67412 Illkirch Cedex, France, and the Institut de Physique Biologique (IPB, UMR CNRS 7004), 67085 Strasbourg Cedex, France (e-mail: [email protected]). *L. Thoraval is with the Université Louis Pasteur (ULP), Strasbourg, France, and the Laboratoire des Sciences de lImage, de lInformatique et de la Télédétection (LSIIT, UMR CNRS 7005), ENSPS-LSIIT, Boulevard Sébastien Brant, BP 10413, 67412 Illkirch Cedex, France (e-mail:[email protected]). J.-P. Armspach and M.-N. Metz-Lutz are with the Université Louis Pasteur (ULP), Strasbourg, France, and the Institut de Physique Biologique (IPB, UMR CNRS 7004), 67085 Strasbourg Cedex, France (e-mail: [email protected]; [email protected]). F. Heitz is with the Université Louis Pasteur (ULP), Strasbourg, France, and the Laboratoire des Sciences de lImage, de lInformatique et de la Télédétection (LSIIT, UMR CNRS 7005), 67412 Illkirch Cedex, France (e-mail: [email protected]). Digital Object Identifier 10.1109/TMI.2004.841225

I. INTRODUCTION

C

OMMONLY used techniques in functional MRI (fMRI) brain mapping can be divided into two classes: data-driven and model-driven. Data-driven techniques such as cluster analysis [1]–[3], principal component analysis [4], [5], or independent component analysis [6], [7], attempt to reveal components of interest in the fMRI data. Their main advantage is that no prior knowledge about the fMRI responses is needed, in particular to explore the data searching for patterns of functional activation. However, physiological interpretation of the results remains difficult to the examiner. Also, data-driven techniques do not provide any solid basis for statistical significance testing. In contrast, model-driven techniques allow the use of standard statistical tests, but most of them require strong prior assumptions about the shape and the timing of the fMRI signal in activated voxels [8]–[13]. These assumptions are generally expressed through the use of a limited set of reference or template basis functions to model the noise-free activated fMRI signal, as in the popular statistical parametric mapping (SPM) approach [14]. Each reference function is usually implemented as the linear convolution of a predefined impulse response function, the putative hemodynamic response function (HRF), with a deterministic stimulus or timing function encoding the supposed sequence of task-induced neural activations. In practice, such modeling assumptions may be too restrictive to capture the broad range of activation patterns, for several reasons. First, anecdotal evidence suggests that the HRF varies across brain regions within a single subject [15], and across subjects as reported in [16]–[19]. Then, though multiple basis functions may facilitate the ability to adjust to timing differences, the deterministic character assigned to the timing function is a major limitation in capturing the variability of the task-induced mental event timing, either in space and time as across subjects. Third, whereas a number of studies concluded that the linearity assumption underlying the convolution model holds in many experimental conditions [10], [18], [20], other groups noted some departure from linearity in the human auditory and visual systems [18], [21], [22]. Finally, despite recent advances [23], the neurovascular coupling existing between the fMRI signal and the neural activity it is believed to represent remains largely unknown [24], [25]. When considering additional sources of variation [26], all these aspects combine to make the fMRI signal difficult to quantify and somewhat restrictive fMRI signal models too close to the data. To relax some of the aforementionned hypotheses, and to take into account activation patterns that do not only fall in the subspace spanned by some predefined template

0278-0062/$20.00 © 2005 IEEE

264

basis functions, more complex activation models based on activation-related hyperparameters have been proposed. These hyperparameters may relate to the modeling of the HRF pattern [27]–[32], of the activation timing [33], of neighboring spatial information [31], [34]–[36], or of the fMRI noise [28], [37], [38]. Classical and Bayesian inference techniques applied to these models allow both to detect neural activation and estimate model parameters and hyperparameters. They have been recently compared in [39], in the context of neuroimaging. In this paper, a new model-driven fMRI brain mapping approach that addresses the three problems of HRF shape variability, neural event timing, and fMRI response linearity is presented within the statistical modeling framework of hidden semi-Markov event sequence models (HSMESMs). HSMESMs are a special instance of hidden semi-Markov models (HSMMs) dedicated to the modeling and analysis of event-based random processes [40]. They have been first described by the authors in [41]. HSMESMs, like HSMMs, belong to the wide range of hidden Markov modeling (HMM) techniques [42]. FMRI brain mapping comes easily within an HMM framework since it consists in analyzing a “hidden” process, the neural activation process, through a related observable one, the fMRI signal. In this framework, we formulate neural activation detection at the voxel level in terms of time coupling between the sequence of hemodynamic response onsets (HROs) observed in the fMRI signal, and an HSMESM of the underlying sequence of task-induced neural activations [43], [44]. The hidden process of this brain activation HSMESM statistically accounts for the neural event timing underlying the HRO sequence. Its observable counterpart models the short-term statistical characteristics of the HRO events. In practice, HROs of interest are detected and characterized from a continuous wavelet transform (CWT) of the voxel’s fMRI signal. The brain activation HSMESM is built from the timing information of the external stimulation paradigm. Thanks to the HSMESM learning and generalizing abilities, the model can be trained to detect in-phase as well as delayed neural activations embedded in a given set of fMRI signals, making the proposed mapping method completely unsupervised. Also, the rich theoretical formalism of HSMESMs allows the development of additional fMRI data analysis functionalities, including activation lag mapping, activation mode visualizing, and HRF shape analysis. The paper is organized as follows. Section II presents the salient features of the HSMESMs within the general context of hidden Markov modeling. Some notations that will be used throughout the paper are also established in this section. Section III examines the application of HSMESMs in unsupervised learning and mapping of active brain fMRI signals. Section IV presents HSMESM brain mapping results obtained on synthetic and real epoch-related fMRI data. Activation lag mapping, activation mode visualizing, and HRF shape analysis are also presented. Section V discusses some major aspects of the HSMESM method. It relates it to the few other Markovian modeling approaches of the fMRI signal, then addresses the application of HSMESMs to more complex blocked designs as well as to event-related designs before concluding with computational issues of the method. Finally, Section VI presents our conclusions.

IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 24, NO. 2, FEBRUARY 2005

II. THEORY OF HSMESMs An HSMESM is a special instance of an HSMM, which, in turn, is an extension of the standard HMM to explicit modeling of state occupancy duration. Therefore, the reader is referred to Rabiner’s tutorial [42] for a rigorous presentation of the HMM formalism, and to the pioneering works of Ferguson [45], Russell et al. [46], and Levinson [47] for the understanding of durational modeling in HSMMs. A. Data Preprocessing, Event Sequence, Observation Sequence In an HSMESM approach, a preprocessing step detects and characterizes events of interest in the raw input data observed from the process under analysis. At each detection time , an observation, also called an event, , is produced. Let us dethe set of observation times and note by the corresponding observation sequence of by , an HSMESM. Let be the set of event detection times, and the corresponding sequence of detected events. In the HSMESM and are both fictive events introduced at the formalism, for duration modeling purposes. beginning and the end of Then, by definition, the observation sequence is built upon the event sequence by inserting a null event, , in , each . The writing means a missing time observation occurred at time . As an example, if and , then and . B. Elements of an HSMESM An HSMESM is a two-stage stochastic process: a nonobservable or “hidden” process, and an observable process, which links the observations, namely, the detected and null events described so far, to the states of the hidden process. The hidden process models the statistical arrangement of the detected and null events along the time axis. The observation process represents the short-term statistical characteristics of the detected events. Due to the detection-based preprocessing, the HSMESM is usually composed of true positive observation sequence events (tpe) mixed with false positive events (fpe) and missing be the number of tpe classes that observations (null). Let characterize the process under study. Let us denote by , the state occupied by the hidden process at time , and by , the . In the sequel, subsequence of states means for . Similarly, let be the subsequence of observations . Then, an HSMESM is defined by the following elements. , where and • Set of hidden states: are the tpe and “aggregate” state subsets defined as (1) (2) where — in

is a standard Markov state producing tpe at times . and are the start and final states of the

FAISAN et al.: UNSUPERVISED LEARNING AND MAPPING OF ACTIVE BRAIN FUNCTIONAL MRI SIGNALS

265

hidden process. They model the fictive events and , respectively. is a semi-Markov state producing fpe at times in — , or null at times in . It results from the aggrega, where tion of two states, that is, is a fpe state and is its dual null state. , where, for • State transition probability matrix: , and in (3) • Set of inter-tpe state duration pdfs: and for

, where, such that

in

(4) , where, for

• Set of tpe observation pdfs: and in

(5) with the arbitrary settings . • Set of aggregate observation pdfs: and for

and , where, in (6)

• Emission probability matrix:

, where, for (7) (8)

In an HSMESM, aggregates states take place between tpe state pairs. The aggregate state models observation subthat may occurr between two successive tpe sequences and modeled by and , respectively. The are representative probability distributions associated with of the tpe states and connected with, thereby justifying models the the double-indexing used. The distribution and . The observation inter-tpe (state) duration between is built upon a latent Bernoulli process with parampdf of eter (9) is an indicator function verifying if otherwise. That is, according as the hidden process at in , or the state at in occupies the state , a fpe, with probability , or a null, with probability , is emitted. In some sense, represents the fpe emis. It models the mean false alarm rate observed besion rate of tween two successive tpe of class and class . Finally, note that under state conditional independence assumption of the obsergiven vations, the probability of observing can be written in a closed form as where

(10)

Fig. 1. HSMESM example. (a) Output observation sequence with the corresponding tpe, fpe, and null states (circles and boxes) visited when transiting from the tpe state S to the tpe state S by the aggregate state S of (b), (b) HSMESM topology, (c) equivalent standard markovian representation of the state triplet fS ; S ; S g of (b) (see text).

From a generative point a view, an HSMESM behaves as folat , the model emits with probability one. At lows. In , in , it selects the next state to visit some according to the transition probability distribution . is , with randomly drawn actually visited at . Before reaching , the from the inter-tpe distribution times in to emit fpe and/or null acmodel remains cording to . This procedure is repeated until reaching at to emit with probability one. To be completely specified, an HSMESM needs to define parameter, the transition probability matrix , the obthe servation pdfs and , the durational pdfs and the emission probability matrix . Hereafter, the compact notation is used to denote an HSMESM. C. Graphical Representation of an HSMESM An example of HSMESM chain topology is depicted in Fig. 1(b). Circles represent tpe states while bicolor boxes in between correspond to aggregate (fpe/null) states. The equivalent standard markovian representation of an HSMESM can into two be obtained by expanding each aggregate state parallel state delay lines, as shown in Fig. 1(c), one composed standard Markov states , with an of , the other composed of stanassociated output pdf , without any associated dard Markov states denotes here the maximum duration allowed output pdf. between two successive tpe. When transiting from to , the hidden process swaps from one delay line to the other. For instance, observing the subsequence of Fig. 1(a) is equivalent at the hidden process level of Fig. 1(c) to successively occupy

266

IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 24, NO. 2, FEBRUARY 2005

, , etc., until reaching . An HSMESM can be thought of as a special instance of an HSMM since setting and in Fig. 1(c) leads back to Levinson’s equivalent standard markovian representation of a semi-Markov state [47].

, that maximizes (13), thus, modified. The pair of indexes, best state sequence can then be retrieved in two steps, using for at time . the compact notation First, one can easily derive by backtracking through the array, starting from , the best tpe state sequence occupied by the hidden process. For

D. Algorithmic Aspects Similarly to the HMM or HSMM case, the three following basic problems have to be solved for the HSMESMs in order to be used in practical applications. • Evaluation: Given an observation sequence , and an , the likelihood of the obHSMESM , compute servation sequence given the model. • Decoding: Given an observation sequence , and an that best HSMESM , infer the hidden state sequence explains . • Learning: Adjust the HSMESM parameter set to maximize the likelihood of a training of observation sequences assumed mutually set independent. In an HMM context, the efficient dynamic programming algorithms that solve these problems are the Forward-Backward, Viterbi, and Baum–Welch algorithms, respectively [42], [48]. They are briefly revisited here within the HSMESM framework (see [40] for details). is computed using 1) Evaluation: The likelihood the forward and/or the backward variables, respectively defined as (11) (12) We can solve both variables inductively. In the case of for

,

(13) , if otherwise. with the initializations A similar formula for computing the backward variable is given in [40]. Then, the observation sequence likelihood can be worked out at time by (14) , is se2) Decoding: The best state sequence, lected among all possible state sequences so as to maximize the joint probability (15) The maximization is performed using the variable as

defined (16)

, we can solve for Similarly to the forward variable inductively, using (13), after having substituted in (13) the ’s by the ’s and the double sum by a double maximization. An is also required to keep track of the additional variable

(17) Then, deriving from is straightforward due to the model and , set structure. For if , and if and . However, in practice, only the tpe state sequence is inferred since it still leads to the sequence of occurrence times of the tpe embedded in as in , as well as to the state index sequence , where designates the class of the tpe detected at . 3) Learning: As in the standard HMM case, re-estimation formulas are employed to iteratively update the HSMESM pa. The derivarameter set while increasing the likelihood tion of these formulas requires to introduce the auxiliary funcdefined as tion (18) is any particular state sequence, and is the auxilwhere iary variable that corresponds to . It can easily be shown, using , Jensen’s inequality, that . Similarly to and that the equality holds if and only if the HMM and HSMM cases, standard Lagrange multiplier optimization method can then be applied to solve for maximizing . The ensuing re-estimation formulas are reported in [40], in the case of one-dimensional Gaussians for all the observation and durational pdfs. III. UNSUPERVISED LEARNING AND MAPPING OF ACTIVE BRAIN fMRI SIGNALS The fMRI brain mapping method developped hereafter relies on the following simple premise: in the presence of neural activation at voxel , the sequence of HROs observed in the corre, should align, to some extent, onto sponding fMRI signal, the hidden sequence of task-induced neural activations. We propose to model both sequences and their time coupling in a single is repHSMESM . In this model, the observation sequence resentative of the sequence of HRO candidates observed at loin the input fMRI cation . It is obtained by preprocessing data set . On the other hand, the hidden part of the model statistically accounts for the neural event timing underlying the HRO sequence. It is built from the timing information of the stimulation protocol. The overall fMRI brain mapping procedure is depicted in Fig. 2. Before describing it further, let us first identify what HRO events are. A. About HRO Events In our approach, an HRO event corresponds to the initial rising edge of the hemodynamic response to a stimulation. Its temporal support extends roughly from the “preundershoot” that

FAISAN et al.: UNSUPERVISED LEARNING AND MAPPING OF ACTIVE BRAIN FUNCTIONAL MRI SIGNALS

267

Fig. 2. Synoptic diagram of the HSMESM-based brain mapping procedure. Learning of active fMRI signals (block “Learning”) alternates with mapping of active fMRI signals (blocks “Evaluation” and “Thresholding”), following a multiple-pass-i, multiple-iteration-j scheme, with a training set of active fMRI signals redefined by mapping at each pass i, before conventional learning (see text for details).

F F

sometimes precedes the onset, to the main peak of the hemodynamic response [see Fig. 3(a)]. The pattern, thus, delineated exhibits approximately at its center a positive inflection point which occurrence time will be considered as the occurrence time of the event. The time-localized, transient character of HROs enables to introduce signal processing techniques—such as the CWT used here—which are very effective in detecting and characterizing transient events in low SNR ambiances as encountered in fMRI. The CWT also exhibits time-scale properties of interest at positive inflection points of a signal. Finally, the HRO event appears as the most invariant pattern of the hemodynamic response across space, time, experiments, and subjects. At least, it preserves its time-localized, transient character upon which its CWT-based detection and characterization rely. B. fMRI Data, Stimulation Protocol The input fMRI data set is assumed to be a sequence of brain volumes so that each fMRI signal is samples long, with in . The stimulation protocol considered in this paper follows a binary activation-baseline paradigm composed of stimulation blocks of variable duration but sharing are real the same basic OFF-ON pattern. Among them, stimulation blocks whereas two are fictive blocks introduced at the beginning and the end of the paradigm for modeling pur). We denote by poses (see Fig. 4(a) with , the set of occurrence times of the first stimulus event in each block or, equivalently, the corresponding set of OFF-ON transition times in the paradigm. C. fMRI Data Preprocessing The preprocessing step of Fig. 2 transforms the input fMRI data set, , into a set of observation sequences, , by detecting and characterizing at the voxel level . The ability of the HROs of interest in the fMRI signal CWT to detect such nonstationarities in the time-scale domain is of great interest here [49]. The method used is a simplified version of the CWT-based preprocessing approach of Thoraval et al. [50]. 1) Event Detection: The mother wavelet employed is chosen complex-valued to benefit from the phase information of the fMRI signal CWT. It has been demonstrated in [49] that symmetry singularities of a signal are associated with particular phase values of its CWT. Specifically, a positive inflection point

occurring at time is associated at high resolution with phase . If one assumes that the HROs to be detected in value exhibit a local positive inflection point, then the event at location can easily be derived by detection time set detecting at a sufficient high scale of resolution, , the time ” crossings of the phase occurr instants at which the “ in the fMRI signal CWT. Such a situation is depicted in Fig. 3 for an activated and a nonactivated fMRI signal. 2) Event Characterization: At each event detection time , of twenty four components is derived. a first feature vector Fourteen of these reflect the temporal behavior of around (five samples of , five samples of a gaussian filtered ver, four cross-correlation coefficients of the raw and sion of with two stepwise functions of variable filtered version of width). Six other components result from the modulus of the (the five most important magnifMRI signal CWT at time tudes and the scale index of the modulus maximum). The last at four components are computed from the fingerprint of in the CWT phase plane (three measure its length and one measures its time deviation across the scales) [see Fig. 3(b), is reduced to a single scalar (d)]. Then, the feature vector using Fisher’s linear discriminant analysis [51]. In our approach, the vector has been estimated once and for all from several databases of feature vectors broadly classified into tpe or fpe. A rough analysis of the estimation results indicates minor variations in the estimate of the vector across the databases. It also reflects that the features elaborated from the HRO fingerprint in the phase plane of the fMRI signal CWT are important factors in discriminating between tpe and fpe. 3) CWT Parameter Settings: The highest scale, , of the CWT decomposition controls the HRO detection rate at the output of the preprocessing step. It is chosen so that HRO candidates are detected on average per fMRI signal, with initially set to 1.5. In contrast, the lowest scale does not have any influence on the HRO detection rate. However, it should be set low enough to provide significant HRO fingerprints. In our “ ” crossings experiments, is set so that of the phase occurr at this scale, on average, per fMRI signal. Finally, the number of CWT scales is arbitrarily set to 20. D. Brain Activation Modeling Though the local sequence of task-induced neural activations is hidden, its timing can, however, be statistically modeled from

268

IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 24, NO. 2, FEBRUARY 2005

=5

Fig. 4. Brain activation modeling. (a) Stimulation paradigm with P blocks among which block 1 and block 5 are fictive blocks. The vertical arrows indicate the first stimulus event of each block with the corresponding occurrence time t . (b) Schematic representation of the task-induced neural event timing where the horizontal arrows underline the hidden nature of the neural activation onsets. (c) Corresponding brain activation HSMESM with the tpe states (circles) and aggregate states in between (boxes). The vertical dashed lines from (a) to (c) reflect the one-to-one registration assumed between a first stimulus event, the induced neural activation, and the observed HRO event.

is then inserted between each valid tpe state pair for which . • State transition probability matrix: A left-right topology is selected for the state chain with the additional constraint , making the matrix upper triangular. To that prevent to declare active an fMRI signal that only responds to a few stimulation blocks, we add the constraint if , with in practice. that is assumed to respond to the The lower , the more successive stimulation blocks, with the limit case where must respond to all blocks. • Set of inter-tpe state duration pdfs: All durational pdfs are specified as one-dimensional gaussians. • set of observation pdfs: Because of the scalar nature of the observations, all observation pdfs are specified as onedimensional gaussians. An example of brain activation HSMESM is depicted in Fig. 4(c), for a stimulation paradigm composed of stimulation blocks, fictive blocks included, and with . Finally, in order to reduce the amount of parameters to be estimated while achieving consistency in their estimates, the concept of parameter tying can be used [40], [42]. state

Fig. 3. fMRI signal preprocessing. (a) Activated fMRI signal with an example of HRO event (gray box) with its detection time (black point). (c) Nonactivated fMRI signal. (b), (d) Phase plane of the corresponding CWT. The scan number and the CWT scale index are reported on the X -axis and Y -axis, respectively. The CWT scales s and s correspond to the scale indexes 1 and 20, respectively. “ = ” crossings of the phase correspond to color jumps from white to black. The curves described by the “ = ” crossing points across the scales are known as fingerprints in pattern recognition. HRO event detection is performed by detecting at scale 1 the “ = ” crossings of the phase. HRO detection times are reported in (a) and (c) by black points superimposed on the signals.

+ 2

+ 2 + 2

E. Unsupervised Learning and Mapping the deterministic timing of the stimulation blocked paradigm. The modeling principle relies on a doubly one-to-one registration between i) the first stimulus event of a stimulation block and the induced neural activation, and ii) the induced neural activation and the observed HRO event. If one considers that the first stimulus events coincide temporally with the OFF-ON paradigm transitions, then the hidden process of the brain activation model can easily be derived, as depicted in Fig. 4, from the OFF-ON transition sequence together with the occurrence time set . The complete model is specified as follows. • Set of hidden states: tpe states are used, one per OFF-ON paradigm transition, that is, the parameter of Section II is set to . The state index reflects the order of appearance of the transition in the paradigm. An aggregate

Unsupervised learning and mapping of brain active signals are performed jointly, searching for a brain activation HSMESM that fits the fMRI data set under analysis. Learning does not consist here in “simply” applying the Baum–Welch re-estimation formulas (or the expectation-maximization algorithm) onto a fixed training set of fMRI signals, following a “conventional” multiple-iteration learning scheme. Rather, learning alternates with mapping, following a multiple-pass- , multiple-iteration- scheme, with a training set of active fMRI signals redefined by mapping at each pass before conventional learning (see Fig. 2). Specifically, the learning-mapping procedure consists in automatically building a series of refined , with a series of active brain activation models, , and its correlate, the series of fMRI signal sets,

FAISAN et al.: UNSUPERVISED LEARNING AND MAPPING OF ACTIVE BRAIN FUNCTIONAL MRI SIGNALS

activation maps . Composed of all the signals deis used to train by iterative applicaclared active by , tion of the Baum–Welch re-estimation formulas. The procedure (in fact, with the set starts with the fMRI signal set of all intracranial observation sequences). All three series , and in that learning converge respectively to yields , which, in turn, declares active . from defines the final brain activation map while is statistically representative of the induced neural activition embedded in the input fMRI data set . Further technical details about the learning-mapping procedure (initialization, evaluation, thresholding, and learning blocks) may be found in [40]. F. More Insight Into Neural Activation: Viterbi Decoding Along with the activation maps, Viterbi decoding of active fMRI signals provides extra information about how brain activated regions have responded to the external stimulation. Indeed, as seen in Section II-D2, Viterbi decoding of the best attached to each activated voxel leads tpe state sequence not only to the occurrence time sequence of the true positive , but also HROs decoded at location , to the sequence of stimulation block numbers associated with, . This information may be exploited in various ways. and , one can 1) Activation Lag Mapping: From easily work out an estimate for the activation lag at location . The estimate we used consists in averaging the differences , where , between the HRO times and the stimulation onset times associated with. Three examples of activation lag map, with their gray level lag scale at the right side, are shown in the upper right corner of Fig. 6(a)–(c). 2) Activation Mode Visualizing: For any activated region, an histogram-like subplot can also be obtained by counting the has been visited by the state number of times each tpe state related to the region. The resulting subplot insequences forms about the neural response rate of the region to each input stimulation block. It can be thought of as a way of visualizing the activation mode of specific activated brain regions. Three examples of activation mode are shown in the top row of Fig. 7. 3) HRF Shape Analysis: Morphological analysis of the HRF can also be carried out in all dimensions, including space, time, stimulation protocols and subjects. For instance, the mean HRF waveform can be computed for a given activated region and a given stimulation block , based on the HRF patterns verifying , in the fMRI signals observed at time related to the region. Three examples of HRF waveform are shown in the bottom row of Fig. 7. IV. EXPERIMENTS The HSMESM-based brain mapping method was applied to synthetic and real epoch-related fMRI data. Because of its widespread use in the fMRI community, the SPM method was employed (SPM2 release) as the main comparator of the HSMESM method. A. fMRI Data 1) Imaging Procedure: Functional images were acquired on a 2-T whole body S200 Bruker MRI system with a head volume coil, using echo-planar imaging (EPI) with an axial slice orien-

269

tation (32 slices). Each fMRI data set consisted in a sequence of brain volumes of size 64 64 32 ( mm), with ms and s. The number of intracranial voxels per sequence was about 17 000. Anatomms; ical imaging used fast spin echo ( mm). Prior to HSMESM brain mapping, all fMRI scans of the data set under concern were registered to the first scan using an automated registration algorithm [52]. A Gaussian premm) and time ( s) was filtering in space ( then performed to the raw data in order to increase the fMRI SNR. 2) Synthetic Epoch-Related Data: A first fMRI noise data set, , was acquired from a “null condition” experiment of no , three synthetic data imposed stimulation paradigm. From sets, , and , were derived using fake activation patterns embedded at known locations. All noise-free activation patterns were obtained by convolving the HRF model proposed by the SPM2 software (difference of two Gammma functions) , poswith a binary reference timing function, sibly transformed in shape or in time before convolution. The function , a partial dashed line plot is given in the top row of Fig. 5(a)–(c), is representative of the input activation-baseline when OFF and when ON, for paradigm, with . It is periodic (5 OFF, 10 ON, 5 OFF) in seven runs of 20 timepoints each. To fulfill with the 145 scan length re, two and three OFF timepoints were added at quirement of the beginning and the end of . In each synthetic data set, four cubic areas of size 6 6 6 voxels were selected within the brain volume as active areas. was designed to illustrate an ideal The synthetic data set application case. Indeed, as shown in the top row of Fig. 5(a), four noise-free activation patterns, one per active area, were derived by convolving the HRF model of the SPM2 software with four exact replications (thin line plots) of the timing function (dashed line plot). Thereby, all noise-free activation patterns conform exactly to what is “expected” as actiembedded in vation patterns in terms of shape and time synchronization. was designed to illustrate shape The synthetic data set variability of the hemodynamic response to a stimulation block (violation of assumption of steady-state dynamics). Four activation patterns, one per active area, were computed by convolving the SPM2 HRF model with four different timing functions derived by shape transformation of (see top row of Fig. 5(b)). between “expected” acThereby, shape differences exist in tivation patterns and “ground truth”. The synthetic data set was designed to illustrate timing variability between stimulation paradigm and neural/hemodynamic response (activation delay). Four noise-free activation patterns were computed by convolving the above HRF model with four delayed versions (delay of 2, 4, 6, or 8 scans) of [see top row of Fig. 5(c)], so that timing differences exist in between “expected” and observed activation patterns. Finally, following the definition of Lange et al. [53], magnitudes of added activation patterns at active locations were specified as some positive fraction, , of series standard deviation at each location, computed after baseline drift removal. In order to ensure a minimum of 95% of activation detection performance for the “ideal” SPMb implementation of SPM (see Section IV-B), the fraction was set to 0.625 in all active areas of the , and data sets, and to 0 outside these regions.

270

IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 24, NO. 2, FEBRUARY 2005

Fig. 5. Results on synthetic fMRI data. Results obtained for (a) the F , (b) the F , and (c) the F data set. First row: Activation patterns used in F ; F , and F , plotted before convolution (thin line), and obtained by shape or timing transformation of the input reference timing function x (dashed line). Second row: SPMa activation maps (FPR = 0:005). Third row: HSMESM activation maps (FPR = 0:005). Fourth row: SPMb activation maps (FPR = 0:005). Fifth row: ROC curves obtained for the HSMESM, SPMa, and/or SPMb methods (see text for details).

3) Real Epoch-Related Data: Another 119 fMRI data sets were acquired from 17 healthy subjects from 18 to 39 years old, who were asked to perform language processing tasks. Three different stimulation protocols, namely, VERBAL, AUDIO, and VISUAL, were designed to map the cortical areas involved in word finding, auditory, and visual lexical processing using EXPE 6 [54]. The auditory and visual verbal stimuli used in these procedures were frequently used nouns of the French lexicon checked for concreteness and imagery [55]. The subjects listened to the auditory stimuli through earphones with high noise attenuation. Visual stimuli and task instructions were presented in Avotec Silent Vision Glasses. Detailed information about the three experimental designs may be found in [56]. B. Results on Synthetic Data Since the “ground truth” is known for the , and synthetic data sets, activation detection performance of the HSMESM and SPM methods are compared from activation maps and receiver-operating characteristic (ROC) curves. For a better evaluation, two implementations of the SPM method

and data sets. The first impleare considered for the mentation, SPMa, corresponds to a real application case where the ground truth is unknown and, hence, cannot be exploited. It makes use of a single regressor, namely, the HRF model proposed by the SPM2 software convolved with the “expected” reference timing function plotted in dashed line in the first row of Fig. 5(a)–(c). Conversely, the second implementation of SPM, SPMb, corresponds to the “ideal” application case where ground truth is known and exploited. That is, the four noise-free activation patterns used to built either or , are reused in SPMb as four additional regressors to the SPMa implementation. Such a doubly implementation of SPM is unnecessary for the data set due to the absence of difference between “expected” and observed activation patterns. Therefore, the SPMa implementation can be considered, for the particular case of the data set, as an “ideal” application case of SPM, similarly to the SPMb implementation used for and . Let us remind that such considerations are of no concern for the HSMESM method since the ground truth is learned and no prior HRF model is needed.

FAISAN et al.: UNSUPERVISED LEARNING AND MAPPING OF ACTIVE BRAIN FUNCTIONAL MRI SIGNALS

271

Fig. 6. Results on real fMRI data. Results obtained for the (a) VERBAL, (b) AUDIO, and (c) VISUAL paradigms. For each map fourtuple (a)–(c), HSMESM activation map (top left), HSMESM activation lag map with its gray level lag scale (top right), SPM activation map (bottom left), and HSMESM in-phase activation map (bottom right) (see text).

Activation detection results obtained for the , , and data sets, are shown in Fig. 5(a), (b), and (c), respectively. Examples of SPMa, HSMESM, and SPMb activation maps are represented in the second, third, and fourth row of Fig. 5, for a false positive rate of 0.005. True positives are represented in black, while false positives and false negatives are represented in white. The four active areas in any activation map are arranged according to a lexicographic order while a top-bottom order is used for the corresponding activation patterns (see top row of Fig. 5). HSMESM, SPMa, and SPMb ROC curves obtained for the , , and data sets, are shown in the bottom row of Fig. 5(a)–(c). Finally, note that in Fig. 5(a) the SPMb activation map and the SPMb ROC curve are absent because of the optimality of the SPMa method for the data set, as previously explained. Activation detection results obtained for demonstrate that the HSMESM method reveals very competitive when compared to an “ideal” application case of SPM. A maximum decrease of only 6% in detection performance is observed for the HSMESM method with respect to the (here optimal) SPMa method, and for a low false positive rate of 0.001. Activation detection results obtained for demonstrate the relative insensitivity of the HSMESM method to variations of the hemodynamic response to a stimulation block. The HSMESM method largely outperforms the SPMa implementation of SPM, especially when the hemodynamic response duration becomes significantly shorter than the stimulation block duration. With respect to the ideal case of SPMb, a maximum fall of only 7% of the activation detection performance is observed for the HSMESM method. Note also that, from the HSMESM map shown in the third row of Fig. 5(b), an increase of false negatives is observed only for the fourth active area, that is, where the hemodynamic response to the stimulation block is particularly unsustained and the temporal support of the response is one half shorter than expected. However, this result is still satisfactory when compared to the first and fourth activation areas in the SPMa activation map of Fig. 5(b). Activation detection results obtained for demonstrate clearly the complete insensitivity of the HSMESM method to

timing variations of the hemodynamic response with respect to the input stimulation blocked paradigm. HSMESM detection results are excellent whatever the false positive rate and the activation delay are. They are equivalent to that obtained by the SPMb method, but without requiring, as in SPMb, any prior knowledge about the delays of the synthetic activation patterns to be detected. C. Results on Real Data 1) Activation Maps, Activation Lag Maps: Examples of HSMESM mapping results are shown in Fig. 6(a)–(c), for the VERBAL, AUDIO, and VISUAL protocols, respectively. Activation maps obtained with the SPM method are also inserted for comparison. Each group of four maps is composed of one HSMESM activation map, one HSMESM activation lag map, one SPM activation map, and one HSMESM in-phase activation map. The SPM maps were obtained using a standard implementation of the SPM2 software, similar to the SPMa implementation used for the synthetic experiments, using Gaussian random fields and a corrected p-value of 0.05. The HSMESM map were obtained for a noncorrected p-value of 0.01 due to the absence of spatial oversampling as in SPM. The HSMESM in-phase activation maps are obtained from the corresponding HSMESM activation maps, by retaining the activated voxels whose activation lag estimate falls between and scans. Activated brain regions are represented in white, except for the HSMESM in-phase activation maps where a gray level scale graduated in number of scans is used to represent the local activation lag estimate. Fig. 6(a) shows the activation maps obtained for the verb generation task (VERBAL), on the horizontal section 16 mm from the AC-PC line in the Talairach and Tournoux atlas. On the SPM map, significant activations are found in the left inferior frontal gyrus (IFG) (BA 45/44) and in the posterior superior temporal gyrus (STG) (BA 22) as expected. The HSMESM in-phase activation map consistently shows activations in the same cortical areas. The HSMESM activation map shows in addition to the left IFG and STG, delayed activations in the right posterior STG and the mid temporal gyrus (TG), and in the right mid frontal

272

IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 24, NO. 2, FEBRUARY 2005

Fig. 7. Additional fMRI data analysis functionalities based on Viterbi decoding. Top row: Activation mode visualizing. (a) “Sustained” activation, (b) “emerging” activation, and (c) “fading” activation observed for the AUDIO paradigm in the same brain region of three subjects. Bottom row: HRF shape analysis. (a)–(c) Mean HRF waveform (thick line) computed from the HRF patterns ( ) observed for the same active brain region of three subjects in response to the stimulation block 4 (thin line) of the AUDIO protocol.

2

gyrus (FG) and bilaterally in the internal superior frontal and the cuneus. Fig. 6(b) shows the activation maps obtained for the auditory verbal processing task (AUDIO), on a horizontal section 20 mm from the AC-PC line. The SPM method shows significant activations in the IFG (BA 46/45), the mid FG (BA10), and the posterior mid TG (BA39/19). Here again, the HSMESM in-phase activation map consistently shows activations in the same cortical areas. In addition, the HSMESM activation map shows delayed activations in the right posterior superior and mid TG, left anterior SFG, and bilaterally in the anterior and posterior cingular gyrus and the precuneus. In the visual verbal task (VISUAL), the SPM map of Fig. 6(c) shows, on the horizontal section 16 mm from the AC-PC line, two main activation foci involving the left IFG and STG. These are also detected in the HSMESM in-phase activation map while the HSMESM activation map shows additional delayed activations, the ones involve mainly the right anterior STG and the postcentral gyrus. More generally, the HSMESM in-phase activation maps were cross-validated with the SPM ones in the 119 subjects studied (41 VERBAL, 42 AUDIO, and 36 VISUAL). For 117 subjects, HSMESM in-phase activation maps consistently showed activation in the expected cortical areas. For only two cases, some active brain areas detected by the SPM method were misdetected by the HSMESM method. This was explained by a much lower proportion of in-phase active fMRI signals than the proportion of delayed active fMRI signals within these data sets, leading, thus, to a brain activation HSMESM essentially representative of the latter signals. Because statistically under-represented, the in-phase active signals were misdetected. A simple and effective solution to this problem consisted in constraining the model during learning, by setting for the durational pdfs attached to the start and final tpe states. Finally, the HSMESM activation maps along with the HSMESM activation lag maps inform about delayed activations with regard to the activation task. As illustrated in Fig. 6,

this information can be useful for filtering neural activation according to its delay. It may also be of interest for the study of the shifting in local brain activity when moving from one task to the other as it is the case in the ON-OFF paradigm. As an example, from the delayed activations observed within the right posterior superior and mid TG and left anterior SFG of Fig. 6(b), one may speculate that they reflect the auditory processing of backward words [56]. Indeed, in the control task of the AUDIO protocol, the subjects face human voice articulating speech that does not fit either the canonical phonological or lexical representations. Such context may induce a shifting of verbal auditory processing to a more supra segmental one relying on the processing of prosodic cues which may implicate the right auditory associative cortex. 2) Activation Mode Visualizing: The activation mode subplots at the top row of Fig. 7 have been obtained for the AUDIO protocol, and for the same active brain region of three subjects. Before comparison, each subplot was normalized by the number of activated voxels in the region . Note that in all subplots, the first and last neural response rates do not have any physiological sense since they correspond to the systematic occupancy of the start and final tpe states and . The subplot at the top row of Fig. 7(a) indicates a regular level of activation from the beginning to the end of the task of interest. This sustained activity might indicate that the involvement of this brain area is essential for the cognitive processing implicated in the task. In the activation mode subplot at the top row of Fig. 7(b), the activation appears to progressively emerge as if the recruitment of this cortical area was dependent on the activation state of other components of the neural network. The activation mode shown at the top row of Fig. 7(c) would indicate that the neuronal activation is fading before the end of the task as if this area was no more essential to pursuit the task. More generally, the different activation modes shown by different brain areas might be the expression of the recruitment dynamics of different components of the neural network involved in the task.

FAISAN et al.: UNSUPERVISED LEARNING AND MAPPING OF ACTIVE BRAIN FUNCTIONAL MRI SIGNALS

273

unsupervised HSMESM mapping approach does not overdetect neural activation. V. DISCUSSION

Fig. 8. Type I error validation. (a) Ratios of observed-to-expected type I p 0:1) obtained by application of the errors for varying p-values (0:001 synthetic activation model (top subplot) and the real AUDIO, VERBAL, and . (b) VISUAL activation models (bottom subplot) to the the “null” data set Corresponding HSMESM activation maps obtained for the synthetic (top) and VERBAL (bottom) activation models (p = 0:005).

 

F

3) HRF Shape Analysis: The bottom row of Fig. 7 shows three subplots of the mean waveform of the HRF obtained for three active brain regions of three subjects, and for the block 4 of the AUDIO stimulation paradigm. Before plotting, the HRF patterns related to the active area and the stimulation block 4 were aligned on the time instant 0, based on their HRO detection time. Also, the OFF-ON pattern of the stimulation block 4 was superimposed on each subplot by accounting for the mean activation lag of the HRF patterns with respect to the block. Despite an important shape variability of the HRF, one may notice the sharp transient, less invariant character of the HRO pattern across subjects, thereby justifying to some extent the use of the HRO pattern in our event-detection-based mapping approach. D. Type I Error Validation In a last experiment, a type I error validation was conducted in the “null” case of no imposed experimental paradigm analyzed with arbitrarily chosen experimental designs. Four brain activation models issued from learning-mapping of synthetic activation patterns (data set ) or of real activation patterns (one AUDIO, one VERBAL, and one VISUAL data set) were reused to detect activation patterns embedded in the “null” data set introduced in Section IV-A2. Because assumed to be composed of pure noise, activation patterns detected in were counted as type I errors (false alarms). The ratio of observed-to-expected type I errors was then computed for varying -values. This ratio is plotted in Fig. 8(a) for the synthetic activation model (top subplot), and for the selected AUDIO, VERBAL, and VISUAL activation models (bottom subplot). Two samples of HSMESM activation maps are also shown in Fig. 8(b), for a -value of 0.005. They result from the application of the synthetic (top map) and VERBAL (bottom map) activation models to the “null” data set . Whatever the synthetic or real experimental designs used, the plots of Fig. 8(a) show a ratio of observed-to-expected type I errors close to 1, never exceeding 1.75, for a -value ranging from 0.001 to 0.1. Therefore, it should be concluded that the

Initial results illustrate the relevance of HSMESMs in unsupervised fMRI brain mapping. In particular, the learning and generalizing abilities of HSMESMs enable to automatically capture neural activation variability across time, brain, experiments, and subjects. Beyond these models, the developed method demonstrates the possibility to detect and analyze brain functional activation based solely on the fMRI signal transients that are the HROs. The benefit of this strategy is threefold: no prior shape assumption about the HRF is needed; robustness in activation detection is increased if one reasonably considers the HRO pattern as the most significant and invariant feature of the HRF; the use of signal processing techniques effective in detecting and characterizing transients in low SNR ambiances, as encountered in fMRI, is made possible. To our knowledge, no similar analysis strategy has been reported in the related literature. Activation detection is commonly performed directly on the fMRI signal rather than on a limited set of time-localized events detected in the signal. Several wavelet-based techniques have also been proposed for activation detection in fMRI. But they usually rely on orthonormal wavelet bases, while statistical inference is performed in the wavelet domain [57]–[60]. Besides, it should be stressed that the proposed approach is not CWT-dependent but rather generic from an event detection and characterization point of view. Detecting brain functional activation by aligning a sequence of paradigm OFF-ON transitions and a sequence of HRO events has been considered previously by Thoraval et al. [50]. In their preliminary work, time coupling is scored in terms of edit distance between event sequences, with the major drawback of requiring empirical cost functions in the computation of the edit distance. This problem is solved here within a probabilistic framework, by changing the deterministic modeling of the paradigm in favor of an HSMESM-based one. Now, let us notice that few (hidden) Markovian modeling approaches of the fMRI signal have been reported in the literature. In [61], Thirion et al. model the fMRI time-series as a first order Markov chain, and derive two activation measures from the estimates of the transition matrices associated with each stimulus condition. This modeling approach is not strictly a hidden Markovian one, and the Markovian property is assumed between fMRI signal samples rather than between tpe HRO events, as in our approach. Højen-Sørensen et al. propose in [62] a hidden Markov modeling of the fMRI signal, which makes use of all fMRI signal samples. It implements a model of the fMRI signal that takes into account, by means of hyperparameters, baseline drift and HRF shape. Also, inference is based on a full-Bayesian methodology that requires time consuming Monte Carlo sampling techniques to estimate the posteriors, whereas in our HSMESM approach, inference is based on adapted versions of algorithms standardly used in an H(S)MM framework [42]. In the mapping approach presented here, a standard HMM- or HSMM-based modeling would, however, have been inappropriate because of the conflicting transient

274

IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 24, NO. 2, FEBRUARY 2005

detection-oriented character of the approach, and the piecewise stationarity assumption required for the observable process of these models. HMMs and HSMMs are better suited for the analysis of random processes that are segmental in nature rather than event-based, as are the HSMESMs. HSMESMs have been introduced in the context of automatic recognition of cardiac arrhythmias from electrocardiographic data by Thoraval et al. [41]. Similarly to the proposed approach, their analysis relies on the detection and the characterization of nonstationary time-localized events, the electrocardiographic waves, whose temporal arrangement is informative and can be statistically modeled by HSMESMs. So far, a two-condition blocked paradigm and a singleHSMESM-based mapping approach have only been considered. When dealing with more sophisticated blocked paradigms, involving more stimulus conditions, a multiple-HSMESM-based approach could be used. These HSMESMs could be representative of specific condition types, or of any combination of them. They could be built from OFF-ON transition subsequences of interest derived from the original stimulation paradigm. The application of the method in event-related fMRI can also reasonably be envisaged. The event-based character of the modeling technique and its absence of prior shape assumption for the HRF are encouraging factors. The modeling methodology used for event-related designs remains similar to the one used for blocked paradigms. In epoch-related fMRI, one HSMESM state was assumed per expected HRO, each HRO marking the onset of the hemodynamic response to a stimulation block (rapid succession of stimuli). In event-related fMRI, one HSMESM state is also assumed per expected HRO, but with each HRO marking here the onset of the hemodynamic response to a single stimulus, considered isolately. In the particular case of rapid event-related fMRI, epoch- and event-related modeling principles can be mixed to account for single stimulus-induced HROs and for HRO overlaps that may result from a short inter-stimulus interval. As an example, in order to model two close stimuli, three HSMESM states are needed. Two of them model the two HROs (HRO1 and HRO2) that could be detected in the absence of HRO overlap. The third one accounts for the possible overlap of HRO1 and HRO2 not distinguished by detection. From this three-state subspace, five state trajectories can be allowed by the examiner: HRO1 followed by HRO2, HRO1 alone, HRO2 alone, overlap of HRO1 with HRO2, absence of stimulus-induced activation. The modeling principle described here can easily be generalized to more than two stimuli, thereby increasing the number of states used as well as the combinatory of state trajectories allowed in the brain activation HSMESM. In order to clarify the presentation of the HSMESM mapping approach, some modeling aspects were voluntarily occulted or simplified. For instance, one may wonder how a brain activation HSMESM relates to the observed data, that is, how to choose and , and how these choices the observation pdfs impact results. Similarly, one may wonder how can covariates, such as age or gender, be added into the model. In response to the first question, the observation pdfs were selected to be simple, namely, one-dimensional Gaussians. But in practice, multiple modeling alternatives are possible for the observation. Their choice depends closely on the nature of the preprocessing car-

ried out on the raw input data. It also depends on the form of the histograms of the observation conditionally to the HSMESM states. For example, for a multimodal distribution of the observation, the most current observation model is a mixture of Gaussians. Now, a proper evaluation of how the preprocessing and observation modeling choices impact results represents an extensive study that goes largely beyond the scope of this paper. Such a study can be compared to the one conducted by the automatic speech recognition community looking for an accurate representation of the speech signal. There is no doubt that it should lead to an improvement in activation detection performance. In response to the second question, the modeling of patient information, such as age or gender, can be envisaged through the use of additional random variables that condition the statistical behavior of the model (duration and/or observation pdfs, probability transition matrix). This also implies to modify the dependence graph of the model accordingly. Such a modeling step is halfway between HSMESMs and dynamic Bayesian networks [63]. More generally, the brief replies to the two above modeling questions illustrate the modeling abilities of HSMESMs. As a hidden Markovian modeling technique, the HSMESMs profit from years of experience in signal modeling and analysis. As a particular instance of dynamic Bayesian networks, they have real capacities of extension. Some comments about the HRO detection step should also be added. One easily understands that this step has a major influence on the HSMESM mapping results. In particular, HRO overdetection increases type I errors since more event sequences with numerous detected events match with the brain activation model under concern. HRO underdetection also increases the false alarm rate. When a large part of HROs of interest is misdetected, learning leads to a weakly constrained activation model (with numerous state transitions allowed), and subsequently to an increasing number of false alarms. In order to solve both problems, other HRO detectors could be compared to the CWT, the literature about abrupt change detection being considerable (see for instance [64]). Robust detection of HROs could also be solved within a distributed detection framework [65], with multiple HRO detectors operating in parallel, at the voxel level, followed by a fusion center that combines their decisions. Another and somewhat more natural solution would consist in introducing neighboring spatial information in the measure of the local neural activity. It implies to extend the HSMESM formalism to account for multiple event sequences, and for time asynchronism between HROs detected on separate channels but related to the same stimulation block. This latter solution is currently under development in our group. Asides from HRO over- or underdetection, task-related motion artifacts that mimic “true” activation may also increase type I errors. This problem is not specific to the proposed mapping approach [66], [67]. Unfortunately, to date, no perfectly accurate motion correction procedure is available. The current HSMESM solution to this problem is an a posteriori analysis of the detected activations from Viterbi decoding information, with final decision-making by the examiner. In practice, activated voxels are grouped into active regions, then, as an help for decision-making, a set of features, including the mean fMRI signal, the mean HRO pattern, and the activation mode, is com-

FAISAN et al.: UNSUPERVISED LEARNING AND MAPPING OF ACTIVE BRAIN FUNCTIONAL MRI SIGNALS

puted for each active region. To complete this help, a preclassification procedure of the detected areas into “motion-related” and “not motion-related” active areas is currently under study. It is based on a supervised learning of the active areas labeled “motion-related” by the examiner across brain fMRI experiments. All the experiments described throughout the paper were performed on a 1.5 GHz Pentium IV platform. The overall HSMESM-based brain mapping procedure was implemented requires about 30 s, with in C/C++. The preprocessing of 20 s for the automatic selection of the highest and lowest scales of the CWT decomposition, and 10 s to compute the output observation sequence set from . The unsupervised learning-mapping procedure—the learning step in particular—is the most time consuming task, spatial registration excluded. It takes no more than 1 min 20 s for a -value of 0.001, 1 min 40 s for a -value of 0.01, and 4 min for a -value of 0.05 to work out the final activation map. In average, a total time of 2 min was needed to work out an activation map from an input fMRI data set , spatial registration excluded. This short processing time is to be attributed to the low complexity of an HSMESM implementation, when compared to an HMM or HSMM one. VI. CONCLUSION In this paper, a new statistical method for learning and mapping active brain functional MRI signals has been described. Neural activation detection is formulated at the voxel level in terms of time coupling between the hidden sequence of task-induced neural activations and its hemodynamic correlate, the sequence of HROs observed in the fMRI signal. Both event sequences and their time coupling were modeled in a single HSMESM. The hidden part of the model statistically accounts for the timing of the neural events. Its observable counterpart accounts for the short-term statistical characteristics of the hemodynamic events. Solving for the standard HSMESM problems of Evaluation, Decoding, and Learning allowed to develop, within a probabilistic framework, the following fMRI data analysis functionalities: activation brain mapping, activation lag mapping, activation mode visualizing, HRF analysis. In particular, activation brain mapping was completely automated thanks to the learning and generalizing abilities of the HSMESMs, so that no template basis function nor prior shape assumption for the response of the activated voxels was needed. An application of the HSMESM brain mapping method to synthetic and real epoch-related fMRI data was presented. Activation detection results obtained on synthetic data demonstrate the superiority of the HSMESM method with respect to a standard implementation of the statistical parametric mapping (SPM) approach. They are also very close, sometimes equivalent, to those obtained with an “ideal” implementation of SPM. The HSMESM method appears clearly insensitive to timing variations of the hemodynamic response. It also exhibits low sensitivity to fluctuations of its shape (unsustained activation during task). Activation detection results obtained on real data compete with those obtained by SPM, but without requiring, as in SPM, any prior definition of the expected activation patterns. More generally, all conducted experiments demonstrate the

275

relevance of HSMESMs in fMRI brain mapping. They confirm and validate the overall strategy that consists in focusing the analysis on the transients, time-localized events that are the HROs. The statistical character of these models, along with their learning and generalizing abilities are of particular interest when dealing with strong variabilities of the active fMRI signal across time, space, experiments, and subjects. REFERENCES [1] P. Filzmoser, R. Baumgartner, and E. Moser, “A hierarchical clustering method for analyzing functional MR images,” Magn. Reson. Imag., vol. 17, pp. 817–826, 1999. [2] C. Goutte, P. Toft, E. Rostrup, F. A. Nielsen, and L. K. Hansen, “On clustering fMRI time series,” NeuroImage, vol. 9, pp. 298–310, 1999. [3] R. Baumgartner, G. Scarth, C. Teichtmeister, R. Somorjai, and E. Moser, “Fuzzy clustering of gradient-echo functional MRI in the human visual cortex. Part I: Reproducibility,” Magn. Reson. Imag., vol. 7, pp. 1094–1101, 1997. [4] L. K. Hansen, J. Larsen, F. A. Nielsen, S. C. Strother, E. Rostrup, R. Savoy, C. Svarer, and O. B. Paulson, “Generalizable patterns in neuroimaging: How many principal components?,” NeuroImage, vol. 9, pp. 534–544, 1999. [5] S. H. Lai and M. Fang, “A novel local PCA-based method for detecting activation signals in fMRI,” Magn. Reson. Imag., vol. 17, pp. 827–836, 1999. [6] V. D. Calhoun, T. Adali, G. D. Pearlson, and J. J. Pekar, “Spatial and temporal independent component analysis of functional MRI data containing a pair of task-related waveforms,” Human Brain Mapping, vol. 13, pp. 43–53, 2001. [7] M. J. McKeown, S. Makeig, G. G. Brown, T. P. Jung, S. S. Kindermann, A. J. Bell, and T. J. Sejnowski, “Analysis of fMRI data by blind separation into independent spatial components,” Human Brain Mapping, vol. 6, pp. 168–188, 1998. [8] P. A. Bandettini, A. Jesmanowicz, E. C. Wong, and J. S. Hyde, “Processing strategies for time-course data sets in functional MRI of the human brain,” Magn. Reson. Med., vol. 30, pp. 161–173, 1993. [9] K. J. Friston, P. Jezzard, and R. Turner, “Analysis of functional MRI time-series,” Human Brain Mapping, vol. 1, pp. 153–171, 1994. [10] A. M. Dale and R. L. Buckner, “Selective averaging of rapidly presented individual trials using fMRI,” Human Brain Mapping, vol. 5, pp. 329–340, 1997. [11] B. R. Rosen, R. L. Buckner, and A. M. Dale, “Event-related functional MRI: Past, present, and future,” Proc. Nat. Acad. Sci., vol. 95, pp. 773–780, 1998. [12] N. Lange and S. L. Zeger, “Non-linear fourier time series analysis for human brain mapping by functional magnetic resonance imaging,” Appl. Stat., J. Roy. Statist. Soc., Ser. C, vol. 46, pp. 1–30, 1997. [13] J. C. Rajapakse, F. Kruggel, J. M. Maisog, and D. Y. von Cramon, “Modeling hemodynamic response for analysis of functional MRI time-series,” Human Brain Mapping, vol. 6, pp. 283–300, 1998. [14] K. J. Friston and J. Ashburner et al., “SPM 97 course notes,” Wellcome Dept. Cognitive Neurol., Univ. College London, London, U.K., vol. 13, 1997. [15] D. L. Schacter, R. L. Buckner, W. Koustaal, A. M. Dale, and B. R. Rosen, “Late onset of anterior prefrontal activity during true and false recognition: An event-related study,” NeuroImage, vol. 6, pp. 259–269, 1997. [16] A. T. Lee, G. H. Glover, and C. H. Meyer, “Discrimination of large venous vessels in time-course spiral blood-oxygen-level-dependent magnetic resonance functional neuroimaging,” Magn. Reson. Med., vol. 33, pp. 745–754, 1995. [17] G. K. Aguirre, E. Zarahn, and M. D’Esposito, “The variability of human bold hemodynamic responses,” NeuroImage, vol. 8, pp. 360–369, 1998. [18] G. H. Glover, “Deconvolution of impulse response in event-related BOLD fMRI,” NeuroImage, vol. 9, pp. 416–429, 1999. [19] F. M. Miezin, L. Macotta, J. M. Ollinger, S. E. Petersen, and R. L. Buckner, “Characterizing the hemodynamic response: Effects of presentation rate, sampling procedure, and the possibility of ordering brain activity based on relative timing,” NeuroImage, vol. 11, pp. 735–759, 2000. [20] G. M. Boynton, S. A. Engel, G. H. Glover, and D. J. Heeger, “Linear systems analysis of fMRI in human V1,” J. Neurosci., vol. 16, pp. 4207–4221, 1996. [21] M. D. Robson, J. L. Dorosz, and J. C. Gore, “Measurements of the temporal fMRI response of the human auditory cortex to trains of tones,” NeuroImage, vol. 7, pp. 185–198, 1998. [22] A. L. Vasquez and D. C. Noll, “Nonlinear aspects of the BOLD response in functional MRI,” NeuroImage, vol. 7, pp. 108–118, 1998.

276

[23] N. K. Logothetis, J. Pauls, M. Augath, T. Trinath, and A. Oeltermann, “Neurophysiological investigation of the basis of the fMRI signal,” Nature, vol. 412, pp. 150–157, 2001. [24] P. A. Bandettini and L. G. Ungerleider, “From neuron to bold: New connections,” Nature Neurosci., vol. 4, pp. 864–866, 2001. [25] O. J. Arthurs and S. Boniface, “How well do we understand the neural origins of the fMRI BOLD signal,” Trends Neurosci., vol. 25, pp. 27–31, 2002. [26] R. Turner and R. J. Ordidge, “Technical challenges of functional magnetic resonance imaging,” IEEE Eng. Med. Biol. Mag., vol. 19, no. 5, pp. 42–54, Sep.–Oct. 2000. [27] P. Ciuciu, J.-B. Poline, G. Marrelec, J. Idier, Ch. Pallier, and H. Benali, “Unsupervised robust nonparametric estimation of the hemodynamic response function for any fMRI experiment,” IEEE Trans. Med. Imag., vol. 22, no. 10, pp. 1235–1251, Oct. 2003. [28] C. R. Genovese, “A Bayesian time-course model for functional magnetic resonance imaging data,” J. Am. Statist. Assoc., vol. 95, pp. 691–719, 2000. [29] C. Gössl, L. Fahrmeir, and D. P. Auer, “Bayesian modeling of the hemodynamic response function in BOLD fMRI,” NeuroImage, vol. 14, pp. 140–148, 2001. [30] K. J. Friston, “Bayesian estimation of dynamical systems: An application to fMRI,” NeuroImage, vol. 16, pp. 513–530, 2002. [31] M. Svensén, F. Kruggel, and D. Y. von Cramon, “Probabilistic modeling of single-trial fMRI data,” IEEE Trans. Med. Imag., vol. 19, no. 1, pp. 25–35, Jan. 2000. [32] C. Goutte, F. A. Nielsen, and L. K. Hansen, “Modeling the hemodynamic response in fMRI using smooth FIR filters,” IEEE Trans. Med. Imag., vol. 19, no. 12, pp. 1188–1201, Dec. 2000. [33] C. Gössl, D. P. Auer, and L. Fahrmeir, “Dynamic models in fMRI,” Magn. Reson. Med., vol. 43, pp. 72–81, 2000. [34] X. Descombes, F. Kruggel, and D. Y. von Cramon, “Spatio-temporal fMRI analysis using Markov random fields,” IEEE Trans. Med. Imag., vol. 17, no. 6, pp. 1028–1039, Dec. 1998. [35] N. V. Hartvig and J. L. Jensen, “Spatial mixture modeling of fMRI data,” Human Brain Mapping, vol. 11, pp. 233–248, 2000. [36] M. Smith, B. Pütz, D. Auer, and L. Fahrmeir, “Assessing brain activity through spatial Bayesian variable selection,” NeuroImage, vol. 20, pp. 802–815, 2003. [37] J. Kershaw, B. A. Ardekani, and I. Kanno, “Application of Bayesian inference to fMRI data analysis,” IEEE Trans. Med. Imag., vol. 18, no. 12, pp. 1138–1153, Dec 1999. [38] W. Penny, S. Kiebel, and K. J. Friston, “Variational Bayesian inference for fMRI time series,” NeuroImage, vol. 19, pp. 722–741, 2003. [39] K. J. Friston, W. Penny, C. Phillips, S. Kiebel, G. Hinton, and J. Ashburner, “Classical and Bayesian inference in neuroimaging: Theory,” NeuroImage, vol. 16, pp. 465–483, 2002. [40] L. Thoraval. (2002) Hidden Semi-Markov Event Sequence Models. Université Louis Pasteur Strasbourg I. [Online]. Available: http://picabia.ustrasbg.fr/lsiit/perso/thoraval.htm [41] L. Thoraval, G. Carrault, and J. J. Bellanger, “Heart signal recognition by hidden Markov models: The ECG case,” Meth. Inform. Med., vol. 33, pp. 10–14, 1994. [42] L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257–286, Feb. 1989. [43] S. Faisan, L. Thoraval, J.-P. Armspach, and F. Heitz, “Hidden semiMarkov event sequence models: Application to brain functional MRI sequence analysis,” presented at the IEEE Int. Conf. Image Processing (ICIP), Rochester, NY, Sep. 2002. , “Unsupervised learning and mapping of brain fMRI signals based [44] on hidden semi-Markov event sequence models,” in Medical Image Computing and Computer Assisted Intervention (MICCAI), Montréal, QC, Canada, Nov. 2003, pp. 75–82. [45] J. D. Ferguson, “Variable duration models for speech,” in Proc. Symp. Application of Hidden Markov Models to Text and Speech, 1980, pp. 143–179.

IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 24, NO. 2, FEBRUARY 2005

[46] M. Russell and R. Moore, “Explicit modeling of state occupancy in hidden Markov models for automatic speech recognition,” in Proc. ICASSP, 1985, pp. 5–8. [47] S. E. Levinson, “Continuously variable duration hidden Markov models for speech analysis,” Comput. Speech Language, vol. 1, no. 1, pp. 29–45, Mar. 1986. [48] G. D. Forney, “The Viterbi algorithm,” Proc. IEEE, pp. 268–278, 1973. [49] A. Aldroubi and M. Unser, Wavelets in Medicine and Biology. Boca Raton, FL: CRC, 1996. [50] L. Thoraval, J.-P. Armspach, and I. Namer, “Analysis of brain functional MRI time series based on the continuous wavelet transform and stimulation-response coupling distance,” in Proc. MICCAI Conf., Utrecht, The Netherlands, Oct. 2001, pp. 881–888. [51] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification. New York: Wiley, 2000. [52] C. Nikou, F. Heitz, J.-P. Armspach, I.-J. Namer, and D. Grucker, “Registration of MR/MR and MR/SPECT brain images by fast stochastic optimization of robust voxel similarity measures,” NeuroImage, vol. 8, pp. 30–43, 1998. [53] N. Lange, S. C. Strother, J. R. Anderson, F. A. Nielsen, A. P. Holmes, T. Kolenda, R. Savoy, and L. K. Hansen, “Plurality and resemblance in fMRI data analysis,” NeuroImage, vol. 10, pp. 282–303, 1999. [54] C. Pallier, E. Dupoux, and X. Jeannin, “EXPE: An expandable programming language for on-line psychological experiments,” Behavior Res. Meth., Instrum., Comput., vol. 29, no. 3, pp. 322–327, 1997. [55] A. Content, P. Mousty, and M. Radeau, “Une base de données lexicales informatisée pour le francais écrit et parlé,” Ann. Psychol., vol. 90, pp. 551–566, 1997. [56] M. N. Metz-lutz, I. J. Namer, D. Gounot, C. Kleitz, J. P. Armspach, and P. Kehrli, “Language functional neuro-imaging changes following focal left thalamic infarction,” NeuroReport, vol. 11, no. 13, pp. 2907–2912, 2000. [57] U. Ruttimann, M. Unser, R. Rawlings, D. Rio, N. Ramsey, V. Mattay, D. Hommer, J. Frank, and D. Weinberger, “Statistical analysis of functional MRI data in the wavelet domain,” IEEE Trans. Med. Imag., vol. 17, no. 2, pp. 142–154, Apr. 1998. [58] M. J. Brammer, “Multidimensional wavelet analysis of functional magnetic resonance images,” Human Brain Mapping, vol. 6, pp. 378–382, 1998. [59] M. Desco, J. A. Hernandez, A. Santos, and M. Brammer, “Multiresolution analysis in fMRI: Sensitivity and specificity in the detection of brain activation,” Human Brain Mapping, vol. 14, no. 1, pp. 16–27, 2001. [60] F. G. Meyer, “Wavelet-based estimation of semiparametric generalized linear model of fMRI time-series,” IEEE Trans. Med. Imag., vol. 22, no. 3, pp. 315–322, Mar. 2003. [61] B. Thirion and O. Faugeras, “Revisiting nonparametric activation detection on fMRI time series,” in Proc. IEEE Workshop on Math. Meth. in Biomed. Image Analysis, Kauai, HI, Dec. 2001, pp. 121–128. [62] P. A. D. F. R. Højen-Sørensen, C. E. Rasmussen, and L. K. Hansen, Bayesian modeling of fMRI time series, S. A. Solla, T. K. Leen, and K.-R. Müller, Eds. Cambridge: MIT Press, 2000, pp. 754–760. [63] F. V. Jensen, Bayesian Networks and Decision Graphs. Berlin, Germany: Springer, 2001. [64] M. Basseville and I. Nikiforov, Detection of Abrupt Changes: Theory and Application, ser. Information and system science. Upper Saddle River, NJ: Prentice-Hall, 1993. [65] R. Viswanathan and P. K. Varshney, “Distributed detection with multiple sensors: Part I—Fundamentals,” Proc. IEEE, vol. 85, no. 1, pp. 54–63, Jan. 1997. [66] J. V. Hajnal, I. R. Young, and G. M. Bydder, “Contrast mechanisms in functional MRI of the brain,” in Advanced MR Imaging Techniques, W. G. Bradley Jr. and G. M. Bydder, Eds, London, U.K.: Martin Dunitz Ltd., 1997, pp. 195–207. [67] N. A. Thacker, E. Burton, A. J. Lacey, and A. Jackson, “The effects of motion on parametric fMRI analysis techniques,” Physiol. Meas, vol. 20, pp. 251–263, 1999.