John Benjamins Publishing Company
This is a contribution from The Mental Lexicon 7:1 © 2012. John Benjamins Publishing Company This electronic file may not be altered in any way. The author(s) of this article is/are permitted to use this PDF file to generate printed copies to be used by way of offprints, for their personal use only. Permission is granted by the publishers to post this file on a closed server which is accessible to members (students and staff) only of the author’s/s’ institute, it is not permitted to post this PDF on the open internet. For any other use of this material prior written permission should be obtained from the publishers or through the Copyright Clearance Center (for USA: www.copyright.com). Please contact
[email protected] or consult our website: www.benjamins.com Tables of Contents, abstracts and guidelines are available at www.benjamins.com
Squibs, commentaries and methodological considerations
Localizing the component processes of lexical access using modern neuroimaging techniques Jed A. Meltzer
Rotman Research Institute — Baycrest Centre
Neuroimaging plays an increasingly important role in the investigation of all aspects of human cognition, including language. Historically, experimental psychology and neuroimaging relied on very different techniques, as neuroimaging studies required comparisons between different tasks rather than manipulation of conditions within a single task, as is standard in behavioural experiments. However, methodology has advanced in the past decade such that many classic behavioural paradigms can now be employed in studies that measure brain activity. We review the technical foundations of conducting studies on single-trial brain responses, using event-related fMRI and electrophysiological recordings. We focus in particular on studies of picture naming, illustrating how the same techniques that were originally used to define temporal processing stages in reaction time studies can now be applied to brain imaging studies to reveal the neural localization of those stages. Keywords: fMRI, MEG, naming, semantic, phonological, event-related, priming
Introduction The explosive growth of neuroimaging as a scientific paradigm has attracted intense enthusiasm along with criticism. While philosophical objections to the “phrenological” nature of modern cognitive neuroscience are nothing new, even those researchers who accept the principal of localization of function can find plenty of reasons to be skeptical about neuroimaging findings. For most of its history, the entire enterprise of functional neuroimaging has been critically dependent on the so-called “logic of cognitive subtraction.” In its most basic form, this paradigm involves testing for the neural localization of a putative cognitive function by measuring brain activity during two contrasting tasks: an “active task” that includes the function of interest, and a “baseline task” that does not, but is wellmatched on other factors. The Mental Lexicon 7:1 (2012), 91–118. doi 10.1075/ml.7.1.05mel issn 1871–1340 / e-issn 1871–1375 © John Benjamins Publishing Company
92
Jed A. Meltzer
While the subtractive paradigm has its advantages and disadvantages, one of the major obstacles to the acceptance of functional imaging in mainstream experimental psychology may be the fundamental differences in experimental design in imaging and behavioural studies that purport to study the same underlying functions. Several critiques of neuroimaging have focused on the problems involved in inferring cognitive functions from comparisons between different tasks (Bub, 2000; Newman, Twieg, & Carpenter, 2001; Stark & Squire, 2001). In a classical behavioural experiment, one ideally sticks to a single set of task instructions while manipulating the nature of individual trials, often without bringing the manipulation to the subjects’ conscious awareness. Historically, a reviewer wishing to synthesize information from both camps needed to look for broad parallels between two very different experimental approaches. In recent years, however, the gap between neuroimaging and behavioural studies has narrowed considerably, due to methodological advances in the analysis of imaging data. To illustrate some sophisticated techniques that have emerged, we will focus here on a subject of central interest in lexical research: word production. In two large-scale meta-analytic reviews, Indefrey and Levelt (2000; 2004) surveyed the existing research on the neural correlates of lexical access in word production. To delineate the cognitive stages involved, they relied mainly on chronometric studies that explored the effects of various trial-level manipulations on verbal reaction time (RT). Experimental techniques included priming, pictureword interference, and parametric manipulation of lexical variables in words to be produced. To localize the brain regions responsible for the various stages, they classified dozens of neuroimaging studies (58 in 2000, 82 in 2004) as to the cognitive task used, and the cognitive components of word production theoretically involved in each task. The overwhelming majority of these studies employed block-design comparisons between different tasks, as opposed to manipulations on the single-trial level. Thus, the chronometric delineation of stages and neural localization of those stages depended on very different kinds of logic. This is no longer the state of affairs in contemporary neuroimaging. The emergence of reliable methods for measuring neural activity evoked by single trials has made the same manipulations available to neuroimagers that have traditionally been used in behavioural experiments. Whereas RT is the most common dependent variable in behavioural measurements, the neural response to individual trials, averaged within conditions, can now be subjected to the same scrutiny. Furthermore, activity can be probed in multiple regions throughout the brain simultaneously, up to the limits of spatial resolution afforded by the technique. Here, we will review several methods used in design and analysis of single-trial neuroimaging experiments, illustrating their application in recent studies of lexical access and word production. We will not attempt to review the expansive literature on the subject,
© 2012. John Benjamins Publishing Company All rights reserved
Neuroimaging studies of lexical access
but will highlight examples that serve well to illustrate the methodological progress in this field.
Stages of word production Although we do not aim here to comprehensively review the behavioural evidence for distinct stages of word production, a brief overview is necessary for the reader to understand the motivation of the imaging studies that we will describe. Several detailed models of word production have been proposed; almost all of them agree on the broad architecture of the system, including the major stages and their order. Disagreements focus mainly on the degree of interactivity and feedback between the stages (Dell, Schwartz, Martin, Saffran, & Gagnon, 1997; Glaser, 1992; Humphreys, Riddoch, & Quinlan, 1998; Levelt, 2001; Levelt, Roelofs, & Meyer, 1999). Word production, broadly defined, is an aspect of numerous laboratory language tasks, but the most influential task has been picture naming. Most models include the following stages: 1) Conceptual preparation. Theorists generally agree that the first stage of picture naming is visual recognition of an object, preceding the retrieval of its name. This is an important consideration for interpretation of neuroimaging data, as activation induced by picture naming may relate to visual recognition processes rather than language. It is commonly assumed that the process of visual recognition in picture naming is analogous to the formation of an idea to be expressed in more natural forms of speech; that is, a conceptual preparation stage that precedes word selection. 2) Lexical-semantic selection. Upon completion of this stage, a speaker has selected a single word that he or she intends to produce, but has not yet retrieved the word’s phonological form. There is some evidence to suggest that syntactic information about the intended word is already available at this point (Schriefers, 1993; van Turennout, Hagoort, & Brown, 1997). An activated lexical entry including syntactic and semantic properties is termed a lemma. Occasionally, a speaker will get stuck at this stage, resulting in a “tip-of-the-tongue” state (Vigliocco, Antonini, & Garrett, 1997). Although the existence of a syntactically specified “lemma” level distinct from a lexical-semantic representation is controversial (e.g., Caramazza, 1997), most models allow for the retrieval of a semantic representation corresponding to a single word but prior to recall of its phonological form. 3) Phonological retrieval. In this stage, the sequence of phonemes to be uttered is retrieved from memory, leading to articulation. This stage may be further
© 2012. John Benjamins Publishing Company All rights reserved
93
94 Jed A. Meltzer
subdivided into various serial processes, including phonemic retrieval, syllabification, and articulatory motor planning (Levelt et al., 1999). However, the largest distinction of experimental interest is between the lexical-semantic and phonological stages. We will concentrate on imaging studies that have used techniques inspired by experimental psychology to dissociate the neural correlates of these major stages.
Chronometric dissociation of stages One of the most influential paradigms in the study of word production is the picture-word interference technique. Here, the subject’s task is to name a picture while ignoring a word presented nearly simultaneously, in either the auditory or visual modality. Relatedness between the pictured target word and the distractor affects verbal RT. The magnitude of the effect depends on the exact timing between picture and word presentation. Crucially, semantic and phonological relatedness can exert effects in opposite directions. When the target and distractor are from the same semantic category (e.g. horse and dog, both mammals and animals), naming latency is increased (Glaser & Düngelhoff, 1984). This is generally interpreted in the context of spreading activation models, in which semantic information activates both a target lemma and closely related lemmas. Activated lemmas are in competition with each other, and such competition must be resolved in order to select a single lemma for subsequent phonological encoding. The extra activation of a competing lemma caused by hearing or reading its name interferes with selection of the target, increasing the amount of time necessary to resolve the competition. In contrast to the detrimental effects of common category membership, phonological relatedness between distractor and target tends to facilitate naming rather than hinder it. Distractor words sharing the same onset (e.g., sheep/sheet) or coda (beet/sleet) both induce subjects to name pictures faster than unrelated distractors (Meyer & Schriefers, 1991). Studies manipulating the stimulus onset asynchrony (SOA) between distractor and target have mostly found that phonological facilitation occurs maximally with a distractor presented simultaneously with or following the target picture (positive SOA, e.g. +200ms), whereas semantic inhibition is maximized with a distractor preceding the target (negative SOA, e.g. -200ms) (Schriefers, Meyer, & Levelt, 1990). These findings are consistent with multi-stage models in which lemma selection precedes phonological word form retrieval. The facilitative effect of phonological overlap suggests that phonological neighbours do not interact competitively.
© 2012. John Benjamins Publishing Company All rights reserved
Neuroimaging studies of lexical access
Another form of facilitation (speeded responses) is found when distractors are semantically related to targets by association rather than common category membership. For example, upon naming a picture of a cat, the distractor word “dog” may slow responses, but the word “purr” may instead speed responses. Like the inhibitory effect of categorically related distractors, this facilitative effect is also maximal with negative SOAs, but studies indicate that the optimal SOA for associatively related distractors may be even earlier (Alario, Segui, & Ferrand, 2000; La Heij, Dirkx, & Kramer, 1990). Thus, chronometric studies of lexical access using the word-picture distractor paradigm have suggested manipulations that can selectively affect three successive stages of naming as outlined above: 1) an early conceptual preparation stage, in which spreading activation towards the target is speeded by an associatively related distractor, 2) a middle stage of lemma selection, in which candidate responses (e.g., cat, dog) compete for activation, leading to inhibitory effects from categorically related distractors, and 3) phonological word form retrieval, in which activation of the same phonemes speeds responses. Although the exact divisions of labour between these stages, their interactions, and their timing are still topics of vigourous debate, the very fact that they are subject to different experimental manipulations provides a rich basis for investigating the neural substrates of lexical access using neuroimaging. With the emergence of techniques for measuring brain activity on a trial-by-trial basis, researchers have applied the insights generated from chronometric studies to map the brain structures involved in semantic and phonological aspects of word retrieval. We will review this progress after introducing the key methodological advance that has made such imaging studies possible.
Event-Related fMRI Despite its shortcomings, fMRI remains one of the most important techniques in contemporary neuroscience, and its influence has only increased in recent years. The growing influence of fMRI is partially attributable to increased availability of the technology in more locations worldwide, but an equally important factor is the growth in sophistication of experimental design and analysis techniques. The methodology of fMRI grew out of an earlier tradition of neuroimaging work based on Positron Emission Tomography (PET). Cognitive studies with PET have most commonly measured either glucose metabolism using the radioactive tracer fluorodeoxyglucose, or blood flow using H2O15. The latter provides a faster image, but is still limited to brain maps that integrate activity over fairly long scan periods, with 60 s being a typical duration. Therefore, PET studies of language typically
© 2012. John Benjamins Publishing Company All rights reserved
95
96 Jed A. Meltzer
employ steady task conditions throughout a scan, and alternate scans of different conditions throughout an experimental session. When fMRI emerged in the early 1990s, it made cognitive neuroimaging available to much wider range of researchers, as MRI scanners are much more readily available than PET machines, which usually require an onsite cyclotron to generate the necessary isotopes. fMRI is cheaper, safer, and more patient-friendly, with no radiation or injections administered to the subject. Thus, fMRI has largely superseded PET for routine cognitive neuroimaging, although PET retains some advantages for certain specialized situations. For example, PET images are relatively unperturbed by cognitive paradigms that require continuous speech from the subject, whereas fMRI images are subject to severe distortions and artifacts from the movement of the articulators. Therefore, PET remains an important technique for studies of language production. For brief periods of elicited speech, such as in picture naming, speech artifacts appear to have only minor effects on the ability of fMRI to detect neural activation, due to the delayed nature of the hemodynamic response (Birn, Cox, & Bandettini, 2004; Heim, Amunts, Mohlberg, Wilms, & Friederici, 2006). Therefore, overt picture naming has become a common task in fMRI studies. In the first several years of fMRI research, study design was heavily influenced by PET methodology, with most studies employing a block design alternating between task conditions over fairly long periods, with 30 seconds as a typical value. It was soon realized, however, that fMRI affords sufficiently good temporal resolution to make single-trial (or “event-related,” ER) designs effective and practical. The signal measured in fMRI is known as the “Blood Oxygen Level Dependent” (BOLD) signal, and it is primarily sensitive to the amount of deoxygenated hemoglobin present in brain tissue (Ogawa et al., 1990). This quantity, in turn, is a function of several factors including blood volume, blood flow, and oxygen consumption, all of which are correlated with neuronal firing and are experimentally dissociable using specialized imaging techniques (Buxton et al., 2004; Kida and Hyder, 2006). In practice though, it is conventional to treat the BOLD response as a “black box” indicator of neuronal activity even if its mechanisms are not completely understood. The BOLD response to neural activity has several fortuitous properties that make it suitable for cognitive studies. Most importantly, it has a highly consistent shape. Although the shape of the BOLD response does vary across individuals (Aguirre, Zarahn, & D’Esposito, 1998; Handwerker, Ollinger, & D’Esposito, 2004) and brain areas (d’Avossa, Shulman, & Corbetta, 2003; Huettel & McCarthy, 2001), the variability is small enough that a fixed model can be used for analysis with reasonable success for many paradigms. A model of the BOLD response is commonly referred to as a “hemodynamic response function” (HRF). Example
© 2012. John Benjamins Publishing Company All rights reserved
Neuroimaging studies of lexical access
HRFs are shown in Figure 1. Following a brief period of neuronal firing (a few hundred milliseconds), the BOLD signal increases slowly, reaching a peak at about 4–6 seconds. The signal then returns to baseline, and often goes below baseline, a phenomenon known as the “undershoot.” Although some studies have also characterized an “initial dip,” caused by oxygen consumption prior to the onset of increased blood flow to an activated area (Chen-Bee, Agoncillo, Xiong, & Frostig, 2007; Yacoub et al., 2001), this is seldom seen in routine studies and does not play a role in standard analysis techniques. Because the blood flow response to even a brief neuronal event extends over 15–30 seconds, the responses to adjacent trials in a cognitive task will overlap, unless a very long inter-trial interval (ITI) is used. Early studies did employ a “slow event-related” design with ITIs of up to 16 seconds. Unfortunately, such designs have numerous disadvantages. Besides the excruciatingly slow pace and resultant subject fatigue, a slow ER design can only accommodate a relatively small number of trials in a given study duration, greatly reducing statistical power. Therefore, methods were quickly developed to deal with overlapping hemodynamic responses in more closely spaced trials, giving rise to “rapid ER” designs. Crucial to the technique was the discovery that the responses to closely spaced events summate linearly, to a reasonable approximation, as long as a minimum ITI of 1–2 seconds is maintained (Boynton, Engel, Glover, & Heeger, 1996, Miezin, Maccotta, Ollinger, Petersen, & Buckner, 2000). Such summation allows for the use of straightforward linear models to estimate the average response magnitude in different conditions, and even for single events. The linear models are far more effective when the trial timings are not strictly periodic. Thus, a randomized ITI, referred to as temporal “jitter,” is commonly used, reducing the correlation between hemodynamic responses to different trials and conditions. To analyze ER-fMRI time series data, one typically employs regression implemented in the so-called “General Linear Model” (GLM). This approach is extremely flexible and incorporates several familiar statistical tests including t-tests, correlations, ANOVA, etc. The model is formulated as:
Y = XB + ε
Where Y is a vector comprising the full time series of MRI signal in a given location, X is a design matrix representing the conditions of interest and their corresponding hemodynamic models, B is a vector of regression coefficients, and ε is the residual variance, or error. Choices in analytic strategy relate mainly to how one constructs the design matrix from the timings of different events. One typically starts with a time series for each condition, consisting of ones at time points when an event occurs and zeros at other time points. The repetition time (TR) of MRI image acquisition (typically 1.5–3s) is the time step used. Due to the slowly
© 2012. John Benjamins Publishing Company All rights reserved
97
98 Jed A. Meltzer
evolving hemodynamic response, sub-TR temporal resolution is seldom necessary except for special applications. The simplest method is then to convolve the event time series with a model HRF. The most common fixed model of the HRF is a gamma density function (Figure 1A). There is no theoretical motivation for the choice of a gamma density function, other than its good empirical correspondence to the observed average hemodynamic response. To capture the initial positive activation and the subsequent undershoot, some neuroimaging software packages routinely model the HRF as the sum of two gamma density functions, the second 1 0.8
A
0.6 0.4 0.2 0 –0.2
Signal (arbitrary units)
1 0.8
B
0.6 0.4 0.2 0 –0.2 1 C
0.8 0.6 0.4 0.2 0 –0.2 0
5
10 15 Time (sec)
20
Figure 1. Modeling the hemodynamic response. A: A gamma density function that matches the typical BOLD response, peaking around 6 seconds poststimulus and returning to baseline within 10 seconds. B: A “double-gamma” function incorporating a second, delayed portion of opposite sign to capture the undershoot. C: A hemodynamic response reconstructed from a series of 7 basis functions at intervals corresponding to a TR of 2 seconds. This particular response has a longer latency and a wider peak than the standard model.
© 2012. John Benjamins Publishing Company All rights reserved
Neuroimaging studies of lexical access 99
of opposite sign with a fixed delay relative to the first (Figure 1B). The undershoot is often modeled with a fixed amplitude of 1/6 that of the positive signal, derived from measurements of early visual activity in occipital cortex. This may not be optimal for other brain regions, or in higher cognitive paradigms. In one study, we observed that undershoots can be as large or larger than the positive signal, resulting in erroneous conclusions if this variability is ignored (Meltzer, Negishi, & Constable, 2008). If there are doubts about the suitability of a fixed HRF in a given study, more flexible models can be used. An approach known as “finite infinite response” (FIR) models each individual time point after the event in increments of the TR, typically up to a limit of16 seconds. This provides maximal flexibility to capture a hemodynamic response of arbitrarily complex shape, but at the cost of adding many more regression parameters, reducing degrees of freedom. In between the FIR approach and the use of a single fixed HRF, one can also reconstruct the shape of the hemodynamic response from a smaller number of temporally staggered “basis functions.” An example of a hemodynamic response measured empirically using basis functions is shown in Figure 1C. In some cases, the best approach may be to use a flexible model first to confirm the match to a fixed HRF, and then employ the fixed HRF for final analysis. Use of a fixed HRF has the advantage of generating a single number characterizing a condition, which can be easily submitted to second-level multi-subject statistical analysis. Additionally, fixed HRFs are much easier to deal with when the study involves parametric modulation of the response by a continuous variable (see Section 7). The slow evolution of the hemodynamic response limits the temporal resolution of fMRI, but this can be an advantage. The “blurring” of the BOLD response to neural activity unfolding over hundreds of milliseconds allows the experimenter to effectively treat a complex sequence of events as a single trial. For example, confrontational naming of a single picture can be treated as a unitary event, despite the fact that the vocal production lags the appearance of the picture by several hundred milliseconds. This technique is extendible to presentation of multiple stimuli further separated in time, as in a picture-word interference paradigm, in which an auditory or visual distracter word precedes or follows a picture to be named. The measurement of event-related fMRI responses throughout the brain opens up a new kind of experimental inference that bridges the gap between chronometric word production studies and previous neuroimaging studies based on comparisons between different tasks. Chronometric studies have identified numerous manipulations that affect the word production process at different stages. Using fMRI, researchers can now identify the brain regions that respond differentially to the same manipulations. In essence, one can now use evoked neural
© 2012. John Benjamins Publishing Company All rights reserved
100 Jed A. Meltzer
activity in each individual brain region as an outcome variable of interest, going far beyond the information available from RT and accuracy. We will now outline some of the most productive techniques that have arisen from the fusion of chronometric techniques with event-related fMRI, and their application in studies of word production.
Priming A classic technique in psychology is to examine the effect of previous trials on a current trial, in the context of a single experimental task. In general, the response to a stimulus tends to be faster and more accurate if the stimulus is related to a previous stimulus in the experiment. The simplest form is known as “identity priming,” in which responses are speeded when exactly the same stimulus is repeated. The typical interpretation of the facilitation affect is that neural networks retain a memory of the processing involved in responding to a stimulus, such that the next time a similar response is elicited, the level of activation required to produce the desired response can be reached more quickly. The same logic has been employed in ER-fMRI, in a technique commonly referred to as “repetition suppression” (RS). Repeated stimuli are hypothesized to evoke a smaller amount of neural activity, and hence a smaller BOLD response, than novel stimuli. Critically, this effect is expected to be confined to the regions that benefit from the facilitation, rather than being present in all responsive regions. Thus, comparison of RS arising from repeating different properties of a stimulus may reveal the regions responsible for processing those properties. Due to concerns about speech-related artifacts in overt naming, many early ER-fMRI studies employed silent naming as a task, in conjunction with an RS approach. Van Turennout, Bielamowicz, and Martin (2003) and Van Turennout, Ellmore, and Martin (2000) observed long-lasting suppression of BOLD responses in occipitotemporal and inferior frontal regions, using identity priming for repeated pictures. These are the same regions that are most commonly activated in overt object naming (Abrahams et al., 2003; Fridriksson, Morrow, Moser, & Baylis, 2006, Meltzer, Postman-Caucheteux, McArdle, & Braun, 2010) confirming that RS is observable in the majority of regions involved in naming. Since identity priming repeats all properties of a stimulus, additional insight into the processing roles of individual regions has come from other forms of priming. Not all of these studies have involved naming, but dissociations between components of the object recognition process have been shown in studies involving a semantic judgment of visual objects (e.g., living/nonliving). Measuring RS using ER-fMRI, Koutstaal et al. (2001) compared identity priming (viewing the exact same picture twice)
© 2012. John Benjamins Publishing Company All rights reserved
Neuroimaging studies of lexical access 101
with lexical priming (viewing a different exemplar object with the same name, e.g., another umbrella). The primary finding was a laterality effect contrasting left and right fusiform cortex. In the right fusiform gyrus, repetition of the same pictures produced strong RS, but novel exemplars with the same name did not, indicating a strong sensitivity to low-level perceptual features as opposed to lexical identity. In the left fusiform gyrus, the novel exemplars with the same name did produce significant RS. This suggests that the left fusiform does play a linguistic role in retrieval of names beyond simple object recognition. A similar finding was reported by Vuilleumier, Henson, Driver, & Dolan (2002), although RS for repeated exemplars was detected in left inferior frontal regions instead. Viewing repeated objects from a different viewpoint induced RS in right but not left fusiform, confirming that right hemisphere regions tend to be more sensitive to visual features than to lexical identity. These studies point to a crucial role of the left fusiform in lexical identification of picture names, despite the fact that it is only one of many regions that are activated when naming pictures. We reached a similar conclusion in a recent fMRI study (Meltzer et al., 2009) that compared naming novel pictures with naming a set of massively primed, overlearned pictures (named several times over repeated sessions, always in the same order). Both tasks were compared with a baseline condition in which subjects produced a single pseudoword response (“rado”) to scrambled objects. Naming novel objects activated the left fusiform, as well as several frontal regions that are commonly activated in language tasks — left inferior frontal gyrus (LIFG), left premotor cortex, and supplementary motor area (SMA). For the overlearned pictures, activation was only observed in left fusiform and a small portion of SMA. In this case, priming was used to infer function in a very different way — the left fusiform stood out because of its resistance to total elimination of activation after massive repetition, as opposed to the studies described in the previous paragraph, where this region stood out due to its sensitivity to repetition of specific features. A critical role of the fusiform gyrus in naming is also supported by studies of acute stroke patients, which showed that reduced blood flow to this region (Brodmann area 37) was the best predictor of impaired naming (Hillis, Tuffiash, Wityk, & Barker, 2002), and that restoration of blood flow to BA 37 was associated with improved naming performance (Hillis et al., 2006). Given that portions of the fusiform gyrus are also associated with identification of visual forms such as printed words (Cohen, Jobert, Le Bihan, & Dehaene, 2004) and faces (Kanwisher, McDermott, & Chun, 1997), it is likely that this region makes a vital contribution to the first hypothesized stage of the naming process, object recognition. Since this stage necessarily precedes the other, more strictly “linguistic” stages, it is unsurprising that damage to this region would be the most sensitive predictor of a naming impairment. Nonetheless, word finding difficulties are
© 2012. John Benjamins Publishing Company All rights reserved
102 Jed A. Meltzer
frequently seen after brain damage to any portion of perisylvian language cortex (Goodglass & Wingfield, 1997), suggesting that the process can be disrupted at any of its several stages. The contributions of other brain regions to subsequent stages of phonological and semantic processing have been clarified with other techniques adopted from experimental psychology into neuroimaging.
Compound trials We will use the term “compound trials” here to refer to studies in which a combination of two or more events, closely spaced in time, is modeled as a single event in fMRI. This approach has been used extensively in ER-fMRI studies of overt naming using a distractor paradigm. As the distractor technique is a mainstay of experimental psychology studies of naming, its adaptation in fMRI represents a significant convergence with cognitive neuroscience. ER-fMRI studies using this paradigm take advantage of the inherent sluggishness of the hemodynamic response, obtaining a single measure of response magnitude for what is essentially three separate events: perception of a distractor stimulus, perception of a target stimulus, and production of a vocal response. The key aspect of the distractor paradigm is the presence of opposite effects on naming latency for different kinds of relatedness, as reviewed in Section 3. fMRI studies of the distractor paradigm typically extend the logic of repetition suppression, in assuming an analogy between RT effects and modulation of the BOLD response. For instance, if an experimental condition produces speeded responses, and a reduced BOLD response in a certain brain area, one might infer that this brain area is the most likely locus of the priming effect. The facilitated activation of the target word results in less neural work required in the region that benefits from the facilitation. Similarly, when response times are lengthened by an experimental manipulation, a region showing an increased BOLD response is a likely locus of the extra processing brought about by that manipulation. If a region’s BOLD response is modulated by semantic vs. phonological relatedness, then that region is likely to be more involved in processing that kind of information. One of the earliest studies to employ the distractor technique in ER-fMRI was that of de Zubicaray, Wilson, McMahon, and Muthiah (2001), which explored the inhibitory effect of same-category distractor words in overt naming. Pictures to be named were presented simultaneously with either a semantically related visual word, or a row of X’s. As predicted, trials with distractor words produced greater activation in a number of regions, including bilateral middle temporal gyrus (MTG), predicted by Levelt’s meta-analysis as a core region for lexical-semantic processing, but also in left superior temporal gyrus (STG), thought to be
© 2012. John Benjamins Publishing Company All rights reserved
Neuroimaging studies of lexical access 103
more involved in phonological retrieval. However, the relatively low-level control condition (X’s) makes it difficult to attribute the increased activation to semantic competition alone, as the distractor words may also produce activations related to reading, and to inhibiting the tendency to read the word aloud instead of naming the picture. Thus, this experiment may be tapping a more general process of Stroop-like response inhibition, as reflected in the activations also observed in anterior cingulate and bilateral superior frontal gyrus, regions highly involved in cognitive control and commonly activated in Stroop paradigms (Braver, Barch, Gray, Molfese, & Snyder, 2001; Egner & Hirsch, 2005). This study used a slow ER design (16 second ISI), and collected only one BOLD image per trial, at the expected peak time of 4–7 seconds poststimulus, to allow for a silent period to record the vocal response and to avoid contaminating the BOLD response with motion artifact related to speaking. Therefore, the low-level control condition was selected to maximize sensitivity under these constraints. Subsequent to this study, hardware has come on the market for collecting clean recordings of subject speech even with simultaneous scanner noise present, and methodological studies have established that BOLD responses to brief speech events can be accurately measured even with continuous scanning (see Section 4). Thus, continuous rapid-ER designs have become the norm, allowing for many more trials and more conditions to be accommodated in a scanning session. De Zubicaray, McMahon, Eastburn, & Wilson (2002) conducted a followup experiment employing phonological distractors, which produced speeded naming relative to unrelated words that served as the control. Decreased BOLD responses for the distractor condition were observed in left posterior STG, consistent with the hypothesized role of this region in phonological word form retrieval, although decreases were also observed in right anterior temporal regions. Interestingly, several other brain regions were differentially modulated by phonological relatedness, but in the opposite direction, showing a larger response to the related distractors. Thus, a subtle manipulation such as distractor relatedness can modulate fMRI responses in both directions in different regions. Although phonological relatedness produces a net decrease in RT, it is possible that it facilitates processing in some areas and hinders it in others, resulting in distinct effects observable in brain imaging. Mechelli, Josephs, Lambon Ralph, McClelland, and Price (2007) also used a compound trial design to explore semantic and phonological priming in overt naming. In their experiment, subjects named two stimuli presented in close succession and modeled as a single trial. Each stimulus could be either a written word or a picture, resulting in four combinations. The study also manipulated relatedness between the two stimuli (semantic, phonological, unrelated, or identical), yielding an ambitious total of 16 experimental conditions. The use of rapid-ER techniques nonetheless allowed for 12 to 13 trials in each of these conditions. This
© 2012. John Benjamins Publishing Company All rights reserved
104 Jed A. Meltzer
is a rather low number of trials for fMRI, but since the main contrasts of interest were factorial combinations of conditions (e.g., all semantically related pairs vs. unrelated pairs), reasonable statistical power was achieved. Semantically related pairs induced larger BOLD responses in left MTG, consistent with prior studies, and also in left angular gyrus, superior frontal gyrus, and LIFG pars orbitalis. In contrast to de Zubicaray et al. (2002), Mechelli et al. (2007) detected no reduction of BOLD responses for phonologically related stimulus pairs. Instead, increased BOLD was found in bilateral insula. The authors interpreted this effect as related to competition between overlapping phonological codes in the two stimuli to be named. This would not have been predicted on the basis of the chronometric studies, which have shown a facilitative effect of phonological relatedness rather than an inhibitory one. However, the chronometric studies have not typically involved overt naming of both stimuli, instead requiring naming only of a target picture while ignoring the distractor. Although Mechelli et al. (2007) was one of very few fMRI studies that have combined both phonological and semantic relatedness to distinguish their effects, it did not distinguish between the categorical and associative forms of semantic relatedness, treating them both as a single condition. A subsequent fMRI study was conducted by Abel, Dressel, Bitzer, et al. (2009), using auditory distractor words presented 200ms before each picture. This negative SOA was chosen, based on a survey of chronometric studies, as an ideal compromise for achieving both semantic inhibition and phonological facilitation. Four conditions were compared, including unrelated distractors, and three kinds of related ones: phonological (P), semantic-categorical (SC), and semantic-associative (SA). Behaviourally, the expected dissociations were observed, with delayed responses for SC distractors, and speeded responses for P and SA. Despite the observed behavioural dissociation between different kinds of distractors, all related distractor conditions produced only enhanced BOLD responses relative to the unrelated distractor condition, regardless of whether they exerted an inhibitory or facilitative effect on RT. Phonological relatedness modulated activity in pSTG and inferior parietal cortex, areas associated with phonological encoding in contemporary models of speech production (Indefrey & Levelt, 2004; Hickok & Poeppel, 2004). Semantic relatedness of both kinds modulated activity more in areas associated with visual processing, including occipital and inferior temporal cortex. In summary, studies employing the compound trial method have supported a distinction between the brain areas involved in semantic and phonological stages of naming, on the basis of differential sensitivity of BOLD responses across regions to distractor type. Phonological relatedness has been linked to structures within the “dorsal stream” of speech processing (Hickok and Poeppel, 2004, 2007), whereas semantic effects localized to “ventral stream” regions. Despite this broad
© 2012. John Benjamins Publishing Company All rights reserved
Neuroimaging studies of lexical access 105
consistency though, studies have not converged neatly on the exact same regions, nor have they demonstrated a consistent link between the directionality of BOLD modulation and RT. In particular, it has been difficult to demonstrate the predicted link between behavioural response facilitation and BOLD suppression, as the majority of BOLD modulations to phonological and semantic-associative relatedness have been positive in direction despite the speeded responses in these conditions.
Parametric modulation Another major trend in the behavioural study of picture naming has been the investigation of continuous variables that affect verbal RT. The difficulty of name retrieval for a given picture can be characterized by numerous quantities, many of which are highly correlated with each other as well as with RT. Among the most reliable variables that increase RT are lower name agreement (% of participants producing the same word to name the picture), lower concept familiarity, later age of acquisition (AoA), and lower lexical frequency (Szekely et al., 2003). Extensive behavioural experiments have established that although multiple variables ultimately influence RT, different variables exert their effects at different stages of the naming process (Griffin & Bock, 1998; Jescheniak & Levelt, 1994; Morrison, Ellis, & Quinlan, 1992). This provides another attractive opportunity for singletrial based brain imaging to delineate the neural localization of successive stages. ER-fMRI can be easily used to measure linear correlations between the magnitude of hemodynamic responses across trials and a continuous variable characterizing those individual trials. This is typically done by incorporating additional regressors into the GLM, as illustrated in Figure 2. First, a standard regressor is generated by convolving a series of delta functions of constant amplitude representing the stimulus timing with a function modeling the HRF, such as a gamma density function (Figure 2A). This regressor models the average activation induced by the stimulus irrespective of the variability that may be correlated with a continuous covariate. The series of values representing the co-variate (e.g., word frequency, RT, etc.) is then typically transformed to have a mean of zero, and then a series of delta functions with the zero-mean values placed at the appropriate stimulus times is constructed (Figure 2B). Finally, these delta functions are convolved with exactly the same HRF model as the first series, resulting in a second regressor modeling the extra variability in the fMRI signal associated with the covariate (Figure 2C). It is important to use both of these regressors together. Simply convolving a series of a modulating variable with the HRF would be a mistake, as it would mix activation associated with stimulus presentation in general with activation that is specifically modulated by the covariate.
© 2012. John Benjamins Publishing Company All rights reserved
106 Jed A. Meltzer A: Mean hemodynamic response 1 0.5
Signal (arbitrary units)
0 B: Parametric covariate 2 0 –2 C: Covariate convolved with hemodynamic response 2 0 –2 0
50
100
150
200
250
300
Time (sec)
Figure 2. Parametric modulation in ER-fMRI. A: A series of delta functions corresponding to stimulus presentation times are convolved with an HRF to model the expected mean response irrespective of a parametric covariate. B: Delta functions corresponding to stimulus timing, with amplitudes set to a normally distributed zero-mean covariate (such as reaction time, word frequency, etc). C: The covariate delta functions are convolved with the HRF to capture expected variability in the BOLD response that is linearly related to the co-variate.
Two recent studies have used this method to study the effects of three parametric variables: Concept familiarity, word frequency, and word length. Concept familiarity is widely understood to modulate the pre-lexical stage of conceptual preparation, whereas word length is associated with rather late stages of phonological encoding and/or articulatory planning. Although word frequency is a very strong and reliable modulator of RT, the exact stage at which it exerts its effects is more controversial. However, it is generally believed to fall in between the other two factors. Levelt’s model places it at the early stage of word form retrieval after lemma selection, based on the fact that production of low-frequency words benefits from the existence of high-frequency homophones. For example, the uncommon word “moor” is retrieved as fast as the common “more” (Jescheniak and Levelt, 1994) and is more resistant to speech errors as a result of sharing a phonological form with a high-frequency word (Dell, 1990). Graves, Grabowski, Mehta, and Gordon (2007) and Wilson, Isenberg, & Hickok (2009) both examined these three factors in ER-fMRI studies of overt naming,
© 2012. John Benjamins Publishing Company All rights reserved
Neuroimaging studies of lexical access 107
but used slightly different techniques to account for the fact that brain regions may be responsive to more than one of these factors. Graves et al. tested each factor separately and used a voxel-wise conjunction approach to identify regions that were uniquely sensitive to just one of the factors. In this approach, one highlights specifically those voxels that achieve statistical significance for only one factor (or alternatively, a combination of factors). A potential problem with this approach is that it is highly dependent on the threshold for statistical significance; a region with activation just over the threshold for one contrast and just under for another could be identified by this procedure, possibly undesirably. Wilson et al. used multiple simultaneous regression, in which all variables were included in the GLM. This approach identifies voxels that are significant for a given factor after partialling out variance attributable to the other factors. This avoids the issue of arbitrary threshold effects, but reduces sensitivity to individual factors, especially when the factors are themselves correlated with each other. Despite these minor methodological differences, Graves et al. (2007) and Wilson et al. (2009) obtained mostly similar results. Conceptual familiarity was seen to modulate activity primarily in bilateral posterior occipito-temporal cortex and the fusiform gyrus, consistent with previous studies identifying these areas as critical for the pre-linguistic stages of object recognition. Word length modulated activity in the superior temporal gyrus, including primary auditory cortex, likely reflecting the increased amount of auditory stimulation involved in hearing oneself articulate longer words. Word length also modulated activity in bilateral precentral gyrus, which presumably reflects an increased amount of vocal motor activity. Word frequency exerted an effect in a rather wide network of left hemisphere regions, including posterior inferior temporal gyrus, LIFG (only in Graves et al.), and portions of the left inferior parietal lobe, including the supramarginal gyrus (SMG). As chronometric studies have placed the word frequency effect on the border between lemma selection and phonological code retrieval, it is possible that the heterogeneous set of activations seen in response to it reflect both semantic and phonological processing. Notably, neither of these studies detected significant modulations of activity in the left middle temporal gyrus, the region most closely associated with lexical-semantic access and lemma selection in Levelt’s metanalysis. However, the studies of picture-word interference suggest that this region is most reliably activated by competition between candidate word responses, and none of the variables examined in these two studies are optimal for capturing that effect. One variable that does reflect lexical competition is the number of alternative names available for a picture, or similarly, measures of name agreement. Kan and Thompson-Schill (2004) compared naming for pictures with low and high name agreement, finding effects in LIFG as well as portions of the left temporal lobe. Postman-Caucheteux et al. (2010) used the parametric approach to
© 2012. John Benjamins Publishing Company All rights reserved
108 Jed A. Meltzer
measure activity related to age-of-acquisition and number of alternative names, in a small sample of four control subjects and three patients with post-stroke aphasia. Strong effects of both of these variables were seen in both frontal and temporal regions within individuals, but a large-scale characterization of consistent effects across a larger subject group awaits future study.
Post-hoc trial classification One more experimental approach made possible by single-trial designs is to classify trials on the basis of behavioural outcome. The main outcomes of interest, as in behavioural experiments, are RT and accuracy. The previously mentioned study of Wilson et al. (2009) also included verbal RT as a parametric covariate of interest, finding modulations in many of the regions that were also identified in the analyses of pre-defined co-variates. RT is likely to reflect total “time-on-task” in regions involved in naming, so a rather widely distributed activation pattern is not unexpected. Accuracy has been used in studies employing overt naming, particularly in patients with post-stroke aphasia, who tend to produce a large number of erroneous “paraphasic” responses in picture naming. These may include both semantic substitutions (e.g., “airplane” for a picture of a blimp) and phonemic errors (e.g., “stoon” for a picture of a spoon). Postman-Caucheteux et al. (2010) demonstrated increased activity in contralesional and perilesional regions for trials classified as erroneous. These activations occurred in overlapping regions with areas also activated by later age of acquisition and larger number of names, suggesting that they reflect difficulty in name retrieval for certain pictures, which tends to induce errors in the patients. This conclusion is supported by a study of error responses in healthy volunteers (Abel, Dressel, K., Kummerer, et al., 2009), which reported increased activation in LIFG and other regions for error responses, but no activation associated preferentially with correct responses. Fridriksson, Baker, & Moser (2009) reported a dissociation in activity related to phonemic and semantic paraphasias in patients, with phonemic errors preferentially recruiting left-hemisphere perilesional cortex and semantic errors recruiting right hemisphere homologous regions.
Electrophysiological approaches The above sections have highlighted ways in which ER-fMRI has been used to delineate the neural underpinnings of successive stages in picture naming. The combination of temporal information derived from chronometric behavioural studies and spatial information derived from fMRI has proven rather effective at
© 2012. John Benjamins Publishing Company All rights reserved
Neuroimaging studies of lexical access 109
unraveling the anatomy of word production. We have shown how the BOLD signal, as an integration of neural activity over several seconds, has been exploited in studies that take advantage of this rather forgiving temporal relationship between neural events and the observed signal. Nonetheless, our understanding of the naming process would certainly benefit from an improved ability to measure neural activity at high temporal resolution, allowing us to view successive waves of activation related to the separate stages described above. Such resolution is only attainable from electrophysiological methods that directly measure neuronal currents. Electroencephalography (EEG) has long been used to measure neural activity elicited by cognitive stimuli, but studies of overt language production have been few, mainly due to concerns about motion artifacts from articulation. Nonetheless, more studies have appeared recently (see Ganushchak, Christoffels, & Schiller, 2011 for review). EEG studies of naming have revealed, as expected, that modulations of responses by semantic factors tend to occur at earlier time points than phonological factors. Unfortunately, estimation of the neural sources of EEG activity is extremely difficult, so spatial localization has traditionally been unemphasized in EEG language research. Nonetheless, alternative methods for spatial localization of electrophysiological responses do exist. One is Magnetoencephalography (MEG), which measures the magnetic fields produced by neuronal currents rather than the electrical fields measured by EEG. Magnetic fields are not affected by passage through the skull, leading to significant improvement in the accuracy with which their sources can be estimated. Another method of interest is intracranial recordings, which are direct measurements from the surface of the brain conducted in patients undergoing monitoring prior to brain surgery, most commonly for resection of brain tissue responsible for generating epileptic seizures. There has been an increasing interest in conducting cognitive studies in willing patients during the monitoring period, which typically lasts for several days. MEG activity in overt picture naming has been investigated in several studies. The typical approach is to analyze activity after picture presentation but before vocal response onset, avoiding the worst of the articulation artifact. Salmelin, Hari, Lounasmaa, and Sams (1994), and Levelt, Praamstra, Meyer, Helenius, and Salmelin (1998) measured the average responses to pictures in an overt naming task, and fit successive temporal periods of the signal to dipole models to estimate the neural generators. The earliest activity was seen in occipital cortex, with later activity occurring in temporal and parietal regions, with frontal regions activated later still. Subsequent studies used similar methods for dipole-based localization of averaged signals, but incorporated manipulations of the stimuli to isolate modulations by semantic and phonological factors. Maess, Friederici, Damian, Meyer, and Levelt (2002) compared naming of pictures in blocks consisting of pictures
© 2012. John Benjamins Publishing Company All rights reserved
110 Jed A. Meltzer
drawn from the same semantic category, with blocks drawn from mixed categories. Interference between same-category stimuli was expected to modulate activity related to lexical-semantic processing. An interference effect was observed, maximally in sensors overlying left temporal cortex. Vilha, Laine, and Salmelin (2006) compared MEG timecourses estimated in cortical source regions during four tasks: passive picture viewing, overt naming, semantic judgments, and phonological judgments. All tasks evoked similar activity up to 300 ms. Thereafter, increased activity was seen in bilateral frontal and left temporal regions for both phonological judgments and overt naming, suggesting that these regions play a preferential role in word form retrieval rather than object recognition. These MEG studies can show shifting spatial patterns of brain activity across successive time windows, but such analyses may underemphasize the fact that brain regions are likely to continue being active and interacting with each other across multiple stages of the naming process. For example, multiple stages of engagement in word production for a single region were recently demonstrated in an intracranial recording study involving patients with electrodes in LIFG. Sahin, Pinker, Cash, Schomer, and Halgren (2009) showed a triphasic response in these electrodes, for which successive peaks were sensitive to different contrasts when patients were asked to either read words verbatim or inflect them for plurality or tense. The first peak was sensitive to lexical frequency, the second to the task demand, and the third to the actual requirement for an overt phonological marking for the inflection (e.g., third person plural past tense of walk → walked, vs. present tense of walk → walk). As fMRI studies have shown LIFG to be particularly sensitive to many different modulating variables, this result suggests that different variables may exert their effects at distinct stages within the naming process. Frontal regions are directly responsible for the final motor output in language production tasks, so successive engagement of these regions by different cognitive demands emphasizes the interactive nature of neural information processing, in which regions of association cortex receive input from “upstream” primary sensory areas, but also provide feedback to those areas.
Analysis of oscillatory activity The MEG and intracranial studies described above have all employed a timedomain approach to signal analysis, in which signals are timelocked to stimulus onset and averaged across trials. Although this approach has been very successful at revealing neural activity related to cognition, it is insensitive to certain forms of activity that may be reliably induced by stimuli but do not produce signals in which the peaks and troughs line up precisely across trials. A popular alternative
© 2012. John Benjamins Publishing Company All rights reserved
Neuroimaging studies of lexical access 111
approach is to quantify oscillatory amplitude within specific frequency bands across time, and average this quantity across trials. This may be accomplished with such time-frequency techniques as wavelets, Hilbert transforms, and shorttime Fourier transforms (for tutorial reviews, see Le Van Quyen & Bragin, 2007; Mouraux & Iannetti, 2008; Tallon-Baudry & Bertrand, 1999). In practice, these different algorithms provide similar results in the presence of a reliable signal. Intracranial recording studies have shown that neural activation tends to be associated with a decrease in power at lower frequencies, such as the alpha band (8–12 Hz) and the beta band (15–25 Hz). Ojemann, Fried, and Lettich (1989) observed reductions in alpha power during silent naming at cortical sites that were also identified as essential for naming through electrical stimulation mapping — that is, electrical current passed to those sites interfered with patients’ attempts to name pictures (a common technique used in neurosurgery to avoid cutting through areas essential for language functions). More recent studies have shown that neural activation is also accompanied by power increases in the gamma frequency bands, which is commonly defined as 40 Hz and higher (although some studies distinguish “low” and “high” gamma, e.g. 40–70 Hz vs 70–200 Hz). Gamma increases are more tightly localized to increases in neuronal firing rate, whereas alpha and beta decreases are commonly detectable at electrode sites some distance away (Crone, Sinai, & Korzeniewska, 2006). Therefore, gamma power increases may offer a particularly sensitive means to map brain activity related to language functions at high spatial and temporal resolution, and indeed a number of studies have recently analyzed gamma activity in intracranial recordings of overt naming tasks (Cervenka, Boatman-Reich, Ward, Franaszczuk, & Crone, 2011; Towle et al., 2008; Wu et al., 2011). These studies have demonstrated that high gamma power in particular is a sensitive measure of neural activation, offering both spatial and temporal resolution unmatched by noninvasive imaging methods. For example, Edwards et al. (2010), using measurements of high gamma activity in overt naming, observed a fine distinction related to the role of pSTG in naming. The gyral surface of the pSTG was activated only after the acoustic onset of speech, implying that this region’s activation in naming tasks is linked to auditory feedback (the sound of one’s own voice), rather than phonological code retrieval. In contrast, a slightly more posterior and superior area termed Spt/TPJ (Superior planum temporale and temporo-parietal junction) exhibited activity that ramped up prior to the response, suggesting a more direct role in speech production. This finding is consistent with more recent neuroimaging evidence that has identified area Spt as a uniquely important region in the transition from abstract phonological codes to articulatory plans (Hickok & Poeppel, 2007). Given the demonstrated spatial and temporal specificity of brain mapping based on high-gamma activity, it is tempting to view this technique as the most
© 2012. John Benjamins Publishing Company All rights reserved
112 Jed A. Meltzer
promising future direction for research on the neural substrates of language production. Unfortunately, such enthusiasm must be tempered by some severe limitations. High gamma activity has most successfully been characterized in invasive recordings, using subdural electrodes implanted on the surface of the cortex or penetrating depth electrodes. Obviously, these recordings are only available in patients undergoing medically necessary procedures, and the spatial sampling within individuals is highly variable, being limited to regions relevant for surgical planning. Furthermore, neurosurgical patients’ brains are by definition abnormal, and are likely to exhibit abnormal localization of function as a result of neuroplasticity related to the patients’ condition (typically epilepsy or tumors). For general-purpose language research, a noninvasive alternative is needed. Currently the most promising technique may be MEG. Although high gamma responses in MEG elicited by cognitive tasks have been reported — mainly in visual tasks — it is not yet clear whether such responses can serve as a broadly applicable indicator of neuronal activity in the same way that the fMRI BOLD signal does. So far, the most reliable MEG correlate of fMRI BOLD activation is a power decrease in lower frequencies, such as the alpha and beta bands (Brookes et al., 2005; Winterer et al., 2007; Zumer, Brookes, Stevenson, Francis, & Morris, 2010). Although these responses have less temporal and spatial specificity than gamma responses, they also offer a higher signal-to-noise ratio, making them more practical for noninvasive recordings. For example, we recently showed that mapping language-induced alpha/beta power decreases in MEG yielded activation maps resembling fMRI data from the same task, but offered sufficient temporal resolution to distinguish between activity related to hearing a sentence and holding it in short-term memory immediately afterwards (Meltzer & Braun, 2011; Meltzer, McArdle, Schafer, & Braun, 2010). In summary, we expect rapid progress in the coming years towards developing the full potential of EEG and MEG to map neural activity at a sufficiently fast time scale to resolve successive stages of neural processing involved in language comprehension and production.
Conclusion Despite the fact that fMRI, EEG, and MEG have all existed for over 15 years, methodological developments in the past decade have greatly expanded the experimental value of these techniques in delineating stages of processing involved in language production. These advances have been largely driven by more sophisticated analysis procedures rather than fundamental changes in data acquisition capabilities. Whereas researchers formerly had to rely on comparison of different cognitive tasks, researchers now routinely employ manipulations based on mixing
© 2012. John Benjamins Publishing Company All rights reserved
Neuroimaging studies of lexical access 113
single trials of different conditions within constant task demands. As a result, key techniques from prior behavioural studies have been incorporated into the mainstream of neuroimaging research on language production. Familiarity with the techniques reviewed above provides a sound basis for researchers wishing to employ modern neuroimaging technologies in studies of language.
References Abel, S., Dressel, K., Bitzer, R., Kummerer, D., Mader, I., Weiller, C., et al. (2009). The separation of processing stages in a lexical interference fMRI-paradigm. Neuroimage, 44(3), 1113–1124. Abel, S., Dressel, K., Kummerer, D., Saur, D., Mader, I., Weiller, C., et al. (2009). Correct and erroneous picture naming responses in healthy subjects. Neurosci Letters, 463(3), 167–171. Abrahams, S., Goldstein, L. H., Simmons, A., Brammer, M. J., Williams, S. C., Giampietro, V. P., et al. (2003). Functional magnetic resonance imaging of verbal fluency and confrontation naming using compressed image acquisition to permit overt responses. Human Brain Mapping, 20(1), 29–40. Aguirre, G. K., Zarahn, E., & D’Esposito, M. (1998). The variability of human, BOLD hemodynamic responses. Neuroimage, 8(4), 360–369. Alario, F. X., Segui, J., & Ferrand, L. (2000). Semantic and associative priming in picture naming. The Quaterly Journal of Experimental Psychology A, 53(3), 741–764. Birn, R. M., Cox, R. W., & Bandettini, P. A. (2004). Experimental designs and processing strategies for fMRI studies involving overt verbal responses. Neuroimage, 23(3), 1046–1058. Boynton, G. M., Engel, S. A., Glover, G. H., & Heeger, D. J. (1996). Linear systems analysis of functional magnetic resonance imaging in human V1. Journal of Neuroscience, 16(13), 4207–4221. Braver, T. S., Barch, D. M., Gray, J. R., Molfese, D. L., & Snyder, A. (2001). Anterior cingulate cortex and response conflict: effects of frequency, inhibition and errors. Cerebral Cortex, 11(9), 825–836. Brookes, M. J., Gibson, A. M., Hall, S. D., Furlong, P. L., Barnes, G. R., Hillebrand, A., et al. (2005). GLM-beamformer method demonstrates stationary field, alpha ERD and gamma ERS co-localisation with fMRI BOLD response in visual cortex. Neuroimage, 26(1), 302– 308. Bub, D. N. (2000). Methodological issues confronting PET and fMRI studies of cognitive function. Cognitive Neuropsycholy, 17(5), 467–484. Buxton, R. B., Uludag, K., Dubowitz, D. J., & Liu, T. T. (2004). Modeling the hemodynamic response to brain activation. Neuroimage, 23 Suppl 1, S220–233. Caramazza, A. (1997). How many levels of processing are there in lexical access? Cognitive Neuropsychology, 14(1), 177–208. Cervenka, M. C., Boatman-Reich, D. F., Ward, J., Franaszczuk, P. J., & Crone, N. E. (2011). Language mapping in multilingual patients: electrocorticography and cortical stimulation during naming. Frontiers in Human Neuroscience, 5, 13. Chen-Bee, C. H., Agoncillo, T., Xiong, Y., & Frostig, R. D. (2007). The triphasic intrinsic signal: implications for functional imaging. Journal of Neuroscience, 27(17), 4572–4586.
© 2012. John Benjamins Publishing Company All rights reserved
114 Jed A. Meltzer Cohen, L., Jobert, A., Le Bihan, D., & Dehaene, S. (2004). Distinct unimodal and multimodal regions for word processing in the left temporal cortex. Neuroimage, 23(4), 1256–1270. Crone, N. E., Sinai, A., & Korzeniewska, A. (2006). High-frequency gamma oscillations and human brain mapping with electrocorticography. Progress in Brain Research, 159, 275–295. d’Avossa, G., Shulman, G. L., & Corbetta, M. (2003). Identification of cerebral networks by classification of the shape of BOLD responses. Journal of Neurophysiology, 90(1), 360–371. de Zubicaray, G. I., McMahon, K. L., Eastburn, M. M., & Wilson, S. J. (2002). Orthographic/ phonological facilitation of naming responses in the picture-word task: an event-related fMRI study using overt vocal responding. Neuroimage, 16(4), 1084–1093. de Zubicaray, G. I., Wilson, S. J., McMahon, K. L., & Muthiah, S. (2001). The semantic interference effect in the picture-word paradigm: an event-related fMRI study employing overt responses. Human Brain Mapping, 14(4), 218–227. Dell, G. S. (1990). Effects of frequency and vocabulary type on phonological speech errors. Language and cognitive processes, 5(4), 313–349. Dell, G. S., Schwartz, M. F., Martin, N., Saffran, E. M., & Gagnon, D. A. (1997). Lexical access in aphasic and nonaphasic speakers. Psychologial Review, 104(4), 801–838. Edwards, E., Nagarajan, S. S., Dalal, S. S., Canolty, R. T., Kirsch, H. E., Barbaro, N. M., et al. (2010). Spatiotemporal imaging of cortical activation during verb generation and picture naming. Neuroimage, 50(1), 291–301. Egner, T., & Hirsch, J. (2005). The neural correlates and functional integration of cognitive control in a Stroop task. Neuroimage, 24(2), 539–547. Fridriksson, J., Baker, J. M., & Moser, D. (2009). Cortical mapping of naming errors in aphasia. Human Brain Mapping, 30(8), 2487–2498. Fridriksson, J., Morrow, K. L., Moser, D., & Baylis, G. C. (2006). Age-related variability in cortical activity during language processing. Journal of Speech, Language, and Hearing Research, 49(4), 690–697. Ganushchak, L. Y., Christoffels, I. K., & Schiller, N. O. (2011). The use of electroencephalography in language production research: a review. Frontiers in Psychology, 2, 208. Glaser, W. R. (1992). Picture naming. Cognition, 42(1–3), 61–105. Glaser, W. R., & Düngelhoff, F. J. (1984). The time course of picture-word interference. Journal of Experimental Psychology: Human Perception and Performance, 10(5), 640. Goodglass, H., & Wingfield, A. (1997). Word-finding deficits in aphasia: Brain-behavior relations and clinical symptomatology. In H. Goodglass & A. Wingfield (Eds.), Anomia: Neuroanatomical and Cognitive Correlates). San Diego: Academic Press. Graves, W. W., Grabowski, T. J., Mehta, S., & Gordon, J. K. (2007). A neural signature of phonological access: distinguishing the effects of word frequency from familiarity and length in overt picture naming. Journal of Cognitive Neuroscience, 19(4), 617–631. Griffin, Z. M., & Bock, K. (1998). Constraint, Word Frequency, and the Relationship between Lexical Processing Levels in Spoken Word Production* 1,* 2,* 3. Journal of Memory and Language, 38(3), 313–338. Handwerker, D. A., Ollinger, J. M., & D’Esposito, M. (2004). Variation of BOLD hemodynamic responses across subjects and brain regions and their effects on statistical analyses. Neuroimage, 21(4), 1639–1651. Heim, S., Amunts, K., Mohlberg, H., Wilms, M., & Friederici, A. D. (2006). Head motion during overt language production in functional magnetic resonance imaging. Neuroreport, 17(6), 579–582.
© 2012. John Benjamins Publishing Company All rights reserved
Neuroimaging studies of lexical access 115
Hickok, G., & Poeppel, D. (2004). Dorsal and ventral streams: a framework for understanding aspects of the functional anatomy of language. Cognition, 92(1–2), 67–99. Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Review. Neuroscience, 8(5), 393–402. Hillis, A. E., Kleinman, J. T., Newhart, M., Heidler-Gary, J., Gottesman, R., Barker, P. B., et al. (2006). Restoring cerebral blood flow reveals neural regions critical for naming. Journal of Neuroscience, 26(31), 8069–8073. Hillis, A. E., Tuffiash, E., Wityk, R. J., & Barker, P. B. (2002). Regions of neural dysfunction associated with impaired naming of actions and objects in acute stroke. Cognitive Neuropsychology, 19(6), 523–534. Huettel, S. A., & McCarthy, G. (2001). Regional differences in the refractory period of the hemodynamic response: an event-related fMRI study. Neuroimage, 14(5), 967–976. Humphreys, G. W., Riddoch, M. J., & Quinlan, P. T. (1988). Cascade processes in picture identification. Cognitive Neuropsychology, 5(1), 67–104. Indefrey, P., & Levelt, W. J. (2000). The neural correlates of language production. In M. Gazzaniga (Ed.), The New Cognitive Neurosciences (2nd ed.) (pp. 845–865). Cambridge, MA: MIT Press. Indefrey, P., & Levelt, W. J. (2004). The spatial and temporal signatures of word production components. Cognition, 92(1–2), 101–144. Jescheniak, J. D., & Levelt, W. J. M. (1994). Word frequency effects in speech production: Retrieval of syntactic information and of phonological form. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20(4), 824. Kan, I. P., & Thompson-Schill, S. L. (2004). Effect of name agreement on prefrontal activity during overt and covert picture naming. Cognitive, Affective, and Behavioral Neuroscience, 4(1), 43–57. Kanwisher, N., McDermott, J., & Chun, M. M. (1997). The fusiform face area: a module in human extrastriate cortex specialized for face perception. Journal of Neuroscience, 17(11), 4302–4311. Kida, I., & Hyder, F. (2006). Physiology of functional magnetic resonance imaging: energetics and function. Methods Molecular Medicine, 124, 175–195. Koutstaal, W., Wagner, A. D., Rotte, M., Maril, A., Buckner, R. L., & Schacter, D. L. (2001). Perceptual specificity in visual object priming: functional magnetic resonance imaging evidence for a laterality difference in fusiform cortex. Neuropsychologia, 39(2), 184–199. La Heij, W., Dirkx, J., & Kramer, P. (1990). Categorical interference and associative priming in picture naming. British Journal of Psychology, 81(4), 511–525. Le Van Quyen, M., & Bragin, A. (2007). Analysis of dynamic brain oscillations: methodological advances. Trends in Neurosciences, 30(7), 365–373. Levelt, W. J. (2001). Spoken word production: a theory of lexical access. Proceedings of the National Academy of Sciences of U S A, 98(23), 13464–13471. Levelt, W. J., Praamstra, P., Meyer, A. S., Helenius, P., & Salmelin, R. (1998). An MEG study of picture naming. Journal of Cognitive Neuroscience, 10(5), 553–567. Levelt, W. J., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical access in speech production. Behavioral and Brain Sciences, 22(1), 1–38; discussion 38–75. Maess, B., Friederici, A. D., Damian, M., Meyer, A. S., & Levelt, W. J. (2002). Semantic category interference in overt picture naming: sharpening current density localization by PCA. Journal of Cognitive Neuroscience, 14(3), 455–462.
© 2012. John Benjamins Publishing Company All rights reserved
116 Jed A. Meltzer Mechelli, A., Josephs, O., Lambon Ralph, M. A., McClelland, J. L., & Price, C. J. (2007). Dissociating stimulus-driven semantic and phonological effect during reading and naming. Human Brain Mapping, 28(3), 205–217. Meltzer, J. A., & Braun, A. R. (2011). An EEG-MEG Dissociation between Online Syntactic Comprehension and Post Hoc Reanalysis. Frontiers in Human Neuroscience, 5, 10. Meltzer, J. A., McArdle, J. J., Schafer, R. J., & Braun, A. R. (2010). Neural aspects of sentence comprehension: syntactic complexity, reversibility, and reanalysis. Cerebral Cortex, 20(8), 1853–1864. Meltzer, J. A., Negishi, M., & Constable, R. T. (2008). Biphasic hemodynamic responses influence deactivation and may mask activation in block-design fMRI paradigms. Human Brain Mapping, 29(4), 385–399. Meltzer, J. A., Postman-Caucheteux, W. A., McArdle, J. J., & Braun, A. R. (2009). Strategies for longitudinal neuroimaging studies of overt language production. Neuroimage, 47(2), 745–755. Meyer, A. S., & Schriefers, H. (1991). Phonological facilitation in picture-word interference experiments: Effects of stimulus onset asynchrony and types of interfering stimuli. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17(6), 1146. Miezin, F. M., Maccotta, L., Ollinger, J. M., Petersen, S. E., & Buckner, R. L. (2000). Characterizing the hemodynamic response: effects of presentation rate, sampling procedure, and the possibility of ordering brain activity based on relative timing. Neuroimage, 11(6 Pt 1), 735–759. Morrison, C. M., Ellis, A. W., & Quinlan, P. T. (1992). Age of acquisition, not word frequency, affects object naming, not object recognition. Memory & Cognition, 20(6), 705–714. Mouraux, A., & Iannetti, G. D. (2008). Across-trial averaging of event-related EEG responses and beyond. Magnetic Resonance Imaging, 26(7), 1041–1054. Newman, S. D., Twieg, D. B., & Carpenter, P. A. (2001). Baseline conditions and subtractive logic in neuroimaging. Human Brain Mapping, 14(4), 228–235. Ogawa, S., Lee, T. M., Kay, A. R., & Tank, D. W. (1990). Brain magnetic resonance imaging with contrast dependent on blood oxygenation. Proceedings of the National Academy of Sciences of USA, 87(24), 9868–9872. Ojemann, G. A., Fried, I., & Lettich, E. (1989). Electrocorticographic (ECoG) correlates of language. I. Desynchronization in temporal language cortex during object naming. Electroencephalography and Clinical Neurophysiology, 73(5), 453–463. Postman-Caucheteux, W. A., Birn, R. M., Pursley, R. H., Butman, J. A., Solomon, J. M., Picchioni, D., et al. (2010). Single-trial fMRI Shows Contralesional Activity Linked to Overt Naming Errors in Chronic Aphasic Patients. Journal of Cognitive Neuroscience, 22(6), 1299–1318. Sahin, N. T., Pinker, S., Cash, S. S., Schomer, D., & Halgren, E. (2009). Sequential processing of lexical, grammatical, and phonological information within Broca’s area. Science, 326(5951), 445–449. Salmelin, R., Hari, R., Lounasmaa, O. V., & Sams, M. (1994). Dynamics of brain activation during picture naming. Nature, 368(6470), 463–465. Schriefers, H. (1993). Syntactic processes in the production of noun phrases. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19(4), 841. Schriefers, H., Meyer, A. S., & Levelt, W. J. M. (1990). Exploring the time course of lexical access in language production: Picture-word interference studies. Journal of Memory and Language, 29(1), 86–102.
© 2012. John Benjamins Publishing Company All rights reserved
Neuroimaging studies of lexical access 117
Stark, C. E., & Squire, L. R. (2001). When zero is not zero: the problem of ambiguous baseline conditions in fMRI. Proceedings of the National Academy of Sciences of USA, 98(22), 12760–12766. Szekely, A., D’Amico, S., Devescovi, A., Federmeier, K., Herron, D., Iyer, G., et al. (2003). Timed picture naming: extended norms and validation against previous studies. Behaviour Research Methods, Instruments, and Computers, 35(4), 621–633. Tallon-Baudry, C., & Bertrand, O. (1999). Oscillatory gamma activity in humans and its role in object representation. Trends in Cognitive Sciences, 3(4), 151–162. Towle, V. L., Yoon, H. A., Castelle, M., Edgar, J. C., Biassou, N. M., Frim, D. M., et al. (2008). ECoG gamma activity during a language task: differentiating expressive and receptive speech areas. Brain, 131(Pt 8), 2013–2027. van Turennout, M., Bielamowicz, L., & Martin, A. (2003). Modulation of neural activity during object naming: effects of time and practice. Cerebral Cortex, 13(4), 381–391. van Turennout, M., Ellmore, T., & Martin, A. (2000). Long-lasting cortical plasticity in the object naming system. Nature Neuroscience, 3(12), 1329–1334. van Turennout, M., Hagoort, P., & Brown, C. M. (1997). Electrophysiological evidence on the time course of semantic and phonological processes in speech production. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23(4), 787–806. Vigliocco, G., Antonini, T., & Garrett, M. F. (1997). Grammatical gender is on the tip of Italian tongues. Psychological science, 8(4), 314. Vihla, M., Laine, M., & Salmelin, R. (2006). Cortical dynamics of visual/semantic vs. phonological analysis in picture confrontation. Neuroimage, 33(2), 732–738. Vuilleumier, P., Henson, R. N., Driver, J., & Dolan, R. J. (2002). Multiple levels of visual object constancy revealed by event-related fMRI of repetition priming. Nature Neuroscience, 5(5), 491–499. Wilson, S. M., Isenberg, A. L., & Hickok, G. (2009). Neural correlates of word production stages delineated by parametric modulation of psycholinguistic variables. Human Brain Mapping, 30(11), 3596–3608. Winterer, G., Carver, F. W., Musso, F., Mattay, V., Weinberger, D. R., & Coppola, R. (2007). Complex relationship between BOLD signal and synchronization/desynchronization of human brain MEG oscillations. Human Brain Mapping, 28(9), 805–816. Wu, H. C., Nagasawa, T., Brown, E. C., Juhasz, C., Rothermel, R., Hoechstetter, K., et al. (2011). gamma-oscillations modulated by picture naming and word reading: intracranial recording in epileptic patients. Clinical Neurophysiology, 122(10), 1929–1942. Yacoub, E., Shmuel, A., Pfeuffer, J., Van De Moortele, P. F., Adriany, G., Ugurbil, K., et al. (2001). Investigation of the initial dip in fMRI at 7 Tesla. NMR in Biomedicine, 14(7–8), 408–412. Zumer, J. M., Brookes, M. J., Stevenson, C. M., Francis, S. T., & Morris, P. G. (2010). Relating BOLD fMRI and neural oscillations through convolution and optimal linear weighting. Neuroimage, 49(2), 1479–1489.
© 2012. John Benjamins Publishing Company All rights reserved
118 Jed A. Meltzer
Author’s address Jed A. Meltzer Rotman Research Institute — Baycrest Centre 3560 Bathurst Street Toronto Ontario M6A 2E1 CANADA Phone: 416 785-2500 #2117. Fax: 416 785-2862
[email protected]
© 2012. John Benjamins Publishing Company All rights reserved