Auditory Evoked Potentials and Their Utility in the ... - Springer Link

2 downloads 0 Views 1MB Size Report
Woldorff et al. 1993; Alain and Woods 1997; Winkler et al. 2006). ... more complex patterns (Alain et al. 1998; Sussman ...... attributes (Tardif et al. 2006). Source ...
Chapter 25

Auditory Evoked Potentials and Their Utility in the Assessment of Complex Sound Processing Mitchell Steinschneider, Catherine Liégeois-Chauvel, and John F. Brugge

Abbreviations

1 Studying Human Auditory Cortex

AEP AI AII AM BF CSD ECP EEG EPSP ERP fMRI GFP HG ISI LD LLR MEG MLR MMN MUA ORN PAC PCP PET PT SAC SSR STG STP VOT

Human auditory cortex is, in the classical sense, composed of multiple fields distributed both on the exposed surface of the superior temporal gyrus (STG) and on the areas buried within the Sylvian fissure on the supratemporal plane (STP). In addition, cortex of the parietal and frontal lobes, while not generally considered part of the classical auditory forebrain, also participates in higher-order operations involving acoustic input (Romanski et al. 1999; Cohen et al. 2004; Gifford and Cohen 2005). Understanding the functions of these various auditory cortical areas requires complementary experimental approaches. This chapter will highlight how event-related potentials (ERPs) and the electroencephalogram (EEG) are important tools in understanding human auditory cortical physiology. All advances in this field cannot be reviewed fully in one chapter. Instead, certain key issues related to the use of these approaches to understanding complex acoustic processing at the cortical level will be discussed, and the relevance of their measures for evolving concepts of auditory cortical function and dysfunction will be highlighted. We first provide a brief overview of the generally accepted classification of ERPs evoked by acoustic stimulation, auditory evoked potentials (AEPs), and their attentional modulation. The locations of neural generators of the AEP are discussed, and various experimental and modeling approaches are used to obtain this information. Thus, emphasis is placed on the relationships between electrophysiological results obtained in human subjects and those obtained in laboratory animals as they share many processing mechanisms. We describe the postnatal maturation of AEPs, as many studies using AEPs use both normally developing children and those with developmental disabilities. Human intracranial recordings of AEPs and EEG, with their superior spatial and temporal resolution, have the unique potential to provide the ideal criterion for defining auditory cortical function (Lachaux et al. 2003; Engel et al. 2005). The ability to record simultaneously action potential

auditory evoked potential primary auditory cortex secondary auditory cortex amplitude modulation best frequency current source density equivalent current dipole electroencephalogram excitatory postsynaptic potential event-related potential functional magnetic resonance imaging global field power Heschl’s gyrus interstimulus interval Laplacian derivation long-latency response magnetoencephalography middle-latency response mismatch negativity multi-unit activity object-related negativity primary auditory cortex processing-contingent potential positron emission tomography planum temporale secondary auditory cortex steady-state response superior temporal gyrus supratemporal plane voice onset time

M. Steinschneider () Kennedy Center, Albert Einstein Medical College, Bronx, NY 10461, USA e-mail: [email protected]

J.A. Winer, C.E. Schreiner (eds.), The Auditory Cortex, DOI 10.1007/978-1-4419-0074-6_25, © Springer Science+Business Media, LLC 2011

535

536

activity and the AEP enhances the utility of the approach and allows for pointed comparisons of human and animal auditory cortical activity (Creutzfeldt et al. 1989; Howard et al. 1996; Jacobs et al. 2007). These invasive approaches, along with recording potentials from the scalp, complement other non-invasive imaging methods described elsewhere such as magnetoencephalography (MEG), functional magnetic resonance imaging (fMRI), and positron emission tomography (PET). We discuss the roles of AEP recordings in cognitive neuroscience by examining their contributions in testing hypotheses about auditory scene analysis (Bregman 1990) and the dual-stream hypothesis about the identity and location of acoustic objects (Rauschecker and Tian 2000). We suggest that the definition of unimodal auditory cortex has to be reevaluated based on evidence in humans and monkeys of multisensory input to classically defined temporal lobe auditory cortical areas (Lakatos et al. 2007; Reale et al. 2007). We highlight the role of AEPs in cognition by examining how context modulates auditory cortical activity and how cortical temporal response properties may shape complex acoustic and phonetic perceptions. Normal response patterns are compared with those in people with hearing and language disorders. Dyslexia is the model by which hypotheses of perceptual dysfunction can be assessed by physiological means. AEPs emphasize time-locked responses at the expense of stimulus-related but not precisely time-locked activity. We conclude by discussing the growing literature revealing the extent of these stimulus-induced responses often seen within high-frequency EEG bands (Freiwald et al. 2001; Ward 2003; Herrmann et al. 2004). These bands exhibit considerable stimulus specificity and sensitivity and thus provide an additional way of assessing auditory cortical activity.

2 Average Evoked Potentials: Definition and Classification Cortical AEPs can be divided into three main categories based on the timing and polarity of voltage deflections after the onset of an effective acoustic stimulus. These are middlelatency (MLR) and long-latency (LLR) responses, and processing-contingent potentials (PCPs). MLRs and LLRs are true sensory evoked potentials in that they are evoked by physical attributes of sounds. PCPs, in contrast, are a diverse group of responses that are not directly related to the physical characteristics of the stimulus, but instead reflect some additional sound processing; they are often referred to as endogenous potentials. PCPs include both the so-called automatic discriminative responses such as the mismatch negativity (MMN) and those that require active, attention-dependent sound processing. However, these distinctions are often

M. Steinschneider et al.

blurred since attention can strongly modulate exogenous and endogenous waveforms (Woldorff and Hillyard 1991; Woldorff et al. 1993; Alain and Woods 1997; Winkler et al. 2006).

3 Middle-Latency Responses The MLR is a sequence of lower amplitude AEP deflections with latencies from 12 to 50 ms after sound onset (Picton et al. 1974; Borgmann et al. 2001; Yvert et al. 2005). The five waves are conventionally labeled by their voltage polarity and temporal sequence. They include P0 , Na , Pa , Nb , and Pb . An earlier wave, N0 (latency ∼8 ms), is usually believed to be of subcortical origin. Na (peak latency 15–25 ms) and Pa (peak latency 25–30 ms) are the most reliably recorded MLRs. Pb (also termed P1) is often considered the first LLR deflection. A variant of the MLR is the steadystate response (SSR), a quasi-sinusoidal response elicited by repetitive, amplitude- or frequency-modulated sounds that match the modulation frequency of the stimulus and may represent a composite of MLR components (Herdman et al. 2002; Poulsen et al. 2007).

4 Long-Latency Responses The most extensively studied exogenous AEP waveforms contain larger and more reliably recorded P1, N1, and P2 deflections which peak near 50, 100, and 200 ms, respectively. P1 and N1 have voltage maxima over the frontocentral scalp, whereas the P2 maximum is more posteriorly, near the vertex (Wood and Wolpaw 1982; Näätänen and Picton 1987; Crowley and Colrain 2004). All deflections invert in polarity in the mastoid region, ventral to the underlying Sylvian fissure, which is consistent, at least in part, with their neural generators being located on the STP. Although study of the LLR has been invaluable in probing the physiology of auditory cortex, the significance ascribed to changes in its waveform, especially N1, are often overinterpreted. For instance, it is difficult to ascribe to N1 a crucial role in perception if this wave dissipates at interstimulus intervals (ISIs) typical of speech or music (Snyder and Large 2004). It is also problematic to infer details of auditory cortical organization from changes in amplitude or scalp voltage distribution of each waveform deflection, as these likely have multiple neural generators in distinct cytoarchitectonic fields (Näätänen and Picton 1987; Scherg et al. 1989; Giard et al. 1994). Without information acquired by other means, such as intracranial recordings, we cannot determine the relative voltage contributions of the many fields comprising auditory

25 Auditory Evoked Potentials

537

cortex. More general aspects of auditory cortical function, such as detecting acoustic change, remain amenable to such analysis (Martin and Boothroyd 2000). Often considered as part of an N1/P2 complex that covaries with changes in stimulus parameters, P2 has its own developmental time course and can be experimentally dissociated from N1 (Ponton et al. 2000; Shahin et al. 2003, 2005; Crowley and Colrain 2004). Many consider P2 as resulting from non-lemniscal pathways, in contrast to N1 and its presumed relationship with the ascending lateral lemniscal system (Crowley and Colrain 2004).

sign of automatic, attentional switching mechanisms. While usually preceded by MMN, P3a does not require MMN for its elicitation, although its emergence may require N1 (Rinne et al. 2006; Sabri et al. 2006). It occurs without attention and has maximal amplitude over frontal regions. As expected for a novelty response, P3a habituates with repeated stimulus presentation. P3a is distinct from a slightly later positive wave, the P3b, which is an attention-dependent positive deflection with maximal amplitude over parietal areas and which is elicited after the subject’s detection of a target event.

5 Automatic Processing: Contingent Responses

6 Role of Attention

MMN is now one of the best studied PCP deflections. In its simplest form, MMN is generated when a repetitive stimulus (the standard) is occasionally replaced by a different stimulus (the deviant) that varies along some physical dimension. When the AEP evoked by the standard is subtracted from the AEP evoked by the deviant, the difference waveform negative deflection, the MMN, has a latency of ∼150–200 ms. This difference wave may be a physiological manifestation of an automatic change-detection process coding a discernable difference between a new sound input and a preceding sound pattern (Näätänen et al. 2005). In essence, MMN is generated when a sound deviates from the sensory-memory representation of the prior acoustic environment. This pre-perceptual process occurs without attentional requirements. Often, the MMN amplitude increases, and its latency decreases, in parallel with the degree of dissimilarity between the standard and deviant stimuli (Friedman et al. 2001). As a pre-perceptual measure, the MMN has become a primary means for evaluating auditory sensory memory and sensory discrimination objectively (Picton et al. 2000; Näätänen et al. 2001). MMN is used to examine the representation and discrimination of simple acoustic attributes, combinations of sound features (Pakarinen et al. 2007), speech (Sharma et al. 1993; Sharma and Dorman 2000), and more complex patterns (Alain et al. 1998; Sussman 2005). Its capacity to examine key aspects of hearing, and its elicitation in passive behavioral states, have made MMN an attractive physiological tool in the assessment of both children and adults with difficulty in task-related performance (Cheour et al. 1998; Ferri et al. 2003; Leppänen et al. 2004; Jing and Benasich 2006). Another interesting PCPs is P3a (Escera et al. 1998; Friedman et al. 2001), a positive wave, which peaks at about 300 ms after stimulus onset and is classically elicited by a novel sound. It is now known to arise after a large change in the acoustic background and may be an electrophysiological

Attention is a powerful contextual modulator of AEPs. Several waves in the AEP are dependent upon attention and are task related. These attention-related potentials often overlap with non-attention-dependent waves, requiring additional methods to separate activity associated with different processes. Typically, the AEP evoked when a behavioral response is not required is subtracted from the AEP elicited when it is, yielding a difference waveform thought to be associated with attention. A valuable approach uses selective attention tasks, which allow simultaneous acquisition of AEPs evoked by attended and unattended stimuli. One paradigm involves attending to target stimuli embedded in a stream of non-target stimuli presented to one ear only while ignoring similar trains of stimuli presented to the unattended ear. Integration of attentional paradigms with AEP acquisition allows the assessment of whether selective attention is based on early cortical gain control of sound input by comparing the amplitudes of obligatory AEP waveforms in the attended and unattended conditions, or whether early processes are relatively unmodulated and new and later neural events are engaged (Näätänen 1990; Coull 1998; Giard et al. 2000). Using stimulus paradigms described above, selective attention clearly enhances MLR waves, suggesting that attention can modulate early stages of cortical neural activity (Woldorff and Hillyard 1991; Woldorff et al. 1993). Enhancement of exogenous AEP components includes the N1 and P2 waves, and parallels increase in task difficulty and attentional load (Woods et al. 1994; Neelon et al. 2006; Sabri et al. 2006). These results thus support a gain control theory of attention by showing that MLR and LLR waves, which represent measures of auditory cortical activation elicited by external stimuli, are modulated by attentional constraints. Attention also induces complex modulatory effects on automatic PCPs. Increased MMN amplitude results when deviant stimuli are presented to an attended ear (Szymanski et al. 1999). However, its amplitude decreases in more

538

demanding tasks and when MMN is elicited by taskirrelevant stimulus changes. P3a modulation is also induced by attentional mechanisms. When attending to a different sensory modality, irrelevant changes in sound deviance evoke a smaller P3a compared to attention directed at auditory stimuli, while more demanding sound discrimination tasks enhance P3a if evoked by changes in task-irrelevant stimulus changes (Sabri et al. 2006). Processing negativities are also generated by attention (Hillyard and Picton 1987; Näätänen 1990; Woods et al. 1994). Both early and late negativities, termed Nd for negative difference waves, are seen. The early Nd wave overlaps the N1/P2 complex and has shorter latencies when the target is easier to distinguish from the background, making this wave a useful measure of the speed of stimulus discrimination. Later segments of Nd can be distinguished by their differential topography: earlier portions have a frontocentral maximum; later ones a more frontal distribution (Giard et al. 2000). These processing negativities occur to both target and non-target stimuli. In contrast, another processing negativity (N2) is elicited only to target stimuli (Novak et al. 1990; Woods et al. 1994).

7 Neural Bases of Event-Related Potentials 7.1 Non-invasive Measures of Event-Related Potential Generator Localization Mapping the locations and spatial distributions of ERP generators usually involves, first, creating isopotential plots of the voltage at each recording site serially in time. Fundamental premises include that different scalp topographies reflect different neural generators in the cortex acting over time (Michel et al. 2004). There are problems, however, in identifying ERP generators only from scalp topographies. First, scalp-recorded ERPs have inherently poor spatial resolution, due largely to the spatial blurring effect of the skull (Babiloni et al. 2001; Nunez and Srinivasan 2006). Second, the various voltage deflections need not arise near the electrode sites with maximal response amplitude (Arezzo et al. 1975; Wood and Wolpaw 1982; Gloor1985). Third, reference electrode activity may seriously bias waveform polarity and amplitude, without changing overall potential gradients (Skrandies 1990). Fourth, a sensory stimulus activates multiple brain generators which summate at the scalp to produce complex voltage topographies and waveforms. Scalprecorded ERPs thus reflect both augmentation and cancellation of neural activity in active tissue subregions. Cortical ERPs reflect mainly the postsynaptic activity in pyramidal neurons that is subject to the largest spatial and temporal summation, with each pyramidal cell neuronal column

M. Steinschneider et al.

behaving as an electrical dipole (Speckmann and Elger 1999). Thus, not all active brain regions yield easily detected voltage fields on the scalp, including cortical layered with little or no dipolar cellular architecture, or deep brain nuclei with closed field architecture. These localization issues in scalp-recorded topography highlight difficulties in the wellknown inverse problem: the volume conductor of the brain contains, theoretically, an infinite number of potential neural sources that can produce a specific topographic pattern. AEP components are traditionally characterized by the timing and polarity of positive and negative waveform peaks. Peak latencies may vary as a function of electrode site due to the simultaneous activation of multiple generators (Michel et al. 2004). Examining only those waveforms recorded at a single, user-chosen electrode site may yield inaccurate AEP categorization. Global field power (GFP) is a more objective measure of the AEP waveform, and it is the square root of the mean of the squared voltage differences between all electrode sites (Lehmann and Skrandies 1980; Skrandies 1990; Michel et al. 2004). It is a reference-free, user-independent measure of the net power of the electric field derived from the electrode grid. GFP peaks can be used to characterize objectively AEP waveforms and peak latencies. Additional methods can then be used to clarify generator source identification. Comparisons can also be made in the field topography across time points in the waveform under one experimental condition or across different conditions at the same time point to assess whether similar generator configurations are present. The global dissimilarity measure is a simple means to compare topographies, and it is the square root of the mean of the squared differences between all corresponding electrodes (Lehmann and Skrandies 1980; Skrandies 1990; Michel et al. 2004). Prior to this calculation, all amplitudes are normalized by dividing activity by its own GFP to minimize topographical changes that might reflect changes in component amplitude rather than generator distribution. Field topography is sharpened by computing a second spatial derivative (Laplacian derivation, LD) of the raw data (Gevins et al. 1999; Babiloni et al. 2001; Nunez and Srinivasan 2006). From this spatial filtering, contributions to waveforms from the reference electrode or distant sources are reduced or eliminated and the LD estimates the transcranial flow to and from the skull directly beneath the recording electrode. The LD is computed by comparing the activity at an electrode site with the mean of its nearest neighbors (Gevins et al. 1999; Babiloni et al. 2001). More accurate estimates require spline interpolation (Perrin et al. 1987; Gevins et al. 1999; Babiloni et al. 2001). Models better approximating head shape, including the subject’s own shape from their MRI, further enhance spatial resolution (Gevins et al. 1999; Babiloni et al. 2001). While scalp AEP distributions are

25 Auditory Evoked Potentials

sharpened, the LD is still includes contributions from multiple overlapping current generators and does not unequivocally identify sources of neural activity. For this identification, approximations to solving the inverse problem and direct intracranial recordings are required. Several sophisticated models provide solutions to the inverse problem. Each makes various assumptions about the neural sources, the conductivity of the brain and its coverings, and head geometry (Baillet 2001; Michel et al. 2004). All methods require the ability to solve first the forward solution, so that an accurate voltage distribution across the scalp can be derived from a series of known generators with given strengths, locations, and orientations (Darvas et al. 2004). Equivalent current dipole (ECD) models and distributed source models are the two main algorithms used to identify generators in scalp voltage topography. Their principal assumption is that a few dipoles with varying strengths, locations, and orientations identify the underlying generators (Scherg and von Cramon 1985; Cuffin 1998; Ebersole and Wade 1990), each representing the summed activity from a circumscribed brain region. ECD algorithms usually calculate the best fitting dipole locations, strengths, and orientations using a reiterative process to reduce the residual variance between predicted scalp topography derived from a forward solution and the actual, voltage distributions. Dipole parameters are systematically modified to obtain the bestfit solution (Michel et al. 2004). Ultimately, ECD models decompose the evoked potential into a series of source waveforms providing the best statistical fit to the empirical data for an assumed number of dipoles. Advantages include relative ease of use, resistance to noise, and relatively accurate results for focal brain activation (Darvas et al. 2004; Im et al. 2005). Disadvantages are the high dependence upon user-provided decisions and, in its classic application, loose coupling between results and detailed anatomical information (Michel et al. 2004; Im et al. 2005). It is prudent to view an ECD as a center of gravity for activity in a given brain volume, understanding that details of the actual activation are inaccessible and that large activated areas may be mislocalized (Kobayashi et al. 2005). Attempts have been made to use a physiologically plausible number of dipoles (Im et al. 2005). Known anatomy and physiology of a structure place realistic constraints on their location and orientation and MRI images can help constrain dipoles to the grey matter and suggest accurate cranial models (Babiloni et al. 2001; Michel et al. 2004). fMRI is used to find and estimate the number of equivalent ECD sources required in a paradigm (Mulert et al. 2004; Molholm et al. 2005; Schönwiesner et al. 2007). Thus, the relationship between fMRI and dipole estimation is complex (Logothetis 2003; Ahlfors and Simpson 2004; Benar et al. 2006) and each technique addresses different aspects of neural function. It is unrealistic to assume a direct correspondence between

539

the two measures (Nunez and Silberstein 2000; Devor et al. 2003) since the locations and dimensions of fMRI activation are not always related to functional maps derived from scalprecorded ERPs, and the latter may not always reflect fMRI changes (Ahlfors and Simpson 2004; Mulert et al. 2004). Distributed source models do not require a predetermined number of dipoles to arrive at an inverse solution. Instead, brain activity is reconstructed from a three-dimensional grid of solution points distributed uniformly on the cortical surface, each functioning as a dipole of fixed location with varying strength and orientation (Michel et al. 2004). As there are many more unknowns (several thousand dipoles) than data (about 100 measurement points), each algorithm requires a mathematical constraint to have a unique solution. Several distributed source models with various assumptions have emerged and include the minimum norm estimation (Hämäläinen and Ilmoniemi 1994), and the LORETA, and LAURA models (Baillet 2001; Darvas et al. 2004; Michel et al. 2004; Bai et al. 2007). Reviews of distributed source models, and the problems inherent with ECD models, capture the imperfect nature of source localization based solely on indirect means and emphasize the role of more direct methods in supporting or modifying putative generators seen in non-invasive techniques.

8 Invasive Measures of Evoked Potential Generator Localization Human intracranial recordings promise to help determine contributions made by neural structures to AEP generation (Halgren et al. 1998; Lachaux et al. 2003). Recordings obtained from patients undergoing evaluation for medically intractable epilepsy are an invaluable and unique window into human brain physiology, despite many limitations. The number and locations of recording sites and the time available for data acquisition are determined by clinical constraints. Electrodes may not be optimally oriented to map directly activity from a presumed cortical generator to the head surface, hampering a straightforward interpretation of the relation between surface recordings and their sources. Further, evoked activity within a region may not be uniform for a specific stimulus. Finally, patients with neurological dysfunction in the brain region of interest require caution when extrapolating to the neurologically normal subject (Boatman and Miglioretti 2005; Boatman et al. 2006).

8.1 Generators of Specific Components Despite its limitations, recording AEPs directly from the cerebral cortex allows relatively precise characterization of

540

M. Steinschneider et al.

functional auditory areas for the stimuli and the tasks studied. When combined with data from non-invasive recordings, clues to the identity of the main generators of the AEP waveform are further strengthened. AEP deflections are often described as waveform components, implying that each deflection represents a discrete underlying neural process near the latency of the maxima or minima of the deflection. Given the uncertain location of sources of the scalp-evoked potential, this interpretation is not strictly tenable for extracranial recording data. It may be more valid for data from intracranial recording, where electrode contacts are very near known sources. One key to accurate interpretation of intracerebral responses is that electrode polarity inversion between two adjacent recording sites indicates that passage through the dipole generating the component (Vaughan and Arezzo 1988). The higher the amplitude of a component, the closer is the generator to the recording site. This permits distinguishing local field potentials from volume-conducted potentials, as the morphology and timing of the latter change little with distance (Badier and Chauvel 1995). There is consensus that the most posteromedial parts of Heschl’s gyrus (HG) contribute to the early P0 and Na components of the MLR (Scherg and Von Cramon 1986; Liégeois-Chauvel et al. 1994; Godey et al. 2001; Yvert et al. 2005). With two HG, a normal anatomical variant, the generator is on the more anterior gyrus but may extend into the intervening sulcus (Yvert et al. 2005). This includes the anatomically defined posteromedial auditory core cortex (Hackett et al. 2001). Posteromedial HG, slightly more anterolateral HG segments, Heschl’s sulcus, the planum

temporale, and the posterior STG all may contribute to Pa (Liégeois-Chauvel et al. 1994; Steinschneider et al. 1999; Howard et al. 2000; Yvert et al. 2005). The involvement of multiple auditory cortical regions to this cortical wave is not surprising, given that electrical stimulation of posteromedial HG evokes short-latency responses within the posterolateral STG, anterolateral HG, and planum temporale (LiégeoisChauvel et al. 1991; Howard et al. 2000; Brugge et al. 2003). Non-invasive dipole source localization emphasizes the critical contribution of the HG posteromedial segment to the scalp-recorded Pa (Scherg and Von Cramon 1986; Borgmann et al. 2001; Yvert et al. 2001). Multifocal generators predominate for the remainder of the AEP, with significant variability in waveform peak latencies, which is likely based on differences across subjects, electrode placements, and stimulus parameters (Howard et al. 2000; Godey et al. 2001). In most studies the evoked waveforms persist for several hundred milliseconds after stimulus onset, including auditory core (Howard et al. 2000; Godey et al. 2001; Brugge et al. 2008). A complex pattern of temporally overlapping waves recorded from diverse auditory cortex regions shows AEPs elicited by a 1-kHz tone burst and recorded simultaneously from electrodes located in auditory core cortex (posteromedial HG, anterior and posterior primary auditory cortex (PAC)) and from non-core auditory cortex (secondary auditory cortex (SAC) and planum temporal (PT)). The earliest components (