Teki S, Kumar S, von Kriegstein K, Stewart L, Lyness CR, Moore BC, Capleton B,. 895. Griffiths TD (2012) Navigating the auditory scene: an expert role for the.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Comodulation enhances signal detection via priming of auditory cortical circuits
Joseph Sollini and Paul Chadderton Department of Bioengineering, Imperial College London, London SW7 2AZ, United Kingdom.
21 22 23 24 25 26 27 28
ACKNOWLEDGMENTS
29 30
We thank Andrei Kozlov for feedback on the manuscript, Matthew Brown and
31
Alexander Morris for discussions and support. This research was supported
32
by a MRC Career Development Award to PC (G1000512).
33 34
The authors declare no competing interests.
1
35
ABSTRACT
36 37
Acoustic environments are composed of complex overlapping
38
sounds that the auditory system is required to segregate into
39
discrete perceptual objects. The functions of distinct auditory
40
processing stations in this challenging task are poorly understood.
41
Here we show a direct role for mouse auditory cortex in detection
42
and segregation of acoustic information. We measured the
43
sensitivity of auditory cortical neurons to brief tones embedded in
44
masking noise. By altering spectrotemporal characteristics of the
45
masker, we reveal that sensitivity to pure tone stimuli is strongly
46
enhanced
47
corresponding to the psychoacoustic phenomenon comodulation
48
masking release. Improvements in detection were largest following
49
priming periods of noise alone, indicating that cortical segregation
50
is enhanced over time. Transient opsin-mediated silencing of
51
auditory cortex during the priming period almost completely
52
abolished these improvements, suggesting that cortical processing
53
may play a direct and significant role in detection of quiet sounds
54
in noisy environments.
in
coherently
modulated
55 56 57 58 59 60 61 62 63 64 65 66 67
2
broadband
noise,
68
SIGNIFICANCE STATEMENT
69 70
Auditory systems are adept at detecting and segregating competing sound
71
sources, but there is little direct evidence of how this process occurs in the
72
mammalian auditory pathway. We demonstrate that coherent broadband noise
73
enhances signal representation in auditory cortex, and that prolonged exposure
74
to noise is necessary to produce this enhancement. Using optogenetic
75
perturbation to selectively silence auditory cortex during early noise processing,
76
we show that cortical processing plays a crucial role in the segregation of
77
competing sounds.
78
3
79
INTRODUCTION
80 81
In the acoustic world, animals are challenged to detect salient sounds in noisy
82
backgrounds, a process of critical importance in communication, hunting and threat-
83
detection. The auditory system is well suited to the task as it demonstrates a
84
remarkable spectral (Rosenblith and Stevens, 1953) and temporal resolution (Klumpp
85
and Eady, 1956; Plomp, 1964), and this acuity is valuable for detecting changes in
86
natural soundscapes (McDermott et al., 2013; Andreou et al., 2015). Predictable, or
87
coherent, non-random features of soundscapes may be exploited by the auditory
88
system to improve sound processing (Ulanovsky et al., 2003; Taaseh et al., 2011;
89
Yaron et al., 2012; Teki et al., 2013; Krishnan et al., 2014; Nelken, 2014). A prevalent
90
feature of natural sound is coherent amplitude fluctuations across frequency, termed
91
‘comodulation’. Comodulation is present in both environmental sounds and
92
vocalizations (Nelken et al., 1999). Given its pervasiveness in nature, comodulation
93
may be a critical cue for grouping and segregating overlapping sounds (Nelken et al.,
94
1999; Krishnan et al., 2014).
95 96
Comodulation masking release (CMR) is a psychoacoustic phenomenon whereby
97
adding coherently modulated noise to an existing masker makes signals easier to
98
perceive (Hall et al., 1984). This effect is striking as additional noise energy normally
99
doesn’t change, or reduces, the detectability of signals (Fletcher, 1940). CMR
100
encompasses two separate processes, dependent on the relative frequencies of the
101
noise and the signal: within-channel CMR (signal and noise similar in frequency) and
102
across-channel CMR (signal and noise dissimilar in frequency). Within-channel CMR
103
can be implemented in the auditory periphery, but across-channel or “true” CMR
104
(Verhey et al., 2003) cannot be explained by mechanical processes in the ear and is
105
sensitive to cues of auditory grouping (Buss et al., 2009; Dau et al., 2009; Verhey et
106
al., 2012). Across-channel CMR is therefore a result of brain processing, but the
107
mechanism and location of such processing is not well understood.
108 109
Only a small number of studies have directly sought to understand the representation
110
and underlying mechanism(s) of co-modulation masking release at the cellular level.
111
In the peripheral auditory system, neuronal responses to pure tones are enhanced by
112
across-channel comodulation in a way consistent with human behaviour (Pressnitzer
113
et al., 2001). However, it is not clear whether this information is inherited or influenced
114
by processing at later stages of the auditory system. A CMR correlate has been shown
115
to develop progressively between the inferior colliculus (IC), medial geniculate body
4
116
(MGB) and auditory cortex (Nelken et al., 1999; Las et al., 2005), although, this work
117
explored both within and across-channel cues simultaneously. As such, it remains
118
unclear how much of the observed CMR is attributable to across-channel processes
119
(Verhey et al., 2003; Grose et al., 2004). Neuronal correlates of within-channel CMR
120
have been observed in the avian auditory forebrain area L2a, however, when
121
measured in an across-channel configuration (comparing narrowband and broadband
122
comodulated noise) no significant CMR was found (Nieder and Klump, 2001; Hofer
123
and Klump, 2003).
124 125
In this study, we set out to quantify the influence of across-channel comodulation on
126
signal detectability in the neuronal activity of primary auditory cortex (A1), a critical site
127
in auditory perception (Bizley and Cohen, 2013). We then sought to establish whether
128
processing by auditory cortical circuits plays a functional role in the formation of across-
129
channel CMR. We provide the first quantification of across-channel CMR in A1.
130
Transient inactivation during early sound processing significantly reduces subsequent
131
masking release, suggesting that auditory cortex may play a causal role in enhancing
132
signal representation in the presence of comodulated noise.
133 134 135
5
136 137 138 139
MATERIALS AND METHODS
Animals and preparation
140 141
The care and experimental manipulation of animals was performed in accordance with
142
institutional and United Kingdom Home Office guidelines. Homozygous Pvalb-IRES-
143
Cre mice (JAX Stock No. 008069) of both genders were used in this study. At 6 weeks,
144
auditory cortex was injected unilaterally with AAV-EF1a-DIO-hChR2(H134R)-EYFP
145
virus in order to express selectively ChR2 and EYFP proteins in PV+ interneurons.
146
Mice were anesthetized with 1-2% isoflurane under aseptic conditions and held using
147
ear bars on a stereotaxic mount (Angle 2, Leica). A small burr hole was made with a
148
dental drill ~1.m lateral from midline and ~2.7mm caudal to bregma. A small glass
149
pipette holding the virus was advanced to reside in auditory cortex. 0.5 µl was then
150
slowly injected into cortex over a period of 15 minutes. The pipette was then removed
151
and the burr hole sealed with Kwik-Cast (World Precision Instruments, USA), once dry
152
acrylic dental cement was layered over the top forming a hard seal over the site. The
153
area was then cleaned (Iodine) and the tissue sealed (Histoacryl, Braun Corporation,
154
USA). Analgesia was administered during the procedure via intra-peritoneal injection
155
(Carprofen; 5mg/kg). The animal was then recovered and the virus left to express for
156
2 weeks, during which time Buprenorphine (0.8mg/kg) jelly was used for postoperative
157
analgesia.
158 159
In vivo electrophysiology
160 161
For electrophysiological recordings mice were anaesthetized with a fentanyl/
162
midazolam/ medetomidine/mixture (0.05, 5.0 and 0.5 mg/kg). A midline incision was
163
made and all tissue was cleared from the cranium. Hatched score lines were made on
164
the cranial surface with a dental drill to increase surface area and a customized head
165
plate was fixed using adhesive (Histoacryl, Braun Corporation, USA). A small
166
craniotomy was made directly above auditory cortex and dura was removed using fine
167
forceps. Mice were fixed by clamping the headplate to a customized clamp stand. A
168
silicon microelectrode (A32, 4x2Tet Neuronexus) was advanced via micromanipulator
169
(IVM, Scientifica, UK) into auditory cortex at an angle perpendicular to the cortical
170
surface. An optical fiber, attached to DPSS laser (473nm, 100 mW; SLOC, China),
171
was positioned between the middle two probe shanks ~1mm from the cortical surface.
172
The laser was controlled via TTL signals generated via digital processor (RZ6, Tucker
173
Davis Technology, USA). A number of laser durations were trialed, with the intention
6
174
of disrupting cortical activity for the period of the pre-cursor (0-400ms) but not
175
disrupting cortical activity to the later masker (500-1000ms). We found that laser
176
duration of 150ms (from 0 to 150ms) was sufficient to produce transient suppression
177
(Figure 6C).
178 179
Data was acquired via Digital Lynx 16SX system (Neuralynx, USA) and stored on a
180
PC. Recordings were made in primary auditory cortex from a total of 21 sites at
181
different penetrations, in 6 animals.
182 183
Auditory stimulation
184 185
Auditory stimuli were generated and calibrated using Matlab (Mathworks, USA) and
186
stored as files to be used when required. All signal levels were calibrated (5-100kHz
187
flat spectrum ±1.5dB SPL). Stimuli were sinusoid amplitude modulated (SAM) tone
188
maskers presented in the presence of a pure tone signal. Three bandwidth
189
configurations were used: a narrowband condition (NB: 10Hz SAM 20kHz pure tone),
190
and two broadband conditions (IM and CM). The broadband conditions consisted of
191
an on-frequency band (10Hz SAM 20kHz pure tone), low off-frequency bands (3 x
192
0.125 octave spaced 10Hz SAM pure tones centered at 12.9kHz) and high off-
193
frequency bands (3 x 0.125 octave spaced 10Hz SAM pure tones centered at
194
30.8kHz). Off-frequency bands were designed so that all components fell outside a
195
mouse auditory filter (May et al., 2006). For all conditions the phase of the envelope of
196
the on-frequency band remained identical. In the IM condition, the envelope of the off-
197
frequency bands was selected at random between 0-180°, and in the CM condition the
198
off-frequency bands had identical envelope phases with the on frequency band. The
199
NB, IM and (Long-) CM conditions were all composed of a pre-cursor (400ms) and
200
masker portion (500ms) with a short gap in between (100ms). A short-CM condition
201
was also used this was identical to the Long-CM condition but without the pre-cursor.
202
An inter-stimulus interval of 2 seconds was used. The signal was comprised of three
203
50 ms SAM tone pips (10Hz modulator), occurring in the 3-5th troughs of the masker.
204
8 different signal conditions were used for each masker condition, a noise alone (i.e.
205
no signal) condition and seven noise + signal conditions (-10 to 20dB SNR in 5 dB
206
steps, where the masker level was held at 65dB SPL and the signal level varied).
207 208
Frequency response areas (FRAs) were measured using pure tones, at 25 different
209
frequencies (500ms duration, 7-57kHz, with 0.25 octave spacing), 8 different sound
210
levels (10-80 dB SPL at 10dB steps), with a 1 second inter-stimulus interval. Stimuli
7
211
for both FRA and masking experiments were presented in blocks, a single repeat of
212
each stimulus combination was presented in each block (frequency/level or masker
213
signal combination, respectively). The order of the stimuli within a block was randomly
214
selected. Blocks of masking condition and FRA stimuli were interleaved such that five
215
blocks of masking condition were followed by one block of the FRA stimuli. In total
216
there were 30 repeats of each masker/signal combination and 6 repeats of each
217
frequency/SPL combination. All stimuli were saved in the above structure into a single
218
file to be used for neuronal recordings. Sound files were presented using Matlab to
219
interface with RPvdsEX (TDT) running on an RZ6 (TDT) driving a free-field speaker
220
(ES1, TDT).
221 222
Data Analysis
223 224
Spikes
225
(http://sourceforge.net/projects/spikedetekt/) and initially clustered using KlustaKwik.
226
Clusters were manually inspected using Klusters and reclustered when necessary,
227
clusters that contained >1% of spikes within a 1ms interspike interval were rejected
228
(http://neurosuite.sourceforge.net). Spike time data was extracted and exported to
229
Matlab for further processing.
were
extracted
from
the
raw
data
files
using
SpikeDetekt
230 247
Raw FRAs were initially smoothed using a 3x3 pyramidal window, iso-response curves
248
(FRA edges) were then determined by finding a 30% change from baseline firing rate
249
(Sutter and Schreiner, 1991; Scholes et al., 2011). Both excitatory (positive criterion)
250
and inhibitory (negative criterion) curves were defined but processed separately. Each
251
curve circumscribed a defined region. Inspection of the resulting defined regions
252
revealed that using a single criterion meant that two strong responsive regions,
253
separated by a weak response in between, could be defined as two independent
254
regions (because the weak region did not cross criterion). To prevent this an additional
255
stage was added where a 15% criterion was used and regions connected by this lower
256
criterion were grouped into single regions (regions were defined using bwboundaries
257
command in Matlab, excitatory regions were only grouped with other excitatory
258
regions). The largest defined region was always used for further analysis.
259
Characteristic frequency was taken as the frequency yielding a defined response at
260
the lowest signal level.
261 262
Significant changes in firing rate of two conditions were detected by pooling the single
263
presentation repeats of those conditions and bootstrapping to produce a distribution of
8
264
mean firing rate (500 samples, sample with replacement). This distribution was then
265
used to define significance increases in firing rate (mean firing rate p>0.95). In order
266
to compare responsiveness to the signal and ignore the plethora of responses to the
267
noise we calculated the signal response:
268 269 270 271
Where SR is the signal response PSTH, Sig the noise and signal PSTH and N the
272
noise alone PSTH (5ms bins were used for all PSTHs) and where i is the cell number,
273
j the background noise condition index and k the SNR index. σ was calculated as the
274
standard deviation of all possible firing rates for that cell (i.e. all bins from all PSTHs
275
for that cell were included in this measure). The SR PSTH for each condition was then
276
averaged between 1.175-1.275s (i.e. the period of the first tone pip). This produced a
277
SR/SNR function for each cell, which was up-sampled (linear interpolation) in the SNR
278
dimension and the individual cell threshold estimated (0.5 STD criterion crossing).
279 280
The population signal response was the mean of the individual signal responses. As
281
before this was then averaged between 1.175-1.275s to produce the mean SR/SNR
282
function. To compare signal sensitivity and tuning properties we grouped cells, based
283
on their CF, and data were bootstrapped to allow comparison across populations (50
284
selections per subpopulation with replacement, 500 repeats). This gave 500 SR/SNR
285
functions, the mean of which was used to estimate the SR/SNR function for each
286
subpopulation. An arbitrary criterion of 0.25 STD was selected, this value was
287
intentionally low to allow subpopulations with only weak responses to be included in
288
this analysis. To quantify thresholds, firing rates were converted into a neurometric
289
function for each individual condition, by comparing the mean firing rate to the noise
290
alone versus each SNR measured. Each cell was given a “vote” to respond whether
291
that cell detected a signal. A relative increase in firing rate (≥25%) was equivalent to a
292
“yes” (and given a value of 1), a decrease (≥25%) was equivalent to a “no” (a value of
293
0) and anything in-between was set to chance (a value of 0.5). The population
294
proportion correct was the mean value across the population. This neurometric
295
function (proportion correct vs SNR) was up-sampled (linear interpolation) and
296
binomial probability used to estimate threshold (above chance performance, p