Accepted Manuscript Neural envelope encoding predicts speech perception performance for normalhearing and hearing-impaired adults Tine Goossens, Charlotte Vercammen, Jan Wouters, Astrid van Wieringen PII:
S0378-5955(17)30590-7
DOI:
10.1016/j.heares.2018.07.012
Reference:
HEARES 7600
To appear in:
Hearing Research
Received Date: 6 December 2017 Revised Date:
19 July 2018
Accepted Date: 25 July 2018
Please cite this article as: Goossens, T., Vercammen, C., Wouters, J., van Wieringen, A., Neural envelope encoding predicts speech perception performance for normal-hearing and hearing-impaired adults, Hearing Research (2018), doi: 10.1016/j.heares.2018.07.012. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Neural envelope encoding predicts speech perception performance for normal-hearing and hearing-impaired adults. Tine Goossens*, Charlotte Vercammen, Jan Wouters, Astrid van Wieringen
RI PT
KU Leuven - University of Leuven, Department of Neurosciences, Research Group Experimental ORL, Herestraat 49 bus 721, 3000 Leuven, Belgium
[email protected] [email protected] [email protected] [email protected]
M AN U
E-mail addresses: Tine Goossens: Charlotte Vercammen: Jan Wouters: Astrid van Wieringen:
SC
*Corresponding author: e-mail:
[email protected] postal address: Tine Goossens, KU Leuven - University of Leuven, Department of Neurosciences, Research Group Experimental ORL, Herestraat 49 bus 721, 3000 Leuven, Belgium
AC C
EP
TE D
Declarations of interest: none
ACCEPTED MANUSCRIPT
Abstract
2
Peripheral hearing impairment cannot fully account for speech perception difficulties that emerge with
3
advancing age. As the fluctuating speech envelope bears crucial information for speech perception,
4
changes in temporal envelope processing are thought to contribute to degraded speech perception.
5
Previous research has demonstrated changes in neural encoding of envelope modulations throughout the
6
adult lifespan, either due to age or due to hearing impairment. To date, however, it remains unclear
7
whether such age- and hearing-related neural changes are associated with impaired speech perception.
8
In the present study, we investigated the potential relationship between perception of speech in different
9
types of masking sounds and neural envelope encoding for a normal-hearing and hearing-impaired adult
10
population including young (20-30 years), middle-aged (50-60 years), and older (70-80 years) people. Our
11
analyses show that enhanced neural envelope encoding in the cortex and in the brainstem, respectively,
12
is related to worse speech perception for normal-hearing and for hearing-impaired adults. This neural-
13
behavioral correlation is found for the three age groups and appears to be independent of the type of
14
masking noise, i.e., background noise or competing speech. These findings provide promising directions
15
for future research aiming to develop advanced rehabilitation strategies for speech perception difficulties
16
that emerge throughout adult life.
17
Keywords: normal-hearing, hearing-impaired, speech perception, neural envelope encoding, correlation
18
1. Introduction
19
Difficulties in speech perception, especially in noisy listening situations, become more and more prevalent
20
with advancing age. This can, in part, be attributed to hearing impairment (Helfer and Wilber, 1990;
21
Humes and Roberts, 1990; Souza and Turner, 1994). Older people who have excellent hearing
22
thresholds, however, can show poor speech perception as well (e.g., Dubno et al., 2002; Füllgrabe et al.,
23
2015), which indicates that hearing impairment is not the only factor affecting speech understanding
AC C
1
EP
TE D
M AN U
SC
RI PT
1
Abbreviations: ASSR, auditory steady-state response; HI, hearing-impaired; ISTS, International Speech Test Signal; LH, left hemisphere; NH, normal-hearing; PTA, pure-tone average; RFM, release from masking; RH, right hemisphere; SD, standard deviation; SE, standard error; SNR, signal-to-noise ratio; SRT, speech reception threshold; SWN, speech-weighted noise; TFS, temporal fine structure
ACCEPTED MANUSCRIPT
across the adult lifespan. In recent years, there has been growing interest in potential contributions of
25
central auditory processing and temporal processing in particular.
26
Temporal processing is crucial for speech perception. After cochlear filtering, speech sounds can be
27
considered as a series of bandpass-filtered signals, each of which is characterized by an envelope
28
superimposed on a temporal fine structure (TFS) (e.g., Moore, 2014). The envelope refers to the relatively
29
slow variations in amplitude over time and the TFS represents more rapid acoustic oscillations. Envelope
30
modulations signal the occurrence of speech units, e.g., syllables (~2-10 Hz) and phonemes (~10-40 Hz)
31
(Chait et al., 2015), and the TFS plays an important role in the perception of pitch (Moore, 2014).
32
Behavioral research has demonstrated that, even without substantial spectral cues, normal-hearing (NH)
33
listeners achieve good speech perception in quiet, on the condition that low-frequency envelope
34
modulations in several frequency bands are preserved (Drullman et al., 1994; Shannon et al., 1995). This
35
shows that envelope modulations are a crucial temporal speech feature.
36
Envelope modulations are encoded in the central auditory system through synchronized activity of neural
37
oscillations, classified as delta (< 4 Hz), theta (4-7 Hz), alpha (8-13 Hz), beta (14-30 Hz), and gamma (>
38
30 Hz) oscillations (Lopes da Silva, 2013; Peelle and Davis, 2012). In response to a periodically varying
39
auditory input, neural oscillations synchronize to the modulations that match their characteristic frequency
40
(Wang et al., 2005). An important aspect of this temporal encoding mechanism is that different modulation
41
rates are predominantly encoded at different stages along the auditory pathway. When ascending the
42
auditory pathway, there is a progressive shift from processing of high-frequency modulations towards
43
processing of low-frequency modulations (Giraud et al., 2000; Joris et al., 2004).
44
Electrophysiological studies have demonstrated that neural synchronization to the speech envelope is an
45
important mechanism underlying speech perception (Ahissar et al., 2001; Doelling et al., 2014; Peelle and
46
Davis, 2012). Moreover, neural envelope encoding changes with advancing age, even when hearing
47
sensitivity is preserved. Aging appears to be accompanied by a decrease in neural envelope encoding at
48
the brainstem level (Anderson et al., 2012; Ansari et al., 2017; Grose et al., 2009; Leigh-Paffenroth and
49
Fowler, 2006; Presacco et al., 2015; Purcell et al., 2004) and by an increase in neural envelope encoding
50
at the cortical level (Presacco et al., 2016; Tlumak et al., 2015). Hearing impairment has also been related
AC C
EP
TE D
M AN U
SC
RI PT
24
ACCEPTED MANUSCRIPT
to changes in neural envelope encoding. Enhanced neural synchronization to envelope modulations has
52
been detected in the brainstem (Anderson et al., 2013) and auditory cortex (Millman et al., 2017) of
53
hearing-impaired (HI) adults relative to similarly-aged NH listeners.
54
Given that envelope modulations play a role in speech perception, it seems likely that these age- and
55
hearing-related changes in neural envelope encoding contribute to the speech perception difficulties that
56
emerge throughout the adult lifespan. Nevertheless, only a limited number of studies have investigated
57
correlations between neural envelope encoding and speech perception. Schoof and Rosen (2016)
58
observed reduced neural envelope encoding in the brainstem and worse speech perception for older (60-
59
72 years) than for young (19-29 years) NH adults, but the neural-behavioral correlation was not significant.
60
Other studies of NH adults did find significant correlations between the degree of subcortical neural
61
envelope encoding and speech perception performance. Anderson et al. (2011) tested a group of NH
62
older people (60-73 years) and found a higher degree of neural envelope encoding at the brainstem level
63
to correlate significantly with better speech intelligibility. Dimitrijevic et al. (2004) and Leigh-Paffenroth and
64
Fowler (2006) demonstrated that such a positive correlation occurred not only for NH older adults (60-80
65
years) but also for NH young individuals (20-40 years). Presacco et al. (2016) and Millman et al. (2017)
66
examined speech understanding and neural envelope encoding at the cortical level. In contrast to
67
Presacco et al. (2016), who did not find a significant association between the degree of cortical neural
68
synchronization and speech understanding among young (18-27 years) and older NH individuals (61-73
69
years), Millman et al. (2017) found a significant correlation for middle-aged NH adults (~60 years). These
70
researchers found enhanced neural envelope encoding in the left auditory cortex to be significantly related
71
to worse speech perception. Note that the direction of this neural-behavioral relationship was opposite to
72
the one observed for the brainstem level (Anderson et al., 2011; Dimitrijevic et al., 2004; Leigh-Paffenroth
73
and Fowler, 2006).
74
Remarkably, the studies that did not find an association between the degree of neural envelope encoding
75
and speech perception for NH adults, i.e., Schoof and Rosen (2016) and Presacco et al. (2016),
76
investigated speech perception in the presence of competing speech, whereas Anderson et al. (2011),
77
Dimitrijevic et al. (2004), and Millman et al. (2017), who did find significant correlations, used speech-
AC C
EP
TE D
M AN U
SC
RI PT
51
ACCEPTED MANUSCRIPT
weighted noise (SWN). An important difference between interfering background noise and competing
79
speech concerns the nature of the masking effect. The degrading effect of SWN results from energetic
80
masking and/or modulation masking (Brungart, 2001; Stone et al., 2012). Energetic masking refers to
81
reduced audibility of the target speech that results from a spectro-temporal overlap between the
82
background noise and the target speech. Modulation masking occurs when envelope modulations in the
83
noise – either random modulations intrinsic to noise or modulations imposed on the noise – adversely
84
affect the perception of envelope modulations in the target speech. The degrading effect of competing
85
speech not only involves energetic and modulation masking, but also incorporates some degree of
86
informational masking (Brungart, 2001; Durlach et al., 2003). Informational masking draws on central,
87
cognitive processing: the listener needs to segregate the target speech from the competing speech (object
88
formation) and then has to selectively attend to the target speech (object selection) (Kidd et al., 1994;
89
Shinn-Cunningham, 2008). As such, competing speech induces a higher cognitive load than background
90
noise. The lack of a significant association between neural envelope encoding and speech perception in
91
the presence of competing speech may be explained by cognitive processing playing an important role in
92
speech-on-speech masking, while auditory (temporal) processing plays a key role in understanding
93
speech masked by noise signals.
94
To the best of our knowledge, only Dimitrijevic et al. (2004) and Millman et al. (2017) have investigated
95
correlations between speech understanding and neural envelope encoding for people who have hearing
96
impairment. In both studies, speech perception was tested in the presence of SWN. Dimitrijevic et al.
97
(2004) showed that more neural envelope encoding in subcortical auditory areas was related to better
98
speech perception for HI adults aged between 57 and 86 years. Millman et al. (2017) reported that
99
enhanced envelope encoding in the left auditory cortex was predictive of inferior speech perception for HI
100
listeners aged about 60 years. Note that the direction of the neural-behavioral relationship for HI adults
101
seems to be different for the brainstem and the cortical level, which is in accord with the observations for
102
NH adults (e.g., Anderson et al., 2011; Millman et al., 2017).
103
Altogether, previous studies support the hypothesis that neural envelope encoding is related to the speech
104
perception performance of both NH and HI adults. Evidence is, however, scarce and a number of
AC C
EP
TE D
M AN U
SC
RI PT
78
ACCEPTED MANUSCRIPT
important issues remain unaddressed. First, there is a lack of research that includes people with different
106
hearing sensitivity and/or people belonging to different age categories, to assess whether correlations
107
between neural envelope encoding and speech perception vary with hearing sensitivity and/or age. Also,
108
to date, studies have been restricted to neural envelope encoding in either subcortical or cortical auditory
109
regions. Investigating both subcortical and cortical neural envelope encoding and their associations with
110
speech perception would allow the relative predictive value of neural envelope encoding in the different
111
auditory regions to be evaluated. Furthermore, neural-behavioral studies using interfering background
112
noises as well as competing speech for evaluating speech perception performance, are not available, yet
113
this is needed to investigate whether the presumed contribution of neural envelope encoding to speech
114
intelligibility depends on the cognitive load induced by the masker.
115
To address these needs, we conducted a number of studies including the same young (20-30 years),
116
middle-aged (50-60 years), and older (70-80 years) NH and HI participants. In a first study, we evaluated
117
speech perception performances in interfering background noises and competing speech (Goossens et
118
al., 2017). In two other studies, we investigated the neural encoding of envelope modulations from the
119
brainstem up to the cortex (Goossens et al., 2016; Goossens et al., under review). These studies
120
demonstrated age- and hearing-related changes in masked speech perception performance and neural
121
envelope encoding. The aim of the present study was to investigate whether the observed changes in
122
neural envelope encoding could predict the changes in speech perception performance. As we tested
123
both NH and HI adults belonging to three age groups, we could selectively explore neural-behavioral
124
correlations across the adult lifespan when hearing sensitivity was or was not preserved. By taking both
125
subcortical and cortical auditory regions into account, we could investigate the relative contribution of
126
neural envelope encoding at different auditory stages to speech perception performance. Moreover, by
127
using both interfering background noise and competing speech, we could assess whether the cognitive
128
load produced by the masker affected the relationship between neural envelope encoding and speech
129
perception.
130
We expected to find significant correlations between neural envelope encoding and speech perception
131
performance. Based on previous research, we hypothesized that more neural envelope encoding in
AC C
EP
TE D
M AN U
SC
RI PT
105
ACCEPTED MANUSCRIPT
subcortical auditory regions would be related to better speech perception, while the opposite scenario was
133
anticipated for neural envelope encoding at the cortical level. Moreover, we postulated that such neural-
134
behavioral correlations could vary with hearing sensitivity, age, and/or the cognitive load produced by the
135
masking noise.
136
2. Methods and Results
137
2.1. Participants
138
Both the NH and HI participant population consisted of young (20-30 years), middle-aged (50-60 years),
139
and older adults (70-80 years) (Table 1, Fig. 1). Pure-tone audiometry (0.25-8 kHz) was conducted in a
140
soundproof booth with a Madsen OB922 audiometer, TDH-39 earphones, and a RadioEar B71 bone
141
transducer. In accord with the ISO standard 389-1 (International Organization for Standardization, 1998),
142
25 dB HL was considered as the upper limit of normal hearing sensitivity. NH was defined as having
143
audiometric thresholds ≤ 25 dB HL at all octave frequencies from 0.25 to 4 kHz. All NH participants met
144
this criterion. However, the best-ear pure-tone average (PTA) across all audiometric thresholds (0.25-8
145
kHz) differed significantly among the three age groups. The PTA was lower for the young group than for
146
the middle-aged and older groups, and the PTA of the middle-aged group was lower than for the older
147
group (all p < 0.001). HI participants had audiometric thresholds ≥ 35 dB HL from 1 kHz upwards. Among
148
the three HI age groups, no significant differences in PTA were found (all p > 0.6). All hearing losses were
149
sensorineural in nature (air-bone gaps ≤ 10 dB HL). All participants were considered to have normal
150
cognitive capacities, as they scored ≥ 26/30 on the Montreal Cognitive Assessment (Nasreddine et al.,
151
2005). Also, all participants were Dutch native speakers, they were right-handed according to the
152
Edinburgh Handedness Inventory (Oldfield, 1971), and they had no history of brain injury, neurological
153
disorders, or tinnitus.
154
This research project was approved by the Medical Ethical Committee of the University Hospitals and
155
University of Leuven (approval number B322201214866). All participants gave their written informed
156
consent.
157
AC C
EP
TE D
M AN U
SC
RI PT
132
ACCEPTED MANUSCRIPT
158 159 160
Table 1. Overview of the number (N) of people (women/men), mean age ± standard deviation (SD) expressed in years, and mean best-ear pure-tone average (PTA) across all audiometric thresholds (0.25-8 kHz) ± SD, for each cohort (young, middle-aged, older) of the NH and HI participant populations. NH population
HI population
age ± SD
PTA ± SD
N (♀/♂)
age ± SD
PTA ± SD
young
17 (8/9)
23 ± 2
1±4
10 (6/4)
27 ± 5
54 ± 11
middle-aged
15 (9/6)
54 ± 2
8±4
14 (10/4)
58 ± 2
49 ± 9
older
10 (7/3)
74 ± 3
18 ± 6
13 (8/5)
78 ± 3
53 ± 6
RI PT
N (♀/♂)
>
162
2.2. Synopsis of speech perception and neural envelope encoding
163
In this section, we give an overview of the behavioral and neural data from our six participant groups
164
(Table 1) that have been reported on before (Goossens et al., 2016; Goossens et al., 2017; Goossens et
165
al., under review).
166
Behavioral and neural measures that varied significantly across the adult lifespan – due to age and
167
hearing impairment – were considered as variables of interest: they were retained for further investigation
168
in the present study.
169
2.2.1. Speech perception performance
170
In the study of Goossens et al. (2017), we investigated age- and hearing-related changes in masked
171
speech perception. Speech perception was quantified by the speech reception threshold (SRT) of masked
172
sentences (Leuven Intelligibility Sentence Test; van Wieringen and Wouters, 2008), presented to the ear
173
with the best PTA. The SRT represents the signal-to-noise ratio (SNR) at which 50% of the sentences,
174
i.e., keywords (~3 per sentence), were recognized correctly (Plomp and Mimpen, 1979). Higher SRTs
175
reflect poorer speech perception. We equated audibility of the speech material among our participants
176
based on the perception of 10 sentences of the Leuven Intelligibility Sentence Test, presented in quiet. If a
177
participant scored ≥ 8/10, it was concluded that the speech material was sufficiently audible. In this way,
178
the speech level was set to 60 dB SPL for all NH participants. For HI participants, this level (60 dB SPL)
179
was raised following the procedure of Jansen et al. (2012): half of the HI participant’s PTA0.25-1
AC C
EP
TE D
M AN U
SC
161
kHz
was
ACCEPTED MANUSCRIPT
added to 60 dB SPL. If a HI participant scored < 8/10 using this level, it was further raised in steps of 5 dB
181
SPL until a score of ≥ 8/10 was obtained. The median speech level for the HI participants was 83 dB SPL.
182
To investigate masked speech perception, we used interfering background noises as well as competing
183
speech. (Fig. 2). For the interfering background noises, we used two SWNs, both having the long-term
184
average spectrum of the speech material of the Leuven Intelligibility Sentence Test. The first SWN was
185
unmodulated, whereas the second SWN was 100% amplitude modulated at a 4-Hz rate. The modulated
186
SWN resulted in temporary increases in the SNR, which could lead to release from masking (RFM), i.e.,
187
better speech perception than for the unmodulated SWN (Festen and Plomp, 1990). In effect, during noise
188
dips the target speech can be heard more clearly and these speech glimpses enable the listener to
189
reconstruct the masked portions of the speech. RFM was quantified by subtracting SRTSWN unmodulated from
190
SRTSWN
191
2010). The ISTS is an unintelligible speech signal that is known to cause a considerable amount of
192
informational masking (Christiansen and Dau, 2012; Francart et al., 2011). Thus, compared to the SWNs,
193
the ISTS induced a higher cognitive load.
194
>
195
In the study of Goossens et al. (2017), age-related changes were investigated by comparing SRTs
196
between the young, middle-aged, and older NH participants. Hearing-related changes were examined by
197
comparing outcomes between NH and HI similarly-aged participants. It was demonstrated that the young
198
NH group showed lower SRTs than the middle-aged and older NH groups for all three maskers. Also, the
199
middle-aged NH group outperformed the older NH group, although this was not significant for the
200
unmodulated SWN. The older NH adults did not show RFM, whereas the young and middle-aged NH
201
adults did. Goossens et al. (2017) also showed that the HI listeners had higher SRTs than their NH
202
counterparts, irrespective of age and the type of masker. Moreover, all HI age groups showed significantly
203
less RFM than the NH age groups.
204
In sum, the study of Goossens et al. (2017) showed that the intelligibility of speech masked by
205
unmodulated SWN, modulated SWN, and the ISTS, worsened with age and with hearing impairment. The
TE D
We also used the International Speech Test Signal as a masker (ISTS; Holube et al.,
AC C
EP
modulated.
M AN U
SC
RI PT
180
ACCEPTED MANUSCRIPT
same was true for RFM. Therefore, SRTSWN unmodulated, SRTSWN modulated, SRTISTS, and RFM were all included
207
in the analyses of the present research.
208
2.2.2. Neural envelope encoding
209
Goossens et al. (2016) investigated changes in neural envelope encoding due to age and Goossens et al.
210
(under review) explored hearing-related changes. In both studies, neural envelope encoding was
211
investigated in the six participant groups (Table 1) by means of auditory steady-state responses (ASSRs)
212
to octave bands of white noise centered at 1 kHz that were 100% amplitude modulated at 4, 20, 40, or 80
213
Hz (Fig. 3, panel A; for a review of ASSRs, see Rance, 2008). The EEG was recorded with 64 active
214
electrodes. The neural responses recorded by eight electrode pairs located at the back of the head were
215
retained for ASSR evaluation, as these electrodes were most sensitive to the neural responses under
216
investigation (Fig. 3, panel B). We included electrodes mirrored across hemispheres to selectively assess
217
neural envelope encoding in the left hemisphere (LH) and in the right hemisphere (RH).
218
>
219
The examination of ASSRs to different modulation rates, i.e., 4, 20, 40, and 80 Hz, allowed us to
220
investigate neural envelope encoding along the central auditory pathway. Source localization studies have
221
demonstrated that ASSRs to modulation frequencies < 30 Hz primarily originate from the auditory cortex,
222
while ASSRs to modulation frequencies > 30 Hz are mainly generated in subcortical structures (e.g.,
223
Herdman et al., 2002; Wang et al., 2012). With regard to higher-frequency ASSRs, Luke et al. (2017)
224
showed that the 40-Hz ASSR has a predominant thalamic source and that the 80-Hz ASSR has a major
225
brainstem source. It is important to note, however, that ASSRs, independent of the modulation frequency,
226
are composite responses, comprising both cortical and subcortical neural activity.
227
For the NH participants, the level of the amplitude-modulated stimuli was set to 70 dB SPL. Through
228
behavioral loudness ratings (graphic rating scale), NH participants indicated that the stimuli presented at
229
70 dB SPL were comfortably loud. This loudness rating was used as a reference for the HI participants:
230
each HI participant was asked to adjust the level of the ASSR stimuli until it was comfortably loud for
231
him/her as well. This resulted in a mean stimulus level of 79 dB SPL for the HI listeners. The reason for
AC C
EP
TE D
M AN U
SC
RI PT
206
ACCEPTED MANUSCRIPT
applying equal loudness levels instead of equal sensation levels concerned listening comfort. For NH
233
listeners, the median hearing threshold of the modulated noises was 5 dB SPL: their sensation level
234
equaled 65 dB SPL. For HI listeners, the median hearing threshold was 43 dB SPL. Presenting the stimuli
235
at equal sensation levels would have implied stimulus levels of 108 dB SPL, which exceeded their
236
uncomfortable loudness level of 103 dB SPL. Also, previous research has demonstrated that the
237
magnitude of the ASSR correlates closely with the perceived loudness of the acoustic input (Ménard et al.,
238
2008; Emara and Kolkaila, 2010; Van Eeckhoutte et al., 2016). Therefore, stimulus loudness needs to be
239
controlled for, in order to prevent differences in ASSRs between NH and HI participants being a result of
240
acoustic stimulation instead of peripheral hearing sensitivity.
241
The magnitude of the ASSR was expressed as signal-to-noise ratio (SNR), the ratio of the power in the
242
modulation frequency bin (signal) to the power in 120 neighboring bins (noise), i.e., 60 bins below and 60
243
bins above the modulation frequency bin. A high SNR denotes that there is a high degree of synchronized
244
neural activity.
245
Goossens et al. (2016) compared the SNR of the 4-, 20-, 40-, and 80-Hz ASSR among the three NH age
246
groups and found age-related changes in the 4- and the 80-Hz ASSR. Older NH adults exhibited larger 4-
247
Hz ASSRs than young and middle-aged NH listeners. The opposite pattern was seen for 80-Hz ASSRs:
248
larger 80-Hz ASSRs were found for young than for middle-aged and older NH individuals. The age-related
249
decrease in neural synchronization to 80-Hz modulations was observed in the RH only, while the age-
250
related increase in the neural encoding of 4-Hz modulations was found in both hemispheres. A study
251
under review (Goossens et al.) revealed hearing-related changes in the neural encoding of 4- and 80-Hz
252
modulations. For both hemispheres, larger 4- and 80-Hz ASSRs were detected for young and middle-
253
aged HI participants relative to their NH counterparts. No such hearing-related changes were found for the
254
older participants.
255
In sum, the studies of Goossens et al. (2016; under review) showed that the 4- and 80-Hz ASSR changed
256
with age and with hearing impairment, while this was not the case for the 20- and 40-Hz ASSR. Therefore,
257
only the 4-Hz ASSRLH, 4-Hz ASSRRH, 80-Hz ASSRLH, and 80-Hz ASSRRH are included in the analyses of
258
the present study.
AC C
EP
TE D
M AN U
SC
RI PT
232
ACCEPTED MANUSCRIPT
259 260
2.3. Correlations between speech perception and neural envelope encoding
261
To assess whether SRTSWN
262
ASSRLH, 4-Hz ASSRRH, 80-Hz ASSRLH, and/or 80-Hz ASSRRH, we performed best subset and linear
263
regression analyses with IBM SPSS Statistics software. We aimed at selectively investigating potential
264
correlations between neural envelope encoding and speech perception performance across the adult
265
lifespan when hearing sensitivity was or was not preserved. Therefore, these regression analyses were
266
performed separately for the NH and HI participant populations.
267
Best subset regression analyses were conducted to determine which of the neural variables predicted
268
speech perception best. Together with the neural variables, we included age (in months) and hearing
269
sensitivity (PTA0.25-8
270
hearing sensitivity (Goossens et al., 2017). We applied the Akaike Information Criterion with a correction
271
for finite sample sizes for selecting the best subset of predictors for each of the three SRTs and for RFM
272
(Hastie et al., 2009).
273
We conducted two linear regression analyses to investigate whether the best predictors, selected by the
274
best subset regression, made a unique and significant contribution to predicting the speech perception.
275
For all regression analyses, the underlying assumptions were met (linearity, homoscedasticity,
276
independent errors, normally distributed errors, no problematic multicollinearity, i.e., all average variance
277
inflation factors ≤ 3.5, all tolerances > 0.2). Therefore, the regression outcomes can be generalized
278
beyond our participant samples.
279
In the first linear regression, all best subset predictors were entered simultaneously, as this is the most
280
appropriate procedure for theory testing (Studenmund and Cassidy, 1987). When the regression
281
coefficient of a best subset predictor reached significance (p < 0.05), it was concluded that this predictor
282
made a unique contribution to predicting the SRT or RFM, independent of the other predictors included in
283
the regression.
SRTSWN
modulated,
SRTISTS, and/or RFM were related to the 4-Hz
M AN U
as predictors, since we know that the SRTs and RFM change with age and
AC C
EP
TE D
kHz)
SC
RI PT
unmodulated,
ACCEPTED MANUSCRIPT
In the second linear regression, these ‘unique’ predictors were entered in order of descending
285
significance, i.e., the predictor with the most significant regression coefficient (smallest p value) was
286
entered in the first block and the other significant predictors were added, one by one. This hierarchical
287
procedure allowed us to evaluate whether the proportion of variance in speech perception explained by
288
each predictor was significant (F change < 0.05).
289
2.3.1. Results for the NH population
290
The results of the best subset regression analyses for the NH participant population are shown in the first
291
column of Table 2 (top panel). From the second column onwards, Table 2 (top panel) shows the
292
outcomes of the linear regression analyses. These analyses demonstrated that age contributed
293
significantly to predicting the SRT of NH adults, irrespective of the type of masker: the SRTs increased
294
with advancing age. The 4-Hz ASSRRH was a significant predictor of speech perception in the presence of
295
modulated SWN and the ISTS, but not in the presence of the unmodulated SWN. The two significant
296
regression coefficients were positive, indicating that a higher degree of neural synchronization to 4-Hz
297
modulations in the RH corresponded to poorer speech perception (Fig. 4, top panel).
298
>
299
To investigate whether the predictive values of the 4-Hz ASSRRH with regard to the modulated SWN and
300
the ISTS were equivalent, we compared both regression coefficients. When both age and the 4-Hz
301
ASSRLH were held constant, the unstandardized regression coefficients (± standard error (SE)) of the 4-Hz
302
ASSRRH with respect to the SRTSWN
303
respectively. Note that the latter regression coefficient is not included in Table 2 (top panel), since the 4-
304
Hz ASSRLH was not selected as a best predictor for the SRTISTS. Yet, in order to compare the unique
305
contribution of the 4-Hz ASSRRH between the SRTSWN modulated and the SRTISTS, the same predictors have
306
to be included in both regressions (Cohen, 1983). We compared these two regression coefficients,
307
following the statistical procedure prescribed by Brame et al. (1998), and we obtained a p value > 0.1,
308
indicating that there was no significant difference.
EP
TE D
M AN U
SC
RI PT
284
and the SRTISTS were 0.41 (± 0.15) and 0.35 (± 0.18),
AC C
modulated
ACCEPTED MANUSCRIPT
A minor remark is that the significant contribution of the 4-Hz ASSRRH to predicting SRTSWN modulated only
310
occurred when taking the 4-Hz ASSRLH into account. This shows that the shared variance between these
311
two neural variables was irrelevant for predicting SRTSWN modulated.
312
There was also a significant positive correlation between the 4-Hz ASSRRH and RFM: increased 4-Hz
313
synchronization in the RH was correlated with less RFM.
314
2.3.2. Results for the HI population
315
The best predictors of the three SRTs and RFM for the HI population are shown in the first column of
316
Table 2 (bottom panel) and the results of the follow-up linear regressions are reported from the second
317
column onwards. None of the best subset predictors made a significant contribution to predicting RFM.
318
However, for the SRTs, both age and the 80-Hz ASSRLH were significant predictors, irrespective of
319
masker type. As indicated by the positive regression coefficients, speech perception performance
320
decreased with advancing age and with increasing neural synchronization to 80-Hz modulations in the LH
321
(Fig. 4, bottom panel). Hearing sensitivity (PTA) contributed significantly to predicting speech
322
understanding in the presence of the ISTS: the greater the hearing loss (higher PTA), the higher the
323
SRTISTS.
324
To assess whether the predictive value of the 80-Hz ASSRLH differed between the three maskers, we
325
compared the unstandardized regression coefficients (± SE) of the 80-Hz ASSRLH between the SRTSWN
326
unmodulated
327
comparative analyses, the factors age and PTA were both controlled for (Cohen, 1983). None of the
328
differences in the regression coefficients were significant (all p > 0.1).
EP
TE D
M AN U
SC
RI PT
309
AC C
(0.23 ± 0.1), the SRTSWN
modulated
(0.31 ± 0.12), and the SRTISTS (0.29 ± 0.12). For these
ACCEPTED MANUSCRIPT
B
SE
SRTSWN unmodulated age* 4-Hz ASSRLH
0.003 -0.075
0.001 0.039
0.676 -0.238
< 0.001 0.062
SRTSWN modulated age* 4-Hz ASSRRH* 4-Hz ASSRLH*
0.005 0.407 -0.346
0.001 0.153 0.151
0.545 0.688 -0.583
< 0.001 0.011 0.028
< 0.001 0.204 0.028
27.2% 10.0% 7.3%
SRTISTS age* 4-Hz ASSRRH*
0.014 0.232
0.001 0.083
0.795 0.219
< 0.001 0.008
< 0.001 < 0.008
57.9% 4.4%
RFM 4-Hz ASSRRH* PTA 4-Hz ASSRLH
0.341 0.069 -0.233
0.128 0.034 0.130
0.781 0.300 -0.534
0.011 0.048 0.080
0.007 0.087
16.8%
p
F change
R²
B SRTSWN unmodulated age* 80-Hz ASSRLH*
0.011 0.259
SRTSWN modulated age* 80-Hz ASSRLH*
SE
M AN U HI population β
F change
R²
< 0.001
38.4%
SC
p
0.795 0.406
< 0.001 0.012
< 0.001 0.012
44.5% 11.6%
0.013 0.336
0.002 0.115
0.795 0.445
< 0.001 0.006
< 0.001 0.006
44.5% 14.0%
SRTISTS age* 80-Hz ASSRLH* PTA*
0.016 0.292 0.125
0.002 0.115 0.058
0.868 0.333 0.239
< 0.001 0.016 0.038
< 0.001 0.005 0.038
52.7% 7.3% 5.3%
RFM 80-Hz ASSRRH age
0.208 0.003
0.103 0.002
0.406 0.359
0.052 0.084
/ /
EP
0.002 0.097
AC C
337
NH population β
RI PT
Table 2. Outcomes of the regression analyses for the NH participants (top panel) and for the HI participants (bottom panel). The first column shows the behavioral variable and its best predictors, as indicated by the best subset regression. The next columns present the unstandardized regression coefficient of each predictor (B) and its standard error (SE), the standardized regression coefficient (β), the significance of the regression coefficient (p), whether the predictor contributes significantly to predicting the speech perception performance (F change), and the proportion of variance in the speech perception performance explained by the predictor when controlling for the shared variances with other significant predictors (R²). Significant predictors (F change < 0.05) are denoted by an asterisk.
TE D
329 330 331 332 333 334 335 336
ACCEPTED MANUSCRIPT
3. Discussion
339
The present study confirms that neural envelope encoding is related to speech perception performance.
340
This neural-behavioral association applies to NH and HI adults but differs in nature for the two
341
populations. For NH adults, enhanced neural envelope encoding in the auditory cortex is related to poorer
342
speech perception, whereas for HI adults, enhanced neural envelope encoding in the brainstem is related
343
to poorer speech perception. For both populations, these correlations are independent of the type of
344
masker, i.e., interfering background noise or competing speech.
345
3.1. Enhanced neural envelope encoding in the auditory cortex is related to poorer speech
346
perception for NH adults
347
The present study showed that, for NH adults, a higher degree of neural synchronization to 4-Hz envelope
348
modulations in the RH, represented by the magnitude of the 4-Hz ASSRRH, was related to poorer speech
349
perception in the presence of the non-stationary maskers. This relationship was detected when age was
350
controlled for, which means that it applies to NH adults, irrespective of age. Taken together with the fact
351
that the 4-Hz ASSR reflects neural synchronization in the auditory cortex (Wang et al., 2012), it appears
352
that enhanced neural envelope encoding at the cortical level corresponds to poorer speech perception for
353
young, middle-aged, and older NH persons in a listening situation with non-stationary masking.
354
Moore and Glasberg (1993) and Moore et al. (1995) investigated the speech perception performance of
355
NH listeners when simulating loudness recruitment, which is an abnormally rapid loudness growth that
356
perceptually enhances envelope modulations, i.e., better detection of envelope modulations (e.g., Moore
357
et al., 1996; Schlittenlacher and Moore, 2016). These researchers found that loudness recruitment
358
adversely affected speech perception, particularly when the speech was masked by a non-stationary
359
masker. This observation is in line with our results showing that the 4-Hz ASSRRH is significantly correlated
360
with speech perception in the presence of modulated noise (modulated SWN) and interfering speech
361
(ISTS), yet not in the presence of unmodulated noise (unmodulated SWN). A plausible explanation for the
362
association between enhanced encoding of 4-Hz envelope modulations and poorer speech perception in
363
modulated noise and interfering speech, resides within modulation masking. The envelope of the
364
modulated SWN was sinusoidally modulated at a 4-Hz rate and the envelope spectrum of the ISTS shows
AC C
EP
TE D
M AN U
SC
RI PT
338
ACCEPTED MANUSCRIPT
a maximum between 2 and 8 Hz (Holube et al., 2010). In continuous speech, envelope modulations
366
ranging from 2 to 10 Hz signal the occurrence of syllables (Edwards and Chang, 2013; Chait et al., 2015),
367
which are speech units that play an important role in speech perception (e.g., Greenberg et al., 2003).
368
Research of Doelling et al. (2014), for instance, indicates that listeners parse speech into syllable-sized
369
chunks for further processing. As modulation masking is greatest when the rate of the target and noise
370
modulations are similar (Yost et al., 1989; Sek et al., 2015), the syllabic-rate modulations in the modulated
371
SWN and the ISTS are likely to mask the syllabic-rate modulations in the target speech, thereby
372
worsening speech perception. As the correlation between speech perception and 4-Hz neural encoding is
373
found for listening situations in which modulation masking is most likely to happen, we suggest that this
374
cortical encoding is related to modulation masking sensitivity. This idea is corroborated by a study of
375
Millman et al. (2017). These researchers also found enhanced cortical encoding of syllabic-rate
376
modulations (i.e., 2 Hz) to be related to poorer intelligibility of speech masked by a 2-Hz modulated noise,
377
and they demonstrated that this association was particularly true for neural envelope encoding in the
378
posteromedial auditory cortex. Research showing that the segregation of speech from background noise
379
takes place, at least in part, in this specific anatomical region (e.g., Ding and Simon, 2012) strengthens
380
the idea that modulation masking, and thereby difficulties with perceptual segregation, may underlie the
381
relationship between enhanced cortical envelope encoding and degraded perception of speech in non-
382
stationary maskers. Further support for this idea is given by our own data showing that increased cortical
383
envelope encoding corresponds to a decline in RFM (Table 2, top panel). After all, it has been
384
demonstrated that RFM actually reflects release from modulation masking (Stone et al., 2012; Stone and
385
Moore, 2014).
386
Our data also show that cortical neural envelope encoding is predictive of speech perception in the
387
presence of competing speech. This outcome is in contrast to the lack of association between cortical
388
neural envelope encoding and speech understanding in a four-talker babble, reported by Presacco et al.
389
(2016). In their study, neural envelope encoding was represented by a linear correlation between the
390
reconstructed and actual speech envelope, not by the magnitude of the ASSR to amplitude-modulated
391
noise stimuli, as in the current study. These methodological discrepancies may explain the divergent
392
outcomes.
AC C
EP
TE D
M AN U
SC
RI PT
365
ACCEPTED MANUSCRIPT
Given the crucial role of cognitive processing in speech-on-speech masking (informational masking), we
394
expected the predictive value of auditory temporal processing, i.e., neural envelope encoding, to be lower
395
for perception of speech in the presence of competing speech than in the presence of interfering
396
background noise. By comparing the regression coefficients of the 4-Hz ASSRRH for the SRTSWN modulated
397
and the SRTISTS, we demonstrated, however, that the predictive values of neural envelope encoding for
398
speech perception in the presence of interfering background noise and competing speech are not
399
significantly different. This implies that processing of envelope modulations plays an important role in
400
speech perception under conditions of informational masking. Most likely, this role concerns the
401
segregation of the target speech from the interfering speech (object formation). The envelope yields
402
prosodic information (e.g., speech rhythm and tempo) that helps the listener to differentiate between
403
multiple talkers (Bregman, 1990; Rosen, 1992). As such, deficits in temporal envelope processing can
404
compromise object formation, and, in turn, object selection, since adequate object formation is a
405
prerequisite for efficient object selection (e.g., Shinn-Cunningham, 2008).
406
Our study shows that the degree of neural synchronization to 4-Hz envelope modulations in the RH is a
407
significant predictor of speech perception for NH adults. Since, generally, alterations in the specialized
408
hemisphere are thought to have the most substantial impact on behavioral functions, this outcome fits with
409
the asymmetric sampling in time framework (Poeppel, 2003), which posits that the RH is specialized in the
410
processing of low-frequency theta-range modulations.
411
An obvious question that arises from the association between enhanced cortical envelope encoding and
412
poorer speech perception is what underlies the envelope enhancement? A plausible explanation is a shift
413
in the neuronal excitation-inhibition balance in favor of neuronal excitation. This is a well-known
414
homeostatic mechanism, i.e., a compensatory process for maintaining an operative level of neuronal
415
functioning (e.g., Gourévitch et al., 2014). The occurrence of homeostatic mechanisms in the auditory
416
cortex of NH adults fits within the framework of ‘hidden hearing loss’, which means that the neural output
417
from the cochlea is reduced while hearing sensitivity (the audiogram) is normal (Schaette and McAlpine,
418
2011). Such a reduced neural output can be attributed to synaptopathy and/or neuropathy, which is not
419
uncommon in NH subjects because of aging and/or noise exposure (Kujawa and Liberman, 2009; Plack et
AC C
EP
TE D
M AN U
SC
RI PT
393
ACCEPTED MANUSCRIPT
al., 2014; Sergeyenko et al., 2013). Thus, NH adults can show a reduced cochlear output accompanied by
421
homeostatic mechanisms in the auditory cortex, resulting in higher neuronal excitability and, in turn,
422
enhanced synchronized neural activity.
423
Previous research on NH adults demonstrated that reduced neural envelope encoding in the brainstem is
424
correlated with poorer speech understanding (Anderson et al., 2011; Dimitrijevic et al., 2004; Leigh-
425
Paffenroth and Fowler, 2006). In the current study, the 80-Hz ASSR, reflecting neural envelope encoding
426
in the brainstem, did not predict speech understanding. Yet, our finding is not incompatible with previous
427
research. Our study is the first to include both subcortical (80-Hz ASSR) and cortical neural predictors (4-
428
Hz ASSR). Since our data did not show a significant subcortical correlation for NH adults, but did reveal a
429
significant cortical correlation, we suggest that neural envelope encoding at the cortical level has a
430
predominant role in predicting speech perception performance.
431
3.2. Enhanced neural envelope encoding in the brainstem is related to poorer speech perception
432
for HI adults
433
Our study showed that speech perception for the HI groups decreased significantly with increasing 80-Hz
434
ASSRLH. This neural-behavioral correlation was found for each of the three maskers. When age was
435
controlled for, this neural-behavioral correlation remained. Given that the dominant neural source of the
436
80-Hz ASSR is localized in the brainstem (Herdman et al., 2002; Luke et al., 2017), our study indicates
437
that enhanced neural envelope encoding in the brainstem is associated with reduced speech perception
438
for young, middle-aged, and older HI persons in a listening situation with either stationary or non-
439
stationary masking.
440
By comparing the regression coefficients of the 80-Hz ASSRRH between the SRTSWN
441
modulated,
442
speech perception was comparable for the three maskers. This suggests that processing of envelope
443
modulations is equally important for speech perception in interfering background noises as in competing
444
speech, which is in agreement with the observation for our NH population (see 3.1.). Schoof and Rosen
445
(2016), however, did not find subcortical neural envelope encoding to be predictive of perception of
AC C
EP
TE D
M AN U
SC
RI PT
420
unmodulated,
SRTSWN
and SRTISTS, we demonstrated that the strength of the correlation between the 80-Hz ASSRLH and
ACCEPTED MANUSCRIPT
speech in the presence of competing speech. This discrepancy may be attributed to abundant differences
447
in stimuli and procedures.
448
The association between encoding of 80-Hz envelope modulations and speech intelligibility for HI people,
449
can presumably be explained by a predominant processing of envelope relative to TFS cues. Even though
450
envelope modulations play a key role in speech perception (e.g., Shannon et al., 1995; Peelle and Davis,
451
2012), ample evidence exists that TFS cues are also important, particularly in noisy listening situations
452
(Lorenzi et al., 2006; Hopkins and Moore, 2009; Füllgrabe et al., 2015). Electrophysiological studies have,
453
however, shown poor TFS encoding in HI adults (Ananthakrishnan et al., 2016; Vercammen et al., 2018).
454
Likewise, behavioral research has demonstrated that HI listeners show reduced sensitivity to TFS
455
information (Buss et al., 2004; Hopkins and Moore, 2011). This poor TFS processing in HI people is in
456
sharp contrast with their strong neural encoding and adequate behavioral detection of envelope
457
modulations (e.g., Wallaert et al., 2017; Goossens et al., under review). Because of this marked envelope-
458
to-TFS processing imbalance, it seems reasonable to assume that HI people predominantly rely on
459
envelope cues, which is necessary but not sufficient for perception of masked speech.
460
Other studies that explored the relationship between neural envelope encoding at a subcortical level and
461
speech perception reported that enhanced subcortical neural envelope encoding was related to better
462
speech understanding (Anderson et al., 2011; Dimitrijevic et al., 2004; Leigh-Paffenroth and Fowler,
463
2006), which is opposite to the finding of the present study. Importantly, Anderson et al. (2011) and Leigh-
464
Paffenroth and Fowler (2006) tested NH adults. The contradictory outcomes suggest that the association
465
between neural envelope encoding and speech perception varies with hearing sensitivity. Dimitrijevic et al.
466
(2004), however, did test HI adults (57-86 years of age) and their neural-behavioral correlation was similar
467
to those reported by Anderson et al. (2011) and Leigh-Paffenroth and Fowler (2006). Dimitrijevic et al.
468
(2004) evaluated word recognition and the number of significant ASSRs (i.e., response amplitude
469
significantly higher than the noise estimate) to four carrier frequencies that were simultaneously
470
amplitude- and frequency-modulated at 40- and 80-Hz rates. All measurements were conducted in quiet
471
with/without hearing aids and in the presence of unmodulated SWN with/without hearing aids. The
472
Pearson correlation between word recognition score and percentage significant ASSRs (out of 16
AC C
EP
TE D
M AN U
SC
RI PT
446
ACCEPTED MANUSCRIPT
responses: 4 carrier frequencies x 2 amplitude modulations x 2 frequency modulations) was calculated
474
across all listening conditions. We, however, conducted a regression analysis for each listening condition,
475
i.e., unmodulated SWN, modulated SWN, and ISTS. Applying such a more extensive statistical procedure
476
would presumably not have reversed the direction of the neural-behavioral correlation reported by
477
Dimitrijevic et al. (2004), but it could have reduced its statistical significance (Bland and Altman, 1994).
478
Furthermore, we focused on amplitude modulation while Dimitrijevic et al. (2004) used a combined
479
assessment of amplitude and frequency modulation. This may also contribute to the discrepant findings.
480
The present study shows that the 80-Hz ASSR in the LH is a significant predictor of speech perception for
481
HI adults. As previously discussed (see 2.2.2.), the 80-Hz ASSR has a major brainstem source (e.g., Luke
482
et al., 2017). In addition to this main brainstem activity, however, there is also 80-Hz neural
483
synchronization at the cortical level (Coffey et al., 2016; Coffey et al., 2017; Schoonhoven et al., 2003),
484
hence the division between LH and RH 80-Hz ASSRs. Since changes in the specialized hemisphere are
485
thought to have the highest impact on functional outcomes, our observation that the 80-Hz ASSR in the
486
LH significantly predicts speech intelligibility, suggests that the LH is specialized in processing 80-Hz
487
modulations. This notion is in accord with experimental evidence showing that the LH is specialized in
488
processing temporal acoustic features, while the RH is specialized in fine-grained spectral analyses
489
(Okamoto et al., 2009; Zatorre and Belin, 2001; Zatorre et al., 2002).
490
Both peripheral and central changes with hearing impairment could underlie an increased degree of
491
envelope encoding in HI people. Cochlear damage (outer hair cells) leads to loudness recruitment, which
492
is known to be associated with better detection of envelope modulations (Moore et al., 1996; Moore, 2007;
493
Schlittenlacher and Moore, 2016). This improved behavioral modulation detection fits with enhanced
494
neural encoding of envelope modulations. Also, hearing impairment is characterized by a degeneration of
495
hair cell synapses and/or cochlear nerve fibers, compromising the afferent neural input (e.g., Kujawa and
496
Liberman, 2015). Consequently – as previously discussed (see 3.1.) – homeostatic mechanisms may
497
operate to maintain neuronal functioning (e.g., Gourévitch et al., 2014). Examples of homeostatic
498
mechanisms in the HI auditory system are nerve fibers showing steeper-than-normal response growths
499
(Kale and Heinz, 2010) and reduced inhibitory synaptic strength (Vale and Sanes, 2002). These central
AC C
EP
TE D
M AN U
SC
RI PT
473
ACCEPTED MANUSCRIPT
changes may very well boost neural synchronization to acoustic envelope modulations, including the 80-
501
Hz modulations under investigation.
502
3.3. Directions for future research
503
The present outcomes can provide directions for future research aiming to develop advanced
504
rehabilitation strategies for speech perception difficulties that emerge throughout adult life. We discussed
505
that the poorer speech understanding with enhanced neural envelope encoding for NH adults may be
506
associated with difficulties in perceptual segregation, and the occurrence of modulation masking in
507
particular. According to Kwon and Turner (2001), modulation masking concerns a failure of object
508
selection, i.e., the listener cannot ignore the background noise and/or cannot selectively attend to the
509
target speech. In this framework, it seems worthwhile to investigate whether training selective listening
510
could mitigate speech perception difficulties for NH people. For HI people, we argued that the worse
511
speech perception with enhanced envelope encoding could involve predominant processing of envelope
512
modulations relative to TFS cues. Since adequate processing of TFS, in addition to envelope cues, is
513
important for masked speech perception (e.g., Hopkins and Moore, 2009), it could be explored whether
514
auditory training focusing on TFS processing can improve the speech perception skills of HI listeners.
515
Also, further research is warranted to verify the direction of the association between neural envelope
516
encoding and speech perception and whether this neural-behavioral relationship is mediated by factors
517
that were not taken into account in the present study (e.g., linguistic skills).
518
3.4. Conclusion
519
In conclusion, the present study shows that neural envelope encoding is significantly related to the speech
520
perception performance of adults aged between 20 and 80 years. For NH and HI adults, enhanced neural
521
envelope encoding in the auditory cortex and in the brainstem, respectively, is associated with poorer
522
perception of masked speech, whether the masking is produced by interfering background noise or
523
competing speech.
524
AC C
EP
TE D
M AN U
SC
RI PT
500
ACCEPTED MANUSCRIPT
Acknowledgments
526
Our special thanks go to all participants for their cooperation in this research. We are grateful to our
527
master’s students, Anneleen Berghmans, Dorien Vandevenne, Ellen Vermaete, Eva Stroobants, Evelien
528
Van den Broeck, Jolien Orye, Kaat Van den Brande, Lore Heylen, Louise Van Haesendonck, Marjolein
529
Declercq, Sarah Heyndrickx, and Robin Gransier for their assistance in data collection, and to Astrid De
530
Vos for her helpful comments with regard to data analysis. We highly appreciate the constructive feedback
531
of Prof. Brian Moore and two anonymous reviewers on earlier versions of our manuscript.
532
This research was funded by the Research Foundation – Flanders (FWO) through an FWO-aspirant grant
533
to Tine Goossens (grant number 11Z8817N) and by the Research Council of KU Leuven (project
534
OT/12/98).
AC C
EP
TE D
M AN U
SC
RI PT
525
ACCEPTED MANUSCRIPT
References Ahissar, E., Nagarajan, S., Ahissar, M., Protopapas, A., Mahncke, H., Merzenich, M.M., 2001. Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proc.
RI PT
Natl. Acad. Sci. U. S. A. 98, 13367–13372. doi:10.1073/pnas.201400998 Ananthakrishnan, S., Krishnan, A., Bartlett, E., 2016. Human frequency following response: neural representation of envelope and temporal fine structure in listeners with normal hearing and sensorineural hearing loss. Ear Hear. 37, e91–e103. doi:10.1097/AUD.0000000000000247
SC
Anderson, S., Parbery-Clark, A., White-Schwoch, T., Drehobl, S., Kraus, N., 2013. Effects of hearing loss on the subcortical representation of speech cues. J. Acoust. Soc. Am. 133, 3030–3038.
M AN U
doi:10.1121/1.4799804
Anderson, S., Parbery-Clark, A., White-Schwoch, T., Kraus, N., 2012. Aging affects neural precision of speech encoding. J. Neurosci. 32, 14156–14164. doi:10.1523/JNEUROSCI.2176-12.2012 Anderson, S., Parbery-Clark, A., Yi, H.-G., Kraus, N., 2011. A neural basis of speech-in-noise perception in older adults. Ear Hear. 32, 750–757. doi:10.1097/AUD.0b013e31822229d3
TE D
Ansari, M.S., Rangasayee, R., Ansari, M.A.H., 2017. Neurophysiological aspects of brainstem processing of speech stimuli in audiometric-normal geriatric population. J. Laryngol. Otol. 131, 239–244. doi:10.1017/S0022215116009841
Bland, J.M., Altman, D.G., 1994. Correlation, regression, and repeated data. BMJ 308, 896.
EP
doi:10.1136/bmj.308.6942.1510a
Brame, R., Paternoster, R., Mazerolle, P., Piquero, A., 1998. Testing for the equality of maximum-
AC C
likelihood regression coefficients between two independent equations. J. Quant. Criminol. 14, 245– 261. doi:10.1023/A:1023030312801 Bregman, A.S., 1990. Auditory scene analysis: the perceptual organization of sounds, 1st ed. MIT Press, Cambridge, MA.
Brungart, D.S., 2001. Informational and energetic masking effects in the perception of two simultaneous talkers. J. Acoust. Soc. Am. 109, 1101–1109. doi:10.1121/1.1345696 Buss, E., Hall, J.W., Grose, J.H., 2004. Temporal fine-structure cues to speech and pure tone modulation in
observers
with
sensorineural
hearing
loss.
Ear
Hear.
25,
242–250.
ACCEPTED MANUSCRIPT
doi:10.1097/01.AUD.0000130796.73809.09 Chait, M., Greenberg, S., Arai, T., Simon, J.Z., Poeppel, D., 2015. Multi-time resolution analysis of speech:
RI PT
evidence from psychophysics. Front. Neurosci. 9, 214. doi:10.3389/fnins.2015.00214 Christiansen, C., Dau, T., 2012. Relationship between masking release in fluctuating maskers and speech reception
thresholds
in
stationary
noise.
J.
Acoust.
Soc.
doi:10.1121/1.4742732
Am.
132,
1655–1666.
SC
Coffey, E.B.J., Herholz, S.C., Chepesiuk, A.M.P., Baillet, S., Zatorre, R.J., 2016. Cortical contributions to the auditory frequency-following response revealed by MEG. Nat. Commun. 7, 11070.
M AN U
doi:10.1038/ncomms11070
Coffey, E.B.J., Musacchia, G., Zatorre, R.J., 2017. Cortical correlates of the auditory frequency-following and
onset
responses:
EEG
and
fMRI
evidence.
J.
Neurosci.
37,
830–838.
doi:10.1523/JNEUROSCI.1265-16.2017
Cohen, A., 1983. Comparing regression coefficients across subsamples: a study of the statistical test. Sociol. Methods Res. 12, 77–94. doi:10.1177/0049124183012001003
scores
in
TE D
Dimitrijevic, A., John, M.S., Picton, T.W., 2004. Auditory steady-state responses and word recognition normal-hearing
and
hearing-impaired
adults.
Ear
Hear.
25,
68–84.
doi:10.1097/01.AUD.0000111545.71693.48
EP
Ding, N., Simon, J.Z., 2012. Emergence of neural encoding of auditory objects while listening to competing speakers. Proc. Natl. Acad. Sci. U. S. A. 109, 5–10. doi:10.1073/pnas.1205381109
AC C
Doelling, K.B., Arnal, L.H., Ghitza, O., Poeppel, D., 2014. Acoustic landmarks drive delta-theta oscillations to enable speech comprehension by facilitating perceptual parsing. Neuroimage 85, 761–768. doi:10.1016/j.neuroimage.2013.06.035 Drullman, R., Festen, J.M., Plomp, R., 1994. Effect of temporal envelope smearing on speech reception. J. Acoust. Soc. Am. 95, 1053–1064. doi:10.1121/1.408467 Dubno, J.R., Horwitz, A.R., Ahlstrom, J.B., 2002. Benefit of modulated maskers for speech recognition by younger and older adults with normal hearing. J. Acoust. Soc. Am. 111, 2897–2907. doi:10.1121/1.1480421
ACCEPTED MANUSCRIPT
Durlach, N.I., Mason, C.R., Kidd, G., Arbogast, T.L., Colburn, H.S., Shinn-Cunningham, B.G., 2003. Note on informational masking. J. Acoust. Soc. Am. 113, 2984–2987. doi:10.1121/1.1570435 Edwards, E., Chang, E.F., 2013. Syllabic (~2-5 Hz) and fluctuation (~1-10 Hz) ranges in speech and
RI PT
auditory processing. Hear. Res. 305, 113–134. doi:10.1016/j.heares.2013.08.017 Emara, A.A.Y., Kolkaila, E.A., 2010. Prediction of loudness growth in subjects with sensorineural hearing loss using auditory steady state response. J. Int. Adv. Otol. 6, 371–379.
Festen, J.M., Plomp, R., 1990. Effects of fluctuating noise and interfering speech on the speech-reception for
impaired
and
normal
hearing.
J.
Acoust.
Soc.
Am.
88,
1725–1736.
SC
threshold
doi:10.1121/1.400247
M AN U
Francart, T., van Wieringen, A., Wouters, J., 2011. Comparison of fluctuating maskers for speech recognition tests. Int. J. Audiol. 50, 2–13. doi:10.3109/14992027.2010.505582 Füllgrabe, C., Moore, B.C.J., Stone, M.A., 2015. Age-group differences in speech identification despite matched audiometrically normal hearing: contributions from auditory temporal processing and cognition. Front. Aging Neurosci. 6, 347. doi:10.3389/fnagi.2014.00347 Giraud, A.-L., Lorenzi, C., Ashburner, J., Wable, J., Johnsrude, I., Frackowiak, R., Kleinschmidt, A., 2000.
TE D
Representation of the temporal envelope of sounds in the human brain. J. Neurophysiol. 84, 1588– 1598. doi:10.1152/jn.2000.84.3.1588
Goossens, T., Vercammen, C., Wouters, J., van Wieringen, A., 2017. Masked speech perception across
EP
the adult lifespan: impact of age and hearing impairment. Hear. Res. 344, 109–124. doi:10.1016/j.heares.2016.11.004
to
AC C
Goossens, T., Vercammen, C., Wouters, J., van Wieringen, A., 2016. Aging affects neural synchronization speech-related
acoustic
modulations.
Front.
Aging
Neurosci.
8,
133.
doi:10.3389/fnagi.2016.00133 Goossens, T., Vercammen, C., Wouters, J., van Wieringen, A. The impact of hearing impairment on neural envelope encoding at different ages. under review Gourévitch, B., Edeline, J.M., Occelli, F., Eggermont, J.J., 2014. Is the din really harmless? Long-term effects of non-traumatic noise on the adult auditory system. Nat. Rev. Neurosci. 15, 483–491. doi:10.1038/nrn3744
ACCEPTED MANUSCRIPT
Greenberg, S., Carvey, H., Hitchcock, L., Chang, S., 2003. Temporal properties of spontaneous speech— a syllable-centric perspective. J. Phon. 31, 465–485. doi:10.1016/j.wocn.2003.09.005 Grose, J.H., Mamo, S.K., Hall, J.W., 2009. Age effects in temporal envelope processing: speech and
auditory
steady
state
responses.
Ear
Hear.
30,
568–575.
RI PT
unmasking
doi:10.1097/AUD.0b013e3181ac128f
Hastie, T., Tibshirani, R., Friedman, J., 2009. Linear methods for regression, in: Hastie, T., Tibshirani, R., Friedman, J. (Eds.), The elements of statistical learning: data mining, inference, and prediction.
SC
Springer, New York, NY, pp. 43–100.
Helfer, K.S., Wilber, L.A., 1990. Hearing loss, aging, and speech perception in reverberation and noise. J.
M AN U
Speech Hear. Res. 33, 149–155. doi:10.1044/jshr.3301.149
Herdman, A.T., Lins, O., Van Roon, P., Stapells, D.R., Scherg, M., Picton, T.W., 2002. Intracerebral sources
of
human
auditory
steady-state
doi:10.1023/A:1021470822922
responses.
Brain
Topogr.
15,
69–86.
Holube, I., Fredelake, S., Vlaming, M., Kollmeier, B., 2010. Development and analysis of an International Speech Test Signal (ISTS). Int. J. Audiol. 49, 891–903. doi:10.3109/14992027.2010.506889
TE D
Hopkins, K., Moore, B.C.J., 2011. The effects of age and cochlear hearing loss on temporal fine structure sensitivity, frequency selectivity, and speech reception in noise. J. Acoust. Soc. Am. 130, 334–349. doi:10.1121/1.3585848
EP
Hopkins, K., Moore, B.C.J., 2009. The contribution of temporal fine structure to the intelligibility of speech in steady and modulated noise. J. Acoust. Soc. Am. 125, 442–446. doi:10.1121/1.3037233
AC C
Humes, L.E., Roberts, L., 1990. Speech-recognition difficulties of the hearing-impaired elderly: the contributions of audibility. J. Speech Hear. Res. 33, 726–735. doi:10.1044/jshr.3304.726 International Organization for Standardization, 1998. ISO 389-1: Acoustics - Reference zero for the calibration of audiometric equipment. Part 1: Reference equivalent threshold sound pressure levels for pure tones and supra-aural earphones. Geneva, Switzerland. Jansen, S., Luts, H., Wagener, K.C., Kollmeier, B., Del Rio, M., Dauman, R., James, C., Fraysse, B., Vormès, E., Frachet, B., Wouters, J., van Wieringen, A., 2012. Comparison of three types of French speech-in-noise
tests:
a
multi-center
study.
Int.
J.
Audiol.
51,
164–173.
ACCEPTED MANUSCRIPT
doi:10.3109/14992027.2011.633568 Joris, P.X., Schreiner, C.E., Rees, A., 2004. Neural processing of amplitude-modulated sounds. Physiol. Rev. 84, 541–577. doi:10.1152/physrev.00029.2003
RI PT
Kale, S., Heinz, M.G., 2010. Envelope coding in auditory nerve fibers following noise-induced hearing loss. J. Assoc. Res. Otolaryngol. 11, 657–673. doi:10.1007/s10162-010-0223-6
Kidd, G., Mason, C.R., Deliwala, P.S., Woods, W.S., Colburn, H.S., 1994. Reducing informational masking by sound segregation. J. Acoust. Soc. Am. 95, 3475–3480. doi:10.1121/1.410023
SC
Kujawa, S.G., Liberman, M.C., 2015. Synaptopathy in the noise-exposed and aging cochlea: primary neural degeneration in acquired sensorineural hearing loss. Hear. Res. 330, 191–199.
M AN U
doi:10.1016/j.heares.2015.02.009
Kujawa, S.G., Liberman, M.C., 2009. Adding insult to injury: cochlear nerve degeneration after “temporary” noise-induced hearing loss. J. Neurosci. 29, 14077–14085. doi:10.1523/JNEUROSCI.2845-09.2009 Kwon, B.J., Turner, C.W., 2001. Consonant identification under maskers with sinusoidal modulation: masking
release
or
modulation
doi:10.1121/1.1384909
interference?
J.
Acoust.
Soc.
Am.
110,
1130–1140.
TE D
Leigh-Paffenroth, E.D., Fowler, C.G., 2006. Amplitude-modulated auditory steady-state responses in younger and older listeners. J. Am. Acad. Audiol. 17, 582–597. doi:10.3766/jaaa.17.8.5 Lopes da Silva, F., 2013. EEG and MEG: relevance to neuroscience. Neuron 80, 1112–1128.
EP
doi:10.1016/j.neuron.2013.10.017
Lorenzi, C., Gilbert, G., Carn, H., Garnier, S., Moore, B.C.J., 2006. Speech perception problems of the
AC C
hearing impaired reflect inability to use temporal fine structure. Proc. Natl. Acad. Sci. U. S. A. 103, 18866–18869. doi:10.1073/pnas.0607364103 Luke, R., De Vos, A., Wouters, J., 2017. Source analysis of auditory steady-state responses in acoustic and electric hearing. Neuroimage 147, 568–576. doi:10.1016/j.neuroimage.2016.11.023 Ménard, M., Gallégo, S., Berger-Vachon, C., Collet, L., Thai-Van, H., 2008. Relationship between loudness growth function and auditory steady-state response in normal-hearing subjects. Hear. Res. 235, 105–113. doi:10.1016/j.heares.2007.10.007 Millman, R.E., Mattys, S.L., Gouws, A.D., Prendergast, G., 2017. Magnified neural envelope coding
ACCEPTED MANUSCRIPT
predicts
deficits
in
speech
perception
in
noise.
J.
Neurosci.
37,
7727–7736.
doi:10.1523/JNEUROSCI.2722-16.2017 Moore, B.C.J., Glasberg, B.R., Vickers, D.A., 1995. Simulation of the effects of loudness recruitment on
RI PT
the intelligibility of speech in noise. Br. J. Audiol. 29, 131–143. doi:10.3109/03005369509086590 Moore, B.C.J., 2014. Auditory processing of temporal fine structure: effects of age and hearing loss, 1st ed. World Scientific Publishing CO. Pte. Ltd., Toh Tuck, Singapore.
Moore, B.C.J., 2007. Cochlear hearing loss: physiological, psychological and technical issues, 2nd ed.
SC
Wiley, Chichester, UK.
Moore, B.C.J., Glasberg, B.R., 1993. Simulation of the effects of loudness recruitment and threshold
94, 2050–2062. doi:10.1121/1.407478
M AN U
elevation on the intelligibility of speech in quiet and in a background of speech. J. Acoust. Soc. Am.
Moore, B.C.J., Wojtczak, M., Vickers, D.A., 1996. Effect of loudness recruitment on the perception of amplitude modulation. J. Acoust. Soc. Am. 100, 481–489. doi:10.1121/1.415861 Nasreddine, Z.S., Phillips, N.A., Bédirian, V., Charbonneau, S., Whitehead, V., Collin, I., Cummings, J.L., Chertkow, H., 2005. The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild
TE D
cognitive impairment. J. Am. Geriatr. Soc. 53, 695–9. doi:10.1111/j.1532-5415.2005.53221.x Okamoto, H., Stracke, H., Draganova, R., Pantev, C., 2009. Hemispheric asymmetry of auditory evoked fields elicited by spectral versus temporal stimulus change. Cereb. Cortex 19, 2290–2297.
EP
doi:10.1093/cercor/bhn245
Oldfield, R.C., 1971. The assessment and analysis of handedness: the Edinburgh inventory.
AC C
Neuropsychologia 9, 97–113. doi:10.1016/0028-3932(71)90067-4 Peelle, J.E., Davis, M.H., 2012. Neural oscillations carry speech rhythm through to comprehension. Front. Psychol. 3, 320. doi:10.3389/fpsyg.2012.00320 Plack, C.J., Barker, D., Prendergast, G., 2014. Perceptual consequences of “hidden” hearing loss. Trends Hear. 18, 1–11. doi:10.1177/2331216514550621 Plomp, R., Mimpen, A.M., 1979. Improving the reliability of testing the speech reception threshold for sentences. Audiology 18, 43–52. doi:10.3109/00206097909072618 Poeppel, D., 2003. The analysis of speech in different temporal integration windows: cerebral lateralization
ACCEPTED MANUSCRIPT
as “asymmetric sampling in time.” Speech Commun. 41,
245–255. doi:10.1016/S0167-
6393(02)00107-3 Presacco, A., Jenkins, K., Lieberman, R., Anderson, S., 2015. Effects of aging on the encoding of dynamic
RI PT
and static components of speech. Ear Hear. 36, e352–e363. doi:10.1097/AUD.0000000000000193 Presacco, A., Simon, J.Z., Anderson, S., 2016. Evidence of degraded representation of speech in noise, in the aging midbrain and cortex. J. Neurophysiol. 116, 2346–2355. doi:10.1152/jn.00372.2016
Purcell, D.W., John, S.M., Schneider, B.A., Picton, T.W., 2004. Human temporal auditory acuity as by
envelope
following
responses.
J.
Acoust.
Soc.
Am.
116,
3581–3593.
SC
assessed
doi:10.1121/1.1798354
Plural Publishing Inc, San Diego, CA.
M AN U
Rance, G., 2008. Auditory steady-state response: generation, recording, and clinical applications, 1st ed.
Rosen, S., 1992. Temporal information in speech: acoustic, auditory and linguistic aspects. Philos. Trans. Biol. Sci. 336, 367–373. doi:10.1098/rstb.1992.0070
Schaette, R., McAlpine, D., 2011. Tinnitus with a normal audiogram: physiological evidence for hidden hearing
loss
and
computational
model.
J.
Neurosci.
31,
13452–13457.
TE D
doi:10.1523/JNEUROSCI.2156-11.2011
Schlittenlacher, J., Moore, B.C.J., 2016. Discrimination of amplitude-modulation depth by subjects with normal and impaired hearing. J. Acoust. Soc. Am. 140, 3487–3495. doi:10.1121/1.4966117
EP
Schoof, T., Rosen, S., 2016. The role of age-related declines in subcortical auditory processing in speech perception in noise. J. Assoc. Res. Otolaryngol. 17, 441–460. doi:10.1007/s10162-016-0564-x
AC C
Schoonhoven, R., Boden, C.J.R., Verbunt, J.P.A., de Munck, J.C., 2003. A whole head MEG study of the amplitude-modulation-following response: phase coherence, group delay and dipole source analysis. Clin. Neurophysiol. 114, 2096–2106. doi:10.1016/S1388-2457(03)00200-1 Sek, A., Baer, T., Crinnion, W., Springgay, A., Moore, B.C.J., 2015. Modulation masking within and across carriers for subjects with normal and impaired hearing. J. Acoust. Soc. Am. 138, 1143–1153. doi:10.1121/1.4928135 Sergeyenko, Y., Lall, K., Liberman, M.C., Kujawa, S.G., 2013. Age-related cochlear synaptopathy: an early-onset
contributor
to
auditory
functional
decline.
J.
Neurosci.
33,
13686–13694.
ACCEPTED MANUSCRIPT
doi:10.1523/JNEUROSCI.1783-13.2013 Shannon, R. V, Zeng, F.-G., Kamath, V., Wygonski, J., Ekelid, M., 1995. Speech recognition with primarily temporal cues. Science 270, 303–304. doi:10.1126/science.270.5234.303
186. doi:10.1016/j.tics.2008.02.003
RI PT
Shinn-Cunningham, B.G., 2008. Object-based auditory and visual attention. Trends Cogn. Sci. 12, 182–
Souza, P.E., Turner, C.W., 1994. Masking of speech in young and elderly listeners with hearing loss. J. Speech Hear. Res. 37, 655–661. doi:10.1044/jshr.3703.655
SC
Stone, M.A., Füllgrabe, C., Moore, B.C.J., 2012. Notionally steady background noise acts primarily as a modulation masker of speech. J. Acoust. Soc. Am. 132, 317–326. doi:10.1121/1.4725766
M AN U
Stone, M.A., Moore, B.C.J., 2014. On the near non-existence of “pure” energetic masking release for speech. J. Acoust. Soc. Am. 135, 1967–1977. doi:10.1121/1.4868392 Studenmund, A.H., Cassidy, H.J., 1987. Using econometrics: a practical guide, 1st ed. Little, Brown Book Group, London, UK.
Tlumak, A.I., Durrant, J.D., Delgado, R.E., 2015. The effect of advancing age on auditory middle- and long-latency evoked potentials using a steady-state-response approach. Am. J. Audiol. 24, 494–507.
TE D
doi:10.1044/2015_AJA-15-0036
Vale, C., Sanes, D.H., 2002. The effect of bilateral deafness on excitatory and inhibitory synaptic strength in the inferior colliculus. Eur. J. Neurosci. 16, 2394–2404. doi:10.1046/j.1460-9568.2002.02302.x
EP
Van Eeckhoutte, M., Wouters, J., Francart, T., 2016. Auditory steady-state responses as neural correlates of loudness growth. Hear. Res. 342, 58–68. doi:10.1016/j.heares.2016.09.009
AC C
van Wieringen, A., Wouters, J., 2008. LIST and LINT: Sentences and numbers for quantifying speech understanding in severely impaired listeners for Flanders and the Netherlands. Int. J. Audiol. 47, 348–355. doi:10.1080/14992020801895144 Vercammen, C., Goossens, T., Undurraga, J., Wouters, J., van Wieringen, A, 2018. Electrophysiological and behavioral evidence of reduced binaural temporal processing in the aging and hearing impaired human auditory system. Trends Hear. 22, 1-12. doi: 10.1177/2331216518785733 Wallaert, N., Moore, B.C.J., Ewert, S.D., Lorenzi, C., 2017. Sensorineural hearing loss enhances auditory sensitivity and temporal integration for amplitude modulation. J. Acoust. Soc. Am. 141, 971–980.
ACCEPTED MANUSCRIPT
doi:10.1121/1.4976080 Wang, X., Lu, T., Snider, R.K., Liang, L., 2005. Sustained firing in auditory cortex evoked by preferred stimuli. Nature 435, 341–346. doi:10.1038/nature03565
RI PT
Wang, Y., Ding, N., Ahmar, N., Xiang, J., Poeppel, D., Simon, J.Z., 2012. Sensitivity to temporal modulation rate and spectral bandwidth in the human auditory system: MEG evidence. J. Neurophysiol. 107, 2033–2041. doi:10.1152/jn.00310.2011
Yost, W.A., Sheft, S., Opie, J., 1989. Modulation interference in detection and discrimination of amplitude
SC
modulation. J. Acoust. Soc. Am. 86, 2138–2147. doi:10.1121/1.398474
Zatorre, R.J., Belin, P., 2001. Spectral and temporal processing in human auditory cortex. Cereb. Cortex
M AN U
11, 946–953. doi:10.1093/cercor/11.10.946
Zatorre, R.J., Belin, P., Penhune, V.B., 2002. Structure and function of auditory cortex: music and speech.
AC C
EP
TE D
Trends Cogn Sci 6, 37–46. doi:10.1016/s1364-6613(00)01816-7
ACCEPTED MANUSCRIPT
Figure captions Fig. 1. Median audiometric thresholds (dB HL) of the ear with the best pure-tone average (PTA0.25-8 kHz) for the NH (black) and HI (gray) participants. Diamonds, squares, and circles show thresholds for young,
RI PT
middle-aged, and older people, respectively. Error bars indicate the interquartile range.
Fig. 2. Time waveform of a sentence from the Leuven Intelligibility Sentence Test (black) in the presence of background noise (gray), i.e., unmodulated SWN (panel A), modulated SWN (panel B), and the ISTS
SC
(panel C).
Fig. 3. Panel A: Illustration of an 80-Hz ASSR (indicated by the arrow) in an EEG spectrum (0-100 Hz);
M AN U
Panel B: Electrode configuration and electrodes selected for the ASSR evaluation (bold circles).
Fig. 4. Illustration of the correlation between perception of speech in the presence of modulated SWN and neural envelope encoding for NH adults (4-Hz ASSRRH; top panel) and for HI adults (80-Hz ASSRLH; bottom panel). The x-axis shows the unstandardized residuals when regressing the neural predictor X on the other best subset predictors, i.e., age and 4-Hz ASSRLH for NH adults (top panel) and age for HI adults
TE D
(bottom panel). Stated otherwise, the x-axis represents the neural predictor X when the other best subset predictors are controlled for. Positive residuals reflect underestimation of the neural predictor X (ASSR magnitude) by the other best subset predictors. Negative residuals reflect overestimation. The y-axis modulated.
EP
shows each participant’s SRTSWN
Diamonds, squares, and circles show the X and Y values for
AC C
young, middle-aged, and older people, respectively.
ACCEPTED MANUSCRIPT
Highlights
Enhanced neural envelope encoding is related to poorer speech perception
•
This association applies to speech masked by interfering noise or competing speech
•
Cortical envelope encoding predicts speech perception for normal-hearing adults
•
Brainstem envelope encoding predicts speech perception for hearing-impaired adults
•
Such neural-behavioral relations are found for young, middle-aged, and older adults
AC C
EP
TE D
M AN U
SC
RI PT
•
500 Hz
1 kHz
10 20 30
40 50
D
70 80 90 NH young NH middle-aged NH older
TE
Audiometric threshold (dB HL)
0
60
2 kHz
4 kHz
6 kHz
M AN US
250 Hz -10
HI young
HI middle-aged HI older
8 kHz
A
RI P
0
M AN US C
Pressure (Pa)
0.12
-0.17 0
Time (s)
-0.17 0
Time (s)
3.98
C
EP
0.12
0
CC
Pressure (Pa)
B
D
0
TE
Pressure (Pa)
0.12
3.98
-0.17
0
Time (s)
3.98
A
M AN
60
50 45 40 35 30 25 20 0
20
40
ED
Amplitude (dB re nV)
55
60
Frequency (Hz)
80
100
B
young
middle-aged
older
r = 0.32
NH
M AN U
SC
RI PT
SRT SWN modulated (dB SNR)
ACCEPTED MANUSCRIPT
EP AC C
SRT SWN modulated (dB SNR)
HI
TE
D
4-Hz ASSR RH (controlled for age and 4-Hz ASSR LH)
80-Hz ASSR LH (controlled for age)
r = 0.37