Neural envelope encoding predicts speech

3 downloads 0 Views 1MB Size Report
In accord with the ISO standard 389-1 (International Organization for Standardization, 1998),. 141. 25 dB HL was considered as the upper limit of normal hearing ...
Accepted Manuscript Neural envelope encoding predicts speech perception performance for normalhearing and hearing-impaired adults Tine Goossens, Charlotte Vercammen, Jan Wouters, Astrid van Wieringen PII:

S0378-5955(17)30590-7

DOI:

10.1016/j.heares.2018.07.012

Reference:

HEARES 7600

To appear in:

Hearing Research

Received Date: 6 December 2017 Revised Date:

19 July 2018

Accepted Date: 25 July 2018

Please cite this article as: Goossens, T., Vercammen, C., Wouters, J., van Wieringen, A., Neural envelope encoding predicts speech perception performance for normal-hearing and hearing-impaired adults, Hearing Research (2018), doi: 10.1016/j.heares.2018.07.012. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Neural envelope encoding predicts speech perception performance for normal-hearing and hearing-impaired adults. Tine Goossens*, Charlotte Vercammen, Jan Wouters, Astrid van Wieringen

RI PT

KU Leuven - University of Leuven, Department of Neurosciences, Research Group Experimental ORL, Herestraat 49 bus 721, 3000 Leuven, Belgium

[email protected] [email protected] [email protected] [email protected]

M AN U

E-mail addresses: Tine Goossens: Charlotte Vercammen: Jan Wouters: Astrid van Wieringen:

SC

*Corresponding author: e-mail: [email protected] postal address: Tine Goossens, KU Leuven - University of Leuven, Department of Neurosciences, Research Group Experimental ORL, Herestraat 49 bus 721, 3000 Leuven, Belgium

AC C

EP

TE D

Declarations of interest: none

ACCEPTED MANUSCRIPT

Abstract

2

Peripheral hearing impairment cannot fully account for speech perception difficulties that emerge with

3

advancing age. As the fluctuating speech envelope bears crucial information for speech perception,

4

changes in temporal envelope processing are thought to contribute to degraded speech perception.

5

Previous research has demonstrated changes in neural encoding of envelope modulations throughout the

6

adult lifespan, either due to age or due to hearing impairment. To date, however, it remains unclear

7

whether such age- and hearing-related neural changes are associated with impaired speech perception.

8

In the present study, we investigated the potential relationship between perception of speech in different

9

types of masking sounds and neural envelope encoding for a normal-hearing and hearing-impaired adult

10

population including young (20-30 years), middle-aged (50-60 years), and older (70-80 years) people. Our

11

analyses show that enhanced neural envelope encoding in the cortex and in the brainstem, respectively,

12

is related to worse speech perception for normal-hearing and for hearing-impaired adults. This neural-

13

behavioral correlation is found for the three age groups and appears to be independent of the type of

14

masking noise, i.e., background noise or competing speech. These findings provide promising directions

15

for future research aiming to develop advanced rehabilitation strategies for speech perception difficulties

16

that emerge throughout adult life.

17

Keywords: normal-hearing, hearing-impaired, speech perception, neural envelope encoding, correlation

18

1. Introduction

19

Difficulties in speech perception, especially in noisy listening situations, become more and more prevalent

20

with advancing age. This can, in part, be attributed to hearing impairment (Helfer and Wilber, 1990;

21

Humes and Roberts, 1990; Souza and Turner, 1994). Older people who have excellent hearing

22

thresholds, however, can show poor speech perception as well (e.g., Dubno et al., 2002; Füllgrabe et al.,

23

2015), which indicates that hearing impairment is not the only factor affecting speech understanding

AC C

1

EP

TE D

M AN U

SC

RI PT

1

Abbreviations: ASSR, auditory steady-state response; HI, hearing-impaired; ISTS, International Speech Test Signal; LH, left hemisphere; NH, normal-hearing; PTA, pure-tone average; RFM, release from masking; RH, right hemisphere; SD, standard deviation; SE, standard error; SNR, signal-to-noise ratio; SRT, speech reception threshold; SWN, speech-weighted noise; TFS, temporal fine structure

ACCEPTED MANUSCRIPT

across the adult lifespan. In recent years, there has been growing interest in potential contributions of

25

central auditory processing and temporal processing in particular.

26

Temporal processing is crucial for speech perception. After cochlear filtering, speech sounds can be

27

considered as a series of bandpass-filtered signals, each of which is characterized by an envelope

28

superimposed on a temporal fine structure (TFS) (e.g., Moore, 2014). The envelope refers to the relatively

29

slow variations in amplitude over time and the TFS represents more rapid acoustic oscillations. Envelope

30

modulations signal the occurrence of speech units, e.g., syllables (~2-10 Hz) and phonemes (~10-40 Hz)

31

(Chait et al., 2015), and the TFS plays an important role in the perception of pitch (Moore, 2014).

32

Behavioral research has demonstrated that, even without substantial spectral cues, normal-hearing (NH)

33

listeners achieve good speech perception in quiet, on the condition that low-frequency envelope

34

modulations in several frequency bands are preserved (Drullman et al., 1994; Shannon et al., 1995). This

35

shows that envelope modulations are a crucial temporal speech feature.

36

Envelope modulations are encoded in the central auditory system through synchronized activity of neural

37

oscillations, classified as delta (< 4 Hz), theta (4-7 Hz), alpha (8-13 Hz), beta (14-30 Hz), and gamma (>

38

30 Hz) oscillations (Lopes da Silva, 2013; Peelle and Davis, 2012). In response to a periodically varying

39

auditory input, neural oscillations synchronize to the modulations that match their characteristic frequency

40

(Wang et al., 2005). An important aspect of this temporal encoding mechanism is that different modulation

41

rates are predominantly encoded at different stages along the auditory pathway. When ascending the

42

auditory pathway, there is a progressive shift from processing of high-frequency modulations towards

43

processing of low-frequency modulations (Giraud et al., 2000; Joris et al., 2004).

44

Electrophysiological studies have demonstrated that neural synchronization to the speech envelope is an

45

important mechanism underlying speech perception (Ahissar et al., 2001; Doelling et al., 2014; Peelle and

46

Davis, 2012). Moreover, neural envelope encoding changes with advancing age, even when hearing

47

sensitivity is preserved. Aging appears to be accompanied by a decrease in neural envelope encoding at

48

the brainstem level (Anderson et al., 2012; Ansari et al., 2017; Grose et al., 2009; Leigh-Paffenroth and

49

Fowler, 2006; Presacco et al., 2015; Purcell et al., 2004) and by an increase in neural envelope encoding

50

at the cortical level (Presacco et al., 2016; Tlumak et al., 2015). Hearing impairment has also been related

AC C

EP

TE D

M AN U

SC

RI PT

24

ACCEPTED MANUSCRIPT

to changes in neural envelope encoding. Enhanced neural synchronization to envelope modulations has

52

been detected in the brainstem (Anderson et al., 2013) and auditory cortex (Millman et al., 2017) of

53

hearing-impaired (HI) adults relative to similarly-aged NH listeners.

54

Given that envelope modulations play a role in speech perception, it seems likely that these age- and

55

hearing-related changes in neural envelope encoding contribute to the speech perception difficulties that

56

emerge throughout the adult lifespan. Nevertheless, only a limited number of studies have investigated

57

correlations between neural envelope encoding and speech perception. Schoof and Rosen (2016)

58

observed reduced neural envelope encoding in the brainstem and worse speech perception for older (60-

59

72 years) than for young (19-29 years) NH adults, but the neural-behavioral correlation was not significant.

60

Other studies of NH adults did find significant correlations between the degree of subcortical neural

61

envelope encoding and speech perception performance. Anderson et al. (2011) tested a group of NH

62

older people (60-73 years) and found a higher degree of neural envelope encoding at the brainstem level

63

to correlate significantly with better speech intelligibility. Dimitrijevic et al. (2004) and Leigh-Paffenroth and

64

Fowler (2006) demonstrated that such a positive correlation occurred not only for NH older adults (60-80

65

years) but also for NH young individuals (20-40 years). Presacco et al. (2016) and Millman et al. (2017)

66

examined speech understanding and neural envelope encoding at the cortical level. In contrast to

67

Presacco et al. (2016), who did not find a significant association between the degree of cortical neural

68

synchronization and speech understanding among young (18-27 years) and older NH individuals (61-73

69

years), Millman et al. (2017) found a significant correlation for middle-aged NH adults (~60 years). These

70

researchers found enhanced neural envelope encoding in the left auditory cortex to be significantly related

71

to worse speech perception. Note that the direction of this neural-behavioral relationship was opposite to

72

the one observed for the brainstem level (Anderson et al., 2011; Dimitrijevic et al., 2004; Leigh-Paffenroth

73

and Fowler, 2006).

74

Remarkably, the studies that did not find an association between the degree of neural envelope encoding

75

and speech perception for NH adults, i.e., Schoof and Rosen (2016) and Presacco et al. (2016),

76

investigated speech perception in the presence of competing speech, whereas Anderson et al. (2011),

77

Dimitrijevic et al. (2004), and Millman et al. (2017), who did find significant correlations, used speech-

AC C

EP

TE D

M AN U

SC

RI PT

51

ACCEPTED MANUSCRIPT

weighted noise (SWN). An important difference between interfering background noise and competing

79

speech concerns the nature of the masking effect. The degrading effect of SWN results from energetic

80

masking and/or modulation masking (Brungart, 2001; Stone et al., 2012). Energetic masking refers to

81

reduced audibility of the target speech that results from a spectro-temporal overlap between the

82

background noise and the target speech. Modulation masking occurs when envelope modulations in the

83

noise – either random modulations intrinsic to noise or modulations imposed on the noise – adversely

84

affect the perception of envelope modulations in the target speech. The degrading effect of competing

85

speech not only involves energetic and modulation masking, but also incorporates some degree of

86

informational masking (Brungart, 2001; Durlach et al., 2003). Informational masking draws on central,

87

cognitive processing: the listener needs to segregate the target speech from the competing speech (object

88

formation) and then has to selectively attend to the target speech (object selection) (Kidd et al., 1994;

89

Shinn-Cunningham, 2008). As such, competing speech induces a higher cognitive load than background

90

noise. The lack of a significant association between neural envelope encoding and speech perception in

91

the presence of competing speech may be explained by cognitive processing playing an important role in

92

speech-on-speech masking, while auditory (temporal) processing plays a key role in understanding

93

speech masked by noise signals.

94

To the best of our knowledge, only Dimitrijevic et al. (2004) and Millman et al. (2017) have investigated

95

correlations between speech understanding and neural envelope encoding for people who have hearing

96

impairment. In both studies, speech perception was tested in the presence of SWN. Dimitrijevic et al.

97

(2004) showed that more neural envelope encoding in subcortical auditory areas was related to better

98

speech perception for HI adults aged between 57 and 86 years. Millman et al. (2017) reported that

99

enhanced envelope encoding in the left auditory cortex was predictive of inferior speech perception for HI

100

listeners aged about 60 years. Note that the direction of the neural-behavioral relationship for HI adults

101

seems to be different for the brainstem and the cortical level, which is in accord with the observations for

102

NH adults (e.g., Anderson et al., 2011; Millman et al., 2017).

103

Altogether, previous studies support the hypothesis that neural envelope encoding is related to the speech

104

perception performance of both NH and HI adults. Evidence is, however, scarce and a number of

AC C

EP

TE D

M AN U

SC

RI PT

78

ACCEPTED MANUSCRIPT

important issues remain unaddressed. First, there is a lack of research that includes people with different

106

hearing sensitivity and/or people belonging to different age categories, to assess whether correlations

107

between neural envelope encoding and speech perception vary with hearing sensitivity and/or age. Also,

108

to date, studies have been restricted to neural envelope encoding in either subcortical or cortical auditory

109

regions. Investigating both subcortical and cortical neural envelope encoding and their associations with

110

speech perception would allow the relative predictive value of neural envelope encoding in the different

111

auditory regions to be evaluated. Furthermore, neural-behavioral studies using interfering background

112

noises as well as competing speech for evaluating speech perception performance, are not available, yet

113

this is needed to investigate whether the presumed contribution of neural envelope encoding to speech

114

intelligibility depends on the cognitive load induced by the masker.

115

To address these needs, we conducted a number of studies including the same young (20-30 years),

116

middle-aged (50-60 years), and older (70-80 years) NH and HI participants. In a first study, we evaluated

117

speech perception performances in interfering background noises and competing speech (Goossens et

118

al., 2017). In two other studies, we investigated the neural encoding of envelope modulations from the

119

brainstem up to the cortex (Goossens et al., 2016; Goossens et al., under review). These studies

120

demonstrated age- and hearing-related changes in masked speech perception performance and neural

121

envelope encoding. The aim of the present study was to investigate whether the observed changes in

122

neural envelope encoding could predict the changes in speech perception performance. As we tested

123

both NH and HI adults belonging to three age groups, we could selectively explore neural-behavioral

124

correlations across the adult lifespan when hearing sensitivity was or was not preserved. By taking both

125

subcortical and cortical auditory regions into account, we could investigate the relative contribution of

126

neural envelope encoding at different auditory stages to speech perception performance. Moreover, by

127

using both interfering background noise and competing speech, we could assess whether the cognitive

128

load produced by the masker affected the relationship between neural envelope encoding and speech

129

perception.

130

We expected to find significant correlations between neural envelope encoding and speech perception

131

performance. Based on previous research, we hypothesized that more neural envelope encoding in

AC C

EP

TE D

M AN U

SC

RI PT

105

ACCEPTED MANUSCRIPT

subcortical auditory regions would be related to better speech perception, while the opposite scenario was

133

anticipated for neural envelope encoding at the cortical level. Moreover, we postulated that such neural-

134

behavioral correlations could vary with hearing sensitivity, age, and/or the cognitive load produced by the

135

masking noise.

136

2. Methods and Results

137

2.1. Participants

138

Both the NH and HI participant population consisted of young (20-30 years), middle-aged (50-60 years),

139

and older adults (70-80 years) (Table 1, Fig. 1). Pure-tone audiometry (0.25-8 kHz) was conducted in a

140

soundproof booth with a Madsen OB922 audiometer, TDH-39 earphones, and a RadioEar B71 bone

141

transducer. In accord with the ISO standard 389-1 (International Organization for Standardization, 1998),

142

25 dB HL was considered as the upper limit of normal hearing sensitivity. NH was defined as having

143

audiometric thresholds ≤ 25 dB HL at all octave frequencies from 0.25 to 4 kHz. All NH participants met

144

this criterion. However, the best-ear pure-tone average (PTA) across all audiometric thresholds (0.25-8

145

kHz) differed significantly among the three age groups. The PTA was lower for the young group than for

146

the middle-aged and older groups, and the PTA of the middle-aged group was lower than for the older

147

group (all p < 0.001). HI participants had audiometric thresholds ≥ 35 dB HL from 1 kHz upwards. Among

148

the three HI age groups, no significant differences in PTA were found (all p > 0.6). All hearing losses were

149

sensorineural in nature (air-bone gaps ≤ 10 dB HL). All participants were considered to have normal

150

cognitive capacities, as they scored ≥ 26/30 on the Montreal Cognitive Assessment (Nasreddine et al.,

151

2005). Also, all participants were Dutch native speakers, they were right-handed according to the

152

Edinburgh Handedness Inventory (Oldfield, 1971), and they had no history of brain injury, neurological

153

disorders, or tinnitus.

154

This research project was approved by the Medical Ethical Committee of the University Hospitals and

155

University of Leuven (approval number B322201214866). All participants gave their written informed

156

consent.

157

AC C

EP

TE D

M AN U

SC

RI PT

132

ACCEPTED MANUSCRIPT

158 159 160

Table 1. Overview of the number (N) of people (women/men), mean age ± standard deviation (SD) expressed in years, and mean best-ear pure-tone average (PTA) across all audiometric thresholds (0.25-8 kHz) ± SD, for each cohort (young, middle-aged, older) of the NH and HI participant populations. NH population

HI population

age ± SD

PTA ± SD

N (♀/♂)

age ± SD

PTA ± SD

young

17 (8/9)

23 ± 2

1±4

10 (6/4)

27 ± 5

54 ± 11

middle-aged

15 (9/6)

54 ± 2

8±4

14 (10/4)

58 ± 2

49 ± 9

older

10 (7/3)

74 ± 3

18 ± 6

13 (8/5)

78 ± 3

53 ± 6

RI PT

N (♀/♂)

>

162

2.2. Synopsis of speech perception and neural envelope encoding

163

In this section, we give an overview of the behavioral and neural data from our six participant groups

164

(Table 1) that have been reported on before (Goossens et al., 2016; Goossens et al., 2017; Goossens et

165

al., under review).

166

Behavioral and neural measures that varied significantly across the adult lifespan – due to age and

167

hearing impairment – were considered as variables of interest: they were retained for further investigation

168

in the present study.

169

2.2.1. Speech perception performance

170

In the study of Goossens et al. (2017), we investigated age- and hearing-related changes in masked

171

speech perception. Speech perception was quantified by the speech reception threshold (SRT) of masked

172

sentences (Leuven Intelligibility Sentence Test; van Wieringen and Wouters, 2008), presented to the ear

173

with the best PTA. The SRT represents the signal-to-noise ratio (SNR) at which 50% of the sentences,

174

i.e., keywords (~3 per sentence), were recognized correctly (Plomp and Mimpen, 1979). Higher SRTs

175

reflect poorer speech perception. We equated audibility of the speech material among our participants

176

based on the perception of 10 sentences of the Leuven Intelligibility Sentence Test, presented in quiet. If a

177

participant scored ≥ 8/10, it was concluded that the speech material was sufficiently audible. In this way,

178

the speech level was set to 60 dB SPL for all NH participants. For HI participants, this level (60 dB SPL)

179

was raised following the procedure of Jansen et al. (2012): half of the HI participant’s PTA0.25-1

AC C

EP

TE D

M AN U

SC

161

kHz

was

ACCEPTED MANUSCRIPT

added to 60 dB SPL. If a HI participant scored < 8/10 using this level, it was further raised in steps of 5 dB

181

SPL until a score of ≥ 8/10 was obtained. The median speech level for the HI participants was 83 dB SPL.

182

To investigate masked speech perception, we used interfering background noises as well as competing

183

speech. (Fig. 2). For the interfering background noises, we used two SWNs, both having the long-term

184

average spectrum of the speech material of the Leuven Intelligibility Sentence Test. The first SWN was

185

unmodulated, whereas the second SWN was 100% amplitude modulated at a 4-Hz rate. The modulated

186

SWN resulted in temporary increases in the SNR, which could lead to release from masking (RFM), i.e.,

187

better speech perception than for the unmodulated SWN (Festen and Plomp, 1990). In effect, during noise

188

dips the target speech can be heard more clearly and these speech glimpses enable the listener to

189

reconstruct the masked portions of the speech. RFM was quantified by subtracting SRTSWN unmodulated from

190

SRTSWN

191

2010). The ISTS is an unintelligible speech signal that is known to cause a considerable amount of

192

informational masking (Christiansen and Dau, 2012; Francart et al., 2011). Thus, compared to the SWNs,

193

the ISTS induced a higher cognitive load.

194

>

195

In the study of Goossens et al. (2017), age-related changes were investigated by comparing SRTs

196

between the young, middle-aged, and older NH participants. Hearing-related changes were examined by

197

comparing outcomes between NH and HI similarly-aged participants. It was demonstrated that the young

198

NH group showed lower SRTs than the middle-aged and older NH groups for all three maskers. Also, the

199

middle-aged NH group outperformed the older NH group, although this was not significant for the

200

unmodulated SWN. The older NH adults did not show RFM, whereas the young and middle-aged NH

201

adults did. Goossens et al. (2017) also showed that the HI listeners had higher SRTs than their NH

202

counterparts, irrespective of age and the type of masker. Moreover, all HI age groups showed significantly

203

less RFM than the NH age groups.

204

In sum, the study of Goossens et al. (2017) showed that the intelligibility of speech masked by

205

unmodulated SWN, modulated SWN, and the ISTS, worsened with age and with hearing impairment. The

TE D

We also used the International Speech Test Signal as a masker (ISTS; Holube et al.,

AC C

EP

modulated.

M AN U

SC

RI PT

180

ACCEPTED MANUSCRIPT

same was true for RFM. Therefore, SRTSWN unmodulated, SRTSWN modulated, SRTISTS, and RFM were all included

207

in the analyses of the present research.

208

2.2.2. Neural envelope encoding

209

Goossens et al. (2016) investigated changes in neural envelope encoding due to age and Goossens et al.

210

(under review) explored hearing-related changes. In both studies, neural envelope encoding was

211

investigated in the six participant groups (Table 1) by means of auditory steady-state responses (ASSRs)

212

to octave bands of white noise centered at 1 kHz that were 100% amplitude modulated at 4, 20, 40, or 80

213

Hz (Fig. 3, panel A; for a review of ASSRs, see Rance, 2008). The EEG was recorded with 64 active

214

electrodes. The neural responses recorded by eight electrode pairs located at the back of the head were

215

retained for ASSR evaluation, as these electrodes were most sensitive to the neural responses under

216

investigation (Fig. 3, panel B). We included electrodes mirrored across hemispheres to selectively assess

217

neural envelope encoding in the left hemisphere (LH) and in the right hemisphere (RH).

218

>

219

The examination of ASSRs to different modulation rates, i.e., 4, 20, 40, and 80 Hz, allowed us to

220

investigate neural envelope encoding along the central auditory pathway. Source localization studies have

221

demonstrated that ASSRs to modulation frequencies < 30 Hz primarily originate from the auditory cortex,

222

while ASSRs to modulation frequencies > 30 Hz are mainly generated in subcortical structures (e.g.,

223

Herdman et al., 2002; Wang et al., 2012). With regard to higher-frequency ASSRs, Luke et al. (2017)

224

showed that the 40-Hz ASSR has a predominant thalamic source and that the 80-Hz ASSR has a major

225

brainstem source. It is important to note, however, that ASSRs, independent of the modulation frequency,

226

are composite responses, comprising both cortical and subcortical neural activity.

227

For the NH participants, the level of the amplitude-modulated stimuli was set to 70 dB SPL. Through

228

behavioral loudness ratings (graphic rating scale), NH participants indicated that the stimuli presented at

229

70 dB SPL were comfortably loud. This loudness rating was used as a reference for the HI participants:

230

each HI participant was asked to adjust the level of the ASSR stimuli until it was comfortably loud for

231

him/her as well. This resulted in a mean stimulus level of 79 dB SPL for the HI listeners. The reason for

AC C

EP

TE D

M AN U

SC

RI PT

206

ACCEPTED MANUSCRIPT

applying equal loudness levels instead of equal sensation levels concerned listening comfort. For NH

233

listeners, the median hearing threshold of the modulated noises was 5 dB SPL: their sensation level

234

equaled 65 dB SPL. For HI listeners, the median hearing threshold was 43 dB SPL. Presenting the stimuli

235

at equal sensation levels would have implied stimulus levels of 108 dB SPL, which exceeded their

236

uncomfortable loudness level of 103 dB SPL. Also, previous research has demonstrated that the

237

magnitude of the ASSR correlates closely with the perceived loudness of the acoustic input (Ménard et al.,

238

2008; Emara and Kolkaila, 2010; Van Eeckhoutte et al., 2016). Therefore, stimulus loudness needs to be

239

controlled for, in order to prevent differences in ASSRs between NH and HI participants being a result of

240

acoustic stimulation instead of peripheral hearing sensitivity.

241

The magnitude of the ASSR was expressed as signal-to-noise ratio (SNR), the ratio of the power in the

242

modulation frequency bin (signal) to the power in 120 neighboring bins (noise), i.e., 60 bins below and 60

243

bins above the modulation frequency bin. A high SNR denotes that there is a high degree of synchronized

244

neural activity.

245

Goossens et al. (2016) compared the SNR of the 4-, 20-, 40-, and 80-Hz ASSR among the three NH age

246

groups and found age-related changes in the 4- and the 80-Hz ASSR. Older NH adults exhibited larger 4-

247

Hz ASSRs than young and middle-aged NH listeners. The opposite pattern was seen for 80-Hz ASSRs:

248

larger 80-Hz ASSRs were found for young than for middle-aged and older NH individuals. The age-related

249

decrease in neural synchronization to 80-Hz modulations was observed in the RH only, while the age-

250

related increase in the neural encoding of 4-Hz modulations was found in both hemispheres. A study

251

under review (Goossens et al.) revealed hearing-related changes in the neural encoding of 4- and 80-Hz

252

modulations. For both hemispheres, larger 4- and 80-Hz ASSRs were detected for young and middle-

253

aged HI participants relative to their NH counterparts. No such hearing-related changes were found for the

254

older participants.

255

In sum, the studies of Goossens et al. (2016; under review) showed that the 4- and 80-Hz ASSR changed

256

with age and with hearing impairment, while this was not the case for the 20- and 40-Hz ASSR. Therefore,

257

only the 4-Hz ASSRLH, 4-Hz ASSRRH, 80-Hz ASSRLH, and 80-Hz ASSRRH are included in the analyses of

258

the present study.

AC C

EP

TE D

M AN U

SC

RI PT

232

ACCEPTED MANUSCRIPT

259 260

2.3. Correlations between speech perception and neural envelope encoding

261

To assess whether SRTSWN

262

ASSRLH, 4-Hz ASSRRH, 80-Hz ASSRLH, and/or 80-Hz ASSRRH, we performed best subset and linear

263

regression analyses with IBM SPSS Statistics software. We aimed at selectively investigating potential

264

correlations between neural envelope encoding and speech perception performance across the adult

265

lifespan when hearing sensitivity was or was not preserved. Therefore, these regression analyses were

266

performed separately for the NH and HI participant populations.

267

Best subset regression analyses were conducted to determine which of the neural variables predicted

268

speech perception best. Together with the neural variables, we included age (in months) and hearing

269

sensitivity (PTA0.25-8

270

hearing sensitivity (Goossens et al., 2017). We applied the Akaike Information Criterion with a correction

271

for finite sample sizes for selecting the best subset of predictors for each of the three SRTs and for RFM

272

(Hastie et al., 2009).

273

We conducted two linear regression analyses to investigate whether the best predictors, selected by the

274

best subset regression, made a unique and significant contribution to predicting the speech perception.

275

For all regression analyses, the underlying assumptions were met (linearity, homoscedasticity,

276

independent errors, normally distributed errors, no problematic multicollinearity, i.e., all average variance

277

inflation factors ≤ 3.5, all tolerances > 0.2). Therefore, the regression outcomes can be generalized

278

beyond our participant samples.

279

In the first linear regression, all best subset predictors were entered simultaneously, as this is the most

280

appropriate procedure for theory testing (Studenmund and Cassidy, 1987). When the regression

281

coefficient of a best subset predictor reached significance (p < 0.05), it was concluded that this predictor

282

made a unique contribution to predicting the SRT or RFM, independent of the other predictors included in

283

the regression.

SRTSWN

modulated,

SRTISTS, and/or RFM were related to the 4-Hz

M AN U

as predictors, since we know that the SRTs and RFM change with age and

AC C

EP

TE D

kHz)

SC

RI PT

unmodulated,

ACCEPTED MANUSCRIPT

In the second linear regression, these ‘unique’ predictors were entered in order of descending

285

significance, i.e., the predictor with the most significant regression coefficient (smallest p value) was

286

entered in the first block and the other significant predictors were added, one by one. This hierarchical

287

procedure allowed us to evaluate whether the proportion of variance in speech perception explained by

288

each predictor was significant (F change < 0.05).

289

2.3.1. Results for the NH population

290

The results of the best subset regression analyses for the NH participant population are shown in the first

291

column of Table 2 (top panel). From the second column onwards, Table 2 (top panel) shows the

292

outcomes of the linear regression analyses. These analyses demonstrated that age contributed

293

significantly to predicting the SRT of NH adults, irrespective of the type of masker: the SRTs increased

294

with advancing age. The 4-Hz ASSRRH was a significant predictor of speech perception in the presence of

295

modulated SWN and the ISTS, but not in the presence of the unmodulated SWN. The two significant

296

regression coefficients were positive, indicating that a higher degree of neural synchronization to 4-Hz

297

modulations in the RH corresponded to poorer speech perception (Fig. 4, top panel).

298

>

299

To investigate whether the predictive values of the 4-Hz ASSRRH with regard to the modulated SWN and

300

the ISTS were equivalent, we compared both regression coefficients. When both age and the 4-Hz

301

ASSRLH were held constant, the unstandardized regression coefficients (± standard error (SE)) of the 4-Hz

302

ASSRRH with respect to the SRTSWN

303

respectively. Note that the latter regression coefficient is not included in Table 2 (top panel), since the 4-

304

Hz ASSRLH was not selected as a best predictor for the SRTISTS. Yet, in order to compare the unique

305

contribution of the 4-Hz ASSRRH between the SRTSWN modulated and the SRTISTS, the same predictors have

306

to be included in both regressions (Cohen, 1983). We compared these two regression coefficients,

307

following the statistical procedure prescribed by Brame et al. (1998), and we obtained a p value > 0.1,

308

indicating that there was no significant difference.

EP

TE D

M AN U

SC

RI PT

284

and the SRTISTS were 0.41 (± 0.15) and 0.35 (± 0.18),

AC C

modulated

ACCEPTED MANUSCRIPT

A minor remark is that the significant contribution of the 4-Hz ASSRRH to predicting SRTSWN modulated only

310

occurred when taking the 4-Hz ASSRLH into account. This shows that the shared variance between these

311

two neural variables was irrelevant for predicting SRTSWN modulated.

312

There was also a significant positive correlation between the 4-Hz ASSRRH and RFM: increased 4-Hz

313

synchronization in the RH was correlated with less RFM.

314

2.3.2. Results for the HI population

315

The best predictors of the three SRTs and RFM for the HI population are shown in the first column of

316

Table 2 (bottom panel) and the results of the follow-up linear regressions are reported from the second

317

column onwards. None of the best subset predictors made a significant contribution to predicting RFM.

318

However, for the SRTs, both age and the 80-Hz ASSRLH were significant predictors, irrespective of

319

masker type. As indicated by the positive regression coefficients, speech perception performance

320

decreased with advancing age and with increasing neural synchronization to 80-Hz modulations in the LH

321

(Fig. 4, bottom panel). Hearing sensitivity (PTA) contributed significantly to predicting speech

322

understanding in the presence of the ISTS: the greater the hearing loss (higher PTA), the higher the

323

SRTISTS.

324

To assess whether the predictive value of the 80-Hz ASSRLH differed between the three maskers, we

325

compared the unstandardized regression coefficients (± SE) of the 80-Hz ASSRLH between the SRTSWN

326

unmodulated

327

comparative analyses, the factors age and PTA were both controlled for (Cohen, 1983). None of the

328

differences in the regression coefficients were significant (all p > 0.1).

EP

TE D

M AN U

SC

RI PT

309

AC C

(0.23 ± 0.1), the SRTSWN

modulated

(0.31 ± 0.12), and the SRTISTS (0.29 ± 0.12). For these

ACCEPTED MANUSCRIPT

B

SE

SRTSWN unmodulated age* 4-Hz ASSRLH

0.003 -0.075

0.001 0.039

0.676 -0.238

< 0.001 0.062

SRTSWN modulated age* 4-Hz ASSRRH* 4-Hz ASSRLH*

0.005 0.407 -0.346

0.001 0.153 0.151

0.545 0.688 -0.583

< 0.001 0.011 0.028

< 0.001 0.204 0.028

27.2% 10.0% 7.3%

SRTISTS age* 4-Hz ASSRRH*

0.014 0.232

0.001 0.083

0.795 0.219

< 0.001 0.008

< 0.001 < 0.008

57.9% 4.4%

RFM 4-Hz ASSRRH* PTA 4-Hz ASSRLH

0.341 0.069 -0.233

0.128 0.034 0.130

0.781 0.300 -0.534

0.011 0.048 0.080

0.007 0.087

16.8%

p

F change



B SRTSWN unmodulated age* 80-Hz ASSRLH*

0.011 0.259

SRTSWN modulated age* 80-Hz ASSRLH*

SE

M AN U HI population β

F change



< 0.001

38.4%

SC

p

0.795 0.406

< 0.001 0.012

< 0.001 0.012

44.5% 11.6%

0.013 0.336

0.002 0.115

0.795 0.445

< 0.001 0.006

< 0.001 0.006

44.5% 14.0%

SRTISTS age* 80-Hz ASSRLH* PTA*

0.016 0.292 0.125

0.002 0.115 0.058

0.868 0.333 0.239

< 0.001 0.016 0.038

< 0.001 0.005 0.038

52.7% 7.3% 5.3%

RFM 80-Hz ASSRRH age

0.208 0.003

0.103 0.002

0.406 0.359

0.052 0.084

/ /

EP

0.002 0.097

AC C

337

NH population β

RI PT

Table 2. Outcomes of the regression analyses for the NH participants (top panel) and for the HI participants (bottom panel). The first column shows the behavioral variable and its best predictors, as indicated by the best subset regression. The next columns present the unstandardized regression coefficient of each predictor (B) and its standard error (SE), the standardized regression coefficient (β), the significance of the regression coefficient (p), whether the predictor contributes significantly to predicting the speech perception performance (F change), and the proportion of variance in the speech perception performance explained by the predictor when controlling for the shared variances with other significant predictors (R²). Significant predictors (F change < 0.05) are denoted by an asterisk.

TE D

329 330 331 332 333 334 335 336

ACCEPTED MANUSCRIPT

3. Discussion

339

The present study confirms that neural envelope encoding is related to speech perception performance.

340

This neural-behavioral association applies to NH and HI adults but differs in nature for the two

341

populations. For NH adults, enhanced neural envelope encoding in the auditory cortex is related to poorer

342

speech perception, whereas for HI adults, enhanced neural envelope encoding in the brainstem is related

343

to poorer speech perception. For both populations, these correlations are independent of the type of

344

masker, i.e., interfering background noise or competing speech.

345

3.1. Enhanced neural envelope encoding in the auditory cortex is related to poorer speech

346

perception for NH adults

347

The present study showed that, for NH adults, a higher degree of neural synchronization to 4-Hz envelope

348

modulations in the RH, represented by the magnitude of the 4-Hz ASSRRH, was related to poorer speech

349

perception in the presence of the non-stationary maskers. This relationship was detected when age was

350

controlled for, which means that it applies to NH adults, irrespective of age. Taken together with the fact

351

that the 4-Hz ASSR reflects neural synchronization in the auditory cortex (Wang et al., 2012), it appears

352

that enhanced neural envelope encoding at the cortical level corresponds to poorer speech perception for

353

young, middle-aged, and older NH persons in a listening situation with non-stationary masking.

354

Moore and Glasberg (1993) and Moore et al. (1995) investigated the speech perception performance of

355

NH listeners when simulating loudness recruitment, which is an abnormally rapid loudness growth that

356

perceptually enhances envelope modulations, i.e., better detection of envelope modulations (e.g., Moore

357

et al., 1996; Schlittenlacher and Moore, 2016). These researchers found that loudness recruitment

358

adversely affected speech perception, particularly when the speech was masked by a non-stationary

359

masker. This observation is in line with our results showing that the 4-Hz ASSRRH is significantly correlated

360

with speech perception in the presence of modulated noise (modulated SWN) and interfering speech

361

(ISTS), yet not in the presence of unmodulated noise (unmodulated SWN). A plausible explanation for the

362

association between enhanced encoding of 4-Hz envelope modulations and poorer speech perception in

363

modulated noise and interfering speech, resides within modulation masking. The envelope of the

364

modulated SWN was sinusoidally modulated at a 4-Hz rate and the envelope spectrum of the ISTS shows

AC C

EP

TE D

M AN U

SC

RI PT

338

ACCEPTED MANUSCRIPT

a maximum between 2 and 8 Hz (Holube et al., 2010). In continuous speech, envelope modulations

366

ranging from 2 to 10 Hz signal the occurrence of syllables (Edwards and Chang, 2013; Chait et al., 2015),

367

which are speech units that play an important role in speech perception (e.g., Greenberg et al., 2003).

368

Research of Doelling et al. (2014), for instance, indicates that listeners parse speech into syllable-sized

369

chunks for further processing. As modulation masking is greatest when the rate of the target and noise

370

modulations are similar (Yost et al., 1989; Sek et al., 2015), the syllabic-rate modulations in the modulated

371

SWN and the ISTS are likely to mask the syllabic-rate modulations in the target speech, thereby

372

worsening speech perception. As the correlation between speech perception and 4-Hz neural encoding is

373

found for listening situations in which modulation masking is most likely to happen, we suggest that this

374

cortical encoding is related to modulation masking sensitivity. This idea is corroborated by a study of

375

Millman et al. (2017). These researchers also found enhanced cortical encoding of syllabic-rate

376

modulations (i.e., 2 Hz) to be related to poorer intelligibility of speech masked by a 2-Hz modulated noise,

377

and they demonstrated that this association was particularly true for neural envelope encoding in the

378

posteromedial auditory cortex. Research showing that the segregation of speech from background noise

379

takes place, at least in part, in this specific anatomical region (e.g., Ding and Simon, 2012) strengthens

380

the idea that modulation masking, and thereby difficulties with perceptual segregation, may underlie the

381

relationship between enhanced cortical envelope encoding and degraded perception of speech in non-

382

stationary maskers. Further support for this idea is given by our own data showing that increased cortical

383

envelope encoding corresponds to a decline in RFM (Table 2, top panel). After all, it has been

384

demonstrated that RFM actually reflects release from modulation masking (Stone et al., 2012; Stone and

385

Moore, 2014).

386

Our data also show that cortical neural envelope encoding is predictive of speech perception in the

387

presence of competing speech. This outcome is in contrast to the lack of association between cortical

388

neural envelope encoding and speech understanding in a four-talker babble, reported by Presacco et al.

389

(2016). In their study, neural envelope encoding was represented by a linear correlation between the

390

reconstructed and actual speech envelope, not by the magnitude of the ASSR to amplitude-modulated

391

noise stimuli, as in the current study. These methodological discrepancies may explain the divergent

392

outcomes.

AC C

EP

TE D

M AN U

SC

RI PT

365

ACCEPTED MANUSCRIPT

Given the crucial role of cognitive processing in speech-on-speech masking (informational masking), we

394

expected the predictive value of auditory temporal processing, i.e., neural envelope encoding, to be lower

395

for perception of speech in the presence of competing speech than in the presence of interfering

396

background noise. By comparing the regression coefficients of the 4-Hz ASSRRH for the SRTSWN modulated

397

and the SRTISTS, we demonstrated, however, that the predictive values of neural envelope encoding for

398

speech perception in the presence of interfering background noise and competing speech are not

399

significantly different. This implies that processing of envelope modulations plays an important role in

400

speech perception under conditions of informational masking. Most likely, this role concerns the

401

segregation of the target speech from the interfering speech (object formation). The envelope yields

402

prosodic information (e.g., speech rhythm and tempo) that helps the listener to differentiate between

403

multiple talkers (Bregman, 1990; Rosen, 1992). As such, deficits in temporal envelope processing can

404

compromise object formation, and, in turn, object selection, since adequate object formation is a

405

prerequisite for efficient object selection (e.g., Shinn-Cunningham, 2008).

406

Our study shows that the degree of neural synchronization to 4-Hz envelope modulations in the RH is a

407

significant predictor of speech perception for NH adults. Since, generally, alterations in the specialized

408

hemisphere are thought to have the most substantial impact on behavioral functions, this outcome fits with

409

the asymmetric sampling in time framework (Poeppel, 2003), which posits that the RH is specialized in the

410

processing of low-frequency theta-range modulations.

411

An obvious question that arises from the association between enhanced cortical envelope encoding and

412

poorer speech perception is what underlies the envelope enhancement? A plausible explanation is a shift

413

in the neuronal excitation-inhibition balance in favor of neuronal excitation. This is a well-known

414

homeostatic mechanism, i.e., a compensatory process for maintaining an operative level of neuronal

415

functioning (e.g., Gourévitch et al., 2014). The occurrence of homeostatic mechanisms in the auditory

416

cortex of NH adults fits within the framework of ‘hidden hearing loss’, which means that the neural output

417

from the cochlea is reduced while hearing sensitivity (the audiogram) is normal (Schaette and McAlpine,

418

2011). Such a reduced neural output can be attributed to synaptopathy and/or neuropathy, which is not

419

uncommon in NH subjects because of aging and/or noise exposure (Kujawa and Liberman, 2009; Plack et

AC C

EP

TE D

M AN U

SC

RI PT

393

ACCEPTED MANUSCRIPT

al., 2014; Sergeyenko et al., 2013). Thus, NH adults can show a reduced cochlear output accompanied by

421

homeostatic mechanisms in the auditory cortex, resulting in higher neuronal excitability and, in turn,

422

enhanced synchronized neural activity.

423

Previous research on NH adults demonstrated that reduced neural envelope encoding in the brainstem is

424

correlated with poorer speech understanding (Anderson et al., 2011; Dimitrijevic et al., 2004; Leigh-

425

Paffenroth and Fowler, 2006). In the current study, the 80-Hz ASSR, reflecting neural envelope encoding

426

in the brainstem, did not predict speech understanding. Yet, our finding is not incompatible with previous

427

research. Our study is the first to include both subcortical (80-Hz ASSR) and cortical neural predictors (4-

428

Hz ASSR). Since our data did not show a significant subcortical correlation for NH adults, but did reveal a

429

significant cortical correlation, we suggest that neural envelope encoding at the cortical level has a

430

predominant role in predicting speech perception performance.

431

3.2. Enhanced neural envelope encoding in the brainstem is related to poorer speech perception

432

for HI adults

433

Our study showed that speech perception for the HI groups decreased significantly with increasing 80-Hz

434

ASSRLH. This neural-behavioral correlation was found for each of the three maskers. When age was

435

controlled for, this neural-behavioral correlation remained. Given that the dominant neural source of the

436

80-Hz ASSR is localized in the brainstem (Herdman et al., 2002; Luke et al., 2017), our study indicates

437

that enhanced neural envelope encoding in the brainstem is associated with reduced speech perception

438

for young, middle-aged, and older HI persons in a listening situation with either stationary or non-

439

stationary masking.

440

By comparing the regression coefficients of the 80-Hz ASSRRH between the SRTSWN

441

modulated,

442

speech perception was comparable for the three maskers. This suggests that processing of envelope

443

modulations is equally important for speech perception in interfering background noises as in competing

444

speech, which is in agreement with the observation for our NH population (see 3.1.). Schoof and Rosen

445

(2016), however, did not find subcortical neural envelope encoding to be predictive of perception of

AC C

EP

TE D

M AN U

SC

RI PT

420

unmodulated,

SRTSWN

and SRTISTS, we demonstrated that the strength of the correlation between the 80-Hz ASSRLH and

ACCEPTED MANUSCRIPT

speech in the presence of competing speech. This discrepancy may be attributed to abundant differences

447

in stimuli and procedures.

448

The association between encoding of 80-Hz envelope modulations and speech intelligibility for HI people,

449

can presumably be explained by a predominant processing of envelope relative to TFS cues. Even though

450

envelope modulations play a key role in speech perception (e.g., Shannon et al., 1995; Peelle and Davis,

451

2012), ample evidence exists that TFS cues are also important, particularly in noisy listening situations

452

(Lorenzi et al., 2006; Hopkins and Moore, 2009; Füllgrabe et al., 2015). Electrophysiological studies have,

453

however, shown poor TFS encoding in HI adults (Ananthakrishnan et al., 2016; Vercammen et al., 2018).

454

Likewise, behavioral research has demonstrated that HI listeners show reduced sensitivity to TFS

455

information (Buss et al., 2004; Hopkins and Moore, 2011). This poor TFS processing in HI people is in

456

sharp contrast with their strong neural encoding and adequate behavioral detection of envelope

457

modulations (e.g., Wallaert et al., 2017; Goossens et al., under review). Because of this marked envelope-

458

to-TFS processing imbalance, it seems reasonable to assume that HI people predominantly rely on

459

envelope cues, which is necessary but not sufficient for perception of masked speech.

460

Other studies that explored the relationship between neural envelope encoding at a subcortical level and

461

speech perception reported that enhanced subcortical neural envelope encoding was related to better

462

speech understanding (Anderson et al., 2011; Dimitrijevic et al., 2004; Leigh-Paffenroth and Fowler,

463

2006), which is opposite to the finding of the present study. Importantly, Anderson et al. (2011) and Leigh-

464

Paffenroth and Fowler (2006) tested NH adults. The contradictory outcomes suggest that the association

465

between neural envelope encoding and speech perception varies with hearing sensitivity. Dimitrijevic et al.

466

(2004), however, did test HI adults (57-86 years of age) and their neural-behavioral correlation was similar

467

to those reported by Anderson et al. (2011) and Leigh-Paffenroth and Fowler (2006). Dimitrijevic et al.

468

(2004) evaluated word recognition and the number of significant ASSRs (i.e., response amplitude

469

significantly higher than the noise estimate) to four carrier frequencies that were simultaneously

470

amplitude- and frequency-modulated at 40- and 80-Hz rates. All measurements were conducted in quiet

471

with/without hearing aids and in the presence of unmodulated SWN with/without hearing aids. The

472

Pearson correlation between word recognition score and percentage significant ASSRs (out of 16

AC C

EP

TE D

M AN U

SC

RI PT

446

ACCEPTED MANUSCRIPT

responses: 4 carrier frequencies x 2 amplitude modulations x 2 frequency modulations) was calculated

474

across all listening conditions. We, however, conducted a regression analysis for each listening condition,

475

i.e., unmodulated SWN, modulated SWN, and ISTS. Applying such a more extensive statistical procedure

476

would presumably not have reversed the direction of the neural-behavioral correlation reported by

477

Dimitrijevic et al. (2004), but it could have reduced its statistical significance (Bland and Altman, 1994).

478

Furthermore, we focused on amplitude modulation while Dimitrijevic et al. (2004) used a combined

479

assessment of amplitude and frequency modulation. This may also contribute to the discrepant findings.

480

The present study shows that the 80-Hz ASSR in the LH is a significant predictor of speech perception for

481

HI adults. As previously discussed (see 2.2.2.), the 80-Hz ASSR has a major brainstem source (e.g., Luke

482

et al., 2017). In addition to this main brainstem activity, however, there is also 80-Hz neural

483

synchronization at the cortical level (Coffey et al., 2016; Coffey et al., 2017; Schoonhoven et al., 2003),

484

hence the division between LH and RH 80-Hz ASSRs. Since changes in the specialized hemisphere are

485

thought to have the highest impact on functional outcomes, our observation that the 80-Hz ASSR in the

486

LH significantly predicts speech intelligibility, suggests that the LH is specialized in processing 80-Hz

487

modulations. This notion is in accord with experimental evidence showing that the LH is specialized in

488

processing temporal acoustic features, while the RH is specialized in fine-grained spectral analyses

489

(Okamoto et al., 2009; Zatorre and Belin, 2001; Zatorre et al., 2002).

490

Both peripheral and central changes with hearing impairment could underlie an increased degree of

491

envelope encoding in HI people. Cochlear damage (outer hair cells) leads to loudness recruitment, which

492

is known to be associated with better detection of envelope modulations (Moore et al., 1996; Moore, 2007;

493

Schlittenlacher and Moore, 2016). This improved behavioral modulation detection fits with enhanced

494

neural encoding of envelope modulations. Also, hearing impairment is characterized by a degeneration of

495

hair cell synapses and/or cochlear nerve fibers, compromising the afferent neural input (e.g., Kujawa and

496

Liberman, 2015). Consequently – as previously discussed (see 3.1.) – homeostatic mechanisms may

497

operate to maintain neuronal functioning (e.g., Gourévitch et al., 2014). Examples of homeostatic

498

mechanisms in the HI auditory system are nerve fibers showing steeper-than-normal response growths

499

(Kale and Heinz, 2010) and reduced inhibitory synaptic strength (Vale and Sanes, 2002). These central

AC C

EP

TE D

M AN U

SC

RI PT

473

ACCEPTED MANUSCRIPT

changes may very well boost neural synchronization to acoustic envelope modulations, including the 80-

501

Hz modulations under investigation.

502

3.3. Directions for future research

503

The present outcomes can provide directions for future research aiming to develop advanced

504

rehabilitation strategies for speech perception difficulties that emerge throughout adult life. We discussed

505

that the poorer speech understanding with enhanced neural envelope encoding for NH adults may be

506

associated with difficulties in perceptual segregation, and the occurrence of modulation masking in

507

particular. According to Kwon and Turner (2001), modulation masking concerns a failure of object

508

selection, i.e., the listener cannot ignore the background noise and/or cannot selectively attend to the

509

target speech. In this framework, it seems worthwhile to investigate whether training selective listening

510

could mitigate speech perception difficulties for NH people. For HI people, we argued that the worse

511

speech perception with enhanced envelope encoding could involve predominant processing of envelope

512

modulations relative to TFS cues. Since adequate processing of TFS, in addition to envelope cues, is

513

important for masked speech perception (e.g., Hopkins and Moore, 2009), it could be explored whether

514

auditory training focusing on TFS processing can improve the speech perception skills of HI listeners.

515

Also, further research is warranted to verify the direction of the association between neural envelope

516

encoding and speech perception and whether this neural-behavioral relationship is mediated by factors

517

that were not taken into account in the present study (e.g., linguistic skills).

518

3.4. Conclusion

519

In conclusion, the present study shows that neural envelope encoding is significantly related to the speech

520

perception performance of adults aged between 20 and 80 years. For NH and HI adults, enhanced neural

521

envelope encoding in the auditory cortex and in the brainstem, respectively, is associated with poorer

522

perception of masked speech, whether the masking is produced by interfering background noise or

523

competing speech.

524

AC C

EP

TE D

M AN U

SC

RI PT

500

ACCEPTED MANUSCRIPT

Acknowledgments

526

Our special thanks go to all participants for their cooperation in this research. We are grateful to our

527

master’s students, Anneleen Berghmans, Dorien Vandevenne, Ellen Vermaete, Eva Stroobants, Evelien

528

Van den Broeck, Jolien Orye, Kaat Van den Brande, Lore Heylen, Louise Van Haesendonck, Marjolein

529

Declercq, Sarah Heyndrickx, and Robin Gransier for their assistance in data collection, and to Astrid De

530

Vos for her helpful comments with regard to data analysis. We highly appreciate the constructive feedback

531

of Prof. Brian Moore and two anonymous reviewers on earlier versions of our manuscript.

532

This research was funded by the Research Foundation – Flanders (FWO) through an FWO-aspirant grant

533

to Tine Goossens (grant number 11Z8817N) and by the Research Council of KU Leuven (project

534

OT/12/98).

AC C

EP

TE D

M AN U

SC

RI PT

525

ACCEPTED MANUSCRIPT

References Ahissar, E., Nagarajan, S., Ahissar, M., Protopapas, A., Mahncke, H., Merzenich, M.M., 2001. Speech comprehension is correlated with temporal response patterns recorded from auditory cortex. Proc.

RI PT

Natl. Acad. Sci. U. S. A. 98, 13367–13372. doi:10.1073/pnas.201400998 Ananthakrishnan, S., Krishnan, A., Bartlett, E., 2016. Human frequency following response: neural representation of envelope and temporal fine structure in listeners with normal hearing and sensorineural hearing loss. Ear Hear. 37, e91–e103. doi:10.1097/AUD.0000000000000247

SC

Anderson, S., Parbery-Clark, A., White-Schwoch, T., Drehobl, S., Kraus, N., 2013. Effects of hearing loss on the subcortical representation of speech cues. J. Acoust. Soc. Am. 133, 3030–3038.

M AN U

doi:10.1121/1.4799804

Anderson, S., Parbery-Clark, A., White-Schwoch, T., Kraus, N., 2012. Aging affects neural precision of speech encoding. J. Neurosci. 32, 14156–14164. doi:10.1523/JNEUROSCI.2176-12.2012 Anderson, S., Parbery-Clark, A., Yi, H.-G., Kraus, N., 2011. A neural basis of speech-in-noise perception in older adults. Ear Hear. 32, 750–757. doi:10.1097/AUD.0b013e31822229d3

TE D

Ansari, M.S., Rangasayee, R., Ansari, M.A.H., 2017. Neurophysiological aspects of brainstem processing of speech stimuli in audiometric-normal geriatric population. J. Laryngol. Otol. 131, 239–244. doi:10.1017/S0022215116009841

Bland, J.M., Altman, D.G., 1994. Correlation, regression, and repeated data. BMJ 308, 896.

EP

doi:10.1136/bmj.308.6942.1510a

Brame, R., Paternoster, R., Mazerolle, P., Piquero, A., 1998. Testing for the equality of maximum-

AC C

likelihood regression coefficients between two independent equations. J. Quant. Criminol. 14, 245– 261. doi:10.1023/A:1023030312801 Bregman, A.S., 1990. Auditory scene analysis: the perceptual organization of sounds, 1st ed. MIT Press, Cambridge, MA.

Brungart, D.S., 2001. Informational and energetic masking effects in the perception of two simultaneous talkers. J. Acoust. Soc. Am. 109, 1101–1109. doi:10.1121/1.1345696 Buss, E., Hall, J.W., Grose, J.H., 2004. Temporal fine-structure cues to speech and pure tone modulation in

observers

with

sensorineural

hearing

loss.

Ear

Hear.

25,

242–250.

ACCEPTED MANUSCRIPT

doi:10.1097/01.AUD.0000130796.73809.09 Chait, M., Greenberg, S., Arai, T., Simon, J.Z., Poeppel, D., 2015. Multi-time resolution analysis of speech:

RI PT

evidence from psychophysics. Front. Neurosci. 9, 214. doi:10.3389/fnins.2015.00214 Christiansen, C., Dau, T., 2012. Relationship between masking release in fluctuating maskers and speech reception

thresholds

in

stationary

noise.

J.

Acoust.

Soc.

doi:10.1121/1.4742732

Am.

132,

1655–1666.

SC

Coffey, E.B.J., Herholz, S.C., Chepesiuk, A.M.P., Baillet, S., Zatorre, R.J., 2016. Cortical contributions to the auditory frequency-following response revealed by MEG. Nat. Commun. 7, 11070.

M AN U

doi:10.1038/ncomms11070

Coffey, E.B.J., Musacchia, G., Zatorre, R.J., 2017. Cortical correlates of the auditory frequency-following and

onset

responses:

EEG

and

fMRI

evidence.

J.

Neurosci.

37,

830–838.

doi:10.1523/JNEUROSCI.1265-16.2017

Cohen, A., 1983. Comparing regression coefficients across subsamples: a study of the statistical test. Sociol. Methods Res. 12, 77–94. doi:10.1177/0049124183012001003

scores

in

TE D

Dimitrijevic, A., John, M.S., Picton, T.W., 2004. Auditory steady-state responses and word recognition normal-hearing

and

hearing-impaired

adults.

Ear

Hear.

25,

68–84.

doi:10.1097/01.AUD.0000111545.71693.48

EP

Ding, N., Simon, J.Z., 2012. Emergence of neural encoding of auditory objects while listening to competing speakers. Proc. Natl. Acad. Sci. U. S. A. 109, 5–10. doi:10.1073/pnas.1205381109

AC C

Doelling, K.B., Arnal, L.H., Ghitza, O., Poeppel, D., 2014. Acoustic landmarks drive delta-theta oscillations to enable speech comprehension by facilitating perceptual parsing. Neuroimage 85, 761–768. doi:10.1016/j.neuroimage.2013.06.035 Drullman, R., Festen, J.M., Plomp, R., 1994. Effect of temporal envelope smearing on speech reception. J. Acoust. Soc. Am. 95, 1053–1064. doi:10.1121/1.408467 Dubno, J.R., Horwitz, A.R., Ahlstrom, J.B., 2002. Benefit of modulated maskers for speech recognition by younger and older adults with normal hearing. J. Acoust. Soc. Am. 111, 2897–2907. doi:10.1121/1.1480421

ACCEPTED MANUSCRIPT

Durlach, N.I., Mason, C.R., Kidd, G., Arbogast, T.L., Colburn, H.S., Shinn-Cunningham, B.G., 2003. Note on informational masking. J. Acoust. Soc. Am. 113, 2984–2987. doi:10.1121/1.1570435 Edwards, E., Chang, E.F., 2013. Syllabic (~2-5 Hz) and fluctuation (~1-10 Hz) ranges in speech and

RI PT

auditory processing. Hear. Res. 305, 113–134. doi:10.1016/j.heares.2013.08.017 Emara, A.A.Y., Kolkaila, E.A., 2010. Prediction of loudness growth in subjects with sensorineural hearing loss using auditory steady state response. J. Int. Adv. Otol. 6, 371–379.

Festen, J.M., Plomp, R., 1990. Effects of fluctuating noise and interfering speech on the speech-reception for

impaired

and

normal

hearing.

J.

Acoust.

Soc.

Am.

88,

1725–1736.

SC

threshold

doi:10.1121/1.400247

M AN U

Francart, T., van Wieringen, A., Wouters, J., 2011. Comparison of fluctuating maskers for speech recognition tests. Int. J. Audiol. 50, 2–13. doi:10.3109/14992027.2010.505582 Füllgrabe, C., Moore, B.C.J., Stone, M.A., 2015. Age-group differences in speech identification despite matched audiometrically normal hearing: contributions from auditory temporal processing and cognition. Front. Aging Neurosci. 6, 347. doi:10.3389/fnagi.2014.00347 Giraud, A.-L., Lorenzi, C., Ashburner, J., Wable, J., Johnsrude, I., Frackowiak, R., Kleinschmidt, A., 2000.

TE D

Representation of the temporal envelope of sounds in the human brain. J. Neurophysiol. 84, 1588– 1598. doi:10.1152/jn.2000.84.3.1588

Goossens, T., Vercammen, C., Wouters, J., van Wieringen, A., 2017. Masked speech perception across

EP

the adult lifespan: impact of age and hearing impairment. Hear. Res. 344, 109–124. doi:10.1016/j.heares.2016.11.004

to

AC C

Goossens, T., Vercammen, C., Wouters, J., van Wieringen, A., 2016. Aging affects neural synchronization speech-related

acoustic

modulations.

Front.

Aging

Neurosci.

8,

133.

doi:10.3389/fnagi.2016.00133 Goossens, T., Vercammen, C., Wouters, J., van Wieringen, A. The impact of hearing impairment on neural envelope encoding at different ages. under review Gourévitch, B., Edeline, J.M., Occelli, F., Eggermont, J.J., 2014. Is the din really harmless? Long-term effects of non-traumatic noise on the adult auditory system. Nat. Rev. Neurosci. 15, 483–491. doi:10.1038/nrn3744

ACCEPTED MANUSCRIPT

Greenberg, S., Carvey, H., Hitchcock, L., Chang, S., 2003. Temporal properties of spontaneous speech— a syllable-centric perspective. J. Phon. 31, 465–485. doi:10.1016/j.wocn.2003.09.005 Grose, J.H., Mamo, S.K., Hall, J.W., 2009. Age effects in temporal envelope processing: speech and

auditory

steady

state

responses.

Ear

Hear.

30,

568–575.

RI PT

unmasking

doi:10.1097/AUD.0b013e3181ac128f

Hastie, T., Tibshirani, R., Friedman, J., 2009. Linear methods for regression, in: Hastie, T., Tibshirani, R., Friedman, J. (Eds.), The elements of statistical learning: data mining, inference, and prediction.

SC

Springer, New York, NY, pp. 43–100.

Helfer, K.S., Wilber, L.A., 1990. Hearing loss, aging, and speech perception in reverberation and noise. J.

M AN U

Speech Hear. Res. 33, 149–155. doi:10.1044/jshr.3301.149

Herdman, A.T., Lins, O., Van Roon, P., Stapells, D.R., Scherg, M., Picton, T.W., 2002. Intracerebral sources

of

human

auditory

steady-state

doi:10.1023/A:1021470822922

responses.

Brain

Topogr.

15,

69–86.

Holube, I., Fredelake, S., Vlaming, M., Kollmeier, B., 2010. Development and analysis of an International Speech Test Signal (ISTS). Int. J. Audiol. 49, 891–903. doi:10.3109/14992027.2010.506889

TE D

Hopkins, K., Moore, B.C.J., 2011. The effects of age and cochlear hearing loss on temporal fine structure sensitivity, frequency selectivity, and speech reception in noise. J. Acoust. Soc. Am. 130, 334–349. doi:10.1121/1.3585848

EP

Hopkins, K., Moore, B.C.J., 2009. The contribution of temporal fine structure to the intelligibility of speech in steady and modulated noise. J. Acoust. Soc. Am. 125, 442–446. doi:10.1121/1.3037233

AC C

Humes, L.E., Roberts, L., 1990. Speech-recognition difficulties of the hearing-impaired elderly: the contributions of audibility. J. Speech Hear. Res. 33, 726–735. doi:10.1044/jshr.3304.726 International Organization for Standardization, 1998. ISO 389-1: Acoustics - Reference zero for the calibration of audiometric equipment. Part 1: Reference equivalent threshold sound pressure levels for pure tones and supra-aural earphones. Geneva, Switzerland. Jansen, S., Luts, H., Wagener, K.C., Kollmeier, B., Del Rio, M., Dauman, R., James, C., Fraysse, B., Vormès, E., Frachet, B., Wouters, J., van Wieringen, A., 2012. Comparison of three types of French speech-in-noise

tests:

a

multi-center

study.

Int.

J.

Audiol.

51,

164–173.

ACCEPTED MANUSCRIPT

doi:10.3109/14992027.2011.633568 Joris, P.X., Schreiner, C.E., Rees, A., 2004. Neural processing of amplitude-modulated sounds. Physiol. Rev. 84, 541–577. doi:10.1152/physrev.00029.2003

RI PT

Kale, S., Heinz, M.G., 2010. Envelope coding in auditory nerve fibers following noise-induced hearing loss. J. Assoc. Res. Otolaryngol. 11, 657–673. doi:10.1007/s10162-010-0223-6

Kidd, G., Mason, C.R., Deliwala, P.S., Woods, W.S., Colburn, H.S., 1994. Reducing informational masking by sound segregation. J. Acoust. Soc. Am. 95, 3475–3480. doi:10.1121/1.410023

SC

Kujawa, S.G., Liberman, M.C., 2015. Synaptopathy in the noise-exposed and aging cochlea: primary neural degeneration in acquired sensorineural hearing loss. Hear. Res. 330, 191–199.

M AN U

doi:10.1016/j.heares.2015.02.009

Kujawa, S.G., Liberman, M.C., 2009. Adding insult to injury: cochlear nerve degeneration after “temporary” noise-induced hearing loss. J. Neurosci. 29, 14077–14085. doi:10.1523/JNEUROSCI.2845-09.2009 Kwon, B.J., Turner, C.W., 2001. Consonant identification under maskers with sinusoidal modulation: masking

release

or

modulation

doi:10.1121/1.1384909

interference?

J.

Acoust.

Soc.

Am.

110,

1130–1140.

TE D

Leigh-Paffenroth, E.D., Fowler, C.G., 2006. Amplitude-modulated auditory steady-state responses in younger and older listeners. J. Am. Acad. Audiol. 17, 582–597. doi:10.3766/jaaa.17.8.5 Lopes da Silva, F., 2013. EEG and MEG: relevance to neuroscience. Neuron 80, 1112–1128.

EP

doi:10.1016/j.neuron.2013.10.017

Lorenzi, C., Gilbert, G., Carn, H., Garnier, S., Moore, B.C.J., 2006. Speech perception problems of the

AC C

hearing impaired reflect inability to use temporal fine structure. Proc. Natl. Acad. Sci. U. S. A. 103, 18866–18869. doi:10.1073/pnas.0607364103 Luke, R., De Vos, A., Wouters, J., 2017. Source analysis of auditory steady-state responses in acoustic and electric hearing. Neuroimage 147, 568–576. doi:10.1016/j.neuroimage.2016.11.023 Ménard, M., Gallégo, S., Berger-Vachon, C., Collet, L., Thai-Van, H., 2008. Relationship between loudness growth function and auditory steady-state response in normal-hearing subjects. Hear. Res. 235, 105–113. doi:10.1016/j.heares.2007.10.007 Millman, R.E., Mattys, S.L., Gouws, A.D., Prendergast, G., 2017. Magnified neural envelope coding

ACCEPTED MANUSCRIPT

predicts

deficits

in

speech

perception

in

noise.

J.

Neurosci.

37,

7727–7736.

doi:10.1523/JNEUROSCI.2722-16.2017 Moore, B.C.J., Glasberg, B.R., Vickers, D.A., 1995. Simulation of the effects of loudness recruitment on

RI PT

the intelligibility of speech in noise. Br. J. Audiol. 29, 131–143. doi:10.3109/03005369509086590 Moore, B.C.J., 2014. Auditory processing of temporal fine structure: effects of age and hearing loss, 1st ed. World Scientific Publishing CO. Pte. Ltd., Toh Tuck, Singapore.

Moore, B.C.J., 2007. Cochlear hearing loss: physiological, psychological and technical issues, 2nd ed.

SC

Wiley, Chichester, UK.

Moore, B.C.J., Glasberg, B.R., 1993. Simulation of the effects of loudness recruitment and threshold

94, 2050–2062. doi:10.1121/1.407478

M AN U

elevation on the intelligibility of speech in quiet and in a background of speech. J. Acoust. Soc. Am.

Moore, B.C.J., Wojtczak, M., Vickers, D.A., 1996. Effect of loudness recruitment on the perception of amplitude modulation. J. Acoust. Soc. Am. 100, 481–489. doi:10.1121/1.415861 Nasreddine, Z.S., Phillips, N.A., Bédirian, V., Charbonneau, S., Whitehead, V., Collin, I., Cummings, J.L., Chertkow, H., 2005. The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild

TE D

cognitive impairment. J. Am. Geriatr. Soc. 53, 695–9. doi:10.1111/j.1532-5415.2005.53221.x Okamoto, H., Stracke, H., Draganova, R., Pantev, C., 2009. Hemispheric asymmetry of auditory evoked fields elicited by spectral versus temporal stimulus change. Cereb. Cortex 19, 2290–2297.

EP

doi:10.1093/cercor/bhn245

Oldfield, R.C., 1971. The assessment and analysis of handedness: the Edinburgh inventory.

AC C

Neuropsychologia 9, 97–113. doi:10.1016/0028-3932(71)90067-4 Peelle, J.E., Davis, M.H., 2012. Neural oscillations carry speech rhythm through to comprehension. Front. Psychol. 3, 320. doi:10.3389/fpsyg.2012.00320 Plack, C.J., Barker, D., Prendergast, G., 2014. Perceptual consequences of “hidden” hearing loss. Trends Hear. 18, 1–11. doi:10.1177/2331216514550621 Plomp, R., Mimpen, A.M., 1979. Improving the reliability of testing the speech reception threshold for sentences. Audiology 18, 43–52. doi:10.3109/00206097909072618 Poeppel, D., 2003. The analysis of speech in different temporal integration windows: cerebral lateralization

ACCEPTED MANUSCRIPT

as “asymmetric sampling in time.” Speech Commun. 41,

245–255. doi:10.1016/S0167-

6393(02)00107-3 Presacco, A., Jenkins, K., Lieberman, R., Anderson, S., 2015. Effects of aging on the encoding of dynamic

RI PT

and static components of speech. Ear Hear. 36, e352–e363. doi:10.1097/AUD.0000000000000193 Presacco, A., Simon, J.Z., Anderson, S., 2016. Evidence of degraded representation of speech in noise, in the aging midbrain and cortex. J. Neurophysiol. 116, 2346–2355. doi:10.1152/jn.00372.2016

Purcell, D.W., John, S.M., Schneider, B.A., Picton, T.W., 2004. Human temporal auditory acuity as by

envelope

following

responses.

J.

Acoust.

Soc.

Am.

116,

3581–3593.

SC

assessed

doi:10.1121/1.1798354

Plural Publishing Inc, San Diego, CA.

M AN U

Rance, G., 2008. Auditory steady-state response: generation, recording, and clinical applications, 1st ed.

Rosen, S., 1992. Temporal information in speech: acoustic, auditory and linguistic aspects. Philos. Trans. Biol. Sci. 336, 367–373. doi:10.1098/rstb.1992.0070

Schaette, R., McAlpine, D., 2011. Tinnitus with a normal audiogram: physiological evidence for hidden hearing

loss

and

computational

model.

J.

Neurosci.

31,

13452–13457.

TE D

doi:10.1523/JNEUROSCI.2156-11.2011

Schlittenlacher, J., Moore, B.C.J., 2016. Discrimination of amplitude-modulation depth by subjects with normal and impaired hearing. J. Acoust. Soc. Am. 140, 3487–3495. doi:10.1121/1.4966117

EP

Schoof, T., Rosen, S., 2016. The role of age-related declines in subcortical auditory processing in speech perception in noise. J. Assoc. Res. Otolaryngol. 17, 441–460. doi:10.1007/s10162-016-0564-x

AC C

Schoonhoven, R., Boden, C.J.R., Verbunt, J.P.A., de Munck, J.C., 2003. A whole head MEG study of the amplitude-modulation-following response: phase coherence, group delay and dipole source analysis. Clin. Neurophysiol. 114, 2096–2106. doi:10.1016/S1388-2457(03)00200-1 Sek, A., Baer, T., Crinnion, W., Springgay, A., Moore, B.C.J., 2015. Modulation masking within and across carriers for subjects with normal and impaired hearing. J. Acoust. Soc. Am. 138, 1143–1153. doi:10.1121/1.4928135 Sergeyenko, Y., Lall, K., Liberman, M.C., Kujawa, S.G., 2013. Age-related cochlear synaptopathy: an early-onset

contributor

to

auditory

functional

decline.

J.

Neurosci.

33,

13686–13694.

ACCEPTED MANUSCRIPT

doi:10.1523/JNEUROSCI.1783-13.2013 Shannon, R. V, Zeng, F.-G., Kamath, V., Wygonski, J., Ekelid, M., 1995. Speech recognition with primarily temporal cues. Science 270, 303–304. doi:10.1126/science.270.5234.303

186. doi:10.1016/j.tics.2008.02.003

RI PT

Shinn-Cunningham, B.G., 2008. Object-based auditory and visual attention. Trends Cogn. Sci. 12, 182–

Souza, P.E., Turner, C.W., 1994. Masking of speech in young and elderly listeners with hearing loss. J. Speech Hear. Res. 37, 655–661. doi:10.1044/jshr.3703.655

SC

Stone, M.A., Füllgrabe, C., Moore, B.C.J., 2012. Notionally steady background noise acts primarily as a modulation masker of speech. J. Acoust. Soc. Am. 132, 317–326. doi:10.1121/1.4725766

M AN U

Stone, M.A., Moore, B.C.J., 2014. On the near non-existence of “pure” energetic masking release for speech. J. Acoust. Soc. Am. 135, 1967–1977. doi:10.1121/1.4868392 Studenmund, A.H., Cassidy, H.J., 1987. Using econometrics: a practical guide, 1st ed. Little, Brown Book Group, London, UK.

Tlumak, A.I., Durrant, J.D., Delgado, R.E., 2015. The effect of advancing age on auditory middle- and long-latency evoked potentials using a steady-state-response approach. Am. J. Audiol. 24, 494–507.

TE D

doi:10.1044/2015_AJA-15-0036

Vale, C., Sanes, D.H., 2002. The effect of bilateral deafness on excitatory and inhibitory synaptic strength in the inferior colliculus. Eur. J. Neurosci. 16, 2394–2404. doi:10.1046/j.1460-9568.2002.02302.x

EP

Van Eeckhoutte, M., Wouters, J., Francart, T., 2016. Auditory steady-state responses as neural correlates of loudness growth. Hear. Res. 342, 58–68. doi:10.1016/j.heares.2016.09.009

AC C

van Wieringen, A., Wouters, J., 2008. LIST and LINT: Sentences and numbers for quantifying speech understanding in severely impaired listeners for Flanders and the Netherlands. Int. J. Audiol. 47, 348–355. doi:10.1080/14992020801895144 Vercammen, C., Goossens, T., Undurraga, J., Wouters, J., van Wieringen, A, 2018. Electrophysiological and behavioral evidence of reduced binaural temporal processing in the aging and hearing impaired human auditory system. Trends Hear. 22, 1-12. doi: 10.1177/2331216518785733 Wallaert, N., Moore, B.C.J., Ewert, S.D., Lorenzi, C., 2017. Sensorineural hearing loss enhances auditory sensitivity and temporal integration for amplitude modulation. J. Acoust. Soc. Am. 141, 971–980.

ACCEPTED MANUSCRIPT

doi:10.1121/1.4976080 Wang, X., Lu, T., Snider, R.K., Liang, L., 2005. Sustained firing in auditory cortex evoked by preferred stimuli. Nature 435, 341–346. doi:10.1038/nature03565

RI PT

Wang, Y., Ding, N., Ahmar, N., Xiang, J., Poeppel, D., Simon, J.Z., 2012. Sensitivity to temporal modulation rate and spectral bandwidth in the human auditory system: MEG evidence. J. Neurophysiol. 107, 2033–2041. doi:10.1152/jn.00310.2011

Yost, W.A., Sheft, S., Opie, J., 1989. Modulation interference in detection and discrimination of amplitude

SC

modulation. J. Acoust. Soc. Am. 86, 2138–2147. doi:10.1121/1.398474

Zatorre, R.J., Belin, P., 2001. Spectral and temporal processing in human auditory cortex. Cereb. Cortex

M AN U

11, 946–953. doi:10.1093/cercor/11.10.946

Zatorre, R.J., Belin, P., Penhune, V.B., 2002. Structure and function of auditory cortex: music and speech.

AC C

EP

TE D

Trends Cogn Sci 6, 37–46. doi:10.1016/s1364-6613(00)01816-7

ACCEPTED MANUSCRIPT

Figure captions Fig. 1. Median audiometric thresholds (dB HL) of the ear with the best pure-tone average (PTA0.25-8 kHz) for the NH (black) and HI (gray) participants. Diamonds, squares, and circles show thresholds for young,

RI PT

middle-aged, and older people, respectively. Error bars indicate the interquartile range.

Fig. 2. Time waveform of a sentence from the Leuven Intelligibility Sentence Test (black) in the presence of background noise (gray), i.e., unmodulated SWN (panel A), modulated SWN (panel B), and the ISTS

SC

(panel C).

Fig. 3. Panel A: Illustration of an 80-Hz ASSR (indicated by the arrow) in an EEG spectrum (0-100 Hz);

M AN U

Panel B: Electrode configuration and electrodes selected for the ASSR evaluation (bold circles).

Fig. 4. Illustration of the correlation between perception of speech in the presence of modulated SWN and neural envelope encoding for NH adults (4-Hz ASSRRH; top panel) and for HI adults (80-Hz ASSRLH; bottom panel). The x-axis shows the unstandardized residuals when regressing the neural predictor X on the other best subset predictors, i.e., age and 4-Hz ASSRLH for NH adults (top panel) and age for HI adults

TE D

(bottom panel). Stated otherwise, the x-axis represents the neural predictor X when the other best subset predictors are controlled for. Positive residuals reflect underestimation of the neural predictor X (ASSR magnitude) by the other best subset predictors. Negative residuals reflect overestimation. The y-axis modulated.

EP

shows each participant’s SRTSWN

Diamonds, squares, and circles show the X and Y values for

AC C

young, middle-aged, and older people, respectively.

ACCEPTED MANUSCRIPT

Highlights

Enhanced neural envelope encoding is related to poorer speech perception



This association applies to speech masked by interfering noise or competing speech



Cortical envelope encoding predicts speech perception for normal-hearing adults



Brainstem envelope encoding predicts speech perception for hearing-impaired adults



Such neural-behavioral relations are found for young, middle-aged, and older adults

AC C

EP

TE D

M AN U

SC

RI PT



500 Hz

1 kHz

10 20 30

40 50

D

70 80 90 NH young NH middle-aged NH older

TE

Audiometric threshold (dB HL)

0

60

2 kHz

4 kHz

6 kHz

M AN US

250 Hz -10

HI young

HI middle-aged HI older

8 kHz

A

RI P

0

M AN US C

Pressure (Pa)

0.12

-0.17 0

Time (s)

-0.17 0

Time (s)

3.98

C

EP

0.12

0

CC

Pressure (Pa)

B

D

0

TE

Pressure (Pa)

0.12

3.98

-0.17

0

Time (s)

3.98

A

M AN

60

50 45 40 35 30 25 20 0

20

40

ED

Amplitude (dB re nV)

55

60

Frequency (Hz)

80

100

B

young

middle-aged

older

r = 0.32

NH

M AN U

SC

RI PT

SRT SWN modulated (dB SNR)

ACCEPTED MANUSCRIPT

EP AC C

SRT SWN modulated (dB SNR)

HI

TE

D

4-Hz ASSR RH (controlled for age and 4-Hz ASSR LH)

80-Hz ASSR LH (controlled for age)

r = 0.37

Suggest Documents