Variability in the characterization of the headphone transfer-function

3 downloads 0 Views 55KB Size Report
by the headphone delivery system on the stimulus at the listener's eardrum. ... letter is to report the variability in the response of supra-aural headphones arising ...
Variability in the characterization of the headphone transfer-function Abhijit Kulkarnia) and H. Steven Colburnb) Hearing Research Center and Department of Biomedical Engineering, Boston University, Boston, Massachusetts 02215

共Received 22 January 1997; revised 1 February 1999; accepted 24 September 1999兲 In simulations of virtual acoustic space, stimuli are filtered with HRTFs and presented over headphones. An equalization filter is specified to compensate for the spurious coloration introduced by the headphone delivery system on the stimulus at the listener’s eardrum. The purpose of this letter is to report the variability in the response of supra-aural headphones arising from the positioning of the headphone cushion during normal usage. The headphone responses were obtained on the KEMAR acoustical mannequin. It is shown that the variability in the measurements due to headphone cushion placements makes it difficult to specify a compensation filter for canceling the headphone characteristics. This makes the stimulus waveform at a listener’s eardrum unpredictable and could have an important consequence on the perceptual adequacy of virtual displays. © 2000 Acoustical Society of America. 关S0001-4966共00兲04301-0兴 PACS numbers: 43.66.Yw, 43.66.Sr, 43.66.Qp 关JWH兴

INTRODUCTION

The process of virtual sound synthesis requires the filtering of the sound stream to each ear with two transfer functions. The first, called the head-related transfer function 共HRTF兲, is the directional transform from source to ear. The second function is the nondirectional inverse headphone transfer function, which is required to compensate for the headphone transducer characteristic. The purpose of this paper is to document the large variability in the characterization of the headphone transfer function and hence also in its inverse. Headphone transfer function data collected from a popular headphone 共Sennheiser HD520兲 are presented for repeated placements on a KEMAR mannequin. This variability, however, appears to be common to the general class of circumaural/supra-aural headphones. These observations are consistent with Shaw 共1966兲.

signal at the point P in the earcanal equals X( j ␻ ). If the transfer function from the headphone input to the point P is denoted by H H ( j ␻ ), such an equivalence can be established if the electrical signal input to the headphone S E ( j ␻ ) is specified by S E共 j ␻ 兲 ⫽

H共 j ␻ 兲 X共 j ␻ 兲 ⫽S 共 j ␻ 兲 . H H共 j ␻ 兲 H H共 j ␻ 兲

Hence, the virtual synthesis filter is the ratio of the HRTF H( j ␻ ) and the headphone transfer function H H ( j ␻ ). Note that H H ( j ␻ ) is nondirectional; it embodies the headphone transducer characteristic in addition to the acoustical transfer function between the headphone transducer and point P in the ear canal. It is easy to see that when S E ( j ␻ ) is delivered over the headphones, the resulting signal at the point P equals that due to the natural sound source S: S E 共 j ␻ 兲 H H 共 j ␻ 兲 ⫽S 共 j ␻ 兲 H 共 j ␻ 兲 ⫽X 共 j ␻ 兲 .

I. BACKGROUND

Consider a sound source S having a pressure spectrum S( j ␻ ) and the resulting signal X( j ␻ ) at some point P in the ear canal. Note that for the purposes of this study we only need to consider the acoustical signal at one ear. The analysis applies for either ear. The signal X( j ␻ ) can be mathematically related to the source spectrum S( j ␻ ) by the relation X 共 j ␻ 兲 ⫽S 共 j ␻ 兲 H 共 j ␻ 兲 ,

共1兲

where H( j ␻ ) is the HRTF from the source S to the point P in a listener’s earcanal. Due to the acoustical issues involved, the point P at which the HRTFs are measured is chosen to be either in close proximity to the eardrum 共Wightman and Kistler, 1989兲 or at the entrance to the earcanal 共Searle et al., 1975兲. For the virtual synthesis of the sound source S, a listener is stimulated over headphones such that the pressure a兲

Electronic mail: [email protected] Electronic mail: [email protected]

b兲

1071

J. Acoust. Soc. Am. 107 (2), February 2000

共2兲

共3兲

The measurement of HRTFs, which has been researched extensively 共e.g., Wightman and Kistler, 1989兲, involves the placement of miniature microphones in the ear canals of listeners. The headphone transfer function H H ( j ␻ ) is normally obtained for the same placement of the microphones. Due to different shapes and sizes of the pinna, HRTFs for different listeners have idiosyncratic spectral features so that individually measured HRTFs are believed to be required for veridical synthesis of virtual space 共Wenzel et al., 1993兲. Similarly, the headphone transfer function also differs for different listeners. Because of different shapes and sizes of the ear canal and pinna, the acoustical coupling between the headphone transducer and the ear canal differs for each listener and hence is typically specified individually. In a comprehensive study of headphone transfer functions, Moller et al. 共1995a, b兲 concluded that there was significant intersubject variability among individuals and suggested that individualized equalization of the headphone response is preferable. Moller and colleagues also reported intrasubject

0001-4966/2000/107(2)/1071/4/$17.00

© 2000 Acoustical Society of America

1071

variability at the high frequencies, which are consistent with the data reported by Wightman and Kistler 共1989兲. The goal of this study was to characterize the headphone transfer function, and its variability over placements using a commonly available supra-aural headphone. We then assessed the equalization that can be achieved to compensate for its response.

II. METHODS

The headphone used in this study was the Sennheiser HD-520. The headphone transfer function was measured on the KEMAR mannequin with ‘‘large’’ pinnae1 and 21-in. Etymotic ER-11 microphones fitted in DB-100 Zwislocki couplers. The KEMAR microphone has a flat response 共to within ⫾0.5 dB兲 between 200 kHz and 14 kHz. The headphone was placed over the mannequin’s ears and the transfer function from headphone to eardrum was obtained using Golay code techniques 共Golay, 1961; Foster, 1986兲. The probe stimulus was constructed on a TMS320C25 Spectrum board housed in a IBM-386 computer. It was delivered to the headphone driver via a D/A converter after suitable amplification from a Crown D-75 amplifier. The output from the KEMAR microphones was measured from an A/D converter and processed on the Spectrum signal processing board to obtain a 1024-tap finite-impulse-response filter at a sampling rate of 50 kHz. A total of 20 transfer function measurements were obtained from the mannequin. On each measurement the headphone was removed and repositioned on the mannequin in order to estimate the variability resulting from repeated placements of the headphone. The headphones were positioned in a normal way on each trial such that the pinna was completely enclosed by the earphone cushion. Special care was taken to ensure that the cushions were not placed awkwardly, but rather, as might be expected, to be positioned by a human subject during a listening session. Inverse headphone transfer functions were constructed for each of the 20 measurements. The inverse functions were constructed in MATLAB using inverse Fourier transform techniques and implemented as linear-phase finite impulse response 共FIR兲 filters. If a headphone impulse response is denoted by h H 关 n 兴 and the corresponding transfer function ⫺1 denoted by H H ( j ␻ ), the inverse filter h H 关 n 兴 can be obtained by performing the basic operation ⫺1 hH 关 n 兴 ⫽F⫺1兵 1/H H 共 j ␻ 兲 其 ,

共4兲

where F⫺1 denotes the inverse Fourier transform operator. A windowing function was introduced in the operand of the inverse Fourier operator of Eq. 共4兲 to prevent regions of low spectral energy in H H ( j ␻ ) from contributing to a numerically unstable result. Specifically, a window having a value of one between 200 and 16 000 Hz and tapered edges 共raised cosine兲 outside this interval was used. The effect of using an inverse filter to cancel the headphone transfer function obtained for a different placement of the headphone cushions was simulated. We arbitrarily combined different inverse functions with different headphone transfer functions to 1072

J. Acoust. Soc. Am., Vol. 107, No. 2, February 2000

FIG. 1. Top-left panel: Representative headphone response magnitude spectrum from the Sennheiser HD-520 headphones. Lower-left panel: A representative HRTF magnitude spectrum shown for comparison. Right panel: The mean headphone transfer-function computed from 20 measurements is shown as a solid line in the upper set of curves. Also shown are three representative measurements 共dashed, dotted, and dot-dashed lines兲 of the headphone transfer-function obtained for three different placements of the headphone-cushions. The curves are shown on a relative dB scale. The standard deviation of the 20 measurements is shown in the lower curve of the right panel on the same scale as the measurements.

evaluate the implications of such a mismatch for the spectrum of the signal received at the drum. III. RESULTS

The magnitude-spectrum of a headphone response H H ( j ␻ ) for a particular placement of the cushions is shown in the top-left panel of Fig. 1. Note that there are significant spectral features in the headphone frequency response 共as measured at the eardrum兲. In fact, the spectral features in the headphone transfer function are comparable to those observed in HRTFs. A representative HRTF magnitude spectrum 兩 H( j ␻ ) 兩 is shown in the lower-left panel of Fig. 1 for comparison. As expected, the 41-wavelength ear canal resonance around 2.7 kHz is common to both the HRTF and the headphone transfer function. The magnitude of peaks and notches in the headphone transfer function magnitude spectrum and the representative HRTF magnitude spectrum are comparable. The mean headphone transfer function obtained by averaging the 20 measurements is shown by the solid line in the top set of curves in the right panel of Fig. 1. The ordinate scale is a relative scale in decibels. Also shown, displaced vertically for clarity, are three representative headphone transfer functions obtained for three different placements of the headphone cushions. The three measurements were chosen arbitrarily and do not represent the extreme variations in the data. Despite this, the different spectral features in the three measurements is striking. The standard deviation, in dB, for the 20 headphone transfer functions is shown by the solid curve at the bottom of the panel 共on the same scale兲. The standard deviation curve indicates significant variability in the characterization of the headphone transfer function. In A. Kulkarni and H. S. Colburn: Letters to the Editor

1072

tions used in the middle panels using an inverse filter constructed from the mean headphone transfer function 共shown in Fig. 1兲. As expected, because the mean headphone transfer function is not representative of the data set in general, a mean inverse filter is unable to equalize the filter functions completely. However, on comparing the results in the middle column and the right column of Fig. 2, it can be seen on average that the mean inverse filter performs better than an inverse filter computed for a unique headphone placement. In both cases, however, the spurious spectral detail in the signal is significant. In fact, the magnitudes of these unwanted spectral features are as prominent as features in the HRTF. Because the spectral detail in the HRTF is associated with directional cues, such unwanted features could clearly have a deleterious perceptual consequence. IV. DISCUSSION FIG. 2. Representative headphone response magnitude spectrum 共top-left panel兲 and the corresponding inverse-filter magnitude spectrum 共middle-left panel兲 for a given placement of the headphones. The equalized headphone response is shown in the bottom-left panel. Equalized headphone transfer functions for three different headphone-cushion placements using the inverse filter shown in the middle-left panel are shown in the middle column. The equalization for the same placements using the inverse filter computed from the mean headphone transfer-function shown in Fig. 1 is shown in the right column.

fact, due to the large variability in the data, it is difficult to consider the mean transfer function to be representative of the data in general. To confirm that the source of variability was in fact due to the headphone placement, we also obtained five transfer-function measurements for a fixed headphone placement. The resulting standard deviation was within 0.1 dB throughout the frequency range. Therefore, the variability seen in Fig. 1 must result from changes in headphone placements. It can be noted from Fig. 1 that presenting virtual stimuli through uncompensated headphones will result in unwanted spectral characteristics at the eardrum which will be different for each placement of the headphone. In order to preserve the fidelity of the directional spectrum at the eardrum the headpone characteristic needs to be compensated. In the top-left panel of Fig. 2 we show a representative headphone response magnitude spectrum for a given placement of the headphones. The magnitude spectrum of the inverse filter corresponding to this placement is shown in the middle-left panel of the figure. In the bottom-left panel of Fig. 2, we show the effective, compensated headphone frequency response when the headphone impulse response and inverse filter are arranged in cascade. As shown, the inverse filter is able to achieve the desired compensation 共i.e., a flat response兲. In contrast, in the three panels of the middle column of Fig. 2, ‘‘compensated’’ headphone transfer functions are shown when the same inverse filter is combined with the headphone transfer functions obtained for three different placements of the headphone cushions. It may be noted that the inverse filtering operation is not only ineffective in not compensating for the headphone characteristics, but also sometimes more deleterious than not having an inverse filter at all. Finally, in the three panels of the right column of Fig. 2 are shown the equalization of the headphone transfer func1073

J. Acoust. Soc. Am., Vol. 107, No. 2, February 2000

The use of virtual acoustical stimuli is becoming increasingly popular in psychophysical experiments. It is often considered that the spectral signal delivered at the listener’s eardrum should be identical to that arising from a natural source. Hence, considerable effort is often expended to measure HRTFs that are listener specific. However, as seen from the data presented above, headphone systems may introduce a significant coloration on the acoustical stimuli. In fact, the spectral features in headphone transfer functions are very similar to the directional features reported in HRTFs. There is a need to compensate for these spurious features in order to prevent them from having a deleterious perceptual effect during the synthesis of virtual acoustical space. The purpose of this paper is to bring to light the large variability in the measured headphone transfer function for a single headphone-head combination. In particular, choosing to compute inverse filters to equalize for the headphone response by measuring the headphone transfer function for each individual listener does not resolve this ambiguity, but may amplify it. In general, this variability is a cause of concern for virtual space synthesis and its consequences need to be carefully evaluated. The variability in the headphone transfer characteristics arises primarily from the variable coupling of the headphone cushion to the ear for different placement. The acoustics of the coupling is sensitive to the specific placement of the cushion around the ear. An exact acoustical model that explains the phenomenon is not within the scope of this paper. A simple model which may be used to qualitatively explain the phenomenon is shown in Fig. 3. The top panel of Fig. 3 shows a generalized series of blocks describing the acoustical elements of interest. The headphone signal P H couples to the earcanal via the two-port2 labeled SOURCE-COUPLING. The signal at the entrance of the earcanal, P E , couples to the eardrum via the second two-port, which is labeled EARCANAL. The transformation between the entrance of the ear canal and the eardrum is fixed and it is the variability associated with the source-coupling that is responsible for the variable features in the measured spectrum. At low frequencies the pressure within the earphone cavity can be considered the same everywhere 共spatial variation is ignored兲 and the source-coupling two-port can be reA. Kulkarni and H. S. Colburn: Letters to the Editor

1073

FIG. 3. Circuit model of headphone acoustics. P H , P E , and P D represent the pressures at the headphone equivalent source, at the entrance to the ear canal, and at the eardrum respectively. Z H and Z D are the acoustic impedances of the headphone and looking into the eardrum; respectively. Top panel: A general circuit comprised of two-ports is shown in the top panel. The variability in the measurements arises due to the variability in coupling between the headphone and the ear canal. Lower panel: The source of the variability at the low frequencies is illustrated in the lower panel. The fitdependent parameters Z leak and Z cav are the lumped acoustic impedances of the leakage of volume velocity to the atmosphere and of the cavity itself.

duced to the circuit shown in the lower panel of Fig. 3. Here the paths for volume velocity between the headphone transducer and the ear canal entrance have been separated into a leakage path out of the earcup (Z leak) and a path into the cavity comprising the earphone cup and the pinna (Z cav). The pressure at the entrance of the ear canal is then described by P E⫽ P H

Z leakZ cav . Z leakZ cav⫹Z H 共 Z leak⫹Z cav兲

共5兲

where all quantities are functions of frequency. Because of the variability in Z leak and Z cav , the pressure waveform ( P E ) at the ear canal entrance and therefore at the eardrum ( P D ) is variable and depends upon the specific placement of the headphone. This result is consistent with Shaw 共1966兲 who had reported large intrasubject variability in the frequency response of circum-aural and supra-aural headphones, and suggested that they were not suitable for exacting applications. This result is also consistent with the results of Domnitz 共1976兲 who reported similar findings in measurements on human subjects wearing circumaural headphones for frequencies below 1 kHz. We note that the behavior of the two-port describing the coupling is complicated at high frequencies and we do not provide a specific analysis of the two-port at those frequencies. Note that the variability reported here arises from normal usage of the headphones. It should be expected to arise during different runs of an experiment during which the

1074

J. Acoust. Soc. Am., Vol. 107, No. 2, February 2000

headphone is repositioned. It is therefore impossible to specify a unique inverse filter that can equalize for the headphone response. Consequently, in spite of the care taken to measure HRTFs and headphone characteristic from individual listeners, the pressure waveform at the listener’s eardrum is unpredictable, at least with circumaural headphones or supra-aural headphones. The often-reported lack of veridical perception of virtual stimuli could also be the outcome of this result. A specific recommendation to eliminate this variability is beyond the scope of this manuscript. Our results suggest that a mean inverse filter computed from averaging the response for several headphone placements performs better on average compared to an inverse filter computed for a given headphone placement. Averaging across measurements made from multiple listeners may also be useful. Even a mean filter, however, is inadequate to compensate for the extreme variations in headphone responses. A possible approach may take the form of a stimulus monitoring system as proposed and used by Domnitz 共1976兲. Note, however, that Domnitz used stimuli only up to 1 kHz. The difficulty in monitoring the frequency response increases with frequency and a system to monitor frequencies above a few kHz would be difficult to implement. In any case, this clearly makes the stimulus delivery system more complicated than those used in many current virtual displays. ACKNOWLEDGMENTS

This work was supported by National Institutes of Health, NIDCD Grant No. RO1DC00100. 1

The large pinna model corresponds to part number DB-065 and DB-066 available from Knowles Electronic, Inc. 2 A two-port network is a generalization of a system having a pair of input terminals and a pair of output terminals. The behavior of a two-port is characterized by its impedances 共input and output兲 and its transfer function. Domnitz, R. H. 共1976兲. ‘‘Headphone monitoring system for binaural experiments below 1 kHz,’’ J. Acoust. Soc. Am. 58, 510–511. Foster, S. H. 共1986兲. ‘‘Impulse response measurement using Golay codes,’’ Proc. of ICASSP, Tokyo, Japan, pp. 929–932. Golay, M. J. 共1961兲. ‘‘Complementary series,’’ IRE Trans. Inf. Theory 7, 82–87. Kulkarni, A. 共1993兲. ‘‘Auditory Imaging a Virtual Acoustical Environment,’’ Masters Thesis, Boston University. Moller, H., Hammershoi, D., Jensen, C. B., and Sorensen, M. F. 共1995a兲. ‘‘Transfer characteristics of headphones measured on human ears,’’ J. Aud. Eng. Soc. 43, 203–217. Moller, H., Jensen, C. B., Hammershoi, D., and Sorensen, M. F. 共1995b兲. ‘‘Design criteria for headphones,’’ J. Aud. Eng. Soc. 43, 218–232. Searle, C. L., Braida, L. D., Cuddy, D. R., and Davis, M. F. 共1975兲. ‘‘Binaural pinna disparity: another auditory localization cue,’’ J. Acoust. Soc. Am. 57, 448–455. Shaw, E. A. G. 共1966兲. ‘‘Earcanal pressure by circumaural and supraaural earphones,’’ J. Acoust. Soc. Am. 39, 471–479. Wenzel, E. M., Arruda, M., Kistler, D. J., and Wightman, F. L. 共1993兲. ‘‘Localization using nonindividualized head-related transfer-functions,’’ J. Acoust. Soc. Am. 94, 111–123. Wightman, F. L., and Kistler, D. J. 共1989兲. ‘‘Headphone stimulation of free-field listening I: stimulus synthesis,’’ J. Acoust. Soc. Am. 85, 858– 867.

A. Kulkarni and H. S. Colburn: Letters to the Editor

1074