This work describes research on the maximum sound pressure level achievable by the spoken and sung .... The vocalists contained 6 females (5 sopranos.
Audio Engineering Society
Convention Paper Presented at the 135th Convention 2013 October 17–20 New York, USA This paper was peer-reviewed as a complete manuscript for presentation at this Convention. Additional papers may be obtained by sending request and remittance to Audio Engineering Society, 60 East 42nd Street, New York, New York 10165-2520, USA; also see www.aes.org. All rights reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct permission from the Journal of the Audio Engineering Society.
Maximum Averaged and Peak Levels of Vocal Sound Pressure Braxton Boren,1 Agnieszka Roginska,1 and Brian Gill1 1
New York University, New York, NY, 10012, USA
Correspondence should be addressed to the authors ({bbb259, roginska, brian.gill}@nyu.edu) ABSTRACT This work describes research on the maximum sound pressure level achievable by the spoken and sung human voice. Trained actors and singers were measured for peak and averaged SPLs at an on-axis distance of 1 m at three different subjective dynamic levels and also for two different vocal techniques (‘back’ and ‘mask’ voices). The ‘back’ sung voice was found to achieve a consistently lower SPL than the ‘mask’ voice at a corresponding dynamic level. Some singers were able to achieve high averaged levels with both spoken and sung voice, while others produced much higher levels singing than speaking. A few of the vocalists were able to produce averaged levels above 90 dBA , the highest found in the existing literature.
1. INTRODUCTION Ongoing research has sought to estimate the sound pressure level (SPL) of the 18th century Anglican preacher George Whitefield, based on an auditory experiment conducted by Benjamin Franklin in 1739 [1]. That work has so far yielded estimates of Whitefield’s averaged SPL at 1 m greater than 90 dBA . Much of the existing literature on maximum vocal SPL has used different measurement methods at different distances, and few of these studies have examined the maximum long-term Leq , which describes the voice’s average pressure over time and is often used as a substitute for steady-state pressure in a varying acoustic system such as speech or song. In addition, research into vocal directivity patterns
has examined the impact of changing vocal resonances for a single vocalist [2]. Though change in vocal resonance was not found to have a significant effect on acoustic radiation pattern, some vocalists believed that certain vocal placements were more effective at reaching an audience than others. It was hypothesized that this was because different vocal resonances might correspond to differences in overall sound pressure rather than directivity. Absolute SPLs of subjective dynamic levels measured in a controlled environment could examine this theory further. From an audio engineering perspective, the maximum peak and average SPLs of trained vocalists are important because training can significantly increase
Boren et al.
Maximum Vocal Sound Pressure Levels
potential vocal output [3, 4]. Existing designations of vocal loudness for acoustic simulation have assumed a 3-level scheme of ‘normal,’ ‘raised,’ and ‘loud,’ with the top category corresponding to an Leq of about 74 dBA [5]. But Whitefield’s example suggests that there exists some headroom above this designation, corresponding to an even louder ‘maximal’ level of speech. Even if this level is only achievable by some trained vocalists, this is important for audio engineers to consider since trained vocalists are disproportionately represented in recording studios, concert halls, and theatres. For recording engineers, knowing the maximum peak and average SPLs produceable by vocalists can be useful for pre-calibrating vocal microphones. For live sound engineers, acoustic conditions may be very different for a very loud trained vocalist and require less loudspeaker reinforcement as a result. 2.
BACKGROUND
While the perception of loudness is dependent on more than SPL alone [6], this investigation will be purely limited to the maximum peak or averaged SPL produced by vocalists. To the authors’ knowledge, no study has comprehensively surveyed the existing literature on maximum vocal SPLs since Kent et. al. in 1987 [7]. Kent’s study summarized different series of measurements, while acknowledging that they were recorded at different distances from the vocalist’s mouth. Here some attempt will be made to scale all such measurements to the predicted SPL at 1 m using the classical formula for free-field sound attenuation, ∆L = 20 log
r1 r2
(1)
where ∆L is the change in level predicted by a move from initial distance r1 to a distance r2 . In practice the drop-off will be less abrupt for some close measurements, due to near-field behavior of the sound field close to the vocalist’s mouth. Still, this formula allows a good approximation for comparing some of the large SPLs reported at short distances to later measurements taken at 1 m. Many of the studies that have collected maximum SPL measurements have used them for comparison
with other factors, and so the types of SPL measurements and vocal signals analyzed vary. Some studies [3, 8, 11] did not specify the time of integration for SPL recording, so it is assumed some form of fastintegrated Lp was used. Some studies [3, 9, 11, 10] only provided graphs and not exact measurement values, so the dB values reported here may contain ± 1 dB of error. In addition, some studies [4, 9, 13] only report the mean level for all their participants, indicating that higher values were measured but not reported directly. The highest SPL reported in Kent’s [7] summary was from Coleman’s 1977 study of fundamental frequency and SPL [8]. Coleman reported the fastintegrated full spectrum SPL recorded for a 2 s sung phonation. This yielded a max Lp of 126 dB for a male and 122 for a female, recorded at 6 inches from the mouth. These high values are a consequence of a very close distance to the vocalist; using an attenuation factor of -16 dB from eq. 1 gives estimated SPLs of 110 and 106 dB, respectively at 1 m. These values are still high but more similar to the maximum peak values seen in other studies. Many studies have measured either instantaneous Lp or Leq for constant phonations a few seconds in length. Leq is defined as the time average of the SPL:
Leq = 10 log
1 T
Z 0
T
p(t) p0
2 dt
(2)
where T is the integration time and p(t) is the pressure as a function of time. For sustained tones with no pauses or silences, Leq will be higher than the Leq for normal speech or song. For this reason, studies that have measured short sustained phonations, whether they are reporting instantaneous Lp or Leq , are describing a quantity more similar to a peak value when applied to continuous speech or song. This can be seen in the study by Akerlund [9], which measured Leq for normal speech and a 2 s phonation. The maximum value of the phonation’s Leq was 25 dB greater than that of the speech. Table 2 summarizes the maximum recorded SPLs of the relevant literature, including the corresponding estimated level at 1 m for comparison. The maximum fast-integrated Lp is similar but not identical to
AES 135th Convention, New York, USA, 2013 October 17–20 Page 2 of 7
Boren et al.
Study Coleman [8] Coleman [8] Gramming [10] Awan [4] Akerlund [9] Akerlund [9] Coleman [11] Mendes [3] Sundberg [12] Leino [13]
Maximum Vocal Sound Pressure Levels
Year. 1977 1977 1988 1991 1992 1992 1994 2003 2006 2009
Participants 10 m. adults 12 f. adults 9 m. singers 20 singers 10 f. singers 10 f. singers 20 singers 14 singers 31 speakers 14 students
Type Fast Lp , 2 s phon. Fast Lp , 2 s phon. Leq , sung triads Mean Fast Lp , 3 s phon. Mean Leq , 30 s speech Mean Leq , 2 s phon. Lp , 4 s phon. Lp , 6 s phon. Leq , 40 s speech Mean Leq , speech
Dist. (cm) 15.24 15.24 30 30.48 30 30 15 2 30 40
Max SPL 126 dB 122 dB 105 dB 112.5 dB 93 dB 118 dB 114 dB 118 dB 100.3 dB 72.8 dB
SPL at 1 m 110 dB 106 dB 95 dB 102 dB 83 dB 108 dB 98 dB 84 dB 90 dB 65 dB
Table 1: Review of maximum SPLs measured in previous studies
the Lpk value, but these measurements give a good overview of the highest average and instantaneous levels recorded for the human voice. In particular, Sundberg [12] reports an Leq corresponding to about 90 dB at 1 m, indicating that such a sustained level is indeed possible. 3.
METHOD
3.1. Pilot Study To first investigate the possible average maximum level of the spoken voice, a pilot study was undertaken using one professional actor and one professional actress. The vocalists were measured in the live room of the James Dolan Recording Studio at NYU. The room’s dimensions are 9m by 4.6m by 3m. The vocalists were aligned to exactly 1 m in front of an on-axis Micro-SPL measurement condenser microphone attached to an XL2 Sound Level Meter, logging peak and averaged values of LA and LZ in 1 s intervals. The meter’s sensitivity range was set to Lp values of 30-130 dB. The vocalists were not restrained in any way, but were observed to keep still during the measurements. Both speakers were instructed to recite a short monologue, about 30 to 60 s in length, from memory at three different subjective loudness levels: ‘conversational’ speech, ‘theatrical’ speech (defined as the level necessary to be intelligible to an audience in a small theatre with no amplification system), and ‘maximal’ speech (defined as the loudest achievable without shouting or screaming). While some studies have used noise played over headphones to induce a higher SPL out of vocal-
Participant P1 - actor P2 - actress
Conv. 64.1 65.3
Thea. 77.9 73.4
Max. 90.1 90.7
Table 2: Leq values for pilot study, in dBA , for Conversational, Theatrical, and Maximal Levels Participant P1 - actor P2 - actress
Conv. 93.6 92.3
Thea. 106.0 98.7
Max. 113.2 113.3
Table 3: Lpk values for pilot study, in dBA
ists [9], for some vocalists this condition has actually reduced maximum sound pressure [10]. Since all vocalists measured for this study were highly trained, no headphones or noise conditions were used, ensuring that the participants felt natural during the experiment. The levels recorded will be reported chiefly in dBA , though the LZ values were generally within 1 dB of the LA values. Both speakers in the pilot study achieved levels comparable to the ‘loud’ designation in [5] at their ‘theatrical’ level, and both were able to produce an average level slightly over 90 dBA at their ‘maximal’ level (table 1). The corresponding Z-weighted values were about 0.5 dB greater for both vocalists. Table 2 shows the highest peak value measured for each speech designation. It can be seen that the peak value in a given monologue was routinely 2030 dB greater than the average level for that period.
AES 135th Convention, New York, USA, 2013 October 17–20 Page 3 of 7
Boren et al.
Maximum Vocal Sound Pressure Levels
To investigate this difference, we defined the peak spread of a given vocal measurement as follows: Spk = 20 log
ppk = Lpk − Leq peq
(3)
where Spk is the peak spread, ppk is the peak pressure, and peq is the average pressure. Table 3 gives the peak spread for both participants in the pilot study based on the A-weighted levels. It can be seen that Spk decreased with increasing vocal loudness, such that the ‘maximal’ voice contained the least variance in pressure. Participant P1 - actor P2 - actress
Conv. 29.5 27.0
Thea. 28.1 25.3
Max. 23.1 22.6
Table 4: Spk values for pilot study, in dBA
3.2. Spoken and Sung Voice The pilot study showed that trained vocalists could indeed reach average levels of 90 dBA , and provided other interesting questions about the relationship between peak and average values in vocal output at different levels. After this, a more broad series of measurements was conducted on 9 trained singers in the same space, using the same setup and equipment. The vocalists contained 6 females (5 sopranos and 1 mezzo soprano) and 3 males (2 baritones and 1 tenor). The same three spoken voice designations were measured for these participants. In addition, the singers also sang a piece, about 30 to 60 s long, from their repertoire at three different dynamic levels: pianissimo (pp), mezzo forte (mf ), and fortissimo (f f ). While it is known that frequency content is strongly correlated with short-term SPL [8], the singers were merely instructed to select a piece from their actual repertoire that they could sing as loudly as possible while retaining a ‘musical’ tone. Frequency of the highest note was not a criterion for selection, but singers were instructed to retain the original key of the pieces to ensure that the pieces were not artificially amplified by modulation. To investigate the role of different vocal resonances, the vocalists sang all three dynamic levels using both the ‘back’ and ‘mask’ voices, for 6 total
sung measurements. The ‘back’ voice places the primary vocal resonance at the rear of the vocal tract, similar to a yawn in its most extreme form. The ‘mask’ voice uses the resonances of the nasal cavities at the front of the head. Tables 5, 6, and 7 show the Leq for the spoken, back sung, and mask sung voices for the 9 singers, along with each vocalist’s numerical ID and vocal range. Tables 8, 9, and 10 show the Lpk for the spoken, back sung, and mask sung voices for the 9 singers. 4.
DISCUSSION
4.1. Average Levels for Speech The mean of the 9 singers’ spoken levels was 59.0 dBA for ‘conversational’ speech, 69.9 dBA for ‘theatrical’ speech, and 79.6 dBA for ‘maximal’ speech. Each of these values increases by 1-2 dB if the two actors from the pilot study are included in the data, indicating that a population of all trained actors may achieve even higher mean values. Even with singers, however, the mean ‘maximal’ LAeq was still about 6 dB higher than the ‘loud’ level of 74 dBA used in [5]. Individual vocalists were able to exceed the ‘loud’ level by up to 15 dB. 4.2. Gender Differences The mean Leq s were higher among male vocalists than female vocalists at the highest loudness levels for both speech and sung conditions, consistent with previous research [7]. For the sung conditions the males showed a larger dynamic range overall, as their mean Leq was lower for the pianissimo condition, but this may be a consequence of the small sample size of male singers (n=3). This is not to say that females cannot produce equally high levels – in fact, the highest recorded Leq for speech, 90.7 dBA , was produced by the female actress in the pilot study. 4.3. Spoken Levels For Singers Another interesting aspect of the recorded data is the differences between sung and spoken data for the trained singers. While previous studies have conclusively shown that vocal training can lead to higher maximum SPL [3, 9], this effect was lessened for some singers during the spoken conditions, as many of the singers produced maximum Leq values that were much lower than those of the two actors. In fact, vocalists 3, 5, and 6 had maximum values of spoken Leq that were 5 or more dB lower
AES 135th Convention, New York, USA, 2013 October 17–20 Page 4 of 7
Boren et al.
Participant 1 - mez. sop. 2 - soprano 3 - soprano 4 - baritone 5 - soprano 6 - soprano 7 - baritone 8 - soprano 9 - tenor
Maximum Vocal Sound Pressure Levels
Conv. 60.1 57.3 58.0 61.7 56.2 55.6 63.7 62.8 55.7
Thea. 65.3 72.6 65.5 75.4 68.0 61.4 76.2 71.0 73.3
Max. 72.4 86.6 73.3 84.3 73.1 69.7 90.3 79.9 86.5
Table 5: Leq values for speech, in dBA Participant 1 - mez. sop. 2 - soprano 3 - soprano 4 - baritone 5 - soprano 6 - soprano 7 - baritone 8 - soprano 9 - tenor
pp 70.8 69.3 78.5 68.0 69.3 69.0 68.5 71.3 63.3
mf 73.8 75.0 79.8 79.0 82.5 76.1 80.4 76.1 73.7
ff 77.1 82.2 83.7 83.8 84.9 82.3 86.8 80.4 88.4
Table 6: Leq values for back sung voice, in dBA Participant 1 - mez. sop. 2 - soprano 3 - soprano 4 - baritone 5 - soprano 6 - soprano 7 - baritone 8 - soprano 9 - tenor
pp 71.3 67.3 80.2 75.6 78.1 76.7 69.6 71.9 68.1
mf 73.9 75.7 82.5 81.6 79.5 82.2 84.2 77.9 78.6
ff 76.9 82.1 82.2 86.4 83.4 88.1 90.8 83.1 90.7
Table 7: Leq values for mask sung voice, in dBA
Participant 1 - mez. sop. 2 - soprano 3 - soprano 4 - baritone 5 - soprano 6 - soprano 7 - baritone 8 - soprano 9 - tenor
Conv. 81.6 79.0 82.8 85.7 77.6 77.7 88.4 88.2 79.6
Thea. 88.3 97.0 91.7 97.1 91.3 84.3 102.6 98.1 98.1
Max. 94.0 107.8 100.7 107.4 94.5 93.7 113.1 104.1 112.4
Table 8: Lpk values for speech, in dBA Participant 1 - mez sop. 2 - soprano 3 - soprano 4 - baritone 5 - soprano 6 - soprano 7 - baritone 8 - soprano 9 - tenor
pp 88.9 89.6 97.8 89.4 88.1 94.2 88.9 94.3 78.0
mf 93.8 94.6 100.4 100.6 103.6 99.2 101.7 98.6 91.6
ff 95.7 102.7 105.5 105.4 107.2 103.5 109.6 101.3 108.2
Table 9: Lpk values for back sung voice, in dBA Participant 1 - mez. sop. 2 - soprano 3 - soprano 4 - baritone 5 - soprano 6 - soprano 7 - baritone 8 - soprano 9 - tenor
pp 88.7 85.4 101.3 97.2 101.5 99.9 92.9 93.2 83.9
mf 91.8 98.1 102.4 104.4 101.6 105.0 108.5 98.8 96.9
ff 96.4 105.0 103.1 109.4 105.0 112.0 110.8 103.0 113.0
Table 10: Lpk values for mask sung voice, in dBA
AES 135th Convention, New York, USA, 2013 October 17–20 Page 5 of 7
Boren et al.
Maximum Vocal Sound Pressure Levels
than the Leq for their pianissimo mask voice! This was not the case for other singers, such as vocalist 7, who was able to produce sung and spoken maximums of Leq similar to the levels produced by the trained actors. This suggests that some trained singers may have different mental frameworks for spoken vs. sung voice, which increases their maximum sung SPL more than their maximum spoken SPL. 4.4. Peak Spread Fig. 2: Standard Deviation for the 9 Singers, dBA dispersed as they spoke with greater effort. Interestingly, this trend was not observed in the standard deviations for either of the sung voices, which stayed in a similar range at all three levels. It is possible that because the subjects were primarily trained as singers rather than speakers, they had more precision as a group in their sung levels than in their spoken levels. 4.6. Back vs. Mask Levels Fig. 1: Mean peak spread, dBA After the pilot study, it had been anticipated that the peak spread would be reduced as the SPL of the measured signals increased. However, the mean spread for all vocalists measured actually shows a slight increase in peak spread with increasing SPL for both sung conditions (fig. 1). The highest mean Spk value for speech was found at the medium dynamic (‘theatrical’ level). This persisted whether calculating the mean for all eleven vocalists or just for the nine singers. While the two actors’ peak spread was greatly reduced for their loudest speaking voice, the singers increased their sung Lpk slightly faster than their Leq as their dynamic level increased.
Fig. 3: Mean dB Increase from Back to Mask Voice
4.5. Standard Deviation by Level In addition to measuring the range of an individual’s pressure variations from the mean, it is also helpful to examine the total variance in Leq across all the singers via the standard deviations of the A-weighted levels (fig. 2). The spoken voice conditions show a clear increase in standard deviation with increasing level, indicating that the singers’ SPLs were more
Figure 3 shows the average dB difference between the ‘mask’ voice and the ‘back’ voice. As had been hypothesized, trained singers usually interpreted the same dynamic levels at lower SPLs for the ‘back’ voice. This difference is greatest at pianissimo and decreases with increasing dynamic level. Participants 4, 5, and 6 each showed a difference greater
AES 135th Convention, New York, USA, 2013 October 17–20 Page 6 of 7
Boren et al.
Maximum Vocal Sound Pressure Levels
than 7 dB for pianissimo. It is possible this difference in subjective level stems from the greater damping of the ‘back’ voice due to its placement in the rear of the vocal tract. Since singers may judge their dynamic level based more on vocal effort than absolute SPL, an equivalent vocal effort may lead to lower relative output pressure for the back voice.
[4] S. Awan, “Phonetographic profiles and F0-SPL characteristics of untrained versus trained vocal groups,” Journal of Voice 5 (1991), no. 1, 41– 50.
5. CONCLUSIONS When comparing maximum SPL measurements in the literature, averaged and peak levels should be distinguished based on the nature of the experiment. Both past studies and this current experiment have yielded maximum Leq values of 90-91 dB, as well as maximum Lpk values in the range 110-114 dB. The difference between peak and average values fluctuates between about 20 and 30 dB, and it may possibly behave differently for trained actors versus trained singers.
[6] G. D. Allen, “Acoustic Level and Vocal Effort as Cues for the Loudness of Speech,” J. Acoust. Soc. Am. 49 (1971), no. 6B, 1831–1841.
For the purposes of simulating George Whitefield’s voice, this study confirms that averaged values of around 90 dBA are perfectly possible 1 m from a speaker. While it is conceivable that his voice may have been louder than any of the vocalists measured so far, any estimates above this measured maximum should be viewed with caution until they can be experimentally verified. 6. ACKNOWLEDGEMENTS Many thanks to all the vocalists who participated in this study. 7.
[5] B. Dalenback, CATT-A v9.0 User’s Manual, CATT, Gothenburg, Sweden, 2011
[7] R. Kent, J. Kent, and J. Rosenbek, “Maximum Performance Tests of Speech Production,” Journal of Speech and Hearing Disorders 52 (1987), 367–387. [8] R. Coleman, J. Mabis, and J. Hinson, “Fundamental Frequency-Sound Pressure Level Profiles of Adult Male and Female Voices,” Journal of Speech and Hearing Research 20 (1977), 197–204. [9] L. Akerlund, P. Gramming, and J. Sundberg, “Phonetogram and averages of sound pressure levels and fundamental frequencies of speech: Comparison between female singers and nonsingers,” Journal of Voice 6 (1992), no. 1, 55– 63. [10] P. Gramming, J. Sundberg, and S. Terstrom, “Relationship between changes in voice pitch and loudness,” Journal of Voice 2 (1988), no. 2, 118–126.
REFERENCES
[1] B. B. Boren and A. Roginska, “Computer simulation of Benjamin Franklin’s acoustic experiment on George Whitefield’s oratory,” presented at the 164th Meeting of the Acoustical Society of America, Kansas City, MO, 2012 October 22–26. [2] B. B. Boren and A. Roginska, “Sound Radiation of Trained Vocalizers,” in Proceedings of Meetings on Acoustics: 21st International Congress on Acoustics, Montreal, Canada, 2013 June 2– 7. [3] A. Mendes, H. Rothman, C. Sapienza, and W. Brown, “Effects of vocal training on the acoustic parameters of the singing voice,” Journal of Voice 17 (2003), no. 4, 529–543.
[11] R. Coleman, “Dynamic intensity variations of individual choral singers,” Journal of Voice 8 (1994), no. 3, 196–201. [12] J. Sundberg and M. Nordenberg, “Effects of vocal loudness variation on spectrum balance as reflected by the alpha measure of long-termaverage spectra of speech,” J. Acoust. Soc. Am. 120 (2006), no. 1, 453–457. [13] T. Leino, “Long-term average spectrum in screening of voice quality in speech: untrained male university students,” Journal of Voice 23 (2009), no. 6, 671–676.
AES 135th Convention, New York, USA, 2013 October 17–20 Page 7 of 7