A Theory of Loudness and Loudness Judgments

0 downloads 0 Views 4MB Size Report
that forms the basis for judgments of loudness differences. Underlying loudness difference is a scale approximating the lambda scale. Thus, the theory uses a.
Psychological Review 1979, Vol. 86, No. 3, 256-285

A Theory of Loudness and Loudness Judgments Lawrence E. Marks

John B. Pierce Foundation Laboratory, New Haven, Connecticut, and Yale University To account both for properties of auditory functioning and for the perceptual bases of various kinds of psychophysical judgments of loudness, the present theory utilizes the notion that the auditory system processes information about the intensity of sounds in a hierarchy of stages. Each stage is characterized by a function that transforms input components into output components. Associated with each stage in processing is a rule for combining the output components of that stage. In the initial, peripheral (sensory) stages, these rules are ones of summation, accounting thereby for energy summation within critical bands, loudness summation across widely separated frequencies, and binaural loudness summation. Underlying loudness summation is a scale approximating the sone scale. In a more central (cognitive) stage, the rule is one of subtraction between components, accounting thereby for the perceptual relationship that forms the basis for judgments of loudness differences. Underlying loudness difference is a scale approximating the lambda scale. Thus, the theory uses a single hierarchical scheme to account both for certain auditory processes and for seemingly incompatible psychological scales of loudness. Loudness is the intensity of sensations of hearing. The quantification of loudness has long been a major goal of psychoacoustics— just as the measurement of sensation in general has long been a major goal of psychophysics. To construct and validate a scale of loudness would constitute a significant theoretical advance for psychophysics, for sensory psychology, and indeed for psychology in general. Moreover, to construct and validate a loudness scale would be of considerable practical importance: Engineers as well as psychologists are concerned with the measurement of loudness, and the International Standards Organization has accepted Stevens's (19SS) sone scale as the scale of loudness. Yet the validity of the psychophysical methods, such as magnitude estimation, that This research was supported by Grant BNS 76099SO from the National Science Foundation. J. J. Zwislocki provided helpful comments on the article. Christine Heselmann assisted in collecting the new data reported. , Requests for reprints should be sent to Lawrence E. Marks, John B. Pierce Foundation Laboratory, 290 Congress Avenue, New Haven, Connecticut 06S19.

were used to erect the sone scale remains controversial. This article proposes a psychophysical theory of loudness, with special emphasis on the perceptual bases from which people make quantitative judgments about loudness. The theory says that information about sound intensity is processed in a hierarchy of stages, each stage being characterized by some input-output function. The scheme may be thought of as comprising transformations embedded inside transformations; that is, the input-output functions (transformations that are linear at one of the stages and nonlinear at other stages) nest within other transformations, and associated with each function is a rule for integrating components. The first transformation operates on a series of stimulus components, and the associated rule of integration combines these transformed components. The second transformation operates on a series of components that are the outputs from integration in the prior stage; another rule of integration combines the twice-transformed components. Transformations arise at different levels in processing. Some of the transformations pre-

Copyright 1979 by the American Psychological Association, Inc. 0033-295X/79/8603-0256$00.75

256

LOUDNESS AND LOUDNESS JUDGMENT

sumably occur peripherally in the auditory system—and certainly automatically (for example, at the basilar membrane of the cochlea)—whereas others arise much more centrally and come under cognitive control (for example, wherever in the brain the sensory effects of two or more stimuli separated in time are compared and related to each other). Let me note here that the notion of hierarchical processing in hearing is not new. A hierarchy of systems is implied in (to use but three examples) Zwicker and Scharf's (1965) model for calculating the loudness of complex sounds, Treisman and Irwin's (1967) model for binaural summation of the loudness of white noise, and Curtis and Mullin's (1975) model for separating perceptual and judgmental processes in loudness scaling tasks. The present theory of loudness begins with a simple organizational scheme common to processing at various levels and by using a small set of basic principles, several of which are already well established, it accomplishes two main functions: (a) The theory binds together several properties of auditory processing, notably critical-band summation, supercritical-band summation, and binaural summation; and (b) by extending the same scheme to incorporate the ways that subjects make quantitative judgments about loudness and loudness relations, it gives a new perspective to an old issue—the evaluation of loudness scales and, consequently, of the psychophysical functions relating the psychological magnitude loudness to parameters of acoustic stimulation. I say junctions because the major tenet of the present theory states that intensity undergoes several nonlinear transformations; the theory shows how some of the seemingly inconsistent loudness scales that have been suggested in the past (e.g., Stevens's, 1955, sone scale and Garner's, 1954, lambda scale) are not actually incompatible; instead, the different scales tap the outcome of processing at different levels in the scheme of embedded transformations. Outline of the Theory Let me summarize briefly, first, what the

257

theory accounts for and, second, how the theory accomplishes this. The theory characterizes (a) the way loudness depends on sound pressure of pure tones and (b) certain perceptual relationships between loudnesses of pairs of sounds. Thus the theory answers (a) substantive questions about auditory functioning and, in particular, questions about the way the auditory system integrates sensory inputs across frequency and across the two ears; and (b) questions of psychophysical scaling and, in particular, questions of why two types of psychophysical procedure—procedures that ask subjects to judge sensory intervals and procedures that ask subjects to judge sensory magnitudes—often yield scales that are different from one another. The theory accomplishes this by proposing to identify loudness as the internal representation of the sensory effect of sound at a particular level in the processing of auditory intensity. Specifically, loudness corresponds to the value of L (loudness) within the following set of embedded processes: spectral shaping; critical-band summation; transformation to L; summation of L across frequency and across the ears; transformation to another entity, called D (difference); and relations of D. A preliminary version of this theory was sketched earlier (Marks, 1978b). According to the theory, the acoustical (pure tone) input is processed first through a linear transformation of acoustical energy to E (a quantity proportional to acoustical energy), followed by linear summation of E across small ranges of sound frequency (across critical bands). This is followed by nonlinear transformation from E to L, whereupon values of L sum linearly across frequency (given that the component frequencies are separated sufficiently) and across the two ears. All of these processes—the sets of transformations and summations—take place automatically.1 Next comes an optional stage, 1 The processes take place "automatically," even though cross-frequency summation can be analytic: It is sometimes possible to "hear out" the individual tonal components of a complex sound and respond either to component loudness or to overall loudness.

258

LAWRENCE E. MARKS

whose transformation turns out to be nonlinear. I call this stage optional because it arises only under conditions that elicit certain relational judgments between pairs of sounds—in which L is converted to a different quantity, D. Judgments of loudness dissimilarity and of loudness difference derive from intervals on the D scale. Loudnesses (values of L) pertain to individual sounds; that is, to sounds that occur as unitary events, even if they comprise several components. Values of D, by contrast, apply only when stimuli produce pairs of sounds that appear perceptually as pairs, and apply to a perceptual relationship between members of each pair. For the most part, I shall refer to this perceptual relationship as a loudness difference, which, it is important to bear in mind, is not necessarily the same as a difference in loudness (a linear difference or interval on the L scale). Loudness difference, in the present context, refers specifically to a linear difference on the D scale. One way to summarize is by means of a recursive equation that describes the relationship between a given stage's output ( Y k ) and input (Yk _ t ) : Yk =

(1)

Tk is the psychophysical transfer function relating output to input of each component, and o is an arithmetic operator (e.g., addition or subtraction). Note that as the equation is stated, the input to any stage k is written as comprising two components: In some instances, this readily generalizes to greater numbers of components— for example, when the stage represents energy summation in the critical band; in other instances, however, two components is the limit— for example, at the stage in which two sounds yield a loudness difference. The transformational rules and associated rules of combination (linear summation and differencing) yield tightly constrained metric structures, and these structures can evidence themselves even when it is not possible to quantify directly the magnitude of the system's output. Moreover, the various transformations and combinations are independently demonstrable. Some of them are already well-

known: for instance, within critical bands.

energy summation

Transformations and Scales Let me pick up on a point made in the last paragraph. Given that the auditory system processes information about intensity through a hierarchy of stages in which each stage has its own input-output transformation, a pertinent question is, How is it possible to determine the scale values that operate at any intermediate stage? Consider the simple case of a single input component that is delivered to a two-stage system. From Equation 1, the final output F2 to Input / will equal T 2 [ T l ( I ) } . Even if it is possible to obtain a valid measure of F2, the relation between F2 and / is jointly a function of the transformations TI and T2. How can TI and T« be separated? A traditional solution in sensory psychophysics uses a principle of invariance in conjunction with experimental paradigms that use compound (e.g., multicomponent) stimuli. Call the inputs to the system's first stage /„ and /& and assume that the first stage's outputs to these components sum linearly before entering the system's second stage. Then

Y, =

+ Ti (/»)].

(2)

If transformation T2 is linear, transformation T! can be determined directly from measures of F2. If, as in the case of the present theory of loudness, T2 is nonlinear, the characteristics of TI may be determined by a matching procedure in which subjects set various combinations of Ia and 7b to be equal perceptually. This matching operation defines Y2 in Equation 2 as constant. Therefore,

Ti(Ia) + Ti(Ib) =

= constant

(3)

describes the form of the equal-sensation function. If, for instance, TI is a linear transformation, then the overall response will depend on the linear sum of the inputs Ia and /&, as in critical-band intensity summation. In this case, a graph plotting Ia versus /;, for a constant perceptual response will result

LOUDNESS AND LOUDNESS JUDGMENT

in a straight line with unit negative slope. More generally, Equation 3 says that one may use equal-sensation curves in order to determine the form of any underlying transfer function, as Treisman and Irwin (1967) did in their analysis of binaural summation of the loudness of white noise. The most fundamental test of any rule of processing comes from applying the principle of invariance (Brindley, 1960, chap. 4; Marks, 1978c, chap. 5). To say that sound energies sum within a critical band is to say that different stimulus combinations that are equal in total energy produce the same loudness. To say that values of L sum across the two ears is to say that different combinations of left-ear and right-ear stimuli that are equal in total L are equal in loudness. To say that the perception of a loudness difference is an interval on the D scale is to say that different pairs of stimuli that are equal in AJD are judged equal in loudness difference. Sensory equalities or invariances are often determined most readily by intensity-matching procedures. But they can also be determined by numerical rating procedures like magnitude estimation. It is perhaps not always appreciated that a set of magnitude estimates, obtained with compound stimuli, contains the same information about equal magnitude that can be found in direct matches. Even if one cannot accept the numerical judgments as representing more than a rank ordering of sensory values, the judgments can nevertheless suffice to allow one to find the stimulus combinations that yield equal sensation magnitude (see Marks, 1974b, especially chap. 2), and therefore to find the form of underlying transformations. This consideration leads to a second procedure for uncovering the form of an underlying transformation. Given, say, a set of magnitude estimates, another way to determine the scale values that operate at some intermediate level of processing is to rescale the final output. If values of F2 are rescaled by the inverse of transformation T2, then

F2* = r2-'(F2) = ri(/8) + r,(/t). (4) The rescaled outputs F2* will show the addi-

259

tive structure underlying the components at Level 1, thereby permitting one to assess transformation TV As a matter of fact, this second procedure is only a small step removed from the first one: The second procedure derives the underlying additive metric by reassigning numerical values to the final outputs while maintaining the equalities (invariances) and rank orderings. The procedure is much akin to conjoint scaling (Luce & Tukey, 1964). The use of an inverse transformation, as in Equation 4, characterizes the approach of Curtis and Rule (e.g., Curtis, Attneave, & Harrington, 1968; Rule, Curtis, & Markley, 1970), who have sought to partial out judgmental from sensory transformations in magnitude estimation. As the discussion in the preceding paragraphs already indicates, the hierarchical scheme of the present theory can readily be extended to include an additional transformation, namely the transformation that characterizes what happens when a person is called on to make a quantitative assessment of the perceptual experience—as in the method of magnitude estimation. If we grant that the output from some stage in intensity processing corresponds to loudness, the question still remains whether any given set of numerical responses provides a valid measure of loudness: When, if ever, are judgments of loudness proportional to loudness? Given that an additive structure operates at the level of loudness sensation, there is a way to answer the question. If the numerical judgments show the additive structure directly, the responses must be linearly related to the underlying scale values. This is the solution proposed by Anderson's (1970) theory of functional measurement. Even if the judgments are a nonlinear function of the underlying sensory scale, the scale may be uncovered by either or both of the two procedures just elucidated: (a) by determining invariances—the equally loud compound stimuli whose loudness components add; and (b) by discovering the transformation of the response scale that reveals the additive structure of the loudness components. In sum, the two methods—determining invari-

LAWRENCE E. MARKS

260

ances and rescaling quantitative judgments —can be used to uncover the underlying sensory or perceptual scale values that function at any of the several levels in the system. The logic behind the present theory and analysis owes much to Anderson's (1970) functional measurement theory, which argues for the utility of simultaneously examining psychophysical scales and rules for concatenating scale values. This is a valuable approach, one that underlies the justification for much of the present substantive theory. On the other hand, the notion that determining a rule of concatenation and the scale values that underlie concatenation necessarily provides the basis for derivation of a unique psychophysical scale might mislead or confuse. For there can be more than one scale, each representing the cumulative effect of the sensory, perceptual, and cognitive processes that have occurred up to that stage in the hierarchy. A significant feature of the present theory is that it uses a common organizational principle to account for processes that range from critical-band summation to the generation of different scales underlying magnitude and interval judgments of loudness. Spectral Shaping By spectral shaping, I refer to the fact that the auditory system is not equally sensitive to sounds of all frequencies. That is, loudness depends on sound frequency as well as on sound energy (and on many other parameters, such as stimulus duration; the present theory provides a model for the way loudness depends on several other parameters of stimulation besides intensity). To a large extent, the nonuniform frequency response of the auditory system reflects the nonuniform transmission of sounds of different frequencies to the receptors (see Zwislocki, 1965). The main concern of the present theory, however, is to account for intensity relations; for example, how the loudness of a sound of given spectral composition depends on the intensity of the stimulus. Critical-Band Summation When acoustical energy falls within a relatively limited range of frequencies, the over-

all sensory effect depends on the total acoustical energy. The frequency range over which the auditory system sums energy is called the critical bandwidth. Critical bandwidth varies with center frequency; at 1000 Hz, the critical bandwidth is about 160 Hz wide (Scharf, 1970; Zwicker, Flottorp, & Stevens, 19S7). The rule of energy summation has been tested primarily by means of loudness-matching experiments, in which a comparison noise or tone is matched to noises of various bandwidths or to combinations of tones. That is, the rule of energy summation has been evaluated primarily if not entirely by applying the principle of invariance, as described in the last section. Results of such studies using tones of variable spacing and different overall intensity levels (Zwicker et al., 1957) and tones of unequal numbers of components (Scharf, 1959) and unequal intensity (Scharf, 1962) support the general rule that as long as the energy falls within a critical band of sound frequencies, loudness is essentially independent of how the energy is distributed in the band (for an excellent review of the data, see Scharf, 1970). The rule of energy summation is codified in some models of loudness summation (e.g., that of Zwicker & Scharf, 1965). The following experiment serves as a demonstration of energy summation within a critical band. Although the results of this experiment do not add new information about peripheral processes of summation, they do serve several valuable purposes. First, the experimental paradigm is the same sort that is used later in this article to demonstrate transformations and integration rules at other stages in the hierarchy of processing, and in doing so it shows how the rule of energy summation can be evaluated by rescaling numerical judgments, as described in the last section. Second, the results, and their interpretation, point to an important principle about psychophysical scaling and in particular about the meaning of the psychological scale values derived from studies of responses to compound stimuli. Two sinusoids of 1000 Hz and 1090 Hz were combined at different sound pressure levels according to a factorial design (see

261

LOUDNESS AND LOUDNESS JUDGMENT NEWTON / ,0 .OP0063 .O002 .00063

.002

.0063

0 .000063 .0002 .00063 .002 5'

3

2,r

10

20

30

40

90

10

20

30

4O

50

DECIBELS SPL-IOOO HZ

Figure 1. Magnitude estimates of the loudness of two-component sounds. (The estimates are plotted as a function of the nine levels of the 1000-Hz component, each of which was combined factorially with the, same nine levels of the 1090-Hz component. The left-hand panel shows the estimates on a linear axis; the right-hand panel shows the same estimates on a logarithmic axis. Each symbol represents a constant level of the 1090-Hz component. O = — °° dB; A = 15 dB; fj = 20 dB; V = 25 dB; 0 = 30 dB; • = 35 dB; A = 40 dB; • = 45 dB; T = SO dB.)

Anderson, 1970; Marks, 1978b); subjects evaluated the loudness of each two-component sound by trying to assign numbers in proportion to loudness (the method of magnitude estimation). According to the rule of critical-band summation, loudness should depend solely on the total energy in each stimulus, not on how the energy divides up between the two components. Method Apparatus. Two oscillators provided the puretone signals, which were amplified and attenuated independently before combining at a mixer. The combined output was gated by an electronic switch, which in turn linked to an interval timer. Each sound had a duration of 1 sec, with rise and fall times of 10 msec. Signals fed TDH-39 headphones, mounted in MX41/AR cushions, located in a soundattenuated booth. Signal waveforms were monitored on an oscilloscope and voltages were monitored on a root-mean-square voltmeter. Procedure. The procedure was magnitude estimation (Stevens, 1975). The subject was told to assign to the first stimulus whatever number seemed most appropriate to stand for the loudness; then, to succeeding stimuli, the subject was to assign other numbers in proportion to loudness. Subjects

were told that they could use whole numbers, decimals, and fractions as needed. The stimuli were constructed as follows: Each of nine intensity levels of the 1000-Hz component (including zero intensity) was combined with each of the same nine intensity levels of the 1090-Hz component, making 81 different stimuli in all. Stimuli were presented binaurally. In a given session, the subject judged all 81 stimuli, which were presented in irregular order. Following this, a 5-min rest was taken, after which the subject then judged all 81 stimuli again (this second set was presented in a different order). Twenty young men and women served as subjects.

Results and Discussion The magnitude estimates assigned to each stimulus were averaged geometrically and the means were plotted on a linear axis against the sound pressure level (SPL) of the 1000Hz component (Figure 1, left-hand panel). Each contour represents a fixed SPL of the 1090-Hz component. The most salient characteristic of these data is the convergence of the contours. Adding a low level of one stimulus component to a low level of the other augments loudness sensation notably, but

LAWRENCE E. MARKS

262

I

10

100

IOOO

TOTAL ENERGY-E |OOQ + E|Qgo

Figure 2. Magnitude estimates from Figure 1, plotted against the total energy in each stimulus. (The symbols are as in Figure 1.)

adding a low level to a high one effects virtually no change in loudness. This outcome agrees with a model of energy summation (Zwicker et al., 19S7; Zwicker & Scharf, 1965); that is, with a sequence of processes in which quantities proportional to acoustical power or energy (power integrated over a finite period of time) first add linearly, after which the total is subjected to a nonlinear transformation (by means of a negatively accelerated function). Adding two equally intense components increases loudness but does not double it. The nonlinear transformation provides a law of diminishing returns and results in the convergence evident in the lefthand panel of Figure 1. The data are consistent with the following equation: = g(.Ei

-Eioso),

(5)

in which E represents a scale proportional to sound energy and the function g relates the loudness estimates /, nonlinearly, to 5£. That the sum of the energies of the two components predicts the response can be gleaned from Figure 2, which plots, now in log-log coordinates, the estimates of loudness versus the total energy. (Use of the log scale is more appropriate than use of the linear axis because the variability around each mean magnitude estimate tends to be proportional to the mean; it is the percentage variability that is roughly constant.) Over all but low energy levels, the relationship between loudness judgment and total energy conforms to a straight line—that is, to a power function with an exponent re sound energy of .26. Since sound energy is propor-

tional to the square of sound pressure (E — cP2), this corresponds to an exponent re sound pressure of .52. But more important to the present purpose than the form of this particular psychophysical function is the demonstration that the loudness of two components lying close in frequency depends on the total energy. Though there is scatter about the function drawn in Figure 2, there are no large systematic deviations. Another view of the fit appears in the right-hand panel of Figure 1, which reproduces the factorial plot of the judgments, but this time with the judgments put on a logarithmic axis. The curves represent the prediction based on the model of energy summation; that is, the curves show the values determined from the psychophysical function in Figure 2. Besides displaying the already well-known property of energy summation within a critical band, these data make a second important point. Figure 3 takes the curves from the right-hand panel of Figure 1 and rescales them, plotting the "rescaled loudness" on a linear axis as function of the SPL of the 1000-Hz component. The rescaling was based on a monotonic transformation of the curves —specifically, by taking the inverse of the psychophysical function shown in Figure 2 (raising the estimates to the 1.92[=^] power). The rescaling produces a set of parallel curves, parallelism being a visual criterion of linear additivity (Anderson, 1970). Of course, what the rescaling does is, in effect, make the new scale directly proportional to

LOUDNESS AND LOUDNESS JUDGMENT

263

always prima facie clear which set of scale values to call loudnesses. The decision as to which values are loudnesses must perforce depend on other sorts of criteria besides metric structure.

sound energy. Since it is sound energy that sums linearly, converting the response to energy must, therefore, yield a picture of linear additivity. Thus the present set of data provides an important caveat with regard to the interpretation of certain psychophysical scaling experiments. The demonstration of additive effects often plays a most useful role in evaluating psychological processes, and several approaches—including some called functional measurement (Anderson, 1970) and conjoint scaling (Luce & Tukey, 1964)—rely on discovery of underlying additive (and other arithmetic) processes. Although there is nothing wrong with doing this, it is important to exercise caution in deciding how to identify or name the quantities that add. Often, there is the temptation to assume that the quantities that add are the psychological values; to assume, for instance, that they are loudnesses or brightnesses. But I doubt that many people would want to identify the additive quantities in the present experiment as loudnesses. One of the main points of this article is to argue and, I hope, to demonstrate, that processes that show clear metric structure (such as linear addition) may operate at different levels in the same sensory system and may operate on different sets of underlying psychological values. It is not

Transformation to Loudness: The L Scale As will become clear in this section, the L scale is closely approximated by results of experiments that use methods like magnitude estimation and magnitude production to quantify the subjective intensities of sounds: The L scale is, roughly, Stevens's (19SS) sone scale, the scale generated by averaging results of magnitude estimation and magnitude production. This does not mean that the results of any given experiment in which loudness is scaled by magnitude estimation or magnitude production necessarily yields a close approximation to the L scale. A review (Marks, 1974a) of many published attempts to scale loudness by magnitude estimation and production showed that the exponents of power functions obtained in different studies cover about a twofold range. Moreover, there are well-known systematic effects—for instance, magnitude estimation giving smaller exponents, on the average, than magnitude production (Stevens & Greenbaum, 1966; but NEWTON/m2

0

000063

.0002

.00063

.002

40

.0063

90

0

.000063

.0002

.00063

.OO2

10

DECIBELS SPL - 1000 Hi

Figure 3. Magnitude estimates from Figure 1, rescaled to produce a linearly additive structure. (The left-hand panel shows the rescaled values on a linear axis; the right-hand panel shows them on a logarithmic axis. The rescaled values are proportional to sound energy.)

264

LAWRENCE E. MARKS

see Teghtsoonian & Teghtsoonian, 1978). Still, typical values of the exponent relating loudness judgments to sound pressure of, say, a 1000-Hz tone fall largely in the range .5 to .6 when the method is magnitude estimation and .6 to .7 when the method is magnitude production. The average of these subaverages is about .6 (Marks, 1974a) and, as will be shown, a .6-power function provides at least a good approximation to the scale, L, that underlies loudness summation. In some of the results to be described later, magnitude estimates correspond closely to values on the L scale; in other cases, the estimates do not correspond to values of L and thus require a nonlinear transformation to bring them in line with L. Two important matters need consideration. First, given the conclusion to the previous section, I am under some compulsion to support the claim that the L scale is appropriately called the loudness scale. (My argument appears later.) Second, it is necessary to validate the L scale itself; that is, to demonstrate that there is a sensory transformation from acoustical energy to values of L and that this transformation reflects processing that takes place in the auditory system. Not everyone is willing to accept at face value the validity of any given set of numerical judgments like those obtained in magnitude estimation experiments or of the scale determined by averaging across many published studies. Even if one is unwilling to do so, however, it is possible to demonstrate the transformation to L. This comes about in the context of establishing linear rules of loudness summation. The rules of summation (indeed, the rule of summation, for it is the same rule) apply to two processes—summation of loudnesses of widely separated sound frequencies and summation of loudnesses of the same sound frequency in the two ears. The perceptual result of combining multiple components differs in these two paradigms. In the former paradigm—adding across frequency—the addition is analytic in that loudnesses add, but in at least certain circumstances the individual components can be perceived individually (and responded to individually);

in the latter paradigm—adding sounds with the same frequency across the ears—the addition is synthetic, for once the components add, the result is a single, wholly fused sound image. Analytic Addition Scaling of L and addition of L across frequency. The transformation from £ to L reveals itself in the way that loudness summates across widely spaced sound frequencies. Linear loudness summation was first suggested by Fletcher and Munson (1933) and later evaluated by Howes (19SO). However, these early studies lacked a rigorous test of the additivity rule. More recently, Zwislocki, Ketkar, Cannon, and Nodar (1974) found evidence for linear loudness summation when two equally loud tones, widely separated in frequency, were presented in rapid succession. The loudness summation was equivalent to a gain of 10 dB, which is a doubling of loudness on the sone scale. A more thorough empirical demonstration of linear loudness summation is given in the following experiment, which was patterned after the experiment on critical-band summation described above. Apparatus and procedure were identical to the previous experiment, with a single exception, the frequency spacing of the stimuli. In the present study, the pure tones had frequencies of 300 Hz and 1000 Hz. These two frequencies are separated by about S| critical bandwidths, and pilot research showed that additional separation in frequency had no further effect on loudness summation.2 Each of 20 subjects gave two magnitude estimates 2 In the pilot study, three subjects matched the loudness of a comparison tone at 635 Hz to the loudness of a two-tone complex (geometric center at 635 Hz) as the frequency separation (A/) of the components increased from 100 to 1800 Hz. (The components were all matched to the loudness of a 635-Hz tone at 50 dB SPL.) For one subject, matched loudness did not change with increasing A/; this subject appeared to be matching to one tone in the complex. For the other two subjects, the loudness summation in decibels increased from about 3 dB at a A/ of 100 Hz to about 10 dB for a A/ of 700 Hz and greater.

LOUDNESS AND LOUDNESS JUDGMENT

265

converge at the upper right. As was mentioned earlier, it is well-known that the exponent of the power function fitted to a set of loudness judgments (magnitude estimates) varies somewhat from experiment to experiment. It is worth repeating that results of other experiments, to be described below, show the additivity of components directly in the numerical judgments. The small tendency of the curves to converge can be assessed quantitatively and can readily be eliminated by the following procedure: First, I calculated the marginal mean for each level of the 1000-Hz tone; this was accomplished by averaging data across the nine SPLs of the 300-Hz tone. Then the magnitude estimates were plotted against the marginal means in a graph analogous to that of Figure 4. This NEWTON/m2 is the test plot. If the magnitude estimates DO0063 .0002 , D being a nonlinear function of L.

278

LAWRENCE E. MARKS

Table 1 Median Judgments of Loudness

Difference Stimulus j (dB SPL)

Stimulus i (dB SPL)

15 mon

-oo (blank) 15 mon 20 bin 30 mon 30 bin 40 mon

.24

20 mon

20 bin

30 mon

30 bin

40 mon

40 bin

50 mon

.46

.50

.19

.375

.87 .66

1.46 1.11

1.45 1.16

.75 .37

.80 .61

2.38 1.96 1.58 1.31

2.50 2.25 1.65 1.44 1.07

.83 .74

.81

Note, mon = monaural, bin = binaural. The stimuli that made up each pair (5,- and Sj) were either monaural or binaural.

Method Apparatus. The 1000-Hz signal was split into four channels, two of which formed the components of the less loud signal in each pair ( S i ) , the other two the components of the louder signal in each pair (Sj). One component of each pair went to the left ear, the other to the right ear. Thus Si comprised Pt,i and Pi.r and Si comprised PI.I and PI.,. Each sound stimulus lasted 1 sec and had rise and decay times of 10 msec. A 1-sec pause separated the end of the first sound from the start of the second. Stimuli. The stimulus levels were chosen so as to distinguish effectively the scale underlying binaural summation from the scale underlying judgments of loudness differences. The louder stimulus of each pair, Si, could take on any of the following levels: IS dB, monaural; 20 dB, monaural; 20 dB, binaural; 30 dB, monaural; 30 dB, binaural; 40 dB, monaural; 40 dB, binaural; and SO dB, monaural. The less loud stimulus of each pair, Si, could take on any of the following levels: zero intensity (a blank stimulus); 15 dB, monaural; 20 dB, binaural; 30 dB, monaural; 30 dB, binaural; and 40 dB, monaural. Stimulus pairs—Si, S]—were chosen to ensure that Sj would in fact be louder than Si. That is, when both Si and 5j were presented in the same manner—either monaurally and binaurally—Sj had greater SPL; when one of them was binaural and the other monaural, the combinations were chosen on the basis of results of previous experiments on binaural summation to ensure that L ( S j ) > L ( S i ) . In addition, the experimental design made it possible to measure directly the degree of binaural summation: By including a blank stimulus in the set of S