subjects were asked to classify all eight stimuli as either '''' or ''0.'' Results of this .... studied streaming, or the tendency for sequences of sounds to appear as ...
Identification of multidimensional complex sounds having parallel dimension structure Laurel A. Christensen Department of Communication Disorders and Kresge Hearing Research Laboratory, Louisiana State University Medical Center, 1900 Gravier Street, New Orleans, Louisiana 70112
Larry E. Humes Audiology Research Laboratory, Department of Speech and Hearing Sciences, Indiana University, Bloomington, Indiana 47405
~Received 28 December 1994; revised 14 August 1995; accepted 4 December 1995! The present series of experiments examined the ability of normal-hearing listeners to make use of cues from multiple, independent stimulus dimensions when classifying multidimensional complex sounds. Ten listeners classified complex sound pulses that differed along three independent dimensions. The stimuli were 100 ms in duration and were synthesized using five simultaneous sinusoids. The three dimensions of the complex stimuli manipulated in this experiment were harmonicity, spectral shape, and amplitude envelope. Each stimulus dimension could take on one of two values, referred to here as target or nontarget. Eight stimuli were synthesized using all possible combinations of the dimension values. Subjects were trained to label two stimuli, the two having either all-target or all-nontarget values, as ‘‘1’’ and ‘‘0,’’ respectively. Following this training, subjects were asked to classify all eight stimuli as either ‘‘1’’ or ‘‘0.’’ Results of this experiment indicated that listeners preferred to classify these stimuli on the basis of one stimulus dimension. However, the preferred dimension was not the same for all of the listeners. In addition, it was demonstrated that it was possible to train an individual to use a dimension other than the one or two initially preferred when classifying the stimuli. © 1996 Acoustical Society of America. PACS numbers: 43.66.Mk, 43.66.Fe, 43.66.Lj
INTRODUCTION
Considerable research over the last 40 years has focused on the acoustic-phonetic patterns of speech and the perceptual cues contained within them. Of specific interest has been the use of these often redundant perceptual cues by listeners to make acoustic-phonetic distinctions. Multiple cues are available for a listener to make acoustic-phonetic distinctions. For example, when a listener perceives a change in the speech signal from one phoneme to another, acoustic changes occur along several dimensions, including, frequency, amplitude, and time. These changes can result in several different physical cues signaling a single phonetic perceptual contrast. Of interest in this research is how listeners make use of multiple, independent cues to classify stimuli. Listeners learn to process speech efficiently after extensive training during infancy and early childhood. In order to avoid the familiarity of the overlearned speech signal, novel multidimensional nonspeech sounds have been used to examine the way in which listeners use multiple cues to classify stimuli. Kidd and Watson ~1987! explored the ability of listeners to categorize complex nonspeech sounds based on information in multiple independent stimulus dimensions. Specifically, listeners classified complex sounds with three independent dimensions to determine the degree to which listeners allocate attention to each dimension and their ability to integrate this information over a series of sound pulses. Listeners categorized stimuli as ‘‘targets’’ or ‘‘nontargets’’ on the basis of stimulus values for each of three dimensions. 2307
J. Acoust. Soc. Am. 99 (4), Pt. 1, April 1996
These dimensions included harmonicity ~the relationships among the frequencies of the complex sound!, spectral shape ~the spectral envelope of the complex sound!, and amplitude envelope ~the temporal envelope of the stimulus waveform!. It was determined that optimal performance for an ideal observer was achieved by categorizing stimuli with target values on less than two dimensions as ‘‘nontarget’’ and stimuli with two or more target values as a ‘‘target.’’ Results of this study indicated that listeners tended to prefer or more heavily weight certain dimensions of the brief bursts of complex sound. Of particular interest to these investigators was the observation that listeners did not always prefer the same stimulus dimension. Finally, these investigators subsequently attempted to train listeners to determine if attentional biases could be eliminated ~G. R. Kidd, personal communication!. Results of these studies indicated that a subject’s preferred dimension was altered very little following training. That is, listeners were not successfully trained to utilize the three dimensions on a more equal basis for classification as ‘‘target’’ or ‘‘nontarget.’’ Complex sound classification was also studied by Howard and O’Hare ~1984!. These investigators examined which physical properties of a complex sound were responsible for similarity judgments of these sounds. It was found using multidimensional scaling ~MDS! analysis that both temporal properties ~periodicity and pitch! and spectral shape properties ~tilt and compactness! correlated with the dimensions revealed through MDS. Furthermore, listeners with musical backgrounds tended to focus on the temporal prop-
0001-4966/96/99(4)/2307/9/$6.00
© 1996 Acoustical Society of America
2307
erties, while those without musical backgrounds focused on the spectral shape properties. In addition to the above cited work on nonspeech stimuli, Garner ~1970! studied the ability of listeners to classify stimulus sets that contained various numbers of stimulus properties. Garner felt that a listener’s ability to classify these stimuli depended on the perceptual relation of the properties. Specifically, Garner ~1970! defined two types of properties, integral and separable. Integral properties are perceived as fused together. That is, if a stimulus has two dimensions and integral properties, the two fuse as one and the separate dimensions cannot be heard out. On the other hand, separable properties exist when the dimensions can be perceived individually. Thus for a stimulus with separable dimensions, more cues can be perceived and analyzed. Pollack and Ficks ~1954! found that when the properties of sounds are varied independently ~separable properties!, the number of sounds that can be correctly identified increases. This finding would be in agreement to the ideas of Garner ~1970!. Finally, Bregman ~1978! and Van Noorden ~1971! have studied streaming, or the tendency for sequences of sounds to appear as one object. In these studies, patterns of tones have been perceived as a single stream when there is similarity in spectral, spatial, intensive, and/or temporal characteristics ~Bregman and Campbell, 1971!. The strongest cue for stream segregation has been found to be the spectral characteristics ~Bregman, 1978!. Thus components are more likely to be perceived as separate streams if they differ widely in frequency. However, Van Noorden ~1971! found that subjects could experience steaming of sounds depending upon the instructions given and the attention that the subjects placed on the perceptual dimensions. Thus steaming of sounds could be manipulated by attention. The present series of experiments examined the ability of normal-hearing listeners to make use of cues from multiple, independent stimulus dimensions when classifying multidimensional complex nonspeech sounds. An additional goal was to determine if listener’s could be trained to shift their attention to a particular stimulus dimension for classification purposes.
FIG. 1. Target ~‘‘1’’! and nontarget ~‘‘0’’! values on the three dimensions of the sound-pulse stimuli.
The stimuli utilized in this experiment were sound pulses used previously by Kidd and Watson ~1987!. These sound pulses were synthesized by adding five simultaneous sinusoids and were 100 ms in length. The sound pulses differed on three parallel dimensions: harmonicity, spectral
shape, and amplitude envelope. The three dimensions were parallel in that stimulus attributes along all three dimensions occurred simultaneously throughout each sound pulse. These dimensions were chosen by Kidd and Watson ~1987! because of their importance in identifying ‘‘many man-made or other single-source sounds.’’ Each of these three dimensions could take on one of two possible values: target ~1! or nontarget ~0!. Target values were chosen to represent natural single, meaningful events, while nontarget values were chosen to represent background or nonmeaningful events. The three dimensions and their possible values can be seen in Fig. 1. The harmonicity dimension ~top of figure! refers to the relationships among the frequencies of the sinusoids. The target ~or ‘‘1’’! value on this dimension was harmonic. That is, the sinusoidal components were at successive integer multiples of the fundamental frequency at 440 Hz. The nontarget ~or ‘‘0’’! value for this dimension was inharmonic. For this value, the second sinusoidal component was shifted from 880 to 740 Hz. The spectral-shape dimension ~middle of figure! refers to the relative amplitudes of the five sinusoidal components. For this dimension, the nontarget ~or ‘‘0’’! value consisted of sinusoids that decreased linearly in amplitude as frequency increased. In contrast, the target ~or ‘‘1’’! value consisted of an increase in the amplitude of the
2308
L. A. Christensen and L. E. Humes: Multidimensional sounds
I. GENERAL METHOD A. Subjects
A total of 12 normal-hearing subjects participated in this project. Subjects ranged in age from 19–28 years ~M521.25 years!. Criteria for normal hearing included pure-tone-airconduction thresholds of less than 20 dB HL ~ANSI, 1989! from 500 to 4000 Hz and normal tympanograms in the test ear. All testing in this study was completed monaurally with the test ear selected arbitrarily. B. Stimuli apparatus
J. Acoust. Soc. Am., Vol. 99, No. 4, Pt. 1, April 1996
2308
TABLE I. Values on the three dimensions of the multidimensional sound pulse stimuli utilized in experiment I. ‘‘0’’ means a nontarget value for the stimulus dimension whereas ‘‘1’’ refers to a target value.
were needed to complete testing in this experiment. All subjects were paid for their participation. The series of three experiments are described below.
Dimension Stimulus
Harmonicity
Shape
Amp. Envel.
A B C D E F G H
0 1 0 1 0 1 0 1
0 0 1 1 0 0 1 1
0 0 0 0 1 1 1 1
fourth sinusoidal component with a corresponding decrease in amplitudes of the second and third sinusoidal components to maintain constant rms energy for both the target and nontarget stimuli. Finally, the amplitude-envelope dimension ~bottom of figure! refers to the envelope of the stimulus waveform. The target ~or ‘‘1’’! value for this dimension consisted of a 5-ms linear rise time with an immediate 5-ms linear return to steady-state amplitude. The decay time at offset was also linear and 5 ms. The nontarget ~or ‘‘0’’! value consisted of a linear 50-ms rise time followed immediately by a linear 50-ms decay. Eight sound-pulse stimuli were synthesized using every combination of the values on the three dimensions. These sound pulses, labeled A to H along with their corresponding values ~‘‘1’’ or ‘‘0’’! on each dimension, can be seen in Table I. All of the sound-pulse stimuli were equated for rms amplitude. Three digital audio tape ~DAT! recordings of the soundpulse stimuli were generated for this experiment. Tapes were generated by playing-out the sound pulses from a computer ~PDP-11/83! through a digital-to-analog converter ~Micro Tech Unlimited, Digisound-16! at a sampling rate of 10 000 Hz. The pulses were then low-pass filtered at 3.9 kHz ~Micro-Tech Unlimited, Digisound 16! and sent to an attenuator ~Hewlett–Packard, model 350B!. Finally, the output of the attenuator was sent to the input of a DAT recorder ~Panasonic, SV-3500!.
II. EXPERIMENTS A. Experiment 1—Similarity judgments
1. Method
For this experiment, a tape was generated that consisted of the sound-pulse stimuli played-out in pairs. From the eight sound-pulse stimuli, it was possible to create 64 different pairs of sound pulses. All 64 sound-pulse pairs were utilized to balance the order of presentation of the sound pulses in each pair. Sound pulses were recorded with 1 s between the members of each pair and 5 s between each pair. The level of each pair was roved over a 20-dB ~65– 85 dB SPL! range in 2-dB steps. The levels were roved randomly using an attenuator ~Hewlett–Packard, model 350B! from trial to trial to insure that the loudness of the sound pulses would not be a factor during the experiment. Ten randomizations of the 64 pairs were recorded. The subjects were presented with all ten randomizations of the 64-pair lists and asked to make similarity judgments about each pair of stimuli. Specifically, the subjects were asked to assign a value between one and nine to each pair of sound pulses. A rating of one indicated that the subject felt the two sound pulses were similar, while a rating of nine indicated that the subject felt the two sound pulses were not similar. All 64 sound-pulse pairs were utilized to balance the order of presentation of the sound pulses in each pair. 2. Results and discussion
Subjects were presented three sets of taped materials ~described below! over headphones at 75 dB SPL. Tapes were played-back via a DAT player ~Panasonic, SV-3500!. This was accomplished by routing the output of the DAT player to the input of an audiometer ~Grason–Stadler, model 162!. The output of the audiometer was then delivered to 13 pairs of TDH-39 headphones mounted in MX-41/AR cushions. Calibration was accomplished by adjusting the level of a 1-kHz calibration tone to 75 dB SPL. The rms level of the calibration tone matched that of the stimuli. Responses were collected in a pencil-and-paper format. All testing was completed in a large acoustically treated room with noise levels low enough to permit threshold measurements with headphones to within 15 dB of audiometric zero from 250 to 8000 Hz ~ANSI, 1991!. Approximately three 2-h sessions
The data from the similarity judgments performed by the subjects was submitted to multidimensional scaling ~MDS! analysis to verify the dimensionality of the stimuli. This was accomplished by averaging the ten ratings given to each pair of sound pulses without regard to presentation order to create a triangular matrix for each subject. These matrices were then submitted to alternating least-squares scaling ~ALSCAL! using an individual difference scaling ~INDSCAL! model. An INDSCAL model was utilized so that both a coordinate space and individual subject weights could be determined. The fundamental assumption in using MDS to find a coordinate solution is that the same dimensional space underlies performance for all subjects, however, individual subjects differ because of the amount of weight they put on each dimension. Results of this analysis are summarized graphically in Fig. 2. The best fitting solution for the data was three dimensional with an r 2 value of 0.917 and a stress value of 0.081. This figure provides a series of twodimensional plots of the derived perceptual coordinates from the MDS solution. The legend in the bottom right-hand corner of this figure indicates the values ~‘‘1’’ and ‘‘0’’! on the three stimulus dimensions ~H5harmonicity, S5spectral shape, A5amplitude envelope! for all eight stimuli. Based on the positions of the stimuli in these perceptual spaces, the dimensions have been labeled to correspond to the three physical dimensions of the stimuli manipulated here ~spectral shape, amplitude envelope and harmonicity!.
2309
L. A. Christensen and L. E. Humes: Multidimensional sounds
C. Materials/procedures
J. Acoust. Soc. Am., Vol. 99, No. 4, Pt. 1, April 1996
2309
the subjects had a different pattern of weights for the three dimensions. Six subjects attended most to the shape dimension, four attended most to the harmonicity dimension, one attended most to the amplitude-envelope dimension, and finally, one subject paid approximately equal attention to both the shape and harmonicity dimensions. In the next experiment, the attention subjects paid to each dimension was investigated further. B. Experiment 2—Exemplar training
1. Method
FIG. 2. MDS stimulus coordinates for the eight sound-pulse stimuli from the similarity judgment experiment ~experiment 1!. The eight stimuli are labeled as A through H and their corresponding values on each dimension are shown in the legend.
The graph in the upper left-hand corner of the figure plots the spectral-shape dimension by the harmonicity dimension. It can be seen from this figure together with the legend, that stimuli A, B, E, and F all had the same value, ‘‘0,’’ of spectral shape, while C, D, G, and H all had the ‘‘1’’ value of spectral shape. In addition, stimuli A, C, E, and G all had the ‘‘0’’ value of harmonicity, while stimuli B, D, F, and H all had the ‘‘1’’ value of harmonicity. In the upper right-hand graph, the spectral-shape dimension is plotted as a function of the amplitude-envelope dimension. Again, it can be seen from this graph and this legend below, that stimuli A, B, C, and D all had the ‘‘0’’ value of amplitude envelope, while E, F, G, and H all had the ‘‘1’’ value of amplitude envelope. The coordinates for the spectral-shape dimension were identical to the upper left-hand graph. The bottom graph plots the harmonicity dimension by the amplitudeenvelope dimension. Thus the derived stimulus coordinates from the MDS solution indicated that the stimuli were being perceived along three dimensions that corresponded to the stimulus dimensions of spectral shape, amplitude envelope, and harmonicity. In addition to verifying the existence of three perceptual dimensions associated with each of the stimulus dimensions manipulated, the INDSCAL solution was also used to derive weights for each individual subject on each dimension. These weights are tabled in Appendix A. The weights on the dimensions indicate the listeners did not pay equal attention to the three dimensions. Of interest is that the dimension that some subjects paid most attention to was not the same dimension to which other subjects attended. That is, each of 2310
J. Acoust. Soc. Am., Vol. 99, No. 4, Pt. 1, April 1996
Ten of the 12 subjects from the similarity judgment experiment participated in this experiment which consisted of training in the classification of these multidimensional stimuli. Two of the 12 subjects withdrew from the study after the first experiment. A second tape was generated for this experiment. The purpose of the first section of tape two was to train the subjects to label two exemplar sound pulses. The two exemplar sound pulses consisted of the all-target ~1 1 1! stimulus ~Table I, stimulus H; harmonic, spectral bump, abrupt amplitude envelope! and the all-nontarget ~0 0 0! stimulus ~Table I!, stimulus A; inharmonic monotonically decreasing components, and gradual amplitude envelope!. Ten repetitions of each exemplar sound pulse were first presented to the subjects, followed by alternating presentation of exemplar stimuli for ten presentations each. The sound-pulse stimuli were recorded so that 1 s separated the presentation of each new sound-pulse stimulus. The subjects were instructed as to the sequence of sounds and were trained to label the all-target sound pulse as ‘‘1’’ and the all-nontarget sound pulse as ‘‘0.’’ Following this training or familiarization, in the second section of tape two, subjects were presented 25 of the alltarget sound pulses and 25 of the all-nontarget sound pulses in a random order and asked to label each sound pulse as either ‘‘1’’ or ‘‘0.’’ These sound pulses were recorded with 5 s between the presentation of each new sound pulse and the presentation levels were again roved over a 20-dB range in 2-dB steps. Subjects were required to perform this task with 90% accuracy prior to continuing to the next task. All subjects reached 90% accuracy after only one training session. The last section of this tape consisted of all eight stimuli presented to the subjects 80 times each in a random order with 5 s between presentations of new sound pulses. The presentation levels of the sound pulses were roved randomly over a 20-dB range in 2-dB steps. During this portion of the experiment, referred to as the classification task, the subjects were asked to classify all of the eight sounds as being either ‘‘1’’ or ‘‘0.’’ The subjects’ only instructions were to classify the sound they heard as ‘‘1’’ if it sounded most like what they learned as ‘‘1’’, or ‘‘0’’ if it sounded most like what they learned as ‘‘0.’’ 2. Results and discussion
The data collection for the classification task was divided over 2 days. Due to this, the first ten responses to the eight stimuli made on each of the 2 days were submitted to L. A. Christensen and L. E. Humes: Multidimensional sounds
2310
reliability analyses. This reliability analysis included calculation of Cronbach’s alpha and a repeated-measures analysis of variance. Results of the ANOVA for these subjects indicated that the responses on the 2 days were not significantly ~p.0.05! different for any of the eight stimuli. Cronbach’s alpha reliability coefficients for all eight stimuli exceeded 0.92. Consequently, data across the two test sessions were pooled for subsequent analyses. Plotted in Fig. 3 are the results of the classification of the eight stimuli from three subjects ~RM, VS, BP!. In each of the graphs in this figure, the eight stimuli are plotted along the abscissa and the percent of responses that each stimulus was labeled as ‘‘1’’ is plotted along the ordinate. The last 60 of the 80 total classifications of each of the eight stimuli were used to calculate these percentages. The first ten responses on each of the 2 days were treated as practice and discarded. In the top graph of this figure, it can be seen that this subject labeled stimuli B, D, F, and H as ‘‘1’’ 88%– 100% of the time and A, C, E, and G as ‘‘1’’ 0%–2% of the time. In other words, A, C, E, and G were being labeled as ‘‘0’’ 98%–100% of the time. The legend in the figure indicates that stimuli B, D, F, and H all were ‘‘1’’ on the harmonicity dimension, while stimuli A, C, E, and G were all
‘‘0’’ on this same dimension. Therefore, it was concluded that this subject was utilizing the harmonicity dimension to classify the sound-pulse stimuli. It is important to note that even stimuli B and G, in which the values on both of the other two dimensions were opposite that for the harmonicity dimension, were still being classified on the basis of the harmonicity dimension. That is, even when the values on two of the three dimensions were ‘‘1,’’ as for stimulus G, this subject still labeled this stimulus as ‘‘0,’’ the value on the harmonicity dimension. In the middle graph of Fig. 3 is an example of an individual who used the amplitude-envelope dimension to classify the sound-pulse stimuli. This subject labeled stimuli E, F, G, and H as ‘‘1’’ 78%–100% of the time and A, B, C, and D as ‘‘1’’ 0%– 4% of the time. Thus from the legend it can be determined that this subject used the amplitude-envelope dimension because stimuli E, F, G, and H were all ‘‘1’’ on the amplitude-envelope dimension, whereas, stimuli A, B, C, and D were ‘‘0’’ on this dimension. In the bottom graph, another subject ~BP! labeled C, D, G, and H as ‘‘1’’ 98%– 100% of the time, whereas A, B, E, and F were labeled ‘‘1’’ 0%–9% of the time. Again, from the legend it can be determined that this subject was using the spectral-shape dimension to classify these stimuli. Thus results of this part indicated that many subjects were attending exclusively or primarily to one dimension to classify the stimuli. Analysis of the classification data from all subjects indicated that there were five subjects primarily using harmonicity, three subjects primarily using amplitude envelope and one subject that was emphasizing spectral shape. Two subjects appeared to be using a ‘‘majority rules’’ pattern. That is, these two subjects were classifying the eight stimuli as ‘‘1’’ when two or more of the values on the three dimensions were ‘‘1’’ and as ‘‘0’’ when two or more of the values were ‘‘0.’’ Finally, one subject that participated in this experiment showed no identifiable pattern of results. The generalized context model ~GCM; Nosofsky, 1986! was used to calculate the perceived dimension weights for each subject. The GCM is based on the assumption that classification of a stimulus will be determined by how similar it is to exemplar members of each category. The GCM is used in conjunction with the MDS solution and can predict performance in classification experiments involving the same set of stimuli scaled in the MDS solution ~Nosofsky, 1992!. In the GCM, similarity among stimuli is determined by the distances between the stimuli in the MDS space and can be modified on the basis of selective attention. The GCM assigns a weight ranging from 0 to 1.0 to each dimension for each individual subject with the sum of the weights of all the dimensions equal to 1.0. @For a full review of the GCM see Nosofsky ~1992!.# The GCM weights assigned each of the dimensions are listed to the right of each of the three graphs in Fig. 3. In Fig. 3, subject RM had the greatest weight on harmonicity ~0.889!. Thus the weight derived by the GCM confirms that subject RM was directing the majority of attention to the harmonicity dimension. In addition, subjects VS and BP ~middle and bottom graphs, respectively! had the greatest GCM weights on amplitude envelope ~0.745! and spectral
2311
L. A. Christensen and L. E. Humes: Multidimensional sounds
FIG. 3. Results of the exemplar training in experiment 2. In this figure, the stimuli are plotted versus the percentage of responses each stimulus was classified as ‘‘1.’’ Next to each graph are the derived generalized context model ~GCM! weights.
J. Acoust. Soc. Am., Vol. 99, No. 4, Pt. 1, April 1996
2311
FIG. 4. Distribution of derived GCM weights following the exemplar training in experiment 2.
shape ~0.928!, respectively. Again, these GCM weights confirm that these subjects were selectively attending to one dimension. The GCM weights themselves, therefore, represent a metric that can capture the way in which attention has been distributed across dimensions. Figure 4 shows the frequency distribution of the derived GCM weights for each dimension across subjects. As can be seen in this figure, the spectral-shape dimension ~unfilled bars! was weighted low ~,0.3! by most subjects. The other two dimensions, however, were weighted anywhere from 0 to 1.0 by the subjects in this experiment. There is no clear preference among the subjects, as a group, for one of these two dimensions, although both tend to be preferred more often than the spectral-shape dimension. GCM weights for all of the subjects following exemplar training are tabled in Appendix B. This experiment demonstrated that many subjects selectively attend to a single dimension on these multidimensional stimuli, with the preferred dimension varying across subjects. In the next experiment, the ability of the subjects to shift their attention with training to a new stimulus dimension was investigated. The stimulus dimension chosen was amplitude envelope given the wide range of GCM weights for this dimension across subjects ~Fig. 4!.
envelope dimension. These six stimuli were the training stimuli and stimuli E and F were the test stimuli. The test stimuli were used to assess the generalization of training to nontrained stimuli. Stimuli where chosen to test and train on the basis of their perceptual distances from one another in the INDSCAL solution. For example, stimulus E has nontarget values on both the harmonicity and shape dimension. Thus, perceptually, stimulus E is closer to a nontarget ~0! than to a target ~‘‘0’’! classification. Therefore, if stimulus E is called ‘‘1’’ we can assume a listener has learned to use the amplitude-envelope feature to classify the stimuli. To accomplish this training, a third tape was generated. Subjects listened to 10 presentations of the ‘‘1’’ stimuli followed by 15 presentations of the ‘‘0’’ stimuli. This training was similar in format to that of experiment 2. However, in the previous experiment, there was only one ‘‘1’’ stimulus and one ‘‘0’’ stimulus during training whereas, in this experiment, there were several ‘‘1’’ and ‘‘0’’ training stimuli. Each sequence of like-valued training stimuli was a different random ordering of either ‘‘1’’ or ‘‘0’’ stimuli. Following this, subjects listened to the ‘‘1’’ and ‘‘0’’ stimuli played-out alternately beginning with a ‘‘1’’ stimulus. Again, the stimulus selected to represent the ‘‘1’’ and ‘‘0’’ categories on each presentation was determined randomly. All ten subjects remaining in the experiment at this time completed this final training even though three subjects were found to already have high GCM weights on this trained dimension ~amplitude envelope!. Following this training, subjects listened to ten presentations each of the training stimuli in a random order and labeled them as ‘‘1’’ or ‘‘0’’ until they could label the ‘‘1’’ and ‘‘0’’ stimuli correctly 90% of the time. All subjects achieved 90% accuracy after the first training session. When the 90% performance criterion was attained, subjects were once again presented a random sequence of all eight stimuli 80 times each, as in experiment 2, and asked to classify each stimulus as either ‘‘1’’ or ‘‘0.’’ This time, however, they were asked to classify the stimuli on the basis of the training they had just undergone. 2. Results and discussion
This experiment consisted of another training phase. The same ten subjects in the previous experiment participated in this experiment. Subjects were trained to label stimuli G and H ~Table I! as ‘‘1’’ and stimuli A, B, C, and D ~Table I! as ‘‘0’’ during this experiment. As can be seen in Table I, stimuli G and H had the ‘‘1’’ value on amplitude envelope in common, while stimuli A, B, C, and D shared the ‘‘0’’ value for the amplitude-envelope dimension. Thus this training was designed to focus the listeners’ attention on the amplitude-
The results of this experiment can be seen in Figs. 5 and 6. Figure 5 illustrates the range of classification results from four subjects following this amplitude-envelope training. The GCM weights are listed below each graph. The results plotted here represent the entire range of GCM weights on the amplitude-envelope dimension obtained after training on this dimension. That is, subject LL had the smallest GCM weight on amplitude envelope following amplitude-envelope training and LH had the highest weight following amplitudeenvelope training. As can be seen, subject LL did not learn to shift GCM weight from the harmonicity dimension to the amplitude-envelope dimension. This can be seen from both the graph and GCM weight. Subject LL was using harmonicity prior to this final training and still places the greatest GCM weight ~0.987! on harmonicity. Subjects RM and BP, however, have begun to shift their weight to the amplitudeenvelope dimension. Subject RM had a GCM weight of 0.102 on this dimension prior to training and had a GCM
2312
L. A. Christensen and L. E. Humes: Multidimensional sounds
C. Experiment 3—Amplitude-envelope training
1. Method
J. Acoust. Soc. Am., Vol. 99, No. 4, Pt. 1, April 1996
2312
the amplitude-envelope dimension almost exclusively. Nine out of ten subjects had greater GCM weights on the amplitude-envelope dimension after training to the amplitude-envelope dimension. The frequency distribution of weights following training is shown in Fig. 6. Relative to the pretraining distribution of weights ~Fig. 4!, Fig. 6 further reflects the shift in weights toward the amplitude-envelope dimension. As can be seen, distribution of weights on the amplitude-envelope dimension ~solid black bars! has shifted to higher values than in Fig. 4. GCM weights for all subjects following amplitude-envelope training are tabled in Appendix C. To further verify the shift in attention to the amplitudeenvelope dimension, a paired sample t test was performed on the GCM weights derived for the amplitude-envelope dimension following initial exemplar training and the weights derived following training to the amplitude-envelope dimension. The mean GCM weight on the amplitude-envelope dimension prior to training on the dimension was 0.383, whereas, the mean GCM weight following training to this dimension was 0.547. The one-tailed paired sample t test was significant at the 0.05 level ~t521.92, df59!. III. GENERAL DISCUSSION
FIG. 6. Distribution of the derived GCM weights following the amplitudeenvelope training in experiment 3.
The purpose of this study was to examine the way in which normal-hearing listeners made use of cues from multiple, independent stimulus dimensions to classify multidimensional acoustic stimuli. Results of these three experiments indicated that most subjects did not use stimulus information available in multiple dimensions to categorize the complex sound-pulse stimuli. Rather, subjects allocated the majority of their attention to one particular stimulus dimension and used this dimension to classify the stimuli as ‘‘1’’ or ‘‘0.’’ This was evident both from the INDSCAL weights ~experiment 1! and the weights derived from the GCM ~experiment 2!. In addition, it was demonstrated ~experiment 3! that it was possible to train an individual to use a dimension other than the one initially used to classify the stimuli. The results of experiments 1 and 2 of this project were in agreement with those of Kidd and Watson ~1987! who first used these stimuli. They found that listeners tended to prefer or weight certain dimensions of these stimuli more than others. Furthermore, Kidd and Watson ~1987! made the observation that listeners did not always prefer the same stimulus dimension. The results of experiments 1 and 2 indicate that preferences for dimensions were not always maintained from the similarity judgments ~experiment 1! to classification ~experiment 2!. Correlations calculated between the weights assigned each subject for each dimension on both MDS and GCM revealed significant ~p,0.05! correlations between the GCM and the MDS weights for the harmonicity and amplitude-envelope dimension. Correlations were not significant for the shape dimension. The correlation for the shape dimension is likely nonsignificant because of the nature of the tasks used to derive the two sets of weights. Specifically, MDS utilizes similarity judgments where there are no right or wrong answers, while the GCM weights were
2313
L. A. Christensen and L. E. Humes: Multidimensional sounds
FIG. 5. Results of the amplitude-envelope training in experiment 3 for four subjects. In this figure, the stimuli are plotted versus the percentage of time each stimulus was classified as ‘‘1.’’ Under each graph are the derived GCM weights.
weight of 0.318 following training. Similarly, subject BP had a weight of 0.069 on amplitude envelope prior to training and had a post-training GCM weight of 0.678 for this dimension. Finally, subject LH shifted GCM weight on the amplitude-envelope dimension from 0.592 to 0.936, using
J. Acoust. Soc. Am., Vol. 99, No. 4, Pt. 1, April 1996
2313
derived after subjects were trained on exemplar stimuli. Thus the decision process for the subject changes in the two tasks and probably accounts for this discrepancy. Further examination of the weights in Appendix A and Appendix B, however, reveals that for only 2 of the 10 subjects is there no agreement at all between the top one of two weighted dimensions by the two methods. Again, future research will need to address if perceptual bias is maintained between tasks. Not all subjects in the final experiment of this series showed a complete shift in their attention from their initial preferred dimension to the trained dimension ~amplitude envelope!. Moreover, although a 90% correct classification score was achieved on the training stimuli alone, this performance was not reflected upon testing. That is, when presented with the two untrained stimuli in addition to the trained stimuli, subjects seemed to get confused to the correct response with the trained stimuli. This may be due to the training in this experiment. This training was not ideal in that trial-to-trial feedback was not used. Had a better or longer training paradigm been used, subjects may have more clearly shifted their weights to the amplitude-envelope dimension. However, it was demonstrated that 90% of the subjects placed greater weight, as evidenced by the GCM, on the amplitude-envelope dimension following training to this dimension. The GCM weight takes into account trained and untrained stimuli. Therefore, even though performance of 90% was not maintained for the training stimuli, the model shows a trend toward using amplitude envelope when classifying all eight stimuli. Future experiments are needed to determine if listeners are truly shifting their weight or merely learning responses during training. Nonspeech sounds were used as stimuli in this project, however, the results of this research may be applicable to listeners’ use of multiple cues in making acoustic-phonetic distinctions in speech. In particular, the acoustic-phonetic patterns of speech contain multiple, redundant cues for making acoustic-phonetic distinctions. It may be that the multidimensional speech signal is processed in much the same way as the multidimensional stimuli used in these two experiments. That is, listeners may pay particular attention to one available cue, even though more cues which signal an acoustic-phonetic distinction exist. Future studies will examine the use of cues in multidimensional speechlike stimuli to further determine the use of multiple cues by listeners when classifying stimuli.
ACKNOWLEDGMENTS
APPENDIX A TABLE AI. MDS subject weights—Experiment 1. Dimension Subject
1 ~shape!
2 ~harm.!
3 ~amp. envel.!
LL JK JC BP RM AT LH DR MS AL SW VS
0.67 0.79 0.51 0.92 0.81 0.52 0.70 1.00 0.62 0.58 0.62 0.99
0.65 0.54 0.15 0.22 0.49 0.68 0.37 0.01 0.67 0.68 0.70 0.06
0.07 0.11 0.82 0.11 0.15 0.29 0.47 0.03 0.22 0.39 0.23 0.06
APPENDIX B TABLE BI. GCM weights—Exemplar training—Experiment 2. Dimension Subject
1 ~shape!
2 ~harm.!
3 ~amp. envel.!
LL JK JC BP RM AT LH MS AL VS
0.14 0.03 0.04 0.93 0.01 0.11 0.13 0.07 0.16 0.07
0.66 0.97 0.00 0.00 0.89 0.41 0.28 0.74 0.36 0.19
0.20 0.00 0.96 0.07 0.10 0.48 0.59 0.19 0.48 0.75
APPENDIX C TABLE CI. GCM weights—Amplitude–envelope training—Experiment 3. Dimension Subject
1 ~shape!
2 ~harm.!
3 ~amp. envel.!
LL JK JC BP RM AT LH MS AL VS
0.01 0.18 0.01 0.01 0.21 0.23 0.06 0.22 0.12 0.03
0.99 0.23 0.00 0.32 0.47 0.28 0.00 0.56 0.39 0.23
0.00 0.59 0.99 0.68 0.32 0.50 0.94 0.23 0.49 0.74
This work is supported by the National Institute of Aging and the Air Force Office of Scientific Research. We thank Rob Nosofsky for the contribution of his mathematical models which strengthen the findings of this research. In addition, we thank Charles Watson and Gary Kidd for the use of their multidimensional stimuli and helpful comments with this research. Finally, we would like to thank the reviewers on this manuscript for their thoughtful comments and suggestions.
ANSI ~1989!. ANSI S3-1991, ‘‘Maximum permissible ambient noise levels for audiometric test rooms’’ ~American National Standards Institute, New York!. ANSI ~1991!. ANSI S3.1-1991, ‘‘Maximum permissible ambient noise levels for audiometric test rooms’’ ~American National Standards Institute, New York!. Bregman, A. S. ~1978!. ‘‘The formation of auditory streams,’’ in Attention and Performance Vol. 7, edited by J. Requin ~Erlbaum, New Jersey!. Bregman, A. S., and Campbell, J. ~1971!. ‘‘Primary auditory stream segregation and perception of order in rapid sequences of tones,’’ J. Exp. Psychol. 89, 244 –249. Garner, W. R. ~1970!. ‘‘The stimulus in information processing,’’ Am. Psychol. 25, 350–358.
2314
L. A. Christensen and L. E. Humes: Multidimensional sounds
J. Acoust. Soc. Am., Vol. 99, No. 4, Pt. 1, April 1996
2314
Howard, J. H., and O’Hare, J. J. ~1984!. ‘‘Human classification of complex sounds,’’ Nav. Res. Rev. 1, 26 –32. Kidd, G. R., and Watson, C. S. ~1987!. ‘‘Perception of multidimensional complex sounds,’’ J. Acoust. Soc. Am. Suppl. 1 81, S33. Noorden, L. P. A. S. van ~1971!. ‘‘Rhythmic fission as a function of tone rate,’’ IPO Ann. Prog. Rep. 6 ~Eindhoven, The Netherlands!, pp. 9–12.
Nosofsky, R. M. ~1986!. ‘‘Attention, similarity, and the identificationcategorization relationship,’’ J. Exp. Psychol. 115, 39–57. Nosofsky, R. M. ~1992!. ‘‘Similarity scaling and cognitive process models,’’ Ann. Rev. Psychol. 43, 25–53. Pollack, I., and Ficks, L. ~1954!. ‘‘Information of elementary multidimensional displays,’’ J. Acoust. Soc. Am. 26, 155–158.
2315
L. A. Christensen and L. E. Humes: Multidimensional sounds
J. Acoust. Soc. Am., Vol. 99, No. 4, Pt. 1, April 1996
2315