underlying parameter that generates the sensation of pitch and tone in music and in .... to dispute this matter, the table suggests that it may be known, but it ... It is invoked here for the fact that acoustics is the medium by which the fundamental.
¶
The Fundamental Error in Hearing Research By Akpan J. Essien
AJESSIEN RESEARCH (Independent Researcher)
www.ajessien.com
ABSTRACT The science of music and hearing began some 2,500 years ago with the work of Pythagoras on the relationship between string ratios and musical pitch intervals. Ohm and his contemporaries translated the string ratio hypothesis into acoustics as ratios of spectral components of sound wave without taking the tension of the string into account, probably because of the Pythagorean claim that the tension of the string was constant as long as the weight of the mass suspending from the string was held constant. To this day, there is no consensus on the origin of the sensation pitch. This paper reviews Pythagorean methodology in pitch production. It unveils a fundamental error in Pythagorean concept of tension in strings, showing the inadequacy of equating the weight of the mass suspending from a string with the tension of the string regardless of adjustments to the effective length of string. Experimental data from drums and human speech signals refute the claim and show that adjustments to effective length of string change the tension of the string as a result of changes in the string’s inherent force of resistance to deformation. The data provided in this paper portray the tension error as the fundamental error in hearing research. Corrections are proposed in this paper. Implications for future research are presented and discussed.
Akpan J. Essien ¶
The Fundamental Error in Hearing Research Akpan J. Essien AJESSIEN RESEARCH (Independent Researcher)
www.ajessien.com 1. Introduction The study of music on the basis of sound production in nature began some 2,500 years ago when the Greek mathematician Pythagoras (c. 570 - c. 490 BCE). He discovered the existence of a functional relationship between string ratios and musical pitch1 intervals (cf. Encyclopaedia Britannica, Pythagoras, 2012). However, the Pythagorean concept of tension in strings equates the weight of the mass suspending from a string with the tension of the string. Furthermore, the string tension was conceived as constant as long as the weight of the mass suspending from the string was held constant regardless of adjustments to effective length of string. Consequently, musical pitch intervals were linked to ratios of sub lengths of the string on the understanding that string tension had been held constant regardless of adjustments to effective length of string. It was shown in The Unfounded Foundation, (copy available on this website), that the relationship between string ratios and musical pitch intervals could not enter into a theory of perception in the absence of invariance since the full length of a string can produce the same intervals as ratios of the length of string. Nevertheless, the string ratio theory assumed an acoustic shape when Ohm (1843) posited that sound wave comprised spectral components which were multiples of the fundamental and corresponded to Pythagorean string ratios of musical pitch intervals. Available data show that the tension of the string which plays a vital role in musical pitch adjustments was not taken into account in the acoustic rendition of the Pythagorean string ratio theory. The error directed researchers to search for pitch in spectral analysis of sound wave. To this day, no one has been able to explain the pitch produced by a stretched string via exploitation of spectral representations of sound wave. Technological advancements in sound wave analysis have created the hope that the answer to this long-standing problem is closer. However, it is shown in this study that although instrumental acoustics is very useful in solving a number of issues, its contribution in the search for the functional units that operate in auditory perception is extremely limited (if at all). By revisiting those elementary operations that underlie musical pitch production in nature, this paper reveals the inadequacy of equating the weight of the mass suspending from a string with the tension of the string. In this regard, it is shown that materials have inherent force of resistance to deformation, and that this force varies with their physical dimensions. Therefore, shortening a string is shown to increase its inherent force of resistance even though the oppositely directed force supplied externally is held constant. In this way, the tension of a string is shown to change with adjustments to its physical dimensions although the weight of the mass suspending from it is held constant. Experimental data from drums and human speech signals suggest that the tension of a string, as defined in this paper, is the underlying parameter that generates the sensation of pitch and tone in music and in speech. The facts presented in this paper not only explain the source of the problems encountered in acoustic search for pitch, but it shows also that there exists no known way to rectify the error without returning to the principles of sound generation in nature. The claim that the tension of a string is constant as long as the weight of the mass suspending from the string is held constant regardless of adjustments to effective length of string is refuted. It is shown as the fundamental error in hearing research. Implications for future research are presented and discussed. ____________________________________
¶. This paper is adapted from my unpublished doctoral thesis (In French), Essien (2000): The Perception of Yoruba Tones: Experimental Evidence from Drums, Speech Signals and Synthesis, Institute of Applied Linguistics and Phonetics, Sorbonne University, Paris III. France. 1. The terms pitch, tone and musical note are used interchangeably in this study to refer to the subjective quality of an audio signal which may be ordered on a scale going from low to high. Frequency of vibration is used to signify the objective physical parameter of sound wave.
The fundamental error in hearing research
2. PYTHAGOREAN APPROACH TO PITCH 2.1. Pythagorean Methodology in Pitch Research Fig. 1 presents the monochord. The mass (hereafter W for weight) suspends from the string; it induces tension in the string Thus, the tension of the string is represented by the value of W which is expressed here in kg. The bridges Z and Y allow for adjustments to the effective length of string. The string ratio theory holds it that when tension (W) is held constant, sub lengths of the original string produce pitch intervals. The theory does not specify string lengths but ratios of sub lengths. The theory encountered daunting experimental constraints. Jeans (1937:63) illustrated the situation as follows: Suppose W is 5kg and the string sounds the musical pitch C1. To raise the pitch by one octave to C2, the string must be halved. To raise the pitch another octave to C3, the half of the string must again be halved, etc. The constraints are readily observable. Jeans (1937:65) noted that “if the piano maker relied on the law of Pythagoras alone, his longest string would have to be more than 150 times the length of his shortest, so that either the former would be inconveniently long or the latter inconveniently short.” Jeans (1937) noted also that the pitch intervals could be produced by varying the tension, that is, W. Suppose W is 5 kg and the string sounds C1. To raise the pitch by 1 octave to C2 the W has to be increased fourfold to 20 kg. To raise the pitch by another octave to C3, W must, again, be raised four-fold to 80 kg, and so on (Jeans, 1937:63). Again, the inconvenience is readily obvious. To produce an interval of 8 octaves on one string by adjusting W, it would culminate in the use of a total mass weighing 327,680 kg. Thus, W would be too heavy for comfort, even impracticable. A rule emerges, though, showing that the frequency is proportional to the square root of the tension. In the face of the above inconveniences of a one-string acoustic system for the entire scale of musical pitches, Helmholtz noted that, to produce FIGURE 1: The monochord. The weight different musical notes, Pythagoras used “strings of different lengths but (W) of a mass suspends from the string, It is of the same make, subjected to the same tension” (Helmholtz, 1877:1). called “tension” and is expressed in kg. In other words, Pythagoras cut the uncomfortably long string into Bridges X and Y may be moved to adjust the effective length of string XY or XZ. manageable portions and subjected them to tension induced by the same weight. The experiments and demonstrations that follow will provide evidence to show two errors in the Pythagorean methodology: (1) It is inadequate to equate the tension of the string with the weight of the mass suspending from the string. Thus (2), it is wrong the claim that the tension of the string is constant as long as W is held constant regardless of adjustments to the length of the string. By correcting the two errors, this paper shows why exploitation of spectral analysis of sound wave has failed to uncover pitch, and probably why no one may ever find pitch or explain auditory attributes of sound in speech or music by present-day psychoacoustic procedures.
2.2. Compensations among Production Parameters The brief presentation made above of two ways to adjust pitch reveal the existence of compensations among production parameters, i.e. length of string and weight of the mass suspending from the string. In this regard, Ohm (1843) converted the string ratios into frequency ratios to account for the perception of the intervals in terms of ratios of frequency of vibration. These laws are demonstrated by the data assembled in Table 1. They will help us identify the fundamental error in hearing research. Table 1(a) presents the primary tenet of the law according to which halving effective length of string produces a pitch one octave higher. Column 1 presents consecutive halves of effective length of string; column 2 shows that the tension induced by W is held constant. Table 1(b) shows that the length of the string is held constant. The tension (W) is shown in column 2 as the variable parameter. According to the law, W is quadrupled successively for the same length of string to produce the pitch intervals, and at the same frequency values. Let us examine two problems that arise
3
Akpan J. Essien
The first problem concerns the relationship between W and frequency of vibration. According to the law, frequency is proportional to the square root of the tension. In this regard, consider column 7 in Table 1(a). By relating frequency of vibration to the square root of tension, we find that frequency is not proportional to the square root of the tension since rising frequency cannot be proportional to tension which is held constant. Consider Table 1(b). In this portion of the table, the variable parameter is W. By relating the same frequency values to quadrupled values of W, column 7 shows that frequency is proportional to the square root of the tension. To resolve this discrepancy, let us return to Table 1(a). For proportionality to exist between frequency and the square root of the tension the tension cannot be constant; the tension must rise to maintain proportionality with frequency even though W is held constant. This calls for distinction between the weight of the mass suspending from the string and the true tension of the string. This distinction is made in columns 2 and 3 of Table 1(c). Having distinguished W from the tension of the string, Table 1(c) advances the hypothesis that each time the string is halved, the tension of the string increases four-fold to maintain proportionality with frequency although W is held constant. Therefore, column 3 in Table 1(c) presents the same values of W as in column 2 of Table 1(a). Let us examine this presentation very carefully. It claims that as length of string decreases (column 1) while W is held constant (column 2), string tension rises (column 3). The hypothesis is that the rising tension of string (column 3) drives up frequency (column 4), generates the higher pitch (column 5), shortens the wavelength (column 6), creates the pitch interval (column 7) and maintains proportionality with frequency of vibration in column 8. A major question arises: When W is held constant, where does the extra weight come from to increase the tension of the string? In other words, how does the acoustic system shown in fig 1 convert length of string into W (or tension)? If we could answer these questions to validate Table 1(c), we would find the origin of pitch, and we would be able to state categorically that holding the tension of a string constant when it is not constant was the fundamental error in hearing research. The only way to prove the point is to return to those basic operations that govern pitch production in nature, and contrast the procedure with acoustic approach to pitch.
The fundamental error in hearing research
3. NATURAL VS DIGITAL APPROACHES TO HEARING
3.1. Principle of Sound Production in Nature The experiments, a few of which are reported briefly in this paper, began as an investigation into the origin of phonemic pitches in a tone language, Yoruba (Essien, 2000). Then, the data were extended to the origin of pitch in music and speech. The experimental data presented in this document show the long road that led to the discovery of the error in the work of Pythagoras. Fig 2 presents a drum. The tension of the membrane is adjustable via the tendons. When the membrane is preset to produce a low tone (in phonological terms), the tension is set to the right degree for a low tone, i.e. Tx. The frequency of vibration of the signal produced is 65Hz; it elicits the perception of a low tone. The experimenter cannot merely change the frequency of vibration and produce a different pitch. To produce a different pitch, he must return to the membrane. The only adjustable parameter under study is the tension of the membrane. Thus, the figure shows that the experimenter returns to the drum, alters the tension to Ty, then he may strike the drum again to produce a high tone (in phonological terms). The data show that at tension Ty the membrane vibrates at the frequency of 142 Hz, and generates a high tone. Regardless of the size of the membrane and frequency values, the procedure described above is universal; it governs the practice of music using any musical instrument. Neither the musician nor the scientist can circumvent this procedure. Hence, speech and music performance demands of the performer continual change to a feature of the acoustic system for every code that is transmitted. Let us contrast the above procedure with the acoustic approach to sound generation and perception. 3.2. Principle of Sound Production in Acoustics Most (if not all) psychoacoustic investigators in hearing research reject fully or to some degree the significance of sound production criteria in auditory analysis. In the field of speech perception, Jakobson & Waugh (1979) labelled any such thought as originating from “regressive dogmatists” because production criteria are absolutely useless in perception since physiological criteria are only a means to an end. In the same vein, acoustic enthusiasts consider that the work of Pythagoras on strings was only a means to an end, and that since the laws governing the vibration of bodies have now been understood, the production criteria may be dispensed with. Let us consider some implications. Fig. 3 presents a computer and a keypad. The computer is equipped with sound card and software to generate sounds at different frequencies of vibration and other acoustic parameters. The researcher inputs the frequency of 65 Hz and can synthesize the drum signal for the low tone in fig. 2. Thereafter, the universal law applies: To change the pitch of this signal, he is under obligation to return to his keypad and input the frequency of 142 Hz in order to generate the high tone. Let us examine the advantages and disadvantages of the two approaches.
5
Akpan J. Essien 3.3. Production vs Acoustic Procedures in Hearing Research Consider Table 2. Under Mechanical Approach to Pitch and Hearing, the table shows that for each pitch produced, we have the origin of the frequency of vibration and the sensation elicited by the signal. We know also that we can neither change the frequency of vibration nor the sensation without returning to the sound source to change the mechanical origin of both (i.e. T). Therefore, the link between mechanical feature T, frequency and sensation is unbreakable. We are therefore compelled to look at what it is that controls frequency of vibration and pitch in the underlying mechanical features. At the acoustic level, the table shows that when we change the frequency of vibration from 65 Hz to 142 Hz and produce the high tone, we do not know the origin of the frequency of vibration. As we have noted above, the frequency of vibration of a signal is valid only for that signal because that signal cannot be changed to generate a different percept without changing the underlying mechanical feature. This is where the embarrassing question arises: What is it that we change electronically to drive frequency of vibration and pitch in the acoustic approach?. The simple answer given in the table is that we do not know; but to be fair to those who would like to dispute this matter, the table suggests that it may be known, but it is unspecified. By deduction, for every sound produced electronically, the underlying cause of the physical parameters of sound wave and the elicited sensation is unknown. On the contrary, for every sound produced in nature, there is an underlying mechanical (or physiological) reality. If we cannot explain auditory sensation by examining this reality, the search for the unknown in the unknown, such as we practise in acoustic approach to hearing, may be described as an ill-fated venture. Besides, let us admit, for argument’s sake that music, speech and auditory analysis as a whole could be explained acoustically. Of what benefit would it be to anyone living in the natural world if we do not know how the acoustic effects may be produced in nature? We would end up knowing how music is perceived without knowing how to play it. Similarly, we would know how we perceive speech without knowing how we speak, or how to teach someone else to speak our own language. These undeniable deficiencies in acoustic approach to hearing has turned many disillusioned researchers away from the practice, one of such is the present author. So how did it come about that we abandoned the bird in the hand and went after the (n)one in the tree? This cannot be explained without showing that acoustics is the offspring of Pythagorean string ratio theory of musical pitch intervals. It is not a topic for us to dwell on in this paper. It is invoked here for the fact that acoustics is the medium by which the fundamental error we are examining was transmitted to modern investigators, In this regard, every well-informed investigator knows (or should know) that Ohm was the first, according to Helmholtz, who converted string ratios into frequency ratios in the form of spectral components of sound. Thereafter, the frequency scale for musical pitches was calculated mathematically. Frequency values in Table 1 are extracted from such a scale in use today, its controversial nature not withstanding. The tension of the string which affects pitch was not factored into the relationship between frequency and pitch since Pythagoras had held it constant. Rather, speed of sound was involved. How does speed of sound help us in the search for pitch when sound travels with the pitch in it? The string ratios are meaningless in hearing since any string may be made to produce the same intervals as its sub-lengths. Consequently, the string ratio theory had breached all psychophysical and psychological prerequisites for a theory of perception and therefore cannot constitute a basis for a theory of hearing in the absence of any scientific essence to it. By deduction, since string ratios are not invariant with pitch, the frequency ratios they generate can never be invariant with pitch as to explain pitch. For this reason, neither Ohm nor Helmholtz could ever get it right. Rather, by adjusting their data to fit the string ratio theory, they transmitted the error to modern psychoacoustics. On this premise, hearing theories have mushroomed world-wide. The list would make many volumes and no attempt is made to list them here as they have nothing to do with the fundamental error in the work of Pythagoras. It is good to note, though, that since the foundation of acoustics is psychophysically unfounded, no work on auditory analysis erected on the basis established by Pythagoras can stand. In this regard, a question arises: What should have been done? It has already been abundantly presented in Table 1(c) that the tension of the string should not be equated with W. The source of the extra force that drives pitch and other parameters when W is held constant is presented, described and discussed here below.
The fundamental error in hearing research
4. THE ORIGIN OF PITCH 4.1. The Case of the Missing Low Tone For the work of Pythagoras to enter into a theory of perception, it should have discovered the invariant in the production criteria. That invariant, if given acoustic representation, would then allow for an acoustic explanation of pitch. But the work of Pythagoras was a pilot study. Had it been given proper considerations, the discrepancies could have been found and it would not have entered at all into a theory of hearing. Besides, Pythagoras was not interested in measuring pitch per se, his interest focussed pitch intervals and musical harmony. It was shown in Unfounded Foundation that a theory of pitch interval based on ratios cannot explain the perception of the individual units in a comparison. This author found the error the hard way and corrected it the hard way. For it to be understood, efforts will be made to explain it the easiest way. Experimental data presented here below are extracted from a large corpus. Some details may be missing as they are presented out of context. As we have noted above in fig 2, the tension Tx and Ty are only labels for the mechanical configurations of the membrane that produced the low and the high tones at 65 Hz and 142 Hz, respectively. None of this is significant in hearing research unless we can prove it that all musical instruments will produce these two tones at the same T and frequency values. We have to note that we do not know the value of T either in quantitative terms. We are only relating parameters to pitch until we find the one that is worthy of precise measurements. The experiments examined the acoustic responses of different sizes of membrane and related them to pitch. Thus, drums in three different sizes were introduced into the study (details in Essien, 2000). The hypothesis was that each drum would produce the Yoruba phonological low, mid and high tones but at different frequency levels similar to human speech signals produced by men, women and children. The whole aim was to establish a basis for studying intra-speaker variability in speech. The results revealed that tone perception is not a simple case of comparing the pitches of contiguous signals to arrive at the cognitive identities of tones. Rather, the results pointed to the existence of mechanical constraints affecting the production and perception of the three tone categories by different drums. Table 3 summarizes the frequency bands that elicit the perception of the low, mid and high phonological tones in Yoruba. The three drums are able to produce the Yoruba mid and high tones. However, the data for the low tone category show that the ability to produce the sensation of lowness diminishes with size of drum. Drum C had completely lost the ability to produce the low tone. That was the primary clue to follow to pitch. Thus, the prediction that the three drums would produce low, mid and high tones at different frequency levels did not come true. Figure 4 illustrates well-known shifts in tone levels such as characterise male, female and infant outputs in speech. The real outputs of the experimental drums for the low tone which is the focus here are presented to the left of the frequency scale, showing the frequency bands that elicit the perception of the low tone category. For Drum A, the data show a relatively wider frequency band (23 Hz) that elicit the sensation of lowness; for Drum B, the width of the low tone frequency band is reduced (3 Hz). However, the lowest pitch of Drum C does not generate the sensation of lowness at all. Why? Acoustic theorists might want to attempt to explain this phenomenon acoustically, but our aim at the moment is to isolate the physiological parameter that controls pitch. Thereafter, we can examine its acoustic manifestation and perception. Thus, the facts to consider in the presentation are as follow: The low tone frequency band for Drum B corresponds to portions of mid and high tone frequencies of Drum A. Also, the entire high tone frequency band of Drum A falls within the mid tone frequency band of Drum C. To the right of the frequency scale are the outputs of male, female and infants using toned CV syllables. Strikingly, whereas children, whose vocal cords may be as short as 3mm (see Hirano, et al., 1983) have such a wide frequency band that elicits the sensation of lowness, Drum C, with a diameter of 105mm cannot generate the sensation of lowness. Despite overlapping frequency bands, the test subjects perceived the toned stimuli apparently independently of frequency values. And that brings us to the puzzle of auditory analysis: How is it that the ear can extract pitch from this “jumble” and does it apparently effortlessly? (cf. Hirsh, 1996) Since many different pitches elicit the perception of the same tone category, it was necessary to examine the actual pitches of the signals that the listeners allocated to the different tone categories. This matter was addressed by conducting pitch matching experiments
7
Akpan J. Essien
4.2. Pitch Matching Experiments The organisation of the stimuli for the test is illustrated in fig. 5, showing the matches aimed for. The signals were organised in three AX series of stimuli where A is a constant signal and X is a series of stimuli with changing pitch. Thus, the lowest pitch of Drum B (BT1) was the reference signal whose pitch was to be matched with the pitch of any of the 16 signals from Drum A. Similarly, the lowest pitch from Drum C (CT1) was the reference signal whose pitch was to be matched with the pitch of any of the 16 signals from Drum A and any of the 10 signals from Drum B. The signals were presented directly from the computer over head phones in the phonetic laboratory at Sorbonne, Paris III. Test subjects (doctoral researchers in phonetics and linguistics) had the freedom to replay the signals as many times as they wanted until they were sure they had established a match.
The fundamental error in hearing research
4. 2.1. Analysis of Test Results The results of the tests are shown in fig 6. It shows that BT1, the signal with the lowest pitch from Drum B, was matched in pitch 75% with AT4, 12.5% with AT5, and 12.5% with AT6. The signal CT1 being the lowest pitch from Drum C was matched in pitch 25% with AT6, and 75% with AT7. Also, the same signal CT1 was unanimously matched in pitch 100% with BT6. From the tone perception standpoint, BT1 and AT4 are Low tone signals from Drums B and A, respectively. CT1 and AT7 are Mid tone signals from Drums C and A, respectively. Also, CT1 and BT6 are mid tone signals from drums C and B, respectively. These results show that the tone perception subjects correctly identified the pitches of the signals and allocated them to the correct tone categories. So, what is the origin of pitch and tone? Again, the pointer to follow is the gradual phasing out of the sensation of lowness as size of drum decreases (fig. 4). It shows that there is something, call it X, that drums use to generate the sensation of lowness, and that X is gradually lost as size of drum decreases. It would also show that humans use the same X to give the same sensation of lowness to auditory signals, but that humans conserve X in speech regardless of age of speaker, such that they do not lose it as a result of reduction in the length of vocal cords. We could conclude that to perceive pitch, the ear simply extracts X from the auditory stimuli regardless of other acoustic features of sound wave. Since X can explain pitch in music and speech, we can safely deduce that to explain the sensation of pitch universally, we must find X.
4.3. LOOKING FOR X The experimental conditions with the drums do not offer any known and valid possibility of measuring the amount of force exerted on the membrane in the process of graduating the tension of the membranes that produced the drum pitches. Because the findings present X as a universal in pitch production and perception, it may be found using any musical instrument. In this regard, despite the weaknesses of the Pythagorean string ratio theory, something valuable was earmarked for retention, namely, the phenomenon of compensations. We saw how the acoustic system traded length of string (L) for weight (W) and vice versa (see Table 1 in this document). It follows that in fig 4, humans must have a way to trade physiological parameters and conserve X while drums don’t have that trading possibility and so lose X as size of drum decreases. Therefore, in looking for X we will have to resort to the laws of string ratios which spell out the laws governing string length and force applied from outside the string (This author has not yet verified these relationships himself but they present the gist of the matter for the present investigations).
4.3.1. Mechanical Compensations in Pitch Production. Table 1(c) predicts that tension must rise with rising frequency although W is held constant. To find pitch, we have to explain where the extra force comes from when W is held constant. This is analogous to finding X to explain the missing low tone in the case of the drums, and its conservation in infant speech. To address this issue, we need to examine another natural phenomenon – the resistance of materials to deformation.
9
Akpan J. Essien Materials have varying degrees of resistance to deformation depending upon their chemical, physical, mechanical and other properties. Of interest to us is the quality stiffness. Kinsler (1971) and Colton (1988) noted that stiffness influences the tension of a body and defined it as a quality that helps a system restore itself to its original configuration following a deformation. This phenomenon requires adequate explanation to avoid any disputes. The attempt that is made here below is not exhaustive. Suppose we were to build a bridge over a river some 100 meters wide. Suppose we order a huge steel beam and set it up as shown in fig, 7(a). We observe that the beam manifests a curvature even though it is a huge rolled steel beam. If we do not heed the warning but proceed to drive our articulated truck over it, fig 7(b) illustrates what would happen. We may not live to tell the story. For this reason, engineers are very careful in these matters. But when it comes to hearing research, we do not seem to heed all warnings because it does not matter what we theorise since practice and enjoyment of music as well as speech are totally independent of scientific theories. Even when music and speech scientists ‘crash’, they still live to theorise another day or even insist that they were right although their theories are blatantly false. Thus, as hearing scientists, we survive the crash in fig 7(b) and want to change the data to fit. We might choose to support the beam by introducing the intermediate column in fig 7(c). At that, the truck goes safely over. We could be tempted to think that the truck went over the bridge thanks to the intermediate column. But that would not be the whole truth. To test it, let us replace the beam with a thin metal sheet. The truck goes on it as shown in fig 7(d), and what a crash even though the intermediate beam is still there! Evidently, the intermediate beam was not the only contributor to the successful passage of the truck. The illustrations in 7(c) and 7(d) make the point that the intermediate beam was a node that divided the beam in two halves so that each half became strong enough to support the weight of the truck. Nevertheless, 7(d) shows that dividing the length of the thin metal sheet in two halves did not increase its strength to the point of supporting the weight of the truck. If we transfer this knowledge to strings, we will begin to see matters just a little bit clearer.
4.3.2. Resistance of strings as a function of physical dimensions Fig 8(a) presents three strings A, B and C of the same make, balanced on a stand. They are acted upon by the force of gravity. We observe that the shorter the string the higher is its resistance to the downward pull. If we were to help each of the three strings against the downward pull, the amount of external force required would decrease the shorter the string. In fact, the shortest string C in the illustration might not require any force at all from outside the string because it is strong enough to resist the deformation without any external assistance. Similarly, fig 8(b) presents three strings X, Y and Z balanced on a stand. The strings are of the same length but vary in thickness. They are acted upon by the force of gravity. We observe that the thicker the string the higher is its resistance to the downward pull. Again, if we were to help each of the three strings against the downward pull, the amount of external force required would decrease the thicker the string. The thickest string Z (now a rod) might not require any force at all from outside the string because it is strong enough to resist the deformation without any external assistance. Whether the force that induces tension in a string is supplied externally in the form of a mass suspending from the string, or whether it is an inherent property of the string, it restores the string to its original configuration following a deformation.
The fundamental error in hearing research
Adjustments to length of string (L), thickness of string (Ø), weight of the mass suspending from the string (W) or other mechanical and chemical properties of the string are various ways to adjust the true TENSION of the string and pitch as the summary in fig 8(c) shows. By deduction, although W induces tension in a string, it is an inadequate yardstick for measuring the tension of the string since it fails to account for the force of resistance that is the inherent property of the string. For this reason W manifests a functional relationship with pitch when W is used as a parameter for adjusting pitch intervals, but the relationship breaks down when W is held constant and length of string is adjusted. On account of this fundamental misconception, the entire edifice of hearing research was erected on a foundation that cannot hold together let alone support anything that is erected upon it. To worsen matters for hearing sciences, W was not even taken into account at all in the conversion of string ratios into frequency ratios to arrive at a theoretical structure of harmonic partials in sound wave. Bearing this in mind, let us consider Table 4 (which was Table 3 in Unfounded Foundation…). As we saw in Unfounded Foundation…, the shorter string XZ always calls for a relatively lesser amount of W to produce the same pitch as the relatively longer string XY. Therefore, shortening the string is another way to increase the tension of the string and compensate for W when W is held constant. We can safely conclude that the increased inherent force of resistance of XZ, resulting from reduction in length of string, accounts for the rise in pitch even though W is held constant. The increment in inherent force of resistance is equivalent to the amount of W displaced for the shorter string to sound the same pitch as the longer string. According to the Pythagorean string ratio theory, the amount of force is four-fold per octave. This requires modern-day verification since the parameter was not integrated into the frequency ratio theory by Ohm, Helmholtz and their contemporaries. Whatever may be the case, the amount of force that a body acquires by simply halving its length is tremendous ― four-fold as Table 1(c) and Table 4 show. This cannot be neglected, but it was and still is. Consequently, no one has been able to explain the origin of the pitch generated by an acoustic system as simple as a stretched string. We shall henceforth use T for tension to represent the sum total of the force of resistance provided externally by W and the force of resistance which is the inherent property of the vibrating body. This may be stated briefly as follows:
Fex + Fin = T = Pitch. Thus, as length of string decreases, Fex decreases and Fin increases. While examining Fin as length of string decreases, preliminary experimental data to hand show that there comes a point when the string is strong enough to produce the target pitch without any assistance from outside (Fex). At that point, the data reveal that the Pythagorean string that was inconveniently short was not useless; it was only a pointer to another class of musical instruments in which pitch adjustments operate exclusively on Fin. Consider the xylophone in fig 9. The longest bar is strong enough to produce a musical note without any mass suspending from it. It would be wrong to state that the body has no tension because we do not have any mass suspending from the bar. Nevertheless, all musical instruments that produce tones using Fin exclusively are still governed by the same universal law which calls for a chamge in the sonorous body to produce the target auditory code. In this case, fig 9 shows that reduction in
11
Akpan J. Essien length applies to achieve increased pitch. Why? We have learned enough to leave frequency of vibration out of the equation at the moment. In the absence of a mass that induces tension from outside the bar, the above formula shows that only Fin applies in this case. We can conclude that reducing the length of the bars increase Fin and pitch. Therefore, these bars may be crafted to produce any musical note, but the bars will vibrate at their natural frequencies which are determined by several factors ― the material of the bar, the size, etc. This is true also of strings. Among stringed musical instruments, plastic strings are being used to produce very pleasing sounds because they vibrate slower than metallic strings; consequently they produce the same musical Figure 9: Increased Rin resulting from reduction in the pitches at relatively lower frequencies. Any disputes? No, because length of bars result in higher pitches. musicians know this to be a fact even though this reality may escape the eyes of pro-acoustic investigators who only peer at and measure movements of air particles on an oscilloscopic screen. Therefore, let us return to fig 4 with all that we have learned about bodies, Fex and Fin, and try to understand the puzzle of the missing low tone in the search for the origin of the sensation of tone.
4.4. We Found X To conclude our search for X, let us see how the findings assembled in this document can help us resolve the case of the missing low tone in fig 4. We know that as we reduce the length of a vibrating string we are actually increasing its inherent force of resistance (Fin). Let us return to the drums in fig 4. We observe that the ability to produce the sensation of lowness decreases with size of drum. Thus, drum C has lost the ability to produce the sensation of lowness because its tension T (where Fex + Fin =T) is higher than what the ear needs to elicit the sensation of lowness. In humans, vocal chord lengths vary between 27mm (in men) and 3mm (in infants). For Fex and Fin to explain the missing low tone, we must account for the fact that children are able to produce the sensation of lowness regardless of gradual reduction in vocal cord length as a function of age. In this regard, it is a well-documented fact that vocal chords are muscles; they are controlled by the brain and may be tensed or slackened regardless of their size (cf, Hollien, 1983; Wyke, 1983). Therefore, although the drum membrane loses slackness as it reduces in size, children can tense or slacken vocal chords to give the sensation of lowness to a pitch regardless of the size of the muscle. In this way, children preserve the slackness needed for lowness of pitch. Nevertheless, the frequency of vibration is relatively high due to the physical size of the vibrating body. If humans of all ages did not possess the capacity to slacken and tense vocal cords regardless of their physical dimensions, infants and adolescents would acquire the capacity to produce the sensation of lowness in vocal pitch only when they attain a certain age. Drum membranes, on the other hand, acquire increased Fin as size decreases. Thus, Drum B lost some of its slackness through reduction in size and so the frequency band for the low tone was reduced. A further reduction in size caused Drum C to lose it completely and along with it its ability to produce the sensation of lowness. X is therefore T (the tension of the vibrating body). The facts presented support the hypothesis in Table 1(c) according to which the rising tension of string (column 3) drives up frequency (column 4), generates the higher pitch (column 5), shortens the wavelength (column 6), creates the pitch interval (column 7) and maintains proportionality with frequency of vibration in column 8. These facts, which explain pitch, only establish the basis for quantitative estimation of pitch, leading to a psychological scale for pitch. Furthermore, knowing the origin of pitch should throw some light on the way the organism internalises the sensation pitch. All this is now possible because, by all evidence, we have found the stimulus to pitch and tone.
5. Implications for Future Research An examination of pitch regulation in nature underscored the need to find the underlying feature of pitch in the sound source. In this regard, the groundwork for a theory of pitch perception and the understanding of music production and perception was done in two parts: (1) In the paper Unfounded Foundation… and (2) In this paper The Fundamental Error… A lot of work is yet to be done. We could approach it via a question: Suppose Pythagoras had announced what we know today that the tension of the string is the parameter that drives pitch and creates pitch intervals, what would hearing research be like today? Two
The fundamental error in hearing research
major changes come readily to mind: (1) Ohm’s acoustic law, and (2) Helmholtz’s resonance theory. Let us see how what we know today would have impacted on these two theories and the way it would have affected theories of music, speech and hearing research. In has been noted in the course of our discussions that Ohm’s acoustic law was an acoustic translation of the string ratio law. By halving the length of string according to the string ratio law, pitch increased by one octave. Physics has it that the frequency of vibration is doubled when length of string is halved. Thus, Ohm posited that sound wave comprised frequency components which were multiples of the fundamental. This too was publicised by Helmholtz in the resonance theory. Helmholtz noted that the frequency scale for musical pitches could now be calculated mathematically (see Unfounded Foundation). A look at the frequency scale for musical pitches shows that frequency values for octaves are doubled values derived theoretically from halving the length of string. All this is the result of equating the tension of a string with W and the claim that the tension was constant as long as W was constant. Wherefore, frequency ratios were calculated on the basis of string ratios. We cannot enter into all the details here. Suppose that rather than hold W constant, Pythagoras held length of string constant. Were that the case, the variable parameter would have been W. In that case, rather than have a string ratio theory of musical pitch intervals, we would have had a tension ratio theory of musical pitch intervals. And instead of half the length of string to an octave, we would have had four-fold tension to the octave. All this is summarised symbolically below, showing that music, speech and hearing research and associated theories today would be altogether different. Let us examine the implications from the standpoints of Ohm’s law and the resonance theory by Helmholtz.
If Pythagoras had held length of string constant and concluded that pitch intervals corresponded to 4W per octave, all efforts would have aimed at explaining how the ear perceives multiples of W as pitch intervals. Ohm would have had to find a way to relate spectral components of sound to ratios of W if he wanted to establish an acoustic law. The distribution of harmonic partials of sound wave to reflect ratios of tension could probably prove to be hard, but since technological limitations obliged him and his contemporaries to use resonators, one can always build a resonator to enhance the perception of any frequency component in sound wave. He could have found a way somehow. Since Seebeck did not resort to such a gadget, he could not hear the harmonic partials that Ohm and Helmholtz claimed to hear. If we leave frequency of vibration out of the discussion, we come to the real problem. At the neurophysiological level, rather than discover piano-like arrangements of tuned nerve fibres that responded to the fundamental frequency of incoming sound wave, Helmholtz would have had to explain the way the ear perceives the tension of a string as pitch. That is the problem facing us right now: How does the ear perceive the tension of a string as pitch? Unless this is done, all that we have discovered this far is totally meaningless as none of it can enter into a theory of perception without establishing the listener’s access to the stimulus dimension ― tension. The good news is that this has been done (cf. Essien, 2000). Therefore, the stage is set for a new foundation for hearing sciences and new methodologies in auditory research. In this regard, while disputing the significance of frequency of vibration in hearing research Stevens (1960) wrote: “Generally speaking, the fit is better to the degree that the dimensions and qualities of the things we study are measurable on well-founded scales. When description gives way to measurement, calculations replace debate.” S. S. Stevens, 1960:1 The groundwork having been completed, we have to move on to a well-founded scale for pitch. We now invite advanced mathematics and physics to quantify the tension of vibrating bodies as described in this study. The door is open to advanced experimental psychology to come in, establish and summate jnds to yield a psychological scale for pitch. A study of spectral slices as a function of the relative tension of vibrating bodies provides an astounding insight into the mechanics of spectral change and frequency modulations in tone segments. Thus, advanced acoustics is most welcome to find the acoustic representation of tension or establish one in order for acoustics to explain pitch acoustically. The sum total of all such accurate studies would establish measures of pitch on the basis of the tension of sonorous bodies — the proposed dimension that underlies the sensation of pitch and tone. It would provide a psychological dimension for neurophysiological investigations into the way the organism internalises the sensation pitch, and eventually the other attributes of sound. The data we examined this far suggest that this is apparently the only rewarding course of action to follow.
13
Akpan J. Essien
To this day, however, having been led astray by the fundamental error pointed out in this work, hearing research has pursued after many glittering and glamorous illusions. Consequently, as speech, music and hearing scientists, we can only point to an astronomically high stock pile of literature of failure as our scientific heritage. Now, though, it seems that the time has arrived for us to come out of the impasse and head out in a new direction, reaching out for the reality.
6. Summary and Conclusion A close examination of pitch production in nature was conducted in this study. Findings from phonological tone perception experiments using drums and human speech signals suggested that the sensation of tone was linked much more to some mechanical parameter of the vibrating body than to frequency of vibration. Further investigations involved the resistance of bodies as a function of their physical dimensions. Thus, distinction was made between the weight of the mass suspending from the string and the tension of the string. In this way, the inherent force of resistance of strings was shown to increase with decreasing length. These facts showed that strings trade length to compensate for the force supplied by the weight of the mass suspending from the string. In other words, the true tension T of a string is the sum of the force of resistance that is supplied externally by the weight of the mass suspending from the string (Fex), and the force that is the inherent property of the string (Fin). These findings point to just 2 (of many) errors in Pythagorean understanding of the physics and psychophysics of the monochord: (1) The weight of the mass W suspending from a string is not an adequate yardstick for measuring the tension of the string. (2) The findings of this study point to tension, as defined in this study, as the origin of the sensation of tone. Therefore, the claim that the tension of a string was constant during adjustments to the effective length of string was the fundamental error in hearing research. Furthermore, the fact that hearing sciences were established on string ratios in the absence of any scientific significance to it is something that we would really not want to call to mind. It is hoped that the light shed by the present work will free music, speech and hearing research from the spell of Pythagorean mystical wisdom.
7. Acknowledgments I am irredeemably indebted to N. Waterson for the whole-hearted support, advice and guidance since the beginning of this research work at SOAS, University of London in 1985. Also, J. Vaissière, at the Institute of Applied Linguistics and Phonetics (ILPGA), Sorbonne University, Paris III, and S. Maeda, at the National Centre for Scientific Research (CNRS), Paris, showered upon me more care than I had expected to receive as a student; they helped me with my research work to fruition. The few who expressed opposition in different ways are hereby remembered for having helped me re-examine my position regularly and thus could see through their eyes what this work needed to prove its point. My friends and colleagues at SOAS (London) and ILPGA (Paris) are fondly remembered for their support in the course of my stay with them. All errors of judgment remain my exclusive responsibility.
8. References Colton, R.H. (1988). Physiological mechanisms of vocal frequency control: The role of tension. J. of Voice. 2.3, 208-220. Essien, A. J. (2000): The Perception of Yoruba Tone: Experimental Evidence from Drums, Speech Signals and Synthesis. Unpublished doctoral thesis (in French), ILPGA. Sorbonne University, Paris III. France. Essien, A. J. (2013) Unfounded foundation of hearing research. In review for publication by Attention, Perception and Psychophysics. A copy of the research work is available on this website. Fechner, G. (1860): Elements of Psychophysics, Vol. 1. Translated from the German by Adler, H. E. (1966). Holt, Rinehart & Winston. New York. London. Helmholtz, H.L.F. (1877): On the Sensation of Tone: As a Physiological Basis for the Theory of Music. Translated from the German version of 1877 and revised by Ellis, A. J. Dover Publications (1954.). New York. Hirano, M; Kurita, S. & Nakashima, T. (1983): Growth, development and aging of human vocal folds. In Bless, D. M. & Abbs, J. H. (Eds.) Vocal Fold Physiology: Contemporary Research and Clinical Issues, pp 22 - 43. College Hill Press. San Francisco. Hirsh, I. J & Watson, C. S. (1996): Auditory Psychophysics and Perception. Annual Review of Psychology, 47, pp461-464. Hollien, H. (1983): In search of vocal frequency control mechanism. In Bless, D. M. & Abbs, J. H. (Eds.) Vocal Fold Physiology: Contemporary Research and Clinical Issues, 361-367. College Hill Press. San Francisco. Jakobson, R. & Waugh, L. (1979); The Sound Shape of Language. The Harvester Press. Sussex. United Kingdom. Jeans, J. (1937): Science and Music. Cambridge University Press. London. Kinsler, L.E. (1971): Vibration. McGraw-Hill Encyclopaedia of Science and Technology, Vol. 14, 361-366. New York. McGraw-Hill.
The fundamental error in hearing research
Ohm, G.S. (1843). Ueber die Definition des Tones, nebst daran geknüpfter Theorie der Sirene und ähnlicher tonbildender Vorrichtungen. Ann. Phys. Chem. 59, 513-565. Pythagoras 2012. Encyclopædia Britannica Online. Retrieved 12 February, 2012, from http://www.britannica.com/EBchecked/topic/485171/Pythagoras Stevens, S. S. (1960): Mathematics, measurement, and psychophysics. In S. S. Stevens, (Ed.) Handbook of Experimental Psychology, pp1-49. John Wiley & Sons. New York. London. . Wyke, B. (1983): Neuromuscular control systems in voice production. In Bless, D. M. & Abbs, J. H. (Eds.) Vocal Fold Physiology: Contemporary Research and Clinical Issues, 71-76. College Hill Press. San Francisco.
15