Mapping timbre spaces of synthetic vowels with Chuck. Mr. Danila Gomulkin, MA, ... [10] Smith J. O. (2007) Introduction to Digital Filters: with Audio Applications.
Mapping timbre spaces of synthetic vowels with Chuck Mr. Danila Gomulkin, MA, Saint-Petersburg (Russia), November 2016
Acoustic data is widely used in phonetic science nowadays, and there is no paper on the on-going language change that would not display plots of vowel formants. It is often forgotten, however, that formant frequencies do not completely define the timbre of a vowel, - not in the absence of their amplitudes. Vowels with similar formant frequencies may sound different, and vowels with different formant frequencies may sound the same. If we want to really understand how formant frequencies define vowel timbre, we have to resort to synthetic vowels where we can keep amplitude and quality of resonances under control. A series of experiments on how a change in a vowel's spectrum (through its filtering, speeding up and slowing the sound down) affects its perception was described by Tsutomi Chiba and Masato Kajiama in 1942 [1]. Interesting experimental data on the perception of one- an two-format synthetic vowels were presented by a group of scientists from Haskins Laboratory in 1952 [5]. Detailed maps of the timbres of synthetic vowels were published by Ralph Miller in 1953 [7], who also described how amplitude of resonances affects perception of vowel timbre. A daring theoretical and artistic interpretation of the synthetic vowel space was summarized by Wane Slawson in 1985 [9]. Numerous experiments on how changes in frequency and amplitude of resonances affect perception of synthetic vowels were conducted in the Pavlov Institute of Physiology during the 60s through the 80s [2, 3, 4]. Many experiments were limited in scope, partially due to the difficulty and cost of working with analog signals. In this study we took advantage of the modern methods of digital synthesis that allow for better control over spectral parameters. Vowel Space Explorer - a simple vowel synthesis program - was written in Chuck programming language [6]. Sound tokens of a constant duration and pitch were synthesized with a sawtooth signal passed through parallel filters. Resonance frequencies of the filters were changed from 0 to 3000 Hertz in 100-Hz increments. The timbre of each token was aurally matched with one of the 28 sample tokens and tagged with a correspondent sign of IPA alphabet. Several maps of synthetic vowel timbres in F1/F2 plane have been plotted for several different types/configurations of filters (400 tokens per each map). The obtained vowel spaces visualize how the change in filter parameters affects the structure of vowel space and allow for identification of the parameters that affect phonetic categorization of sound tokens. The results support the devision of vowel timbers into pure and complex timbres (one- and two-formant vowels) [1, 7]. The maps visualize the effects of masking on the shape of the borders between timbre zones [4]. It was also shown that within a critical band [6] timbres of correspondent one-formant vowels merge into a pure tone of intermediate quality; when the distance between the formants increases, the interaction of their timbres (rather than merging) takes place, fist like mutual deduction (timbres of back unrounded vowels), then like overlaying (timbres of front rounded vowels) of the two pure timbres [8]. The filters of constant resonance amplitude, ResonZ , showed the finest resolution, revealing a more complex pattern in timbre deduction/overlaying cycles than regular Bi-quad filters [10]. [1] Chiba T., Kajiama M. (1941) The vowel, its nature and structure. Tokyo-Kaiseikan, Tokyo. [2] Chistovich, L.A. (1985) Central auditory processing of peripheral vowel spectra // J. Acoust. Soc. Am. 77: 789–805. [3] Chistovich L.A. Chernova E. I. (1986) Identification of one- and two- formant steady state vowels: a model and experiment. // Speech Communication. 5: 3-16. [4] Chistovich L.A., Lublinskaja V. V. (1979) The 'Center of gravity' effect in vowel spectra and critical distance between the formants: psycho-acoustical study of the perception of vowel-like stimuli. // Hearing Research. 1: 185–195. [5] Delattre P. et al. (1962) An experimental study of the acoustic determinants of vowel colour; observations on one- and two- formant vowels synthesized from spectrographic patterns / P. Delattre, A.M. Liberman, F.S. Cooper and L.J. Gerstman, Word, Vol. 8: 195-210. [6] Kapur A. et al. (2013) Programming for Musicians and Digital Artists : Creating music with ChucK / Ajay Kapur, Perry Cook, Spencer Salazar, and Ge Wang. Manning Publications. 344 p. [7] Miller R. L. (1953) Auditory tests with synthetic vowels. J. Acoust. Soc. Am. 25, 114-121. [8] Schane S. А. (1996) Diphthongization in Particle Phonology // The Handbook of Phonological Theory. Goldsmith, John A. Blackwell Publishing. [9] Slawson W. (1985) Color Space. University of California Press. 266 p. [10] Smith J. O. (2007) Introduction to Digital Filters: with Audio Applications. W3K Publishing. 480 p.