nasal vowels following a nasal consonant

6 downloads 0 Views 324KB Size Report
Velum variation was typical of C VC contexts with both consonants oral. Results where in accordance with the view of nasality "... as a dynamic trend from an oral.
NASAL VOWELS FOLLOWING A NASAL CONSONANT Antonio Teixeira, Francisco Vaz

Dep. Electrónica e Telec./IEETA, Universidade de Aveiro, Portugal [email protected]

http://www.ieeta.pt/ajst

José Carlos Príncipe

Computational NeuroEngineering Laboratory, University of Florida, USA http://cnel.ufl.edu/principe/principe.html

ABSTRACT

1 INTRODUCTION

Portuguese language is very rich in nasal sounds [1]. Far from natural quality of these sounds produced by present synthesizers motivated our production and perception studies. Several authors hypothesized the possible dependence of nasality perception on the timing of velum movements [2]. For Portuguese manifestation of nasality in the temporal dimension was rst mentioned by Lacerda and Strevens in 1956 [3]. Our rst studies [4] resulted in higher naturalness in synthetic nasal vowels produced with dynamic velum. Velum variation was typical of CV C contexts with both consonants oral. Results where in accordance with the view of nasality "... as a dynamic trend from an oral conguration toward the pharyngonasal conguration." [5]. But there are also nasal vowels after nasal consonants. This context poses new problems. At the beginning of the vowel velum is already open. Also the nasal consonant, before the nasal vowel, may restrict possible phonemes after the vowel. That is, nasalization before the vowel may need some properties of the following segment for the nasal vowel to be perceived. To address these problems we made 3 types of studies: analysis of natural nasal vowels after a nasal consonant; simulations of lips and nostrils radiation; and perceptual tests. Simulations and perceptual tests used an articulatory synthesizer developed specially to model nasal sounds [6].

2 NASAL VOWEL AFTER NASAL CONSONANTS IN PORTUGUESE

A limited number of forms with nasal vowels appeared in Portuguese as result of progressive assimilation. Initial /m/ and, less commonly, initial /n/ nasalized following vowel, particularly high and stressed vowels. Nasal vowels before an oral vowel were unstable and changed to an oral vowel followed by a palatal nasal consonant. Also in other contexts denasalization occurred. For example the word `mesa' derived from MENSA. More details can be found in [1]. Presently in Portuguese NV sequences appear before occlusives or fricatives. Some Portuguese dialects use contrastive nasalization to distinguish between the rst person plural of the preterite and present tenses, e.g. am[5]mos `we love' (from Latin AMAMUS), and am[a]mos `we loved' (from AMMAMUS) [7, pp. 77].

3 NATURAL SPEECH ANALYSIS

Analysis addressed what is similar in this context to the non-nasal context. We found that in both cases there is an energy increase at the beginning and a slow decrease of the energy to the end. Figure 1 presents an example of a nasal vowel before an occlusive. Speech signal and intensity is represented simultaneous. It's noticeable the energy increase in nasal consonant to nasal vowel transition, the decrease in energy during nasal vowel and the low energy of the following segment, the [t]. Intensity (dB)

In this paper we present the study of the inuence of velum variation in time and of the following segment in the perception of nasality of nasal vowels after a nasal consonant. Studies consisted of natural speech analysis, simulations, and perceptual tests. Our results give further support for the theory of nasal vowels as dynamic sound. Also the context is shown to inuence nasality perception. At the beginning oral closure release results in an overall energy increase, dominated by lips radiation. If a high energy segment, such as a oral vowel, follows the nasal vowel perception of nasality is reduced. It is necessary a low energy, by dominant nasal radiation at the end. Nasal vowels can be regarded as diphthongs starting with dominant lips radiation and tending gradually to a low energy, with nasal radiation dominance. End conguration can be realized as a nasalized continuation of the rst part or as a glide to the following segment.

[m]

[5] time

[t]

[u]

Figure 1: Natural NV Ocl sequence (speech signal and energy). A comparison between an oral vowel and a nasal vowel, in a nasal consonant fricative context, is presented in Figure 2. High to low energy transition before the fricative is much faster in the oral vowel case. Also energy is more stable during oral vowel than the nasal vowel. Another context for the nasal vowel, at the end of a word, is analyzed in Figure 3. For comparison is presented the corresponding oral vowel in the same context. Once

Intensity (dB) Intensity (dB)

[m]

[a]

[m]

time

[s]

[u]

[s] time

[5]

[u]

Intensity (dB)

Figure 2: Natural NVFric (top) and NV Fric sequence.

[m]

[]

time

[m]

[i]

Figure 3: NV and NV at word end. [m] and [mi] are presented in sequence. more is noticeable the energy increase from the preceding nasal consonant to the vowel. This also happens for the oral vowel. Energy decrease at the end is much more gradual for the nasal vowel. Other contexts generally only appear in word sequences. In Figure 4 is presented the case of a nasal vowel before an oral vowel. The nasal is the end of a word and the oral the beginning of the next word. The end of the nasal vowel has a low energy segment before the energy increase of the oral vowel. In all contexts presented is noticeable an energy increase at the beginning of the nasal vowel, and a gradual decrease of energy during the nasal vowel. Also a high energy segment doesn't follow immediately.

4 SIMULATIONS

Intensity (dB)

By simulation, we investigated the energy variation at beginning and end of the nasal vowel. Looking at the lips and nostrils radiation separately, at the release of oral closure, in Fig. 5 the [m5] case, it is clear that opening of the oral passage makes the oral radiated signal dominant. This is also true for vowels with reduced oral passage such as [], in Fig. 6. Nasal radiation remains almost the same, for the [5], or is reduced, in the []. Lips radiation dominates because oral passage oers less resistance to sound propagation. Overall energy of the signal increases at vowel start.

300 200 100 0 -100 -200 -300 100 300 200 100 0 -100 -200 -300 100 300 200 100 0 -100 -200 -300 100

Total radiation

150

200

250 300 Oral radiation

350

400

150

200

250 300 Nasal radiation

350

400

150

200

250 300 time(ms)

350

400

Figure 5: Simulation results for the [mã] sequence. Total radiation

100 0 -100

100

150

200

250 300 Oral radiation

350

400

100

150

200

250 300 Nasal radiation

350

400

100

150

200

250 300 time(ms)

350

400

100 0 -100 100 0 -100

Figure 6: Simulation results for the [mOcl] sequence. In Figure 6 is presented a simulation of a nasal vowel before an occlusive. Velum and oral articulators movement over time, typical of this context, causes a progressive decrease in lips radiation and an increase of nasal radiation. The closure of oral passage before velum closure creates a nal segment with only nasal radiation (a nasal consonant is created due to coarticulation). The eect in total radiated signal is a gradual decrease in energy ending in a low energy segment.

5 PERCEPTUAL TESTS

[m]

[u]

time

Figure 4: NV -V sequence.

[a]

To know to what extent the variation over time of velum, in this context, inuences the perception of the nasal vowel we performed 2 perceptual tests. Being interested in quality we decided for the realization of a paired comparison preference test. The lack of detailed information about velum movement and oral tract conguration, for the Portuguese

Context NV V

Factor oral vowel duration

NV Ocl

nasal consonant created by coarticulation nasal consonant at the end

NV # NV velum constant NV Fric NV N

NVN

aperture reduced oral passage duration second nasal consonant duration

Par value 0 50 -40 40 -40 10 0 40 -1 fric 40 N=V NV V = 0 40 10 0 25 50 100 100

St 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 A B C D E

Table 1: Description of the stimuli. Variants of the nasal vowels after a nasal consonant, in various contexts, where generated using an articulatory synthesizer [6, 4]. Two factors were considered: velum, and the following segment. Stimuli are presented, in a very concise form, in Table 1. The context, variable parameter in that context, values used for the parameter and stimulus number are presented. Briey, and starting from table top, for the nasal vowel before the corresponding oral vowel, produced by only closing the velum, we varied oral vowel duration. 3 cases were produced: duration 0 but with transition to the oral vowel; duration 40 ms; and no vowel or transition. In the second context, the duration of the nasal consonant, created by coarticulation at the nasal vowel end, was used as parameter. Values of 10 and 40 ms and no consonant at all (represented by ,40) were used. For the nasal vowel at the end, the parameter is also a nal nasal consonant, created by closing oral passage. If oral closure is only at the end we have the 0 case. The case of no oral closure is represented by ,1. If there is no closure but oral passage is reduced to a small passage, similar to a fricative, we have the case noted as fric 40. Three cases of constant velum aperture were produced: equal aperture for the nasal consonant and vowel; velum open but with dierent aperture for vowel and consonant; velum only open during the nasal consonant. For nasal vowels before fricatives, parameter used was the duration of the reduced oral passage, caused by the fricative, with simultaneous nasal radiation. Two values were used: 10 and 40 ms.

5.2 Preference Test

Due to the number of stimuli a standard AB test would result in a very long test. Also it's a waist of time to make comparisons between low quality stimuli when we are interested in the determination of the best stimuli. We decided for a paired comparison test using a tournament strategy. Tournament work as follows. Each stimulus is evaluated against other stimuli, randomly chosen, a xed number of times. After this, half of the stimuli, the ones with higher scores, are kept. Process is repeated until only 1 stimulus remain. After the test, stimuli are ranked by their total of points. Listeners have 4 options for each pair: choose rst stimulus as the best; choose the second; choose both; choose none. Winner stimulus is awarded the number of points of the looser plus one. Winning to a good stimulus is rewarded with this method. Choosing the last option penalizes both stimuli. They loose half of their present points. 1 3 5 7 9 11 13

Classification

5.1 Stimuli

For the case of a nasal vowel between two nasal consonants we varied the second consonant duration. Values of 0, 25, 50 and 100 ms were used. Also an example of an oral vowel in the same context, having a second nasal consonant of 100 ms was produced. Stimuli for this case, labeled A to D, were only made for [5]. Only for this case exists phonemic opposition between NVN and NV N in Portuguese. Regarding the velum variation, in the nasal consonant and beginning of the nasal vowel, for the high vowels, velum lowers for the consonant and then rises to some extent during the vowel. For low vowels the velum continues to lower during the vowel [2].

1

2

3

4

5

6

7 8 Stimuli

9

10

11

12

13

14

15

Figure 7: Classication for [].

Results, represented in Fig. 7 to 9, conrmed the con-

tribution of velum variation to the perception of nasality. For the 3 vowels, stimuli with constant velum obtained very poor classications. Also the inuence of the segment after the nasal vowel is noticeable for the 3 vowels. Stimuli with a nasal vowel before an oral vowel have very poor classications. In contrast stimuli typical of a nasal vowel before an occlusive, or before a fricative, are in the top places regarding quality. 1 3 5 7 9 11 13

Classification

nasal vowels, makes impossible the production of very high quality synthetic stimuli. To deal with this limitation we also performed an identication test.

1

2

3

4

5

6

7 8 Stimuli

9

10

11

12

13

14

15

Figure 8: Classication for [u]. For the higher quality cases, it is clear the preference of

the stimuli for NV Ocl with a nasal consonant at the end of the nasal vowel of 40 ms duration. Results for NV # case were very dierent for each of the three vowels. We don't have an explanation for this, except the possible incorrect production of some of the stimuli.

Classification

1 6

11 16 1

2

3

4

5

6

7

8

9 10 Stimuli

11

12

13

14

15

A

B

C

D

E

Figure 9: Classication for [5].

6 DISCUSSION

1 Classification

2 3 4 5

A

B

Stimuli

C

D

E

Figure 10: Classication for N[5]N and N[a]N cases. Results for NV N and NVN context, ranked as if they were the only stimuli in the test, in Fig 10, show the inuence of the second nasal consonant duration in the perceived nasality. Stimuli with no nasal consonant after were rated of lower quality. The diculty of using nasality to distinguish two words, between two nasal consonants, is demonstrated by the similar score obtained by the stimuli produced with open and closed velum during the nasal vowel.

5.3 Identication Test

Using stimuli for the 3 vowels used in the preference test and similar stimuli for the two other Portuguese nasal vowels (only for a subset of the contexts indicated in Table 2 with 5 vowels) a identication test was performed. Six listeners participated in this test. Listeners labeled stimuli as one of the Portuguese oral and nasal vowels. When none was adequate they could use two other labels: one for other oral vowel, the other for a nasal vowel not in the ve Portuguese nasal vowels. Context NV V (5 v) NV Ocl (5 v) NV # (3 v) Constant (3 v) NV Fric (5 v)

St 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Id Perc 12 13.3 6 6.7 13 14.4 42 46.7 12 13.3 26 28.8 15 27.8 23 42.6 9 16.7 20 37.0 2 3.7 2 3.7 4 7.4 28 31.1 27 30.0

Context NV N (1 v) NVN

St A B C D E

Results, summarized in Table 2, conrmed the higher quality of the stimuli generated using velum variation in time. Constant velum cases were identied very few times. Also nasal vowels before an high energy segment, provided by a following oral vowel, had low identication scores. The following oral vowel reduces the perception of nasality. This eect increased with oral vowel duration. NV V sequences are generally perceived as NV when the oral vowel has 50 ms duration. Tests showed a higher rate of identication for nasal vowels in NV Ocl and NV Fric contexts. Top 3 stimuli are highlighted in the table. In NV Ocl contexts, stimuli with a 40 ms nasal consonant at the end were identied more times, followed by the 10 ms case.

Id 6 5 8 5 7

Perc 33.3 27.8 44.4 27.8 38.8

Table 2: Identication results. 6 listeners 3 repetitions.

A nasal vowel, at least in European Portuguese, is not a sound obtained only by lowering the velum. The way this aperture, and other articulators, vary in time is important. This was conrmed in this work for nasal vowels after nasal consonants. Our previous work, with non nasal environments [4], had similar results. Release of oral closure, at the beginning of the vowel, results in a energy increase, by dominant lips radiation. If a high energy segment follows immediately it is very dicult to perceive the nasality in the nasal segment. This results point to the necessity of having a low energy, by dominant nasal radiation, at the end of the nasal vowel. Nasal vowels can be regarded as diphthongs [8], starting with dominant lips radiation and ending in a nasal radiation dominant conguration. This end condition can be obtained with a reduced or occluded oral passage. Transition is gradual.

7 ACKNOWLEDGMENTS

This work was funded by the Portuguese research foundation (FCT) under program PRAXIS XXI, reference PRAXIS/P/PLP/11222/1998.

REFERENCES

[1] R. Sampson, editor. Nasal Vowel Evolution in Romance. Oxford University Press, 1999. [2] H. Clumeck. Patterns of soft palate movements in six languages. Journal of Phonetics, 4:337351, 1976. [3] A. Almeida. The portuguese nasal vowels: Phonetics and phonemics. In J. Schmidt-Radefelt, editor, Readings in Portuguese Linguistics, pp. 348396. North Holland, 1976. [4] A. Teixeira, F. Vaz, and J. C. Príncipe. Inuence of dynamics in the perceived naturalness of portuguese nasal vowels. Proc. ICPhS, 1999. [5] G. Feng and E. Castelli. Some acoustic features of nasal and nasalized vowels: A target for vowel nasalization. JASA, 99(6):36943706, 1996. [6] A. Teixeira, F. Vaz, and J. C. Príncipe. A Software Tool to Study Portuguese Vowels. Proc. Eurospeech, v 5, pp 25432546, Rhodes, 1997. [7] J. Hajek. Universals of Sound Change in Nasalization. Blackwell, 1997. [8] S. Parkinson. Portuguese nasal vowels as phonological diphthongs. Lingua, 61:157177, 1983.