Tone Spans of Cantonese English

7 downloads 0 Views 882KB Size Report
This paper examines the newly developed tones in Cantonese. English ... In fact, many aspects .... syllable which carries H no matter the word is in the utterance-.
4th International Symposium on Tonal Aspects of Languages (TAL-2014) Nijmegen, The Netherlands May 13-16 2014

ISCA Archive http://www.isca-speech.org/archive

Tone Spans of Cantonese English Suki S.Y. Yiu Linguistics Department, University of Hong Kong [email protected]

Abstract This paper examines the newly developed tones in Cantonese English, with a particular interest in the tones spanning across syllables. It attempts to provide phonetic evidence for the tone spans and demonstrate the association of tones to syllables. Audio-recording data elicited from six speakers of Cantonese English was processed with Praat, and then fitted to a smoothing spline analysis of variance with R to generate smoothing splines at 95% confidence intervals for determining if the the tones in different positions of a word are significantly different from each other. Pitch tracks of tones in individual words were also generated for a detailed examination of the tone contours of the tone spans. Results showed that the realization of different tones was restricted by the position of the syllables in a word. Those tones are significantly different from each other. H spans, Ø spans and M spans were discussed and illustrated with tonal associations. Index Terms: tone, tone spans, Cantonese English, Hong Kong English

1. Introduction Cantonese English is a variety of language that owns its distinctiveness in many aspects of language which are worth paying attention to for the good of providing a neutral and unbiased description in its own terms [1]. In fact, many aspects of the phonology of Cantonese English have been studied so far [2-6]. Specifically, the use of pitch in Cantonese English, unlike tones in Cantonese, is perceptually distinct but carries a low functional load, with few if any minimal pairs in normal speech. A systematic use of pitch in Cantonese English was reported in recent studies such as [5, 7-10]. This paper refers this systematic use of pitch as tone. It asks whether each ‘tone’ is significantly different from each other phonetically, and how some tones span across several syllables forming tone spans. This paper examines these newly developed tones by fitting a smoothing spline analysis of variance (SS ANOVA) generated from R (version 3.0.2) [11] to the acoustic data and showing the pitch tracks of different tone spans from Praat (version 5.3.39) [12]. Based on the phonetic analysis, it provides generalizations of the phonetic distribution of tones in Cantonese English and illustration of tonal associations.

2. Phonetic Evidence Audio-recordings were obtained from six speakers of Cantonese English. They were balanced for biological gender, all born after 1980, and raised in Hong Kong. The data was elicited with a wordlist covering the surface tones (M(id), H(igh), Mf(alling), L(ow), Hf) and corner vowels ([i], [a], [u]) of Cantonese English. The speakers were asked to produce each word three times in a row so that the basic position (initial, medial or final) of each target item in an utterance was controlled, for instance, elicit-elicit-elicit. The wordlist was repeated twice for taking averages. A total of 2376 target

TAL 2014

syllables (= 66 target syllables x 3 repetitions x 2 sets x 6 subjects) were recorded using Praat with a sampling frequency of 22050Hz in a sound-proofed recording booth. Fundamental frequencies (F0) of the rhymes of each word in the utterance final position were extracted with Praat, then time-normalized at 10% interval points with ProsodyPro (version 4.3) [13]. The F0 was then transformed into logarithmic Z-score (LZ-score) values to fit the logarithmic characteristic of pitch in speech perception [14] and production [15], and to minimise inter-speaker variation. The conversion was based on equation (1): ̄

(1)

ee e no , and SD is standard deviation. Now we are ready to fit a SS ANOVA to the LZ-scores to see if the tones in different word positions differ significantly from each other.

2.1. SS ANOVA plots A SS ANOVA was implemented to the LZ-score values with R. It is a statistical technique used to estimate the means of dependent variables having a contour characteristic over a certain dimension, in this case time, with an ANOVA [16]. A smoothing spline of the estimated values is constructed by connecting discrete data points with a polynomial function which concerns the mean, error effect, shape of the curve, and interaction between the effect and shape of the curve. This technique has been used to determine whether the shapes of multiple curves statistically differ from one another significantly in articulatory phonetics when comparing tongue shapes on ultrasound images [17-19] and in acoustic phonetics for tonal contours in F0 [20-22]. This study applied a SS ANOVA to the F0 data to generate the contours which show the smoothing spline fit for each of the tones. To confirm whether the smoothing splines of the LZscores are statistically different, 95% confidence intervals (CIs) were constructed for the sets of curves. If the CIs at any section on the length of two curves overlap, the difference between them in that section is not statistically significant. A wider CI at a certain interval point of time indicates a greater variance in the LZ-scores of all data available at that point. Figure 1 shows the tone profile of Cantonese English for speaker F2 as an example. Recall that the words were repeated three times so as to control the position of the words in an utterance. Only the words in the utterance-final position are presented because they display richer materials for tonal analysis. Also, since the placement of tones seems to be restricted to syllable positions, three graphs, from left to right, are used to present the tones found on syllables in word-initial, word-medial and word-final positions respectively. A wordinitial syllable refers to the first syllable of a word while the word-final syllable refers to the last syllable of a word. Any remaining syllables in between are treated as word-medial. The curves with the highest LZ-scores are H whereas those with the lowest ones are L, except for the initial position

143

where the curve with the lower LZ-score is M. The one between H and L in the medial position is also M. Pitch (Hz)

understood_Ufinal 500 400 300 200 100 50

un H

der H

stood Hf 4.723

3.824 Time (s)

Figure 2: Hs of ‘understood’ in non-utterance-final position (upper panel) and utterance-final position (lower panel).

Figure 1: Tone profile of Cantonese English for speaker F2.

The first two Hs in ‘understood’ surface as H in both nonutterance-final position and utterance-final position, while the last H surfaces as H in non-utterance-final position but as Hf in utterance-final position. Hf is probably the result of the boundary effect of a L on an utterance-final syllable. The fact that all three syllables surface as H regardless of the position of the word in an utterance suggests that this is a phonological H span. criminal_non-Ufinal

Pitch (Hz)

As shown, not all tones surface in all syllable positions. H and M surface in both word-initial and word-medial positions. In the word-final position, the lower curve is regarded as a boundary L, and the upper curve with a distinguishable falling contour is a H transiting to the boundary L. Additional data in section 2.3 shows that the more the syllables between the rightmost H and the boundary L, the flatter the slope of each curve. They are interpolating tones from the rightmost H in a word to the boundary L. Since the curves at 95% CIs do not overlap with each other, their difference is statistically significant. We are confident that the use of pitch as tones is systematic and stable in the way mentioned above. Despite some variations concerning the details of the pitch contours, the patterns described above generally hold across subjects. The sequence of tones is identified as follows:

500 400 300 200 100 50

(M (span)) - H (span) - (Ø (span)) - %L

Pitch (Hz)

500 400 300 200 100 50

un H

der H

stood H 3.745

2.889 Time (s)

TAL 2014

nal H 2.995

Time (s) criminal_Ufinal

Pitch (Hz)

2.2. H spans

understood_non-Ufinal

mi H

2.388

There are three kinds of tone spans: H spans, Ø spans and M spans. Each of them will be illustrated with pitch tracks of individual words as examples in sections 2.2, 2.3 and 2.4.

Two kinds of H spans were recognized from the data. The Hs of one type surface as H in words of all three utterance positions while the Hs of the other type only surface in nonutterance-final positions. H spans in ‘understood’ and ‘criminal’ are examples of the two types respectively. Figure 2 displays two plots of pitch tracks. F0 is based on Hertz scale with time in seconds. From top to bottom, it shows the pitch track tier, syllable tier and tone tier. The dashed lines represent the syllable boundaries of the words. Pitch tracks were extracted from the same speaker for convenience of comparison of tones.

cri H

500 400 300 200 100 50

cri H

mi Mf

nal L 3.582

3.009 Time (s)

Figure 3: H spans of ‘criminal’ in non-utterance-final position (upper panel) and utterance-final position (lower panel). Figure 3 introduces a phonetic H span. Syllables following the H on the first syllable of a word are found to have alternating tonal behaviour depending on the position of the word in an utterance. The last two syllables of ‘criminal’ surface as H in non-utterance-final position, but as Mf and L in utterance-final position. This alternating tonal behaviour suggests that the last two Hs are phonetic Hs, forming a phonetic H span. The phonetic Hs receive their tone value from the only tone in the same word, i.e. the H on the first syllable. When the word is at the utterance-final position, the last syllable will receive a boundary L, and the syllable between the phonological H and the boundary L transits from the H to reach the L target. Considering the alternating behaviour of the tones on the syllables following the phonological H, let us call those two alternating syllables Ø syllables, and move on to section 2.3 for more details.

144

2.3. Ø spans

introduce_Umedial 500 400 300 200 100 50

Pitch (Hz)

Focusing on the tonal contour of the Ø syllables, Figure 4 shows an example with three Ø syllables.

Pitch (Hz)

regularly_non-Ufinal 500 400 300 200 100 50

duce H 3.336 Time (s)

re H

gu H

lar H

introduce_Ufinal

ly H

500 400 300 200 100 50

3.27

Pitch (Hz)

Time (s) regularly_Ufinal

Pitch (Hz)

tro M

2.609

2.513

500 400 300 200 100 50

in M

in M

tro M

duce Hf 4.154

3.334 Time (s) re H

gu Mf

lar Mf

ly L

Figure 5: Ms of ‘introduce’ in utterance-initial, utterance-medial and utterance-final positions (from the uppermost panel to lowest panel).

3.985

3.252 Time (s)

Similar to ‘criminal’, for ‘regularly’, other than the first syllable which carries H no matter the word is in the utterancefinal position or not, all three remaining syllables alternate as either H when the word is not at the utterance-final position, or Mf/L when the word is at the utterance-final position. One more Ø syllable in ‘regularly’ than in ‘criminal’ for tonal interpolation leads to a flatter slope of the Mfs in ‘regularly’. This indicates that the more the number of syllables available between the phonological H and the boundary L, the flatter the slope of those syllables. Ø syllables seem to be phonologically toneless and can carry any interpolating tones demanded by the utterance boundary.

For ‘introduce’, there is a M span preceding the Hf. Syllables preceding Hf surface as M in all three utterance positions, unaffected by any positional effects. When comparing the pre-high M (as the first two Ms in ‘introduce’) and post-high Mf (as the last two Mfs in ‘criminal’), it is clear that they are different in terms of their response to positional/boundary effects and their shapes. The pre-high M is phonetically very stable, unaffected by any positional/boundary effects while the post-H Mfs are highly responsive to boundary effects. When it comes to their shapes, the pre-high Ms are more horizontal (cf. Figure 5) whereas the post-high Mfs are either steeper (cf. Figure 3) or flatter (cf. Figure 4) depending on the number of post-high phonologically toneless syllables available for tonal interpolation to reach the target boundary L. Therefore, though the post-high Mfs are interpolating tones from H to L, they are not necessarily M.

2.4. M spans

3. Generalizations and association of tones

M spans are found preceding the phonological H (span) in a word. Figure 5 shows an example.

Based on the discussion of phonetic data, here are some generalizations of the tones in Cantonese English.

Figure 4: Ø syllables of ‘regularly’ in non-utterancefinal position (upper panel) and utterance-final position (lower panel).

Pitch (Hz)

introduce_Uinitial 500 400 300 200 100 50

in M

tro M

duce H 2.57

1.82 Time (s)

TAL 2014



H is obligatory. There is at least one H in each lexical word. Its high feature spreads rightward to all post-high Ø syllables in a non-utterance-final word.



M is optional, only found preceding the obligatory H (span) of a word. It is phonetically very stable in terms of shape, unaffected by any boundary effects.



Ø is optional, only found following the obligatory H (span) of a word if any. It surfaces as either H or Mf/L depending on whether the word is utterance final.



There are three types of tone spans: H spans, Ø spans and M spans. 

H spans can be phonological or derived from spreading by the obligatory H of a word.



Ø spans are a chain of interpolating H or Mf/L. In an utterance-final word, the Ø span surfaces as phonetic Hs. In an utterance-final word, the slope of the pitch contours of the Mf spans correlates inversely with the

145

number of syllables available phonological H and %L of the word. 

between

the

M spans are phonetic defaults grounded on physiological ease. The stableness of the pitch contour shape and neutrality to positional/boundary effects suggest that M may also be carrying some sort of phonological function.

Tonal association of different spans will be illustrated assuming syllable as the TBU. Solid lines and dashed lines indicate phonological tones and derived tones respectively.

3.1. H spans un | σ | H cri | σ | H

der stood | | σ σ ] non-Ufinal

mi | σ

nal | σ

]non-Ufinal

un | σ | H cri | σ | H

der stood | | σ σ ] Ufinal | %L mi | σ

nal | σ ] Ufinal | %L

Figure 6: Phonological H spans (upper panel) and phonetic H span (lower panel). The H span of ‘understood’ results from associating the obligatory H with all three TBUs, having three syllables sharing a H. Since there is a phonological H on the utterancefinal syllable, the %L has to share a TBU with that H, resulting in a Hf. The H span of ‘criminal’ is derived from spreading of the obligatory H to the phonologically toneless post-high syllables in a non-utterance-final word. ‘criminal’ is also a case of Ø spans. When the word is in the utterance-final position, the %L attaches to the last TBU. The pitch of any intervening syllables in between transits from H to the L target without associating with any tones.

3.2. Ø spans re gu lar ly | | | | σ σ σ σ | H

]non-Ufinal

re gu lar ly | | | | σ σ σ σ ]Ufinal | | H %L

Figure 7 presents another example with one more intervening syllable between the obligatory H and the %L. Similarly, the Ø syllables alternate between phonetic Hs and Mf/L depending on the utterance boundary. There is a chain of phonetic Hs, or interpolating tones from the obligatory H to the %L.

3.3. M spans

5. Acknowledgements I thank Stephen Matthews and Diana Archangeli for many stimulating discussions. I l o t nk L n ee Wee ‘l b t ’ Winnie Cheung and Queenie Chan, and the six subjects. This paper also benefits from useful comments by two anonymous reviewers and the audience at TAL4.

6. References [1] [2] [3] [4] [5] [6] [7] [8]

[10] [11] [12] [13] [14] [15] [16] [17] [18]

in tro duce | | | σ σ σ ]Umedial | | M H

in tro duce | | | σ σ σ ]Ufinal | | M H %L

Figure 8: M spans in all three utterance positions. The M spans of ‘introduce’ result from associating the M with two adjacent TBUs. The fact that the Ms in the utterancemedial word do not receive Hs from the preceding word could

TAL 2014

4. Conclusions This paper has examined the newly developed tones in Cantonese English with a particular interest in the tones spanning across syllables based on the acoustic data. It displayed a distinctive use of pitch Cantonese English, and illustrated the H spans, Ø spans and M spans with pitch tracks and association of tones to TBUs.

[9]

Figure 7: Ø spans.

in tro duce | | | σ σ σ ]Uinitial | | M H

be due to the requirement of alignment between the left edges of the syntactic boundary and the phonological boundary [9], which is marked by the pre-high M(s) on the left edge of a phonological phrase, i.e. a word in this case. It serves as a phonological boundary, preventing spreading from the preceding H in another phonological phrase to the present M. From the above, we can see that a tone can be linked to multiple TBUs and a TBU can be linked to multiple tones in Cantonese English.

[19] [20]

[21] [22]

Mo n n K. P. “Describing the Phonology of Non-Native V et e o L ngu ge”. World Englishes, 11, 111-128, 1992. Hung T. “To d p onology o Hong Kong Engl ”. World Englishes, 19(3), 337-356, 2000. Hung, T.“Wo d St e n Hong Kong Engl : p el n y tudy”.HKBU Papers in Applied Language Studies9,29-40,2005. Peng L. nd Sette J. “T e E e gence o Sy te t c ty n English Pronunciations of Two Cantonese-speaking Adults in Hong Kong”. EWW, 21(1), 81-108, 2000. Luke K. K. “P onolog c l e-interpretation: The assignment ofth C ntone e tone to Engl o d ”. P pe p e ented t t e 9 International Conference of Chinese Linguistics. NUS, 2000. Sewell, A. and Chan J. “Patterns of variation in the consonantal phonology of Hong Kong English. EWW, 31(2), 138-161, 2010. Wee, L.H., “Phonological patterns in the Englishes of Singapore and Hong Kong”. World Englishes 27(3/4):480‐501, 2008. Cheung, W.H.Y. “Sp n o g tone n Hong Kong Engl ”. In Proc. of BLS 35, 72‐82. Berkeley, CA, 2009. Yiu, S.S.Y. “Inton t on o Engl Spoken n Hong Kong”. BA Thesis, Hong Kong Baptist University, 2010. Gu en oven C. “Tone nd Inton t on n C ntone e Engl ”. TAL 3, Nanjing, May 26-29, 2012. R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/, 2013. Boersma, P. and Weenink D. Praat: doing phonetics by computer [Computer program]. Version 5.3.57, retrieved 2012 from http://www.praat.org/, 2013. Xu, Y. Praat script ProsodyPro (version 4.3), 2012. Nolan, F. “Intonational equivalence: an experimental evaluation of pitch scales”. In Proc. of the 15th ICPhS, 771-774, 2003. Fujisaki H. “P o ody n o t on and modeling: With Emphasis on Ton l e tu e o Speec ”. In Proc. of Workshop on Spoken Language Processing, Mumbai, 5–14, 2003. Gu C. “Smoothing Spline ANOVA Models”. Springer, New York, 2002; 2013. Davidson, L. “Comparing tongue shapes from ultrasound imaging using smoothing spline analysis of variance”. Journal of the Acoustical Society of America 120 (1): 407-415, 2006. Fruehwald, J. SS ANOVA. 2010. http://www.ling.upenn.edu/~joseff/. C en Y. nd L n H. “An ly ng tongue pe nd ove ent n vo el p oduct on u ng SSANOVA n ult ound g ng”. In Proc. of the 17th ICPhS, 123-127, 2011. Moisik, S., Lin, H. nd E l ng J. “Larynx height and constriction in Mandarin tones” In G. Peng nd . S (Ed .) Eastward flows the Great River:Festschrift in Honor of Professor William S-Y. Wang on his 80th birthday, 187-205, 2013. Chuang, C.T., Chang, Y.C., and H e . . “Co plete nd notso-co plete ton l neut l z t on n Pen ng Hokk en” In W.S. Lee (Ed.) Proc. of the ICPLC 2013, 54-57, 2013. Mathes, T. “Extreme tonal depressor effects in Khoisan: Evidence from Tsua”. TAL 4, Nijmegen, May 13-16, 2014.

146