Phonetic implementation of mid vowel contrasts across Italian varieties

0 downloads 68 Views 1MB Size Report
Corpora e Lessici dell'Italiano Parlato e Scritto. http://www.clips.unina.it/. ... Normalized front and back mid vowel t
Phonetic implementation of mid vowel contrasts across Italian varieties Margaret E. L. Renwick Ÿ [email protected]

1. Italian mid vowel contrasts v /e/ and /o/ chiuso: v /ɛ/ and /ɔ/ aperto:

/peska/ pesca ‘fishing’ /pεska/ pesca ‘peach’

Results of k-means clustering

/forɔ/ foro ‘hole’ /fɔrɔ/ foro ‘forum’

Stressed Italian mid vowels, classified by k-means clustering Bergamo

Milano

Venezia

Torino

Parma

-2

Results of k-means clustering, colored by vowel transcribed by MAUS, for comparison

-1 0 1

Genova

Firenze

Perugia

Roma

high mid

Napoli

-2

Midpoint F1.z

Italian /e ɛ o ɔ/ are separate phonemes, but the contrast between high and low mid vowels is marginal: they have few minimal pairs; vowels neutralize to /e, o/ in unstressed syllables; actual phonetic height may vary; and regional patterns of phonological conditioning decrease reliance on lexical specification.

Cluster

2

low mid

-1

Transcribed vowel

0

e

1

E

2 Bari

Lecce

Catanzaro

Palermo

o

Cagliari

-2

O

-1 0

2. Acoustics vs. intuition (Renwick & Ladd 2016)

1 2 2

Vowel

Normalized F1

ɛ o

0

ɔ

Categorical mismatch

1

match mismatch

2 0.0

-2

2

1

-0.5

-1.0

-1.5 1.5

1.0

0.5

0.0

-0.5

-1.0

0

-1

-2

2

1

0

-1

-2

2

1

0

-1

-2

2

1

0

-1

-2

Back vowels Percent classification

100 75 50 25

75 50 25 0 economico modo pero po parlero nuovo parole scuola volta ricordo proprio soldo buon accorsi daro bisogna occhi o signori mondo uno non conosci persona conosco situazione trovi ancora conto ho foto rumori giorno ora ogni troppo discorso solo sono dopo cosa forse con lavoro dove

0

100

certo bella terra era quel te prezzo nella sempre dovrebbe tempo quella prendo per sarebbe attenta Stefano della medico e bene aspetti un'azienda gente bel tormento perche piaceva pensi pareva chiesto quello le recipiente aveva teneva avevo appena cerco se messo ne momento porgeva vedere vero fretta nel poteva mese questo fingeva chiede credere legge voleva che

Percent classification

Front vowels

word Cluster height

High mid

Low mid

word

Syllable structure

closed

Cluster height

open

High mid

Syllable structure

Low mid

closed

open

Regional differences in clustering of specific lexical items

v All speakers have some mismatches between their intuition of vowel height, and its phonetic implementation

e

0.5

-1

ItF7

-1

1.0

0

Rates of stressed-vowel classification into higher vs. lower clusters

v However, the high vs. low mid vowel distinctions are also weak v Phonological conditioning occurs in some regions, e.g. by syllable structure v Widespread lexical variability of “Standard Italian” mid vowels

1.5

1

Midpoint F2.z

v Despite a “particular closeness” between mid vowel pairs (Ladd 2006), Italian mid vowels retain their phonetic and phonological contrasts v Speakers are, generally, good judges of their own 4 mid vowels v Phonetic separation of mid vowels is strong

ItF12

Hypothesis: Less-variable words appear mostly in one cluster, while more variable words appear across both clusters

Percent tokens of PIACEVA clustered as high mid, by city

Percent tokens of CONOSCI clustered as high mid, by city

Percent tokens of BELLA clustered as high mid, by city

Percent tokens of DOVE clustered as high mid, by city

While some words are consistently classified, many are variable

-1.5

Normalized F2

v A remaining research question: How does a speaker’s regional variety influence the selection and phonetic implementation of mid vowels in Italian?

3. Mid vowel variation across Italy: a corpus approach

Percent clustered as high mid

Percent clustered as high mid

Percent clustered as high mid 100

v CLIPS (corpora e lessici di italiano parlato e scritto) v Collected 1999 – 2004; team led by Federico Albano Leoni (Leoni et al. 2007) v >100 hours of speech, “partially transcribed” by the original team v Radio & TV, dialogues (MapTask), reading, telephone, pathological speech v 15 Italian cities, 16+ speakers/city (150 speakers represented here) v Data analyzed here: 20 read sentences from the “lista frasi” portion of the corpus, containing 284 unique words

Percent clustered as high mid

6

0

81

100

Effects of syllable structure on vowel clustering Clustering of open-syllable front vowels, by city

Clustering of closed-syllable front vowels, by city

Clustering of open-syllable back vowels, by city

Clustering of closed-syllable back vowels, by city

In Northern varieties, [ɛ] is conditioned in closed syllables Here, considerable regional variation occurs throughout

4. Phonetic analysis of mid vowels

Percent clustered as high mid

Percent clustered as high mid

72

v Forced alignment (MAUS; Kisler et al. 2016), with hand correction of TextGrids v F1, F2 extracted by Praat at vowel midpoint: 99,770 vowel tokens v Outliers filtered from raw data v Mahalanobis distance (Mahalanobis 1936) calculated, relative to a gender- and vowel-specific centroid. Tokens with high Mahalanobis distance (based on the 95% quantile of a χ2 distribution with df = 2) were excluded as outliers. v Subset of mid vowels identified (39,632 tokens) and marked for stress v Data Lobanov-normalized (z-score) on a speaker-specific basis

Percent clustered as high mid

Percent clustered as high mid

22

64

63

Duration: a secondary cue to phonological height? Duration vs. F1 of Italian mid vowels, penultimate stressed syllables front

front

back

back

open

closed

open

closed

Midpoint F1.z

-1

four.clusters back high mid 0

back low mid front high mid front low mid

1

2 0.1

0.2

0.3

0.1

0.2

0.3

0.1

0.2

0.3

0.1

0.2

0.3

Duration (sec)

Cluster Higher front Higher front Lower front Lower front

Initial finding: some words are realized with consistently higher mid vowels (top), some with consistently lower mid vowels (bottom), and others variably (center)

-1

0

1

0

1

2 2

1

0

-1

2

1

0

Bergamo

Firenze

Milano

Parma

Torino

-1

2

1

Cagliari

Genova

Napoli

Perugia

Venezia

0

2

-1

1

Bergamo

Firenze

Milano

Parma

Torino

Cagliari

Genova

Napoli

Perugia

Venezia

-1

Bari vowel Catanzaro Lecce(259 Palermo Stressed in CONOSCI tokens)Roma

Barivowel in Catanzaro Lecce(115 Palermo Stressed PERSONA tokens)Roma

City

0

Midpoint F2.z

Midpoint F2.z

Bari vowel Catanzaro Lecce Stressed in ASPETTI (143Palermo tokens) Roma

City

Unstressed front vowels

2

Midpoint F2.z

Bari vowel Catanzaro Lecce(143Palermo Stressed in PIACEVA tokens)Roma

0

1

2

Midpoint F2.z

City

0

1

2

Bergamo

Firenze

Milano

Parma

Torino

Cagliari

Genova

Napoli

Perugia

Venezia

City

Bergamo

Firenze

Milano

Parma

Torino

Cagliari

Genova

Napoli

Perugia

Venezia

Unstressed back vowels

100 75 50 25 0

100 75 50 25 0

Word (number indicates syllable) -1

1

0

1

2 0

-1

1

Midpoint F2.z

Bergamo

Firenze

Milano

Parma

Torino

-1

2

Cagliari

Genova

Napoli

Perugia

Venezia

Roma

Bergamo

Firenze

Milano

Parma

Torino

Cagliari

Genova

Napoli

Perugia

Venezia

1

0

1

2 0

-1

1

Midpoint F2.z

City

0

-1

Firenze

Milano

Parma

Torino

Cagliari

Genova

Napoli

Perugia

Venezia

Catanzaro

Lecce

Palermo

Roma

Bergamo

Firenze

Milano

Parma

Torino

Cagliari

Genova

Napoli

Perugia

Venezia

City

0

City

Firenze

Milano

Parma

Torino

Cagliari

Genova

Napoli

Perugia

Venezia

0

Catanzaro

Lecce

Palermo

Roma

Firenze

Milano

Parma

Torino

Cagliari

Genova

Napoli

Perugia

Venezia

Cluster height

Low mid

High mid

Low mid

v Widespread variability of mid vowels: while some words have consistent phonetic height at regional levels, others are highly variable even within single cities v Areas of regional or lexical inconsistency: variable phonetic implementation of mid vowels is not a misleading consequence of pooling across diverse phonological systems – it is a local property v Variability within words and cities suggests the mapping of lexical specification to phonetic category is weak, and contrasts are marginal (cf. Renwick & Ladd 2016)

2 1

0

-1

2

1

Midpoint F2.z

Bergamo

Roma

Bergamo

1

2

Bari

-1

-1

Midpoint F2.z

Bari

0

Bari Palermo Stressed vowelCatanzaro in VOLTALecce (148 tokens)

Roma

Bergamo

2 2

1

Midpoint F2.z

1

2 1

2

-1

-1

Midpoint F1.z

Midpoint F1.z

0

0

Bari Palermo Stressed vowel Catanzaro in MODOLecce (138 tokens)

City

-1

2

1

Midpoint F2.z

Bari Palermo Stressed vowel Catanzaro in BELLALecce (145 tokens)

City

-1

Midpoint F1.z

0

Midpoint F2.z

Bari Lecce Stressed vowel Catanzaro in MEDICO (106 Palermo tokens) Roma

City

2

2 2

High mid

Word (number indicates syllable)

6. Conclusions

0

1

Midpoint F1.z

1

0

1

2 2

Midpoint F1.z

0

Cluster height

-1

-1

Midpoint F1.z

Midpoint F1.z

Midpoint F1.z

-1

Correlation r2 = 0.06, p < 0.01 r2 = 0.04, p = 0.1226 r2 = 0.23, p < 0.001 r2 = 0.14, p < 0.001

Unstressed vowels: asymmetrical evidence for neutralization

-1

-1

Midpoint F1.z

Midpoint F1.z

Midpoint F1.z

-1

σ structure Open Closed Open Closed

Stressed vowel in FORSE (146 tokens)

Stressed vowel in DOVE (288 tokens)

parole.3 numero.2 parlero.2 tornare.3 eppure.3 scrivere.3 male.2 durera.2 pero.1 due.2 venire.3 fine.2 persona.1 neanche.1 dove.2 perche.1 nazionale.4 scrivere.2 padre.2 notare.3 guardare.3 un'azienda.3 dice.2 situazione.5 sembrava.1 bene.2 fare.2 chiede.2 cercava.1 parte.2 adatte.3 forse.2 andare.3 sentire.3 pagare.3 sempre.2 entrata.1 gente.2 servire.3 grande.2 vedere.3 dell'acqua.1 capire.3 credere.2 legge.2 servire.1 cuore.2 diventato.2 possibile.4 teneva.1 cambiare.3 dovrebbe.3 stare.2 verita.1 portare.3 vedere.1 rispettava.2 economico.1 qualche.2 sentiva.1 societa.2 credere.3 eppure.1 sentire.1 venire.1 mese.2 recipiente.4 recipiente.1 sarebbe.3 deciso.1 neanche.3

Stressed vowel in LEGGE (115 tokens)

Closed syllable

Midpoint F1.z

Stressed vowel in VOLEVA (116 tokens)

Open syllable

Cluster Higher back Higher back Lower back Lower back

Stefano.3 sanno.2 primo.2 economico.2 giorno.2 troppo.2 momento.1 solo.2 piccolo.3 ministro.3 lontano.3 lavorava.2 uno.2 notare.1 ora.2 riuscito.4 nazionale.2 lavoro.3 Lucio.2 conosco.1 chiesto.2 economico.5 italiano.4 trovava.1 mano.2 occhi.2 tornare.1 uomo.2 avevo.3 lontano.1 ragazzo.3 tempo.2 conosco.3 conosci.1 Marco.2 sono.2 tipo.2 o.2 tormento.1 nuovo.2 foto.2 diventato.4 bambino.3 conto.2 alto.2 bugiardo.3 politico.4 piccolo.2 parco.2 soldo.2 fatto.2 medico.3 prendo.2 cerco.2 amico.3 voleva.1 mondo.2 portare.1 ho.2 dopo.2 numero.3 modo.2 porgeva.1 tanto.2 vero.2 prezzo.2 dovrebbe.1 discorso.3 poteva.1 giocava.1 proprio.2 tormento.3 messo.2 societa.1 certo.2 subito.3 quando.2 questo.2 suo.2 momento.3 quello.2 possibile.1 potrai.1 tutto.2 deciso.3 ricordo.3 politico.1 punto.2

Closed syllable

Percent classification

Open syllable

Back vowels

Correlation r2 = 0.06, p < 0.01 r2 = 0.15, p < 0.001 r2 = 0.24, p < 0.001 r2 = 0.49, p < 0.001

Percent classification

Front vowels

σ structure Open Closed Open Closed

City

0

-1

Midpoint F2.z

Bari

Catanzaro

Lecce

Palermo

Roma

Bergamo

Firenze

Milano

Parma

Torino

Cagliari

Genova

Napoli

Perugia

Venezia

City

Bari

Catanzaro

Lecce

Palermo

Roma

Bergamo

Firenze

Milano

Parma

Torino

Cagliari

Genova

Napoli

Perugia

Venezia

5. Automatic classification of mid vowels Goal: a measure of vowel height unbiased by prescriptive quality or human intuition, to compare rates of high mid vs. low mid classification, across words, cities & regions v Method: k-means clustering in R, a procedure that partitions data points to minimize the sum-of-squares distance between a point and its assigned cluster v Normalized front and back mid vowel tokens clustered separately, per city v Two clusters permitted, resulting in a higher and a lower cluster v Output: a list of cluster assignments for each token v The effects of syllable structure, duration, stress are also considered v Lower (front) vowels expected in closed syllables, in some varieties v Longer durations expected in open syllables (e.g. Farnetani & Kori 1986) v If neutralization to /e, o/ occurs, higher vowels expected in unstressed syllables

v Contrastiveness is a matter of degree v Goal: adapt phonological theory to diverse dimensions of contrast

7. Acknowledgments and References Supported by the UGA Willson Center for Humanities & Arts Boersma, Paul & David Weenink. 2017. Praat: Doing phonetics by computer [Computer program], Version 6.0.30. http://www.praat.org. Farnetani, Edda & Shiro Kori. 1986. Effects of syllable and word structure on segmental durations in spoken Italian. Speech Communication 5(1). 17–34. Kisler, Thomas, Uwe D. Reichel, Florian Schiel, Christoph Draxler, Bernhard Jackl & Nina Pörner. 2016. BAS Speech Science Web Services – an Update on Current Developments. Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), paper id 668. Portorož, Slovenia. Ladd, D. R. 2006. “Distinctive phones” in surface representation. In Louis Goldstein, D. H. Whalen & Catherine T. Best (eds.), Laboratory Phonology 8, 3–26. Berlin; New York: Mouton de Gruyter. Leoni, Federico Albano, Francesco Cutugno, Renata Savy, Valentina Caniparoli, Leandro D’Anna, Ester Paone, Rosa Giordano, Olga Manfrellotti, Massimo Petrillo & Aurelio De Rosa. 2007. Corpora e Lessici dell’Italiano Parlato e Scritto. http://www.clips.unina.it/. Mahalanobis, Prasanta Chandra. 1936. On the generalized distance in statistics. Proceedings of the National Institute of Sciences (Calcutta) 2. 49–55. Renwick, Margaret E. L. & D. Ladd. 2016. Phonetic Distinctiveness vs. Lexical Contrastiveness in Non-Robust Phonemic Contrasts. Laboratory Phonology: Journal of the Association for Laboratory Phonology 7(1). 1–29. doi:10.5334/labphon.17.