Corpora e Lessici dell'Italiano Parlato e Scritto. http://www.clips.unina.it/. ... Normalized front and back mid vowel t
Phonetic implementation of mid vowel contrasts across Italian varieties Margaret E. L. Renwick
[email protected]
1. Italian mid vowel contrasts v /e/ and /o/ chiuso: v /ɛ/ and /ɔ/ aperto:
/peska/ pesca ‘fishing’ /pεska/ pesca ‘peach’
Results of k-means clustering
/forɔ/ foro ‘hole’ /fɔrɔ/ foro ‘forum’
Stressed Italian mid vowels, classified by k-means clustering Bergamo
Milano
Venezia
Torino
Parma
-2
Results of k-means clustering, colored by vowel transcribed by MAUS, for comparison
-1 0 1
Genova
Firenze
Perugia
Roma
high mid
Napoli
-2
Midpoint F1.z
Italian /e ɛ o ɔ/ are separate phonemes, but the contrast between high and low mid vowels is marginal: they have few minimal pairs; vowels neutralize to /e, o/ in unstressed syllables; actual phonetic height may vary; and regional patterns of phonological conditioning decrease reliance on lexical specification.
Cluster
2
low mid
-1
Transcribed vowel
0
e
1
E
2 Bari
Lecce
Catanzaro
Palermo
o
Cagliari
-2
O
-1 0
2. Acoustics vs. intuition (Renwick & Ladd 2016)
1 2 2
Vowel
Normalized F1
ɛ o
0
ɔ
Categorical mismatch
1
match mismatch
2 0.0
-2
2
1
-0.5
-1.0
-1.5 1.5
1.0
0.5
0.0
-0.5
-1.0
0
-1
-2
2
1
0
-1
-2
2
1
0
-1
-2
2
1
0
-1
-2
Back vowels Percent classification
100 75 50 25
75 50 25 0 economico modo pero po parlero nuovo parole scuola volta ricordo proprio soldo buon accorsi daro bisogna occhi o signori mondo uno non conosci persona conosco situazione trovi ancora conto ho foto rumori giorno ora ogni troppo discorso solo sono dopo cosa forse con lavoro dove
0
100
certo bella terra era quel te prezzo nella sempre dovrebbe tempo quella prendo per sarebbe attenta Stefano della medico e bene aspetti un'azienda gente bel tormento perche piaceva pensi pareva chiesto quello le recipiente aveva teneva avevo appena cerco se messo ne momento porgeva vedere vero fretta nel poteva mese questo fingeva chiede credere legge voleva che
Percent classification
Front vowels
word Cluster height
High mid
Low mid
word
Syllable structure
closed
Cluster height
open
High mid
Syllable structure
Low mid
closed
open
Regional differences in clustering of specific lexical items
v All speakers have some mismatches between their intuition of vowel height, and its phonetic implementation
e
0.5
-1
ItF7
-1
1.0
0
Rates of stressed-vowel classification into higher vs. lower clusters
v However, the high vs. low mid vowel distinctions are also weak v Phonological conditioning occurs in some regions, e.g. by syllable structure v Widespread lexical variability of “Standard Italian” mid vowels
1.5
1
Midpoint F2.z
v Despite a “particular closeness” between mid vowel pairs (Ladd 2006), Italian mid vowels retain their phonetic and phonological contrasts v Speakers are, generally, good judges of their own 4 mid vowels v Phonetic separation of mid vowels is strong
ItF12
Hypothesis: Less-variable words appear mostly in one cluster, while more variable words appear across both clusters
Percent tokens of PIACEVA clustered as high mid, by city
Percent tokens of CONOSCI clustered as high mid, by city
Percent tokens of BELLA clustered as high mid, by city
Percent tokens of DOVE clustered as high mid, by city
While some words are consistently classified, many are variable
-1.5
Normalized F2
v A remaining research question: How does a speaker’s regional variety influence the selection and phonetic implementation of mid vowels in Italian?
3. Mid vowel variation across Italy: a corpus approach
Percent clustered as high mid
Percent clustered as high mid
Percent clustered as high mid 100
v CLIPS (corpora e lessici di italiano parlato e scritto) v Collected 1999 – 2004; team led by Federico Albano Leoni (Leoni et al. 2007) v >100 hours of speech, “partially transcribed” by the original team v Radio & TV, dialogues (MapTask), reading, telephone, pathological speech v 15 Italian cities, 16+ speakers/city (150 speakers represented here) v Data analyzed here: 20 read sentences from the “lista frasi” portion of the corpus, containing 284 unique words
Percent clustered as high mid
6
0
81
100
Effects of syllable structure on vowel clustering Clustering of open-syllable front vowels, by city
Clustering of closed-syllable front vowels, by city
Clustering of open-syllable back vowels, by city
Clustering of closed-syllable back vowels, by city
In Northern varieties, [ɛ] is conditioned in closed syllables Here, considerable regional variation occurs throughout
4. Phonetic analysis of mid vowels
Percent clustered as high mid
Percent clustered as high mid
72
v Forced alignment (MAUS; Kisler et al. 2016), with hand correction of TextGrids v F1, F2 extracted by Praat at vowel midpoint: 99,770 vowel tokens v Outliers filtered from raw data v Mahalanobis distance (Mahalanobis 1936) calculated, relative to a gender- and vowel-specific centroid. Tokens with high Mahalanobis distance (based on the 95% quantile of a χ2 distribution with df = 2) were excluded as outliers. v Subset of mid vowels identified (39,632 tokens) and marked for stress v Data Lobanov-normalized (z-score) on a speaker-specific basis
Percent clustered as high mid
Percent clustered as high mid
22
64
63
Duration: a secondary cue to phonological height? Duration vs. F1 of Italian mid vowels, penultimate stressed syllables front
front
back
back
open
closed
open
closed
Midpoint F1.z
-1
four.clusters back high mid 0
back low mid front high mid front low mid
1
2 0.1
0.2
0.3
0.1
0.2
0.3
0.1
0.2
0.3
0.1
0.2
0.3
Duration (sec)
Cluster Higher front Higher front Lower front Lower front
Initial finding: some words are realized with consistently higher mid vowels (top), some with consistently lower mid vowels (bottom), and others variably (center)
-1
0
1
0
1
2 2
1
0
-1
2
1
0
Bergamo
Firenze
Milano
Parma
Torino
-1
2
1
Cagliari
Genova
Napoli
Perugia
Venezia
0
2
-1
1
Bergamo
Firenze
Milano
Parma
Torino
Cagliari
Genova
Napoli
Perugia
Venezia
-1
Bari vowel Catanzaro Lecce(259 Palermo Stressed in CONOSCI tokens)Roma
Barivowel in Catanzaro Lecce(115 Palermo Stressed PERSONA tokens)Roma
City
0
Midpoint F2.z
Midpoint F2.z
Bari vowel Catanzaro Lecce Stressed in ASPETTI (143Palermo tokens) Roma
City
Unstressed front vowels
2
Midpoint F2.z
Bari vowel Catanzaro Lecce(143Palermo Stressed in PIACEVA tokens)Roma
0
1
2
Midpoint F2.z
City
0
1
2
Bergamo
Firenze
Milano
Parma
Torino
Cagliari
Genova
Napoli
Perugia
Venezia
City
Bergamo
Firenze
Milano
Parma
Torino
Cagliari
Genova
Napoli
Perugia
Venezia
Unstressed back vowels
100 75 50 25 0
100 75 50 25 0
Word (number indicates syllable) -1
1
0
1
2 0
-1
1
Midpoint F2.z
Bergamo
Firenze
Milano
Parma
Torino
-1
2
Cagliari
Genova
Napoli
Perugia
Venezia
Roma
Bergamo
Firenze
Milano
Parma
Torino
Cagliari
Genova
Napoli
Perugia
Venezia
1
0
1
2 0
-1
1
Midpoint F2.z
City
0
-1
Firenze
Milano
Parma
Torino
Cagliari
Genova
Napoli
Perugia
Venezia
Catanzaro
Lecce
Palermo
Roma
Bergamo
Firenze
Milano
Parma
Torino
Cagliari
Genova
Napoli
Perugia
Venezia
City
0
City
Firenze
Milano
Parma
Torino
Cagliari
Genova
Napoli
Perugia
Venezia
0
Catanzaro
Lecce
Palermo
Roma
Firenze
Milano
Parma
Torino
Cagliari
Genova
Napoli
Perugia
Venezia
Cluster height
Low mid
High mid
Low mid
v Widespread variability of mid vowels: while some words have consistent phonetic height at regional levels, others are highly variable even within single cities v Areas of regional or lexical inconsistency: variable phonetic implementation of mid vowels is not a misleading consequence of pooling across diverse phonological systems – it is a local property v Variability within words and cities suggests the mapping of lexical specification to phonetic category is weak, and contrasts are marginal (cf. Renwick & Ladd 2016)
2 1
0
-1
2
1
Midpoint F2.z
Bergamo
Roma
Bergamo
1
2
Bari
-1
-1
Midpoint F2.z
Bari
0
Bari Palermo Stressed vowelCatanzaro in VOLTALecce (148 tokens)
Roma
Bergamo
2 2
1
Midpoint F2.z
1
2 1
2
-1
-1
Midpoint F1.z
Midpoint F1.z
0
0
Bari Palermo Stressed vowel Catanzaro in MODOLecce (138 tokens)
City
-1
2
1
Midpoint F2.z
Bari Palermo Stressed vowel Catanzaro in BELLALecce (145 tokens)
City
-1
Midpoint F1.z
0
Midpoint F2.z
Bari Lecce Stressed vowel Catanzaro in MEDICO (106 Palermo tokens) Roma
City
2
2 2
High mid
Word (number indicates syllable)
6. Conclusions
0
1
Midpoint F1.z
1
0
1
2 2
Midpoint F1.z
0
Cluster height
-1
-1
Midpoint F1.z
Midpoint F1.z
Midpoint F1.z
-1
Correlation r2 = 0.06, p < 0.01 r2 = 0.04, p = 0.1226 r2 = 0.23, p < 0.001 r2 = 0.14, p < 0.001
Unstressed vowels: asymmetrical evidence for neutralization
-1
-1
Midpoint F1.z
Midpoint F1.z
Midpoint F1.z
-1
σ structure Open Closed Open Closed
Stressed vowel in FORSE (146 tokens)
Stressed vowel in DOVE (288 tokens)
parole.3 numero.2 parlero.2 tornare.3 eppure.3 scrivere.3 male.2 durera.2 pero.1 due.2 venire.3 fine.2 persona.1 neanche.1 dove.2 perche.1 nazionale.4 scrivere.2 padre.2 notare.3 guardare.3 un'azienda.3 dice.2 situazione.5 sembrava.1 bene.2 fare.2 chiede.2 cercava.1 parte.2 adatte.3 forse.2 andare.3 sentire.3 pagare.3 sempre.2 entrata.1 gente.2 servire.3 grande.2 vedere.3 dell'acqua.1 capire.3 credere.2 legge.2 servire.1 cuore.2 diventato.2 possibile.4 teneva.1 cambiare.3 dovrebbe.3 stare.2 verita.1 portare.3 vedere.1 rispettava.2 economico.1 qualche.2 sentiva.1 societa.2 credere.3 eppure.1 sentire.1 venire.1 mese.2 recipiente.4 recipiente.1 sarebbe.3 deciso.1 neanche.3
Stressed vowel in LEGGE (115 tokens)
Closed syllable
Midpoint F1.z
Stressed vowel in VOLEVA (116 tokens)
Open syllable
Cluster Higher back Higher back Lower back Lower back
Stefano.3 sanno.2 primo.2 economico.2 giorno.2 troppo.2 momento.1 solo.2 piccolo.3 ministro.3 lontano.3 lavorava.2 uno.2 notare.1 ora.2 riuscito.4 nazionale.2 lavoro.3 Lucio.2 conosco.1 chiesto.2 economico.5 italiano.4 trovava.1 mano.2 occhi.2 tornare.1 uomo.2 avevo.3 lontano.1 ragazzo.3 tempo.2 conosco.3 conosci.1 Marco.2 sono.2 tipo.2 o.2 tormento.1 nuovo.2 foto.2 diventato.4 bambino.3 conto.2 alto.2 bugiardo.3 politico.4 piccolo.2 parco.2 soldo.2 fatto.2 medico.3 prendo.2 cerco.2 amico.3 voleva.1 mondo.2 portare.1 ho.2 dopo.2 numero.3 modo.2 porgeva.1 tanto.2 vero.2 prezzo.2 dovrebbe.1 discorso.3 poteva.1 giocava.1 proprio.2 tormento.3 messo.2 societa.1 certo.2 subito.3 quando.2 questo.2 suo.2 momento.3 quello.2 possibile.1 potrai.1 tutto.2 deciso.3 ricordo.3 politico.1 punto.2
Closed syllable
Percent classification
Open syllable
Back vowels
Correlation r2 = 0.06, p < 0.01 r2 = 0.15, p < 0.001 r2 = 0.24, p < 0.001 r2 = 0.49, p < 0.001
Percent classification
Front vowels
σ structure Open Closed Open Closed
City
0
-1
Midpoint F2.z
Bari
Catanzaro
Lecce
Palermo
Roma
Bergamo
Firenze
Milano
Parma
Torino
Cagliari
Genova
Napoli
Perugia
Venezia
City
Bari
Catanzaro
Lecce
Palermo
Roma
Bergamo
Firenze
Milano
Parma
Torino
Cagliari
Genova
Napoli
Perugia
Venezia
5. Automatic classification of mid vowels Goal: a measure of vowel height unbiased by prescriptive quality or human intuition, to compare rates of high mid vs. low mid classification, across words, cities & regions v Method: k-means clustering in R, a procedure that partitions data points to minimize the sum-of-squares distance between a point and its assigned cluster v Normalized front and back mid vowel tokens clustered separately, per city v Two clusters permitted, resulting in a higher and a lower cluster v Output: a list of cluster assignments for each token v The effects of syllable structure, duration, stress are also considered v Lower (front) vowels expected in closed syllables, in some varieties v Longer durations expected in open syllables (e.g. Farnetani & Kori 1986) v If neutralization to /e, o/ occurs, higher vowels expected in unstressed syllables
v Contrastiveness is a matter of degree v Goal: adapt phonological theory to diverse dimensions of contrast
7. Acknowledgments and References Supported by the UGA Willson Center for Humanities & Arts Boersma, Paul & David Weenink. 2017. Praat: Doing phonetics by computer [Computer program], Version 6.0.30. http://www.praat.org. Farnetani, Edda & Shiro Kori. 1986. Effects of syllable and word structure on segmental durations in spoken Italian. Speech Communication 5(1). 17–34. Kisler, Thomas, Uwe D. Reichel, Florian Schiel, Christoph Draxler, Bernhard Jackl & Nina Pörner. 2016. BAS Speech Science Web Services – an Update on Current Developments. Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), paper id 668. Portorož, Slovenia. Ladd, D. R. 2006. “Distinctive phones” in surface representation. In Louis Goldstein, D. H. Whalen & Catherine T. Best (eds.), Laboratory Phonology 8, 3–26. Berlin; New York: Mouton de Gruyter. Leoni, Federico Albano, Francesco Cutugno, Renata Savy, Valentina Caniparoli, Leandro D’Anna, Ester Paone, Rosa Giordano, Olga Manfrellotti, Massimo Petrillo & Aurelio De Rosa. 2007. Corpora e Lessici dell’Italiano Parlato e Scritto. http://www.clips.unina.it/. Mahalanobis, Prasanta Chandra. 1936. On the generalized distance in statistics. Proceedings of the National Institute of Sciences (Calcutta) 2. 49–55. Renwick, Margaret E. L. & D. Ladd. 2016. Phonetic Distinctiveness vs. Lexical Contrastiveness in Non-Robust Phonemic Contrasts. Laboratory Phonology: Journal of the Association for Laboratory Phonology 7(1). 1–29. doi:10.5334/labphon.17.