Triggered codeswitching: A corpus-based ... - Semantic Scholar

10 downloads 117 Views 122KB Size Report
In this article the triggering hypothesis for codeswitching proposed by Michael Clyne is discussed and tested. According to this hypothesis, cognates can ...
C 2006 Cambridge University Press doi:10.1017/S1366728905002348 Bilingualism: Language and Cognition 9 (1), 2006, 1–13 

Triggered codeswitching: A corpus-based evaluation of the original triggering hypothesis and a new alternative*

1

M I R JA M B RO E R S M A Max Planck Institute for Psycholinguistics, Nijmegen

K E E S D E B OT Department of Applied Linguistics, University of Groningen

In this article the triggering hypothesis for codeswitching proposed by Michael Clyne is discussed and tested. According to this hypothesis, cognates can facilitate codeswitching of directly preceding or following words. It is argued that the triggering hypothesis in its original form is incompatible with language production models, as it assumes that language choice takes place at the surface structure of utterances, while in bilingual production models language choice takes place along with lemma selection. An adjusted version of the triggering hypothesis is proposed in which triggering takes place during lemma selection and the scope of triggering is extended to basic units in language production. Data from a Dutch–Moroccan Arabic corpus are used for a statistical test of the original and the adjusted triggering theory. The codeswitching patterns found in the data support part of the original triggering hypothesis, but they are best explained by the adjusted triggering theory.

Introduction One of the most fascinating phenomena in bilingual speech, both to linguists and to accidental bystanders, is codeswitching. The merging of two languages, complex as they are in themselves, into one coherent whole shows us language at its best. Not only does it demonstrate how flexible and versatile human speech can be, it can also teach us a lot about the organization of different languages in the mind. It is no wonder that codeswitching has been studied extensively since the middle of the previous century. One of the first researchers to dedicate himself to this topic was Michael Clyne. His study of bilingualism in Australia contributed a lot to our understanding of bilingualism in general and of codeswitching in particular. This paper is concerned with one of his hypotheses. Clyne noticed that codeswitches seemed to occur relatively often in the neighborhood of a cognate. In a series of publications (Clyne, 1967, 1972, 1977, 1980, 2003), he developed the triggering hypothesis which states that words that are part of two languages can facilitate codeswitching from one language to the other. When the hypothesis was first presented, not much was known about the mental processes underlying the production of speech. There were no models available to explain the phenomenon Clyne described. Much has * The authors are indebted to Michael Clyne, Peter Indefrey, Pieter Muysken and two anonymous reviewers for their comments on earlier versions of this article, and to Louis Boumans for making his Dutch– Moroccan Arabic data available.

changed since then. Nowadays, we have detailed information about the process of speech production at our disposal. And although different models still make different claims with regard to some parts of the process, a great deal of consensus has been reached. In this paper we will relate the original triggering hypothesis to current knowledge on speech production. We will see that these are not completely compatible. We therefore attempt to adjust the triggering hypothesis to bring it in line with present speech production models, keeping as close as possible to its original form. The triggering hypothesis is intuitively appealing, and many researchers have accepted the notion of triggering (e.g. Saunders, 1982; Schatz, 1989; Halmari, 1997; Treffers-Daller, 1998; Van Hell and De Groot, 1998). However, it is important to note that all evidence so far is anecdotal. No statistical tests of the hypothesis have been presented, either supporting or contradicting its predictions. The aim of the present article is to test the predictions of the triggering hypothesis using a corpus of Dutch–Moroccan Arabic speech. Codeswitching has been studied from many different perspectives. Sociolinguistic studies are mainly concerned with the social and pragmatic function of codeswitching (e.g. Blom and Gumperz, 1972; Myers-Scotton, 1993). Grammatical studies aim to explain which elements in an utterance can be codeswitched. The grammatical approach has resulted in a variety of proposals, including universal constraints (e.g. Sankoff and Poplack, 1981), restrictions on codeswitching based on Universal Grammar (e.g. Di Sciullo, Muysken and Singh, 1986;

Address for correspondence Mirjam Broersma, Max Planck Institute for Psycholinguistics, PO Box 310, 6500 AH Nijmegen, The Netherlands E-mail: [email protected]

2

M. Broersma and K. de Bot

Halmari, 1997), and the Matrix Language Frame model1 (Myers-Scotton, 1997). The present study takes a different approach to codeswitching: a psycholinguistic one. This paper is organized as follows. The triggering hypothesis is introduced in section 1. In section 2, the main characteristics of speech production are discussed in terms of compatibility with the triggering hypothesis. A new version of the triggering theory is presented in section 3. In section 4, the results of a corpus analysis for the original and the adjusted version of the triggering theory are presented. A general discussion is provided in section 5. 1. Clyne’s triggering hypothesis The triggering hypothesis proposes that words which have similar form and meaning in two languages can cause, or at least facilitate, a codeswitch from one language to the other. The hypothesis was first presented in Clyne (1967) and further developed in a series of publications about bilingualism in Australia (Clyne, 1972, 1977, 1980). A thorough revision is presented in Clyne (2003). Clyne (2003) defines trigger words as to contain the following items:

r r r

lexical transfers (items belonging to one standard language which have also become part of the lexicon of the speaker’s other language), bilingual homophones, proper nouns.

The triggering hypothesis assumes a relation between the presence of trigger words and the occurrence of codeswitches (or TRANSVERSIONS, in the terminology of Clyne, 2003). In earlier publications (Clyne, 1967, 1972, 1977, 1980), a direct causal relation was implied, with the production of a trigger word leading to confusion, resulting in a codeswitch. The nature of this relation is modified in Clyne (2003). Here, Clyne points out that it is more appropriate to speak of lexical facilitation than of triggering, as other (structural and sociolinguistic) factors may play a role in the occurrence of a codeswitch. Apart from lexical facilitation, Clyne (2003) also describes prosodic and syntactic facilitation of transversion. Prosody and syntax may affect the level of word choice, which may help explain a number of language contact phenomena not accounted for so far. The present study is restricted to lexical facilitation, corresponding to the original notion of triggering. Three forms of triggering can be distinguished: consequential facilitation, where the codeswitch follows the trigger word; anticipational facilitation, where the 1

While this model is also psycholinguistic in nature, it explicitly does not deal with triggering.

codeswitch precedes the trigger word; and a combination of these two, where a codeswitch is “sandwiched” between two trigger words. Clyne (2003) shows that all patterns of co-occurrence of trigger words and codeswitching emerge regularly in the language use of German-, Croatian-, Dutch-, Vietnamese-, Italian-, and Spanish-English bilinguals, and Hungarian-German-English and Dutch-German-English trilinguals in Australia. Data from all these groups of bi- and trilinguals contain instances of consequential facilitation, the data from all but the Spanish group contain anticipational facilitation, and the data from the Germanand Spanish-English bilinguals and the HungarianGerman-English trilinguals contain combinations of consequential and anticipational facilitation. Although these data show that trigger words and codeswitches do co-occur regularly, there is no proof that this co-occurrence is not a matter of coincidence. As pointed out above, the triggering hypothesis has never before been tested statistically. All the evidence for triggered codeswitching so far is anecdotal. In this paper a statistical test of the predictions of the triggering hypothesis will be presented. The triggering hypothesis has its roots in bi- and trilingual corpus studies. It is presented as a description of a pattern that seems to stand out in the data (Clyne, 1967, 1972, 1977, 1980, 2003). Although the hypothesis does make claims about a relation between trigger words and codeswitches, and although it offers very clear predictions about this relation, the emphasis is never on its predictive power. So far, triggering has been described exclusively as a linguistic phenomenon. At the time of the earlier publications (Clyne, 1967, 1972, 1977, 1980), not much was known about the mental processes underlying the production of speech, and such processes were not considered in these publications. Clyne (2003) acknowledges the importance of mental processes, as described in psycholinguistic models, and their relation to language contact phenomena. An important goal of the present paper is to discuss the triggering hypothesis in relation to psycholinguistic models of speech production, and to adjust the triggering theory where necessary to bring it in line with the models. 2. The original triggering hypothesis and speech production theory In the previous decades, much has been learned about the mental processes underlying speech. Although parts of the process are still topic of lively debate, a general consensus has been reached about some of the main characteristics of the process. In this section, the question will be considered to what extent the triggering hypothesis is compatible with what we now know about speech production.

Testing the triggering hypothesis As the present authors pointed out before (Broersma, 2000; Broersma and De Bot, 2001), the triggering hypothesis as it was described in the earlier publications (Clyne, 1967, 1972, 1977, 1980) is not fully compatible with recent views on speech production. Clyne (2003) acknowledges this challenge to the triggering hypothesis, and discusses some models and their implications for triggering, which will be described in section 3. Speech production theory To this day, no final model of the speaker is agreed upon. Many aspects of the speech production process are still under debate. Probably the most vigorously debated issue is the flow of activation through the different components of the speech production model. However, a form of consensus (as expressed, for example, by Levelt, 1999; Costa, Caramazza and Sebasti´an-Gall´es, 2000) seems to exist on some aspects of speech production. There seems to be a general agreement about the components a speech production model should contain, and the separate levels these components operate on. We do not wish to argue for one model over another. However, as we want to provide a brief description of speech production, we will follow the lines of one model. In this section, we will give an outline of the model by Levelt (1989, 1999), which De Bot (1992) has adapted for bilingual processing. In this model, the flow of information is exclusively top-down. Note that other models do not necessarily share this assumption. In Levelt’s “blueprint of the speaker” (Levelt, 1999), information is stored on different levels, all contributing to different steps in the production process. Each utterance starts with the message a speaker wants to convey. This message is composed of lexical concepts. Lexical concepts are connected to and activate lemmas, which contain syntactic information, but no information about word form. Upon selection of a lemma, its syntactic information becomes available. This information is used to place the lemma into a surface structure with the other selected lemmas. The surface structure is a representation of the sentence as it will eventually be produced. It contains lemmas in the order in which they will appear in the utterance, but without any information about the form they will take. The word form, containing morphological and phonological information, then becomes available. This information is used for phonetic encoding. During phonetic encoding, all the information that is needed for the production of the utterance is gathered, resulting in a speech plan. Finally, articulation of the speech plan leads to overt speech. Note that the order in which lemmas are selected is not necessarily reflected in the position they will eventually get in the surface structure. This holds in the top-down model described here, as well as in models which allow for bottom-up activation.

3

Also note that in a model which only allows for the top-down spread of information, information about the word form is not available until after the positioning of the lemma in the surface structure. Levelt, Roelofs and Meyer (1999) speak of a major rift between lemma and word form, that is between the conceptual/syntactic and the phonological/articulatory domain. Levelt’s model (1989, 1999) does not specifically consider bilingual speakers. Those who do (e.g. De Bot, 1992, 2004; Green, 1998; Grosjean, 1998; Costa and Caramazza, 1999; Clyne, 2003) assume that the language in which a sentence or parts of a sentence will be uttered is chosen in an early stage of the production process. The intended language is part of the message to be conveyed. Lemmas are generally assumed to be language specific, so that the language choice for a specific word definitively takes place with the selection of a lemma. Triggering at the surface structure level The triggering hypothesis as it was originally described by Clyne (1967, 1972, 1977, 1980) proposes that words adjacent to a trigger word have an increased chance of being codeswitched. The theory does not predict that words next to a trigger word will always be codeswitched, but a trigger word may facilitate codeswitching of neighboring items. Trigger words can influence the words surrounding them. Whether a word can be influenced by a trigger word depends on its position within the sentence as it is pronounced. No influence is adjudged to the structural relation between a trigger word and the adjoining word. In other words, the influence of a trigger word is exercised at the level of the surface structure of a sentence. The assumption that a trigger word can influence the language choice within a sentence at the surface level is contrary to the prevailing view on bilingual speech production as described above. It suggests that the surface structure is formed before language choice has taken place, for language choice can be influenced at the surface level. However, according to the bilingual speech models described above, language specific lemmas have to be selected before they are placed in a surface structure. Once a lemma has found its place in the surface structure, the language choice for that particular item has been made. Top-down models and models allowing for bottomup activation are similar in this respect. Models that do not allow for the bottom-up flow of activation pose an additional problem for the triggering hypothesis. As information about word forms does not become available until after the positioning of a lemma in the surface structure, trigger words are not recognizable as such at the stage of lemma selection. Although they have similar meaning and form in two languages, presumably they are no different from other translation pairs at the lemma level.

4

M. Broersma and K. de Bot

3. Triggering at the lemma level As discussed above, the original triggering hypothesis is not fully compatible with current speech production models. The major problem is that the triggering hypothesis implies that language choice is made at the surface level, whereas the selection of language specific lemmas is assumed to precede the formation of a surface structure. An additional problem for top-down models is that trigger words are presumably no different from other translation pairs at the level of lemma selection, where language choice definitively takes place. This does not necessarily imply that the general idea of a causal relation between trigger words and codeswitching is incompatible with speech production theory. Therefore, an adjusted triggering theory will be presented, which preserves the general idea, but which is in line with psycholinguistic models of speech production. The predictions of this theory are somewhat different from those made by Clyne’s original triggering hypothesis, but the general idea that trigger words may facilitate codeswitching is preserved. For the formulation of a new triggering theory, the idea is discarded that trigger words influence their neighbors in the utterance as it is pronounced. In the new theory, triggering takes place before the surface structure is formed. It proposes that trigger words can cause a shift in the activation of the available languages at the lemma level. The selection of a trigger word can enhance the activation of all the lemmas of a non-selected language (i.e. a language which is not maximally activated, Green, 1986) thereby raising the chance that one of these lemmas will get selected afterwards. How the selection of trigger words may cause a shift in the activation of different language subsets at the lemma level depends on the assumptions one makes about the language production system. Especially the notion of feedback makes a difference in this respect. Models that do not allow for feedback from word forms to lemmas have to deal with the problem that triggerwords do not seem to differ from other translation pairs at the level of lemma selection, which is not a problem for bottom-up models. Triggering at the lemma level can be explained under both assumptions. As the new triggering theory assumes that triggering takes place at the lemma level, and because the order in which lemmas are selected is not necessarily reflected in the word order in the surface structure, the surface structure cannot be relied on to predict which words are likely to be influenced by a trigger word. A way of determining which elements are likely to be codeswitched under the influence of a trigger word will be described below. Triggering with feedback In a model that allows for feedback from the word form to the lemma level, the selection of trigger words may lead to

codeswitches because they share both meaning and form characteristics. Three such models are described. Firstly, De Bot (1992) and De Bot and Schreuder (1993) suggest that feedback from word forms to lemmas can explain triggering. Crucial in their explanation is the fact that trigger words have both similar meanings and similar forms. Broersma and De Bot (2001) present the following elaboration of this idea. De Bot (1992) and De Bot and Schreuder (1993) assume that all lexical items of a certain language form a subset within a language non-specific mental lexicon, as described by the subset hypothesis (Paradis, 1987). As the items within one set are connected more strongly to each other than to the items of other sets, they can function as separate subsets. As such, activation of one item adds to the activation of the whole subset it belongs to (Paradis, 1998b). De Bot (1992) and De Bot and Schreuder (1993) argue that word forms that are similar in two languages are represented as a single node which is part of both language networks. If the speaker is in a bilingual mode (Grosjean, 1997, 1998), both languages are active, and activation in the lemma network of one language will not automatically lead to the inhibition of activation in the network of the other language. Although the levels of activation of two languages may be very similar, there will always be some difference in the exact levels of activation of both languages. For convenience, the language with the highest level of activation will be called language A. When the lemma of a trigger word in language A is activated, the lemma will send activation to the connected word form, which is part of two language subsets. This shared word form is also connected to a lemma in language B, and it will send some activation up to this lemma. As the language B lemma was already activated, matching the intended concept, the joint effect of top-down and bottomup activation of the lemma can be enough to lead to selection of this lemma in the non-intended language. This in turn leads to an increase in the activation of the whole subset of language B at the lemma level. As a result of the increased activation of all language B lemmas, the next time that a lemma is selected, there is an increased chance that this will be a lemma from language B. Therefore, the selection of a trigger word enhances the chance of codeswitching.2 The retrieval of a trigger word will have little effect on the activation of language A, as it already is highly activated. The additional activating effect of the trigger word on language A will be minimal, while the activating effect on language B may be large enough to tip the balance and make this language the active one.

2

Both meaning and form overlap are required for triggering to take place. Therefore, false friends (words sharing form but not meaning characteristics) are not predicted to facilitate codeswitching in this scenario.

Testing the triggering hypothesis Secondly, Clyne (2003) proposes an adaptation of Levelt’s speech production model to make it compatible with triggering. Although this model does not allow for feedback from word forms to lemmas, Clyne proposes to add an internal loop from self-perceived items, after phonological encoding, back to the lemma level. He proposes that self-perceived items cannot influence the activation of the lemmas in the same language, but only the activation of items in the other language. A problem may be that bilingual processing models do not assume fully separated systems for the different languages of a bilingual (Paradis, 1987; De Bot, 1992; Green, 1998; Grosjean, 1998). The proposed feedback loops seem to require processing of the languages in total separation, at least at the lemma level, or they would result in the elimination of the top-down nature of the speech production model. Thirdly, the Multilingual Processing Model (De Bot, 2004) aims to account for various types of interactions between languages as discussed in the bilingual processing literature. In the Multilingual Processing Model it is assumed that elements of languages are stored on different levels: the conceptual level, the lemma level, the syntactical level and the word form level. Within the stores at different levels, there are subsets for different languages, and some elements are shared between subsets, because of their similarity in the languages involved. There may be language specific and shared lexical concepts, syntactical rules and word forms or phonological units (e.g. syllables). In language production, language choice is regulated at the conceptual level: language choice is part of the communicative intention to be expressed. It is assumed that there is a language node which may be part of the monitoring device. The language node receives information about language choice from the conceptual system, and relays that to all the language subsets in the system in order to prepare them for the selection of elements of the language chosen. At the same time it also receives information about language choice from elements in the language specific subsets. Crucially, information about the activation of subsets is exchanged between subsets of the various languages through the language node, allowing for both a top-down and a bottom-up flow of information. When, for example, the subset of lemmas of language A is activated, this information is relayed to both the language node (which forwards that information to the rest of the system), and to the subset of word forms from language A. So activation of a part of a language activates elements from that same language at other levels. At the same time, because subsets overlap, elements that are shared by more than one language may also activate elements from the other subset (or subsets) they are part of. In this model, triggering could take place in the following way. When the lemma of a trigger word is

5

selected in language A, it activates a word form which belongs to the network of word forms in language B as well as word forms in language A. At the word form level, the activation of language B rises. The language node relays this information to the lemma level, causing an increase of activation of language B at that level as well. Consequently, the next time a lemma has to be selected, there is an increased chance that a lemma from language B gets selected.3 Triggering without feedback The difference between trigger words and other translation pairs is that the former not only share their meaning but also their form. Assuming that trigger words behave differently from other words, causing codeswitches where other words do not, it seems only logical that this must have something to do with the fact that they share both meaning and form. The assumption that there is no feedback from word forms to lemmas implies that there is no information about word forms available when lemmas are selected. This would seem to suggest that trigger words are no different from other translation pairs at the lemma level and cannot behave differently under the assumption of no feedback. However, there is evidence in favor of the contrary. Van Hell and De Groot (1998) found that there are differences in the conceptual representation of cognates versus non-cognates. It will be proposed that this different representation of cognates at the conceptual level can lead to the activation of lemmas from the non-selected language. An old question in the study of bilingualism is whether concepts are language-specific or shared between languages. In 1953, Weinreich presented three models of the mental lexicon. In one model, conceptual representations are language specific, while in the other two, translation equivalents are connected to shared conceptual representations. Weinreich (1953) proposed that different models could be present in one speaker at the same time. The study by Van Hell and De Groot (1998) suggests that this is indeed the case. They found different types of conceptual representation within speakers, depending on word-type and grammatical class. They varied cognate status, concreteness, and grammatical class in a bilingual word association task, and found that cognates, concrete words, and nouns were responded to differently than non-cognates, abstract words, and verbs. They conclude that the former share a conceptual representation more often, or (in terms of a distributed representation) share 3

Upon activation of a shared word form, the language node sends activation up to all language B lemmas, regardless of whether or not the shared word form corresponds to similar meanings as well. Therefore, false friends may facilitate codeswitching as much as cognates do in this scenario.

6

M. Broersma and K. de Bot

larger parts of a conceptual representation, than the latter. Of interest here is the claim about cognates: cognates show more overlap in conceptual representation than non-cognates. The origin of this difference between cognate and non-cognate translation pairs could well be the fact that cognates are similar not just in meaning but also in form. Possibly, learners simply map the meaning of a newly acquired cognate onto the conceptual representation of its translation equivalent, on the basis of the resemblance between the word forms. While learning non-cognates, the difference in word forms could signal to the learner that the meaning of both words is not necessarily identical either (Van Hell and De Groot, 1998). De Groot (1993) further points out that cognates are derived from a single stem more often than non-cognates, thus sharing more of their semantic characteristics. Whatever the origin of the difference in conceptual representation, the finding that a difference exists beyond the form level is crucial. When a word is activated for production, a difference between cognates and noncognates already arises before elements at the form level have been activated. Clyne (2003), in a reaction to Broersma (2000), remarks rightly that the degree of conceptual overlap of cognates is not necessarily larger than that of noncognate translation equivalents. Indeed, the difference between cognate and non-cognate translation pairs is likely to be quantitative rather than qualitative. This is supported by the above-mentioned finding (Van Hell and De Groot, 1998) that concreteness and grammatical class were also related to different degrees of conceptual overlap. Thus, we assume that cognates and non-cognates may differ in a graded way with respect to the degree of conceptual overlap. Now triggering can take place in the following way. When a conceptual node is connected with only one lemma, selection of the concept will lead to activation of this lemma. When a conceptual node is connected with two translation equivalents, selection of the concept will activate both lemmas. As one language is always more active than the other, the lemma of the language with the highest overall activation (language A) will usually get selected, but there is also a chance that the lemma from language B gets selected (Levelt, 1989). The selection of the lemma from language B will lead to an increase in the activation of all the lemmas of that language. This, in turn, increases the chance that during the next selection procedure, a lemma from language B will be selected. In short, the selection of the conceptual node of a cognate increases the chance that a lemma from language B gets selected during the next selection procedure, resulting in a codeswitch.4 4

As triggering depends on conceptual overlap and not on similarities in word form, false friends are not predicted to facilitate codeswitching in this scenario.

The scope of triggering In the previous section, it was explained how the activation of a trigger word can enhance the activation of the entire network of lemmas from the non-selected language. If one of these lemmas gets selected, a codeswitch takes place. Which words are most likely to be codeswitched under the influence of a trigger word will be discussed below. According to the activation threshold hypothesis (Paradis, 1998a, b), the activation of a lexical element or language network gradually decreases over time. The activation of the lemmas of the non-selected language will be highest directly after the selection of the trigger word. Therefore, the chance of being codeswitched is highest for lexical elements that are selected directly after a trigger word. The order in which lemmas are selected does not necessarily correspond to the order of words in the surface structure. It is not the case that words that follow a trigger word in the surface structure also have the highest chance to be codeswitched. Unfortunately, the order in which lemmas are selected cannot be determined. However, the selection of lemmas takes place in “chunks”: a number of lemmas go through the process of lexical selection together. Even though the order of selection within each chunk is unknown, knowledge about which lemmas have been selected approximately at the same time provides some information about the chances of triggering. Levelt (1989) proposes that the processing unit during lexical selection is the basic clause. A basic or deep clause contains one and only one main verb, in contrast to the finite or surface clause, which contains one and only one finite verb. Although the beginning of a finite clause always is the beginning of a basic clause as well, the opposite is not necessarily true. Levelt (1989: 257) illustrates the division of a sentence into basic and finite clauses with the following example: Finite clauses “/ I began working a lot harder / when I finally decided to come to Uni” Basic clauses “/ I began / working a lot harder / when I finally decided / to come to Uni” Of course speakers can and do produce utterances that do not contain a main verb, and which do not fit the definition given for basic clauses. Still a process of lexical selection must have preceded them. Therefore, the unit of production during lexical selection must be either a basic clause or an utterance without a main verb. Here, the term “basic clause” will be used to indicate utterances that contain maximally (but not minimally) one main verb.

Testing the triggering hypothesis If the basic clause is the unit of production during lexical selection, the most likely candidates to be influenced by a trigger word are words in the same basic clause as the trigger word. Among the words in this basic clause may be words that have been selected before the trigger word, and which could therefore not be influenced by it. The words that are selected after the trigger word though may have been influenced by the activation of the network of the non-selected language. Note that the idea of a unit of production does not imply that all the elements in that unit have to finish a stage in the process before any of them can go on to the next stage. Language production is generally assumed to be incremental (e.g. Levelt, 1999). Again, if the exact order of lemma activation could be determined, more precise predictions about which elements are likely to undergo triggered codeswitching would be attainable. As it is, the idea that certain lexical elements were selected around the same time as the trigger word is the most specific one can get.

7

As pointed out above, Clyne’s original triggering hypothesis has never been tested. The original triggering hypothesis predicts that words directly preceding or directly following a trigger word have a greater chance of being codeswitched than words that are not adjacent to a trigger word. It also predicts that words located between two trigger words have a higher chance of being codeswitched than words that are adjacent to a trigger word on only one side. In the previous section, a new, adjusted triggering theory was presented. Although this theory preserves the idea that the selection of trigger words may lead to an increased chance of codeswitching, the predictions are not exactly the same as Clyne’s predictions. The adjusted triggering theory predicts that words in a basic clause that contains a trigger word have a higher chance to be codeswitched than words in a basic clause that does not contain a trigger word. The predictions of both the original and the adjusted triggering theory were tested on a corpus of DutchMoroccan Arabic speech. The results for both versions of the triggering hypothesis will first be presented separately, after which both versions will be compared in one joint analysis.

forms in combination with the large amount of bilingual homophones that the standard varieties of the two languages share led to a form of speech in which hardly any items can be uniquely ascribed to either language. Similar patterns were found in Dutch-English data described by Broersma (2000) and Broersma and De Bot (2001). Due to the large amount of bilingual homophones, including many compromise forms, the number of trigger words in these data is so large, that there are hardly any basic clauses without trigger words. Although these data were successfully used to test the predictions of the original and the adjusted versions of the triggering theory separately (Broersma, 2000; Broersma and De Bot, 2001), the small number of basic clauses without trigger words forms a limitation for the statistical comparison of both versions. Therefore, the triggering theory was tested with a corpus of bilingual speech in two unrelated languages, Dutch and Moroccan Arabic. A series of transcribed conversations between three Dutch-Moroccan Arabic bilinguals was used (Boumans, 1999). The informants were male friends, living in the city of Utrecht in The Netherlands. Two of them were born in Morocco and moved with their families to The Netherlands at ages 2 and 11. They were 19 years old at the time of recording. The third was born in The Netherlands and was 22 years old at the time of recording. All were still in school. In an interview about their language use, the informants indicated that they regularly used both Dutch and Moroccan Arabic. Moroccan Arabic was mainly spoken in the home. They spoke exclusively Moroccan Arabic with their parents, and in contacts with siblings Moroccan Arabic was the main language of conversation. Outside the home, Dutch was spoken in most situations. Moroccan Arabic was spoken with some friends. They regularly spent the summer in Morocco. The informants displayed a positive attitude towards both languages and to their own bilingualism. They indicated that they found it important to be able to speak both languages well, and to be able to use either language when the situation required it. The data consists of self-recorded, informal conversation between the three informants, which took place in The Netherlands. Both Dutch and Moroccan Arabic are used as the base language in different parts of the recording, and the data contain many instances of codeswitching.

The corpus

Testing the original triggering hypothesis

Clyne (2003) describes how the languages of a bilingual speaker can be more similar than the standard varieties of the languages, due to phonetic, morphophonemic and prosodic convergence. For some Dutch-English bilinguals described in Clyne (1987), the use of compromise

Method The data set was divided into conversational turns, lasting as long as one subject was speaking. Each word uttered by each subject was categorized as a trigger word or a non-trigger word. For each non-trigger word it was

4. Testing both versions of the triggering hypothesis

8

M. Broersma and K. de Bot

determined whether it directly preceded or followed a trigger word, and if it had been codeswitched. Words were coded as a codeswitch when they belonged to a different language than the word directly preceding it. The first word of a conversational turn served as an initial reference point. As structural relations do not play a role in the original triggering hypothesis (Clyne, 1967, 1972, 1977, 1980) each word that differed in language from the previous word was considered a codeswitch, regardless of its syntactic role, and regardless of when the switch back occurred. Under this definition, single lexical items can also constitute a codeswitch. The definition of trigger words was kept as close as possible to the description Clyne uses in his consecutive publications (1967, 1972, 1977, 1980, 2003). However, in order to operationalize the expression, and implement it for the unambiguous coding of the corpus, one important change was necessary. In this study, lexical transfers were not considered to be trigger words. Lexical transfers are items which belong to one standard language, but have become part of the other language for the individual speaker. As the word is part of two languages for the individual speaker only, there is no clear-cut way to determine which words fit the category. Clyne (2003) suggests that items can be identified as being part of the two languages of a speaker when they are phonologically unintegrated, or little integrated. However, many Dutch words are phonotactically legal sequences in Moroccan Arabic and vice versa. No adaptation is needed to make such words sound acceptable in the other language. As no phonological integration is needed, its absence does not provide any information about the presence or absence of the Moroccan-Arabic word in the Dutch lexicon of the speaker (or vice versa). Other objections against phonological integration as a criterion have been raised by McClure (1981) and Myers-Scotton (1997). However, no other criteria are available that do not lead to similar objections (e.g. morphological and syntactic integration have been criticized in Halmari, 1997; MyersScotton, 1997). As it is impossible to determine with any certainty which Dutch words are part of the Moroccan Arabic lexicon of an individual speaker and vice versa, in this study, all words were considered to be part of the language they belong to for the larger community. Further arguments in favor of this approach are given by Backus (1996) and Myers-Scotton (1997). Otherwise, our operationalization of trigger words corresponds to the description in Clyne (2003), including bilingual homophones, allowing for small differences in phonological form.5 For example, the word for “family” 5

Conceivably, the strength of a triggering effect depends on the extent of semantic and phonological similarity. The data set is too small to differentiate between different degrees of overlap, but this may be a useful direction for future research.

is famila, [fa:'mi:la:] in Moroccan Arabic and familie, [fa:'mi:li:] in Dutch. Although these words have different final vowels, they are considered bilingual homophones. All bilingual homophones in the data are nouns. The data do not contain any compromise forms. The large majority of trigger words in the data consists of proper nouns. The relation between the two variables (codeswitching and preceding or following a trigger word) is investigated with the χ 2 test for independence, with Yates’ correction for continuity for small expected frequencies applied where necessary (Ferguson and Takane, 1989). As the marginals are very uneven in some of the analyses, Fisher’s Exact test P (hence “P”) is also reported. All excerpts from the data will be presented as follows: Dutch speech is given in italics, Moroccan Arabic in capitals, and trigger words are underlined. Results The original triggering hypothesis predicts that words directly preceding or directly following a trigger word have a greater chance of being codeswitched than words that are not adjacent to a trigger word. In order to test these predictions, it was determined how many words in the data set directly precede or follow a trigger word, how many words are not adjacent to a trigger word, and how many words in each group are codeswitched. First, the prediction that words preceding a trigger word have a higher chance of being codeswitched than words that do not border on a trigger word is considered. Of the 67 items in the data that directly precede a trigger word, two were codeswitched. The others were not, as in the following Moroccan-Arabic example: (1) MA you

TQEDD sˇ TEMˇsI cannot go

B -t-t-aksi by taxi

Table 1 shows the number of codeswitches for the words that only precede a trigger word (excluding the words that

Table 1. Number of words that are codeswitched, number of words that are not codeswitched (observed frequencies, and expected frequencies in italics), and percentage of words that are codeswitched; split by words that precede a trigger word and words that do not border on a trigger word Preceding a trigger word

Codeswitch

yes no % yes

yes

no

2 (2.10) 65 (64.90) 3.0

50 (49.90) 1541 (1541.10) 3.1

Testing the triggering hypothesis Table 2. Number of words that are codeswitched, number of words that are not codeswitched (observed frequencies, and expected frequencies in italics), and percentage of words that are codeswitched; split by words that follow a trigger word and words that do not border on a trigger word

Table 3. Number of words that are codeswitched, number of words that are not codeswitched (observed frequencies, and expected frequencies in italics), and percentage of words that are codeswitched; split by words that follow a trigger word and words that do not follow a trigger word

Following a trigger word

Codeswitch

yes no % yes

yes

no

8 (2.07) 51 (56.93) 13.6

50 (55.93) 1541 (1535.07) 3.1

also follow a trigger word), as compared to the number of codeswitches for the words that do not border on a trigger word. As the table shows, the percentage of codeswitched words is virtually identical for both sets. Indeed, whether a word precedes a trigger word or not does not influence its chance to be codeswitched (χ 2 = 0.08, p >.70, P = 1.262). In a strictly linear view, it is assumed that words can only be influenced by words that have been uttered before them and not by words that follow them. Appel and Muysken (1987) point out that the hypothesis that words can be codeswitched in anticipation of a following trigger word does not fit well in a linear approach. Next, the prediction that words following a trigger word have a higher chance of being codeswitched than words that are not adjacent to a trigger word is considered. The following example contains two trigger words, the second of which is followed by a codeswitch from Moroccan Arabic to Dutch: (2) MˇsINA L-U L Pari εEND Xalid met z’n twee¨en we went there to Paris to Xalid the two of us Table 2 shows that words that follow a trigger word are codeswitched relatively more often than words that do not border on a trigger word. This relation turns out to be significant (χ 2 = 15.26, p

Suggest Documents