German and English Comparative Correlatives

The closer you look, the more differences you find: German and English Comparative Correlatives Thomas Hoffmann (Catholic University of Eichstätt-Ingolstadt)

Abstract In complete-inheritance constructionist approaches, Filler-Gap constructions (Sag 2010) are usually treated as a set of abstract constructions that interact with independent lexical constructions to license specific constructs. In this paper, I will look at Comparative Correlative (CC) constructions (the more you eat, the fatter you get / je mehr Du isst, desto dicker wirst Du) in German and English and show that corpus data indicate that such a view is untenable for these peripheral members of the Filler-Gap construction family. On top of that, I will address the question as to how general the mental representations of CC constructions are and what the different networks of these constructions in German and English looks like.

1. Introduction Comparative correlatives (CC) (McCawley 1988; Michaelis 1994; Culicover & Jackendoff 1999; Borsley 2004; Den Dikken 2005; Sag 2010; Cappelle 2011; Kim 2011) are biclausal constructions that exhibit several idiosyncrasies:1 (1)

[the [more]comparative phrase1 Ben ate,]C1 [the [fatter]comparative phrase2 he got]C2

(2)

[je [mehr]comparative phrase1 Ben aß,]C1 [desto [fetter]comparative phrase2 wurde er]C2

In both English (1) and German (2), the construction consists of two clauses (C1: the more Ben ate / C2: the fatter he got; C1: je mehr Ben aß / C2: desto fetter wurde er) of which the second clause C2 can be interpreted as the dependent variable for the independent variable specified by C1 (cf. Goldberg 2003: 220; e.g. the more Ben ate → the fatter he got; je mehr Ben aß → desto fetter wurde er; cf. Beck 1987; Cappelle 2011). Moreover, the construction consists of fixed, phonologicallyspecified material ([ðə ...]C1 [ðə …]C2 / [jeː ...]C1 [dɛsto …]C2) as well as schematic, open slots which can be filled freely by the speaker to create novel utterances (cf. the more she slept, the happier she felt;

1

Throughout this paper, I provide English examples followed by the corresponding German sentences. Paraphrases of the German examples are only given, when no such correspondence exists.

the richer the man, the bigger the car / je mehr sie schlief, desto glücklicher war sie; je reicher der Mann, desto größer das Auto). On top of that, the construction also shares properties with a number of other constructions: just like WH-questions (5) or relative clauses (6), comparative correlatives have a clause-initial phrase (the so-called ‘filler’) that in declaratives would be realised in post-verbal position (cf. happy in (2), whose position is marked by a co-indexed ‘gap’ in (3-6)). (3)

Declarative clause: Ben was [tired] Ben war [müde]

(4)

(5)

(6)

Comparative Correlative construction: [The more tired]i Ben was _i,

[the more mistakes]i he made _i

[Je müder]i Ben war _i,

[desto mehr Fehler]i machte er _i

WH-question: [What]i was Ben _ i ?

[What]i did he make _i?

[Was]i war Ben _ i ?

[Was]i machte er _i?

WH-relative clause: A pilot shouldn’t be tired, [which]i Ben was _i

The mistakes [which]i he made _i …

Ein Pilot sollte nicht müde sein, [was]i Ben war _i

Die Fehler, [die]i er _i machte ...

In Mainstream Generative Grammar (cf. e.g. Chomsky 1977, 1981, 1995, 2000, 2001), the structural similarities of (4-6) are explained by a single transformational operation (which has e.g. been called A-bar movement or WH-movement). Consequently, in this approach the mental representation underlying comparative-correlatives is maximally abstract and completely independent of the argument structure of the main verb (e.g. the transitive verb make/machen as well as the predicative verb be/sein in 4-6). As Sag (2010) pointed out, the various structures accounted for by A-bar/WH-movement (which he labels ‘Filler-Gap constructions’) are characterized by great variation across a number of other parameters (presence of a WH-element, syntactic category of the filler phrase, grammaticality of subject-verb inversion, etc.; cf. Sag 2010: 490). This leads Sag to postulate construction-specific formal representations (for interrogatives, relatives, comparative-correlatives as well as other FillerGap constructions) in addition to an abstract Filler-Head construction (Sag 2010: 536) that captures the common structural properties of these phenomena. Yet, while Sag (2010) presents a fully formalized analysis, his account still assumes that the constraints of the CC construction operate

independently of Argument Structure constructions: for CCs, he postulates only two abstract constraints underlying the construction: 1) a ‘The-clause construction’ (Sag 2010: 537) that licences instances of C1 and C2 and 2) a ‘Comparative-Correlative construction’ (Sag 2010: 537) that combines the two clauses (and computes the complex semantics of the resulting output; cf. also section 2 for details). In contrast to this, Culicover and Jackendoff provide a constructional analysis (1999: 567; see also Fillmore, Kay and O’Connor 1988 and McCawley 1988) that does not assume that the two CC clauses are licensed separately: (7)

[the [ ]comparative phrase1 (clause)]C1 [the [ ]comparative phrase2 (clause)]C2

In (7), both CC clauses are included in a single constructional template, but as the schematic slots labelled ‘(clause)’ indicate, this analysis also does not refer to any specific Argument Structure construction since the latter are taken to be realised independently of (7). As I will argue in this paper, authentic corpus data show that Culicover and Jackendoff’s analysis is empirically more adequate than Sag’s constructional analysis. On top of that, however, there is also evidence that particular Argument Structure constructions and CCs interact in a noncompositional way: it is e.g. well-known (cf. McCawley 1988; Zifonun et al. 1997; Culicover and Jackendoff 1999; Borsley 2004) that CCs which include a Predicative Argument Structure construction with BE/SEIN allow for the optional deletion of the main verb in both German and English for further deletion phenomena in CCs, see section 2): (8)

a. The greater the demand is, the higher the price is. b. The greater the demand is, the higher the price is. c. The greater the demand is, the higher the price is. d. The greater the demand is, higher the price is.

(9)

a. Je größer die Nachfrage ist, desto höher ist der Preis. b. Je größer die Nachfrage ist, desto höher ist der Preis. c. Je größer die Nachfrage ist, desto höher ist der Preis. d. Je größer die Nachfrage ist, desto höher ist der Preis.

This deletion process is not entirely unconstrained (for details; cf. Culicover and Jackendoff 1999: 554; Borsley 2004: 5), but what is even more interesting is that this type of be-deletion would be completely ungrammatical in Standard English and German declarative clauses (cf. *The price is higher. / *Der Preis ist höher.) and is also not possible in other Filler-Gap constructions (cf. *What is he? / *Was ist er? or *the price which was higher / *der Preis, der höher war.). Such an interaction of

a Filler-Gap construction with a specific Argument Structure construction obviously raises questions as to how the phenomenon is stored in the speakers’ mental construction network. The present paper will address this issue adopting a usage-based Construction Grammar approach (Lakoff 1987; Croft 2001; Goldberg 2003, 2006; Bybee 2006, 2013), which emphasises the fact that the mental grammar of speakers is shaped by the repeated exposure to specific utterances and that domain-general cognitive processes such as categorization and cross-modal association play a crucial role in the mental entrenchment of constructions. In contrast to complete-inheritance approaches, which aim at providing non-redundant analyses that only draw on the minimal number of constructions needed to license a specific construct, usage-based approaches thus hold that sufficient frequency of a form-meaning pairing can also lead to the storage of a construction (Croft and Cruise 2004: 276-278). As a result, while complete inheritance approaches only postulate one or two CC constructions (Culicover and Jackendoff 1999; Sag 2010), the number of constructions in a usage-based analysis, inter alia, depends on the frequency with which a speaker is exposed to various form-meaning pairings. In order to empirically assess this frequency, usage-based approaches often draw on authentic corpus data as a heuristic for the input that speakers are exposed to (Bybee 2013; Gries 2013). In the present study, I use corpus data from the BROWN corpus family for English and their German equivalent, the LIMAS corpus, to assess the frequency with which speakers encounter various types of the CC construction. As a statistical analysis of these data shows, English and German actually differ significantly as to the degree to which they have entrenched the deletion structures in (8) and (9). Moreover, the results indicate that deletion or retention of the copula in C1 and C2 are not independent phenomena. In addition to that, I shall also look at other central features of the CC construction (the order of C1 and C2 as well as the syntactic category of the filler phrase) and discuss the repercussions of the empirical results for the English and German construction networks. After this introduction, section 2 will give a more detailed discussion of the features of the CC construction that were the focus of the present study. Then section 3 will provide information on the data sources as well as the statistical tools used for the empirical analysis. The results of the corpus study are reported in section 4, and a usage-based Construction Grammar analysis of the findings is given in section 5.

2. Syntactic properties of the CC construction CCs have received ample attention in the syntactic literature and several important properties of the construction have been identified (cf. McCawley 1988; Michaelis 1994; Zifonun et al. 1997; Culicover and Jackendoff 1999; Borsley 2004; Den Dikken 2005; Sag 2010; Cappelle 2011; Kim 2011).

In the present paper, I will not address all of these since many of them are so infrequent that they do not occur in the selected corpora at all (such as optional zero imperative morphology in C2 in I demand that the more John eats, the more he pay(s).; from Culicover and Jackendoff 1999: 548). This is not to say that these are unimportant or irrelevant for the description of CC constructions, but simply that these features - due to the lack of corpus evidence - are better analysed by future introspection-based experiment studies. Instead, the present paper focusses on the following aspects of the CC construction which I shall discuss in detail below: 1. clause order: Does C1 precede or follow C2? 2. filler type / displaced element: Which syntactic phrases occur as displaced fillers? Are there any entrenched substantive filler-filler pairs across C1 and C2? 3. deletion phenomena: How often is a copula verb deleted or not? Are there any other deletion phenomena and what is their frequency? 4. variety: Are there any differences between English and German CCs with respect to the above features? All of these issues are addressed by the quantitative corpus study presented below and various statistical tests will be used to identify those factors that play a significant role in the CC construction network of English and German. In addition to that, it will be explored to which degree these features indicate an interaction of Argument Structure and CC constructions. One important difference between German and English concerns the elements that introduce a CC clause as well as the order of C1 and C2 (cf. e.g. Zifonun et al. 1997; Culicover and Jackendoff 1999: 549): (10) a.

[The more you think about it]C1

[the more interesting it becomes]C2

b.

[It becomes more interesting]C2

[the more you think about it]C1

(11) a. b.

[Je mehr man drüber nachdenkt]C1 [desto/umso/je interessanter wird es]C2 [Es wird umso interessanter]C2

[je mehr man drüber nachdenkt]C1

English usually has the iconic order C1 ® C2, which mirrors the semantic cause-effect interpretation of C1 acting as an independent variable on the dependent variable C2 (cf. above). In this version of the CC construction, both clauses are introduced by a the-filler (10a). As (11a) shows, the

corresponding German structure (11a) has three different lexical items that can introduce C2 (desto, umso and je), of which only one (je) is employed in C1. Moreover, verb placement in the German CC construction clearly indicates that C1 functions as a subordinate clause, while C2 is the main clause (since the former has the finite verb in clause-final position, while it follows the filler phrase in the latter2). On top of that, English also has an alternative structure in which C2 precedes C1 (10b). This structure has been labelled CC’ construction (by Culicover and Jackendoff 1999: 549) and has the comparative phrase at the end of C2 (more interesting) often without the, while C1 retains its thefiller. Again, German has a similar structure (11b), the only difference being that the comparative phrase in C2 is introduced by umso (je and desto are not considered possible in this order: cf. *[Es wird je interessanter]C2 [je mehr man drüber nachdenkt]C1 and *[Es wird desto interessanter]C2 [je mehr man drüber nachdenkt]C1; cf. Zifonun et al. 1997). In line with Hawkins’ Competence-Performance Hypothesis (2004), it can be expected that the iconic C1 ® C2 order is cognitively preferred over the alternative CC’ construction C2 ® C1, since the former structure mirrors the semantic interpretation of the two subclauses and should therefore be easier to process. This in turn should lead to a greater use of the CC construction, a hypothesis that can be tested by investigating the frequency of the two structures in authentic performance data, i.e. corpora. On top of that, however, the Principle of No Synonymy (Goldberg 1995: 67-8) and the related concept of pre-emption (Tomasello 2003: 300; Goldberg 2006: 94 –98) predict that CC and CC’ should not be fully synonymous: if a speaker has a choice between two (or more) similar constructions, then a hearer will assume that the use of one variant on a given occasion reflects a functional difference between the two structures. In the long run, this may then lead to the functional differentiation of the two alternatives if these contextual associations are strengthened by similar usage events. Finally, there will be contexts in which one construction strongly pre-empts the other alternative, which in effect also minimises constructional synonymy. Now, a comparison of (10a) with (10b) and (11a) and (11b) suggests that CC and CC’ constructions are semantically synonymous. In line with Goldberg’s Corollary A (1995: 67), which states that two constructions that are syntactically distinct and semantically synonymous, cannot be pragmatically synonymous, this would imply some kind of pragmatic difference in the usage constraints of the two constructions. Evidence for this comes from the distribution of focus particles (Sudhoff 2010) such as even/sogar: (12) a. b. 2

[The more you think about it]C1

[the more interesting it (?even) becomes]C2

[It becomes even more interesting]C2

[the more you think about it]C1

This distinction is less straightforward for English (cf. Culicover and Jackendoff 1999: 546-553).

(13) a. b.

[Je mehr man drüber nachdenkt]C1 [desto/umso/je interessanter wird es (?sogar)]C2 [Es wird sogar umso interessanter]C2

[je mehr man drüber nachdenkt]C1

As (12a,b) and (13a,b) indicate, in both English and German the use of a focus particle in C2 is more acceptable in CC’ constructions, indicating that this variant is preferred when the comparative phrase of C2 is focused. Thus, while they are semantically synonymous, CC and CC’ constructions differ with respect to their information structure properties. These informational structural differences, however, appear to be independent of any argument structure phenomena. Focusing on the more frequent CC construction again, there is at least one other variable that interacts with Argument Structure constructions in a straightforward way, namely the syntactic type of filler phrase. As several studies have pointed out, English CC constructions licence the following filler phrase types: adjective phrases (AdjP; (14a)), adverb phrases (AdvP; (15a)), noun phrases (NP; (16a)), certain idiomatic prepositional phrases (PP; (17a)), and a so-called “Special Construction” (18a)) (cf. e.g. McCawley 1988; Borsley 2004; Den Dikken 2005; Fillmore et al. 2007: 20-22; Sag 2010: 493). Besides, as the examples in (14b-18b) show, apart from the “Special Construction”, German displays a similar range of possible filler phrases: (14) a. b. (15) a. b. (16) a. b. (17) a. b.

[the [older]AdjP the man got,]C1

[the [happier]AdjP he became]C2

[je [älter]AdjP der Mann wurde,]C1

[desto [glücklicher]AdjP wurde er]C2

[the [longer]AdvP she slept]C2

[the [faster] AdjP she could run] C2

[je [länger]AdvP sie schlief]C2

[desto [schneller] AdjP konnte sie laufen] C2

[the [less money]NP we earned]C1

[the [more problems]NP we encountered]C2

[je [weniger Geld]NP wir verdienten]C2

[umso [mehr Probleme]NP bekamen wir.]C2

[The [more under the weather]PP you are,] C1

[the [more in pain]PP you are] C2

[je [mehr in Rage]AdvP er sich redete]C2 [desto [weniger im Zaum] AdjP konnten Sie ihn halten]C23

(18) a.

[The [braver a soldier]SpecialConstruction you are,]C1 [the [bigger of aindef threat]SpecialConstruction you become.]C2

b. * [Je [mutiger ein Soldat]SpecialConstruction Du bist,]C1 * [desto [größer von einerindef Gefahr]SpecialConstruction wirst Du.]C2 AdjPs (14), AdvPs (15), NPs (16) and PPs (17) are all perfectly acceptable filler types in both languages, though, as I will show below, they are not equally prototypically associated with the CC construction. The predicative Special Construction [Adjcomparative (of) NPindefinite]-filler (Fillmore et al. 3

‘The more he talked himself into a fury, the less she could keep him in check.’

2007: 20-30), however, seems only fully grammatical in English (cf. (18a) vs. the ungrammatical German equivalent structure (18b)). The syntactic type of filler phrases is usually considered independent of the Argument Structure construction that is unified with a Filler-Gap construction. As I will point out below, however, AdjPs are by far the most prototypical fillers that speakers encounter in CC constructions, while NPs are clearly disfavoured. On top of that, note that these two phrase types are also associated to different degrees with different Argument Structure constructions: NPs normally fill the three non-verbal slots in the Ditransitive construction (Subj V Obj1 Obj2/[‘Subj CAUSES Obj1 TO RECEIVE Obj2’]; Goldberg 1995: 3, 2006: 73; Boas 2013: 235-239; e.g. BradNP gave AngieNP [a kiss]NP, SheNP sent himNP [a letter]NP, etc.). In contrast to this, AdjPs are preferred in Predicative constructions (Subj BE XP/[‘Subj is XP’]) (Brad is richAdjP, Angie is happyAdjP, etc.). From a usage-based perspective, this entails that certain types of Argument Structure constructions (here the Predicative construction) can become more closely associated with a Filler-Gap construction (here CCs) than previously assumed, a hypothesis that will be explored in more detail below. The specific, lexical instantiation of the fillers in CC constructions also raises important questions concerning the internal structure of this construction, in particular the relationship of C1 and C2. As mentioned earlier, complete-inheritance analyses (Borsley 2004; Sag 2010) propose constructional templates in which C1 and C2 are licensed independently of each other. Yet, even these approaches would accept that there are idiomatic uses of the construction that are stored holistically in a speaker’s mental constructicon (cf. also Fillmore, Kay and O’Connor 1988: 506; Croft and Cruse 2004: 234): (19) a. b.

The more the merrier Je oller

desto doller

(‘The older, the bolder’ / ‘There’s no fox like an old fox’4)

This view would entail that speakers have entrenched two types of CC constructions, a fairly schematic constructional template that they use to create novel instances and a set of fully substantive constructions such as (19). Yet, from a usage-based point of view, it is far from obvious that speakers should only have entrenched these two types of constructions (an assumption that has the flavour of a clear-cut, pre-Constructionist lexicon-syntax dichotomy). In the present study, I therefore also investigated whether there are any filler-filler associations in fully creative CC constructs that might warrant the postulation of intermediate partly substantive and partly schematic constructions. In order to identify such patterns, I tested for substantive filler-filler cooccurrences (such as older-happier / älter-glücklicher in (14)) as well as abstract filler type associations (such as AdjP-AdjP in (14)). 4

source: http://m.digitaljournal.com/article/33711?doredir=0&noredir=1 [last accessed 07.03.2013].

As mentioned earlier, both English and German allow for the optional deletion of main verb BE/SEIN in CCs with Predicative Argument Structure constructions (cf. (8) The greater the demand is, the higher the price is.; (8) Je größer die Nachfrage ist, desto höher ist der Preis.). On top of that, however, in CCs with all types of Argument Structure constructions it is also possible to truncate both comparative correlative clauses down to just their filler phrase (Zifonun et al. 1997; Huddleston 2002: 1136): (20) a. b.

[the [less money]NP you earn]C1

[the [more problems]NP you will encounter]C2

[je [weniger Geld]NP man verdient ]C2

[desto [mehr Probleme]NP bekommt man.]C2

So far, no study has addressed the question of how often the main verb BE/SEIN is actually deleted in CCs in German and English. Moreover, no information was available as to the frequency of truncation phenomena such as (20). The present corpus study investigates these issues, also taking into account the possibility of cross-clausal parallelisms in C1 and C2. Finally, English CC constructions already exhibit a greater parallelism between C1 and C2 than German CCs (with respect to e.g. the lexical items that introduce the subclauses, cf. the-the vs. jeje/umso/desto, and word order). The parallel word order in C1 and C2 in English CCs can obviously be attributed to the general diachronic change that lead to SVO word order in both main and subordinate clauses. At the same time, this additional parallelism in surface structure can also be hypothesized to facilitate and strengthen the storage of constructional C1C2 templates. This hypothesis was also investigated in the empirical study.

3. Data and methodology The main data base for English used in the present study was the BROWN family of corpora: •

the BROWN corpus (representative of 1960s written American English (AmE); Francis and Kucera 1979),

•

the Lancaster-Oslo/Bergen corpus (LOB; 1960s written British English (BrE); Johansson, Leech and Goodluck 1978),

•

the Freiburg-Brown Corpus of American English corpus (FROWN; AmE / 1990s; Hundt, Sand and Skandera 1999) and

•

the Freiburg-LOB Corpus of British English corpus (FLOB; BrE 1990s; Hundt, Sand and Siemund 1998).

The German data was extracted form the LIMAS corpus (http://www.korpora.org/Limas/), a corpus consisting of written 1970s German texts which is modelled on the design of the BROWN/LOB corpora. These are all, by modern standards, fairly small corpora with only 1 million words each, but they enabled me to fully retrieve all relevant instances of the CC construction (as well as the CC’ construction), which for the present pilot study was considered a considerable advantage. In particular so, since CC constructions are not tagged as such in any corpus, and it therefore becomes necessary to use lexically-based queries that lead to a great number of false positives (and doubled results) which have to be manually checked and discarded. (The English corpora were queried for the strings “the more” / “the less” / “the worse” and “the *er”; the German corpus was queried for “je”, “um so”, “umso”, “desto”.) In light of the discussion in section 2, the data were coded for the variables presented in Table 1:

Table 1. Variables which the corpus data were coded for Factors

Levels

LANGUAGE

English, German

ORDER

C1C2 (‘CC construction’), C2C1 (‘CC¢ construction’)

INITIAL WORD

je, um so, desto

[for German only] FILLER TYPE

AdjP, AdvP, NP, PP, SpecialConstruction

[for both C1 and C2] LEXICAL FILLER TOKEN

older, älter, more money, mehr Geld, etc.

[for both C1 and C2] DELETION

full clause (without auxiliary), BE/SEIN-retained,

[for both C1 and C2]

BE/SEIN-deletion, truncated

Finally, the data were then subjected to a “hierarchical configural frequency analysis” (HCFA; cf. Bortz, Lienert and Boehnke. 1990: 155-157; Gries 2008: 242-254), in order to test the association of various categorical variables. HCFA is a powerful statistical method that performs a goodness-of-fit test for each factor combination of a data set. Unlike standard goodness-of-fit chi-square tests, HCFA can also be applied to data sets with three or more factors. For the present study this analysis was carried out with the R 2.7.1 for Windows software (R Development Core Team 2008) using Gries’s HCFA 3.2 script (Gries 2004a). HCFA 3.2 employs exact binomial tests, which are more robust than simple chi-square tests, and adjusts the significance of all tested factor associations (so called ‘configurations’) for multiple testing (using the Bonferroni method; see Gries 2008: 245-246 for details). Following standard practice, the p-values of configurations presented in this paper accepted as significant are p