Copyright 2000 by the American PsychologicalAssociation, Inc. 0278-7393/00/$5.00 I)131:10.1037//0278-7393.26.2.489
Journal of ExperimentalPsychology: learning, Memory,and Cognition 2000, Vol. 26, No. 2, 489-511
The Balance of Storage and Computation in Morphological Processing: The Role of Word Formation Type, Affixal Homonymy, and Productivity Raymond Bertram
Robert Schreuder and R. Harald Baayen Universityof Nijmegen
University of Turku
This article is concerned with the way in which the balance of storage--storing and processing words through full-form representations--and computation~storing and processing words through morpheme-based representations--in lexical processing in the visual modality is affected by the following 3 factors: word formation type (roughly, inflection vs. derivation), productivity, and affixal homonymy. Experimental results for 5 different Dutch suffixes, combined with previous results obtained for 4 comparable Finnish suffixes (R. Bertram, M. Lalne, & K. Karvinen, 1999) and 2 Dutch suffixes (R. H. Baayen, T. Dijkstra, & R. Schreuder, 1997), show that none of these factors in isolation is a reliable cross-linguistic predictor of the balance of storage and computation. The authors offer a general framework that outlines how morphological processing is influenced by the interaction of word formation type, productivity, and affixal homonymy.
Current research on morphological processing in language comprehension reveals a growing awareness that various linguistic properties of complex words profoundly influence the way that these words are processed. Researchers have pointed out that prefixed words are processed differently than suffixed words Co16, Beanvillain, & Segui, 1989; Cutler, Hawkins, & Gilligan, 1985; Marslen-Wilson, Tyler, Waksler, & Older, 1994; Taft, 1994). Others have called attention to the relevance of the distinction between inflectional and derivational morphology in lexical processing (Niemi, Laine, & Tuominen, 1994; Schriefers, Friederici, & Graetz, 1992; Taft, 1994). Marslen-Wilson et al. (1994) point to the relevance of semantic transparency. More recently, the individual properties of the afftxes themselves have become subject to investigation. Landanna and Burani (1995) and Burani, DOvetto, Thornton, and Laudanna (1997) investigated a number of distributional affix-specific proper-
ties. Using visual lexical decision, they measured response latencies to nonwords consisting of a nonexisting base word and a real affLx. Reaction times to such words increased with the increasing length of the affix. Also, affixes that occurred in many different word types gave rise to longer response latencies. Conversely, nonwords with affixes with a high confusability (defined as the percentage of word tokens for which the final orthographic string is homographic with the affix but does not in fact denote the affix) were responded to faster than nonwords with low-confusability affixes. These distributional properties determine--what these authors call the orthographic salience of the affix. The more salient an affix is, the more wordlike the nonwords become, leading to longer rejection latencies. Bertram, Laine, and Karvinen (1999) show that linguistic properties of individual alftxes, such as productivity and homonymy, are important factors in the processing of real words. Derived Finnish words with an unambiguous, productive suffix were responded to faster than were monomorphemic control words. However, derived words with either an unproductive or homonymic suffix (a suffix that is ambiguous in that it serves more than one semantic function) were responded to equally as fast as monomorphemic controls. The authors explain this finding in terms of statistical facilitation (Frauenfelder & Schreuder, 1992; Raab, 1962) that occurs for words with productive and unambiguous affixes but that is absent for words with unproductive or homonymic affixes. Bertram et al. (1999) claim that derived Finnish words with an unambiguous, productive sttfftx are processed simultaneously on the basis of full-form representations and on the basis of their morphological constituents. Because of overlap in the distributions of the processing times of the two lexical routes, these complex words can be responded to faster than can morphologically simplex words. For simplex words, which by their nature have only full-form representations, no statistical facilitation can take place. The fact that derived words with either an unproductive or homonymic suffix are responded to equally as fast as
Raymond Bertram, Department of Psychology, University of Turku, Turlm, Finland; Robert Schreuder and R. Harald Baayen, Interfaculty Research Unit for Language and Speech, University of Nijmegen, Nijmegen, the Netherlands. We thank four anonymous reviewers, Matti Laine, and Jukka Hytn~ for their helpful comments on an earlier version of this article, and Leonore Biegstraten for her help in conducting the experiments. This study was financially supported by the Academy of Finland Grant 27774 to Matti Laine, by the Centre of International Mobility (CIMO, Grant to Raymond Bertram), by the Finnish Graduate School of Psychology Grant of the Finnish Ministry of Education to Raymond Bera'am, and by the Dutch National Research Council (Nederlandse Organisatie voor Wetenschappelijk Onderzoek, Persoonsgerichte Impuls voor Onderzoeksgroepen Met Nieuwe Ide~n voor Excellente Research, Grant to R. Harald Baayen). Correspondence concerning this article should be addressed to Raymond Bertram, Department of Psychology, University of Turku, FIN-20520, Turku, Finland. Electronic mail may be sent to
[email protected]. 489
490
BERTRAM,SCHREUDER,AND BAAYEN
monomorphemic words indicates that both word types are processed in the same way, through full-form representations only. The authors concluded that homonymy induces storage, whereas productivity enhances morphological parsing. The Italian and the Finnish studies show that distribufional and linguistic properties of individual affixes affect the way in which words with these affixes are processed. An additional complicating factor that may be relevant for the balance of storage and computation concerns the role of cross-linguistic differences between morphological systems. Finnish, for instance, has a rich inflectional morphology, whereas the inflectional system of Dutch is extremely simple. In Finnish, a given word can occur in hundreds or even thousands of different inflectional variants (Karlsson & Koskenniemi, 1985), whereas in Dutch at most 10 inflectional variants are available. With the combinatorial explosion of possible forms in Finnish, the likelihood of extensive storage of such forms in the mental lexicon diminishes considerably. Indeed, until now hardly any empirical evidence has been obtained for storage of inflected words in Finnish (Bertram et al., 1999; Hy6n~ Laine, & Niemi, 1995; Niemi et al., 1994). By contrast, Baayen, Dijkstra, and Schrender (1997) reported evidence for extensive storage of regular noun plurals in Dutch (see also Serene & Jongman, 1997, for similar findings for English). This suggests that the balance of storage and computation is co-determined by the lexical-stafistical properties of a given language. To explore the possible role of cross-linguistic factors on the balance of storage and computation, the present study addresses the potential effects of morphological productivity, affix bomonymy, and word formation type (roughly, inflection vs. derivation) for Dutch, paralleling the study of Bertram et al. (1999) for Finnish. We have investigated the following five Dutch suffixes: the productive inflectional suffix -re, which marks singular past tense on verbs; the unproductive derivafional suffix -te, with which abstract nouns are derived from adjectives (e.g., warmte, warmth); the productive derivational suffix -he/d, which like derivational -re creates abstract nouns from adjectives (e.g., leegheid, emptiness); the productive inflectional suffix -er, which forms comparafives (e.g., warmer, warmer); and the productive derivational suffix -er, wl'dch forms nouns from verbs (e.g., denker, thinker). For each of these suffixes we have investigated the role of storage by varying the surface frequency of the complex words while keeping the base frequency constant; and we have investigated the role of computation by varying the frequency of the b a s e word while keeping the surface frequency constant. This method was introduced by Taft (1979) and has been used by various other researchers as well (e.g,, Baayen, Dijkstra, & Sc.heuder, 1997; Bradley, 1979; Burani & CaramaT~,a, 1987; Burani, Salmaso, & Carumu~u: 1984; Col~ et al., 1989; Schreuder & Baayen, 1997). The logic underlying this kind of method is as follows. Effects of surface frequency reveal familiarity of the processing system with the complex word as a whole. A common theoretical assumption is that the information concerning the frequency of use of a complex word is coded into the resting
activation level of that word as a whole at some level of representation. Effects of base frequency reveal that the constituents of a complex word, notably its base word, play a functional role in lexical processing. When such effects arise, the common assumption is that the lexical representation of the base word is activated during lexical processing. These two frequency effects are not mutually incompatible. In parallel dual route models of morphological processing (Baayen, Dijkstra, & Schreuder, 1997; Schreuder & Baayen, 1995), for instance, lexical access may take place both by decomposition into the constituent morphemes (leading to a base frequency effect) and by retrieval from memory of the full form (leading to a surface frequency effect). The levels of representation on which surface and base frequency effects are located differ from model to model. For instance, the base frequency effect is located at the access level and the surface frequency effect at a more central leveP in the prefix stripping framework of Taft and Forster (1976) and Taft (1979) as well as in the model outlined in Co16 et al. (1989). For the augmented addressed morphology (AAM) model, Burani and Caramazza (1987) stated the following: Briefly, this model assumes that while lexical representations in the orthographic(input) lexicon are morphologicallydecoml~sed--i.e., roots or stems are represented independently of inflectional affixes and, possibly, derivational affixes--the access procedure operates with both whole-word and morpheme access units. Previously experienced words activate whole-word access units while the recognition of novel words occurs through the activation of morphemic access units. We concluded that the AAM model locates base frequency effects at a more central level and takes surface frequency effects for known words to arise at the level of access representations. In our own theoretical framework, we interpret both frequency effects as arising at the level of access representations. The experiments reported in the present article do not allow for a decision between these three different theoretical accounts. They have been designed to address a different issue, namely, to gain insight into the balance of storage and computation for regular morphologically complex words. Traditional approaches to this issue have ranged from no storage under any circumstances to ubiquitous storage. For instance, Pinker (1991) and Clahsen, Eisenheiss, and Sonnensmhl-Henning (1997) claimed that all regularly inflected words are always processed on the basis of their base words and that full-form access representations are absent for such words. In terms of our frequency diagnostics, this view implies the presence of an effect of base frequency and the absence of an effect of surface frequency for regularly t We use the terms access level and more central level to denote the early perceptual level of processing and representation on the one hand, and the later linguistic levels of processing and representation on the other hand. Similar distinctions are made by, for instance, Taft (1979), who calls the first level the orthographic file and the second level the master file, and by Burani and C..arsma~Ta ( 1987; e.g., the augmented addressed ~ o g y model), who call the first level the access level and the second level the orthographic input lexicon.
STORAOE AND COMvtrrAT~ON inflected words. Conversely, Butterworth (1983) and Seidenberg (1987) argued that complex words are extensively stored and that their base words have little or no role to play. Thus, this view predicts no effect of base frequency and ubiquitous effects of surface frequency. The aim of the present article is to show that both of these views are too restricted and that the balance of storage and computation is co-determined by factors such as word formation type, productivity, and affixal homonymy. We present a framework for how these factors interact in the General Discussion section using both the present experimentai results as well as data obtained for a number of comparable affixes of Finnish reported in Bertram et ai. (1999). Experiments 1 and 2 address the role of storage and computation for words with the homonymic suffix -re, which is used as an inflectional suffix to express the simple past tense and as an unproductive, but otherwise fully regular, derivational suffix to express abstract nouns. Experiment 3 focuses on the processing of words with the productive derivational suffix -heid, which, like -re, forms abstract nouns. In Experiments 4 and 5, we investigate the way in which words with the homonymic suffix - e r are processed. As in English, the --er suffix forms either comparatives or agent nouns. The task used in each experiment is the visual lexical-decision task. E x p e r i m e n t s With the Suffix - t e In Experiment la, we investigate the role of surface frequency for the inflectional singular past tense suffix -re using a factoriai design with base frequency kept constant in the mean between the high and low surface frequency conditions. We define the base frequency as the summed frequencies of the base word itself and its inflectional variants. In Experiment Ib, we use a similar design to investigate the role of base frequency for inflectional -re, this time keeping mean surface frequency constant across conditions. In Experiment 2, we investigate the role of surface and base frequency for derivational -re using a correlational design because there are only 73 formations with -re in the CELEX lexical database of 42 million word tokens (Baayen, Piepenbrock, & Gnlikers, 1995), of which 17 are so obscure that they cannot be used for experimentation. The remaining items are too few to construct orthogonal contrasts between surface and base frequency. For the inflectional -re, we did not expect an effect of surface frequency, but only an effect of base frequency, for the following reasons. First, it is a fully regular and very productive suffix. Second, it has a much higher frequency of occurrence than its unproductive derivationai homonym. Third, the inflectional -re specifies deictic tense and personnumber marking, adding only syntactically relevant information without changing the meaning of the base word. We will refer to this kind of inflection as meaning-invariant morphology. For a similar inflectional suffix, -en, in its use as a plural marker on past tense verbs, Baayen, Dijkstra, and Schreuder (1997) reported the total absence of a surface frequency effect. These considerations lead us to expect maximal
491
effects of parsing and no effects of storage for the inflectional -re. By contrast, we expected the mirror image pattern for the derivadonal -re. This suffix, although fully regular, is not productive, and it is much less frequent than its inflectional homonym. Moreover, instead of adding only syntactically relevant information without changing the meaning of the base word, the derivational -re also involves meaningchanging morphology. (For instance, words such as warmte [warmth] have a base word denoting a property, but they themselves embody the notion of measurement.) All of these factors conspire to favor storage rather than parsing. Experiment 1 Method Participants. Eighteen undergraduate students from the University of Nijmegen were paid to participate in Experiment la. Twenty-seven different undergraduate students from the same university were paid to perform Experiment lb. All were native speakers of Dutch and had normal or corrected-to-normal vision. Target materials for Experiment l a. Forty inflected verb forms with the singular past tense suffix -re were selected from the CELEX lexical database, of which 20 had a relatively high surface frequency average of 5.3, whereas the other 20 had a low surface frequency average of 1.4 (all token frequency counts reported are scaled to one million). The frequency difference between the two conditions is significant, t(38) = 5.8, p < .00001. The two sets were matched for base frequency 0aigh surface, 15.8; low surface, 15.4), family size of the bas¢~ (high surface, 15.8; low surface, 15.4), geometric mean bigram frequency (high surface, 13.7; low surface, 13.9), and word length in letters (high surface, 6.5; low surface, 6.9). Here, and in all the other experiments reported, items were matched on an item-by-item basis. In other words, pairs of items were selected that were comparable on all the variables mentioned previously except for the manipulated frequency value, in this case the surface frequency. Target materials for Experiment lb. Forty inflected verb forms with the singular past tense suffix -re were selected from the CELEX database, of which 20 had a high base frequency (23.0) and the other 20 a low base frequency (5.1). The two sets were matched for surface frequency (high base, 2.7; low base, 2.3), family size of the base (high base, 16.5; low base, 16.8), geometric mean bigram frequency (high base, 13.9; low base, 13.7), and word length in letters (high base, 6.4; low base, 6.6). The materials for these (and all of the other experiments reported in this article) are listed in Table A1 in the Appendix. Figures A1 and A2 in the Appendix show by means of scatterplots how our experimental data points in the experiments manipulating surface frequency (Figure A1) and the experiments manipulating base frequency (Figure A2) have been selected from the population of available words meeting the constraint of having an unambiguous Germanic monomorphemic base word. In both figures, the upper right panel shows the scatterplots for the inflectional-re. Filler materials for Experiment 1. The same filler material was added for Experiments la and lb. The additional 100 filler words 2 The family size of the base is the number of derived and compound words with a given base word as a constituent. Schrender and Baayen (1997) show that this is a relevant factor in lexical processing and that the relevance is type-based rather than token-based.
492
BF.,RTRAM, SCHREUDER, AND BAAYEN
consisted of 10 monomorphemic nouns; 40 inflected nouns; 30 verbs, of which 10 were in the nominative case, whereas the others were some kind of inflection (7 participles, 7 third-person singulars, and 6 singular plurals); and 20 derived words, of which 10 were with the feminine suffix -in, and 10 were with the productive de-adjectival suffix -he/d. For each of the 140 words, a nonword was obtained by changing one to three letters so that the phonotactics of Dutch were not violated. The nonwords had a similar structure as the real words in that 40 of them were formed with the target suffix -re, 20 were without morphological structure, and the remaining 80 nonwords involved exactly the same suffixes that were used for the filler words in exactly the same proportions. Procedure. For the lexical-decision task, 2 participants were tested at a time in noise-proof experimental booths. They were to decide as quickly and as accurately as possible whether a letter string appearing on the computer screen was a real Dutch word or not. Each stimulus was preceded by a fixation mark in the middle of the screen for 500 ms. After 50 ms, the stimulus appeared at the same position. Stimuli were presented on NEC MultiSync ® color monitors in white lowercase 18-point Helvetica letters on a dark background, and they remained on the screen for 1,500 ms. The maximum response time was 2,000 ms from stimulus onset. Fifteen practice trials, 8 words and 7 nonwords, preceded the actual experiment. The experiment itself was divided into two blocks of 140 items (each block contained 70 words and 70 nonwords). There was a short pause between the two blocks. The experiment lasted approximately 20 rain.
Results and Discussion Experiment la. The data for all of the participants were included in the analyses, as all performed with an overall error rate below 15%. All items elicited error rates below 30%, which is our criterion for inclusion in the analyses. The observations were used to calculate the mean response latencies and error scores for the different test conditions (see Table 1). As expected, neither a paired t test for participants nor a standard two-sample t test for items showed a significant difference between the response latencies of two conditions, t~(17) = 1.5,p > .1 and t2(38) = 0.7, p > .1, two-tailed tests. Also the error scores did not reveal a significant difference, t2(38) = 1.6, p > .1. In a post hoc analysis, we further analyzed the dependence of the response latencies on the logarithmically transformed surface
Table 1
Mean Response Latencies With Standard Deviations and Error Percentages for Inflected Verbs With the Singular Past Tense Su_~x-re, With High Versus Low Surface Frequency (Experiment l a), and With High Versus Low Base Frequency (Experiment 1b) Frequency (per million)
Manipulated
Reaction time
SD
Error
(%)
Experiment la High (5.3) Low (1.4)
Surface Surface
651 665
54 64
1.7 4.2
High (23.0) Low (5.1)
Experiment lb Base 630 Base 685
45 63
3.9 14.1
and base frequencies of the individual inflected verb forms, using linear models of the form, Reaction Tune (RT) = a* log (base frequency + 1) + b* log (surface frequency + 1) + c.
(1)
We use log(frequency + 1) rather than simply log(frequency) because our counts include frequencies equal to zero, for which the logarithmic transformation is undefined. In what follows, we will refer to the effect of base frequency and surface frequency in the linear model with the tacit assumption that the frequency counts (per 42 million) are logarithmically transformed as in Equation 1. A stepwise regression analysis revealed a significant coefficient for base frequency only (for the RTs: base frequency, a = - 4 4 . 6 7 , p < .01; surface frequency, b = - 1 . 5 4 , p > .8; throughout this paper we present Bonferroni-adjusted p values), Note that by itseK our frequency contrast (5.3 vs. 1.4) was powerful enough to yield substantial frequency effects in visual lexical decision; Baayen, Dijkstra, and Schreuder (1997) reported a reliable 63-ms effect for regular plural nouns with the suffix - e n for a similar frequency contrast of 4 vs. 1. Because our post hoc analysis also did not reveal any dependency of the response latencies on surface frequency, we concluded that surface frequency does not play a role for our inflected words with the regular past tense suffix -re. The next experiment tested whether the effect of base frequency observed in this post hoc analysis could be replicated with a factorial design. Experiment lb. The data for all of the participants were included in the analyses because they all performed with an overall error rate below 15%. Two items elicited error rates higher than 30% and were therefore excluded from the analyses. Here, and in all of the other experiments in this article, exclusion of error-prone items did not alter the matching of the two target groups in a sitmiricant way. The remaining observations were used to calculate the mean response latencies and error scores for the two conditions, as can be seen in Table 1. As expected, the verbs with a high base frequency were recognized significantly faster than were the verbs with a low base frequency, fi(26) = 3.8, p < .001; t2(36) = 2.5, p < .01, one-tailed tests. In addition, the low-frequency condition elicited significantly more errors than did the high-frequency condition, tz(36) = 2.88, p < .01. 3 A post hoc analysis supported the dependence of the response latencies on only the base frequency of the
3 One might argue that no effect was found in Experiment la and that a significant effect in Experiment lb was induced by a difference in power for the two experiments because we used 18 participants in Experiment la and 27 participants in Experiment lb. To exclude this possibility, we reanalyzed Experiment lb with the first 18 participants only. The pattern of results was exactly the same as with 27 participants. Participants responded to high base frequency words much faster than they did to low frequency words (634 ms vs. 678 ms), ti(17) = 3.4,p < .005; t2(36) -- 1.9,p < .05) and elicited less errors as well (3.3% vs. 15.4%), tt(17) = 5.3, p < .001; t2(36) = 2.9,p < .005).
SrORAnE AND coMptrrAriON individual inflected verb forms. Stepwise regression analyses revealed a significant coefficient for base frequency and a zero coefficient for surface frequency (for the RTs: base frequency, a = - 2 4 . 4 5 , p < .0001; surface frequency, b = 0). We interpret these results as indicating that for the inflectional -re the balance of storage and computation is strongly in favor of computation. A similar result for another verbal inflectional suffix was obtained in Baayen, Dijkstra, and Schreuder (1997) for Dutch. For Finnish, Bertram et al. (1999) reported evidence for full parsing with locative inflectional suffixes. For evidence of parsing for inflected words in English, see Taft (1994). Contrary evidence for storage of inflected words in English can be found in Sereno and Jongman (1997). We return to this issue in the General Discussion section. In the next experiment, we shift our attention from the domain of inflection to the domain of derivation using a suffix with the same orthographic form as in the preceding experiments. Because the derivational -re is unproductive and has a very productive inflectional homonym, we expected to find exactly the opposite to what we found in Experiment 1, (i.e., no effect of base frequency and a solid effect of surface frequency).
493
storage and computation for words with the derivational -te has shifted toward storage. This dependency on storage may be induced by the presence of a very productive and frequent homonym on the one hand, and by a lack of productivity of the suffix itself on the other hand. Bertram et al. (1999) showed that words with the unproductive but unambignous and regular loeational derivational Finnish suffix -1A (el. English --cry in bakery) are fully dependent on storage. This cross-linguistic comparison suggests that a lack of productivity by itself is already sufficient to induce storage. The next experiment addresses whether the balance of storage and computation is similar for the unambiguous and productive Dutch suffix -heid or whether storage is as ubiquitous as it is for its near synonym, the derivational -re. Experiments With the Suffix - h e i d In Experiment 3a, we investigate the role of surface frequency for the derivational suffix -heid using a factorial design with the base frequency kept constant in the mean between the high- and low-surface-frequency conditions. In Experiment 3b, we use a similar design to investigate the role of base frequency but now keeping mean surface frequency constant across conditions.
Experiment 2 Experiment 3
Method Participants. Sixteen undergraduate students from the University of Nijmegen were paid to participate in the lexical-decision experiment. All were native speakers of Dutch and had normal or corrected-to-normal vision. None had participated in any of the previous experiments. Target materials. As mentioned previously, there are only 73 words with the derivational suffix -te in the CELEX lexical database. Fifty-six of these words could be used for a correlational experimental design. These 56 words include a few formations with nontransparent readings (e.g., groente, green-tit i.e., vegetables). The materials are listed in Table A1 in the Appendix. The upper left panel of Figures A1 and A2 in the Appendix summarize the frequential properties of our critical words. (Both panels plot the same data of this correlational design with the variables on the axes interchanged.) Filler materials. In this experiment we used 84 filler words, all of which were also used in Experiment 1. Sixteen filler words (10 nominal inflections and 6 derived words in the suffix -/n) that were used in Experiment 1 were not included in Experiment 2. The nonwords in the present experiment were constructed along the same lines as those of Experiment 1. Of these nonwords, 56 ended in the suffix -re to match the number of real words in -re. Procedure. The procedure was identical to that of Experiment 1. Results a n d Discussion The data for all of the participants were included in the analyses, as they all performed with an overall error rate below 15%. Eighteen items with an error rate exceeding 30% were excluded from the analyses. A stepwise regression analysis with Equation I as the underlying linear model revealed a significant coefficient for surface frequency only (for the RTs: surface frequency, a = - 21.61, p < .001; base frequency, b = 0). These results show that the balance of
Method
J
Participants. Sixteen undergraduate students from the University of Nijmegen were paid to participate in Experiment 3a, and 16 different undergraduate students were paid to perform in Experiment 3b. All were native speakers of Dutch and had normal or corrected-to-normal vision. None had participated in the previous experiments. Target materials for Experiment 3a. Forty nouns with the de-adjectival suffix -held were selected from the CELEX database, of which 20 had a high surface frequency (26.4), whereas the other 20 had a low surface frequency (0.9). The two sets were matched for base frequency (high surface, 291; low surface, 289), family size of the base (high surface, 53.0; low surface, 48.7), geometric mean bigram frequency (high surface, 13.4; low surface, 13.5), and word length in letters (both 8.7). The materials are listed in the Appendix. Target materials for Experiment 3b. Fifty nouns with the de-adjectival suffix -heid were selected from the CELEX database, of which 25 had a high base frequency (207.2), whereas the other 25 had a low base frequency (24.9). The two sets were matched for surface frequency (both 1.0), family size of the base word (high base, 14.2; low base, 14.0), geometric mean bigram frequency (high base, 13.4; low base, 13.3), and word length in letters (high base, 8.6; low base, 8.1). The materials are listed in Table A 1 in the Appendix. The center panels of Figures A1 & A2 in the Appendix show how we have sampled the words of Experiments 3a and 3b. Filler materials for Experiment 3. The same set of 100 filler words were selected both for Experiment 3a and Experiment 3b. The set of filler words was different from that used for the first two experiments, but it was similar in design in that it consisted of words belonging to a variety of word types and word categories. There were 25 monomorphemic nouns (10 in Experiment 1); 40 inflected nouns (40 in Experiment 1); 20 verbs (30 in Experiment 1), 5 of which were in the nominative case (10 were in the nominative case in Experiment 1); 5 participles (7 in Experiment
494
BERTRAM, SCHREUDER,AND BAAYEN
1); 5 third person singulars (7 in Experiment 1); 5 plural inflections (6 in Experiment 1); 5 nominative adjectives (new for this experiment); and 10 derived words (20 in Experiment 1) with the de-verbal suffixes --sel(5) and-ing (5). For each of the 140 words, we constructed nonwords along the same lines as outlined in Experiments 1 and 2. Procedure. The procedurewas identicalto that of Experiment1.
Results and Discussion Experiment 3a. The data for all of the participants were included in the analyses, as they all performed with an overall error rate below 15%. One target word elicited an error rate higher than 30% and was excluded from the analyses. The remaining observations were used to calculate the mean response latencies and error scores for the different test conditions, as can be seen in Table 2. Both a paired t test for participants and a standard two-sample t test for items revealed a significant difference between the two conditions; the derived words with a high surface frequency were recot,niTed significantly faster than were those with a low surface frequency, tl(15) = 9.8,p < .001; t2(37) = 4.8,p < .001. Moreover, significantly more errors were made in the low-frequency condition than were made in the highfrequency condition, tz(37) = 3.4, p < .005. A stepwise regression analysis with Equation 1 as the underlying linear model suggests that not only surface frequency but also base frequency determines the response latencies (base frequency, a = -12.38, p < .01; surface frequency, b = -18.27, p < .001). It is clear that surface frequency is an important determinant of the response latencies for words in -he/d. At the same time, we also observed an effect of base frequency in the response latencies that, given its smaller coefficient in the linear model, appears to be somewhat weaker than that of surface frequency. This may be a direct consequence of our experimental design, in which we have attempted to keep base frequency constant in the mean across the two surface frequency conditions. Hence, it is possible that for words in -heid, in general, base frequency is a more important factor than surface frequency. Alternatively, the balance of storage and computation might in fact be in favor of lexical processing of nouns in -heid on
Table 2
Mean Response Latencies With Standard Deviations and Error Percentages for Derived Nouns With the De-Adjectival Sufftx-heid, With High Versus Low Surface Frequency (Experiment 3a), and With High Versus Low Base Frequency (Experiment 3b) Frequency (per million) High (24.4) Low (0.9) High (207.2) Low (24.9)
Manipulated
Reaction time
Experiment 3a Surface 554 Surface 644 Experiment 3b Base 646 Base 664
SD
Error (%)
40 72
0.6 6.5
76 72
8.3 4.9
the basis of full-form representations, with a subsidiary role for parsing only. To tease apart these possibilities, we used a factorial design in Experiment 3b in which we controlled for surface frequency while maximizing the contrast in base frequency. Experiment 3b. For the lexical-decision task, the data for all of the participants were included in the analyses, as they all performed with an overall error rate below 15%. Three target words elicited error rates higher than 30% and were excluded from the analyses. The remaining observations were used to calculate the mean response latencies and error scores for the two experimental conditions. Response latencies and errors are listed in Table 2. Words in -held with a high-frequency base word were recognized faster than were words with a low-frequency base word. This effect is significant in the by-participant analysis but not in the by-item analysis, t~(15) = 2.2, p < .05; t2(45) = 0.9, p > .1. Also the error analysis showed no difference between the two conditions, t2(45) = 1.5, p > .1. However, a stepwise regression analysis for the RT data using Equation 1 as the basic model revealed significant coefficients for both independent variables (base frequency, a = - 1 3 . 1 3 , p < .05; surface frequency, b = - 1 7 . 3 3 , p < .001). Note that the coefficient for surface frequency is larger than that for base frequency, as in Experiment 3a. This may explain why the effect of base frequency did not emerge reliably in the by-item factorial analysis; the variance introduced by surface frequency masked the variance because of base frequency. As an effect of base frequency also emerged in the post hoc regression analysis of Experiment 3a, we concluded that base frequency indeed plays a role in the lexical processing of nouns in -heid, albeit to a lesser extent than does surface frequency. Some researchers have argued that morphological decomposition takes place only for neologisms (Caramazza, Landanna, & Romani, 1988). Because our experimental materials contained 8 words that have a surface frequency of zero in the CELEX lexical database, and which can be considered as good approximations of neologisms for our participants, 4 we had the opportunity to investigate this possibility. If Caramazza et al. (1988) are correct, base frequency should have disappeared as an independent effect in our linear model when words with a surface frequency of zero were excluded from the analysis. Interestingly, the linear model, Equation 1, applied to the remaining words with frequency greater than zero, still showed a reliable effect for base frequency (base frequency, a = -15.32, p < .05; surface frequency, b = -30.35, p < .01). This suggests that the role of parsing extends beyond the mere handling of neologisms. Curiously enough, excluding neologisms from the linear model leads to a substantial increase in the coefficient of the surface frequency effect from -17.33 to -30.35. One would expect that excluding the lowest frequency items, 4 A CELEX frequency of zero means that these words are registered in comprehensive dictionaries of Dutch but do not appear in the corpus underlying the CELEX frequency counts with a probability greater than 1 in 42 million. Hence, these words were prime candidates to be neologisms for our participants.
STORAGEAND COMPUTATION which should elicit the longest response latencies, would lead to a smaller coefficient and not a higher one. We therefore inspected the individual item means using a scatterplot with log(surface frequency + 1) on the horizontal axis, and with RT on the vertical axis, as shown in Figure 1. A non-parametric, robust locally weighted scatterplot smoother (Cleveland, 1979) for the full frequency range, plotted as a dotted line, shows an initial positive correlation followed by a negative correlation. In fact, this initial positive correlation is supported by a t test comparing the item means of the neologisms with the item means of the words with log(surface frequency + 1) in the range of 1 to 2, t2(22) = 2.3, p < .05. Possibly, we are dealing with two distinct subsets, as shown by the solid lines in Figure 1. This initial positive correlation explains why the linear coefficient of surface frequency was larger when the neologisms were excluded. At the same time, this finding raises the question of why it seems to be more difficult to process words in -heid with very low surface frequencies than to process neologisms. Our tentative explanation is that we are observing the effects of early lexical competition between the access representations of the higher frequency base words on the one hand and the weak representations of the very low-frequency full forms on the other hand. When such pairs of access representations are in competition with each other, lexical processing is slowed down. Conversely, when there is no fuU-form representation at all, as is the case for our neologisms, no such competition takes place, leading to slightly faster response latencies. Clearly, further experimental research is required here. Summing up, Experiment 3 shows that there is a solid effect of surface frequency for the suflZx -heicL This strong effect of surface frequency, observed for one of the most productive derivationai suffixes of Dutch, is probably best understood in terms of the word formation type involved. Derivation involves meaning-changing morphology, which often goes hand in hand with the accretion of idiosyncratic aspects of meaning (Sandra, 1994). The functionality of
9O0 • QO 0
"~ 700
.......
495
full-form access representations for words with such idiosyncratic meanings is that they provide an efficient means for accessing these meanings. However, -he/d is a productive and unambiguous suffix that leads one also to expect the effects of parsing, especially for the many fully compositional formations (see Baayen & Neijt, 1997, for a detailed corpus-based study of this suffLx and its semantic properties) for which parsing can provide all required semantics. From this point of view, it is not surprising that base frequency appears as a significant second factor in our experiments. Similarly, Bertram et ai. (1999) observed effects of both surface and base frequency for the Finnish suffix -stO, which is--as is the Dutch suffix -he/dmalso productive and unambiguous. The next experiments investigated if affixal homonymy shifts the balance of storage and computation further in the direction of storage when two productive affixes are involved, as observed by Bertram et ai. (1999) for the homonymic Finnish suffix -jA. Experiments With the Suffix - e r As in English, the Dutch suffix - e r has two functions. Attached to adjectives it denotes the comparative (e.g., warmer, warmer), and attached to verbs it denotes subject nouns (as with main semantic classes or agents, e.g., meier, rower; and instruments, e.g., opener, opener; for details see Booij, 1986). Both homonyms are productive, but the inflectional - e r accounts for a majority of the 64% of all string tokens in the CELEX lexicai database with the suffix -er. Experiment 4a investigated the role of surface frequency, and Experiment 4b investigated the role of base frequency for the derivationai -er. Experiments 5a and 5b do the same for the inflectional -er. There are two reasons why we expected to find more evidence for parsing for the inflection --er than for the derivationai -er. First, inflected words are prime candidates for parsing, given their full compositionality and their simple semantics, whereas subject-noun formation leads to several related but distinguishable meanings such as agent and instrument. Second, comparatives are encountered more often than are derived words in --er. This may also be a relevant factor given the results of Baayen, Dijkstra, and Schreuder (1997) who showed for the Dutch plural suffix -en that the numerically stronger homonym is parsed by default, whereas its rival is extensively stored.
6OO Experiment 4
5OO 0
1
2
3
4
5
6
Log(SurfaceFrequency+l) Figure 1. Reaction time (RT) as a function of log(surface frequency + 1) for the derivational suffix -heid. The dotted line is a non-parametric locally weighted scatterplot smoother for the full frequency range. The solid line represents the same smoother for the nonzero surface frequencies. The horizontal line segment for the words with zero frequency represents the mean RT for these words.
Method Participants. Sixteen undergraduate students from the University of Nijmegen were paid to participate in Experiment 4a, and 16 different undergraduate students from the same university were paid to perform Experiment 4b. All were native speakers of Dutch and had normal or corrected-to-normai vision. None had participated in any of the previous experiments. Target materialsfor Experiment4a. Forty derived nouns with the de-verbal suffix--er were selected from the CELEX database, of which 20 had a relatively high surface frequency (2.8), whereas the
496
BERTRAM, SCHREUDER,AND BAAYEN
other 20 had a very low surface frequency (0.1). The two sets were matched for base frequency (high surface, 75.7; low surface, 76.9), family size of the base word (high surface, 29.4; low surface, 27.1), geometric mean bigram frequency (high surface, 14.1; low surface, 14.0), and word length in letters (high surface, 6.3; low surface, 6.4). The materials are listed in the Appendix. Target materialsfor Experiment 4b. Forty derived nouns with the de-verbal suffix -er were selected from the CELEX database, of which 20 had a high base frequency (456.8), whereas the other 20 had a low base frequency (39.0). The two sets were matched for surface frequency (high base, 0.7; low base, 0.9), family size of the base (high base, 33.8; low base, 30.2), geometric mean bigram frequency (high base, 14.2; low base, 14.0), and word length in letters (high base, 6.4; low base, 6.6). The materials are listed in Table A 1 in the Appendix. The bottom left panels of Figures A 1 and A2 in the Appendix show how we have sampled our target words for Experiments 4a and 4b. Filler materialsfor Erperiment 4. The same 100 filler words that were used in Experiment 3 were used for Experiments 4a and 4b. The 140 nonwords were constructed in a similar way as in the previous experiments. Procedure. The ~ was idcnticai to that of Experinaent 1.
Results and Discussion Experiment 4a. The data for all of the participants were included in the analyses, as they all performed with an overall error rate below 15%. Four items elicited an error rate higher than 30% and were excluded from the analyses. The remaining observations were used to calculate the mean response latencies and error scores for the two test conditions. Response latencies and errors are listed in Table 3. As expected for a derivational suffix with a wide range of meanings, both a paired t test for participants and a standard two-sample t test for items showed that words with a high surface frequency were responded to significantly faster than were words with a low surface frequency, tl(15) = 3.9, p < .01; t2(34) = 2.6, p < .05.5 Post hoe linear regression models using Equation 1 revealed an effect of surface frequency only (RT: a = 0; b = - 14.93, p < .02). Note that this result contrasts with that obtained for -heid in Experiment 3, in which we observed an effect of base frequency in the regression analysis. The error rates of both conditions did not differ significantly, t2(34) = 0.2, p > .1. The next Table 3
Mean Response Latencies With Standard Deviations and Error Percentages for Derived Nouns With the De-Verbal Sufftx-er, With High Versus Low Surface Frequency (Experiment 4a), and With High Versus Low Base Frequency (Experiment 4b) Frequency (per million)
Manipulated
Reaction time
SD
Error
(%)
Experiment 4a High (2.8) Low (0.1)
Surface Surface
603 659
62 66
Base Base
685 676
59 87
Experiment 5 Method Participants. Seventeen undergraduate students from the University of Nijmegen were paid to participate in Experiment 5a, whereas 16 different undergraduate students from the same university were paid to perform Experiment 5b. All were native speakers of Dutch and had normal or corrected-to-normal vision. None had participated in any of the previous experiments. Target materialsfor Experiment 5a. Forty comparatives were selected from the CELEX database, of which 20 had a high surface frequency (18.9), whereas the other 20 had a low surface frequency (1.1). The two sets were matched for base freqneney (high surface, 147.6; low surface, 147.5), family size of the base (high surface, 17.8; low surface, 18.4), geometric mean bigram frequency (for both, 14.0), and word length in letters (for both, 6.7). The materials are listed in the Appendix. Target materialsfor Experiment 5b. Forty comparatives were selected from the CELEX database, of which 20 had a high base frequency (146.0), whereas the other 20 had a low base frequency
2.3 2.6
Experiment 4b High (505.5) Low (48.3)
experiment maximizes the contrast in base frequency for the derivational --er, with the usual factorial design. Experiment 4b. The data for all of the participants were included in the analyses, as they all performed with an overall error rate below 15%. Six items elicited an error rate higher than 30% and were excluded from the analyses. The remaining observations were used to calculate the mean response latencies and error scores for the two test conditions. Response latencies and errors are listed in Table 3. Both a paired t test for participants and a standard twosample t test for items failed to reveal any significant difference in response latencies between words with a high base frequency and words with a low base frequency, h(15) = 1.2,p > .1; t2(32) = 0.3,p > .1. Post hoe stepwise multiple regression analyses using Equation 1 as the underlying model revealed a n effect of surface frequency only (RT: a = 0, b = - 15.23, p < .03). Similarly, the error rates of the high- and low-frequency conditions did not differ, h(32) = 0.4, p > .1. We concluded that the presence of a numerically stronger homonymic inflectional rival in combination with the semantic diversity of the derivational - e r had shifted the balance of storage and computation completely in favor of storage. This result parallels the findings of Bertram et al. (1999) who found evidence for storage for the Finnish equivalent of the derivational --er:. the suffix -jA. Tiffs suffix also has a productive inflectional homonym, which is the partitive plural case marker. Our hypothesis was that the presence of a productive inflectional rival induced storage to avoid on-line resolution of the ambiguity of a given suffix. The next experiment studies the role of surface and base frequency for the comparative -er, in which we expected to find a solid effect of base frequency and no effect of surface frequency, just as with the inflectional -re.
9.4 8.2
5 Again, the significant effect was not dependent on the words with zero-frequencies in the low-frequency condition. Excluding these items did not change the pattern of results: response latencies, 615 ms vs. 655 ms; tl(15) = 2.7,p < .02; t2(28) = 2.2,p < .04.
S ~ R A G B A N D COMPUTATION
(21.0). The two sets were matched for surface frequency (both 1.3), family size of the base (high base, 14.9; low base, 15.0), geometric mean bigram frequency (both 14.0), and word length in letters (high base, 7.1; low base, 6.3), The materials are listed in Table A1 in the Appendix. The bottom right panels of Figures A1 and A2 in the Appendix show how we have sampled the target words of Experiment 5. Filler materials for Experiment 5. The 100 filler words were the same as the ones used in the previous two experiments, except for the 40 inflected nouns, of which 30 were replaced with adjectives (20 nominatives and 10 superlatives) to ensure that the target words were not the only adjectives in the experiment. The nonwords were constructed along the same lines as in the previous experiments. Procedure. The procedurewas identicalto that of Experiment1.
Results and Discussion Experiment 5a. The data for all of the participants were included in the analyses, as they all performed with an overall error rate below 15%. One item elicited an error rate higher than 30% and was excluded from the analyses. The remaining observations were used to calculate the mean response latencies and error scores for the different test conditions, as can be seen in Table 4. Surprisingly, both a paired t test for participants and a standard two-sample t test for items showed that comparafives with a high surface frequency were recognized significantly faster than were comparatives with a low surface frequency, h(16) = 3.0, p < .01; t2(37) = 2.2, p < .05. Moreover, the error analysis indicated that the lowfrequency condition elicited more errors than did the highfrequency condition, t2(37) = 2.0, p = .06. In addition, post hoc linear regression analyses on the response latencies using Equation 1 as the model resulted in a significant effect for surface frequency only (RT: a = 0, b = - 1 6 . 3 4 , p < .001). These results suggest extensive storage and no discernable effect of parsing for comparatives, a pattern that is completely opposite to that for the inflectional -re. Experiment 5b. The data for all of the participants were included in further analyses, as they all performed with an overall error rate below 15%. Five items elicited error rates higher than 30% and were therefore excluded from further
497
analysis. The remaining observations were used to calculate the mean response latencies and error scores for the two conditions, as can be seen in Table 4. Both a paired t test for participants and a standard two-sample t test for items did not reveal any significant difference between words with a high base frequency and words with a low base frequency, neither in the response latencies, tl(15) = 0.2, p > 0.1; t2(33) = 0.1,p > 0.1, nor in the error rates, t2(33) = 1.2,p > 0.1. Post hoe linear regression analyses using Equation 1 as the underlying model confirmed that the RTs depend on surface frequency only (a = 0, b = -26.19, p < .001). The presence of solid effects of storage and the absence of any effects of parsing in the reaction time data for the inflectional suffix - e r was probably due to affixal homonymy. Unlike in the case of the inflectional -re, for which the derivational rival is completely unproductive, the rival derivational suffix of the inflectional --er is fully productive. For productive rival suffixes, Baayen, Dijkstra, and Schreuder (1997) argued that a mere subcategorization conflict can already induce storage. Within the framework of parallel dual-route modelling, our hypothesis was that the resolution of affixal homonymy between two productive affixes is so time costly that the direct route is almost always first to complete, not only for the derivational --er, but also for the comparative -er. With response latencies depending on the processing times of the first route to win the race---for -er, always the direct route---no effect of parsing was visible. Our general conclusion with respect to the results for the comparative - e r is that the disambiguation problem caused by affixal homonymy was so severe that massive storage was induced even for this completely regular and compositional suffix. What we do not yet know is if affixal homonymy might likewise induce storage for the Finnish inflectional suffix -jA, the partitive plural, and the homonymic with the derivational subject noun marker -jA. We are currently investigating whether there is no storage for this inflectional suffi× just as we observed no storage for the past tense suffix -re, or whether homonymy induces massive storage just as for the comparative suffi× -er. Our hypothesis was that the lexical statistics of Finnish would not allow such massive storage. Methodological Issues
Table 4
Mean Response Latencies With Standard Deviations and Error Percentages for Inflected Words With the Comparative Sufftx -er, With High Versus Low Surface Frequency (Experiment 5a), and With High Versus Low Base Frequency (Experiment 5b ) Frequency (per million) High (18.9) Low (1.1) High (146.1) Low (21.3)
Manipulated
Reaction time
Experiment 5a Surface 577 Surface 612 Experiment 5b Base 624 Base 626
SD
Error (%)
38 60
2.4 6.2
61 58
3.1 5.5
Before proceeding to the General Discussion section, three methodological issues require discussion. First, the absence of a surface frequency effect for inflected words in -re (Experiment 1) does not allow us to conclude that storage does not occur at all for such words. Although our experimental words cover a substantial range of surface frequencies, as shown in the upper right panel of Figure A1, a small number of very high-frequency-inflected words were not included in the experiment. Although the conclusion that storage does not seem to play a role for inflected words in-re holds for the bulk of such words, the high-frequency outliers might still be stored with their own access representations, which would then be an exceptional property given the morphological category of all inflected words in -re as a whole.
498
BV_J~'~, scI-meuD~P., AND BAAY~
Table 5
Mean Log Frequencies (With Standard Deviation and Range) and Pearson Correlation Coe~icients of Base F~:equencyand Reaction Tone (rBase) and of Surface Frequency and Reaction ~ne (rSurface) for All of the Suffzxes Used Within the Surface Frequency Range [2:6] in the Surface Frequency F_~eriments and Within the Base Frequency Range [4:8] in Base Frequency Experiments Suffix
Mean
frequency SD
Range
surm. r u ~ m ~ ma~ [2:6] ~at ~ ~ Vte (V) Ate (N) A-he/d(N) V--er(N) A--er(A)
4.46 4.01 4.51 3.98 4.16
0.98 1.31 10.3 1.23 1.29
rBase
rSurface
rm~mey eximinmus
2.48-5.96 2.20--5.97 2.48--5.94 2.08--5.91 2.08-5.97
-0.51 [-0.11] [-0.18] [-0.01] [-0.16]
[-0.29] -0.67 -0.51 -0.47 -0.42
B~ ~ V--ce(V) Ate (N) A-heM(N)
n u ~ [4:8]in me t~sc fiequcncyc,~oaim~ts 6.06 1.06 4.25-7.83 -0.38 [-0.22] 7.05 0.83 5.08-8.00 [-0.48] -052 6.69 0.79 5.20-7.98 [-0.34] -0.51 V--er(N) 6.62 1.09 4.03-7.89 [-0.54] -0.61' A--er(A) 6.31 1.11 4.51-7.90 -0.52 -0.60 Note. V -- verb; N ffi noun; A = adjective. 'One outlier was t~-movedfrom the analysis. Second, we have studied 5 affixes with different frequential properties. Our experimental materials reflect these differences, leading to differences in the range of surface and base frequencies used in the individual experiments. This raises the question of to what extent it remains possible to compare results across experiments. To make sure that our results are not a consequence of these differences in frequency ranges, we re-analyzed all of the experiments using multiple regression for fixed frequency ranges, chosen such that a maximum number of words from all experiments could be included. For the experiments manipulating surface frequency, we selected the log frequency range 2:6 (see Figure AI); for the experiments manipulating base frequency, we selected the log frequency range 4:8 (see Figure A2). Table 5 summarizes mean, standard deviation, and range for each suffix in both kinds of experiments together with the Pearson correlation coefficients of base frequency and RT (rBase) and of surface frequency and RT (rSurface). Correlations listed between square brackets were removed in a stepwise analysis. The basic pattern of results that emerged from this limited data set is identical to that observed on the basis of the factorial studies. The inflectional -re revealed an effect of base frequency and no effect of surface frequency. Nominal --re and the - e r suffixes revealed only surface frequency effects, and --he/d revealed the effects of both. We concluded that the differences in the frequency ranges between our experiments were unlikely to have caused the specific pattern of results that we have observed. Third, differences in list composition (selection of filler materials and nonwords) might in some way have influenced our results. However, each factorial RT experiment was paralleled by a subjective frequency ratine (on a 7-point scale) with other participants using only the experimental
target words without any fillers or nonwords. The results of this separate series of factorial subjective frequency experiments, summarized in Table 6, exactly mirrors the pattern of the ~factorial RT data. The congruence between this off-line task and the lexical-decision task shows that list composition is an unlikely source of our pattern of results. Furthermore, the congruence of the two tasks shows that this pattern is quite robust. A final objection against our methodology raised by one of the reviewers is that two samples of words that differ markedly in surface frequency might in fact differ in base frequency when remeasured and vice versa. In other words, the phenomenon of regression toward the mean might affect the matching for base frequency of sets of words with different surface frequencies and vice versa. In response to this potential problem, we have to distinguish between measurement error on the one hand and the phenomenon of regression toward the mean on the other hand. First consider measurement error. The question that we should ask ourselves here is how our surface frequencies and our base frequencies might change ff we counted these frequencies for exactly the same experimental words in another corpus of 42 million words. Fortunately, lexical statistics estimates for the expected frequency of a word in a new corpus with the same size are available (Baayen, 1996; Church & Gale, 1991; Good, 1953). For a word that occurs m times in a corpus of N tokens, the expected frequency in another corpus of N tokens, m*, is estimated by m* = (m + 1)*E[V(m + 1, N)]/E[V(m, N)], in which E[.] is the expectation operator and V(tm N) denotes the number of types in a corpus of N tokens that occur with frequency n~ Various techniques are ava)iAhle for estimating the expectations E[V(m + 1, N)] and E[V(m, N)] (see Chitashvili & Baayen, 1993; Church & Gale, 1991; Gale & Sampson, 1995). For word frequency data, m* is smaller than m. This is because corpora are samples that do not exhaustively sample all possible word types. These unseen word types have a joint probability P of being sampled equal to P -- E[V( 1,N)]/N, the total number of types occurring once only divided by the size in tokens of the 6 In the past, subjective frequency (or familiarity) ratings have been used to check whether corpus-based frequency counts were reliable. However, recent experiments have revealed that subjectiv~efrequency ratings are not only sensitive to surface frequency, but also to the morphological family size (see Footnote 1) both in Dutch (Schreuder & Baayen, 1997) and in English(Baayan, Lieher, & Schreuder, 1997). Although it is an off-line task, the subjective frequency rating task apparently taps into various aspects of lexical representation and processing. The experiments summarized in Table 6 show that, just as lexical decisions, subjective frequency ratings are also occasionally sensitive to differences in base frequency when surface frequency and morphological family size are controlled for. We assume that if access happens through the base, the frequency of the base is felt in the rating. If access takes place through the whole-word form, the frequency of the base does not matter, and the rating should be a function of the surface frequency only.
STORAOEAND coMvtrrAr~oN
499
Table 6 Mean Frequency Ratings (With Standard Deviation and Range) for All of the Sufftxes Used in the Surface Frequency and Base Frequency Experiments
Suffix
Mean rating for high-frequency condition
SD
Mean rating for low-frequency condition
SD
Test statistic
Surface frequency experiments
V-re (V) A t e (N) A-he/d (N) V--er(N) A-er (A)
4.8
1.2
4.2
1.1
5.9 5.1 5.4
0.9 1.3 0.9
3.1 3.5 3.7
1.0 0.8 1.2
t2(38)= 1.6,p > .1 stepwise: a = .60, p < .001 t2(37)*= 8.9, p < .001 t2(37)*= 4.7, p < .001 t2(38)= 5.3, p < .001
Base frequency experiments
V-re (V) 4.2 1.0 3.1 1.4 t2(38)= 2.9, p < .005 A t e (N) stepwise: b = 0.00,p > .1 A-heM(N) 3.3 1.3 3.2 1.2 t2(47)*= 0.3,p > .1 V-er (N) 3.6 1.4 3.8 1.4 t2(37)*= 0.4, p > . 1 A--er (A) 4.7 1.0 4.0 1.7 t2(38)= 1.6, p > .05 Note. *Because of a programming error, one item that was used in the corresponding reaction time (RT) experiment was dropped in the subjective frequency rating experiment. V = verb; N = noun; A = adjective.
corpus (Good, 1953). When we estimate word probabilities on the basis of their sample relative frequencies, we leave no probability space for these unseen types. Therefore, all relative sample frequencies slightly overestimate the population probabilities. Given that the probability of a word W with frequency m is less than m/N, its frequency in another corpus of size N will be less than m as well. For instance, for our corpus of 42 million tokens, a word appearing with a frequency of 4 is estimated to have an expected frequency of 3.851 in other corpora of 42 million words. For a higher frequency such as 600, the Good-Turing estimate equals 598.978. For the large corpus that underlies our counts, the measurement error is very small indeed; it minimally affects our frequency counts, and it does not at all affect betweenset contrasts under pairwise matching. It will be clear that measurement error does not invalidate our methodology. Next consider the potential problem of regression toward the mean. Imagine two words, A and B, with the same base frequency, but with A having a higher surface frequency than B. Might it not be the case that the probability that the base frequency of A has been underestimated will be higher than the probability that it has been overestimated? If we look at the very same words A and B and their expected frequencies of occurrence in other corpora, the answer to this question is clearly no. As we have seen previously, the base frequencies of A and B (which are matched in this example) are both slightly overestimated, as are, in fact, their surface frequencies. It is only when we randomly select other words, say C and D, with the same surface frequency contrasts as A and B, that it is likely that one will find a higher base frequency for Word C compared with Word D. But this is a completely different question that concerns the ecological validity of our data sets and not the reliability of our methodology and the way it uses contrasts in one frequency measure and matching for another frequency count to gain insight in the role of storage and computation in lexical processing. (With respect
to the question of ecological validity, significant by-item analyses allowed us to assume that similar results would be observed for words with similar frequential properties. We make no claims with respect to the lexicon as a whole, as our aim was to examine very specific parts of the lexicon, exactly those parts that allowed us to test our hypotheses concerning lexical processing.) General Discussion The question that we have addressed in this paper is how the three factors, affixal homonymy, productivity, and word formation type, affect the balance of storage and computation in visual lexical processing. Experiments 1 and 2 showed that the productive inflectional past tense suffax -re is processed exclusively by means of the parsing mute, whereas its unproductive derivational homonym is processed exclusively by the direct mute. Experiment 3 showed that for the productive derivational suffix -he/d, both routes operated in parallel. Experiments 4 and 5 revealed extensive storage and no effect of parsing for the productive derivational suffix - e r and its productive inflectional homonym, the comparative - e r Table 7 summarizes the present results that we obtained for Dutch, as well as the results of Baayen, Dijkstr~ and Schreuder (1997) for the Dutch plural suffax-en and those of Bertram et al. (1999) for Finnish. Our present results are remarkably similar to those for Finnish, and as of yet do not support the hypothesis that the balance of storage and computation is a priori biased toward computation for Finnish, the language with the more complex morphological system. Table 5 tabulates word formation type, productivity, and affixal homonymy, and whether the experiments show evidence for storage, parsing, or both. In what follows we will trace the effect of each of these factors on the balance of storage and computation.
500
BERTRAM, $CHREUDER, AND BAAYEN
Table 7
Summary of Affixal Properties and the Role of Storage and Computation for the Dutch and Finnish Languages Suffix
Word formation type
Productive
Homonymic
Storage
Computation
yes yes no yes yes yes yes
no yes yes yes yes yes no
yes no yes no no yes yes
Dutch V-re
A-re A--~/d V-er A-er N-en V-en
Invariant Changing Changing Changing Adding Adding Invariant
yes no yes yes yes yes yes Finnish
N-ssA
V-jA
Invariant Changing
yes yes
no yes
no yes
yes no
N-stO N-/A
Changing Changing
yes no
no no
yes yes
no
yes
Note. This table summarizes data from the present article and from Baayen, Dijkstra, and
Schreuder (1997), which both focus on Dutch, and it summarizes data from Bertram et al. (1999), which focuses on Finnish. V = verb; N = noun; A = adjective.
Word Formation Type
AffixalHomonymy
It has been argued that inflected words are always parsed and that derived words are always stored (Niemi et al., 1994; Taft, 1994). Our data show that inflection is not a necessary nor a sufficient condition for parsing. As can be seen in Table 5, parsing can take place for derivational suffixes (-he/d, -stO), whereas at the same time massive storage can be present for inflectional suffixes (inflectional -er). Instead of strictly distinguishing between inflection and derivation, it seems more useful to consider word formation type as a scalar dimension with meaning-invariant morphology at the oneextreme (e.g,, person and number marking on verbs) and meaning-changing morphology (e.g., subject-noun formation) at the other extreme, with meaning-adding morphology (e.g., diminutive and comparative formation, noun pluralization) at an intermediate position (see Baayen, Lieber, & Schreuder, 1997, for extensive discussion of noun pluralization as meaning-adding morphology). With this more finegrained analysis of the dimension, wordformation type, we can formulate the hypothesis that parsing is more likely to take place for words with productive meaning-invariant affixes without productive semantic homonyms.
Bertram et al. (1999) claimed that affixal h o m o n y m y triggers storage. Table 7 shows that this statement is somewhat too strong in the sense that an unproductive homonymic rival suffix (the derivational -te) does not induce storage for words with its productive homonymic counterpart (the inflectional-te). It is clear that the balance of storage and computation cannot be captured in terms of broad single-factor generalizations. From this perspective, full-parsing models (e.g., Taft and Forster, 1975) and full-storage models (Butterworth, 1983, Caramazza et al., 1988) describe logical possibilities that are indeed realized in Dutch and Finnish, but in certain circumstances only. As theories of morphological processing in general, they are too restrictive. Figure 2 presents the decision tree that depicts how the three factors interact to determine the balance between storage and computation. First, note that productivity is a necessary but not a sufficient condition for parsing. Second, the existence of a productive rival homonym induces storage irrespective of word formation type. Only for productive affixes without a productive rival homonym does the distinction between meaning-invariant morphology on the one hand and meaning-adding or meaning-changing morphology on the other hand seem to become relevant. Under these circumstances, complex words with meaning-invariant morphology revealed effects of parsing only, whereas complex words with meaning-adding or meaning-changing morphology showed effects of both storage and parsing. We have been able to come to these conclusions by combining the data from research involving Dutch and Finnish, which typologically are very different and unrelated languages, and it is encouraging to find that these crosslinguistic data converge in a consistent manner. It is clear,
Productivity It has similarly been argued that words with productive affixes are always parsed and never stored, whereas words with unproductive suffixes are always stored and never parsed (Anshen & Aronoff, 1988, 1997). Table 7 indeed shows no evidence for parsing for unproductive suffixes (the derivafional -te, the derivationa1-1A), but storage can take place for productive suffixes. It is rampant for the homonymic suffixes --er and -jA, and it also plays an extensive role alongside parsing for the suffixes --hem and -stO.
STORAGEAND COMPUTATION
productive~? [storage I (homonym?~ storage I V-er, A-er
I parsing -heid, -stO N-en
V-jA V-re, -ssA V-en
Figure 2. Decision tree for parsing and storage. Productive: Is the affix productive (+) or unproductive (-); Homonym: Does the have a productive rival affix with a different semantic function (+) or not (-); Invariant: Is the Word Formation Type meaninginvariant (+) or not (-).
however, that our conclusions are only based on 11 suffixes, and that the theory embedded in the decision tree of Figure 2 requires further empirical justification. More affixes from a wider range of languages need to be investigated experimentally. Moreover, there are other factors that need to be taken into account, such as the prefix-suffix distinction, the specific semantic properties of individual affixes, the distributional properties of affixes, and the computational complexity of different morphological systems. For instance, Figure 2 predicts extensive storage for the Finnish inflectional suffix -jA, which realizes the partitive plural, whereas the extreme productivity of the Finnish morphological system suggests that massive storage of so many different inflected words would be highly surprising. We are currently investigating if storage is indeed absent for Finnish partitive plurals, which would falsify our theory. The theory that we have offered here is only a first approximation that, in its simplicity, is probably wrong. Subsequent research will undoubtedly reveal other factors that co-determine the balance of storage and computation as well. At the same time, the present results suggest that in spite of the complexity of the various factors and their interrelations, the beginnings of a coherent of morphological processing are emerging.
cross-linguistictheory
References Anshen, E, & Aronoff, M. (1988). Producing morphologically complex words. Linguistics, 26, 641-655. Anshen, E, & Aronoff, M. (1997). Morphology in real time. In G.E. Booij & J. van Marie (Eds.), Yearbookof Morphology 1996 (pp. 9-13). Norwell, MA: Kluwer Academic. Baayen, R. H. (1996). The effect of lexical specialization on the
501
growth curve of the vocabulary. Computational Linguistics, 22, 455-480. Baayen, R. H., Dijkstra, T., & Schreuder, R. (1997). Singulars and plurals in Dutch: Evidence for a parallel dual mute model. Journal of Memory and Language, 36, 94-117. Baayen, R. H., Lieber, R., & Schreuder, R. (1997). The morphological complexity of simplex nouns, Linguistics, 35, 861--877. Baayen, R. H., & Neijt, A. (1997). lh'oductivity in context: A case study of a Dutch suffix. Linguistics, 35, 565-587. Baayen, R. H., Piepenbrock, R., & Gulikers, L. (1995). The CELEX lexical database [CD-ROM]. University of Pennsylvania, Philadelphia, PA: Linguistic Data Consortium. Bertram, R., Laine, M., & Karvinen, K. (1999). The interplay of word formation type, affixal homonymy, and productivity in lexical processing: Evidence from a morphologically rich language. Journal of Psycholinguistic Research, 28, 213-226. Booij, G. E. (1986). Form and meaning in morphology: The case of Dutch 'agent nouns.' Linguistics, 24, 503-517. Bradley, D. C. (1979). Lexical representation of derivational relations. In M. Aronoff & M. L. Kean (Eds.), Jlmcture (pp. 37-55). Cambridge, MA: Mrr Press. Burani, C., & Caramazza, A. (1987). Representation and processing of derived words. Language and Cognitive Processes, 2, 217-227. Burani, C., Dovetto, M., Thornton, A. M., & Laudanna, A. (1997). Accessing and naming suffixed pseudo-words. In G. E. Booij & J. van Made (Eds.), Yearbook of Morpholosy 1996 (pp. 55-73). Norwell, MA: Kluwer Academic. Burani, C., Salmaso, D., & Caramazza, A. (1984). Morphological structure and lexical access. ~sible Language XVIII, 4, 342352. Butterworth, B. (1983). Lexical representation. In B. Butterworth (Ed.), Language production: Vol. II. Development, writing and other language processes (pp. 257-294). London: Academic Press. Caramazza, A., Laudanna, A., & Romani, C. (1988). Lexical access and inflectional morphology. Cognition, 28, 297-332. Chitashvili, R. J., & Baayen, R. H. (1993). Word frequency distributions. In G. Altmann & L. Hl'ebf~ek(Eds.), Quantitative text analysis (pp. 54-135). Trier, Germany: Wissenschaftlicher Verlag Trier. Church, K., & Gale, W. (1991). A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams. Computer Speech and Language, 5, 19-54. Clahsen, H., Eisenbeiss, S., & Sounenstuhl-Henning, L (1997). Morphological structure and the processing of inflected words. Theoretical Linguistics, 23, 201-249. Cleveland, W. S. (1979). Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association, 74, 829--836. Colt, P., Beauvillain, C., & Segui, J. (1989). On the representation and processing of prefixed and suffixed derived words: A differential frequency effect. Journal of Memory and Language, 28, 1-13. Cutler, A., Hawkins, J. A., & Gilligan, G. (1985). The suffixing preference: A processing explanation. Linguistics, 23, 723-758. Frauenfelder, U. H., & Schreoder, R. (1992). Constraining psycholinguistic models of morphological processing and representation: The role of productivity. In G. E. Booij & J. van Made (Eds.), Yearbookof Morphology 1991 (pp. 165-183). Dordrecht, the Netherlands: Kluwer Academic. Gale, W. A., & Sampson, G. (1995). Good-Taring frequency estimation without tears. Journal of Quantitative Linguistics, 2, 217-237.
502
BERTRAM,SCHREUDmt,AND BAAYEN
Good, L J. (1953). The population frequencies of species and the estimation of population parameters. Biometrika, 40, 237-264. Hy6nli, J., I.afine, M., & Niemi, J. (1995). Effects of a word's morphological complexity on readers' eye fixation pattern. In J. Findlay, R. Kentridge, & R. Walker (Eds.), Eye movement research: Mechanisms, processes and applications (pp. 445452)./umterdam: Elsevier. Kerlsson, E, & Koskeuniemi, K. (1985). A process model of morphology and lexicon. Folia Linguistica, 29, 207-231. Landanna, A., & Burani, C. (1995). DisUibutional properties of derivational affixes: Implications for processing. In L. B. Feldman (Ed.), Morphological aspects of language processing (pp. 345--364). Hillsdale, NJ: Erlbaum. Marslen,grtlson, W., Tyler, L. IC, Waksler, R., & Older, L. (1994). M ~ l o g y and meaning in the English mental lexicon. Psychological Review, 101, 3-33. Niemi, J., Laine, M.,& Tuominen, J. (1994). Cognitive morphology in Finnish: Foundations of a new model. Language and Cognitive Processes, 9, 423--446. Pinker, S. (1991). Rules of languase. Science, 253, 530--535. R a ~ ~ H. (1962). Statistical facilitation of simple reaction time. ~tions of the New York Academy of Sciences, 24, 574590. Sandra, D, (1994). The morphology of the mental lexicon: Internal word ~ viewed from a psycholinsuistic perspective. Language and Cognith,e Processes, 9, 227-269. "
Schreuder, R., & Baayen, R. H. (1995). Modelling morphological processing. In L. B. Feldman (Ed.), Morphological aspects of language processing (pp. 131-154). Hillsdale, NJ: Erlbaum. Schreuder, R., & Baayen, R. H. (1997). How complex simple words can be. Journal of Memory and Language, 36, 118--139. Schriefers, H., Friederici, A., & Graetz, R (1992). Inflectional and derivational morphology in the mental lexicon: Symmetries and asymmetries in repetition priming. QuarterlyJournal of Experimental Psychology: Human Experimental Psychology, 44A, 373-390. Seidenberg, M. S. (1987). Sublexical structures in visual word recognition: Access units or orthographic redundancy? In M. Coltbeart (Ed.), Attention and performance XII: Reading (pp. 245-263). Hillsdale, NJ: Erlbaum. Sereno, J. A., & Jongman, A. (1997). Processing of English inflectional morphology. Memory and Cognition, 25, 425-437. Taft, M. (1979). Recognition of affixed words and the word frequency effect. Memory and Cogni'tion, 7, 263-272. Taft, M. (1994). Interactive-activation as a framework for understanding morphological processing. Language and Cognitive Processes, 9, 271-294. Taft, M., & Forster, ILI. (1975). Lexical storage and retrieval of prefixed words. Journal of VerbalLearning and VerbalBehavior, 14, 638--647. Taft, M., & Forster, K. I. (1976). Lexical storage and retrieval of polymorpbemic and polysyllabic words. Journal of Verbal Learning and VerbalBehavior, 15, 607-620.
503
STORAGE AND COMPUTATION
Appendix -re (inflection)
-re (derivation) 12
j 0
2
4
6
8
10
: 0
2
Surface Frequency
4
SuN~
I
8
10
Frequency
-held 14
,ot-
0
.,~** . * * . . - ~ * . . - ~
2
4
6
IB
Surface Fr~luency -er (derivation)
-er (inflection)
!il, .....,..
== ,o" • ==
~
I11 0
2
4
6
8
Surface Frequency
10
.
..,,- ....
r! .i
..
- -
"
0
2
4
(I
8
10
Surface Frequency
Figure A1. Scatterplots of log(surface fiequency + 1) by log(base frequency + 1) for the derivafional -re (upper left), the inflectional -re (upper fight), the derivational suffix -he/d (center), and the derivational --er (bottom left) and the inflectional --er (bottom right). The small dots represent all of the available bi-morphemic words in the CELEX lexical dAtAbase.The superimposed large dots represent the words selected for the experiments manipulating surface frequency. The solid lines are non-paramelric robust locally weighted scatterplot smoothers (Cleveland, 1979) highlighting the dependencies between the two frequency counts.
504
BERTRAM, SCHREUDER, AND BAAYEN
-te (derivation)
•
14.
-te (inflection)
t V~l . / ~
|
e
•
== m
o
2
i
-'IP 6
.am8
4 • 10
12
14
~ ff~
0 0
2
~, Base Frequency
4
6
8
10
12
14
12
14
Base Frequency
-held
c•"10
|.
6 ..:. I=:
~o
.-"
2
o
=
4 e e lo Base Frequency
n
1:~
-er (derivation) 8
-er (inflection)
.."
e
~
8
.o U.
e
o ' 0
2
4
6
8
10
Base Frequency
12
14
0
2
4
6
8
10
Base Frequency
Scatterplots of log(surface frequency + 1) by log(base frequency + 1) for the derivational --re (upper left), the inflectional -te (upper fight), the derivational suff~ -he/=/(center), and the derivational - e r (bottom left) and the inflectional - e r (bottom right). The small dots represent all of the available hi-morphemic words in the CELEX lexical database. The superimposed large dots represent the words selected for the experiments manipulating base fiequency. The solid lines are non-parametric robust locally weighted scatterplot smoothers (Cleveland, 1979) highlighting the dependencies between the two frequency counts. Figure A2.
SrORAO~ Am) COMPtrrA_~OS
505
Table A 1 Materials Used in the Experiments Experiment
Word (in English)
Mean reaction Mean error score Base Surface time (%) frequency frequency
Verbs with a high surface frequency la
blafte (barked) glipte (slipped) groette (greeted) hakte (chopped) klikte (told tales) knoopte (knotted) kuchte (coughed) plofte (plopped) propte (crammed) raapte (picked) rookte (smoked) schopte (kicked) schraapte (scraped) smakte (smacked) spitste (pricked up) stampte (stamped) startte (started) stokte (broke down) stroopte (poached) tastte (touched)
622 567 602 654 635 674 612 674 716 636 556 662 671 738 685 626 603 773 691 631
0 0 5.6 0 5.6 0 0 0 0 0 0 0 0 5.6 5.6 5.6 0 0 0 5.6
696 415 1128 875 279 916 306 252 230 594 2198 983 556 296 318 572 888 332 219 1187
218 184 386 119 114 307 192 97 85 272 456 332 305 120 119 154 302 219 72 413
407 477 1016 606 287 677 683 203 1066 2024 182 329 427 1249 958 267 544 982 148 431
36 49 47 33 55 92 201 13 81 180 11 17 37 122 58 38 22 13 3 30
100 2222 121 335 2436 1296 1016 606 183 47 323 635 2518 1066 2024 329 302 148
17 154 14 66 177 262 47 33 16 2 104 125 372 81 180 17 23 3
Verbs with a low surface frequency
la
dampte (steamed) dempte (dimmed) kweekte (cultivated) kwetste (wounded) lustte (liked) poetste (polished) prikte (stung) raspte (grated) schetste (sketched) schikte (arranged) schorste (suspended) sloopte (demolished) splitste (spli0 spotte (mocked) staakte (went on strike) strookte (was in accordance) testte (tested) toetste (tried out) twistte (quarreled) wreekte (avenged)
lb
blufte (boasted) dankte (thanked) jatte (nabbed) knarste (crunched) kookte (cooked) kraakte (cracked) kweekte (cultivated) kwetste (hutted) lapte (shimmied) lootte (drew lots for) mikte (aimed) piepte (peeped) rustte (rested) schetste (sketched) schikte (arranged) sloopte (demolished) stookte (burned) twistte (quarreled)
609 690 573 586 724 630 576 770 712 615 663 706 694 648 640 751 747 580 644 739
16.7 16.7 0 0 0 0 0 0 5.6 5.6 5.6 0 5.6 0 0 0 0 0 5.6 22.2
Verbs with a high base frequency 625 620 602 715 589 606 598 622 609 731 608 576 607 693 611 611 613 622
3.7 0 0 3.7 0 3.7 0 0 0 18.5 0 0 0 11.1 3.7 3.7 7.4 3.7
(table continues)
BF.RTRAM, SCHREUDER,AND BAAYEN
Table A 1(confinue~ Experiment
Word (in English)
Mean reaction Mean error score Base Surface time (%) frequency frequency
Verbs with a high base frequency (continued) wekte (woke up) wreekte (avenged)
633 713
3.7 14.8
3171 431
538 30
40.7 33.3 14.8 18.5 25.9 0 29.6 14.8 25.9 0 0 11.1 7.4 0 14.8 18.5 0 3.7 0 22.2
38 29 38 279 93 149 51 15 95 923 32 28 230 353 556 55 888 332 70 3
26 7 7 119 32 86 22 13 52 402 5 11 85 184 305 26 302 219 30 3
93.8 75.0 75.0 62.5 56.2 50.0 50.0 43.8 43.8 43.8 43.8 43.8 37.5 37.5 37.5 37.5 31.2 31.2 25.0 25.0 25.0 25.0 25.0 18.8 18.8 18.8 12.5 12.5 12.5 12.5 12.5 6.2 6.2 6.2 6.2 6.2 0 0 0 0
53 1899 1990 5132 867 5270 1521 1214 368 142 1367 1658 3890 1584 11221 21535 394 350 8982 1594 582 1623 1996 856 325 3044 205 2628 1156 1696 982 2716 2344 10 2032 831 5558 6689 1001 5820
4 0 0 16 2 63 0 0 8 0 5 6 0 4 61 0 0 2 109 0 5 0 0 16 0 0 8 0 24 9 8 167 227 67 8 194 445 180 30 28
Verbs with a low base frequency Ib
pookte (poked)* ventte (hawked)* balkte (brayed) blikte (glanced) dweepte (idolized) flapte (flapped) floepte (Impped) ka~e (teased) klakte (clacked) klapte (clapped) mestte (manured) pare (blazed) propte (crammed) schepte (created) schraapte (scraped) spurtte (spurted) startte (started) stokte Caroke down) strecpte (striped) trompette (trumpeted)
811 754 718 639 704 619 716 718 753 620 675 766 637 559 617 759 630 622 618 756 Nouns in --re
zoelte (mildness)* dwarste (crossness)* dunte (thinness)* 8anwte (quickness)* scheefte (crookedness)* klaarte (clarity)* puurte (pureness)* grofte (rudeness)* ruigte (roughness)* scheelte (cross-eyedness)* schuinte (obliqueness)* sfljfte (stiffness)* fijnte (fineness)* friste (freshness)* graagte (eagerness)* kleinte (sinai Iness)* kromte (crookedness)* loomte (languor)* donkerte (darkness) kaalte (baldness) lauwte (tepidness) ruwte (roughness) vuilte (dirtyness) engte (strait) schraalte (leanness) smalte (narrowness) krapte (tighmess) nanwte (defile) slapte (slackness) sombette (somberness) steilte (steepness) droogte (dryness) koelte (coolness) luwte (shelter) magerte (meagerness) schaarste (scarcity) breedte (breadth) dikte (thickness) flauwte (faint) gekte (foolishness)
621 757 778 1075 858 854 700 784 909 624 909 696 681 813 807 574 749 782 746 617 838 751 613 692 801 628 668 671 758 824 717 586 626 688 748 684 578 591 618 650
507
STORAGE A N D COMPUTATION
Table A1 Experiment
(continued)
Mean reaction Mean error score Base Surface time (%) frequency frequency
Word (in English)
Nouns in -re (continued) gewoonte (custom) 2 (cont'~ groente (vegetables) grootte (size) hoogte (height) kalmte (calmness) kilte (chilliness) leegte (emptiness) mimte (space) sterkte (strength) stilte (silence) verte (distance) warmte (warmth) wijdte (width) ziekte (sickness) zwaarte (weight) zwoelte (sultriness)
550 532 564 553 577 616 556 549 608 613 637 529 721 499 633 794
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
13509 2978 51667 17053 2047 907 5290 4678 16548 6995 23125 6680 2205 3783 9812 160
2459 977 1314 5164 391 142 679 6355 482 4283 2362 1995 13 5174 193 3
1807 9137 269 1129 51667 9923 17067 310 230 430 20545 2884 16716 773 3340 11678 406 69252 842 24640
228 205 263 308 426 231 378 310 90 181 537 1856 1694 513 1395 5298 148 4752 324 3049
10588 337 395 273 31731 8O5 83272 254 2996 2701 18700 8935 19595 32846 1547 11726 867 1853 316
11 3 18 6 82 2 473 2 3 37 11 20 13 60 15 5 0 3 2 5
3873 6724 2585
6 3 111
Nouns in --hekt with a high surface frequency 3a
bitterheid (bitterness) dichtheid (density) droefheid (sadness) dwaasheid (absurdity) grootheid (greatness) hardheid (hardness) hoogheid (highness) ijdelheid (vanity) kuisheid (chastity) lafheid (cowardice)
oudheid (antiquity) schoonheid (beauty) snelheid (speed) tederheid (tenderness) veiligheid (safety) vrijheid (freedom) vroomheid (piety) waarheid (troth) wreedheid (cruelty) zekerheid (certainty)
546 529 558 507 528 526 544 567 643 542 593 547 532 627 552 521 632 515 552 521
0 0 0 0 0 6.2 0 0 6.2 0 0 0 0 0 0 0 0 0 0 0
Nouns in -he/d with a low surface frequency 3a
halfheid 0udfness)* bitsheid (tartness) broosheid (fragility) bruutheid (brutality) eigenheid (ownness) forsheid (robustness) goedheid (kindness) goorheid (dinginess) groenheid (greenness) koelheid (coolness) kortheid (shortness) lichtheid (lightness) losheid (looseness) netheid (decency) nieuwheid (newness) puurheid (purity) rechtheid (straightness) scheefheid (crookedness) serieusheid (seriousness) troebelheid (turbidity)
833 728 641 770 605 682 553 700 717 575 543 562 667 543 598 692 584 733 631 703
50.0 19.0 19.0 6.2 0 25.0 0 13.0 6.2 0 0 6.2 13 0 6.2 6.2 0 0 0 6.2
14846
Nouns in -he/d with a high base frequency 3b
linksheid (leftness)* dikheid (fatness) felheid (fierceness)
610 668 599
31.2 18.8 0
(table continues)
508
BeS~,
SCHI~UDEa, AND BAAY~ Table A 1
Experiment
Word (in English)
(continued)
Mean reaction Mean error score Base Surface time (%) frequency frequency
Nouns in --he/d with a high base frequency (continued) • fraaiheid (beauty) 3 (c~mt'd) heelheid (wholeness) heetheid (hotness) jongheid (youngness) juistheid (correctness) koudheid (coldness) leegheid (emptiness) lenkheid (niceness) magerheid (thinness) moeheid (fatigue) mooiheid (beauty) nieuwheid (newness) puurheid (purity) scheetheid (crookedness) schuinheid (obliqueness) serieusheid (seriousness) steilheid (steepness) strengheid (strictness) traagheid (slowness) trotsheid (pride) witheid (whiteness) zwaarheid (heaviness)
659 748 787 610 605 577 575 709 576 550 653 568 655 729 747 793 716 560 547 649 611 616
0 18.8 18.8 6.2 6.2 0 0 18.8 0 6.2 25.0 6.2 6.2 6.2 25.0 18,8 12.5 0 0 0 6.2 0
1660 55138 9873 15000 16928 5789 5451 4026 2032 2546 15065 32846 1547 867 1371 1853 982 2508 1993 3103 14016 9812
6 19 4 13 385 0 36 3 7 212 7 15 5 3 0 2 0 78 144 0 38 0
37 94 95 273 131 269 4185 90 534 287 252 341 310 91 2701 430 8935 101 632 186 900 569 329 292 4059
8 3 5 6 2 263 11 3 0 4 0 64 310 0 37 181 13 0 18 0 3 39 96 6 8
93 47 3360 3745 1702 4762 980 2511 • 2168 21514 7188 3076 537 302 213 3171
16 68 13 38 41 55 39 221 363 148 45 99 8 143 15 249
Nouns in --hem with a low base frequency 3b
zatheid (satiety)* ziltheid (saltishness)* brosheid (fragility) bruutheid (brutishness) dartelheid (friskiness) droefheid (sadness) droogheid (dryness) dufheid (mustiness) edelheid (nobleness) effertheid (smoothness) gaarheid (cookedness) geilheid (lubricionsness) ijdelheid (vanity) kleflleid (clamminess) koelheid (coolness) lafheid (cowardliness) losheid (looseness) makheid (tameness) matheid (weariness) plompheid (rudeness) rotheid (rottenness) schuwheid (shyness) soberheid (soberness) vaalheid (fadedness) vlugheid (quickness)
667 837 829 699 742 623 580 817 650 708 680 685 604 685 602 562 583 650 580 710 610 654 684 727 607
31.2 31.2 18.8 6.2 12.5 0 0 6.2 0 0 6.2 0 0 0 0 0 25.0 0 0 18.8 0 0 6.2 6.2 6.2
Nouns in -er with a high surface frequency 4a
dweper (idolater)* turner (gymnast) drijver (cattle driver) schenker (pourer) ruiker (bouquet) breker (breaker) gieter (wateringcan) heerser (ruler) jager (hunter) kenuer (connoisseur) lijder (sufferer) redder (saver) meier (rower) stoker (stoker) robber (worrier) wekker (alarm clock)
851 641 641 624 637 586 580 582 548 630 739 586 530 719 679 536
50.0 18.8 6.2 6.2 6.2 0 0 0 0 0 0 0 0 0 6.2 0
509
STORAGE AND COMPUTATION
Table A1 Experiment
Word (in English)
(continued)
Mean reaction Mean error score Base Surface time (%) frequency frequency
Nouns in - e r with a high surface frequency (continued) 4a zanger (singer) (cont'd) zender (sender) zwemmer (swimmer) zwerver (wanderer)
542 543 544 571
0 0 0 0
4234 1752 1567 692
138 369 52 138
295 55 659 3234 7689 148 128 555 2337 1782 1979 3193 4155 1489 4580 2836 925 3435 21720 3383
0 0 0 0 18 3 0 0 7 6 10 0 8 0 3 6 5 0 4 3
231 1310 36536 495 816 3451 3193 4580 947 3335 51631 7689 52837 40957 2337 98375 2511 21514 29234 21720
0 77 15 0 0 0 0 3 0 0 24 18 0 0 7 82 221 148 4 4
14 2O 29 226 1979 980 4155 55 637 8591 925 2682 470 192 213
0 0 0 2 10 39 8 0 0 11 5 10 0 3 15
Nouns in - e r with a low surface frequency 4a
smeder (forger)* looier (tanner)* schelder (slanger)* buiger (bower) bieder (bidder) blusser (extinguisher) kaatser (fives-player) vreter (glutton) wasser (washer) bidder (prayer) binder (binder) glijder (slider) grijper (grasper) hijger (gasper) hurler (crier) kmiper (creeper) sluiper (sneaker) vanger (catcher) volger (follower) winner (winner)
4b
:ronker (snorer)* duider (suggestor)* brenger (bearer) janker (yelper) smelter (melter) dwinger (forcer) glijder (slider) hurler (crier) smijter (dasher) stijger (riser) vinder (finder) bieder (bidder) blijver (stayer) ligger (liar) wasser (washer) zieuer (seer) heerser (ruler) kelmer (connoisseur) voeler (feeler) volger (follower)
4b
zifter (sifter)* schranzer (gormandizer)* kleumer (slaiverer)* schender (violator)* binder (binder) gieter (watering can) grijper (grasper) looier (tanner) ordener (arranger) rijder (rider) sluiper (sneaker) snijder (cutter) strooier (strewer) temmer (tamer) robber (worrier)
833 752 807 678 696 707 759 691 671 645 724 713 641 606 660 741 599 546 598 527
43.8 37.5 31.2 12.5 6.2 6.2 6.2 6.2 6.2 0 0 0 0 0 0 0 0 0 0 0
Nouns in - e r with a high basefrequency 982 798 738 732 766 731 653 613 650 602 730 713 611 688 650 706 642 586 771 743
81.2 50.0 25.0 18.8 18.8 12.5 12.5 12.5 12.5 12.5 12.5 6.2 6.2 6.2 6.2 6.2 0 0 0 0
Nouns in - e r with a low base frequency 1383 852 930 1087 666 633 697 911 787 669 630 615 625 731 705
81.2 75 62.5 37.5 6.2 0 0 2.5 25.0 0 0 0 0 25.0 18.8
(tablecon6nues)
510
BERTRAM, SCHREUDER,AND BAAYEN
Table A1 (continued) Experiment
Word (in English)
Mean reaction Mean error score Base Surface time (%) frequency frequency
Nouns in -er with a low base frequency (continued) 4b turner (gymnast) (cont'd) vuller (filler) wekker (alarm clock) winner (winner) zender (sender)
695 711 556 618 571
0 18.8 0 12.5 0
47 3245 3171 3383 1752
68 2 249 3 369
11667 6689 1801 856 2585 3890 3539 2047 5789 42525 909 1089 4875 7977 1095 16312 849 2423 4059 3006
1957 357 189 153 338 182 489 167 170 7438 196 115 503 542 121 1921 92 168 390 383
3719 7921 967 5281 4391 1059 2908 5132 16298 3873 2181 2572 32729 882 1367 2079 1566 6646 12989 6846
4 98 7 30 20 3 36 25 173 9 17 24 137 8 7 15 8 99 76 56
3719 1291 7921 2908 5132 1097 16928 3873 2181 32729 1367 3044 1362 1101 674
4 7 98 36 25 4 173 9 17 137 7 155 26 16 19
Comparatives with a high surfacefrequency 5a
dieper (deeper) dikker (fatter)
dunner (thinner) enger (creepier) feller (fiercer) fijuer (finer) geringer (smaller) kalmer (calmer) kouder (colder) langer (taller) milder (milder)
rijper (riper) scherper (sharper) slechter (worse) slimmer (smarter) sueller (faster) soepler (suppler) strenger (stricter) vlugger (quicker)
zwakker (weaker)
553 593 603 589 585 552 631 542 576 527 621 569 590 551 539 528 679 595 573 546
0 0 0 0 5.9 0 5.9 5.9 0 0 5.9 5.9 5.9 0 0 0 11.8 0 0 0
Comparatives with a low surface frequency 5a
zeldener (rarer)* banger (more afraid) blanker (whiter) blauwer (bluer) blijer (happier) doller (wilder) ttinker (firmer) ganwer (sooner) juister (more right) linkser (more left) naakter (hurler) hatter (wetter) nieuwer (newer) rauwer (rawer) schuiner (more oblique) lrotser (prouder) valser (falser) vreemder (suanger) witter (whiter) zwarter (blacker)
749 496 614 599 599 669 674 727 590 716 637 574 551 624 672 576 533 589 614 569
52.9 0 5.9 0 5.9 5.9 11.8 29.4 0 17.6 0 5.9 0 11.8 0 0 11.8 0 11.8 0
Comparatives with a high base frequency 5b
zelduer (rarer)* blonder (blonder)* banger (more afraid) flinker (firmer) gauwer (sooner) jaloerser (more jealous) juister (more right) linkser (more left) naakter (nuder) nieuwer (newer) schuiner (more oblique) smaller (narrower) stommer (more stupid) strikter (stricter) triester (drearier)
861 609 577 689 698 597 679 670 614 542 648 531 619 755 652
43.8 37.5 0 6.2 18.8 6.2 6.2 0 0 0 6.2 0 0 6.2 0
511
STORAGEAND COMPUTATION Table A1 (continued) Experiment
Word (in English)
Mean reaction Mean error score Base Surface time (%) frequency frequency
Comparatives with a high base frequency (continued) 5b trotser (prouder) (cont'd) valser (falser) vreemder (stranger) witter (whiter) zwarter (blacker)
605 609 546 640 562
6.2 0 0 0 0
2079 1566 6646 12989 6846
15 8 99 76 56
80 189 49 673 2096 534 856 2344 394 43 4026 180 327 268 356 1089 849 294 506 352
22 2 3 22 72 28 153 164 13 3 172 2 51 16 15 115 92 5 36 2
Comparatives with a low base frequency 5b
driester (more reckless)* ranker (crankier)* penibeler (more awkward)* bonter (more variegated) dommer (dumber) edeler (nobler) enger (creepier) koeler (cooler) krommer (more crooked) lakser (slacker) leuker (nicer) luxer (more luxurious) matter (paler) nobeler (nobler) reiner (purer) rijper (riper) soepeler (suppler) spitser (sharper) taaler (more tenacious) wankeler (shakier)
741 739 920 662 596 584 546 555 672 637 553 644 712 638 687 596 570 694 582 716
62.5 31.3 31.3 12.5 0 6.2 6.2 0 6.2 0 0 6.2 25.0 6.2 6.2 0 0 0 6.2 12.5
Note. Words marked with an * were not included in the analyses because of their high error rates. Received December 12, 1997 Revision received May 4, 1999 Accepted August 12, 1999 •
Low Publication Prices for APA Members and Affiliates K e e p i n g y o u u ~ t o - c l a t e . All APA Fellows, Members, Associates, and Student Affiliates rex~ive---as part of their annual dues---subscriptions to the American Psychologist and APA Monitor. High School Teacher and International Affiliates receive subscriptions to the APA Monitor, and they may subscribe to the American Psychologist at a significantly reduced rate. In addition, all Members and Student Affiliates are eligible for savings of up to 60% (plus a journal credit) on all other APA journals, as well as significant discounts on subscriptions from cooperating societies and publishers (e.g., the American Association for Counseling and Development, Academic Press, and Human Sciences Press). Essential r e s o u r c e s . APA members and affiliates receive special rates for purchases of APA books, including the Publication Manual of the American Psychological Association, and on dozens of new topical books each year.
Other benefits of membership. Membership in APA also provides eligibility for competitive insurance plans, continuing education programs, reduced APA convention fees, and specialty divisions.
More i n f o r m a t i o n . Write to American Psychological Association, Membership Services, 750 First Street, NE, Washington, DC 20002-4242.