Document not found! Please try again

Implicit learning of a recursive rule in an artificial grammar

0 downloads 0 Views 102KB Size Report
Participants performed an artificial grammar learning task, in which the standard finite state grammar (J. ..... could answer by pressing a 'yes' or a 'no' key on the keyboard. ... a range of 0–1. The answers given in the interview were categorised according to ... the recursive rule beyond two levels of embedding. The interviews ...
Acta Psychologica 111 (2002) 323–335 www.elsevier.com/locate/actpsy

Implicit learning of a recursive rule in an artificial grammar Fenna H. Poletiek Unit of Experimental Psychology, University of Leiden, P.O. Box 9555, 2300 RB Leiden, Netherlands Received 4 July 2001; received in revised form 18 December 2001; accepted 14 March 2002

Abstract Participants performed an artificial grammar learning task, in which the standard finite state grammar (J. Verb. Learn. Verb. Behavior 6 (1967) 855) was extended with a recursive rule generating self-embedded sequences. We studied the learnability of such a rule in two experiments. The results verify the general hypothesis that recursivity can be learned in an artificial grammar learning task. However this learning seems to be rather based on recognising chunks than on abstract rule induction. First, performance was better for strings with more than one level of self-embedding in the sequence, uncovering more clearly the self-embedding pattern. Second, the infinite repeatability of the recursive rule application was not spontaneously induced from the training, but it was when an additional cue about this possibility was given. Finally, participants were able to verbalise their knowledge of the fragments making up the sequences––especially in the crucial front and back positions––, whereas knowledge of the underlying structure, to the extent it was acquired, was not articulatable. The results are discussed in relation to previous studies on the implicit learnability of complex and abstract rules. Ó 2002 Elsevier Science B.V. All rights reserved.

1. Introduction People are able to acquire knowledge of the structures in their environment by mere exposure to exemplars of these structures. This process has been shown to proceed implicitly, without any explicit theory being induced from the observed exemplars (Berry & Dienes, 1993; Dienes, Broadbent, & Berry, 1991; Reber, 1967, 1976, 1993). This basic skill plays a role in many forms of learning, for example from

E-mail address: [email protected] (F.H. Poletiek). 0001-6918/02/$ - see front matter Ó 2002 Elsevier Science B.V. All rights reserved. PII: S 0 0 0 1 - 6 9 1 8 ( 0 2 ) 0 0 0 5 7 - 4

324

F.H. Poletiek / Acta Psychologica 111 (2002) 323–335

patterns of social behavior (Lewicki, 1986) to system control (Cleeremans & McClelland, 1991) and grammar learning (Reber, 1989). Research into implicit grammar learning generally addresses two major questions about this process: What kind of knowledge can be acquired by induction from exemplars? And how much of this knowledge is available to consciousness? Researchers proposed different hypotheses ranging from abstract knowledge being unconsciously acquired (Reber, 1967, 1989, 1993) to knowledge of small partitions (chunks) of the exemplars, which is available to consciousness (Dulany, Carlson, & Dewey, 1984; Perruchet, Gallego, & Savy, 1990; Perruchet & Pacteau, 1990). The group of researchers defending the latter position, suggest that performance on grammaticality judgements tasks can be explained by assuming merely associative knowledge of short strings of which the exemplars are made up. A number of methodologies have been used to counter this argument, like transfer tasks, testing the knowledge of the location of partitions in the exemplars, and testing the learnability of long distance dependencies. In the studies using the first methodology, participants seem to be able to transfer knowledge of abstract grammatical rules acquired from exemplars in one domain to another, different domain (Gomez & Schvaneveldt, 1994). This capability was even found in seven-monthold babies exposed to sequences of sounds (Marcus, Vijayan, Rao, & Vishton, 1999). However, it has been debated whether transfer studies demonstrated rule based learning. Redington and Chater (1996), for example, demonstrated that transfer results could be explained by simple similarity judgements and fragmentary knowledge. In the second methodology, it is tested whether participants learn the permissible location of short strings within the exemplars of the systems. The assumption is that knowledge of the correct location of the short strings (for example bigrams or trigrams) within the exemplars, requires that one knows the full underlying grammar (Johnstone & Shanks, 1999; Meulemans & Van der Linden, 1997). Meulemans and Van der Linden, however, found that such rule based knowledge is acquired under certain conditions only, for example, when the training set of exemplars is long. In the third methodology, the system to be learned by participants contains rules about the dependency of elements at distant positions. For example, the grammar may dictate that if a certain element is in the first position of an exemplar, then another particular element must be in the fifth position. In this line of reasoning, it is hypothesised that if the long distance dependencies are learned, then rule induction proceeds by more than the acquisition of short partitions of the exemplars only (Cleeremans & McClelland, 1991; Lewicki, Hill, & Bizot, 1988; Millward & Reber, 1972; Nissen & Bullemer, 1987). However, long distance dependencies learning has been demonstrated in sequence learning tasks, and not in the more ‘abstract’ grammar learning tasks. The purpose of the present study is to investigate the learnability of a particular abstract system in an artificial grammar learning (AGL) task: a recursive grammar. Recursion is the global principle that an operation applied on the elements of a set, results in a new element of that set. Recursion is a generative principle, which is a crucial characteristic of natural language. For example, recursivity in natural gram-

F.H. Poletiek / Acta Psychologica 111 (2002) 323–335

325

mar makes it possible to generate sentences with relative clauses. The clause can be seen as a sentence within a sentence, as in: ‘The cat [that sits on the roof of the neighbour’s house], blows at my dog’. In this case, the elements are sentences and the operation is ‘self-embedding’. The learnability of recursion from exemplars is an interesting topic in the context of the foregoing discussion, for several reasons. First, recursivity is a quite ubiquitous, ecologically relevant principle. Besides natural language, it is found in many kinds of cognitive activities varying from highly abstract mathematical reasoning to daily reasoning and planning. Second, it is both a complex kind of rule and a very intuitive principle. Its complexity can be illustrated by the fact that it can, in principle, be applied an infinite number of times. Its ‘simplicity’, however, is also laid in the fact that it is the same operation which is applied repetitively. Third, because the gist of the principle of recursion is its endless repeatable character, it is interesting to investigate whether people spontaneously induce this ‘infinite’ property when they learn it by being exposed to exemplars in which it is applied (necessarily) a limited number of times. In this line of reasoning, Mathews and Cochran (1998) argue that the learnability of generativity is one of the most important questions to deal with in the implicit learning paradigm. Thus, two aspects of a recursive rule have to be learned by a learner to master it fully: the recursive operation itself and its infinite reapplicability. Notice that the recursive rule we investigate involves long distance dependencies. Therefore, it contributes to the discussion about fragment based versus deep rule learning explanations of implicit grammar learning. The question we address about the learnability of this structure is twofold: First, do participants learn the recursive operation from exposure to the learning strings and second, under what conditions do they infer that this operation can be applied more often within one string, than it was applied in the strings from the learning set? Two hypotheses underlie our predictions: First, we hypothesise that the recursive operation can be learned in an AGL task. This learning process probably is based on the recognition of groups of elements in the string, and their mutual dependency. The groups of elements are recognised as being dependent on each other and positioned at each side of another group making the ‘distance’. This leads to the prediction that the recursive operation is learned better as the relevant groups of letters are easier to identify. We further predict that this identification is more easily performed as the recursive operation is applied more often in the sequence, showing more saliently the repeated pattern of partitions of sequences. Thus, when frequently applied in a string, it should be easier for a learner to recognise the recursive operation, i.e., to recognise the ‘chunks’ in the sequences and their symmetrical organisation within a string. In addition, in line with past results, we predict that the knowledge of fragments of the grammar can be verbalised, whereas the operation, being the more abstract part, may not be explicitly articulatable by the learner. Our second hypothesis is about the generative aspect of the recursive structure. The repeated applicability of the recursive rule may not be inferred spontaneously from a training set with a limited number of recursive applications. This hypothesis is related to findings in (psycho)linguistics suggesting that learning to use a recursive

326

F.H. Poletiek / Acta Psychologica 111 (2002) 323–335

structure not necessarily requires deep knowledge of these structures, but can be explained by the acquisition of a limited number of applications used in the learning materials (Christiansen, 1992). The learner considers the training set, ceteris paribus, as a representative sample of the materials the rules system generates. Learning is expected to be based on this sample. However, as cues are provided that more recursive applications are permissible in principle, participants may quickly recognise the multiple expressions of a principle already acquired. To investigate the learnability of recursion in an artificial grammar learning task, we varied Reber’s (1967) standard task. In the learning phase, participants in the standard task are exposed to exemplars of an artificial grammar, being a finite state grammar with five elements. We manipulated this grammar by extending it with a recursive rule. This rule allows the generation of embedded sequences within a sequence. In addition, the embedding rule was located in the centre of the grammar (generating self-embedded sequences) and not at one of the extremities. In this manner, the sequences would contain long distance dependencies between a front part and a back part of the basic structure. This mechanism is infinitely repeatable. Thus, an infinite number of sequences can be generated with all possible numbers of levels of self-embedding. This recursive grammar is displayed in Fig. 1. The front parts (chunk) of the exemplars with embeddings will be called ‘pushes’ and the back parts will be called ‘pops’. The pushes are the strings generated by all possible ways leading from S1 to S5 (for example: TXXV, PV, PTV), and the pops are the strings generated by all possible ways leading from S5 to S6 (for example: V, PS, PXVV). Participants were trained with a set of strings having zero, one, or two levels of embedding. The procedure was analogous to the standard artificial grammar learning procedure. Participants’ knowledge of the grammar was observed in a categorisation task with a test set of novel strings, having zero, one, two, or three (one more than in the learning set strings) levels of embedding. In Experiment 1, we test the participants’ capability to learn the present recursive principle, and the way they learn it. In line with the foregoing, we predicted first that

Fig. 1. Reber’s (1976) grammar extended with a recursive rule.

F.H. Poletiek / Acta Psychologica 111 (2002) 323–335

327

strings with no embedding (the basic structure) are most easy to categorise. Ungrammaticality of those strings in the test phase did not involve long distance dependencies, but letters in illegal positions. Second, we expect that strings with two levels of embedding are more easy to recognise than those with one. For strings having a number of levels of embedding, and therefore more pushes and pops, the symmetric pattern generated by the recursivity emerges more clearly from the exemplars. Strings with one level of embedding only, were hypothesised to be difficult for two reasons. Since there are only one (in ungrammatical strings) or two (in grammatical strings) ‘cuts’ which can be made in the sentence, identifying the correct groups of letters is less obvious. Moreover, since the test strings were all balanced for length and number of embeddings, the average length of the embeddings of pops and pushes was longer for strings with one level of embedding. Thus, the dependency between push and pop reached over a longer distance, on average, for these strings. Third, we predicted that the strings with three levels of embedding will be rejected because more than two applications of the principle within one string has not been seen during training: Participants are hypothesised to consider the training set as a complete sample of the possibilities of the grammar.

2. Experiment 1 2.1. Method 2.1.1. Participants Forty undergraduate students aged 18–15, of the University of Leiden participated in this study on a voluntary basis. 2.1.2. Materials The grammar of Fig. 1 was implemented in a computer program generating sequences. This program was used to make the learning set of sequences and the test set. For the learning set, 56 sequences were selected having a length of at least 9 and at most 13 elements (letters). The sequences had zero, one or two levels of selfembedding. The length of the strings was controlled for different levels of embedding. That is, there were as many sequences with zero, one, or two levels of self-embedding having the same length. The order of presentation was randomised. Examples of sequences presented in the learning phase are: PTVPXTTTVV (zero level of selfembeding), PV(TSXXVV)V (one level of self-embedding.) and TXXV(PV(TSSXS)V)V (two levels of self-embedding). Notice that the self-embeddings in the sequences are put between brackets here for clarity. These brackets were not presented in the stimulus sequences actually presented to the participants. For the test set, 118 sequences with at least 9 and at most 13 letters were used (lengths were the same as in the training phase). Fifty-nine sequences were grammatical and 59 were ungrammatical. In addition, 60 of the test sequences had no self-embeddings, 20 had one level of

328

F.H. Poletiek / Acta Psychologica 111 (2002) 323–335

embedding, 20 had two levels of embedding, and 18 had three levels of embedding, which is one more level than presented in the learning phase. Ungrammaticality was implemented by carrying out a transformation on grammatical sequences generated by the computer program. For sequences with no self-embeddings, this transformation held that two letters (not the ones in the first or last position) were placed in a non-permissible order. For the sequences with one or more levels of embedding, one push or pop was omitted. Thus, either the part of the sequence preceding State 5 (see Fig. 1) or the part of the sequence following State 5 was omitted. Importantly, this ‘ungrammaticality’ bears on the recursive operation only. The necessity of a pop in the second part of the sequence, is dependent upon the presence of a push at the beginning. Thus, recognising that the pop is missing at the end must be derived from the knowledge that it is related to a push. The ungrammaticality of the test strings with embeddings did not result in any unpermissible letter pairs or triples. We took care to select each push, embedding and pop approximately equally often in both the learning and the test phase. This was done to make sure that the associative chunk strength of these constituents did not vary. This would happen if for example, the constituent TXS would have appeared much more often in the learning phase than the embedding PVV (Knowlton & Squire, 1994). Examples of ungrammatical strings in the test phase with different numbers of embedding were: TSSXXTVSPXVV (no embeddings; S erroneously precedes P), TXXTTTP(PTTVPS) (one level; pop is missing), PV(TSXXTV(TXS)V) (two levels; one of two pops is missing), PV(PTV(PV(TSXS)V)V) (three levels; one of three pops is missing). 2.1.3. Procedure The task was administered individually with a computer. It was presented as a memory task. Participants first saw the 56 sequences one by one appearing in the middle of the screen, and were requested to memorise them as much as possible. They were then instructed about the second phase of the task. Here, they were informed that the exemplars they had seen before were made according to rules governing the order of the letters. They were told that half the sequences they were to see next followed the same rule, whereas the other half did not, and that their task was to tell for each sequence whether it is correct or incorrect. Participants could answer by pressing a ‘yes’ or a ‘no’ key on the keyboard. The sequences in the learning phase were displayed a few seconds each. The sequences in the test phase were displayed until the participant pressed one of the keys. The task took about 40 min. Afterwards, the participant was asked one open question: ‘How did you proceed to answer yes or no?’ The participants were encouraged to reflect on how they categorised the sequences, and to tell about every detail. The experimenter noted the answer. 2.1.4. Design The manipulation was the number of levels of embedding of a sequence in the test phase. This was a within-subjects variable with four levels: zero, one, two and three

F.H. Poletiek / Acta Psychologica 111 (2002) 323–335

329

levels. Categorisation was the dependent variable. Categorisation scores (for strings with a given number of levels of embedding) were determined as the ratio of the number of correctly rated sequences (with the same number of levels) to the total number of ratings (with the same number of levels). This performance score has a range of 0–1. The answers given in the interview were categorised according to whether participants mentioned having paid attention to: ‘the first group of letters’, ‘last group of letters’, ‘groups of three letters’, ‘groups of two letters’, ‘long distance dependencies’, ‘recursion’. 2.2. Results The mean judgement performance per level of embedding of the test stimuli is displayed in Table 1. A within-subject analysis of variance revealed a significant effect of the number of levels of embedding (F ð3; 37Þ ¼ 12:3; p < 0:01). The relation between levels of embedding and performance was not linear. It displays the predicted saw tooth pattern: Performance was relatively high for test strings with zero and two levels of embedding. On a test for the difference between a theoretical (0.50) and a sample proportion, categorisation performance on strings without embeddings (z ¼ 2:93; p < 0:01) as well as on strings with two embeddings (z ¼ 2:82; p < 0:01) were significantly higher than chance. Performance on strings with one level of embedding and three levels of embedding was lower, and actually significantly below chance (z ¼ 1:69; p < 0:05, for one-level strings and z ¼ 2:60; p < 0:01, for three-level strings). Performance on strings with three levels of embedding was clearly below chance, suggesting that participants tended to reject grammatical strings with three levels. In Table 2, the frequencies of the participants’ answers to the post-experimental interview are shown. None of the participants was aware of long distance dependencies or recursion. Instead, most participants indicated they based their judgements on the recognition of groups of letters. 2.3. Discussion The low overall performance (as compared to the results mentioned in the literature) might be explainable by the greater length (Reber, 1993) of the sequences used, and by the complexity of the grammar. Since our main focus, however, is

Table 1 Mean performance on grammaticality judgements of sequences with zero, one, two and three levels of embedding (N ¼ 40) Levels of embedding

M

SD

Zero One Two Three

0.53 0.47 0.55 0.45

0.046 0.086 0.089 0.10

330

F.H. Poletiek / Acta Psychologica 111 (2002) 323–335

Table 2 Answer frequencies to post-experimental interview (N ¼ 40) Answer category

f

First group of letters Last group of letters Groups of two letters Groups of three letters Long distance dependencies Recursion

16 16 13 27 0 0

on differences in performance for different types of strings, a low overall score does not affect our analyses. The influence of number of embeddings on categorisation performance globally follows the predicted pattern. First, exemplars without applications of the recursive rule (without embeddings) were recognised better than strings with one level of embedding. Second, strings with two levels of self-embedding were better recognised than the one level strings. This is in accordance with our suggestion that repetition of the recursive principle in the string makes it more salient. Third, when items with three levels of embedding (more than seen during training) were judged, performance clearly dropped again: In accordance with our hypothesis, participants probably inferred from the learning set that two levels is the highest possible number of levels, and they do not spontaneously generalise the recursive rule beyond two levels of embedding. The interviews revealed that participants focus on partitions, and look for recognisable chunks. Interestingly, the chunks they mentioned in the open interview, often coincided with pushes and pops. Although this organisation of the materials might have facilitated the induction of the recursive operation, no participants mentioned explicitly to have seen long distance relations within the exemplars, let alone a recursive rule. It should be noticed that the introspection method we used only provides a modest indication of the accessibility to awareness of the grammatical knowledge acquired (Shanks & St John, 1994), especially since a recursive rule is quite difficult to describe anyway. We performed a second experiment, with two purposes: First, we wanted to test whether the pattern of results could be replicated. Second, we wanted to find out whether people would generalise the possible number of levels of embedding permitted by the grammar beyond the number they saw at training, when they are given an additional cue. To answer the first question, we replicated Experiment 1 with a new sample of strings. To investigate the second one, we added a second task during testing. In this task, we asked participants to categorise new strings clearly longer than the ones used in the training set in the same way they did in the previous task. According to our saliency hypothesis, more levels of embedding in sequences would make it easier for the learner to recognise the recursive rule application, and chunk the string in pushes, pops and embeddings. It was thus predicted that participants would do better in categorising strings with two levels of embedding or more, when cued that the grammar allows for longer strings than the ones seen during learning. With regard to strings with one embedding, these were predicted to be difficult to rec-

F.H. Poletiek / Acta Psychologica 111 (2002) 323–335

331

ognise as (un)grammatical because these strings are more difficult to chunk. Finally, performance on the strings without any embedding, is expected to show the standard grammar learning effect.

3. Experiment 2 3.1. Method 3.1.1. Participants Thirty-eight undergraduate students of the University of Leiden participated in this study on a voluntary basis. 3.1.2. Materials The materials in Task 1 were similar to the materials in Experiment 1. A new sample of exemplars was taken from the set produced by the program, satisfying the same constrains as the sequences of Experiment 1. The sequences in the learning set had zero, one or two levels of embedding. The test set contained sequences with zero, one, two and three levels of self-embedding. For Task 2, 20 new strings were generated with more than 13 letters, 10 of which had four levels of self-embedding, and another 10 had five levels. Moreover, half this set was grammatical and half was not. 3.1.3. Procedure Learning and testing of Task 1 was administered in the same way as in Experiment 1. However, after the participants had completed the Task 1 test, a second task was announced: ‘‘Now, you will see 20 new exemplars that are longer than the ones you have seen until now. Could you indicate which ones are correct and which ones are not? You can do this in the same way as you have done until now’’. After they finished this task, the post-experimental interview was administered. 3.1.4. Design The number of levels of embedding of a sequence was the independent within-subjects variable with four levels: zero, one, two and three levels in Task 1. In Task 2, this variable had two levels: four and five levels of embedding. Grammaticality judgements performance was the dependent variable. The interview responses were scored in the same way as in Experiment 1. 3.2. Results Again, performance followed the same pattern as in Experiment 1, depending on the number of levels of embedding. The mean judgement performance on Task 1 and Task 2, per level of embedding, is displayed in Table 3. A within-subject analysis of variance of the data of Task 1 revealed a significant effect of the number of levels of embedding (F ð3; 35Þ ¼ 4:58; p < 0:01). The pattern

332

F.H. Poletiek / Acta Psychologica 111 (2002) 323–335

Table 3 Mean performance on grammaticality judgements of sequences with zero, one, two, three (Task 1) and four and five (Task 2) levels of embedding (N ¼ 38) Levels of embedding

M

SD

Task 1 Zero One Two Three

0.55 0.51 0.55 0.47

0.08 0.08 0.10 0.09

Task 2 Four Five

0.56 0.55

0.13 0.16

of performance over the levels of embedding in Task 1 is very similar to that found in Experiment 1. An analysis of the results for the different levels of embedding showed that performance on the strings without embeddings (z ¼ 2:75; p < 0:01) and the strings with two levels of embedding (z ¼ 0:275; p < 0:01) were significantly above chance. For one-level strings performance did not deviate from chance (z ¼ 0:55; p > 0:05). And again, there is no spontaneous generalisation to unstudied strings with three levels of embedding. Performance on those strings was marginally below chance (z ¼ 1:65; p < 0:05). As predicted, categorisation performance on Task 2, where participants were cued that longer strings than previously seen (with higher levels of embedding than the ones studied) could nonetheless be correct, was also high. For strings with four and five levels of embedding, performance exceeded chance (z ¼ 2:34; p < 0:01, for four levels, and z ¼ 1:94; p < 0:05 for five levels). The frequencies of the participants’ answers to the post-experimental interview are displayed in Table 4. As in Experiment 1, participants most often reported having paid explicit attention to the same partitions as mentioned by participants in Experiment 1. 3.3. Discussion The influence of the number of levels of embedding on the categorisation performance, closely follows the pattern displayed in Experiment 1. As in Experiment 1, participants do not spontaneously generalise the recursive principle beyond two levTable 4 Answer frequencies to post-experimental interview (N ¼ 38) Answer category

f

First group of letters Last group of letters Group of two letters Group of three letters Long distance dependencies Recursion

23 24 15 32 1 0

F.H. Poletiek / Acta Psychologica 111 (2002) 323–335

333

els of embedding. However, when participants had to rate sequences with four or five embeddings, and were told that half these sequences were correct, then categorisation was better again. Seemingly, aspects of the self-embedding principle were recognised in these multiple applications strings. Apparently, participants were able to use information about chunks or symmetrical organisation of pushes and pops, learned on the basis of short strings, to enhance classification performance on long ones. The interview data also look like the findings of Experiment 1. Participants report to have focused on letter chunks in the form of letter groups of two or three letters, whereby the letter chunks often coincided with pops and pushes. Also, they mention to have looked at groups of two and three letters.

4. General discussion To what extent can people acquire a recursive system implicitly? A recursive rule is made of a basic structure and a recursive operation. In the present experiments, in which the recursive rule generated self-embedded sentences, the basic structure (the finite state part of the grammar) is learned to some extent, although the performance for strings with applications of the recursive rule, in both experiments was quite low, presumably because of the mean string length. Also, the recursive operation seems to be inducible to some extent from the training materials, but it was recognised only when the pattern was repeated twice, presumably because this makes the pattern to emerge more saliently at the surface. Moreover, participants needed a cue to generalise the number of permissible levels of embedding beyond the number they saw at training. Thus, the rule was not spontaneously learned at a deep and abstract level. With regard to the nature of the learning, little can be concluded about the implicitness of the knowledge used. Especially because the rule we tested is quite difficult to describe anyway, and because we used an introspective measure of explicit knowledge (Shanks & St John, 1994). However, it is clear from the comments of the participants that they focus especially on chunks at the front and back locations of the sequences. These locations have also been found to be crucial for categorisation in previous studies (Meulemans & Van der Linden, 1997). Interestingly, this may have contributed to performance: Participants probably have been attentive to the pushand pop-parts of the strings with embeddings. Moreover this could explain why it was easier to recognise strings with more than one embedding. In these strings the repeated pushes and pops are important cues of the recursive operation. Yet, only one participant mentioned any kind of long distance dependencies, or rule based judgement. The ungrammatical recursive test sequences, contained no illegal bigram or trigram, but all violated a long distance dependency between pushes and pops. Thus, the fact that they could reject those exemplars to some extent must be explained somehow by means of knowledge of these dependencies. So, it seems that a recursive rule in an artificial grammar can be learned, but this learning is based on a chunking strategy. This learning, however, may be based on a learning strategy, and not on some deep or abstract knowledge of the recursive rule itself. This explanation

334

F.H. Poletiek / Acta Psychologica 111 (2002) 323–335

is consistent with the observation that people do not spontaneously generalise the principle beyond which they have seen during training. Clearly, they consider the training set as representative for the underlying system. But if they get a cue that the rule is generalisable, they are able to recognise some characteristics of it in items with multiple applications of the recursive rule. Although recursion is considered to be a highly complex principle, it may be learned for a large part by means of simple heuristics, like pattern recognition and chunking, together with organising these chunks in relation to each other. That is, participants use simple heuristics like chunking and grouping elements, in such a way that it helps to get the underlying grammatical structure uncovered. The simple chunking heuristic might help to separate the pushes, pops, and embeddings making up the building blocs of the grammar. The results in this study suggest that the generative property of recursivity is not learned in an abstract way. This is consistent with perspectives on natural language learning that emphasise that language performance is restricted to the use of the same number of recursive rule applications as the language learner has been faced with (Christiansen, 1992). The present observations are but the beginning of a full account of how recursive rules can be learned by induction from exemplars. For instance, we need to scrutinise in more detail strategies learners use. Moreover, we need to know whether the knowledge they acquire is complete or just satisfying for practical use? And in what conditions does the knowledge remain implicit and how does it become explicit? Future experimentation may focus on these aspects to contribute to our understanding of the possibilities and limits of recursivity learning by induction from exemplars.

Acknowledgements I am indebted to Emmanuel Pothos, Rebecca Gomez, Martin Redington and two anonymous reviewers for helpful comments on earlier versions of this article. I also thank Remco Stafleu and Noortje Janssen for their help in carrying out Experiments 1 and 2. This research was supported by the Royal Netherlands Academy of Arts and Sciences.

References Berry, C., & Dienes, Z. (1993). Implicit learning. Hove: Lawrence Erlbaum Associates. Christiansen, M. H. (1992). The (non)necessity of recursion in natural language processing. In Proceedings of the 14th annual conference of the Cognitive Science Society (pp. 665–670). Indiana: Indiana University. Cleeremans, A., & McClelland, J. L. (1991). Learning the structure of event sequences. Journal of Experimental Psychology: General (120), 235–253. Dienes, Z., Broadbent, D. E., & Berry, C. (1991). Implicit and explicit knowledge bases in artificial grammar learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 875– 887.

F.H. Poletiek / Acta Psychologica 111 (2002) 323–335

335

Dulany, D. E., Carlson, R., & Dewey, G. (1984). A case of syntactical learning and judgement: How concrete and how abstract? Journal of Experimental Psychology: General, 113, 541–555. Gomez, R. L., & Schvaneveldt, R. W. (1994). What is learned from artificial grammars? Transfer tests of simple association. Journal of Experimental Psychology: Learning, Memory and Cognition (20), 396– 410. Johnstone, T., & Shanks, D. R. (1999). Two mechanisms in implicit artificial grammar learning? Comment on Meulemans and van der Linden. Journal of Experimental Psychology: Learning, Memory and Cognition, 25, 524–531. Knowlton, B. J., & Squire, L. R. (1994). The information acquired during artificial grammar learning. Journal of Experimental Psychology: Learning, Memory and Cognition, 20, 79–91. Lewicki, P. (1986). Nonconscious social information processing. San Diego, CA: Academic Press. Lewicki, P., Hill, T., & Bizot, E. (1988). Acquisition of procedural knowledge about a pattern of stimuli that cannot be articulated. Cognitive Psychology, 20, 24–37. Marcus, G. F., Vijayan, S., Rao, S. B., & Vishton, P. M. (1999). Rule learning by seven-month-old infants. Science, 283, 77–79. Mathews, R. C., & Cochran, B. P. (1998). Project Grammarama revisited. Generativity of implicitly aquired knowledge. In M. A. Stadler, & P. A. French (Eds.), Handbook of implicit learning. London: Sage. Meulemans, T., & Van der Linden, M. (1997). Associative chunk strength in artificial grammar learning. Journal of Experimental Psychology: Learning, Memory and Cognition, 23(4), 1007–1028. Millward, R. B., & Reber, A. S. (1972). Probability learning: contingent event sequences with lags. American Journal of Psychology, 85, 81–98. Nissen, M. J., & Bullemer, P. (1987). Attentional requirements of learning: evidence from performance measures. Cognitive Psychology, 19, 1–32. Perruchet, P., Gallego, J., & Savy, I. (1990). Synthetic grammar learning: implicit rule abstraction or explicit fragmentary knowledge? Journal of Experimental Psychology: General, 119, 264–275. Perruchet, P., & Pacteau, C. (1990). Synthetic grammar learning: implicit rule abstraction or explicit fragmentary knowledge. Journal of Experimental Psychology: General, 119, 264–275. Reber, A. S. (1967). Implicit learning of artificial grammars. Journal of Verbal Learning and Verbal Behavior, 6, 855–863. Reber, A. S. (1976). Implicit learning of synthetic languages: the role of the instructional set. Journal of Experimental Psychology: Human Learning and Memory, 2, 88–94. Reber, A. S. (1989). Implicit learning and tacit knowledge. Journal of Experimental Psychology: General, 118, 219–235. Reber, A. S. (1993). Implicit learning and tacit knowledge. New York: Oxford University Press. Redington, M., & Chater, N. (1996). Transfer in artificial grammar learning: a reevaluation. Journal of Experimental Psychology: General, 125, 123–138. Shanks, D. R., & St John, M. F. (1994). Characteristics of dissociable human learning systems. Behavioral and Brain Sciences, 17, 367–447.

Suggest Documents