Items 9 - 16 - The author is extremely grateful to Phil Johnson-. Laird for his valuable suggestions and for having de- voted so much of his time to discussions ...
JOURNAL
OF MEMORY
AND
Developmental Representations
LANGUAGE
28, 37-55 (1989)
Differences in the Use of Verbatim versus Spatial in the Recall of Spatial Descriptions: A Probabilistic Model and an Experimental Analysis SERGIO MORRA University of Padua
Ehrlich and Johnson-Laud (1982, Journal of Verbal Learning and Verbal Behavior, 21, 296-306) showed that adults interpret spatial descriptions by constructing mental models of them. An alternative strategy may be to encode descriptions verbatim and then to try to represent them overtly, one sentence at a time. A probabilistic model of performance is presented for this strategy, together with three experiments (subjects: 32 third-graders, 36 third-graders, 18 undergraduates, respectively). The results obtained from children (goodness of fit of the probabilistic model, correlations with measures of verbal short-term memory, a rhyme-effect enhancing performance) show that children tend to use the strategy based on phenomic coding. On the other hand, Experiment 3 replicates the finding that adults follow the “mental model” strategy. 8 1989 Academic press. IX.
Experimental tasks have their own fate. They are invented for some specific purpose, but are then often manipulated and modified in order to tackle quite different questions. Such is the case of a psycholinguistic task with spatial descriptions such as: “The spoon is to the left of the glass. The knife is in front of the spoon. The glass is behind the dish” (Ehrlich & Johnson-Laird, 1982; Johnson-Laird, 1980, 1983). Subjects receive such descriptions and are asked to depict diagrams of them. This task was designed to study the integration of sentences in a comprehensive representation: in the case of spatial descriptions, Ehrlich and Johnson-Laird (1982) aimed at distinguishing whether the integrated structure is a network of propositions, as suggested by
Kintsch and Van Dijk (1978), or an analog representation. The experimental manipulations of the order of sentences provided convincing evidence that the integrated representation is an analog one, i.e., a mental model. Throughout this article, the term “mental models” will be understood according to Johnson-Laird’s (1983) definition, i.e., analog representations the structures of which “are identical to the structures of the states of affairs, whether perceived or conceived, that the models represent” (p. 419). In particular, this article is concerned with spatial mental models, consisting of “a finite set of tokens representing a finite set of physical entities,” and “a finite set of relations between the tokens, representing physical relations between the entities,” where “the only relations between the entities are spatial, and the models represent these relaThe author is extremely grateful to Phil JohnsonLaird for his valuable suggestions and for having de- tions by locating tokens within a dimenvoted so much of his time to discussions about these sional space” (p. 422). It may be noticed that this definition does not place any reexperiments. Thanks are also due to Luigi Ceruso for preparation of the apparatus used in Experiments 1 striction on the nature of the “tokens,” and 2, to Sandro Bettella for advice on computer prowhich are presumably symbolic and do not gramming, and to the journal’s referees. Reprints may need to be quasi-pictorial images. be requested from Dr. Sergio Morra, Dipartimento di Converging experimental evidence for Psicologia, Universita’ di Padova, P. Capitaniato 3, 35100 Padova, Italy. the existence of spatial mental models was 37 0749-5%x/89 $3.00 Copyright 0 1989 by Academic Press, Inc. All rights of reproduction in any form reserved.
38
SERGIO
provided by Mani and Johnson-Lair-d (1982) and theoretical arguments by JohnsonLaird (1980, 1983). Thus, the spatial descriptions task, invented in the field of the propositional/analog controversy, proved successful in providing unambiguous evidence for deciding between two hypotheses. Subsequently, a rather different question was raised. If the integrated representation is analog, how is it constructed in working memory? What kinds of codes are stored and processed as the linguistic input is transformed into an integrated spatial model? These questions were investigated by Oakhill and Johnson-Laird (1984) with dual task paradigms. “Continuous” descriptions (in which the order of terms in the three sentences has the form AB, BC, CD) and “discontinuous” descriptions (in which the order is like AB, CD, BC) were combined with either visual tracking or digit memory as secondary tasks. Although the results were probably not conclusive, they suggested that the construction of mental models from both continuous and discontinuous descriptions involves both analog spatial coding (more or less to the same extent in the two types of descriptions) and verbal coding (probably with a heavier memory load in discontinuous rather than continuous descriptions). Furthermore, the studies quoted above consider only adult subjects. What happens in children? Are they able to construct spatial mental models? If the capacity of their working memory is smaller than that of adults (e.g., see Burtis, 1982; Case, 1985; Hitch & Halliday, 1983; Hulme, Thomson, Lawrence, & Muir, 1984; Pascual-Leone & Goodman, 1979), how does this limitation constrain performance in the task considered here? There are no available data answering such questions. Some experimental paradigms (e.g., Huttenlocher & Strauss, 1968; Cox, 1981; Olson & Bialystok, 1983) consider only one-sentence descriptions that
MORRA
probably load working memory to a lesser extent. In addition, they do not allow us to distinguish between strategies using different codes, such as propositional representations versus mental models. Other experimental paradigms (e.g., Halford, 1984) involve a coordination of relations, but in this case the relations need be neither spatial nor presented in a verbal format. The questions listed above motivated a pilot experiment with 6-year-old children, in which we modified the original task by Ehrlich and Johnson-Laird (henceforth called Spatial Descriptions Task) in three ways. First, the information load was reduced to two sentences in each description, stating the position of three objects, in order to avoid floor performance. Second, the terms “left” and “right” were replaced by the more familiar “over” and “under.” Third, to make the task more attractive to children, the required performance was not the depiction of a diagram, but the placement of real objects on an apparatus designed for this series of experiments. As rarely happens, the results of a pilot experiment were extremely clear, and also surprising. They could easily be interpreted as evidence for verbatim storage of sentences, which are not integrated in a holistic mental representation but rather represented overtly one at a time. The first two experiments reported in this article, both on 8year-olds, supported this finding and provided more details about the strategy of verbatim encoding and the conditions of its implementation. A third experiment, with adult subjects, replicated one of Ehrlich and Johnson-Laird’s experiments with some modifications and ruled out the hypothesis that adults also follow the strategy of verbatim encoding, while confirming their reliance on the construction of analog representations. Some theoretical implications of these findings will be discussed, but it can be anticipated here that the reasons why children should try to encode these descriptions verbatim, rather than constructing a mental
SPATIAL
DESCRIPTIONS
AND
VERBATIM
39
RECALL
model, are still far from clear. Nevertheless, the aim of this article is simply to demonstrate the existence of the phenomenon.
forgetting both sentences while succeeding by chance in the problem. It is assumed that p(C) is the probability of representing correctly the first sentence, A MODEL as there is still no object in the apparatus and therefore no need to revert the senThe strategy hypothesized is different tence to take into account the position of from the two compared by Ehrlich and already placed objects. But when a subject Johnson-Laird (i.e., those corresponding to also tries to represent the second sentence, propositional networks and mental modthis may require either a congruent or an els). It simply assumes that subjects try to incongruent placement, with equal probamemorize the sentences verbatim and then, bility: therefore, the probability of being as soon as they receive the objects, reprecorrect in it is (p(C) + p(Z)) )/2. These assent the sentences one at a time. This stratsumptions are also consistent with the reegy is embodied in a mathematical model and Strauss that expresses the probability of success in sults obtained by Huttenlocher (1968) and Huttenlocher and Weiner (1971). a problem as a function of four parameters: It should be noted that p(C), p(Z), p(R2), and p(R1) are not four free parameters esp(R2) p(C) p(c) + I-m timated a posteriori from the scores in the 2 main task, but are independent measures, obtained for each subject from two other tasks, i.e., the preliminary one-sentence items and the final short-term memory test In the formula, for sentences. Figure 1 illustrates the conp(R2) is the probability of recalling cor- stant of l/6 (which also holds if a subject has already placed one object or both in a rectly both sentences in a couple; p(H) is the probability of recalling cor- boundary position of the apparatus); the other constant of l/36 is obtained from simrectly only one of them; probability computations. p(R0) is the probability of recalling cor- ple conditional rectly neither of them; of course, p(R0) = 1 This model enables us to compute a theoretically predicted score in the Spatial De- (PW) + PW) 1; scriptions Task for each subject, on the bap(C) is the probability of representing correctly a “congruent” relationship be- sis of her/his performance in the preliminary placement and in the final memory tween two objects; p(Z) is the probability of representing cor- tasks. rectly an “incongruent” relationship between two objects; 0 I l/6 is the probability of guessing the placement of one of three objects by chance; l/36 is the probability of guessing the placement of all three objects by chance.
0
Actually, the three terms of the sum stand respectively for the probability of recalling both sentences and representing them correctly; of recalling only one sentence, representing it correctly and guessing the position of the third object; and of
+
01
FIG. 1. Six equiprobable positions for random placement of third object. Asterisks, already placed objects; circles, equiprobable positions for guessing.
40
SERGIO MORRA
Expected scores obtained with this formula were fitted to the results of a pilot experiment with 10 six-year-old subjects. The expected and observed means were very similar: expected mean = 2.44, observed mean = 2.70, t(9) = 0.72. In addition, the correlation between the two distributions was quite high: r(8) = .89, p < .OOl. EXPERIMENT
1
The strategy and model described above were hypothesized after the results of a pilot study. The main purpose of Experiment 1 was to test the model with an independent and larger sample. In addition, more questions may be asked at this point. Is rote memorizing a strategy used only by very young subjects, or can it also be found in older children, and perhaps even in adults? Is it also possible to find in children evidence of analog spatial coding, comparable to that obtained by Johnson-Laird and coworkers? Could it be that the task is difficult for children because it involves asymmetrical spatial relations (i.e., in understanding a sentence like: “X is behind Y,” which of two noninterchangeable objects goes behind the other must be identified)? What happens if symmetrical spatial relations are used: would this render the task easier; would it affect children’s strategies? Experiment 1 considered subjects’ accuracy and order of object placement in the main tasks, together with a set of other, possibly related measures. One main task was the Spatial Descriptions Task described above. The other (henceforth called the Symmetrical Relations Task) involved the terms “together with,” “not together with,” that are intrinsically symmetrical. Method Subjects. From a pool of 35 thirdgraders, 3 were excluded because of low PM47 scores (less than 13 items correct). The remaining subjects were 16 boys and 16 girls (age range 8;4-9;4), native Italians, from an urban public school. Materials. For the main tasks, we used a
four-stage shelf, 31 cm high, with a dark gray plastic back and four transparent layers. Each layer (17.5 cm wide and 32 cm long) was divided into four zones (17.5 x 7.5 cm) by parallel colored stripes. The front-back dimension was made salient by the parallel colored stripes and by the dark back. The updown dimension was made salient by the vertical arrangement of the layers and of the stripes on them. A large cardboard box contained one example of 24 small objects (e.g., eraser, toy soldier, coin) used in the experiment (each object was used in two Spatial Descriptions and two Symmetrical Relations problems). Each of 16 smaller boxes contained three different objects; each problem was made of two sentences and dealt with a different triplet of objects. In each main task, there were four problems for each of the following arrangements of terms in the sentences: AB,AC; AB,BC; AB,CA; AB,CB. In the Spatial Descriptions Task, for each arrangement of terms there were two “linear” problems (describing a vertical or horizontal alignment of objects on the apparatus, such as A over B, A under C) and two “orthogonal” problems (describing a square-angle layout, e.g., A over B, A behind C). The terms “over, under, in front of, behind” were used an equal number of times and counterbalanced over the different conditions of the task. For the Symmetrical Relations Task, the same 16 boxes (and triplets of objects) as those for the main task were used. Each sentence could be affirmative (X together with Y) or negative (X not together with Y). Therefore, four different combinations of relations were possible: both sentences affirmative (AA); the first affirmative and the second negative (AN); vice versa (NA); both negative (NN). The 16 problems were constructed by crossing the four arrangements of terms with the four combinations of relations. Procedure. In a preliminary session the PM47 test (Raven, 1947) was groupadministered in subjects’ classrooms. In
SPATIAL
DESCRIPTIONS
two individual sessions subjects performed the Spatial Descriptions Task and the Symmetrical Relations Task. The 16 problems of each task were presented to subjects in one of four counterbalanced orders. The order of the two sessions was counterbalanced over subjects. At the beginning of the first session, subjects were made familiar with the objects in the large box by being asked to name each of them, and with the shelf apparatus, the two relevant dimensions of which (height, depth) were emphasized. Before performing each task, subjects were shown examples of the relevant relations between pairs of objects. In order to evaluate subjects’ ability to represent a single spatial relation correctly, a preliminary task was given before the Spatial Descriptions Task. This preliminary task was essentially similar to the procedure introduced by Huttenlocher and Strauss (1968). The experimenter selected an object from the large box and put it in one of the central positions in the apparatus, selected another object and handed it to the child, and read a sentence describing a single relation between them. There were eight such items, four of which required a “congruent” placement gesture and four an “incongruent” one; the terms over, under, in front of, behind, were used once in each condition. Congruent and incongruent placement gestures were defined as follows. If the experimenter had placed, for example, a button on the shelf, and read a sentence like “the key behind the button,” the subject took the key and had to place it, congruently, behind. Instead, if the sentence was “the button behind the key,” the subject took the key and had to place it in front, the gesture being incongruent with the word. That is, a sort of reversibility was involved, so that the subject of the sentence (the button) could go “behind.” In this preliminary task the subjects received feedback and after the series of eight items (presented in a fixed order) they repeated any item on which they had failed;
AND
VERBATIM
RECALL
41
however, what was scored was the first performance on each item. In the Spatial Descriptions Task, sentences were read by the experimenter at the rate of about 1 per 3.5 s, then after about 2 s the objects were spread in front of the children. In order to encourage analog coding, subjects were encouraged to try to imagine the objects already placed in the appropriate positions: this instruction was given at the beginning and repeated two or three times during the task. As there were eight types of problems in all (4 arrangements of terms X 2 spatial layouts), one problem per type was presented in items l-8 and one in items 9-16. Each problem was scored as correct or incorrect. Feedback was kept to a minimum. After the Spatial Descriptions Task, the two WISC memory span subtests were presented (it may be remembered that Backward Memory Span has been used several times, since Case & Globerson, 1974, as an estimate of children’s working memory capacity; see also Case, 1985; Schoffreld & Ashman, 1986; Mot-t-a & Scopesi, in press). Last, each subject performed a memory task in which pairs of sentences were presented, quite similar to those constituting the items of the main task, with the only difference that subjects were now told not to place objects but to repeat the sentences. The score for each item was 0, 1, or 2 according to the number of sentences exactly repeated. In another session, the 16 problems of the Symmetrical Relations Task were presented in the same way, after having checked subjects’ understanding of the single relations “together” and “not together.” In a final session, Goodenough’s DrawA-Man Test was group-administered in the subjects’ classrooms, as prescribed by the Italian manual (Pizzo Russo, 1977). In the Spatial Descriptions Task and Symmetrical Relations Task, not only were items scored for accuracy, but the order of placement of the objects was also recorded.
42
SERGIO
It was not always possible to score an item for the order of placement: e.g., the children placed the objects in one way and then moved some of them to correct themselves. About 5% of the trials remained unscored for order of placement. Also considering that two objects were sometimes grasped at one time, 10 different temporal patterns could be identified. In order to avoid excessive complications in scoring temporal order, only two scores were used: (a) Same-order score: the number of problems in which the three objects were placed with three distinct gestures, in the same order as they were mentioned in the sentences read by the experimenter. (b) Subtractive score: the number of problems in which the object belonging only to the first sentence was placed before the object belonging only to the second sentence, minus the number of problems in which they were placed in the opposite temporal order. (The reader may notice that this difference resembles that scored by Johnson-Laird & Bara, 1984, between the forms AC and CA that subjects give to their spontaneous conclusions of syllogisms.) Design. The following analyses were performed. First, the goodness of fit of the model was tested. The accuracy scores in the Spatial Descriptions Task were also submitted to analysis of variance with a 4 (orders of terms) x 2 (spatial layouts) between-subjects design. The accuracy scores in this task were also correlated with the other available measures. Analyses of variance and t tests were performed on the order of placement scores, in order to detect any “figural bias” (see Johnson-Laird, 1983; Johnson-Laird & Bara, 1984) in the Spatial Descriptions Task. Analyses of variance were run on the accuracy scores in the Symmetrical Relations Task. In order to avoid O-l scores in the interaction cells, two separate one-way analyses of variance were run, with order of terms and combinations of relations as between-subjects factors. The accuracy
MORRA
scores in this task were also correlated with the other available measures. Analyses of variance and t tests were also run on the order of placement scores in the Symmetrical Relations Task. Results and Discussion
The expected and observed mean scores in the Spatial Descriptions Task were 8.35 and 7.56, respectively; the difference was not significant: t(31) = 1.47, p > .14, SEd = 0.53. The ratio of the observed (10.87) to the expected variance (16.02) was not significant: x*(31) = 21.72, p > .21. The expected an 1 observed distributions of scores were highly correlated, r(30) = .685, p < .OOOl. Thus, the fit of the model was very good: besides predicting the mean and variance of the distribution accurately, it accounted for a greater portion of variance than the best empirical predictor, which was memory for sentences (47% vs. 34% of variance explained). The analysis of variance of the accuracy scores in the Spatial Descriptions Task yielded significant effects for the spatial layout of objects, F(l,31) = 10.46, p < .003, MSE = 0.36, order of terms, F(3,93) = 3.22, p < .025, MSE = 0.47, and interaction, F(3,93) = 2.86, p < .04, MSE = 0.38. The mean of correct problems in each condition is reported in Table 1. The effect of spatial layout may be compatible with a strategy involving mental models (as the construction of a twodimensional model may be more difficult than that of a one-dimensional model), but it may also be compatible with one based on verbatim encoding (since performance would be correct, with linear problems, even if subjects forgot the last word of the second sentence). The effect of the order of terms was due to AB,CB problems being easier than the other orderings. Paired t tests show that they were easier than AB,BC: t(31) = 2.92, p < .Ol; easier than AB,AC: t(31) = 2.69, p < .02; and easier than AB,CA: t(31) =
SPATIAL
DESCRIPTIONS
AND
TABLE MEAN
SCORES PER CONDITION
IN SPATIAL
VERBATIM
1 DESCRIPTIONS
Arrangement Spatial layout Linear Orthogonal Total
43
RECALL
TASK (EXPERIMENT
1)
of terms
AB,AC
AB,BC
AB,CA
AB,CB
Total
0.91 0.84 1.75
1.16 0.53 1.69
0.97 0.78 1.75
1.25
1.12 2.37
4.28 3.28
2.43, p < .05. These three orderings did not show any significant difference from one another. This effect was unpredicted, but it does not seem that it can be reconciled with a strategy based on mental models. However, it can easily be explained by a strategy based on verbatim encoding, as a facilitation due to the rhyme of the two sentences in the form AB,CB. Last, the interaction effect might be due to violations of the normality assumption. However, if it is reliable, it shows that the facilitation of the linear layout was greater with the AB,BC order of terms. This order may facilitate the construction of a linear model, or it may cue the recall of the second sentence (since the relational term is the same in the two sentences and the term B appears consecutively). In sum, the effects of spatial layout and interaction are of little interest for our hypotheses, while the effect of the order of terms seems to be compatible only with the verbatim encoding hypothesis. The relevant correlations are reported in Table 2. For the order of placement, the more sensitive “subtractive score” is considered. The pattern gives further support to the verbatim encoding strategy. Due to the nonsignificant correlations with Backward Digit Span, Raven’s PM 47, and Draw-A-Man, the results seem to rule out the hypothesis that subjects perform the main tasks with a strategy (such as constructing a mental model) that heavily taxes the central components of working memory and uses an analog spatial code. Rather, they point to an involvement of verbal short-term memory, i.e., of the mechanism
called articulatory loop by Baddeley (198 1, 1986), who regards it as a peripheral component of working memory. A 4 (arrangements of terms) X 2 (spatial layouts) analysis of variance of the “same order” scores in the Spatial Descriptions Task yielded nonsignificant effects for spatial layout, F(1,31) = 1.49,~ > .22, MSE = 0.42; arrangement of terms, F(3,93) = 0.75, p > .47, MSE = 0.34; and interaction, F(3,93) = 1.14, p > .33, MSE = 0.48. Attempting a more powerful test of the hypothesis, a one-way analysis of variance of the “subtractive” scores was also run, with four levels of the arrangement of terms; it TABLE
2
CORRELATIONS OF ACCURACY SCORESIN SPATIAL DESCRIPTIONS TASK AND SYMMETFUCAL RELATIONS TASK WITH OTHER MEASURES USED AS PREDICTORS (EXPERIMENT 1) Task Predictor Preliminary items (congruent) Preliminary items (incongruent) Forward Diit Span Backard Digit Span Memory for Sentences Draw-A-Man Raven’s PM47 Order of placement (Spatial Descriptions) Order of placement (Symmetrical Relations) Accuracy (Spatial Descriptions) Accuracy (Symmetrical Relations) Note. *p ** p *** p
df = 30 in all cases. < .05, one-tailed. < .Ol, one-tailed. < ~301, one-tailed.
Spatial descriptions
Symmetrical relations
.35*
.25
.49** .49**
.37* .42** .23 .52**
.14 .w** .14 .05 .23
.08
46 .13 .05
-.I0 SO**
.50**
44
SERGIO
yielded a nonsignificant result, F(3,93) = 1.42,~ > .23, MSE = 1.69. However, there was a highly significant trend for positive “subtractive” scores within each arrangement of terms. The means were 2.68 for AB,AC; 2.84 for AB,BC; 2.21 for AB,CA; 2.71 for AB,CB (the possible range being - 4 to + 4). Each of them was significantly greater than 0 (t = 9.08, t = 10.51, t = 7.31, t = 10.50, respectively; always with $3 1 and p < .OOl). The mean of the “same order” scores (9.41 out of 16) was also very high. The strong tendency to place the objects in the same temporal order as they are named by the experimenter may be due either to a strategy based on verbatim encoding of the sentences, or to the “firstin-first-out” principle in the construction of mental models. However, the absence of any “figural effect” of the different arrangements of terms seems to rule out the “first-in-first-out” possibility. Thus, the data on order of placement also tend to support the hypothesis of verbatim encoding. In the Symmetrical Relations Task, the order of terms had a significant effect on accuracy scores, F(3,93) = 3.05, p < .04, MSE = 0.30. Actually, there may be a ceiling effect (this task is rather easy, the proportion of success being .88). However, if the effect is reliable, it shows that AB,CA problems are slightly more difficult than the others. The combinations of affirmative and negative sentences had a significant effect, F(3,93) = 5.23, p < .003, MSE = 0.49. The means per condition were AA = 3.84, AN = 3.66, NA = 3.25, NN = 3.31. Due to the size and high significance of the effect, it is unlikely that it was merely a ceiling effect, but rather shows that negations, especially in the first sentence, impair performance. Paired t tests showed that AA problems were more difficult than both NA problems, t(31) = 4.72, p < .OOl, and NN problems, t(31) = 3.06, p < .005; and that AN problems were marginally easier than both NA and NN problems (p < .03 and p < .05 one-tailed).
MORRA
A one-way analysis of variance of the “subtractive scores” of the order of placement with respect to the arrangement of terms in the sentences gave a nonsignificant results, F(3,93) = 1.63, p > .18, MSE = 2.08, for the Symmetrical Relations Task, too. As for the other main task, these scores were significantly greater than 0 for each arrangement of terms. The means were 1.53 for AB,AC; 2.25 for AB,BC; 2.06 for AB,CA; and 1.72 for AB,CB. The t values were 4.82, 7.02, 7.56, and 4.47, respectively (all with df 31); each of them is significant, p < .OOl. However, a one-way analysis of variance of the same scores with respect to the combinations of affirmative and negative sentences yielded highly significant results, F(3,93) = 21.17, p < .OOl, MSE = 2.76. The means per condition were: AA = 2.50, AN = 3.37, NA = 0.25, NN = 1.43. All of the paired t comparisons among conditions were significant, showing a clear pattern NA < NN < AA < AN, and NA was the one condition that did not significantly differ from 0: t(31) = 0.61, p > .54, SEd = 0.41. In conclusion, it may be said that the Symmetrical Relations Task was much easier than the Spatial Descriptions Task (proportion of success = .88 vs. .49 in the same sample). Although these data are not directly comparable because of the different chance probability of success, the difference is striking, and the results of the latter analysis of variance are an important clue to the reasons for it. Not only was there a bias to place the first-mentioned objects first, there was also a strong bias to place the objects that go together first. A convenient strategy that allows working memory space to be saved is to encode not all of the sentences, but only the affirmative ones, or even only the names of the couples of objects that go together. Clearly, this strategy may be characterized as propositional. When it is not followed, workspace may be wasted when the irrelevant information conveyed by negative sentences is en-
SPATIAL
DESCRIPTIONS
coded, and this may be the reason why negative sentences (especially a negative first sentence) impair performance, as shown above. The strategy described here seems to be simple enough; some subjects were even aware of it and able to describe it in an informal postexperimental interview. EXPERIMENT
2
AND
VERBATIM
RECALL
45
stroying it. For example, let us consider a problem like: “the acorn behind the penny, the card under the penny.” One way of destroying the rhyme is the use of synonyms: a penny may also be called a “coin” and the problem formulated as “the acorn behind the penny, the card under the coin.” Another way is to shift the two terms in the first sentence, i.e., “the penny in front of the acorn, the card under the penny,” thus turning it into an AB,CA item. (A third way would be to shift the terms in the second sentence, turning the problem into an AB,BC; but this is disregarded here, as AB,BC problems showed a specific difference between linear and orthogonal problems in Experiment 1.) Experiment 2 had a 2 x 2 design: the factors were the order of terms (only AB,CB and AB,CA were considered) and the use of one term or two synonyms for the object that was common to both sentences. If the “rhyme effect” is real, then only the AB,CB problems in which the same term is used twice for the object B will be performed better. If, on the contrary, the “grammatical role” explanation holds, then AB,CB problems will prove easier than AB,CA problems independently of the manipulations of the other experimental factor.
In the previous experiment, the results obtained on the Spatial Descriptions Task supported the hypothesized strategy of “verbatim memorizing” and also the related model. Furthermore, a somewhat different pattern of results was obtained from a similar task, that could be solved by means of a simpler strategy. However, one of the results discussed as supporting the hypothesized strategy was actually unpredicted, and the suggested explanation of a “rhyme effect” was clearly ad hoc. The aim of Experiment 2 was to check more directly this interpretation and compare it with other possible explanations. For instance, it could be hypothesized (see Huttenlocher & Weiner, 1971), that AB,CB problems may be easier because the object mentioned in both sentences takes on the same grammatical role in each of them. A further reason for running this experiment was that it can be argued that, alMethod though in the previous experiment subjects did encode and store the sentences sepaSubjects. Subjects were 36 normal Italian rately, perhaps they did so semantically 1 third-graders (mean age about 8,3; 18 males rather than phonemically. This hypothesis and 18 females) from an urban public seemed unlikely, since the high correlations school not involved in the previous experiwith the Forward Digit Span suggest that ment. the encoding was actually phonemic, but it Materials. Essentially the same materials was not absurd, and it seemed worthwhile as those in Experiment 1 were used, with to test it. A rhyme effect could only be ex- the following differences. Only the Spatial plained by phonemic coding: therefore, its Descriptions Task was considered, and for demonstration would provide clearer evi- simplicity of design only orthogonal layouts dence of a verbatim memory strategy in- were described. For each of the 16 triplets volving phonemic coding than correlations of objects, four different descriptions of the with Forward Digit Span. same layout were prepared, in the forms If the facilitation of the AB,CB problems AB,CA; AB,CB; AB,CA’; AB,CB’ (where was actually due to a rhyme effect, then the prime indicates the use of a synonym). there should be at least two ways of de- For example:
46
SERGIO
AB,CB
AB,CB’
AB,CA
AB,CA’
La GHIANDA dietro alla MONETA, Il CARTONCINO sotto alla MONETA. (The acorn behind the coin, The card under the coin.) La GHIANDA dietro alla MONETA, 11 CARTONCINO sotto al SOLDO. (The acorn behind the coin, The card under the pew.) La MONETA davanti alla GHIANDA, 11 CARTONCINO sotto alla MONETA. (The coin in front of the acorn, The card under the coin.) La MONETA davanti alla GHIANDA, 11 CARTONCINO sotto al SOLDO. (The coin in front of the acorn, The card under the pew.)
Procedure. Preliminary instructions were given as in Experiment 1. The usual eight preliminary one-sentence items were presented as a part of the preliminary training and warm-up. However, since no “incongruent” placement was demanded by the problems used in this task, in these preliminary items both objects were directly handed to the subject, who in case of error received feedback and was allowed to repeat the item as in Experiment 1. In the main task, each subject received four problems in each form, the assignment of descriptions (i.e., triplets of objects) to forms being counterbalanced over subjects. The order of presentation of the descriptions was fixed, implying a counterbalanced order of presentation of the four
MORRA
forms of problems. Scoring was one point for each description correctly represented. Design and hypotheses. The design was 2 x 2: factors were order of terms and use of synonyms. The hypothesis of phonemic coding of sentences had no implication for the main effects of the factors, but implied the prediction of a significant interaction. There was no special reason to expect any difference in performance if the object named in both sentences was mentioned first or second in the first sentence (i.e., was object A or B). The use of synonyms could also produce any outcome: e.g., they could impair performance, if subjects were obliged to devote workspace to the processing of synonymy; alternatively, the use of the same term could impair performance if it produced phonemic confusability between the two sentences (see, e.g., Baddeley, Lewis, & Vallar, 1984). Thus, no specific prediction was made for the main effects. The predicted interaction was formulated as follows: when the same term is used in both sentences, AB,CB problems (with a rhyme) will be easier than AB ,CA problems (that may even suffer from phonemic confusability). However, when two synonyms are used, there will be no difference between AB,CB’ and AB,CA’ problems, in which there is no rhyme and no possible phonemic confusion. Instead, if grammatical role is the correct explanation, then the outcome will be a main effect of order of terms, with AB,CB easier than AB,CA, and AB,CB’ easier than AB,CA’ problems, with no significant interaction between the two factors. (This hypothesis, too, has no implication about the main effect of the use of synonyms.) Results
and Discussion
The results (mean scores out of four problems per form) are summarized in Table 3. An analysis of variance yielded nonsignificant main effects for order of terms, F(1,35) = 2.40, p > .12, MSE = 0.49, and second name of the repeated object, F( 1,35)
SPATIAL
MEAN Terms
DESCRIPTIONS
TABLE 3 SCORES PER CONDITION IN EXPERIMENT Arrangement
for repeated object
2
of terms AB,CB
AB,CA
Same word Synonyms
AND
1.72 1.50
1.25 1.61
= 0.24, p > .63, MSE = 0.72. However, the interaction was significant, F(1,35) = 4.09, p < .05, MSE = 0.75, revealing a cross-over pattern (see Table 3). As predicted, performance was better with AB,CB than with AB,CA problems, t(35) = 2.62, p < .Ol one-tailed, while there was no significant difference, and if anything a trend in the opposite direction, between AB,CB’ and AB,CA’ problems, t(35) = 0.58, p > .56. Thus, the data were in agreement with the hypothesized rhyme effect. The levels of significance were not remarkably high; however, the one-tailed p < .Ol of the difference between AB,CB and AB,CA replicated the result of Experiment 1 (i.e., t(31) = 2.43, p < .02 one-tailed) fairly well. It may be concluded that, besides the good fit of the probabilistic model, the high correlations with Memory for Sentences and Forward Digit Span and the lack of “figural effect” in the order of placement, shown in Experiment 1, another source of evidence was obtained for a strategy based on phonemic encoding of sentences. EXPERIMENT
3
At this point, it may be asked whether the strategy described above is followed only by children or if it is used also by adults. Since Ehrlich and Johnson-Laird (1982) only considered the two hypotheses of integrated mental models versus integrated propositional networks, disregarding the possibility of separate verbatim storage of sentences, Experiment 3 was designed to test this and was a replication with modifications of Ehrlich and Johnson-Laird’s (1982) Experiment 3.
VERBATIM
RECALL
47
The three types of descriptions used in the original study were called continuous (e.g., AB,BC,CD), discontinuous (e.g., AB,CD,BC), and semicontinuous (e.g., BC,AB,CD). The latter type was a crucial condition; it was called “semicontinuous” because it is continuous from the point of view of constructing a mental model (subjects, would add one token in the model as they understand the second and third sentences), but discontinuous from the point of view of constructing a propositional network (since there is no link between the second and third sentences). The measures recorded in the original study were accuracy and reading time (sentences were displayed one at a time on a CRT at a self-paced rate). Results showed that discontinuous problems were more difficult than the other two types, which did not differ significantly from each other. Both speed and accuracy measures showed the same pattern. The order of terms within each sentence was also manipulated, but yielded less interesting results. Several variables, irrelevant to the comparison of the two hypotheses considered there, were left uncontrolled in the original study. These included the number of syllables in nouns and in sentences, counterbalancing of frequencies and imagery values of nouns across problem types, and the wording of the third sentence in each problem. Since these variables are not irrelevant for the hypothesis of verbatim coding, they were controlled in this experiment. Furthermore, preliminary items of one sentence and final memory tasks were added, as in Experiments 1 and 2. Last, in order to have a first comparison with children’s performance, two-sentence problems, similar to those given to children, were also included in this experiment, besides the three-sentence continuous, semicontinuous, and discontinuous problems. The twosentence items will henceforth be called briefdescriptions. The basic aim of this experiment was to compare predictions drawn from two hypothesized strategies,
48
SERGIO
i.e., construction of mental models versus phonemic encoding of sentences.
MORRA
mean number of syllables. The same manipulations of spatial layout and order of terms in sentences as those in the original Method study were contained within the 24 probSubjects. Subjects were 18 undergradulems presented to subjects. ates (14 females, 4 males) attending the secBrief descriptions were constructed with ond year of courses in psychology. They three items from each of the following sereceived additional credit for participating mantic categories (never used in the other in this experiment and in a successive sem- types of descriptions): insects, cars, jobs, inar on it. professions, nationalities, bathroom obMaterials. A CRT screen connected to a jects, reptiles, housecleaning equipment. microcomputer was used for displaying all The same four arrangements of terms and stimuli. Twenty-four groups of four nouns two spatial layouts as those in Experiment each, from the same semantic categories as 1 were used; however, the terms left-right those in Ehrlich and Johnson-Laird’s (1982) were used here instead of over-under. As study, were used to construct continuous, the hypotheses were less complex for brief semicontinuous, and discontinuous de- descriptions, fewer counterbalancings were scriptions, so that the final sentence had the needed, and only one set of such descripsame wording in all cases. Here is a trans- tions was constructed. lated example of the three types of probTwelve preliminary one-sentence items lems thus contructed: were constructed with different semantic categories: four of them were merely pracContinuous The sheep in CAB) tice items with no object already reprefront of the sented in the diagram, while in eight cases cat an object was already present: four of them The mouse to 0) “congruent” and four “inrequired the right of the cat congruent” placement of the remaining obThe dog in (DC) ject. Six practice descriptions with 3, 5, 4, front of the 3, 4, 5 objects were also prepared, with semouse mantic categories not used for the experiSemicontinuous The dog to the WV mental items. right of the The items for the final task of Memory cat for Sentences were 8 three-sentence and 4 The sheep 03 two-sentence problems, very similar to behind the those used in the main task. cat Last, computer-controlled visually preThe dog in (AD) front of the sented versions of the digit span were premouse pared, both forward and backward, with 14 Discontinuous The dog to the CAB) lists from the WAIS and six from the WISC right of the in each condition, arranged in an irregular, sheep not predictable (but generally increasing) The mouse at 0.3 order of length. the right of Procedure. Each subject was instructed the cat The dog in (AC) to read and understand each sentence, then press the keyboard space-bar in order to front of the mouse make that sentence disappear and another The assignment of each group of nouns to one appear, and to show the whole layout types of descriptions was counterbalanced on paper when the word “depict” appeared across subjects. The descriptions of each in place of a sentence. Subjects were intype presented to subjects had the same strutted to represent each description with
SPATIAL
DESCRIPTIONS
a diagram, in which the names of the objects were written in the appropriate positions. The instructions strongly emphasized accuracy, but subjects were informed that the study time for each sentence was also recorded. Each subject received the 12 preliminary one-sentence items and the six practice items in a fixed order; then the 32 test items (eight for each type of description) were presented in a random, computer-generated order. No sentence remained on the screen for more than 32 s; after this time it would disappear with a beep and be replaced by the next sentence. This actually happened a few times. While a subject made a diagram, the experimenter recorded the temporal order of placement of the objects. After the main task, subjects were questioned about the way they had performed it. Then, the 20 Forward Digit Span and the 20 Backward Digit Span series were presented in a fixed order (l-s presentation and 200-ms IS1 for each digit, oral response). Last, the 12 groups of sentences were also presented on the screen in a fixed order (4-s presentation and 400-ms IS1 for each sentence) for immediate oral recall of each group of sentences. Each preliminary item and each problem of the Spatial Descriptions Task were scored as correct or incorrect. The score for order of placement was 1 point for each problem in which the names were written in the same temporal order as mentioned. The Forward and Backward Digit Spans were scored 1 point for each series reported in the exact (or exactly reverse) order. Each item of the Memory for Sentences task was scored according to the number of sentences exactly reported, so that the maximum score could be 2 or 3, for two- and three-sentence items, respectively. For each subject, the following parameters were computed: -probability of correct congruent liminary items, -probability of correct incongruent liminary items,
prepre-
AND VERBATIM
49
RECALL
-probability of recalling 2, 1, 0 tences in two-sentence items of the memory task, -probability of recalling 3, 2, 1, 0 tences in three-sentence items of the memory task.
senfinal senfinal
Design and hypotheses. In order to compare the predictions derived from the two hypothesized strategies (construction of mental models, phonemic encoding of sentences), the following hypotheses were tested. If the verbatim strategy is used consistently, the same probabilistic model that fitted the data from the pilot study and Experiment 1 should also fit the data for the brief descriptions in this experiment. That model may also be generalized to continuous and semicontinuous descriptions, with the formula reported in Table 4, derived using the same criteria as the preTABLE 4 MODIFIEDFORMULAOF VERBATIM-STRATEGY MODELFORCONTINUOUSAND SEMICONTINUOUSPROBLEMS p(R3)dC) ((P(C) + PUIY~)’ + 3/8 p(R2) p(C) (PC0 + p(O)/2 + 5&t PW) ~(0 + 5/m p@W where p(R3) = probability of recalling correctly all sentences in a triplet; p(R2) = probability of recalling correctly two of them; P(RI) = probability of recalling correctly one of them; + p(R2) + p(R1)) = PWO) = 1 - (p(R3) probability of recalling correctly none; P(C) = probability of representing correctly a congruent relationship between two objects; Pm = probability of representing correctly an incongruent relationship; 318 = probability of guessing the position of one of the four objects by chance; 5164 = probability of guessing the position of two of the objects by chance; 51512 = probability of guessing the position of all four by chance.
50
SERGIO
vious one; explanations are omitted here for reasons of space. A similar formula, however, cannot be derived for the discontinuous problems, since more assumptions would be needed (e.g., a shift in the order of sentences during overt representation, which in turn may impair performance to an unknown extent). If the strategy of verbatim encoding is consistently used, the same pattern of correlations as that in Experiment 1 should also appear; i.e., Memory for Sentences and Forward Digit Span should have the highest correlations with performance, both in the brief descriptions and in the three-sentence descriptions. In contrast, if the mental model strategy is used consistently, the pattern of accuracy results obtained by Ehrlich and Johnson-Laird (1982) should be replicated. These authors obtained no significant difference between continuous and semicontinuous problems, and found both of them easier than discontinuous problems. In addition, the pattern of their reading time results should also be replicated. They found no significant increase in reading time for the third sentence in continuous and semicontinuous problems, but did find a significant increase in discontinuous problems. Specific hypotheses may also be tested for order of placement of objects. If they are placed in random order, the probability of placing them in the same temporal order in which they are mentioned in the sentences is l/6 for brief descriptions and l/24 for the other types; therefore, the expected same order mean scores would be 1.33 for brief problems and 0.33 for the other types. If subjects follow a verbatim memory strategy (or if they construct mental models and follow a first-in-first-out principle at the point of depicting them) the scores should be quite high, and roughly independent of the types of problems. As the proportion of “same order” placements was .59 in Experiment 1 and 53 in Experiment 2, a mean score of at least four could be expected for brief descriptions. It may also be noticed
MORRA
that 59 and 53 are probably underestimations, as in about 10% of the trials the two objects named in the first sentence happened to be placed together with one gesture, before object C. However, if subjects construct mental models and scan them according to some spatial principle (e.g., leftright or far-near), the same order scores should vary in a way constrained by the probability that the spatial order in correct solutions corresponds to the order of mentioning the objects in any type of problem. These probabilities are l/2 for brief and continuous descriptions, l/4 for discontinuous descriptions, and l/8 for semicontinuous descriptions. Of course, there are variables that cannot be controlled, e.g., the position of the starting-point of the path chosen by subjects for each correct solution and, most important, subjects’ errors (the above probabilities were computed on correct solutions, but there are errors that destroy the temporal-spatial correspondence and errors that do not; and it is difficult to control the number and type of errors made by each subject). However, if not the expected means of 4 for brief and continuous, 2 for discontinuous, and 1 for semicontinuous problems, at least the predicted pattern of inequalities should be observed in the data. Results and Discussion The verbatim-memory strategy model underestimated performance in brief problems (expected mean = 4.88, observed mean = 6.56), t(17) = 5.55, p < .OOOl. Also, the generalized model underestimated performance in the continuous and semicontinuous problems (expected mean = 3.05, observed mean = 8.94), t(17) = 7.34, p < .OOOl. The variance of subjects’ performance was also largely underestimated by the verbatim-memory strategy model: for brief descriptions, x*(17) = 52.73, p < .OOl; for continuous and semicontinuous descriptions, x*(17) = 98.49, p < .OOl. This seems enough to discard the hypoth-
SPATIAL
DESCRIPTIONS
esis that verbatim encoding of sentences is a basic component of the strategy of adult subjects. One remarkable aspect of the data is that adults’ performance with brief descriptions was much better than children’s. However, the performance predicted by this model for adults was only slightly better than that predicted for children. The reason is that the adults performed the final test of Memory for Sentences only slightly better than children did. This seems in agreement with other findings reported in the literature, showing that even for adults remembering two sentences verbatim is not an easy task and that remembering three is very taxing (e.g., see Glanzer, Dorfman, & Kaplan, 1981). Therefore, it may be presumed that, if adults were following the same “verbatim” strategy as children, their performance should not be very high, and that the higher performance actually observed probably indicates that they were following a more efficient strategy (e.g., the construction of mental models). Nevertheless, the correlations between observed and expected scores were significant: for brief descriptions, r( 16) = .61, p < .004; for continuous and semicontinuous ones, r(16) = .49, p < .025. This finding is discussed below. It also seems that the subjects of our experiment set themselves at a different speed/accuracy balance than Ehrlich and Johnson-Laird’s subjects. We did not actually replicate the differences in accuracy, but we found speed differences that were even larger than those in the original study. The mean scores (out of eight) for the four types of descriptions were: brief = 6.56, continuous = 4.67, discontinuous = 4.39, semicontinuous = 4.28. They were significantly different (but this finding is trivial) when all types of descriptions were considered: F(3,51) = 12.41, p < .OOl, MSE = 1.66. However, when only continuous, discontinuous, and semicontinuous problems were considered, the result was not significant: F(2;34) = 0.47, p > .63, MSE = 1.55. It is quite possible that this failure to
AND
VERBATIM
51
RECALL
replicate the lower score for discontinuous problems was due to the much longer times spent by our subjects on this type of description. The mean times for each sentence are shown in Table 5. A two-way analysis of variance was run on the reading times for sentences 1,2,3 in continuous, discontinuous, and semicontinuous problems. It yielded significant results for types of descriptions, F(2,34) = 38.33, p < .OOl, MSE = 3.79; serial order of sentences F(2,34) = 37.95, p < .OOl, MSE = 8.75; and interaction, F(4,68) = 16.41, p < .OOl, MSE = 3.60. The crucial prediction derived from the hypothesis of a strategy that involves mental models is that the time spent on the third sentence is longer for discontinuous than for other types of descriptions. It was clearly satisfied by the data: the paired t tests for these comparisons yielded: continuous-discontinuous, t( 17) = 5.82, p < .OOl; semicontinuous-discontinuous, t(17) = 6.82, p < .OOl. A more detailed analysis of the mean times in Table 5, not reported in detail for reasons of space, showed that the shortest times were spent with the first sentence of any description; two more seconds were devoted to those sentences (Sentence 2 of brief, continuous, and semicontinuous and Sentence 3 of continuous descriptions), that merely demanded the placement of a token in the mental model in a position close to a token mentioned in the previous sentence. Even longer times were spent with sentences that may demand more complex encoding operations (Sentence 2 of discontinuous descriptions), retrieval (Sentence 3 of semicontinuous ones), and
MEAN
READING
TABLE 5 TIMES (IN SECONDS) Serial position
Description Brief Continuous Discontinuous Semicontinuous
Sentence 7.19 7.26 7.14 7.28
1
PER SENTENCE of sentences
Sentence 2
Sentence 3
9.66 8.58 11.42 9.53
9.16 16.08 11.32
52
SERGIO MORRA
integration of mental models (Sentence 3 of discontinuous ones). Correlations between the accuracy scores and the other variables considered in this experiment are reported in Table 6. Accuracy in two-sentence problems is shown to be correlated with recall of sentences presented in couples, and accuracy in three-sentence problems with recall of sentences presented in triplets. Performance in the preliminary items was correlated with all types of problems (but the asymmetric distribution of this variable might inflate correlations). The correlation (significant only in one case) between study time and accuracy simply showed that accurate subjects were also more rapid. Accuracy in each type of description (separately considered) showed positive but not significant correlations with Forward and Backward Digit Spans. When performance in all types of description was pooled, the correlations with Forward and Backward Digit Spans were .40, p = .05, and .30, p > .lO, respectively. The pattern of correlations obtained from children was not exactly replicated. However, the correlations of accuracy with Memory for Sentences were positive and significant, as well as those with the scores
predicted by the model; the correlation with Forward Digit Span, though significant only when all types of description were pooled, suggests about 15% common variance. Therefore, verbal short-term memory also seems to be involved in adults’ performance of this task. Subjects’ self-reports may be of some help in clarifying how this occurs. All subjects clearly reported using a strategy of constructing mental models, which they called “maps, ” “mental schemas,” and so on. No subject reported rehearsal of sentences, in contrast with children, who often reported such a strategy after Experiments 1 and 2. However, nine subjects in this experiment also reported or showed in their overt behavior some verbal coding, like rehearsing names of objects or encoding their initial letters. Nevertheless, these subjects’ results were quite similar to those of the whole sample. In particular, the analysis of variance of their reading times yielded a significant interaction of type of description x serial order of sentences, F(4,32) = 6.37, p < .OOl, and the crucial test on the difference between the times spent on the third sentence in discontinuous and semicontinuous problems yielded r(8) = 4.50, p < .002, showing that these nine subjects too
TABLE CORRELATIONS
OF ACCURACY
SCORES IN SPATIAL PREDICTORS
6
DESCRIPTIONS (EXPERIMENT
TASK WITH OTHER
MEASURES
USED
AS
3)
Type of descriptions Predictor
Brief
Continuous
Preliminary items Same order (brief) Same order (continuous) Same order (discontinuous) Same order (semicontinuous) Memory for Sentences (couples) Memory for Sentences (triplets) Forward Digit Span Backward Digit Span Mean reading time
.51* -.24 .08 -.03 - .39 ..56** .24 .31 .20 -.17
.59** -39 .38 .12 -.13 .21 .44* .27 .25 - .49*
* p < .05, one-tailed. ** p < .Ol, one-tailed. *** p < .OOl, one-tailed.
Discontinuous .59** - .45* .lO - .os -.23 .48* .55** .37 .21 - .32
Semicontinuous .76*** - .67** .20 -.22 .OS .31 .75*** .37 .33 -.26
SPATIAL
DESCRIPTIONS
were constructing mental models (as they did report). It should be recalled the Oakhill and Johnson-Laird (1984) had already found both a verbal coding component and an analog coding component in this task. Two interpretations may be suggested regarding the involvement of verbal short-term memory in adults’ strategy. While performance is determined primarily by spatial mental models, a verbatim representation may also be present at some stage of processing and exert an influence on performance (although Experiments 2 and 3 by Schmalhofer & Glavanov, 1986, showed little evidence of verbatim representations being used in comprehension tasks). A slightly different interpretation assumes that the “tokens” in the mental models can be verbal entities, such as words or initials. In several trials, three subjects overtly rehearsed the names of objects while pointing in different directions: this really looks like the kind of rehearsal that would be required if the second interpretation were true. However, no strong data allow us to distinguish between these two interpretations. The mean same order scores (out of eight) for the four types of description were: brief = 2.72; continuous = 2.28; discontinuous = 1.89; semicontinuous = 1.17. Each of them was significantly above the chance level (t = 2.73, p < .02 for semicontinuous descriptions; t = 4.32, t = 6.05, t = 4.99 for the other three types respectively, all with p < .OOl and df 17). The proportion of same order placements, even for brief descriptions, was clearly lower than that in Experiments 1 and 2. The observed mean for this condition was significantly lower, t(17) = 3.98, p < .OOl, than the value of 4, itself probably an underestimation of the score that should be expected from the strategy of encoding sentences verbatim. This again opposes the hypothesis that adults follow the same strategy as children. A one-way analysis of variance of the same order scores in three-sentence prob-
AND
VERBATIM
RECALL
53
lems yielded F(2,34) = 3.70, p < .04, MSE = 1.56. The pattern of inequalities was Semicontinuous < Discontinuous < Continuous, exactly as predicted for the case in which mental models are scanned according to spatial principles for overt representation. However, the level of significance was not high, and not all the comparisons were significant: continuous-semicontinuous, t(17) = 2.60, p < .Ol one-tailed; discontinuous-semicontinuous, t( 17) = 1.58, p < .04 one-tailed; continuous-discontinuous, t(17) = 1.10,~ > .14. As predicted, the same order score in brief problems was higher than in either semicontinuous t(17) = 3.50, p < .OOl one-tailed or discontinuous, t(17) = 2.48, p < .Ol one-tailed; it did not significantly differ from the same score in continuous problems, t(17) = 0.98, p > .34. It may be concluded that this experiment ruled out the hypothesis that adults follow the strategy of encoding sentences verbatim and supported the conclusion (Ehrlich & Johnson-Laird, 1982; Johnson-Laird, 1980, 1983) that adults construct analog representations of space, i.e., spatial mental models. Experiment 3 also suggested that the model is probably scanned according to some spatial (rather than temporal) principle in order to depict it in the form of a diagram. In addition, some results showed that verbal coding plays a role in adults’ performance; two ways in which this may occur are suggested. GENERAL DISCUSSION
Different sources of evidence for a strategy based on phonemic encoding of sentences (goodness of fit of a probabilistic model, correlations with empirical measures of verbal short-term memory, and rhyme-effect) were provided by experiments with children. The conclusion is that subjects aged 6-9 try to recall and overtly represent the sentences in a description one at a time. In addition to direct evidence supporting the hypothesis that children’s main strategy
54
SERGIO
is based on phonemic encoding of sentences, two contrasting situations are described here. The first contrasting situation was another task, described in Experiment 1, quite similar in content to the Spatial Descriptions Task, that nevertheless turns out to be much easier; as discussed above, the same subjects were probably able to use a different, more effective, propositional strategy in performing it. The second contrasting situation is adults’ performance in the Spatial Descriptions Task: Experiment 3 replicated the finding by Johnson-Laird and co-workers that adult’s strategy involves the construction of spatial mental models. Experiment 3 also helped in clarifying some details on adult strategy. A distinction between different types of code, such as surface linguistic representations, propositional representations, and mental models (Johnson-Laird, 1983), is supported by our data. A similar distinction among verbatim, propositional, and situational representations was recently suggested by Schmalhofer and Glavanov (1986). It could still be argued that our adult subjects did not use the same three-dimensional apparatus that the children used in Experiments 1 and 2, and that children only received two-sentence problems, while adults received 8 two-sentence and 24 three-sentence problems in random order. On intuitive grounds, it seems unlikely that these differences in procedure raised major artifacts. In any event, follow-up research is in progress, using the same procedure with different age-groups. The very first results of a new experiment tend to suggest that, with two-sentence problems and three-dimensional apparatus, a real shift in strategy occurs around the age of 11. This would be in agreement with the interpretation given above. The question remains open, however, of why children and adults use different strategies. The first answer that may come to mind is that children are not yet able to construct mental models. However, data
MORRA
from another line of research show that this is not the case. In the context of a task involving the planning of drawings, Morra, Moizo, and Scopesi (1988a, 1988b) show that 6-year-olds can usually construct spatial mental models of the positions in which they are going to draw three different elements of a scene; &year-olds can do this with elements in four different positions, and 9-year-olds may even be able to coordinate five different positions. A different explanation must therefore be found. In the absence of crucial data, various hypotheses may be put forward. One may be that the verbatim encoding strategy, for some reason, taxes working memory to a smaller extent: thus, children may prefer to use a less efficient, but also less demanding strategy. A more specific formulation of this hypothesis is that children can follow a verbatimmemory strategy because the articulatory loop reaches complete (or almost complete) development at an earlier age than other components or mechanisms of the working memory system. A second possibility is that children are less efficient in recoding verbal information into analogic, spatial information (in the experiments on drawing quoted above, item information was given verbally by the experimenter, but spatial information was self-generated by subjects). A third hypothesis is that children may lack control processes, executive routines, effective procedures, or any specific piece of procedural knowledge needed for the mental model strategy specific to this task. Last, the number of sentences in a problem may interact with age (i.e., with development of working memory) in determining the workload imposed by each strategy. In sum, this article provided evidence for an age-related difference in strategies, but more study is needed to find its cause. One more question was raised by E. Czerniawska (personal communication, November 1986). Given that children do not spontaneously follow a strategy based on mental models, can they be taught to do so? This question has both theoretical and ed-
SPATIAL
DESCRIPTIONS
ucational implications. The latter concerns the educational importance of mental models and related procedures in text comprehension (and perhaps also the use of procedures for the construction of spatial models in other domains such as drawing, as suggested by Morra, Moizo, & Scopesi, (1988 b)). Some speculations put forward in this tinal discussion are related to issues and theoretical problems which are not specific to the Spatial Descriptions Task but which are more general. Perhaps for this very reason, it seems worthwhile to continue to study this paradigm. REFERENCES BADDELEY, A. D. (1981). The concept of working memory: A view of its current state and probable future development. Cognition, 10, 17-23. BADDELEY, A. D. (1986). Working memory. Oxford: Oxford University Press (Clarendon). BADDELEY, A. D., LEWIS, V., & VALLAR, G. (1984). Exploring the articulatory loop. Quarterly Journal of Experimental Psychology, 36A, 233-252. BURTIS, P. J. (1982). Capacity increase and chunking in the development of short-term memory. Journal of Experimental Chld Psychology, 34, 387413. CASE, R. (1985). Intellectual development: Birth to adulthood. New York: Academic Press. CASE, R., & GLOBERSON, T. (1974). Field independence and central computing space. Child Development,
45, 772-778.
Cox, M. V. (1981). Interpretation of the spatial propositions “in front of’ and “behind.” International Journal of Behavioural Development, 4, 359-368. EHRLICH, K., & JOHNSON-LAIRD, P. N. (1982). Spatial descriptions and referential continuity. Journal of Verbal Learning and Verbal Behavior, 21, 296306. GLANZER, M., DORFMAN, D., & KAPLAN, B. (1981). Short-term storage in the processing of text. Journal of Verbal Learning and Verbal Behavior, 20, 656-670. HITCH, G. J., & HALLIDAY, M. S. (1983). Working memory in children. Philosophical Transactions of the Royal Society of London, B, 302, 325-340. HULME, C., THOMSON, N., MUIR, C., & LAWRENCE, A. (1984). Speech rate and the development of
AND VERBATIM
55
RECALL
short-term memory span. Journal of Experimental Psychology, 38, 241-253. HUTTENLOCHER, J., & STRAUSS, S. (1968). Comprehension and a statement’s relation to the situation it describes. Journal of Verbal Learning and VerChild
bal Behavior,
7, 308-304.
HUTTENLOCHER, J., WEINER, S. L. (1971). Comprehension of instructions in varying contexts. Cognitive
Psychology,
2, 369-385.
JOHNSON-LAIRD, P. N. (1980). Mental models in cognitive science. Cognitive Science, 4, 71-l 15. JOHNSON-LAIRD, P. N. (1983). Mental models. London: Cambridge University Press. JOHNSON-LAIRD, P. N., & BARA, B. (1984). Syllogistic inference. Cognition, 16, l-61. KINTSCH, W., & VAN DLJK, T. A. (1978). Towards a model of text comprehension and production. Psychological
Review,
85, 363-394.
MANI, K., & JOHNSON-LAIRD, P. N. (1982). The mental representation of spatial descriptions. Memory and Cognition, 10, 181-187. MORRA, S., MOIZO, C., & SCOPESI, A. (1988a). From general theories to specific models: Toward a working-memory model of the planning of children’s drawings. ERIC Documents, ED 286594. MORRA, S., MOIZO, C., & SCOPESI, A. (1988b). Working memory (or the M Operator) and the planning of children’s drawings. Journal of Experimental Child
Psychology,
46, 41-73.
MORRA, S., SCOPESI, A. (in press). La memoria operativa e la sua misurazione. Et& Evolutiva, 31. OAKHILL, J., & JOHNSON-LAIRD, P. N. (1984). Representation of spatial descriptions in working memory. Current Psychological Research, 52-62. OLSON, D. R., & BIALYSTOK, E. (1983). Spatial Cognition. Hillsdale, NJ: Erlbaum. PASCUAL-LEONE, J., & GOODMAN, D. (1979). Intelligence and experience: A neo-Piagetian approach. Instructional Science, 8, 301-367. Ptzzo Russo, L. (1977). Introduzione al test de1 disegno dell’uomo. Firenze: Giunti Barbera. RAVEN, J. C. (1947). Coloured progressive matrices: Sets A, Ab, B. London: Lewis. SCHMALHOFER, F., & GLAVANOV, D. (1986). Three components of understanding a programmer’s manual: Verbatim, propositional, situational representations. Journal of Memory and Language, 25, 279-294. SCHOFIELD, N. J., & ASHMAN, A. F. (1986). The relationship between digit span and cognitive processing across ability groups. Intelligence, 10, 5973. (Received December 8, 1987) (Revision received July 20,1988)