Psychological Bulletin 1999, Vol. 125, No. 2, 171-186
Copyright 1999 by the American Psychological Association, Inc. 0033^2909/99/53.00
Stimulus Generalization, Context Change, and Forgetting Mark E. Bouton, James B. ofNelson, University Vermontand Juan M. Rosas Forgetting is often attributed to retrieval failure caused by background contextual cues changing over time. However, generalization between stimuli may increase over time and make them increasingly interchangeable. If this effect occurs with contextual cues, it might cancel any effect of a changing context. The authors review the evidence and suggest a resolution of this paradox. Although generalization gradients can change over time, the effect is not always strong. Increased responding to nontarget stimuli is not often shown, and few studies have demonstrated such changes with contextual cues in a way that rules out other interpretations. Even this example of forgetting may be caused by retrieval failure. The physical contexts manipulated in learning and memory experiments themselves occur within a superordinate temporal context and can thus be forgotten with no inherent challenge to a contextchange account of forgetting.
Spontaneous forgetting, the loss in learned performance that is often observed when time elapses between learning and remembering, is presumably caused by several mechanisms. The memory trace might decay over time, or information acquired earlier or later might increasingly interfere with it (proactive and retroactive interference, respectively). This article focuses on retrieval failure, another factor that has received considerable attention over the last few decades (e.g., Tulving & Pearlstone, 1966). In this case, memory performance might deteriorate when an individual fails to access material that is otherwise still available in the memory store. The idea that forgetting involves retrieval failure is consistent with evidence suggesting that forgotten information can be recovered by the presentation of retrieval cues (e.g., Deweer, 1986; Gordon, 1981; Tulving & Pearlstone, 1966). Thus, information may remain available, yet become less accessible over time (e.g., Tulving & Thomson, 1973). What exactly causes the decrease in accessibility? Ordinarily, retrieval is best when there is a match between the conditions present during encoding and the conditions present during retrieval (e.g., Spear, 1978; Tulving & Thomson, 1973); when there is a mismatch, retrieval failure occurs. It is now widely assumed that the passage of time can create a mismatch
because internal and external contextual cues that were present during learning may change or fluctuate over time (e.g., Estes, 1955). Thus, the passage of time may change the background context and make it less likely that target material will be retrieved (e.g., Bouton, 1993; McGeocbu 1942; Mensink & Raaijmakers, 1988; Spear, 1978). We call this approach the context-change account of forgetting, The idea has recently been put to use in the area of animal conditioning and memory. For example, consider some current thinking about extinction in classical conditioning (Bouton, 1991, 1993). In this paradigm, performance is reduced when a conditioned stimulus (CS) is presented without the unconditioned stimulus (US) after conditioning has occurred. However, the performance drop does not reflect unlearning: After extinction, behavior is determined by the background context (e.g., Bouton, 1991, 1993). For example, when the context is changed after extinction, animals behave as if they forget extinction: Conditioned responding is "renewed" (e.g., Bouton & Bolles, 1979; Bouton & King, 1983; Bouton & Peck, 1989; Bouton & Ricker, 1994). Thus, extinction performance may depend on the context for retrieval (e.g., Bouton, 1991, 1993). It is interesting that extinction seems especially dependent on the context. A similar change of context after conditioning (rather than extinction) often does not cause a corresponding loss of conditioning performance (e.g,, Bouton & King, 1983; Bouton & Peck, 1989; Hall & Honey, 1989; Kaye, Preston, Szabo, Druiff, & Mackintosh, 1987). These ideas have helped illuminate spontaneous recovery, the well-known phenomenon in which extinguished responding naturally returns over time (see Bouton, 1993, for a review). Just as extinction may depend on the physical context for retrieval, so it may depend on contextual cues that correlate with time. Renewal and spontaneous recovery may both result from a failure to retrieve extinction outside of the context in which it was learned (e.g., Bouton, 1991, 1993). Consistent with this view, both effects are attenuated by presenting cues that remind the subject of extinction at the time of the test (Brooks & Bouton, 1993, 1994), That is, forgetting of extinction caused by either time or physical context change is reduced by a
Mark E. Bouton, James B. Nelson, and Juan M. Rosas, Department of Psychology, University of Vermont. James B. Nelson is now at Department of Psychology, University of Central Arkansas, Juan M. Rosas is now at Departamento de Psicologia, University of Jaen, Jaen, Spain. This research was supported by National Science Foundation Grant IBN-9209454, by a grant from the Basque Government's Programa de Formation de Investigadores (Ref. BFI94.140), and by the Center for Advanced Study in the Behavioral Sciences, where Mark E. Bouton was a Fellow during 1997-4998 (supported by Grant 95-32005-0 from the John D. and Catherine T. MacArthur Foundation). We thank Robert Boakes for his comments on an early version of this article. Correspondence concerning this article should be addressed to Mark E. Bouton, Department of Psychology, University of Vermont, Burlington, Vermont 05405, Electronic mail may be sent to
[email protected].
171
172
BOUTON, NELSON, AND ROSAS
retrieval cue. Such evidence is further consistent with the idea that the passage of time is equivalent to a change of physical context. Although a context-change account of forgetting has gained some acceptance in both the animal and human memory literatures, a provocative question about its plausibility was raised but never answered over a decade ago. Riccio, Richardson, and Ebner (1984; see also Riccio, Ackil, & Burch-Vernon, 1992; Riccio, Rabinowitz, & Axelrod, 1994) suggested that the literature on stimulus generalization is relevant. They noted that generalization between different stimuli can often change over time. Humans and animals tend to forget stimulus attributes, and generalization between stimuli may increase so that responding to different stimuli becomes more similar over time (e.g., McAllister & McAllister, 1963; McAllister, McAllister, & Franchina, 1965; Perkins & Weyant, 1958; D. R. Thomas & Lopez, 1962). Generalization performance is often summarized by plotting a "generalization gradient" in which responding to a set of test stimuli is shown as a function of their difference from the original stimulus. Responding typically declines as the test stimulus deviates from the original. Such a gradient can "flatten" with delayed testing (e.g., Riccio et al., 1984); Responding to the various test stimuli may become more equal. There has been surprisingly little theoretical work to account for this phenomenon. A recent exception is a model presented by Estes (1997), which supposes that items are coded in memory as arrays with many features or attributes. Over time, random perturbation of individual attributes could yield memory distortions that include increased generalization between initially discriminated stimuli. Regardless of the theoretical explanation of the phenomenon, Riccio and his colleagues have noted that the effect of time on generalization performance introduces a puzzle for the contextchange account of forgetting that we call the context-forgetting paradox. If organisms generalize more between different contexts over time, it would make different contexts increasingly interchangeable. Such an increase in interchangeability could offset the retrieval failure that would otherwise be caused by a changing context. How can organisms forget because of contextual change if contexts are judged to be increasingly similar? The idea that spontaneous forgetting is due to contextual change thus needs to be brought to terms with the literature on how stimulus generalization changes over time. The purpose of this article is to do exactly that. To begin the discussion, we suggest that two criteria must be met for changes in generalization to truly challenge the context-change account of forgetting. First, notice that there are at least two ways that generalization gradients might flatten over time. In one case, responding to the trained stimulus might decrease and begin to resemble the level of responding to the altered test stimuli. This form of flattening would only imply forgetting of the trained stimulus, and because it would be consistent with any theory of forgetting, it would pose no paradox for a context-change account of forgetting. A second type of flattening would be more damaging. In this case, responding to the altered test stimuli would increase and begin to resemble responding to the originally trained stimulus. This form of flattening is the one emphasized by Riccio et al. (1984), who stated, for example, that as the retention interval increases, "the original responses continue to occur, but more 'false positives' are generated to altered stimuli" (p. 158). If a generalization gradient surrounding a contextual retrieval cue were
to flatten this way, memory performance assessed outside the original context could actually improve, rather than worsen, over time. This type of flattening would undermine the context-change account of forgetting; in the extreme, a flattened gradient would imply perfect memory without forgetting in any context. Our point is that the context-change account of forgetting would be challenged only by demonstrations of an increase in false-positive responding. A second criterion is that an increase in false-positive responding needs to be shown with contextual stimuli. The context-change account of forgetting emphasizes changes in the background in which to-be-remembered tasks are embedded. In contrast, many studies of time's effect on generalization have examined generalization to stimuli that are in the foreground. For example, classic work examined generalization to the color of the lighted "key" that pigeons peck to earn food reinforcers (e.g., D. R. Thomas & Burr, 1969; D. R. Thomas & Lopez, 1962). The distinction between contextual and noncontextual cues may be important because, as Bouton (1993) noted, there is evidence that the control of behavior by background cues can often be surprisingly stable over time (e.g., Bouton & Brooks, 1993; Kraemer, 1984; Peck & Bouton, 1990; D. R. Thomas, Moye, & Kimose, 1984). Furthermore, there may be grounds for expecting such stability. Contextual stimuli are often complex and multimodal: Estes's (1997) recent account of gradient flattening predicts less memory distortion with complex stimuli, that is, those that involve the encoding of a large number of attributes. There is thus a need to examine how generalization to background contextual stimuli specifically changes over time. In this article, we re-evaluate the context-forgetting paradox. We begin with a brief survey of the literature that helps characterize the flattening-gradient phenomenon. We then examine the evidence in light of the criteria set out earlier. Surprisingly few studies provide direct evidence that false-positive responding to nontarget stimuli increases over time, and fewer still demonstrate such an effect with background contextual cues in a way that separates true changes in generalization from alternative explanations. We then propose a conceptual resolution of the paradox and review new results that are further consistent with a contextchange account of forgetting. We believe the discussion goes some distance in resolving the context-forgetting paradox. We should note that, although there are other implications of the fact that humans and animals may forget stimulus attributes (as emphasized by Riccio et al., 1994), our specific focus is the paradox the phenomenon may pose for the context-change account of forgetting.
Changes in Generalization Over Time The purpose of this first section is to survey briefly the range of results that have been reported. We first examine results that have been obtained when different stimuli have been tested (a) within individual subjects and (b) between different groups. This work is predominantly with animals. We then turn to the effect as it has been observed in humans. One implication of our survey is that the term flattening (e.g., Riccio et al., 1984) should not be misconstrued to mean that the generalization gradient literally becomes flat, with uniform responding to a wide range of stimuli. The forgetting of stimulus attributes is often much less extreme.
RE-EVALUATION OF THE CONTEXT-FORGETTING PARADOX
Generalization Tested Within-Subjects The most systematic work studying changes in generalization over time has investigated discriminated operant performance in pigeons. In one early study, D. R. Thomas and Lopez (1962) trained pigeons to peck a key that was illuminated with a greenyellow color (550 nm) over a series of 30-min sessions. Pecking the key was reinforced on a variable-interval schedule in which food reinforcers were presented about once a minute. Different groups then received generalization tests (using a procedure described later) at 1 minute, 1 day, or 1 week after the last training session. Testing was conducted during each of two sessions (separated by 24 hr) in which the response key was illuminated for 30-s periods with 1 of 11 colors ranging from 500 to 600 nm (roughly green to orange). There were eight unreinforced tests of each stimulus on each test day. The results of this widely cited experiment are presented in Figure 1. Responding to each test stimulus is plotted as a percentage of the total responses made during the test (we discuss this measure in the next section). There is a clear generalization gradient at the 1-day test. Note that responding dropped off systematically as the test stimulus deviated from the original 550-nm value. The figure also suggests that the immediate test produced the steepest gradient. Note, however, that a generalization gradient with a clear peak at the trained value is evident at each retention interval. Note also that although the gradients change between the 35
30
C
25
O Q.
2
20
"co *^ o
r 15 10
500 510 520 530 540 550 560 570 580 590 600
Wave length (MU) •— — *— —••---
Immediate group 1 -day group 1-week group
Figure 1. Responding by pigeons during stimulus generalization tests conducted immediately, one day, or one week after the conclusion of reinforced training to peck a 550-nm key. From "The Effect of Delayed Testing on Generalization Slope" by D. R. Thomas and L. J. Lopez, 1962, Journal of Comparative and Physiological Psychology, 44, p. 542. In the public domain.
173
immediate and the 1 -day test, there is no further change in performance from one day to one week. Thus, forgetting appears to have reached its maximum within the first 24 hr (however, see D. R. Thomas et al., 1985). D. R. Thomas and Burr (1969) subsequently showed that the time since training, not the time since consuming recent reinforcers (confounded in the D. R. Thomas & Lopez study), is responsible for the change evident between the immediate and 1-day tests, although neither study equated the groups on handling just before the test (the experimenter handled the delaytested groups, but not immediate-tested groups, at this time). Despite this complication, we may note, as Bouton (1993) did, that the asymptotic gradient is not literally flat; the change in stimulus generalization is less dramatic than that. Similar changes in the shape of generalization gradients have been reported in related experiments (Burr & Thomas, 1972; Moye & Thomas, 1982; D. R. Thomas & Burr, 1969). In each, there was a change in performance, although the trained stimulus still retained considerable control over behavior. Not all attempts to produce the effect in pigeons have been successful, however. D. R. Thomas, Ost, and Thomas (1960) reported no change in the shape of the gradient with increasing retention intervals. Their experiment differed from the D. R. Thomas and Lopez (1962) study in that the birds had received explicit discrimination training in which pecking at a 550-nm keylight was reinforced (S+) and a 570-nm keylight was not (S + ). The birds were also tested 1, 7, or 21 days after training. The generalization gradients observed were similar at all of the test intervals, with each even retaining a phenomenon known as peak shift (e.g., Hanson, 1959): The highest level of responding was not to the reinforced 550-nm stimulus but to a stimulus (540 nm) that was further away from the nonreinforced S—. These data are noteworthy for the following reason. Peak shift is usually attributed to the interaction of an excitatory gradient around S+ and an inhibitory gradient around S—, with the inhibition surrounding S— assumed to subtract from the excitation around S + . Because there is less inhibition as the test stimuli deviate from S -, the peak of responding is seen on the side of the gradient opposite (540 nm) the location of S —. In this light, the results of D. R. Thomas et al. (1960) imply good retention of stimulus control around both S+ and S- after 21 days. In a similar fashion, Hearst and Sutton (1993) reported no significant flattening of gradients around either S+ or S— (tested separately) when pigeons were tested after delays between 30 min and 21 days. Pigeons can thus retain learned discriminations between visual stimuli over retention intervals of several weeks. Good retention of stimulus control by pigeons has also been shown with auditory stimuli (Hoffman, Fleshier, & Jensen, 1963; Hoffman, Selekman, & Fleshier, 1966). In this research, pigeons were first trained to key peck on a variable interval food reinforcement schedule. Then, a 1000-Hz tone was repeatedly paired with shock while the birds continued to key peck. Tone-shock training gave the tone the ability to suppress the operant keypecking baseline when it was presented. It was continued until the birds pecked at a normal rate in the absence of the tone and showed complete suppression during the tone. This apparently required extensive training; most birds needed about 600 tone-shock trials over 70 sessions. Of most interest, however, was a series of generalization tests that began one day or less after training (Hoffman & Fleshier, 1961), after an interruption of 2.5 years, and then after another 1.5-year interruption. At this time, the birds received
174
BOUTON, NELSON, AND ROSAS
nonshocked presentations of tones between 300 and 3400 Hz that were spaced in roughly equal increments on a logarithmic scale. As Figure 2 shows, the birds retained excellent discriminative control throughout testing. The figure, which is taken from Hoffman et al. (1966), illustrates average performance for the 4 birds (out of 6) that survived the entire experiment. The lower panel shows that the suppression evoked by the tone decreased gradually and systematically over the course of the extinction testing, with some spontaneous recovery probably evident after the multipleyear interruptions. The 1000-Hz tone was never paired with shock after initial conditioning, although noncontingent shocks were presented during one test phase to reinstate extinguished performance (shock stress). Most important, note that the generalization gradients at the top (which describe 4-session means) remain systematic and sharp throughout testing, regardless of the extent of extinction, and more important, regardless of the length of the retention interval. Because there is no age-matched control group tested immediately while these birds received their delayed tests, it is not possible to conclude that there is no forgetting. But the data suggest that a well-learned fear response to an auditory cue can be well-retained, and apparently specifically so, over tests that span 4 years.
Generalization Tested Between-Subjects The results of some systematic within-subject tests thus suggest that although generalization gradients can flatten to some extent over time, the gradient is sometimes retained surprisingly well. However, because the within-subjects method involves exposing them to a large set of stimuli during testing, it may provide many opportunities to reactivate information that was forgotten over the delay. In this sense, the method may provide a somewhat conservative estimate of forgetting (D. A. Thomas & Riccio, 1979). In
fact, some between-subjects tests (in which different groups are tested with different stimuli) do suggest a more marked change in generalization over time (e.g., McAllister & McAllister, 1963; Perkins & Weyant, 1958). One example is the seminal experiment by Perkins and Weyant (1958), the first study to suggest that generalization gradients flatten over a delay. Rats were trained to run for food reward on an elevated runway. For half of the rats, the runway and goal box were white; for the other half, the runway and goalbox were black. In either case, the start box and door that was opened at the start of each trial was gray. After training, different groups were then tested either 60 s or 1 week following training in either the same or the different (novel) runway. As shown in Figure 3, the switch to the novel runway caused a decrease in running speed in the immediate condition but not in the delayed condition. These data are consistent with the idea that the rats generalized more between the runways a week after training. McAllister and McAllister (1963) reported compatible results. Rats received pairings of a light and a shock in a white box and then learned to escape the light (and the box) by jumping a hurdle to get into a neutral gray compartment during a test. Different groups received conditioning in either the start compartment of the hurdle apparatus or in a different box. They then learned to escape the start compartment beginning 3 min or 24 hr later. Rats that received conditioning in the different box displayed relatively little escape jumping when the test began 3 min later. However, this deficit was not present in the rats tested at the 24-hr interval, who showed improved performance that was similar to the nonswitched controls. These rats apparently forgot attributes of the conditioning box. The results of Perkins and Weyant (1958) and McAllister and McAllister (1963) seem more striking than the pigeon data por-
-A A JU A
I I I I I I I I I I I I I 20 25 30
Session number Figure 2. Response suppression by pigeons during generalization tests conducted in extinction after a 1000-Hz tone was associated with electric shock. Generalization gradients (4-session means) are shown at the top; amount of suppression evoked by the tone is shown below. Testing began 1 day after training (left), 2.5 years later (middle), and 1.5 years after that (far right). From "Stimulus Aspects of Aversive Control: Long-Term Effects of Suppression Procedures" by H. S. Hoffman, W. Selekman, and M. Fleshier, 1966, Journal of the Experimental Analysis of Behavior, 9, p. 660. Copyright 1966 by the Journal of the Experimental Analysis of Behavior. Adapted with permission.
RE-EVALUATION OF THE CONTEXT-FORGETTING PARADOX
49.2 50--
40--
Delay
32.7 /
28.6
30--
20" 11.1
10--
tiata Immediate
/
Test stimulus Figure 3. Running speeds of groups of rats tested in the trained runway (T) or a novel runway (N) beginning 60 s (Immediate) or one week (Delay) after the end of training. From "The Intertrial Interval Between Training and Test Trials as Determiner of the Slope of Generalization Gradients" by C. C. Perkins and R. G. Weyant, 1958, Journal of Comparative and Physiological Psychology, 51, p. 597. In the public domain.
trayed in Figures 1 and 2, and there are many differences between the various methods that might account for this. One obvious one is the between-subjects, rather than the within-subjects, method used during testing. However, the use of a between-subjects design does not guarantee strong results. For example, in the introduction to a closely related experiment that followed on from McAllister and McAllister's, Desiderate, Butler, and Meyer (1966, presented in more detail later) described an experiment (Butler, 1964) that failed to produce gradient flattening and suggested that the stimulus change must be moderate to produce the effect. The same conclusion might be drawn from an experiment by Zhou and Riccio (1996), who found flattened generalization after 2 weeks when the room or the apparatus was changed (the rooms differed only in illumination and the apparatuses differed in odor and slightly in size) but not when both room and apparatus were changed together. McAllister and McAllister's (1963) own contexts were extremely similar; the other box was a "replica of the start box of the hurdle apparatus" (p. 577) except for some very subtle differences. Overall, the findings are consistent with the view that time primarily affects generalization between very similar stimuli. This conclusion is also consistent with other between-subjects experiments using animals. Kraemer (1984) reinforced pigeons for pecking an S+ and nonreinforced an S— and then retrained this discrimination after retention intervals of 1, 10, or 20 days. He found increasing forgetting at 10 and 20 days, as indicated by incorrect responding to S— (Experiment 1). However, S+ and S— were similar, consisting of 2 versus 3 green dots projected on the pecking key (counterbalanced). There was little forgetting when the discrimination involved more dissimilar stimuli, for example, 2 versus 5 green dots (Experiment 2). A similar set of findings was
175
reported by D. R. Thomas et al. (1985). We would note that increased responding to an S — after explicit discrimination training like this might be caused by a loss of inhibition to S— rather than a change in generalization between S+ and S-. Inhibition may weaken more quickly than excitation over a retention interval (e.g., Hendersen, 1978; D. A. Thomas, 1979; see also Nakajima, 1997). This may provide an alternative explanation of effects occurring after explicit discrimination training (Kraemer, 1984; D. R. Thomas et al., 1985). MacArdy and Riccio (1991) examined the effects of interoceptive contexts created by the administration of drugs. They studied state-dependent retention, in which retention of a task is best when testing is conducted in the drug state that was also present during training. In their experiment, rats were trained while they were under the influence of sodium pentobarbital, chloropent (a combination of pentobarbital and chloral hydrate), or isotonic saline. The task was one of passive avoidance, in which the rat was given a mild shock if it entered the black compartment of a box with both black and white compartments. The rats were then tested while in the same state or in one of the other states (all nine possible training-testing combinations were examined) either 1 or 7 days later. When the rats were switched from one drug to the other, there was a loss of retention at the 1-day interval but not at the 7-day interval. In the switched condition, performance improved as the retention interval lengthened. In contrast, no such pattern was apparent when animals were switched from drug to saline or saline to drug; this more extreme switch caused a loss in performance that did not change significantly over the long retention interval. (Indeed, the switch from saline to drug may have been more disruptive at the longer than at the shorter interval [p = .07], which, if anything, seems to imply a sharpening gradient.) Animals forgot the specific attributes of the different drug states but still discriminated between drugged and undrugged states. In a similar vein, in a taste-aversion learning experiment with rats, Richardson, Williams, and Riccio (1984, Experiment 2) paired a drink of 10% sucrose with illness produced by a lithium chloride injection. Different groups were then tested for their aversions to sucrose concentrations of 2.5%, 10%, or 32% after intervals of 2, 7, or 21 days. In this method, conditioning is indicated by suppressed consumption of the fluid. The results revealed a modest generalization gradient around the trained sucrose concentration, which also became modestly flattened. However, consumption changed by only 1 or 2 ml in the crucial groups tested with the generalized solutions. Also notice that the study did not involve tests of different flavors, but rather, different concentrations of sweet. Retention interval again influenced generalization between very similar stimuli. Tests With Human Participants Experiments with humans have produced compatible results. McAllister et al. (1965) showed students a 50-mm line in the center of a 27.9 X 35.6 cm piece of cardboard for 10 s. Different participants then waited 30 s or 8 min before they were shown a second card with a line of 50, 54, 58, or 62 mm on it. At this time, they were asked to judge whether the new line was equal in length or longer than the first one. This judgment was worse at the longer delay (longer stimuli were incorrectly judged as equal in length), suggesting the forgetting of what appears to be a relatively fine
176
BOUTON, NELSON, AND ROSAS
discrimination. (The effect disappeared when the participants were given a repeat test with the other retention interval.) In another widely cited study, Bahrick, Clark, and Bahrick (1967) showed undergraduates line drawings of 16 common objects 1, 3, 9, or 18 times. In a later test, they asked the participants to identify the original drawing from a row of similar drawings of the same type of object. There were 10 distractor items in the row that differed in similarity to the original prototype (as determined by the ratings of independent judges). A sample row is shown in Figure 4; the participants had 8 s to choose the original drawing from among the distractors in the row. The participants were tested immediately after seeing the individually presented prototypes, or 2 hr, 2 days, or 2 weeks later. It is not surprising that performance appeared to become poorer over time, as the subjects forgot specifics of the original stimulus (the results were not analyzed statistically). Nevertheless, with each level of training, a gradient was still evident at all retention intervals. As shown in Figure 5, which shows the frequency with which items were selected during testing as a function of the judges' ratings of their deviation from the original, there was considerable specificity in what the participants retained. Indeed, we believe these data could also be used to support the claim that stimulus attributes are retained fairly well over time.
Summary This overview of the literature begins to characterize the effect of time on stimulus generalization. There is clearly evidence that generalization can change in a way that is consistent with the idea that organisms forget stimulus attributes, although the size of the effect can range from very small or nonexistent to large (e.g., compare Figures 1-5). There is presently little understanding of why such a range of outcomes is possible, and attempts to compare the results of such radically different methods are obviously problematic. It seems plausible that the amount of training is important, as is the method of testing. There is also evidence that "difficult" discriminations (in which S+ and S- are close together on the tested dimension) are hurt more by the passage of time (D. R. Thomas et al., 1985). With this range of outcomes in mind, we now turn to an evaluation of its impact on the context-change account of forgetting.
Do False-Positive Responses Increase? As we noted in the introduction, for a flattening gradient to pose a true paradox for a context-change account of forgetting, changes in the gradient would need to reflect an increase in false-positive
responding. Thus, one criterion for a serious challenge to the context-change account of forgetting is that gradient flattening should involve an increase in responding to stimuli that were not trained, for instance, those represented by the tails of the generalization gradient. It would be ideal if flattening took the form of a significant increase, between immediate and delayed tests, in responding to nontarget stimuli. Relatively few of the studies that are commonly cited in this literature really satisfy this criterion, however.
Use of Relative Generalization Gradients We previously noted that the most systematic work on flattening gradients has been that with discriminated operant tasks in pigeons. Most of these data have been presented in the form of a relative gradient (Burr & Thomas, 1972; Moye & Thomas, 1982; D. R. Thomas & Burr, 1969; D. R. Thomas & Lopez, 1962) in which responding to each stimulus is expressed as a percentage of the total responding to all stimuli (see Figure 1). The use of relative gradients makes it difficult to infer an increase in false-positive responding because such a gradient's shape can be affected by changes other than absolute increases in responding in the tails (see Mackintosh, 1974). For instance, responding could generally increase to all of the tested stimuli. When expressed as a relative gradient, responding to the training stimulus would become a smaller percentage of the total overall responding. However, the shapes of the absolute gradients would remain the same; one would merely be above and parallel to the other (see, e.g., D. R. Thomas et al., 1985, pp. 474-475). Alternatively, if overall responding decreased to a floor, a relative gradient would also flatten. There are many ways in which a relative gradient can flatten without an increase in false-positive responses. The study by D. R. Thomas and Lopez (1962) described earlier (Figure 1) can be used to illustrate the difficulties associated with interpreting relative gradients. In that study, responding to the trained stimulus alone was shown statistically to be a smaller percentage of total responding at the delayed tests. Although the authors reported that overall responding did not change significantly from the immediate test (865.4) to the 1-day test (1031.5), numerically there was an increase of about 166 responses. If responding had increased uniformly to all stimuli (roughly 15 responses to each), the percentage of total responses given to the training stimulus would decrease from about 29% (see Figure 1) to 25%, but the shapes of the absolute gradients would be the same. To further reduce responding to the 20% level suggested by Figure 1 would require a further drop in absolute responding to the
Figure 4. Example of stimuli used in the recognition test of Bahrick, Clark, and Bahrick (1967). The original stimulus (0) and stimuli of various degrees of deviation from the original (1-5) are shown. From "Generalization Gradients as Indicants of Learning and Retention of a Recognition Task" by H. P. Bahrick, S. Clark, and P. Bahrick, 1967, Journal of Experimental Psychology, 75, p. 465. Copyright 1967 by the American Psychological Association. Reprinted with permission of the authors.
177
RE-EVALUATION OF THE CONTEXT-FORGETTING PARADOX
1 Training trial
3 Training trials
140 -I
Immediate 2 hours 2 days 2 weeks
120-
--Oo c