tive effects of extended training (overtraining) on repeated reversals of a simultaneous dis- crimination, that is, one in which SD and SA are presented together ...
JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR
1970, 14, 345-352
NUMBER
3
(NOVEMBER)
THE EFFECTS OF AMOUNT OF TRAINING PER REVERSAL ON SUCCESSIVE REVERSALS OF A COLOR DISCRIMINA TION' I. L. BEALE UNIVERSITY OF
AUCKLAND, NEW ZEALAND
Three groups of pigeons were trained on a red-green discrimination in which the stimuli were alternately presented in a multiple schedule of reinforcement. The discrimination was reversed 24 times. Groups were given 1, 2, or 4 hr of training on each discrimination. Increasing the length of training had two principal effects on reversal performance: it increased the rate of extinction of responding to one of the stimuli and increased the rate of reacquisition of responding to the other. The latter effect involved both an increase in reacquisition of responding to a positive stimulus within reversals and an increase in recovery of responding to the previous negative stimulus between reversals. Improvements in performance of each group over the series of reversals were qualitatively similar to the two effects of length of training on each discrimination, and were analogous to effects obtained in other studies involving overtraining and successive reversals of simultaneous discriminations.
Animals may be taught to respond to one stimulus (SD) and not to another (SA) by reinforcing their responses to SD only. The resulting change in behavior, called discrimination learning, may be reversed by further training in which SD and SA are reversed. This paper is concerned with the effect of amount of discrimination training on performance during a series of reversals. Several experiments have reported facilitative effects of extended training (overtraining) on repeated reversals of a simultaneous discrimination, that is, one in which SD and SA are presented together and responses to either are simultaneously available (Capaldi and Senko, 1961; Pubols, 1962; Gonzalez, Berger, and Bitterman, 1966; Eimas, 1967). These studies agree in showing that animals given more training per reversal tend to make more errors in the early reversals, but improve more rapidly and soon perform as well or better than animals with less training per reversal. In the spatial discrimination reversal experiment by Eimas (1967), errors during reversal were divided into two classes. Perseverative errors were defined as the number of consecutive errors before the first correct response, and reacquisition errors as the difference between total errors and perseverative errors. Eimas
found that the failure to find a facilitative effect of overtraining on early reversals was due largely to the excess of perseverative errors after overtraining, and that the facilitative effect of overtraining on later reversals was primarily due to a reduction in reacquisition errors. Perseverative errors declined across reversals for all subjects, but the decline was faster for overtrained subjects. Gonzalez et al. (1966) found that the improvement in performance across reversals generally took the form of a decrease in perseverative errors, although some subjects also showed a decrease in reacquisition errors. Both studies provide evidence that overtraining has beneficial effects on early reversals (fewer reacquisition errors) that are masked by the increase in perseverative errors. The present experiment examined the effect of amount of training per reversal on a series of reversals of a free-operant, successive discrimination, where SD and SA are presented alternately. The use of a successive discrimination allows separation of the effects of overtraining on responding to SD and 5A, while the use of a non-discrete trial procedure permits response rate to be used as dependent variable.
METHOD Subjects "Reprints may be obtained from the author, DepartNine experimentally naive homing pigeons ment of Psychology, Dalhousie University, Halifax, were maintained on a diet of grain occasionNova Scotia, Canada. 345
346
I. L. BEALE
ally supplemented by cod-liver oil. Free-feeding body weight was determined for each bird during a 14-day period of free access to grain, immediately before the experiment. During the experiment, the birds were maintained at about 80% of their mean free-feeding weights. Grit and water were always available in the home cage. Apparatus
The experimental enclosure was a standard single-key pigeon chamber (Ferster and Skinner, 1957). The response mechanism was a translucent key of 1 in. (25 mm) diameter that could be transilluminated by white, green, or red light projected from behind the key. Reinforcement was access to grain for 2.5 sec in a magazine located 6 in. (152 mm) below the key. During reinforcement, the magazine was illuminated and the keylight was off. The experimental chamber was illuminated by a 15-w houselight, the onset and offset of which indicated the beginning and end of the experimental session. The experimental chamber was housed in a sound insulating box. A blower provided ventilation and noise to mask auditory cues arising from the controlling equipment. A one-way vision screen permitted the observation of the subjects in the experimental chamber. Reinforcements and other stimuli were controlled automatically by standard tape-pullers, relays, and timers. Key pecks were recorded on counters and cumulative recorders.
Sessions ended after 60 reinforcements were obtained. The next procedural step was to introduce the discriminative stimuli (red or green illumination of key) in a multiple schedule (mult VI 1-min VI 1-min) in which the key color alternated between red and green every 3 min. The schedule of interreinforcement intervals was identical during red and green. The purpose of this procedure was to establish a stable response rate baseline for mult VI 1-min VI 1-min. No reinforcements were scheduled in one out of every ten 3-min presentations of each color, but most 3-min periods contained at least one scheduled reinforcement. Training on this schedule continued until total responses per hour in three consecutive sessions were within 5% of the mean total for the three sessions, and the highest total was not for the last session. The actual number of sessions on the schedule ranged from 11 to 17. When criterion was met by each subject, the "initial discrimination" was commenced by altering the reinforcement schedule in one component of the multiple schedule from VI 1-min to extinction (EXT). In the new schedule (mult VI 1-min EXT), responding was reinforced as before during one color (SD), but was never reinforced during the alternated color (SA). Three groups of three birds, designated Groups 1, 2, and 4 were given 1, 2, and 4 hr training respectively on the initial discrimination and every subsequent reversal. After this training, the discrimination was reversed by interchanging SD and SA, so that if red was SD on the initial discrimination it was SA on the first reversal. After further training the discrimination was again reversed, and so on, for a total of 24 reversals. For Group 1, sessions lasted 1 hr. For Groups 2 and 4, sessions were of 2 hr duration. This ensured that for all groups, all reversals occurred at the start of a daily session. Generally, experimental sessions began with SD, but for each bird this procedure was varied once as a test for stimulus control.
Procedure Experimental sessions were run on different, but not always consecutive days. After 1 hr adaptation to the experimental chamber the birds were trained to eat when the magazine was presented, and then to peck the key (white) to produce the magazine for 2.5 sec. Three sessions on continuous reinforcement (each key peck reinforced) were followed by training on variable-interval (VI) schedules of reinforcement, which reinforce the first response following varying intervals since the last reinforcement. The VI schedules used were constructed from a published progression RESULTS (Fleshler and Hoffman, 1962). These schedules Figures 1-1, 1-2, and 1-4 show the performcontrol a roughly uniform response rate. Three sessions on VI 20-sec were followed ance of Groups 1, 2, and 4 on the initial disby five sessions on VI 1-min. During this pre- crimination (reversal 0) and 24 subsequent liminary training the key was always white. reversals. Discriminative performance is shown
SUCCESSIVE DISCRIMINA TION REVERSALS
GROUP
,j.
B-12
B52
,
I&
1
7
&
-0
I
6
,L
'
0
24
.74 zp
I
5
4
\/VvAvN
MEAN '0
SD. Figure 1-4 shows Group 4 results for hours 1 and 2 only, hours 3 and 4 being omitted for greater clarity. These figures indicate the variability within groups, and the overlap between
groups.
O4~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ;~~~~~~~~~~ U.'~~~~~~~~~~~~~~~~~~~~~~~~~~~~,
z
347
4
8
12
REVERSAL
.
16
. zO
|
24
1 subexaof Group )
Fig. 1-1. Mean discrimination ind jects on initial discrimination (reve versals. Insets show individual curv es.
as a discrimination index, basced on each hour of responding, obtained by dinviding responses during SD by total responses. In other words, the discrimination index is th e proportion of the total number of responses c ccurring during
Because all groups could not be run concurrently, the usual procedure for matching groups was not in this case feasible. However, Fig. 1-1, 1-2, and 1-4 show that the three groups were well matched on performance during hour 1 on the initial discrimination. Much of the following analysis is in terms of "rate of improvement" of discrimination indices over reversals. This generally refers to the slope of smooth curves fitted by eye to the discrimination index profiles being considered. Inspection shows that in Group 4, discrimination indices increased rapidly during early reversals, but there was little increase after the twelfth reversal. In Groups 1 and 2, more gradual increase was sustained over the whole series of reversals. If a best-fit curve is fitted by eye to each reversal profile, the profiles in Groups 1 and 2 may be summarized by straight lines, and Group 4 by negatively accelerated curves. This difference may be interpreted as due to Group 4 birds rapidly reaching a per-
GROUP 4 GROUP 2
4 a U.'
I
M EAN I.
o-o *--o I
REVERSAL Fig. 1-2. Mean discrimination indices of Group 2 subjects on initial discrimination (reversal 0) and 24 reversals. Hour 1 and hour 2 means are shown separately. Insets show individual curves.
hr.2 hr.1
I
I
I
I
REVERSAL Fig. 1-4. Mean discrimination indices of Group 4 subjects on initial discrimination (reversal 0) and 24 reversals. Hour 1 and hour 2 means are shown separately. Data for hours 3 and 4 are not shown. Insets show individual curves.
348
I. L. BEALE
formance ceiling. This curve-fitting by eye is intended to give a rough indication of the general trend of the reversal profiles, so as to make possible general statements about trends in rate of improvement of discrimination indices over the series of reversals. Where a more accurate statement of rate of improvement is required, the actual increase in discrimination index is referred to. In comparing slopes of best fit lines in Group 1 (Fig. 1-1) with other groups (Fig. 1-2, 1-4), note the expanded scale on the ordinate in Fig. 1-1. All birds in Groups 2 and 4 performed worse during hour 1 of the first two reversals than in the initial discrimination. For each bird in Group 1, however, this was true for only one of the first two reversals. Rate of improvement up to reversal 12, during both hours 1 and 2, was more rapid for all birds in Group 4 than for all birds in Group 2, which in turn improved more rapidly than two out of three birds in Group 1. The exceptional bird, B-12, improved more rapidly than other birds in Groups 1 or 2, but did not approach the rapid rate of itnprovement over the first 12 reversals shown by all birds in Group 4. The mean function for Group 1 shows an oscillatory effect caused by a tendency of two birds to perform better on odd-numbered than even-numbered reversals. This effect is also seen in other groups, and appears to reflect sustained response biases to either red or green. It may be thought that this effect could have arisen from a tendency for a poor discrimination to be followed by a rapid reversal, since a strong discrimination should be easy to reverse, and vice versa. But the fact that these oscillations sometimes disappeared, together with the fact of interreversal improvement, argues against this notion. In the raw data, this effect appears as a tendency to respond more rapidly on VI 1-min or EXT during one stimulus than during the other. These biases caused irregularities in the present data, even in the group means, that obscured the more important trends in performance. For this reason further analysis of results is based on averaged performance for successive pairs of reversals. Figure 2 brings together the group means from Fig. 1-1, 1-2, and 1-4, with performance averaged for successive pairs of reversals. Figure 3 shows the same functions for Group 4, for all hours of training. Figure 2 shows that
discriminative performance in hour 1 on the first two reversals was inversely related to amount of training on each discrimination. By reversals 5 and 6, however, this relation was reversed, and remained reversed for subsequent reversals. Mainly because of the superior performance of B-12 in Group 1, the profiles of hour 1 performance of Groups 1 and 2 were not clearly separated beyond reversals 1 and 2. Discriminative performance in hour 2 of all reversals was directly related to amount of training per reversal, i.e., Group 4 indices were highest on all reversals. All curves except Group 4-hour 3 show substantial improvement in discrimination over reversals. The Group 4-hour 3 curve reflects a recovery of responding during SA that probably resulted from running the third and fourth hours of training for Group 4 on a different day from hours 1 and 2. Up to reversal 12, where the curves for Group 4 begin to level off, rate of improvement over reversals was directly related to amount of training per reversal for both hours 1 and 2 of training. Figure 3 shows that after reversal 6, the third and fourth hours of training did not increase discriminative performance in Group 4 even though the highest index was not reached until reversal 14. Despite this, the superior performance of Group 4 was maintained in both hours 1 and 2. Also,
REVERSALS Fig. 2. Mean discrimination indices of all groups on initial discrimination and 24 reversals. Indices are averaged for successive pairs of reversals. Hour 1 (open points) and hour 2 (filled points) means are shown separately. Means for hours 3 and 4 are not shown.
SUCCESSIVE DISCRIMINATION REVERSALS
GROUP 4
NITIAL DISCRI.
REVERSALS
Fig. 3. Mean discrimination indices of Group 4 on initial discrimination and 24 reversals. Means for successive hours are shown separately.
the rate of improvement over reversals 6 to 14 was still greater for Group 4 than Group 2. Group 4 improved from 0.85 to 0.97, Group 2 from 0.73 to 0.75. This is contrasted with the initial discrimination and early reversals, where hour 4 indices were always higher than hour 2 and hour 2 higher than hour 1. These differences notwithstanding, discrimination in the last hour of training on any reversal was always better for Group 4 than Group 2, and for Group 2 than Group 1, even past reversal 6 where this superiority could not have resulted from the additional training on that reversal, since hour 4 performance was no longer an improvement over hour 2 (Fig. 3). Figure 2 also shows that beyond reversal 4 at least, even hour 1 performance was better in Group 4 than in Group 2. This does not always hold for the comparison of Groups 2 and 1, however. These comparisons indicate that the difference in performance between groups on a particular reversal was not merely a result of the additional training on the previous reversal, and is best understood in context of all antecedent training. Figure 4 shows the mean number of responses during SD and SA during each hour of training on selected reversals equally spaced along the series. Group means are shown separately. Performance is again averaged for pairs of reversals. The baseline (hour 0) shows
the values of the last hour of training on the previous reversal. The slopes of lines joining any two SD or SA points are taken to indicate the rates of acquisition and extinction respectively during the hour shown by the second point. Comparison of Groups 4, 2, and 1 on reversals 1 and 2 shows that rates of acquisition (SD) and extinction (SA) were highest- for Group 4 and lowest for Group 1. That is, rate of acquisition or extinction (rate of change) in both SD and SA during the first two reversals was directly related to amount of training on the initial discrimination. This relation was true for most other reversals shown, except that beyond reversal 8, rate of change in SA was lower for Group 2 than Group 1. Figure 4 shows that improvement over reversals in all groups resulted from increasing rate of change in responding in both SD and SA. The recovery of responding to SA in hour 3 of training in Group 4 presumably resulted from the time lapse between hour 2 and hour 3 of training. Since there was no comparable reduction in SD responding in hour 3, the increase in SA responding alone produced the low hour 3 discrimination indices shown in Fig. 3. Furthermore, since reversals were always commenced at the beginning of a daily session, this recovery of SA responding was an obvious source of positive transfer from one reversal to the next. This raises the question of how much change shown between baseline and hour 1 responding was change within hour 1 of training and how much was change between reversals? To settle this question, hour 1 performance for all reversals and groups shown in Fig. 4 is elaborated in Fig. 5, where hour 1 performance is broken down into ten 6-min cycles. Each cycle comprises 3 min SA and 3 min SD. The cumulative records from which cycle 1 values are derived show that response rate throughout cycle 1 of SD was very uniform, so that cycle 1 values are a good indication of response rate before the first re-
inforcement. Figure 5 shows that change in response rate within hour 1 of the reversals was always far greater in SA than SD. Between-reversals change is assessed by comparing cycle 1 values with baseline values. Transformed baseline values are shown on the vertical lines, being obtained by dividing the Fig. 4 baseline values by 10 to give mean responses per cycle during the last hour of the previous reversal. The comparison
I. L. BEALE
350
shows that between-reversal change was always much greater in SD than SA. In only two cases (Group 4-reversals 7, 8; Group 1-reversals 19, 20) was there any appreciable drop in SA rate in cycle 1 compared to the baseline rate; for other reversals, slight increases and decreases occurred with comparable frequency. Apparently, change between reversals was usually confined to recovery of responding to the SA of the previous day. The magnitude of this effect was directly related to amount of training per reversal, being greatest for Group 4 and least for Group 1. The improvement over the series of reversals in the rapidity of change in SD rate from baseline to the end of hour 1 is gauged by the increase in the difference between baseline and cycle 10 SD values. This difference includes change between reversals (cycle 1-baseline) and change within hour 1 of reversals (cycle 10-cycle 1). Differences between groups on change in SD rate within hour 1 were small compared with the difference in recovery of
REVE R SAL
REVERSAL
7,8
19,20
13,14
7,8
1,2
responding to the previous SA between reversals and were not systematically related to amount of training per reversal. The difference between SD rates in cycles 1 and 10 increased over reversals most for Group 1, less for Group 4, and not at all for Group 2. Recovery cf responding to the previous SA before reversal (cycle 1-baseline) was always greatest in Group 4, but did not increase for Group 4 after reversals (7, 8). For Group 2, it increased progressively up to reversals (19, 20), and for Group 1 it did not increase at all. Putting it another way, the increase over reversals in the difference between baseline and cycle 10 SD rates is due to either increasing within hour-l change (Group 1), increasing betweenreversals change (Group 2) or both (Group 4). Figure 5 shows that for all groups, the rate of extinction of SA responses in hour 1 increased over reversals. This is deduced from the increasing steepness of smooth curves fitted by eye to the SA curves.
13,14
a)_________
3000
7 3 D
2000
C0 Ch:1 a
b
=
1000 0
i~~~~~~~~________ LU
LU 100
0~ o CD cn LU 1=
C,
K
X 2000
c)
0
m CD
20001
a
10001 I I I II
I I
I I I
I II
I
II
I I
0 1 2 34012 3401 23401 2 34
HOUR OF TRAINING Fig. 4. Mean responses occurring in SD and SI' in each hour of training on selected reversals. Performance is averaged for pairs of reversals. Group means are shown separately. Baseline values (hour 0) are for the last hour of training on the previous reversal.
C YC L E Fig. 5. Mean responses occurring in SD and SA in each 6-min cycle of training during hour 1 of selected reversals. Performance is averaged for pairs of reversals. Group means are shown separately. Baseline values are mean responses per cycle during the last hour of the previous reversal.
SUCCESSIVE DISCRIMINA TION REVERSALS Finally, considering reversals 1 and 2 alone, Fig. 5 shows that change in SD responding within hour 1 (cycle 10-cycle 1) was greatest for Group 1 and equal for Groups 2 and 4, although the differences were small. This is seen by comparing the slopes of straight lines through SD cycles 1 and 10. Recovery of responding to the previous SA (SD cycle 1-baseline) was greatest for Group 4 and least for Group 1, the differences between groups on this being quite large. Changes in SA responding within hour 1 were again greatest for Group 4 and least for Group 1. Figure 2 shows that on reversals (1, 2) the mean hour 1 performance was worst for Group 4 and best for Group 1. However, Fig. 5 shows that the rate of reversal was most rapid in Group 4; that is, the differences between baseline and cycle 10 values were greatest in Group 4. The mean hour 1 performance was worst in Group 4 because a stronger discrimination was being reversed; that is, baseline values of SD and SA were most different for Group 4.
351
nate appropriately according to which stimulus was present when responding was reinforced. Group 1, on the other hand, showed no evidence of recovery of responding to SA between reversals, and clearly could not have used such a strategy. Learning to respond to both stimuli could not account for the observed recovery of SA responding between early reversals by Group 4 (Fig. 5, reversals 1, 2), but this may have been due to another factor, such as spontaneous recovery. It is conceivable that all such increases in SA responding between reversals were simply spontaneous recovery, but this is not supported by the fact that the effect increased with successive reversals. Although the increases in SA responding in hour 3 by Group 4 (Fig. 4) may support the spontaneous recovery notion, those data are also consistent with the operation of a strategy of responding to both stimuli at the beginning of each session. The fact that sessions generally began with a presentation of SD suggests another possible source of improvement over reversals; the birds may have learned to respond to the first DISCUSSION stimulus presented, and not the one alternated Regardless of the amount of training given with it. The data in Fig. 5 do not support this on each reversal, all subjects improved in dis- view. In reversals (19, 20) for example, recriminative performance over the series of re- sponding was almost equal to both stimuli in versals. Only one source of improvement over the first cycle. Moreover, the cumulative recreversals was common to all groups; this was ords (not shown) of reversals begun with SA an increase in the rate of extinction of respond- (Group 1, reversal 18; Group 2, reversal 3; ing to SA. Another source of improvement, Group 4, reversal 14) were very like those of common to Groups 1 and 4, was an increase the reversals immediately before and after in rate of acquisition of responding to SD. them. A third source of improvement, common to The observed improvements in performance Groups 2 and 4, was an increase in the recov- within reversals, over the series of reversals, is ery of responding to the SA of the previous in line with the reduction in reacquisition reversal, between reversals. This third source errors reported by Eimas (1967), but is much of improvement seems analogous to the de- stronger than the improvement within revercrease in perseverative errors over a series of sals reported by Gonzalez et al. (1966). This reversals of a simultaneous discrimination re- experiment shares with those just mentioned ported by both Eimas (1967) and Gonzalez the finding that overtraining has a facilitative et al. (1966). Gonzalez et al. interpreted this effect in early reversals that is masked by the decrease in perseverative errors as the devel- stronger discriminations it established in the opment of "forgetting" between reversals. previous reversal. This point of similarity in While such an interpretation could conceiv- the results of those experiments and the presably be made of the increasing between-rever- ent study is emphasized by the fact that it sursal recovery of responding to SA in this experi- vives many procedural differences. This result ment, there is an alternative interpretation seems closely related to the common finding that may have more merit. The data of Groups that overtraining on a simultaneous discrimi2 and 4 are consistent with the view that the nation leads to greater initial resistance to expigeons learned to respond to both stimuli at tinction, as measured by choices of the previthe beginning of each session and then discrimi- ous SD, even though the reversal criterion is
352
1. L. BEALE
reached in fewer trials (Reid, 1953; Mackintosh, 1963; Williams, 1967). In the present experiment, overtraining had two principal effects. It increased the rate of extinction of SA responding, and it increased the rate of reacquisition of SD responding. The latter effect involved both an increase in SD responding within reversals and an increase in recovery of responding between reversals. Gonzalez et al. (1966) found a facilitative effect of overtraining only in the decrease of perservative errors between reversals, which is analogous to the between-reversals recovery of responding in the present experiment. Eimas (1967), on the other hand, found a facilitative effect of overtraining on both perseverative ancd (to a lesser extent) reacquisition errors, which is consistent with the between-reversals and within-reversals effects found in the present experiment. The effect of overtraining on a single reversal of a successive discrimination has been studied by Birch, Ison, and Sperling (1960). Unlike the present study, they found only one source of facilitation of reversal performance; this was an increase in the rate of extinction of SA responses. Theoretical accounts of the effects of overtraining on discrimination reversal (e.g., Mackintosh, 1969) are at present restricted to simultaneous discrimination. Mackintosh (1969) takes the view that the facilitative effects of overtraining on reversal of successive and simultaneous discriminations are not closely related. However, the similarity of the effects of overtraining in the present study and in those of Gonzalez et al. (1966)
and Eimas (1967) may possibly be taken as evidence to the contrary.
REFERENCES Birch, D., Ison, J. R., and Sperling, S. E. Reversal learning under single stimulus presentation. Journal of Experimental Psychology, 1960, 60, 36-40. Capaldi, E. J. and Senko, M. G. Effect of trials-perproblem on successive discrimination reversal learning. Psychological Reports, 1961, 8, 227-232. Einias, P. D. Effects of overtraining and irrelevant stimuli on successive position reversals in rats. Psychlonoinic Science, 1967, 7, 259-260. Ferster, C. B. and Skinner, B. F. Schedules of reinforcemlent. New York: Appleton-Century-Crofts, 1957. Fleshler, M. and Hoffman, H. S. A progression for generating variable-interval schedules. Journal of the Experimental Analysis of Behavior, 1962, 5, 529-530. Gonzalez, R. C., Berger, B. D., and Bitterman, M. E. Improvement in habit-reversal as a function of amount of training per reversal and other variables. Amlerican Journal of Psychology, 1966, 79, 517-530. Mackintosh, N. J. Extinction of a discrimination habit as a function of overtraining. Journal of Comparative and Physiological Psychology, 1963, 56, 842-847. Mackintosh, N. J. Further analysis of the overtraining reversal effect. Jouirnal of Comparative and Physiological Psychology, Monograph, 1969, 67, No. 2, Part 2. Pubols, B. H., Jr. Serial reversal learning as a function of the number of trials per reversal. Jouirnal of Comparative and Physiological Psychology, 1962, 55, 66-68. Reid, R. L. Discrimination-reversal learning in pigeons. Jouirnal of Cooinparative and Phlysiological Psychology, 1958, 51, 716-720. Williams, D. I. The overtraining reversal effect in the pigeon. Psychonomic Science, 1967, 7, 261-262.
Received 30 April 1970.