The Psychological Record, 2012, 62, 707–718
EXTINCTION OF THE DISCRIMINATIVE STIMULUS EFFECTS OF NICOTINE WITH A DEVALUED REINFORCER: RECOVERY FOLLOWING REVALUATION Joseph R. Troisi II, Erin Bryant, and Jennifer Kane Saint Anselm College, Manchester, New Hampshire
Extinction and recovery of the discriminative stimulus effects of nicotine (0.3 mg/kg) was investigated with a devalued food reinforcer (rats sated). Sixteen rats were trained in a counterbalanced one manipulandum (nose-poke) drug discrimination procedure with the roles of nicotine and saline counterbalanced as SD and SΔ. Discrimination training was maintained and then extinguished with the devalued reinforcer. Devaluation of the reinforcer diminished SD response rates during discrimination training but not discriminative control. Following delays after extinction, recovery of responding occurred with the revalued but not devalued reinforcer. These data demonstrate that (a) discriminative control by nicotine is temporally stable with a devalued reinforcer following acquisition and extinction, (b) revaluation of the reinforcer promotes recovery of discriminative control, and (c) recovery of interoceptive discriminative control by nicotine following extinction is affected by changes in motivation. Theoretical implications regarding drug replacement therapy and cue-exposure therapy are discussed. Key words: cue-exposure therapy, drug discrimination, establishing operations, extinction, devaluation, satiety, interoception, nicotine, spontaneous recovery, rats An operant discriminative stimulus (SD) sets the occasion (Skinner, 1938) for a response–reinforcer relationship. Conversely, the SΔ occasions nonreinforcement of a response. Differential responding among SD and SΔ stimulus conditions defines operant discriminative control. When methodologically arranged, administration of a drug state can function as either an SD or an SΔ. Discriminative control by drug states, as determined by response rate differences in the SD and SΔ conditions as well as an index of discrimination (i.e., %SD responses), is stable over time. For example, this laboratory (Troisi, 2003) previously demonstrated that discriminative control by nicotine (vs. saline) remained active 89 days following the final training and testing sessions in rats. These data are consistent with other reports showing that discriminative functions of drugs endure the passage of time (Henriksson & Järbe, 1972; Li & McMillan, 2003; McMillan, 1987; Spear, Smith, Sheer, & Bryan, 1979). This work was supported by New Hampshire IDeA Network of Biological Research Excellence (NHINBRE) NIH Grant Number 1P20RR030360-01 from the INBRE Program of the National Center for Research Resources. We thank Dr. Torbjörn U. C. Järbe, of Northeastern University Center for Drug Discovery, for the many discussions regarding methodology and results and Dr. Bernard Balleine, of the University of California Los Angeles, for thoughtful comments on an earlier version of this manuscript. Special thanks go to Devan Brazil, Dustin MacConnell, and Donna Pioli for their diligent animal husbandry. Correspondence concerning this article should be addressed to Joseph R. Troisi II, Psychology Department, Saint Anselm College, 100 St. Anselm Dr., Manchester, NH 03102. E-mail:
[email protected]
708
TROISI ET AL.
Extinction of interoceptive stimulus control by drugs has also been reported (Rijnders, Järbe, & Slangen, 1993; Zarcone & Ator, 2000). This laboratory has systematically evaluated the impact of various extinction procedures on the discriminative stimulus effects of nicotine (Troisi, 2003a, 2003b, 2006, 2011; Troisi, LeMay, & Järbe, 2010). Acquisition and extinction of discriminative control was carried out using a counterbalanced one-manipulanda (go/no-go across sessions) drug discrimination procedure (Schaal, McDonald, Miller, & Reilly, 1996). On some sessions a drug condition functions as the SD and sets the occasion for the response to be reinforced on a variable interval schedule, whereas on other sessions an alternate drug (or vehicle) occasions nonreinforcement and functions as the SΔ. Counterbalancing the role of the drug(s) as SD and SΔ across animals fosters an explicit drugreinforcer relationship for some animals and an explicit drugnonreinforcer relationship for others; it also controls for potential unconditioned effects on responding (cf. Colpaert, 1977). Extinction of the response in a nondrug state (saline) does not undermine discriminative control by nicotine or ethanol (Troisi, 2003a), whereas extinction of the response in the drug state substantially reduces response rates and discriminative control (Troisi, 2003a, 2003b). Additionally, Pavlovian extinction (i.e., the presentation of the drug-SD alone without response manipulanda present) following drug discrimination training impacts response rates and discriminative control as a function of the context in which nicotine, but not alcohol, is presented (Troisi, 2011). More relevant to the present study, there was no evidence of spontaneous recovery of discriminative control by nicotine 2 or 4 weeks following extinction; however, responseindependent food delivery without the levers present reinstated discriminative control 24 hours later (Troisi, 2003b). These last findings suggest that extinction training does not eliminate learning about the relationships originally established among the drug-SD, the response, and the primary reinforcer but rather promotes newer learning regarding nonreinforcement of operant responding under the drug conditions. Thus, extinguished discriminative control by drug states appears to be stable following extinction training (Troisi, 2003a, 2003b) just as discriminative control endures the passage of time following drug discrimination training (Troisi, 2003b). This may be one important distinction between exteroceptive stimulus control and interoceptive stimulus control produced by drug states. The clinical relevance of extinction of interoceptive stimulus control by drugs has guided this laboratory’s line of research. It has been suggested elsewhere that the drug discrimination paradigm might be a useful methodology for simulating how interoceptive states evoke drug-seeking and drug-taking behavior (see Troisi, 2003b, p. 590, for a detailed discussion). Interoceptive (emotions, stressors, drug states) and exteroceptive (social contexts) stimuli likely interact in modulating drug-seeking behavior as noted previously (Troisi, 2003b, 2006, 2011; Troisi & Akins, 2004). Cue exposure therapy (CET) is based on Pavlovian extinction in which drug abusers are merely exposed to drug-related stimuli (O’Brien, Childress, McLellan, & Ehrman, 1990). Historically, the goal has been to alleviate withdrawal-like symptoms and drug “craving.” Unfortunately, its reliability has been equivocal (Conklin & Tiffany, 2002). Predicated on these theoretical hypotheses, this laboratory has sought to extinguish discriminated operant responding under interoceptive stimulus control by drug states in an effort to simulate behavioral functions that might contribute to a clearer understanding of drug-abuse treatment and relapse. Changes in food deprivation/restriction level (“motivation”) impacts rate of operant responding and hence reinforcing efficacy (Belke & Kwan, 2000). In all of the drug discrimination extinction studies conducted by this laboratory (as summarized earlier), rats were maintained at 80% of their free-feeding weights during initial drug discrimination training and extinction. One method of modulating the effectiveness of extinction learning is by altering the efficacy of the reinforcer. For instance, food restriction increases the reinforcing efficacy (“value”) of food pellets whereas sating the organism decreases its value (i.e., “devaluation”; Balleine & Dickinson, 1994; Michael, 1982, 1993). Devaluation of the reinforcer would be expected to undermine extinction
NICOTINE DISCRIMINATION
709
learning (Balleine, 1992; Dickinson & Balleine, 1994; Colwill & Rescorla, 1985). A study by Balleine, Ball, and Dickinson (1994) reported that the discriminative stimulus effects of midazolam modulated the effects of feeding states on instrumental performance during training and brief periods of extinction. In a related study, Massey and McMillan (1987) demonstrated that the discriminative stimulus effects of phencyclidine were unaffected by changes over a wide range in percentage of ad lib feeding weights. Response rates decreased as body weight approached 90% ad lib feeding weights, but discriminative control was unaffected. That study was conducted in view of reports demonstrating that food restriction increases the reinforcing effects of drugs (Carrol & Meisch, 1984; cf. Bongiovanni & See, 2008)—but not to evaluate extinction of the discriminative stimulus functions of drug states. The clinical parallel between extinction of stimulus control with a devalued reinforcer and substance abuse treatment is subtle but perhaps theoretically relevant and important: First, drug replacement therapy (DRT; e.g., nicotine) prevents, postpones, or minimizes withdrawal symptoms, rendering the client somewhat less responsive to drug-related stimuli (e.g., Waters et al., 2004). Second, DRT may only temporarily devalue the reinforcing effects of the original drug of choice. Finally, exposure to drug-related stimuli while the drug abuser is maintained on DRT may minimize (if not prevent) the effectiveness of extinction learning. For instance, heroin abusers come into contact with many drug-related cues in methadone maintenance and continue to exhibit reactivity (Langleben et al., 2008); they often relapse (Joe, Simpson, & Sells, 1994). Theoretically, when taken off DRT, either abruptly or perhaps gradually, responsiveness to exteroceptive drug-related cues might be expected to increase, thereby promoting relapse. Langleben et al. (2008) noted that responsiveness to drug cues is greatest during longer periods between doses of methadone. Interoceptive cues previously correlated with drug taking would also be expected to evoke drug craving and increase the chances of relapse. In consideration of (a) the basic experimental operations regarding extinction of the discriminative stimulus functions of nicotine, (b) the theory underlying extinction of interoceptive stimulus control, and (c) the potential clinical relevance regarding DRT and CET, the present investigation sought to extinguish the interoceptive discriminative stimulus function of nicotine with a devalued reinforcer (i.e., while sated under ad lib feeding) and to test recovery following delays after extinction with a devalued reinforcer and a revalued reinforcer (i.e., return to 80% ad lib feeding weight). It was predicted that sating the rats with ad lib feeding would lower operant response rates in the SD state but sustain discriminative control as demonstrated by Massey and McMillan (1987). It was further predicted that following extinction of the drug discrimination in the sated state (devaluation), returning the rats to their 80% weights (revaluation) would promote recovery of response rates in the SD state (but not SΔ) and that discriminative control would hence return.
Method Animals Sixteen 11-month-old male Sprague-Dawley rats (Harlan Breeders, Indianapolis, IN) were used in accord with this institution’s IACUC policies. The rats had histories of exposure to nicotine and light-food pairings in a Pavlovian pilot study 5 months before the start of the present investigation (note that drug condition assignments in the present study were based on these histories). Prior to that study, rats were experimentally naïve, group housed, and maintained on ad lib feeding with free access to water since their arrival 6 months earlier. Ad lib weights at the start of the study averaged 435 g prior to feeding restriction. The rats were gradually brought down to 80% of their ad lib weights (mean = 348 g). They were maintained at 80% (restricted feeding) or returned to their ad lib weights (sated) during different conditions throughout the study. Rats were housed
710
TROISI ET AL.
individually in stainless steel hanging cages with a Plexiglas liner above the mesh floor (to allow floor bedding) and maintained on a 12-hour light–dark cycle (0700–1900 hours). All rats were weighed daily. During restricted feeding, the rats were provided approximately 10 to 20 g of 5P00 Prolab RMH 3000 (depending on body weights) at 1730 hours. During ad lib feeding, rats had continuous access to chow throughout the day.
Apparatus Sessions took place in eight stainless-steel operant chambers (Med-Associates, Georgia VT, model ENV-001), measuring L 28 × W 21 × H 21 cm. Each chamber was equipped with a centrally located food magazine (which delivered 45 mg food pellets, base formula, Bio-Serv, Frenchtown, NJ) and was located 7 cm above the grid floor on the front panel. Nose-poke manipulanda (Med-Associates) were located 2 inches from the rear steel wall on the adjacent Plexiglas wall to the left of the food magazine at the opposite end of the chamber. Nose-poke entry by the rat’s snout triggered a photocell, which was recorded by the Med-PC software via computer in an adjacent room. The chambers were spaced 2 to 3 feet apart about the perimeter of the sound- and light-attenuated experimental conditioning room, measuring L 16.5 × W 9 feet. White noise from an antenna-less television was present during all training and test sessions.
Procedure Initial drug discrimination acquisition (feeding restriction). Rats were maintained at 80% of their free feeding weights during this acquisition phase and during the initial tests for discriminative control (see below). Nose poking required no specific training and was quickly established on a fixed ratio of one (FR-1) for 20 minutes on the first day. Over the next three sessions, responding was gradually titrated to a variable interval 30-second schedule (VI-30 s). Drug discrimination training. Drug discrimination training commenced on the fifth session. (-)-Nicotine di-tartrate (Research Biochemicals International, Natick, MA, USA) calculated as base was dissolved in 0.9% saline. Intraperitoneal administration of either nicotine (0.3 mg/kg) (N) or saline (S) occurred prior to all training and test sessions and was delivered in a volume of 1 ml/kg 10 min prior to the start of each session. The dose of nicotine was selected based on prior work by the authors with this dose. For eight rats (N+/ S− trained) nicotine (N+) was administered 10 min before sessions of food reinforced nose-poking, and functioned as the SD condition (i.e., go). For these same eight rats, saline (S−) functioned as the SΔ condition (i.e., no-go). For the other eight rats (S+/N− trained) the stimulus roles of nicotine and saline were the reversed. Training sessions were 30 min every day. SD and SΔ sessions alternated quasirandomly from day to day under the condition that no more than two consecutive types of sessions occurred. There were a total of 20 initial training sessions (9 SD and 11 SΔ). It should be noted here that although there were an uneven number of SD and SΔ sessions, it was previously reported that SΔ sessions may be more critical for establishing discriminative control with a one-lever manipulanda drug discrimination procedure (see Troisi et al., 2010, p. 178, for a more detailed discussion of the role of the SΔ in drug discrimination learning). Thus two more SΔ sessions were embedded in acquisition training. Initial testing (feeding restriction). Over the next 2 days, two 5-min nonreinforcement tests were conducted, one with the SD condition and the other with SΔ condition. Eight rats received nicotine on the first test day and saline on the second test day. For four of these eight rats, the SD condition occurred first and was followed by the SΔ condition on the next day; the other four rats received the opposite test order. The remaining eight rats received saline on the first test day and nicotine on the second test day; again, the test order was counterbalanced across the 2 test days. Thus the stimulus roles of nicotine and saline, and the test order, were completely counterbalanced across all 16 rats.
NICOTINE DISCRIMINATION
711
Drug discrimination training while sated (reinforcer devaluation). Over the next week, all 16 rats were provided ad lib access to lab chow in their home cages and weighed daily. Drug discrimination training resumed as described above over the next 11 sessions. There were 7 SD and 4 SΔ sessions conducted. More SD than SΔ sessions were conducted to determine the impact of satiety on SD response rates. Two additional 5-min tests were conducted as described above. Extinction under satiety (reinforcer devaluation). During this phase, the rats were maintained on ad lib feeding to sustain the reinforcer devaluation. Over the next 11 sessions (7 SD and 4 SΔ), nose-pokes were not reinforced with either the nicotine or saline condition. There were three more SD sessions than SΔ sessions to foster extinction in the SD conditions. Within-group tests with revaluation versus devaluation. The rats were matched based on the percentage decrease in responding in the SD condition from initial testing at 80% weights to responding in the SD condition while maintained on ad lib food. Eight rats were then randomly assigned to the reinforcer revaluation condition and returned to their 80% weights over the next 13 days. The remaining eight rats were maintained on ad lib feeding (i.e., devaluation) in their home cages. Within each condition of eight rats, there were four rats with N+/S− training histories and four with S+/N− training histories. Thirteen days were necessary to return the eight rats to their 80% weights. Therefore a 13-day delay took place from the final extinction session (described above) and the tests that followed. Over the next 2 days, two 5-min tests were conducted. Over the next 2 weeks, the eight rats that were initially tested at their 80% weights were placed on ad lib feeding. The other eight rats that were initially tested on ad lib feeding were gradually returned to their 80% weights. Two additional nonreinforcement tests occurred over the next 2 days. Consequently, all rats were tested under both feeding conditions allowing for within-group comparisons and control for order of testing.
Results Training Data The training results are displayed in Figure 1. To analyze changes in SD response rates across the three phases, a 2 (group: N+ vs. S+) by 3 (phases: initial acquisition, training under satiety, and extinction under satiety) repeated measure ANOVA was conducted (α = .05). The data were averaged across SD training sessions for each group across the initial acquisition (9 sessions), across training under satiety (7 sessions), and across extinction under satiety (7 sessions). There was no significant difference in response rates with the SD conditions between N+ rats (top) and S+ rats (bottom) across all three phases. There was a significant decrease in SD response rates across the three phases, F(1, 14) = 384.28, p