Distinct Roles for the Amygdala and Orbitofrontal Cortex in ...

0 downloads 0 Views 3MB Size Report
Jul 5, 2017 - http://dx.doi.org/10.1016/j.neuron.2017.06.012 ... Alexandre Saez,1,7 Joseph J. Paton,1,5 Brian Lau,1,6 and C. Daniel Salzman1,2,3,4,8,*.
Report

Distinct Roles for the Amygdala and Orbitofrontal Cortex in Representing the Relative Amount of Expected Reward Highlights d

Monkeys’ behavior tracked the relative reward magnitude predicted by stimuli

d

Neurons in orbitofrontal cortex (OFC) and the amygdala encode this relative magnitude

d

d

Authors Rebecca A. Saez, Alexandre Saez, Joseph J. Paton, Brian Lau, C. Daniel Salzman

Correspondence

Relative-magnitude-coding neurons in OFC update faster than their amygdala counterparts

[email protected]

Relative-magnitude-coding neurons in OFC but not in the amygdala integrate reward history

Saez et al. study the mechanisms by which relative reward magnitude is represented in the brain. Orbitofrontal cortex (OFC) and the amygdala both encode relative reward, but OFC plays a predominant role in its computation and updating.

Saez et al., 2017, Neuron 95, 70–77 July 5, 2017 ª 2017 Elsevier Inc. http://dx.doi.org/10.1016/j.neuron.2017.06.012

In Brief

Neuron

Report Distinct Roles for the Amygdala and Orbitofrontal Cortex in Representing the Relative Amount of Expected Reward Rebecca A. Saez,1,7 Alexandre Saez,1,7 Joseph J. Paton,1,5 Brian Lau,1,6 and C. Daniel Salzman1,2,3,4,8,* 1Department

of Neuroscience, Columbia University, 1051 Riverside Drive Unit 87, New York, NY 10032, USA Institute for Brain Sciences 3Department of Psychiatry Columbia University, 1051 Riverside Drive Unit 87, New York, NY 10032, USA 4New York State Psychiatric Institute, 1051 Riverside Drive Unit 87, New York, NY 10032, USA 5Present address: Champalimaud Research, Champalimaud Centre for the Unknown, Avenida de Brası´lia, Lisbon 1400-038, Portugal 6Present address: Sorbonne Universite ´ s, UPMC Univ. Paris 06, Inserm, CNRS, Institut du Cerveau et de la moelle e´pinie`re (ICM) – Hoˆpital Pitie´-Salpeˆtrie`re, Boulevard de l’hoˆpital, F-75013, Paris, France 7These authors contributed equally 8Lead Contact *Correspondence: [email protected] http://dx.doi.org/10.1016/j.neuron.2017.06.012 2Kavli

SUMMARY

The same reward can possess different motivational meaning depending upon its magnitude relative to other rewards. To study the neurophysiological mechanisms mediating assignment of motivational meaning, we recorded the activity of neurons in the amygdala and orbitofrontal cortex (OFC) of monkeys during a Pavlovian task in which the relative amount of liquid reward associated with one conditioned stimulus (CS) was manipulated by changing the reward amount associated with a second CS. Anticipatory licking tracked relative reward magnitude, implying that monkeys integrated information about recent rewards to adjust the motivational meaning of a CS. Upon changes in relative reward magnitude, neural responses to reward-predictive cues updated more rapidly in OFC than amygdala, and activity in OFC but not the amygdala was modulated by recent reward history. These results highlight a distinction between the amygdala and OFC in assessing reward history to support the flexible assignment of motivational meaning to sensory cues.

INTRODUCTION A sensory stimulus can possess innately reinforcing qualities, or it may acquire motivational significance through learning how it predicts a rewarding or aversive event. However, changes in the external environment or in internal homeostatic state can cause humans and other animals to adjust their assessment of the motivational significance of a stimulus. For example, an innately reinforcing stimulus, such as a sweet tast70 Neuron 95, 70–77, July 5, 2017 ª 2017 Elsevier Inc.

ant, can be more or less motivationally significant depending upon internal variables such as thirst, hunger, or satiation to that particular taste. Similarly, external variables can affect the motivational significance of a stimulus. The reinforcement contingencies of a stimulus can change, or environmental conditions can change, which will in turn adjust the motivational significance of a particular reward. To wit, a glass of a Premier Cru red burgundy wine may be quite rewarding if presented alongside a range of inferior wines, but the same glass of wine might be less appealing if tasted alongside a Grand Cru. In other words, if the Premier Cru is presented in each of two contexts defined by the other wines tasted in that occasion, then it can have different motivational meaning. Flexible cognitive and emotional behavior therefore requires neural mechanisms that adjust assessments of motivational significance to contextual changes defined by internal or external variables. Two anatomically interconnected brain structures commonly implicated in representing the motivational significance of sensory stimuli are the amygdala and OFC (Everitt et al., 2003; Morrison and Salzman, 2010; Padoa-Schioppa and Cai, 2011; Price, 2003; Salzman and Fusi, 2010; Schultz, 2015; Stefanacci and Amaral, 2002; Wallis, 2011). Neural responses to sensory stimuli possessing innate or learned motivational meaning have been described in both the OFC and amygdala (Belova et al., 2008; Bermudez and Schultz, 2010a, 2010b; Gore et al., 2015; Hikosaka and Watanabe, 2000; Kennerley et al., 2009; Morrison et al., 2011; Morrison and Salzman, 2009; Nishijo et al., 1988a, 1988b; Padoa-Schioppa and Assad, 2006, 2008; Paton et al., 2006; Pritchard et al., 2005; Rich and Wallis, 2016; Rolls et al., 1996; Saez et al., 2015; Shabel and Janak, 2009; Tremblay and Schultz, 1999; Wallis and Miller, 2003). In both structures, different ensembles of neurons respond preferentially to rewarding or aversive CSs or USs (Belova et al., 2007, 2008; Gore et al., 2015; Janak and Tye, 2015; Kim et al., 2016; Morrison et al., 2011; Morrison and Salzman, 2009; Padoa-Schioppa and Assad, 2006; Paton et al., 2006; Redondo et al., 2014; Zhang et al., 2013). Neural responses

Figure 1. Contrast Revaluation Task and Behavior (A) Sequence of events for trials (Fixation, CS appearance, trace interval, US delivery). CS2 predicts a small reward in blocks 1 and 3 and a large reward in block 2. CS1 predicts a medium reward in all blocks. (B) Licking rate (smoothed with 5 trial sliding window) plotted versus trial number for CS1 and CS2 for one experiment. Block switches, vertical dashed lines. (C) Mean normalized licking rate across experiments plotted versus trial number relative to the first (left) and second (right) revaluation. The sequence of trials in each panel is the longest sequence common to all experiments. Orange line, CS1; purple line, CS2. Shading, SEM. (D) Average licking rate for each CS. Error bars: SEM. **p < 1e-06.

to a CS in OFC and amygdala change rapidly when reinforcement contingencies change (Hobin et al., 2003; Morrison et al., 2011; Paton et al., 2006; Saez et al., 2015; Schoenbaum et al., 1998). However, the motivational significance of a CS can also change through an adaptive process even when its reinforcement contingencies do not change. Prior studies have shown that neural responses to a CS in OFC exhibit adaptation (Kobayashi et al., 2010; Padoa-Schioppa, 2009), but it remains unknown whether same applies to the amygdala as adaptation has only been observed for responses to USs (Bermudez and Schultz, 2010b). Here we employed a contrast revaluation paradigm to manipulate the relative motivational meaning of a reward-predictive CS without changing its reinforcement contingencies. This paradigm changes the relative reward magnitude—but not the absolute reward magnitude—predicted by a CS by manipulating the reward magnitude predicted by another CS. The adjustment of motivational significance in this circumstance requires integrating reward history information about both types of trials. We observed that updating neural representations of motivational significance using this paradigm occurred more rapidly in OFC than amygdala. Consistent with this finding, recent reward history influenced responses to a CS in OFC but not amygdala. These data implicate OFC and not amygdala in flexible adjustments of relative motivational meaning driven by reward history.

RESULTS Behavior We trained two rhesus monkeys to perform a contrast revaluation task that manipulates the relative reward magnitude predicted by a stimulus while holding the absolute amount of reward predicted constant (Figure 1A; see STAR Methods). On each trial, the monkey was presented with either CS1 or CS2 followed by a trace interval and US delivery. Experiments had three blocks of trials. In block 1, CS1 predicted medium reward and CS2 predicted small reward. In block 2, CS1 still predicted medium reward but CS2 predicted large reward. Block 3 contained the same reinforcement contingencies as block 1. Thus, the relative reward magnitude predicted by CS1 changed from block to block despite predicting a fixed quantity of reward. Anticipatory licking during the interval between CS presentation and reward delivery—an approach behavior that indicates expectation of reward—was employed to assay the monkeys’ subjective assessment of the motivational significance of each CS. Figure 1B shows the anticipatory licking rate during a representative experiment. The monkey licks more after the CS that predicted the larger reward in each block. Moreover, anticipatory licking in response to CS1 decreases during block 2 even though predicted reward amount remains constant. During block 3, the licking response to CS1 increases again but not to the same level as in block 1. The lower licking rate for CS1 in

Neuron 95, 70–77, July 5, 2017 71

Figure 2. Relative Reward Amounts Modulate Single-Neuron Responses in Amygdala and OFC (A) Amygdala cell. Top: raster. Bottom: PSTHs. Vertical lines: CS onset, offset, and US onset. Orange, CS1; purple, CS2. (B) Firing rate (smoothed with 5-trial sliding window) during CS and Trace intervals plotted versus trial number for the same cell. Block switches, vertical dashed lines. (C and D) OFC cell, same conventions as (A) and (B). See also Figure S1.

block 3 may reflect satiation. Figure 1C shows the average licking rate across experimental sessions for CS1 and CS2. The licking rate for CS1 during block 2 is significantly lower than during block 1 (Figure 1D, p < 1e-06, Wilcoxon sign-rank test), an observation also true for each monkey individually. The mean licking rate increased significantly for block 3 compared to block 2 (p < 1e-06, Wilcoxon sign-rank test). Overall, these results imply that anticipatory licking is not just a response to the sensory properties or the volume amount of the anticipated reward. Instead, anticipatory licking indicates the amount of expected reward relative to other rewards received within a block of trials. Neural Correlates of Relative Reward Magnitudes We recorded the activity of 166 amygdala neurons (100 and 66 in each monkey, respectively) and 96 OFC neurons (62 and 34) during performance of the contrast revaluation task. Recorded cells were mainly in the basolateral complex of the amygdala and area 13/13 m of OFC (Figure S1). Neurons that respond most strongly to a CS predictive of either large or small rewards were observed in both amygdala and OFC, consistent with prior studies (Belova et al., 2007; Morrison et al., 2011; Morrison and Salzman, 2009; Padoa-Schioppa and Assad, 2006; Paton et al., 2006; Peck et al., 2013). Figures 2A and 2B show an amygdala neuron that prefers large over small rewards in all three blocks, while Figures

72 Neuron 95, 70–77, July 5, 2017

2C and 2D show an OFC neuron that prefers small over large rewards. We refer to these two types of selectivity as ‘‘positive’’ and ‘‘negative’’ neurons, respectively. The example cells depicted in Figure 2 exhibited activity that was modulated both by changes in the expected reward magnitude (CS2) and the relative magnitude of the reward (CS1). We next identified neurons whose responses were sensitive to absolute expected reward amount using an ANOVA (see STAR Methods), and then asked whether those same neurons also modulated their responses to changes in the relative reward amount of reward predicted by CS1. We found 101 amygdala neurons (61%) and 55 OFC neurons (57%) that were selective to reward magnitude. Of these neurons, about half showed positive selectivity (53/101 in the amygdala; 25/55 in OFC) and the other half showed negative selectivity. Figures 3A and 3C show the mean normalized (Z scored) firing rate of all neurons selective for reward magnitude plotted as a function of trial number in the amygdala and OFC. Across the populations of neurons recorded, the selected neurons tended to be modulated by changes in the relative amount of reward in a significant manner (p < 0.01, Wilcoxon sign-rank test); the observed modulation by relative reward magnitude is comparable in amplitude to that elicited by changes to actual reward amounts associated with CS2 (Figures 3B and 3D). These results held when considering each monkey individually (p < 0.05, Wilcoxon test).

Figure 3. Relative Reward Amounts Modulate the Average Responses of Neurons Sensitive to Reward Magnitude (A) Mean firing rate of amygdala neurons selective to amount of expected reward plotted as a function of trial number relative to the first (left) and second (right) revaluation. Each panel shows the longest sequence common to all averaged neurons. Orange, purple lines: CS1, CS2 trials. Shading, SEM. (B) Mean firing rate of neurons in (A) for each block. Error bars: SEM. **p < 0.01 (Wilcoxon sign-rank test). (C and D) Same as (A) and (B) except for OFC neurons.

Figure 3 demonstrates that neurons selective to expected reward magnitude are on average also selective to relative reward size. We then fit for each neuron a linear regression modeling firing rate as a sum of regressors reflecting blocktype and the interaction between block-type and CS; we used this fit to classify neural activity as being significantly modulated by expected relative reward (bold orange in Figure S2 and STAR Methods). We found 54 (33%) cells in the amygdala, and 28 (29%) in OFC that met this criterion, and we refer to those cells as encoding the motivational significance of a CS. The Dynamics of Updating Representations of Motivational Significance Differ between OFC and Amygdala We investigated the dynamics of modulation in neurons in OFC and amygdala that change their responses when the relative amount of a reward changes, but not its absolute magnitude. For each CS1 trial following the revaluation in block 2, we computed the mean change in neural activity as a function of time within the trial (see STAR Methods). Figures 4A and 4B show the average change across all neurons in OFC (A) and the amygdala (B) encoding motivational significance adjusted by changes in relative reward amounts; in this plot, the activity of negative neurons was sign-flipped before averaging with positive neurons. The identified neurons in OFC show a faster and more pronounced decrease in their response to CS1 following the revaluation of CS2 compared to neurons in the amygdala. The change in activity during the CS presentation interval (shifted

by 90 ms to account for visual response latency) is significantly greater in OFC than amygdala for the first three trials after revaluation, a result that was also observed when considering each monkey individually (p < 0.05, one-sided Wilcoxon rank-sum test). This suggests a predominant role of OFC in integrating the information necessary to compute the motivational significance of the current reward relative to other rewards within a block of trials. Following the second revaluation of CS2 in block 3, OFC and amygdala neurons updated their response to CS1 at about the same speed, likely reflecting the smaller adjustment in behavior observed for that transition (Figures 1C and 1D). In our task, the motivational significance of CS1 is determined by comparing the reward amounts predicted by CS1 and CS2 within a block of trials. As a result, the computation of motivational significance requires the integration of information about reward history. The more rapid updating of representations in OFC suggested that OFC and not amygdala may integrate information about reward history. In other words, the medium reward associated with CS1 may be perceived as less rewarding when following a large reward trial than when following another medium reward trial. By contrast, recent CS2 trials resulting in a small reward will make the medium reward associated with CS1 seem better. Thus, recent CS2 trials may affect activity on current CS1 trials so as to support the computation of motivational significance. We assessed the effect of the magnitude of the currently predicted reward in relation to the magnitude of rewards in the

Neuron 95, 70–77, July 5, 2017 73

Figure 4. OFC and Amygdala Neural Activity Change at Different Rates, and Reward History Differentially Influences Activity in the Two Areas (A and B) Mean spike density across neurons (A, OFC; B, amygdala) plotted as a function of time and trial number, aligned on the first trial where a large reward followed CS2. Horizontal solid and vertical dashed lines, block transition and CS onset. See Figure S2. (C and D) Regression coefficients associated with reward history terms for OFC (C) and amygdala (D) (see STAR Methods). Positive coefficients: previous rewards smaller than predicted on the current trial increase neural responses. Error bars: 95% confidence intervals based on t-statistic. *Coefficients significantly different from 0, p < 0.05. See Figure S3.

DISCUSSION

recent past by fitting the neural activity of the cell populations encoding motivational significance with an extended version of the previous regression model where factors are added to capture an effect of reward history (see STAR Methods). Figures 4C and 4D show the coefficients associated with the five reward history regressors for the OFC (C) and amygdala (D) populations of neurons. In OFC, the three most recent trials have a significant effect on the current neural activity (coefficients significantly different from 0, p < 0.05, t statistic; Figure 4C). There was not a significant difference between monkeys for each beta coefficient estimate (p > 0.05 in each case). This means that, all other factors being equal, if the current reward is larger than a past reward, the neural activity in a positive neuron will be higher than if the two rewards are equal. Conversely, if the past reward is larger than the current one, a relative decrease in firing rate will be observed. Interestingly, reward history did not have an effect on the activity of identified amygdala neurons (Figure 4D), thus suggesting that although amygdala neurons modify their activity to reflect the motivational significance of a current predicted reward relative to other recently received rewards, they might not be responsible for the integration of information in the environment that facilitates this calculation. We also performed the analysis of the effect reward history on individual cells and observed qualitatively similar results (Figure S3).

74 Neuron 95, 70–77, July 5, 2017

Flexible emotional and cognitive behavior demands mechanisms that can update representations of the motivational significance of stimuli. In this study, a contrast revaluation task was used to investigate the neurophysiological mechanisms that underlie adjustments in the representation of the motivational significance of a CS. In this task, manipulations in motivational significance can be caused by changes in the relative amount of reward predicted by a CS while holding the absolute amount of reward associated with this CS constant. Monkeys’ anticipatory licking behavior tracked the motivational significance of this CS (Figure 1). Neurons in the amygdala and OFC were sensitive to changes in motivational significance even though the actual reward associated with this CS did not change (Figures 2 and 3). The neural representation of motivational significance caused by changes in the relative amount of predicted reward updated more rapidly in OFC than in amygdala, and recent reward history modulated neural responses in OFC but not in amygdala (Figure 4). Satiation is unlikely to account for our results, since neural activity tracked an increase in motivational significance during the final block of trials, when effects of satiation would be maximal. Moreover, satiation has been observed to affect neural responses in ventromedial pre-frontal cortex more frequently than in OFC (Simmons et al., 2010). Overall, these data suggest a primary role for the OFC as compared to the amygdala in flexible behavior driven by contrast revaluation. The neural representation of motivational significance that we describe may be conceptually related to neural representations of value that have been reported previously in OFC and the amygdala (Belova et al., 2007, 2008; Morrison et al., 2011; Padoa-Schioppa, 2009; Padoa-Schioppa and Assad, 2006, 2008; Paton et al., 2006; Peck et al., 2013; Rich and Wallis, 2016). However, different studies have often used the word value

in distinct ways. Some studies utilized Pavlovian designs in which value is defined as the subjective estimate of the expected cumulative future reward (Belova et al., 2007, 2008; Morrison et al., 2011; Paton et al., 2006; Peck et al., 2013). By contrast, studies working within the framework of neuroeconomics define value by utilizing tasks in which subjects express subjective choice preferences (Padoa-Schioppa, 2009; Padoa-Schioppa and Assad, 2006, 2008; Rich and Wallis, 2016). In both the contrast-revaluation task and tasks in which subjects make choices between two stimuli, the motivational significance of a given stimulus is influenced by the reward associated with another stimulus (either the other CS within a block of trials in the contrast revaluation task, or the other choice stimulus in an economic decision-making task). As such, the cells identified as encoding motivational significance in OFC in this study may overlap considerably with the cells previously identified as encoding the subjective value (or offer value) of impending rewards (Padoa-Schioppa, 2009; Padoa-Schioppa and Assad, 2006, 2008; Rich and Wallis, 2016). The present study extends this observation to the amygdala. A number of studies conducted in rodents and utilizing a range of paradigms that include outcome-specific Pavlovian instrumental transfer (PIT) and transreinforcer blocking have suggested that the basolateral amygdala and OFC represent information about the sensory quality of a US associated with a CS (Balleine and Killcross, 2006; Bradfield et al., 2015; Burke et al., 2008). We did not explore how neural responses to CSs change for different types of associated USs, but the current findings reveal that neural responses to a CS in both OFC and the amygdala often reflect the overall motivational significance of an associated US. Neural responses to CS1 changed across blocks when the relative reward amount predicted by CS1 had changed, even though the actual reward associated with CS1 had not changed and thus the sensory quality of the associated US had not changed. The process of updating a neural representation of motivational significance invoked during the contrast revaluation task differs from the process invoked by reversal learning, a paradigm often used to investigate the role of the OFC and the amygdala in flexible behavior. During reversal learning, the reinforcement contingences of CSs switch simultaneously. Extensive data in both rodents and primates implicate the amygdala and OFC as playing a role in reversal learning (Belova et al., 2007; Chau et al., 2015; Morrison et al., 2011; Paton et al., 2006; Rolls et al., 1996; Saddoris et al., 2005; Saez et al., 2015; Schoenbaum et al., 1998, 2003), but the precise role of each area is frequently debated. Some of the conflicting reports in the literature may arise from differences in rodent and primate prefrontal cortex, as the degree of homology between pre-frontal cortex across species is limited (Wise, 2008). Indeed, electrophysiological data recorded during reversal learning differs between rodent and non-human primates. In rodents, after a reversal, a new set of OFC neurons appear to respond to the cues whose contingencies have switched (Schoenbaum et al., 1999, 2000). By contrast, in primates, the same neurons tend to change their responses to cues so as to represent the new expected reinforcement (Morrison et al., 2011; Saez et al., 2015). Furthermore, in primates the appetitive network of neurons in

OFC appears to update activity to represent expected outcome more rapidly than the aversive network, and dynamic interactions—as captured by simultaneously measured local field potentials—between the OFC and amygdala shifts as a function of learning (Morrison et al., 2011). Finally, recent data have delineated a role in flexible credit assignment during reversal learning for a part of the lateral OFC in primates that has often not been studied electrophysiologically (Chau et al., 2015). Flexible behavior in the contrast revaluation task resulted from the manipulation of the reward associated with one CS compared to a different CS. Adaptation-level theory suggests that emotional behavior and other cognitive processes adjust depending upon the value of a particular stimulus, action, or experience relative to other ones within a given situation (Helson and Bevan, 1964). The present study investigated the amygdala and OFC to understand the adaptation process. Range adaptation, which in our task would occur across blocks of trials, has been proposed to account for adjustments in representations of relative value in OFC (Padoa-Schioppa, 2009). The present data extend this observation to the amygdala, and specifically identify the OFC, and not the amygdala, as reflecting the integration of information about recent reward history to adjust representations of motivational significance related to the relative amount of expected reward. The integration of reward history may not, however, be limited to OFC; during tasks involving probability matching, neural responses reflecting the relative value of actions have been observed in the lateral intraparietal area (LIP) and striatum (Lau and Glimcher, 2008; Sugrue et al., 2004). The integration of reward history information to assess the motivational meaning of a stimulus may reflect the role of the OFC in representing relevant aspects of an agent’s current situation so as to interpret the motivational significance of environmental stimuli within a context (Schuck et al., 2016; Wilson et al., 2014). This role may be shared with other brain areas when particular types of abstract information define a context, such as information about ‘‘task sets,’’ which are the set of stimulus-response-outcome mappings in effect within a context (Saez et al., 2015). At this stage, the source of the reward history information provided to OFC during contrast revaluation remains unknown, as is a mechanistic understanding of exactly how the comparison process is implemented by a network of neurons. Future studies must uncover how the OFC and other interrelated structures adjust the representation of motivational significance, a process critical for both value-based decision making and emotional regulation. STAR+METHODS Detailed methods are provided in the online version of this paper and include the following: d d d d

KEY RESOURCES TABLE CONTACT FOR REAGENT AND RESOURCE SHARING EXPERIMENTAL MODEL AND SUBJECT DETAILS METHOD DETAILS B Behavioral task and analysis B Electrophysiological recordings

Neuron 95, 70–77, July 5, 2017 75

d

QUANTIFICATION AND STATISTICAL ANALYSIS OF NEURAL DATA B Classifying cell coding properties B Timing of changes in neural activity due to manipulations of relative but not absolute reward B Effect of reward history

SUPPLEMENTAL INFORMATION Supplemental Information includes three figures and can be found with this article online at http://dx.doi.org/10.1016/j.neuron.2017.06.012. AUTHOR CONTRIBUTIONS J.J.P. and C.D.S. designed the study; R.A.S. performed the experiments; A.S., B.L., and C.D.S. designed the analyses; A.S. performed the analyses; R.A.S., A.S., and C.D.S. wrote the manuscript with contributions from J.J.P. and B.L. ACKNOWLEDGMENTS This work was supported by National Institutes of Health grant R01 MH082017, the James S. McDonnell Foundation, the Brain and Behavior Research Foundation, and the Kavli Institute for Brain Science. A.S. received support from the Brain and Behavior Research Foundation (formerly NARSAD). J.J.P. received support from NICHD (T32 HD007430) and NEI (EY013933) institutional training grants. B.L. received support from the NIMH (T32-MH015144) and the Helen Hay Whitney Foundation. Received: November 1, 2016 Revised: April 25, 2017 Accepted: June 6, 2017 Published: July 5, 2017 REFERENCES Balleine, B.W., and Killcross, S. (2006). Parallel incentive processing: an integrated view of amygdala function. Trends Neurosci. 29, 272–279. Belova, M.A., Paton, J.J., Morrison, S.E., and Salzman, C.D. (2007). Expectation modulates neural responses to pleasant and aversive stimuli in primate amygdala. Neuron 55, 970–984. Belova, M.A., Paton, J.J., and Salzman, C.D. (2008). Moment-to-moment tracking of state value in the amygdala. J. Neurosci. 28, 10023–10030. Bermudez, M.A., and Schultz, W. (2010a). Responses of amygdala neurons to positive reward-predicting stimuli depend on background reward (contingency) rather than stimulus-reward pairing (contiguity). J. Neurophysiol. 103, 1158–1170. Bermudez, M.A., and Schultz, W. (2010b). Reward magnitude coding in primate amygdala neurons. J. Neurophysiol. 104, 3424–3432. Bradfield, L.A., Dezfouli, A., van Holstein, M., Chieng, B., and Balleine, B.W. (2015). Medial Orbitofrontal Cortex Mediates Outcome Retrieval in Partially Observable Task Situations. Neuron 88, 1268–1280. Burke, K.A., Franz, T.M., Miller, D.N., and Schoenbaum, G. (2008). The role of the orbitofrontal cortex in the pursuit of happiness and more specific rewards. Nature 454, 340–344.

sentations of unconditioned stimuli in basolateral amygdala mediate innate and learned responses. Cell 162, 134–145. Helson, H., and Bevan, W. (1964). An investigation of variables in judgments of relative area. J. Exp. Psychol. 67, 335–341. Hikosaka, K., and Watanabe, M. (2000). Delay activity of orbital and lateral prefrontal neurons of the monkey varying with different rewards. Cereb. Cortex 10, 263–271. Hobin, J.A., Goosens, K.A., and Maren, S. (2003). Context-dependent neuronal activity in the lateral amygdala represents fear memories after extinction. J. Neurosci. 23, 8410–8416. Janak, P.H., and Tye, K.M. (2015). From circuits to behaviour in the amygdala. Nature 517, 284–292. Kennerley, S.W., Dahmubed, A.F., Lara, A.H., and Wallis, J.D. (2009). Neurons in the frontal lobe encode the value of multiple decision variables. J. Cogn. Neurosci. 21, 1162–1178. Kim, J., Pignatelli, M., Xu, S., Itohara, S., and Tonegawa, S. (2016). Antagonistic negative and positive neurons of the basolateral amygdala. Nat. Neurosci. 19, 1636–1646. Kobayashi, S., Pinto de Carvalho, O., and Schultz, W. (2010). Adaptation of reward sensitivity in orbitofrontal neurons. J. Neurosci. 30, 534–544. Lau, B., and Glimcher, P.W. (2008). Value representations in the primate striatum during matching behavior. Neuron 58, 451–463. Morrison, S.E., and Salzman, C.D. (2009). The convergence of information about rewarding and aversive stimuli in single neurons. J. Neurosci. 29, 11471–11483. Morrison, S.E., and Salzman, C.D. (2010). Re-valuing the amygdala. Curr. Opin. Neurobiol. 20, 221–230. Morrison, S.E., Saez, A., Lau, B., and Salzman, C.D. (2011). Different time courses for learning-related changes in amygdala and orbitofrontal cortex. Neuron 71, 1127–1140. Nishijo, H., Ono, T., and Nishino, H. (1988a). Single neuron responses in amygdala of alert monkey during complex sensory stimulation with affective significance. J. Neurosci. 8, 3570–3583. Nishijo, H., Ono, T., and Nishino, H. (1988b). Topographic distribution of modality-specific amygdalar neurons in alert monkey. J. Neurosci. 8, 3556–3569. Padoa-Schioppa, C. (2009). Range-adapting representation of economic value in the orbitofrontal cortex. J. Neurosci. 29, 14004–14014. Padoa-Schioppa, C., and Assad, J.A. (2006). Neurons in the orbitofrontal cortex encode economic value. Nature 441, 223–226. Padoa-Schioppa, C., and Assad, J.A. (2008). The representation of economic value in the orbitofrontal cortex is invariant for changes of menu. Nat. Neurosci. 11, 95–102. Padoa-Schioppa, C., and Cai, X. (2011). The orbitofrontal cortex and the computation of subjective value: consolidated concepts and new perspectives. Ann. N Y Acad. Sci. 1239, 130–137. Paton, J.J., Belova, M.A., Morrison, S.E., and Salzman, C.D. (2006). The primate amygdala represents the positive and negative value of visual stimuli during learning. Nature 439, 865–870. Peck, C.J., Lau, B., and Salzman, C.D. (2013). The primate amygdala combines information about space and value. Nat. Neurosci. 16, 340–348. Price, J.L. (2003). Comparative aspects of amygdala connectivity. Ann. N Y Acad. Sci. 985, 50–58.

Chau, B.K., Sallet, J., Papageorgiou, G.K., Noonan, M.P., Bell, A.H., Walton, M.E., and Rushworth, M.F. (2015). Contrasting roles for orbitofrontal cortex and amygdala in credit assignment and learning in macaques. Neuron 87, 1106–1118.

Pritchard, T.C., Edwards, E.M., Smith, C.A., Hilgert, K.G., Gavlick, A.M., Maryniak, T.D., Schwartz, G.J., and Scott, T.R. (2005). Gustatory neural responses in the medial orbitofrontal cortex of the old world monkey. J. Neurosci. 25, 6047–6056.

Everitt, B.J., Cardinal, R.N., Parkinson, J.A., and Robbins, T.W. (2003). Appetitive behavior: impact of amygdala-dependent mechanisms of emotional learning. Ann. N Y Acad. Sci. 985, 233–250.

Redondo, R.L., Kim, J., Arons, A.L., Ramirez, S., Liu, X., and Tonegawa, S. (2014). Bidirectional switch of the valence associated with a hippocampal contextual memory engram. Nature 513, 426–430.

Gore, F., Schwartz, E.C., Brangers, B.C., Aladi, S., Stujenske, J.M., Likhtik, E., Russo, M.J., Gordon, J.A., Salzman, C.D., and Axel, R. (2015). Neural repre-

Rich, E.L., and Wallis, J.D. (2016). Decoding subjective decisions from orbitofrontal cortex. Nat. Neurosci. 19, 973–980.

76 Neuron 95, 70–77, July 5, 2017

Rolls, E.T., Critchley, H.D., Mason, R., and Wakeman, E.A. (1996). Orbitofrontal cortex neurons: role in olfactory and visual association learning. J. Neurophysiol. 75, 1970–1981. Saddoris, M.P., Gallagher, M., and Schoenbaum, G. (2005). Rapid associative encoding in basolateral amygdala depends on connections with orbitofrontal cortex. Neuron 46, 321–331. Saez, A., Rigotti, M., Ostojic, S., Fusi, S., and Salzman, C.D. (2015). Abstract context representations in primate amygdala and prefrontal cortex. Neuron 87, 869–881. Salzman, C.D., and Fusi, S. (2010). Emotion, cognition, and mental state representation in amygdala and prefrontal cortex. Annu. Rev. Neurosci. 33, 173–202. Schoenbaum, G., Chiba, A.A., and Gallagher, M. (1998). Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning. Nat. Neurosci. 1, 155–159. Schoenbaum, G., Chiba, A.A., and Gallagher, M. (1999). Neural encoding in orbitofrontal cortex and basolateral amygdala during olfactory discrimination learning. J. Neurosci. 19, 1876–1884. Schoenbaum, G., Chiba, A.A., and Gallagher, M. (2000). Changes in functional connectivity in orbitofrontal cortex and basolateral amygdala during learning and reversal training. J. Neurosci. 20, 5179–5189. Schoenbaum, G., Setlow, B., Nugent, S.L., Saddoris, M.P., and Gallagher, M. (2003). Lesions of orbitofrontal cortex and basolateral amygdala complex disrupt acquisition of odor-guided discriminations and reversals. Learn. Mem. 10, 129–140. Schuck, N.W., Cai, M.B., Wilson, R.C., and Niv, Y. (2016). Human orbitofrontal cortex represents a cognitive map of state space. Neuron 91, 1402–1412.

Schultz, W. (2015). Neuronal reward and decision signals: from theories to data. Physiol. Rev. 95, 853–951. Shabel, S.J., and Janak, P.H. (2009). Substantial similarity in amygdala neuronal activity during conditioned appetitive and aversive emotional arousal. Proc. Natl. Acad. Sci. USA 106, 15031–15036. Simmons, J.M., Minamimoto, T., Murray, E.A., and Richmond, B.J. (2010). Selective ablations reveal that orbital and lateral prefrontal cortex play different roles in estimating predicted reward value. J. Neurosci. 30, 15878–15887. Stefanacci, L., and Amaral, D.G. (2002). Some observations on cortical inputs to the macaque monkey amygdala: an anterograde tracing study. J. Comp. Neurol. 451, 301–323. Sugrue, L.P., Corrado, G.S., and Newsome, W.T. (2004). Matching behavior and the representation of value in the parietal cortex. Science 304, 1782–1787. Tremblay, L., and Schultz, W. (1999). Relative reward preference in primate orbitofrontal cortex. Nature 398, 704–708. Wallis, J.D. (2011). Cross-species studies of orbitofrontal cortex and valuebased decision-making. Nat. Neurosci. 15, 13–19. Wallis, J.D., and Miller, E.K. (2003). Neuronal activity in primate dorsolateral and orbital prefrontal cortex during performance of a reward preference task. Eur. J. Neurosci. 18, 2069–2081. Wilson, R.C., Takahashi, Y.K., Schoenbaum, G., and Niv, Y. (2014). Orbitofrontal cortex as a cognitive map of task space. Neuron 81, 267–279. Wise, S.P. (2008). Forward frontal fields: phylogeny and fundamental function. Trends Neurosci. 31, 599–608. Zhang, W., Schneider, D.M., Belova, M.A., Morrison, S.E., Paton, J.J., and Salzman, C.D. (2013). Functional circuits and anatomical distribution of response properties in the primate amygdala. J. Neurosci. 33, 722–733.

Neuron 95, 70–77, July 5, 2017 77

STAR+METHODS KEY RESOURCES TABLE

REAGENT or RESOURCE

SOURCE

IDENTIFIER

Covance Research Products

http://www.covance.com/services/research/ research-models.html

MathWorks

https://www.mathworks.com/products/matlab.html

Experimental Models: Organisms/Strains Rhesus macaque (Mucacca mulatta) Software and Algorithms MATLAB

CONTACT FOR REAGENT AND RESOURCE SHARING Further information and requests for resources should be directed to and will be fulfilled by the Lead Contact, Dr. C. Daniel Salzman ([email protected]). EXPERIMENTAL MODEL AND SUBJECT DETAILS Two male rhesus monkeys (Macaca mulatta) of 7 and 8 years of age and weighing 10 and 11 kg were used in these experiments. Each animal was surgically implanted with a plastic head post and a single recording chamber giving access to both the prefrontal cortex and the amygdala. Surgery was conducted using aseptic techniques under general anesthesia (isoflurane), and analgesics and antibiotics were provided during postsurgical recovery. Structural magnetic resonance imaging (MRI) was used to guide the positioning of the recording chamber prior to surgery and to verify its position after surgery. After surgical implantation of the head post and the recording chamber, animals were trained on the task until their anticipatory licking behavior was consistently modulated by the amount of reward predicted by the CS on a trial-by-trial basis and across blocks of trials. All experimental procedures were in accordance with the National Institutes of Health guide for the care and use of laboratory animals and the Animal Care and Use Committees at New York State Psychiatric Institute and Columbia University. METHOD DETAILS Behavioral task and analysis The two monkeys were trained on a contrast revaluation conditioning paradigm; a schematic diagram of the task is shown in Figure 1A. Monkeys initiated a trial by looking at a central fixation point for 0.9 s. While maintaining fixation, they were then presented one of two possible conditioned stimuli (CS) for 0.3 s. The CSs consisted of abstract images (fractal patterns) that were novel and unique to each experiment. After CS offset, monkeys waited a trace period of 1.5 s during which fixation was no longer required. Then, monkeys received an unconditioned stimulus (US) consisting of one of two possible liquid rewards (water) depending upon which CS was presented. The inter-trial interval between the end of the US delivery and the next fixation point onset was 3 s. The presentation of the two CSs followed a pseudo-random schedule ensuring that the same CS is not presented more than four consecutive times. Each experimental session consisted of three blocks of trials of variable length (typically 40-80 trials per block). In blocks 1 and 3, CS1 was paired with a medium-sized reward and CS2 was paired with a small reward. In block 2, the reward associated with CS1 remained unchanged while CS2 became paired with a large reward (see Figure 1A). CS-reward associations were the only indication as to what block was in effect, as block switches were not cued. Blocks were switched at a time un-signaled to the monkey after the monkey’s behavior reflected learning of the new CS-reward associations in effect. Reward amounts were quantified by the opening time of a solenoid valve. The opening times for the small, medium, and large rewards, were (in that order) 100 ms, 500 ms and 2500 ms for monkey T; and 20 ms, 100 ms and 500 ms for monkey P. Given the reward systems used for the two monkeys, these opening times correspond, approximately, to 50, 250 and 1250 mL of water for monkey T, and to 20, 100 and 500 mL of water for monkey P. As such, the medium reward was always 5 times larger than the small reward and the large reward was always 5 times larger than the medium reward. The reward delivery spout was placed a few millimeters away from the animal’s mouth, so monkeys needed to lick at the spout before reward delivery to avoid missing it. Anticipatory licking behavior was measured by detecting the interruption of an infrared laser beam passing between the monkey’s lips and the reward delivery spout. Licking rate was defined as the proportion of time spent licking during the 500-ms time window starting 100 ms after CS onset. This time window was chosen for containing the strongest discrimination between expected reward amounts for the two monkeys in the study. The licking rate was normalized within each experiment (Z score), then averaged across experiments for each trial number relative to the revaluation (e.g., Figure 1).

e1 Neuron 95, 70–77.e1–e3, July 5, 2017

Electrophysiological recordings We recorded neural activity from 96 neurons in the right OFC and 166 neurons in the right amygdala (62 OFC cells and 100 amygdala cells in monkey T; 34 OFC cells and 66 amygdala cells in monkey P). We only included neurons from experimental sessions where, upon visual examination of the licking patterns (i.e., plots such as Figure 1B), relative valuation of CS1 was evident in the monkey’s behavior. This was the case of the vast majority of experimental sessions. In each session, we individually advanced up to 4 tungsten electrodes into each of the two brain areas (impedance 2 MU, FHC Instruments) using a motorized multi-electrode drive (NAN Instruments). Analog signals were amplified, band-pass filtered (250 Hz – 8 kHz) and digitized (40 kHz) using a Plexon MAP system (Plexon). Single units were isolated offline using Plexon Offline Sorter. Recording sites in OFC were located between the medial orbital sulcus and the lateral orbital sulcus (Brodmann areas 13 m and 13l). Amygdala recordings were primarily in the basolateral and basomedial complex (Figure S1). QUANTIFICATION AND STATISTICAL ANALYSIS OF NEURAL DATA Classifying cell coding properties ANOVA In order to identify neurons selective to expected reward amount, we performed a two-way ANOVA with factors ‘‘reward magnitude’’ (3 levels: small, medium, and large reward) and ‘‘block type’’ (2 levels: blocks 1 / 3 and block 2) on the firing rate of each neuron. Firing rates were taken from two different time windows during the preceding the reward delivery: the CS interval (90 to 390 ms from CS onset, corresponding to the 300 ms of CS presentation shifted by 90 ms to account for visual response latency) and the Trace interval (390 to 1890 ms from CS onset). We classified as selective to reward-magnitude those neurons for which the reward magnitude factor in the ANOVA was significant in either one of the two intervals, with Bonferroni correction for the fact that we are running the test twice (p < 0.025 for either interval). Importantly, we included a post hoc requirement that significant differences between any two of the three levels of reward were consistent in sign with one another. Specifically, if the ANOVA indicates a significant effect of reward amount and post hoc analysis reveals a significant difference in firing rate both between the large and medium rewards and between the large and small rewards, then the signs of the firing rate differences must be consistent with a given reward preference (that is, either FR{large}>FR{medium} AND FR{large}>FR{small}, or FR{large}FR{large}. For the plots in Figure 3, firing rates from each neuron were taken from the time period (CS and/or Trace) where significant rewardrelated activity was present. Firing rates were then Z scored, sign-flipped according to each cell’s reward preference and averaged together. Linear Regression In order to identify neurons selective to the relative amount of the upcoming reward compared to the reward predicted by the other CS within a block of trials, we fitted the firing rate of each neuron with the following linear regression model (referred to as M1): M1:

FR = a + bCS  CS + bBL  BLOCK + bINT  CS  BLOCK + ε

where CS and BLOCK are binary vectors representing, respectively, the CS shown (‘‘1’’ for CS1 trials, ‘‘0’’ for CS2 trials) and the block (‘‘1’’ for block 2 trials, ‘‘0’’ for blocks 1 and 3), and CS*BLOCK is their product, or interaction. a, bCS, bBL and bINT are the intercept and the regression coefficients and ε is the residual error. This way, ignoring the intercept, the firing rate in response to CS1 in blocks 1 and 3 is estimated by bCS. Similarly, the firing rate for CS1 in block 2 is estimated by bCS+bBL+bINT and that of CS2 in block 2 simply by bBL (see Figure S2). Of particular interest is the difference in firing rate between blocks 1, 3 and block 2 for CS1, since it can be used to quantify coding of the relative magnitude of reward by a cell. In our regression model, this quantity is estimated by bBL+bINT (Figure S2). We define those cells for which bBL+bINT is significantly different from 0 as encoding motivational significance (or meaning, which reflects an assessment of the relative amount of reward even though its absolute size does not change, p < 0.05, t test). This ensures that the firing on CS1 trials is different during block 2 relative to blocks 1 and 3, as it should if the neuron is selective to the relative amount of the medium reward. In addition, we require that bINT s 0 to discard neurons that are merely selective to the block of trials, and we require that (bBL+bINT)*bBL < 0 to ensure that the change in firing to CS1 in block 2 (given by bBL+bINT) occurs in the opposite direction from the one for CS2 (given by bBL). For example, if a neuron increases its activity when CS2 switches from predicting small to predicting large reward during block 2, we require that its response to CS1 at the same time decreases, not increases. The regression analysis was performed, as the ANOVA analysis above, separately in the CS and Trace time intervals, with Bonferroni correction for multiple comparisons. Timing of changes in neural activity due to manipulations of relative but not absolute reward In order to investigate the dynamics of changes in neural responses to CS1 (which does not change its associated absolute reward) following the first block transition, we plotted the spike density function averaged across neurons identified as encoding motivational significance as a function of trial and time within the trial (Figures 4A and 4B). Using a Gaussian kernel (bandwidth: 100 ms), we computed the spike density function of each selected neuron during 25 CS1 trials around the transition between block 1 and block 2: the 10 CS1 trials before and the 15 CS1 trials after the ‘‘effective revaluation.’’ The effective revaluation is defined as the first time that

Neuron 95, 70–77.e1–e3, July 5, 2017 e2

the monkey experienced a large reward associated with CS2, which constitutes the only evidence that the block has switched. The spike density estimates were normalized, for each neuron, by subtracting the mean density across the 10 trials prior to the revaluation and by dividing by their standard deviation. As in previous analyses, the individual spike density estimates were sign-flipped according the neuron’s valence (or reward preference) before averaging the densities across neurons. Therefore, each row in Figures 4A and 4B represents, for that given trial, the change in spike density relative to a baseline consisting of the mean over the 10 trials preceding the revaluation. These changes are shown as a function of time from CS onset and are in units of standard deviation of the baseline. Effect of reward history The firing rates of neurons sensitive to manipulations of predicted relative reward were fitted with a regression model that includes terms accounting for reward history up to 5 trials in the past. These terms consist of the difference in reward amount between the current and the previous trial (1 trial prior), between the current trial and two trials back (2 trials prior), etc. Formally, we fitted the firing rate of these neurons with the following extension of our previous linear regression model (M2): M2:

FR = a + bCS  CS + bBL  BLOCK + bINT  CS  BLOCK + bT­1  DREWT­1

+ bT­2  DREWT­2 + bT­3  DREWT­3 + bT­4  DREWT­4 + bT­5  DREWT­5 + ε where DREWT-1 is a vector containing the difference in reward amount between the current trial and the previous trial (current minus previous), DREWT-2 contains the difference between the current trial and two trials past, and so forth. This model was fit to all neurons identified as being sensitive to manipulations in relative reward amounts. Model M1 was sufficient to explain any variability related to the properties of the current trial (this model fits 4 independent predictors (counting the constant term) and there are 4 possible trial types in the task). Therefore, the additional terms in the extended model M2 were meant to capture variability in the firing rate that is caused by the reward history up to 5 trials in the past. Figures 4C and 4D plot the values of bT-1, ., bT-5, which are the coefficients associated with these ‘‘reward history’’ terms. The dependent variable in this analysis was a single vector of the concatenated firing rates from all selected neurons. In the same way, the regressors associated with each cell were concatenated across experiments and used to fit the firing rate vector. The firing rates were taken from the CS and/or Trace interval of the trial, depending on which interval(s) presented evidence that manipulations in relative reward amounts influenced firing rate based on the M1 regression model above. Before concatenation, the firing rate of each cell was Z scored and multiplied by –1 if the neuron had a ‘‘negative reward preference’’ (i.e., increased its activity in response to decreases in relative reward amount and vice versa). In the plots in Figures 4C and 4D, y axis units correspond to the change in Z scored firing rate per difference in reward from previous trials (small, medium, and large rewards were assigned arbitrary units proportional to actual water amounts: 1, 5 and 25, respectively).

e3 Neuron 95, 70–77.e1–e3, July 5, 2017

Suggest Documents