• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptNIH Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Behav Brain Res. Author manuscript; available in PMC Feb 12, 2008.
Published in final edited form as:
PMCID: PMC1859851
NIHMSID: NIHMS16880

Anticipating instrumentally obtained and passively-received rewards: A factorial fMRI investigation

Using dopaminergic signals in the ventromesial striatum (VS), organisms learn to approach stimuli that signal a high probability of reward [1]. During instrumental learning, single-unit activity in the striatum shifts forward from reward delivery to the presentation of reward-predictive stimuli [2]. Accordingly, human functional magnetic resonance imaging (fMRI) experiments show that regions of the striatum are activated by: 1) learned [3] cues that signal eligibility to respond for uncertain reward in monetary incentive delay (MID) tasks [46], 2) the salience of a reward cue [7], 3) anticipation of maximally uncertain (50%) reward relative to more certain outcomes [8], and 4) reward delivery [9, 10] that is contingent on behavior [11].

Of interest here is to use rapid, event-related fMRI to further characterize the extent to which human ventromesial striatum (VS) activation by reward anticipation is dependent on the interaction between reward delivery probabilities and instrumental response requirements. The distinction between instrumental and classical conditioning is muddled in many behavioral tasks because stimuli signaling availability to respond for reward also inherently convey information about impending reward as would a classically-conditioned (Pavlovian) cue. VS activation directly correlated with positive affective responses to reward availability [4, 5] suggesting the possibility of a Pavlovian component to this activation. Moreover, populations of VS neurons fire in response to reward-predictive cues under Pavlovian conditions [12] (reward not contingent on an instrumental response). If anticipatory VS activation by reward in instrumental tasks is primarily elicited by positive affect itself, this activation should also be elicited by cues associated with reward delivered in a Pavlovian manner.

We designed a Factorial Reward Anticipation (FRA) task (a variant of the MID task) to examine activation by linear contrasts between anticipation of potential reward versus non-reward reported previously [46], but here under each of Pavlovian and instrumental conditions. As a secondary objective, we wished to obtain evidence that reward-anticipatory activation previously elicited by MID tasks [4, 5] was also partially dependent on how reward for effort was not certain. Notably, Fiorillo et al demonstrated that anticipatory single-unit activity in mesolimbic dopaminergic neurons is maximal when the probability of payoff for the instrumental response is maximally uncertain (50%) [13]. The FRA included trials with an explicitly-briefed 50% probability of payoff for a successful response, where outcomes were not primarily a function of the subject’s successful behavior (as with previous MID tasks). We hypothesized that 1) anticipatory VS activation would be dependent on the requirement to respond [14], 2) VS activation would be more robust with uncertainty of reinforcement [8], and 3) VS activation would be most evident under the dual conditions of an instrumental requirement and uncertainty of reward delivery.

Ten men (mean age 34.2 ± 6.4) and 10 women (33.9 ± 6.4) free of any physical or mental illness gave written informed consent to participate in this experiment. All procedures were reviewed and approved by the NIAAA Institutional Review Board. The FRA task stimuli were white on a black background, and were projected on a screen and viewed with a head coil mirror. The six trial types (n = 18 each) were pseudo-randomly presented, and were separated by a jittered inter-trial interval (2–8 s) with fixation crosshair. Each trial lasted 6 s, and featured an instruction cue (500 ms), a target (500 ms), and feedback (2000 ms; see Figure 1). Response trials (square cue series) required the subject to respond on a button box while the subsequent target (white square) was presented. A square enclosing a “$,” a “?,” and a “0” indicated 1.0, 0.5, and 0 probabilities, respectively, of winning $1 for hitting the target. Non-response trials (circle cue series) instructed the subject to withhold response while the subsequent target was presented. A circle enclosing a “$,” a “?” and a “0” indicated 1.0, 0.5, and 0 probabilities, respectively, of passive receipt of $1 after the following target was presented. Reward presentation was not contingent on withholding a response to the target. Responding during a 500 ms target presentation was intended to promote attention to the task, to reduce variance in the timing of responses, and was twice the mean reaction time (RT) of the slowest subject in a previous study (235 ms; Bjork et al, 2004), promoting ~ 1.0 probability of hits. During feedback, both trial and cumulative winnings were presented. Each subject was trained about the reward contingencies of the six instruction cues, and performed a 4-minute practice session of the task. Subjects were shown the cash they could win.

Figure 1
The Factorial Reward Anticipation (FRA) task. Trials lasted 6 s, and were separated by jittered intertrial intervals of 2–8 s. Subjects first saw one of six cues that signaled whether the subject needed to respond (squares) or to not respond (circles) ...

We used a 3 T scanner (General Electric, Milwaukee, WI) and a quadrature head coil. We collected 24 3.8-mm-thick axial slices with a 1 mm interslice gap. In-plane resolution was 3.75 × 3.75 mm. Functional scans were acquired using a T2*-sensitive echoplanar sequence with a repetition time (TR) = 2000 msec, echo time (TE) =40 msec, flip = 90°. Structural scans were acquired using a T1-weighted MP-RAGE sequence (TR, 100 msec; TE, 7 msec; flip, 90°) for co-registration of functional data. Each subject’s head was immobilized by a deflatable head restraint cushion.

We analyzed blood oxygen level-dependent (BOLD) signal time-locked to instruction cue presentation. Preprocessing and statistical analyses were conducted using Analysis of Functional Neural Images (AFNI) software [15] as follows: (1) volumes were concatenated across the three task runs; (2) voxel time series were interpolated to correct for non-simultaneous slice acquisition within each volume, (3) volumes were corrected for motion. Motion-correction estimates indicated that no participant’s head moved more than 1.5 mm between volumes. We then applied a 10 mm smoothing kernel, a de-spiking algorithm, and bandpass filtering which smoothed cyclical fluctuations in signal that were greater than 0.011/sec or less than 0.15/sec.

The regression model consisted of orthogonal regressors corresponding to presentation of the six instruction cues (anticipation), outcome notifications, residual motion following volume correction, and baseline and linear trends for each run. Regressors of interest were convolved with a gamma variate function that modeled a prototypical hemodynamic response function. Statistical maps were generated by the following linear contrasts (LC: area-under-curve activation), which were planned a priori to replicate previously reported contrasts and to extend them to Pavlovian conditions: 1) anticipation of responding for certain reward (p = 1.0) versus nonreward (p = 0), 2) anticipation of responding for uncertain reward (p = 0.5) versus nonreward, 3) anticipation of passive receipt of certain reward versus certain nonreward, and 4) anticipation of passive receipt of uncertain reward versus nonreward.

We used higher-order LC to isolate activation by the combination of an incentive together with the requirement to make an instrumental response [14]. For each of the p = 0.5 and p = 1.0 trial types, this was accomplished with a higher-order LC that could be conceptualized in two ways, either: 1) activation during anticipation of responding for reward versus anticipation of passively-obtained reward, while masking out activation by the contrast: (responding for non-reward versus passive receipt of non-reward), or 2) activation during anticipation of responding for reward versus anticipation of responding for no reward, while masking out the contrast: (passive receipt of reward versus passive nonreward). Thus this comparison allows us to identify those voxels that are most activated by the combination of anticipated reward and the need to generate a motor action to obtain the reward.

Individual subject maps of linear contrast t-statistics were transformed into Z scores and warped to common Talairach space and combined into a group map using a meta-analytic formula (average Z * square root (n))[4, 5]). For each contrast, activations were objectively detected using AFNI programs AlphaSim, 3dmerge, and 3dExtrema, where: 1) voxels each exceeded a statistical significance threshold of p < .0001, and 2) activated voxels were part of a contiguous cluster of sufficient size to obtain a family-wise corrected type I error rate ≤ 0.05 using Monte Carlo simulation.

Subjects hit a large majority of targets, such that intended reinforcement probabilities in response trials were not appreciably degraded by failures to hit the targets. Omission error rates (in response trials) were 3.3% in p = 1.0 trials, 3.3 % in p = 0.5 trials, and 5.8% in p = 0 trials. Compound repeated-measures ANOVA revealed a significant increase in omission errors in the final run of the task (main effect of time (Runs 1–3) F(2,38) = 5.216, P < .01), and a trend toward greater omission errors in p = 0 trials (main effect of probability F(2,38) = 2.940, P = .065). When subjects responded to a target, hit rates were 97.2%, 98.1%, and 96.4% in p = 1.0, p = 0.5 and p = 0 trials, respectively. There was a main effect of incentive probability on reaction time (RT) (F(2,38) = 6.325, p < .001). Simple effect paired t-tests indicated that responses to both p = 1.0 targets (mean 267.2 ± 51.3 ms) and p = 0.5 targets (mean 267.8 ± 40.4 ms) were significantly faster than responses to p = 0 (mean 294.2 ± 56.3 ms) targets (p < .01). There were no main or interactive effects of time on RT (runs 1–3). The incidence of commission errors (in non-response trials) was 1.4%.

Neither the LC between anticipation of passive receipt of uncertain (p = 0.5) reward versus nonreward, nor passive receipt of certain (p = 1.0) reward versus nonreward, activated any cortical or subcortical voxels. Anticipation of responding for uncertain (p = 0.5) reward (versus nonreward) activated different regions of motor cortex, and activated the putamen bilaterally, with activated voxels extending ventro-mesially into the NAcc (See Table 1 and Figure 2A). The LC between anticipation of responding for certain (p = 1.0) reward versus responding for nonreward activated left dorsal thalamus, left parietal cortex, left putamen, cerebellar vermis, as well as insula and superior post-central gyri bilaterally (See Table 1 and Figure 2B). The VS activation was centered in the Putamen, with activated voxels extending ventro-mesially into nucleus accumbens (NAcc). A post hoc LC between certain (p = 1.0) versus uncertain (p = 0.5) reward activated mesial frontal cortex under each of response and non-response conditions (Table 1).

Figure 2
Activation by anticipation of instrumental responding for uncertain reward. (A) linear contrast between responding for uncertain (p = 0.5) reward linearly contrasted with anticipation of responding for non-reward (p = 0) (B) linear contrast between responding ...
Table 1
Anticipatory activations detected by linear contrasts between cues signaling differing reward probabilities

For each of the p = 0.5 and p = 1.0 trial types, higher-order LC revealed regions selectively activated by the interaction of the presence of potential gain (versus nongain) with the requirement to emit an instrumental response. This higher-order LC with p = 0.5 trials activated left precentral gyrus and putamen and lentiform nuclei bilaterally. (Table 2, Figure 3A). The activation in the putamen extended into the VS. The higher-order LC with p = 1.0 trials activated bilateral insula, cingulate motor area, left pre- and post-central gyrus, posterior putamen, and right cerebellum, but with no recruitment of rostral Putamen or VS voxels (Table 2, Figure 3B).

Figure 3
Activation by a higher-order combination of reward with an instrumental response requirement. A) Instrumental-specific activation during anticipation of responding for uncertain (p = 0.5) reward versus no reward (p = 0), and (B) Instrumental-specific ...
Table 2
Activation by potential reward- specific to requirement for an instrumental response

We also examined activation time-locked to trial outcome notification. Across p = 0.5 trials (response and nonresponse conditions collapsed for statistical power), the LC between BOLD signal change time-locked to notifications of wins “+$1.00” versus nonwins “+$0.00” activated anterior and posterior cingulate cortices (Table 3). Certain gains contrasted with certain non-gains activated VS and mesofrontal cortex in non-response conditions, and activated several points of motor circuitry in response conditions.

Table 3
Activation time-locked to reward notification

These respective patterns of activation demonstrated general support for our hypotheses that: 1) anticipatory striatal activation in a MID task is dependent on the requirement to respond and not just on imminent, potential reward delivery itself, 2) VS activation in a MID task is enhanced by the uncertainty of reinforcement, and 3) VS activation by prospective reward is sensitive to an instrumental contingency together with uncertainty of reinforcement for successful response.

The absence of significant reward anticipation activation in non-response trials is in apparent conflict with single-unit studies (e.g. [12]) that demonstrate that subpopulations of VS neurons fire in response to reward-predictive cues under Pavlovian conditions. However, we note that even under conditions where reward delivery is not contingent on an instrumental response, in [12] and other reports, a forthcoming motor response is nevertheless signaled by the reward predictive cue in that the organism must prepared to lick and swallow a liquid reward. In contrast, non-response rewards in the FRA engendered no motor response. Alternatively, the lack of Pavlovian VS activation here may have resulted from the use of an abstract monetary reward, or because BOLD signal results predominantly from local field potentials (e.g. maintenance of gradients), not action potentials [16].

Interestingly, instrumental incentive-elicited activation in other nodes of the basal ganglia-thalamocortical circuit was more extensive in p = 1.0 trials compared to p = 0.5 trials, indicating instead a direct relationship between activation and the expected value (EV; the product of reward magnitude * probability) of the instrumental response in those motor effector regions. In addition, the p = 1 vs p = 0.5 LC also activated mesofrontal cortex under both Pavlovian and instrumental conditions, in accord with mesofrontal activation by increasing payoff probability as reported in a recent fMRI study of EV [6]. Finally, this event-related experiment also replicated a previous finding of a block-design experiment [14]- that activation of putamen and other striatal regions by environmental cues for potential rewards is critically dependent on the requirement for an instrumental response.

Activation of multiple points in the motor circuit by learned incentives has been demonstrated in other reports. For example, Haruno and Kawato [17] used a choice learning task to elicit incentive-dependent activation in bilateral superior parietal, dorsolateral prefrontal, dorsal premotor and occipital cortices, thalamus, supplementary motor area, and right superior temporal sulcus. These activations specifically correlated with the degree to which a stimulus-action-reward association was learned. Similarly, Lau et al. [18], reported that presentation of a visual cue to respond on one of two buttons to win an unspecified amount (versus nongain) activated the putamen and the CMA, with additional activation in left precentral sulcus and left postcentral gyrus when the proper choice of button on which to respond was signaled to the subject in advance. Finally, the posterior mesial frontal cortex activated by instrumental reward aniticaption in the FRA task included regions shown to activate when subjects prepare an intended motor response [19].

Dreher at al recently used a slot-machine task to assess activation by reward anticipation and feedback, and reported bilateral putamen activation during outcome anticipation when maximally uncertain reward outcomes (50%) were contrasted with more certain reward outcomes (25%). Our findings of more extensive VS voxels activated by the prospect of responding for an uncertain reward versus nonreward (compared to certain reward versus nonreward, which only activated posterior regions of putamen) shares this directionality, and also suggest that uncertainty-specific activation need not require a learning context in that subjects in both the present study and in the Dreher et al study had been explicitly briefed on the probability of rewarding outcomes.

Activation time-locked to notification of monetary outcomes activated several mesial cortical regions in uncertain outcome (p = 0.5) trials, and also activated mesofrontal cortex in the p = 1 vs p = 0 contrast in non-response conditions. We caution, however, that because outcomes always followed the anticipatory cue by 4 s, activations ostensibly attributed to reward notifications in p = 1 trials may have resulted instead from a protracted hemodynamic response to the reward-anticipatory cue. Temporal jittering between events within a trial would better distinguish between anticipatory and feedback activations.

In conclusion, these findings retrospectively assist the interpretation of previous activations by MID tasks, and indicate that anticipatory VS activation by reward-predictive cues in these and similar incentive tasks is at least partially dependent on uncertainty of reward delivery, the requirement to mobilize an instrumental response, and on the interaction of these two factors.

Acknowledgments

This research was sponsored by intramural research funds of the National Institute on Alcohol Abuse and Alcoholism. During data collection, J.B. was supported by a PRAT fellowship from the National Institute of General Medical Sciences. The authors thank Ms. Cinnamon Danube for technical assistance in subject recruitment and data collection.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

1. Tobler PN, Fiorillo CD, Schultz W. Adaptive coding of reward value by dopamine neurons. Science. 2005;307(5715):1642–5. [PubMed]
2. Schultz W, Tremblay L, Hollerman JR. Changes in behavior-related neuronal activity in the striatum during learning. Trends Neurosci. 2003;26(6):321. [PubMed]
3. Galvan A, Hare TA, Davidson M, Spicer J, Glover G, Casey BJ. The role of ventral frontostriatal circuitry in reward-based learning in humans. J Neurosci. 2005;25(38):8650–6. [PubMed]
4. Bjork JM, Knutson B, Fong GW, Caggiano DM, Bennett SM, Hommer DW. Incentive-elicited brain activation in adolescents: similarities and differences from young adults. J Neurosci. 2004;24(8):1793–802. [PubMed]
5. Knutson B, Adams CM, Fong GW, Hommer D. Anticipation of increasing monetary reward selectively recruits nucleus accumbens. J Neurosci. 2001;21(16):RC159. [PubMed]
6. Knutson B, Taylor J, Kaufman M, Peterson R, Glover G. Distributed neural representation of expected value. J Neurosci. 2005;25(19):4806–12. [PubMed]
7. Zink CF, Pagnoni G, Martin-Skurski ME, Chappelow JC, Berns GS. Human striatal responses to monetary reward depend on saliency. Neuron. 2004;42(3):509–17. [PubMed]
8. Dreher JC, Kohn P, Berman KF. Neural coding of distinct statistical properties of reward information in humans. Cereb Cortex. 2006;16(4):561–73. [PubMed]
9. Delgado MR, Nystrom LE, Fissell C, Noll DC, Fiez JA. Tracking the hemodynamic responses to reward and punishment in the striatum. J Neurophysiol. 2000;84(6):3072–7. [PubMed]
10. Elliott R, Newman JL, Longe OA, Deakin JF. Differential response patterns in the striatum and orbitofrontal cortex to financial reward in humans: a parametric functional magnetic resonance imaging study. J Neurosci. 2003;23(1):303–7. [PubMed]
11. Tricomi EM, Delgado MR, Fiez JA. Modulation of caudate activity by action contingency. Neuron. 2004;41(2):281–92. [PubMed]
12. Roitman MF, Wheeler RA, Carelli RM. Nucleus accumbens neurons are innately tuned for rewarding and aversive taste stimuli, encode their predictors, and are linked to motor output. Neuron. 2005;45(4):587–97. [PubMed]
13. Fiorillo CD, Tobler PN, Schultz W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science. 2003;299(5614):1898–902. [PubMed]
14. Elliott R, Newman JL, Longe OA, William Deakin JF. Instrumental responding for rewards is associated with enhanced neuronal response in subcortical reward systems. Neuroimage. 2004;21(3):984–90. [PubMed]
15. Cox RW. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput Biomed Res. 1996;29(3):162–73. [PubMed]
16. Logothetis NK, Pfeuffer J. On the nature of the BOLD fMRI contrast mechanism. Magn Reson Imaging. 2004;22(10):1517–31. [PubMed]
17. Haruno M, Kawato M. Different neural correlates of reward expectation and reward expectation error in the putamen and caudate nucleus during stimulus-action-reward association learning. J Neurophysiol. 2006;95(2):948–59. [PubMed]
18. Ramnani N, Miall RC. Instructed delay activity in the human prefrontal cortex is modulated by monetary reward expectation. Cereb Cortex. 2003;13(3):318–27. [PubMed]
19. Lau HC, Rogers RD, Haggard P, Passingham RE. Attention to intention. Science. 2004;303(5661):1208–10. [PubMed]
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • Compound
    Compound
    PubChem Compound links
  • PubMed
    PubMed
    PubMed citations for these articles
  • Substance
    Substance
    PubChem Substance links

Recent Activity

    Your browsing activity is empty.

    Activity recording is turned off.

    Turn recording back on

    See more...