![]() | ![]() |
Formats:
|
||||||||||||||||||
Dopamine DRD2 polymorphism alters reversal learning and associated neural activity 1 Max Planck Institute for Neurological Research Cognitive Neurology Research Group Gleueler Strasse 50, D-50931 Cologne, Germany 2 Max Planck Institute for Human Cognitive and Brain Sciences Department of Cognitive Neurology Stephanstrasse 1a, D-04103 Leipzig, Germany 3 Department of Psychology Laboratory of Neurogenetics University of Bonn Kaiser-Karl-Ring 9, D-53111 Bonn, Germany *Corresponding author. Gerhard Jocham, Max Planck Institute for Neurological Research, Cognitive Neurology Research Group, Gleueler Strasse 50, D-50931 Cologne, Germany. Phone: +49 221 4726 273, E-mail: jocham/at/nf.mpg.de The publisher's final edited version of this article is available free at J Neurosci.Abstract In humans, presence of an A1 allele of the DRD2/ANKK1-TaqIa polymorphism is associated with reduced expression of dopamine (DA) D2 receptors in the striatum. Recently, it was observed that carriers of the A1 allele (A1+ subjects) showed impaired learning from negative feedback in a reinforcement learning task. Here, using functional MRI, we investigated carriers and non-carriers of the A1 allele while they performed a probabilistic reversal learning task. A1+ subjects showed subtle deficits in reversal learning. In particular, these deficits consisted of an impairment in sustaining the newly rewarded response after a reversal and in a generally decreased tendency to stick with a rewarded response. Both genetic groups showed increased fMRI signal in response to negative feedback in the rostral cingulate zone (RCZ) and anterior insula. Negative feedback that incurred a change in behavior additionally engaged the ventral striatum and a region of the midbrain consistent with the location of dopaminergic cell groups. The response of the RCZ to negative feedback increased as a function of preceding negative feedback. However, this graded response was not observed in the A1+ group. Furthermore, the A1+ group also showed diminished recruitment of the right ventral striatum and the right lateral orbitofrontal cortex (lOFC) upon reversals. Taken together, these results suggest that a genetically driven reduction in DA D2 receptors leads to deficient feedback integration in RCZ. This in turn was accompanied by impaired recruitment of the ventral striatum and the right lOFC upon reversals, which might explain the behavioral differences between the genetic groups. Keywords: fMRI, reversal learning, dopamine, DRD2/ANKK1-TaqIa polymorphism, ACC, ventral striatum Introduction Surviving in a changing environment requires constant evaluation of action outcomes. One experimental paradigm representing changing environments is reversal learning. Subjects learn to respond to one specific stimulus to receive a reward. After a number of trials, task contingencies are reversed and the alternative stimulus is rewarded. Neuroimaging studies in humans and lesion studies in humans and animals show that reversal learning requires the integrity of the ventral prefrontal cortex (PFC) and the dorsomedial and ventral striatum (Divac et al., 1967; Iversen and Mishkin, 1970; Dias et al., 1996; Cools et al., 2002; Fellows and Farah, 2003; Hornak et al., 2004; Izquierdo et al., 2004; Clarke et al., 2008). Furthermore, performance monitoring and implementation of flexible, adaptive behavior seems to engage the anterior cingulate cortex (ACC) particularly the rostral cingulate zone (RCZ, Ullsperger and von Cramon, 2003; Ridderinkhof et al., 2004; Debener et al., 2005; Rushworth et al., 2007). Negative feedback calling for behavioral adjustments (Kringelbach and Rolls, 2004) activates the orbitofrontal cortex (OFC), often accompanied by ACC activation (Kringelbach, 2005). Dopamine (DA), particularly D2 receptors, seems to be required for reversal learning (Cools et al., 2001; Cools et al., 2007). Both genetic deletion of the D2 receptor (Kruzich and Grandy, 2004) and pharmacological blockade (Ridley et al., 1981; Lee et al., 2007) was shown to be detrimental for reversal learning in rodents and monkeys. However, agonism of D2 receptors has also been shown to impair reversal learning (Smith et al., 1999; Mehta et al., 2001). Furthermore, it was shown in the rat that DA in the nucleus accumbens (NAC) part of the ventral striatum is necessary for reversal learning (Taghzouti et al., 1985). Besides pharmacological interventions and lesions to study the neurochemical bases of reversal learning, there are also genetic polymorphisms leading to natural variations in DA transmission. The DRD2/ANKK1-TaqIa polymorphism modulates the density of DA D2 receptors. The A1 allele is associated with a reduction in striatal D2 receptor density of up to 30% (Thompson et al., 1997; Pohjalainen et al., 1998; Jonsson et al., 1999; Ritchie and Noble, 2003). This reduction is particularly prominent in ventral parts of caudate and putamen. Additionally, reduced glucose metabolism is observed in carriers of the A1 allele, not only in the striatum but also in remote areas such as ventral and medial PFC (Noble et al., 1997). Given the importance of these areas for reversal learning, we hypothesized that the reduction of glucose metabolism and the reduced density of D2 receptors in the ventral striatum in carriers of the A1 allele should lead to impaired reversal learning and reversal-related brain activity in these areas. Furthermore, given the role of the ACC/RCZ in integrating action outcomes over multiple trials (Kennerley et al., 2006), we postulated that activation of the RCZ by negative feedback would be dependent on the outcome of previous trials. We tested whether this history-driven response of the RCZ is impaired in A1+ subjects. To address these hypotheses, we scanned carriers and non-carriers of the A1 allele with fMRI while they performed a probabilistic response reversal learning task. Methods Participants Thirty-five male, Caucasian subjects, aged 20 to 32 years, participated in the study. Subjects were invited with respect to their DRD2/ANKK1-TaqIa polymorphism configuration from a larger sample which was in Hardy-Weinberg equilibrium. Seven subjects had to be excluded, one due to malfunction of the presentation system, the others because they did not perform the task satisfactorily (≤ 10 final reversal errors (see below) or an excess of switching behavior, i.e. > 100 switches between the responses throughout the experiment). Of those six subjects, four belonged to the A1− group and two to the A1+ group. Thus, 15 subjects of the A1− group and 13 of the A1+ group remained (mean age A1− = 26.33; mean age for A1+ = 25.92; difference n.s.). We included only male subjects in our study to avoid menstrual cycle-dependent interactions between the dopaminergic system and gonadal steroids (Becker et al., 1982; Becker and Cha, 1989; Creutz and Kritzer, 2004; Dreher et al., 2007). The study was approved by the Research Ethics Committee of the University of Leipzig, Germany. Genetic analyses The DRD2/ANKK1-TaqIa polymorphism is a restriction fragment polymorphism on chromosome 11 at q22-q23 (Noble, 2003; Reuter et al., 2005). Three genotypes of the dopamine DRD2/ANNK1-TaqIa locus can be differentiated: The A1A1 genotype, the A1A2 genotype, and the A2A2 genotype. Due to the small prevalence of the A1A1 genotype (3% of healthy Caucasians), A1A1 and A1A2 subjects are commonly grouped as A1+ subjects, whereas A2A2 subjects are referred to as A1− subjects. The prevalence of at least one A1 allele (A1+ group) leads to an up to 30% reduction in D2 receptor density (Thompson et al., 1997; Pohjalainen et al., 1998; Jonsson et al., 1999; Ritchie and Noble, 2003). The direct impact of the DRD2/ANKK1-TaqIa polymorphism (rs1800497) on D2 receptor density has recently been questioned (Lucht and Rosskopf, 2008) since this SNP is located < 10 kb downstream of the DRD2 gene within a protein-coding region of the adjacent ANKK1 gene (Neville et al., 2004). Zhang et al. (2007) investigated 23 SNPs within the D2 gene and found a decreased expression of the short splice variant of the D2 receptor compared to the long splice variant caused by two intronic SNPs (rs2283265 & rs1076560). Interestingly in the study by Zhang et al. (2007) the minor allele of the two SNPs shows strong linkage disequilibrium with the A1 allele of the DRD2/ANKK1-TaqIa polymorphism (D’ = 0.855). This data indicates that due to linkage the DRD2/ANKK1-TaqIa polymorphism is a marker for dopamine receptor density. DNA was extracted from buccal cells. Automated purification of genomic DNA was conducted by means of the MagNA Pure® LC system using a commercial extraction kit (MagNA Pure LC DNA isolation kit; Roche Diagnostics, Mannheim, Germany). Genotyping of the three SNPs (rs1800497, rs1076560, rs2283265) was performed by real time PCR using fluorescence melting curve detection analysis by means of the Light Cycler System 1.5 (Roche Diagnostics, Mannheim, Germany). The primers and hybridization probes (TIB MOLBIOL, Berlin, Germany) are as follows:
Haplotype analysis Linkage analyses between SNPs and construction of haplotype blocks were conducted by means of Haploview 3.32 (http://www.broad.mit.edu/mpg/haploview/index.php). Two subjects did not give their approval for re-analyzing their genetic samples for rs2283265 and rs1076560. Therefore, the sample size for the haplotype analyses was n = 26 (one subject from each DRD2/ANKK1-TaqIa group missing). Individual haplotypes were calculated with PHASE, version 2.1. PHASE implements a Bayesian statistical method for reconstructing haplotypes from population genotype data. In simulation experiments it turned out that the mean error rate using PHASE was about half that obtained by the EM (expectation-maximization) algorithm (Stephens et al., 2001). Experimental design We employed a probabilistic response reversal task (Cools et al., 2002). In each trial, subjects were required to choose between two identical stimuli (two symbolic square buttons of the same color) located to the left and to the right of a central fixation cross. Subjects had to index their response with the index finger of the left or right hand. One of the two responses (left or right) was rewarded in 75% of the trials, while in the remaining 25% of trials, the other response was rewarded. Reward allocation to one of the two responses was thus mutually exclusive. After a predefined block length of 18 to 24 trials (randomly jittered), the contingencies reversed and the other response was now rewarded in 75% of the trials. Note that this reversal learning task is entirely response-based, implementing a reversal in response-reward mapping. This is in contrast to the task employed by Cools and colleagues (2002) which implements a reversal in the stimulus-reward mapping. Participants were instructed to switch to the other response only when they were sure that the rule had changed. Subjects underwent 19 blocks (and thus 18 contingency reversals), totaling 382 trials. Mean trial duration was 5 s. Additionally, 46 null events of the same duration were randomly interspersed with the experimental trials. During null events, only the fixation cross was presented. The entire experiment lasted slightly less than 36 minutes. In each trial (see Fig. 1A
Image acquisition Data acquisition was performed at 3 T on a Siemens Magnetom Trio (Erlangen, Germany) equipped with a standard birdcage head coil. Thirty slices (3 mm thickness, 0.3 mm interslice gap) were obtained parallel to the anterior commissure – posterior commissure (AC-PC) line using a single-shot gradient echo-planar imaging (EPI) sequence (repetition time: 2000 ms; echo time: 30 ms; bandwidth: 116 kHz; flip angle: 90°; 64 × 64 pixel matrix; field of view: 192 mm) sensitive to blood-oxygen level-dependent contrast. To improve the localization of activations, a high resolution brain image (3D reference data set) was recorded from each participant in a separate session using a modified driven equilibrium Fourier transform (MDEFT) sequence. Image processing and analysis Analysis of fMRI data was carried out using FSL (FMRIB's Software Library, Smith et al., 2004). Functional data were motion-corrected using rigid-body registration to the central volume (Jenkinson et al., 2002). Low frequency signals were removed using a Gaussian-weighted lines 1/100 Hz highpass filter. Spatial smoothing was applied using a Gaussian filter with 8 mm full width at half maximum (FWHM). Slicetime acquisition differences were corrected using Hanning-windowed sinc-interpolation. Registration of the EPI images with the high resolution brain images and normalization into standard (MNI) space was carried out using affine registration (Jenkinson and Smith, 2001). A general linear model was fitted into pre-whitened data space to account for local autocorrelations (Woolrich et al., 2001). Analysis I aimed at investigating effects of negative and positive feedback in general. Analysis II considered negative feedback in relation to reversals in task contingencies and behavioral changes. For Analyses I, negative and positive feedback were modeled at feedback onset and the contrast between negative and positive feedback (ALLNEG vs. ALLPOS) was assessed. For Analyses II, a different trial classification was used, similar to the one employed by Cools et al. (2002): Negative feedback that was delivered following a correct response due to the probabilistic task schedule was termed a probabilistic error. When task contingencies reversed and subjects received negative feedback because they still applied the previously correct response, this was called a reversal error (REVERR), however only, if those errors were not followed by a change of behavior on the subsequent trial. In contrast, reversal errors that were followed by a switch to the then correct response on the next trial were considered to be final reversal errors (FINREVERR, Fig. 1B The following contrasts were calculated and assessed within and between the two groups: For the effects of negative feedback in general, the contrast ALLNEG vs. ALLPOS was analyzed. To investigate activity on error trials that was specific to reversals, we compared final reversal errors with reversal errors (FINREVERR vs. REVERR). Furthermore, we tested whether activity to negative feedback was higher when this was immediately preceded by one (NEG+1) or two (NEG+2) feedback trials in comparison to the first negative feedback (NEG+0) after positive feedback trials. Therefore, the contrasts NEG+2 vs. NEG+1; NEG+2 vs. NEG+0 and NEG+1 vs. NEG+0 were calculated. Trials that fell into neither class were modeled as events of no interest. In addition, one would assume that subjects weight feedback differently, depending on whether it occurred early or late after a successful reversal, given that after reversal a certain number of trials had to elapse before contingencies reversed again. We therefore performed a comparison between trials occurring early and late after contingency reversal. Specifically, all trials occurring after the subject's final reversal error up to the next reversal in task contingencies were split into two halves of equal length, called HALF1 and HALF2. If the number of trials to divide was odd, the trial in the middle was modeled as event of no interest. Excluding all trials between a reversal in task contingencies and a subject's final reversal error enabled us to investigate positional effects of feedback independent of the effect caused by the accumulation of negative feedback due to a rule reversal. We investigated the effect of positive and negative feedback within both halves separately (POS vs. NEG and NEG vs. POS in HALF1 and HALF2) as well as between the two halves (HALF1 vs. HALF2 and HALF2 vs. HALF1, for positive and negative feedback, respectively). These contrasts were then compared between the two genetic groups. Finally, timecourses of the hemodynamic response function to final reversal errors and to NEG+0, NEG+1 and NEG+2 were extracted from regions of interest in the ventral striatum, the lOFC and mesial frontal cortex using PEATE (Perl Event-Related Average Timecourse extraction), a companion tool to FSL (http://www.jonaskaplan.com/peate/peate-cocoa.html). Behavioral data and timecourses of the hemodynamic response were tested for group-differences using one-tailed t-tests for independent samples. A p-value <0.05 was considered statistically significant. One tailed tests were employed for the following reasons: In case of the fMRI data, we expected attenuated responses in the lOFC, ventral striatum and mesial prefrontal cortex based on previous work showing reduced glucose metabolism in these areas in carriers of the A1 allele (Noble et al., 1997). Our behavioral predictions of increased switching and reduced persistence were driven by the reports of an association of the A1 allele with increased impulsivity (Limosin et al., 2003; Eisenberg et al., 2007) and the observation that ventral striatal D2 receptor expression is reduced in rats with increased levels of trait impulsivity (Dalley et al., 2007). Results Genetic analyses The genotype frequencies of all three SNPs under investigation were in Hardy-Weinberg-Equilibrium: DRD2/ANKK1-TaqIa (rs1800497): A1/A1: n = 1, A1/A2: n = 11, A2/A2: n = 14, Chi2 = 0.43, df = 1, n. s.; rs1076560: C/C: n = 16, C/A: n = 9, A/A: n = 1, Chi2 = 0.04, df = 1, n. s.; rs2283265: G/G: n = 16, G/T: n = 9, T/T: 1, Chi2 = 0.04, df = 1, n. s. The three SNPs build a haplotype block (see Fig. S1) spanning 15 kb according to the method by Gabriel et al. (2002). D’ was 1.0 for all linkages besides the one between rs1800497 and rs2283265 (D’ = 0.78). Three different haplotypes could be identified (Tab. S1) resulting in four different haplotype combinations (Tab. S2). Results of the haplotype analysis suggested testing the most frequent haplotype combination (CCG-CCG) against the rest. All CCG-CCG haplotype carriers are belonging to the A1− group. Thus our haplotype analysis corroborates the reported linkage between rs1800497 (DRD2/ANKK1-TaqIa) and the two other SNPs (rs1076560 & rs2283265) influencing the splicing of the DRD2 gene (Zhang et al., 2007). In case of an A2 allele in rs1800497 the alleles on rs1076560 and rs2283265 can be perfectly predicted. For the A1 allele of rs1800497 the linkage is not perfect resulting in alternative allele combinations. Grouping by haplotypes thus does not yield any further information than that provided by the DRD2/ANKK1-TaqIa SNP. The fMRI and the behavioral data were therefore analyzed by grouping participants according to the DRD2/ANKK1-TaqIa alleles. Behavioral data The overall amount of rewards collected did not differ between the two genetic groups (p>0.9). The total number of reversal errors did not differ between groups (p>0.16). The total average number of reversal errors was (mean ± SEM) 63.76 ± 4.38 for the A1− group and 57.00 ± 5.18 for the A1+ group (Fig. S2A). However, subjects from the A1+ group switched between the two response alternatives more frequently (p<0.045, Fig. S2B) than the A1− group. Interestingly, even immediately after having received positive feedback, subjects from the A1+ group frequently switched to the other response on the next trial, a behavior that was rarely observed in the A1− group (p<0.015, Fig. S2C). To investigate this response pattern in more detail, we analyzed to what extent subjects sustained their new response after a reversal due to a change in task contingency. Specifically, we analyzed the eight trials following a final reversal error and analyzed, for all 18 blocks, the proportion of trials after the reversal in which subjects maintained the newly correct response before they switched back to the (now) incorrect response. Figure 2
We divided the trials remaining in each block after a final reversal error into two halves (HALF1 & HALF2; see “image processing and analysis”). Next, we calculated the probability of staying after positive and shifting after negative feedback separately for the two halves and compared these probabilities within and between groups. Two-way repeated measures ANOVA with the factors HALF (two halves) and GROUP (two groups) was used to assess these differences. The probability of shifting after negative feedback was higher in the second compared to the first half of the block (effect of HALF, F1,27 = 18.67, p<0.001). This lose-shift probability was not different between groups (no effect of GROUP, no GROUP × HALF interaction; ps > 0.252). In contrast, the probability of staying after positive feedback was higher in the first than in the second half of the block (effect of HALF, F1,26 = 87.39, p<0.001). This win-stay probability was higher in the A1− compared to the A1+ group (Effect of GROUP, F1,26 = 4.41, p=0.045, Fig. S3), and this group difference was present in both halves (ps<0.044). Imaging data Negative feedback (ALLNEG vs. ALLPOS) induced significant increase in BOLD signal in the RCZ, the bilateral ventral anterior insula and the left middle frontal gyrus (Fig. 3
Reversal-related activity (FINREVERR vs. REVERR) was found in the same regions as described above. Additional signal change was found in the lOFC bilaterally. Furthermore, there was widespread increase of signal in the bilateral striatum and in a region of the ventral midbrain consistent with the location of the dopaminergic ventral tegmental area and pars compacta of the substantia nigra (Fig. 4
A further group difference was observed in the right lateral orbitofrontal cortex (MNI x=53, y=37, z=−5; p<0.025). Timecourses were extracted from a sphere (3 mm radius) centered at the peak coordinate showing an increased response to final reversal errors in the A1− group compared to the A1+ group. The amplitude of the hemodynamic response to final reversal errors was higher in the A1− compared to the A1+ group at five and six seconds after event onset (ps<0.016, Fig 6
We also investigated whether negative feedback encoding is dependent on the outcome (positive or negative) of the immediately preceding trials. In both groups, negative feedback evoked on the whole brain level a stronger response in the RCZ when it was preceded by one (NEG+1) or by two trials (NEG+2) with negative feedback. Comparing NEG+2 vs. NEG+0 between the genotypes revealed that this contrast was diminished in the A1+ group (Fig 7A
Similar to the behavioral analyses, we investigated if the hemodynamic response to positive and negative feedback was different between the first and second half of each block. Positive feedback in HALF1 (Contrast: positive feedback in HALF1 vs. negative feedback in HALF1) evoked a marked signal increase in striatum, in particular in the A1− group where this effect was present bilaterally (MNI x=−28, y=−4, z=7 and x=32, y=−1, z=−8), and unilaterally in the A1+group (MNI x=−29, y=5, z=−2). These activations cover large extents of the putamen, particularly in the A1− group, with the peaks located more posterior and dorsal compared to the peak of the final reversal error activation (Fig. S4). In contrast, in HALF2, positive feedback failed to significantly engage the striatum (Contrast: positive feedback in HALF2 vs. negative feedback in HALF2). Negative feedback in contrast exerted a markedly stronger influence on RCZ activity (MNI x=5, y=14, z=44 and x=1, y=12, z=49, for the A1− and A1+ groups, respectively) when it occurred in HALF2 as compared to HALF1 (Contrast: negative feedback in HALF2 vs. negative feedback in HALF1). None of these effects differed between the two genetic groups. Discussion In the present study, the overall network of brain regions we found to be activated by negative feedback per se (anterior insula, RCZ, middle frontal gyrus) and by final reversal errors (lateral orbitofrontal cortex, ventral striatum) is consistent with the literature (Cools et al., 2002; Cohen et al., 2007; Dodds et al., 2008). In addition, our results demonstrate that a genetically driven reduction in striatal D2 receptors affects performance in a probabilistic reversal learning task. The behavioral alteration did not consist of increased perseverative errors. Rather, A1+ subjects, having reduced D2 receptor density compared to A1− subjects, had difficulty in maintaining the newly rewarded response after behavioral adaptation in response to a change in task rule. Moreover, these subjects were in general more likely to switch back and forth between the response alternatives. In particular, A1+ subjects frequently switched to the other response although they had just been reinforced for the response they made. These subtle behavioral differences were accompanied by changes in feedback-related BOLD signals. The final reversal error engaged the ventral striatum and the lOFC in the A1− group to a greater extent than in the A1+ group. Interestingly, the amplitude of the ventral striatal response to the final reversal error was also predictive of subjects’ propensity to maintain the newly correct response: The higher the ventral striatal response, the less rapidly subjects switched back to the incorrect response. Furthermore, activity in the RCZ increased as a function of preceding negative feedback. That is, the more negative feedback trials preceded a negative outcome, the stronger was the response in the RCZ. This graded response to consecutive negative outcomes was absent in the A1+ group: While in these subjects, activity in the RCZ increased from the first to the second negative feedback, no further increase from the second to the third negative feedback was observed. Interestingly, the graded response of the RCZ to negative feedback was also predictive of subjects’ behavior after a final reversal error: The more the activity in RCZ increased with the number of preceding negative feedback, the more likely subjects maintained the newly correct response. Additionally, we observed that feedback differentially influenced subjects’ behavior, depending on its position in the block. In general, lose-shift behavior occurred more frequently in the second half of each block, and win-stay behavior was more frequent in the first half. However, win-stay behavior in both parts of the block was less frequent in subjects from the A1+ group, consistent with their decreased tendency to maintain the newly correct response after a reversal. These effects of feedback position were also found in the fMRI data. A pronounced striatal response to positive feedback, located, in particular, in large extents of the posterior two thirds of the putamen, was only observed in the first half of the block. Here, positive feedback can be thought of as being most informative, particularly in the first trials after reversal, when rewards confirm that the decision to switch was correct. Nevertheless, hemodynamic response amplitudes to positive feedback in the first half were not correlated with the tendency to maintain the newly correct response. Negative feedback, in contrast, evoked clear-cut activation of the RCZ in both parts of the block, but the response of the RCZ was markedly stronger in the second half. Therefore, it seems that subjects ascribe more relevance to negative feedback that occurs later in a block, which is paralleled by the increased incidence of lose-shift behavior in the second half of the block. The reduced ventral striatal response to final reversal errors in the A1+ group may either be a direct consequence of the reduced D2 receptor density in this region or secondary to the reduced glucose metabolism in the RCZ (Noble et al., 1997), which might entail an impaired integration of negative feedback. We would speculate that, as subjects accrue more and more negative feedback upon reversals, activity in the RCZ gradually increases up to a certain threshold. When activity exceeds this threshold, the RCZ engages the striatum via its efferents (Müller-Preuss and Jurgens, 1976; Yeterian and Van Hoesen, 1978; Baleydier and Mauguiere, 1980; Devinsky et al., 1995; Takada et al., 2001) to trigger a behavioral adaptation. In A1+ subjects equipped with reduced striatal D2 receptor density, the already altered information arriving from the RCZ might be further degraded in the ventral striatum by maladaptive corticostriatal integration, due to the relative lack of D2 receptors. We also found an area of the midbrain comprising the dopaminergic nuclei of the ventral tegmental area (VTA) and substantia nigra (SNPC) to be activated on final reversal errors. This suggests that the dopaminergic midbrain is recruited by the OFC, RCZ (but see Frankle et al. (2006) for sparse cortical projections to the midbrain) or ventral striatum upon reversals. Alternatively, reversal-related activity in the OFC and RCZ might also be modulated by engagement of the VTA and SNPC. The impaired maintenance of the correct response after a behavioral switch shown by the A1+ subjects may be due to inefficient updating of stimulus-reward associations (Rolls, 2000). This behavior parallels findings from patients with bilateral lesions in the OFC showing the same win-shift behavior in a visual discrimination reversal task (Hornak et al., 2004). Both D2 receptor agonism and antagonism have been shown to impair reversal learning (Ridley et al., 1981; Smith et al., 1999; Mehta et al., 2001; Lee et al., 2007). A recent study found that reversal-related activity in the ventral striatum was diminished by the catecholamine-releasing drug methylphenidate, but not by the D2 receptor antagonist sulpiride (Dodds et al., 2008). As the authors discussed, this lack of effect of sulpiride may be a consequence of the dose they used (400 mg) which might have been too low to occupy a substantial proportion of D2 receptors. In another recent study, however, Lee and colleagues reported that antagonizing D2 receptors with raclopride diminished behavioral flexibility in monkeys thereby increasing the number of reversal errors in a response reversal task (Lee et al., 2007). It is not clear, if the effects of D2 receptor agonists and antagonists on reversal learning are mediated by action on receptors in the striatum (ventral or dorsomedial) or in other (for instance, fronto-cortical) brain regions. The reduction of striatal D2 receptors in the A1+ subjects suggests that the effect is indeed mediated by striatal D2 receptors (but see: Calaminus and Hauber, 2007). However, it is not clear yet whether the DRD2/ ANKK1-TaqIa polymorphism also affects D2 receptor density in brain areas other than the striatum. Speaking in favor of a central role for intact dopaminergic transmission in the striatum, Frank and colleagues, using elaborate computational models, provided evidence that disrupted DA signaling in the ventral striatum might be held responsible for impairments in reversing behavior after a switch in task contingencies in a probabilistic learning task (Frank, 2005; Frank and Claus, 2006). In our task, subjects were instructed to switch to the alternative response only when they were sure that the contingencies had reversed. Therefore, it is not surprising that we only found a slight, non-significant reduction in the overall number of reversal errors in the A1+ group. This is consistent with findings of another study that also compared reversal learning in A1+ and A1− subjects (Cohen et al., 2007). However, our analysis of the behavioral pattern after the reversal of task contingency revealed that, even though A1+ subjects did not take longer to switch to the correct response, they were less likely than A1− subjects to sustain this new response and frequently reverted back to the previously correct response. This pattern is remarkably reminiscent of the deficit Kennerley and colleagues (2006) observed in macaque monkeys with lesions of the anterior cingulate sulcus. Lesioned animals did not take more trials to switch to the correct response, but they were impaired at maintaining this new response on the next trials. Even after having collected several rewards for the new response, they were still likely to revert back to the previously reinforced response. The authors argue that one function of the dorsal ACC/RCZ is to integrate action-outcome-associations over multiple trials (“reinforcement history”) and that the lesion interfered with this function. In agreement with this interpretation, we found that negative action outcomes were not encoded in a uniform fashion in the RCZ. Rather, the response of this brain region to negative feedback depended on the outcome of previous trials: The more consecutive negative outcomes preceded a negative feedback, the more pronounced was the BOLD response in the RCZ. This integrative function seems to be reduced in carriers of the A1 allele. Our results concur with previous findings showing that A1+ subjects had difficulty in learning from negative feedback in a reinforcement learning task. This was also accompanied by diminished responses of the RCZ to negative feedback in this group (Klein et al., 2007). This suggests a general role of D2 receptors in feedback-based learning. Taken together, the results of the present study show that in a probabilistic reversal learning task, negative action outcomes are integrated over multiple trials in the RCZ. Upon behavioral adaptation to a reversal of task contingencies, the lateral orbitofrontal cortex and ventral striatum are engaged. Carriers of the A1 allele show deficient integration of feedback in the RCZ and reduced recruitment of the ventral striatum and the lOFC during reversal. This diminished engagement of reversal-relevant brain areas likely makes the subjects’ decision less stable, and thereby causes them to revert back to previously successful actions more frequently. Our findings suggest that striatal and possibly cortical D2 receptors are crucial for the integration of action outcomes and successful reversal learning. Supp1 Click here to view.(732K, pdf) Acknowledgements Jane Neumann's work was supported by a grant from the NIH (R01 MH74457). Gerhard Jocham's work was supported by a grant from the Deutsche Forschungsgemeinschaft (JO-787/1-1) References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||
J Comp Physiol Psychol. 1967 Apr; 63(2):184-90.
[J Comp Physiol Psychol. 1967]Exp Brain Res. 1970 Nov 26; 11(4):376-86.
[Exp Brain Res. 1970]Nature. 1996 Mar 7; 380(6569):69-72.
[Nature. 1996]J Neurosci. 2002 Jun 1; 22(11):4563-7.
[J Neurosci. 2002]Brain. 2003 Aug; 126(Pt 8):1830-7.
[Brain. 2003]Cereb Cortex. 2001 Dec; 11(12):1136-43.
[Cereb Cortex. 2001]Neuropsychopharmacology. 2007 Jan; 32(1):180-9.
[Neuropsychopharmacology. 2007]Neuropsychopharmacology. 2007 Oct; 32(10):2125-34.
[Neuropsychopharmacology. 2007]Psychopharmacology (Berl). 2001 Dec; 159(1):10-20.
[Psychopharmacology (Berl). 2001]Behav Neural Biol. 1985 Nov; 44(3):354-63.
[Behav Neural Biol. 1985]Eur J Pharmacol. 1982 May 7; 80(1):65-72.
[Eur J Pharmacol. 1982]Behav Brain Res. 1989 Nov 1; 35(2):117-25.
[Behav Brain Res. 1989]J Comp Neurol. 2004 Aug 30; 476(4):348-62.
[J Comp Neurol. 2004]Proc Natl Acad Sci U S A. 2007 Feb 13; 104(7):2465-70.
[Proc Natl Acad Sci U S A. 2007]Am J Med Genet B Neuropsychiatr Genet. 2003 Jan 1; 116B(1):103-25.
[Am J Med Genet B Neuropsychiatr Genet. 2003]Behav Brain Res. 2005 Oct 14; 164(1):93-9.
[Behav Brain Res. 2005]Pharmacogenetics. 1997 Dec; 7(6):479-84.
[Pharmacogenetics. 1997]Mol Psychiatry. 1998 May; 3(3):256-60.
[Mol Psychiatry. 1998]Mol Psychiatry. 1999 May; 4(3):290-6.
[Mol Psychiatry. 1999]Science. 2008 Jul 11; 321(5886):200; author reply 200.
[Science. 2008]Hum Mutat. 2004 Jun; 23(6):540-5.
[Hum Mutat. 2004]Proc Natl Acad Sci U S A. 2007 Dec 18; 104(51):20552-7.
[Proc Natl Acad Sci U S A. 2007]Am J Hum Genet. 2001 Apr; 68(4):978-89.
[Am J Hum Genet. 2001]J Neurosci. 2002 Jun 1; 22(11):4563-7.
[J Neurosci. 2002]J Neurosci. 2002 Jun 1; 22(11):4563-7.
[J Neurosci. 2002]Neuroimage. 2004; 23 Suppl 1():S208-19.
[Neuroimage. 2004]Neuroimage. 2002 Oct; 17(2):825-41.
[Neuroimage. 2002]Med Image Anal. 2001 Jun; 5(2):143-56.
[Med Image Anal. 2001]Neuroimage. 2001 Dec; 14(6):1370-86.
[Neuroimage. 2001]J Neurosci. 2002 Jun 1; 22(11):4563-7.
[J Neurosci. 2002]Am J Med Genet. 1997 Apr 18; 74(2):162-6.
[Am J Med Genet. 1997]Psychiatr Genet. 2003 Jun; 13(2):127-9.
[Psychiatr Genet. 2003]Behav Brain Funct. 2007 Jan 10; 3():2.
[Behav Brain Funct. 2007]Science. 2007 Mar 2; 315(5816):1267-70.
[Science. 2007]Science. 2002 Jun 21; 296(5576):2225-9.
[Science. 2002]Proc Natl Acad Sci U S A. 2007 Dec 18; 104(51):20552-7.
[Proc Natl Acad Sci U S A. 2007]J Neurosci. 2002 Jun 1; 22(11):4563-7.
[J Neurosci. 2002]Eur J Neurosci. 2007 Dec; 26(12):3652-60.
[Eur J Neurosci. 2007]J Neurosci. 2008 Jun 4; 28(23):5976-82.
[J Neurosci. 2008]Am J Med Genet. 1997 Apr 18; 74(2):162-6.
[Am J Med Genet. 1997]Brain Res. 1976 Feb 13; 103(1):29-43.
[Brain Res. 1976]Brain Res. 1978 Jan 6; 139(1):43-63.
[Brain Res. 1978]Brain. 1980 Sep; 103(3):525-54.
[Brain. 1980]Brain. 1995 Feb; 118 ( Pt 1)():279-306.
[Brain. 1995]Neuropsychopharmacology. 2006 Aug; 31(8):1627-36.
[Neuropsychopharmacology. 2006]Cereb Cortex. 2000 Mar; 10(3):284-94.
[Cereb Cortex. 2000]J Cogn Neurosci. 2004 Apr; 16(3):463-78.
[J Cogn Neurosci. 2004]Psychopharmacology (Berl). 2001 Dec; 159(1):10-20.
[Psychopharmacology (Berl). 2001]Neuropsychopharmacology. 2007 Oct; 32(10):2125-34.
[Neuropsychopharmacology. 2007]J Neurosci. 2008 Jun 4; 28(23):5976-82.
[J Neurosci. 2008]Psychopharmacology (Berl). 2007 Apr; 191(3):551-66.
[Psychopharmacology (Berl). 2007]J Cogn Neurosci. 2005 Jan; 17(1):51-72.
[J Cogn Neurosci. 2005]Psychol Rev. 2006 Apr; 113(2):300-26.
[Psychol Rev. 2006]Eur J Neurosci. 2007 Dec; 26(12):3652-60.
[Eur J Neurosci. 2007]Nat Neurosci. 2006 Jul; 9(7):940-7.
[Nat Neurosci. 2006]Science. 2007 Dec 7; 318(5856):1642-5.
[Science. 2007]