Send to

Choose Destination
J Neurosci. 2013 May 15;33(20):8866-90. doi: 10.1523/JNEUROSCI.4614-12.2013.

Dopaminergic control of motivation and reinforcement learning: a closed-circuit account for reward-oriented behavior.

Author information

Physical and Health Education, Graduate School of Education, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan.


Humans and animals take actions quickly when they expect that the actions lead to reward, reflecting their motivation. Injection of dopamine receptor antagonists into the striatum has been shown to slow such reward-seeking behavior, suggesting that dopamine is involved in the control of motivational processes. Meanwhile, neurophysiological studies have revealed that phasic response of dopamine neurons appears to represent reward prediction error, indicating that dopamine plays central roles in reinforcement learning. However, previous attempts to elucidate the mechanisms of these dopaminergic controls have not fully explained how the motivational and learning aspects are related and whether they can be understood by the way the activity of dopamine neurons itself is controlled by their upstream circuitries. To address this issue, we constructed a closed-circuit model of the corticobasal ganglia system based on recent findings regarding intracortical and corticostriatal circuit architectures. Simulations show that the model could reproduce the observed distinct motivational effects of D1- and D2-type dopamine receptor antagonists. Simultaneously, our model successfully explains the dopaminergic representation of reward prediction error as observed in behaving animals during learning tasks and could also explain distinct choice biases induced by optogenetic stimulation of the D1 and D2 receptor-expressing striatal neurons. These results indicate that the suggested roles of dopamine in motivational control and reinforcement learning can be understood in a unified manner through a notion that the indirect pathway of the basal ganglia represents the value of states/actions at a previous time point, an empirically driven key assumption of our model.

[Indexed for MEDLINE]
Free full text

Supplemental Content

Full text links

Icon for HighWire
Loading ...
Support Center