An fMRI study of reward-related probability learning
Introduction
Learning what choice is best comes with experience. In order to maximize rewards, an organism will strive to make better choices based on trial and error. Thus, it is imperative that brain mechanisms exist to support early learning of contingencies that will lead to a rewarding outcome. The goal of the present study is to investigate how the human brain behaves during a reward-learning paradigm, specifically the acquisition and progression of reward learning. One structure that has been implicated in processing of reward-related information is the striatum, the input unit of the basal ganglia, and specifically the caudate nucleus, part of the dorsal striatum. The striatum is a component of multiple cortico-striatal loops that are modulated by dopaminergic neurons in the midbrain, which have been shown to increase firing to unexpected rewards and conditioned stimuli that predict a reward. Due to its heterogeneity in terms of function and connectivity, the striatum is in a prime position to integrate cognitive and motivational information and influence goal-directed behavior. The human striatum is therefore a possible key structure in the acquisition of contingencies that lead to a reward.
Previous research has suggested a role for the striatum in processing reward-related information across species. Significant increases in dopamine release in the striatum, for example, have been observed during cocaine self-administration in rats (Di Chiara and Imperato, 1988, Ito et al., 2002). Neurons in the monkey striatum have been shown to respond to the anticipation (Apicella et al., 1992, Kawagoe et al., 1998) and delivery (Apicella et al., 1991, Hikosaka et al., 1989) of rewards. In accordance with animal studies, brain imaging studies of the human striatum have observed activity during the processing of both primary and secondary rewards (Aharon et al., 2001, Berns et al., 2001, Breiter et al., 2001, Delgado et al., 2000, Delgado et al., 2003, Elliott et al., 2004, Kirsch et al., 2003, Knutson et al., 2000, Knutson et al., 2001a, O'Doherty et al., 2002, O'Doherty et al., 2004, Pagnoni et al., 2002). The striatum's response to the anticipation and delivery of rewards and punishments suggests that it may be a key structure in affective learning. Indeed, as argued by Schultz et al. (2003), learning can be viewed as a change in outcome predictions and the acquisition of discriminatory responses to different stimuli may reflect the learning of appropriate behavioral actions.
Although the striatum responds to anticipation and delivery of rewards, the caudate nucleus, a component of the dorsal portion of the striatum, does not seem to respond to the reward per se. Rather, it seems to be more vigorously recruited when an outcome is contingent on an action (Tricomi et al., 2004), suggesting a larger role for reinforcement-based processing, where predictions and feedback help adjust behavior. The plasticity of the striatum allows for such rapid reinforcement of actions as shown in dynamic and efficacious synaptic changes in the rat throughout learning of a procedural task (Jog et al., 1999) and during self-stimulation (Reynolds et al., 2001, Wickens et al., 2003). Thus, the caudate nucleus' unique role in reward processing may be to contribute to the brain's ability to learn though reinforcement.
The caudate nucleus is one of the main regions affected in degenerative disorders such as Parkinson's and Huntington's disease. In accordance with the idea that the caudate is important during feedback-based learning, patients with Parkinson's disease are slower during initial learning of an associative learning task (Myers et al., 2003), as compared to control subjects, and show deficits during a feedback-based learning task, as opposed to intact learning during a nonfeedback version of the same paradigm (Shohamy et al., 2004). Similarly, patients with Huntington's disease have poor performance on a trial and error incidental learning task, a type of learning thought to be dependent on the integrity of the caudate nucleus (Brown et al., 2001).
The striatum, particularly the caudate nucleus, is therefore a structure involved in processing reward-related information and various aspects of learning. Research suggests that the caudate may be an essential component of a brain circuit that allows us to improve our choices through trial and error learning. However, it is unclear whether this observed pattern of results extends from cognitive to more affective learning, where feedback properties are both informative and incentive-laden (representing possible gain or losses). The goal of this experiment was to investigate the role of the human striatum during reward-related contingency learning.
The present study investigated how activity in the caudate nucleus is modulated as reward learning progresses, specifically looking at the early stages of learning, when associations between action and outcome are being formed, and during latter stages, when the well-learned stimulus–responses are performed. We used a gambling paradigm where wins and losses were determined on the basis of guessing, but learning of stimulus–response contingencies could influence future performance. Participants were instructed that different cues, presented prior to making a guess, predicted what type of choice was more likely to lead to a reward. The introduction of a learning component insured that participants had a chance to maximize their rewards based on actual performance, allowing us to investigate how activity in the striatum, particularly the caudate nucleus, is modulated as learning of affective contingencies progresses.
Section snippets
Participants
Seventeen right-handed volunteers participated in this study (9 male, 8 female). Participants responded to posted advertisement (average age: M = 23.29, SD = 3.31), and all participants gave informed consent.
Procedure
The paradigm involved a series of 120 interleaved trials, divided into 10 runs of 12 trials each. Participants were instructed that they would see a card and were asked to guess if the value of such card was higher or lower than the number 5. Each individual trial represented one specific
Behavioral results
Analysis was conducted on all 17 participants to investigate behavioral effects of gender, trial order and overall monetary gain. Participants monetary score was calculated at the end of the session and took into account correct ($1.00 gain), incorrect (−$0.50 loss) and missed trials (−$1.00 loss). Participants scores ranged from $34.50 to $78.00 (M = 58.15, SD = 12.28). Using the monetary score for each participant, we then looked at any effects of trial order (version 1 and version 2) and
Discussion
The goal of this study was to investigate how the human brain processes learning of reward contingencies. Specifically, we investigated brain regions thought to be important during the acquisition of reward associations and their modulation as learning progresses. By using a gambling paradigm (where rewards were attained on the basis of guessing) that contained probabilistic cues (which educated the participant in regards to which choice or guess was more likely to lead to a reward), we were
Acknowledgments
This work was supported by NIMH 62104 to EAP and Center for Brain Imaging, NYU. The authors wish to acknowledge Kate Fissell, Brett Sedgewick and Ben Holmes for technical assistance, Susan Ravizza and Elizabeth Tricomi for informative discussion and constructive criticism. The authors would also like to acknowledge the support of the Beatrice and Samuel A. Seaver Foundation.
References (64)
- et al.
Beautiful faces have variable reward value: fMRI and behavioral evidence
Neuron
(2001) Reinforcement learning control
Curr. Opin. Neurobiol.
(1994)- et al.
Functional imaging of neural responses to expectancy and experience of monetary gains and losses
Neuron
(2001) AFNI: software for analysis and visualization of functional magnetic resonance neuroimages
Comput. Biomed. Res.
(1996)- et al.
Instrumental responding for rewards is associated with enhanced neuronal response in subcortical reward systems
NeuroImage
(2004) The primate basal ganglia: parallel and integrative networks
J. Chem. Neuroanat.
(2003)- et al.
Anticipation of reward in a nonaversive differential conditioning paradigm and the brain reward system: an event-related fMRI study
NeuroImage
(2003) - et al.
fMRI visualization of brain activity during a monetary incentive delay task
NeuroImage
(2000) - et al.
Temporal prediction errors in a passive learning task activate human striatum
Neuron
(2003) - et al.
Basal ganglia output and cognition: evidence from anatomical, behavioral, and clinical studies
Brain Cogn.
(2000)
Neural responses during anticipation of a primary taste reward
Neuron
Temporal difference models and reward-related learning in the human brain
Neuron
Inactivation of hippocampus or caudate nucleus with lidocaine differentially affects expression of place and response learning
Neurobiol. Learn. Mem.
Changes in behavior-related neuronal activity in the striatum during learning
Trends Neurosci.
Modulation of caudate activity by action contingency
Neuron
Multiple parallel memory systems in the brain of the rat
Neurobiol. Learn. Mem.
Neural mechanisms of reward-related motor learning
Curr. Opin. Neurobiol.
Human striatal responses to monetary reward depend on saliency
Neuron
Responses to reward in monkey dorsal and ventral striatum
Exp. Brain Res.
Neuronal activity in monkey striatum related to the expectation of predictable environmental events
J. Neurophysiol.
Human midbrain sensitivity to cognitive feedback and uncertainty during classification learning
J. Neurophysiol.
Predictability modulates human brain response to reward
J. Neurosci.
Dissociation between intentional and incidental sequence learning in Huntington's disease
Brain
Loss of lever press-related firing of rat striatal forelimb neurons after repeated sessions in a lever pressing task
J. Neurosci.
Tracking the hemodynamic responses to reward and punishment in the striatum
J. Neurophysiol.
Dorsal striatum responses to reward and punishment: effects of valence and magnitude manipulations
Cognit. Affective Behav. Neurosci.
Motivation-dependent responses in the human caudate nucleus
Cereb. Cortex
Drugs abused by humans preferentially increase synaptic dopamine concentrations in the mesolimbic system of freely moving rats
Proc. Natl. Acad. Sci. U. S. A.
Dissociable neural responses in human reward systems
J. Neurosci.
Discrete coding of reward probability and uncertainty by dopamine neurons
Science
Improved assessment of significant activation in functional magnetic resonance imaging (fMRI): use of a cluster-size threshold
Magn. Reson. Med.
How do people solve the “weather prediction” task?: individual variability in strategies for probabilistic category learning
Learn. Mem.
Cited by (287)
The effect of emotional faces on reward-related probability learning in depressed patients
2024, Journal of Affective DisordersRandomized controlled trial of computerized approach/avoidance training in social anxiety disorder: Neural and symptom outcomes
2023, Journal of Affective DisordersCitation Excerpt :In the case of Balanced AAT, the no-contingency design resulted in a lack of coupling between a given motor behavior and cue valence – i.e., arm flexion/approach could result in advancement toward positive or neutral social cues unpredictably. Repeated practice in the Balanced AAT condition may have engaged general alertness and attention as individuals tried to learn action contingencies of the task as well as higher prediction error processing, both of which would recruit corticostriatal substrates including the caudate (Davidson et al., 2004; Delgado et al., 2005; Haruno and Kawato, 2006; O'Doherty et al., 2004). Thus, increased caudate activation during the SID in the Balanced AAT condition may reflect an increase in dorsal striatum “readiness” during anticipation as a function of repeated practice in a no-contingency task that disconnected motor response from anticipated outcomes.
Neural Basis of Prejudice and Prejudice Reduction
2022, Biological Psychiatry: Cognitive Neuroscience and NeuroimagingAn fMRI meta-analysis of the role of the striatum in everyday-life vs laboratory-developed habits
2022, Neuroscience and Biobehavioral ReviewsMeta-analytic evidence for the cognitive control model of loneliness in emotion processing
2022, Neuroscience and Biobehavioral Reviews