An fMRI study of reward-related probability learning

doi:10.1016/j.neuroimage.2004.10.002

NeuroImage

Volume 24, Issue 3, 1 February 2005, Pages 862-873

https://doi.org/10.1016/j.neuroimage.2004.10.002 Get rights and content

Abstract

The human striatum has been implicated in processing reward-related information. More recently, activity in the striatum, particularly the caudate nucleus, has been observed when a contingency between behavior and reward exists, suggesting a role for the caudate in reinforcement-based learning. Using a gambling paradigm, in which affective feedback (reward and punishment) followed simple, random guesses on a trial by trial basis, we sought to investigate the role of the caudate nucleus as reward-related learning progressed. Participants were instructed to make a guess regarding the value of a presented card (if the value of the card was higher or lower than 5). They were told that five different cues would be presented prior to making a guess, and that each cue indicated the probability that the card would be high or low. The goal was to learn the contingencies and maximize the reward attained. Accuracy, as measured by participant's choices, improved throughout the experiment for cues that strongly predicted reward, while no change was observed for unpredictable cues. Event-related fMRI revealed that activity in the caudate nucleus was more robust during the early phases of learning, irrespective of contingencies, suggesting involvement of this region during the initial stages of trial and error learning. Further, the reward feedback signal in the caudate nucleus for well-learned cues decreased as learning progressed, suggesting an evolving adaptation of reward feedback expectancy as a behavior–outcome contingency becomes more predictable.

Introduction

Learning what choice is best comes with experience. In order to maximize rewards, an organism will strive to make better choices based on trial and error. Thus, it is imperative that brain mechanisms exist to support early learning of contingencies that will lead to a rewarding outcome. The goal of the present study is to investigate how the human brain behaves during a reward-learning paradigm, specifically the acquisition and progression of reward learning. One structure that has been implicated in processing of reward-related information is the striatum, the input unit of the basal ganglia, and specifically the caudate nucleus, part of the dorsal striatum. The striatum is a component of multiple cortico-striatal loops that are modulated by dopaminergic neurons in the midbrain, which have been shown to increase firing to unexpected rewards and conditioned stimuli that predict a reward. Due to its heterogeneity in terms of function and connectivity, the striatum is in a prime position to integrate cognitive and motivational information and influence goal-directed behavior. The human striatum is therefore a possible key structure in the acquisition of contingencies that lead to a reward.

Previous research has suggested a role for the striatum in processing reward-related information across species. Significant increases in dopamine release in the striatum, for example, have been observed during cocaine self-administration in rats (Di Chiara and Imperato, 1988, Ito et al., 2002). Neurons in the monkey striatum have been shown to respond to the anticipation (Apicella et al., 1992, Kawagoe et al., 1998) and delivery (Apicella et al., 1991, Hikosaka et al., 1989) of rewards. In accordance with animal studies, brain imaging studies of the human striatum have observed activity during the processing of both primary and secondary rewards (Aharon et al., 2001, Berns et al., 2001, Breiter et al., 2001, Delgado et al., 2000, Delgado et al., 2003, Elliott et al., 2004, Kirsch et al., 2003, Knutson et al., 2000, Knutson et al., 2001a, O'Doherty et al., 2002, O'Doherty et al., 2004, Pagnoni et al., 2002). The striatum's response to the anticipation and delivery of rewards and punishments suggests that it may be a key structure in affective learning. Indeed, as argued by Schultz et al. (2003), learning can be viewed as a change in outcome predictions and the acquisition of discriminatory responses to different stimuli may reflect the learning of appropriate behavioral actions.

Although the striatum responds to anticipation and delivery of rewards, the caudate nucleus, a component of the dorsal portion of the striatum, does not seem to respond to the reward per se. Rather, it seems to be more vigorously recruited when an outcome is contingent on an action (Tricomi et al., 2004), suggesting a larger role for reinforcement-based processing, where predictions and feedback help adjust behavior. The plasticity of the striatum allows for such rapid reinforcement of actions as shown in dynamic and efficacious synaptic changes in the rat throughout learning of a procedural task (Jog et al., 1999) and during self-stimulation (Reynolds et al., 2001, Wickens et al., 2003). Thus, the caudate nucleus' unique role in reward processing may be to contribute to the brain's ability to learn though reinforcement.

The caudate nucleus is one of the main regions affected in degenerative disorders such as Parkinson's and Huntington's disease. In accordance with the idea that the caudate is important during feedback-based learning, patients with Parkinson's disease are slower during initial learning of an associative learning task (Myers et al., 2003), as compared to control subjects, and show deficits during a feedback-based learning task, as opposed to intact learning during a nonfeedback version of the same paradigm (Shohamy et al., 2004). Similarly, patients with Huntington's disease have poor performance on a trial and error incidental learning task, a type of learning thought to be dependent on the integrity of the caudate nucleus (Brown et al., 2001).

The striatum, particularly the caudate nucleus, is therefore a structure involved in processing reward-related information and various aspects of learning. Research suggests that the caudate may be an essential component of a brain circuit that allows us to improve our choices through trial and error learning. However, it is unclear whether this observed pattern of results extends from cognitive to more affective learning, where feedback properties are both informative and incentive-laden (representing possible gain or losses). The goal of this experiment was to investigate the role of the human striatum during reward-related contingency learning.

The present study investigated how activity in the caudate nucleus is modulated as reward learning progresses, specifically looking at the early stages of learning, when associations between action and outcome are being formed, and during latter stages, when the well-learned stimulus–responses are performed. We used a gambling paradigm where wins and losses were determined on the basis of guessing, but learning of stimulus–response contingencies could influence future performance. Participants were instructed that different cues, presented prior to making a guess, predicted what type of choice was more likely to lead to a reward. The introduction of a learning component insured that participants had a chance to maximize their rewards based on actual performance, allowing us to investigate how activity in the striatum, particularly the caudate nucleus, is modulated as learning of affective contingencies progresses.

Section snippets

Participants

Seventeen right-handed volunteers participated in this study (9 male, 8 female). Participants responded to posted advertisement (average age: M = 23.29, SD = 3.31), and all participants gave informed consent.

Procedure

The paradigm involved a series of 120 interleaved trials, divided into 10 runs of 12 trials each. Participants were instructed that they would see a card and were asked to guess if the value of such card was higher or lower than the number 5. Each individual trial represented one specific

Behavioral results

Analysis was conducted on all 17 participants to investigate behavioral effects of gender, trial order and overall monetary gain. Participants monetary score was calculated at the end of the session and took into account correct ($1.00 gain), incorrect (−$0.50 loss) and missed trials (−$1.00 loss). Participants scores ranged from $34.50 to $78.00 (M = 58.15, SD = 12.28). Using the monetary score for each participant, we then looked at any effects of trial order (version 1 and version 2) and

Discussion

The goal of this study was to investigate how the human brain processes learning of reward contingencies. Specifically, we investigated brain regions thought to be important during the acquisition of reward associations and their modulation as learning progresses. By using a gambling paradigm (where rewards were attained on the basis of guessing) that contained probabilistic cues (which educated the participant in regards to which choice or guess was more likely to lead to a reward), we were

Acknowledgments

This work was supported by NIMH 62104 to EAP and Center for Brain Imaging, NYU. The authors wish to acknowledge Kate Fissell, Brett Sedgewick and Ben Holmes for technical assistance, Susan Ravizza and Elizabeth Tricomi for informative discussion and constructive criticism. The authors would also like to acknowledge the support of the Beatrice and Samuel A. Seaver Foundation.

References (64)

I. Aharon et al.
Beautiful faces have variable reward value: fMRI and behavioral evidence
Neuron
(2001)
A.G. Barto
Reinforcement learning control
Curr. Opin. Neurobiol.
(1994)
H.C. Breiter et al.
Functional imaging of neural responses to expectancy and experience of monetary gains and losses
Neuron
(2001)
R.W. Cox
AFNI: software for analysis and visualization of functional magnetic resonance neuroimages
Comput. Biomed. Res.
(1996)
R. Elliott et al.
Instrumental responding for rewards is associated with enhanced neuronal response in subcortical reward systems
NeuroImage
(2004)
S.N. Haber
The primate basal ganglia: parallel and integrative networks
J. Chem. Neuroanat.
(2003)
P. Kirsch et al.
Anticipation of reward in a nonaversive differential conditioning paradigm and the brain reward system: an event-related fMRI study
NeuroImage
(2003)
B. Knutson et al.
fMRI visualization of brain activity during a monetary incentive delay task
NeuroImage
(2000)
S.M. McClure et al.
Temporal prediction errors in a passive learning task activate human striatum
Neuron
(2003)
F.A. Middleton et al.
Basal ganglia output and cognition: evidence from anatomical, behavioral, and clinical studies
Brain Cogn.
(2000)

J.P. O'Doherty et al.

Neural responses during anticipation of a primary taste reward

Neuron

(2002)

J.P. O'Doherty et al.

Temporal difference models and reward-related learning in the human brain

Neuron

(2003)

M.G. Packard et al.

Inactivation of hippocampus or caudate nucleus with lidocaine differentially affects expression of place and response learning

Neurobiol. Learn. Mem.

(1996)

W. Schultz et al.

Changes in behavior-related neuronal activity in the striatum during learning

Trends Neurosci.

(2003)

E.M. Tricomi et al.

Modulation of caudate activity by action contingency

Neuron

(2004)

N.M. White et al.

Multiple parallel memory systems in the brain of the rat

Neurobiol. Learn. Mem.

(2002)

J.R. Wickens et al.

Neural mechanisms of reward-related motor learning

Curr. Opin. Neurobiol.

(2003)

C.F. Zink et al.

Human striatal responses to monetary reward depend on saliency

Neuron

(2004)

P. Apicella et al.

Responses to reward in monkey dorsal and ventral striatum

Exp. Brain Res.

(1991)

P. Apicella et al.

Neuronal activity in monkey striatum related to the expectation of predictable environmental events

J. Neurophysiol.

(1992)

A.R. Aron et al.

Human midbrain sensitivity to cognitive feedback and uncertainty during classification learning

J. Neurophysiol.

(2004)

G.S. Berns et al.

Predictability modulates human brain response to reward

J. Neurosci.

(2001)

R.G. Brown et al.

Dissociation between intentional and incidental sequence learning in Huntington's disease

Brain

(2001)

R.M. Carelli et al.

Loss of lever press-related firing of rat striatal forelimb neurons after repeated sessions in a lever pressing task

J. Neurosci.

(1997)

M.R. Delgado et al.

Tracking the hemodynamic responses to reward and punishment in the striatum

J. Neurophysiol.

(2000)

M.R. Delgado et al.

Dorsal striatum responses to reward and punishment: effects of valence and magnitude manipulations

Cognit. Affective Behav. Neurosci.

(2003)

M.R. Delgado et al.

Motivation-dependent responses in the human caudate nucleus

Cereb. Cortex

(2004)

G. Di Chiara et al.

Drugs abused by humans preferentially increase synaptic dopamine concentrations in the mesolimbic system of freely moving rats

Proc. Natl. Acad. Sci. U. S. A.

(1988)

R. Elliott et al.

Dissociable neural responses in human reward systems

J. Neurosci.

(2000)

C.D. Fiorillo et al.

Discrete coding of reward probability and uncertainty by dopamine neurons

Science

(2003)

S.D. Forman et al.

Improved assessment of significant activation in functional magnetic resonance imaging (fMRI): use of a cluster-size threshold

Magn. Reson. Med.

(1995)

M.A. Gluck et al.

How do people solve the “weather prediction” task?: individual variability in strategies for probabilistic category learning

Learn. Mem.

(2002)

Cited by (287)

The effect of emotional faces on reward-related probability learning in depressed patients
2024, Journal of Affective Disorders
Existing research indicates that individuals with Major Depressive Disorder (MDD) exhibit a bias toward salient negative stimuli. However, the impact of such biased stimuli on concurrent cognitive and affective processes in individuals with depression remains inadequately understood. This study aimed to investigate the effects of salient environmental stimuli, specifically emotional faces, on reward-associated processes in MDD.
Thirty-three patients with recurrent MDD and thirty-two healthy controls (HC) matched for age, sex, and education were included in the study. We used a reward-related associative learning (RRAL) task primed with emotional (happy, sad, neutral) faces to investigate the effect of salient stimuli on reward-related learning and decision-making in functional magnetic resonance imaging (fMRI). Participants were instructed to ignore emotional faces during the task. The fMRI data were analyzed using a full-factorial general linear model (GLM) in Statistical Parametric Mapping (SPM12).
In depressed patients, cues primed with sad faces were associated with reduced amygdala activation. However, both HC and MDD group exhibited reduced ventral striatal activity while learning reward-related cues and receiving rewards.
The patients'medication usage was not standardized.
This study underscores the functional alteration of the amygdala in response to cognitive tasks presented with negative emotionally salient stimuli in the environment of MDD patients. The observed alterations in amygdala activity suggest potential interconnected effects with other regions of the prefrontal cortex. Understanding the intricate neural connections and their disruptions in depression is crucial for unraveling the complex pathophysiology of the disorder.
Randomized controlled trial of computerized approach/avoidance training in social anxiety disorder: Neural and symptom outcomes
2023, Journal of Affective Disorders
Citation Excerpt :
In the case of Balanced AAT, the no-contingency design resulted in a lack of coupling between a given motor behavior and cue valence – i.e., arm flexion/approach could result in advancement toward positive or neutral social cues unpredictably. Repeated practice in the Balanced AAT condition may have engaged general alertness and attention as individuals tried to learn action contingencies of the task as well as higher prediction error processing, both of which would recruit corticostriatal substrates including the caudate (Davidson et al., 2004; Delgado et al., 2005; Haruno and Kawato, 2006; O'Doherty et al., 2004). Thus, increased caudate activation during the SID in the Balanced AAT condition may reflect an increase in dorsal striatum “readiness” during anticipation as a function of repeated practice in a no-contingency task that disconnected motor response from anticipated outcomes.
Social anxiety is associated with diminished automatic approach toward positive social cues that may limit the ability to connect with others. This diminished approach bias may be a modifiable treatment target. We evaluated the effects of an approach avoidance training procedure on positive emotions, social relationship outcomes, clinical symptoms, and neural indices of social approach and reward processing.
Forty-five individuals with social anxiety disorder were randomized (parallel 1:1 randomization) to complete computerized Approach Positive training (n = 21) or Balanced training(n = 24). Sessions included a standardized social interaction task. Participants were blind to training group. Participants completed clinical outcome measures and functional magnetic resonance imaging at baseline and post intervention with an MRI-compatible AAT and the social incentive delay task (SID).
Both groups displayed significant improvements of similar magnitude on the primary outcome of social connectedness (between group post-treatment d = −0.21) but not positive affect (d = −0.09), from before to after treatment, persisting through follow-up. Groups demonstrated significant improvements on additional outcomes including anxiety, depression, and anhedonia symptoms. Participants in Approach Positive AAT demonstrated increased activation in the thalamus and medial prefrontal cortex during social versus neutral- approach relative to Balanced AAT during the fMRI AAT. Participants in Balanced AAT showed increased activation in regions within an a priori-defined striatum region of interest mask during anticipation of social reward (vs. baseline) in the SID relative to Approach Positive AAT.
At a neural processing level AAT may influence the valuation and motivations associated with positive social cues regulated by the mPFC and thalamus. NCT02136212, NIMH R00MH090243.
Neural Basis of Prejudice and Prejudice Reduction
2022, Biological Psychiatry: Cognitive Neuroscience and Neuroimaging
Social prejudices, based on race, ethnicity, gender, or other identities, pervade how we perceive, think about, and act toward others. Research on the neural basis of prejudice seeks to illuminate its effects by investigating the neurocognitive processes through which prejudice is formed, represented in the mind, expressed in behavior, and potentially reduced. In this article, we review current knowledge about the social neuroscience of prejudice regarding its influence on rapid social perception, representation in memory, emotional expression and relation to empathy, and regulation, and we discuss implications of this work for prejudice reduction interventions.
An fMRI meta-analysis of the role of the striatum in everyday-life vs laboratory-developed habits
2022, Neuroscience and Biobehavioral Reviews
The dorsolateral striatum plays a critical role in the acquisition and expression of stimulus-response habits that are learned in experimental laboratories. Here, we use meta-analytic procedures to contrast the neural circuits activated by laboratory-acquired habits with those activated by stimulus-response behaviours acquired in everyday-life. We confirmed that newly learned habits rely more on the anterior putamen with activation extending into caudate and nucleus accumbens. Motor and associative components of everyday-life habits were identified. We found that motor-dominant stimulus-response associations developed outside the laboratory primarily engaged posterior dorsal putamen, supplementary motor area (SMA) and cerebellum. Importantly, associative components were also represented in the posterior putamen. Thus, common neural representations for both naturalistic and laboratory-based habits were found in the left posterior and right anterior putamen. These findings suggest a partial common striatal substrate for habitual actions that are performed predominantly by stimulus-response associations represented in the posterior striatum. The overlapping neural substrates for laboratory and everyday-life habits supports the use of both methods for the analysis of habitual behaviour.
Meta-analytic evidence for the cognitive control model of loneliness in emotion processing
2022, Neuroscience and Biobehavioral Reviews
Loneliness is strongly related to affective dysregulation. However, the neuropsychological mechanisms underpinning the loneliness-affective processing relationships remain unclear. Here, we first utilised the coordinate-based activation likelihood estimation method to confirm functional clusters related to loneliness, including the striatum, superior and medial frontal gyrus, insula, and cuneus. Meta-analytic connectivity modelling was then performed to characterise the functional connectivity of these clusters across studies using emotion tasks. Our results revealed that these clusters co-activated with the cognitive control networks. From the literature, we understand that loneliness and its neural correlates are highly related to regulating the attention biases to social rewards and social cues. Therefore, our findings provide a proof-of-concept that loneliness up-regulates the cognitive control networks to process socio-affective information. Prolonged up-regulation thus exhausts cognitive resources and hence, affective dysregulation. This study offers insight into the intricate role of cognitive and affective regulation in loneliness and social perception and provides meta-analytic evidence of the cognitive control model of loneliness and loneliness-related affective dysregulation, bringing significant clinical implications.
Dynamic modulation of neural feedback processing and attention during spatial probabilistic learning
2022, iScience
Learned stimulus-reward associations can modulate behavior and the underlying neural processing of information. We investigated the cascade of these neurocognitive mechanisms involved in the learning of spatial stimulus-reward associations. Using electroencephalogram recordings while participants performed a probabilistic spatial reward learning task, we observed that the feedback-related negativity component was more negative in response to loss feedback compared to gain feedback but showed no modulation by learning. The late positive component became larger in response to losses as the learning set progressed but smaller in response to gains. In addition, feedback-locked alpha frequency oscillations measured over occipital sites were predictive of N2pc amplitudes—a marker of spatial attention orienting—observed on the next trial. This relationship was found to become stronger with learning set progression. Taken together, we elucidated neurocognitive dynamics underlying feedback processing during spatial reward learning, and the subsequent effects of these learned spatial stimulus-reward associations on spatial attention.

View all citing articles on Scopus

View full text

An fMRI study of reward-related probability learning

Abstract

Introduction

Section snippets

Participants

Procedure

Behavioral results

Discussion

Acknowledgments

Neuron

Curr. Opin. Neurobiol.

Neuron

Comput. Biomed. Res.

NeuroImage

J. Chem. Neuroanat.

NeuroImage

NeuroImage

Neuron

Brain Cogn.

Neuron

Neuron

Neurobiol. Learn. Mem.

Trends Neurosci.

Neuron

Neurobiol. Learn. Mem.

Curr. Opin. Neurobiol.

Neuron

Responses to reward in monkey dorsal and ventral striatum

Exp. Brain Res.

Neuronal activity in monkey striatum related to the expectation of predictable environmental events

J. Neurophysiol.

Human midbrain sensitivity to cognitive feedback and uncertainty during classification learning

J. Neurophysiol.

Predictability modulates human brain response to reward

J. Neurosci.

Dissociation between intentional and incidental sequence learning in Huntington's disease

Brain

Loss of lever press-related firing of rat striatal forelimb neurons after repeated sessions in a lever pressing task

J. Neurosci.

Tracking the hemodynamic responses to reward and punishment in the striatum

J. Neurophysiol.

Dorsal striatum responses to reward and punishment: effects of valence and magnitude manipulations

Cognit. Affective Behav. Neurosci.

Motivation-dependent responses in the human caudate nucleus

Cereb. Cortex

Drugs abused by humans preferentially increase synaptic dopamine concentrations in the mesolimbic system of freely moving rats

Proc. Natl. Acad. Sci. U. S. A.

Dissociable neural responses in human reward systems

J. Neurosci.

Discrete coding of reward probability and uncertainty by dopamine neurons

Science

Improved assessment of significant activation in functional magnetic resonance imaging (fMRI): use of a cluster-size threshold

Magn. Reson. Med.

How do people solve the “weather prediction” task?: individual variability in strategies for probabilistic category learning

Learn. Mem.