Elsevier

Cognition

Volume 139, June 2015, Pages 154-167
Cognition

Reward and punishment act as distinct factors in guiding behavior

https://doi.org/10.1016/j.cognition.2015.03.005Get rights and content

Highlights

  • Are reinforcement and punishment comparable or distinct factors in guiding behavior?

  • We assessed the effects of the magnitude of reward and penalty in single trials.

  • It was found that the magnitude of reward dictates the rate of choice repetition.

  • In contrast, penalties cause universal avoidance regardless of the penalty magnitude.

  • Thus, reinforcement and punishment act as fundamentally distinct behavioral factors.

Abstract

Behavior rests on the experience of reinforcement and punishment. It has been unclear whether reinforcement and punishment act as oppositely valenced components of a single behavioral factor, or whether these two kinds of outcomes play fundamentally distinct behavioral roles. To this end, we varied the magnitude of a reward or a penalty experienced following a choice using monetary tokens. The outcome of each trial was independent of the outcome of the previous trial, which enabled us to isolate and study the effect on behavior of each outcome magnitude in single trials. We found that a reward led to a repetition of the previous choice, whereas a penalty led to an avoidance of the previous choice. Surprisingly, the effects of the reward magnitude and the penalty magnitude revealed a pronounced asymmetry. The choice repetition effect of a reward scaled with the magnitude of the reward. In a marked contrast, the avoidance effect of a penalty was flat, not influenced by the magnitude of the penalty. These effects were mechanistically described using a reinforcement learning model after the model was updated to account for the penalty-based asymmetry. The asymmetry in the effects of the reward magnitude and the punishment magnitude was so striking that it is difficult to conceive that one factor is just a weighted or transformed form of the other factor. Instead, the data suggest that rewards and penalties are fundamentally distinct factors in governing behavior.

Introduction

Reinforcement and punishment constitute Nature’s arsenal in guiding behavior (Thorndike, 1898, Thorndike, 1911, Skinner, 1963, Tversky and Kahneman, 1986, Davison, 1991, Gray et al., 1991, Ehrlich, 1996, Hackenberg, 2009). It is well established that reinforcers and punishers both critically influence behavior, but it has been unclear whether these factors exert symmetric or qualitatively distinct behavioral effects (Skinner, 1953, Farley and Fantino, 1978, Gray et al., 1991, Dinsmoor, 1998, Lerman and Vorndran, 2002, Critchfield et al., 2003, Lie and Alsop, 2007). One-factor theories have proposed a symmetric law of effect (Thorndike, 1927). In this view, reinforcement increases behavior frequency, punishment decreases behavioral frequency, and the magnitudes of these effects are equal, just of opposite signs (Thorndike, 1911, Sidman, 1962, Herrnstein and Hineline, 1966, Schuster and Rachlin, 1968, Rachlin and Herrnstein, 1969, Villiers, 1980). In contrast, two-factor theories view reinforcement and punishment as qualitatively distinct influences on operant behavior (Mowrer, 1947, Dinsmoor, 1954, Epstein, 1985, Yechiam and Hochman, 2013).

This debate remains, for the most part, unresolved (Hineline, 1984, Gray et al., 1991, Dinsmoor, 1998, Dinsmoor, 2001, Critchfield et al., 2003, Lie and Alsop, 2007). This is mainly due to two reasons. First, it is difficult to compare qualitatively different factors (e.g., food versus electric shock) on a common scale (Schuster and Rachlin, 1968, Farley and Fantino, 1978, Villiers, 1980, Fiorillo, 2013). A solution to this problem is to work with reinforcers and punishers that are of the same kind—using tokens that represent gains and losses (Hackenberg, 2009). Second, previous studies targeting this question have employed relatively complex paradigms (Bradshaw et al., 1979, Gray et al., 1991, Critchfield et al., 2003, Rasmussen and Newland, 2008). The complex paradigms make it difficult to readily investigate the effect of a reward or a punishment on a behavioral response.

We addressed this question in a simple choice paradigm in which we varied the magnitude of a reward or a penalty experienced following each choice. This allowed us to measure subjects’ tendency to repeat their previous choice as a function of the magnitude of the experienced reward or penalty. In this simple paradigm, one-factor theories predict that the reward and penalty magnitudes will lead to qualitatively similar, just oppositely signed tendencies to repeat the previous choice. In contrast, two-factor theories predict that the choice repetition tendencies will be qualitatively distinct for the two factors. The data indeed revealed a striking asymmetry in the effects of the reward and penalty magnitudes on the choice behavior. The asymmetry was so profound that it suggests that the two behavioral factors are of distinct natures.

Section snippets

Subjects

Eighty-eight Washington University undergraduate students participated in this study. The subjects performed an Auditory task or a Visual task. The Auditory task was performed by 54 students (37 females, 17 males), aged 18–21 (mean 19.2). The Visual task was performed by a distinct set of 34 students (24 females, 10 males), aged 18–23 (mean 19.4). All subjects were healthy, had normal hearing capacity, and gave an informed consent. Subjects participated for class credit.

Auditory task

Subjects sat in a

Task

Fifty-four human subjects performed a choice task in which they were instructed to make a response based on the polarity of brief trains of click sounds simultaneously presented to both ears. The polarity was drawn randomly on each trial. If subjects heard more click sounds in the right ear, they pressed the right Command key with the right index finger. If they heard more click sounds in the left ear, they pressed the left Command key with the left index finger (Fig. 3A). Critically, a response

Discussion

Whether Thorndike’s law of effect is symmetric or asymmetric in regard to reinforcement and punishment has been an unresolved question (Skinner, 1953, Farley and Fantino, 1978, Gray et al., 1991, Dinsmoor, 1998, Lerman and Vorndran, 2002, Critchfield et al., 2003, Lie and Alsop, 2007). We addressed this question in simple choice tasks that allowed us to study the behavioral effects of the magnitudes of reinforcement and punishment in single trials. We found overwhelmingly asymmetric effects of

Acknowledgments

This study was supported by the NIH Grants EY012135 and EY002687. The authors declare no competing conflict of interest.

References (62)

  • R.F. Baumeister et al.

    Bad is stronger than good

    Review of General Psychology

    (2001)
  • C. Bradshaw et al.

    The effect of punishment on free-operant choice behavior in humans

    Journal of the Experimental Analysis of Behavior

    (1979)
  • T.S. Critchfield et al.

    Punishment in human choice: Direct or competitive suppression?

    Journal of the Experimental Analysis of Behavior

    (2003)
  • M. Davison

    Choice, changeover, and travel: A quantitative model

    Journal of the Experimental Analysis of Behavior

    (1991)
  • M. Davison et al.

    Choice in a variable environment: Every reinforcer counts

    Journal of the Experimental Analysis of Behavior

    (2000)
  • J.A. Dinsmoor

    Punishment: I. The avoidance hypothesis

    Psychological Review

    (1954)
  • J.A. Dinsmoor

    Punishment

    (1998)
  • J.A. Dinsmoor

    Still no evidence for temporally extended shock-frequency reduction as a reinforcer

    Journal of the Experimental Analysis of Behavior

    (2001)
  • I. Ehrlich

    Crime, punishment, and the market for offenses

    Journal of Economic Perspectives

    (1996)
  • R. Epstein

    The positive side effects of reinforcement: A commentary on Balsam and Bondy (1983)

    Journal of Applied Behavior Analysis

    (1985)
  • J. Farley et al.

    The symmetrical law of effect and the matching relation in choice behavior1

    Journal of the Experimental Analysis of Behavior

    (1978)
  • C.D. Fiorillo

    Two dimensions of value: dopamine neurons represent reward but not aversiveness

    Science

    (2013)
  • W.J. Gehring et al.

    The medial frontal cortex and the rapid processing of monetary gains and losses

    Science

    (2002)
  • L.N. Gray et al.

    Rewards and punishments in complex human choices

    Social Psychology Quarterly

    (1991)
  • T.D. Hackenberg

    Token reinforcement: A review and analysis

    Journal of the Experimental Analysis of Behavior

    (2009)
  • Herrnstein, Richard J. (2000). The matching law: Papers in psychology and economics. Harvard University...
  • R. Herrnstein et al.

    Negative reinforcement as shock-frequency reduction1

    Journal of the Experimental Analysis of Behavior

    (1966)
  • PN. Hineline

    Aversive control: A separate domain?

    Journal of the Experimental Analysis of Behavior

    (1984)
  • C.B. Holroyd et al.

    The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity

    Psychological Review

    (2002)
  • D. Kahneman et al.

    Prospect theory: An analysis of decision under risk

    Econometrica: Journal of the Econometric Society

    (1979)
  • D.C. Lerman et al.

    On the status of knowledge for using punishment: Implications for treating behavior disorders

    Journal of Applied Behavior Analysis

    (2002)
  • Cited by (45)

    • Impaired Punishment Learning in Conduct Disorder

      2024, Journal of the American Academy of Child and Adolescent Psychiatry
    • An analysis of the behavioral decisions of governments, village collectives, and farmers under rural waste sorting

      2022, Environmental Impact Assessment Review
      Citation Excerpt :

      The village collectives are the manager of rural garbage classification, considering the governments' reward and punishment policies (Amini et al., 2014), the collective economy and future ecological benefits. Village collectives not only pursue the collective economy (Kubanek et al., 2015) but also raise funds to implement waste sorting. To seek future development, the village collectives will actively manage and respond to government policies, evaluate and guide farmers' investment behaviors, and maintain the rural environment at all times.

    • The functional neural architecture of dysfunctional reward processing in autism

      2021, NeuroImage: Clinical
      Citation Excerpt :

      Remarkably, regions also associated with negative emotions, such as punishment, loss, disgust, fear and sadness, were part of these consensus connectivity networks. Behavioral data from healthy controls points to a close interplay between rewarding and punishing experiences as necessary for the acquisition of goal-directed behaviors (Kubanek et al., 2015). Congruent with that idea, animal data show a partial overlap of neuronal circuits mediating positive and negative valence (Tovote et al., 2015).

    View all citing articles on Scopus
    View full text