Forum
Planning as inference

https://doi.org/10.1016/j.tics.2012.08.006Get rights and content

Recent developments in decision-making research are bringing the topic of planning back to center stage in cognitive science. This renewed interest reopens an old, but still unanswered question: how exactly does planning happen? What are the underlying information processing operations and how are they implemented in the brain? Although a range of interesting possibilities exists, recent work has introduced a potentially transformative new idea, according to which planning is accomplished through probabilistic inference.

Section snippets

Traditional perspectives on planning

So how might the brain accomplish planning? In psychology, the classical approach to this question focuses on planning problems involving a specific a priori goal. Although the study of such tasks has yielded important insights, it stops short of the more general problem, which centers on the generic goal of reward maximization. The classical approach also concentrates on cases where action outcomes are perfectly predictable, something that is not characteristic of most real life settings.

Planning as inference: the basic idea

Under the planning-as-inference (PAI) view, the decision-making agent makes use of an internal cognitive model, which represents the future as a joint probability distribution over actions, outcome states, and rewards (Figure 1a, b). This generative model allows the agent to attach a probability to any potential action-outcome-reward sequence.

To plan, the agent can use its internal model to sample potential action-outcome trajectories, essentially using it to perform tree search. However,

PAI in machine learning and robotics

The seeds of PAI were planted in artificial intelligence and machine learning research as early as the 1980s (for this background, see 4, 5). Since then, evolving techniques for PAI have been increasingly applied to support planning in artificial agents and to solve stochastic optimal control problems in robotics. In both of these settings, recent implementations of PAI have yielded computational benefits over traditional techniques, discovering optimal solutions more quickly and, in some

Implications for psychology and neuroscience

As cognitive and neuroscientific research has reengaged with the topic of planning, recent work has begun to explore the potential relevance of PAI 5, 9. Although still at its inception, such work already makes clear why PAI may hold special interest.

One appealing aspect of PAI is that it brings planning under the same umbrella as other forms of information processing. There has been a recent surge of interest in the idea that essentially all cognitive and neural computation can be understood

Present opportunities and challenges

PAI appears to offer a promising new perspective on the time-honored problem of planning – one that reveals underlying commonalities with other cognitive functions and which may shed new light on relevant neural processes. Of course, a great deal of additional research will be needed if the apparent relevance of PAI to cognition and neural function is to be properly validated. To date, most work on PAI has been theoretical. A necessary next step will be to identify and test empirical

Acknowledgments

Support for the present work was provided by the James S. McDonnell Foundation (M.B.) and the German Research Foundation (DFG), Emmy Noether fellowship TO 409/1-3 and SPP grant TO 409/7-1 (M.T.).

References (15)

  • Y. Niv

    Reinforcement learning in the brain

    J. Math. Psychol.

    (2009)
  • B. Lau et al.

    Value representations in the primate striatum during matching behavior

    Neuron

    (2008)
  • B.W. Balleine et al.

    Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action

    Neuropsychopharmacology

    (2010)
  • D. Silver

    Temporal-difference search in computer Go

    Mach. Learn.

    (2012)
  • M. Toussaint et al.

    Probabilistic inference for solving discrete and continuous state markov decision processes

  • A. Solway et al.

    Goal directed decision making as probabilistic inference: A computational framework and potential neural correlates

    Psychol. Rev.

    (2012)
  • Rawlik, K. et al. On stochastic optimal control and reinforcement learning by approximate inference. In Proceedings,...
There are more references available in the full text version of this article.

Cited by (260)

  • The Human Affectome

    2024, Neuroscience and Biobehavioral Reviews
  • Federated inference and belief sharing

    2024, Neuroscience and Biobehavioral Reviews
View all citing articles on Scopus
View full text