Bayesian model selection for group studies

doi:10.1016/j.neuroimage.2013.08.065

NeuroImage

Volume 84, 1 January 2014, Pages 971-985

https://doi.org/10.1016/j.neuroimage.2013.08.065 Get rights and content

Highlights

•
Some conceptual issues with group-level Bayesian Model Selection are still outstanding.
•
We provide a complete picture of the statistical risk incurred when performing BMS.
•
We address the problem of between-group and between-condition comparisons.
•
We examine the difference between BMS and classical random effects analyses.

Abstract

In this paper, we revisit the problem of Bayesian model selection (BMS) at the group level. We originally addressed this issue in Stephan et al. (2009), where models are treated as random effects that could differ between subjects, with an unknown population distribution. Here, we extend this work, by (i) introducing the Bayesian omnibus risk (BOR) as a measure of the statistical risk incurred when performing group BMS, (ii) highlighting the difference between random effects BMS and classical random effects analyses of parameter estimates, and (iii) addressing the problem of between group or condition model comparisons. We address the first issue by quantifying the chance likelihood of apparent differences in model frequencies. This leads to the notion of protected exceedance probabilities. The second issue arises when people want to ask “whether a model parameter is zero or not” at the group level. Here, we provide guidance as to whether to use a classical second-level analysis of parameter estimates, or random effects BMS. The third issue rests on the evidence for a difference in model labels or frequencies across groups or conditions. Overall, we hope that the material presented in this paper finesses the problems of group-level BMS in the analysis of neuroimaging and behavioural data.

Introduction

Any statistical measure of empirical evidence rests on some form of model comparison. In a classical setting, one typically compares the null with an alternative hypothesis, where the former is a model of how chance could have generated the data. Theoretical results specify the sense in which model comparison can be considered optimal. For example, the Neyman–Pearson lemma essentially states that statistical tests based on the likelihood ratio (such as a simple t-test) are the most powerful, i.e., they have the best chance of detecting an effect (see e.g., Casella and Berger, 2001). From this perspective, Bayesian model comparison can be seen as a simple extension to likelihood tests, in that it allows for the comparison of more than two models. In fact, likelihood ratios are used in a Bayesian setting, under the name of Bayes factors (Kass and Raftery, 1995). These are just the ratio of experimental evidence in favour of one model relative to another. Having said this, established classical and Bayesian techniques may give different answers to the same question — a difference that has entertained generations of statisticians (see e.g., Fienberg, 2006).

In this paper, we consider the problem of performing random effects Bayesian model selection (BMS) at the group level. This was originally addressed in Stephan et al. (2009), where models were treated as random effects that could differ between subjects and have a fixed (unknown) distribution in the population. The implicit hierarchical model is then inverted using variational or sampling techniques (see Penny et al., 2010), to provide conditional estimates of the frequency with which any model prevails in the population. This random effects BMS procedure complements fixed effects procedures that assume that subjects are sampled from a homogenous population with one (unknown) model (cf. the log group Bayes factor that sums log-evidences over subjects; Stephan et al., 2007). Stephan et al. (2009) also introduced the notion of exceedance probability, which measures how likely it is that any given model is more frequent than all other models in the comparison set. These two summary statistics typically constitute the results of random effects BMS (see, for example, den Ouden et al., 2010).

While the random effects BMS procedure suggested in Stephan et al. (2009) and Penny et al. (2010) has proven useful in practice — and has been employed by more than hundred published studies to date, some conceptual issues are still outstanding. In this paper, we extend the approach described in Stephan et al. (2009) in three ways: (i) we provide a complete picture of the statistical risk incurred when performing group BMS, (ii) we examine the formal difference between random effects BMS and classical random effects analyses of parameter estimates, when asking whether a particular parameter is zero or not, and (iii) we address the problem of between-group and between-condition comparisons.

Section 2 revisits random effects BMS, providing a definition of the null at the group level. This allows us to quantify the statistical risk incurred by performing random effects BMS, i.e. how likely it is that differences in model evidences are due to chance. En passant, we clarify the interpretation of exceedance probabilities and provide guidance with regard to summary statistics that should be reported when using random effects BMS.

Section 3 addresses the difference between random effects BMS and classical random effects analyses of parameter estimates. In principle, group effects can be assessed using a classical random effects analysis of the parameter estimates across subjects (e.g., using t-tests), or using random effects BMS (reduced versus full model). However, these approaches do not answer the same question (and therefore may not give the same answer). Here, we explain the nature of this difference and identify the situations that would yield identical or different conclusions.

Section 4 introduces a simple extension to the original framework proposed in Stephan et al. (2009). In brief, we propose a test of whether two (or more) groups of subjects come from the same population. We also address the related issue of between condition comparisons. The key idea behind these procedures is a generalization of the intuition that underlies classical paired t-tests; i.e. one has to quantify the evidence for a difference — as opposed to the difference of evidences.

For all three issues, we use Monte-Carlo simulations to assess the performance of random effects BMS in the context of key applications, e.g. Dynamic Causal Modeling (see Daunizeau et al., 2011a for a recent review).

Section snippets

On the statistical risk of group BMS

In this section, we first revisit the approach to random effects BMS proposed in Stephan et al. (2009), recasting it as an extension of Polya's urn model. This serves to identify the nature of the risk associated with model selection. In brief, we focus on the risk of stating that a given model is a better explanation for the data than other models, given that chance could have favoured this particular model. In turn, we propose a simple Bayesian “omnibus test”, to exclude chance as a likely

Random effects BMS and classical random effects analysis of parameter estimates

In this section, we focus on a specific question, namely “whether a model parameter is zero or not” at the group level. In a classical setting, this is typically addressed using a two-sided t-test on the parameter of interest. Effectively, this relies on the parameter estimate — from each subject — as a summary statistic to perform a random effects analysis; testing whether the group mean is significantly different from zero. However, one could also perform a group BMS with two models (with and

Between-group and between-condition BMS

In this section, we address the relationship between different treatment conditions and groups; for example, dealing with one group of subjects measured under two conditions,⁴ or two groups of subjects. Until now,

Discussion

In this work, we introduced three extensions of our original approach to random effects BMS (Stephan et al., 2009). First, we have described a protected exceedance probability that any model is more frequent than the others (above and beyond chance). Second, we have presented systematic simulations of various approaches to address questions about specific treatment effects on model parameters using group studies. Third, we considered approaches to between-condition and between-group BMS

Acknowledgments

This work was supported by the European Research Council (JD), by the Ville de Paris (LR), and by the IHU-A-ICM (JD, LR). KES acknowledges support by the René and Susanne Braginsky Foundation and KJF acknowledges support from the Wellcome Trust.

Conflict of interest

The authors declare that there are no conflicts of interest.

References (33)

C.C. Chen et al.
Forward and backward connections in the brain: a DCM study of functional asymmetries
Neuroimage
(2009)
J. Daunizeau et al.
Dynamic causal modeling: a critical review of the biophysical and statistical foundations
Neuroimage
(2011)
J. Daunizeau et al.
Stochastic dynamic causal modelling of fMRI data: should we care about neural noise?
Neuroimage
(2012)
N. Daw et al.
Model-based influences on humans' choices and striatal prediction errors
Neuron
(2011)
T. Ethofer et al.
Cerebral pathways in processing of affective prosody: a dynamic causal modeling study
Neuroimage
(2006)
K.J. Friston et al.
Dynamic causal modelling
Neuroimage
(2003)
K.J. Friston et al.
Mixed-effects and fMRI studies
Neuroimage
(2005)
K.J. Friston et al.
Variational free energy and the Laplace approximation
Neuroimage
(2007)
A.P. Holmes et al.
Generalisability, random effects and population inference
Neuroimage
(1998)
W.D. Penny
Comparing dynamic causal models using AIC, BIC and free energy
Neuroimage
(2012)

K.E. Stephan et al.

Comparing hemodynamic models with DCM

Neuroimage

(2007)

K.E. Stephan et al.

Neuroimage

(2009)

M. Abramowitz et al.

Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables

(1968)

H. Akaike

Information measures and model selection

Bull. Int. Stat. Inst.

(1973)

M. Boly et al.

Preserved feedforward but impaired top–down processes in the vegetative state

Science

(2011)

G. Casella et al.

Statistical Inference

(2001)

Cited by (404)

Optimizing competence in the service of collaboration
2024, Cognitive Psychology
In order to efficiently divide labor with others, it is important to understand what our collaborators can do (i.e., their competence). However, competence is not static—people get better at particular jobs the more often they perform them. This plasticity of competence creates a challenge for collaboration: For example, is it better to assign tasks to whoever is most competent now, or to the person who can be trained most efficiently “on-the-job”? We conducted four experiments ( $N = 396$ ) that examine how people make decisions about whom to train (Experiments 1 and 3) and whom to recruit (Experiments 2 and 4) to a collaborative task, based on the simulated collaborators’ starting expertise, the training opportunities available, and the goal of the task. We found that participants’ decisions were best captured by a planning model that attempts to maximize the returns from collaboration while minimizing the costs of hiring and training individual collaborators. This planning model outperformed alternative models that based these decisions on the agents’ current competence, or on how much agents stood to improve in a single training step, without considering whether this training would enable agents to succeed at the task in the long run. Our findings suggest that people do not recruit and train collaborators based solely on their current competence, nor solely on the opportunities for their collaborators to improve. Instead, people use an intuitive theory of competence to balance the costs of hiring and training others against the benefits to the collaboration.
A cognitive-computational account of mood swings in adolescence
2024, Trends in Cognitive Sciences
Teenagers have a reputation for being fickle, in both their choices and their moods. This variability may help adolescents as they begin to independently navigate novel environments. Recently, however, adolescent moodiness has also been linked to psychopathology. Here, we consider adolescents’ mood swings from a novel computational perspective, grounded in reinforcement learning (RL). This model proposes that mood is determined by surprises about outcomes in the environment, and how much we learn from these surprises. It additionally suggests that mood biases learning and choice in a bidirectional manner. Integrating independent lines of research, we sketch a cognitive-computational account of how adolescents’ mood, learning, and choice dynamics influence each other, with implications for normative and psychopathological development.
Dynamic noise estimation: A generalized method for modeling noise fluctuations in decision-making
2024, Journal of Mathematical Psychology
Computational cognitive modeling is an important tool for understanding the processes supporting human and animal decision-making. Choice data in decision-making tasks are inherently noisy, and separating noise from signal can improve the quality of computational modeling. Common approaches to model decision noise often assume constant levels of noise or exploration throughout learning (e.g., the $ϵ$ -softmax policy). However, this assumption is not guaranteed to hold – for example, a subject might disengage and lapse into an inattentive phase for a series of trials in the middle of otherwise low-noise performance. Here, we introduce a new, computationally inexpensive method to dynamically estimate the levels of noise fluctuations in choice behavior, under a model assumption that the agent can transition between two discrete latent states (e.g., fully engaged and random). Using simulations, we show that modeling noise levels dynamically instead of statically can substantially improve model fit and parameter estimation, especially in the presence of long periods of noisy behavior, such as prolonged lapses of attention. We further demonstrate the empirical benefits of dynamic noise estimation at the individual and group levels by validating it on four published datasets featuring diverse populations, tasks, and models. Based on the theoretical and empirical evaluation of the method reported in the current work, we expect that dynamic noise estimation will improve modeling in many decision-making paradigms over the static noise estimation method currently used in the modeling literature, while keeping additional model complexity and assumptions minimal.
Autonomous behaviour and the limits of human volition
2024, Cognition
Humans and some other animals can autonomously generate action choices that contribute to solving complex problems. However, experimental investigations of the cognitive bases of human autonomy are challenging, because experimental paradigms typically constrain behaviour using controlled contexts, and elicit behaviour by external triggers. In contrast, autonomy and freedom imply unconstrained behaviour initiated by endogenous triggers. Here we propose a new theoretical construct of adaptive autonomy, meaning the capacity to make behavioural choices that are free from constraints of both immediate external triggers and of routine response patterns, but nevertheless show appropriate coordination with the environment. Participants (N = 152) played a competitive game in which they had to choose the right time to act, in the face of an opponent who punished (in separate blocks) either choice biases (such as always responding early), sequential patterns of action timing across trials (such as early, late, early, late…), or predictable action-outcome dependence (such as win-stay, lose-shift). Adaptive autonomy was quantified as the ability to maintain performance when each of these influences on action selection was punished. We found that participants could become free from habitual choices regarding when to act and could also become free from sequential action patterns. However, they were not able to free themselves from influences of action-outcome dependence, even when these resulted in poor performance. These results point to a new concept of autonomous behaviour as flexible adaptation of voluntary action choices in a way that avoids stereotypy. In a sequential analysis, we also demonstrated that participants increased their reliance on belief learning in which they attempt to understand the competitor's beliefs and intentions, when transition bias and reinforcement bias were punished. Taken together, our study points to a cognitive mechanism of adaptive autonomy in which competitive interactions with other agents could promote both social cognition and volition in the form of non-stereotyped action choices.
Risky decisions are influenced by individual attributes as a function of risk preference
2023, Cognitive Psychology
It has long been assumed in economic theory that multi-attribute decisions involving several attributes or dimensions – such as probabilities and amounts of money to be earned during risky choices – are resolved by first combining the attributes of each option to form an overall expected value and then comparing the expected values of the alternative options, using a unique evidence accumulation process. A plausible alternative would be performing independent comparisons between the individual attributes and then integrating the results of the comparisons afterwards. Here, we devise a novel method to disambiguate between these types of models, by orthogonally manipulating the expected value of choice options and the relative salience of their attributes. Our results, based on behavioral measures and drift-diffusion models, provide evidence in favor of the framework where information about individual attributes independently impacts deliberation. This suggests that risky decisions are resolved by running in parallel multiple comparisons between the separate attributes – possibly alongside an additional comparison of expected value. This result stands in contrast with the assumption of standard economic theory that choices require a unique comparison of expected values and suggests that at the cognitive level, decision processes might be more distributed than commonly assumed. Beyond our planned analyses, we also discovered that attribute salience affects people of different risk preference type in different ways: risk-averse participants seem to focus more on probability, except when monetary amount is particularly high; risk-neutral/seeking participants, in contrast, seem to focus more on monetary amount, except when probability is particularly low.
The neural substrates of how model-based learning affects risk taking: Functional coupling between right cerebellum and left caudate
2023, Brain and Cognition
Higher executive control capacity allows people to appropriately evaluate risk and avoid both excessive risk aversion and excessive risk-taking. The neural mechanisms underlying this relationship between executive function and risk taking are still unknown. We used voxel-based morphometry (VBM) analysis combined with resting-state functional connectivity (rs-FC) to evaluate how one component of executive function, model-based learning, relates to risk taking. We measured individuals’ use of the model-based learning system with the two-step task, and risk taking with the Balloon Analogue Risk Task. Behavioral results indicated that risk taking was positively correlated with the model-based weighting parameter ω. The VBM results showed a positive association between model-based learning and gray matter volume in the right cerebellum (RCere) and left inferior parietal lobule (LIPL). Functional connectivity results suggested that the coupling between RCere and the left caudate (LCAU) was correlated with both model-based learning and risk taking. Mediation analysis indicated that RCere-LCAU functional connectivity completely mediated the effect of model-based learning on risk taking. These results indicate that learners who favor model-based strategies also engage in more appropriate risky behaviors through interactions between reward-based learning, error-based learning and executive control subserved by a caudate, cerebellar and parietal network.

View all citing articles on Scopus

View full text

Bayesian model selection for group studies — Revisited

Highlights

Abstract

Introduction

Section snippets

On the statistical risk of group BMS

Random effects BMS and classical random effects analysis of parameter estimates

Between-group and between-condition BMS

Discussion

Acknowledgments

Neuroimage

Neuroimage

Neuroimage

Neuron

Neuroimage

Neuroimage

Neuroimage

Neuroimage

Neuroimage

Neuroimage

Neuroimage

Neuroimage

Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables

Information measures and model selection

Bull. Int. Stat. Inst.

Preserved feedforward but impaired top–down processes in the vegetative state

Science

Statistical Inference