Elsevier

NeuroImage

Volume 49, Issue 1, 1 January 2010, Pages 603-611
NeuroImage

Dynamic Causal Modeling applied to fMRI data shows high reliability

https://doi.org/10.1016/j.neuroimage.2009.07.015Get rights and content

Abstract

Sensitivity, specificity, and reproducibility are vital to interpret neuroscientific results from functional magnetic resonance imaging (fMRI) experiments. Here we examine the scan–rescan reliability of the percent signal change (PSC) and parameters estimated using Dynamic Causal Modeling (DCM) in scans taken in the same scan session, less than 5 min apart. We find fair to good reliability of PSC in regions that are involved with the task, and fair to excellent reliability with DCM. Also, the DCM analysis uncovers group differences that were not present in the analysis of PSC, which implies that DCM may be more sensitive to the nuances of signal changes in fMRI data.

Introduction

As the analysis tools of functional MRI data evolve into methods of greater complexity, ongoing critical examination of their sensitivity, specificity, and reproducibility is vital to ensure that interpretations of the data do not exceed the limitations of the data itself. In the past few years, several limitations have been examined. The accuracy of transforming data to a common space has been called into question because of the assumptions made about uniform brain anatomy (Devlin and Poldrack, October 2007, Feredoes and Postle, April 2007, Goldman-Rakic, May 2000, Saxe et al., May 2006). In addition, models of the physiological basis of the BOLD signal and its neurological underpinnings have been under continual examination (Aguirre et al., November 1998, Handwerker et al., April 2004, Leontiev and Buxton, January 2007, Raichle and Mintun, 2006). And finally, leading researchers (Logothetis, 2008) have shown that BOLD activity could be due to inhibitory or excitatory effects.

The need for this reevaluation extends to complex mathematical analysis methods. Emerging data analysis techniques that draw from unrelated fields such as engineering and economics may be conceptually and statistically sound, but their power can only be realized if their assumptions are satisfied, and this is often difficult to do in the context of brain imaging. They must therefore be shown to be reliable and valid when applied to a variety of well-characterized datasets. This is a prerequisite for rigorous interpretation of findings about group differences in brain activation, correlation of brain activation with tasks, subject traits, or subject behavior, or more complex understanding of connectivity of different regions in a single brain. Moreover, reliability and validity are of paramount importance when fMRI is applied to clinical research, clinical diagnosis, and the study of neural changes arising from therapeutic interventions.

Many studies (Aron et al., February 2006, Caceres et al., April 2009, Cohen and Dubois, 1999, Friedman et al., November 2006, Friedman et al., 2007, Johnstone et al., May 2005; Le and Hu, 1997, Loubinoux et al., 2001, Waites et al., January 2005, Wei et al., March 2004) have focused on the reliability of fMRI signal, as measured by regional percent signal change (PSC) during sensorimotor, cognitive and affective tasks. The results have suggested that, while not ideal, generally there is evidence for reliability of PSC within a subject across time. These studies calculate reliability using Type 1 Intraclass Correlation Coefficient, the ICC (1,1), based on the work of Shrout and Fleiss (1979). ICC (1,1) is calculated usingICC(1,1)=BMSWMSBMS+(k1)WMSwhere BMS is the between-subject mean square, WMS is the within-subject mean square, and k is the number of scans on each subject. Cicchetti's guidelines of ICC (Cicchetti, 2001) classify ICC < 0.40 as poor, 0.40–0.59 as fair, 0.60–0.74 as good, and 0.75 to 1.00 as excellent. Some studies report reliabilities in the fair to poor range, but many of the studies of reproducibility of PSC would be classified as ‘good.’ Loubinoux et al. (2001) report highly reliable activation in ROIs in response to a motor task repeated 5 h, one month, and two months apart, and Friedman et al. (2007) find good reliability (average ICC = 0.67) in ROIs involved in a block design sensorimotor task with scans one day apart. In a classification learning task, Aron et al. (2006) find good to excellent reliability using voxel-wise ICCs in regions involved with the learning task scanned a few minutes apart. In contrast, Wei et al. (2004) find reliabilities ranging from poor to excellent in different regions in response to an auditory 2-back task, with the Brodmann's Area 40 (BA40) producing the highest reliability across scan sessions at least three weeks apart. The current study replicates the finding that the PSC measurement in an ROI has good reliability in scans that are in the same scanning session (Aron et al., February 2006, Caceres et al., April 2009, Cohen and Dubois, 1999, Friedman et al., November 2006, Friedman et al., 2007, Johnstone et al., May 2005, Le and Hu, 1997, Loubinoux et al., 2001, Waites et al., January 2005, Wei et al., March 2004).

The main goal of this work is to study the reliability of Dynamic Causal Modeling (DCM) (Friston et al., 2003) and to see how this method compares to PSC. DCM is an example of a creative application of an engineering technique to the analysis of neuroimaging data. The analysis makes use of time series of activation in a priori regions of interest to estimate changing connectivities between those regions, in addition to direct effects of the stimuli on the regions themselves. It is conceptually appealing and has been used to explore several neural models including language (Mechelli et al., 2005, Bitan et al., June 2005, E et al., April 2006, Booth et al., February 2007, Sonty et al., February 2007, Noppeney et al., July 2007), motor activation (Grol et al., October 2007, Grefkes et al., July 2008), face perception (Fairhall and Ishai, 2006), visuospatial attention (Siman-Tov et al., 2007) and changes across development (Cao et al., November 2008, Booth et al., January 2008).

A full explanation of the mathematical underpinnings of DCM can be found in Friston et al. (2003), but an overview of the basic structure will be presented here. DCM is an effective connectivity method, as opposed to a functional connectivity method. Functional connectivity analyses examine correlations between regions without regard to causation, while effective connectivity analyses can infer causation within a model. DCM asserts that a change in neuronal firing in one region can be caused by external driving inputs, connections to other regions, and contextual modulation of those inter-regional connections. This is shown explicitly with the formulaz.=(A+ujBj)z+Cuwhere z and u are vectors containing the time series of the neural state of the volumes of interest (VOIs) and stimuli presentations, respectively. A is a matrix that describes connectivity between different VOIs when the system is in a steady state, B is a matrix that describes how that connectivity changes based on the presentation of each stimulus j, and C is a matrix that describes the effect of each stimulus on each brain region, comparable to a traditional GLM analysis. The model is then extended from the hypothetical neuronal state to the observed BOLD activity using formulas that calculate blood flow induction, volume changes, and subsequent changes in the fraction of oxy- to deoxy-hemoglobin (Stephan et al., 2007a, Stephan et al., 2007b, further discussed in David et al., 2008). These formulas employ biophysical parameters as priors, which are entered into a Bayesian estimation procedure (Friston, 2002). This estimation uses an Expectation-Maximization algorithm to estimate the matrices A, B, and C. The estimated indices of A, B, and C can then be used to explore differences in activation between groups.

In the present experiment, we examined the reliability of results obtained from an analysis of PSC and two DCM analyses, and the ability of each method to discriminate between groups attending to different sensory modalities. We found regions of interest which activated in response to the simultaneous presentation of faces and voices, and studied the behavior in those regions in subjects that were asked to attend to either the face or the voice. We studied the ability of the each analysis method to discriminate between the two groups, and we studied reliability by comparing the analysis results in two scans approximately 2 min apart.

Section snippets

Participants

Forty-two healthy human subjects were recruited through the local newspaper and chosen by phone screening with an MRI compatibility form and the Edinburgh Handedness Survey. The goal was to obtain a sample representative of the “normal” types of subjects recruited as controls from the population at large. Subjects ranged in age from 18 to 50 years, with number and sex balanced within each decade: 18–29 years, 8 males and 6 females; 30–39 years, 6 males and 6 females; and 40–50 years, 8 males

Results

The two separate models, a simple auditory model (Fig. 1) and a more complex visual model (Fig. 2), were entered into the DCM analysis and their parameter estimates were compared with each subject's assigned attentional target (face or voice). In the auditory model it is assumed that the stimulus activates Primary Auditory cortex (A1) and activation in A1 then leads to activation in Primary Motor cortex (M1), when the subject indicates his or her discrimination decision with a button press. The

Discussion

Many creative methods have emerged to extract information from the massive amounts of data produced by neuroimaging studies in order to characterize the functional role of different brain regions. Both methods evaluated here — one that infers stimulus-related activation by using a General Linear Model to relate brain activity to a predetermined hemodynamic response function (HRF) and another that infers causal relationships between activation in different regions from a mechanistic model, are

Acknowledgments

The authors thank Ron Fisher, Michael Anderle and Kathleen Ores-Walsh for assistance in data collection. This study was supported by NIH grants R01 MH067167, P50-MH084051, and P30-HD03352.

References (46)

  • FriedmanL. et al.

    Reducing interscanner variability of activation in a multicenter fMRI study: controlling for signal-to-fluctuation-noise-ratio (SFNR) differences

    NeuroImage

    (November 2006)
  • FristonK.J.

    Bayesian estimation of dynamical systems: an application to fMRI

    Neuroimage

    (June 2002)
  • FristonK.J. et al.

    Dynamic causal modelling

    NeuroImage

    (August 2003)
  • Goldman-RakicP.

    Localization of function all over again

    NeuroImage

    (May 2000)
  • GrefkesC. et al.

    Dynamic intra- and interhemispheric interactions during unilateral and bilateral hand movements assessed with fMRI and DCM

    NeuroImage

    (July 2008)
  • HandwerkerD.A. et al.

    Variation of bold hemodynamic responses across subjects and brain regions and their effects on statistical analyses

    NeuroImage

    (April 2004)
  • JenkinsonM. et al.

    Improved optimisation for the robust and accurate linear registration and motion correction of brain images

    NeuroImage

    (2002)
  • JohnstoneT. et al.

    Stability of amygdala bold response to fearful faces over multiple scan sessions

    NeuroImage

    (May 2005)
  • LeontievO. et al.

    Reproducibility of bold, perfusion, and CMRO(2) measurements with calibrated-bold fMRI

    Neuroimage

    (January 2007)
  • OakesT.R. et al.

    Comparison of fMRI motion correction software tools

    NeuroImage

    (2005)
  • StephanK.E. et al.

    Comparing hemodynamic models with DCM

    NeuroImage

    (November 2007)
  • WaitesA. et al.

    How reliable are fMRI–EEG studies of epilepsy? A nonparametric approach to analysis validation and optimization

    NeuroImage

    (January 2005)
  • WeiX. et al.

    Functional MRI of auditory verbal working memory: long-term reproducibility analysis

    NeuroImage

    (March 2004)
  • Cited by (54)

    • Causal Interactions Between the Default Mode Network and Central Executive Network in Patients with Major Depression

      2021, Neuroscience
      Citation Excerpt :

      The reliability of DCM has been extensively assessed. Schuyler et al (Schuyler et al., 2010) assessed the test–retest reliability of DCM and reported fair to excellent reliability of parameters estimated. Frassle et al. (2016) found that the reliability of negative free energy obtained by DCM was excellent, while the reliability of parameter estimation was good.

    View all citing articles on Scopus
    View full text