Brain morphometry reproducibility in multi-center 3 T MRI studies: A comparison of cross-sectional and longitudinal segmentations
Introduction
Methods that enable the characterization of human brain morphometry from MRI data are demonstrating important applications in neuroscience. Several reviews describe how morphometry tools have been applied to investigate a variety of populations, including, but not limited to, normal development (Silk and Wood, 2011), normal aging (Mueller et al., 2007), Alzheimer's disease (Drago et al., 2011, Fjell and Walhovd, 2012, Frisoni et al., 2010, Jack, 2011), Parkinson's disease (Kostić and Filippi, 2011), autism (Chen et al., 2011), bipolar disorders (Selvaraj et al., 2012), epilepsy (Bernasconi et al., 2011) and schizophrenia (Levitt et al., 2010). One particular example of a successful contribution of brain morphometry to the field of neurodegenerative diseases is the fact that hippocampal volume has been recently approved as biomarker to enrich the population selection in clinical trials that study early stages of Alzheimer's disease (EMA/CHMP/SAWP/809208/2011).
There are several methods to obtain brain morphometry estimates from MRI data. Manual segmentation of specific brain structures on MRI made by trained raters, with its high inter-rater reliability, is considered as the gold standard by many neuroimaging studies (Rojas et al., 2004, Whitwell et al., 2005). However, due to its time-costs, manual segmentations are not practically applicable for large studies involving many subjects and different brain structures. Various automated and semi-automated algorithms have been proposed, including atlas-based methods (Alemán-Gómez et al., 2007, Fischl et al., 2002, Lötjönen et al., 2010, Magnotta et al., 2002, Wolz et al., 2010), voxel-based morphometry with statistical parametric mapping (Ashburner and Friston, 2000), tensor-based morphometry (Leow et al., 2005, Studholme et al., 2001) and boundary shift integral methods (Camara et al., 2007, Smith et al., 2002). This list of brain morphometry analysis methods is by no means complete nor does this paper attempt to compare and contrast these methods.
Automated morphometric analysis is of particular interest in longitudinal studies aimed at characterizing disease progression or the effect of therapeutic treatments, both when using known and when searching for new useful biomarkers. In particular, longitudinal multi-center MRI studies are becoming an increasingly common strategy to collect large datasets while distributing the data acquisition load across multiple partners (Van Horn and Toga, 2009), and probably one of the largest examples is the Alzheimer's Neuroimage Initiative, or ADNI (Carrillo et al., 2012). One critical factor that limits the sensitivity to detect changes in any longitudinal study is the reproducibility of repeated measures. The test–retest reliability of MRI-derived morphometric estimates may be affected by a variety of factors (Jovicich et al., 2009), including hydration status of the subject (Walters et al., 2001), instrument related factors such as scanner manufacturer, field strength, head RF coil, magnetic gradients (Jovicich et al., 2006), pulse sequence and image analysis methods (Han et al., 2006). Repeated acquisitions within a single scan session without subject repositioning may be used to characterize the best attainable reproducibility conditions from an acquisition and analysis protocol. However, the reproducibility errors present in a longitudinal study are better described by repeated acquisitions obtained in different sessions several days apart. Such across-session differences will include additional sources of variance like MRI system instabilities, differences in head positioning within the RF coil, differences in automated acquisition procedures like auto shimming, as well as potential effects from how different operators follow instructions to execute the same acquisition protocol. Across-session reproducibility is even more challenging in multicenter neuroimaging clinical studies where comparable results are usually difficult to obtain due to the added variability from site differences in the MRI hardware, acquisition protocols and operators.
Despite the wide usage of automated morphometric techniques applied to 3 T MRI studies, across-site test–retest reliability of morphometry measures has not been thoroughly investigated and thus its impact on statistical analysis is not clearly defined. Table 1 outlines studies that, to the best of our knowledge, have reported across-session test–retest reproducibility measures of morphometric data derived from healthy volunteers using 3 T systems. Most studies were done on a single MRI system (Kruggel et al., 2010, Morey et al., 2010, Wonderlick et al., 2009), except for one study that evaluated major MRI system upgrade effects on reproducibility, therefore considering effectively two different systems (Jovicich et al., 2009). These studies have been performed on only two vendors (Siemens and GE), and three models (Trio, Trio TIM, GE Excite) that nowadays tend to be less common as the manufacturers develop newer versions. In addition, morphometry segmentation tools have also been evolving. Recently, a FreeSurfer longitudinal image processing framework has been developed (Reuter et al., 2012) showing a significant increase in precision and discrimination power when compared with tools originally designed for the FreeSurfer cross-sectional analysis. In that study the test–retest reliability of the longitudinal stream was evaluated at 3 T, but it was done for repeated acquisitions obtained during the same session and also when using a particular sequence, multi-echo 3D MPRAGE (van der Kouwe et al., 2008), that has interesting advantages relative to the standard 3D MPRAGE (Wonderlick et al., 2009) but that is not yet commonly available across all vendors. To date there are no studies evaluating the across-session test–retest reproducibility of this new longitudinal analysis at 3 T, for one or more MRI system vendors, while using an MRI acquisition that is standard across vendors.
All of these issues are relevant to the PharmaCog project, a new industry-academic European project aimed at identifying biomarkers sensitive to symptomatic and disease modifying effects of drugs for Alzheimer's disease (http://www.alzheimer-europe.org/FR/Research/PharmaCog). One of the objectives of the PharmaCog project is to investigate potential biomarkers derived from human brain structural and functional MRI, in particular brain morphometry. Within this context, the goals of the present PharmaCog study were the following: i) implement a multi-site 3 T MRI data acquisition protocol for morphometry analysis, ii) acquire across-session test–retest data from a population of healthy elderly subjects, and iii) evaluate and compare the across-session reproducibility of the cross-sectional and longitudinal FreeSurfer segmentation analyses within and across MRI sites. This work is therefore an extension of previous work (Reuter et al., 2012), evaluating the across-session reproducibility of the segmentation results (cortical thickness, intracranial, ventricular and subcortical volumes) on a variety of 3 T MRI scanning platforms (Table 1). To keep a manageable number of variables in this study we do not manipulate the acquisition sequence other than trying to implement a target common protocol across all sites following in great part ADNI recommendations. The study is focused on the comparison of the test–retest reproducibility of morphometric results derived from two variants of the FreeSurfer segmentation, comparisons with other segmentation methods are beyond the scope of this work.
Section snippets
Subjects
Nine clinical sites participated in this study across Italy (Brescia, Verona, and Genoa), Spain (Barcelona), France (Marseille, Lille, and Toulouse) and Germany (Leipzig and Essen). The Brescia site was responsible for the coordination and analysis of the whole study and did not acquire MRI data. Each MRI site recruited 5 local volunteers within an age range of 50–80 years. The subject's age range corresponds to the same one of the clinical population that will be studied with the protocols
Results
In this study, we estimate the test–retest reliability of morphometry measures derived from structural T1-weighted 3 T MRI data and evaluate how their reproducibility errors are affected by FreeSurfer processing stream (CS, LG) and MRI site (eight 3 T MRI scanners from different vendors: GE, Siemens, Philips) on healthy elderly volunteers scanned in two separate sessions at least one week apart. This short period between the test and retest sessions was chosen to minimize biological changes that
Discussion
The main goal of this study was to investigate the effects on reliability of two variants of the automated FreeSurfer brain segmentation analysis when used in a 3 T MRI consortium. The choices of MRI data acquisition and data analysis protocols can affect reproducibility errors and are therefore crucial in longitudinal studies aimed at evaluating MRI-derived biomarkers for disease progression and/or treatment efficacy. In this brain morphometry study we show for the first time the across-session
Conclusions
This study achieved the following three main goals: i) a structural MRI acquisition protocol for morphometry analysis was implemented across eight 3 T MRI sites (3D MPRAGE, most sites using mildly accelerated acquisitions) covering various vendors (Siemens, Philips, GE) and countries (Italy, Spain, Germany and France); ii) within- and across-session test–retest data were acquired from a group of 40 healthy elderly volunteers (5 different volunteers per MRI site), generating a dataset with a
Acknowledgments
PharmaCog is funded by the EU-FP7 for the Innovative Medicine Initiative (grant no. 115009). All members of the PharmaCog project deserve sincere acknowledgment for their significant efforts, but unfortunately, they are too numerous to mention. The authors would like to thank especially to people who contributed to the early phases of this study, including Luca Venturi, Genoveffa Borsci and Thomas Günther.
Conflict of interest
The authors have no conflict of interests to declare.
References (49)
- et al.
Voxel-based morphometry—the methods
NeuroImage
(2000) - et al.
Worldwide Alzheimer's disease neuroimaging initiative
Alzheimers Dement.
(2012) - et al.
Cortical surface-based analysis
I. Segmentation and surface reconstruction. Neuroimage
(1999) - et al.
An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest
Neuroimage
(2006) - et al.
Cortical surface-based analysis. II: Inflation, flattening, and a surface-based coordinate system
Neuroimage
(1999) - et al.
Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain
Neuron
(2002) - et al.
Reliability of MRI-derived measurements of human cerebral cortical thickness: the effects of field strength, scanner upgrade and manufacturer
NeuroImage
(2006) Alliance for aging research AD biomarkers work group: structural MRI
Neurobiol. Aging
(2011)- et al.
Reliability in multi-site structural MRI studies: effects of gradient non-linearity correction on phantom and human data
NeuroImage
(2006) - et al.
MRI-derived measurements of human subcortical, ventricular and intracranial brain volumes: reliability effects of scan sessions, acquisition sequences, data analyses, scanner upgrade, scanner vendors and field strengths
NeuroImage
(2009)
Neuroanatomical correlates of depression and apathy in Parkinson's disease: magnetic resonance imaging studies
J. Neurol. Sci.
Impact of scanner hardware and imaging protocol on image quality and compartment volume precision in the ADNI cohort
NeuroImage
Brain structural mapping using a novel hybrid implicit/explicit framework based on the level-set method
NeuroImage
Fast and robust multi-atlas segmentation of brain magnetic resonance images
NeuroImage
Structural MR image processing using the BRAINS2 toolbox
Comput. Med. Imaging Graph.
Computational anatomy: shape, growth, and atrophy comparison via diffeomorphisms
NeuroImage
Measurement of hippocampal subfields and age-related changes with high resolution MRI at 4 T
Neurobiol. Aging
Within-subject template estimation for unbiased longitudinal image analysis
NeuroImage
Accurate, robust, and automated longitudinal and cross-sectional brain change analysis
NeuroImage
Brain morphometry with multiecho MPRAGE
NeuroImage
LEAP: learning embeddings for atlas propagation
NeuroImage
Reliability of MRI-derived cortical and subcortical morphometric measures: effects of pulse sequence, voxel geometry, and parallel imaging
NeuroImage
IBASPM: toolbox for automatic parcellation of brain structures
Advances in MRI for ‘cryptogenic’ epilepsies
Nat. Rev. Neurol.
Cited by (141)
A multi-scanner neuroimaging data harmonization using RAVEL and ComBat
2021, NeuroImageCitation Excerpt :Scanner effects refer to both within- and between-scanner variability and harmonization refers to removal of such variability. In neuroimaging, it has been shown that scanner effects can affect downstream analyses of derived measures of regional healthy tissue or brain lesion volumes (Jovicich et al., 2013; Schnack et al., 2010; Schwartz et al., 2019). These effects can be very large and exceed the biological variations of interest.
Participant followup rate can bias structural imaging measures in longitudinal studies
2021, Neuroimage: Reports
- 1
Authors contributed equally to this work.