Elsevier

NeuroImage

Volume 61, Issue 3, 2 July 2012, Pages 606-612
NeuroImage

Classification of schizophrenia patients and healthy controls from structural MRI scans in two large independent samples

https://doi.org/10.1016/j.neuroimage.2012.03.079Get rights and content

Abstract

The purpose of this study is to create a model that can classify schizophrenia patients and healthy controls based on whole brain gray matter densities (voxel-based morphometry, VBM) from structural magnetic resonance imaging (MRI) scans. In addition, we investigated the stability of the accuracy of the models, when built with different sample sizes. Using a support vector machine, we built a model from 239 subjects (128 patients and 111 healthy controls) and classified 71.4% correct (leave-one-out). We replicated and validated this result by testing the unaltered model on a completely independent sample of 277 subjects (155 patients and 122 healthy controls), scanned with a different scanner. The classification rate of the validation sample was 70.4%. The model's discriminative pattern showed, amongst other differences, gray matter density decreases in frontal and superior temporal lobes and hippocampus in schizophrenia patients with respect to healthy controls and increases in gray matter density in basal ganglia and left occipital lobe and. Larger training samples gave more reliable models: Models based on sample sizes smaller than N = 130 should be considered unstable and can even score below chance.

Highlights

► We used VBM (sMRI) to separate individuals with schizophrenia from healthy controls. ► A support vector machine model trained on a large sample (N = 239) gave 71% accuracy. ► Application of the model to a replication sample (N = 277) validated this result: 70%. ► Larger amounts of training subjects gave higher accuracy and more stable models.

Introduction

Currently, the diagnosis of schizophrenia is based purely on clinical manifestations. The availability of a more objective measure could help psychiatrists in the process of diagnosis and increase its reliability. In addition, an objective measure would serve as a basis for diagnosis at an earlier stage, which in turn could lead to better treatment. Throughout the years, magnetic resonance imaging (MRI) has proven to be an effective technique to detect structural brain abnormalities in schizophrenia patients (Honea et al., 2005, Olabi et al., 2011, Wright et al., 2000). These observations are usually based on statistical analyses, comparing groups of patients to groups of healthy controls. Unfortunately, statistical group differences do not imply the possibility to discover deviations from normal in single individuals and therefore do not suffice to aid in diagnosis.

A considerable amount of work has been done to establish possible detectable patterns in the brain that distinguish between individual schizophrenia patients and healthy controls. Usually, the underlying methodology is machine learning classification by means of pattern recognition. These discriminating patterns are generated by means of input features; in structural MRI most common features are so-called brain tissue densities (obtained from voxel based morphometry). Frequently used methods to create these classification models are: support vectors machine (SVM) (Davatzikos et al., 2005, Fan et al., 2005, Fan et al., 2007, Fan et al., 2008, Ingalhalikar et al., 2010, Koutsouleris et al., 2009, Pohl and Sabuncu, 2009); Discriminant Function Analysis (Karageorgiou et al., 2011, Kasparek et al., 2011, Leonard et al., 1999, Liu et al., 2004, Nakamura et al., 2004, Takayanagi et al., 2010, Takayanagi et al., 2011); and some other methods (Caprihan et al., 2008, Kawasaki et al., 2007, Sun et al., 2009). Although considerable accuracies have been achieved ranging from 70.5% to 91.8%, these were often obtained from relatively small data sets and without testing the model in validation samples. To our knowledge there is only one study that used a separate, though small, cohort (16 patients and 16 controls) to validate their initial results (Kawasaki et al., 2007).

Most classification studies included around 30 subjects per class with sample sizes ranging from 10 to 69 patients. Since subjects have to be divided into a subset from which the model is built and a set, which is subsequently used to test the model's predictive value, samples of this size may be too small for robust model building and testing. Moreover, in prior studies the predictive capacity of models was not based on using a separate validation sample, but on using a cross validation method, such as leave one out, providing an estimation of the percentage correctly classified subjects using virtually all data to create the models. A more robust method is using a completely independent sample to validate the model.

Therefore, we used an independent sample to validate the results we found with our discovery sample. The goal of this study is two-fold: 1) Test whether a large sample is necessary to build a stable classification model; and 2) Investigate whether the classification results obtained with such a model can be validated, using the same model, in an independent sample. As input features we started with all gray matter densities in the brain, which enables us to compare the model's classification patterns to brain abnormalities found in group-level statistical analyses of schizophrenia brain images. Next to this full model, we used two forms of feature reduction. First, we excluded the striatum, since this structure is known to be affected by (typical) antipsychotic medication (Smieskova et al., 2009), and we wish to separate patients from controls, rather than medication-users from non-users. We further reduced the number of features by ranking them and keeping only the 10% features that had the most influence on the model. In doing so, we reduced the risk of overfitting the model to our training set and thus made it potentially more general.

Section snippets

Subjects

In both samples, the presence or absence of psychopathological abnormality was established using the Comprehensive Assessment of Symptoms and History (Andreasen et al., 1992) and Schedule for Affective Disorders and Schizophrenia Lifetime Version (Endicott and Spitzer, 1978) assessed by at least one independent rater who was trained to assess this interview. All healthy comparison subjects met Research Diagnostic Criteria (Spitzer et al., 1978) of “never [being] mentally ill.” All patients met

Results

Fig. 2 shows the weight-vector w mapped onto the brain. Warm colors indicate increases in GM-densities and cool colors indicate decreases in GM-densities when examining patients as compared to controls. Substantial contributions to the full model's discriminative pattern were found for the basal ganglia and left occipital lobe (relatively large GMD in patients) and for the frontal and superior temporal lobes and hippocampus (relatively small GMD in patients).

The LOO accuracy reached on the full

Discussion

The purpose of this study was to create a model that classifies schizophrenia patients and healthy controls based on structural MRI scans. First, we used a support vector machine (SVM) to create a model from a large sample of patients and control subjects (discovery sample, N = 239), which we then applied to a large independent sample (validation sample, N = 277). We demonstrated that it is possible to attain approximately the same classification accuracy (70.4%) in a completely independent set of

As explained in the Materials and methods section, the OSH is dependent on two terms; the margin is maximized while the error times C is minimized leading to a minimization of: Cn=1Nξn+2||w||. C is the penalty parameter that controls the tradeoff between training errors and the narrowness of the margin. Increasing its value narrows the margin and forces better classification of the subjects in the training set. The goal was to identify the optimal C that would create a model that could predict

References (39)

  • D. Sun et al.

    Elucidating a magnetic resonance imaging-based neuroanatomic biomarker for psychosis: classification analysis using probabilistic brain atlas and machine learning algorithms

    Biol. Psychiatry

    (2009)
  • Y. Takayanagi et al.

    Differentiation of first-episode schizophrenia patients from healthy controls using ROI-based multiple structural brain variables

    Prog. Neuropsychopharmacol. Biol. Psychiatry

    (2010)
  • N.C. Andreasen et al.

    The Comprehensive Assessment of Symptoms and History (CASH). An instrument for assessing diagnosis and psychopathology

    Arch. Gen. Psychiatry

    (1992)
  • H.B. Boos et al.

    Focal and global brain measurements in siblings of patients with schizophrenia

    Schizophr. Bull.

    (2011)
  • C.-C. Chang

    A library for support vector machines

  • D.L. Collins et al.

    Automatic 3-d model-based neuroanatomical segmentation

    Hum. Brain Mapp.

    (1995)
  • C. Davatzikos et al.

    Whole-brain morphometric study of schizophrenia revealing a spatially complex set of focal abnormalities

    Arch. Gen. Psychiatry

    (2005)
  • J. Endicott et al.

    A diagnostic interview: the schedule for affective disorders and schizophrenia

    Arch. Gen. Psychiatry

    (1978)
  • Y. Fan et al.

    COMPARE: classification of morphological patterns using adaptive regional elements

    IEEE Trans. Med. Imaging

    (2007)
  • Cited by (155)

    • Neuroimaging studies of mental disorders

      2023, Encyclopedia of Mental Health, Third Edition: Volume 1-3
    View all citing articles on Scopus
    View full text