ARTICLE IN PRESS
YNIMG-06508; No. of pages: 13; 4C: 5, 6, 7, 8
NeuroImage xxx (2009) xxx–xxx
Contents lists available at ScienceDirect
NeuroImage j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / y n i m g
Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach Christine Ecker a,⁎, Vanessa Rocha-Rego b, Patrick Johnston a, Janaina Mourao-Miranda c, Andre Marquand c, Eileen M. Daly a, Michael J. Brammer c, Clodagh Murphy a, Declan G. Murphy a and the MRC AIMS Consortium 1 a b c
Section of Brain Maturation, Department of Psychological Medicine, Institute of Psychiatry, King's College, London, UK Institute of Biophysics Carlos Chagas Filho, University of Rio de Janeiro, Brazil Brain Image Analysis Unit, Department of Biostatistics, Institute of Psychiatry, Kings College, London, UK
a r t i c l e
i n f o
Article history: Received 27 May 2009 Revised 5 August 2009 Accepted 10 August 2009 Available online xxxx
a b s t r a c t Autistic spectrum disorder (ASD) is accompanied by subtle and spatially distributed differences in brain anatomy that are difficult to detect using conventional mass-univariate methods (e.g., VBM). These require correction for multiple comparisons and hence need relatively large samples to attain sufficient statistical power. Reports of neuroanatomical differences from relatively small studies are thus highly variable. Also, VBM does not provide predictive value, limiting its diagnostic value. Here, we examined neuroanatomical networks implicated in ASD using a whole-brain classification approach employing a support vector machine (SVM) and investigated the predictive value of structural MRI scans in adults with ASD. Subsequently, results were compared between SVM and VBM. We included 44 male adults; 22 diagnosed with ASD using “gold-standard” research interviews and 22 healthy matched controls. SVM identified spatially distributed networks discriminating between ASD and controls. These included the limbic, frontal-striatal, fronto-temporal, fronto-parietal and cerebellar systems. SVM applied to gray matter scans correctly classified ASD individuals at a specificity of 86.0% and a sensitivity of 88.0%. Cases (68.0%) were correctly classified using white matter anatomy. The distance from the separating hyperplane (i.e., the test margin) was significantly related to current symptom severity. In contrast, VBM revealed few significant between-group differences at conventional levels of statistical stringency. We therefore suggest that SVM can detect subtle and spatially distributed differences in brain networks between adults with ASD and controls. Also, these differences provide significant predictive power for group membership, which is related to symptom severity. © 2009 Elsevier Inc. All rights reserved.
Introduction Autistic spectrum disorder (ASD) is a highly genetic neurodevelopmental condition (Bailey et al., 1996; Bolton and Rutter, 1990; Folstein and Rutter, 1977), which is characterized by a triad of symptoms. These are impaired social communication, social reci-
⁎ Corresponding author. Section of Brain Maturation, PO50, Institute of Psychiatry, De Crespigny Park, London, SE5 8AF, UK. E-mail address:
[email protected] (C. Ecker). 1 The MRC AIMS Consortium is a UK collaboration of autism research centers in the UK including the Institute of Psychiatry, London; the Autism Research Centre, University of Cambridge; and the Autism Research Group, University of Oxford. It is funded by the MRC UK and headed by the Section of Brain Maturation, Institute of Psychiatry. The Consortium members are (in alphabetical order) Bailey AJ, Baron-Cohen S, Bolton PF, Bullmore ET, Carrington S, Chakrabarti B, Daly EM, Deoni SC, Ecker C, Happe F, Henty J, Jezzard P, Johnston P, Jones DK, Lombardo M, Madden A, Mullins D, Murphy C, Murphy DG, Pasco G, Sadek S, Spain D, Steward R, Suckling J, Wheelwright S, and Williams SC.
procity and repetitive and stereotypic behavior (Gillberg, 1993; Wing, 1997). The behavioral phenotype of ASD is well described but less is known about its etiology and pathogenesis, although neuroimaging studies have repeatedly highlighted the potential role of a number of brain regions and neural systems (for a review, see Amaral et al., 2008; Toal et al., 2005). Differences in gray matter (GM) volume have been reported by several previous studies using “regions of interest” (ROI) approaches involving manually tracing brain regions as well as automated voxelbased morphometry (VBM) of the entire brain (Ashburner and Friston, 2000). The findings of previous studies, while reporting variable results, predominantly point toward abnormalities in frontal, parietal and limbic regions as well as the basal ganglia and the cerebellum (McAlonan et al., 2005; McAlonan et al., 2002; Rojas et al., 2006; Waiter et al., 2004). Some have also reported that anatomical differences in some of these regions are correlated with clinical severity (Rojas et al., 2006; Schmitz et al., 2006). There is also
1053-8119/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.neuroimage.2009.08.024
Please cite this article as: Ecker, C., et al., Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach, NeuroImage (2009), doi:10.1016/j.neuroimage.2009.08.024
ARTICLE IN PRESS 2
C. Ecker et al. / NeuroImage xxx (2009) xxx–xxx
evidence for differences in white matter volume (Herbert et al., 2004; McAlonan et al., 2002) and microstructural integrity, which may affect interhemispheric “connectivity” (Hardan et al., 2000; Waiter et al., 2004), inhibition of fronto-striatal systems (McAlonan et al., 2002) and the cerebellar circuitry (Catani et al., 2008). Hence, there is increasing evidence that people with ASD have differences in brain morphology. Nevertheless, as noted above, the results of many studies are in disagreement. This may be explained simply on the basis of factors such as clinical heterogeneity between studies. However, it may also indicate that differences in brain anatomy are relatively subtle and widespread; if so, this presents significant challenges to whichever analytic technique is employed. Region of interest (ROI, i.e., manual tracing) approaches provide higher statistical power than voxel-wise analysis methods such as VBM because only relatively few brain regions are investigated and so less correction for multiple comparisons is required. On the other hand, they have relatively low exploratory power because they restrict their focus to a small number of specific regions. In contrast, whole-brain, mass-univariate analyses (e.g., VBM) offer high exploratory power but with only moderate statistical power, as corrections for multiple comparisons are required in order to limit the occurrence of false positives. Thus, mass-univariate approaches may be too conservative to detect subtle morphological differences in the brain (especially with relatively small sample sizes). In addition, VBM-type approaches do not offer predictive value, which may be of diagnostic relevance. Currently, ASD is diagnosed solely on the basis of behavioral criteria. The behavioral diagnosis of ASD is however often time consuming and can be problematic, particularly in adults. For example, the “gold-standard” diagnostic instruments for ASD such as the Autism Diagnostic Observation Schedule (ADOS; Lord, 1989) and/or the Autism Diagnostic Interview (ADI-R; Lord et al., 1994) are dependent on the skill of the examiner in, respectively, observing current symptoms and eliciting a developmental history from informants who know the person very well. However, in adults, current symptoms are often modulated by coping strategies developed over the life span, and retrospective accounts of past symptoms rely on an informant being both reliable and available. Also, different biological (e.g., genetic/environmental) etiologies might result in the same behavioral phenotype (sometimes referred to as the “autisms”; Geschwind and Levitt, 2007), but this is undetectable using behavioral measures alone. Hence, some have suggested that endophenotypes need to be identified to aid genetic studies. To enable this, new approaches are required (perhaps especially in adults) to combine biological information with behavioral measures. We therefore investigated the predictive value of neuroanatomical gray and white matter MRI scans in ASD using support vector machines (SVMs). SVMs belong to the general machine learningbased pattern recognition techniques that can be used to classify data by differentiating between two or more classes (e.g., patients vs. controls). SVMs are initially “trained” on the basis of a wellcharacterized sample or a subset of the data in order to establish a mathematical criterion that best distinguishes the groups (i.e., training phase). Once a so-called “decision function” or “hyperplane” is learned from the training data, it can be used to predict the class of a new test example (e.g., whether a new participant has ASD). Importantly for studies such as the current one, SVM also provides numerical indicators for specificity (i.e., the proportion of people without disease displaying a negative test result) and sensitivity (i.e., proportion of people with disease displaying a positive result), thus offering predictive value. In addition, and in contrast to VBM-based approaches, SVM is multivariate and so takes into account interregional correlations. This may be of particular relevance to ASD, as individuals with the disorder most likely have abnormalities in the development of neural systems – rather than isolated regions – leading to differences in intra-regional correlations of brain metab-
olism, function and anatomy (Horwitz et al., 1988; Koshino et al., 2005; McAlonan et al., 2002). SVM may therefore be particularly suited to detect subtle differences in the morphology of brain systems. A prior study using discriminate function analysis reported that some pre-selected differences in brain anatomy (e.g., volume of cerebellum and whole-brain gray and white matter) correctly classified 95% of very young people with ASD (Akshmoomoff et al., 2004). To date, however, few studies have employed SVMs in the analysis of structural MRI data in humans, and these have mainly been used to classify patients with Alzheimer's disease (Kloppel et al., 2008; Vemuri et al., 2008) and mild cognitive impairment (Davatzikos et al., 2008; Teipel et al., 2007). Other studies have also reported that differences in white matter shape, the size of corpus callosum (Akshoomoff et al., 2004) and cortical thickness (Singh et al., 2008) distinguish young people with autism from controls. These prior studies employing discriminate function and/or SVM were important first steps. However, they were all based on a priori selected features rather than whole-brain data, and so this significantly impacted on their exploratory power and ability to investigate the role of numerous other brain regions and systems implicated in ASD. In addition, none of these studies validated their classification of individuals with ASD by relating classification to symptom severity. Thus, we used an SVM classification approach to discriminate adults with ASD from controls based on whole-brain gray and white matter anatomy. Also, we related classification results to symptom severity. Lastly, we compared the results obtained by our SVM analysis with that from conventional VBM mass-univariate analysis. Materials and methods Participants Twenty-two control adults were recruited locally by advertisement, and 22 adults with autistic spectrum disorder were recruited through a clinical research program at the Maudsley/Institute of Psychiatry (London) supported by the MRC UK AIMS network. All volunteers (see Table 1) gave informed consent (as approved by the Institute of Psychiatry and South London and Maudsley NHS Foundation Trust research ethics committee), were aged 18–42 years and had an IQ within the normal range (measured using the WASI; Wechsler, 1999). All volunteers were right-handed and male. None had a history of major psychiatric disorder or medical illness affecting brain function (e.g., psychosis or epilepsy). All participants had a structured clinical exam and blood tests to exclude biochemical, haematological or chromosomal abnormalities (including fragile X syndrome). All participants with ASD were diagnosed with autism according to ICD 10 research criteria (International Statistical Classification of Diseases and Health Related Problems—10th revision; World Health Organization, 1992). The initial clinical diagnosis was then confirmed using
Table 1 Subject demographics.
n Age Full-scale IQ White matter total (ml) Gray matter total (ml) Brain total (ml) ADI-R social (n = 21) ADI-R communication ADI-R repetitive behavior ADOS total (n = 22) AQ
ASD
Controls
22 27 (7) 104 (15) 487.17 (59) 761.62 (96) 1248.8 (148) 17 (6) 13 (4) 5 (2) 10 (3) 29 (9)
22 28 (7) 111 (10.0) 502.77 (46) 747.96 (78) 1250.7 (117) — — — — 12 (6)
Data expressed as mean (SD). There were no significant group differences between subject groups in age and IQ on p b 0.05.
Please cite this article as: Ecker, C., et al., Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach, NeuroImage (2009), doi:10.1016/j.neuroimage.2009.08.024
ARTICLE IN PRESS C. Ecker et al. / NeuroImage xxx (2009) xxx–xxx
3
both the Autism Diagnostic Interview (ADI-R; Lord et al., 1994) and the autism diagnostic observation schedule (ADOS; Lord, 1989). All individuals considered as having ASD reached the ADOS algorithm cutoffs and 21 volunteers reached ADI algorithm cutoffs in the three domains of impaired reciprocal social interaction, communication and repetitive behaviors and stereotyped patterns. In addition, we administered the AQ questionnaire (Baron-Cohen et al., 2001) to both autistic and control participants. The AQ is a self-administered screening instrument that measures the degree to which an adult with normal intelligence has autistic traits by quantifying where a given individual is situated on the continuum from autism to normality. It is thus well suited to measure symptoms in both control and ASD population. MRI data acquisition All participants were scanned at the Centre for Neuroimaging Sciences, Institute of Psychiatry, London, UK, using a 3-T GE Signa System (General-Electric, Milwaukee, WI). High-resolution structural T1-weighted volumetric images were acquired with full head coverage, 196 contiguous slices (1. 1-mm thickness, with 1.09 × 1. 09-mm in-plane resolution), a 256 × 256 × 196 matrix and a repetition time/ echo time (TR/TE) of 7/2.8 ms (flip angle = 20 in., FOV = 28 cm). A (birdcage) head coil was used for radiofrequency transmission and reception. Consistent image quality was ensured by a semi-automated quality control procedure. Image processing Initially, images were visually inspected for artifacts or structural abnormalities unrelated to ASD. Subsequently, the data were preprocessed using SPM5 (Wellcome Trust Centre for Neuroimaging, Institute of Neurology, UCL, London, UK; http://www.fil.ion.uncl.ac. uk/spm). Using the unified segmentation step included in SPM5, images were normalized in the standard space of Talairach and Tournoux and segmented into gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF). The modulated normalized images were selected to assess alterations in volume. Modulated segmented GM and WM images were then smoothed with 8-mm isotropic Gaussian kernels and subsequently used as input into the classification algorithm. Classification and support vector machine Prior to SVM, subject groups were matched for IQ, age, gray matter volume, white matter volume and total brain volume. A linear SVM was subsequently used for image classification. In the context of machine learning classification, each image is treated as a point in a high dimensional space (space dimension = number of voxels in the image). The task of classifying the images into two classes (e.g., patients vs. controls) can be viewed as a task of finding a separating hyperplane or decision boundary. The classification procedure consists of two phases: training and testing. During the training phase, the algorithm finds a hyperplane that separates the examples in the input space according to their class labels. The classifier is trained by providing examples of the form bx,cN, where x represents a spatial pattern (e.g., gray matter map) and c is the class label (e.g., patient or control). Once the decision function is learned from the training data, it can be used to predict the class of a new test example. A hypothetical example of classification in a two-dimensional space with six training examples or images is displayed in Fig. 1A. The circles represent images of patients and the squares represent images of healthy controls. The dashed lines represent possible separating hyperplanes. Depending on the machine learning method applied, there are many possible decision boundaries or hyperplanes (e.g., linear discriminant
Fig. 1. Classification and support vector machine. (A) Illustration of a classification problem between two groups (patients vs. controls) for the simplified case of only two voxels. Each axis represents one voxel. Each symbol represents a brain image (gray matter map). The circles represent the images of patients and the squares images of healthy controls. The dashed lines represent hyperplanes that correctly classify the groups. The separating hyperplanes (H) are described by a learning weight vector (w) and an offset (b). (B) Illustration of a hyperplane determined by the support vector machine (SVM) algorithm. The optimal hyperplane (dashed line) is the one with a margin of separation between the two classes. The symbols at the margin (circled) are the support vectors.
analysis, support vector machine, etc.). However, some classifiers that correctly classify a training set fail for unseen examples and therefore generalize badly. Based on generalization performance, one can choose between different learning methods or classifiers. The SVM algorithm (Vapnik, 1995) finds the largest margin hyperplane. The margin is the distance from the separating hyperplane to the closest training examples. It has been demonstrated that the optimal hyperplane is the one with maximal margin (i.e., more separation between the classes). A larger margin corresponds to better generalization performance. An SVM classifier for a two-dimensional problem is illustrated in Fig. 1B. The training examples that lie on the margin are called support vectors. Conceptually, they are the most difficult data points to classify, and therefore they define the decision boundary or hyperplane. The hyperplane is defined by a weight vector and an offset. The weight vector is a linear combination of the support vectors and is normal to the hyperplane. A more detailed description of the SVM can be found in Schoelkopf and Smola (2002) and Burges (1998).
Please cite this article as: Ecker, C., et al., Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach, NeuroImage (2009), doi:10.1016/j.neuroimage.2009.08.024
ARTICLE IN PRESS 4
C. Ecker et al. / NeuroImage xxx (2009) xxx–xxx
In the present study, we exclusively used a linear kernel SVM to reduce the risk of overfitting the data and to allow direct extraction of the weight vector as an image (i.e., the SVM discrimination map). The linear kernel only has one parameter (C) that controls the trade-off between having zero training errors and allowing misclassifications. This was fixed at C = 1 for all cases (default value). The LIBSVM toolbox for Matlab was used to perform the classifications (http:// www.csie.ntu.edu.tw/~cjlin/libsvm/). Discrimination maps and recursive feature elimination If the input space is voxel space (one dimension per voxel), the weight vector normal to the hyperplane will be the direction along which the images of the two groups differ most. Hence, it can be used to generate a map of the most discriminating regions (i.e., a discrimination map). In the GM/WM maps, the value in a voxel is correlated with the regional volume. Given two groups, patients and controls, with the labels + 1 and −1, respectively, a positive value in the discrimination map (red scale) means relatively higher GM/WM matter volume in patients than in controls and a negative value (blue scale) means relatively higher GM/WM volume in controls than in patients. Because the classifier is multivariate by nature, the distribution of weights over all voxels can be interpreted as the spatial pattern by which the groups differ (i.e., the discriminating pattern). To identify the set of voxels with the highest discriminative power, SVM recursive feature elimination (SVM-RFE; Guyon et al., 2002; De Martino et al., 2008) was applied. SVM-RFE is a backward
elimination feature selection technique that iteratively removes features (voxels) from the data set with the aim of removing as many non-informative features as possible while retaining features that carry discriminative information. During SVM-RFE, an SVM classifier is repeatedly trained, and at each iteration, a featureranking criterion is used to remove a subset of the least informative features. Here, we did not apply recursive feature elimination as a feature selection technique per se but to evaluate the behavior of predictive accuracy as a function of the number of voxels in the data. The SVM-RFE algorithm was embedded within a leave-one-out crossvalidation (LOO-CV) framework, which is summarized in algorithm 1 (see Appendix). We used the square of each weight vector coefficient as a ranking criterion (i.e., w2i ) and removed 10,000 of the lowest ranking voxels at each iteration (except for the last 10,000 voxels, where we reduced the step size to 1000). The LOO-CV framework allowed us to derive an accuracy measure for each feature elimination level from which we determined the minimum number of voxels required to produce equivalent accuracy to all voxels. There is some evidence that SVM-RFE can produce improvements in prediction accuracy for fMRI data (De Martino et al., 2008), but if used for this purpose, it is necessary to use nested cross-validation to determine the optimal number of features for each LOO-CV fold. In contrast, we emphasize we are not using SVM-RFE to optimize predictive accuracy but to remove non-informative features from the weight vector and to find the most parsimonious weight vector representation (i.e., as a map) possible. As such, we use a single (nonnested) CV loop, but for each fold the optimal feature subset does not depend on the test data because the weight vector is constructed
Fig. 2. Classification accuracies for different classifiers. Both image modalities together correctly classified 77.0% of all cases (A). Best classification accuracy (81.0%) was achieved when GM image only were utilized (B). Lowest classification (68.0%) accuracy provided WM scans (C). (D) Receiver operating characteristics (ROC) graph for the discrete classifiers.
Please cite this article as: Ecker, C., et al., Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach, NeuroImage (2009), doi:10.1016/j.neuroimage.2009.08.024
ARTICLE IN PRESS C. Ecker et al. / NeuroImage xxx (2009) xxx–xxx
exclusively from the training data. As a result, the LOO-CV estimate of accuracy is unbiased. Once we have determined the minimal number of features, the weight vector group map is constructed using all subject data. Cross-validation The performance of the classifier was validated using the commonly used leave-two-out cross-validation approach, which provides a relatively unbiased estimate of the true generalization performance. In each trial, observations from all but one subject from each group were used to train the classifier. Subsequently, the class assignment of the test subjects was calculated during the test phase. This procedure was repeated S = 22 times (where S is the number of subjects per group), each time leaving observations from a different subject from each group out. The accuracy of the classifier was measured by the proportion of observations that were correctly classified into patient or control group. We also quantified the sensitivity and specificity of the classifier defined as
5
graphs as well as by permutation testing. Permutation testing can be used to evaluate the probability of obtaining specificity and sensitivity values higher than the ones obtained during the cross-validation procedure by chance. We permuted the labels 1000 times, each time randomly assigning patient and control labels to each image and repeated the cross-validation procedure. We then counted the number of times the specificity and sensitivity for the permuted labels were higher than the ones obtained for the real labels. Dividing this number by 1000, we derived a p value for the classification. To compare the results obtained using SVM with conventional voxel-based analysis, we employed voxel-based morphometry (VBM) using SPM5 software (Wellcome Department of Imaging Neuroscience Group, London, UK; http://www.fit.ion.ucl.ac.uk/spm). Group differences were assessed using the general linear model (GLM) on the basis of the modulated, smoothed gray and white matter segments (kernel size 8 mm) while co-varying for total gray and white matter volume. Outcomes were examined at three different statistical thresholds with varying degree of conservativeness: corrected for multiple comparisons using a family error rate (FWE) of (1) p b 0.05, (2) p b 0.001 uncorrected and (3) p b 0.1 uncorrected.
Sensitivity = TP=ðTP þ FNÞ Specificity = TN=ðTN þ FPÞ
Correlation between test margin and diagnostic criteria
where TP is the number of true positives, i.e., the number of patient images correctly classified; TN is the number of true negatives, i.e., number of control images correctly classified; FP is the number of false positives, i.e., number of controls images classified as patients; and FN is the number of false negatives, i.e., number of patients images classified as controls. Classifications were made, firstly, by including both image modalities in the feature vector and, secondly, on the basis of each individual image type (gray or white matter). Classifier performance was evaluated using basic receiver operating characteristic (ROC)
To validate the binary classification of the subject groups on a quantitative level and to identify the degree to which the classification is driven by autistic symptoms rather than confounds unrelated to autism, the test margin for each subject and image type was correlated with parameters indicating symptom severity using Spearman's rank correlation coefficient. Test margins were correlated with (1) ADOS scores indicating current symptom severity, (2) ADI scores representing autistic symptoms at the age of 4–5 years and (3) AQs measuring autistic traits in both control and ASD group.
Fig. 3. Discrimination map for gray matter classification. Relative increases in gray matter volume in the ASD group compared with the control group are displayed in red, while deficits are displayed in blue. The maps are orientated with the right side of the brain shows on the right side of each panel. The z-coordinate for each axial slice in the standard Talairach and Tournoux space is given in millimeters.
Please cite this article as: Ecker, C., et al., Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach, NeuroImage (2009), doi:10.1016/j.neuroimage.2009.08.024
ARTICLE IN PRESS 6
C. Ecker et al. / NeuroImage xxx (2009) xxx–xxx
Results Prediction accuracy Fig. 2 summarizes the results of the classification between ASD and controls utilizing GM and WM images as well as a combination of both. The best classification accuracy was obtained using only the GM images. Here, individuals with ASD were correctly assigned to the appropriate diagnostic category in 81.0% of all cases (see Fig. 2B). This value relates to the predictive power of the algorithm, which is an estimate of classification accuracy of a new individual's scan and therefore of direct diagnostic relevance. The sensitivity of the GM classification was 77.0%; i.e., if a volunteer had a clinical diagnosis of ASD, the probability that this participant was correctly assigned to the ASD category was 0.77. The specificity of the GM classification was 86.0%, implying that 86.0% of the healthy volunteers were correctly classified as controls on the basis of their GM anatomical scan. Generally lower classification accuracies were observed on the basis of the WM images (see Fig. 2C), which resulted in correct classification in 68.0% of all cases at a sensitivity of 73.0% and a specificity of 64.0%. An intermediate performance was observed when both image modalities were employed simultaneously. The overall classification accuracy of the multimodal classifier was 77.0% (sensitivity = 77.0%; specificity= 77.0%). The ROC graph for the three classifiers is shown in Fig. 2D. In general, one point in ROC space is better than another if it is to the northwest, i.e., above and to the left (TP rate is high, FP rate is low, or both) with the point (0,1) representing perfect classification. Classifiers appearing on the left-hand side of the ROC graph, near the x axis, may be considered “conservative” (i.e., make positive classifications only with strong evidence but often have low TP rates), whereas classifiers on the upper right-hand side of the graph may be thought of as “liberal” (i.e., make positive classifications with weak evidence so they classify nearly all positives correctly but often have high FPs). The
diagonal line y = x represents the effect of randomly guessing a class. The ROC graph further demonstrates the relatively larger discriminative power of GM images over WM scans. The classification p value resulting from the permutation test was very low across as well as within modalities (p b = 0.001). The probability of obtaining specificity and sensitivity values higher than the ones obtained during crossvalidation procedure by chance is thus extremely low. These results suggest that while both image modalities provide above chance classification accuracies, GM structural brain anomalies are better in differentiating individuals with ASD from controls. Discrimination maps The discrimination maps showing the global spatial pattern by which the groups differ are displayed in Fig. 3 (gray matter) and in Fig. 4 (white matter). These maps highlight a set of regions, which according to our classification approach carry the most distinctive characteristics between groups following SVM-RFE. As shown in Fig. 5, the observed classification accuracy was obtained even following removal of ∼95% of all voxels for both tissue type. This allowed us to threshold discrimination maps by coloring voxels above those percentages (i.e., coloring upper 5%). Due to the multivariate character of SVM, the discrimination maps should be interpreted as spatially distributed patterns of relative deficit (i.e., decreased volume in ASD; negative weight vector; blue color scale) or excess regions (i.e., increased volume in ASD; positive weight vector; red color scale) in ASD as compared with controls rather than making strong interpretations for effects in individual regions. Cluster coordinates in the standard space of Talairach and Tournoux as well as the highest weights within individual clusters are detailed in Tables 2 and 3. Regions displaying a relative decrease in GM volume in ASD versus controls were found in a spatially distributed network covering all four lobes of the cortex bilaterally as well as the cerebellum. Most
Fig. 4. Discrimination map for white matter classification. Relative increases in gray matter volume in the ASD group compared with the control group are displayed in red, while deficits are displayed in blue. The maps are orientated with the right side of the brain shows on the right side of each panel. The z-coordinate for each axial slice in the standard Talairach and Tournoux space is given in millimeters.
Please cite this article as: Ecker, C., et al., Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach, NeuroImage (2009), doi:10.1016/j.neuroimage.2009.08.024
ARTICLE IN PRESS C. Ecker et al. / NeuroImage xxx (2009) xxx–xxx
7
clusters were located in parietal-frontal regions including the middle and inferior frontal gyrus (BA44/10), the inferior parietal regions (BA39/40), the precuneus (BA7) and the precentral gyrus (BA4/6). In addition, the deficit network included the posterior cingulate, the right amygdala and the hippocampal gyrus as well as the cerebellum and the basal ganglia. We further observed a widespread pattern of regions displaying a relative increase in GM volume in ASD as compared with controls (i.e., an excess network). The excess network contained temporal regions such as the fusiform gyrus and the superior temporal sulcus (BA20, BA22) and several frontal regions such as BA4 and the left inferior frontal gyrus (BA47). In addition, the excess network included the lingual gyrus in the occipital lobe (BA17/ 18) as well as the bilateral insular cortex. A full description of the networks can be found in Table 2. Although white matter provided lower classification accuracy overall, the discrimination maps show a spatially distributed network of regions displaying relative increases and decreases in white matter volume in ASD relative to controls. Relative increases were observed in a network comprising several frontal regions such as medial and inferior frontal gyrus (BA10/40). In addition, individuals with ASD showed increases in WM volume in the temporal gyrus (BA20/21) as well as in lingual gyrus of the right occipital lobe (BA18). In addition, the excess network comprised the corpus callosum and the cerebellum). The WM deficit network consisted mostly of temporal and frontal regions, such as the superior and medial frontal gyrus (BA10/11) and the BA37/39 of the temporal gyrus. Finally, white matter deficits were also detected in the anterior cingulated gyrus and the thalamus. A full list of areas can be found in Table 3. Voxel-based morphometry
Fig. 5. Results of the recursive feature elimination (RFE) for gray matter (A) and white matter (B). These graphs show the obtained classification accuracy following the successive removal of non-informative features (i.e., voxels).
No significant volumetric between-group differences in either GM or WM were observed when conventional VBM-type analysis was employed using a family error rate (FWE) of p b 0.05. Even at a p value uncorrected for multiple comparisons of p b 0.001, very few differences in very small clusters could be detected using VBM. To compare the results obtained with VBM and SVM, we further lowered the VBM
Table 2 Most important gray matter regions discriminating between individuals with ASD and typical controls. Lobe ASD N Controls Frontal lobe
Temporal lobe
Parietal lobe Occipital lobe Other ASD b Controls Frontal lobe
Temporal lobe
Parietal lobe
Occipital lobe Other
Region/hemisphere
Brodmann area
x
y
z
Inferior frontal gyrus L Dorsolateral prefrontal gyrus L/R Precentral gyrus L/R Middle temporal gyrus L/R Inferior temporal gyrus L/R Superior temporal sulcus L/R Fusiform gyrus L/R Inferior parietal lobe L/R Medial occipital gyrus L/R Lingual gyrus L/R Insular cortex L/R
47 10/8 4 21 20 42/22 20 40 19 17/18
34 2 42 − 44 − 38 58 − 54 38 − 44 28 46
29 57 −3 5 − 13 − 13 − 45 − 39 − 61 − 89 7
−9 −1 39 − 33 − 33 7 − 12 39 −9 −9 7
8.46 6.22 11.60 9.83 12.1 7.10 6.59 8.01 6.02 7.21 6.67
Middle frontal gyrus L/R Precentral gyrus R Inferior frontal gyrus L/R Amygdala R Hippocampal gyrus L/R Uncinate L/R Inferior parietal lobe L/R Superior parietal lobe L/R Precuneus Posterior cingulate gyrus L/R Precuneus L/R Cerebellar cortex L/R Putamen R Caudate nucleus R
10 4/6 44
40 28 − 46 28 26 26 − 44 − 14 4 2 20 − 22 28 16
45 − 25 1 −9 − 29 − 13 − 55 − 57 − 63 − 45 − 57 − 43 3 11
−1 71 29 − 15 − 17 − 39 23 71 39 15 15 − 53 −5 − 13
− 9.02 − 6.93 − 11.61 − 7.26 − 7.36 − 9.65 − 8.21 − 7.57 − 7.92 − 10.17 − 6.75 − 10.70 − 7.89 − 6.32
35/36 20 39/40 7 7 30/23 19/31
wi
L, left hemisphere; R, right hemisphere; wi, weight of each cluster centroid i.
Please cite this article as: Ecker, C., et al., Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach, NeuroImage (2009), doi:10.1016/j.neuroimage.2009.08.024
ARTICLE IN PRESS 8
C. Ecker et al. / NeuroImage xxx (2009) xxx–xxx
Table 3 Most important white matter regions discriminating between individuals with ASD and typical controls. Lobe ASD N Controls Frontal lobe Temporal lobe Parietal lobe Occipital lobe Other
ASD b Controls Frontal lobe
Temporal lobe Parietal lobe Occipital lobe Other
Region/hemisphere
Brodmann area
x
y
z
wi
Medial frontal gyrus L/R Inferior frontal gyrus L/R Inferior temporal gyrus L/R Medial temporal gyrus L/R Supramarginal gyrus L/R Postcentral gyrus L/R Lingual gyrus R Corpus callosum L/R Cerebellum L/R
10 44 20 21/39 40 1/2/3 18
− 30 − 46 30 − 44 40 32 24 16 8
45 13 − 19 − 31 − 57 − 29 − 73 27 − 61
1 17 − 31 −7 31 63 −1 1 − 33
13.07 18.17 9.28 17.55 9.64 7.97 7.61 8.99 6.84
Superior frontal gyrus R Medial frontal gyrus L/R Precentral gyrus L/R Medial temporal gyrus L/R Inferior temporal gyrus L/R Inferior parietal lobe L/R Fusiform gyrus R Anterior cingulate R Thalamus L/R
11 10 4 39 37 7 37 32
34 30 − 10 42 − 42 − 16 48 16 − 10
45 47 − 29 − 65 − 57 − 73 − 51 13 − 19
− 13 11 73 23 −9 43 − 15 33 9
− 6.7 − 11.74 − 8.40 − 13.69 − 7.33 − 10.90 − 10.14 − 7.53 − 11.14
L, left hemisphere; R, right hemisphere; wi, weight of each cluster centroid i.
threshold to very lenient value p b 0.1 (uncorrected) to show all those regions showing changes that might be involved in ASD. The output maps are presented in Figs. 6A and B for gray and white matter. Overall, the regions identified by VBM were similar to regions identified by SVM. Increases in GM volume were observed predominantly in fronto-temporal regions while decreases were found mainly in the cerebellum and occipito-parietal areas. There were also spatially
distributed subtle differences in white matter such as excesses in fronto-temporal circuits and deficits in fronto-parietal networks. Relationship between test margin and level of symptom severity Correlation coefficients between the diagnostic criteria and the distance from the optimal hyperplane (i.e., test margin) are listed in
Fig. 6. Results of the conventional VBM analysis for gray (A) and white matter (B). Results are presented at a p value smaller than 0.1, uncorrected for multiple comparisons.
Please cite this article as: Ecker, C., et al., Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach, NeuroImage (2009), doi:10.1016/j.neuroimage.2009.08.024
ARTICLE IN PRESS C. Ecker et al. / NeuroImage xxx (2009) xxx–xxx Table 4 Correlation coefficients between diagnostic criteria and weight vector for gray and white matter. Diagnostic test
AQ ADOS total ADI-R social ADI-R communication ADI-R repetitive behavior
Gray matter
White matter
r
p
r
p
0.510⁎⁎ 0.430⁎ 0.141 − 0.057 0.365
b .01 b .05 b .60 b .90 b .20
0.218 0.432⁎ 0.163 − 0.259 0.014
b .2 b .05 b .50 b .30 b .96
9
r = .430, p b .05; WM: r = .432, p b .05). Patients with the most severe autism currently are therefore also better discriminators between groups on the basis of their GM/WM structural brain scans (i.e., are located furthest away from the hyperplane) than patients with less severe current autistic symptomology. We did not observe a significant correlation between any of the ADI-R subscales and the test margin at p b 0.5. This implies that the classification on the basis of brain morphometry in adulthood is predominantly driven by current symptoms as measured by ADOS rather than by past symptom severity during childhood as measured by the ADI.
⁎Significant correlation on p b 0.05; ⁎⁎significant correlation on p b 0.01 (two-tailed).
Discussion Table 4 and displayed in Fig. 7. Across the whole subject group (including both cases and controls), the test margin was positively correlated with the autism quotient summary scores for GM classification (r = .510, p b .01; see Fig. 7A). High values on the AQ indicate a higher level of symptom severity. In other words, subjects with the lowest AQs (i.e., with the least autistic traits) are located on the extreme left relative to the hyperplane (see Fig. 2B), thus being the best discriminators in the control group. Subjects with the highest AQs (i.e., most autistic traits) are located on the extreme right relative to the hyperplane and thus being most typical for the ASD group. Participants with average AQs exhibit small test margins and are thus located in close proximity of the hyperplane. Such a relationship between AQs and test margins was absent in the WM classifier (r = .218, p b 0.2; see Fig. 7B). Correlating the classifier results with total ADOS scores, however, displayed significant positive correlations between the test margin and the scores (Figs. 7C and D), which were available for the patient group exclusively and measure current autistic symptoms (GM:
To our knowledge, this is the first study to employ pattern recognition algorithms in the automatic classification of whole-brain GM and WM anatomical MRI scans in people with ASD. Brain regions discriminating most between autism and controls were primarily located in limbic, fronto-striatal and cerebellar regions. Overall, SVM achieved good separation between groups and correctly identified individuals with ASD in 81.0% of all cases on the basis of their gray matter anatomy. GM anatomy displayed superior discriminative power to WM anatomy, which provided an overall classification accuracy of 68.0%. For both tissue types, the distance from the hyperplane (i.e., decision value) increased significantly with the level of symptom severity, suggesting that biological differences underlying symptom severity were important factors in the classification. In this relatively small sample, a traditional VBM approach detected few significant between-group differences after correcting for multiple comparisons. At a lower level of statistical stringency (p b 0.1), however, the spatial patterns revealed by both techniques were largely in agreement. Our
Fig. 7. Correlations between SVM classifications and diagnostic criteria. Relationships between AQ and the distance from the weight vector for controls and individuals with ASD are displayed in panel A for gray matter and panel B for white matter. Panels C and D display the relationship between ADOS and gray/white matter, respectively.
Please cite this article as: Ecker, C., et al., Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach, NeuroImage (2009), doi:10.1016/j.neuroimage.2009.08.024
ARTICLE IN PRESS 10
C. Ecker et al. / NeuroImage xxx (2009) xxx–xxx
findings suggest that SVM may still provide accurate and above chance classifications even in the range of statistically non-significant findings and thus seems to be particularly suited to detect subtle and spatially distributed neuroanatomical differences in conditions such as ASD. In addition, SVM provided significant predictive power for group membership, which may be of diagnostic relevance. Neural systems discriminating individuals with ASD from controls Our findings demonstrated that individuals with ASD might be dissociated from controls on the basis of neuroanatomical differences in spatially distributed cortical networks. Unlike conventional mass-univariate analysis (i.e., VBM), which considers each voxel as a spatially independent unit, SVM is a multivariate technique and considers inter-regional correlations. Individual regions may therefore display high discriminative power due to two reasons: (1) there is a large difference in volume between groups in that region, or (2) this region is highly inter-correlated with other network components. For this reason, discriminative networks should be interpreted as a spatially distributed pattern rather than making assumptions on their constituent parts. This is of particular relevance to ASD, as individuals with the disorder most likely have abnormalities in the development of neural systems in addition to differences in isolated regions. SVM indicated two discriminating patterns: an excess network (i.e., regions displaying a relative increase in volume in ASD as compared with controls) and a deficit network (i.e., brain areas displaying a relative decrease in volume in the ASD group). To identify a set of regions with the highest discriminative power, recursive feature elimination (RFE) was applied. RFE iteratively removes non-informative voxels from the data set and derives an accuracy measure for each feature elimination level. This way the minimum number of voxels required to produce equivalent accuracy to all voxels can be obtained. We found for both tissue types that up to 95% of all voxels may be removed from the classifier without any loss in overall classifier performance. This implies that the classifier is driven by the volume of relatively few brain structures and their inter-correlations rather than by general regional differences related to the overall brain size. Most of the regional anomalies with high discriminative power were bilateral and located predominantly in three spatially distributed neuroanatomical networks including (1) the limbic system, (2) the fronto-striatal system and (3) the fronto-temporal and frontoparietal networks. This finding is in agreement with the notion that developmental abnormalities in a limbic circuit comprising medial prefrontal cortex, temporal lobe, striatum and limbic thalamus are present in people with ASD (Damasio and Maurer, 1978). The limbic system and its neuroanatomical connections have attracted much attention in autism research and have been linked to various aspects of emotional and social processing, executive functioning and episodic memory (for reviews, see Bachevalier and Loveland, 2006; Baron-Cohen et al., 2000). The core components of the limbic system, i.e., amygdala, hippocampus, cingulate gyrus, fornix, hypothalamus and the thalamus, are embedded into two separate circuits: the orbitofrontal–amygdala circuit (ventral path) and the dorsolateral prefrontal–hippocampal circuit (dorsal path; for a review, see Loveland et al., 2008). The ventral circuit is centered around the amygdala and includes the anterior cingulate, the orbital frontal cortex and the temporal pole. This circuit has been implicated in the monitoring of emotional states and social cognition as well as the self-regulation of socially acceptable behavior (Barbas, 1995; Baron-Cohen et al., 1999; Brothers, 1989). The dorsal subsystem is centered around the hippocampus and comprises the parahippocampal, posterior cingulate, parietal and dorsolateral prefrontal cortices. This network has been linked to processing of events and actions in the service of the visuo-spatial domain and memory (reviewed in Loveland et al., 2008). Here, we found high discriminative power
between groups in several components of both the ventral and dorsal pathways. Differences in the ventral pathway included regions such as the anterior cingulated, obitofrontal cortex (BA10), thalamus and amygdala, and those in the dorsal pathway included the posterior cingulate, the parietal cortex (BA40) and the dorsolateral prefrontal cortex (DLPF; BA11). Similar areas have previously been highlighted by studies using VBM-type techniques (Boddaert et al., 2004; Craig et al., 2007; Waiter et al., 2004). This indicates that several of the regions highlighted by SVM display high discriminative power due to true volumetric differences between groups rather than due to high intercorrelations with other network components. A combination of VBM and SVM is therefore potentially very useful in order to differentiate between regions with a volumetric between-group difference and regions with a high degree of interconnectivity. We also observed neuroanatomical differences in the fronto-striatal regions. There is evidence from previous VBM and functional neuroimaging studies that individuals with autism have functional (Schmitz et al., 2006), metabolic (Murphy et al., 2002) and structural differences from healthy individuals (Abell et al., 1999; Carper and Courchesne, 2005) in fronto-striatal pathways. In the present study, several of these regions were part of discriminative networks, namely, the head of the caudate nucleus, the superior frontal gyrus (BA11), the anterior cingulate and the DLPF cortex. Frontal and striatal brain regions are reciprocally connected to each other (Haber et al., 1995) and the thalamus (Ray and Price, 1993). We do not suggest that differences in the development of these regions is specific to ASD—because they are also implicated in a variety of other neurodevelopmental disorders such as Tourette's syndrome (TS), obsessive compulsive disorder (OCD), attention deficit hyperactivity disorder (ADHD) and schizophrenia (SCZ) (for a review, see Bradshaw and Sheppard, 2000). Nevertheless, differences in these regions may partially explain some of the clinical variation found within ASD individuals. For example, abnormalities in these regions in people with ASD have been related to (1) motor abnormalities such as abnormal gait sequencing, delayed development of walking and abnormal hand positioning (e.g., Nayate et al., 2005; Rinehart et al., 2006; Takarae et al., 2007); (2) impaired sensorimotor gating (McAlonan et al., 2002); and (3) repetitive and stereotyped behavior (Hollander et al., 2005; Rojas et al., 2006). Furthermore, we observed GM deficits in the cerebellum and its circuitry. The cerebellum is intrinsically connected with itself as well as with the cerebral cortex via the thalamus (for a review, see Ramnani, 2006). It mainly receives afferents from the prefrontal cortex (BA4, primary motor cortex; BA6, premotor cortex). Both BA4 and BA6 were identified as highly discriminating between ASD and control groups by SVM used in the present study. In addition, the deficit network included temporo-parietal regions (BA40) as well as the superior parietal lobe (BA7), which have been linked to functional deficits during mentalizing tasks in people with Asperger syndrome (Castelli et al., 2002). Structural abnormalities in the cerebellum have been reported previously in autism, but studies are in disagreement. Like the present study, most previous published data have noted a decrease in volume of the cerebellum (Courchesne et al., 1988; Levitt et al., 1999; McAlonan et al., 2002; Rojas et al., 2006), but there are also reports of both an increased volume in ASD (Abell et al., 1999) and no differences between groups (Piven et al., 1992). These variable findings highlight again that in spectrum disorders such as ASD, where one would not expect to find an exact match in deficit or excess regions across affected individuals, the voxel-by-voxel comparison approach on its own may lead to over-conservative findings. In such heterogenic samples, multivariate techniques (e.g., SVM) may still achieve high classification accuracy on the basis of spatially distributed patterns of brain regions. The interpretation of such patterns in terms of region-specific differences is, however, problematic. Thus, employing VBM alongside SVM could greatly enhance the interpretability of the results as well as help to identify core structures implicated in autism.
Please cite this article as: Ecker, C., et al., Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach, NeuroImage (2009), doi:10.1016/j.neuroimage.2009.08.024
ARTICLE IN PRESS C. Ecker et al. / NeuroImage xxx (2009) xxx–xxx
Predictive value of gray and white matter whole-brain MRI scans for ASD The second aim of this study was to investigate the predictive value of neuroanatomical MRI scans in ASD for both gray and white matter. As noted above mass-univariate analyses rely on voxel-wise levels of statistical significance, which are not representative of the overall group separation. Such approaches are thus designed to describe mean differences at the voxel-level rather than to make a prediction (e.g., group affiliation/separation; Davatzikos et al., 2008). On the other hand, SVM provides numerical indicators for group membership and thus offers diagnostic potential. Here we found that gray matter anatomical scans provided the best discriminative power and correctly classified 81.0% of all cases. This implies that any new case entered into the algorithm might be predicted to have autism or not at a probability of about 0.8. The specificity of the test was as high as 86.0%, i.e., if a subject did have a clinical diagnosis of ASD, the probability that this subject was correctly assigned to the ASD groups on the basis of their associated gray matter image was 0.86. The sensitivity of the GM classification was 77.0%, i.e., probability of correctly identifying an individual without a clinical diagnosis of ASD as control. This level of sensitivity and specificity can be classified as good overall and compares well with behaviorally guided diagnostic tools whose sensitivity/specificity are on average around 80% (e.g., the Social Communication Questionnaire (SCQ) sensitivity = 85.0%; specificity=75.0%; Rutter et al., 2003). Naturally, one would expect lower sensitivity values than gold-standard diagnostic tools (e.g., the ADOS, ADI-R), as these are used to define the “autistic prototype” itself. Thus, if a classifier is trained on the basis of true positives identified by ADOS and ADI, the maximal power that it could reach is only as good as the measurements used to achieve the behavioral or clinical classification. A well-trained classifier might, however, be used to overcome some of the limitations of the behaviorally guided diagnosis, particularly with respect to the etiology of the disease, and/or be employed to help speed diagnosis when used in combination with other clinical measures. There is strong evidence to suggest that autism is a heterogeneous disorder displaying a wide range in symptom severity and is associated in some individuals with neurodevelopmental conditions such as fragile X syndrome or tuberous sclerosis (Bailey et al., 1998; De Vries, 2008). It has also been suggested that ASD should be thought of as “the autisms” rather than a single autism phenotype (Geschwind and Levitt, 2007). The neuropathology of autism is therefore unlikely to be uniform across individuals, but instead to vary depending on the autistic subtype. In the present study, whole-brain rather than individual features (i.e., regions of interest) were used as input to the classifier with the purpose of maximizing the exploratory power. A discriminative weight can therefore be derived at each spatial location in the brain and for each image modality. As the classifier is primarily driven by structural gray and white matter differences in the brain, different class predictions may reflect different disease etiologies on the neuroanatomical level. If this is the case, SVM may be able to reveal distinct neuroanatomical phenotypes even though the behavioral phenotype may be identical. Hence, this approach may be used to identify endophenotypes aiding genetic studies. In addition to the overall classification (i.e., being autistic or not), SVM provided a test margin for each subject indicating the distance from the separating hyperplane or decision value. The test margin describes the distance between the test sample and the hyperplane determined during the training phase. As such, it provides a measure of how prototypical the test sample is of each class and the more ambiguous a sample is, the closer it would be expected to lie to the hyperplane. One problem with the test margin is that when crossvalidation is used to assess generalization, test samples from different cross-validation folds may be measured relative to different hyperplanes (if the training set change includes a support vector). However, this effect is likely to be small if the changes to the training set are also
11
small (e.g., leave-one-out cross-validation). Here we have shown that the distance from the hyperplane is positively correlated with the total ADOS scores for both gray and white matter and with the autism spectrum quotient (AQ) for gray matter. Although ADOS scores are not absolute measures of symptom severity (i.e., ratio or interval scale), higher levels generally indicate larger degrees of autistic traits at the time of assessment. A significant positive correlation between test margin and AQ/ADOS scores implies that (1) the classification is driven by autistic symptoms rather than confounds unrelated to autism (e.g., scanner noise), thus offering a direct validation of the classifier and (2) autism can be measured along a continuum, which is also reflected in individuals' gray matter anatomy. It thus provides evidence in favor of a quantitative diagnostic approach rather than a categorical diagnosis. Therefore, the margin by which a subject is diagnosed using SVM seems a true representation of the graded severity of the underlying autistic spectrum rather than providing a binary classification exclusively. This finding is in agreement with current theories proposing that autism and Asperger's syndrome lie on a continuum of social communication disability. Thus, our results support others who have suggested a shift from a categorical diagnosis toward a more quantitative “dimensional” approach (Baron-Cohen et al., 2001; Frith, 1991; Wing, 1981). There was no significant correlation between the test margin and any of the three ADI sub-domains. Contrary to the ADOS and AQ, the ADI measures past symptoms and retrospectively assesses autistic sub-domains at the age of approximately 4 to 5 years. The classification in our study was, however, based on gray matter anatomy in an adult ASD sample (mean age = 27 ± 7 years). If gray matter neuroanatomy were indeed a neurobiological marker for autism reflecting current rather than past symptoms, one would only expect a significant correlation with the AQ and ADOS rather than with the ADI. In addition, both diagnostic approaches differ qualitatively in that the ADOS is observational whereas the ADI is based on an informant's memory for distant events and is thus confounded by recall accuracy. Recall accuracy is most likely more of a confound in adult ASD samples than in children, and this may impact on ADI score accuracy, which could also explain our inability to find a significant correlation between test margins and ADI scores. Overall, however, the results of the correlation analysis presented here provide further evidence for the assumption that gray matter anatomy is a neurobiological marker for current severity of symptoms. The lack of a significant correlation between ADI sub-scores also highlights two important limitations of SVM in comparison with conventional diagnostic tools. Firstly, the ADOS as well as the ADI subdivide autistic symptoms into different content areas or domains, along which individuals differ (i.e., reciprocal social interaction; communication and language; and restricted and repetitive behavior). In contrast, the SVM algorithm employed in the present study was based on the overall separation between the autistic versus the non-autistic group. The test margins therefore represent summary scores and hence ignore specific autistic symptom profiles. Scores for individual sub-domains might, however, be derived in the future by developing classifiers for high/low scorers on individual sub-domains. Secondly, the classification algorithm is highly specific to the particular sample used for “training” the classifier. In the present study, we chose to investigate non-learning disabled (i.e., non mentally retarded) adult males with ASD only. The advantage of this approach is that the proposed classifier offers high specificity with regard to this particular subject group but is less specific with regard to alternative autistic populations (e.g., children with ASD, low-functioning autistics with mental retardation, etc.). Future work is thus needed to develop specific classifiers for different developmental stages, autistic subgroups and females with ASD. One final methodological limitation of SVM is the generalizability and stability of the results, which is dependent on the employed validation procedure. While the leave-two-out estimator employed in
Please cite this article as: Ecker, C., et al., Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach, NeuroImage (2009), doi:10.1016/j.neuroimage.2009.08.024
ARTICLE IN PRESS 12
C. Ecker et al. / NeuroImage xxx (2009) xxx–xxx
this paper has a low bias, it comes at the expense of high variance, and other validation techniques (such as split-half or 10-fold crossvalidation) may produce better estimates. Future research is thus needed to evaluate the performance of the presented classifier in a larger and more heterogeneous sample. Conclusions We examined neuroanatomical networks implicated in ASD using a whole-brain classification approach employing support vector machine (SVM) and investigated the predictive value of gray and white matter MRI scans in adults with ASD. SVM provided good group separation overall and indicated biologically plausible, spatially distributed networks with maximal discriminative power. Furthermore, we demonstrated that the classification was related to current symptom severity. At a low level of statistical stringency, the pattern of volumetric differences revealed by VBM was largely in agreement with the pattern indicated by SVM. These findings suggest that SVM can detect subtle and spatially distributed differences in brain networks between adults with ASD and controls, which can also be used to provide significant predictive power for group membership. Acknowledgments This study was partially funded by the MRC UK as the AIMS (Autism Imaging Multicentre Study) to PI's Murphy, Bullmore, BaronCohen and Bailey. We are grateful to those who agreed to be scanned and who gave their time so generously to this study. None of the authors of the above manuscript have declared any conflict of interest or financial interests, which may arise from being named as an author on the manuscript. Appendix Algorithm 1: Recursive Feature Elimination For s = 1 … nSubjects Exclude subject pair s for testing Initialize FeatureSet to all voxels While FeatureSet is not empty Train SVM using FeatureSet Test SVM using subject pair s and record accuracy Compute weight vector excluding subject pair s Rank voxels according to w2i If size(FeatureSet_s) ≤ 10000 StepSize = 1000 Else StepSize = 10000 End If Remove StepSize voxels with the lowest w2i from FeatureSet End While End For Compute average CV accuracy over all subjects for each FeatureSet size minFeatures : = minimum number of features providing equivalent (or greater) accuracy to all voxels. FeatureSet_all : = all voxels Train SVM using all subjects' data and FeatureSet_all Compute the weight vector using all subjects' data Rank voxels according to w2i Remove (totalFeatures − minFeatures) voxels with the lowest w2i from FeatureSet_all Train SVM using all subjects' data and FeatureSet_all Compute final weight vector map using all subjects' data
References Abell, F., Krams, M., Ashburner, J., Passingham, R., Friston, K., Frackowiak, R., Happe, F., Frith, C., Frith, U., 1999. The neuroanatomy of autism: a voxel-based whole brain analysis of structural scans. NeuroReport 10, 1647–1651. Akshoomoff, N., Lord, C., Lincoln, A.J., Courchesne, R.Y., Carper, R.A., Townsend, J., Courchesne, E., 2004. Outcome classification of preschool children with autism spectrum disorders using MRI brain measures. J. Am. Acad. Child Adolesc. Psych. 43, 349–357.
Amaral, D.G., Schumann, C.M., Nordahl, C.W., 2008. Neuroanatomy of autism. Trends Neurosci. 31, 137–145. Ashburner, J., Friston, K.J., 2000. Voxel-based morphometry—the methods. NeuroImage 11, 805–821. Bachevalier, J., Loveland, K.A., 2006. The orbitofrontal–amygdala circuit and selfregulation of social–emotional behavior in autism. Neurosci. Biobehav. Rev. 30, 97–117. Bailey, A., Phillips, W., Rutter, M., 1996. Autism: towards an integration of clinical, genetic, neuropsychological, and neurobiological perspectives. J. Child Psychol. Psychiatry 37, 89–126. Bailey Jr., D.B., Mesibov, G.B., Hatton, D.D., Clark, R.D., Roberts, J.E., Mayhew, L., 1998. Autistic behavior in young boys with fragile X syndrome. J. Autism. Dev. Disord. 28, 499–508. Barbas, H., 1995. Anatomic basis of cognitive-emotional interactions in the primate prefrontal cortex. Neurosci. Biobehav. Rev. 19, 499–510. Baron-Cohen, S., Ring, H.A., Wheelwright, S., Bullmore, E.T., Brammer, M.J., Simmons, A., Williams, S.C., 1999. Social intelligence in the normal and autistic brain: an fMRI study. Eur. J. Neurosci. 11, 1891–1898. Baron-Cohen, S., Ring, H.A., Bullmore, E.T., Wheelwright, S., Ashwin, C., Williams, S.C., 2000. The amygdala theory of autism. Neurosci. Biobehav. Rev. 24, 355–364. Baron-Cohen, S., Wheelwright, S., Skinner, R., Martin, J., Clubley, E., 2001. The autismspectrum quotient (AQ): evidence from Asperger syndrome/high-functioning autism, males and females, scientists and mathematicians. J. Autism Dev. Disord. 31, 5–17. Boddaert, N., Chabane, N., Gervais, H., Good, C.D., Bourgeois, M., Plumet, M.H., Barthelemy, C., Mouren, M.C., Artiges, E., Samson, Y., Brunelle, F., Frackowiak, R.S., Zilbovicius, M., 2004. Superior temporal sulcus anatomical abnormalities in childhood autism: a voxel-based morphometry MRI study. NeuroImage 23, 364–369. Bolton, P., Rutter, M., 1990. Genetic influences in autism. Int. Rev. Psychiatry 2, 67–80. Bradshaw, J.L., Sheppard, D.M., 2000. The neurodevelopmental frontostriatal disorders: evolutionary adaptiveness and anomalous lateralization. Brain Lang. 73, 297–320. Brothers, L., 1989. A biological perspective on empathy. Am. J. Psychiatry 146, 10–19. Burges, C.J.C., 1998. A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2, 121–167. Carper, R.A., Courchesne, E., 2005. Localized enlargement of the frontal cortex in early autism. Biol. Psychiatry 57, 126–133. Castelli, F., Frith, C., Happe, F., Frith, U., 2002. Autism, Asperger syndrome and brain mechanisms for the attribution of mental states to animated shapes. Brain 125, 1839–1849. Catani, M., Jones, D.K., Daly, E., Embiricos, N., Deeley, Q., Pugliese, L., Curran, S., Robertson, D., Murphy, D.G., 2008. Altered cerebellar feedback projections in Asperger syndrome. NeuroImage 41, 1184–1191. Courchesne, E., Yeung-Courchesne, R., Press, G.A., Hesselink, J.R., Jernigan, T.L., 1988. Hypoplasia of cerebellar vermal lobules VI and VII in autism. N. Engl. J. Med. 318, 1349–1354. Craig, M.C., Zaman, S.H., Daly, E.M., Cutter, W.J., Robertson, D.M., Hallahan, B., Toal, F., Reed, S., Ambikapathy, A., Brammer, M., Murphy, C.M., Murphy, D.G., 2007. Women with autistic-spectrum disorder: magnetic resonance imaging study of brain anatomy. Br. J. Psychiatry 191, 224–228. Damasio, A.R., Maurer, R.G., 1978. A neurological model for childhood autism. Arch. Neurol. 35, 777–786. Davatzikos, C., Fan, Y., Wu, X., Shen, D., Resnick, S.M., 2008. Detection of prodromal Alzheimer's disease via pattern classification of magnetic resonance imaging. Neurobiol. Aging 29, 514–523. De Martino, F., Valente, G., Staeren, N., Ashburner, J., Goebel, R., Formisano, E., 2008. Combining multivariate voxel selection and support vector machines for mapping and classification of fMRI spatial patterns. NeuroImage 43, 44–58. De Vries, P.J., 2008. What can we learn from tuberous sclerosis complex (TSC) about autism? J. Intellect. Disabil. Res. 52, 818. Folstein, S., Rutter, M., 1977. Genetic influences and infantile autism. Nature 265, 726–728. Frith, U., 1991. Autism and Asperger's Syndrome. Cambridge University Press, Cambridge UK. Geschwind, D.H., Levitt, P., 2007. Autism spectrum disorders: developmental disconnection syndromes. Curr. Opin. Neurobiol. 17, 103–111. Gillberg, C., 1993. Autism and related behaviours. J. Intellect. Disabil. Res. 37 (Pt 4), 343–372. Guyon, I., Weston, J., Barnhill, S., Vapnik, V., 2002. Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422. Haber, S.N., Kunishio, K., Mizobuchi, M., Lynd-Balta, E., 1995. The orbital and medial prefrontal circuit through the primate basal ganglia. J. Neurosci. 15, 4851–4867. Hardan, A.Y., Minshew, N.J., Keshavan, M.S., 2000. Corpus callosum size in autism. Neurology 55, 1033–1036. Herbert, M.R., Ziegler, D.A., Makris, N., Filipek, P.A., Kemper, T.L., Normandin, J.J., Sanders, H.A., Kennedy, D.N., Caviness Jr, V.S., 2004. Localization of white matter volume increase in autism and developmental language disorder. Ann. Neurol. 55, 530–540. Hollander, E., Anagnostou, E., Chaplin, W., Esposito, K., Haznedar, M.M., Licalzi, E., Wasserman, S., Soorya, L., Buchsbaum, M., 2005. Striatal volume on magnetic resonance imaging and repetitive behaviors in autism. Biol. Psychiatry 58, 226–232. Horwitz, B., Rumsey, J.M., Grady, C.L., Rapoport, S.I., 1988. The cerebral metabolic landscape in autism. Intercorrelations of regional glucose utilization. Arch. Neurol. 45, 749–755. Kloppel, S., Stonnington, C.M., Chu, C., Draganski, B., Scahill, R.I., Rohrer, J.D., Fox, N.C., Jack Jr., C.R., Ashburner, J., Frackowiak, R.S., 2008. Automatic classification of MR scans in Alzheimer's disease. Brain 131, 681–689.
Please cite this article as: Ecker, C., et al., Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach, NeuroImage (2009), doi:10.1016/j.neuroimage.2009.08.024
ARTICLE IN PRESS C. Ecker et al. / NeuroImage xxx (2009) xxx–xxx Koshino, H., Carpenter, P.A., Minshew, N.J., Cherkassky, V.L., Keller, T.A., Just, M.A., 2005. Functional connectivity in an fMRI working memory task in high-functioning autism. NeuroImage 24, 810–821. Levitt, J.G., Blanton, R., Capetillo-Cunliffe, L., Guthrie, D., Toga, A., McCracken, J.T., 1999. Cerebellar vermis lobules VIII-X in autism. Prog. Neuropsychopharmacol. Biol. Psychiatry 23, 625–633. Lord, C., 1989. Autism diagnostic observation schedule: a standardized observation of communicative and social behavior. J. Autism Dev. Disord. 19, 185–212. Lord, C., Rutter, M., Le Couteur, A., 1994. Autism Diagnostic Interview-Revised: a revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. J. Autism Dev. Disord. 24, 659–685. Loveland, K.A., Bachevalier, J., Pearson, D.A., Lane, D.M., 2008. Fronto-limbic functioning in children and adolescents with and without autism. Neuropsychologia 46, 49–62. McAlonan, G.M., Daly, E., Kumari, V., Critchley, H.D., van Amelsvoort, T., Suckling, J., Simmons, A., Sigmundsson, T., Greenwood, K., Russell, A., Schmitz, N., Happe, F., Howlin, P., Murphy, D.G., 2002. Brain anatomy and sensorimotor gating in Asperger's syndrome. Brain 125, 1594–1606. McAlonan, G.M., Cheung, V., Cheung, C., Suckling, J., Lam, G.Y., Tai, K.S., Yip, L., Murphy, D.G., Chua, S.E., 2005. Mapping the brain in autism. A voxel-based MRI study of volumetric differences and intercorrelations in autism. Brain 128, 268–276. Murphy, D.G., Critchley, H.D., Schmitz, N., McAlonan, G., Van Amelsvoort, T., Robertson, D., Daly, E., Rowe, A., Russell, A., Simmons, A., Murphy, K.C., Howlin, P., 2002. Asperger syndrome: a proton magnetic resonance spectroscopy study of brain. Arch. Gen. Psychiatry 59, 885–891. Nayate, A., Bradshaw, J.L., Rinehart, N.J., 2005. Autism and Asperger's disorder: are they movement disorders involving the cerebellum and/or basal ganglia? Brain Res. Bull. 67, 327–334. Piven, J., Nehme, E., Simon, J., Barta, P., Pearlson, G., Folstein, S.E., 1992. Magnetic resonance imaging in autism: measurement of the cerebellum, pons, and fourth ventricle. Biol. Psychiatry 31, 491–504. Ramnani, N., 2006. The primate cortico-cerebellar system: anatomy and function. Nat. Rev., Neurosci. 7, 511–522. Ray, J.P., Price, J.L., 1993. The organization of projections from the mediodorsal nucleus of the thalamus to orbital and medial prefrontal cortex in macaque monkeys. J. Comp. Neurol. 337, 1–31.
13
Rinehart, N.J., Bellgrove, M.A., Tonge, B.J., Brereton, A.V., Howells-Rankin, D., Bradshaw, J.L., 2006. An examination of movement kinematics in young people with highfunctioning autism and Asperger's disorder: further evidence for a motor planning deficit. J. Autism Dev. Disord. 36, 757–767. Rojas, D.C., Peterson, E., Winterrowd, E., Reite, M.L., Rogers, S.J., Tregellas, J.R., 2006. Regional gray matter volumetric changes in autism associated with social and repetitive behavior symptoms. BMC Psychiatry 6, 56. Rutter, M., Bailey, A., Lord, C., 2003. The Social Communication Questionnaire (SCQ) Manual. Western Psychological Services, Los Angeles, CA. Schmitz, N., Rubia, K., Daly, E., Smith, A., Williams, S., Murphy, D.G., 2006. Neural correlates of executive function in autistic spectrum disorders. Biol. Psychiatry 59, 7–16. Schoelkopf, M., Smola, A.J., 2002. Learning with Kernes. MIT Press, Cambridge. MA. Singh, V., Mukherjee, L., Chung, M.K., 2008. Cortical surface thickness as a classifier: boosting for autism classification. Medical Image Computing and ComputerAssisted Intervention—MICCAI 2008. Springer, Berlin, pp. 999–1007. Takarae, Y., Minshew, N.J., Luna, B., Sweeney, J.A., 2007. Atypical involvement of frontostriatal systems during sensorimotor control in autism. Psychiatry Res. 156, 117–127. Teipel, S.J., Born, C., Ewers, M., Bokde, A.L., Reiser, M.F., Moller, H.J., Hampel, H., 2007. Multivariate deformation-based analysis of brain atrophy to predict Alzheimer's disease in mild cognitive impairment. NeuroImage 38, 13–24. Toal, F., Murphy, D.G., Murphy, K.C., 2005. Autistic-spectrum disorders: lessons from neuroimaging. Br. J. Psychiatry 187, 395–397. Vapnik, V.N., 1995. The Nature of Statistical Learning Theory. Springer. Vemuri, P., Gunter, J.L., Senjem, M.L., Whitwell, J.L., Kantarci, K., Knopman, D.S., Boeve, B.F., Petersen, R.C., Jack Jr, C.R., 2008. Alzheimer's disease diagnosis in individual subjects using structural MR images: validation studies. NeuroImage 39, 1186–1197. Waiter, G.D., Williams, J.H., Murray, A.D., Gilchrist, A., Perrett, D.I., Whiten, A., 2004. A voxel-based investigation of brain structure in male adolescents with autistic spectrum disorder. NeuroImage 22, 619–625. Wechsler, D., 1999. Wechsler Abbreviated Scale of Intelligence (WASI). Harcourt Assessment, San Antonio. TX. Wing, L., 1981. Asperger's syndrome: a clinical account. Psychol. Med. 11, 115–129. Wing, L., 1997. The autistic spectrum. Lancet 350, 1761–1766.
Please cite this article as: Ecker, C., et al., Investigating the predictive value of whole-brain structural MR scans in autism: A pattern classification approach, NeuroImage (2009), doi:10.1016/j.neuroimage.2009.08.024