linear SVM discrimination maps for efficient prediction

0 downloads 0 Views 542KB Size Report
A classifier based on a linear SVM can directly apply discrimination maps to make ... The dual form of an SVM can be solved by only using kernel functions.
linear SVM discrimination maps for efficient prediction and classifier storage Alle Meije Wink†, Fiona Heeman†, Rutger W. ter Borg§, Jan C. de Munck‡, Larry Evans and Frederik Barkhof† Departments of †Radiology and Nuclear Medicine and ‡Physics & Medical Technology, VU University Medical Center and Neuroscience Campus, Amsterdam and §ArtiBit, Den Haag, Netherlands. Introduction Machine learning (ML) for neuroimage analysis provides a way to assess single subjects with sufficient predictive power. However, support vector machines (SVM) are currently mainly used in training and validation of classifiers. A classifier based on a linear SVM can directly apply discrimination maps to make single-subject predictions, circumventing the need to process the set of support vectors. More importantly, it enables efficient storage and exchange of classifiers without the need of re-processing training data. We demonstrate this technique in a study of patients with probable Alzheimer’s disease (AD) and controls, using eigenvector centrality maps computed from restingstate fMRI (RS-fMRI) data. Materials and Methods SVM – A binary SVM classifies input vectors from two classes by finding a hyperplane (for 2D inputs, a straight line) with a maximal distance between the point clouds, Slack variables allowing misclassifications at a penalty and non-linear feature space mappings can be used to guarantee a valid solution. for high-dimensional inputs such as brain images, using linear classifiers may reduce the risk of overfitting [1]. The dual form of an SVM can be solved by only using kernel functions k(vi,vj) between inputs vi and vj: the optimisation method is equal for linear and nonlinear kernels [2].

Figure 2: Discrimination map of the ECM-based classifier training using 21 patients with AD and 21 controls. This maps shows the most discriminative voxels in yellow (positive weights) and light blue (negative weights). Lower weights are shown in semi-transparent colours. SVM training was done using in-house software (sourceforge.net/projects/canabis) using an on-line SVM for efficient leave-one-out cross-validation (LOOCV) to find the optimal slack penalty C. Weight maps of the projection vector, the sum in (2) also known as discrimination maps, and the bias term b, were computed and saved in an image file for predicting the class labels of the remaining subjects. Results Training the classifier resulted in an accuracy of 74% (76% true positive, 71% true negative). The cross-validation confirmed that values for the sum in (1), the standard prediction formula, corresponded to the dot product in (2). Clusters of strong negative weights were found bilaterally in the visual cortex. Prediction of the remaining subjects (16 patients, 20 controls) was done using the projection method (2). The results could be computed with the image file saved in the training, without requiring the training data or the SVM software. This resulted in an overall accuracy of 68% (55% true negative, 81% true positive).

Figure 1: (a) A simple example of 2 classes with 2 points each: A(green) and B (red). The decision boundary (dark blue) lies in the centre of a maximally-separating hyperplane (light blue).(b) the projection onto the axis (orange) is defined by the support vectors and the shift to its origin (indicated by ┼ ) is defined by the bias. During training, the SVM assigns a weight αi to each input vector vi; only the support vectors have non-zero weights. The class of a new input v is predicted as the weighted sum of its kernel functions with the training vectors: (1)

f ( v )=sign

( ∑ y α ( v⋅v ) − b ) i

i

i

i

where class labels y = {-1, 1} for controls and patients, respectively. For the standard dot product, a linear kernel function, this is equivalent to (2)

f ( v )=sign ( v⋅ ∑ y i α i v i −b)

(

i

)

so that the sum and bias term only need to be computed once and can be re-used, increasing efficiency and exchangeability of training results. Imaging – 37 patients with probable AD and 41 subjects with subjective cognitive decline (SCD) from the Amsterdam Dementia Cohort (ADC) underwent a standard dementia screening and diagnosis used the NIAAA criteria for AD [3]. MR imaging on a 3T system (Signa HDxt, GE medical, USA) with an 8-channel head coil. Anatomical MR used sagittal FSPGR scans, 132 slices with 224×224 voxels, voxel size 1.2×1×1 mm³. RS-fMRI used 200 T2*-weighted EPI volumes (TR 2.85s, TE 60ms, FA 90º) of 64×64×36 voxels 0f 3.3×3.3×3.3 mm³.

Figure 3: the discrimination map computed from single-subject ECM computed from RS-fMRI shows concentrations of strong negative weights in the visual cortex. Conclusions Discrimination maps of voxelwise linear SVM, in combination with the bias term, can be directly used to perform prediction in new observations. Predictions can be disseminated easily and re-used even without using SVM software, if the bias term is stored in the (NifTI) header.

ECM computed from RS-fMRI show promise for classification and single-subject prediction in Alzheimer’s Disease. Concentrations of strong negative weights were found in previously reported regions [5], and a more restricted mask may increase Preprocessing included realignment, brain extraction, standard-space mapping, smoothing and resampling at 4×4×4 mm³. Eigenvector centrality mapping (ECM) was accuracies. However, the current sample is too small to draw this conclusion. used to estimate the relative contribution to the functional brain network for each voxel and was computed with fast ECM [4]. Differences in eigenvector centrality have References [1] Plant et al (2010) Neuroimage 50(1):162-174. [2] Schrouff et al (2013) Neuroinf. 11(3):319-337. [3] Van der Flier et al (2014) J. Alz. Dis. 41(1):313-327. [4] Wink previously been found between patients with AD and healthy elderly [5]. ECM were et al. (2012) Brain Connect. 2(5): 265-74. [5] Binnewijzend, M. et al. (2014) Hum Brain computed from the RS-fMRI data inside a mask of voxels where every subject Mapp. 35(5): 2383-934. showed a signal.

This research was sponsored by NWO Memorabel grant 733050204

Suggest Documents