SPECT image classification using random forests J. Ramı´rez, J.M. Go´rriz, R. Chaves, M. Lo´pez, ´ lvarez and F. Segovia D. Salas-Gonzalez, I. A
subjects was computed by averaging the voxel intensities of all the normal controls in the database. Only those voxels with a mean intensity above 50% of the maximum intensity and defining a 3-D mask were considered. Then, the normalised mean square error (NMSE) of a block of ð2v þ 1Þ ð2v þ 1Þ ð2v þ 1Þ voxels centred at the point with co-ordinates (x, y, z) is defined as:
A novel computer aided diagnosis system for the early diagnosis of Alzheimer’s disease (AD) is presented. The system consists of voxelbased normalised mean square error feature extraction, a t-test with feature correlation weighting for feature selection and random forest image classification. The proposed method yields an up to 96% classification accuracy, thus outperforming recent developed methods for early AD diagnosis.
Introduction: Alzheimer’s disease (AD) is the most common cause of dementia in the elderly and affects approximately 30 million individuals worldwide. Its prevalence is expected to triple over the next 50 years owing to growth of the older population. To date there is no single test or biomarker that can predict whether a particular person will develop the disease. With the advent of several effective treatments of AD symptoms, current consensus statements have emphasised the need for early recognition. Single photon emission computed tomography (SPECT) is a noninvasive, 3-D functional imaging modality that can be used to analyse the regional cerebral blood flow (rCBF) in patients. A SPECT rCBF study is frequently used as a complementary AD diagnostic tool in addition to clinical findings. However, conventional evaluation of SPECT images is subjective and often relies on manual reorientation, visual reading of tomographic slices and semi-quantitative analysis of certain regions of interest (ROI). Moreover, the minimal changes in the images in early AD make visual diagnosis a challenging problem that requires experienced practioners [1]. Even with this problem still unsolved, the potentials of novel machine learning techniques have not been explored in depth for computer aided diagnosis (CAD). This Letter presents the design of a CAD system to detect early AD by means of random forests [2]. Random forests: Various ensemble classification methods have been proposed in recent years for improved classification accuracy. In ensemble classification, several classifiers are trained and their results are combined through a voting process. Perhaps the most widely used of such methods are boosting and bagging. Boosting is based on sample reweighting but bagging uses bootstrapping. The random forest classifier [2] uses bagging, or bootstrap aggregating, to form an ensemble of classification and regression tree (CART)-like classifiers hðx; Tk Þ; k ¼ 1; . . . , where Tk are the bootstrap replica obtained by randomly selecting N observations out of N with replacement, where N is the dataset size, and x is an input pattern [2]. For classification, each tree in the random forest casts a unit vote for the most popular class at input x. The output of the classifier is determined by a majority vote of the trees. This method is not sensitive to noise or overtraining, as the resampling is not based on weighting. Furthermore, it is computationally more efficient than methods based on boosting and somewhat better than simple bagging. Materials and methods: Each patient is injected with a gamma emitting technetium-99m labelled ethyl cysteinate dimer (99mTc-ECD) radiopharmaceutical and the SPECT scan is acquired by means of a three-head gamma camera Picker Prism 3000. Brain perfusion images are reconstructed from projection data using the filtered backprojection (FBP) in combination with a Butterworth noise filter. SPECT images require spatial normalisation [3] in order to ensure that a given voxel in different images refers to the same anatomical position. This process was performed by using statistical parametric mapping (SPM) [4] yielding 69 95 79 normalised SPECT images. Finally, intensity level is normalised to the maximum intensity as in [1]. The images were initially labelled by experienced clinicians of the Virgen de las Nieves Hospital (Granada, Spain) as normal (NOR) for subjects without any symptoms of the disease and ATD to refer to possible, probable or certain AD patients. In total, the database consists of 79 patients: 41 NOR and 38 ATD. Feature extraction: Similarity measures of the rCBF of each subject and the mean rCBF value associated to normal controls were used as features. First, the mean value of the voxel intensity of normal
NMSEp ðx; y; zÞ ¼ þv P
½ f ðx l;y m;z nÞ gp ðx l; y m;z nÞ2
l;m;n¼v þv P
ð1Þ
½ f ðx l; y m; z nÞ2
l;m;n¼v
where f(x, y, z) and g(x, y, z) are the mean voxel intensities of controls and the p subject, respectively. Feature selection: Not all the NMSE features defined by (1) provide the same discriminant value for detecting early AD. In fact, the posterior cingulate gyri and precunei, as well as the temporo-parietal region are typically affected by hypo-perfusion in AD [5]. A feature selection process is carried out to find the region of interest (ROI) to train the random forest. It is based on an absolute value two-sample t-test with a pooled variance estimate on the NMSE features: jm1 m0 j T ðx; y; zÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi s21 þ s20
ð2Þ
where m1 and m0 denote the ATD and NOR within the class means of the NMSE features, respectively, and s21 and s20 their variances. Fig. 1a shows a map of the value of T (discriminant value of the feature) for each of the voxels within the mask and a voxel size v ¼ 4. The feature selection also uses correlation information to outweigh the T value of potential features using T ð1 arÞ, where r is the average of the absolute values of the cross-correlation coefficient between the candidate feature and all previously selected features, and a is a weighting factor. A large value of r (close to 1) outweighs the significance statistic; this means that features that are highly correlated with the features already picked are less likely to be included in the output list. Fig. 1 shows the 100 most discriminant ROI co-ordinates that were found using this feature selection procedure with a ¼ 0:95.
80 60 40 20 0 100
100 50
50 0
0
10
20
30
40
0
50
a
b
Fig. 1 Magnitude of T for each voxel within mask and voxel size v ¼ 4, and 100 most discriminant ROI co-ordinates of AD after applying feature selection based on t-test with feature correlation weighting a Magnitude of T b 100 most discriminant ROI co-ordinates
Evaluation experiments: Several experiments were conducted to tune a random forest classifier. First of all, Fig. 2 shows the out-of-bag error rate that is used to analyse the convergence of the random forest and the selection of the optimum voxel size v. In this analysis, the random forest is trained with all the 100 most discriminant features obtained above. Note that the generalisation error for the forest converges to a limit as the number of trees in the forest becomes large. Moreover, the generalisation error depends on the strength of the individual trees in the forest and the correlation between them. It can be concluded that (i) the random forest classifier converges for about 20 to 30 trees grown, and (ii) increasing the voxel size up to v ¼ 4 reduces the
ELECTRONICS LETTERS 4th June 2009 Vol. 45 No. 12 Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on June 16, 2009 at 14:06 from IEEE Xplore. Restrictions apply.
mean out−of−bag classification error
out-of-bag error rate. The importance of a feature variable for classification can be also estimated by randomly permuting all the values of the variable in the out-of-bag samples for each classifier (thereby missing the information provided by that feature). An increased outof-bag error is an indication of the importance of that feature. Thus, it is not needed to supply test data for bagged ensembles because reliable estimates of the predictive power and feature importance are obtained in the process of training, which is an attractive feature of bagging. 0.40 0.35 0.30
Acknowledgments: This work was partly supported by the MICINN under the PETRI DENCLASES (PET2006-0253), TEC2008-02113, NAPOLEON (TEC2007-68030-C02-01) and HD2008-0029 projects and the Consejerı´a de Innovacio´n, Ciencia y Empresa (Junta de Andalucı´a, Spain) under the Excellence Project (TIC-02566).
v=2 v=3 v=4 v=5
0.25 0.20 0.15 0.10 0.05
0
10
20
30
40
50
number of grown trees
Fig. 2 Random forest mean out-of-bag error rate against number of grown trees for selection of optimum voxel size v
The performance of the random forest CAD system was further evaluated as a tool for the early detection of AD. The experiments considered an increasing number of features for designing the classifier. Sensitivity, specificity and accuracy values were estimated by leaveone-out cross-validation The results are shown in Fig. 3. Performance improves with the number of features up to a maximum stable value. Moreover, peak values of sensitivity ¼ 94.7%, specificity ¼ 97.6% and accuracy ¼ 96.2% are obtained. As a conclusion, the proposed system outperforms recent developed AD CAD systems combining principal component analysis (PCA) [6] and Bayesian classification [1], and the voxel-as-feature (VAF) approach [7] that yields just an 80% classification accuracy by considering the voxel intensities of the SPECT images as input data for a support vector machine (SVM). 100
80 sensitivity specificity correct rate
60
40
20
0
Conclusion: Random forests have been investigated for classification of SPECT images and the design of an AD CAD system. The proposed system is based on voxel-based normalised mean square error feature extraction, the t-test with feature correlation weighting for feature selection and random forest image classification. It is shown that the generalisation error for the forest converges to a limit as the number of trees in the forest becomes large. Moreover, the generalisation error depends on the strength of the individual trees in the forest and the correlation between them. The proposed method yielded an up to 96.2% classification (sensitivity ¼ 94.7%, specificity ¼ 97.6%) accuracy and outperformed recent developed methods for early Alzheimer’s disease diagnosis.
0
2
4
6
8
10
12
14
16
18
# The Institution of Engineering and Technology 2009 20 April 2009 doi: 10.1049/el.2009.1111 J. Ramı´rez, J.M. Go´rriz, R. Chaves, M. Lo´pez, D. Salas-Gonzalez, ´ lvarez and F. Segovia (Department of Signal Theory, Networking I. A and Communications, University of Granada, Spain) E-mail:
[email protected] References ´ lvarez, I., 1 Lo´pez, M., Ramı´rez, J., Go´rriz, J.M., Salas-Gonza´lez, D., A Segovia, F., and Puntonet, C.G.: ‘Automatic tool for the Alzheimer’s disease diagnosis using PCA and Bayesian classification rules’, Electron. Lett., 2009, 45, (8), pp. 389–391 2 Breiman, L.: ‘Random forests’, Mach. Learn., 2001, 45, (1), pp. 5– 32 3 Salas-Gonza´lez, D., Go´rriz, J.M., Ramı´rez, J., Lassl, A., and Puntonet, C.G.: ‘Improved Gauss-Newton optimisation methods in affine registration of SPECT brain images’, Electron. Lett., 2008, 44, (22), pp. 1291–1292 4 Friston, K.J., Ashburner, J., Kiebel, S.J., Nichols, T.E., and Penny, W.D.: ‘Statistical parametric mapping: the analysis of functional brain images’ (Academic Press, 2007) 5 Kogure, D., Matsuda, H., Ohnishi, T., Asada, T., Uno, M., Kunihiro, T., Nakano, S., and Takasaki, M.: ‘Longitudinal evaluation of early Alzheimer disease using brain perfusion SPECT’, J. Nucl. Med., 2000, 41, (7), pp. 1155– 1162 ´ lvarez, I., Go´rriz, J.M., Ramı´rez, J., Salas-Gonza´lez, D., Lo´pez, M., 6 A Puntonet, C.G., and Segovia, F.: ‘Alzheimer’s diagnosis using eigenbrains and support vector machines’, Electron. Lett., 2009, 45, (7), pp. 342– 343 7 Fung, G., and Stoeckel, J.: ‘SVM feature selection for classification of SPECT images of Alzheimer’s disease using spatial information’, Knowl. Inf. Syst., 2007, 11, (2), pp. 243–258
20
number of features
Fig. 3 Performance of random forest classifier against number of input features
ELECTRONICS LETTERS 4th June 2009 Vol. 45 No. 12 Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on June 16, 2009 at 14:06 from IEEE Xplore. Restrictions apply.