Efficient mining of association rules for the early diagnosis of ...

Home

Search

Collections

Journals

About

Contact us

My IOPscience

Efficient mining of association rules for the early diagnosis of Alzheimer's disease

This article has been downloaded from IOPscience. Please scroll down to see the full text article. 2011 Phys. Med. Biol. 56 6047 (http://iopscience.iop.org/0031-9155/56/18/017) View the table of contents for this issue, or go to the journal homepage for more

Download details: IP Address: 150.214.27.109 The article was downloaded on 02/09/2011 at 18:26

Please note that terms and conditions apply.

IOP PUBLISHING

PHYSICS IN MEDICINE AND BIOLOGY

Phys. Med. Biol. 56 (2011) 6047–6063

doi:10.1088/0031-9155/56/18/017

Efficient mining of association rules for the early diagnosis of Alzheimer’s disease R Chaves1 , J M Górriz1 , J Ram´ırez1 , I A Illán1 , D Salas-Gonzalez1 and M Gómez-R´ıo2 1

Department of Signal Theory, Networking and Communication, ETSIIT, 18071, University of Granada, Spain 2 Virgen de las Nieves Hospital, Granada, Spain E-mail: [email protected]

Received 25 February 2011, in final form 25 July 2011 Published 26 August 2011 Online at stacks.iop.org/PMB/56/6047 Abstract In this paper, a novel technique based on association rules (ARs) is presented in order to find relations among activated brain areas in single photon emission computed tomography (SPECT) imaging. In this sense, the aim of this work is to discover associations among attributes which characterize the perfusion patterns of normal subjects and to make use of them for the early diagnosis of Alzheimer’s disease (AD). Firstly, voxel-as-feature-based activation estimation methods are used to find the tridimensional activated brain regions of interest (ROIs) for each patient. These ROIs serve as input to secondly mine ARs with a minimum support and confidence among activation blocks by using a set of controls. In this context, support and confidence measures are related to the proportion of functional areas which are singularly and mutually activated across the brain. Finally, we perform image classification by comparing the number of ARs verified by each subject under test to a given threshold that depends on the number of previously mined rules. Several classification experiments were carried out in order to evaluate the proposed methods using a SPECT database that consists of 41 controls (NOR) and 56 AD patients labeled by trained physicians. The proposed methods were validated by means of the leave-one-out cross validation strategy, yielding up to 94.87% classification accuracy, thus outperforming recent developed methods for computer aided diagnosis of AD. (Some figures in this article are in colour only in the electronic version)

1. Introduction Alzheimer’s disease (AD) is the most common cause of dementia in the elderly and affects approximately 30 million individuals worldwide (Petrella et al 2003). Its prevalence is 0031-9155/11/186047+17$33.00

© 2011 Institute of Physics and Engineering in Medicine

Printed in the UK

6047

6048

R Chaves et al

expected to triple over the next 50 years due to the growth of the older population. To date, there is no single test or biomarker that can predict whether a particular person will develop the disease. With the advent of several effective treatments of AD symptoms, current consensus statements have emphasized the need for early recognition (Evans et al 1989, Brookmeyer et al 1998). AD diagnosis is based on the information provided by a careful clinical examination, a thorough interview of the patient and relatives and a neuropsychological assessment. However, distinguishing AD using these clinical protocols remains a diagnostic challenge specially during the early stage of the disease. Furthermore, in this early stage, the disease offers better opportunities to be treated. An analysis of the regional cerebral blood flow (rCBF) by means of single photon emission computed tomography (SPECT) is a widely used technique to study the functional properties of the brain (English and Childs 1986, Ayache 1996) and it is frequently used as a complimentary diagnostic tool in addition to the clinical findings (Silverman et al 2001). After the reconstruction and a proper normalization of the SPECT raw data, taken with Tc-99m ethyl cysteinate dimer (ECD) as a tracer, one obtains an activation map displaying the local intensity of the rCBF. Therefore, this technique is particularly applicable for the diagnosis of neuro-degenerative diseases like AD (Hellman et al 1989, Holman et al 1992, Johnson et al 1993, Stoeckel et al 2001, 2004, Fung and Stoeckel 2007, Johnson et al 2007). Although this functional modality has lower resolution and higher variability than others such as positron emission tomography (PET), SPECT tracers are relatively cheap. Furthermore, due to longer tracer half-life, SPECT imaging is well suited when biologically active radiopharmaceuticals have slow kinetics. SPECT modality also eliminates the need for an expensive on-site cyclotron/radiochemistry production facility, typically required for the use of PET tracers. Therefore, the former is very popular and is used in clinical practice nowadays. This paper shows a novel methodology for finding associations in functional SPECT image databases for the early diagnosis of AD. In this sense, we propose a novel method based on association rules (ARs) (Agrawal et al 1993) in order to exploit their strengths when operating with large itemsets. Although the size of nuclear medicine databases is usually small, the number of voxels is quite large. Thus, each SPECT scan has lots of attributes which can be related to each other by means of ARs. This work is organized as follows. Firstly, in sections 2 and 3 we motivate the use of computer-aided diagnosis (CAD) systems of AD and introduce the basic principles on ARs which are later used for SPECT image analysis. In section 4.1, the SPECT image acquisition and image preprocessing steps are explained. In section 4.2, a novel AR-based feature extraction and selection method is introduced. Then, this method is applied to the extraction of relationships among brain regions in the SPECT images, as shown in section 4.3. In section 5, we analyze the classification performance of a novel CAD system based on the number of the previously extracted rules verified by each subject and, finally, conclusions are drawn in section 6.

2. Background in CAD systems In late-onset AD, there are minimal perfusion deficits in the mild stages of the disease, and agerelated changes, which are frequently seen in healthy aged people, have to be discriminated from the minimal disease-specific changes. These minimal changes in the images make visual diagnosis a difficult task that requires experienced clinicians. In order to enhance the prediction accuracy, specially in the early stage of AD, where the patient could benefit most from drugs and treatments, CAD tools are desirable (Frackowiak et al 2003, Illán et al 2009).

Efficient mining of association rules for the early diagnosis of Alzheimer’s disease

6049

The potential of CAD systems has not been totally exploited in this area although several approaches for the AD can be found in the literature (Fung and Stoeckel 2007, Minoshima et al 1995, Ishii et al 2006). The main goal of CAD systems is to reproduce the knowledge of medical experts in the evaluation of a given subject, i.e. distinguishing AD patients from controls. Thus errors from single observer evaluation are reduced by using a decision support system for assisting the identification of early signs of AD. In this sense, several approaches have been proposed for analyzing SPECT and other medical image modalities. The common strategy to tackle the medical image analysis is based on statistical inference, at the voxel level or considering groups of voxels or regions of interest (ROIs). The most relevant univariate analysis approach to date is the widely used statistical parametric mapping (SPM) and its numerous variants (Friston et al 2007). SPM is able to measure the differences between two groups and consists in doing a voxelwise statistical test, i.e. a two-sample ttest, comparing the values of the image of one patient under study (normal or AD) to the mean values of the group of controls. Subsequently, the significant voxels are inferred by using random field theory (Adler 1981). On the other hand, multivariate approaches such as MANCOVA consider all the voxels in a single scan as one observation to make inferences about distributed activation effects. Their importance is based upon the fact that the effects due to activations, confounding effects and error effects, are assessed statistically in terms of effects at each voxel and also interactions among voxels, but requiring a number of observations (i.e. scans) to be greater than the number of components of the multivariate observation (i.e. voxels) (Frackowiak et al 2003). Other prolific approach to the medical image analysis problem is based on statistical learning theory (Vapnik 1998, Toivonen 1996). One example is the straightforward voxel-as -feature (VAF) approach, in which the voxel intensities In of the SPECT image are directly used to construct the feature vectors v = (I1 , . . . , IN ) (Stoeckel et al 2001, 2004). This framework reports the well-known small sample size problem (Duin 2000) which is very common in functional imaging studies since the number of images is limited. Several solutions have been proposed to circumvent this problem within the context of supervised multivariate approaches, such as Ram´ırez et al (2009), Illán et al (2009), López et al (2009). Firstly, feature vectors representing the different SPECT images are defined. Secondly, a classifier is trained using a given set of known samples (Stoeckel et al 2001, 2004, Fung and Stoeckel 2007) to automatically distinguish between controls and AD patients. Such a statistical learning approach forms a powerful tool for brain image analysis and an aid in decision making, with the additional advantage that no specific knowledge about the disease is necessary. In this work, we focus on brain image analysis, pointing out the importance of high-order relationships between brain areas, which cannot be found on a voxel level or in direct ROI analysis. To this aim, each patient is represented by a feature vector containing ROIs which are selected using the VAF approach and a threshold-based activation estimation (AE) method similar to the one used in Turkeltaub and Zeffiro (2002). These ROIs are tridimensional blocks with a high concentration of activation coordinates (Newman et al 2008) and are directly related to the perfusion level which varies widely between controls and AD in its three possible stages. Finally, relationships between ROIs using ARs are established to efficiently distinguish between AD patients and controls. 3. Association rules basis ARs have drawn researchers’ attention in the past (Agrawal et al 1993, Agrawal and Srikant 1994, Cheung et al 1996, Han and Fu 1995). ARs enable us to find the associations and/or relationships among input items of any system in large databases and later eliminate some

6050

R Chaves et al

unnecessary input. An AR is represented with a mathematical expression of transactional relationship between two patterns A and B but is also sensitive to the direction (Zang 2000). ARs differentiate from classification rules in two manners: they can predict any attribute, not just the class, and they can predict more than one value of attributes at the same time, giving the freedom to predict combinations of attributes too (Witten and Frank 2005). Moreover, classification using ARs combines association rule mining and classification, and, therefore, is concerned with finding rules that accurately predict a single target (class) variable. Therefore, any classification learner using ARs has to perform three major steps: (i) mining a set of potentially accurate rules, (ii) evaluating and pruning rules and (iii) classifying future instances using the found rule set. ARs are typically used in market basket analysis, cross-marketing, catalog design, lossleader analysis, store layout and customer buying pattern (Skirant et al 1997). This is primarily due to the usefulness and easy understanding of their results and the fact that they express how important products or services relate to each other, and immediately suggest particular actions. There are also applications of ARs in mining categorical data items (Agrawal et al 1993, Skirant et al 1997). The discovery of new and potentially important relationships between concepts in the biomedical field has focused the attention of many researchers in text mining (Nearhos et al 1996). Hospitals and health-care organizations are developing automated mechanisms to support the process of analysis and costs of medical information, which are usually both time consuming and very expensive. ARs have been proposed for image analysis (Ordonez et al 2003). The problem of image mining combines the areas of content-based image retrieval, image understanding, data mining, etc. Medical image mining is a promising area of computational intelligence applied to automatically analyze patient’s records aiming at the discovery of new knowledge potentially useful for medical decision making. These data mining techniques have been applied to medical image databases for image analysis of a subject under study. An approach for tumour classification in mammograms by employing association rule mining with the constraint form has been proposed in Antonie et al (2001), Zaiane et al (2002). Other techniques explore the association rule mining in x-ray image databases for the early diagnosis of lung cancer (Zubi et al 2011). In Sheela et al (2009), a CAD system to classify patients with AD was proposed using structural MRI. In this sense, five textural features proposed by Haralick (Haralick et al 1973) were extracted from MRI scans while an enhanced classification based on associations algorithm (CBA) (Liu et al 1998) was used to classify the images as normal or abnormal. The key issue for these applications is that available image databases provide a great environment to apply some image recognition methods to extract the useful knowledge, i.e. rules to characterize the normal and abnormal patterns of the diseases in a medical decisionmaking scenario. The strengths of ARs are based on the capability of operating with large databases in an efficient way, while its execution time scales almost linearly with the size of the database (Nearhos et al 1996). Although in the biomedical field, the application of ARs is an innovative possibility, yet to explore in AD diagnosis. For instance, the idea of image diagnosis and ARs has been used in Ribeiro et al (2009), in which mammography image diagnosis is enhanced through the AR-based (IDEA) method. Another example can be found in Karabatak et al (2009), where ARs are used for reducing the dimension of the erythematosquamous diseases dataset. Moreover, a direct application of ARs for detecting breast cancer in Karabatak and Ince (2008) plays an important role due to the huge amount of images to be processed. To our knowledge, the potential of these classification techniques based on AR mining has not been applied to functional medical image databases and is still an open and


6051

promising field of research. Diagnosis of AD by means of functional image modalities such as SPECT usually requires to visually inspect 3D functional activation maps and identifying abnormal voxel patterns in typical hypoperfusion brain regions. This process requires experienced clinicians being subjective and prone to error. AR mining can be used in this context for the early diagnosis of AD by characterizing normal perfusion patterns in SPECT images by means of ARs mined from healthy subjects. Then, a subject under study could be analyzed by observing jointly activated voxel patterns identified by a previous AR-mining process and quantifying the number of ARs its image pattern verifies. 4. Materials and methods 4.1. Subjects and preprocessing Baseline SPECT data from 97 participants were collected from the Virgen de las Nieves hospital in Granada (Spain). The patients were injected with a gamma emitting 99m Tc-ECD radiopharmeceutical and the SPECT raw data were acquired by a three-head gamma camera Picker Prism 3000. A total of 180 projections were taken with a 2◦ angular resolution. The images of the brain cross sections were reconstructed from the projection data using the filtered backprojection algorithm in combination with a Butterworth noise removal filter. The SPECT images are first spatially normalized using SPM software (Friston et al 2007), in order to ensure that the voxels in different images refer to the same anatomical positions in the brain. This step allows us to compare the voxel intensities of the brain images of different subjects (Salas-Gonzalez et al 2008, 2010). Then, we normalize the intensities of the SPECT images with the maximum intensity Imax, which is computed for each image individually by averaging over the 3% of the highest voxel intensities, similar to the method in Saxena et al (1998). After the spatial normalization, one obtains a 95 × 69 × 79 voxel representation of each subject, where each voxel represents a brain volume of 2 × 2 × 2 mm3 . The SPECT images were visually classified by experts of the Virgen de las Nieves hospital using four different labels: normal (NOR) for patients without any symptoms of AD, and possible AD (AD1), probable AD (AD2) and certain AD (AD3) to distinguish between different levels of the presence of typical characteristics for AD. In total, the database consists of 41 NOR, 30 AD1, 22 AD2 and 4 AD3 patients. Labels were assigned to patients after a clinical cognitive test and a visual inspection of the SPECT image was performed. Experienced physicians of the Departments of Neurology and Nuclear Medicine of the Hospital Universitario Virgen de las Nieves (Granada, Spain) jointly assigned the labels by considering both the neurological assessment and the SPECT study. This methodology has been reported to be highly accurate, when histopathological information is absent (Jobst et al 1998, Dougall et al 2004). For posterior analysis, the data were arranged in two different groups: AD subjects were labeled as positive and normal controls as negative. The motivation of doing that is to test our method with all the available stages of the disease, to keep the database as balanced as possible (41 NOR and 56 AD) and to include several types of pattern in the classification task (training and test). 4.2. Feature extraction and selection In this paper, we propose to apply a combination of VAF and threshold-based AE methods for feature extraction of each subject. The former method (VAF) is a way of including all the voxel intensities inside the brain as features; thus no explicit knowledge about the disease

6052

R Chaves et al

Figure 1. Pipeline with the feature extraction, training and test stages of this AR-based classifier.

is needed and the inclusion of a priori information about the pathology into the system is avoided. Hence this information is implicitly obtained from image databases (Stoeckel et al 2004). The AE method leads to a reduced list of activation maxima containing only those maxima which have one or more other maximums in their vicinity (activated voxels). In functional imaging, each voxel carries a gray level intensity I (xj ), which is related to the rCBF (Holman et al 1992), glucose metabolism (Ng et al 2007), etc in the brain of a patient, depending on the image acquisition modality. The process is described as follows and it can be observed in figure 1. In the first step (VAF) (Stoeckel et al 2004), a mean tridimensional image is computed by averaging all control images from the SPECT database. An activation threshold intensity aT is fixed by the 50% of the maximum intensity of this mean image, defining a 3D mask. Voxels outside this mask are discarded from the database images, and therefore voxels outside the brain and poorly activated regions are excluded from this analysis. Then, an image voxel is considered as activated when its intensity value I is above the activation threshold, that is, when I > aT . In the second step (AE), each SPECT image is divided into 3D v × v × v cubic voxels or blocks. The number of blocks in which an image is divided depends on two parameters: and g (see figure 2). The first defines the size of the blocks, and the latter defines the size of the step in a number of voxels between two adjacent blocks in one direction. The sodefined center locations of the blocks define a 3D g × g × g grid, considering the possibility of some overlap between adjacent blocks. A block is parameterized by the proportion of activated voxels t. It has been considered that a block is activated only if the ratio of activated voxels inside of it is greater than a given threshold, i.e. t > 90%. These parameters are tuned experimentally in order to reduce the computational time. If the activation threshold is reduced or the overlapping is broader, it may lead to an increase of the computational cost of the CAD system. On the other hand, other strategies are analyzed for reducing the


6053

80 60 40 20 0 100 80 60

50

40 20 0

0

(a)

(b)

Figure 2. (a) Tridimensional ROIs (centered at dots) for a voxel size = 9, grid space = 4 and 90% of activation by block. (b) Bidimensional activated ROIs (crosses in rectangles) from the axial, sagittal and coronal axes obtained from ARs in the supervised mode with minsup and minconf of 100%).

computational demands of our approach as well, by adding extra constraints on the structure of patterns (Umarani et al 2010). In this sense, we use a threshold-based decision which provides a good trade-off between computational cost and accuracy and allows the inclusion of all relevant brain regions in the detection of AD. Figure 2(a) shows a contour representation of the brain and the location of the activated regions (in dots), which are used as input for the AR-mining algorithm. Using these preprocessing steps, we automatically reproduce the visual inspection of a clinician but in a more objective manner: common ROIs are searched in controls and afterward they are applied to subjects under test.

6054

R Chaves et al

4.3. Extracting ARs from SPECT images In data mining, AR learning is an effective method for discovering interesting relations between variables in large databases (Agrawal et al 1993). In this context, let m be the number of activated regions (also called ROIs) I = i1 , i2 , . . . , im and let T be a database of transactions T = t1 , t2 , . . . , tn that relate only the ROIs of controls. Each transaction t is represented as X ⇒ Y , where X, Y ⊂ ROI s of controls and X ∩ Y = ∅. In our problem, an AR: X ⇒ Y , which relates two activated items (regions) of the normal pattern, is defined as ‘activation in area X implies activation in area Y’ given some certain support and confidence constraints. The rule X ⇒ Y has a support s if s% of the transactions in T contain X ∪ Y and a confidence c, if c% of the transactions in T that contain X also contain Y, (Wu et al 2004) that is, support(X ∪ Y ) . (1) confidence(X ⇒ Y ) = support(X) For example, if the rule X ⇒ Y has a confidence of 1 in the database means that for 100% of the transactions containing both ROIs, the rule is correct. While confidence is a measure of the rule’s strength, support corresponds to statistical significance (Agrawal et al 1993). Support and confidence do not measure association relationship but how often patterns co-occur and only make sense in AR mining. In this paper, AR mining is referred to ‘normal patterns’ in terms of activation areas across the brain (ROIs). Control patterns are less variable than AD patterns (Illán et al 2010a); thus they are more suitable for obtaining relevant relationships among brain regions using ARs, which will not be satisfied in general by AD patients. In addition, ARs are mined from ROIs of controls solving the problem of high dimensionality as in Illán et al (2010b). This methodology improves the classification results since we have different samples of the same individual, that is, the proposed AR-based CAD follows the philosophy of voxel-wise statistical approaches, such as SPM. That is, in practice, the number of individuals of the database is in fact multiplied by the number of ROIs and, as a consequence, the problem of small sample size is relieved. The problem of rule extraction is to generate all ARs that have support and confidence greater than a minimum support (called minsup) and minimum confidence (called minconf) established by the user. The Apriori algorithm (Agrawal and Srikant 1994) is used for the AR mining (see appendix). If minsup and minconf values are increased, the number of rules mined is reduced and vice versa. Particularly, if minconf and minsup are selected next to the maximum of 100%, the number of rules to be mined is lower but only the most distinguishing features are used for the classification stage. Figure 2(b) shows transaxial, coronal and sagittal slices of a mean SPECT image illustrating the 4 × 4 × 4 grid locations (crosses) and the activated regions involved in ARs (squares) in terms of antecedents and consequences. ARs are mined using a high minsup and minconf (100%) to extract the rules that better distinguish AD subjects from controls. Note that temporoparietal brain regions, which are characteristic brain regions for AD, are considered as activated ROIs by the algorithm. In the classification step, we test for each subject how many ARs are verified. Since ARs were mined from controls, it is expected that healthy subjects verify a higher number of ARs than the one verified by AD patients. 5. Results In this section, we show the experimental results after applying the proposed methods for ROI extraction and classification. In this sense, we have achieved several objectives. Firstly, supervised learning based on ARs applied to a reduced number of ROIs yields higher classification accuracy than the VAF approach (Stoeckel et al 2004) which is used as a baseline.


6055

Then, we reproduce the visual procedure performed by clinicians of identifying hypoperfusion patterns but in a more automated way than other methods, for instance principal component analysis (PCA) (Friston et al 2007, Illán et al 2009). The stages of the AR-based technique are as follows. • Training stage: ROIs (see figure 2(a)) were previously extracted for each subject in the SPECT database as described in section 4.2. In the training stage, ARs are mined only from controls using the cross-validation strategy (Gidskehaug et al 2008, Stone et al 1974). Whenever a specific control is being evaluated in the test stage, it is not considered in the AR mining, avoiding the overtraining as shown in figure 1. If the availability of samples is limited, separation of the data into a training and validation set may decrease the quality of both the calibration model and the validation. Therefore, this is solved by processing all available information (i.e. scans), in order to estimate the residual error of the model in the test stage (Gidskehaug et al 2008) in an iterative leave-one-out (loo) process. • Test stage: in this stage, we check the number of ARs which are verified by each test subject following the loo validation strategy. If a subject verifies more or an equal number of ARs than a class-threshold, then it is classified as NOR otherwise as AD. Several experiments were conducted in order to evaluate the combination of VAF and AE feature extraction methods aiming to posterior AR mining for brain image classification. In this sense, the performance of the AR-based classifier, as a tool for the early detection of AD, was evaluated in depth in terms of accuracy, sensitivity and specificity. In the experimental framework, we consider an increasing threshold of the number of verified ARs by the subject to be classified. The sensitivity and specificity of each test are defined as TN TP ; specificity = , TP + FN TN + FP respectively, where TP is the number of true positives: the number of AD patients correctly classified; TN is the number of true negatives: the number of controls correctly classified; FP is the number of false positives: the number of controls classified as AD patients; FN is the number of false negatives: the number of AD patients classified as controls. In all the experiments, 3D activated blocks of size 9 × 9 × 9 are selected, which include voxels above an intensity threshold as described in section 4.2. Coordinates of the blocks are sampled by means of a 4 × 4 × 4 3D grid to reduce the computational complexity of the method. The activated regions (shown in figure 2(a)) are obtained and numbered for each subject inside a mean 3D image computed by averaging all controls from the SPECT database. Thus, 3D blocks are defined by setting an activation threshold as a function of the maximum voxel intensity (AE method) and they are numbered (I = i1 , i2 , . . . , im ) to describe transactions X ⇒ Y in the AR mining. In particular, it was experimentally proven that a region is considered of interest or nearly-activated, if more than 90% or a fraction (650 voxels) of the total 729 voxels are activated. In fact, if a higher activation block threshold is considered, the results are quite good as shown in figure 3(a), where blocks are at 92% activation. On the other hand, if a lower activation block threshold is selected, the accuracy results are lower with an increase in computational time. Sensitivity =

5.1. Analysis of the AR mining In this section, we analyze the performance of the proposed method in terms of the measures of significance used in the AR mining. First of all, it can be observed that the number of extracted rules is larger when minsup and minconf values are reduced as shown in figure 3(b),

6056

R Chaves et al 100

%

90 80

Accuracy

70

Sensitivity

60

Specificity

50 40 30 20 10 0

0

10

20

30

40

50

60

70

80

90

100

# VERIFIED RULES (%) (a)

(b) Figure 3. (a) Accuracy, sensitivity and specificity versus the percentage of verified rules. Activation block threshold of 92%, with minsup and minconf values 100%. (b) Bidimensional activated regions (rectangles) from the axial, sagittal and coronal axes with minsup and minconf of 90%.

where a higher number of ROIs (in comparison to the case in figure 2(b)), which represent antecedents and consequences of ARs, is obtained. Although the number of rules increases, it is remarkable that they are all not discriminant as the support and confidence of the mined rules are low, that is, the extracted rules include hundreds of them which depend on the specific properties of the images. This is shown in figure 4, where minconf and minsup are reduced simultaneously. In this case, the specificity curve decreases at a lower percentage of verified rules (n).

Efficient mining of association rules for the early diagnosis of Alzheimer’s disease 100

6057

100

Accuracy Sensitivity Specificity

80

60

%

%

60


80

40

40

20

20

0

0

20

40

60

80

0

100

0

20

# verified rules (%)

40

60

80

100

#verified rules (%)

(a)

(b)

100

100


80

60

%

%

60


80

40

40

20

20

0

0

10

20

30

40

50

60

#verified rules (%)

(c)

70

80

90

100

0

0

20

40

60

80

100

# verified rules (%)

(d)

Figure 4. Accuracy, sensitivity and specificity versus the percentage of verified rules when the activation block threshold is 90% for minsup and minconf of (a) 100%, (b) 95%, (c) 90% and (d) 80%. Note that the overall number of mined rules and the maximum accuracy rates obtained are (a) 29756 rules, 94.87%, (b) 84396 rules, 89.69%, (c) 146768 rules, 87.63% and (d) 288190 rules, 85.57%.

Mined ARs are discriminant for the early diagnosis of AD when the highest minsup and minconf, i.e. 100%, are selected; otherwise, we include hundreds of ‘data-dependent’ rules which do not effectively model the normal pattern. Moreover, the decrease of specificity at a determined n0 value in figures (4(b) with minsup, minconf of 95% at n0 = 95% of verified rules, 4(c) with minsup and minconf of 90% at n0 = 88% and in 4(d) with minsup, minconf of 80% at n0 = 79.5%) corroborates this assumption. As a conclusion, if minsup and minconf values are lower than 100%, the method provides an increasing detection ability up to an specific n0 value. Then, when the n > n0 value is considered specificity decreases notably. In addition, this may lead to an increase in the computational cost of the CAD system along with a penalty in the maximum accuracy rate (89.69% in figure 4(b), 87.63% in figure 4(c) and 85.57% in figure 4(d)) because nondiscriminant rules are also included in the classification process. In the light of figure 4, we note that extracting rules at the highest minsup and minconf percentage yields the best results in terms of accuracy, specificity and sensitivity (94.87%, 100% and 91.07%, respectively). In the latter case, the computational cost is reduced as well, since the analysis is focused on the most relevant regions of the brain for the diagnosis of AD (see figure 2(b)). 5.2. ROC analysis The ROC curve is aimed to compare the compromise that can be made between sensitivity (TPFtrue positive fraction) and 1-specificity (FPF—false positive fraction) for providing a

6058

R Chaves et al 1 0.9

sensitivity

0.8 minsup,minconf=100% minsup,minconf=95% minsup,minconf=90% minsup,minconf=80% VAF SVM (operation point)

0.6

0.4

GMM SVM−rbf (operation point) VAF SVM (ROC) PCA−SVM (operation point) PCA−SVM (ROC)

0.2

0

0

0.1

0.2

0.3 0.4 1−specificity

0.5

0.6

0.7

Figure 5. ROC analysis: the AR method with different selected percentages of minsup (100%, 95%, 90%, 80%) and minconf (100%, 95%, 90%, 80%) in the rule extraction. Comparison to other recently reported methods represented by their operation points. The observed peak value of the Acc = 94.87%, with Sen = 91.07% and Spe = 100% (cross-shaped mark), for minsup = minconf = 100. The AUC obtained for each ROC is AR method (minsup: AUC) 100%: 0.9554, 95%: 0.9371, 90%: 0.9233, 80%: 0.9027; VAF SVM: 0.8993; PCA SVM: 0.9177.

complete description of disease detectability (Metz 1978). Figure 5 shows the ROC curves for different minsup and minconf values (from 100% to 80%) which evaluate the trade-off between sensitivity and specificity as the threshold of ARs verified by each subject is increased. A comparison to other recently reported CAD methods has been established such as the ones based on VAF (Stoeckel et al 2001) (ROC curve and operation point), PCA (López et al 2009) (ROC curve and operation point) and Gaussian mixture model (GMM) (Górriz et al 2009) combined with support vector machines (SVM), which is represented by its operation point. The ROC curve of the proposed method is shifted up and to the left outperforming the baseline used in this paper as shown in figure 5. Due to the fact that specificity is always 1, avoiding to misclassify any control, when the maximum minsup and minconf (100%) are used, the ROC curve in this case reaches 0.91 sensitivity, approaching the ideal CAD system (1 sensitivity, that would mean that every control and AD would be correctly classified). In this set-up, our AR-based CAD system correctly classified all controls (41) and all AD patients but five scans (four labeled as AD1 and one labeled as AD2) as expected, since the AD1 pattern is still a challenge to be diagnosed. If we only consider the case NOR versus AD1, the precision rates of the method are Acc = 94.37%, Sen = 86.67%, Spe = 100% which still represent a great advance in the field. If a subject verifies more or the equal number of ARs than a class threshold, it is classified as control, otherwise as AD. This means that as the threshold of the verified rules is increased, a smaller number of cases would be classified as control, i.e. the specificity should decrease. By contrast, this does not happen when the highest minsup and minconf (100%) are selected


6059

Table 1. Statistical measures of performance of AR-based feature selection methods for the sample group using several set-ups (minconf, minsup and % verified rules). VAF, GMM and PCA operation results are reported as reference. VAF parameters: linear SVM classifier, GMM parameters: σ = 6 RBF-SVM classifier with eight components and PCA parameters: σ = 6 RBF-SVM classifier with 16 components.

minconf, minsup

100%

% verified ARs 70 90 100

Acc/Sen/Spe (%) 53.71/19.64/100 71.13/50/100 94.85/91.07/100 VAF Acc/Sen/Spe (%) 83.51/83.93/82.93

95% Acc/Sen/Spe (%) 58.76/28.57/100 86.60/83.93/100 74.23/100/39.02 PCA Acc/Sen/Spe (%) 86.56/91.07/80.49

80% Acc /Sen/Spe (%) 81.44/76.79/87.80 84.54/92.86/73.17 60.82/100/0.07 GMM Acc/Sen/Spe (%) 89.69/90.24/89.29

in the AR mining, that is, when the rules perfectly model the normal pattern as being the most discriminative for AD. The best classification results are obtained when 100% of the rules are verified. These results are in agreement with the expected behavior of the system since normal subjects are assumed to have a common SPECT image pattern and to verify most of the ARs mined. This fact supports the idea that it is preferable to only use AR mining in a set of controls. In table 1, we summarize the statistical measures obtained with the proposed methods and compare them to recently reported CAD systems (Stoeckel et al 2004, Górriz et al 2011, López et al 2009). As a conclusion, the analysis of the ROC curves shows that the proposed CAD system based on the AR mining yields the best trade-off between sensitivity and specificity by shifting the operating point up and to the left in the ROC space (Metz 1978). As expected, best results are obtained with the proposed AR technique for the highest minsup and minconf (of 100%). 6. Conclusions and outlook ARs were investigated for the classification of SPECT images and the design of a CAD system for AD diagnosis. The proposed system is based on AR mining for controls and classification, depending on the number of rules verified by each subject. It was shown that accuracy, specificity and sensitivity increase up to a maximum value as the threshold of verified ARs increases. However, this effect only occurs when ARs are mined as being the most discriminative, with minsup and minconf values of 100%. The proposed method yields up to 94.87% of classification accuracy (sensitivity = 91.07%, specificity = 100%) outperforming recent developed methods for the early AD diagnosis. In any other case, accuracy and specificity decrease at a determined number of rules, even when the most suitable rule for classification is chosen. Up to now AD has only been studied from a static point of view—usually by just looking for hypoperfusion patterns. By contrast, the innovation of this paper consists in studying relationships among brain areas with ARs. In the light of the results, this has turned out to be very efficient in a more dynamic way. The SPECT database used in this work, such as other relevant databases, is not pathologically confirmed and consequently may introduce some uncertainty on the subject’s

6060

R Chaves et al

labels. Using these labels allows us to test the robustness of the different classifiers. This should be also considered when comparing to other methods tested on autopsy confirmed AD patients on which every classifier is expected to improve the performance. Finally, this method could be used not only for quantitative assessment but for visual assessment in clinical practice as shown in the examples of the previous sections. Since the presented method does not depend on any pathological information about the specific disease, it is applicable to other types of neuro-degenerative diseases as well. It may also be applied to other image modalities, like for instance PET. Furthermore, recent studies have shown that different biomarkers may provide complementary information and an improvement in classification for the diagnosis of ADs (Davatzikos et al 2010), as for example in Zhang et al (2011), where different modalities of biomarkers are combined, i.e. MRI, FDG-PET and CSF for PET image classification.

Acknowledgments This work was partly supported by the MICINN under the TEC2008-02113 project and the Consejer´ıa de Innovación, Ciencia y Empresa (Junta de Andaluc´ıa, Spain) under the Excellence Project (TIC-2566 and TIC-4530). We are grateful to M Gómez-R´ıo and coworkers from the ‘Virgen de las Nieves’ hospital in Granada (Spain) for providing and classifying the SPECT images used in this work.

Appendix: Apriori algorithm Several algorithms have been developed (Chen et al 1996) aiming to extract ARs. The Apriori algorithm (Agrawal and Srikant 1994) is a state-of-the-art algorithm since most of the AR algorithms are variations of this. It works iteratively finding first the set of large 1-itemsets, and then set of 2-itemsets, and so on. The number of scans depends on the length of the maximal itemset. Apriori is based on the generation of a smaller candidate set using the set of large itemsets found in the previous iteration. Let Lk represent the set of frequent k-itemsets and Ck the set of candidate itemsets or potentially frequent itemsets. The Apriori algorithm makes many passes over the SPECT database and each pass consists of two stages. In the first one, the set of all frequent (k − 1)-itemsets, Lk−1 , found in the (k − 1)st pass, is used to generate the candidate itemsets Ck which are ensured to be a superset of the set of all frequent k-itemsets. Secondly, the algorithm scans the database and it decides which of the candidates in Ck are contained in the record using a hash-tree data structure and incrementing their support count. Finally of the pass, Ck is tested to determine which of the candidates are frequent, yielding Lk . The Apriori algorithm finishes when Lk becomes empty (Skirant et al 1997).

References Adler R 1981 The Geometry of Random Fields (New York: Wiley) Agrawal R, Imielinski T and Swami A 1993 Mining association rules between sets of items in large databases Proc. of the ACM SIGMOID Int. Conf. on the Management of Data pp 207–16 Agrawal R and Srikant R 1994 Fast algorithms for mining association rules in large databases Proc. of the 20th Int. Conf. on Very Large Databases pp 487–99


6061

Antonie M L, Zaiane O R and Coman A 2001 Application of data mining techniques for medical image classification Proc. of Second Int. Workshop on Multimedia Data Mining (MDM/KDD 2001) in Conjunction with 7th ACM SIGKDD (San Francisco, CA, USA) pp 94–101 Ayache N 1996 Analyzing 3D images of the brain Neuroimage 4 S34–5 Brookmeyer R, Gray S and Kawas C 1998 Projections of Alzheimer’s disease in the United States and the public health impact of delaying disease onset Am. J. Public Health 88 1337–42 Chen M, Han J and Yu P 1996 Data mining: an overview from a database perspective IEEE Trans. Knowl. Data Eng. 8 866–81 Cheung D W, Han J, Ng V T, Fu A W and Fu Y 1996 A fast distributed algorithm for mining association rules Proc. of the 4th Int. Conf. on Parallel and Distributed Information Systems pp 31–43 Davatzikos C, Bhatt P, Shaw L M, Batmanghelich K N and Trojanowski J Q 2010 Prediction of MCI to AD conversion, via MRI, CSF biomarkers, and pattern classification Neurobiol. Aging Epub ahead of print Dougall N J, Bruggink S and Ebmeier K P 2004 Systematic review of the diagnostic accuracy of 99mTc-HMPAOSPECT in dementia Am. J. Geriatr. Psychiatry 12 554–70 Duin R P W 2000 Classifiers in almost empty spaces Proc. 15th Int. Conf. on Pattern Recognition vol 2 pp 1–7 English R J and Childs J (ed) 1986 SPECT: Single-Photon Emission Computed Tomography: A Primer (New York: Society of Nuclear Medicine) Evans D, Funkenstein H, Albert M, Scherr P, Cook N, Chown M, Hebert L, Hennekens C and Taylor J 1989 Prevalence of Alzheimer’s disease in a community population of older persons J. Am. Med. Assoc. 262 2551–6 Frackowiak R S J, Ashburner J T, Penny W D and Zeki S 2003 Human Brain Function 2nd edn (New York: Academic) Friston K J, Ashburner J, Kiebel S J, Nichols T E and Penny W D 2007 Statistical Parametric Mapping: The Analysis of Functional Brain Images (New York: Academic) Fung G and Stoeckel J 2007 SVM feature selection for classification of SPECT images of Alzheimer’s disease using spatial information Knowl. Inf. Syst. 11 243–58 Gidskehaug L, Anderssen E and Alsberg B K 2008 Cross model validation and optimisation of bilinear regression models Chemometr. Intell. Lab. Syst. 93 1–10 Górriz J M, Lassl A, Ram´ırez J, Salas-Gonzalez D, Puntonet C G and Lang E 2009 Automatic selection of ROIs in functional imaging using Gaussian mixture models Neurosci. Lett. 460 108–11 Górriz J M, Segovia F, Ram´ırez J, Lassl A and Salas-Gonzalez D 2011 Automatic selection of ROIs in functional imaging using gaussian mixture models Appl. Soft Comput. 11 2376–82 Han J and Fu Y 1995 Discovery of multiple-level association rules from large databases Proc. of the 21st Int. Conf. on Very Large Databases pp 420–31 Hellman R S, Tikofsky R S, Collier B D, Hoffmann R G, Palmer D W, Glatt S, Antuono P G, Isitman A T and Papke R A 1989 Alzheimer disease: quantitative analysis of I-123-iodoamphetamine SPECT brain imaging Radiology 172 183–8 Holman B L, Johnson K A, Gerada B, Carvalho P A and Satlin A 1992 The scintigraphic appearance of Alzheimer’s disease: a prospective study using technetium-99m-HMPAO SPECT J. Nucl. Med. 33 181–5 Illán I A, Górriz J M, Ram´ırez J, Salas-Gonzalez D, López M, Segovia F and Puntonet C G 2010a Projecting independent components of SPECT images for computer aided diagnosis of Alzheimer’s disease Pattern Recognit. Lett. 31 1342–7 Illán I A, Górriz J M, López M, Ram´ırez J, Salas-Gonzalez D, Segovia F, Chaves R and Puntonet C G 2010b Computer aided diagnosis of Alzheimer’s disease using component based SVM Appl. Soft Comput. 11 2376–82 Illán I A, Górriz J M, Ram´ırez J, Salas-Gonzalez D, López M, Puntonet C G and Segovia F 2009 Alzheimer’s diagnosis using eigenbrains and support vector machines IET Electron. Lett. 42 877–9 Ishii K, Kono A K, Sasaki H, Miyamoto N, Fukuda T, Sakamoto S and Mori E 2006 Fully automatic diagnostic system for early- and late-onset mild Alzheimer’s disease using FDG PET and 3D-SSP Eur. J. Nucl. Med. Mol. Imaging 33 575–83 Jobst K A, Barnetson L P and Shepstone B J 1998 Accurate prediction of histologically confirmed Alzheimer’s disease and the differential diagnosis of dementia: the use of NINCDS-ADRDA and DSM III-R criteria, SPECT, X-ray CT, and Apo E4 in medial temporal lobe dementias. Oxford Project to Investigate Memory and Aging Int. Psychogeriatr. 10 271–302 Johnson E, Brookmeyer R and Ziegler-Graham K 2007 Modeling the effect of Alzheimer’s disease on mortality Int. J. Biostat. 3 13 Johnson K A, Kijewski M F, Becker J A, Garada B, Satlin A and Holman B L 1993 Quantitative brain SPECT in Alzheimer’s disease and normal aging J. Nucl. Med. 34 2044–8 Karabatak M and Ince M C 2008 Expert Syst. Appl. 36 3465–9 Karabatak M and Cevdet Ince M 2009 A new feature selection method based on association rules for diagnosis of erythemato-squamous diseases Expert Syst. Appl. 36 12500–5

6062

R Chaves et al

Liu B, Hsu W and Ma Y 1998 Integrating classification and association rule mining Proc. KDD-98 (New York, 27–31 August) (Menlo Park, CA: AAAI Press) pp 80–6 ´ López M, Ram´ırez J, Górriz J M, Alvarez I, Salas-Gonzalez D, Segovia F and Chaves R 2009 SVM-based CAD system for early detection of the Alzheimer’s disease using Kernel PCA and LDA Neurosci. Lett. 464 233–8 Metz C 1978 Basic Prin. ROC Anal. 8 283–98 Minoshima S, Frey K A, Koeppe N L, Foster N L and Kuhl D E 1995 A diagnostic approach in Alzheimer’s disease using three dimensional stereotactic surface projections of fluorine-18-FDG PET J. Nucl. Med. 36 1238–48 Nearhos J, Rothman M and Viveros M 1996 Applying data mining techniques to a health insurance information system 22nd Int. Conf. on Very Large Databases pp 286–94 Newman J, von Cramon D Y and Lohmann G 2008 Model-based clustering of meta-analytic functional imaging data Hum. Brain Mapp. 29 177–92 Ng S et al 2007 Visual assessment versus quantitative assessment of 11C-PIB PET and 18F-FDG PET for detection of Alzheimer’s disease J. Nucl. Med. 48 547–52 Ordonez C, Santana C and de Braal L 2000 Discovering interesting association rules in medical data Proc. of ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery pp 78–85 Petrella J R, Coleman R E and Doraiswamy P M 2003 Neuroimaging and early diagnosis of Alzheimer disease: a look to the future Radiology 226 315–36 ´ Ram´ırez J, Górriz J M, Salas-Gonzalez D, Romero A, López M, Alvarez I and Gómez-R´ıo M 2009 Computer-aided diagnosis of Alzheimer’s type dementia combining support vector machines and discriminant set of features Inf. Sci. at press Ribeiro M X, Bugatti P H, Traina C, Marques P, Rosa N A and Traina A 2009 Supporting content-based image retrieval and computer-aided diagnosis systems with association rule-based techniques Data Knowl. Eng. 68 1370–82 Salas-Gonzalez D, Górriz J M, Ram´ırez J, Lassl A and Puntonet C G 2008 Improved Gauss–Newton optimization methods in affine registration of SPECT brain images IET Electron. Lett. 44 1291–2 ´ Salas-Gonzalez D, Górriz J M, Ram´ırez J, López M, Alvarez I, Segovia F, Chaves R and Puntonet C G 2010 Computer aided diagnosis of Alzheimer’s disease using support vector machines and classification trees Phys. Med. Biol. 55 2807–17 Saxena P, Pavel D G, Quintana J C and Horwitz B 1998 Medical Image Computing and Computer-Assisted Intervention—MICCAI (Lecture Notes in Computer Science vol 1496) (Berlin: Springer) pp 623–30 Sheela L J and Shanthi V 2009 Int. Conf. on Signal Processing Systems: An Image Analysis and Classification Protocol for Characterization of Normal and Abnormal Memory Loss in Aging from Structural MRI pp 639–43 Haralick R M, Shanmugam K and Dinstein I 1973 Textural features for image classification IEEE Trans. Syst. Man Cybern. SMC 3 610–21 Silverman D H, Small G W and Chang C Y 1993 Positron emission tomography in evaluation of dementia: regional brain metabolism and long-term outcome J. Am. Med. Assoc. 286 2120–7 Skirant R, Vu Q and Agrawal R 1997 Mining association rules with item constraints 3rd Int. Conf. on Knowledge Discovery and Data Mining pp 67–73 Stoeckel J, Ayache N, Malandain G, Koulibaly P M, Ebmeier K P and Darcourt J 2004 Automatic classification of SPECT images of Alzheimer’s disease patients and control subjects Medical Image Computing and ComputerAssisted Intervention—MICCAI (Lecture Notes in Computer Science vol 3217) (Berlin: Springer) pp 654–62 Stoeckel J, Malandain G, Migneco O, Koulibaly P M, Robert P, Ayache N and Darcourt J 2001 Classification of SPECT images of normal subjects versus images of Alzheimer’s disease patients Medical Image Computing and Computer-Assisted Intervention—MICCAI (Lecture Notes in Computer Science vol 2208) (Berlin: Springer) pp 666–74 Stone M 1974 Cross-validatory choice and assessment of statistical predictions J. R. Stat. Soc. B 36 111–47 Toivonen H 1996 Sampling large databases for association rules 22nd Int. Conf. on Very Large Databases pp 134–45 Turkeltaub P E, Eden K J and Zeffiro T 2002 Meta-analysis of the functional neuroanatomy of single-word reading: method and validation Neuroimage 16 765–80 Umarani V and Punithavalli D M 2010 Sampling based association rules mining—a recent overview Int. JComput. Sci. Eng. 2 314–8 Vapnik V N 1998 Statistical Learning Theory (New York: Wiley) Witten I H and Frank E 2005 Data Mining, Practical Machine Learning Tools and Techniques (Amsterdam: Elsevier) Wu X, Zhang C and Zhang S 2004 Efficient mining of both positive and negative association rules ACM Trans. Inf. Syst. 22 381–405 Zaiane O R, Antonie M L and Coman A 2002 Mammography classification by an association rule-based classifier Int. Workshop on Multimedia Data Mining (ACM SIGKDD) pp 62–9


6063

Zang T 2000 Association Rules (Lecture Notes in Computer Science vol 1805) (Berlin: Springer) pp 245–56 Zhang D, Wang Y, Zhou L, Yuan H, Shen D and the Alzheimer’s Disease Neuroimaging Initiative 2011 Multimodal classification of Alzheimer’s disease and mild cognitive impairment Neuroimage 55 856–67 Zubi Z S and Saad R A 2011 Using some data mining techniques for early diagnosis of Lung cancer. Recent researches in artificial intelligence, knowledge engineering and data bases 10th WSEAS Int. Conf. on Artificial Intelligence, Knowledge Engineering and Data Bases (AIKED ’11) pp 32–7