JOURNAL OF NEUROCHEMISTRY
| 2012 | 120 | 190–198
doi: 10.1111/j.1471-4159.2011.07547.x
*Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, China College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China àDepartment of Nutrition and Food Hygiene, School of Public Health, Harbin Medical University, Harbin, China §Department of Human Genetics, University of California at Los Angeles, Los Angeles, California, USA
Abstract Alzheimer’s disease (AD) is a kind of complex neurological disorder. The complex genetic architecture of AD makes genetic analysis difficult. Fortunately, a pathway-based method to study the existing genome-wide association studies datasets has been applied into AD. However, no shared Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway was reported. In this study, we performed multiple pathway analyses of French AD genome-wide association studies dataset (discovery dataset, n = 7360, 2032 cases and 5328 controls) and Pfizer dataset (validation dataset, n = 2220, 1034 cases and 1186 controls). First, we performed multiple pathway analyses by Hypergeometric test, improved gene set enrichment analysis (IGSEA) and Z-statistic test in KEGG. Using Hypergeometric test, we identified 54 and 25 significant pathways (p < 0.05) in discovery dataset and validation dataset, respectively. Using IGSEA method, we identified three significant pathways in both discovery and validation datasets, respectively. Using Z-statistic test, we identified 19 significant pathways in validation dataset. Among the signifi-
cant pathways, cell adhesion molecules (CAM) pathway was identified to be the only consistent signal emerging across multiple analyses in KEGG. After permutation and multiple testing corrections, CAM pathway was significant with p = 2.40E)05 (Hypergeometric test) and p = 3.00E)03 (IGSEA) in discovery dataset. In validation dataset, CAM pathway was significant with p = 1.84E)06 (Hypergeometric test), p = 1.00E)02 (IGSEA) and p = 2.81E)03 (Z-statistic test). We replicated the association by multiple pathway analyses in Gene Ontology using Hypergeometric test (WebGestalt), modified Fisher’s exact test (DAVID) and Binomial test (PANTHER). Our findings provided further evidence on the association between CAM pathway and AD susceptibility, which would be helpful to study the genetic mechanisms of AD and may significantly assist in the development of therapeutic strategies. Keywords: Alzheimer’s disease, cell adhesion molecules, complex neurological disorder, genome-wide association studies, multiple testing correction, pathway analysis. J. Neurochem. (2012) 120, 190–198.
Alzheimer’s disease (AD) is a kind of highly heritable and complex neurological disorder. AD is also the leading cause of dementia in the elderly (Harold et al. 2009; Hooli and Tanzi 2009). The disorder is characterized by extracellular b-amyloid plaque deposition, intraneuronal tau pathology, neuronalcell death, vascular dysfunction and inflammatory processes (Hochstrasser et al. 2010). Recently, researchers have turned to genome-wide association studies (GWAS) to investigate the genetics of AD by analyzing hundreds of thousands of polymorphisms. Until now, over two dozen novel AD candidate loci have been identified by 15 GWAS
Received July 25, 2011; revised manuscript received September 18, 2011; accepted October 7, 2011. Address correspondence and reprint requests to Guiyou Liu and Zugen Chen, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Xiqi Dao 32, Tianjin Airport Economic Area, Tianjin 300308, China. E-mails:
[email protected];
[email protected] Abbreviations used: AD, Alzheimer’s disease; APP, amyloid precursor protein; CAM, cell adhesion molecules; ES, enrichment score; FDR, false discovery rate; GO, Gene Ontology; GSEA, gene set enrichment analysis; GWAS, genome-wide association studies; IGSEA, improved GSEA; KEGG, Kyoto Encyclopaedia of Genes and Genomes; NCAM, neural cell adhesion molecule; PVRL2, poliovirus receptor-related 2; SNP, single nucleotide polymorphism; VCAM-1, vascular cell adhesion molecule 1.
190
Ó 2011 The Authors Journal of Neurochemistry Ó 2011 International Society for Neurochemistry, J. Neurochem. (2012) 120, 190–198
Cell adhesion molecules contribute to Alzheimer’s disease | 191
(Bertram and Tanzi 2010; Seshadri et al. 2010). Two largescale GWAS conducted by Harold et al. (2009) and Lambert et al. (2009) have identified clusterin, complement component (3b/4b) receptor 1 (CR1) and phosphatidylinositol binding clathrin assembly protein as three novel putative susceptibility genes for AD. Another large-scale GWAS conducted by Seshadri et al. (2010) confirmed the susceptibility of clusterin and phosphatidylinositol binding clathrin assembly protein using three-stage approach to GWAS involving more than 35 000 individuals. However, little overlap was observed between the results of different studies. AD is a kind of polygenic disorder (Pedersen 2010). The complex genetic architecture of AD makes genetic analysis difficult. Fortunately, a pathway-based method to study the existing GWAS datasets has been applied into AD to investigate the biological mechanisms underlying AD susceptibility and several important immune system related pathways have been identified (Hong et al. 2010; Jones et al. 2010; Lambert et al. 2010). Recently, there were three pathway analyses of AD using the two large-scale GWAS datasets, of which two studies reported significant AD Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathways (Hong et al. 2010; Jones et al. 2010; Lambert et al. 2010). Lambert et al. performed a pathway-based analysis of French AD GWAS dataset using GenGen software package (Wang et al. 2007; Lambert et al. 2009). They reported five significant pathways (p < 0.05), such as Alzheimer’s disease (hsa05010), Regulation of autophagy (hsa04140), Natural killer cell mediated cytotoxicity (hsa04650), Antigen processing and presentation (hsa04622), retinoic acid-inducible gene-I (RIG-I)-like receptor signaling (hsa04612) (Lambert et al. 2010). Jones et al. (2010) performed an enrichment analysis of KEGG pathways using ALIGATOR. They identified six significantly enriched pathways shared by French AD GWAS dataset and another GWAS dataset from Harold et al. (Harold et al. 2009; Lambert et al. 2009). The pathways included asthma (hsa05310), hematopoietic cell lineage (hsa04640), graft-versus-host disease (hsa05332), allograft rejection (hsa05330), autoimmune thyroid disease (hsa05320), type I diabetes mellitus (hsa04940). Although the French AD GWAS dataset was used by the two studies, no shared KEGG pathway was reported. Several recently published studies have clearly demonstrated the use and importance of pathway-based approaches, which complement standard single-marker analysis in extracting more biological information from existing GWAS datasets. Over recent years, dozens of different methods have been published for pathway-based association analysis. Some of these published algorithms are available as software implementations or web servers (Wang et al. 2010). Voight et al. performed multiple pathway analyses of type 2 diabetes using GRAIL (Raychaudhuri et al. 2009), PANTHER (Mi et al. 2005), Reactome (Vastrik et al. 2007) and MAGENTA
(Segre et al. 2010). They found that cell-cycle regulation was the only consistent signal across multiple pathway analyses (Voight et al. 2010). They reported the association signals between cell cycle regulation and type 2 diabetes. For AD, there may also be shared pathway between the results of different GWAS, although little or no overlap of AD candidate loci was observed. To test this hypothesis, we performed multiple pathway analyses of AD GWAS datasets. Previously, ALIGATOR (Holmans et al. 2009) and GenGen (Wang et al. 2007) have been used to investigate AD related pathways (Jones et al. 2010; Lambert et al. 2010). In our research, we used the other five pathway analysis methods to investigate the shared pathway between two different AD GWAS datasets. First, we investigated the shared pathway in KEGG database using Hypergeometric test, improved gene set enrichment analysis (IGSEA) and Z-statistic test. Then we replicated the shared pathway in Gene Ontology (GO) database using Hypergeometric test (WebGestalt) (Zhang et al. 2005), modified Fisher’s exact test (DAVID) (Huang da et al. 2009) and Binomial test (PANTHER) (Thomas et al. 2003). Interestingly, cell adhesion molecules (CAM) pathway was the consistent signal across different analyses. Our results showed the potential relevance of CAM pathway to AD pathogenesis, which would be helpful for future genetic studies in AD.
Materials and methods The GWAS datasets Two GWAS datasets were available for analysis. To detect the replication of AD risk pathways, we divided the two GWAS datasets into discovery dataset and validation dataset. The discovery dataset came from the first stage of the study conducted by Lambert et al. (2009). There were 2032 AD cases and 5328 controls of French ancestry. Cases were ascertained by neurologists from Bordeaux, Dijon, Lille, Montpellier, Paris and Rouen. Clinical diagnosis of AD was established according to the DSMIII-R and NINCDS-ADRDA criteria. Controls were individuals without symptoms of dementia from French Three-City (3C) prospective population-based cohort, which is a population-based, prospective (4 years follow-up) study of the relationship between vascular factors and dementia. Samples were genotyped with Illumina Human610-Quad BeadChips. This study included 537,029 single nucleotide polymorphisms (SNPs), of which 511,978 SNPs passed quality control (SNPs with call rates of < 98%, with minor allele frequency < 1% or showing departure from Hardy-Weinberg equilibrium in the control population (p < 1E–6) were excluded). Logistic regression taking into account sex and age was used to test the association between each SNP and AD, and principal components analysis was used to adjust for possible population stratification. For more detailed information, please refer to Lambert et al. (2009). The validation dataset came from the first stage of the study conducted by Hu et al. (2011). This research included 1034 cases and 1186 controls mostly collected from Pfizer clinical trials. All subjects were diagnosed with probable or possible AD if they met
Ó 2011 The Authors Journal of Neurochemistry Ó 2011 International Society for Neurochemistry, J. Neurochem. (2012) 120, 190–198
192 | G. Liu et al.
NINCDS and/or DSM-IV criteria and had mini-mental state examination scores below 25 at baseline. The control subjects included 234 subjects from Precision Med for case/control study, 883 subjects from A9010012 which is a method study to collect elderly subjects free of any neurological and psychiatric conditions, and 69 subjects from 999-GEN-0583-001 which is another method study to obtain DNA in a reference population of Caucasians defined as psychiatric and neurological normal. Controls have no neuropsychiatric diseases and their mini-mental state examination scores were above 27 at the time of enrollment. For AD susceptibility analysis, we removed any potential early onset AD cases (age of onset less than 65). All the controls were re-matched with the remaining cases according to gender, age (controls are older than the cases) and ethnicity (only Caucasians were selected in the analysis). SNPs were tested for association with AD using chi-squared test. For more detailed information, please refer to Hu et al. (2011). The pathway resources and mapping from SNP to gene The first pathway resource came from the KEGG (Kanehisa and Goto 2000) database and all of the pathways were experimentally validated. The second pathway resource came from GO database (Harris et al. 2004). All SNPs detected in previous genome assembly builds were converted to current build (reference assembly, db132) in NCBI. The gene file (reference assembly, build 37.1) which contained the start coordinates and stop coordinates was also obtained from NCBI. If a SNP is located within a gene, the SNP and gene are selected. We finished the mapping from SNP to gene by a toolset for whole-genome association and population-based linkage analysis (PLINK) (Purcell et al. 2007). Identifying AD risk pathway by Hypergeometric test in KEGG Here, for a given pathway, we used Hypergeometric exact test to detect an overrepresentation of the AD related genes among all the genes in the pathway. The P-value of observing K AD related genes in the pathway can be calculated by S N S Þ K ð Þð X i mi P ¼1 N i¼0 ð Þ m where N is the total number of genes that are of interest, S is the number of all AD-related genes and m is the number of genes in the pathway, K is the number of AD related genes in the pathway. Here, we finished the analysis using WebGestalt (Zhang et al. 2005).
Identifying AD risk pathway by improved gene set enrichment analysis in KEGG The i-GSEA4GWAS web server (Zhang et al. 2010) implements IGSEA, which is an application and extension of GSEA. For GSEA, it runs the following three key procedures described by Wang et al. (2007). (i) The max statistics or )log(p-value) of closely spaced SNPs in a gene is used to represent the gene; then, the ranked gene list with corresponding representing values is utilized to calculate each gene set enrichment score (ES), a Kolmogorov–Smirnov like statistics with weight 1, which reflects the trend that genes of a gene set tend to be located at the top of the entire ranked genomewide gene list.
8 9 < X jr jp X 1 = j ESðSÞ ¼ max 1iN : N NH ; NR g 2S; ji g2 = S; ji j
j
where S is the given gene set (pathway), N is the total number of genes included in a GWA study. NH is the number of genes in P jrj jm , rj is the statistic value of gene j, i is the position in S. NR ¼ gj 2S
the gene list N, j is the position before i in the gene list N, p is a parameter that gives higher weight to genes with extreme statistic values. (ii) The phenotype label permutation (to break the association between genotype and phenotype) and a straightforward normalization are performed to generate the distribution of the ES and correct gene variation (different genes with different number of SNPs mapped will result in identification of gene sets containing genes with more SNPs mapped, instead of genes with functional correlation) and gene set variation (different gene sets contain different number of genes) by this following formula: NESðSÞ ¼
ESðSÞ mean½ESðS; pÞ sd½ESðS; pÞ
where NES(S) is the normalized enrichment score. For each permutation (p), we calculated ES(S) and denoted asNES(S, p). (iii) Based on all the distributions of ESs generated by permutation, false discovery rate (FDR) is used for multiple testing corrections. In IGSEA, SNP label permutation instead of phenotype label permutation is implemented to analyze SNP p-values and to correct gene and gene set variation and multiply k/K to the ES to get the significance proportion based enrichment score, where k is the proportion of significant genes of the gene set and K is the proportion of significant genes of the total genes in the GWAS (Zhang et al. 2010). Identifying AD risk pathway by Z-statistic test in KEGG GSA-SNP implements Z-statistic method (Kim and Volsky 2005). Here, we use GSA-SNP for our analysis. For Z-statistic method, it has the following two key procedures described by Nam et al. (2010). (i) Each p-value of SNP was converted to)log(p). We choose the second best SNP in each gene as a default option to summarize the information of multiple SNPs instead of the best SNP (Nam et al. 2010). (ii) Each SNP is assigned to a gene. If a SNP is located within a gene, the SNP and gene are selected. We calculate the Z-statistic for each gene set (GS): X m0 ZðGSÞ ¼ r pffiffiffi = n where X is the average of gene scores [)log(kth beset p)] in a gene set, m0andrare the mean and the standard deviation of all the gene scores, n is the number of genes in the gene set. When the p-values of each gene set are computed, Benjamini-Hochberg multiple testing correction was applied. For more detailed information, please refer to the research conducted by Nam et al. (2010). Pathway analysis by WebGestalt, DAVID and PANTHER in GO Pathway analysis of AD in GO was performed by WebGestalt (Zhang et al. 2005), DAVID (Huang da et al. 2009) and PANTHER
Ó 2011 The Authors Journal of Neurochemistry Ó 2011 International Society for Neurochemistry, J. Neurochem. (2012) 120, 190–198
Cell adhesion molecules contribute to Alzheimer’s disease | 193
(Thomas et al. 2003). Hypergeometric test was implemented in WebGestalt. DAVID provides a comprehensive set of functional annotation tools for investigators to understand biological meaning behind large list of genes (Huang da et al. 2009). In DAVID, a modified Fisher’s exact test is adopted to measure the significance of the gene-enrichment in annotation pathways. The PANTHER Classification System is a unique resource that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function even in the absence of direct experimental evidence (Thomas et al. 2003). In PANTHER, binomial test was used to compare a gene group to a reference list to statistically determine over- or under- representation of PANTHER classification categories. Full more detailed algorithms, please refer to DAVID (Huang da et al. 2009) and PANTHER (Thomas et al. 2003).
validation datasets respectively. Among the five significant pathways, CAM pathway was significant in the discovery dataset with p = 3.00E)03 (FDR = 3.00E)02) and in the validation dataset with p = 1.00E)02 (FDR = 1.30E)02) (Table 1). Complete pathway details are provided in supplementary tables (Table S3).
Results
Pathway analysis results by WebGestalt, DAVID and PANTHER in GO Three thousand one hundred and fifty seven and 1157 genes were used for pathway analysis in discovery and validation datasets respectively. CAM pathway was consistently associated with AD in five analyses after multiple testing corrections (Table 1).
Pathway analysis results by Hypergeometric test in KEGG In discovery dataset, we got 6460 significant SNPs with p < 0.01 and 3157 genes with at least one significant SNP. In validation dataset, we got 5427 significant SNPs with p < 0.01 and 1157 genes with at list one significant SNP. After Benjamini–Hochberg multiple testing correction, 54 and 45 significant pathways were identified in discovery and validation datasets respectively. CAM pathway was significant in discovery and validation datasets with p = 2.40E)05 and p = 1.84E)06, respectively. In CAM pathway, 25 of 134 genes had significantly associated SNPs in discovery dataset and 17 of 134 genes had significantly associated SNPs in validation dataset. Here, we only list the results of CAM pathway (Table 1). Complete pathway details are provided in Table S1 and S2. Pathway analysis results by IGSEA in KEGG All the available SNPs as well as their P-values were used to evaluate the association between CAM pathway and AD in both datasets by IGSEA. After FDR testing corrections, two and three pathways were significant in discovery and
Pathway analysis results by Z-statistic test in KEGG 19 significant pathways (p < 0.05) were identified in validation dataset after FDR testing corrections. CAM pathway was still significant with p = 2.81E)03 (Table 1). However, CAM pathway was not significant in discovery dataset. Complete pathway details are provided in supplementary tables (Table S4).
Overlap with previous researches AD related KEGG pathways reported by Lambert et al. (2010) and Jones et al. (2010) also were significantly associated with AD in our research. We replicated seven pathways reported by previous studies (Table 2).
Discussion In this research, we performed multiple pathway analyses of AD using two GWAS datasets by Hypergeometric test, IGSEA and Z-statistic test in KEGG. We identified that CAM pathway was the only consistent signal emerging across multiple analyses in KEGG. We also replicated the association between CAM pathway and AD by multiple pathway analyses in GO using WebGestalt, DAVID and
Table 1 Shared pathway by pathway analysis in KEGG and GO Pathway ID
Pathway name
Hsa04514 Hsa04514 Hsa04514 Hsa04514 Hsa04514 Hsa04514 GO:0007155 GO:0007155 GO:0007155 GO:0007155 GO:0007155 GO:0007155
Cell Cell Cell Cell Cell Cell Cell Cell Cell Cell Cell Cell
adhesion adhesion adhesion adhesion adhesion adhesion adhesion adhesion adhesion adhesion adhesion adhesion
molecules molecules molecules molecules molecules molecules molecules molecules molecules molecules molecules molecules
pathway pathway pathway pathway pathway pathway pathway pathway pathway pathway pathway pathway
Dataset
Method
p-Value
Discovery Discovery Discovery Validation Validation Validation Discovery Validation Discovery Validation Discovery Validation
Hypergeometric test IGSEA Z-statistic test Hypergeometric test IGSEA Z-statistic test WebGestalt WebGestalt PANTHER PANTHER DAVID DAVID
2.40E)05 3.00E)03 3.41E)01 1.84E)06 1.00E)02 2.81E)03 4.06E)02 1.58E)13 2.61E)13 2.48E)16 1.90E)01 1.90E)12
Ó 2011 The Authors Journal of Neurochemistry Ó 2011 International Society for Neurochemistry, J. Neurochem. (2012) 120, 190–198
194 | G. Liu et al.
Table 2 The overlap with previous research in KEGG Pathway ID
Pathway name
Dataset
Method
Hsa05010 Hsa05010 Hsa05010 Hsa05010 Hsa05010 Hsa05010 Hsa04140 Hsa04140 Hsa04140 Hsa04650 Hsa04650 Hsa04650 Hsa04622 Hsa04622 Hsa04622 Hsa04612 Hsa04612 Hsa04612 Hsa04640 Hsa04640 Hsa04640 Hsa05320 Hsa05320 Hsa05320
Alzheimer’s disease Alzheimer’s disease Alzheimer’s disease Alzheimer’s disease Alzheimer’s disease Alzheimer’s disease Regulation of autophagy Regulation of autophagy Regulation of autophagy Natural killer cell mediated cytotoxicity Natural killer cell mediated cytotoxicity Natural killer cell mediated cytotoxicity Antigen processing and presentation Antigen processing and presentation Antigen processing and presentation RIG-I-like receptor signaling RIG-I-like receptor signaling RIG-I-like receptor signaling Hematopoietic cell lineage Hematopoietic cell lineage Hematopoietic cell lineage Autoimmune thyroid disease Autoimmune thyroid disease Autoimmune thyroid disease
Discovery Discovery Discovery Validation Validation Validation Discovery Discovery Discovery Discovery Discovery Discovery Discovery Discovery Discovery Discovery Discovery Discovery Discovery Discovery Discovery Discovery Discovery Discovery
Hypergeometric IGSEA Z-statistic test Hypergeometric IGSEA Z-statistic test Hypergeometric IGSEA Z-statistic test Hypergeometric IGSEA Z-statistic test Hypergeometric IGSEA Z-statistic test Hypergeometric IGSEA Z-statistic test Hypergeometric IGSEA Z-statistic test Hypergeometric IGSEA Z-statistic test
p-Value test
test
test
test
test
test
test
test
3.00E)04 > 5.00E)02 9.98E)01 4.44E)02 < 1.00E)03 6.73E)01 6.22E)05 > 5.00E)02 8.94E)01 8.56E)05 > 5.00E)02 9.34E)01 3.50E)07 > 5.00E)02 8.17E)01 7.00E)04 > 5.00E)02 7.52E)01 7.00E)04 < 1.00E)03 9.98E)01 1.31E)07 > 5.00E)02 7.89E)01
RIG-I, retinoic acid-inducible gene-I
PANTHER. Our findings provided further evidence on the association between CAM pathway and AD susceptibility. We believe that our results will be helpful to study the genetic mechanisms of AD and may significantly assist in the development of therapeutic strategies to either prevent or at least reduce disability from AD. Recent research shows that some monocytic CAM, such as monocytic intercellular adhesion molecule-3 and P-selectin (related to the CD14 antigen) were significantly reduced in AD patients compared to healthy subjects (Hochstrasser et al. 2010). Another recent research demonstrated that Neurexin 3 beta (NRXN3b), a cell adhesion molecule, can be processed by a- and c-secretases, leading to the formation of two final products, both being potentially implicated in the regulation of the activity and plasticity of synapses and playing an important role in establishing and maintaining proper synaptic function (Bot et al. 2011). They further reported that this processing was altered by several presenilin1 mutations in the catalytic subunit of the c-secretase that caused early onset familial AD (Bot et al. 2011). Cell adhesion molecules are essential mediators of both immune and inflammatory responses (Bullard 2002; Golias et al. 2007). The induction of cell adhesion molecules is an important step in the inflammatory process (Laurila et al. 2009; Di Paola et al. 2011). Evidence indicates that cell adhesion molecules play an integral role in immune cell
interaction with peripheral neurons, and may have a determining influence on the overall antinociceptive outcome of immune cell-derived peptides in inflamed tissue (Hua et al. 2006). The CAM pathway functions in neuronal cell adhesion, which is critical for synaptic formation and normal cell signaling (O’Dushlaine et al. 2010). Synaptic loss is one of the strongest correlates to the cognitive impairment in patients with AD (Crews and Masliah 2010). Until now, a large number of interacting CAM have been classically shown to regulate transmigration of monocytic cells to the brain, at least upon acute brain injury (Malm et al. 2010). These adhesion molecules include P-selectin and very late antigen-4 during tethering/rolling along the blood-brain barrier (BBB), and platelet–endothelial cell adhesion molecule-1 during early transmigration (Malm et al. 2010). Notably, intercellular adhesion molecule-3 is expressed in leukocytes and its levels are decreased in monocytes of patients with AD (Malm et al. 2010). The same change has been observed in P-selectin levels (Hochstrasser et al. 2010). Compared with controls, Alzheimer’s brains showed increased expression of polysialylated nerve cell adhesion molecule (Jin et al. 2004). Amyloid precursor protein (APP) plays a key role in AD and AD is characterized by extracellular amyloid betapeptide (Ab), which is derived from APP upon cleavage by b- and c-secretases (Osterfield et al. 2008). Mutations in
Ó 2011 The Authors Journal of Neurochemistry Ó 2011 International Society for Neurochemistry, J. Neurochem. (2012) 120, 190–198
Cell adhesion molecules contribute to Alzheimer’s disease | 195
APP have been linked to familial AD (Osterfield et al. 2008). APP is also a potent cell adhesion molecule which binds to both heparin and other extracellular matrix molecules (Chen et al. 2002). The cell adhesion molecule transient axonal glycoprotein has been identified as the ligand that induces APP processing and signaling, resulting in the inhibition of neurogenesis (Mattson and van Praag 2008). These findings suggest the potential involvement of transient axonal glycoprotein in the pathogenesis of AD (Mattson and van Praag 2008). There is substantial evidence to support their involvement in other cognitive and neuropsychiatric disorders. Recently, O’Dushlaine et al. used a SNP ratio test method on schizophrenia datasets from the International Schizophrenia Consortium and Genetic Association Information Network and on bipolar disorder dataset from WTCCC. In the end, they found that CAM pathway was significantly associated with both schizophrenia and bipolar disorder (O’Dushlaine et al. 2010). Recent autism studies have indicated the involvement of CAM in autism spectrum disorder (Ye et al. 2010). A recent autism study reported the involvement of neuronal CAM in the pathogenesis of autism spectrum disorders (Wang et al. 2009). Another autism study reported autism spectrum disorder was related to endoplasmic reticulum stress induced by mutations in the synaptic cell adhesion molecule, CADM1 (Fujita et al. 2010). In our research, we selected five pathway analysis methods, including Hypergeometric (WebGestalt), Gene Set Enrichment Analysis (GSEA, i-GSEA4GWAS), Zstatistic (GSA-SNP), modified Fisher’s exact (DAVID), and Binomial tests (PANTHER). All the five methods have been used to detect significant enrichments of KEGG and GO categories in previous GWAS. Hypergeometric test has been used to detect significant pathway in schizophrenia (Jia et al. 2010), type 2 diabetes (Elbers et al. 2009) and multiple sclerosis (Briggs et al. 2011). GSEA algorithm has been applied into schizophrenia (Jia et al. 2010), type 2 diabetes (Perry et al. 2009) and Alzheimer’s disease GWAS datasets (Lambert et al. 2010). Z-Statistic method has been used into adult human height GWAS dataset (Nam et al. 2010). Modified Fisher’s exact test (DAVID) has been used by Elbers et al. in the pathway analysis of type 2 diabetes GWAS dataset (Elbers et al. 2009). Meanwhile, binomial test (PANTHER) has been widely used in kinds of pathway analyses of GWAS datasets, such as type 2 diabetes (Voight et al. 2010), age at menarche (Elks et al. 2010), Venous Thrombosis (Morange et al. 2010), body mass index (Speliotes et al. 2010) and aging (Walter et al. 2011). Previously, ALIGATOR (Holmans et al. 2009) and GenGen (Wang et al. 2007) have been used to investigate AD related pathways (Jones et al. 2010; Lambert et al. 2010). In ALIGATOR, a modified Fisher’s exact test is adopted to measure the significance of the gene-enrichment in annotation pathways. In GenGen, a modified GSEA
algorithm was implemented (Wang et al. 2007). There are three main differences between our study and that of Lambert et al. (2010) and Jones et al. (2010). First, GenGen chooses the best SNP in each gene to represent the gene level signal and ranks all genes in order of a gene-wide association statistic. ALIGATOR converts a list of significant SNPs into a list of significant genes, and tests this list for enrichment within functional categories. In our research, all pathway analysis methods choose the same methods as ALIGATOR. A list of significant SNPs (p < 0.01) were used to represent a list of significant genes, and test this list for enrichment within functional categories. Second, we used different genome assembly builds from previous two studies. Lambert et al. used gene information (reference assembly, build 36.3) and SNP information (dbSNP, build 130). Jones et al. also used gene information (reference assembly, build 36.3). In our research, we used gene information (reference assembly, build 37.1) and SNP information (dbSNP, build 132). Third, we used different mapping from SNP to gene. In previous two studies, SNP that mapped to within 20 kb of a gene were assigned to that gene. In our research, if a SNP is located within a gene (0 kb), the SNP and gene are selected. Both studies performed by Lambert et al. and Jones et al. used the same pathway analysis methods as we did (modified Fisher’s exact test in ALIGATOR and modified GSEA algorithm in GenGen). However, both studies did not detected significant association between CAM and AD. It can be explained that pathway-based methods may be more successful for some complex traits. However, these methods may have limitations when applied in GWAS data analysis. Several factors could impact the results substantially and make the analysis results unstable, such as definition of gene boundaries and assignment of a p value to a gene (Jia et al. 2011). Jia et al. (2011) suggested that multiple pathway analysis methods should be used to evaluate the reliability of the results, and multiple datasets also should be used to replicate the findings. Voight et al. (2010) performed multiple pathway analyses of type 2 diabetes using GRAIL, PANTHER, Reactome and MAGENTA. They identified the association signals between cell cycle regulation and type 2 diabetes. In our research, we used the other five pathway analysis methods to investigate the shared pathway between two different AD GWAS datasets. On gene level, we found 8 shared genes in CAM pathway between discovery and validation datasets. Among the eight genes, five genes have been reported to be involved in AD. For CDH4 (Gene ID: 1002), a previous GWAS of endophenotypes with relevance to AD from the Framingham group found significant association between CDH4 (rs1970546) and total cerebral brain volume (Seshadri et al. 2007; Waring and Rosenberg 2008). Using quantitative-trait association approach, Stein et al. compared 3D profiles of temporal volume in MRI brain
Ó 2011 The Authors Journal of Neurochemistry Ó 2011 International Society for Neurochemistry, J. Neurochem. (2012) 120, 190–198
196 | G. Liu et al.
scans of AD patients, mildly impaired and healthy elderly subjects of European ancestry to genotyping data analyzed using the Human610-Quad microarrays. The results indicated KIAA0743 (Gene ID: 9369) with nominal significance of p < 1.00E)05 (Stein et al. 2010). For poliovirus receptor-related 2 (PVRL2) (Gene ID: 5819), previous AD GWAS indicated the SNPs in PVRL2 with genome-wide significance, rs6859-A (p = 1.00E)07) (Naj et al. 2010) and rs6859-A (p = 6.00E)14) (Abraham et al. 2008). A recent genetic association study on in and around the apolipoprotein E in late-onset Alzheimer disease in Japanese also indicated the involvement of PVRL2 in AD (Takei et al. 2009). For vascular cell adhesion molecule 1 (VCAM-1) (Gene ID: 7412), the plasma concentration of VCAM-1 is increased in AD (Zuliani et al. 2008; Ewers et al. 2010; Grammas 2011). The adhesion of circulating monocytes to cerebrovascular endothelium has the potential to contribute to the pathogenesis of AD. VCAM-1 plays an essential role in Ab aggregate stimulated endothelial-monocyte adhesion (Moss et al. 2010). For neural cell adhesion molecule (NCAM) (Gene ID: 4684), it has been found that disturbance of NCAM expression is involved in the pathogenesis of AD, and the level of soluble NCAM in the blood plasma of patients may be used for differential diagnostics of AD (Chekhonin et al. 2008). Previous research found a strong tendency for increase of the soluble fragments of NCAM in the cerebrospinal fluid (CSF) of Alzheimer patients compared to the normal control group. NCAM concentrations were positively correlated with age. Age and neurodegeneration influenced NCAM concentrations (Strekalova et al. 2006). Another research showed that the procognitive actions of H3 antagonism combined with increased NCAM polysialylation expression may exert a disease-modifying action in conditions harboring fundamental deficits in NCAM-mediated neuroplasticity, such as schizophrenia and Alzheimer’s disease (Foley et al. 2009). Evidence also suggests a potential involvement of NCAM expressing neurons in the cognitive deficits in AD (Aisa et al. 2010). This analysis also has limitations. Current approaches for pathway analysis are still in an early stage of development in that analysis results are often prone to sources of bias, including gene set size and gene length, linkage disequilibrium patterns and the presence of overlapping genes (Jia et al. 2011; Wang et al. 2011). All the pathway analysis methods should be adjusted using permutation. In our research, permutation was carried out by IGSEA. Multiple testing corrections (WebGestalt, GSA-SNP) typically are not enough to adjust all these biases. Here, we used two GWAS dataset. The study power would benefit from examining all available datasets. If possible, we would replicate our findings by applying multiple pathway analysis methods into the other GWAS datasets in future
(Harold et al. 2009; Seshadri et al. 2010). It is important that further replication studies are required to investigate the involvements of CAM in AD.
Acknowledgements This research is supported by the grants from the Tianjin Science and Technology Support Program (10ZCZDSY06400, 10ZCKFSY05500), One Hundred Person Project of the Chinese Academy of Sciences (KSCX2-YW-BR-3). The authors reported no biomedical financial interests or potential conflicts of interest. We thank Lambert et al. and Xiaolan Hu et al. for the GWAS data of AD. We appreciate the useful comments made by anonymous reviewers.
Supporting information Additional supporting information may be found in the online version of this article: Table S1. The pathway analysis of discovery dataset by hypergeometric test. Table S2. The pathway analysis of validation dataset by hypergeometric test. Table S3. The pathway analysis of discovery dataset by IGSEA and The pathway analysis of validation dataset by IGSEA. Table S4. The pathway analysis of validation dataset by Z-statistic test. As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials are peer-reviewed and may be re-organized for online delivery, but are not copy-edited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors.
References Abraham R., Moskvina V., Sims R. et al. (2008) A genome-wide association study for late-onset Alzheimer’s disease using DNA pooling. BMC Med. Genomics 1, 44. Aisa B., Gil-Bea F. J., Solas M., Garcia-Alloza M., Chen C. P., Lai M. K., Francis P. T. and Ramirez M. J. (2010) Altered NCAM expression associated with the cholinergic system in Alzheimer’s disease. J. Alzheimers Dis. 20, 659–668. Bertram L. and Tanzi R. E. (2010) Alzheimer disease: New light on an old CLU. Nat. Rev. Neurol. 6, 11–13. Bot N., Schweizer C., Ben Halima S. and Fraering P. C. (2011) Processing of the synaptic cell adhesion molecule neurexin-3beta by Alzheimer disease alpha- and gamma-secretases. J. Biol. Chem. 286, 2762–2773. Briggs F. B., Shao X., Goldstein B. A., Oksenberg J. R., Barcellos L. F. and De Jager P. L. (2011) Genome-wide association study of severity in multiple sclerosis. Genes Immun. [Epub ahead of print]. Bullard D. C. (2002) Adhesion molecules in inflammatory diseases: insights from knockout mice. Immunol. Res. 26, 27–33. Chekhonin V. P., Shepeleva I. I. and Gurina O. I. (2008) Disturbances in the expression of neuronal cell adhesion proteins NCAM. Clin. Aspects Neurochem. J. 2, 239–251. Chen Q., Kimura H. and Schubert D. (2002) A novel mechanism for the regulation of amyloid precursor protein metabolism. J. Cell Biol. 158, 79–89.
Ó 2011 The Authors Journal of Neurochemistry Ó 2011 International Society for Neurochemistry, J. Neurochem. (2012) 120, 190–198
Cell adhesion molecules contribute to Alzheimer’s disease | 197
Crews L. and Masliah E. (2010) Molecular mechanisms of neurodegeneration in Alzheimer’s disease. Hum. Mol. Genet. 19, R12–R20. Di Paola R., Talero E., Galuppo M., Mazzon E., Bramanti P., Motilva V. and Cuzzocrea S. (2011) Adrenomedullin in inflammatory process associated with experimental pulmonary fibrosis. Respir. Res. 12, 41. Elbers C. C., van Eijk K. R., Franke L., Mulder F., van der Schouw Y. T., Wijmenga C. and Onland-Moret N. C. (2009) Using genomewide pathway analysis to unravel the etiology of complex diseases. Genet. Epidemiol. 33, 419–431. Elks C. E., Perry J. R., Sulem P. et al. (2010) Thirty new loci for age at menarche identified by a meta-analysis of genome-wide association studies. Nat. Genet. 42, 1077–1085. Ewers M., Mielke M. M. and Hampel H. (2010) Blood-based biomarkers of microvascular pathology in Alzheimer’s disease. Exp. Gerontol. 45, 75–79. Foley A. G., Prendergast A., Barry C., Scully D., Upton N., Medhurst A. D. and Regan C. M. (2009) H3 receptor antagonism enhances NCAM PSA-mediated plasticity and improves memory consolidation in odor discrimination and delayed match-to-position paradigms. Neuropsychopharmacology 34, 2585–2600. Fujita E., Dai H., Tanabe Y., Zhiling Y., Yamagata T., Miyakawa T., Tanokura M., Momoi M. Y. and Momoi T. (2010) Autism spectrum disorder is related to endoplasmic reticulum stress induced by mutations in the synaptic cell adhesion molecule, CADM1. Cell Death Dis. 1, e47. Golias C., Tsoutsi E., Matziridis A., Makridis P., Batistatou A. and Charalabopoulos K. (2007) Review. Leukocyte and endothelial cell adhesion molecules in inflammation focusing on inflammatory heart disease. In Vivo 21, 757–769. Grammas P. (2011) Neurovascular dysfunction, inflammation and endothelial activation: implications for the pathogenesis of Alzheimer’s disease. J. Neuroinflammation 8, 26. Harold D., Abraham R., Hollingworth P. et al. (2009) Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer’s disease. Nat. Genet. 41, 1088–1093. Harris M. A., Clark J., Ireland A. et al. (2004) The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32, D258– D261. Hochstrasser T., Weiss E., Marksteiner J. and Humpel C. (2010) Soluble cell adhesion molecules in monocytes of Alzheimer’s disease and mild cognitive impairment. Exp. Gerontol. 45, 70–74. Holmans P., Green E. K., Pahwa J. S., Ferreira M. A., Purcell S. M., Sklar P., Owen M. J., O’Donovan M. C. and Craddock N. (2009) Gene ontology analysis of GWA study data sets provides insights into the biology of bipolar disorder. Am. J. Hum. Genet. 85, 13–24. Hong M. G., Alexeyenko A., Lambert J. C., Amouyel P. and Prince J. A. (2010) Genome-wide pathway analysis implicates intracellular transmembrane protein transport in Alzheimer disease. J. Hum. Genet. 55, 707–709. Hooli B. V. and Tanzi R. E. (2009) A current view of Alzheimer’s disease. F1000 Biol. Rep. 1, 54. Hu X., Pickering E., Liu Y. C. et al. (2011) Meta-analysis for genomewide association study identifies multiple variants at the BIN1 locus associated with late-onset Alzheimer’s disease. PLoS ONE 6, e16616. Hua S., Hermanussen S., Tang L., Monteith G. R. and Cabot P. J. (2006) The neural cell adhesion molecule antibody blocks cold water swim stress-induced analgesia and cell adhesion between lymphocytes and cultured dorsal root ganglion neurons. Anesth. Analg. 103, 1558–1564. Huang da W., Sherman B. T. and Lempicki R. A. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57.
Jia P., Wang L., Meltzer H. Y. and Zhao Z. (2010) Common variants conferring risk of schizophrenia: a pathway analysis of GWAS data. Schizophr. Res. 122, 38–42. Jia P., Wang L., Meltzer H. Y. and Zhao Z. (2011) Pathway-based analysis of GWAS datasets: effective but caution required. Int. J. Neuropsychopharmacol. 14, 567–572. Jin K., Peel A. L., Mao X. O., Xie L., Cottrell B. A., Henshall D. C. and Greenberg D. A. (2004) Increased hippocampal neurogenesis in Alzheimer’s disease. Proc. Natl. Acad. Sci. U S A 101, 343– 347. Jones L., Holmans P. A., Hamshere M. L. et al. (2010) Genetic evidence implicates the immune system and cholesterol metabolism in the aetiology of Alzheimer’s disease. PLoS ONE 5, e13950. Kanehisa M. and Goto S. (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30. Kim S. Y. and Volsky D. J. (2005) PAGE: parametric analysis of gene set enrichment. BMC Bioinformatics 6, 144. Lambert J. C., Heath S., Even G. et al. (2009) Genome-wide association study identifies variants at CLU and CR1 associated with Alzheimer’s disease. Nat. Genet. 41, 1094–1099. Lambert J. C., Grenier-Boley B., Chouraki V. et al. (2010) Implication of the immune system in Alzheimer’s disease: evidence from genome-wide pathway analysis. J. Alzheimers Dis. 20, 1107– 1118. Laurila J. P., Laatikainen L. E., Castellone M. D. and Laukkanen M. O. (2009) SOD3 reduces inflammatory cell migration by regulating adhesion molecule and cytokine expression. PLoS ONE 4, e5786. Malm T., Koistinaho M., Muona A., Magga J. and Koistinaho J. (2010) The role and therapeutic potential of monocytic cells in Alzheimer’s disease. Glia 58, 889–900. Mattson M. P. and van Praag H. (2008) TAGing APP constrains neurogenesis. Nat. Cell Biol. 10, 249–250. Mi H., Lazareva-Ulitsky B., Loo R. et al. (2005) The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res. 33, D284–D288. Morange P. E., Bezemer I., Saut N. et al. (2010) A follow-up study of a genome-wide association scan identifies a susceptibility locus for venous thrombosis on chromosome 6p24.1. Am. J. Hum. Genet. 86, 592–595. Moss M. A., Gonzalez-Velasquez F. J., Reed J. W., Fuseler J. W., Matherly E. E., Kotarek J. A. and Soto-Ortega D. D. (2010) Soluble amyloid-beta protein aggregates induce nuclear factor-kappa b mediated upregulation of adhesion molecule expression to stimulate brain endothelium for monocyte adhesion. J. Adhes. Sci. Technol. 24, 2105–2126. Naj A. C., Beecham G. W., Martin E. R. et al. (2010) Dementia revealed: novel chromosome 6 locus for late-onset Alzheimer disease provides genetic evidence for folate-pathway abnormalities. PLoS Genet. 6, e1001130. Nam D., Kim J., Kim S. Y. and Kim S. (2010) GSA-SNP: a general approach for gene set analysis of polymorphisms. Nucleic Acids Res. 38, W749–754. O’Dushlaine C., Kenny E., Heron E., Donohoe G., Gill M., Morris D. and Corvin A. (2010) Molecular pathways involved in neuronal cell adhesion and membrane scaffolding contribute to schizophrenia and bipolar disorder susceptibility. Mol. Psychiatry 16, 286–292. Osterfield M., Egelund R., Young L. M. and Flanagan J. G. (2008) Interaction of amyloid precursor protein with contactins and NgCAM in the retinotectal system. Development 135, 1189– 1199. Pedersen N. L. (2010) Reaching the limits of genome-wide significance in Alzheimer disease: back to the environment. JAMA 303, 1864– 1865.
Ó 2011 The Authors Journal of Neurochemistry Ó 2011 International Society for Neurochemistry, J. Neurochem. (2012) 120, 190–198
198 | G. Liu et al.
Perry J. R., McCarthy M. I., Hattersley A. T., Zeggini E., Weedon M. N. and Frayling T. M. (2009) Interrogating type 2 diabetes genomewide association data using a biological pathway-based approach. Diabetes 58, 1463–1467. Purcell S., Neale B., Todd-Brown K. et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575. Raychaudhuri S., Plenge R. M., Rossin E. J. et al. (2009) Identifying relationships among genomic disease regions: predicting genes at pathogenic SNP associations and rare deletions. PLoS Genet. 5, e1000534. Segre A. V., Groop L., Mootha V. K., Daly M. J. and Altshuler D. (2010) Common inherited variation in mitochondrial genes is not enriched for associations with type 2 diabetes or related glycemic traits. PLoS Genet. 6, e1001058. Seshadri S., DeStefano A. L., Au R. et al. (2007) Genetic correlates of brain aging on MRI and cognitive test measures: a genome-wide association and linkage analysis in the Framingham Study. BMC Med. Genet. 8(Suppl. 1), S15. Seshadri S., Fitzpatrick A. L., Ikram M. A. et al. (2010) Genome-wide analysis of genetic loci associated with Alzheimer disease. JAMA 303, 1832–1840. Speliotes E. K., Willer C. J., Berndt S. I. et al. (2010) Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat. Genet. 42, 937–948. Stein J. L., Hua X., Morra J. H. et al. (2010) Genome-wide analysis reveals novel genes influencing temporal lobe structure with relevance to neurodegeneration in Alzheimer’s disease. Neuroimage 51, 542–554. Strekalova H., Buhmann C., Kleene R., Eggers C., Saffell J., Hemperly J., Weiller C., Muller-Thomsen T. and Schachner M. (2006) Elevated levels of neural recognition molecule L1 in the cerebrospinal fluid of patients with Alzheimer disease and other dementia syndromes. Neurobiol. Aging 27, 1–9. Takei N., Miyashita A., Tsukie T. et al. (2009) Genetic association study on in and around the APOE in late-onset Alzheimer disease in Japanese. Genomics 93, 441–448. Thomas P. D., Campbell M. J., Kejariwal A., Mi H., Karlak B., Daverman R., Diemer K., Muruganujan A. and Narechania A. (2003)
PANTHER: a library of protein families and subfamilies indexed by function. Genome Res. 13, 2129–2141. Vastrik I., D’Eustachio P., Schmidt E. et al. (2007) Reactome: a knowledge base of biologic pathways and processes. Genome Biol. 8, R39. Voight B. F., Scott L. J., Steinthorsdottir V. et al. (2010) Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat. Genet. 42, 579–589. Walter S., Atzmon G., Demerath E. W. et al. (2011) A genome-wide association study of aging. Neurobiol. Aging 32, 2109 e2115– 2128. Wang K., Li M. and Bucan M. (2007) Pathway-based approaches for analysis of genomewide association studies. Am. J. Hum. Genet. 81, 1278–1283. Wang K., Zhang H., Ma D. et al. (2009) Common genetic variants on 5p14.1 associate with autism spectrum disorders. Nature 459, 528– 533. Wang K., Li M. and Hakonarson H. (2010) Analysing biological pathways in genome-wide association studies. Nat. Rev. Genet. 11, 843–854. Wang L., Jia P., Wolfinger R. D., Chen X. and Zhao Z. (2011) Gene set analysis of genome-wide association studies: Methodological issues and perspectives. Genomics 98, 1–8. Waring S. C. and Rosenberg R. N. (2008) Genome-wide association studies in Alzheimer disease. Arch. Neurol. 65, 329–334. Ye H., Liu J. and Wu J. Y. (2010) Cell adhesion molecules and their involvement in autism spectrum disorder. Neurosignals 18, 62–71. Zhang B., Kirov S. and Snoddy J. (2005) WebGestalt: an integrated system for exploring gene sets in various biological contexts. Nucleic Acids Res. 33, W741–748. Zhang K., Cui S., Chang S., Zhang L. and Wang J. (2010) i-GSEA4GWAS: a web server for identification of pathways/gene sets associated with traits by applying an improved gene set enrichment analysis to genome-wide association study. Nucleic Acids Res. 38, W90–95. Zuliani G., Cavalieri M., Galvani M., Passaro A., Munari M. R., Bosi C., Zurlo A. and Fellin R. (2008) Markers of endothelial dysfunction in older subjects with late onset Alzheimer’s disease or vascular dementia. J. Neurol. Sci. 272, 164–170.
Ó 2011 The Authors Journal of Neurochemistry Ó 2011 International Society for Neurochemistry, J. Neurochem. (2012) 120, 190–198