Is There Selection for the Pace of Successive ... - Springer Link

0 downloads 0 Views 204KB Size Report
Jun 20, 2008 - analyzed the pace of pseudogenization of arpAT, an. L-DOPA transporter related to the neurotransmitter func- tion of this amino acid in the ...
J Mol Evol (2008) 67:23–28 DOI 10.1007/s00239-008-9120-6

Is There Selection for the Pace of Successive Inactivation of the arpAT Gene in Primates? Ferran Casals Æ Anna Ferrer-Admetlla Æ Josep Chillaro´n Æ David Torrents Æ Manuel Palacı´n Æ Jaume Bertranpetit

Received: 26 September 2007 / Accepted: 6 May 2008 / Published online: 20 June 2008 Ó Springer Science+Business Media, LLC 2008

Abstract Pseudogenes have classically been considered inactive sequences evolving under neutrality. In recent years, however, a growing body of evidence is favoring the appearance of hypotheses attributing a functional role to pseudogenes. One of these hypotheses is that the silencing of a gene could produce a loss of function that could have been favored by natural selection. Here, we

Electronic supplementary material The online version of this article (doi:10.1007/s00239-008-9120-6) contains supplementary material, which is available to authorized users. F. Casals  A. Ferrer-Admetlla  J. Bertranpetit (&) Evolutionary Biology Unit, Department of Experimental Sciences and Health, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain e-mail: [email protected] J. Bertranpetit CIBERESP, Barcelona, Spain J. Chillaro´n Department of Biochemistry and Molecular Biology, Faculty of Biology, University of Barcelona, Barcelona, Spain D. Torrents Barcelona Supercomputing Center, Jordi Girona 31, 08034 Barcelona, Spain D. Torrents Institut Catala` per la Recerca i Estudis Avanc¸ats (ICREA), Passeig Lluı´s Companys 23, Barcelona 08010, Spain M. Palacı´n Institute for Research in Biomedicine, Barcelona Science Park, Department of Biochemistry and Molecular Biology, Faculty of Biology, University of Barcelona, Barcelona, Spain M. Palacı´n CIBERER, Barcelona, Spain

analyzed the pace of pseudogenization of arpAT, an L-DOPA transporter related to the neurotransmitter function of this amino acid in the brain. While active in rodent, dog, and chicken, arpAT has been silenced during primate evolution. Given the high number of inactivating mutations described in humans, it is possible that there have been selective pressures favoring this silencing. Through analysis of orthologous sequences in several primate species, we show that the silencing of arpAT occurred *77 million to 90 million years ago, and that the observed mutation pattern is likely a consequence of its antiquity. Keywords Pseudogene  Pseudogenization  Silenced gene  Natural selection  Primate evolution  arpAT

Introduction Whole-human-genome analyses have revealed that there exist 11,000–19,000 pseudogenes. The vast majority of human pseudogenes derive from a duplicated sequence, with *70% of them having a retrotranspositional origin and the rest having originated by a duplication event (Torrents et al. 2003; Zhang et al. 2003). Only a minority of pseudogenes did not appear after a duplication event, and thus their silencing could have produced a loss of function (Menashe et al. 2003). Although pseudogenes are defined as transcriptionally silent sequences, many cases of transcribed pseudogenes have been reported (Balakirev and Ayala 2003). This fact, together with the evolutionary conservation of the original sequence and the low level of nucleotide diversity, led to the proposal of a regulatory role for some pseudogenes (McCarrey and Riggs 1986). This hypothesis has been confirmed by several examples,

123

24

confirming the functional potentiality of RNA transcribed from pseudogenes (Hirotsune et al. 2003; Korneev et al. 1999; Lee 2003). On the other hand, pseudogenes have also been shown to be involved in the generation of genetic diversity through gene conversion or recombination with functional genes and have the potential to become new genes (Balakirev and Ayala 2003). The increasing evidence of functionality of RNA independent of its role in protein synthesis that could encompass partial mRNA must be considered in analysis of the pseudogenization process. Usually, gene silencing is thought to be produced by one inactivation event followed by neutral evolution, adding new mutations at a neutral pace, some of which will, later, be viewed as further inactivating variants. Considering the functional potentialities of RNA, this process can be viewed as a series of inactivating variants having been successively selected due to undesired remnant function in the partial or flawed mRNA (by itself, as, e.g., a regulatory element or through a partial protein). In this study, we have analyzed the pseudogenization process of arpAT, one of the 33 recently silenced genes described in the analysis of the human genome (International Human Genome Sequencing Consortium 2004). arpAT is a member of the light subunits of heteromeric amino acid transporters (LSHAT) family and has a strong preference for aromatic amino acids, especially L-DOPA. It was shown to be expressed in enterocytes in the small intestine and in neurons from different brain areas, and was suggested to be an L-DOPA transporter related to the neurotransmitter function of this amino acid in the rodent brain. The gene is functional in rodents, dog, and chicken, whereas it is inactive in the human (and chimpanzee) genome, where ten frame-disrupting insertions/deletions, four in-frame stop codons, and one Alu insertion-disrupting exon 1 were found. While the obtained dN/dS ratios of 0.8 and 0.85 for humans and chimpanzees, respectively, indicated that these sequences are under neutral evolution, the possible excess of frameshift mutations in the two primate species, compared with the low mutational rate of the genome, led to consideration of the possibility that the successive silencing of this gene may have undergone positive selection (Fernandez et al. 2005), thus opening the question of the pace of pseudogenization. To disentangle the previous two hypotheses on the arpAT gene, we have analyzed it in several primate species to trace back the pseudogenization history. The goal is to answer questions regarding the pseudogenization process of arpAT, such as when the inactivation occurred, what the rate of accumulation of inactivating mutations is, and whether there have been selective forces favoring the successive inactivation of this gene in any of the primate branches.

123

J Mol Evol (2008) 67:23–28

Materials and Methods Samples The following primate DNA samples from the ECACC (European Collection of Cell Cultures) Primate DNA Panel were used: MA104 (Chlorocebus aethiops), CYNOM-K1 (Macaca fascicularis), OMK (637–69) (Aotus trivirgatus), and B95–8 (Saguinus oedipus). The Microcebus murinus sample was provided by Christian Roos, Gene Bank of Primates (German Primate Center), along with other prosimian samples. The human DNA was obtained in our lab. Supplementary Table S1 reports the primers used. When amplification or the expected band was not obtained, PCRs were performed at lower annealing temperatures.

Obtaining Sequences The first step in this work was to obtain genomic sequences from the arpAT coding region in several primates. The coding region includes six exons, and no evidence for splicing variants has been reported. To do this, available genomic sequences from different organisms (Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, and Canis familiaris) were aligned and primers were located in the more conserved regions close to the ends of the exons. No primers were designed in exon 5 because of its small size and low level of conservation (see Results and Discussion). As expected because it is a pseudogene, it is more difficult than designing primers in real exons, as here the amount of change is expected to be greater. PCR amplifications were carried out in Homo sapiens; two Old World monkeys, the cynomolgous monkey (Macaca fascicularis) and the African green monkey (Chlorocebus aethiops); two New World monkeys, the cotton-top marmoset (Saguinus oedipus) and the owl monkey (Aotus trivirgatus); and one prosiminan species, the mouse-lemur (Microcebus murinus). The exact location of these primers is reported in Supplementary Table S1. Due to sequence divergence, PCR amplifications were successful only when the primers were located in exonic sequences, although in some cases PCR product was not obtained. Sequences of different parts of the arpAT gene were obtained as follows: exon 1 sequences from Homo sapiens, Chlorocebus aethiops, and Saguinus oedipus; exon 2, exon 3, and the intron between these two exons in Homo sapiens, Macaca fascicularis, Chlorocebus aethiops, and Saguinus oedipus; exon 4 in Homo sapiens, Macaca fascicularis, Chlorocebus aethiops, Saguinus oedipus, Aotus trivirgatus, and Microcebus murinus; and exon 6 in Homo sapiens, Macaca fascicularis, Chlorocebus aethiops, and Microcebus murinus. Since genomic traces of the regions containing exons 4

J Mol Evol (2008) 67:23–28

25

and 6 in Microcebus murinus were detected through BLAST searches, it was possible to design primers in the intronic regions surrounding these two exons. Therefore, in the case of Microcebus murinus the complete sequences of exons 4 and 6 were obtained. No sequences homologous to exons 1, 2, 3, and 5 were found through BLAST searches. DNA Sequence Analysis Nucleotide sequences were analyzed using the Lasergene software package (DNASTAR). Besides the sequences obtained here, other orthologous sequences were included in the analyses. Genetic sequences of Mus musculus (ENSMUSG00000020600), Rattus norvegicus (ENSRNOG00000006119), and Canis familiaris (ENSCAFG000 00003845) were retrieved from the Ensemble database (www. ensembl.org). Microcebus murinus sequences were obtained by means of similarity searches in the NCBI Trace Archive, carried out using Mega Blast. Homo sapiens orthologous sequences described previously (Fernandez et al. 2005) were also used in some analyses. Multiple sequence alignments were performed with ClustalW (Thompson et al. 1994) and revised manually. Phylogenetic analyses were performed using the PAML software package, which excludes indels and stop codons from the analysis (Yang 1997).

Results and Discussion The sequences of the corresponding coding regions of arpAT were obtained anew for some species (including a prosimian) and retrieved from databases for others; see Table 1 for data and species analyzed and Supplementary Information for details on the methods of obtaining them all. For some newly sequenced species, it was not possible

to amplify some exons. Species analyzed include the main primate groups (human, Chlorocebus aethiops, and Macaca fascicularis as Old World monkeys and Aotus trivirgatus and Saguinus oedipus as New World monkeys—all of them haplorrhine; and Microcebus murinus or mouse-lemur as prosimian or strepsirhine). Other species include mouse, rat, and dog. As shown previously for humans and chimpanzees, which share nearly all the coding disablements (Fernandez et al. 2005), the arpAT gene appears to be silenced in all primate lineages analyzed, including the prosimian. All species contain several stop codons and frameshift mutations disrupting the putative coding sequence (Table 1). To evaluate whether the number of indels is higher than expected, as suggested (Fernandez et al. 2005), we have used two estimates of the neutral indel substitution rate: 1.0110–4 per site per million years (Britten 2002; Podlaha et al. 2005) and 8.8310–5 (Silva and Kondrashov 2002), a rate found to be quite similar along the primate species in different studies. We mapped all the described indels in the phylogeny of the analyzed species (Fig. 1 and Supplementary Table S2) and compared them to the expected number of indels in each branch of the tree, according to the length of the studied region and divergence time (Table 2). We considered the mutations produced after the initial strepsirhines-haplorhines split, since the first inactivating mutations did occur after the rodents-primates split and before the initial primate split. This split is not much more recent than the rodents-primates split, calculated to have been between 75 and 90 million years ago (MYA) (Hedges 2002; Murphy et al. 2001). The last common ancestor of haplorhines and strepsirhines has been estimated at 77.5 MYA (Steiper and Young 2006). Given that, hereafter we consider the split between primates and rodents to have been at *90 MYA. The probabilities of the observed numbers of indels have been calculated

Table 1 Stop codons and frameshift mutations described in the sequenced regions of each exon Stop codons

Frameshift mutations

Exon 1 (687/ 759 bp)

Exon 2 (117/ 135 bp)

Exon 3 (113/ 129 bp)

Exon 4 (57/ 126 bp)

Exon 6 (181/ 231 bp)

Exon 1 (687/ 759 bp)

Exon 2 (117/ 135 bp)

Exon 3 (113/ 129 bp)

Exon 4 (57/ 126 bp)

Exon 6 (181/ 231 bp)

Microcebus murinusa

ND

ND

ND

2

1

ND

ND

ND

1

2

Saguinus oedipus

6

0

1

0

ND

8

1

2

0

ND

Aotus trivirgatus

ND

ND

ND

0

ND

ND

ND

ND

0

ND

Chlorocebus aethiops

6

0

0

0

0

4

0

0

0

2

Macaca fascicularis Homo sapiens

ND 6

0 2

0 1

0 0

0 1

ND 4

0 1

0 1

0 0

2 1

Mus musculus

0

0

0

0

0

0

0

0

0

0

Note: ND, not determined a

Complete exons 4 and 6

123

26

J Mol Evol (2008) 67:23–28

Fig. 1 Phylogenetic tree of arpAT, including the species with available sequences from exons 1 to 4. dN/dS ratios are shown above each branch. The number of frameshift mutations is shown below the branch. Topology of the tree and branch lengths were calculated considering all changes

Table 2 Observed and expected numbers of indels among different primate species Comparison

Divergence (MY)a

Observed

Expectedb

Exons

Total length (bp)

1155

3

3.56

468

0

1.44

1155

1

3.56

Expectedc

pb

pc

3.11

0.52

0.62

1.26

0.24

0.28

3.11

0.13

0.18

Cercopithecidae-Hominoidae split To Homo sapiens

30.5

1, 2, 3, 4, 6

To Macaca fascicularis

30.5

2, 3, 4, 6

To Chlorocebus aethiops

30.5

1, 2, 3, 4, 6

Old World-New World monkeys split To Homo Sapiens

42.9

1, 2, 3, 4

974

4

4.22

3.69

0.59

0.69

To Macaca fascicularis To Chlorocebus aethiops

42.9 42.9

2, 3, 4 1, 2, 3, 4

287 974

1 2

1.24 4.22

1.09 3.69

0.65 0.21

0.70 0.29

To Saguinus Oedipus

42.9

1, 2, 3, 4

974

9

4.22

3.69

0.98

0.99

To Homo sapiens

77.5

4, 6

238

1

1.86

1.63

0.44

0.52

To Macaca fascicularis

77.5

4, 6

238

2

1.86

1.63

0.71

0.78

To Chlorocebus aethiops

77.5

4, 6

238

2

1.86

1.63

0.71

0.78

To Microcebus murinus

77.5

4, 6

238

2

1.86

1.63

0.71

0.78

Strepsirhines-Haplorhines split

a

From Steiper and Young (2006)

b

Expected number of indels and cumulative Poisson probability p according to Podlaha et al. (2005)

c

Expected number of indels and cumulative Poisson probability p according to Silva and Kondrashov (2002)

considering that the number of indels, under a neutral model, follows a Poisson distribution (Table 2). In general, the observed and expected numbers of indels are quite similar, and the differences are not statistically significant. Only in the Saguinus oedipus branch are differences statistically significant. However, this branch also shows a higher mutation rate, with the highest rate of synonymous and nonsynonymous changes in the primate species (see Table 4), which could explain the excess of indels. When the full coding sequence including the six exons is compared between humans and rodents, the observed number of frameshift mutations (13) is also very similar to the expected numbers (13.34 and 11.66, depending on the mutation rate) considering that they have all appeared in

123

the branch leading to the primates after the split with rodents (*90 MYA). Similar results were obtained in all comparisons of the primate species to the mouse sequence included in this work. Alternatively, the action of natural selection favoring the silencing of arpAT could also be reflected in the heterogeneous rate of indel fixation across the different exons of the gene, since if the gene is evolving neutrally, a similar rate of indel fixation is expected between the different exons. Table 3 reports the observed and expected numbers of indels according to a random distribution in the comparison of the full coding sequence between mice and humans. The number of indels described in exon 5 is significantly higher than expected (v2 = 19.21 after Yate’s

J Mol Evol (2008) 67:23–28

27

Table 3 Exon rate of evolution in arpAT between Mus musculus and Homo sapiens Exon Length (bp)a

Indels observed

Indels expected

Ka

1

759

5

6.73

0.2246 0.5175 0.43

2

135

1

1.20

0.1889 0.2780 0.68

3 4

129 126

1 0

1.14 1.12

0.1779 0.6153 0.29 0.2964 0.6449 0.46

5

87

5

0.77

0.5044 1.5199 0.33

6

231

1

2.05

0.1972 0.3498 0.56

a

Ks

Ka / Ks

Mus musculus

correction, p \ 0.001). Although this result could suggest the existence of some selective pressures favoring the accumulation of disrupting mutations in this exon, a deeper analysis shows that the rate of synonymous changes is also much higher in this exon (Table 3). This would suggest a higher mutation rate in this region as the most parsimonious explanation (Hardison et al. 2003; Kvikstad et al. 2007; Wetterbom et al. 2006). The specific rates of synonymous (dS) and nonsynonymous (dN) substitutions were estimated through maximum likelihood models (see Materials and Methods). The dN/dS ratios were estimated in the external and internal branches of three different phylogenetic trees (differing in the number of sequences and species included), using the Canis familiaris sequence as an outgroup (Fig. 1). The first analysis (Fig. 1), for exons 1–4, indicates that a relaxation of the purifying selection on the gene occurred after the split between rodents and primates, since in these branches the dN/dS ratio is closer to 1, which suggests that these sequences are under neutral evolution. In contrast, in the branches leading to rodent and dog, dN/dS values are lower, as expected for coding sequences under purifying selection.

We tested this by comparing the likelihoods of each branch evolving under conserved evolution with x = 0.25, since the average KA/KS ratio for the human-chimpanzee lineage has been estimated to be *0.23 (Chimpanzee Sequencing and Analysis Consortium 2005), and under neutral evolution with x = 1 (Table 4). To do this, we compared the log-likelihood value of a model assuming one fixed x (0.25 or 1) for the branch of interest to a model that estimates a free x value for each branch of our phylogenetic tree. We then compared the models using the likelihood ratio test with as many degrees of freedom as the number of differences in the parameters estimated (that fits a chi-square distribution). All branches leading to primate species show likelihood values not compatible with conserved evolution but compatible with neutral evolution. The extremely high, although not significantly different from 1, dN/dS value obtained in the branch leading to Old World monkeys is probably due to its short length, and to the fact that randomly there is an extremely low number of synonymous substitutions. On the other hand, the likelihood values obtained in the branches leading to nonprimate species are mostly in agreement with conserved evolution. Only in the case of the branch leading to rat is the obtained dN/dS value significantly different from 0.25, but also less than 1, suggesting that some kind of relaxation has occurred in this species. Similar results are obtained in the second and third trees (Supplementary Fig. S1). When we included all the species with sequences of exons 1, 2, 3, 4, and 6 (Supplementary Fig. S1A), dN/dS values close to 1 also appear in all the branches leading to primate species after the split with rodents. Again, a relatively high dN/dS value is obtained in the branch leading to the rat, which is significantly different from 0.25 (Supplementary Table S3). The obtained probabilities are also in agreement with conserved

Table 4 Phylogenetic analysis of arpAT Branch

dN / dS

daN

dbS

NdN

p-value (x = 0.25)c

p-value (x = 1)d

Canis familiaris

0.1710

0.0447

0.2615

27.5

0.1818

1.74E-05

Rattus norvegicus

0.5018

0.0311

0.0619

19.1

0.1005

0.1510

Mus musculus

0.2221

0.0187

0.0842

11.5

0.7949

0.0015

Rodents

0.3120

0.0855

0.2741

52.6

0.3665

7.33E-06

Primates

0.5900

0.0582

0.0986

35.8

0.0230

0.2350

Saguinus oedipus

0.7864

0.0799

0.1016

49.2

2.68E-05

0.4027

Old World monkeys

8.5371

0.0281

0.0033

17.3

0.0001

0.0747

Chlorocebus aethiops

0.7223

0.0371

0.0513

22.8

0.0056

0.4120

Homo sapiens

1.3472

0.0364

0.0270

22.4

0.0002

0.5433

a

N (number of nonsynonymous positions) = 615.3

b

S (number of synonymous positions) = 260.7

c

Free model vs free model with x branch = 0.25

d

Free model vs free model with x branch = 1

123

28

evolution in the nonprimate species and neutral evolution in primate species (Supplementary Table S3). However, in this case the dN/dS ratio calculated in the rat branch is significantly different from 0.25. Finally, in the third tree (Supplementary Fig. S1B) we have included the prosimian species for which complete sequences of exons 4 and 6 have been obtained. Although it is based on a lower number of sites (309 after deleting gaps), this tree is in agreement with an inactivation and subsequent neutral evolution of arpAT after the split between rodents and primates. Although the dN/dS ratio obtained in the prosimian species branch is quite high, the possibility of positive selection is excluded given the stop codons and frameshift mutations described in this lineage (see above), indicating that this excess of nonsynonymous mutations is due to the fact that this sequence is evolving under neutrality. The absence of synonymous changes does not allow calculation of the dN/dS ratio in the primate branch before the split of strepsirhines and haplorhines, due both to the relatively short length of the alignment and the short length of this branch. The probabilities obtained are again in agreement with conserved evolution for nonprimate species and neutral evolution for primates, since their split with rodents (Supplementary Table S4). In conclusion, this deeper analysis of the pseudogenization process of arpAT did not reveal the existence of selective pressures favoring its inactivation in primates, suggesting that the large number of frameshift mutations described in its coding sequence in several primate species is a consequence of the antiquity of this inactivation. The understanding of genome dynamics is complex and stochastic factors have a strong effect on the evolutionary output in specific genome regions, besides the directional effects of selective forces. These, although acting at a very fine scale in the genome, are not easy to discover in many of the results of a molecular evolutionary process, and random processes may account for large amounts of the existing variation, as in the case of the arpAT neutral pseudogene in primates. Acknowledgments We are grateful to C. Roos, Gene Bank of Primates (German Primate Center), for kindly providing us the prosimian DNA samples. Toma`s Marque`s (Washington University, USA) helped with the use of PAML. This research was supported by the Ministerio de Educacio´n y Ciencia of Spain (Grants SAF2007– 63171, BFU2005–00243, and BFU2006–14600) and EC Project Grant 502802 EUGINDAT. We also thank the Servei de Geno`mica (Universitat Pompeu Fabra) for technical support. D. Torrents is an ICREA Research Professor.

References Balakirev ES, Ayala FJ (2003) Pseudogenes: are they ‘‘junk’’ or functional DNA? Annu Rev Genet 37:123–151

123

J Mol Evol (2008) 67:23–28 Britten RJ (2002) Divergence between samples of chimpanzee and human DNA sequences is 5%, counting indels. Proc Natl Acad Sci USA 99:13633–13635 Chimpanzee Sequencing, Analysis Consortium (2005) Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437:69–87 Fernandez E, Torrents D, Zorzano A, Palacin M, Chillaron J (2005) Identification and functional characterization of a novel low affinity aromatic-preferring amino acid transporter (arpAT). One of the few proteins silenced during primate evolution. J Biol Chem 280:19364–19372 Hardison RC, Roskin KM, Yang S, Diekhans M, Kent WJ, Weber R, Elnitski L, Li J, O’Connor M, Kolbe D, Schwartz S, Furey TS, Whelan S, Goldman N, Smit A, Miller W, Chiaromonte F, Haussler D (2003) Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res 13:13–26 Hedges SB (2002) The origin and evolution of model organisms. Nat Rev Genet 3:838–849 Hirotsune S, Yoshida N, Chen A, Garrett L, Sugiyama F, Takahashi S, Yagami K, Wynshaw-Boris A, Yoshiki A (2003) An expressed pseudogene regulates the messenger-RNA stability of its homologous coding gene. Nature 423:91–96 International Human Genome Sequencing Consortium (2004) Finishing the euchromatic sequence of the human genome. Nature 431:931–945 Korneev SA, Park JH, O’Shea M (1999) Neuronal expression of neural nitric oxide synthase (nNOS) protein is suppressed by an antisense RNA transcribed from an NOS pseudogene. J Neurosci 19:7711–7720 Kvikstad EM, Tyekucheva S, Chiaromonte F, Makova KD (2007) A macaque’s-eye view of human insertions and deletions: differences in mechanisms. PLoS Comput Biol 3:1772–1782 Lee JT (2003) Molecular biology: complicity of gene and pseudogene. Nature 423:26–28 McCarrey JR, Riggs AD (1986) Determinator-inhibitor pairs as a mechanism for threshold setting in development: a possible function for pseudogenes. Proc Natl Acad Sci USA 83:679–683 Menashe I, Man O, Lancet D, Gilad Y (2003) Different noses for different people. Nat Genet 34:143–144 Murphy WJ, Eizirik E, Johnson WE, Zhang YP, Ryder OA, O’Brien SJ (2001) Molecular phylogenetics and the origins of placental mammals. Nature 409:614–618 Podlaha O, Webb DM, Tucker PK, Zhang J (2005) Positive selection for indel substitutions in the rodent sperm protein catsper1. Mol Biol Evol 22:1845–1852 Silva JC, Kondrashov AS (2002) Patterns in spontaneous mutation revealed by human-baboon sequence comparison. Trends Genet 18:544–547 Steiper ME, Young NM (2006) Primate molecular divergence dates. Mol Phylogenet Evol 41:384–394 Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673– 4680 Torrents D, Suyama M, Zdobnov E, Bork P (2003) A genome-wide survey of human pseudogenes. Genome Res 13:2559–2567 Wetterbom A, Sevov M, Cavelier L, Bergstrom TF (2006) Comparative genomic analysis of human and chimpanzee indicates a key role for indels in primate evolution. J Mol Evol 63:682–690 Yang Z (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci 13:555–556 Zhang Z, Harrison PM, Liu Y, Gerstein M (2003) Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome. Genome Res 13:2541–2558