Distinct Evolutionary Patterns Between Two Duplicated Color Vision ...

0 downloads 0 Views 371KB Size Report
Oct 17, 2009 - the evolution of color vision genes in fishes. Keywords Genome ... expressed pigments of many mammals studied to date. (Yokoyama 2000).
J Mol Evol (2009) 69:346–359 DOI 10.1007/s00239-009-9283-9

Distinct Evolutionary Patterns Between Two Duplicated Color Vision Genes Within Cyprinid Fishes Zhiqiang Li • Xiaoni Gan • Shunping He

Received: 20 May 2009 / Accepted: 15 September 2009 / Published online: 17 October 2009 Ó Springer Science+Business Media, LLC 2009

Abstract We investigated the molecular evolution of duplicated color vision genes (LWS-1 and SWS2) within cyprinid fish, focusing on the most cavefish-rich genus— Sinocyclocheilus. Maximum likelihood-based codon substitution approaches were used to analyze the evolution of vision genes. We found that the duplicated color vision genes had unequal evolutionary rates, which may lead to a related function divergence. Divergence of LWS-1 was strongly influenced by positive selection causing an accelerated rate of substitution in the proportion of pocketforming residues. The SWS2 pigment experienced divergent selection between lineages, and no positively selected site was found. A duplicate copy of LWS-1 of some cyprinine species had become a pseudogene, but all SWS2 sequences remained intact in the regions examined in the cyprinid fishes examined in this study. The pseudogenization events did not occur randomly in the two copies of LWS-1 within Sinocyclocheilus species. Some cave species of Sinocyclocheilus with numerous morphological specializations that seem to be highly adapted for caves, retain both intact copies of color vision genes in their genome. We found some novel amino acid substitutions at key sites, Z. Li  X. Gan  S. He (&) Key Laboratory of Aquatic Biodiversity and Conservation, Institute of Hydrobiology, Chinese Academy of Sciences, 7th Donghu South Road, 430072 Wuhan, Hubei Province, People’s Republic of China e-mail: [email protected] Z. Li  X. Gan Graduate School of Chinese Academy of Sciences, Beijing 100039, People’s Republic of China X. Gan Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan 430071, Hubei Province, People’s Republic of China

123

which might represent interesting target sites for future mutagenesis experiments. Our data add to the increasing evidence that duplicate genes experience lower selective constraints and in some cases positive selection following gene duplication. Some of these observations are unexpected and may provide insights into the effect of caves on the evolution of color vision genes in fishes. Keywords Genome duplication  Color vision gene  Positive selection  Cyprinid fish  Cavefish  Sinocyclocheilus

Introduction The photic environments on Earth are very diverse, ranging from total darkness to full brightness. Visual sensitivities of vertebrates are tightly associated with ecological photoenvironments (Bowmaker 1995; Yokoyama 2008). For example, the cottoid fishes in Lake Baikal (the deepest lake in the world) and coelacanths in the deep sea alter their spectral sensitivities of cone and rod opsin to match with the available downwelling sunlight (Hunt et al. 1996; Yokoyama and Tada 2000). Nocturnal terrestrial animals have a preponderance of rods, whereas diurnal vertebrates with good color vision often have more cones. The nocturnal gecko (Gekko gekko) with pure-rod retinas (Walls 1934) and the diurnal chameleon (Anolis carolinensis) with pure-cone retinas (Crescitelli 1972) represent two extreme examples. Vision in vertebrates is known to be modulated by visual pigments, which consist of a protein moiety, an opsin, and a chromophore, either 11-cis-retinal or 11-cis-3,4-dehydroretinal. The pigments reside in rod and cone photoreceptor cells in the retina. Opsins represent a group of G

J Mol Evol (2009) 69:346–359

protein-coupled receptors (GPCRs) with seven transmembrane (TM) domains. The opsin is covalently linked to the retinal chromophore by a Schiff base linkage at lysine residue 296 located in the seventh TM domain (residue numbers are according to those in bovine rhodopsin) (Yokoyama 2000). The retinal visual pigments in vertebrates are classified into five evolutionarily distinct groups: RH1 (rhodopsin), RH2 (RH1-like or green), SWS1 (short wavelength-sensitive type 1 or UV–blue), SWS2 (short wavelength-sensitive type 2 or blue), and M/LWS (middleto-long wavelength-sensitive or red–green) (Yokoyama 2000). The RH1 pigment is responsible for dim vision, while the other four groups of visual pigments are responsible for color vision. The phylogenetic relationship of these pigments is considered as ((((RH1, RH2), SWS2), SWS1), M/LWS) (Yokoyama 2000). The light sensitivity of the visual pigments is determined by the interaction between chromophore and opsin protein, and the maximum absorption wavelengths of visual pigments are referred to as kmax. At present, the absorption spectra and amino acid sequences of over 100 visual pigments have been characterized (Yokoyama 2000). Free 11cis-retinal with a protonated Schiff base in solution has a kmax of 440 nm (Kito et al. 1968). Interacting with an opsin, however, the Schiff base-linked chromophore in a visual pigment can have a kmax ranging from 360 to 635 nm (e.g., Kochendoerfer et al. 1999; Yokoyama 2002). An interesting question is how visual pigments achieve such a wide range of kmax values. Thanks to site-directed mutagenesis and in vitro assay, now we can construct the chimeric and presumed ancestral pigments to study the effects of amino acid (aa) substitutions of specific residues on the kmax values. Up to now, it is known that aa substitutions at 28 sites are involved in the spectral tuning of vertebrate visual pigments (Yokoyama 2008). These changes are located within or near the transmembrane segments, which can form a pocket for chromophores. The critical amino acid changes that shift kmaxs of visual pigments are summarized by Yokoyama (2008). The kmax values of the M/LWS pigments are mostly determined by 5 aa residues at positions 164, 181, 261, 269, and 292 (numbered according to bovine rhodopsin), which is called ‘‘five-sites’’ rule and has correctly predicted the in vitroexpressed pigments of many mammals studied to date (Yokoyama 2000). The aa replacements S180A (-7 nm), H197Y (-28 nm), Y277F (-8 nm), T285A (-15 nm), A308S (-27 nm), and S180A/H197Y (11 nm) were suggested to have generated the variable kmaxs of M/LWS. Effects of these aa replacements on the kmax shift are nearly additive (Yokoyama and Radlwimmer 2001). As for SWS1, the individual spectral effects of key aa replacements appear to be negligible and are largely dependent on background aa sequences, but the combined effects are

347

nonadditive and synergetic. The spectral tuning of the other three opsins is not as well understood as M/LWS. Despite the advances in recent years, our understanding of spectral tuning of visual pigments is still fragmentary. The relatively closed, often nutrient-poor, and lightless environment of caves represents an unusual type of photoenvironment and is habitat for at least 107 troglomorphic fishes (Zhao and Zhang 2006). Many of these troglomorphic fishes show striking adaptation to this kind of extremely rigorous environment and are often characterized by a remarkable convergence of eye and pigmentation reduction or loss (Protas et al. 2006). Sinocyclocheilus (Cypriniformes, Cyprinidae), is the most cavefish-rich genus, with about 20 of its *50 species being cave dwellers (Zhao and Zhang 2006). This genus is only found in karst cave waters and surface rivers or lakes in Yunnan–Guizhou Plateau of China. It is remarkable that there are so many cavefish species with varying degrees of troglomorphic characters in a single fish genus. Sinocyclocheilus, along with Astyanax mexicanus, represent good models for studying natural selection and the molecular mechanisms responsible for troglomorphic characters (Xiao et al. 2005). Because food is limited in caves, most Sinocyclocheilus species have small populations, and are thus considered endangered (Yue and Chen 1998). The genomes of the species of this genus are fundamentally tetraploid (Xiao et al. 2002). Tandem gene duplication give rise to two LWS (LWS-1 and LWS-2) and four RH2 (RH2-1, RH22, RH2-3, and RH2-4) genes in zebrafish (Danio rerio), while the other three opsin genes are single copy genes in zebrafish (Chinen et al. 2003). Zebrafish are diploid, thus Sinocyclocheilus will possess a theoretical number of two zebrafish co-orthologs for every opsin gene if no copy is lost in the genome of Sinocyclocheilus species. Many studies have shown that adaptive evolution of opsin genes is quite common in fishes and correlated with the photo-environments inhabited by these fishes (Spady et al. 2005; Sugawara et al. 2002; Terai et al. 2006; Wang et al. 2008; Yokoyama and Takenaka 2004). The difference in kmax of opsin genes, combined with other factors (such as ambient light color and male nuptial color), can even drive speciation in fishes (Seehausen et al. 2008; Wang et al. 2008). However, selective pressures on the visual systems of the animals that inhabit lightless or aphotic environment may be completely different. For example, opsin of Mexican cavefish can potentially become nonfunctional (Yokoyama et al. 1995). Similar to that of Mexican cavefish, the opsins of some nocturnal animals will experience relaxed selective constraints and may thus become nonfunctional. Surprisingly, S- and M/L-opsin genes of multiple nocturnal primates remain intact and are under long-term purifying selection, which further raises the question about the function and role of opsins in low-light or aphotic environment (e.g., Perry et al. 2007; Tan et al. 2005).

123

348

The importance of gene duplication in supplying raw genetic material to biological evolution has been popularized by Ohno (1970). Duplicate genes are believed to be a major mechanism for the establishment of new gene functions and the generation of evolutionary novelty, because gene duplication liberates one gene copy from purifying selection, permitting it to diverge genetically and potentially to evolve a new function. When many genome sequences are determined and analyzed, the prevalence and importance of gene duplication are clearly demonstrated (Zhang 2003). Genome duplication duplicates every gene in the genome, and is considered to be the most economical and prompt approach in providing large raw genetic material (Ohno 1970). Compared with tetrapods, teleost fish experienced an additional genome duplication—the so-called fish-specific genome duplication (FSGD), which is considered to be correlated with the diversification of teleost fish ( Crow et al. 2006; Hoegg et al. 2004; Taylor et al. 2003). How genes and their functions evolve after duplication is a central and longstanding question in evolutionary biology. As stated by Zhang (2003), possible evolutionary fates of duplicate genes are as follows: (1) pseudogenization; (2) conservation of gene function; (3) subfunctionalization; (4) neofunctionalization. The most common fate for duplicate genes has been thought to be loss of a functional copy (Walsh 1995). If a duplicated gene acquires an advantageous function, then it is likely to be kept in subsequent evolution, as proved by the gene families in many eukaryotic organisms’ genomes. These different evolutionary fates of duplicate genes can be realized at the DNA, RNA, or protein level. The molecular evolution of rhodopsin of Sinocyclocheilus has been dominated by relaxed purifying selection (Li and He 2009). But what about the color vision genes within this genus? Here, we have conducted a molecular evolutionary study of the two color vision genes (LWS-1 and SWS2) in the most cavefish-rich genus, Sinocyclocheilus, as well as their close relatives. The aims of this study are (1) to investigate the molecular evolution of duplicated color vision genes in tetraploid cyprinid fishes under positive and divergent selection, and (2) to identify sites consistent with the types of selective pressure if they are present.

Materials and Methods Samples The cyprinid species sampled in this study are listed in Table 1. All these species were deposited in the Freshwater Fish Collection of the Institute of Hydrobiology, Chinese Academy of Sciences. Among these species, 16 species represent the cave and surface species of the genus

123

J Mol Evol (2009) 69:346–359

Sinocyclocheilus. In the light of the molecular phylogenetic relationships with Cyprininae (Li et al. 2008a), the other four cyprinid species (Danio rerio, Carassius auratus, Cyprinus carpio, and Procypris rabaudi) were assigned to the outgroup taxa, as listed in Table 1. Muscle or fin tissues used for DNA extraction in this study were preserved in 95% ethanol. The LWS-1 and SWS2 sequences of C. carpio, C. auratus, and D. rerio were downloaded from GenBank, except for the B copy of LWS-1 of C. auratus, which was sequenced from goldfish through cloning. The samples of Sinocyclocheilus were identified following the classification system of Zhao (2006). DNA Extraction, PCR Amplification, and Sequencing Total genomic DNA was extracted from ethanol-preserved muscle or fin tissues using phenol/chloroform extraction (Sambrook et al. 1989). LWS-1 and SWS2 were amplified from the total DNA extracts using the polymerase chain reaction (PCR), with PCR primers designed according to the conserved regions of the opsin genes of D. rerio, C. carpio, and C. auratus. The primers for LWS-1 were LWS-1F (50 -ATG GCA GAG CAK TGG GGA GAY GC-30 ) and LWS-1R (50 -GCA GGA GCC ACA GAR GAC ACT-30 ), while the primers for SWS2 were SWS2F (50 -AGC AAR TAC CAG AGT TTC ACG-30 ) and SWS2R (50 -CTC TGG TGC AAC RGA GGA GAC CTG-30 ). Reaction mixtures contained approximately 100 ng of template DNA, 1 ll of each primer (each 10 lM), 5 ll of 109 reaction buffer, 2 ll dNTPs (each 2.5 mM), and 2.0 U Ex Taq DNA polymerase in a total volume of 50 ll. The PCR amplification profile consisted of an initial denaturation step at 94°C for 3 min, followed by 35 cycles performed in the following order of denaturation at 94°C for 30 s, annealing at 56°C for 45 s (58°C for SWS2), elongation at 72°C for 130 s (160 s for SWS2), and a final extension at 72°C for 8 min. PCR amplification products were fractionated by electrophoresis through 1% agarose gels, recovered from the gels and purified with OMEGA (from OMEGA bio-tek) purification kit according to manufacturer’s instructions. Then the purified PCR products were cloned using pMD18-T vector (TAKARA) into E. coli Top10 strain. To get the inferred two copies of opsin genes by ploidy levels (tetraploidy), multiple positive clones were sequenced for every specimen using M13 universal sequencing primers. Some sequences from different clones of the same specimen possessed several polymorphic base pairs. These differences might represent different alleles of the same gene, or they could be a result of base misincorporations during PCR amplification or sequencing. When multiple closely related sequences were identified, only a representative sequence with the least number of autapomorphies was selected for further analysis. The resulting

J Mol Evol (2009) 69:346–359

349

Table 1 Species, clones, and GenBank accession numbers of sequences in this study Species

Voucher

Habitat

Eye

Sinocyclocheilus xunlensis

IHB04050268

Cave

Blind

Sinocyclocheilus furcodorsalis

IHB0411233

Cave

Sinocyclocheilus microphthalmus

IHB0411237

Cave

Clones sequenced

Divergent paralogs

LWS-1

SWS2

7, 4

2, 2

GQ168761

GQ168742

GQ168762

GQ168743

Blind

5, 4

2, 2

GQ168763

GQ168744

GQ168764

GQ168745

Reduced

3, 5

2, 1

GQ168765

GQ168746

GQ168766 Sinocyclocheilus macrolepis Sinocyclocheilus jii

IHB0411242 IHB0411228

Surface Surface

Normal Normal

14, 12 12, 13

1, 1 1, 2

GQ168767 GQ168768

GQ168747 GQ168748

Sinocyclocheilus macrophthalmus

IHB0411246

Cave

Normal

3, 13

2, 1

GQ168769

GQ168750

GQ168749 GQ168770 Sinocyclocheilus grahami

No Voucher

Surface

Normal

5, 14

2, 1

GQ168771

GQ168751

GQ168772 Sinocyclocheilus tingi

IHB0407220

Surface

Normal

6, 12

2, 1

GQ168773

GQ168752

GQ168774 Sinocyclocheilus yangzongensis

IHB2006640

Surface

Normal

3, 12

2, 1

GQ168775

GQ168753

GQ168776 Sinocyclocheilus yimenensis

IHB2006645

Intermediate

Normal

5, 11

2, 1

GQ168777

GQ168754

GQ168778 Sinocyclocheilus rhinocerous

IHB2006635

Cave

Reduced

4, 13

2, 1

GQ168779

GQ168755

GQ168780 Sinocyclocheilus purpureus

IHB2006637

Surface

Normal

6, 11

2, 1

GQ168781 GQ168782

GQ168756

Sinocyclocheilus maculates

IHB2006632

Surface

Normal

13, 12

1, 1

GQ168783

GQ168757

Sinocyclocheilus qiubeinsis

IHB2006624

Surface

Normal

2, 11

2, 1

GQ168784

GQ168758

Sinocyclocheilus anophthalmus

IHB2006629

Cave

Blind

4, 10

2, 1

GQ168786

Sinocyclocheilus jiuxuensis

IHB2006674

Cave

Reduced

12

1

GQ168788



IHB020418026

Surface

Normal

4, 13

2, 1

GQ168790

GQ168760

GQ168785 GQ168759

GQ168787 Outgroup Procypris rabaudi

GQ168791 Cyprinus carpio

Surface

Normal

Carassius auratus

Surface

Normal

AB055656

AB113668

L11867

L11864

GQ168789 Danio rerio

Surface

Normal

AB087803

AF109372

In the columns of clones sequenced and divergent paralogs, the former and latter numbers represent the numbers of clones sequenced for LWS-1 and SWS2, and that of divergent paralogs have gotten for LWS-1 and SWS2, respectively. The species with pseudogene of LWS-1 and the accession Nos. of newly obtained sequences are labeled with bold type

sequences have been deposited in GenBank (Accession Nos. are listed in Table 1, with associated information). Phylogenetic Reconstruction Mutiple-sequence alignment was performed by the program Clustal X (Thompson et al. 1997) with default settings. The alignment was carefully adjusted manually

based on exon–intron structure of the corresponding opsin genes of zebrafish. The concatenated exon sequences of opsin genes were translated into their putative amino acid sequences using MEGA 3.1 (Kumar et al. 2004) to determine whether disruptive mutations (i.e., premature stop codon, frameshift, or splice site mutations) were present. If this was the case, we carefully checked the chromatogram files to rule out the possibility of ambiguous sequencing.

123

350

Bayesian phylogenetic analysis was carried out using MrBayes 3.1.1 (Ronquist and Huelsenbeck 2003) on the nucleotide sequence alignment to determine relationships. A posterior sample of trees was obtained by Markov chain Monte Carlo simulation with 5,000,000 generations from a random starting tree, four Markov chains (three heated and one cold) sampled every 1,000 generations using the substitution model selected by Modeltest 3.7 (Posada and Crandall 1998) with the Bayesian information criterion. The K80 ? G and HKY ? G models were selected for LWS-1 and SWS2 data sets, respectively. Bayesian analysis was conducted twice for each data set, using different random numbers, to confirm consistency between runs. The samples obtained during the first 2.5 9 106 generations were discarded as ‘‘burn-in’’ and a 50% majority—rule consensus tree that summarizes topology and branch-length information was calculated from the remaining 5,002 sampled trees. Selective Pressure Analysis To examine the pattern of selection acting on the LWS-1 and SWS2 at global and lineage-specific levels, several codon-based selection tests implemented in PAML 4 were applied (Yang 1997, 2007) to the LWS-1 and SWS2 sequences. Comparisons of the ratio of nonsynonymous nucleotide substitutions per nonsynonymous site (dN) with that of synonymous substitutions per synonymous sites (dS), dN/dS = x, are used to infer the selective regime experienced by a gene. Neutral evolution, purifying selection, and positive selection are indicated by x = 1, x \ 1, and x [ 1, respectively. The LWS-1 sequences with disruptive mutations were excluded from the analyses of selective pressure, and the Bayesian tree was pruned accordingly. In the analysis of selective pressure, the sequences of zebrafish were excluded from the LWS-1 and SWS2 data sets. Two sequence data sets were used for LWS-1: one included all functional LWS-1 gene sequences with exclusion of LWS-1 gene sequence of zebrafish, which is hereafter referred to as the ‘large’ data set. Another comprised LWS-1 gene sequences of Sinocyclocheilus, which is hereafter referred to as the ‘small’ data set. The SWS2 sequence data sets were also classified as ‘large’ and ‘small’ data sets similar to the way of LWS-1. The branch models allow for variable x among branches in the tree but invariable x ratios among sites, and can be implemented for the study of changes in selective pressures in specific lineages (Yang 1998). Two branch models, M0 (R1) and two ratios (R2) were used to detect lineage-specific changes in selective pressure after the duplication events. The branch-specific models are based on the assumption that selective constraints may change following

123

J Mol Evol (2009) 69:346–359

gene duplication. Likelihood-ratio tests (LRT) were used to asses their goodness of fit. The site models use statistical distributions to describe the selective pressure variation among sites but do not assume a priori which sites may be under positive selection (Yang et al. 2000). The site models only make consideration about sites and are suitable for detecting sites under positive selection in all lineages. Two pairs of nested site models were used to test for positive selection: M2a (selection), M1a (nearly neutral), and M8 (beta & x), M7 (beta). LRTs were used to compare the pair of nested models with the corresponding statistics being approximated by a chi-squared distribution. Posterior probabilities of the different site classes were calculated using a Bayes empirical Bayes (BEB) procedure (Yang et al. 2005) and those sites with a posteriori probability higher than 0.95 of having a x [ 1 were considered to have evolved under positive selection. A test for functional divergence between the two copies of LWS-1 and SWS2 was performed using the clade Model C (Bielawski and Yang 2004) implemented in PAML. This model is based on a branch-site model and allows variation in the x ratio among sites with a proportion of sites evolving under different selective constraints between a pair of clades. M1a served as the null model of Model C. Divergent selection is indicated by different x’s between clades, and LRT statistics approach a significant level. The sites under divergent selection were identified using the BEB approach. The models mentioned above were performed for both of the large and small data set of these two color vision genes, respectively. Because the models implemented in PAML are noted to be prone to the problem of multiple local optima (Bielawski and Yang 2004; Yang et al. 2000), starting x values both above and below 1 were used. The results corresponding to the highest likelihood were used. Site Inspections At present, certain amino acid changes at a total of 28 critical sites are known to be able to modify the kmax of various visual pigments during vertebrate evolution (reviewed by Yokoyama 2008). There are 12 sites (44, 46, 91, 94, 109, 116, 118, 122, 261, 265, 269, and 292) and 5 sites (164, 181, 261, 269, and 292) involved in spectral tuning of SWS2 pigments and M/LWS pigments, respectively. Functionally relevant sites are usually in the chromophore-binding pocket and often involve a change in amino acid polarity. There are functional critical sites for all opsins, i.e., the intramolecular disulfide linkage of C110 and C187, the Schiff’s base counterion E113, and the site (K296) covalently linked to the retinal chromophore (Palczewski et al. 2000). We inspected the potential amino acid

J Mol Evol (2009) 69:346–359

351

Results

coding regions of LWS-1 of most examined cyprinid fishes were 1074 bp in length, except for some degenerative sequences due to severe disruptive mutations (Fig. 1). However, the majority of introns and other parts of coding regions of the pseudogenes can align with the functional genes unambiguously, which suggests that the nonfunctionalization has been fairly recent.

Opsin Gene Sequences

Phylogenetic Reconstructions

If duplication of the opsin genes occurred only by genome duplication, diploids would be expected to have one copy and tetraploids, two. The numbers of differentiated sequences (Table 1) were less than or equal to the numbers predicted by the ploidy level of each species, which was consistent with the assertion that the duplicate copies of LWS-1 and SWS2 were derived from entire genome duplication and not from duplication of these genes alone, although we did not sequence enough clones to statistically demonstrate this in all species. For some of the genes predicted according to ploidy level, no copies were found in the clones. This might have resulted from insufficient clones sequenced, or the loss of some of the duplicated genes in polyploids. Thirty-one complete LWS-1 sequences and 19 partial SWS2 sequences were obtained. The sequence length of LWS-1 ranged from 1674 bp (P. rabaudi, B paralog) to 1868 bp (S. grahami, A paralog), while that of SWS2 ranged from 1959 bp (S. jii, B paralog) to 2286 bp (S. xunlensis, A paralog). In all genes, splice junction signals (GT/AG) were conserved in all introns. There were no disruptive mutations in the coding regions of SWS2, and the coding regions sequenced were 1046 bp in length. The

The reconstructed LWS-1 and SWS2 gene trees are shown in Fig. 2a, b, respectively. For the LWS-1 gene tree, LWS-1 sequences of all tetraploid cyprinid fish split into two lineages with strong posterior probability (PP): A (PP = 0.97) and B (PP = 0.98). The LWS-1 from Sinocyclocheilus species examined formed two separate lineages both with strong support (PP = 1.00). All tetraploid cyprinid fishes’ SWS2 sequences formed two lineages with moderate support: A (PP = 0.75) and B (PP = 0.77). The A paralog of SWS2 from Sinocyclocheilus species formed a monophyletic group (PP = 1.00); but the B paralog of SWS2 from Sinocyclocheilus species formed a polyphyletic group due to the insertion of C. auratus’s sequence. The PP values of the nodes mentioned above were high, strongly suggesting that the genes of each clade were orthologous.

substitutions at these sites and mapped these variable sites onto the crystal structure of bovine rhodopsin (1F88) (Palczewski et al. 2000). The positively and divergently selected sites identified by PAML were also mapped.

Selective Pressure Analysis Maximum likelihood estimates of parameters under codonbased models for the four datasets are listed in Table 2, while the LRT results are provided in Table 3. For the large data set of LWS-1, the estimated x (x = 0.281) as an average over all sites and branches was substantially

E1

E2

E3

E4

E5

E6

(103bp)

(297bp)

(169bp)

(166bp)

(240bp)

(99bp)

(8 bp, 600-607)

S. xunlensis (A) S. furcodorsalis (A)

(23 bp, 335-357)

(5 bp, after 596) (8 bp, 600-607)

S. macrophthalmus (A)

(1 bp, 836)

S. tingi (A) (1 bp, 519)

S. rhinocerous (A) S. jiuxuensis (A)

(23 bp, 335-357)

P. rabaudi (B)

Fig. 1 Disruptive mutations in the coding regions of the LWS-1 opsin gene of cyprinid fishes. The LWS-1 gene of zebrafish contains six exons (indicated by the six open boxes), with a total length of 1074 bp for the coding regions. Deletions (filled triangle) and insertions (inverted filled triangle) result in a frame-shift and premature stop

(122 bp, 405-526)

codons. The numbers in parentheses indicate the length and position (according to LWS-1 of D. rerio) of indels, respectively. The events of nonfunctionalization of LWS-1 of 6 Sinocyclocheilus species occurred in paralog A, while that of P. rabaudi occurred in paralog B

123

352

J Mol Evol (2009) 69:346–359

a

D

Sinocyclocheilus yimenensis A

Sinocyclocheilus grahami A

0.94

Sinocyclocheilus tingi A

Sinocyclocheilus qiubeinsis A Sinocyclocheilus purpureus A

1.00

0.98 Sinocyclocheilus qiubeinsis A

Sinocyclocheilus maculatus A

1.00 0.82

Sinocyclocheilus yangzongensis A

Sinocyclocheilus anophthalmus A Sinocyclocheilus xunlensis A (D)

1.00 0.68 D

1.00

Sinocyclocheilus anophthalmus A

Sinocyclocheilus macrophthalmus A (D)

1.00

Sinocyclocheilus microphthalmus A 1.00 I

0.75

D

0.97

Sinocyclocheilus furcodorsalis A (DI)

Sinocyclocheilus purpureus A

1.00 S94L

Sinocyclocheilus jiuxuensis A (D)

D

A

Sinocyclocheilus grahami A

b

Sinocyclocheilus yangzongensis A

0.98 0.89

Sinocyclocheilus tingi A (D) Sinocyclocheilus yimenensis A

Sinocyclocheilus maculatus A

Sinocyclocheilus rhinocerous A (D)

Sinocyclocheilus macrolepis A

Carassius auratus A

1.00

1.00

Cyprinus carpio A

0.97

Procypris rabaudi A

0.98

Sinocyclocheilus tingi B

0.98 0.98

Sinocyclocheilus yangzongensis B Sinocyclocheilus furcodorsalis B Sinocyclocheilus anophthalmus B

0.84 0.90

Sinocyclocheilus yimenensis B

0.87

Sinocyclocheilus purpureus B

0.81

A

0.99

1.00 0.93

Sinocyclocheilus furcodorsalis A Sinocyclocheilus rhinocerous A

Sinocyclocheilus xunlensis B

Cyprinus carpio A

Sinocyclocheilus macrophthalmus B Sinocyclocheilus macrolepis B Sinocyclocheilus microphthalmus B

0.96 1.00

B

Sinocyclocheilus microphthalmus A 0.75 94S

Sinocyclocheilus rhinocerous B

0.77

Sinocyclocheilus macrophthalmus A Sinocyclocheilus xunlensis A

1.00

Sinocyclocheilus grahami B

Sinocyclocheilus jii A

1.00 0.96

Sinocyclocheilus jii B

0.98

0.96

Sinocyclocheilus qiubeinsis B Carassius auratus B

0.50 D

Procypris rabaudi B (D) Danio rerio

B

0.77 94S

Sinocyclocheilus xunlensis B Sinocyclocheilus furcodorsalis B Carassius auratus B Sinocyclocheilus jii B Procypris rabaudi B Danio rerio

Fig. 2 a 50% majority-rule consensus tree of Bayesian analysis based on LWS-1 data. Values below 0.5 are not shown. The K80 ? G model is used. Numbers above nodes represent node supports inferred from Bayesian posterior probability analyses. D and I indicate the presence of frameshift deletion and insertion in coding region, respectively. The deduced events of deletion and insertion along

branches are indicated by D and I below corresponding branches, respectively. b 50% majority-rule consensus tree of Bayesian analysis based on SWS2 data. Values below 0.5 are not shown. The HKY ? G model is used. Numbers above nodes represent node supports inferred from Bayesian posterior probability analyses. The amino acids along branches are shown below the corresponding branches

smaller than 1. Model R2 was better than the one-ratio model according to their likelihoods, and the former indicated that the selective pressures acting on lineage A (xA = 0.357) and lineage B (xB = 0.227) were different. But the LRT statistic was only marginally significant (0.05 \ P = 0.0966 \ 0.10). Site-specific positive selection models M2a and M8 fitted the data significantly better than models M1a and M7, respectively. Parameter estimates under M2a suggested 6.4% of sites to be under positive selection with x2 = 3.024. Parameter estimates under M8 suggested 6.3% of sites to be under positive selection with a similar xs = 3.038. The BEB posterior means of x for sites under model M8 are shown in Fig. 3. Site 112V was inferred to have x [ 1 with very high posterior probabilities under both M2a and M8 (PP [ 99%). Site 52L was inferred to have x [ 1 with a high posterior probability (PP [ 95%) only under M8, and the posterior probability of this site under positive selection was 93% under M2a. Model C was significantly better than M1a. Model C suggested that 6.3% of codon sites evolved

under divergent selection with x2A = 2.865 for lineage A and x2B = 3.232 for lineage B, which indicated some sites were under positive selection both in lineages A and B. Two codon sites (52L and 112V) were identified with posterior probabilities [80% of having evolved under divergent selective pressures between lineages A and B. For the small data set of LWS-1, M0 (R1) had an average x value of 0.290, similar to that of M0 for large data set of LWS-1. The two-ratio model (R2) was not significantly better than M0 (P = 0.689). Site-specific positive selection models M2a and M8 fitted the data better than models M1a and M7, respectively. A proportion (0.6%) of codon sites were suggested to be under positive selection by both M2a (x2 = 8.560) and M8 (xs = 8.447). Only site 112V was suggested to be under positive selection by M2a and M8. But the LRT statistics were only marginally significant (0.05 \ P \ 0.10, Table 3). Clade Model C was not significantly better than M1a, so no sites were identified to be under divergent selection between lineages A and B.

123

J Mol Evol (2009) 69:346–359

353

Table 2 Likelihood values and parameter estimates under branch, site, and clade models for LWS-1 and SWS2 Gene

Data set

LWS-1

Large data

Model

lnL

Parameter estimates

Positively or divergently selected sites

M0 (R1)

-2888.24

x = 0.281

None

Two ratios (R2)

-2886.86

xA = 0.357/xB = 0.227

None

M1a (nearly neutral)

-2858.97

p0 = 0.826, (p1 = 0.174), x0 = 0.072, (x1 = 1.000)

Not allowed

M2a (positive selection)

-2850.59

p0 = 0.936, p1 = 0.000, (p2 = 0.064)

112V

M7 (beta)

-2861.23

p = 0.086, q = 0.257

Not allowed

M8 (beta&x)

-2850.63

p0 = 0.937, (p1 = 0.063)

52L, 112V

x0 = 0.132, (x1 = 1.000), x2 = 3.024

p = 15.290, q = 99.000,xs = 3.038 Model C

-2850.53

p0 = 0.937, p1 = 0.000, (p2 = 0.063)

52L, 112V

x0 = 0.133, (x1 = 1.000), x2A = 2.865, x2B = 3.232 Small data

M0 (R1)

-2287.31

x = 0.290

None

Two ratios (R2)

-2287.23

xA = 0.328/xB = 0.278

None

M1a (nearly neutral)

-2277.75

p0 = 0.742, (p1 = 0.258), x0 = 0.024, (x1 = 1.000)

Not allowed

M2a (positive selection)

-2275.37

p0 = 0.768, p1 = 0.226, (p2 = 0.006)

112V

M7 (beta)

-2277.85

p = 0.005, q = 0.012

Not allowed

M8 (beta&x)

-2275.33

p0 = 0.994, (p1 = 0.006)

112V

x0 = 0.052, (x1 = 1.000), x2 = 8.560

p = 0.089, q = 0.248, xs = 8.447 Model C

-2275.22

p0 = 0.765, p1 = 0.230, (p2 = 0.005)

None

x0 = 0.049, (x1 = 1.000), x2A = 5.617, x2B = 10.480 SWS2

Large data

M0 (R1)

-2606.87

x = 0.353

None

Two ratios (R2)

-2604.31

xA = 0.488/xB = 0.252

None

M1a (nearly neutral)

-2578.23

p0 = 0.701, (p1 = 0.299), x0 = 0.000, (x1 = 1.000)

Not allowed

M2a (positive selection)

-2576.59

p0 = 0.750, p1 = 0.000, (p2 = 0.250)

None

M7 (beta)

-2578.23

p = 0.005, q = 0.012

Not allowed

M8 (beta&x)

-2576.59

p0 = 0.752, (p1 = 0.248)

40S, 172M, 284F

x0 = 0.007, (x1 = 1.000), x2 = 1.448

p = 0.036, q = 2.126, xs = 1.454 Model C

-2574.16

p0 = 0.767, p1 = 0.000, (p2 = 0.233) x0 = 0.021, (x1 = 1.000), x2A = 2.100, x2B = 1.057

Small data

7V, 27A, 40S, 46V, 56V, 106F, 107N, 166L, 172M, 227V, 280L, 282I, 284F, 306T

M0 (R1)

-2206.52

x = 0.401

None

Two ratios (R2)

-2204.19

xA = 0.621/xB = 0.273

None

M1a (nearly neutral)

-2192.70

p0 = 0.672, (p1 = 0.328), x0 = 0.000, (x1 = 1.000)

Not allowed

M2a (positive selection)

-2190.46

p0 = 0.832, p1 = 0.030, (p2 = 0.138)

None

M7 (beta)

-2192.85

p = 0.005, q = 0.012

Not allowed

M8 (beta&x)

-2190.46

p0 = 0.869, (p1 = 0.131)

40S

x0 = 0.085, (x1 = 1.000), x2 = 2.356

p = 0.297, q = 2.003, xs = 2.414 Model C

-2188.75

p0 = 0.753, p1 = 0.000, (p2 = 0.247) x0 = 0.005, (x1 = 1.000), x2A = 2.543, x2B = 1.180

40S, 46V, 107N, 172M, 280L, 282I, 306T

Positive selection sites are identified at the cutoff posterior probability (PP) [ 95%, while those with PP [ 99% shown in boldface. Divergent selection sites are identified at the cutoff PP [ 80%, while those with PP [ 99% shown in boldface. The LWS-1 of Danio rerio is used as standard

For both of the large and small data sets of SWS2, the branch-specific models (two ratios, R2) were significantly better than M0. Thus, lineage A evolved more quickly than lineage B for both of the SWS2 data sets. The positive selection models M2a and M8 were not significantly better than models M1a and M7 for the two data sets of SWS2, respectively. So, the possibility of positive selection was rejected for both of the large and small data sets of SWS2

according to site models. Clade Model C was significantly better than M1a for the large and small data sets. For the large data set, Model C suggested 23.3% of sites to be under divergent selection with x2A = 2.100 and x2B = 1.057, and 14 codon sites were identified with PP [ 80% of having evolved under divergent selective pressures between lineages A and B (Table 2). For the small data set, Model C suggested a similar proportion (24.7%) of sites to

123

354

J Mol Evol (2009) 69:346–359

Table 3 Likelihood ratio test for LWS-1 and SWS2 Gene

Data set

LWS-1

Large data

Small data

SWS2

Large data

Small data

Compared models

df

2DlnL

P value

Two ratios-M0

1

2.76

0.0966

M2a-M1a

2

16.76

2.29E-4

M8-M7

2

21.2

2.49E-5

Model C-M1a

3

16.88

7.48E-4

Two ratios-M0

1

0.16

0.689

M2a-M1a

2

4.76

0.0926

M8-M7

2

5.04

0.0805

Model C-M1a

3

5.06

0.167

Two ratios-M0

1

5.12

2.37E-2

M2a-M1a

2

3.28

0.194

M8-M7 Model C-M1a

2 3

3.28 8.14

0.194 4.32E-2

Two ratios-M0

1

4.66

3.09E-2

M2a-M1a

2

4.48

0.106

M8-M7

2

4.78

0.0916

Model C-M1a

3

7.9

4.81E-2

107N at TM II; 166L and 172M at TM IV; 227V at TM V; 280L, 282I, and 284F at TM VI; 306T at TM VII. The other three divergently selected sites were located within NH2-terminal tail.

Discussion Different Evolutionary Fates of Duplicated Color Vision Genes and the Events of Pseudogenization of LWS-1 In this study, we investigated the molecular evolution of duplicated LWS-1 and SWS2 genes in tetraploid cyprinid Table 4 Amino acid variations on 1 of 5 key sites and positively selected sites of LWS-1 protein sequences Clades

Consensus

5(3) 2(9) L

1 1(9) 2(9) V

1(1) 7(6) 7(4) P

Sinocyclocheilus xunlensis

.

.

.

Sinocyclocheilus furcodorsalis

.

F

.

Sinocyclocheilus microphthalmus

.

F

.

Sinocyclocheilus macrolepis

.

F

.

Sinocyclocheilus jii

.

.

.

Sinocyclocheilus macrophthalmus

.

.

.

Sinocyclocheilus grahami

.

F

.

Sinocyclocheilus tingi Sinocyclocheilus yangzongensis

. .

F F

. .

Sinocyclocheilus yimenensis

.

F

.

Sinocyclocheilus purpureus

.

F

.

Sinocyclocheilus qiubeinsis

V

F

.

Sinocyclocheilus anophthalmus

.

F

.

Carassius auratus

.

Y

S

Sinocyclocheilus microphthalmus

.

I

S

Sinocyclocheilus grahami

V

I

S

Sinocyclocheilus yangzongensis

V

I

S

Sinocyclocheilus yimenensis

V

I

S

Sinocyclocheilus purpureus

V

M

S

Sinocyclocheilus maculatus

V

I

S

Sinocyclocheilus qiubeinsis

V

I

S

Sinocyclocheilus anophthalmus

V

I

S

Procypris rabaudi Carassius auratus

V .

. F

A S

Cyprinus carpio

.

I

S

Danio rerio

V

F

A

Amino acid positions

4

B

3 2 1 0

0

50

100

150

200

250

300

350

Site numbers

Fig. 3 Approximate posterior means of x, calculated as a weighted average of x over the 11 site classes, weighted by the posterior probabilities. Sites with low mean x’s are inferred to be under purifying selection. Sites are numbered according to the reference sequence LWS-1 of zebrafish (AB087803)

be under divergent selection with slightly higher x2A = 2.543 and x2B = 1.180, and 7 codon sites were suggested to be under divergent selection. A

Sites Inspection The functional critical sites for all opsins were conserved in all intact sequences examined in this study. One of the five critical sites for M/LWS, site 164, was variable in tetraploid cyprinid fishes (Table 4), and the other four sites were conserved (181H, 261Y, 269T, and 292A). The positively selected sites, 52 and 112, were located within TM I and II, respectively (Table 4). Three of the 12 critical sites were variable for SWS2 of tetraploid cyprinid fishes (Table 5). Fourteen codon sites of SWS2 were suggested to have evolved under divergent selective pressures between lineage A and B for the large data sets (Table 2). We mapped these sites to the three-dimensional structure of bovine rhodopsin. Most sites (11) were located within transmembrane regions: 46V and 56V at TM I; 106F and

123

Read vertically, the top three lines specify the amino acid positions in the Cyprinidae LWS-1 pigment (using the LWS-1 of Danio rerio as standard, while the numbers in parentheses are the corresponding positions of bovine rhodopsin) A dot indicates an amino acid identical to the LWS-1 consensus

J Mol Evol (2009) 69:346–359

355

Table 5 Amino acid variations on 3 of 12 key sites of SWS2 protein sequences Clades

Consensus

5(4) 1(4) M

1 0(9) 1(4) S

1(1) 2(1) 5(8) T

Sinocyclocheilus xunlensis

.

.

.

Sinocyclocheilus furcodorsalis

.

.

.

Sinocyclocheilus microphthalmus

.

.

.

Sinocyclocheilus macrolepis

.

L

.

Sinocyclocheilus jii

.

.

.

Sinocyclocheilus macrophthalmus

.

.

.

Sinocyclocheilus grahami

.

L

.

Sinocyclocheilus tingi

.

L

.

Sinocyclocheilus yangzongensis

.

L

.

Sinocyclocheilus yimenensis Sinocyclocheilus rhinocerous

. .

L .

. .

Sinocyclocheilus purpureus

.

L

.

Sinocyclocheilus maculatus

.

L

.

Sinocyclocheilus qiubeinsis

.

L

.

Sinocyclocheilus anophthalmus

.

L

.

Cyprinus carpio

.

.

.

Sinocyclocheilus xunlensis

.

.

.

Sinocyclocheilus furcodorsalis

T

.

.

Sinocyclocheilus jii

.

.

S

Procypris rabaudi

.

.

.

Carassius auratus

.

.

.

Danio rerio

.

A

.

Amino acid positions

A

B

Read vertically, the top three lines specify the amino acid positions in the Cyprinidae SWS2 pigment (using the SWS2 of Danio rerio as standard, while the numbers in parentheses are the corresponding positions of bovine rhodopsin) A dot indicates an amino acid identical to the SWS2 consensus

fishes. It has been suggested that the events of polyploidization repeatedly occurred in different lineages of cyprinine fishes (Yu et al. 1989). Following the gene tree of these two color vision genes (duplicated color vision genes grouped together by genes, not by species), we suggest that the events of genome duplication occurred before the speciation of tetraploid cyprinine fishes used in this study (Ma´laga-Trillo et al. 2002). As is well known, a duplicate copy is much more likely to become a pseudogene. Here we have shown that a duplicate copy of LWS-1 of some cyprinine species had indeed become a pseudogene through analysis at DNA level. But two copies of SWS2 remained intact in the regions sequenced, which suggested that both copies might still be functional. For Sinocyclocheilus species, the pseudogenization of LWS-1 occurred only in lineage A, while that of P. rabaudi occurred in lineage B. Five of the seven sequences with disruptive mutations occurred in the cave

species of Sinocyclocheilus, which suggested that the pseudogenization occurred more frequently in cave species and was consistent with the presumption that opsins were not as important in caves as on surface. According to the mitochondrial gene tree (Li et al. 2008b) and the positions of disruptive mutations, pseudogenization events are shown in Fig. 2a. Five independent deletion events and one insertion event can be used to explain the resulting pattern of pseudogenization of LWS-1 gene parsimoniously. A deletion of 8 bp (600–607) occurred in the most recent common ancestor (MRCA) of S. xunlensis and S. macrophthalmus. The MRCA of S. furcodorsalis and S. jiuxuensis had a deletion of length in 23 bp (335–357), then a 5-bp insertion occurred independently in S. furcodorsalis. Combined with the fact that the pseudogenes have very high similarity to the functional genes, we suggest that the pseudogenization events are independent and recent. Positive Selection, Functional Diversification of Duplicated Color Vision Genes For the same gene, the predicted codon sites evolving under positive or divergent selection and the LRT statistics are variable under different data sets. This is likely the result of there just being fewer variable sites when the out groups are left out. So, the discussion is based on the large data sets. As stated in the introduction, some nocturnal primates accumulate dysfunctional mutations in color vision genes, while others do not. Why? Two alternative interpretations of the presence of two intact color vision genes in some nocturnal primates have been offered in the literature. First, nocturnal habits necessarily lead to monochromacy over the long term, and the presence of two functional opsin genes would indicate different time periods of functional relaxation among lineages (i.e., the opsins are on their way to pseudogenization but gene disrupting mutations have not yet occurred, Tan et al. 2005). Second, there might be longterm functional constraints on S- and M/L-opsin genes despite a persistent nocturnal activity pattern. If the latter is true, then dichromacy might be adaptive in some nocturnal primates, or opsins might have a function unrelated to vision, e.g., circadian rhythm regulation (Crandall and Hillis 1997; Nei et al. 1997). Because photic-environments of caves and night are highly similar, we investigate whether the two interpretations can be applied to Sinocyclocheilus cave species. Sinocyclocheilus cave species show different degrees of eye degeneration: normal, reduced, or blind. If different time periods of functional relaxation are related to different times of origin of cave species, then those species with a relatively older shift to cave environment will have a

123

356

higher chance to accumulate degenerative mutations. According to the divergence time estimation, S. microphthalmus and the MRCA (most recent common ancestor, which most likely have shifted to a cave environment) of S. furcodorsalis, S. jiuxuensis, and S. rhinocerous, originated at almost the same time (about 6.5 Ma, Li et al. 2008b), so S. microphthalmus will have a similar chance to accumulate degenerative mutations as the three cave species. But both copies of LWS-1 of S. microphthalmus remain intact. Sometimes natural selection is a very quick and intense process, as shown in a butterfly (Charlat et al. 2007). One troglomorphic character, humpback, can appear quickly in individuals of surface populations of A. mexicanus maintained in darkness (Rasquin 1949; Rasquin and Rosenbloom 1954). Thus, it is possible that eye reduction, a troglomorphic character, can appear in a very short time period and may be unrelated to the times of environment shift. Because the blind species do not need or have the capacity for color vision (or even vision), then these species will have a greater chance to accumulate degenerative mutations as compared to the species with reduced or normal eyes. But S. anophthalmus, a blind species with numerous morphological specializations that seem to be highly adapted for caves, retains both intact copies of LWS1 in their genome. Another individual of S. rhinocerous from the same location was sequenced, and the A copy of LWS-1 accumulated a 1-bp deletion at the same position as the individual used in this study (data not shown). This suggests that this deletion might be fixed or at high frequency in S. rhinocerous. As we are not clear about those degenerative mutations in other species, we suggest the second interpretation is more reasonable. Functional divergence can occur in a variety of ways, including relaxed selective constraints, neutral evolution, or even positive selection. The classical model of gene duplication is that one copy is free to evolve neutrally, thereby accumulating random mutations that may infrequently result in acquiring a new function (Ohno 1970). Under this model, the most common fate of duplicated genes will become null through deleterious mutations. However, the classical model cannot explain why a large number of duplicated genes are retained in a genome for a long period of time. So, the duplication–degeneration– complementation (DDC) model was proposed to offer an alternative explanation of the evolutionary fate of duplicated genes (Force et al. 1999; Lynch and Force 2000). Under this model, duplicate copies are retained in the genome through a process of subfunctionalization. In this process, beneficial mutations would not be necessary to retain duplicate copies. Even if acting on a small number of sites for a brief period of time, positive selection can be an important factor in retaining duplicated genes by promoting the acquisition of a novel or more specialized function

123

J Mol Evol (2009) 69:346–359

and is probably common across many taxa. Examples of positive selection following gene duplication include the duplicated pancreatic ribonuclease gene in the leaf-eating monkey (Zhang et al. 1998, 2002), Dntf-2r in Drosophila (Betra´n and Long 2003), phytochrome A in angiosperms (Mathews et al. 2003), CCT chaperonin subunits in eukaryotes (Fares and Wolfe 2003), and many others (Bielawski and Yang 2001; Maston and Ruvolo 2002; Rodriguez-Trelles et al. 2003). Population genetics tests also suggest that positive selection can take an important role in the very early evolutionary histories of duplicated genes (Moore and Purugganan 2003, 2005). As the LRTs of both pairs of positive selection tests were significant for LWS-1, we suggested that LWS-1 in tetraploid cyprinine fishes provided another example that demonstrates the prominent role of positive selection in the evolution of duplicate gene. However, the selective advantage of LWS-1 is unclear. Most of the amino acid substitutions that can cause the shifts of kmax to occur more frequently in the transmembrane domain than in other region (Yokoyama 2000, 2002), because the transmembrane sites are closer to chromophores in three dimensions and can easily interact with chromophores. According to the crystal structure of bovine rhodopsin, the positively selected sites of LWS-1, 52L and 112V, are located in transmembranes I and II, respectively. One (164) of the five critical sites for M/LWS is variable, and a new amino acid substitution (S164P) is found. Proline occupied this position in lineage A of LWS-1 except for the pigment of C. auratus, while serine occupied this position in lineage B except for the pigment of P. rabaudi. When only Sinocyclocheilus species are examined, P and S can be considered as lineage-specific amino acids for lineage B and A, respectively. Changes in the polarity of amino acids in the chromophore-binding pocket of opsins can affect the distribution of electrons in the pelectron system of the chromophore, thus producing a diversity of kmax values (Honig et al. 1976). Because the side chain of proline is non-polar, while that of serine is polar, we suggest that S164P substitution is most likely to cause kmax shift, although the magnitude and direction are not clear. Divergent selection tests revealed that amino acids at only two positions (52 and 112) accounted for the potential functional divergence between the lineage A and B of LWS-1. Position 52 was invariably occupied by leucine and valine in all LWS-1 of tetraploid fishes examined in this study. For lineage B of LWS-1, the position was occupied by leucine except for S. qiubeinsis, while valine occurred in most sequences of lineage A. Position 112 is more variable than positions 52 and 177: five amino acids (valine, phenylalanine, tyrosine, methionine, and isoleucine) occur at this position. Moreover, 52L and 112V are also under strong positive selection, which suggests that

J Mol Evol (2009) 69:346–359

possible divergence of LWS-1 is strongly influenced by positive selection causing accelerated rates of substitution in a substantial proportion of pocket-forming residues. In the study of molecular analyses of rhodopsins in vertebrates (Yokoyama et al. 2008), none of the sites identified by Bayesian methods was among the sites known to be involved in adaptive changes in opsin sensitivity, so we suggest that the final proof of statistical results rests on experimental verification. Most of the sites of SWS2 pigments identified to have evolved under divergent selection have not been verified by experimental evidence yet. Only one site (56V, which correspond to site 49 of bovine rhodopsin), located in TM I region, has been investigated by functional study. Amino acid substitutions on this site can shift the kmax of RH2 and SWS1 pigments, but this site is not in the list that can contribute to the kmax shift of SWS2 (Yokoyama 2008). Interestingly, this position was invariably occupied by V in B copy of SWS2, except for B copy of C. auratus (occupied by I). It is known that the origin of the violet pigment of some avian species is caused by amino acid replacements F49V/F86S/L116V/S118A (Shi and Yokoyama 2003). As the amino acid substitution I49V has not been reported before, this substitution may be useful for site-directed mutagenesis in the future. Three of 12 critical sites were variable for SWS2 pigments in this study. The sites 44 and 118 were singleton sites: M44T occurred in S. furcodorsalis, and T118S occurred in S. jii. It has been demonstrated by experimental evidence that M44T contributes to the increase of kmax of SWS2 pigment of bluefin killifish (Yokoyama et al. 2007), so we suspect that M44T will cause red-shift of kmax of SWS2 pigment of S. furcodorsalis, although the advantage of this substitution for this species is not clear. Although the site 118 was involved in spectral tuning of SWS2 pigments of some vertebrates, the substitution T118S was not reported before for SWS2 pigments. However, we noted that S118T occurred two times in the evolution of SWS1 pigments of vertebrates, and contributed to the red-shift of SWS1 of clawed frog and human together with other amino acid substitutions (Yokoyama 2008). So we hypothesize that T118S contributes to the blue-shift of SWS1 of S. jii, but the exact effect of T118S on the kmax needs to be verified by a functional study. The site 94 was occupied by S or L in tetraploid cyprinid fish. Surprisingly, we found that L only occurred in the SWS2 pigments of Sinocyclocheilus species from clades A and B (Li et al. 2008b) when S and L were mapped onto the phylogenetic tree of SWS2 (Fig. 2b). So, the most parsimonious explanation for this pattern is that S94L occurred in the SWS2 pigment of MRCA of Sinocyclocheilus species of clades A and B (Fig. 2b). Site 94 is only involved in spectral tuning of vertebrate SWS2

357

pigments examined to date (Yokoyama 2008), but the substitution S94L is a new substitution and may be useful for site-directed mutagenesis in the future.

Conclusion To our knowledge, this study represents the first molecular evolutionary study on color vision genes with so many cavefishes. Our study focused on cyprinid fishes, a large group of primary freshwater fishes distributed throughout North America, Africa, and Eurasia. Taken together, our data suggest that duplicate color vision genes of tetraploid cyprinid fishes have asymmetric evolutionary rates, which will lead to sequence divergence. The sequence divergence may ultimately lead to functional divergence. Functional divergence can promote their retention in the genome and eventually result in neofunctionalization, often a related function rather than an entirely new function (Zhang 2003). Although some cavefishes are highly adapted to caves and blind, the color vision genes examined remain intact, which further raises the question about the exact function and role of color vision opsins in low-light or aphotic environment (e.g., Perry et al. 2007; Tan et al. 2005). Our results support recent theoretical and experimental work which indicates that positive selection can play a prominent role in the evolutionary dynamics of duplicate genes. Positive selection following gene duplication most likely contributes to the functional divergence (potentially the shift of kmax) of LWS-1 pigments in tetraploid cyprinid fishes, although the reasons for these changes remain an open issue. Unfortunately, molecular selection alone cannot be used as the unique criterion to infer protein function, and the true nature of duplicated color vision gene needs to be determined experimentally and independently. Some new amino acid substitutions at key sites that play an important functional role in opsins are found, and these substitutions might have evolved to allow for functional specificity to take place. Therefore, these sites may represent interesting target sites for future mutagenesis experiments that will improve our understanding of the structure–function relationships in color vision genes. It is hoped that the present work will inform future functional and molecular studies that might clarify the exact functions of opsins. Acknowledgments The authors would like to thank the members of He’s lab for their assistance. Zuogang Peng and Simon Y. W. Ho are gratefully acknowledged for critically reading this manuscript. We sincerely thank the two anonymous referees and the Associate Editor for their insightful comments on the earlier versions of this manuscript. This research has been supported by the grants from National Natural Science Foundation of China (NSFC) 2007CB411600 and 30530120 to S. H.

123

358

References Betra´n E, Long M (2003) Dntf-2r, a young Drosophila retroposed gene with specific male expression under positive Darwinian selection. Genetics 164:977–988 Bielawski JP, Yang Z (2001) Positive and negative selection in the DAZ gene family. Mol Biol Evol 18:523–529 Bielawski JP, Yang Z (2004) A maximum likelihood method for detecting functional divergence at individual codon sites, with application to gene family evolution. J Mol Evol 59:121–132 Bowmaker JK (1995) The visual pigments of fish. Prog Retin Eye Res 15:1–31 Charlat S, Hornett EA, Fullard JH, Davies N, Roderick GK, Wedell N, Hurst GD (2007) Extraordinary flux in sex ratio. Science 317:214 Chinen A, Hamaoka T, Yamada Y, Kawamura S (2003) Gene duplication and spectral diversification of cone visual pigments of zebrafish. Genetics 163:663–675 Crandall KA, Hillis DM (1997) Rhodopsin evolution in the dark. Nature 387:667–668 Crescitelli F (1972) The visual cells and visual pigments of the vertebrate eye. In: Dartnall HJA (ed) Handbook of sensory physiology, vol VII/1. Springer-Verlag, Berlin, pp 245–363 Crow KD, Stadler PF, Lynch VJ, Amemiya C, Wagner GP (2006) The ‘‘fish-specific’’ Hox cluster duplication is coincident with the origin of teleosts. Mol Biol Evol 23:121–136 Fares MA, Wolfe KH (2003) Positive selection and subfunctionalization of duplicated CCT chaperonin subunits. Mol Biol Evol 20:1588–1597 Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J (1999) Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531–1545 Hoegg S, Brinkmann H, Taylor JS, Meyer A (2004) Phylogenetic timing of the fish-specific genome duplication correlates with the diversification of teleost fish. J Mol Evol 59:190–203 Honig B, Greenberg AD, Dinur U, Ebrey TG (1976) Visual-pigment spectra: implications of the protonation of the retinal Schiff base. Biochemistry 15:4593–4599 Hunt DM, Fitzgibbon J, Slobodyanyuk SJ, Bowmaker JK (1996) Spectral tuning and molecular evolution of rod visual pigments in the species flock of cottoid fish in Lake Baikal. Vision Res 36:1217–1224 Kito Y, Suzuki T, Azuma M, Sekoguti Y (1968) Absorption spectrum of rhodopsin denatured with acid. Nature 218:955–957 Kochendoerfer GG, Lin SW, Sakmar TP, Mathies RA (1999) How color visual pigments are tuned. Trends Biochem Sci 24:300–305 Kumar S, Tamura K, Nei M (2004) MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform 5:150–163 Li Z, He S (2009) Relaxed purifying selection of rhodopsin gene within a Chinese endemic cavefish genus Sinocyclocheilus (Pisces: Cypriniformes). Hydrobiologia 624:139–149 Li J, Wang X, Kong X, Zhao K, He S, Mayden RL (2008a) Variation patterns of the mitochondrial 16S rRNA gene with secondary structure constraints and their application to phylogeny of cyprinine fishes (Teleostei: Cypriniformes). Mol Phylogenet Evol 47:472–487 Li Z, Guo B, Li J, He S, Chen Y (2008b) Bayesian mixed models and divergence time estimation of Chinese cavefishes (Cyprinidae: Sinocyclocheilus). Chin Sci Bull 53:2342–2352 Lynch M, Force A (2000) The probability of duplicate gene preservation by subfunctionalization. Genetics 154:459–473 Ma´laga-Trillo E, Laessing U, Lang DM, Meyer A, Stuermer CA (2002) Evolution of duplicated reggie genes in zebrafish and goldfish. J Mol Evol 54:235–245

123

J Mol Evol (2009) 69:346–359 Maston GA, Ruvolo M (2002) Chorionic gonadotropin has a recent origin within primates and an evolutionary history of selection. Mol Biol Evol 19:320–335 Mathews S, Burleigh JG, Donoghue MJ (2003) Adaptive evolution in the photosensory domain of phytochrome A in early angiosperms. Mol Biol Evol 20:1087–1097 Moore RC, Purugganan MD (2003) The early stages of duplicate gene evolution. Proc Natl Acad Sci USA 100:15682–15687 Moore RC, Purugganan MD (2005) The evolutionary dynamics of plant duplicate genes. Curr Opin Plant Biol 8:122–128 Nei M, Zhang J, Yokoyama S (1997) Color vision of ancestral organisms of higher primates. Mol Biol Evol 14:611–618 Ohno S (1970) Evolution by gene duplication. Springer-Verlag, New York Palczewski K, Kumasaka T, Hori T, Behnke CA, Motoshima H, Fox BA, Le Trong I, Teller DC, Okada T, Stenkamp RE, Yamamoto M, Miyano M (2000) Crystal structure of rhodopsin: a G proteincoupled receptor. Science 289:739–745 Perry GH, Martin RD, Verrelli BC (2007) Signatures of functional constraint at aye-aye opsin genes: the potential of adaptive color vision in a nocturnal primate. Mol Biol Evol 24:1963–1970 Posada D, Crandall KA (1998) Modeltest: testing the model of DNA substitution. Bioinformatics 14:817–818 Protas ME, Hersey C, Kochanek D, Zhou Y, Wilkens H, Jeffery WR, Zon LI, Borowsky R, Tabin CJ (2006) Genetic analysis of cavefish reveals molecular convergence in the evolution of albinism. Nat Genet 38:107–111 Rasquin P (1949) The influence of light and darkness on thyroid and pituitary activity of the characin, Astyanax mexicanus and its cave derivates. Bull Am Mus Nat Hist 93:497–532 Rasquin P, Rosenbloom L (1954) Endocrine imbalance and tissue hyperplasia in teleostes maintained in darkness. Bull Am Mus Nat Hist 104:357–426 Rodriguez-Trelles F, Tarrio R, Ayala FJ (2003) Convergent neofunctionalization by positive Darwinian selection after ancient recurrent duplications of the xanthine dehydrogenase gene. Proc Natl Acad Sci USA 100:13413–13417 Ronquist F, Huelsenbeck JP (2003) MRBAYES 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19: 1572–1574 Sambrook J, Fritsch EF, Maniatis T (1989) Molecular cloning: a laboratory manual, 2nd edn. Cold Spring Harbor Laboratory Press, New York Seehausen O, Terai Y, Magalhaes IS, Carleton KL, Mrosso HD, Miyagi R, van der Sluijs I, Schneider MV, Maan ME, Tachida H, Imai H, Okada N (2008) Speciation through sensory drive in cichlid fish. Nature 455:620–626 Shi Y, Yokoyama S (2003) Molecular analysis of the evolutionary significance of ultraviolet vision in vertebrates. Proc Natl Acad Sci USA 100:8308–8313 Spady TC, Seehausen O, Loew ER, Jordan RC, Kocher TD, Carleton KL (2005) Adaptive molecular evolution in the opsin genes of rapidly speciating cichlid species. Mol Biol Evol 22:1412–1422 Sugawara T, Terai Y, Okada N (2002) Natural selection of the rhodopsin gene during the adaptive radiation of East African Great Lakes cichlid fishes. Mol Biol Evol 19:1807–1811 Tan Y, Yoder AD, Yamashita N, Li WH (2005) Evidence from opsin genes rejects nocturnality in ancestral primates. Proc Natl Acad Sci USA 102:14712–14716 Taylor JS, Braasch I, Frickey T, Meyer A, Van de Peer Y (2003) Genome duplication, a trait shared by 22000 species of rayfinned fish. Genome Res 13:382–390 Terai Y, Seehausen O, Sasaki T, Takahashi K, Mizoiri S, Sugawara T, Sato T, Watanabe M, Konijnendijk N, Mrosso HD, Tachida H, Imai H, Shichida Y, Okada N (2006) Divergent selection on

J Mol Evol (2009) 69:346–359 opsins drives incipient speciation in Lake Victoria cichlids. PLoS Biol 4:e433 Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The Clustal X windows interface: flexible strategies for multiple sequences alignment aided by quality analysis tools. Nucleic Acids Res 25:4876–4882 Walls GL (1934) The reptilian retina. Am J Ophthalmol 17:892–915 Walsh JB (1995) How often do duplicated genes evolve new functions? Genetics 139:421–428 Wang FY, Chung WS, Yan HY, Tzeng CS (2008) Adaptive evolution of cone opsin genes in two colorful cyprinids, Opsariichthys pachycephalus and Candidia barbatus. Vision Res 48:1695– 1704 Xiao H, Zhang RD, Feng JG, Ou Y, Li W, Chen S, Zan R (2002) Nuclear DNA content and ploidy of seventeen species of fishes in Sinocyclocheilus. Zool Res 23:195–199 Xiao H, Chen S, Liu Z, Zhang R, Li W, Zan R, Zhang Y (2005) Molecular phylogeny of Sinocyclocheilus (Cypriniformes: Cyprinidae) inferred from mitochondrial DNA sequences. Mol Phylogenet Evol 36:67–77 Yang Z (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci 13:555–556 Yang Z (1998) Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol Biol Evol 15:568–573 Yang Z (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24:1586–1591 Yang Z, Nielsen R, Goldman N, Pedersen AM (2000) Codonsubstitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431–449 Yang Z, Wong WSW, Nielsen R (2005) Bayes empirical Bayes inference of amino acid sites under positive selection. Mol Biol Evol 22:1107–1118 Yokoyama S (2000) Molecular evolution of vertebrate visual pigments. Prog Retin Eye Res 19:385–419 Yokoyama S (2002) Molecular evolution of color vision in vertebrates. Gene 300:69–78

359 Yokoyama S (2008) Evolution of dim-light and color vision pigments. Annu Rev Genomics Hum Genet 9:259–282 Yokoyama S, Radlwimmer FB (2001) The molecular genetics of red and green color vision in vertebrates. Genetics 158:1697–1710 Yokoyama S, Tada T (2000) Adaptive evolution of the African and Indonesian coelacanths to deep-sea environments. Gene 261:35– 42 Yokoyama S, Takenaka N (2004) The molecular basis of adaptive evolution of squirrelfish rhodopsins. Mol Biol Evol 21:2071–2078 Yokoyama S, Meany A, Wilkens H, Yokoyama R (1995) Initial mutational steps toward loss of opsin gene function in cavefish. Mol Biol Evol 12:527–532 Yokoyama S, Takenaka N, Blow N (2007) A novel spectral tuning in the short wavelength-sensitive (SWS1 and SWS2) pigments of bluefin killifish (Lucania goodei). Gene 396:196–202 Yokoyama S, Tada T, Zhang H, Britt L (2008) Elucidation of phenotypic adaptations: molecular analyses of dim-light vision proteins in vertebrates. Proc Natl Acad Sci USA 105:13480– 13485 Yu XJ, Zhou T, Li YC, Li K, Zhou M (1989) Chromosomes of Chinese fresh-water fishes (in Chinese). Science Press, Beijing Yue PQ, Chen YY (1998) China red data book of endangered animals: Pisces. Science Press, Beijing Zhang J (2003) Evolution by gene duplication: an update. Trends Ecol Evol 18:292–298 Zhang J, Rosenberg HF, Nei M (1998) Positive Darwinian selection after gene duplication in primate ribonuclease genes. Proc Natl Acad Sci USA 95:3708–3713 Zhang J, Zhang Y, Rosenberg HF (2002) Adaptive evolution of a duplicated pancreatic ribonuclease gene in a leaf-eating monkey. Nat Genet 30:411–415 Zhao YH (2006) An endemic cavefish genus Sinocyclocheilus in China—species diversity, systematics, and zoogeography (Cypriniformes: Cyprinidae). PhD thesis, Institute of Zoology, Chinese Academy of Sciences, Beijing Zhao Y, Zhang C (2006) Cavefishes: concept, diversity and research progress. Biodivers Sci 14:451–460

123

Suggest Documents