Supplementary Information for Convergent evolution of tertiary structure in rhodopsin visual proteins from vertebrates and box jellyfish Authors: Elliot Gerrard1¶, Eshita Mutt2¶, Takashi Nagata3, Mitsumasa Koyanagi3, Tilman Flock2, Elena Lesca4, Gebhard Schertler2,4, Akihisa Terakita3*, Xavier Deupi2* & Robert Lucas1* 1
Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom.2Paul Scherrer
Institute, 5232 Villigen PSI, Switzerland.3Department of Biology and Geosciences, Graduate School of Science, Osaka City University, Osaka 558-8585, Japan.4Department of Biology, ETH Zürich, Wolfgang-Pauli-Str. 27, Zürich 8093, Switzerland. ¶
These authors made an equal contribution to this work.
*
For correspondence;
[email protected],
[email protected] &
[email protected].
This PDF file includes: Supplementary text Figs. S1 to S5 Tables S1 to S3 References Other supplementary materials for this manuscript include the following: Additional files S1 to S2
1 www.pnas.org/cgi/doi/10.1073/pnas.1721333115
Supplementary Information Text
Materials and methods Spectroscopy JellyOp and its mutants were expressed and purified as previously described1. In brief, HEK293S cells were transfected with plasmid DNAs by the calcium phosphate method and incubated for 2 days. Expressed proteins were reconstituted by adding an excess of 11-cis-retinal, extracted with 1% n-Dodecyl β-D-maltoside (DM) in HEPES buffer (pH 6.5) containing 140 mM NaCl (buffer A) and purified using 1D4-agarose with buffer A containing 0.02% DM. To remove chloride, samples were dialyzed against 50 mM HEPES buffer (pH 7.3) containing 0.02% DM. Absorption spectra were measured with a spectrophotometer (UV2450, Shimadzu) at 4°C. Cell culture, transfection & Supplementation HEK293 cells (ATCC, CRL-1573) were maintained at 37’C in DMEM supplemented with 10% fetal bovine serum (FBS, Sigma) and penicillin/streptomycin in a 5% CO2 atmosphere. Expression vectors & Mutagenesis To report JellyOp light-mediated production of cAMP in HEK293 cells, a mammalian expression plasmid pcDNA5 GloSensor 22F containing the open reading frames of a GloSensor cAMP reporter (Promega Corp) was used, as described elsewhere2,3. A mammalian expression plasmid, pcDNA3, containing the open reading frame for JellyOp (Genbank AB435549) was used as described previously2,3 Site directed mutagenesis (Quikchange Lightening site-directed mutagenesis kit; Agilent Technologies, USA) was carried out using primers designed to alter nucleotide triplets equating to amino acid sites 94, 113, 157 and 181 (bovine rhodopsin numbering system). All plasmids were sequenced by the University of Manchester DNA sequencing facility and verified prior to use.
2
cAMP assay Transient transfection of expression vectors into HEK293 cells (Glosensor & relevant JellyOp plasmid, 1:1) was carried out using Lipofectamine 2000 (Thermo Fisher Scientific) for 4-6hrs in 6-well plates. After transfection cells were split into 96-well plates and supplemented with an additional 10µM 9-cis retinal (Sigma) for a minimum of 16 hours before experimentation (handled in dim red light only from this point). 2 hours prior to luminescence measurement, DMEM was replaced with L-15 medium without phenol red (Thermo Fisher Scientific), supplemented with 1% FBS, 10uM 9-cis retinal and an additional 1mM beetle luciferin (Promega). Glosensor reporter luminescence was measured using a Fluostar Optima plate reader as described previously (BMG Labtech, Germany2).All light stimuli were generated using a single wavelength-band LED from the pE-4000 CoolED system (CoolLED), relevant power and exposure time is indicated in figure legends. Powers of light stimuli were independently measured using a Spectroradiometer (SpectroCAL MKII, Cambridge Research Systems). Determination of Spectral Sensitivity To determine spectral sensitivity, irradiance response curves were generated using 2s exposure to light of 8 near-monochromatic wavelength bands over a minimum 3-order range of light intensity. The maximum post-flash luminescence (raw luminescence units; RLU) value was normalized to average baseline luminescence of that particular well (10 cycles) to give a cAMP-induction value. These IRCs were fit with standard sigmoidal dose response curves in Graphpad Prism, with only the bottom of the curve restrained (to 1x). The EC50 values obtained from this process were converted into relative sensitivity for each replicate by dividing every EC50 by the lowest EC50 in that replicate. The average relative sensitivity data was fit with Govardovskii formula nomogram templates (as described4, and applied for relative sensitivity data5). The formula was modified to better fit relative sensitivity in the UV-region (as discussed in the original description of the formula4) by reducing the contribution to overall relative sensitivity of the β-band to 30% of the α-band. For fitting of composite nomograms (Figs 2-4), the λmax of the UV pigment was held at 380nm (matching that of deprotonated Schiff base was6) and that of the visible pigment determined according the best fit to the long wavelength limb of the action spectrum. Structural modeling A 3D model of JellyOp was built by homology modeling using the crystal structure of squid rhodopsin (PDBid: 2Z737) as a template. First, the sequences of JellyOp (UniProt id: B6F0Y5_CARRA) and squid rhodopsin (~25% sequence identity) were aligned using MUSCLE8. This initial alignment was manually refined using Chimera9 to adjust the gaps in the loop regions. Using this alignment and the squid rhodopsin template, a 3D model of JellyOp was built using using Modeller v9.1410. The putative cysteine bridge between Cys79[3.25] in transmembrane helix 3 and Cys158[ecl2.50] in the second extracellular loop was explicitly defined during model building. All models were subjected to 300 iterations of variable target function method optimization and thorough molecular dynamics and simulated annealing optimization, and scored using the discrete 3
optimized protein energy potential. The 20 best-scoring models were analyzed visually, and a suitable model (in term of low score and structure of the loops) was selected for the next step. We added to this preliminary model the crystallographic waters resolved within the transmembrane bundle of squid rhodopsin. We then used PROPKA11 at pH 7.0 as implemented in PDB2PQR12 to determine protonation states of titratable groups, add hydrogens to the structure, and optimize the hydrogen bond network. The model was embedded in a pre-equilibrated lipid bilayer consisting of 116 molecules of 1-palmitoyl-2-oleoyl-sn-glycerol-3-phosphatidylcholine (POPC), which was hydrated with a layer of aprox. 35Å of water molecules on each side. Sodium and chloride ions were added to a concentration of 150 mM NaCl, and then additional ions were added to achieve charge neutrality. The dimensions of the final tetragonal box were approximately 70x70x100Å, containing a total of ~50.000 atoms. This system was equilibrated as follows: first a short (0.5 ns) simulation was performed in which only the lipid tails were allowed to move, in order to induce the appropriate disorder of a fluid-like bilayer. Then, the geometry of the entire system was optimized by 1000 steps of energy minimization, followed by two equilibration steps with the protein constrained (0.5 ns) and without constraints (0.5 ns). This equilibrated system was used as a starting point to perform unrestrained molecular dynamics simulations. All simulations were carried out with NAMD 2.1013 with the c36 CHARMM force field14 in the NPT ensemble, using Langevin dynamics to control temperature at 300K, and with a time step of 2 fs, while constraining all bonds between hydrogen and heavy atoms. A 3D model of the common ancestor A was built with Modeller v9.1410 by threading its sequence in the squid rhodopsin template (PDBid: 2Z737) following the alignment obtained in the PhyloBot web server (see below). The side chain rotamers in the binding pocket of the model were subsequently optimized using the backbone-dependent rotamer library15 implemented in the PyMOL Molecular Graphics System, Version 2.0 (Schrödinger, LLC). Structure-based sequence conservation analysis Using the crystal structure of bovine rhodopsin (PDBid: 1GZM16), we selected all residues within 10 Å of the Schiff base (LYS296:NZ) in the retinal binding pocket as potential sites for a counterion. We then measured the sequence variability at these positions in an alignment of ~900 sequences from a recent large-scale analysis of opsin evolution17, in which the 26 outgroups had been removed. Phylogenetic reconstruction of ancestral opsin sequences The reconstruction of ancestral sequences was performed using the PhyloBot web server18 (phylobot.com). For this, we selected a subset of 141 opsin sequences representing the four defined opsin clades. Then, in order to enrich the presence of different counterion locations in certain phyla, we added several opsins from hydra (Hydra magnipapillata), jellyfish (Cladonema radiatum) and box jellyfish (Carybdea rastonii, Tripedalia cystophora) 19, vertebrate visual rhodopsins (rhodopsin, rhodopsin type 2, long-wavelength sensitive opsin, medium-wavelength sensitive opsin, and shortwavelength sensitive opsin), and a reference set of eight biochemically and functionally characterized opsins from different families (jumping spider opsin, squid opsin, 4
amphioxus melanopsin, human melanopsin, mouse melanopsin, lamprey parapinopsin, mosquito opsin 3, and monkey green opsin) (141 sequences in total). Melatonin 1 receptors from human, sheep, fish & mouse were added as outgroups. Within the PhyloBot server, these sequences were aligned with MUSCLE8 and MSAProbs20 with default settings; the program ProtTest21 recommended the “PROTGAMMALG” [LG + GAMMA ] model of evolution22; maximum likelihood trees were built using RAxML23; PhyML was used to assess the statistical support of the data for the calculated trees24; and, finally, the ancestral opsin sequences at various internal nodes were reconstructed using the software packages PAML25 and Lazarus26. Apart from many Bayesian sampled alternate sequences, ML sequence was selected as the best reconstructed sequence. The reconstructed ancestors also possessed high posterior probabilities for many sites. The sequences of these ancestors were then aligned to the reference set for further analysis. The best phylogeny tree (msaprobs, PROTGAMMALG) was evaluated to have alpha parameter for GAMMA distribution: 0.995 (best and lowest than other model trees). References for Supplementary Materials and Methods 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
Terakita, A., Yamashita, T. & Shichida, Y. Highly conserved glutamic acid in the extracellular IVV loop in rhodopsins acts as the counterion in retinochrome, a member of the rhodopsin family. Proc. Natl. Acad. Sci. U.S.A. 97, 14263–14267 (2000). Bailes, H. J., Zhuang, L.-Y. & Lucas, R. J. Reproducible and Sustained Regulation of Gαs Signalling Using a Metazoan Opsin as an Optogenetic Tool. PLoS ONE 7, e30774 (2012). Bailes, H. J. et al. Optogenetic interrogation reveals separable G-protein-dependent and independent signalling linking G-protein-coupled receptors to the circadian oscillator. BMC Biol. 15, 40 (2017). Govardovskii, V. I., Fyhrquist, N., Reuter, T., Kuzmin, D. G. & Donner, K. In search of the visual pigment template. Vis. Neurosci. 17, 509–528 (2000). Bailes, H. J. & Lucas, R. J. Human melanopsin forms a pigment maximally sensitive to blue light (λmax ≈ 479 nm) supporting activation of G(q/11) and G(i/o) signalling cascades. Proc. Biol. Sci. 280, 20122987 (2013). Sakmar, T. P., Franke, R. R. & Khorana, H. G. Glutamic acid-113 serves as the retinylidene Schiff base counterion in bovine rhodopsin. Proc. Natl. Acad. Sci. U.S.A. 86, 8309–8313 (1989). Murakami, M. & Kouyama, T. Crystal structure of squid rhodopsin. Nature 453, 363–367 (2008). Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004). Pettersen, E. F. et al. UCSF Chimera--a visualization system for exploratory research and analysis. J Comput Chem 25, 1605–1612 (2004). Webb, B. & Sali, A. Comparative Protein Structure Modeling Using MODELLER. Curr Protoc Bioinformatics 54, 5.6.1–5.6.37 (2016). Olsson, M. H. M., Søndergaard, C. R., Rostkowski, M. & Jensen, J. H. PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions. J Chem Theory Comput 7, 525–537 (2011). Dolinsky, T. J. et al. PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res. 35, W522–5 (2007). Phillips, J. C. et al. Scalable molecular dynamics with NAMD. J Comput Chem 26, 1781–1802 (2005). Best, R. B. et al. Optimization of the additive CHARMM all-atom protein force field targeting improved sampling of the backbone φ, ψ and side-chain χ(1) and χ(2) dihedral angles. J Chem Theory Comput 8, 3257–3273 (2012). Shapovalov, M. V. & Dunbrack, R. L. A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions. Structure 19, 844–858 (2011). Li, J., Edwards, P. C., Burghammer, M., Villa, C. & Schertler, G. F. X. Structure of bovine
5
17. 18. 19. 20. 21. 22. 23. 24. 25. 26.
rhodopsin in a trigonal crystal form. Journal of Molecular Biology 343, 1409–1438 (2004). Porter, M. L. et al. Shedding new light on opsin evolution. Proc. Biol. Sci. 279, 3–14 (2012). Hanson-Smith, V. & Johnson, A. PhyloBot: A Web Portal for Automated Phylogenetics, Ancestral Sequence Reconstruction, and Exploration of Mutational Trajectories. PLoS Comput. Biol. 12, e1004976 (2016). Liegertová, M. et al. Cubozoan genome illuminates functional diversification of opsins and photoreceptor evolution. Sci Rep 5, 11885 (2015). Liu, Y., Schmidt, B. & Maskell, D. L. MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities. Bioinformatics 26, 1958–1964 (2010). Abascal, F., Zardoya, R. & Posada, D. ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21, 2104–2105 (2005). Le, S. Q. & Gascuel, O. An improved general amino acid replacement matrix. Mol. Biol. Evol. 25, 1307–1320 (2008). Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014). Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010). Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007). Hanson-Smith, V., Kolaczkowski, B. & Thornton, J. W. Robustness of ancestral sequence reconstruction to phylogenetic uncertainty. Mol. Biol. Evol. 27, 1988–1999 (2010).
Access to ancestral reconstruction displayed in Fig. 3A: Online phylobot ancestral sequence reconstruction used for Fig. 3A can be accessed at the following web address: http://www.phylobot.com/468187218/ancestors. To access the tree and ancestral sequences used in Figure 3A, please use Alignment method: msaprobs and Markov Model: PROTGAMMALG.
References to the experimentally determined counterions listed in Figure 3A RGR/Retinochrome {181} Terakita, A., Yamashita, T. & Shichida, Y. Highly conserved glutamic acid in the extracellular IV-V loop in rhodopsins acts as the counterion in retinochrome, a member of the rhodopsin family. Proc. Natl. Acad. Sci. U.S.A. 97, 14263–14267 (2000). Peropsin,Go coupled opsin & Parapinopsin {181} Terakita, A. et al. Counterion displacement in the molecular evolution of the rhodopsin family. Nat. Struct. Mol. Biol. 11, 284–289 (2004). Parietopsin {181} Sakai K & Shichida Y et al. Photochemical nature of parietopsin.Biochemistry. 2012 Mar 6;51(9):1933-41. doi: 10.1021/bi2018283. Rod opsin {113} 6
Sakmar TP, Franke RR, Khorana HG. Glutamic acid-113 serves as the retinylidene Schiff base counterion in bovine rhodopsin. Proc Natl Acad Sci U S A. 1989 Nov;86(21):830913.
TM2
TM3
2.61x60 Bovine rhodopsin Jumping spider RH1 Squid opsin
3.28
|
reference (P02699) (AB251846) (P31356)
ECL2
|
90 GFTTTLYTS 98
109 GCNLEG FFATL119
177 RYIPEGMQC-SC 187
MMPTMSINC GFPLMTISC
MCELYG MIGSL ACKVYG FIGGI
RYVPEGSMT-SC AYTLEGVLC-NC
GYPLEVFTV GYTVEIVTT GYPLEVISV GYPLEVFTV GYPIELTTN GYAIELYGR GYAIELYGR GYTTELYGH GSLWEAYAGFIWEAYGGFVWE---GFVWEGY-GFAWE---GFVWEGYGGIIWEGYGGFIWEGYGAYPLELNNI AYPLELSSL
LCQVAG FFITA ACQLVG FVVTF ACAGTA FMVTW LCQVAA FFITA LCIIAG YSTTT ICQMAG FSVTF LCQMAG FSVTF LCRASG FAVTF LCKMAG FGITF LCKMAG FGTTF LCKFAA YGATF LCKIAG FGATF LCKFAG FGATF LCKMAG FGSAF LCKMAG FGATF LCKMAG FGATF SCVVSA VMVYG ECLMGA ILVFG
NYDVEGDGM-RC NYALESVKV-RC GYANEGLA--RC NYDVEGDGM-RC EYAPELGSNKRC AYKREPIDKYRC AYKREPIDKYRC GYEREPHAPHRC GYFREPIHTFRC GYIREKLHTYRC EYKREAVHTYRC EYMREAVYTYRC EYMREAVHSYRC VYLREEVHTYRC AYMREPVHTYRC EYMREANHSYRC NYQKETDDSHRC SYQKEMKPSHRC
cubozoans Box jellyfish opsin (AB435549) C-like Opsin9 Tripedalia cystophora (A0A059NTG2_TRICY) C-like Opsin5 Tripedalia cystophora (A0A059NTD7_TRICY) C-like Opsin13 Tripedalia cystophora (A0A059NTG7_TRICY) PcopC Podocoryna-carnea (A9CR60_PODCA) CropK2 Cladonema radiatum (A9CR51_9CNID) CropK1 Cladonema radiatum (A9CR48_9CNID) CropJ Cladonema radiatum (A9CR54_9CNID) CropN1 Cladonema radiatum (A9CR43_9CNID) CropF Cladonema radiatum (A9CR39_9CNID) CropI Cladonema radiatum (A9CR37_9CNID) CropH Cladonema radiatum (A9CR36_9CNID) CropD Cladonema radiatum (A9CR34_9CNID) CropE Cladonema radiatum (A9CR32_9CNID) CropL Cladonema radiatum (A9CR38_9CNID) CropC Cladonema radiatum (A9CR30_9CNID) CropM Cladonema radiatum (A9CR26_9CNID) CropO Cladonema radiatum (A9CR28_9CNID)
|
…
…
|
|
94
113
181
cubozoan counterion
vertebrate counterion
ancestral counterion
Fig. S1. Section of a sequence alignment of cnidarian opsins showing sections of the second (TM2) and third (TM3) transmembrane helices, and of the second extracellular loop (ECL2). The corresponding regions in bovine rhodopsin, jumping spider rhodopsin 1 and squid rhodopsin and shown at the top as reference. The JellyOp counterion position, E94, is conserved in opsin sequences from multiple species of cnidarians, including T.cystophora, C.radiatum and P.cornea.
7
B Wild Type
0
-0.002
0.002 pH 6.5
400
500
C
600
Wavelength (nm)
E181Q
0.001
0
pH 6.5
-0.002
Diff. Abs.
0.002
Diff. Abs.
Diff. Abs.
A
E94Q pH 5.3
0
pH 6.5
-0.001 400
500
600
Wavelength (nm)
400
500
600
Wavelength (nm)
Dark-minus-light difference spectra with crude extracts of HEK293 cells expressing
Fig. S2.wild-type Dark-minus-light difference spectra with crude extracts of HEK293 cells expressing wild JellyOp (A) and the E181Q (B) and E94Q mutants. The positive peaks observed type JellyOp (A) and the E181Q and E94Q reflect mutants. The positive wild with wild type and E181Q(B) (arrowheads) absorption of thepeaks dark observed state with with the Schiff type andbase E181Q (arrowheads) reflect absorption of the dark state with the Schiff base protonated. protonated. The difference spectra of E94Q muntant did not exhibit any obvious peak The difference spectra of E94Q mutant did not exhibit any peak at neutral or be acidic pH at neutral or acidic pH conditions, demonstrating thatobvious no opsin expression can observed conditions, demonstrating that no opsin expression can be observed by this method. Pigments by this method. Pigments were constituted and extracted with dodecyl β-D-maltoside as were constituted extracted with dodecylsamples β-D-maltoside as described in buffer the main text. The describedand in the text. The extracted were diluted twice with A and measured using the spectrophotometer. Samples were illuminated with green light (540 nm) extracted samples were diluted twice with buffer A and measured using a spectrophotometer. Samples were illuminated with green light (540 nm).
8
Fig. S3. A, In our three-dimensional structural model of JellyOp, R186 sits between the two acidic residues E181 and E94, near the PSB. B, Schematic of the proposed ionic interactions within the PSB/counterion system. C-D, Molecular dynamics simulations indicate that E181 primarily interacts with R186 (D, red trace), rather than the PSB (C, red trace) or E94. On the other hand, the interaction between E94 and the PSB (C, black trace) remains stable throughout the simulation, while still interacting transiently with R186 (D, black trace).
9
504nm λmax
Relative Sensitivity
1
0.1
E94Q, 470-595nm responses only
0.01
450
500
550
600
λ(nm)
Figure S4. Relative sensitivity of the3.response only by light (470-595nm) for the Extended data fig Relativeinduced sensitivity of visible the visible light 2 E94Q action spectrum. These responses are fit with the 504 λ nomogram (dotted line, R : 0.64, only(470-595nm) responses for the E94Q mutant; max these responses are fit n=6 experiments), represent with adata 504 (circles) λ nomogram (R2mean : 0.64);±s.e.m. n= 6 experiments, data are mean max
± s.e.m.
Relative Sensitivity
1
380nm60%48240% λmax
0.1
0.01
E94Q, E181Q & R157V
400
500
600
λ(nm) Extended data fig 4. Relative sensitivity profile of E94Q, E181Q, R157V triple mutant; we were unable to obtain an accurate IRC at 595nm due to poor cAMP induction at this wavelength; best fit double nomogram in black broken line; n=6 experiments, data are mean ±
s.e.m. 10
Extended data fig 3. Relative sensitivity of the visible light only(470-595nm) responses for the E94Q mutant; these responses are fit with a 504 λmax nomogram (R2: 0.64); n= 6 experiments, data are mean
± s.e.m.
Relative Sensitivity
1
380nm60%48240% λmax
0.1
0.01
E94Q, E181Q & R157V
400
500
600
λ(nm) data of figthe4.E94Q/E181Q/R186V Relative sensitivity profile E94Q, We E181Q, Fig S5. RelativeExtended sensitivity profile triple mutantof of JellyOp. were triple mutant; wedue were to obtain accuratebyIRC 595nm unable to obtainR157V an accurate IRC at 595nm to unable poor overall cAMPan induction this at mutant. The data (circles) are to fit with of lmax’s andwavelength; the relative weighting due poora double cAMPnomogram induction at this best fitis double indicated as superscripts. Data represent mean ± s.e.m, n =6 experiments.
nomogram in black broken line; n=6 experiments, data are mean ±
s.e.m.
11
Table S1. Equivalences between different residue numbering schemes for JellyOp residues. Throughout the manuscript, positions in the sequence of JellyOp (left) are referred to using the equivalent positions in the sequence of bovine rhodopsin (middle). When appropriate, these residues are also referred to using the general GPCRdb numbering scheme in square brackets, which allows comparison to other class A GPCRs.
Residue numbering schemes JellyOp
Bovine rhodopsin
B&W/GPCRDB47
E63
94
[2.61x60]
A82
113
[3x28]
E152
181
-
R157
186
-
12
Table S2. List of opsin sequences used for ancestral sequence reconstruction and phylogeny in Fig. 3A.
OpsinID.Species
CropM.Anthomedusa
Sequence Accession (L0ATA4_ACRPL) (L0AUA8_ACRPL) (ABG37009.1) (NP_001104634.1) (BAV92607.1) (AB050606) [B. belcheri] (AB006455) [M. yessoensis] BAF95829.1 BAF95831.1 BAF95830.1 BAF95835.1 AB332427 BAF95832.1 BAF95833.1 BAF95842.1 BAF95840.1 BAF95841.1 BAF95834.1 A9CR43_9CNID: C. radiatum AB332418
CropB1.Anthomedusa
AB332416
CropB4.Anthomedusa
AB332417
CropO.Anthomedusa
AB332419
HmopA1Hydra HmopA2Hydra HmopA4Hydra HmopB1Hydra HmopB2Hydra HmopB3Hydra HmopB4Hydra HmopB5Hydra HmopC1Hydra HmopC3Hydra HmopD1Hydra
(ACZU01000679) (ACZU01004988) (ACZU01004994) (ACZU01027146) (ACZU01051946) (ACZU01043783) (ACZU01065240) (ACZU01078635) (ACZU01005587) (ACZU01005579) (ACZU01091217)
Acropsin1Coral Acropsin2Coral CRUST.NeooerRh3 Enceph.danRer Enceph.galGal Go-coupled Amphioxus-op1 Go-coupled Rho scallop-op2 CropC.Anthomedusa CropD.Anthomedusa CropE.Anthomedusa CropF.Anthomedusa CropG1.Anthomedusa CropH.Anthomedusa CropI.Anthomedusa CropJ.Anthomedusa CropK1.Anthomedusa CropK2.Anthomedusa CropL.Anthomedusa CropN1.Anthomedusa
13
(ACZU01027487) HmopD2Hydra (ACZU01075622) HmopD3Hydra (ACZU01076129) HmopD4Hydra (ACZU01038272) HmopD5Hydra (ACZU01040368) HmopD7Hydra HmopE1Hydra.CNOPe1_hydMag_XP_012561673.1(PPIN- (ACZU01068872) like) (ACZU01085554) HmopE2Hydra (ACZU01057385) HmopE3Hydra (ACZU01037497) HmopE4Hydra (ACZU01037501) HmopE5Hydra (ACZU01013912) HmopE6Hydra (ACZU01095006) HmopF1Hydra.CNOPf1_hydMag_XP_012561008.1 HmopF2Hydra HmopF3Hydra HmopF4Hydra HmopG1Hydra HmopG2Hydra HmopG3Hydra HmopG4Hydra HmopG5Hydra HmopG6Hydra HmopH1Hydra.CNOPh1_hydMag HmopH2Hydra HmopH3Hydra HmopH4Hydra JellyOp.Carybdea rastonii LMS2.hasAda LMS.limPol LMS.schGre LWS.galGal LWS.homSap LWS.letJap LWS.monDom LWS.Monkey_Green MEL1.bosTau MEL1.danRer MEL2.galGal MEL2.human MWS.Bostau MWS.Cavpor MWS.Equcab NEUR1.bosTau NEUR1.danRer
(ACZU01063269) (ACZU01063264) (ACZU01049000) (ACZU01024987) (ACZU01083619) (ACZU01007423) (ACZU01103264) (ACZU01014309) (ACZU01086515) (ACZU01018050) (ACZU01085207) (ACZU01011197) (ACZU01090363) B6F0Y5_CARRA (BAG14331.1) (AAA02498.1) (Q94741) (CAA40727.1) (NP_000504.1) (BAD17958.1) (NP_001138553.1) (NP_001270138.1) NP_001179328.1 NP_001122233.1 AAX73255.1 Q9UHM6 NP_776991.1 AAD30523.1 NP_001075314 NP_001193009.1 AAS75321
14
Nvop202.Anemone Nvop273.Anemone Nvop289.Anemone Nvop329.1.Anemone Nvop3544.Anemone Nvop42.Anemone Nvop465.Anemone Nvop2.Anemone OPN3.Mosquito PARIE.danRer PARIE.xenTro PcopCHydrozoan PER.Amphioxus PER.Bostau PPIN.geoAus PPIN.Lamprey PPIN.letJap Retinochrome_Squid RGR1.xenTro RGR2.danRer Rh1.Bovine Rh1.Canfam Rh1.Felcat RH1LMS1.hasAda_Jumping_spider Rh1.Musmus Rh1.Ratnor Rh1.Squid_opsin RH2.Caraur RH2.Sercan RH2.Thuori RHO1.homSap RHO2a.danRer RHO2.galGal SWS1.Canfam SWS1.danRer SWS2.Danrer SWS2.galGal Tcop10Jellyfish Tcop11Jellyfish Tcop13Jellyfish Tcop14Jellyfish Tcop15Jellyfish Tcop16Jellyfish Tcop1Jellyfish Tcop2Jellyfish
A9UMX2 A9UMX5 A9UMX1 A9UMX7 A9UMY3 A9UMX6 A7SZI5 FAA00400.1 AB753162 BAL63415.1 NP_001039256 A9CR60 BAC76023.1 NP_001179153.1 ANV21069.1 AB116380 BAD13381.1 P23820 BC135113 NM_001024436 P02699 AAM11432.1 NP_001009242.1 AB251846 NP_663358.1 NP_254276.1 P31356 AAA49169.1 CAB91995.1 BAG14283.1 NP_000530.1 NP_571328.2 NP_990821.1 XP_539386.2 NP_571394.1 NP_571267.1 NP_990848.1 JQ968423 JQ968422 JQ968420 JQ968419 JQ968418 JQ968417 JQ968432 JQ968431
15
Tcop3Jellyfish Tcop4Jellyfish Tcop5Jellyfish Tcop6Jellyfish Tcop7Jellyfish Tcop8Jellyfish Tcop9Jellyfish TMT.danRer TMT.galGal UV5.apiMel UV5B.droMel UVB.apiMel VAOP.danRer VAOP.galGal Melatonin1Human MTR1A_SHEEP MTR1A_FishEuropeanBass MTR1L_MOUSE
JQ968430 JQ968429 JQ968428 JQ968427 JQ968426 JQ968425 JQ968424 ALG92548.1 BAV92606.1 AAC13418.1 AAC47426.1 AAC13417.1 NP_571661.1 NP_001297018.1 P48039 P48040 B2Y4M8_DICLA O88495
16
Table S3. Supporting statistical likelihoods for the amino acid composition at known counterion positions (94, 113, and 181) –plus position 186– in reconstructed ancestral opsin sequences. Nodes A-G refer to the internal nodes highlighted in Fig. 3A. Ancestral Node 94 A (All opsin)
E (0)
B (R-type) C (Ciliary/Cnidopsin) D (Ciliary)
M (0.59) S (0.57) T (0.65)
E (Ciliary; E113) F (Cnidopsin)
T (0.77) E (0.004)
G (Cnidopsin; E94)
E (0.94)
Residue (likelihood statistic) 113 181 E (0) E (0.99) Y (0.995) Y (0.998) E (1.0) Y (1.0) E (1.0) E (0.001) E (1.0) Y (0.993) E (0.99) E (1.0) E (0) E (1.0) Y (0.99) E (0) E (1.0) A (0.66)
186 S (1.0) S (0.99) S (1.0) S (1.0) S (1.0) S (0.99) R (1.0)
17
Additional data file S1 (separate file) Pdf document showing the full phylogenetic tree –with branch lengths– of the 141 opsin sequences used for the ancestral sequence reconstruction displayed in a summarized form in Fig 3A. Additional data file S2 (separate file) Phylogenetic tree (in the Newick file format) used for the ancestral sequence reconstruction displayed in Fig 3A.
Additional data file S3 (Separate file) Sequence alignment of the 7 reconstructed ancestral sequences shown in Fig. 3A (Nodes A-G).
18