Jan 10, 2013 ... Supporting Information. Dessau and Modis 10.1073/pnas.1217780110. RVFV.
691 CSELIQASSR ITTCST-EGVN TKCRLSGTAL IRAGSVGAEA ...
Supporting Information Dessau and Modis 10.1073/pnas.1217780110 a RVFV Gc Sinbis E1
691 CSELIQASSR ITTCSTEGVN TKCRLSGTAL IRAGSVAGEA CLMLKGVKED QTKFLKLKTV SSELSCREGQ SYWTGSF-----SPK --CLSSRRC----HLV GECHVNRCLS 363 HATTVPNVPQ .PYKALVERA GYAP.N---- ---------- ---------- ----.EI-.. ...VLPSTN. E.I.CK.TTVVP... IK.CG.LE.QPAA.AG YT.K.-----
RVFV Gc Sinbis E1
791 WRDNETSAEF SFVGESTTMR ENKCFEQCGG WGCGCFNVNP SCLFVHTYLQ ----SVRKEALRVF NCIDWVHKLT LEITDFDGSV STIDLGASSS RFTNWGSVSL SLDAEGISGS 441 ---------- -.G.VYPF.- -------W.. AQ.F.DSE.S Q-SEAYVE.S ADCA.DHAQ.IK.H TA---AM.VG .R.V-YGNTT .FL.V----- -YV.GVTPGT .K.LKV.A.P
RVFV Gc Sinbis E1
901 NSFSFIESPG KGYAIV--------DEPF SEIPRQGFLG EIRCNSESSV LSAHESCLRA PNLISYKPMI DQLECTTNLI DPFVVFERGS LPQTRNDKTF AASKGNRGVQ 526 I.A..--T.F DHKVVIHRGLVYNY.F.E YGA-KP.AF. D.QAT.LT.- ----KDLIAS TDIRLL..SA KNVHVPYTQA S--------- -------SG. E-W.N.S.RP
RVFV Gc 1001 AFSKGSVQAD LTL--------MFDNF--EV DFVGAAV--SCD AAFLNLTGC------Y SCNAGARVCL SITSTGTGSL SAHN-------KDGSLH IVLPSENGTK DQCQILHFTV Sinbis E1 611 LQETAPFGCK IAVNPLRAVDCSYG.IPISI .IPN..FIRTS. .PLVSTVK.EVSECT. .ADF.GMAT. QYV.DRE.QC PV.SHSSTATLQESTV. -..--.K.AV T----V..ST
RVFV Gc 1091 PEVEEEFMYS CDGDERPLLV KGTLIAIDP Sinbid E1 719 ASPQAN.IV. LCG.KTTCNA EC-------
b RVFV Uukuniemi virus Punta toro virus Toscana virus Massilia virus Sandfly fever virus Candiru virus Heartland virus SFTS virus HB29
691 514 810 837 830 835 816 567 563
CSELIQASSR ..DS.SVTAS ..NTVV.N.K ...SVI.D.K .T.SLI.E.S ..DHTI...K ...TLISN.K .D..VH.E.K .D.MVH.D.K
ITTCST-EGVN SQR...SSDGV Q.R.VQ-..S. VMQ.T.-S.SS .MQ.T.-.DSS .VK.VA-K.SK Q.K.VQ-SSDK SI..KSAS.NE LVS.RQGS.NM
TKCRLSGTAL NS.FV.TSS. ...SITA.IT .L.KA...VI .T.KAT..VV SV.TI..LIN V..SV.A.VT KE.SVT.R.. KE.ITT.R..
IRAGSVGAEA LQVSPK.Q.S LR..VI...S MKL.PI.S.S LKL.PI.S.S VK..PI.S.T LK..II...S LP.VNP.Q.. LP.VNP.Q..
CLMLKGVKED ..I...PTGT .FII..PM.N ..I...L.DN ..I...LRDN .VT...PDSA .FI...PM.N ..HFSVPGSP ..HFTAPGSP
QTKFLKIKTV AVDSIR...T .Q.TIS...I EKQ.IS...I EKQ.IS...I DK...T...I .R.TIR...L DS.C....VK DS.C....VK
SSELSCR SSELSCREGQ EGQ DIK.E.V DIK.E.VRRD RRD ...TV.. ...TV....S ..S ....T.. ....T....E ..E ....T.. ....T....E ..E A...I.S A...I.S... ... ....V.. ....V....N ..N .IN.R.K .IN.R.KQAS QAS .IN.K.K .IN.K.KKSS KSS
SYWTGSFSPK L..VPRVTHR .F..SLYI.S .F..TLYT.V .F..SLYT.V ....SQYGVE .F..SHY..T ..YVPEAKAR ..FVPDARSR
CLSSRRCHLV .IGT.....M .........V .........M .........M .......RGV .........M .T.V...RWA .T.V...RWA
RVFV Uukuniemi virus Punta toro virus Toscana virus Massilia virus Sandfly fever virus Candiru virus Heartland virus SFTS virus HB29
791 615 910 937 930 935 916 668 664
WRDNETSAEF FKI.DY.P.W ...DQL.R.. .KT....P.. .KS.K..S.. .NSTLV.R.. .T.E.V.Q.. FSS.SF.DDW FTS.SF.DDW
SFVGES-TTMR GHEE.LMAQLG .G.KDN-HI.N TGKAHG-EV.H TGKAH.-EVIH QGITNN-SVIS KG.NDN-MV.N ANRMDR-AGLG AGKMDR-AGLG
ENKCFEQCGG ENKCFEQCGG WSY.V..... WSY.V..... ..K..... ..K......A .A ..R....S.. ..R....S.. ..K....S.. ..K....S.. ..R.I..... ..R.I..... ..K......A ..K......A MSG.SDG... MSG.SDG... FSG.SDG... FSG.SDG...
WGCGCFNVNP WGCGCFNVNP AL.Q...MR. AL.Q...MR. I......I.. I......I.. I.Y....... I.Y....... I.Y....... I.Y....... V..A....YA V..A....YA A......I.. A......I.. AA.....AA. AA.....AA. AA.....AA. AA.....AA.
SCLFVHTYLQ ..FYLRK.FS ...Y..AY.K ...Y..SI.K ...M..SY.K ......AR.R ......SSFR ....WRKWVE ....WRKWVE
SVRKEALRVF HLSQD.FNIY .A.N..V... ..Y.NGFKI. PIY.N.F... ATKR..IK.. ...P..I... NPSNRVWK.S NPHGIIWK.S
NCIDWVHKLT NCIDWVHKLT E.SE.SYRIN E.SE.SYRIN S.S....RVS S.S....RVS R.VA.N.RIR R.VA.N.RIR R..A.N.RVR R..A.N.RVR .....S.R.V .....S.R.V S.A....R.S S.A....R.S P.AS..LAA. P.AS..LAA. P.AA..PSAV P.AA..PSAV
LEITDFDGSV VLVSTNSTHS F.VKGP..ET ..V.THSRKF .GV.THKRKF ......N.KK FLVYGPNRER I.L.LPS.E. I.L.MPS.E.
STIDLGA--SSS NLTLKLG---VP ELVT..S--PGT P-MV.M.--M.T E-.T.M.--M.T EKVSMTG--MTT EE.S..S--LGT K.LEPVTGQATQ R.FHPMSGIPTQ
RVFV Uukuniemi virus Punta toro virus Toscana virus Massilia virus Sandfly fever virus Candiru virus Heartland virus SFTS virus HB29
891 716 1011 1037 1030 1036 1017 771 767
SLDAEGISGS .SVSQPPAIA .........T I..S...T.T I..S...E.T A..F...T.T .......T.T GSSI.IVGMT GS.M.VSGLT
NSFSFIESPG Y.EC.G.DLH ..I..L..SK .AY..MKHGS .AY..MKHGL ..Y..LR.SS ..I..L..SK RLCEMK.MGT DLCEIE.LKS
EIRCNS-ESSV ..Q.PT-KADA ....S.-..AA .V..PT-.ETA .V..PT-.ETA ....S.-.AAA ....S.-..AA ..Q.S.I..AK ..Q.S.E..AR
LSAHESCLRA .AVSKR.SSD I...K..I.. IR.SP..KM. IK.SP..KV. .T..K..VV. I...S..I.. HIRSDG.IWN TIKKDG.IWN
PNLISYKPMI SIIFS..VHK .GL.K....T ..L.E.Q.EM .GL.E.Q.EM .DI.R....T ..L.K....T ADLVGIELRV ADLVGIELRV
DQLECTTNLI .SVD.QSSI. ..I...AS.V .TA.....M. .TV.....MM .IVD.S.S.. ..ID..AS.V .DAV.FSK.T .DAV.YSK.T
DPFVVFERGS ..MTIRN.NK ...AI.LK.. ..MAI.N... ..MAI.N... ...A..L..A ...AA.VK.. SVEA.ANFSK SVEA.ANYSA
RVFV Uukuniemi virus Punta toro virus Toscana virus Massilia virus Sandfly fever virus Candiru virus Heartland virus SFTS virus HB29
991 816 1111 1137 1130 1217 1117 872 868
AASKGNRG---VQ WPT--ETS---.E TST.DKKT---.. TQ.IEKNT---.. TQ.IEKNS---.. SS.IDKKT---I. TS..DKKT---I. DQGNHGESRIYGS ER.HDSQGKISGS
RVFV Uukuniemi virus Punta toro virus Toscana virus Massilia virus Sandfly fever virus Candiru virus Heartland virus SFTS virus HB29
1091 916 1212 1238 1231 1237 1218 974 970
PEVEEEFMYS SH..AKCYFR .L.D.RLS.. .VISI.GR.D .VISI.GK.D .Q.KMDVV.. .EIDSITK.. AVID..CVLN AV.D.QCQLN
KGYAIVDEPF GTKFHTVCNR G.FALYD.GY GSFA.IDD.Y GSFA..D..Y GTFSL.D.AM G.FALHD.A. GIMALAPCND .KLALAPCNQ
AFSKGSVQAD .AIPDLAS.T ..TN.AIK.L .LTT.E.R.S .LTT.EIH.S ..TS.I.H.S ..NS..IK.M PLDITR.SGE PLDITAIRGS
CDGDERPLLV .PNS.SQ.TI .GSEPKLIVI .GLSK..MVI .GKNKK.M.I ...YYKAMSI .G.E..M.HI .G.HSSK... .G.H.SQVT.
SEIPRQGFLG RTDYTL.RI. N....E.... .PE..K.... .VE..K.Y.. .ME..K.... .....E.... PGHAIM.NV. AGMGVV.KV.
LTLMFDNFEV MLIRL.GYTI .SINL.DH.I IR.TL.DYD. VR.VL.EY.I MS.S...... V.INL.DY.I FSVS.RGMRL FSVNYRGLRL
KGTLIAIDP--F KGTLIAIDP--F R.E..YLFN R.E..YLFN--D --D ....VCMGV--Y ....VCMGV--Y ......TA.--H ......TA.--H ......MA.--H ......MA.--H A...V.MN.--. A...V.MN.--. ..L..SLGV--Q ..L..SLGV--Q ..S.VFM.VPR. ..S.VFM.VPR. ..N..FL.VPK. ..N..FL.VPK.
DFVGAAVSCD DFVGAAVSCD QFRSDSNK.S QFRSDSNK.S VFINKVKN.. VFINKVKN.. VYQNSQSD.S VYQNSQSD.S EYKVTSND.. EYKVTSND.. EFEEER...L EFEEER...L EFLSDIAK.E EFLSDIAK.E KLSEISA..T KLSEISA..T SLSEITAT.T SLSEITAT.T
DDRREAGGES .I.HHNQTL. .F.NKT..S. ...KHE..S. ...VIN..S. ....HEETN. .L.NST..A. V.GSYVQTYH V.GSYMQTYH
AAFLNLTGCY PR..S.S... .T...VS... .T.K...... ST.V.I.... .S.V.I.... .T.V...... GEIT.VS... GEVT.VS...
SCNAGARVCL N.E...KLE. ..DY..H..V ..DE.S.M.Y ..DE...L.V ...E.....I ...Y.....V ...T..S.SI ...T..K.SI
SITSTG-TGSL EHVTDFG.ALG KVK.SE-SADF QVKAE.-ETIF KVK.SS-EAIY QAAANK-NTT. RVKTDR-DADF KLH.SK-NTTG KLH.SK-NSTA
TVVNPKSGS-W PGLS.....G. .....SE.A-. L.....G.T-V ......GHG-V V.....T.R-. I....SDS.-. SK.PAGGRV-P ST.PTGANI-P
NFFDWFSGLM DP.G..--KA SISN.....L D..G.V...T D.LG.L...S DLSS.A...V SLTG.A...I .PV..LNA.F SPT..LNA.F
SWFGGPLKTI ..LRAIWAIL D.L...M.A. ..L......F .........F D.L....R.A ..L..SWMAV G--D.ITRW. G--N.LSRW.
Stem peptide
SAHNKDGSLH ILECPSLGYT F.ESE.KTTV HFV.E.E.IN HFTDSSF.VN HVHTL.N..T V.TQV.QEF. HLKCDSD--E HVRCKGD--E
LLICLYVALS GGTVSLIIGV .K.LGFI.IG .T.LGFL..G .MVMGFLMIG GV.LG.II.A LK.IGFLL.G .G.IGVLLAC .GVIGVLLGG
TM
GECHVNRCLS .A.KGEA.SE .D.VG.K.Q. ...VSD...K ...TGDI..K A..KGDA.QR SD.TG.R.QR .D.QSGCPTY .D.QSGCPPH
RFTNWGSVSL DSIPH.LI.. K.LN..TL.. QP.D...IG. QQ.D...IG. Q.FS...MT. K.LS..TIS. M.KGVAITY. V.KGVSVTY.
LPQTRNDKTF ..S.VGSV.. ......GQ.. ...V.DGM.. ...V.EG... ......G... ...V..GM.. I.A.ISGVR. I.T.IGGLR.
IVLPSENGTK TYYEVK.TLE LSF.IQS..H VINKVSPDVS VLFKVTPSVH ..MDVLSPKS LAFAVW..VQ TAFSVME..H TAFSVLE.VH
IGLFFLLIYL VIIYMVFTLC .VCFV.FMI. LL.VVII.II VIIVIVT.I. .VFLIM.VLC FLCA.IMLS. VM..VVVVAI LA....IMF.
DQCQILHFTV KSIRTM.LNG .Y..V...QK .Y.TV...SR .Y.T....SR TD.RVV.LST .Y..TI..NK TYRPHMS.DK SYIVR.S.DH
GRTGLSKMWL LK-------I.--IAVNSI A.---TGVEQ I.---IGLQQ VPKLVGMIRA IK--FLGISY T.R-.I.GLT LKL.TKQVFR
AATKKAS --V..SNIK..N.LK..NK .LA..FK .LL..KL LRK..LT QRA.V.SR..L.-
Cytoplasmic tail
Fig. S1. Amino acid sequence alignments of Rift Valley fever virus (RVFV) glycoprotein GC and related proteins. (A) Structure-based protein sequence alignment of RVFV GC and Sindbis E1. The sequences were aligned based on a pairwise comparison of the corresponding crystal structures (PDB entries 4HJ1 and 3MUU). Conserved residues were replaced with a period. Domain colors are colored as in Fig. 1. Arrows denote β-strands; zigzags, α-helices; angled ellipsoids, 310 helices. Disulfide bonds are indicated by pairs of matching green symbols. Light blue and gray sectors represent ordered and disordered glycans, respectively. (B) Multiple amino acid sequence alignment of GC from selected phleboviruses. Database sequence accession nos.: RVFV, gb:ABD38819.1; Uukuniemi virus, gi:38371706; Punta Legend continued on following page Dessau and Modis www.pnas.org/cgi/content/short/1217780110
1 of 4
toro virus, gb:AAA47110.1; Toscana virus, gb:ABS85172.1; Massilia virus, gb:ACI24011.1; Sand fly fever virus, gb:AAA75043.1; Candiru virus, gb:AEA30045.1; Heartland virus, gb:AFP33393.1; severe fever with thrombocytopenia syndrome (SFTS) virus, gb:ADZ04471.1. Periods indicate conserved residues. Colors correspond to the domains of RVFV GC as defined in Fig. 1A. The stem region (light gray shading), transmembrane anchor (medium gray shading), and cytoplasmic tail (dark gray shading) are missing in the RVFV GC crystal structure.
Fig. S2. The β-barrel between domains I and II of RVFV GC. A close-up (Inset; viewed along the direction of the arrow) shows the hydrophobic core of the β-barrel. The environment is similar to the domain I–II interface of dengue E and could conceivably also accommodate a hydrophobic ligand. A 2Fo–Fc electron density map contoured at 1 σ (blue mesh) is shown around the residues in the hydrophobic core (in ball-and-stick representation).
Fig. S3. Nonglycosylated GC has an extended conformation that is likely to correspond to the so-called prefusion or prehairpin intermediate postulated for all fusion proteins. Left, the glycosylated GC dimer structure shown with its dyad axis perpendicular to the membrane, as in the outer protein shell assembly proposed in Fig. 4. Center and Right, the nonglycosylated GC structure. Hydrophobic residues in the fusion loop are shown as orange space-filling spheres. Extensive packing contacts around a crystallographic twofold symmetry axis result in a crystallographic dimer with a buried surface area at the dimer interface of 1,180 Å2. In the orientation imposed by the crystallographic dimer, Leu779 (yellow spheres), on a loop adjacent to the fusion loop, is positioned to insert into the target membrane along with the fusion loop. The positions of the N- and C termini are labeled Left and Center.
Dessau and Modis www.pnas.org/cgi/content/short/1217780110
2 of 4
Fig. S4. RVFV GC is a monomer in solution at different pH conditions. A total of 0.1 mL of GC (1 g/L) was loaded onto a Superdex 200 (30/100) size-exclusion column preequilibrated with different buffers: red, 50 mM NaOAc pH 5.0; magenta, 50 mM MES pH 6.2; blue, 50 mM Tris·HCl pH 8.0. The eluate was analyzed for absorbance at 280 nm (Left y axis) and for multiangle light scattering, which was converted into molecular mass (Right y axis; Materials and Methods). Elution of GC was significantly retarded due to nonspecific interactions of the protein with the dextran resin in all three buffers, although the effect was more pronounced in the pH 5 buffer. A similar effect was reported for flavivirus E proteins and attributed to exposure of the hydrophobic fusion loop (1). 1. Kanai R, et al. (2006) Crystal structure of West Nile virus envelope glycoprotein reveals viral surface epitopes. J Virol 80(22):11000–11008.
Fig. S5. Electrostatic surface potential of an icosahedral lattice of GC assembled as shown in Fig. 4. The electrostatic potential of the outer surface of the GC lattice is similar and slightly negative (red) around the (pseudo)-symmetry axes of each of the four types of capsomers: two-, three-, fivefold (labeled with standard symbols) and pseudo-sixfold, labeled “1.” This was prepared with PyMOL using vacuum electrostatic potentials.
Dessau and Modis www.pnas.org/cgi/content/short/1217780110
3 of 4
Table S1. Quality-of-fit statistics for fitting the crystal structure of the RVFV GC dimer (PDB ID code 4HJ1) into the RVFV EM structure (EMDataBank ID code EMD-1550) Manually fitted coordinates Nominal resolution of EM map (Å) Contour level of EM map* (σ) Resolution of map calculated from atomic coordinates for fitting/correlation measurement† (Å) CC between EM and calculated maps‡ CAM between EM and calculated maps§ CC with masked EM map used for fitting‡ CAM with masked EM map used for fitting§ CC with masked EM map, and icosahedral symmetry applied to calculated map during fitting Total atoms per ASUjj Atoms per ASU outside the unmasked EM density Intra- and intermolecular contacts within the ASU** Intra- and intermolecular clashes within the ASU†† Intermolecular contacts with other ASUs** Intermolecular clashes with other ASUs††
22 1.3 20 0.64 0.26 0.66 0.27 NA{ 40,074 14,020 35,038 2,055 8,080 3,080
Coordinates after fitting with UCSF Chimera 22 1.3 20 0.77 0.34 0.80 0.44 0.88 40,074 8,949 35,964 2,394 6,968 2,658
*The map sampling density was set to 2 in UCSF Chimera. † The map sampling density was set to 1. ‡ CC, correlation coefficient for map-in-map fitting, defined as follows: CC = /jujjvj, where u and v are vectors expressing the values of the fit and reference maps, respectively, at a certain point in space, and (the “overlap”, or the inner product of vectors u and v) is the sum over fit map grid points of the product of the fit map value and the reference map value at that point, determined by trilinear interpolation. § CAM, correlation about mean map values, defined as CAM = /ju – uavejjv – vavej, where uave is a vector with all components equal to the average of the components of u and vave is defined analogously. The CAM equals the cosine of the angle between the vectors (after subtraction of averages) and can range from –1 to 1, whereas the range of overlap (and hence CC) values depends on the scaling of the maps. { NA, not applicable. jj ASU, icosahedral asymmetric unit. **Contact defined as atom pairs more than four bonds apart with van der Waals overlap ≥ −0.4 Å. †† Clash defined as non–hydrogen-bonding atom pairs more than four bonds apart with van der Waals overlap ≥ 0.6 Å, or potentially hydrogen-bonding atom pairs more than four bonds apart with van der Waals overlap ≥ 1.0 Å.
Dessau and Modis www.pnas.org/cgi/content/short/1217780110
4 of 4