The nucleotide sequence of the large (L) genomic RNA segment of Seoul 80-39 virus was determined from overlapping cDNA clones. The virion L RNA ...
Virus Research, 19 (1991) 59-66 8 1991 Elsevier Science Publishers B.V. 0168-1702/91/$03.50 ADONIS 01681702910007131(:
59
VIRUS 00646
Nucleotide sequence and coding capacity of the large (L) genomic RNA segment of Seoul 80-39 virus, a member of the hantavirus genus * Dragana
Antic, Byung-Uk
Lim and C. Yong Kang University of Ottawa, Faculty ofMedicine,
Department of Microbiology and Immunology, Ontario, Canada
Ottawa,
(Acsepted 13 December 1990)
The nucleotide sequence of the large (L) genomic RNA segment of Seoul 80-39 virus was determined from overlapping cDNA clones. The virion L RNA segment is 6530 nucleotides long. The 3’ and 5’ terminal sequences are inversely complementary for 15 bases. The viral complementary-sense RNA contains a single open reading frame from an AUG codon at nucleotide position 37-39 to a UAA stop codon at nucleotide position 64906492. This ORF could encode a polypeptide of 2151 amino acids (246,662 kDa) which likely corresponds to the L protein detected in purified viral particles (Elliott et al., 1984) and is assumed to be an RNA-dependent RNA polymerase molecule (Schmaljohn and Dalrymple, 1983). Comparison of the L protein of the Seoul 80-39 virus with the polymerase proteins encoded by other negative-stranded RNA viruses revealed 44% similarity only with the part of the Bunyamwera virus L protein (Elliott, 1989) and a very weak homology with the PBl protein of influenza virus.
Seoul 80-39 virus is a member of the ~a~iauj~ genus of the Bunyaviridae family. It causes a mild form of hemorrhagic fever with renal syndrome. The Correspondence to: C. Yong Kang, Dept. of Microbiology and Immunology, University of Ottawa, Faculty of Medicine, Ottawa, Ontario, Canada KlH 8M5. * The nucleotide sequence data reported in this paper will appear in the EMBL, Genbank under the accession No. X56492.
genome of hantaviruses consists of 3 single-strands, negative sense RNA segments (L, M, S). All three genomic RNA segments have a consensus 3’ terminal sequence common to all virus strains in the genus (3’-AUCAUCAUCUG), and it differentiates hantaviruses from other Bunyaviridae genera (Schmaljohn et al., 1985). The M and S segments of several hantavirus serotypes have already been molecularly characterized (Schmaljohn et al., 1986, Schmaljohn et al., 1987, Yoo and Kang, 1987, Giebel et al., 1989, Parrington and Kang, 1990, Stohwasser et al., 1990). The M genomic segment encodes two viral envelope glycoproteins (Gl and G2), whereas the S genomic segment encodes the nucleocapsid (N) protein. By default the hantavirus L segment is presumed to encode the L protein which is assumed to be an RNA-dependent RNA polymerase ftranscriptase and repiicase) (Schmaljohn and Dalrymple, 1983). In order to complete the molecular characterization of the entire hantavirus genome, we have cloned and sequenced the Seoul 80-39 virus L genomic RNA segment. Seoul 80-39 virus was propagated in VERO E6 cells (ATCC 1008, CRL 1586). Purified viral particles were deproteinized by proteinase K and SDS followed by RNA extraction with phenol-chlorofo~-isoamylalcohol. This RNA was used as a template for cDNA synthesis, as described previously (P~ngton and Kang, 1990). The clones were grouped by cross-hybridization and their specificity was determined by Northern blot hybridization with purified viral RNA. Seven clones were L L8 II
13’ 0
L2
PLl II
q
7 8
4' 6
Ll 0
DA2 0
2 3
1
L3’
12PLl
1
L5’ 6.5kb
PSI
h2
PSl
I.3
CA CTGCA
t.2
CAGCTGCAG
Li
CAGCTGCAG TCAT~~ACTG~TAGAGTACT
TAGTAGTAGACTCCGGAAGAGACAA TGCAGAGGCMTCTGTGCATACTG
F 2
LB
CAG
PLI
CAGCTGCAG
CTGTCCTCAATGCACAGT
DA2
CAGCTGCAG
TAGTAGTAGA
TGCA
ATCTCATGTATTGAACGACACCAT
Fig. 1. Cioning strategy of Seoui 80-39 virus L RNA segment. The L segment specific cDNA library (clones 8, 7, 6, 4, 3, 2 and 1) was made using purified viral RNA as a template and random hexanucleotide primers. Clones L3’, L2PLl and L5’ were obtained by polymerase chain reaction using specific primers. Oligonucleotides L3’, L2 and Ll were used to prime the fist strand cDNA synthesis, whereas amplification reactions were performed in the presence of two primers: L3’ and L8 for obtaining the 3’ end, Ll and DA2 for obtaining the 5’ end and L2 and PLl for filling the gap between the clone 2 and clone 1. All primers had a PsrI site at their 5’ ends allowing insertion of the clones into the Pstl site of puc19.
61
720
960
1560
1920
2160
2400
2610 2760 2880 3000 3120 TC,ATGM,AC3TATGUCCTCA3G,CAGA~C333TTrTAMrT3CTTTCCAC*CGGGCA3CA3G~GAGGTACGAGGCM3TGGT,GCAGGGTAACTIGAACAAGTG,TCATCATTGTT3 SMDTYEPHVRDFLNFFPDGHncEVROYULPCYCYYCSSLF
3240 3360 3480 3600 3720
Fig. 2. Nucleotide
sequence
and the deduced amino acid sequence of the Seoul 80-39 virus L genomic segment. The X-D-D motifs are underlined.
4320
6120
TTTMCTTTTCAGGCMGChGGTATCCTTATTAGhCC&AThtGhTlTMCTGhGCTTGhGhGThTTGTtMGGGhTGGGGTGhGTCTGlTGTGGhtCAGTTlGAtTCCCTTGACTlhGhA FYFSGK4VStLDPIDtTELES~“KG”GESVVB~~DSL~L~
6360
GCCChOMT&TAGTTCMMG&AGGCThTlGThCtlGAGGhTCTGhlTCCh~CT>TGlTTTC&TfThGhCAThCThTCCTCCThTTh&GCCGGCTGtITGGACACChTTChGtTTCT haWLVOKPG,VPEDVtPaSLFSFRHTnYLLRRLrGPOSVS
6480
A&hT~C~hTT~T&hTTCA&TTCATtTtChTTTTGCG~GT&TAC~h&TA T f T
6530
Fig. 2 (continued).
segment specific (Fig. 1). Six of these clones (8, 7, 6, 4, 3, 2) were overlapping, whereas the clone 1 represented different region of the L segment (Fig. 1). All clones were subcloned into bacteriophage M13mp18 and M13mp19 and sequenced by the dideoxy chain termination method (Sanger et al., 1977). The 3’ end, the 5’ end and the gap between the clone 2 and the clone 1 were cloned by performing first strand cDNA synthesis on the viral RNA template followed by the polymerase chain reaction using specific primers. The L3’ primer has 25 nucleotides complementary to the previously determined 3’ end of the Seoul L RNA segment (Schmaljohn et al., 1985), whereas the DA2 primer has 10 bases complemental to the cRNA (sequence was predicted from the inverse complementarity between the 3’ and 5’ ends of M segment and S segment genomic RNAs). Other primers were designed according to the determined sequences of the clones 8 (L8), 2 (L2) and 1 (Ll and PLl) (Fig. 1). Amplified products were cloned into the
63 SEOlt79 GUN1118 $81 405
RGNYLGGNLNYCSSLFGV R NUtGGNFNYlSSYVNSCAI P GNRIGWFNILSYVLGVSIL
AWSLLfKEIUTRLfPELD CFFEFANNSDO LVYKDfCKECWKLLDGDCLINSNVNSDO NLGPKR YYKTTYWYDGLGSSDD
1222 1165 446
Fig. 3. Comparison of homologous parts of the L proteins of Seoul virus and influenza virus PBl protein.
PstI site of pUC 19 and sequenced
in both directions
by the dideoxy
chain
termination method. The complete nucleotide sequence of the L genomic RNA segment was determined and presented as implements ( + ) sense DNA in Fig. 2. The L segment is 6530 nucleotides long and has a base composition of 32.6% A, 16.4% C, 21.1% G
and 29.9% U. The 3’ and 5’ tern&i are inversely complementary for 15 nucleotides and the secondary structure that would result from base pairing would have a free energy of - 42.4 kcal (calculated for terminal 50 nucleotides). A single open reading frame was detected in the first frame of the viral complementary (+) sense RNA (Fig. 2). It starts from the AUG codon at nucleotide position 37-39 and ends with the UAA stop codon at nucleotide position 6490-6492. This ORF encodes a polypeptide of 2151 amino acids (246,662 kDa) which corresponds to the L protein detected in purified viral particles (Elliott et al., 1984). The L protein of the Seoul SO-39 virus was compared with the polymerase proteins encoded by other negative-stranded RNA viruses using Microgenie computer program. This comparison showed 44% homology only with the part of the Bunyamwera virus L protein (Elliott, 1989) and a very weak homology (17%) with the PBl protein of influenza virus (Fig. 3). These relatively conserved regions contain the SDD sequence motif surrounded by predominantly hydrophobic amino acids. Similar sequence motifs (GDD, LDD and MDD) have been found in RNA-polymerase molecules of bacteriophages, retroviruses and other RNA viruses (Argos et al., 1988) suggesting the evolutionary conservation and functional importance. Seoul virus L protein does not contain GDD or LDD sequence motifs, but there is one MDD motif and six other DD motifs (Fig. 2). Homology to polymerase proteins of LCMV, measles virus, Newcastle disease virus, parainfluenza virus, respiratory syncytial virus, sendai virus, rabies and VSV was not detected. The L proteins of negative stranded RNA viruses, although similar in size, have probably evolved from an ancestor molecule so that their relationship is evident only among members of the same family. Further characterization of RNA-dependent RNA polymerase molecules of negative strand RNA viruses is necessary in order to determine their activities and functional domains. The 3’ end, the 5’ end and the region between the clone 2 and the clone 1 of the Seoul L RNA segment (Fig. 1) were obtained by gene amp~fication using the AMV reverse transcriptase and Taq DNA polymerase. Although the error frequency reported for the Taq polymerase during a 30-cycle amplification is estimated to be approximately 0.25% (Saiki et al., 1988), we have found an error frequency of only 0.03% over a 30-cycle amplification of known human parainfluenza virus 3 gene sequences (Murphy et al., 1990). No insertions or deletions were noted. There were no errors in the overlapping portions of the 3 Seoul 80-39 clones. However,
64
considering the error frequency of the Taq DNA the 3 Seoul SO-39 clones obtained by polymerase 0.24 (Murphy et al., 1990) and 2.02 (Saiki et al., in a maximum of 2 amino acids change in the L
polymerase, the 809 nucleotides of chain reaction could have between 1988) base substitutions, resulting protein of Seoul 80-39 virus.
We thank Dr Mark Galinski for providing us the sequences of the L proteins of negative-stranded RNA viruses, Jean Pineault and M. Radojicic for assistance with computing, N. Ddcellier, B. Mah and D. McLean for excellent technical assistance. This study is supported by a grant from the Natural Sciences and Engineering Research Council of Canada.
References Argos, P. (1988) A sequence motif in many polymerases. Nucleic Acids Res. 16, 9909-9916. Arikawa, J., Lapenotiere, H.F., Iacono-Connors, L., Wang, M., Schmaljolm, C.S. (1990) Coding properties of the S and the M genome segments of Sapporo rat virus: comparison to other causative agents of hemorrhagic fever with renal syndrome. Virology 176, 114-125. Bank, S., Rud, E.W., Luk, D., Banerjee, A.K. and Kang, C.Y. (1990) Nucleotide sequence analysis of the L gene of vesicular stomatitis virus (New Jersey serotype): identification of conserved domains in L proteins of nonsegmented negative-strand RNA viruses. Virology 175, 332-337. Blumberg, B.M., Crowley, J.C., Silverman, J.I., Menonna, J., Cook, SD. and Dowling, P.C. (1988) Measles virus L protein evidences elements of an ancestral RNA polymerase. Virology 164, 487-497. Elliott, L.H., Kiley, M.P. and McCormick, J.B. (1984) Hantaan virus: identification of virion proteins. J. Gen. Virol. 65, 1285-1293. Elliott, R.M. (1989) Nucieotide sequence analysis of the large (L) genomic RNA segment of Bunyamwera virus, the prototype of the family Bunyaviridae. Virology 173, 426-436. Feldhaus, A.L. and Lesnaw, J.A. (1988) Nucleotide sequence of the L gene of vesicular stomatitis virus (New Jersey): identification of conserved domains in the New Jersey and Indiana L proteins. Virology 163, 359-368. Fields, S. and Winter, G. (1982) Nucleotide sequences of influenza virus segments 1 and 3 reveal mosaic structure of a small viral RNA segment. Cell 28, 303-313. Galinski, MS., Mink, M.A. and Pons, M.W. (1988) Molecular cloning and sequence analysis of the human paminfluenza 3 virus gene encoding the L protein. Virology 165, 499-510. Giebel, L.B., Stohwasser, R., Zoller, L., Bauty, E.K.F. and Darai, G. (1989) Determination of the coding capacity of the M genome segment of nephropathia epidemica virus strain Hallnas Bl by molecular cloning and nucleotide sequence analysis. Virology 172, 498-505. Morgan, E.M. and Rakestraw, K.M. (1986) Sequence of the Sendai L gene: open reading frames upstream of the main coding region suggest that the gene may be polycistronic. Virology 154, 31-40. Murphy, D.G., Dimock, K. and Kang, C.Y. (1990) Viral RNA and protein synthesis in two LLC-MK2 cell lines persistently infected with human parainfluenza virus 3. Virus Res. 16, l-16. Parrington, MA. and Kang, C.Y. (1990) Nucfeotide sequence analysis of the S genomic segment of Prospect Hill Virus: comparison with the prototype hantavirus. Virology 175, 167-175. Saiki, R.K., Gelfand, D.H., Stoffer, S., Scharf, S.J., Huguchi, R., Horn, G.T., Mullis, K.B. and Erlich, H.A. (1988). Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239,487-491.
65 Salvato, M., Shimomaye, E. and Oldstone, M.B.A. (1989) The primary structure of the lymphocytic cho~om~~tis virus L gene encodes a putative RNA polymerase. Virology 169, 377-384. Sanger, F., Nicklen, S. and Coulson, A.R. (1977) DNA sequencing with chain terminating inhibitors. Proc. Natl. Acad. Sci. USA, 74, 5463-5467. Schmaljohn, C.S. and DaJrymple, J.M. (1983) Analysis of hantaan virus RNA: evidence for a new genus of Bunyaviridae. Virology 131,482-491. Schmaljohn, C.S., Hasty, SE., Dalrymple, J.M. and LeDuc, J.W. (1985) Antigenic and genetic properties of viruses linked to hemorrhagic fever with renal syndrome. Science 227,1041-1044. Schmaljohn, C.S., Jennings, G.B., Hay, J. and Dalrymple, J.M. (1986) Coding strategy of the S genome segment of Hantaan virus. Virology 155633-643. SchmaIjohn, C.S., enjoy, A.L. and Dahymple, J.M. (1987) Hantaan virus M RNA: coding strategy, nucleotide sequence, and gene order. Virology 157,31-39. Schubert, M., Ham&on, G.G. and Meier, E. (1984) Primary stnmture of the vesicular stomatitis virus polymerase (L) gene: evidence for high frequency of mutations. J. Virol. 51,505-514. ‘Stohwasser, R., Giebel, L.B., Zoller, L., Bautz, E.K.F. and Darai, G. (1990) Molecular characterization of the RNA S segment of nephropathia epidemica virus strain Hallnas Bl. Virology 174,79-86. Tordo, N., Poch, O., Ermine, A., Keith, G. and Rougeon, F. (1988) Completion of the rabies virus genome sequence determination: Highly conserved domains among the L (polymerase) proteins of unsegmented negative-strand RNA viruses. Virology 165, 565-576. Winter, G. and Fields, S. (1982) Nucleotide sequence of human influenza A/PR/8/34 segment 2. Nucleic Acids Res 10, 213.5-2143. Yoo, D.W. and Kang, C.Y. (1987) Nucleotide sequence of the M segment of the genomic RNA of Hantaan virus 76-118. Nucleic Acids Res. 15,6299-6300. Yusoff, K., Milk, N.S., Chambers, P. and Emmerson, P.T. (1987) Nucleotide sequence analysis of the L gene of Newcastle disease virus: homologies with Sendai and vesicular stomatitis viruses. Nucleic Acids Res. 15, 3961-3976. (Received 3 December 1990; revision received 13 December 1990)