Journal of General Virology (1993), 74, 769-773. Printed in Great Britain ... both RNA segments have an ambisense coding strategy. The open reading frame ...
Journal of General Virology (1993), 74, 769-773.
769
Printed in Great Britain
Nucleotide sequence and possible ambisense coding strategy of rice stripe virus RNA segment 2 M a m i Takahashi, 1 Shigemitsu Toriyama, l* Chika Hamamatsu 2 and Akira Ishihama 2 1National Institute of Agro-Environmental Sciences, Kannondai 3, Tsukuba, Ibaraki 305 and 2National Institute of Genetics, Mishima, Shizuoka 411, Japan
The complete nucleotide sequence (3514 nucleotides) of RNA segment 2 office stripe virus (RSV), the prototype member of tenuivirus group, was determined. In the virus-sense RNA an open reading frame (ORF) is present which encodes a 199 amino acid protein of M r 22 762. Another long ORF encoding an 834 amino acid protein with M r 94047 (94K) exists in the viruscomplementary RNA. Between these two ORFs, there is a long non-coding intergenic region of 299 nueleotides.
Rice stripe virus (RSV) is transmitted in cyclical fashion by the small brown planthopper, Laodelphax striatellus Fallrn, and three other delphacid species, and frequently causes severe damage to Japonica type rice varieties (Toriyama, 1983, 1986b). The genome of RSV is composed of four single-stranded virus-sense RNAs (vRNAs) and four dsRNAs, the latter being duplexes of the vRNAs and their complementary (c) RNAs (Toriyama, 1982; Toriyama & Watanabe, 1989). The complete sequences have been determined for vRNAs 3 and 4 from two different RSV isolates (Zhu et al., 1991, 1992; Kakutani et al., 1990, 1991). The results suggested that both RNA segments have an ambisense coding strategy. The open reading frame (ORF) in the 5'-proximal region of cRNA 3 encodes a nucleocapsid protein of 35K and the ORF in the Y-proximal region of vRNA 4 encodes the major non-structural protein (S protein) of 20K. The proteins encoded by the ORFs in the Y-proximal region of vRNA segment 3 and of segment 4 cRNA have not yet been detected. Similar ambisense coding structures were also found in RNAs 3 and 4 of maize stripe tenuivirus (Huiet et al., 1991, 1992). The 5'- and 3'-terminal sequences of all four RSV RNA segments are complementary to each other and possibly form panhandle structures (Takahashi et al.,
The nucleotide sequence data reported in this paper have been submitted to the DDBJ and have been assigned the accession number D13176. 0001-1363 © 1993 SGM
The sequence suggests that RNA 2 has an ambisense coding strategy as found for RSV RNAs 3 and 4. The putative 94K protein carries stretches with an amino acid sequence showing weak similarity to parts of the membrane glycoproteins of Punta Toro and Uukuniemi phleboviruses of the family Bunyaviridae, suggesting a possible distinct evolutionary relationship between the animal phleboviruses and the plant tenuiviruses.
1990). RSV and rice grassy stunt tenuivirus contain the RNA-dependent RNA polymerase as an integral component of the virus particles (Toriyama, 1986a, 1987). A soluble protein fraction containing the nucleocapsid protein and a minor viral protein of 230K exhibits RNA polymerase activity in the presence of model R N A templates with the 3' conserved sequence (Barbier et al., 1992). All these characteristics are indications of either the negative strand or ambisense nature of all the RNA segments of the RSV genome. This suggests that RNA segments 1 and 2 are either negative sense or ambisense. In order to determine the coding strategies of the two large RNAs of RSV, we are attempting to sequence the RSV genome (Zhu et al., 1991, 1992). We report here the complete nucleotide sequence of RNA 2 of RSV isolate T. 0
1000
I
I
2000 I
3000 3514 I
I
RSV RNA 2 .~
4...
Clones
R2Ta40
4-
R2To37 R2SD5
i
' rd
i
R2Ta62 ri
Fig. 1. Four cDNA clones used to determine the nucleotide sequence of RSV RNA 2. cDNAs were synthesized using two species of synthetic primers: primer A ( i ) , complementary to nucleotides 3478 to 3496; and primer B (D), complementary to nucleotides 1297 to 1316. The regions determined by direct RNA sequencing methods (primer extension with reverse transcriptase or wandering spot analysis from the 5" and 3' termini) are indicated with arrows.
N
N
~
N
~
N
°
N
~
N
~
°°~~i
N
~
N
N
N~N~nN~
~ I ~~
~
~~Oo
-,.-,I --.3
771
Short communication
2550 2580 2610 2640 UAUCUACA~A~ACU~JCAUCUACAACAGGUCUCC~AU~ACAUA~UCCAACCAUUGAAUAUA~UAC~U~JC~JU0A~AAC.CCAAAAUGCAGGUU0UA~AAAAAAU~AUA~CAUUA~A D V S V g 0 V V P It i H C L 0 V H $ Y b V K K L b W F A P K I, ? F I., I N M I T I 2670 2700 2730 2760
U•kGkGU•A•U•CU•UCACUAUAAC••AAACA•GAUUAUkG•GUAC•AUG•UkUCUAU••UGUGGG•GC•UUGkUA•UUCUUUAUUCUCAUU•UCAUGUcCAUUCUCUCkUACUU•U L T F V I V I V F W L I I $ Y L $ y R y I H S C Q Y N K ! R M 1 H D M I~ I~ Y K D 2790 2829 2850 ?,880 CACAAAUCUCUCCUGGGUGACAAGUUACA~.~UU~ACAACCACCC~JUAUCACAAACU0CAAUA~UA~JCA~JCA~AAGUUGGAAGUUUGACUGA~CACAAGAA~AU~G~AU~U~U C I g 0 P H C T V N Q C G G K D C V E I T D D Y T P L K V S g C S P I S F T Q D 2910 2940 2970 3000
C~CU~CA~A~UC~C~U~A~AUAU~UAC~C~C~AUCU~UC.C~A(F`a~`.`UU~A~CAC~UGAUUUA~UUUC~C~UUUC~AC.CU~UC~ULH]~ S V N V g C A S N I Y Y L V 0 D T C S P K l
3030
3060
A V 0 H N I
l
R E PK
K A Q D P Q I
3090
3120
~GCAUGCCCUkGUCUC~CAGC~tU~C~U~UUCCCkkGAAACCAGUUUUG~GAUUU~AGU~UG~UCU~JUUC~kkUUGG~UU~UC~U~GA~UU~AG~U~UkUUCU~U C AI]
T I;: V A Y Y I 0 L F G T K b N S Y S 3150 3180
I K e. S N S N T l L L P S S T Y I ] T 3210
Q 3240
GCU~U~C~CCUACCA~AGACACA~UCUGCAG~AUkGCAUCAUCGC~CU~GAUAGACGUGkCCACCG~ACUU~GkG~AGGAU~GkG~GGCAGCUu~U~AUCGUAGUUUCCAGUG~ Q I G V L S V N Q L I A D D C S L Y V H G 0 F K S F S P 11L P L K Y D.Y 3270 3300 3330
N G T Y 3360
AGCkU~UCUC~A~U/~UU~A~ACUAGCCCUAUU~UkkCAAAU~A~ACCAUUUUC~U~kA~U~UC~AUUCAC~CU~AUAA~UCAkUUUGCAGGG~GGGGCUCUGGCACUGCAAGGGACCU C T K F Y N V S k It N Y C I L O
N E l
F Y 0 N L S Y T L K C P P A ~ A S C P V K
3390 3420 3450 3480 UGACkAUCUCAGAUGGUUCCCUCUCCCUGUUCCUCkUCCkAOAA UGGG~UCAGGOAkAGGGAUUC~UGCACCCCACC'CCAUAUUAkkGA~U~UGUAGAUGA~UA~A~A~U V I g S P g E E R N l~ M W $ H T D P F P I P A G If k M M 1~ I T T Y I F Y $ K F H
3510 GCkUCUUCk/~GAAGUUAUACCCAGACUUUGUGU M
Fig. 2. Complete nucleotide sequence of RSV RNA 2 and predicted amino acid sequence in which the single-letter amino acid code is used. The 94K ORF in the 3'-proximal sequence of vRNA is encoded by the virus cRNA sequence. The amino acid sequence for the 94K protein is shown in its opposite orientation. The termination codons are indicated by asterisks.
Purification of RSV isolate T nucleoprotein particles and the preparation of RNA were carried out as described previously (Toriyama, 1986a; Toriyama & Watanabe, 1989). Either ssRNA 2 purified from the agarose gel or a mixture of RNAs 2, 3 and 4 were used as templates for the cDNA synthesis. The cDNA cloning was carried out by the method of Gubler & Hoffman (1983). Two types of synthetic oligonucleotides, 3' TTACGTAGAAGTTTCTTCAA 5' and 3' A C A A C A A G T C A T A G T A T C C C 5', were used as primers. The former, complementary to the unique sequence located 18 to 37 residues from the 3' end of R N A 2 (Takahashi et al., 1991), was used to obtain clones, R2To37 and R2SD5, containing cDNA for the 3' half of RNA 2. The latter, complementary to a sequence present in the middle o f R N A 2 (see Fig. 1) was used to obtain clones, R2Ta40 and R2Ta62, containing cDNA for the 5' half of RNA 2. The dsDNAs were made blunt-ended using T4 DNA polymerase and inserted into the Sinai site of pUCI8 (Yanisch-Perron et al., 1985). Recombinant plasmids were transformed into the competent Escherichia coli strain JM 109 (Nippon Gene Company; Hanahan, 1985). Transformants containing long inserts were selected and examined for the presence of either the 3'- or 5~-terminal sequence of RNA 2. The inserts of recombinant plasmid clones R2To37 and R2Ta40 were digested with restriction enzymes and
the resulting fragments were subcloned into appropriately cleaved M13mpl8 or M13mpl9 phage vector DNAs. Alternatively, a nested set of deletions was prepared from the insert DNAs of clones R2Ta62 and R2SD5 using the Kiro sequence deletion kit (Takara Shuzo; Henikoff, 1984; Yanisch-Perron et al., 1985). The ssDNA prepared from subclones of phage M13 or the set of alkali-denatured deletion clones was sequenced by the dideoxynucleotide chain termination method (Sanger et al., 1977), using the Sequenase version 2.0 kit (United States Biochemicals) and [~-35S]dCTP (Amersham). The cDNA sequence near the junction between clones R2Ta40 and R2To37 was confirmed by direct RNA sequencing i.e. the dideoxynucleotide chain termination method of Zimmern & Kaesberg (1978), using avian myeloblastosis virus (AMV) reverse transcriptase (Bethesda Research Laboratories). In this reaction, a synthetic oligonucleotide 3' T C A A C A G A A G A A T TTAAAGGAG 5', which is complementary to the vRNA sequence from positions 1385 to 1406 (Fig. 1), was used as the primer. The first 10 bases from the 3' ends of four ssRNAs are identical except for the sixth nucleotide position of vRNA 1 (Takahashi et al., 1990). For the synthesis of cDNA of vRNA 2, the primer was designed so as to hybridize to an inner unique sequence adjacent to this 3'
772
Short communication vRNA
,
l
t11, z H I'I
II/111
1 11 Ilzlll
III II I t I I llllz
!
I
I
II
cRNA
I*l
I
•
:
11
I
H fl
IU I 1
I Ill [I |
II
Ill l l,qlW I
,
/Ill
Ilizlll
I
It
I|1
] I
I III [
II
I
II "
I
I
I
I
I
m *1
I
III
I
1
Ill
I
I
Fig. 3. ORFs in RSV RNA 2. Three possible ORFs are shown for both vRNA and cRNA of RSV segment 2. Translation initiation codons and termination codons are indicated by short vertical bars above and below each RNA, respectively. Major ORFs are indicated by thick arrows whereas other minor ORFs longer than 280 bases are indicated by thin arrows.
conserved sequence (primer A), in Fig. 1. Complementary DNA of the same size as vRNA 2 was synthesized using this unique primer, but after several attempts at transformation, it proved difficult to obtain full-sized cDNA clones. Therefore we prepared two sets of partial clones, as illustrated in Fig. 1. The 5' half clones, R2Ta40 and R2Ta62, were obtained by primer extension using primer B, the sequence of which was derived from the 3'proximal sequence of clone R2To37. After analysis of the sequence of these four clones, we obtained the sequence of nucleotides 23 to 3492 of vRNA 2. The sequences of both termini were obtained from the data determined by direct sequencing of vRNA 2 (Takahashi et al., 1990). The complete sequence was found to be composed of 3514 bases (Fig. 2). To confirm the sequence around the junction between two sets of cDNAs, the RNA sequence from nucleotides 1152 to 1375 was determined directly. The result accorded perfectly with that of the cDNA sequence. The sequence was scanned for AUG-initiated ORFs (Fig. 3). A distinct long ORF is present in the 5'proximal region of the cRNA between nucleotides 980 and 3484, which encodes an 834, amino acid protein of M r 94 047 (94K) (Fig. 2, lower part). On the other hand, the next longest ORF is present in the 5'-proximal region of vRNA between nucleotides 81 and 680, and encodes a 199 amino acid protein with Mr 22762 (22.7K) (Fig. 2, upper part). Between these two ORFs, there is a noncoding intergenic region consisting of 299 nucleotides. The presence of two long ORFs, 94K on cRNA and 22-7K on vRNA, suggests that RSV RNA 2 also has an ambisense coding arrangement, as found in RSV RNAs 3 and 4 (Zhu et al., 1991, 1992; Kakutani et al., 1990, 1991). In addition, there are three ORFs longer than 280 bases in vRNA: two lie in the same frame as the 22"7K
ORF and one lies in a different frame near the middle part of vRNA (Fig. 3, indicated by thin arrows). So far no protein corresponding to the predicted Mr of 94K or 22-7K or any other ORF products has been detected in either purified RSV nucleoprotein preparations or plant tissues infected with RSV.
RSV94Kprotein PT M protein UK G protein
DTIEVCDKGGCQN~TgHEGEI~DKYERMDHIMR|KIWISHIYRYS~Y~I V[V~P~RDS~AAHNCLLCYHO|L~HSTLSAIJT~F TCWFCRANNANIHCFSKEQ
RSV94Kprotein PT M protein UK G protein
I~FP~I]U~TLI|IMIIJ~I~FFI--PA~ ~LK|V~Y~M~Jm-InH~PVV L~ILF|YTVSVT|--IJ}]~VV~B ....... ~IPEQ|K~PW-W--LKLFI VW LIA|SSLC LLLASV~RAI~V AT~TmK I~PFWNILSLM--|TCS
RSV 94K protein PT M protein UK G protein
DEVSVDMSTVW-VVDEAIMaLV~EDS~A--PNTJlJ]~SDK~K~-KG~KVEN
RSV94Kprotein PT M protein UK G protein
G~I-FI~Y~L~I~L~VRA~j~Q~LV~SISNIE~TN ..... ~D~IBK TTL-LLTLIHmTGGNA~NTVVAN-SKQTRCVQEGS§. . . . . TK~SITAT NITRLS~V~VGM~C~A~PV~S~SI~VTALSQ~STS~DOVJR~-~V~T
RSV94Kprotein PT M protein UK G protein
HK~TLLN-TP~DFI~I ...... KESTpV ...... YK|EFN~VRVM~LSVP~ IT~BAGV-IOAES~. . . . . . IIKGPHENQQKTIS~KTIISETV~REGS$ SS~LQVSPKO|ES~LILKSPTO~AV~S ...... I~KTTDIKLE~VRRD~
RSV94Kprotein PT M protein UK G protein
~Y|NSFKRVlSREENK~FEGBG~RTDGTH~I~GESTS~?DYC-|T~F~
RSV 94K protein PT M protein UK G protein
FSYCPAYHYNWKRIEYEPTSS MN
NWL---LTAL|-I--KTRNVM~21NqR~GWVDHHD~EBPRH~-EPMRRFK K2LNKRAEKLKESIHSL~NN~DEGP~
-EQN~PARA~ANPNV|QKM?
FW|SLY P-SCLSSRR~HLVGD~VGNKCQ~-~RD-DQI~REF$8~K~NII~I~ ~NVPRVTH-RCIGTRR~HLMGA~KGEA
Fig. 4. Comparison of the deduced amino acid sequences of the RSV 94K protein, the Punta Toro virus (PT) M protein (Ihara e t al., 1985) and the Uukuniemi virus (UK) G protein (R6nnholm & Petterson, 1987). The FASTA program was used for protein similarity searches. Amino acids identical to those of the RSV 94K protein are indicated by the shaded areas. Gaps inserted in the sequences to maximize homology are indicated by dashed lines.
Short communication
Comparison of amino acid sequence homology in the NBRF/PIR protein database failed to find any protein with significant similarity to the predicted 94K product. However, short stretches of weak but significant similarity were observed with the membrane glycoprotein precursors of Punta Toro virus, Uukuniemi virus and other viruses of the genus Phlebovirus in the family Bunyaviridae (Fig. 4). RSV RNA 2 may correspond to M RNA of the genus Phlebovirus, which encodes a membrane glycoprotein. A weak similarity was also previously observed between RSV and Punta Toro phlebovirus nucleocapsid protein (Kakutani et al., 1990). The 3'-terminal sequence 5'--GACUUUGUGU 3', common between four RSV RNA segments apart from a base change (U to A) at the sixth nucleotide of RNA 1, is completely identical to that of phleboviruses (Takahashi et al., 1990; Elliott, 1990), suggesting a distant evolutionary relationship between RSV and these viruses. The genome organization of RSV RNA 2 is different from that of the M RNAs of phlebovirus and uukuvirus, the genomes of which are negative-sense (Elliott, 1990), and identical to the ambisense M RNA of impatiens necrotic spot tospovirus (Law et al., 1992). The filamentous particles, 3 to 8 nm wide, of the tenuivirus group (Francki et al., 1991) are apparently similar to the nucleocapsids of viruses of the Bunyaviridae (von Bonsdorff et al., 1969). However extensive examination in tissues infected with viruses of the tenuivirus group by using electron microscopy has failed to detect enveloped spherical particles. We thank N. Fujita for conducting a homology search of protein sequences. This work was supported in part by Grants-in-Aid from the Ministry of Agriculture, Forestry and Fisheries [Biocosmos Program92-I-C-(1)] (to S. Toriyama and A. Ishihama) and the Ministry of Education, Science and Culture (to A. Ishihama).
References BARBmR, P., TAKAHASHI, M., NAKAMtmA, I., TORIYAMA, S. & ISmHAMA,A. (1992). Solubilization and promoter analysis of RNA polymerase from rice stripe virus. Journal of Virology 66, 6171-6174. ELLIOT'r,R. M. (1990). Molecular biology of the Bunyaviridae. Journal of General Virology 71, 501-522. FRANCKI, R. I. B., FAUQUET,C.M., KNUDSON, D.L. & BROWN, F. (1991). Classification and nomenclature of viruses. Fifth Report of
the International Committee on Taxonomy of Viruses. Archives of Virology supplementum 2, 398-399. Wien & New York: SpringerVerlag. GtrBLER, U. & HOFFMAN, B.J. (1983). A simple and very efficient method for generating cDNA libraries. Gene 25, 263-269. HANAHAN,D. (1985). Techniques for transformation of Esherichia call In DNA Cloning: A Practical Approach, vol. 1, pp. 109-135. Edited by D. M. Glover, Oxford & Washington, D.C. : IRL Press. HENIKOFF, S. (1984). Unidirectional digestion with exonuclease III creates targeted breakpoints for DNA sequencing. Gene 28, 351-359.
773
HUIET, L., KLAASSEN,V., TSAI,J. H. & FALK, B. W. (1991). Nucleotide sequence and RNA hybridization analyses reveal an ambisense coding strategy for maize stripe virus RNA3. Virology 182, 47-53. HumT, L., TsAI, J. H. & FALK, B. W. (1992). Complete sequence of maize stripe virus RNA4 and mapping of its subgenomic RNAs. Journal of General Virology 73, 1603-1607; corrigendum 3049. IHARA, T., SMITH, J., DALRYMPLE,J. M. & BISHOP, D. H. L. (1985). Complete sequences of the glycoproteins and M RNA of Punta Toro phlebovirus compared to those of Rift Valley fever virus. Virology 144, 246-259. KAKUTANI, T., HAYANO, Y., HAYASHI, T. & MINOBE, Y. (1990). Ambisense segment 4 of rice stripe virus: possible evolutionary relationship with phleboviruses and uukuviruses (Bunyaviridae). Journal of General Virology 71, 1427-1432. KAKUTANI, T., HAYANO, Y., HAYASHI, T. & MINOBE, Y. (1991). Ambisense segment 3 of rice stripe virus: the first instance of a virus containing two ambisense segments. Journal of General Virology 72, 465--468. LAW, M.D., SPECK, J. & MOVER, J.W. (1992). The M RNA of impatiens necrotic spot tospovirus (Bunyaviridae) has an ambisense genomic organization. Virology 188, 732-741. RrNNHOLM, R. & PETTERSON, R.F. (1987). Complete nucleotide sequence of the M RNA segment of Uukuniemi virus encoding the membrane glycoproteins G1 and G2. Virology 160, 191-202. SANDER,F., NICKLEN, S. d~ COULSON,A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, U.S.A. 74, 5463-5467. TAKAHmHI, M., TOmYAMA, S., KIKOChq, Y., HAYAKAWA, T. & ISmHAMA, A. (1990). Complementarity between the 5'- and 3'terminal sequences of rice stripe virus RNAs. Journal of General Virology 71, 2817-2821. TORIYAMA, S. (1982). Characterization of rice stripe virus: a heavy component carrying infectivity. Journal of General Virology 61, 187-195. TORIYAMA,S. (1983). Rice stripe virus. CMI/AAB Descriptions of Plant Viruses, no. 269. TORIYAMA, S. (1986a). An RNA-dependent RNA polymerase associated with the filamentous nucleoproteins of rice stripe virus. Journal of General Virology 67, 1247-1255. TORIYAMA,S. (1986b). Rice stripe virus: prototype of a new group of viruses that replicate in plants and insects. Microbiological Sciences 3, 347-351. TORIYAMA, S. (1987). Ribonucleic acid polymerase activity in filamentous nucleoproteins of rice grassy stunt virus. Journal of General Virology 68, 925-929. TORIYAMA,S. • WATANABE,Y. (1989). Characterization of single- and double-stranded RNAs in particles of rice stripe virus. Journal of General Virology 70, 505-511. VONBONSOORFF,C. H., SAIKKU,P. & OKER-BLOM,N. (1969). The inner structure of Uukuniemi virus and two Bunyamwera supergroup arborviruses. Virology 39, 342-344. YANIScH-PERRON,C., VmIRA,J. & MESSrNG,J. (1985). Improved M13 phage cloning vectors and host strains: nucleotide sequences of the M13mpl8 and pUC19 vectors. Gene 33, 103-119. ZHU, Y., HAYAKAWA,T., TORIYAMA,S. & TAKAHASHI,M. (1991). Complete nucleotide sequence of RNA 3 of rice stripe virus: an ambisense coding strategy. Journal of General Virology 72, 763-767. ZHU, Y., HAYAKAWA,T. & TORIYAMA,S. (1992). Complete nucleotide sequence of RNA 4 of rice stripe virus isolate T, and comparison with another isolate and with maize stripe virus. Journal of General Virology 73, 1309-1312. ZIMMERN,D. & KAESBERG,P. (1978). T-Terminal nucleotide sequence of encephalomyocarditis virus RNA determined by reverse transcriptase and chain-terminating inhibitors. Proceedings of the National Academy of Sciences, U.S.A. 75, 4257-4261.
(Received 17 September 1992; Accepted 24 November 1992)