Nucleotide Sequence of the Coding and Flanking ...

21 downloads 0 Views 559KB Size Report
SUMMARY. The complete nucleotide sequence of the human parainfluenza virus type 3 (HPIV3) fusion (F) protein gene has been determined. The HPIV3 F ...
J. gen. Virol. (1987), 68, 1003-1010. Printed in Great Britain

1003

Key words: human parainfluenza virus 3/fusion protein/sequence analysis

Nucleotide Sequence of the Coding and Flanking Regions of the Human Parainfluenza Virus Type 3 Fusion Glycoprotein Gene By MARIE-JOSI~ C()TI~, D O U G L A S G. S T O R E Y , C. Y O N G K A N G AND KENNETH DIMOCK* Department of Microbiology and Immunology, University of Ottawa, Faculty of Health Science, 451 Smyth Road, Ottawa, Ontario, Canada K1H 8M5 (Accepted 6 January 1987)

SUMMARY The complete nucleotide sequence of the human parainfluenza virus type 3 (HPIV3) fusion (F) protein gene has been determined. The HPIV3 F gene is 1851 nucleotides long including six U residues in the genomic RNA, which probably direct synthesis of the first few nucleotides in the F mRNA polyadenylate tail. The HPIV3 F gene contains a single long open reading frame coding for 539 amino acids. The predicted molecular weight of the unglycosylated precursor F0 protein was 60 031. Four potential carbohydrate acceptor sites were identified. Comparison of the HPIV3 F protein sequence with the F gene sequences of two other paramyxoviruses, Sendai virus and simian virus 5, indicated a very close evolutionary relationship between HPIV3 and Sendal virus. Sequence analysis of HPIV3 F gene flanking regions identified signals which appear to be responsible for polymerase recognition and polyadenylation. INTRODUCTION Human parainfluenza virus type 3 (HPIV3) ranks second in importance, after respiratory syncytial virus, as a cause of respiratory tract infections in young children. The symptoms vary from mild upper respiratory tract infections to severe lower respiratory tract syndromes such as croup, bronchiolitis and pneumonia (Chanock, 1956; Parott et aL, 1962; Glezen & Denny, 1973; Welliver et al., 1982). The negative-sense, single-stranded RNA genome of HPIV3 has a mol. wt. of approx. 4-6 × 106 or 15 000 nucleotides (Storey et al., 1984). The genome codes for six major structural proteins (Storey et al., 1984; Sanchez & Banerjee, 1985; Wechsler et al., 1985; Jambou et al., 1985) in the order Y N P - P / C - M - F - H N - L 5' (Dimock et al., 1986a; Spriggs & Collins, 1986). Three proteins are found in the nucleocapsid of the virion: L (mol. wt. approx. 195000, 195K), the putative polymerase; P (87K), a phosphorylated protein thought to be associated with the polymerase; and NP (67K), the major nucleocapsid protein. Three proteins are associated with the envelope: HN (69K), the haemagglutinin/neuraminidase; F1,2 (60K), the fusion protein complex, which consists of two disulphide-linked proteins, F~ (50K) and F2 (10K); and M (35K), the matrix protein. The F glycoprotein of the prototype paramyxovirus, Sendai virus, has been shown to be involved in virus-induced cell fusion and haemolysis (Homma & Ohuchi, 1973; Scheid & Choppin, 1974). Activation of the fusion protein precursor, F0, requires cleavage by a host trypsin-like enzyme to yield F~ and F2 (Homma & Ohuchi, 1973; Scheid & Choppin, 1974; Nagai et al., 1976; Scheid & Choppin, 1977). A new N terminus of F1 is generated by the cleavage (Scheid & Choppin, 1977). The amino acid sequences of the F1 termini of several paramyxoviruses are very similar to each other and the hydrophobicity of the sequences suggest that this region interacts with the target cell membrane (Gething et al., 1978 ; Richardson et al., 1980; Richardson & Choppin, 1983). The fact that anti-F antibodies are able to prevent the spread of infection by cell fusion and also to inhibit virus penetration indicates that the F protein plays an important role in the pathogenicity of paramyxoviruses (Merz et al., 1980). 0000-7496 © 1987 SGM

1004

M-J. C6TI~ AND OTHERS

Previously, we reported the cloning of HPIV3 m R N A and genomic R N A sequences and the identification and gene assignment of the HPIV3 m R N A s (Dimock et al., 1986a). Here, we describe HPIV3 F gene-specific c D N A clones constructed from HPIV3 F m R N A and genomic R N A . The complete nucleotide sequence of the F gene including the sequences of the M - F and F - H N junctions have been determined, The deduced amino acid sequence of the F protein is presented and similarities in sequence between the F protein of HPIV3 and the F proteins of two other paramyxoviruses, Sendal virus and simian virus 5 (SV5) are discussed. The H P I V 3 F gene sequences presented in this paper have also been compared with those previously reported by Spriggs et al. (1986). METHODS Virus and cells. HPIV3 strain 47885 was propagated in LLCMK2 cells as described by Storey et al. (1984). Cloning ofmRNA sequences. Procedures for isolation of mRNA from HPIV3-infected cells, cDNA synthesis and cloning into Escherichia coli plasmid pBR322 have been described previously (Dimock et al., 1986a). Cloning ofgenomic RNA sequences. The procedure for isolation of HPIV3 genomic RNA, cDNA synthesis and cloning into the replicative form of bacteriophage M13 is described in Dimock et al. (1986a). Identification of the F gene product. Hybrid-select translation of HPIV3-specific mRNAs and crosshybridization analyses have been described previously (Dimock et al., 1986a). Preparation of M13 single-stranded DNA. Transformation of E. coli JM101 with M13 DNA, growth of

transformed cells and preparation of single-stranded M13 DNA were all done according to Messing (1983). Sequencing strategy and methods. Four overlapping clones, mpPIgX3, mpPIg14 I, mpPlg142 and pPI14, which cover the entire HP1V3 F gene and its flanking regions, were chosen for sequence analysis. Sequential series of overlapping clones were generated according to Dale et al. (1985). Dideoxynucleotide sequence analysis was carried out according to Messing (1983), with minor modifications. The final concentrations of the dideoxynucleotides (Pharmacia) in the sequencing reactions were 0.03 mM-ddATP, 0.1 mM-ddCTP, 0.5 mMddGTP and 1-5 mM-ddTTP.[3sS]dATP(> 400 Ci/mmo~, Amersham) was used at a concentration of 2-0 to 2-5 ~tM. Reaction mixtures were incubated for 30 rain at 42 °C and the products were chased for 30 min at 42 °C. The mixtures were heated for 3 rain at 90 °C in formamide buffer and analysed on 8.0~ polyacrylamide gels containing 8 M-urea. International Biotechnologies, Inc. software was used for nucleic acid and protein sequence analyses. RESULTS AND DISCUSSION Characterization o f H P I V 3 F gene c D N A clones

Previously, we identified five groups of HPIV3-specific, mRNA-derived clones by reciprocal cross-hybridization (Dimock et aL, 1986a). Clones specific for the HPIV3 NP, P and M genes were assigned following analysis by hybrid-select translation in vitro (Dimock et al., 1986a). Unambiguous identification of HN-specific clones was achieved by comparing products translated in vitro from hybridization-selected m R N A with polypeptides immunoprecipitated from tunicamycin-treated HPIV3-infected cells (Storey et al., 1987), Similar experiments and a process of elimination suggested that the fifth group of clones represented the HPIV3 F gene. This group of clones hybridized to two m R N A species in HPIV3-infected cells, one of approx. 2.2 kb (F m R N A ) and another of approx. 3.5 kb ( M - F bicistronic m R N A ) (Dimock et aL, 1986 a). One of these clones, pP114 (1496 nucleotides, excluding the homopolymeric tract), was chosen for sequence analysis, This clone contains about 8 0 ~ of the F gene sequence and includes the polyadenylate tail. Three other clones were chosen from a bank of genomic R N A derived clones: both mpPIgX3 (approx. 1250 nucleotides) and mpPigl4~ (approx. 300 nucleotides) overlap pPI14 and extend into the M gene and into the H N gene respectively; mpPIg142 (approx. 1300 nucleotides) overlaps much of mRNA-derived clone pPI14. Most of the F gene was sequenced in both directions and overlapping sequences from genomic and m R N A clones were identical to each other. In Fig. 1, the relative positions of the four overlapping clones and subclones generated according to Dale et al. (1985) are presented. Nucleotide and amino acid sequence analysis o f the H P I V 3 F gene

The complete nucleotide sequence of the HPIV3 F gene is shown in Fig. 2 in the m R N A sense. The F gene is 1851 nucleotides long including the A residue immediately following the M -

1005

Human parainfluenza virus 3 Jusion gene

3'

M

200

400

600

I

I

I

Base number 800 1000 1200 1400 1600 1800 i

I

I

I

I

I

HN

t

I

5'

mpPIgX3 mpPIg142 pP114

lb

mPIgl4t ira,

Fig. I. Strategyfor sequencingHPIV3 cDNA clones. The positionsofcDNA clones synthesizedfrom HPIV3 F mRNA (pP114)and genomicRNA (mpPIgX3,mpPIg141 and mpPlg142)are shown. Arrows indicate the direction of sequencing and the approximate number of nucleotides sequenced in each subclone constructed by the sequential deletion method of Dale et al. (1985). F intergenic sequence and the six A residues preceding the F - H N intergenic sequence. These six residues probably correspond to the first six residues of the polyadenylate tail of the F mRNA. The first ATG is at positions 194 to 196 and begins a large open reading frame of 1617 nucleotides coding for 539 amino acids. Neither of the other two reading frames have open stretches of more than 114 nucleotides initiating with an ATG. The deduced amino acid sequence of this large open reading frame is also shown in Fig. 2. The predicted mol. wt. of the unglycosylated polypeptide encoded by the HPIV3 F gene is 60031. F0, found in HPIV3 virions and in infected cells, has an apparent mol. wt. of 60K as determined by SDS-polyacrylamide gel electrophoresis (Storey et al., 1984). Previously, we identified a polypeptide, translated in vitro from hybrid-selected F mRNA, with an apparent mol. wt. of 54K (Dimock et al., 1986a). A polypeptide of the same size was also immunoprecipitated from tunicamycin-treated HPIV3 o infected cells (unpublished observations). Paterson et al. (1984) observed a difference between the predicted mol. wt. of the SV5 fusion protein and the apparent mol. wt. of the in vitro product. These authors suggested that the hydrophobic nature of the F protein results in aberrant mobility of the unglycosylated polypeptide in SDS-polyacrylamide gels. Like the fusion proteins of other paramyxoviruses (Paterson et al., 1984; Blumberg et al., 1985; Shioda et al., 1986), the F protein of HPIV3 is highly hydrophobic, having three major hydrophobic domains which correspond to the signal peptide (A), the amino terminus of F1 (B) and a membrane anchor (C) (underlined in Fig. 2). The site for cleavage of the HPIV3 F protein signal peptide has yet to be identified. Cleavage probably occurs, however, between Cysl8 and Glul9 by analogy with the Sendai virus F protein where the cleavage site is between Cys25 and Glu26 (Blumberg et al., 1985). It has been demonstrated that F 0 is cleaved after Argl09 to generate the new F1 N terminus (Spriggs et al., 1986). Following the hydrophobic anchor domain is a highly charged cytoplasmic tail (amino acids 517 to 539) which may interact with the M protein (Lyles, 1979). Four potential Noglycosylation sites (Asn-X-Ser/Thr) were identified in the HPIV3 Fo protein; all these were found in the F1 coding region and are boxed in Fig. 2. The relative positions of the HPIV3 F0 protein glycosylation sites were compared with those of the Sendai virus (Blumberg et al., 1985 ; Shioda et al., 1986) and SV5 (Paterson et al., 1984) Fo proteins (Fig. 3). The first glycosylation site (amino acids 238 to 240) corresponds to one in an identical position in the Sendai virus F0 protein. The second (amino acids 359 to 361), corresponds to one at an equivalent position in the SV5 F0 protein. The third (amino acids 446 to 448) is in a similar

1006

M-J. CHTI~ AND OTHERS

AGC25~Z~

5A5 ATA ATC AAA AAC TTA 5GA CAA AA5 AAG TCA ATA CCA ACA ACT ATT AHC AGCCAC ACT CGC TGG

AAC 52

AA6 RAA SAA 668 ATA AAA AAA GTT TAA CAG RAG ARA CAR RRA CR~ RRRGCA CAG ARC RCC AGA ACA ACR RGRTCR RAR 130 CAC CCA ACC CR~ TCA ARA CGA AAA TCT CAA AA6 AGA TTG GCA ACA CAA CRA ACA CTG AAC ATC ATG CCR ACC TCA ATA 208 Met Pro Thr Ser lie 5 A CTG CTA ATT ATY ACA ACC AT5 ATT AT6 6CA TCT TTC TGC CAA ATA HAT ATC ACA AAA CTA CA8 CAT STA GTT 6TA ITS 286 Leu Leu lle lie Thr Thr Met lie Met Ala Set Phe Cys Gin lie Asp lie Thr Lye Leu Gin His Val Sly Val Leu 31 GTT AAC AHT ~CC AAA GGGAT@ARG RTA TCA CRA AAC TTT GAA ACA AGA TAT CTA RTT TTG AGC CTC ATA CCA AAA ATA 364 Vai Asn Her Pro Lys 61y Met Lys Ile Ser Gin Asn Phe 61u Thr Arg Tyr Leu Ile Leu Ser Leu lle Pro Lys Ile 57 GAR GRT TCT RAC TCT TGT 5GT BAT CAA CAB ATC AA5 CAA TAC RAG AGG TTA TTG BAT AGA CTG ATC ATT CCT TTA TAT 442 G1u Asp Set A~ Her Cys Gly Asp Gin Bin lle Lys Gin Tyr Lys Arg Leu Leu Asp Arg Leu Ile lle Pro Leu Tyr 83 HAT HGA TTA AGA TTA CAG AAG BAT GTG ATA GTG TCC AAT CAA GAA TCC AAT GAA AAC ACT GAC CCC AGA ACA AAA CGJ520 Asp GIy Leu Rrg Leu GI(~ Lys Asp Val lle Val Her Asn Gin Glu Set Asn Glu Asn Thr Asp Pro Arg Thr Lys hrg 109 TTC TTT GHA 686 6TA AFT GGA ACT ATT 6CF CTG GGA 8TG GCA ACC TCA 6CA CRA ATT ACA GCGGCA GTT GCT CTfi GTT 598 Phe Phe GIy H~y Val Ile GIy Thr lle Ala Leu B1y Val Ala Thr Her Ala Gln lle Thr A1a Ala Val Ala Leu Val 135 B GAA GCC AAG CAG GCA AGA TCA 6AC ATT GAA AAA CTC AAG GAA GCA ATC AGG 6AC ACA AAC AAA fiCA 8TG CA8 TCA 6TC 676 GIu Ala Lye Gin Ala Arg Set Asp lle Glu Lye Leu Lys 61u Ala lle Arg Asp Thr Ash Lys Ala Val 61n Her Val 161 CAG AGC TCC ATA GGA ART TT6 ATA GTR GCA ATT AAA TCG GTC CA6 HAT TAT GTC AAC AAA GAA ATC 6T6 CCA TCA ATT 754 Gin Her Ser lie Hly Asn Leu lie Val Ala lle Lys Ser Val Gin Asp Tyr Val Asn Lys Glu Ile Val Pro Her lle 187 GCH AGA TTA GHT TGT GAA GCA GCA GGA CTT CAG TTA 6GA ATT 6CA TTA ACA CAG CAT TAC TCA SAA TTA ACA AAC ATA 832 A]a Arg Leu Glu Cys 61u AIa Aia G1y Leu Gin Leu Giy Ile Ala Leu Thr 61n His Tyr Her 61u Leu Thr Asn Ile 213 llC GGl BAT AAC ATA GGA TCG TTA CAA 6AA AAA GGG ATA AAA TTA CAA GGT ATA GCA TCA TTA TAC CGCACAIAA?ATC 910 Phe Gly Asp Ash lle 61y Her Leu 51n Glu Lys GIy lle Lys Leu Gin Gly lie Ala Ser Leu Tyr Arg Thr IAsn Ile 239

~

GAG RTA TTC RCR ACR TCA ACR GTT HAT AAA TAT HAT RTT TAT 6RT CTR TTR TTT ACR GAA TCA ATA AAG BTG AGA 988 61u lie Phe Thr Thr Her Thr Val Asp Lys Tyr Asp lle Tyr Asp Leu Leu Phe /hr 61u Her lle Lys Val Arg 9 = ~6u

GTT ATA HAT GTF GAC TTG AAT HAT TAC TC~ ATC ~CC CTC CAA GTC AGA CTC CCT TTA TTA ACT AGA CTG CTG AAC ACC 1066 Val lle Asp Val Asp Leu ASh Asp Tyr Her Ile Thr Leu Gin Val Arg Leu Pro Leu Leu Thr Arg Leu Leu Asn Thr 291 CAG ATT TAC A~A 8TA HAT TCC ATA TCA TA~ AAC ATC CAA AAC AGA GAA THG TAT ATC CCT CTT CCC AGC CAC ATC ATG 1144 Gin Ile Tyr Lye Val Asp Set 11e Set TyF Asn lie Gin Ash Rrg Glu Trp Tyr lle Pro Leu Pro Ser His lle Bet 317 ACA AAA GGG GCA TTT CTA GGT GGA GCA GAT RTC AAA GAA TGT ATA BAR GCA TTC AGC AGT TAT ATA TGC CCT TCT GAT 1222 Thr Lys 61y Ala Phe Leu 61y Gly Ala Asp lie Lys 61u Cys lie 61u A1a Phe Her Her Tyr Ile Cys Pro 8er Asp 343 CAA GGA TTT GTA CTA AAC CAT GAA ATG GAG AGC TGT TTA TCA GGAZRAC ATA TC~CAA THT CCA AGA ACC GTG GT~ A~A 1300 Pro Gly Phe Val Leu Ash His Glu Met G1u Set Cys Leu Her G1ylAsn Ile Se~Gin Cys Pro Arg Thr Val Val Thr 369 TCA GAC ATT GTT CCA AGA TAT GCA TTT GTC AAT 6GA GGA GTG GTT GCA ART TGT ATA ACA ACC ACA TGT ACA TGC AAC 1378 Set Asp Ile Val Pro Arg Tyr Ala Phe Val Asn G1y G1y Val Val A1a ASh Cys Ile Thr Thr Thr Cys Thr Cys Asn 395 £GT AT~ GGC ART AGA ATC ART CAA CCA CCT GAT CAA GGA GTA AAA ATT ATA ACA CAT AAA GAA TGT AAT ACA ATA S6T 1456 Hly Ile Gly ASh Rrg Ile Asn Gin Pro Pro Asp 61n Gly VaI Lys Ile lie Thr His Lys GIu Cys Asn Thr Ile Sly 421 ATC AAC GGA ATG CTG TTC ATT ACA AAT AAA GAA GGA ATC CTT GCA TTT TAC ACA CCA ART HAT ATA ACA TTA~AC AAT 1534 lie Asn G1y Met Leu Phe Ash Thr Asn Lys Glu Gly Thr Leu Ala Phe Tyr Thr Pro Ash Asp lle Thr Leu~sn Asn 447 ~HTT kA CTT HAT CCA ATT GAC ATA TCA ATC GAG CTC AAT AA@ GCC AAA TCA GAT CTA GAA GAG TCA AAA GAA TGG 1612 Val Set Leu Asp Pro lie Asp lle Her Ile Slu Leu Asn Lys Ala Lys Her Asp Leu 51u G1u Set Lye Glu Trp 473 ATA AGA AGG TCA ART CRA ARA CTA HAT TCC ATT GGA AAT TGG CAT CAA TCT AGC ACC ACA ATC ATA ATT GTT TT8 ATA 1690 lie Arg Rrg Set Ash G1u Lys Leu Rsp Ser lle GIy Asn Trp His G1u Set Ser Thr Thr Ile lie lie Val Leu lle 499 C ATG ATA RTT ATA TTG TTT ATA A T T ~ ( A T A ATT ATA ATT GCA GTT AAG TAT TAC AGA ATT CAA AAG AriA AAT 1768 Met lie lle lle Leu Phe llemlleLAsn Val ThrIlle lle lle lie Ala ValLys Tyr Tyr Arg lle Gln Lye Arg Asn 525 CGA 6TG HAT CAA AAT HAT AAA CCA TAT GTA TTA ACA AAC AAA TGA CAH ATC TAT AGA TCA TTA HAT ATT AAA ATT ATA 1846 Arg Vai Asp 61n Asn Asp Lys Pro Tyr Val Leu Thr Asn Lys --...... 539 AAA AAC TTA 6HA GTA AAH TTA CGU

1870

Fig. 2. The complete nucleotide sequence of the HPIV3 F gene and flanking regions. The cDNA sequence is shown in the plus (mRNA) strand sense in the 5' to 3' direction. The sequence includes the intergenic sequences (CTT) between the M and F genes and also the F and HN genes. The transcriptional initiation sites of the F and HN genes and the polyadenylation signals of the M and F genes are underlined (- -). The predicted amino acid sequence of the unglycosylated polypeptide, Fo, is also shown. The arrow indicates the cleavage site between F1 and F2 (Spriggs et al., 1986). The asterisk indicates the translational initiation site and the three major hydrophobic sequences are underlined ( ): A, the signal peptide; B, the F1 amino terminus; C, the membrane anchor. Potential glycosylation sites are boxed and the dots show variation found between this sequence and that of Spriggs et al. (1986).

1007

Human parainfluenza virus 3 fusion gene 0

100

I

I

Amino acid number 200 300 /~ir

I

F~ HPIV3

,

I

Sendai

,

ii

SV5

I

.

500

I

I

F~

,I

I

400

I

,

i

I

,~**,

,

~

,

I

i

*

i

I

iI

I

*

,, ,

] I

i l

¢,

II

i i

II

I

II

, II

III

[ I

i i

11

if

i.

II

ii

I

I

lllll I

l

I

*

*

I

*

II

"~

I

,I|I

*

I

Fig. 3. Comparison of cysteine and proline residues and potential glycosylationsites in the predicted amino acid sequences of the HPIV3, Sendai virus and SV5 F1 and F z proteins. The amino termini are on the left. Verticallines represent the positionsof cysteine (above)and proline (below)residues and the asterisks represent the potential N-glycosylationsites (Asn-X-Ser/Thr).

position in all three viral F0 proteins. The fourth potential acceptor site (amino acids 508 to 510) does not correspond to any glycosylation site of either the Sendai virus or SV5 F 0 proteins, is found within the membrane anchor region of the HPIV3 Fo protein and, therefore, presumably is not used. The positions of cysteine and proline residues in HPIV3 F0 protein were compared with those in the F0 proteins of Sendai virus and SV5 (Fig. 3). Eight of 11 cysteines and 11 of 18 prolines in the HPIV3 F protein are clustered in F~ between amino acids 282 and 453. In the F protein of Sendai virus, cysteines and prolines are also concentrated in the same part of F~. In addition, many of the cysteines, and to a lesser extent the prolines, in the F~ protein of SV5 can also be aligned with those of HPIV3 and Sendai virus F~ proteins. The large number and the conserved distribution of cysteines and prolines in this region of F~ suggest a high degree of folding and possibly an important role in stabilizing the tertiary structure of the F protein. Two cysteine residues are located in F2; one probably marks the cleavage site for the signal peptide and the other, therefore, must be involved in disulphide bonding of F~ and F 2. Comparison with the previously published F gene sequence Recently, Spriggs et al. (1986) reported the complete nucleotide sequence of the HPIV3 F gene. Our results confirm their findings. There are only 12 nucleotide differences between their sequence and ours. Two of the differences (nucleotides 97 and 142) are in the untranslated region preceding the initiation codon. In the coding region, four differences (nucleotides 1018, 1096, 1297 and 1384) do not change the amino acid sequence. The remaining six differences (nucleotides 296, 1022, 1077, 1175, 1299 and 1541) would result in an altered amino acid sequence (see Fig. 2). Of these six (Ser/Pro35, Ala/Thr277, Arg/Lys295, Val/Ile328, Lys/Thr369 and Ala/Ser449), only Lys/Thr369 would result in a net charge change. A proline at position 35, as shown in our sequence, suggests that there is no potential carbohydrate acceptor site in the F2 region of the HPIV3 F protein gene that we have sequenced. The sequence of Spriggs et al. (1986) indicates a potential glycosylation site at Asn33-Ser34-Ser35. It will be of interest to see whether the lack of a potential glycosylation site in F2 will have any effect on expression of the cloned F protein gene. The observed differences can be explained by natural variation which may have occurred during the different passage histories of HPIV3 in the two laboratories. Alternatively, our cloned F sequence may not be representative of the population of viral genomes in the RNA pool, because we analysed specific clones or as a result of cloning artefacts.

1008

M-J.

C6T~

AND OTHERS

M

F

vRNA

3 ' - - - UCGUUUAUUCUCUAUUAGUUUUUGAAUCCUGUUUUCUUC--5'

mRNA

5 ' ---AGCAAAUAAGAGAUAAUC-polyA AGGACAAAAGAAG---3' HN

F

vRNA mRNA

3 ' - - -AAUCUAUAAUUUUAAUAUUUUUUGAAUCCUCAUUUCAAU-

5 ' ---UUAGAUAUUAAAAUUAU-polyA

-- 3 '

AGGAGUAAAGUUA---5'

Fig. 4. Nucleotide sequences of the HPIV3 F gene flanking regions. The sequences are presented in both genomicand mRNA senses. The polymerase recognition(gene start) and polyadenylation(gene stop) sequences are underlined. Note the commonintergenic sequence GAA and the eight nucleotide insertion within the M gene polyadenylation signal.

Spriggs et al. (1986) confirmed their F gene sequence by direct dideoxynucleotide sequencing of genomic RNA and found eight differences between the sequence of cloned cDNA and the sequence determined by direct sequencing of genomic RNA, half of which were silent at the amino acid level. Direct RNA sequencing, therefore, should provide an explanation for these observed differences.

Sequence analysis o f the H P I V 3 F gene flanking regions The sequences of both the M-F and F - H N gene junctions were also determined and are shown in the genomic sense in Fig. 4. The sequence of the polymerase recognition or gene start signal of the F gene (3' U C C U G U U U U C 5') of HPIV3 is identical to that recently reported by Spriggs & Collins (1986) and fits the consensus sequence 3' U C C U N N U U U C 5' found at the beginning of each HPIV3 gene (Spriggs & Collins, 1986; Dimock et al., 1986 b and unpublished observations). The polyadenylation or gene stop signal for the F gene was determined to be 3' U U A A U A U U U U U U 5' by comparing the nucleotide sequence of clone pPI14, a mRNAderived cDNA clone which contains sequences corresponding to the polyadenylate tract, with the overlapping sequences of genomic RNA-derived clones (Fig. 1). The polyadenylation site of the HPIV3 F gene is identical, except at one position, to that of the HPIV3 HN gene, 3' U U U A U A U U U U U U 5' (Storey et al., 1987), but differs from those of the other HPIV3 genes (Spriggs & Collins, 1986). The M gene stop signal differs from the stop signals of the other HPIV3 genes by having an internal insertion of eight nucleotides. This insertion might explain the abundance of M/F bicistronic mRNA found in HPIV3-infected cells (Dimock et al., 1986a; Spriggs & Collins, 1986). In summary, the complete nucleotide sequence of the HPIV3 F gene and its flanking regions was determined. The amino acid sequence of the HPIV3 F0 protein was deduced and compared with those of two other paramyxoviruses, Sendai and SV5. Overall, there is 42 to 4 3 ~ homology between the Sendal virus F protein (Blumberg et al., 1985; Shioda et al., 1986) and the HPIV3 F protein. Several regions of the HPIV3 F protein show greater than 5 0 ~ homology to corresponding regions of the Sendal virus F protein. Sequence comparison among several genes of different paramyxoviruses indicates that HPIV3 and Sendai virus are more closely related to each other than to other paramyxoviruses (Luk et al., 1986; Sanchez et al., 1986; Galinski et al., 1986 a, b; Elango et al., 1986; Dimock et al., 1986b; Spriggs et al., 1986; Storey et al., 1987). Our results, therefore, add further support to these observations.

This work was supported by the Medical Research Council of Canada, the Natural Science and Engineering Research Councilof Canada and the World Health Organization. K.D. is the recipient of an Ontario Ministry of Health Career Development Award. D.G.S. and M.J.C. are the recipients of an Ontario Graduate Scholarship.

Human parainfluenza virus 3 Jusion gene

1009

REFERENCES BLUMBERG,B. M., GIORGI, C., ROSE, K. & KOLAKOFSKY,D. (1985). Sequence determination of the Sendai virus fusion protein gene. Journal of General Virology 66, 317-331. CrtANOeK, R. M. (1956). Association of a new type of cytopathogenic myxovirus with infantile croup. Journal of Experimental Medicine 104, 555-576. DALE, R. M. K., McCLURE, B. A. & HOUCHINS, J. P. (1985). A rapid single-stranded cloning strategy for producing a sequential series of overlapping clones for use in D N A sequencing: application to sequencing the corn mitochondrial 18S r R N A . PlasmM 13, 31-40. DIMOCK, K., STOREY, D. G., COTE, M.-J. & KANG, C. Y. (1986a). Cloning, coding assignments and m a p p i n g of h u m a n parainfluenza virus 3 genes. In Biology of Negative Strand Viruses. Edited by D. Kolakofsky & B. W. J. Mahy. New York, A m s t e r d a m & Oxford: Elsevier Biomedical Press. (In press.) DIMOCK, K., RUD, E. W. & KANG, C. Y. (1986b). T-Terminal sequence of h u m a n parainfluenza virus 3 genomic R N A . Nucleic Acids Research 14, 4694. ELANGO, N., COLIGAN, J. E., JAMBOU, R. C. & VENKATESAN, S. (1986). H u m a n parainfluenza type 3 virus hemagglutinin-neuraminidase glycoprotein : nucleotide sequence o f m R N A and limited amino acid sequence of the purified protein. Journal of Virology 57, 481-489. GALINSKI,M. S., MINK, M. A., LAMBERT,D. M., WECHSLER,S. L. & PONS, M. W. (1986a). Molecular cloning and sequence analysis of the h u m a n parainfluenza 3 virus R N A encoding the nucleocapsid protein. Virology 149, 139-151. GALINSKI,M. S., MINK, M. A., LAMBERT,D. M., WECHSLER,S. L. & PONS, M. W. (1986b). Molecular cloning and sequence analysis of the h u m a n parainfluenza 3 virus R N A encoding the P and C proteins. Virology 155, 46-60. GETHING, M. J., WHITE, J. M. & WATERFIELD, M. D. (1978). Purification of the fusion protein of Sendal virus: analysis of the NHz-terminal sequence generated during precursor activation. Proceedingsof the National Academy of Sciences, U.S.A. 75, 2737 2740. GLEZEN, W. P. & DENNY, F. W. (1973). Epidemiology of acute lower respiratory disease in children. New England Journal of Medicine 288, 498 505. HOMMA,M. & OHUCHI, M. (1973). Trypsin action on the growth of Sendai virus in tissue culture cells. III. Structural difference of Sendal viruses grown in eggs and tissue culture cells. Journal of Virology 12, 1457 1465. JAMBOU, R. C., ELANGO, N. & VENKATESAN,S. (1985). Proteins associated with h u m a n parainfluenza virus type 3. Journal of Virology 56, 298-302. LUK, O., SANCHEZ, A. & BANERJEE, A. K. (1986). Messenger R N A encoding the phosphoprotein (P) gene of h u m a n parainftuenza virus 3 is bicistronic. Virology 153, 318-325. LYLES, D. S. (1979). Glycoproteins of Sendai virus are t r a n s m e m b r a n e proteins. Proceedings of the National Academy of Sciences, U.S.A. 76, 6521 6525. MERZ, D. C., SCHEID, A. & CHOPPIN, P. W. (1980). Importance of antibodies to the fusion glycoprotein of paramyxoviruses in the prevention of spread of infection. Journal of Experimental Medicine 151, 275-288. MESSING, J. (1983). New M13 vectors for cloning. Methods in Enzymology 101, 20-78. NAGAI, V., KLENK,H.-D. & ROTT,R. (1976). Proteolytic cleavage of the viral glycoproteins and its significance for the virulence of Newcastle disease virus. Virology 72, 494-508. PAROTT,R. H., VARGOSKI,A. J., KIM, H. W., BELL, J. A. & CHANOCK,R. M. (1962). Respiratory diseases of viral etiology. III. Myxovirnses: parainfluenza. American Journal of Public Health 52, 907-917. PATERSON, R. G., HARRIS, T. J. R. & LAMB,R. A. (1984). Fusion protein of paramyxovirns simian virus 5 : nucleotide sequence of m R N A predicts a highly hydrophobic glycoprotein. Proceedings of the National Academy of Sciences, U.S.A. 81, 6706-6710. RICHARDSON, C. O. & CHOPPIN, P. W. (1983). Oligopeptides that specifically inhibit m e m b r a n e fusion by paramyxoviruses: studies on the site of action. Virology 131, 518-532. RICHARDSON, C. D., SCHEID, A. & CHOPPIN, P. W. (1980). Specific inhibition of paramyxovirus and myxovirus replication by oligopeptides with amino acid sequences similar to those at the N-termini of the F1 or HAL viral polypeptides. Virology 105, 205-222. SANCHEZ, A. & BANERJEE, A. K. (1985). Studies on h u m a n parainfluenza virus 3: characterization of the structural proteins and in vitro synthesized proteins coded by m R N A s isolated from infected cells. Virology 143, 45-54. SANCHEZ, A., BANERJEE, A. K., FURUICHI, Y. & RICHARDSON, M. A. (1986). Conserved structures among the nucleocapsid proteins of the Paramyxoviridae: complete nucleotide sequence of h u m a n parainfluenza virus type 3 N P m R N A . Virology 152, 171-180. SCHEID, A. & CHOPPIN, P. W. (1974). Identification of biological activities of paramyxovirus glycoproteins. Activation of cell fusion, hemolysis and infectivity by proteolytic cleavage of an inactive precursor protein of Sendai virus. Virology 57, 475-490. SCHEID, A. & CHOPPIN, P. W. (1977). Two disulphide-linked polypeptide chains constitute the active F protein of paramyxoviruses. Virology 80, 54-66. SHIODA, T., IWASAKI, K. & SHIBUTA,n. (1986). Determination of the complete nucleotide sequence of the Sendai virus genome R N A and the predicted a m i n o acid sequence of the F, H N and L proteins. Nucleic Acids Research 14, 1545-1563. SPRIGGS, M. K. & COLLINS, P. L. (1986). H u m a n parainfluenza virus type 3: messenger R N A s , polypeptide coding assignments, intergenic sequences, and genetic map. Journal of Virology 59, 64(~654. SPRIGGS, M. K., OLMSTED,R. A., VENKATESAN,S., COLIGAN,J. E. & COLLINS,P. L. (1986). Fusion glycoprotein of h u m a n parainfluenza virus type 3 : nucleotide sequence of the gene, direct identification of the cleavage-activation site and comparison with other paramyxoviruses. Virology 152, 241-251.

1010

M-J, COTE A N D O T H E R S

STOREY,D. G., DIMOCK,K. & KANG,C. Y. (1984). Structural characterization of virion proteins and genomic RNA of human parainfluenza virus 3. Journal of Virology 52, 761-766. STOREY, D. G., C6T~, i-J., DIMOCK,K.& KANG, C. Y. (1987). Nucleotide sequence of the coding and flanking regions of the human parainfluenza virus 3 hemagglutinin-neuraminidase gene: comparison with other paramyxoviruses, lntervirology (in press). WECHSLER, S. L., LAMBERT, D. M., GALINSKI, M. S. & PONS, M. W. (1985). Intracellular synthesis of human parainfluenza type 3 virus-specified polypeptides. Journal of Virology 54, 661-664. WELLIVER, R., WONG, D. T., CHOI, T.-S. & OGRA, P. L. (1982). Natural history of parainfluenza virus infection in

childhood. Journal of Pediatrics 101, 180-187.

(Received 9 October 1986)

Suggest Documents