Nucleotide Sequence of the pnp Gene of Escherichia coli Encoding ...

6 downloads 778 Views 2MB Size Report
ment containing the open reading frames coding for .... tide sequence of 3030 base pairs (Fig. ... tides before the translation termination codon in pnp, can.
Val. 262. No. 1, Issue of January 5, pp. 6348,1987 Printed in U.S. A .

THEJOURNAL OF BIOLOGICAL CHEMISTRY

0 1987 by The American Soeiety of Biological Chemista, Inc.

Nucleotide Sequence of the pnp Gene of Escherichia coli Encoding Polynucleotide Phosphorylase HOMOLOGY OF THE PRIMARY STRUCTURE OF THE PROTEINWITH RIBOSOMAL PROTEIN S1*

THE RNA-BINDING DOMAIN OF (Received for publication, May 13,1986)

Philippe RegnierS, Marianne Grunberg-Manago, and Claude Portier From the Znstitut de Biologie Physico-Chimique, 75005 Paris, France

Thepnp gene is located at 69 min on the E8cherichia CY subunits (mass = 86 kDa) which are responsible for the coli chromosome adjacent to the rps0 gene which en- catalytic activity (Portier, 1975a). Studies of the kinetics of codes the ribosomal protein 515. In this paper, we polymerization and of phosphorolysis led to the conclusion present the sequence of a3030-nucleotide DNA frag- that a catalytic center exists together with a RNA binding ment containing the open reading frames coding for site on the surface of the molecule (Thang et al., 1970; Goderibosomal proteinS 15 and polynucleotide phosphoryl-froy, 1970;Guissani, 1977). Limited proteolysis of the enzyme ase. Translation o f p n pis initiatedby 5'-UUG-3' codon supports the view that these two kinds of sites are distinct separated by 7 nucleotides from a good ribosome bind-(Thang et al., 1967, 1970; Godefroy,1970). ing site. Codon usage in this gene is typical of highly Several mutants have been isolated which are deficient in expressed proteins of E. coli. Some of the transcripts PNPase synthesis and which map at 69 min on the E. coli of the pnp gene terminate just after the stem of the terminator tt visible in the nucleotide sequence. How- chromosome in the pnp gene (Reiner, 1969) encoding the CY ever, a very strong read-through occurs at this site, subunit (Portier, 1980). Our approach to improve our knowlthus permitting many of the pnp transcripts toextend edge of this enzyme has been to clone the pnp gene (Portier beyond this transcription terminator.We also describe et al., 1981). In further studies, we have demonstrated that transcription of pnp is regulated together with that of the the primary structure homologies between 69-aminoa acid stretch of polynucleotide phosphorylase and the rps0 gene which encodes ribosomal protein S15 (Portier and fourhomologous stretches o f ribosomalprotein S1 Rkgnier,1984;R6gnier and Portier, 1986). In the present which form its RNA binding site. The possibility that work, we report the DNA sequence of p a p gene and we this 69-amino-acid stretch constitutes the polynucleo- describe primary structure homologies between a 69-aminotide binding domain of polynucleotide phosphorylase acid stretch of the C-terminal part of PNPase and the four is discussed. repeating homologous stretches of ribosomal protein S1 (Wittmann-Liebold et al., 1983). Polynucleotide phosphorylase (PNPase') (polyribonucleotide:orthophosphate nucleotidyltransferase, EC 2.7.7.8) was first discovered in bacteria (Grunberg-Manago and Ochoa, 1955) and has been identified subsequently in many prokaryotes (Godefroy-Colburn and Grunberg-Manago, 1972) and even in plants (Khan and Fraenkel-Conrat, 1985). In vitro, PNPase catalyzes the polymerization of ribonucleoside diphosphates, the phosphorolysis of polyribonucleotides or oligoribonucleotides in the presence of phosphate, and the exchange between the B-phosphate of ribonucleoside diphosphates andfree orthophosphate. Biochemical studies have shown that PNPase of Escherichia coli is constituted by the association of three identical

* This work was supported by a grant from University Paris 7 (to Ph. R.), Grant 18 from the Centre National de la Recherche Scientifique, Grant 82 V 1289 from the Ministhe de la Recherche et de l'Industrie, Grants 823.008 and 831.003 from the InstitutNational de la SantB et de la Recherche MBdicale, and grantsfrom the Fondation pour la Recherche MBdicale and Du Pont (to M. G.-M.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18U.S.C. Section 1734 solelyto indicate this fact. The nricleotide sequence(s)reported in thispaper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) 502638. 2 To whom correspondence and reprint requests should be addressed. The abbreviation used is: PNPase, polynucleotide phosphorylase.

EXPERIMENTAL PROCEDURES

Materials-Polynucleotide kinase, DNA polymerase Klenow fragment, and 32P-labelednucleotides were from Amersham. S1 nuclease from Boehringer Mannheim, and all endonucleases from commercial suppliers. Preparation of Labeled DNA Fragments and DNA SequencingDNA restriction fragmentswere prepared from plasmids p B P l l l and pBPA61 (Portier et al., 1981) and purified from low melting point agarose gels (Sacerdot et al., 1982) or from polyacrylamide gels (Maxam and Gilbert, 1980). Labeling of the restriction fragments (at 5'-ends with [y3'P]ATP, 3000 Ci/mmol, and polynucleotide kinase or at the 3'-end with [cY-~*P]~CTP, 3000 Ci/mmol, and Klenow fragment of DNA polymerase I of E. coli), strand separation, and sequencing of the DNA were performed as described by Maxam and Gilbert (1980). SI NucleaseMapping-The method used (R6gnier and Portier, 1986) was basically as described by Burton et al. (1983). In each hybridization experiment, 0.1 pg of end-labeled DNA was incubated with 50 pg of E. coli RNA prepared from BL 322 strain (Studier, 1975).Hybridization was carried out for 3 h a t 57 "Cand S1 nucleaseresistant DNA fragments were analyzed on 8%polyacrylamide gels containing 8.3 M urea. Sequence Comparison-Systematic comparison between the amino acid sequence of PNPase and all E. coli protein sequences in the National Biomedical Research Foundation data base was carried out using the Search program of Orcutt et al. (1984) combined with the mutation data matrix of Dayhoff et al. (1983). The PNPase amino acid sequence was divided into segments of 20 amino acids in length that were compared to the database. An increment of 10 was added to the position of the N-terminal amino acid of a segment to obtain that of the next segment to be compared. Comparison of PNPase

63

64

Nucleotide Sequence of the pnp Gene: Homology with rpsA

with ribosomal protein S1 was carried out using the graphic method of Staden (1982) combined with the mutation data matrix of Dayhoff et al. (1983). RESULTS

DNA Sequence of thepnp Gene-The plasmid p B P l l 1 was constructed and containsa 4.5-kilobase HindIII-EcoRI DNA fragment covering the rps0-pnp operon of E. coli coding for PNPase and ribosomal protein S15 inserted in pBR322 (Fig. 1, Appendix) (Portier et al., 1981). We present herea nucleotide sequence of 3030 base pairs(Fig. 2, Appendix) which was obtained by combining the sequence of the HpaI-DdeI fragment thatwe previously reported (Portier and Rignier, 1984) withthat of theSmuI-XmnIfragmentsequenced by the method of Maxam and Gilbert (1980) according to the strategy presentedin Fig. 1, Appendix. The sequence of the latter fragment was determined a t least twice from different restrica tion sites and 86%of it was deduced from both strands. The 0 1 whole DNA fragment contains the two open reading frames corresponding to rps0 (Portier and Rignier, 1984) and pnp (a) (b) (Fig. 2, Appendix). The pnp structuralgene starts a t nucleoMlul t Xmnl tide 664 using a 5’-UUG-3’ translation initiation codon loP”P I f cated 7 nucleotides downstream from a good ribosome binding 196-197-3’ site (5’-AAGGA-3’) (Portier and Rignier, 1984). Following 395-3’ S”394 this 5’-UUG-3’ is an open reading frameof 2133 nucleotides that codes for a polypeptideof 711 amino acids. The two other reading frames are frequently broken by translation termiT G nation codons and cannot encode polypeptide longer than 68 C-G amino acids. G-C C-G Codon usage in pnp resembles that of proteins that are C-G highly expressed. Its coefficient of expressivity (e = 0.815), T-C C-G which is the averageof the ratios of usage frequency of each C-G codon relative to thefrequency of the mostused codon (GrosC-G v v jean and Fiers, 1982) approaches the values of the highly TAAGGTTGCCATTTG-CTTTTAACCGGGC (C) expressed genes tufA (e = 0.903), tufB (e = 0.892), and the genes for ribosomal proteins (0.597 < e < 0.900). FIG. 3. S1 nuclease mapping of terminator tz. a, the DNA Another open reading frame begins 111 nucleotides after probe was a 3’-end-labeled395-nucleotide-longsingle-stranded the translation termination codon of pnp, by an 5’-AUG-3’ XmnI-MluI DNA fragment drawn in c. It was hybridized to mRNA codon (2908-2910) preceded by a good ribosome binding site ( L a n e 4 ) and tRNA ( L a n e 2), and DNA fragments were analyzed on a polyacrylamide gel together with the probe ( L a n e I ) and 5’-end(5’-GGAG-3’, 2898-2901) (Fig. 2). It covers the123last labeled DNA fragments resulting from HpaII digestion of pBR322 nucleotides of the sequence that we have determined. HindIII-EcoRI large fragment ( L a n e 3). Lengths of marker DNA Termination of Transcription Downstream of pnp-Downfragments are 527, 460, 404, 309, 242, 238, 217, 201, 190, 180, 160, stream of the terminationcodon of pnp (5‘-UAA-3’, positions 147, 132, 122, 110, 90, 76, 67, 34, 26, 15, 9. The relative lengths of 2797-2799), we find a sequence (2811-2832) (Fig. 2, Appen- protected DNA are given on the autoradiograph together with the dix) that can form a stable stem and loop structure (SG = lengths of some marker DNA fragments. b, the same experiment was -16.7 kcal X mol-’) (Fig. 3). Moreover, this structure labeled done with the 5’-end-labeled XrnnI-MluI DNA fragment. Protected DNA fragments ( L a n e 8) were run on an acrylamide gel with marker t is followed by a thymidine-rich region (5’-TTTTAA-3’, DNA fragments as in (a) ( L a n e 5) and the probe ( L a n e 6 ) . When 2833-2838) and thus resembles documented p-independent hybridized to tRNA ( L a n e 7), the probe is completely degraded by transcription terminators (Rosenberg and Court, 1979). An- nuclease S1. In a and b, the position andthe length of the probe (Pr) other sequence (2755-2791), which terminates five nucleo- are indicatedon the autoradiographs. c, the end of pnp and the terminator t;! are shown relative to the XmnI and MluI sites. The tides before the translation termination codon in pnp, can form a stable stem and loop structure (AG = -19.3 kcal X base-pairing of the stem loop of the terminator t;! is shown beneath. Protected DNA fragments are shown by thick lines. The nature of mol”). Interestingly,the nucleotides that form these two the labeling (3’ or 5’) and the lengths of protected DNA fragments secondary structures can base pair aindifferent way to form are indicated. The arrows on the sequence indicate the positions of a third stable alternative structure (2774-2820) (AG = -20.4 the termination sites. kcal x mol”). This latter structure, if i t exists, prevents the formation of the putative transcription terminator. also indicates that a significant number of RNA polymerase In order to determine whether the putative transcription terminator, h, functions in uiuo, we performed S1 nuclease molecules (more than50%)can read through the transcription terminator. The existence of transcripts spanning this trana single-strandedXmnI-MluI mappingexperimentswith been confirmed by showing that they probe (Fig. 3). Since hybridization to mRNA protects from scription terminator has digestion by S1 nuclease 196and 197-nucleotide-long also protect from digestion by S1 nuclease the same singlesubfragments of the probe 3’-end-labeled at MluI (Fig. 3), stranded DNA probe 5’-end-labeled at XmnI(Fig. 3). The Protein Sequence of PNPase-The amino acid comone can conclude that some transcripts terminate in uiuo in the 4-thymidine stretch which follows the (G C)-rich stem position of PNPase predicted from the DNA sequence is in of thetranscriptionterminator (Fig. 3). However, the good agreement with the determination made by chemical protection of the full length 3’-end-labeled probe by mRNA analysis (Portier,1975b; Soreq and Littauer,1977) except for

1

+

Nucleotide 8equence of the pnp Gene: Homology with rpsA cysteine which is in significantly lower amounts than expected. PNPase contains only 1 cysteine residue at position 444 and 2 tryptophan residues at positions 231 and 233 (Fig. 2). The contentof acidic residues in PNPase (14.2%)together with its low content of basic residues (12.65%) make it an acidic protein with an isoelectric point calculated at 5.07 from the amino acid composition and estimated at 6.01 by isoelectric focusing (Soreq and Littauer, 1977). In the amino acid sequence of PNPase, some of the acidic residues are clustered in stretches devoid of positively charged residue. The 711 amino acids coded for by the pnp gene give a molecular mass for the CY subunit of PNPase of 77,122 Da. The higher molecular mass values of 86,000 f 5,000 Da (Portier, 1975a) and 84,000 +. 5,000 Da (Soreq and Littauer, 1977) previously determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis could be attributed to the acidic character of PNPase (Burton et al., 1981). Primary Structure Homology with Ribosomal Protein S l A systematic comparison of the amino acid sequence of PNPase with those of all other proteins of E. coli by the method of Orcutt et al. (1984) disclosed a strong homology with ribosomal protein S1. Homologous regions in PNPase and ribosomal protein S1 were accurately determined by the graphic comparison method of Staden (1982) (Fig. 4). This comparison shows that an amino acid stretch of the C-terminal part of PNPase has homologies with each of the four homologous stretches of the middle and C-terminal parts of ribosomal protein S1 (Wittmann-Liebold et al., 1983). If the 69-amino-acid-long stretch of PNPase, from position 622 to position 690, is aligned with the first homologous stretch of ribosomal protein S1 (192-260), 42% of the amino acids in the corresponding positions are identical and, in addition, 22% are replaced by amino acids of very similar physicochemical properties (Fig. 5). Alignments of the same stretch of PNPase with the three other homologous stretches of the ribosomal protein S1 (277-347, 364-434, 451-520) show, respectively, 39%) 39%) and40% identical amino acids at corresponding positions and 17%, lo%, and 8% conservative replacements (Fig. 5). The best conserved subregion is a 27amino-acid sequence extending in PNPase from position 626 to position 652 (Fig. 5).

..

RPS1

1

.

500

400

300

200

100

'

;.

,

/.

r

,.

100

200

I, ,I