Nucleotide Sequence of the Acinetobacter ... - Semantic Scholar

7 downloads 0 Views 2MB Size Report
evolution of the internal E. coli trpP, promoter (Morse and Yanofsky 1968;. Horowitz et al. 1982; Horowitz and Platt 1982) or to an entirely different origin of.
Nucleotide Sequence of the Acinetobacter calcoaceticus trpGDC Gene Cluster Je@ey B. Kaplan, Paul Goncharofi Anita 44. Seibold, and Brian P. Nichols Department

of Biological Sciences, University of Illinois at Chicago

A plasmid library of Acinetobacter calcoaceticm HindIII fragments was constructed, and clones that complemented an Escherichia coZi pabA mutant were selected. Plasmids containing a 3.9-kb fragment of A. cakoaceticus DNA that also complemented E. coli trpD and trpC-(trpF+) mutants were obtained. We infer that complementation of E. coli pabA mutants was the result of the expression of the amphibolic anthranilate-synthasee/paminobenzoate-synthase glutamineamidotransferase gene and that the plasmid insert carried the entire trpGDC gene cluster. In E. coli minicells, the plasmid insert directed the synthesis of polypeptides of 44,000, 33,000, and 20,000 daltons, molecular masses that are consistent with the reported molecular masses of phosphoribosylanthranilate transferase, indoleglycerol-phosphate synthase, and anthranilate-synthase component II, respectively. A 3,105-bp nucleotide sequence was determined. Comparison of the A. calcoaceticus trpGDC sequences with other known trp gene sequences has allowed insight into (1) the evolution of the amphibolic trpG gene, (2) varied strategies for coordinate expression of trp genes, and (3) mechanisms of gene fusions in the trp operon.

Introduction

In all organisms investigated to date, ranging from gram-negative prokaryotes to lower-level eukaryotes, the five reactions that convert chorismate to tryptophan have not been found to differ, but the structure and organization of the genes involved in tryptophan biosynthesis vary considerably (Crawford 1975). Acinetobacter calcoaceticus contains seven separate genes that encode the enzymes of tryptophan biosynthesis (Twarog and Liggins 1970; Sawula and Crawford 1973), and these genes lie in three unlinked clusters (Sawula and Crawford 1972). The Escherichia coli trp operon, on the other hand, contains five contiguous genes encoding the tryptophan-biosynthetic enzymes (Yanofsky et al. 198 1). Two gene fusions have occurred in the evolution of the E. coli trp operon, with the result that two of the E. coli genes encode bifunctional polypeptides (Creighton 1970; Greishaber and Bauerle 1972). In certain organisms (A. calcoaceticus [Sawula and Crawford 19721, Bacillus subtilis [Kane et al. 19721, and Pseudomonas acidovorans [Buvinger et al. 198 l]), tryptophan biosynthesis is “interlocked” with the biosynthesis of the vitamin dihydrofolate because the glutamine amidotransferase (GAT) subunit of anthranilate synthase (AS; EC 4.1.3.27) is amphibolic, that is, it also acts as the GAT subunit of 1. Key words: tryptophan

genes, Acinetobacter calcoaceticus, gene rearrangements,

and reprints: Brian P. Nichols, Department Address for correspondence University of Illinois at Chicago, Box 4348, Chicago, Illinois 60680. Mol. Biol. Evol. 1(6):456-472. 1984. 0 1984 by The University of Chicago. All rights reserved.

0737-4038/84/O 106~0002$02.00 456

gene fusions.

of Biological

Sciences,

Acinetobacter trpGDC Sequence

457

a second enzyme, p-aminobenzoate synthase (PABS). AS and PABS use identical substrates, chorismate and glutamine, and produce very similar products-glutamate, pyruvate, and either anthranilate (o-aminobenzoate) or p-aminobenzoate (PABA), the latter being incorporated into dihydrofolate. We report here the nucleotide sequence of the A. calcoaceticus trpGDC gene cluster. TrpG encodes the amphibolic GAT subunit of AS and PABS. Since the nucleotide and amino acid sequences of genes encoding the pathway-specific GAT subunits-pabA, trpG, and trp(G)D-in several enterobacterial species are known (Nichols et al. 1980; Tso et al. 1980; Kaplan and Nichols 1983), determination of the primary structure of an amphibolic subunit might assist in the elucidation of the functional and evolutionary origins of pathway-specific genes. TrpD and trpC encode anthranilate phosphoribosyltransferase (PRTase; EC 2.4.2.18) and indoleglycerol phosphate synthase (InGPS; EC 4.1.1.48), respectively. In several species of the Enterobacteriaceae, these activities lie on bifunctional polypeptides. In E. coli, for example, PRTase and AS GAT activities are encoded by a single gene, trpD (Greishaber and Bauerle 1972; Miozzari and Yanofsky 1979). For clarity, we will follow the suggestion of Crawford (1975) and refer to the fused gene as trp(G)D. Similarly, in E. coli InGPS lies on the same polypeptide with phosphoribolyslanthranilate isomerase (PRAI) and is encoded by trpC (Creighton 1970). PRAI is encoded by trpF in A. calcoaceticus, and we will refer to the fused E. coli gene as trpC(F). Comparisons of the separate and fused gene sequences can aid in the determination of the minimum amount of information necessary to encode each individual function as well as assist in the elucidation of events that may have resulted in fused genes. Materials and Methods

Bacteriophages, Strains, and Strain Construction The bacterial strains used in this work are listed in table 1. Bacteriophages M 13mp8 and Ml 3mp9 are those described by Messing and Vieira (1982) and were obtained from New England Biolabs (Beverly, Mass.). Bacteriophage Pl kc was obtained from C. Yanofsky, and transductions were performed as described by Miller (1972). Escherichia coli BN 100 was constructed by transducing a Pl kc lysate made from E. coli AB3292 (Huang and Pittard 1967) into E. coli W3 110 AtrpLD102 and selecting streptomycin-resistant colonies. Resistant colonies were then screened for PABA auxotrophy. Escherichia coli BN105 (trpR, minA, minB, Zeu, thi, ZacY, ara, gal, maZA, xyl, azi, tsx, tonA, and rpsL) was constructed by transduction of a P 1kc lysate made from E. coli W3 110 trpR AtrpEA2 tna2 into E. coli P678-54 and isolation of 5-methyltryptophan-resistant, threonine-independent colonies. DNA manipulations. -Restriction endonucleases were either prepared in this laboratory according to published procedures or purchased from commercial distributors (New England Biolabs; P-L Biochemicals, Milwaukee; and Boehringer Mannheim Biochemicals, Indianapolis). Calf intestinal phosphatase was obtained from Boehringer Mannheim. Restriction endonuclease digests were performed according to the commercial supplier’s recommendations. Ligations were performed at DNA concentrations of 30-60 pg/ml in 10 mM Tris-HCl (pH 7.4), 10 mA4 MgCl*, 10 mM 2-mercaptoethanol, and 0.5 mA4 ATP for l-4 h at 14 C. Transformation of E. coli was carried out as described by Mandel and Higa (1970). Acinetobacter calcoaceticus DNA was prepared according to Marmur ( 1966), except

458

Kaplan, Goncharoff, Seibold, and Nichols

Table 1 Bacterial Strains Cited Strain

Genotype

Acinetobacter calcoaceticus

BD413 . . .

trpE2 7, ura

Crawford 1975

pabA1, rpsL704. proA2, argE3, lacY1, his-4, ilvC7, thi-I, galK2, ~~1-5, mtl-I, tfr3, tsx358, supE44

Huang and Pittard 1967

pabA1, AtrpLD102, rpsL704

This work

pabA1, trpA33, rpsL 704

Kaplan and Nichols 1983

thr, ara, leu, azi, tonA, lacy, tsx, minA, minB, gal, malA, thi, xyl, rpsL

Matsumura et al. 1977

......... .

Escherichia coli AB3292

E. coli BNlOO

...

E. coli BNlOl

...

E. coli ~678-54

Reference or Source

.

.....

....

E. coli W3110

......

trpR, AtrpEA2, tna2

Yanofsky et al. 198 1

E. coli BN105

......

trpR, ara, leu, azi, tonA, lacy, tsx, minA, minB, gal, malA, thi, xyl rpsL

This work

E. coli W3110 trpC55

trpC, (trpF+)

Yanofsky et al. 1971

E. coli LE392

.. .. . .

trpR55, metB1, hsdR5 14, galK2, galT22, lacYI, supE44, supF58

Enquist et al. 1979

E. coli JM103

......

A(lac-pro), thi, strA, supE, endA, sbcBl5, hsdR4lF’ traD36, proAB, lacI4 AlacZMI 5

Messing et al. 198 1

that DNA precipitated in ethanol was collected by centrifugation. Plasmid DNA was prepared by the method of Birnboim and Doly ( 1979). Plasmid constructions.- Acinetobacter calcoaceticus DNA was prepared, and a plasmid library was constructed by ligating 20 pg of HindIII-digested A. calcoaceticus DNA with 10 pg of pBR322 treated with Hind111 and phosphatase. A portion of the ligated DNA representing 3 pg of vector was transformed into E. coli LE392 (r-m+), an d am p ici 11in-resistant colonies were selected. Approximately 7,000 ampicillin-resistant colonies were pooled, and the plasmid DNA was isolated. Two micrograms of the amplified plasmid library was transformed into E. coli BN 100, and ampicillin-resistant, PABA-independent colonies were isolated. Protein synthesis in E. coli minicells.- Plasmids containing A. calcoaceticus trp genes were transformed into E. coli BN105, and ampicillin-resistant colonies were isolated. Isolation of minicells and labeling of proteins with 35S-methionine (300 Ci/ mmol; Amersham, Arlington Heights, Ill.) were performed as described by Matsumura et al. (1977). DNA-sequence analysis.- DNA sequences were determined by the procedure of Maxam and Gilbert ( 1980) or Sanger et al. ( 1977). DNA fragments were endlabeled either with T4-polynucleotide kinase (P-L) and (T-~~P)ATP (>2,000 Ci/mmol; Amersham) or with E. coli DNA polymerase I (Klenow fragment;

Acinetobacter trpGDC Sequence

459

Boehringer Mannheim) and (u-32P)dCTP (>800 Ci/mmol; Amex-sham) (Maxam and Gilbert 1980). Alternatively, restriction endonuclease-generated DNA fragments were cloned into M 13mp8 or M 13mp9 RF I. Recombinant phages were transformed into E. coli JM 103, and single-stranded DNA was prepared as described by Messing and Vieira ( 1982). The polyacrylamide gel/urea electrophoresis system described by Sanger and Coulson (1978) was used for resolving DNA fragments. DNA sequences were analyzed in part by computer programs (Queen and Kom 1980; S. Weaver, B. Nichols, and C. Hutchison III, unpublished). Results

Construction and Characterization of Recombinant Plasmids containing the Acinetobacter calcoaceticus trpGDC Gene Cluster Plasmid libraries of A. calcoaceticus DNA were transformed into BN 10 1 and selected for ampicillin resistance and PABA independence. Approximately 50 such colonies were obtained. Plasmid DNA prepared from six of the PABA-independent colonies obtained from the Hind111 library was cleaved with Hindlll. Agarose-gel electrophoresis showed that, in addition to the 4.3-kb pBR322-Hind111 fragment, all of the plasmids contained a 3.9-kb Hind111 fragment. Four of the six plasmids contained a small additional Hind111 fragment. Retransformation of Escherichia coli BN 10 1 with the two smaller plasmids resulted in a high proportion of PABAindependent colonies. One plasmid, designated pBN79, was selected for further characterization. Plasmid pBN79 was mapped with several restriction endonucleases. The sites are indicated in figure 1. Since Sal1 cleaved the A. calcoaceticus DNA into approximately equal parts, the two SalI-Hind111 DNA fragments were cloned separately into the Sal1 and Hind111 sites of pBR322. Each was then transformed into E. coli BN 10 1, and ampicillin-resistant colonies were tested for PABA independence. The plasmid containing the larger (2,350-bp) Hindlll-Sal1 fragment (pBN78) was found to complement the E. coZipabA mutation, whereas the plasmid containing the smaller (1,750-bp) fragment (pMB466) did not. It has been shown that the product of the A. calcoaceticus trpG gene, that is, AS Component II (AS Coil), does not complement an enterobacterial AS Component

(c)

scale

K-

pBN78+



(d) K-

-4

pMB466 region sequenced -1

FIG. I.-Physical and genetic maps of the Acinetobacter calcoaceticus trpGDC region including (a) genetic map of pBN79 illustrating the 3.9-kb Hind111 fragment (light line) in pBR322 (heavy lines) with arrows indicating the position and direction of transcription of each gene, (b) restriction endonuclease map of the Hind111 fragment in pBN79, (c) HindIII-Sal1 fragments contained in plasmids pBN78 and pMB466, and (d) extent of the sequence analysis of the Hind111 DNA fragment.

460

Kaplan, Goncharoff,

Seibold, and Nichols

I (AS CoI) (Sawula and Crawford 1973). Therefore, in order to confirm that transformation to PABA independence was the result of cloning the A. calcoaceticus trpG gene and not the result of anomalous expression of a normally cryptic pabA gene, complementation tests were performed to test for the presence of other trp genes known to be linked to A. calcoaceticus trpG. Complementation tests for the presence of A. calcoaceticus trpD were performed by transforming the plasmids into E. coli BNlOO. The tryptophan requirement of this strain cannot be circumvented by anthranilate unless the PRTase (trpD) function is supplied by the plasmid. Only pBN79 gives rise to colonies on medium containing anthranilate and PABA. Since neither pBN78 nor pMB466 is trpD+, separation of the Hind111 fragment at the Sal1 site must either interrupt the A. calcoaceticus trpD gene or cause the loss of its expression in some other manner. The test for complementation by A. calcoaceticus trpC was carried out in E. coli W3 110 trpC55, a strain that contains a missense mutation in the first half of trpC(F). This mutation causes the loss of InGPS activity without loss of PRAI activity. The strain harboring either pBN79 or pMB466 can grow on minimal medium, indicating the presence of A. calcoaceticus trpC on both of these plasmids. Plasmid pBN78 does not confer the ability to grow on minimal medium. The trpC gene of the A. calcoaceticus trpGDC cluster must therefore be located to the right of the Sal1 site (as indicated in fig. 1). Growth-rate experiments indicate that A. calcoaceticus trpG and trpD complement E. coli pabA and trpD mutants as effectively as exogenous PABA or tryptophan (data not shown). Acinetobacter calcoaceticus trpC, on the other hand, complements the E. coli trpC55 mutation slightly less effectively than exogenous tryptophan. These complementation data show that PABA independence is conferred on an E. coli pabA mutant by an A. calcoaceticus gene that lies in or near the trpGDC cluster on a 3.9-kb Hind111 DNA fragment. Since only three polypeptides are synthesized from this DNA fragment (see below), we conclude that PABA independence is conferred by trpG. In addition, the data indicate that trpG and trpC are separated by a Sal1 site that inactivates trpD. Analysis of Polypeptides Encoded by the A. calcoaceticus DNA Fragment The polypeptides encoded by the three plasmids were analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis of 35Smethionine-labeled proteins produced by E. coli minicells. Plasmids pBN79, pBN78, and pMB466 were transformed into E. coli BN 105, a minicell-producing strain containing a defective trp aporepressor. The results are shown in figure 2. Plasmid pBN79 efficiently directs the synthesis of three major polypeptides in addition to the pBR322-encoded P-lactamase. The molecular masses of 44,000, 33,000, and 20,000 daltons are consistent with the molecular masses of 37,543,30,203, and 2 1,779 daltons predicted from DNA sequence analysis (see below). Molecular masses estimated by gel permeation chromatography for PRAI, InGPS, and AS Co11 were 42,500, 30,000, and 14,000, respectively (Twarog and Liggins 1970; Sawula and Crawford 1973). The difference in the estimates for AS Co11 may be due to the different analytical methods employed. The origin of the faint bands beneath the largest polypeptide is unknown but may be a result of artifactual translation initiation at internal methionine codons of trpD.

CCTCATAAGTGGGCTATCATGATAGTTAGCCCCAACTCATA 30 TMTATATCCACTTMTTAGCCGCTTTATTTTGAATTAAA 90

Met Leu ATG CTT

Am MT

Asp GAC

Ile Ser ATT TCT

Gly Asn GGC AAT

10 Leu Met Ile Asp Aan Tyr Asp Ser Phe CTC ATG ATC GAC MT TAC GAC TCC TTT 200 40 Gin Val Thr Leu Glu Asp Ile Glu Arg CAA GTC ACT CTT GAG GAC ATT GM CGA 290 70 Ile Pro Ala Ile His His Phe Ala Gly ATT CCT GCA ATT CAT CAT TTT GCA GGC 3AO 100 Ile Ile Arg Ala Lys Thr Val Met His ATT ATT CGA GCC AAA ACC GTG ATG CAT

120

150

20 Thr Tyr Am Ile Val Gln Tyr Phe Gly Glu ACG TAT MC ATC GTC CAG TAT TTT GGC GAG 230 50 Trp Gin Pro Lys Tyr Leu Val Val Gly Pro TGG CAG CCG MG TAT TTG GTG GTC GCT CCC 320 80 Arg Ile Pro Leu Leu Gly Val Cys Leu Gly AGA ATC CCA TTA CT-TGGT GTT TGT TTA CGA 410 110 Gly Arg Leu Ser Asp Met Tyr His Thr Asp GGT CGT TTA TCG GAT ATG TAT CAT ACG GAT

60

180

30 Leu Asn Gin Asp Val Lys Val Val Arg TTA MT CM GAC GTT MA GTT GTT CGC 260 60 Gly Pro Cys Ser Pro Thr Glu Ala Gly GGC CCT TGT TCT CCA ACT GM GCT CGA 350 90 His Gln Ala Ile Gly Gin Ala Phe Gly CAT CAG GCA ATT GGA CAG GCT TTT GGG 440 120 Lys Gly Ile Phe Ser Am L-w Pro Ser AAA GGG ATT TTT AGC MT TTG CCT TCC

Asp Cly Ser Ile Glu Glu Ile Met Gly Val Lys His Lys Thr Leu Pro Val Glu Gly Val Gin Phe His Pro Glu Ser Ile Leu Ser Gin GAT GGT TCA ATC GM GM ATC ATG GGC GTT AAA CAT MG ACA TTA CCT GTA GAG GGT GTG CAG TTT CAT CCA GM TCG ATT TTG AGC CAG h5ll 680 710 15 190 5 His Gly His Gin Ile Phe Lys Am Phe Leu Glu Ile Tyr Ala Met Am Ile Gln Gin Ala Leu Am His Ile Thr Lys Am Ile His CAT GGA CAC CM ATC TTC MC MT TTT TTh GAG ATT TAT GCA TM ATG MT ATT CAA CM GCA TTA AAC CAT ATC ACC AAA MT ATT CAT 740 770 800 45 25 35 Leu Thr Gin Ala Gin Met Glu Asp Val Met Arg Ser Ile Met Gin Gly Glu Ala Thr Glu Ala Gin Ile Gly Ala Leu Met Met Gly Leu CTG ACC CAA GCG CAA ATG GM GAT GTG ATG CGT AGC ATC ATG CM GGC GM GCC ACT GM GCA CAA ATT GGT GCG TTA ATG ATG GGT TTG 830 860 890 55 65 75 Arg Met Lys Gly Glu Ser Ile Asp Glu Ile Thr Ala Ala Ala Arg Val Met Arg Glu Leu Ala Ile Lye Ile Asp Val Ser Asp Ile Gin CGC ATG MG GGT GM AGT ATC GAT GM ATT ACC GCA GCC GCA C4X GTT ATG CGT GAG CTG GCG ATT AAA ATC CAT GTC AGT CAT ATT CM 920 950 980 85 95 105 Tyr Leu Val Asp Ile Val Gly Thr Gly Gly Asp Gly Gin Am Leu Phe Am Val Ser Thr Ala Ser Ser Phe Val Ile Ala Ala Ala Gly TAT CTA GTC GAT ATT GTA GGG ACA GGT GGC GAT GGT CAA MC CTG TTT MC GTC TCT ACG GCA TCC AGC TTT GTA ATT GCT GCT GCA GGT 1010 1040 1070 115 125 135 Ala Thr Ile Ala Lys His Gly Am Arg Gly Val Ser Ser Lys Ser Gly Ser Ser Asp Leu Leu Glu Gin Ala Gly Ile Am Leu Asp Leu GCAACC ATT GCT AAG CAC GGT MT CGT GGC GTT TCT AGC AAATCT GGATCATCAGAT CTG CTT GM CAG GCAGGG ATT MT CTG GAT CTG 1100 1130 1160 165 145 155 Asp Met Gin Gin Thr Glu Arg Cys Ile Arg Glu Met Gly Val Gly Phe Leu Phe Ala Pro Aen His His Lys Ala Met Lya Tyr Ala Val GAT ATG CAA CAG ACG GM CGT TGC ATT CGC GAG ATG GGT GTC GGT TTC TTA TTT GCT CCC MC CAT CAC AM GCC ATG AAA TAT GCA GTG 1190 1220 1250 175 185 199 Gly Pro Arg Arg Glu Leu Gly Ile Arg Ser Ile Phe Am Leu Lea Gly Pro Leu Thr Am Pro Ala Gly Val Lys Arg Phe Val Ile Gly GCC 'XT CGT CGT GM CTC GGG ATT CGT AGT ATT TTT MT TTA CTT GGC CCG CTT ACG MT CCC GCT w GTA AAG CCC TlT GTG ATT CCC 1280 1340 1310 205 215 225 Val Phe Ser Asp Glu Leu Cys Arg Pro Ile Ala Glu Val Met Lye Gin Leu Gly Ala Glu His Val Met Val Val His Ser Lys Asp Gly GTA TTT TCA GAT GM CTT TGT CGA CCA ATT GCA GM GTC ATG AAA CM CTT GGT GCT GAG CAT GTG ATG GTT GTG CAT TCC AAA GAT GGT 1430 1370 1400 235 245 255 Leu Asp Glu Ile Ser Leu Ala Ser Gin Thr Tyr Ile Ala Glu Leu Lys Am Gly Glu Val Thr Glu Trp Val Leu Am Pro Glu Asp Val TTG CAT GM ATT AGT CTT GCC TCG CM ACT TAT ATT GCG GM CTC AAA MC GGT GM GTG ACT GM TGG GTG CTT MT CCA GM CAT GTG 1460 1490 1520 285 265 275 Am Ile Pro Ser Gin Thr Leu Ser Gly Leu Ile Val Glu Asp Ser Am Ala Ser Leu Lys Leu Ile Lys Asp Ala Leu Gly Arg Lys Lys MT Al-TCCG TCT CM ACG TTA AGC GGT TTA ATT GTC GM GAT TCC MT GCC AGT CTA AAA TTG ATC AM CAT GCT TTG GGC CGC AAA AAA 1550 1580 1610 315 295 305 Ser Asp Ile Gly Glu Lys Ala Ala Am Met Ile Ala Leu Am Ala Gly Ala Gly Ile Tyr Val Ser Gly Leu Ala Thr Ser Tyr Lys Gin TCG GAC ATC GGT GM AAA GCA GCC MT ATG ATT GCC TTA MT GCA GGC CCC GGT ATT TAT GTA TCA GGC TTA GCT ACC AGC TAT MA CAA 1700 1640 1670 345 325 335 Gly Val Ala Leu Ala His Asp Ile Ile Tyr Gly Gly Gin Ala Leu Glu Lys Met Ser Ile Leu Ser Glu Phe Thr Lys Ala Leu Lys Glu GGT GTG GCG TTG GCACAT GATATT Al-TTAC GGC GGA CMGCA CTC GM AAAATG AGTATT TTG TCT GMTTTACTAAAGCG CTTAAAGM 1790 1730 1760 5 15 Tyr Ala Am Am Met Ile Gin Ile Gin Am Thr Ile Leu Gly Lys Ile Val Asp Arg Lya His Glu Glu Leu Ala Asp TAC GCA MC MC TGA GGTTATCCAA ATG ATT CAG ATT CAA MT ACT ATT CTG GGT AAA ATC GTT CAT CGT MA CAT GM GM CTG GCA GAT 1880 1820 1850 45 25 35 Arg Leu Lys Gin Arg Am Leu Lys Asp Val Glu Gin Trp Val Lya Glu Ala Pro Pro Val Arg Gly Phe Ala Gin Scr Leu Arg Ser Lys CGT TTA AAG CAG CCC MC CTA AAA GAT GTG GM CM TGG GTA AAG GM GCG CCC CCT GTA CGT GGT TTT CCC CAA TCA TTA CCC AGC AAA 1970 1940 1910 65 75 55 Arg Pro Gly Val Ile Ala Glu Ile Lys Lya Ala Ser Pro Ser Lys Gly Val Ile Arg Val Asp Phe Asn Pro Ala Glu Ile Ala Gin Gin CGC CCT GGT GTAATT GCCGMATAMAAMGCC TCA CCT TCT AAAGGT GTG ATT CGGGTT GAT TlTAAT CCT GCT GAAATT GCT CM CAA 2030 2060 2000 105 85 95 Tyr Glu Gin Ala Gly Ala Ala Cys Leu Ser Val Leu Thr Asp Val Asp Phe Phe Gin Gly Ala Asp Glu Am Ile Ala Ile Ala Arg Gin TAC GAG CAA GCA GCT WA GCT TGT CTI TCT GTA TTA ACA CAT GTC GAT TTT TTC CM GGT GCT GAT GM MC ATT GCC ATT GCT C4X CAA 2090 2120 2150 135 115 125 His Cys Ser Leu Pro Ala Leu Arg Lys Asp Phe Leu Ile Asp Pro Tyr Am Val Val Glu Ala Arg Ala l&u Gla Ala Asp Cys Ile Leu CAT TGT TCT TTC CCT CCC CTA CGT AAA CAT TTT TTA ATT CAT CCA TAC MT GTT GTC GM GCA CGT GCT TTA CAG GCA GAC TGT ATT TTA 2240 2180 2210

___

Acinetobacter trpGDC Sequence 145 Leu Ile Val Ala Cys Leu TTG ATT GTC GCT TCT TTG 2270 175 Asp Glu Gin Glu Leu Glu GAC GM CM GM TTA GM 2360 205 Leu Am Thr Ser Leu Arg TTA MT ACC AGC TTG CGT 2450 235 Leu Met Gin Glu His Asp TTA ATG CM GM CAT GAC 2540 265 Gly Val Pro Lys Leu Val GGT GTG CCT AAG CTG GTT 2630

155 Ser Asp Gln Gln Leu Glu Glu Met Ser Lys TCT GAT CAG cAA CTC GM GAG ATG TCA Au 2300 185 Arg Ala Leu Lys Leu Ser Glu Gin Cys Leu CGC GCA CTC AAA CTT TCA GM CAG TGC TTA 2390 215 Leu Lys Lys Leu Leu Asp Pro Ser Arg Leu TTG MG MG CTG CTT GAT CCA TCA CGT TTA 2480 245 Ile His Ser Phe Leu Val Gly Glu Ser Phe ATT CAT TCA TTC TTG GTG GGT GM AGT TTT 2570

TM

165 Thr Ala Phe Glu Tyr Asp Leu Asp Val Leu Val ACT GCA TTT GM TAT GAT CTT GAC GTT TTG GTT 2330 195 Leu Gly Val Am Am Arg Am Leu Lys Thr Phe TTG GGT GTA MT MT CGT MT CTT AAA ACC TTT 2420 225 Leu Val Thr Glu Ser Gly Ile Ala Thr Pro Gin TTG GTC ACT GM ACT GGT ATC GCA ACT CCT CAA 2510 255 Met Lys Gin Pro Arg Pro Asp Gln Ala Phe Thr ATG AAG CAG CCT CGA CCA GAT CAG GCA TTT ACT 2600

463

Glu Val His GM GTG CAT

Asp Val Asp CAT GTC GAT

Asp Val Gln GAT GTG CM

Glu Leu Phe GM TTA TTT

ACAAATTTTGTTAAAGCMTAGATACCAAATTGAACATTTGCTA~G~G~T~TCTCTMC~~~GTT~TMTAT~TA~~TCGMT 2720 2660 2690

GCTTCCTTMCAGTCATCTGTATAGTTCTGGTATMGCCGTTGAGC~~T~TA~AGTMT~TAGTT~T~T~TMTGTGC~TG~AT~G~MTATCA 2750 2780 2810

2840

CTTAAATACTGCCGTATATCTGTATTTAAATTCTGAGGATACTTTTCTMCAGAT~TCTTTT~TAGTTA~G~T~T~GCCTATTCAGA~C~TCMCATAGCTTM~A 2870 2900 2930 2960 TTAAAGTATTCTCGMCATCTTCTTCCGAT~~T~GC~TGTCT~GTGM~CATCTTGATTAT~T~MTA~TCTG~~CGCCCMTCTTGMTCTC~TAGTAGTMGC 2990 3020 3050

3080

MTCCMCTTCMGCTT: 3100

FIG. 3.-DNA

and predicted amino acid sequences of the trpGDC gene cluster and flanking regions.

E. coli thrB and thrC (Cossart et al. 198 1) and are reminiscent of the overlapping termination and initiation codons observed in the E. coli trp operon between trpE and trp(G)D (Nichols et al. 1980) and between trpB and trpA (Platt and Yanofsky

1975; Nichols and Yanofsky 1979). The overlapping termination and initiation codons in the E. coli trp operon have been implicated in the coordinate translation of the adjacent genes (Oppenheim and Yanofsky 1980; Aksoy et al. 1984), and it is possible that the adjacent termination and initiation codons play a similar role in the coordinate expression of A. calcoaceticus trpG and trpD. The three trp genes can be easily recognized by their similarity to both E. coli trp(G)D and the S-terminal portion of E. coli trpC(F). Figure 4 shows the similarity between the aligned A. calcoaceticus and E. coli nucleotide sequences. The sequence similarity is evident along the entire length of the gene cluster and is strongest within the coding regions of the genes. Similarities are notably reduced around intercistronic regions and at the beginning and end of the A. calcoaceticus gene cluster. The intermittent nature of strong similarity suggests conservation of structurally and functionally critical amino acid sequences and divergence of expressionrelated signals and less important amino acid sequences. Amino Acid-Sequence

Alignments

Comparisons of the predicted amino acid sequences of A. calcoaceticus AS CoII, PRTase, and InGPS with the homologous sequences from other organisms are shown in figures 5, 6, and 7, respectively. Alignments are manual, and gaps have been inserted to increase similarity among the sequences. The three A. calcoaceticus amino acid sequences show 36%-39% similarity to the E. coli amino acid sequences (AS CoII, 37%; PRTase, 36%; InGPS, 39%); however, A. calcoaceticus AS Co11 is much more similar to the E. coZi PABS Co11 (57%) than to AS CoII. The higher similarity may explain why A. calcoaceticus trpG functions in E. coZi PABA synthesis but does not function in anthranilate synthesis.

464

8

Kaplan, Goncharoff, Se&old, and Nichols

0

vx

3

frpG

frpD

#rpC

FIG. 4.-Nucleotide-sequence similarity between Acinetobacter calcoaceticus trpGDC and the equivalent portion of the Escherichia coli trp operon. Each point represents the similarity between 40-nucleotidelong sequences. The positions of the A. calcoaceticus genes are indicated by the arrows along the horizontal axis.

Discussion

Comparison of Amino Acid Sequences Alignment of the three AS GAT amino acid sequences with the Escherichia the common origin of the AS GAT and PABS GAT subunits (Kaplan and Nichols 1983). Strong similarity occurs in several regions throughout the sequences. One of these regions, at residues 74-83, contains a cysteine residue (position 79) that has been determined to be essential for catalytic activity. A glutamine analog covalently reacts with this cysteine residue in the Pseudomonas putida (Kawamura et al. 1978) and Serratia marcescens (Tso et al. 1980) AS GAT subunits. Another region, centered in the area of residue 103, is rather less conserved among the sequences but contains a lysine residue shown to be essential for activity (Bower and Zalkin 1982). Arginine replaces the essential lysine residue in the Acinetobacter calcoaceticus sequence. The specific roles of the other conserved sequences are not known, but their conservation across generic boundaries suggests critical roles in either substrate binding or catalysis. The A. calcoaceticus trpG product is most closely related to the P. putida trpG product and least related to the E. coli trp(G) product. These relationships correlate

coli PABS GAT sequence underscores

100

120

160 180 WTNQNDGSIEEIMGVKHKTLPVEGVQFHPESILSQHGHQIFKNFLEIYA WTAHEDGSVDEIMGLRHKTLNIEGVQFHP~SILTEQGHELFANFLKQTGGRR WSET---1 1] [lrri -REIMGIRHRQWDLEGVQFHPESILSEQGHQLLANFLHR HFNG------MVMAVRHDADRVCGFQFHPESILTTQGARLLEOTLAWAQHKL...

140

0

1]

FIG. L-Alignment of the predicted amino acid sequences of Acinetobacter calcoaceticus anthranilate synthase (AS) Co11 (row l), Pseudomonas putida AS Co11 (Kawamura et al. 1978) (row 2), Escherichia coli paminobenzoate synthase Co11 (Kaplan and Nichols 1983) (row 3), and the equivalent portion of E. coli AS Co11 (Nichols et al. 1980) (row 4). Numbering refers only to the A. culcoaceticus sequence. Identical residues in all sequences are boxed.

Acinetobacter trpGDC Sequence

465

20

180

200

280

320

340 IIYGGQALEKMSILSEFTKALKEYANN AYDRVTALAARG 17 FIG. 6.-Alignment of the predicted amino acid sequences of Acinetobacter cakoaceticus anthranilate phophoribosyltransferase (PRTase) (top row) and Escherichia coli PRTase (Horowitz et al. 1982) (bottom row). The E. coli sequence shown begins at residue 198 of the bifunctional glutamine amidotransferase PRTase polypeptide. Numbering refers to the A. calcoaceticus sequence.

with intrageneric AS subunit exchange experiments. The A. calcoaceticus and P. putida CAT subunits can be exchanged without appreciable loss of function, but neither of the AS Co1 subunits can function with an enterobacterial AS Co11 (Sawula and Crawford 1973; Queener et al. 1973); however, at least two (A. calcoaceticus AS Co11 and E. coli PABS CoII) of the three most similar sequences shown in figure 5 do function in PABA synthesis with E. coli PABS Co1 in vivo. Because of the observed sequence similarity between A. calcoaceticus trpG and P. putida trpG,

B”0 LRKDF LCKDF

q

VVEARA IYLARY

VA LS

260 KQPRPDQmFT-E[F[VP[LV AHDDLHAAVRRVLLGENKVCGLTRGODAKA...

FIG. 7.-Alignment of the predicted amino acid sequences of Acinetobacter calcoaceticus indoleglycerol phosphate synthase (InGPS) (top row) and Escherichia coli InGPS (Christie et al. 1978) (bottom row). Only the N-terminal portion of the fused E. coli InGPS-phosphoribosylanthranilate isomerase sequence is shown. Numbering refers to the A. cakoaceticus sequence.

466

Kaplan,

Goncharoff,

Seibold, and Nichols

it would not be surprising to find that P. putida AS Co11 functioned similarly, although it has not yet been determined whether P. putidu trpG encodes an amphibolic CAT subunit. It is not obvious from inspection of the amino acid sequences why the A. calcoaceticus GAT subunit does not function amphibolically in E. cob. Evidence suggests that the N-terminal portion of E. coli AS Co11 interacts with the AS Co1 subunit (Greishaber and Bauerle 1972). Whether the same is true of the PABS subunit is unknown; however, if most of the remainder of the sequence is conserved for other functions, then by analogy with AS CoII, it is likely that the N-terminal region of the PABS GAT subunit also specifies interaction with the large PABS subunit. Since the A. calcoaceticus CAT cannot complement the E. coli AS Co1 but behaves instead as a PABS-specific GAT subunit, it is likely that the amino acid differences observed in E. coli trp(G) in this region account in part for the functional differences in subunit specificity. Subunit-specificity determinants would also be expected to occur on the Co1 subunits of AS and PABS, and according to observed complementation patterns, it is likely that in the subunit-interaction areas A. culcoaceticus AS Co1 will show similarities to A. culcoaceticus PABS Co1 that are not found between E. coli AS Co1 and E. coli PABS CoI. The predicted amino acid sequences of A. calcoaceticus and E. coli PRTase are compared in figure 6. The E. coli sequence shown begins within the fusion region between the trp(G) and trpD portions of the sequence. (The monofunctional PRTase-encoding sequence of S. marcescens begins at the position equivalent to A. calcouceticus residue 3.) Similarity is quite strong in the central region of the polypeptide, whereas the N- and C-terminal portions show less similarity. The diversity at the N and C termini may be due to either a structural or a functional adaptation of the E. coli sequence to the fusion of trp(G) in conjunction with the evolution of the internal E. coli trpP, promoter (Morse and Yanofsky 1968; Horowitz et al. 1982; Horowitz and Platt 1982) or to an entirely different origin of this DNA region. The amino acid sequence of the A. calcoaceticus trpC product aligns with the N-terminal portion of the fused E. coli trpC(F) product (Christie and Platt 1980). As in the PRTase sequences, the extent of similarity is strongest in the central portion of the sequence and is somewhat lower toward the termini. The product of the E. coli trpC(F) gene is 450 amino acids long and catalyzes both the InGPS and PRAI activities. Analysis of proteolytic fragments has determined that the two activities lie on independently folding domains and that the InGPS activity is confined to the 289 N-terminal residues (Kirschner et al. 1980). Comparison of the E. coli trpC(F) sequence with the monofunctional A. calcoaceticus trpC sequence suggests that the InGPS activity is encoded within the first 260 amino acid residues and that the domain containing PRAI activity lies in the remaining 190 amino acids. Sucharomyces cerevisiae TRPl (equivalent to trpF) encodes InGPS on a separate polypeptide chain. The first recognizable similarity to the E. coli sequence occurs at positions 15 19 of TRPl , matching E. coli positions 266-27 1 as numbered in figure 7 (Tschumper and Carbon 1980). This latter sequence, KVCGLT, also overlaps the last three amino acids of the A. calcoaceticus InGPS sequence. While it has been shown that the E. coli trp(G)D fusion is the result of modification and translation through an ancestral intercistronic region (Miozzari and Yanofsky 1979), this is probably not the same mechanism that resulted in the fusion of trpC and trpF. We suggest instead that the trpC-trpF fusion occurred via a deletion between

Acinetobacter trpGDC Sequence

467

separate but adjacent trpC and trpF genes with end points within each of the ancestral genes. This hypothesis is supported by the observation that the “fusion peptide” between the GAT and PRTase activities lies near the surface of the molecule and is accessible to proteases (Li et al. 1974), whereas the junction between the fused InGPS and PRAI activities is buried within the protein core (Kirschner et al. 1980). Codon and Amino Acid Usage in the A. calcoaceticus trpGDC Gene Cluster The choice of specific codons

within a codon

family differs between A.

calcoaceticus trpGDC and the equivalent region of the E. coli trp operon (table 2).

As seen in many prokaryotes, there is an avoidance of the use of such codons as ATA(Ile), CTA(Leu), CGR(Arg), and AGR(Arg) (Grantham et al. 1981). These preferences correlate with insignificant amounts of the cognate tRNAs in E. coli (Ikemura 198 I), and it may be that the same is true in other prokaryotes. Other A. calcoaceticus codon usage trends that are similar to those of E. coli are preferred use of ATT (Ile), AAA (Lys), and GGY (Gly). Among the notable differences in A. calcoaceticus codon usage, however, are the following: (1) CTG (Leu) is not the preferred Leu codon; rather, TTA and TTG represent one-half of the Leu codons; (2) TCT and TCA (Ser) are preferred to TCG; (3) CCT (Pro) is preferred to CCG; and (4) in all of the two-codon families, the member ending in T or A is used from two to six times as often as is the alternative. It has been observed that within the trp genes of the Enterobacteriaceae, codon usage is influenced by the genomic G + C content of the organism and that the third positions of the codons are the most influenced (Nichols et al. 1980, 198 1). The same effect is seen in the A. calcoaceticus trpGDC gene cluster, albeit in the opposite direction: the G + C content of A. calcoaceticus is 4296, but in the third positions of the codons it is only 35%. As mentioned above, this is very noticeable in the two-codon families, but it is also obvious in some of the four-codon families (e.g., Ser and Pro). In addition to the influence exerted on the third position of the codon, the genomic G + C content may also affect the choice of amino acids. That is, within a family of amino acids, those that are encoded by A/T-rich codons are used more frequently in A. calcoaceticus. For example, in the (aliphatic) hydrophobic class of amino acids, Ile (ATH) is used more frequently and Ala (GCX) used less frequently than in E. coli. A similar statement can be made concerning the choice between the basic amino acids Lys (AAR) and Arg (CGX + AGR). The true significance of this observation awaits further documentation of the amino acid composition of orthologous proteins from additional organisms with varied genomic G + C content. Conclusion

The nucleotide sequence of the A. calcoaceticus trpGDC gene cluster emphasizes the fundamental similarity of the enzymes involved in tryptophan metabolism yet also illustrates the variety of genomic arrangements and regulatory differences coordinating their expression. The G-D-C gene order is common to gram-negative bacteria, but the amphibolic nature of trpG and the separateness of the trpGDC cluster are not. Transcription of the A. calcoaceticus trpGDC cluster is coordinated with the expression of trpE, but the trpFBA cluster is regulated differently (Cohn and Crawford 1976). In the absence of other results, we cannot identify potential

468

Kaplan,

Goncharoff,

Seibold, and Nichols

Table 2 Codon Usage in Acinet&zcter calcoaceticus trpGDC and in the Equivalent Portion of the Escherichia coli trp Operon A.

G

CODON

E.

TRP

COLI

TRP

D

C

Total

(G)D

C(-F)

Total

7

24 4 28 18

8 7 9 6

6 4 5 5

14 11 14 11

. . ..... ..... .....

8 1 8 3

8 6

9 2 12 9

(Leu) (Leu) (Leu) (Leu)

.... ..... . . .....

3 1 0 1

8 3 2 6

5 2 2 4

16 6 4 11

4 8 2 37

3 2 1 11

7 10 3 48

ATT ATC ATA ATG

(Ile) (Ile) (he) (Met)

...... ...... ...... .....

11 6 0 5

24 6 0 15

12 2 1 4

47 14 1 24

20 6 1 13

10 9 0 4

30 15 1 17

GTT GTC GTA GTG

(Val) (Val) (Val) (Val)

...... . .. ..... .....

5 3 1 5

3 6 5 9

6 5 5 6

14 14 11 20

7 8 3 15

3 5 4 7

10 13 7 22

TCT TCC TCA TCG

(Ser) (Ser) (Ser) (Ser)

...... ...... ...... ......

2 3 2 2

5 3 4 2

4 0 6 0

11 6 12 4

4 5 2 3

0 1 2 5

4 6 4 8

CCT CCC CCA CCG

(Pro) . . . . . . (Pro) . . . . (Pro) . . . . . (Pro) . . . . .

6 0 3 2

1 1 2 3

8 1 3 0

15 2 8 5

1 3 2 23

1 1 1 6

2 4 3 29

ACT ACC ACA ACG

(Thr) (Thr) (Thr) (Thr)

. ... . . . .. .....

2 3 2 2

4 5 1 4

5 2 1 0

11 10 4 6

1 15 5 4

2 3 0 3

3 18 5 7

GCT GCC GCA GCG

(Ala) . . . . . . (Ala) . . . . . . (Ala) . . . . . (Ala) . . . . .

2 1 5 0

8 7 15 7

7 5 9 1

17 13 29 8

6 19 10 31

4 13 5 9

10 32 15 40

TAT TAC TAA TAG

(Tyr) (Tyr) (End) (End)

...... ...... ..... .....

6 1 1 0

5 2 0 0

1 2 1 0

12 5 2 0

8 5 1 0

6 4 0 0

14 9 1 0

CAT CAC CAA CAG

(His) (His) (Gln) (Gln)

... ...... ..... .....

8 2 4 7

6 2 14 2

5 0 13 7

19 4 31 16

12 11 12 14

15 15 16 26

AAT AAC AAA AAG

(Am) (Asn) (Lys) (Lys)

. .. ..... ..... ..

7 2 4 3

11 7 17 3

7 2 11 6

25 11 32 12

8 15 11 3

12 20 20 6

TTT TTC TTA TTG

(Phe) (Phe) (Leu) (Leu)

CTT CTC CTA CTG

.

CALCOACETICUS

1

Acinelobacler lrpGDC Sequence

469

Table 2 (Continued) A. CALCOACETICUS G

CODON

GAT GAC GAA GAG

(Asp) (Asp) (Glu) (Glu)

..... .... ..... ..

TGT TGC TGA TGG

(Cys) (Cys) (End) (Trp)

..... ..... .. . ...

CGT CGC CGA CGG

(Arg) (Arg) (Arg) (Arg)

AGT AGC AGA AGG

(Ser) (Ser) (Arg) (Arg)

...... ...... ..... .....

GGT (Gly) GGC (Gly) GGA (Gly) GGG (Gly)

..... ..... .....

D

C

Total

(G)D

16

16 4 19 2

35 10 47 10

12 9 22 7

11 6 5 9

4

8 3

4

2 2 0

1 20 3

0

0

C(-F)

Total 23 15 27 16

4 8 4

..

..... .....

.....

.....

E. COLI TRP

TRP

17 9 4 3

7 16 1 0

3 10

7 10 1 0 16 10 2 4

10 0 0 0

31 15 6 6

15 18 5 8

10 26 2

10 15 1 0 4 3 3

19 21 6 11

transcriptional signals (either promoters or operators) by inspection of the 185bp sequence flanking trpG. Similarly, we do not see evidence of secondary structures or leader peptide-coding regions that might reflect an attenuation mechanism of transcriptional regulation; however, the coupling of translation by the propinquity of termination and initiation signals of adjacent genes may rival gene fusion as a method of coordinating expression. We have pointed out above that the termination and initiation codons of trpG and trpD are adjacent to one another, and we have also noticed that the termination codon of trpD overlaps a nine-base sequence that closely resembles the Shine and Dalgarno (Escherichia coli) sequence (Shine and Dalgamo 1974). Although translational coupling has not been directly demonstrated in such arrangements, it is possible that termination of translation within or very near to an initiating environment influences the efficiency of the subsequent initiation event. The evolution of an amphibolic trp/pab GAT subunit can be envisioned by several mechanisms, involving the duplication of one (CoI) or two (Co1 + CoII) genes (with one of the duplicated Co11 genes being deleted in the latter case). While the sequence data do not support one mechanism more than another, the similarity of the trp-linked amphibolic subunit to E. coli pabA rather than to E. coli trp(G) suggests that the pabA-like sequence may more closely resemble the ancestral GAT sequence. This is further supported by the fact that the trp-dedicated GAT sequences of lower eukaryotes also closely resemble the E. coli pabA sequence (Schectman and Yanofsky 1983; H. Zalkin, personal communication). Gene fusions are commonly encountered in evolutionary comparisons of trpgene arrangements. The fusion junctions that have thus far been characterized at

470

Kaplan, Goncharoff, Seibold, and Nichols

the nucleotide-sequence level by comparisons of fused and separate genes include enterobacterial trpG-trpD (Miozzari and Yanofsky 1979; Nichols et al. 1980), Neurospora crassa TRPl (equivalent to trpG-trpC-trpF) (Schechtman and Yanofsky 1983), and Sacharomyces cerevisiae TRPS (equivalent to trpA-trpB) (Zalkin and Yanofsky 198 1). In each case, the fusion regions are characterized by the presence of a “linking peptide” in the fused gene that did not previously exist in either of the separate genes (Yanofsky, personal communication). We report here the first complete nucleotide sequences of separate trpD and trpC genes. In conjunction with previous sequence data, we propose that an alternative fusion mechanism involving an in-phase deletion with end points within adjacent but separate genes may be responsible for joining trpC and trpF. The Bacillus subtilis trp cluster contains separate but adjacent trpC and trpF genes, and determination of the intercistronic region may provide further information regarding this hypothesis. Acknowledgments

We thank the members of BioS 466 (Procaryotic Molecular Biology Laboratory)-G. Alberts, P. Balandyk, D. Baro, J.-D. Chen, S. Hahn, B. Howard, J.-W. Lee, P. Lysakowski, J. Malakooti, M. Sussman, H. Weidmann, and C. Wilke-for determining the sequence of the Acinetobacter calcoaceticus DNA fment contained in pMB466, W. K. Merkel for assisting with the complementation studies, I. Crawford for providing the A. calcoaceticus strains and for helpful advice and discussions, and C. Yanofsky and H. Zalkin for communicating information prior to publication. This work was supported by U.S. Public Health Service grant AI 18639 and by the Department of Biological Sciences of the University of Illinois at Chicago. LITERATURE

CITED

AKSOY, S., C. L. SQUIRES,and C. SQUIRES. 1984. Translational coupling of the trpB and trpA genes in the EscherichiacoZitryptophan operon. J. Bacterial. 157:363-367. BIRNBOIM,H. C., and J. DOLY. 1979. A rapid alkaline extraction procedure for screening recombinant plasmid DNA. Nucleic Acids Res. 7: 15 13- 1523. BOWER,S., and H. ZALKIN. 1982. Modification of Serratia marcescensanthranilate synthase with pyridoxal 5’-phosphate. Arch. B&hem. Biophys. 219: 12 1- 127. BUVINGER,W. E., L. C. STONE,and H. E. HEATH. 198 1. Biochemical genetics of tryptophan synthesis in Pseudomonas acidovorans. J. Bacterial. 147~62-68. CHRISTIE,G. E., and T. PLATT.1980. Gene structure in the tryptophan operon of Escherichia coli: nucleotide sequence of trpC and the flanking intercistronic regions. J. Mol. Biol. 142:5 19-530. COHN, W., and I. P. CRAWFORD.1976. Regulation of enzyme synthesis in the tryptophan pathway of Acinetobacter calcoaceticus. J. Bacterial. 127:367-379. COSSART,P., M. KATINKA,and M. YANIV. 198 1. Nucleotide sequence of the thrB gene of E. coli, and its two adjacent regions; the thrAB and thrBC junctions. Nucleic Acids Res. 9:339-347. CRAWFORD,I. P. 1975. Gene arrangements in the evolution of the tryptophan synthetic pathway. Bacterial. Rev. 39:87- 120. CREIGHTON,T. E. 1970. N-( 5’-phosphoribosyl)anthranilate isomerase-indole-3-ylglycerol phosphate synthetase of tryptophan biosynthesis: relationship between the two activities of the enzyme from Escherichia coli. Biochem. J. 120:699-707. ENQUIST,L. W., M. J. MADDEN, P. SCHIOP-STANSLY, and G. F. VANDE WOUD. 1979. Cloning of herpes simplex type 1 DNA fragments in a bacteriophage lambda vector. Science 203:54 I-544.

Acinetobacter trpGDC Sequence

47 1

GRANTHAM, R., C. GAUTIER, M. Gow, M. JACOBZONE,and R. MERCIER. 1981. Codon catalog usage is a genome strategy modulated for gene expressivity. Nucleic Acids Res. 9:r43-r74. GREISHABER,M., and R. BAUERLE. 1972. Structure and evolution of a bifunctional enzyme of the tryptophan operon. Natur. New Biol. 236:232-235. HOROWITZ, H., G. E. CHRISTIE, and T. PLATT. 1982. Nucleotide sequence of the trpD gene, encoding anthranilate synthetase component II of Escherichia coli. J. Mol. Biol. 156:245256. HOROWITZ, H., and T. PLATT. 1982. Identification of trpPz, an internal promoter in the tryptophan operon of Escherichia coli. J. Mol. Biol. 156:257-267. HUANG, M., and J. PITTARD. 1967. Genetic analysis of mutant strains of Escherichia coli requiring paminobenzoic acid for growth. J. Bacterial. 93: 1938-1942. IKEMURA, T. 198 1. Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J. Mol. Biol. 151:389-409. KANE, J. F., W. M. HOLMES, and R. A. JENSEN. 1972. Metabolic interlock: the dual function of a folate pathway gene as an extra-chromosomal gene of tryptophan biosynthesis. J. Biol. Chem. 247: 1587-l 596. KAPLAN, J. B., and B. P. NICHOLS. 1983. Nucleotide sequence of Escherichia coli pabA and its evolutionary relationship to trp(G)D. J. Mol. Biol. 168:451-468. KAWAMURA, M., P. S. KEIM, Y. GOTO, H. ZALKIN, and R. L. HEINRIKSON. 1978. Anthranilate synthetase component II from Pseudomonas putida: covalent structure and identification of the cysteine residue involved in catalysis. J. Biol. Chem. 253:4659-4668. KIRSCHNER, K., H. SZADKOWSIU, A. HENSCHEN, and F. LOTTSPEICH. 1980. Limited proteolysis of N-(5’-phosphoribosyl)anthranilate isomerase: indoleglycerol phosphate synthetase from Escherichia coli yields two different enzymically active, functional domains. J. Mol. Biol. 143:395-409. LI, S. L., J. HANLON, and C. YANOFSKY. 1974. Structural homology of the glutamine amidotransferase subunits of the anthranilate synthetases of Escherichia coli, Salmonella typhimurium, and Serratia marcescens. Nature (London) 248:48-50. MANDEL, M., and A. HIGA. 1970. Calcium-dependent bacteriophage DNA infection. J. Mol. Biol. 53: 159- 162. MARMUR,J. 1966. A procedure for the isolation of deoxyribonucleic acid from microorganisms. J. Mol. Biol. 3:208-2 18. MATSUMURA, P., M. SILVERMAN,and M. SIMON. 1977. Synthesis of mot and the gene products of Escherichia coli programmed by hybrid ColEl plasmids in minicells. J. Bacterial. 132:996- 1002. MAXAM, A. M., and W. GILBERT. 1980. Sequencing end-labeled DNA with base-specific chemical cleavages. Methods Enzymol. 65:499-560. MESSING, J., R. CREA, and P. H. SEEBURG. 198 1. A system for shotgun DNA sequencing. Nucleic Acids Res. 9:309-32 1. MESSING,J., and J. VIEIRA. 1982. A new pair of M 13 vectors for selecting either DNA strand of double-digest restriction fragments. Gene 19:269-276. MILLER, J. H. 1972. Experiments in molecular genetics. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. MIOZZARI, G., and C. YANOFSKY. 1979. Gene fusion during the evolution of the trp operon in Enterobacteriaceae. Nature 277~486-489. MORSE, D. E., and C. YANOFSKY. 1968. The internal low-efficiency promoter of the tryptophan operon of Escherichia coli. J. Mol. Biol. 38:447-45 1. NICHOLS, B. P., M. BLUMENBERG,and C. YANOFSKY. 198 1. Comparison of the nucleotide sequence of trpA and sequences immediately beyond the trp operon of Klebsiella aerogenes, Salmonella typhimurium, and Escherichia coli. Nucleic Acids Res. 9: 1743- 1755. NICHOLS, B. P., G. F. MIOZZARI, M. VAN CLEEMPUT, G. N. BENNETT, and C. YANOFSKY.

472

Kaplan, Goncharoff,

Seibold, and Nichols

1980. Nucleotide sequences of the trpG regions of Escherichia coli, Shigella dysenteriae, Salmonella typhimurium, and Serratia marcescens. J. Mol. Biol. 142:503-5 17. NICHOLS,B. P., and C. YANOFSKY. 1979. Nucleotide sequences of trpA of SaZmonella typhimurium and Escherichia coli: an evolutionary comparison. Proc. Nat. Acad. Sci. USA 76:5244-5248. OPPENHEIM,D. S., and C. YANOFSKY. 1980. Translational coupling during expression of the tryptophan operon of Escherichia coli. Genetics 95:785-795. PLATT, T., and C. YANOFSKY. 1975. An intercistronic region and ribosome-binding site in bacterial messenger RNA. Proc. Nat. Acad. Sci. USA 72:2399-2403. QUEEN, C. L., and L. J. KORN. 1980. Computer analysis of nucleic acids and proteins. Methods Enzymol. 65:595-609. QUEENER, S. W., S. F. QUEENER, J. R. MEEKS, and I. C. GUNSALUS. 1973. Anthranilate synthetase from Pseudomonas putida: purification and properties of a two-component enzyme. J. Biol. Chem. 248: 15 1- 16 1. SANGER,F., and A. R. COULSON. 1978. The use of thin acrylamide gels for DNA sequencing. FEBS Lett. 87: 107-l 10. SANGER,F., S. NICKELN,and A. R. COULSON. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Nat. Acad. Sci. USA 74:5463-5467. SAWULA,R. V., and I. P. CRAWFORD, 1972. Mapping of the tryptophan genes of Acinetobacter calcoaceticus by transformation. J. Bacterial. 112:797-805. . 1973.Anthranilate synthetase of Acinetobacter calcoaceticus: separation and partial characterization of subunits. J. Biol. Chem. 248:3573-358 1. SCHECTMAN, M. G., and C. YANOFSKY. 1983. Structure of the trifunctional TRP-I gene from Neurospora crassa and its aberrant expression in Escherichia coli. J. Mol. Appl. Genet. 2:83-99. SHENE,J., and L. DALGARNO. 1974. The 3’-terminal sequence of Escherichia co/i 16s ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. Proc. Nat. Acad. Sci. USA 71:1342-1346. TSCHUMPER, G., and J. CARBON. 1980. Sequence of a yeast DNA fragment containing a chromosomal replicator and the TRpl gene. Gene 10:157-166. Tso, J. Y., M. A. HERMODSON, and H. ZALKIN. 1980. Primary structure of Serratia marcescens anthranilate synthetase component II. J. Biol. Chem. 255: 145 1-1457. TWAROG, R., and G. L. LIGGINS. 1970 Enzymes of the tryptophan pathway in Acinetobacter calco-aceticus. J. Bacterial. 104:254-263. YANOFSKY, C., V. HORN, M. BONNER, and S. STASIOWSIU. 197 1. Polarity and enzyme functions in mutants of the first three genes of the tryptophan operon of Escherichia coli. Genetics 69:409-433. YANOFSKY, C., T. PLATT, I. P. CRAWFORD, B. NICHOLS, G. CHRISTIE, H. HOROWITZ, M. VAN CLEEMPUT,and A. WV. 198 1. The complete nucleotide sequence of the tryptophan operon of Escherichia coli. Nucleic Acids Res. 9:6647-6668. ZALKIN, H., and C. YANOFSKY. 198 1. Yeast trp.5: structure, function and regulation. J. Biol. Chem. 257:1491-1500. WALTER M. FITCH, reviewing editor Received

January

16, 1984; revision received April 16, 1984.