Nov 2, 2006 - Garvie, C. W., and C. Wolberger. 2001. Recognition of specific DNA se- quences. Mol. Cell 8:937â946. 7. Gellert, M., M. H. O'Dea, T. Itoh, and J.
JOURNAL OF BACTERIOLOGY, Mar. 2007, p. 2392–2400 0021-9193/07/$08.00⫹0 doi:10.1128/JB.01695-06 Copyright © 2007, American Society for Microbiology. All Rights Reserved.
Vol. 189, No. 6
Distinct Functions of the Two Specificity Determinants in Replication Initiation of Plasmids ColE2-P9 and ColE3-CA38䌤 Kazuteru Aoki,1 Miki Shinohara,2† and Tateo Itoh1,2* Department of Biology, Faculty of Science, Shinshu University, Matsumoto, Nagano 390-8621,1 and Department of Biology, Faculty of Science, Osaka University, Toyonaka, Osaka 560-0043,2 Japan Received 2 November 2006/Accepted 5 January 2007
The plasmid ColE2-P9 Rep protein specifically binds to the cognate replication origin to initiate DNA replication. The replicons of the plasmids ColE2-P9 and ColE3-CA38 are closely related, although the actions of the Rep proteins on the origins are specific to the plasmids. The previous chimera analysis identified two regions, regions A and B, in the Rep proteins and two sites, ␣ and , in the origins as specificity determinants and showed that when each component of the region A-site ␣ pair and the region B-site  pair is derived from the same plasmid, plasmid DNA replication is efficient. It is also indicated that the replication specificity is mainly determined by region A and site ␣. By using an electrophoretic mobility shift assay, we demonstrated that region B and site  play a critical role for stable Rep protein-origin binding and, furthermore, that 284-Thr in this region of the ColE2 Rep protein and the corresponding 293-Trp of the ColE3 Rep protein mainly determine the Rep-origin binding specificity. On the other hand, region A and site ␣ were involved in the efficient unwinding of several nucleotide residues around site ␣, although they were not involved in the stable binding of the Rep protein to the origin. Finally, we discussed how the action of the Rep protein on the origin involving these specificity determinants leads to the plasmid-specific replication initiation. exhaustive mutant set of single-base-pair substitutions (33). One of the subregions is important for stable binding of the Rep protein, the second one is important for binding of the Rep protein and for initiation of DNA replication, and the last one is important for initiation of DNA replication but not for stable binding of the Rep protein. The analyses using the mutant ColE2 origins also suggested that the subregions important for stable binding of the Rep protein contain three elements, sites a, b, and c (33) (Fig. 1). The SELEX (systematic evolution of ligands by exponential enrichment) experiment and electrophoretic mobility shift assay (EMSA) using the mutant ColE2 origins showed that those sites, especially sites a and b, are required for high-affinity binding of the ColE2 Rep protein (32). The analyses using the truncated ColE2 Rep proteins suggested that sites a, b, and c of the origin are recognized by regions I, II, and III of the Rep protein, respectively (9) (Fig. 1). The replicon regions of 11 plasmids of the ColE2-related plasmids show structural similarity, and it seems that these plasmids share a common mechanism for initiation of plasmid replication (11). Among them, there are two incompatibility determinants, IncA and IncB, and the former is due to the common regulatory mechanism for expression of the Rep protein by antisense RNA (RNA I) and the latter is due to titration of the Rep protein by binding to the origin (11, 27). An incompatibility test using cloned origins indicated that there are four IncB specificity groups among those plasmids (11). A recent search for related plasmids in GenBank revealed that plasmids specifying the Rep proteins presumably with a primase activity like that of the ColE2 plasmid are commonly found in both gram-negative and -positive bacteria and form a fairly large plasmid family (33). The Rep protein and origin of the plasmid ColE3-CA38 (ColE3) are highly homologous to those of the ColE2 plasmid (11, 35), while these plasmids
In all organisms, DNA replication is a key event for inheritance of genetic information. Initiation of DNA replication requires interaction of the initiator protein with the specific DNA region called the replication origin and the consequent localized melting of duplex DNA, which provides a singlestranded template for establishment of replication machinery. In initiation of chromosomal DNA replication in Escherichia coli, several molecules of the initiator protein (DnaA) tightly bind to the 9-mer repeated sequences called the DnaA boxes in the chromosomal replication origin (oriC), and then the DnaB helicase is loaded onto the unwound 13-mer AT-rich region located to one side of oriC, followed by establishment of the replication machinery containing DnaG and DNA polymerase III holoenzyme (17). The plasmid ColE2-P9 (ColE2) is a circular duplex DNA molecule of about 7 kb (10) and is kept at 10 to 15 copies per chromosome (2, 12). Initiation of the plasmid replication requires host DNA polymerase I (16, 26) and a plasmid-encoded replication initiator (Rep) protein that uniquely possesses an origin-specific primase activity among bacterial plasmids (15, 35), and replication proceeds in a unidirectional manner (13, 28). The Rep protein specifically binds to the replication origin (15, 33) and synthesizes a short RNA molecule of 5⬘-ppApG pA-3⬘ at a specific position in the origin as a primer for initiation of DNA synthesis by DNA polymerase I (28, 29). The 32-bp minimal ColE2 origin may be divided into three functional subregions, as proposed by in vivo analyses using an
* Corresponding author. Mailing address: Department of Biology, Faculty of Science, Shinshu University, Matsumoto, Nagano 390-8621, Japan. Phone: 81-263-37-2489. Fax: 81-263-37-2560. E-mail: tateito @gipac.shinshu-u.ac.jp. † Present address: Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita, Osaka 565-0871, Japan. 䌤 Published ahead of print on 19 January 2007. 2392
VOL. 189, 2007
SPECIFICITY IN REPLICATION INITIATION IN TWO PLASMIDS
FIG. 1. Comparison of the Rep proteins and origins of the ColE2 and ColE3 plasmids. (a) Schematic structures of the Rep proteins are shown by rectangles, and a circle indicates an additional sequence in the ColE3 Rep protein. Amino acid sequences of regions A and B of the ColE2 and ColE3 Rep proteins are indicated. The amino acid residues are numbered from the first methionine. Regions I, II, and III (brackets) are the origin binding elements proposed previously (9). (b) Nucleotide sequences of the minimal replication origins are shown. Sites ␣ and  are indicated by brackets, and dots above the sequence indicate the position of the primer RNA synthesis. Sites a, b, and c (thick lines) are the Rep protein binding elements proposed previously (33). In both panels, vertical bars show the positions where the residues are different between plasmids ColE2 and ColE3, and hyphens in the sequences are gaps introduced for maximum homology.
belong to different IncB specificity groups, indicating that the interactions of the Rep proteins with the origins are specific to the plasmids (11, 24, 27, 35). By using various chimeric Rep proteins and origins, the two regions, regions A and B, in the C-terminal portions of the Rep proteins were identified as the specificity determinants for replication initiation, and it was also revealed that the two sites, sites ␣ and , in the origins determine the specificity corresponding to regions A and B, respectively (24). The ColE3 Rep protein and origin have an additional sequence of nine continuous amino acids in region A and 1 base pair in site ␣, respectively (11, 24) (Fig. 1). The transformation test using the chimeric Rep proteins and origins indicated that the specificity of replication initiation is mainly determined by the specific combinations of regions A and sites ␣, namely, by the absence or presence of the additional amino acid sequence and one base pair. It was also suggested that the specific combinations of regions B and sites  increase the efficiency of the specific replication initiation. Furthermore, by a filter binding assay, it was suggested that the specificity is mainly determined at the step of interaction between the Rep protein and origin, but detailed mechanisms how those regions and sites are involved in determination of the specificity have remained unclear. Here we show that the two pairs of the regions in the Rep proteins and the sites in the origins, A-␣ and B-, play distinct roles in determination of the specificity. The former is involved in the efficient origin unwinding, and the latter is involved in the plasmid-specific Rep protein-origin binding. We propose a possible additional mechanism for determination of the specificity other than the Rep protein-origin binding. MATERIALS AND METHODS Bacterial strains and plasmids. The Escherichia coli strains and plasmids used have been described elsewhere (4, 7, 11, 24, 30, 33, 34) except for those con-
2393
structed in this study. Mutant ColE2 rep genes were obtained by PCR with mutagenic oligodeoxyribonucleotides (21), and mutant ColE3 rep genes were obtained by using uridine-containing single-stranded DNA and mutagenic oligodeoxyribonucleotides (18). Other materials. Media used have been described elsewhere (12, 23). Chemicals, antibiotics, oligodeoxyribonucleotides, and most enzymes were from commercial sources. In vivo analysis of the replication initiation specificity of the mutant Rep proteins. In vivo replication initiation specificity was analyzed by transformation test as described previously (24) except that E. coli AG1 (11) was used. Competence of bacterial cells harboring each of the Rep donor plasmids was normalized to that of pACYC184 (4). Two kinds of plasmids containing a 40-bp segment with the minimal origin (24) were used, the complete origin or miniorigin plasmid with or without the enhancer regions of the ColE2 plasmid raising the transformation efficiency, so that the former is more efficiently supported by somewhat unfavorable Rep proteins than the latter. The relative transformation efficiency was calculated by setting the transformation efficiency for the combination of the wild-type Rep protein and origin at 1.0. Experiments were independently repeated at least three times. Measurement of the activities of the mutant ColE3 Rep proteins. E. coli AG1 carrying pTIK343⫹, producing the ColE3 Rep protein, and pEC32s (24), carrying the ColE3 origin, was transformed with each of the pBE343 derivatives producing the wild-type or mutant ColE3 Rep protein. Plasmid pTIK343⫹ carrying the wild-type ColE1 replicon is excluded from the cells harboring pBE343 (24),or its derivatives, carrying a mutant ColE1 replicon with a high plasmid copy number, due to incompatibility. Plasmids pTIK343⫹ and pEC32s and derivatives of pBE343 are carrying resistance genes to kanamycin (Km), chloramphenicol (Cm), and ampicillin (Ap), respectively. The transformed cell was cultivated on an LB plate containing 50 g/ml Ap for 20 h at 37°C. Twenty Ap-resistant colonies were picked up and transferred onto an LB plate containing either 50 g/ml Ap and 20 g/ml Cm, 20 g/ml (each) Cm and Km, or 50 g/ml Ap alone at 37°C. Recultivation of Ap-resistant colonies was repeated up to 200 generations. Activities of the mutant ColE3 Rep proteins were judged from the ratio of the numbers of Cm-resistant colonies among those that are resistant to Ap and sensitive to Km. Preparation of crude cell extracts. Cell extracts of E. coli NT525 (7) were prepared essentially as described previously (13, 22) except that bacteria were grown in terrific broth (23). E. coli BL21(DE3) (Novagen) cells newly transformed with pET21a-E2Rep⌬Tag, pET21a-E3Rep⌬Tag, pET21a-2/363Rep⌬Tag, or pET21a-3/2-24⌬Tag, which provide the wild-type and chimeric Rep proteins without any tag sequences, were grown to 1 ⫻ 108 cells/ml at 37°C in 50 ml terrific broth with 200 g/ml Ap and then to 1 ⫻ 109 cells/ml at 25°C with 0.4 mM isopropyl-thio--galactoside. The cells were harvested, resuspended in 2 ml of buffer A (1 mM Na2HPO4, 1 mM NaH2PO4, 1 M NaCl, 0.5% [vol/vol] Tween 20, pH 8.0) containing 1 mM phenylmethylsulfonyl fluoride, and kept frozen. Frozen cells were thawed and treated with 2 mg/ml egg white lysozyme for 30 min at 4°C followed by sonication. The samples were centrifuged at 110,000 ⫻ g for 20 min at 4°C, and supernatants containing soluble Rep proteins were stored at ⫺80°C until use. Expression and purification of the Rep proteins. Expression of the tagged Rep proteins was performed as described above, except that the plasmids pET21aE2Rep (33), pET21a-E3 Rep (K. Matsumoto and T. Itoh, unpublished data), pET21a-2/3-63Rep, and pET21a-3/2-24Rep, producing the tagged wild-type and chimeric Rep proteins, were used. Purification of the tagged Rep proteins was performed as described previously (33) except that the proteins bound on Ni2⫹nitrilotriacetic acid resin were eluted with buffer A containing 500 mM imidazole. EMSA. EMSA was performed essentially as described previously (33) except that the origin-containing 32P-labeled DNA segments were prepared by treatment of the 0.14-kb BamHI fragments of plasmids pKA2/2a, pKA3/3a, pKA2/3a, and pKA3/2a, which carry the minimal ColE2, ColE3, ColE2/ColE3 (2/3), and ColE3/ColE2 (3/2) origins, respectively, with [␣-32P]dCTP and Klenow fragment of DNA polymerase I as described previously (9). Analysis of the early replicative intermediate in in vitro DNA synthesis. DNA synthesis using crude cell extracts was performed essentially as described previously (13, 28) except that 1 l of E. coli BL21(DE3) cell extracts containing approximately 150 nM of Rep proteins was added into the 25 l of standard reaction mixtures with 50 M ddTTP. KMnO4 footprint. KMnO4 footprint was carried out essentially by a method that will be described elsewhere (M. Han and T. Itoh, unpublished data) except that the supercoiled DNA molecules of pKA2/2a and pKA3/3a and E. coli BL21(DE3) cell extracts containing each of the wild-type and chimeric Rep proteins were used. Sequencing ladders were prepared by using Thermo Seque-
2394
AOKI ET AL.
J. BACTERIOL.
FIG. 2. Binding specificities of the wild-type and chimeric Rep proteins and origins. (a) Schematic structures of the Rep proteins and the origins used are shown. The region derived from plasmid ColE2 or ColE3 is shown by an open or filled bar, respectively. The types of the Rep proteins and origins are designated by the names of the plasmids (plasmids ColE2 and ColE3) from which regions containing the specificity determinants, A/B (in Rep proteins) or ␣/ (in origins), are derived, so that 2/3 Rep had region A derived from ColE2 and region B from ColE3. Striped rectangles at the N and C termini of the Rep proteins represent T7- and 6⫻ His-tagged sequences, respectively. (b to e) EMSA was performed with 5 nM of 32P-labeled origin-containing DNA fragments. In each panel, lane 1 contains no Rep proteins, and the concentrations of tagged Rep proteins added were 5, 25, 50, 100, and 200 nM. The positions of shifted bands are marked by asterisks. The values at the bottom of each panel are relative transformation efficiencies of the complete-origin plasmids for each of the Rep protein-origin combinations (24).
nase dye primer cycle sequencing kit with 7-deaza-dGTP (Amersham Biosciences) with the same template DNA and primer for the primer extension. Samples were loaded onto 7 M urea-4% polyacrylamide gel and then analyzed by using SHIMADZU sequencer DSQ-1000.
RESULTS The region B-site  pair is important for plasmid-specific stable binding of the Rep proteins and origins. To examine the roles of the two specificity determinants in Rep protein-origin binding, we prepared the wild-type and chimeric tagged Rep proteins from plasmids ColE2 and ColE3. By measuring the transformation efficiency of complete-origin plasmids as described in Materials and Methods, no effects on the replication specificity due to fusion of the tag sequences to the Rep proteins were observed, although the activities of the tagged ColE2, designated 2/2, and 3/2 Rep proteins to support the replication of complete-origin plasmids decreased slightly compared with those without the tag sequences (data not shown). We then performed EMSA with all the possible combinations of the wild-type and chimeric Rep proteins and origins (Fig. 2). For the homologous Rep protein-origin combinations including the chimeric Rep proteins and origins, in which each component of the region A-site ␣ and region B-site  pairs is derived from the same plasmid, complex formation was very
efficient (Fig. 2b to e, lanes 2 to 6). On the other hand, for the heterologous Rep protein-origin combinations, in which neither component of the region A-site ␣ nor region B-site  pair was derived from the same plasmids, complex formation was not observed, even with excess amounts of the Rep proteins (Fig. 2b to e, lanes 17 to 21). These results confirmed that the Rep protein-origin binding is plasmid specific as shown previously (15, 24). For the Rep protein-origin combinations with a heterologous region A-site ␣ pair and a homologous region B-site  pair, complex formation was very efficient (Fig. 2b to e, lanes 12 to 16), although replication initiation was very inefficient or not detectable (24). On the other hand, for the Rep proteinorigin combinations with a homologous region A-site ␣ pair and a heterologous region B-site  pair, in which replication is efficient (24), the Rep protein-origin binding was almost undetectable, even with excess amounts of the Rep proteins (Fig. 2b to e, lanes 7 to 11). When the phosphorimaging plates were exposed for longer periods, little but significant complex formation was detectable (data not shown). These results indicated that the region B-site  pair plays an important role in the plasmid-specific stable binding of the Rep proteins to the origins and that the region A-site ␣ pair, which is important for plasmid-specific replication, is not apparently essential for the plasmid-specific binding of the Rep proteins to the origins.
VOL. 189, 2007
SPECIFICITY IN REPLICATION INITIATION IN TWO PLASMIDS
FIG. 3. In vivo replication specificity of the mutant Rep proteins with amino acid substitutions in region B. Transformation efficiency for each combination of the Rep proteins and the origins as indicated was measured as described in Materials and Methods. The values are averages of at least three independent experiments. Bold letters represent residues in region B of the ColE3 Rep protein different from those of the ColE2 Rep protein. The residues in the mutant Rep proteins derived from the ColE3 Rep protein were also shown in bold letters. The threonine residue at position 284 in region B of the ColE2 Rep protein is numbered. Donor plasmids used were pBX243 (2/2 Rep protein), pTI2343-63 (2/3 Rep protein), and derivatives of pBX243 (mutant 2/2 Rep proteins). The origin-containing plasmids used were pEC22s (complete 2/2), pEC22s 2/3 (complete 2/3), pEC32s (complete 3/3), pEC216 (mini 2/2), and pEC2/3 (mini 2/3) (24).
One residue in region B is responsible for the specificity of recognition. To identify the key residue(s) in region B of the ColE2 Rep protein for determination of the specificity, each of the amino acid residues in region B was changed to the corresponding residue of the ColE3 Rep protein. As the last six residues at the C terminus of the ColE2 Rep protein are dispensable (24, 35), we focused on the amino acid residues from 284-Thr to 291-Cys in the ColE2 Rep protein and the corresponding residues of the ColE3 Rep protein as candidates involved in determination of the specificity. We constructed mutant ColE2 Rep proteins with single and multiple amino acid substitutions by site-directed mutagenesis (21) and examined their replication activity and specificity by measuring the transformation efficiencies of the origin plasmids, which are able to replicate only in the presence of the active Rep proteins provided by a donor plasmid (Fig. 3). In addition to the complete-origin plasmids, the mini origin plasmids were also used for more quantitative evaluation (see Materials and Methods). Among the single amino acid substitutions, the 2/2W Rep protein with the substitution of 284-Thr to Trp switched the replication specificity to the 2/3-type, although its activity was much lower. The two mutants (2/2Q and 2/2L) in which the basic residues were changed to neutral ones exhibited decreased activity. In contrast, the 2/2K Rep protein, in which a neutral residue (Leu) was changed to a basic one (Lys), exhibited increased activity. Moreover, the specificity and activity of the mutant ColE2 Rep proteins in replication were roughly direct manifestations of their properties in Rep protein-origin
2395
FIG. 4. Origin binding specificities of the mutant ColE2 Rep proteins with amino acid substitutions in region B. EMSA was performed with 5 nM of the 32P-labeled DNA fragments containing the 2/2 (a) and 2/3 (b) origins. In each panel, the lane marked by (⫺) has no Rep proteins. Concentrations of the tagged Rep proteins added were 25, 100, and 200 nM. The positions of shifted bands are marked by asterisks.
binding as shown by EMSA (Fig. 4). These results suggested that 284-Thr of the ColE2 Rep protein and the corresponding 293-Trp of the ColE3 Rep protein in regions B of the ColE2 and ColE3 Rep proteins, respectively, are probably the major factors involved in determination of the origin-binding specificity. Importance of components of an additional sequence in region A of the ColE3 Rep protein for replication activity. We demonstrated previously that the additional 9-amino-acid sequence in region A of the ColE3 Rep protein has a critical role for replication specificity and that the mutant Rep proteins with amino acid substitutions outside the additional sequence affected replication activity, but not specificity (24). To investigate the importance of amino acid residues in and around region A, we introduced an amino acid substitution(s) into every position in the 12-amino-acid region from 246-Ala to 257-Ile, including three residues around the additional 9amino-acid sequence by site-directed mutagenesis (18) using a mixture of oligodeoxyribonucleotides, which contain either one of four deoxyribonucleotides at all of the first residues of the triplets coding for this region. We obtained 15 different isolates from 40 candidates and examined their in vivo replication activity (Fig. 5, F6 to ml2). Ten of the mutants (F6 to TY42), including the two mutants (Cl5 and TY42) carrying substitutions at as many as eight positions, fully retained their replication activity. It is worthwhile to mention that all the active mutants retained tyrosine and histidine at positions 248 and 253, respectively, and that all the defective mutants, except for Cs4, carried substitutions at either one or both of the 248-Tyr and 253-His residues, suggesting the importance of these two residues. The mutants with a single amino acid substitution at either of these two positions (Y248A, H253A, or H253Y) showed decreased or deficient activity, except for Y248H, which was fully active. Furthermore, these active mutants retained the ColE3-type replication specificity (data not shown). These results suggested that these two residues at positions 248 and 253 of the ColE3 Rep protein, especially 253-His, are important for normal replication activity, rather than determination of the specificity. On the other hand, ms10 and Cs4 exceptionally
2396
AOKI ET AL.
J. BACTERIOL.
FIG. 6. Initiation sites of the primer RNA and DNA syntheses in various Rep-origin combinations. In vitro DNA synthesis using NT525 cell extracts was performed for various Rep protein-origin combinations indicated in the figure. The Eco47III-digested nascent fragments were treated with alkali (⫹) or without alkali (⫺). The length of each band in the 2/2 Rep protein-origin combination is indicated to the left of the gel. FIG. 5. In vivo replication activities of the mutant ColE3 Rep proteins with amino acid substitutions in region A. Replication activities were measured as described in Materials and Methods, and the symbols ⫹, ⫹/⫺, or ⫺ indicate that 100%, 10 to 50%, or none of the transformants tested maintained the plasmid with the complete ColE3 origin, respectively. Amino acid sequences of only the region with amino acid substitution(s) were shown. Two positions with highly conserved residues in the active mutant Rep proteins are boxed, and the positions of amino acid residues are numbered from the first methionine residue. WT, wild type.
retained and lost the replication activity, respectively. These results might have been caused by a substitution(s) at other positions. Rep protein has strict selectivity for the initiation site of the primer RNA and DNA syntheses. For the Rep protein-origin combinations in which only the region A-site ␣ pairs were heterologous, two types of transformation efficiency of the complete-origin plasmids were observed (24). The transformation efficiency was below the detectable level (⬍0.001), when region A and site ␣ were derived from the ColE3 Rep protein and the ColE2 origin, respectively, and the efficiency was moderate (0.3), when region A and site ␣ were from the ColE2 Rep protein and the ColE3 origin, respectively. Analyses of the mechanism which causes such difference could give us a clue to understanding the functions of region A and site ␣ in determining the specificity. For that purpose, we first examined the primer and initiation site of in vitro plasmid DNA synthesis in various combinations of the wild-type and chimeric Rep proteins and origins using E. coli cell extracts (Fig. 6). In plasmid ColE2, a 105-nucleotide (nt) fragment, corresponding to the nascent leading strand was produced by digestion of the early replicative intermediate with Eco47III, and a 102-nt fragment was obtained by removing primer RNA from the 105-nt fragment by alkaline hydrolysis (Fig. 6, lanes 1 and 2) as described previously (28). We obtained similar results with plasmid ColE3, but the leading strand components of the nascent fragments with or without alkaline treatment migrated a little faster than those of the plasmid ColE2 (Fig. 6, lanes 5 and 6). It was probably caused
by the difference in the nucleotide sequences downstream of the origins between the two origin plasmids, because we obtained the results consistent with the primer sequence of ppA pGpA by RNase T1 and RNase A digestion and limited alkaline hydrolysis of the nascent leading strand fragment of the plasmid ColE3 (data not shown) and the distances between the primer start site and the Eco47III site are identical for the ColE2 and ColE3 origin plasmids used. Newly synthesized 102and 105-nt leading strands, with or without alkaline treatment, respectively, were also observed when the homologous pairs of the chimeric Rep proteins and origins were used (Fig. 6, lanes 3, 4, 7,and 8). This suggested that the chimeric constructs do not affect the initiation sites of the primer RNA and DNA syntheses. In addition, the identical initiation sites of the primer RNA and DNA syntheses were observed, even when the combination of region A and site ␣ was heterologous, although the efficiency of initiation of DNA synthesis was very low (Fig. 6, lanes 9, 10, 11, and 12). These results suggested that the Rep proteins have strict selectivity for the initiation site of primer RNA synthesis and also suggested the possibility that the combinations of regions A and sites ␣ directly or indirectly affect the efficiency of the primer synthesis at the proper position or that of the primer utilization by DNA polymerase I. The region A-site ␣ pair affects unwinding of the region of the origin containing the site of the primer RNA synthesis. Recently, it was shown that several nucleotide residues in the origin are unwound by binding of the Rep protein to the origin in plasmid ColE2 (Han and Itoh, unpublished). The unwinding is probably necessary for the Rep protein to synthesize the primer RNA. To investigate whether the combinations of region A and site ␣ affect unwinding of the origin, we performed KMnO4 footprinting (14) using various homologous and heterologous Rep protein-origin combinations (Fig. 7). For the 2/2 and 3/3 origins, four or five nucleotide residues around sites ␣ in the top strands were sensitive to KMnO4 oxidation, depending on the presence of the homologous Rep proteins (Fig. 7a, lanes 2, 3, 7, and 8), and the thymine residues next to sites ␣ in the bottom strands also showed weak sensi-
VOL. 189, 2007
SPECIFICITY IN REPLICATION INITIATION IN TWO PLASMIDS
2397
FIG. 7. KMnO4 footprints with various combinations of the wild-type and chimeric Rep proteins and origins. Supercoiled DNA molecules of the plasmids carrying the 2/2 or 3/3 origin were treated with KMnO4 in the presence of various Rep proteins. The modified DNA molecules were used for primer extension with 5⬘-FITC-labeled primers and analyzed by 7 M urea-4% polyacrylamide gel electrophoresis. The bands for the top (a) and bottom (b) strands were visualized. Lanes 1 and 6 contained no Rep proteins. The concentrations of the Rep proteins were 37.5 and 75 nM. The positions of the origin sequences are indicated by rectangles to the left of the gels. Quantification of the results of the experiments shown in panels a and b was performed by using Scion Image software, and the scanned profiles and sequences of the 2/2 (c) and 3/3 (d) origins are presented. The profiles of lanes 3 or 5 and lanes 8 or 10 (with Rep protein) (black lines) were superimposed over those of lanes 1 and 6 (without the Rep protein) (gray lines), respectively, and normalized to bands in the regions outside the origins. Arrowheads indicate the nucleotides whose sensitivity to KMnO4 oxidation was reproducibly stimulated by the presence of the Rep protein in two independent experiments, and the open or filled arrowheads indicate the homologous or heterologous pair, respectively, of regions A and sites ␣. Sites ␣ and  and the position of primer synthesis are indicated by brackets and dots, respectively.
tivity to oxidation in the presence of the homologous Rep proteins (Fig. 7b, lanes 2, 3, 7, and 8). These results showed that site ␣ and the neighboring few residues of the ColE3 origin are also unwound by binding of the ColE3 Rep protein and also suggested the possibility that a portion of the Rep protein, perhaps the primase domain, covers a region around site ␣ of the bottom strand to synthesize the primer RNA. For the combination of the 3/2 Rep protein and 2/2 origin, in which in vivo transformation efficiency was below the detect-
able level but in vitro binding was efficient by EMSA (Fig. 2), neither strand of the origin region showed stimulation of sensitivity to oxidation depending on the presence of the Rep protein (Fig. 7a, lanes 4 and 5, and Fig. 7b, lanes 4 and 5). Similar results were obtained, even with excess amounts of the 3/2 Rep protein (200 nM) (data not shown). On the other hand, for the combination of the 2/3 Rep protein and 3/3 origin, in which in vivo transformation was detectable but lower and in vitro binding was efficient (Fig. 2),
2398
AOKI ET AL.
J. BACTERIOL.
KMnO4 oxidation of a few residues around site ␣ in the top strand was stimulated, depending on the presence of the 2/3 Rep protein (Fig. 7a, lanes 9 and 10), although such stimulation was not detected for the bottom strand (Fig. 7b, lanes 9 and 10). In the presence of an excess amount of the 2/3 Rep protein (200 nM), the sensitivity to KMnO4 oxidation of the nucleotide residues around site ␣ was stimulated to a higher degree, and the stimulated regions in both strands were expanded to the same regions as those stimulated in the presence of the 3/3 Rep protein (data not shown). These results indicated that the efficiency of origin unwinding approximately correlates with the transformation efficiency in the Rep protein-origin combinations with the homologous or heterologous region A-site ␣ pair and also suggested that the Rep protein-origin complexes capable of efficiently unwinding the origin are formed, when the combinations of region A and site ␣ are proper (see Discussion). DISCUSSION The C-terminal region containing region B of the Rep protein involved in specific binding to the origin shows a certain sequence similarity to the DNA binding motifs of NtrC and DctD, which are prokaryotic transcription factors (11). The solution structure of the DNA binding domain of NtrC from Salmonella enterica serovar Typhimurium has been resolved and classified into the classical helix-turn-helix DNA binding motif (20). The secondary structural prediction (8) of the ColE2 Rep protein suggested that the region containing region B probably forms a helix-turn-helix structure (data not shown). The region from 284-Ala to 293-Pro of the ColE2 Rep protein (Fig. 8a) contained completely in region B corresponds to the DNA recognition helix of NtrC in the sequence alignment. Together with this, our EMSA results imply that region B is a portion of a DNA recognition helix and specifically recognizes the origin with homologous site . Furthermore, only 284-Thr and 293-Trp residues of the ColE2 and ColE3 Rep proteins, respectively, affected the binding specificity. We chose 10 ColE2-related plasmids with the highest similarity to the ColE2-P9 plasmid from a large number of ColE2related plasmids (33) and aligned the amino acid and nucleotide sequences of the C-terminal portions of the (putative) Rep proteins and origins, respectively (Fig. 8). In each of the (putative) Rep proteins (Fig. 8a), either a serine, threonine, or tryptophan residue is located at the position, which corresponds to 284-Thr (ColE2) or 293-Trp (ColE3). In all the plasmids which specify the Rep proteins with a serine or threonine residue at this position, the origin sequences could be arbitrarily aligned, so that the residues at the sites of their origins corresponding to site  of the ColE2 origin are thymine and guanine, respectively. On the other hand, in all the plasmids which specify the Rep proteins with a tryptophan residue at this position, the residues at the site of their origins could be guanine and thymine, respectively (Fig. 8b). Such a correlation is not found at other nearby sites in the origins. These amino acid residues of region B therefore might specifically interact with either one or both bases at site  of the origin. In many DNA binding proteins, the side chains of the basic residues interact with sugar-phosphate backbones of target DNA molecules (25). The binding of the Rep protein to the
FIG. 8. Comparisons of the C-terminal amino acid sequences of the Rep proteins (a) and nucleotide sequences of the origins (b) among the (putative) ColE2-related plasmids. Thick lines indicate the regions of the specificity determinants in the Rep proteins (A, B, and C) and the origins (␣, , and ␥). Bold letters indicate highly conserved residues among these plasmids. The positions corresponding to the residues in regions B and sites  involved in the Rep-origin binding specificity in plasmids ColE2 and ColE3 are boxed with dotted lines. The positions of amino acid residues at the C-terminal end are numbered, and the positions of the origin sequences are numbered from left to right. Dots indicate the positions of the primer RNA synthesis in plasmids ColE2 and ColE3. The GenBank accession numbers of the Rep proteins are as follows: ColE2P9, BAA06292; ColE3-CA38, BAA06293; ColE5-099 (11), 809524; pUB6060 (1), CAB56519.1; pMGD2 (36), NP_620615.1; pEMCJH03, NP_957539.1; pEI2 (5), NP_061811.1; pAsa1 (3), NP_861554.1; pAsa3 (3), NP_861563.1; and pAsal1, NP_710167.1. The GenBank accession numbers of the nucleotide sequences are as follows: ColE2-P9, D30054; ColE3-CA38, D30055; ColE5-099, D30060; pUB6060, AJ249644; pMGD2, NC_003789; pEMCJH03, NC_005325; pEI2, AF244084; pAsa1, AY301063; pAsa3, AY301065; and pAsal1, AJ508382. The putative origin sequences for pEMCJH103 are from reference 33.
origin seemed to be stabilized through the interactions between the basic side chains and sugar-phosphate backbones as shown above. It was also suggested that the presence of a lysine residue at an appropriate distance from the Thr or Trp residue is necessary for the specific origin binding with an appropriate affinity. The structures of protein-DNA complexes of many DNA binding proteins were determined, and it has been revealed that an ␣-helical cylinder interacts with the target DNA at a unique binding angle in each protein-DNA complex, and this was proposed to be one of the reasons why a simple recognition code in the protein-DNA interaction has not been found (6, 19). This could explain the complexity in determination of the specificity by regions B and sites  in the ColE2related plasmids. The recognition helix of the Rep protein
VOL. 189, 2007
SPECIFICITY IN REPLICATION INITIATION IN TWO PLASMIDS
might bind to the origin DNA at a unique angle for each pair of Rep proteins and origins through interaction between long basic side chains at the specific positions and sugar-phosphate backbones, so that the Thr/Trp residue could efficiently recognize the specific base at the specific position in the origins. For region A, it has been proposed that this region might be a linker connecting two functional domains (24). Retention of replication activity by many of the mutant ColE3 Rep proteins with multiple amino acid substitutions in this region, as shown above, further supported such a proposal. On the other hand, the results of mutation analysis also indicated that this region is not a simple flexible linker, as even single amino acid substitutions at either one or two specific positions, decreased or abolished the replication activity. The importance of these two residues was further supported by the conservation of a tyrosine (with a neighboring valine) and a histidine (with a neighboring threonine) corresponding to 248-Tyr and 251-His in region A of the ColE3 Rep protein are highly conserved among the (putative) Rep proteins of the ColE2-related plasmids with additional sequences in their regions A (Fig. 8a). It has been suggested that the length and composition of the linker sequence of the transcription factors Oct-1 and Pax6 are important for an appropriate positioning of the two domains separated by this region on the target DNA via intramolecular interactions between the residues in the linker region and those in the folded domain and/or interactions between the residues in the linker region and DNA (30, 31). The 248-Tyr and 253-His residues of the ColE3 Rep protein might contribute to form a defined structure resulting in an appropriate positioning of two functional domains separated by region A on the cognate origin DNA, and one of the domains may be the DNA binding domain containing region B and the other might be the primase domain. It may be worth mentioning that the Rep protein is likely able to synthesize the primer RNA only at a specific position in the origin. The nucleotide sequence with more than 10 residues just downstream of site ␣, including the site of the primer synthesis, are highly conserved among the (putative) origins of the ColE2-related plasmids (Fig. 8b). The conserved sequence in the ColE2 origin contains site c (Fig. 1), which is one of the three elements for the specific Rep protein binding proposed previously (33). It has been suggested that region III (Fig. 1a) located upstream of region A of the ColE2 Rep protein is involved in recognition of the several residues containing site c of the origin (9) and also involved in unwinding of the localized duplex DNA region in the origin (Han and Itoh, unpublished). In addition to stable binding of the C-terminal portion containing region B of the ColE2 Rep protein to the left part of the origin, including site , region III recognizes site c, and such interaction of the Rep protein with the two regions of the origin may cause unwinding of the localized duplex DNA region around site ␣, exposing the single-stranded template for primer RNA synthesis by the Rep protein. When the combination of region A and site ␣ is improper, region III might be unable to bind or less efficiently bind to site c due to an inappropriate length of region A, although the C-terminal portion containing region B is able to bind to the left part of the origin, and this might result in deficiency or lower efficiency of the unwinding of the origin and replication initiation. The sequences of the C-terminal portion of the Rep proteins
2399
and origins are almost identical between plasmids ColE3 and ColE5-099, except for region C and site ␥ (Fig. 8). Nevertheless, the interactions of the Rep proteins with the origins are plasmid specific; therefore, these regions and sites have been proposed to be the third set of the specificity determinants (11). By comparison of the C-terminal portions of the Rep proteins and origins among the ColE2-related plasmids (Fig. 8), we noticed that insertion or deletion of three or four amino acids in region C approximately corresponds to insertion or deletion of one nucleotide residue in site ␥. The roles of region C and site ␥ in determination of the specificity, however, could be somewhat different from those of region A and site ␣. The 5⬘-TAAGCC-3⬘ sequence across site ␥ in the ColE2 origin corresponds to one of the three Rep binding elements in the origin, site b, whereas site ␣ corresponds to the boundary of the two binding elements, sites b and c (33) (Fig. 1). In addition, region II (Fig. 1) across region C of the ColE2 Rep protein binds to site b in the origin (9). Therefore, the amino acid residue(s) in region C might specifically recognize the nucleotide residue(s) of site ␥ in the cognate origin. ACKNOWLEDGMENTS We are grateful to H. Ogawa and T. Ogawa for helpful discussions and encouragement. We thank M. Yagura and M. Han for experimental advice. We also thank T. Yamada for contributions in the early phase of this study. This work was supported in part by a grant-in-aid for scientific research from the Ministry of Education, Science and Culture of Japan. REFERENCES 1. Avison, M. B., T. R. Walsh, and P. M. Bennett. 2001. pUb6060: a broadhost-range, DNA polymerase-I-independent ColE2-like plasmid. Plasmid 45:88–100. 2. Bazaral, M., and D. R. Helinski. 1968. Circular DNA forms of colicinogenic factors E1, E2, and E3 from Escherichia coli. J. Mol. Biol. 36:185–194. 3. Boyd, J., J. Williams, B. Curtis, C. Kozera, R. Singh, and M. Reith. 2003. Three small, cryptic plasmids from Aeromonas salmonicida subsp. salmonicida A449. Plasmid 50:131–144. 4. Chang, A. C. Y., and S. N. Cohen. 1978. Construction and characterization of amplifiable multicopy DNA cloning vehicles derived from the P15A cryptic miniplasmid. J. Bacteriol. 134:1141–1156. 5. Fernandez, D. H., L. Pittman-Cooley, and R. L. Thune. 2001. Sequencing and analysis of the Edwardsiella ictaluri plasmids. Plasmid 45:52–56. 6. Garvie, C. W., and C. Wolberger. 2001. Recognition of specific DNA sequences. Mol. Cell 8:937–946. 7. Gellert, M., M. H. O’Dea, T. Itoh, and J. Tomizawa. 1976. Novobiocin and coumermycin inhibit DNA supercoiling catalyzed by DNA gyrase. Proc. Natl. Acad. Sci. USA 73:4474–4478. 8. Geourjon, C., and G. Deleage. 1995. SOPMA: significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments. Comput. Appl. Biosci. 11:681–684. 9. Han, M., M. Yagura, and T. Itoh. 2007. Specific interaction between the initiator protein (Rep) and origin of plasmid ColE2-P9. J. Bacteriol. 189: 1061–1071. 10. Herschman, H. R., and D. R. Helinski. 1967. Comparative study of the events associated with colicin induction. J. Bacteriol. 94:691–699. 11. Hiraga, S., T. Sugiyama, and T. Itoh. 1994. Comparative analysis of the replicon regions of eleven ColE2-related plasmids. J. Bacteriol. 176:7233– 7243. 12. Horii, T., and T. Itoh. 1988. Replication of ColE2 and ColE3 plasmids: the regions sufficient for autonomous replication. Mol. Gen. Genet. 212:225– 231. 13. Itoh, T., and T. Horii. 1989. Replication of ColE2 and ColE3 plasmids: in vitro replication dependent on plasmid-coded proteins. Mol. Gen. Genet. 219:249–255. 14. Kahl, B. F., and M. R. Paule. 2001. The use of diethyl pyrocarbonate and potassium permanganate as probes for strand separation and structural distortions in DNA. Methods Mol. Biol. 148:63–75. 15. Kido, M., H. Yasueda, and T. Itoh. 1991. Identification of a plasmid-coded protein required for initiation of ColE2 DNA replication. Nucleic Acids Res. 19:2875–2880.
2400
AOKI ET AL.
16. Kingsbury, D. T., and D. R. Helinski. 1970. DNA polymerase as a requirement for the maintenance of the bacterial plasmid colicinogenic factor E1. Biochem. Biophys. Res. Commun. 41:1538–1544. 17. Kornberg, A., and T. A. Baker. 1992. DNA replication, 2nd ed. Freeman and Company, New York, NY. 18. Kunkel, T. A., J. D. Roberts, and R. A. Zakour. 1987. Rapid and efficient site-specific mutagenesis without phenotypic selection. Methods Enzymol. 154:367–382. 19. Pabo, C. O., and L. Nekludova. 2000. Geometric analysis and comparison of protein-DNA interfaces: why is there no simple code for recognition? J. Mol. Biol. 301:597–624. 20. Pelton, J. G., S. Kustu, and D. E. Wemmer. 1999. Solution structure of the DNA-binding domain of NtrC with three alanine substitutions. J. Mol. Biol. 292:1095–1110. 21. Saiki, R. K., D. H. Gelfand, S. Stoffel, S. J. Scharf, R. Higuchi, G. T. Horn, K. B. Mullis, and H. A. Erlich. 1988. Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239:487–491. 22. Sakakibara, Y., and J. Tomizawa. 1974. Replication of colicin E1 plasmid DNA in cell extracts. Proc. Natl. Acad. Sci. USA 71:802–806. 23. Sambrook, J., and D. W. Russell. 2001. Molecular cloning: a laboratory manual, 3rd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. 24. Shinohara, M., and T. Itoh. 1996. Specificity determinants in interaction of the initiator (Rep) proteins with the origins in the plasmids ColE2-P9 and ColE3-CA38 identified by chimera analysis. J. Mol. Biol. 257:290–300. 25. Suzuki, M. 1993. Common features in DNA recognition helices of eukaryotic transcription factors. EMBO J. 12:3221–3226. 26. Tacon, W., and D. J. Sherratt. 1976. ColE plasmid replication in DNA polymerase I-deficient strains of Escherichia coli. Mol. Gen. Genet. 147:331– 335. 27. Tajima, Y., T. Horii, and T. Itoh. 1988. Replication of ColE2 and ColE3
J. BACTERIOL.
28.
29.
30.
31.
32.
33. 34.
35. 36.
plasmids: two ColE2 incompatibility functions. Mol. Gen. Genet. 214:451– 455. Takechi, S., and T. Itoh. 1995. Initiation of unidirectional ColE2 DNA replication by a unique priming mechanism. Nucleic Acids Res. 23:4196– 4201. Takechi, S., H. Matsui, and T. Itoh. 1995. Primer RNA synthesis by plasmidspecified Rep protein for initiation of ColE2 DNA replication. EMBO J. 14:5141–5147. van Leeuwen, H. C., M. J. Strating, M. Rensen, W. de Laat, and P. C. van der Vliet. 1997. Linker length and composition influence the flexibility of Oct-1 DNA binding. EMBO J. 16:2043–2053. Xu, H. E., M. A. Rould, W. Xu, J. A. Epstein, R. L. Maas, and C. O. Pabo. 1999. Crystal structure of the human Pax6 paired domain-DNA complex reveals specific roles for the linker region and carboxy-terminal subdomain in DNA binding. Genes Dev. 13:1263–1275. Yagura, M., and T. Itoh. 2006. The Rep protein binding elements of the plasmid ColE2-P9 replication origin. Biochem. Biophys. Res. Commun. 345: 872–877. Yagura, M., S. Nishio, H. Kurozumi, C. Wang, and T. Itoh. 2006. Anatomy of the replication origin of plasmid ColE2-P9. J. Bacteriol. 188:999–1010. Yasueda, H., S. Takechi, T. Sugiyama, and T. Itoh. 1994. Control of ColE2 plasmid replication: negative regulation of the expression of the plasmidspecified initiator protein, Rep, at a posttranscriptional step. Mol. Gen. Genet. 244:41–48. Yasueda, H., T. Horii, and T. Itoh. 1989. Structural and functional organization of ColE2 and ColE3 replicons. Mol. Gen. Genet. 215:209–216. Yoo, J. S., H. S. Kim, S. Y. Chung, Y. C. Lee, Y. S. Cho, and Y. L. Choi. 2001. Characterization of the small cryptic plasmid, pGD2, of Klebsiella sp. KCL-2. J. Biochem. Mol. Biol. 34:584–589.