Crystal structures of open and closed forms of ... - Semantic Scholar

3 downloads 0 Views 1MB Size Report
I (Klentaq1) with a primer/template DNA and dideoxy- cytidine triphosphate, and that of a binary complex of the same enzyme with a primer/template DNA, were.
The EMBO Journal Vol.17 No.24 pp.7514–7525, 1998

Crystal structures of open and closed forms of binary and ternary complexes of the large fragment of Thermus aquaticus DNA polymerase I: structural basis for nucleotide incorporation Ying Li, Sergey Korolev and Gabriel Waksman1 Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, Campus Box 8231, 660 South Euclid Avenue, St Louis, MO 63110, USA 1Corresponding author e-mail: [email protected]

The crystal structures of two ternary complexes of the large fragment of Thermus aquaticus DNA polymerase I (Klentaq1) with a primer/template DNA and dideoxycytidine triphosphate, and that of a binary complex of the same enzyme with a primer/template DNA, were determined to a resolution of 2.3, 2.3 and 2.5 Å, respectively. One ternary complex structure differs markedly from the two other structures by a large reorientation of the tip of the fingers domain. This structure, designated ‘closed’, represents the ternary polymerase complex caught in the act of incorporating a nucleotide. In the two other structures, the tip of the fingers domain is rotated outward by 46° (‘open’) in an orientation similar to that of the apo form of Klentaq1. These structures provide the first direct evidence in DNA polymerase I enzymes of a large conformational change responsible for assembling an active ternary complex. Keywords: binary/ternary complex/crystal structure/DNA polymerase I/fingers domain/nucleotide incorporation

Introduction The family of DNA polymerase I (DNA Pol I) enzymes plays a role in the repair of DNA lesions in prokaryotic organisms (Kornberg and Baker, 1991). DNA Pol I enzymes catalyze the addition of mononucleotide units derived from deoxynucleoside 59 triphosphates (dNTP) to the 39 hydroxyl terminus of a primer chain in a reaction that requires a template chain which directs the enzyme in its selection of the specific incoming nucleotide (Kornberg and Baker, 1991). These enzymes are characterized by a multidomain architecture which supports not only the polymerase activity, but also a proof-reading 39→59 and/or a 59→39 exonuclease activity (Delarue et al., 1990). The mechanism of DNA polymerization by DNA pol I enzymes has been the subject of extensive biochemical and structural studies (Johnson, 1993; Joyce and Steitz, 1994; Brautigam and Steitz, 1998). The first crystal structure of a member of this family of proteins, the large fragment (Klenow) of Escherichia coli DNA Pol I, revealed that the polymerase domain has a shape reminiscent of a right hand in which the palm, fingers and thumb form the DNA-binding crevice (Ollis et al., 1985). The 7514

active site composed of three acidic residues is located at the palm which forms the base of the crevice. Subsequently, complexes of this protein fragment with DNA described the enzyme in its editing mode (Freemont et al., 1988; Beese and Steitz, 1991; Beese et al., 1993). On the basis of the configuration of the binding partners at the 39–59 exonuclease active site, it was proposed that the mechanism of catalysis for the polymerase activity involved two metal ions which promote the deprotonation of the 39 OH of the primer strand and assist the leaving of the pyrophosphate (Steitz, 1993). During synthesis by DNA Pol I enzymes, the primer/ template DNA is translocated with each cycle of polymerization to present a new base to the polymerase active site. This event occurs at a rate of several hundred bases per second, depending on the intrinsic processivity of the enzyme (Carroll and Benkovic, 1990; Johnson, 1993). Whether translocation of the DNA requires a conformational change in the protein is not known. However, solution studies have identified a slow rate-limiting step before chemistry which suggests that at least one conformational transition is required before incorporation of the nucleotide, possibly to assemble a productive protein– DNA–dNTP ternary complex (Kuchta et al., 1987, 1988; Dahlberg and Benkovic, 1991; Patel et al., 1991; Wong et al., 1991). Recently, a conformational change in DNA Pol I enzymes has been documented by comparing the structure of a quaternary complex of the T7 DNA polymerase bound to thioredoxin, a template/primer DNA, and an incoming dideoxynucleoside triphosphate (ddNTP) with structures of DNA-bound or apo forms of other DNA polymerases I (Doublie´ et al., 1998). This comparison revealed an orientation of the fingers domain in the T7 DNA polymerase quaternary complex that is different from the other structures, corresponding to a rotation inwards (closed) by ~41° towards the primer/template DNA. Such an open to closed conformational transition may be responsible for assembly of a productive ternary complex. The crystal structure of the quaternary complex of the T7 polymerase, as well as that of an active binary complex of the DNA polymerase I from Bacillus stearothermophilus bound to a template/primer DNA (Kiefer et al., 1998), demonstrated that the terminal base pair is contained within a binding pocket, the geometry of which is incompatible with a mismatched base pairing (Doublie´ et al., 1998; Kiefer et al., 1998). In none of these structures were the DNAs seen to cross the crevice formed by the fingers, palm and thumb domains. Instead, in both, the first single-stranded template base is flipped out at a 90° angle, indicating that during polymerization the DNA remains on one side of the protein. Since the argument for a large conformational change affecting the fingers domain, as proposed by Doublie´ et al. © European Molecular Biology Organization

Structures of closed and open forms of Klentaq1

Fig. 1. Representative regions of the electron density. (A) The GT5:ddC base pair and the flipped GT4 base of the open ternary complex. ddC indicates the incorporated ddCMP (notation as in Figure 5). (B) Same region as (A) in the closed ternary complex. Note that GT4 is now in a stacking arrangement with GT5. (C) The H1H2 loop in the thumb region. In (A) and (C), electron density results from a map calculated using the experimental MAD solvent-flattened phases. In (B), electron density results from a simulated annealing omit map where the region shown has been deleted from the model (Hodel et al., 1992). Residues in the protein and the DNA are color coded according to atom type with carbon and phosphorus atoms in yellow, oxygen atoms in red and nitrogen atoms in blue. Generated using the program O (Jones et al., 1991).

(1998), rested on the comparison of different DNA Pol I structures, the direct experimental proof for such a motion remained to be provided. In this report, we describe the structures of two ternary complexes of the large fragment of Taq DNA Pol I (Klentaq1) bound to a primer/template DNA and ddNTP. These structures represent the open and closed forms of the enzyme and capture Klentaq1 in the act of incorporating a nucleotide at the active site. We also present the structure of a binary complex of Klentaq1 bound to a template/primer DNA. These three structures together with those of the apo (Korolev et al., 1995) and the dNTP-bound (Li et al., 1998) forms of the enzyme provide new insight into the structural basis of nucleotide incorporation during DNA polymerization.

Results and discussion Structure determination The strategy used to obtain ternary complexes of Klentaq1 was similar to that first developed by Pelletier et al. (1994) for DNA polymerase β. Klentaq1 was mixed prior to crystallization with a primer/template DNA composed of strands that were 11 and 16 nucleotides in length, respectively, and subsequently reacted against an excess of ddNTP (Materials and methods). The design of the template strand was such that the two first single-stranded bases of the template were guanosines allowing (i) the incorporation of a dideoxycytidine monophosphate (ddCMP), and (ii) the productive positioning of a dideoxycytidine triphosphate (ddCTP) at the active site. Two ternary complex crystals were obtained, the struc-

tures of which differed by a large reorientation of the fingers domain (designated ‘open’ and ‘closed’ below). Both crystals were obtained using identical crystallization conditions, diffracted to similar resolution (2.3 Å) and were in the same space group (P3121) with the same unit cell dimensions. However, the ‘open’ crystal form was obtained from a selenomethionine-derived protein and was incubated after growth in a stabilizing solution deprived of protein, DNA and ddCTP for 5 days. In contrast, the ‘closed’ crystal form was obtained from the wild-type protein and was used in data collection immediately after growth. Therefore, we believe that the ‘open’ form was obtained by depleting, at least partially, the ‘closed’ crystal form of its ddCTP component, a hypothesis consistent with the fact that the occupancy for the ddCTP in the open form was found to be low (Materials and methods). The binary complex crystals were obtained by incubating the closed ternary complex crystals in the stabilizing solution described above for an extended period (1 month). This treatment resulted in complete release of the ddCTP and consequently in a binary enzyme–DNA complex. The structure of the open ternary complex was determined using the method of multiwavelength anomalous diffraction (MAD) and was refined against data to 2.3 Å resolution with a free-R and R values of 28.8 and 22.4%, respectively. The structures of the closed ternary complex and of the binary complex were determined by difference Fourier methods (Materials and methods). The closed ternary complex structure was refined against 2.3 Å data, with values for the free-R and R-factors of 27.5 and 21.8%, respectively, whereas the binary enzyme–DNA 7515

Y.Li, S.Korolev and G.Waksman

7516

Structures of closed and open forms of Klentaq1

complex structure was refined against 2.5 Å data, with values for the free-R and R-factors of 29.8 and 22.7%, respectively (Figures 1 and 2, Table I). Two conformations of the fingers domain The two ternary complexes of Klentaq1 bound to a primer/ template DNA and ddCTP differ by a large conformational change affecting the tip of the fingers domain (Figure 3A). In one ternary complex structure (Figure 2A), the fingers domain closes the crevice formed by the thumb, palm and fingers (Figure 2C): this conformation is referred to as the closed form of the enzyme. In the second ternary complex structure, as in the binary Klentaq1–DNA complex structure (Figures 2B), the fingers domain is seen in a totally different conformation, such that the crevice is clearly visible (Figure 2D). This conformation is referred to as the open form of the polymerase. Therefore, the three structures presented here are those of an open and a closed ternary complex, and that of an open binary complex. The open to closed conformational change affecting the fingers domain can be deconvoluted into two rotations successively affecting different parts of the fingers domain (Figure 3A). First, a 6° rigid body rotation of helices N, O, O1 and O2 results in a partial closing of the crevice (see Figure 4 for definition of secondary structures). This motion is amplified by a second rotation of 40°, affecting the N and O helices only. The open to closed transition affects dramatically the orientation of the O helix. Solution studies of DNA Pol I have demonstrated the involvement of many residues in the O helix in dNTP binding and incorporation (Astatke Table I. Refinement statistics Open binary Closed ternary Open ternary complex complex complex Resolution (Å) Reflections Completeness (%)a Total nonhydrogen atoms Total protein residues Total DNA residuesb Total number of ions Total number of waters R/Rfree (%)c R.M.S. deviation Bond length (Å) Bond angle (°) 1.3 B-factors (Å2) main chain side chain

30–2.5 20 804 87.2 (74.7) 4769 539 13/12 0 103 22.7/29.8

30–2.3 26 228 91.0 (84.8) 4874 539 14/12/1 2 149 21.8/27.5

30–2.3 25 698 88.9 (77.2) 4631 528 13/12/1 1 150 22.4/28.8

0.008 1.4

0.007 1.2

0.007

1.3 2.0

0.7 1.5

1.1 2.5

aCompleteness for the ‘working set’ at F/σ(F) . 2.0. Completeness for the last resolution shell is in parentheses. bTotal DNA residues: template/primer/nucleoside triphosphate. cR free was calculated using 5% of the total reflections (‘test set’).

et al., 1995; Suzuki et al., 1996). The effect of the open to closed conformational transition is to position the O helix in two different orientations. In the first orientation, that seen in the open form, the O helix is in a configuration similar to that observed in the apo and dCTP-bound forms of the enzyme (Korolev et al., 1995; Li et al., 1998). Tyr 671, at the C-terminus of the O helix, is inserted into the stacking arrangement of the template bases and lies on top of the first base pair of the duplex part of the primer/template DNA (Figures 5, 6B and D). Tyr 671 may act as a positioning device for the DNA, such that the first base pair of the primer/template can register itself against the active site. Insertion of Tyr 671 into the stacking arrangement of the template bases also has consequences for the local structure of the DNA in this region (see below). In its second orientation, that seen in the closed form of the ternary complex, the O helix has moved in and is now much closer to the active site formed by the three carboxylates located in the palm domain (Figures 2A and C, and 6A and C). One effect of the rigid body motions affecting the O helix in the open to closed transition is to release the side chain of Tyr 671 from its stacking arrangement with the template bases (Figure 6A). As a result, the place which Tyr 671 occupied in the open form is vacated, allowing the first single-stranded DNA base of the template to position itself in front of the incoming ddCTP (Figures 1B and 6A). Another effect of the conformational change affecting the O helix is to bury the ddCTP, thereby assembling a productive ternary complex poised for chemistry (see below). Conformational transitions between the unbound and DNA-bound states The conformation of the thumb is affected greatly by binding of DNA. Figure 3B compares the structure of the apo form of the enzyme (Korolev et al., 1995) with that of the open binary Klentaq1–DNA complex (this work) and illustrates the conformational change affecting the thumb domain upon interactions with the primer/template DNA. This conformational change can be deconvoluted into two parts: first, a rigid body rotation of the thumb domain by a 17° angle along an axis perpendicular to the view plane of Figure 3B results in an opening of the DNA-binding crevice; secondly, a rotation by 12° about the same axis brings the tip of the thumb domain, i.e. only helices H1 and H2, closer to the DNA. This second rotational component is in the opposite direction to the first rotation. Other significant conformational changes, mostly corresponding to disorder to order transitions, are apparent between the unbound and the DNA-bound states of the thumb. These are mostly localized to the H1H2 loop, which in the apo- and dNTP-bound forms of the

Fig. 2. Stereo ribbon diagram (Carson, 1997), and surface of the closed ternary and open binary complexes of Klentaq1. (A and C) The closed ternary complex. (B and D) The open binary complex. In (A) and (B), the N-terminal, palm, fingers and thumb domains are indicated as ribbons, and color coded in yellow, magenta, green and deep blue, respectively. The O helix in the fingers domain is shown in red. The ribose and base in the primer (silver) and template (clear blue) strands are shown in stick representation, with the ribose-phosphate backbones shown as ribbons. The incoming ddCTP in (A) is shown in stick representation and is colored black. Gold spheres in A indicate metal ions. The notation for the secondary structural elements in the polymerase domain is indicated. In (C) and (D), the surface was contoured and displayed using GRASP (Nicholls et al., 1991). Color coding of DNA atoms is as in Figure 1 except for carbons (white). The template and primer strand backbones are shown in blue and red, respectively. Helices O1 and Q, and loop H1H2 are indicated. The double cyan arrow in D indicates the only possible direction for DNA motion in the open binary complex.

7517

Y.Li, S.Korolev and G.Waksman

Fig. 3. Comparison of the Cα tracings of the open and closed ternary complex forms and of the apo and open binary DNA-bound forms of Klentaq1 (Carson, 1997). (A) Superimposition of the structures of the open (magenta) and closed (yellow) forms of the ternary complex. (B) Superimposition of the structures of the open binary Klentaq1/DNA complex (magenta) and apo-Klentaq1 (green).

enzyme is disordered, but well defined in electron density in all three complexes described here (Figure 1C). The result of the conformational changes affecting the thumb domain is the formation of a cylinder that almost completely surrounds the DNA (Figure 2C and D). This structure is observed in all three complexes. Solution studies have established that the rate of DNA dissociation during polymerization by DNA Pol I enzymes is very slow and that the DNA does not necessarily dissociate from the protein after chemistry (Kuchta et al., 1987; Patel et al., 1991). These data suggest that the observed wrapping of the thumb domain around the DNA may be maintained during nucleotide assembly, nucleotide incorporation and DNA translocation. DNA conformations and residues involved in DNA binding The overall conformation of the duplex part of the primer/ template is very similar in all three complexes (Figure 5). The DNA is mostly in the B-form, except for the three base

7518

pairs at the end of the duplex DNA adjacent to the O helix, which are A-form. The resulting widening of the minor groove in this region has also been noted for DNAs complexed with the Taq, T7 and B.stearothermophilus DNA polymerases and DNA polymerase β (Pelletier et al., 1994; Eom et al., 1996; Doublie´ et al., 1998; Kiefer et al., 1998). Here, however, as in the quaternary complex of T7 polymerase, the DNA is distorted to assume an S shape. A first bend is caused by the interactions with the palm at the active site, whereas a second bend results from interactions with the thumb (Doublie´ et al., 1998). The DNA duplex interacts with the protein between base GT5 and GT13 in the template, and base CP6 and ddC in the primer (ddC refers to the incorporated ddCMP; notation for the bases is given in Figure 5), i.e. nine and seven nucleotides at the 59 end of the template and the 39 end of the primer, respectively, participate in contacts with the protein (Figure 5). This observation is consistent with solution studies by Catalano et al. (1990) which have measured a site size of seven base pairs using a marker located on the primer

Structures of closed and open forms of Klentaq1

Fig. 4. Sequence alignment of the polymerase domains of Taq, E.coli and T7 DNA polymerases. Secondary structural elements and domain boundaries are indicated above the aligned sequences, with β-strands and α-helices indicated by blue and red open boxes, respectively. Notation for each of these elements is included in the boxes. Note that the region corresponding to helix J in Klenow is not helical in the complexes presented here. The catalytic triad is shown in gray boxes spanning all three sequences. Residues in Klentaq1 involved in DNA and ddCTP binding are indicated in colored unframed boxes: green, residues interacting with DNA in both the open and closed ternary complexes, and in the open binary complex; pink, residues interacting with the DNA in the closed ternary complex only; dark blue, interactions with DNA (such as Tyr671) in the open ternary and the open binary complexes only; cyan, residues interacting with the ddCTP in both the open and closed ternary complexes; orange, residues interacting with the ddCTP in the closed form only; dark red, residues interacting with the ddCTP only in the open ternary complex.

strand. The part of the duplex DNA distal to the active site and the O helix makes contact with residues in the thumb domain, whereas the other end of the duplex DNA mostly interacts with residues in the palm. Two loops in the thumb domain, HH1 and H1H2, and the I helix form most of the contacts with the DNA between the T12–P5 and T9–P8 base pairs (Figure 5). Interactions in this region are polar or charged, and occur mostly with the ribose phosphate backbones of each strand. The palm domain also presents binding surfaces to both the primer and the template strands, with

β-strands 7 and 8 running parallel to the template, and βstrands 12 and 13 parallel to the primer (Figure 2A and B). In all three structures, neither the duplex DNA nor the single-stranded region of the template strand passes through the crevice between the thumb and the fingers. Instead, the first unpaired base of the template is flipped out of the stacking arrangement with the duplex by a sharp angle in the template ribose phosphate backbone which positions the single-stranded template base on the same side of the crevice as the duplex DNA (Figure 2). 7519

Y.Li, S.Korolev and G.Waksman

Fig. 5. Schematic diagram of the contacts between the polymerase and the DNA. Only direct contacts between the protein and the DNA are shown. Residues in green boxes make similar contacts with the DNA in all three complexes. Pink boxes indicate contacts only observed in the closed ternary complex. AT3 is shown in pink since it is only observed in this complex. GT4 is shown both in pink and deep blue since it is seen in two different conformations in the open (deep blue) binary and ternary complexes, and in the closed (pink) ternary complex. Tyr671 is in deep blue since it is observed stacking against the 59 end template base in the open binary and ternary complexes only. Distances between interacting atoms are shown. This figure also defines the notation for the bases used in the text.

Most of the differences in DNA conformation and binding between the open and closed ternary complexes lie in the single-stranded and ddCTP-paired regions of the template (Figure 6A and B). In the closed form, a sharp turn in the template’s phosphate backbone between AT3 (the first single-stranded template base) and GT4 (the base paired with the ddCTP) positions AT3 between the O1 and Q helices, away from the duplex part of the DNA (Figure 2C). In contrast, in the open form, AT3 is disordered and GT4 is displaced from the ddCTP-paired position by the insertion of Tyr671. Surprisingly, the displaced GT4 7520

base lies on the side of the DNA duplex and is stabilized uniquely by contacts with the template bases GT5 and GT6 (Figures 1A and 6B). This configuration of GT4 is also observed in the open binary Klentaq1–DNA complex. Binding of the dideoxynucleoside triphosphate in the two forms of the ternary complex In both forms of the ternary complex, the nucleoside triphosphate is located on top of the 39 end base of the primer strand. Its binding configuration is also very similar in the two structures (Figure 6). However, due to the

Structures of closed and open forms of Klentaq1

Fig. 6. Stereo diagrams and surfaces of the contacts between the polymerase and the ddCTP (Carson, 1997). (A) Protein–ddCTP contacts in the closed form. (B) Protein–ddCTP contacts in the open form. (C) Surface diagram of ddCTP-binding site in the closed form. (D) Surface diagram of the ddCTP-binding site in the open form. Residues interacting with the ddCTP and with the first base pair are shown and labeled. Color coding and display of the ribbons are as in Figure 2. Metal ions are indicated as gold spheres. Water molecules are not shown. Color coding of the protein side chains, the DNA and the ddCTP in (A) and (B) is by atom type with the carbons in white, oxygens in red, nitrogens in blue and phosphorus in yellow. In (C) and (D), protein side chains are in gray, except for residues in the O helix which are in red; atoms in the template and primer strands are in clear blue and silver, respectively; the incoming nucleotide is colored by atom types as in (A) and (B). The solvent-accessible surface [in (C) and (D)] is represented by gray dots. In all panels, atoms are represented as ball-and-stick models, except in C and D where the ddCTP and the DNA are represented as space-filling models.

reorientation of the O helix from open to closed, the interactions of ddCTP with the protein are very different between the two ternary complexes (Figure 6A and B). Whereas, in the open form, the ddCTP is readily accessible to solvent, in the closed form, the incoming ddCTP is buried completely (Figure 6C and D). As illustrated in Figures 4 and 6, residues in the O helix contribute most of the additional contacts observed in the closed form.

Interestingly, contacts with the O helix in this form are similar to those observed between residues in the O helix and the dCTP in the open Klentaq1–dCTP binary complex described previously by Li et al. (1998): the base of the nucleotide stacks against Phe667, whereas the triphosphate moiety makes electrostatic interactions with basic residues on the same face of the O helix. Furthermore, when the motion which brings the O helix from its open to closed 7521

Y.Li, S.Korolev and G.Waksman

configuration is applied to the coordinates of the dCTP of the open Klentaq1–dCTP binary complex, the dCTP is moved to within 1.2 Å of the ddCTP of the closed ternary complex (result not shown). This observation, together with the fact that in the closed form the ddCTP is located near the three active site carboxylates (Asp785, Glu786 and Asp610), lends support to the suggestion that a role for the conformational change affecting the O helix from the open to the closed form is to deliver the incoming nucleotide to the active site, thereby assembling a productive polymerase machinery poised for chemistry. In the closed form of the ternary complex, two metal ions (Mg21) are octahedrally coordinated by the triphosphate moiety of the ddCTP and carboxylate side chains in the active site (Figure 7). One metal (metal B) is ligated in the basal octahedral plane by four oxygen atoms, contributed by the β- and γ-phosphates and the carboxylate groups of Asp610 and Asp785 (Figure 7). The coordination sphere of the metal ion is completed on each side of the octahedral plane by interactions with oxygen atoms in the α-phosphate and the carbonyl of Tyr 611. The other metal ion (metal A) is coordinated in the octahedral plane by oxygen atoms from the carboxylate of Asp785, the α-phosphate

Fig. 7. The active site in the closed ternary complex. The three acidic side chains involved in the transfer reaction are shown as well as the ddCTP, the first incorporated ddCMP base (ddC) at the 39 end of the primer strand and the metal ions (yellow spheres). Red lines with distances indicate interactions between atoms. Red stars indicate water molecules. The cyan star indicates the position of the deoxyribose 39-OH in a natural substrate. Generated using the program O (Jones et al., 1991).

of the incoming nucleotide and two water molecules (Figure 7). On one side of the octahedral plane, metal A is ligated by an oxygen from the carboxylate of Asp610. On the other side of the plane, however, the coordinating position is vacant: a ribose 39-hydroxyl at the 39 end of the primer strand in a natural substrate would occupy this position and complete the coordination sphere of metal ion A (Figure 7). The coordination architectures of metal ions A and B are similar to those observed in the T7 polymerase quaternary complex (Doublie´ et al., 1998) and are consistent with the ‘two metal ion’ mechanism for nucleotide addition proposed by Steitz and colleagues (Steitz, 1993; Steitz et al., 1994). In this mechanism, metal ion A is thought to lower the affinity of the 39 OH for the hydrogen, thereby facilitating the 39 O– attack on the α-phosphate. Metal ion B may assist the leaving of the pyrophosphate. Conclusions Solution studies of DNA Pol I enzymes have established that the mechanism of DNA polymerization consists of a series of single dNMP incorporation reactions (Carroll and Benkovic, 1990; Johnson, 1993). Nucleotide incorporation begins with the binding of a dNTP to the enzyme–DNA (E·Dn) complex. The binding of a correct dNTP induces a rate-limiting conformational change which results in the formation of a tight ternary complex (E·Dn·dNTP ↔ E*·Dn·dNTP). The chemical reaction that follows is fast and results in the formation of a tightly bound enzyme– product complex (E*·Dn·dNTP ↔ E*·Dn11·PPi); this complex then undergoes a second conformational change which relaxes the tightly bound enzyme–product complex (E*·Dn11·PPi ↔·E·Dn11·PPi). This step facilitates PPi release and allows translocation of the DNA for the next cycle of polymerization. This mechanism suggests that the E·Dn complex fluctuates periodically between a loose and a tight binding state such that the binding and release of the substrates and products, as well as the linear diffusion or sliding of the DNA, would occur in the loose binding state, whereas the chemical reaction would occur in the tight binding state (Kuchta et al., 1987; Dahlberg and Benkovic, 1991; Patel et al., 1991). Since the kinetic data can be described fully by assuming only two conformations for the enzyme, one ‘relaxed’ (E) and one ‘tight’ (E*), we propose that the open and closed structures described in this report represent the E and the E* forms of the enzyme, respectively. In this proposal, the E·Dn state is represented by the open binary Klentaq1– DNA complex. The E·Dn·dNTP state may be approximated

Table II. Data collection Data set

Native (binary)

Native (ternary)

SeMet-1

SeMet-2

SeMet-3

SeMet-4

Wavelength (Å) Resolution (Å) Total reflections Unique reflections Completeness (%)a Rsym (%)b

1.08 30.0–2.5 270 566 20 823 91.8 8.1

CuKα 30.0–2.3 210 488 26 341 96.3 6.0

0.98789 30.0–2.3 319 110 25 623 93.4 4.8

0.97938 30.0–2.3 417 051 26 130 95.2 5.4

0.97923 30.0–2.3 317 452 25 726 93.7 5.0

0.96859 30.0–2.3 341 422 25 207 91.9 5.5

for I/σ(I) . 1.0. 5 Σ|I – ,I.| / ΣI, where I 5 observed intensity and ,I. 5 average intensity from multiple observations of symmetry-related reflections.

aCompleteness bR

sym

7522

Structures of closed and open forms of Klentaq1

by the structure of the open binary Klentaq1–DNA complex, to which a dNTP bound to the O helix in the configuration described in Li et al. (1998) is added. Alternatively, one cannot rule out the possibility that the open ternary complex may represent a E·Dn·dNTP state. Finally, the E*·Dn·dNTP state corresponds to the closed ternary complex where the conformational change affecting the O helix brings the dNTP within the active site of the enzyme, thereby assembling a ‘tight’ ternary complex. Hence, a description of the structures of three of the five kinetic states invoked by Patel et al. (1991) for a single nucleotide incorporation is provided for the Klentaq1 system, making this DNA polymerase the best structurally documented DNA Pol I enzyme. Solution studies of DNA Pol I enzymes have demonstrated that the formation of a ‘tight’ (i.e. closed) ternary complex (E*·Dn·dNTP) only occurs when the correct complementary dNTP is selected (Carroll and Benkovic, 1990; Johnson, 1993). The enzyme may be able to sample all dNTPs in a fast process by delivering them to the active site through an open to closed transition. However, only the correct dNTP locks the enzyme in the tight closed ternary complex form. This process can be understood by examining the structure of the closed ternary Klentaq1 complex and that of the quaternary T7 polymerase complex. In these structures, the terminal base pair is contained within a binding pocket, the geometry of which is incompatible with a mismatched base pairing (Figure 6 and Doublie´ et al., 1998). Hence, formation of a mismatch base pair would possibly result in an unstable structure which could open rapidly to sample another dNTP. During DNA synthesis, the primer/template must translocate with each cycle of nucleotide incorporation. Kinetic studies have suggested that linear diffusion or sliding of the DNA may occur in the relaxed E·Dn state (Kuchta et al., 1987; Patel et al., 1991). This suggests that in the open form (i.e. the E·Dn form), the DNA may be free to slide within the structure formed by the thumb, the palm and the fingers domains of the enzyme. Such a structure, as shown in Figure 2, resembles a cylinder that almost completely wraps around the DNA. The axis of this cylinder (indicated by the double arrow in Figure 2D) is perpendicular to the plane of the nascent base pair, suggesting that a motion of the DNA along this axis would generate the space required for the next template base to insert itself in the stacking arrangement of the duplex DNA. The motions of the DNA within the cylinder need not be unidirectional. However, in the direction of the fingers, the DNA would wedge against the ring of Tyr671 and position itself in a favorable geometry: the open to closed transition would then displace Tyr671, allow the first single-stranded base of the template to insert itself in its place, assemble the incoming dNTP to base-pair with the template base and finally, position the metal ions and the water molecules in a configuration favorable for the reaction to occur. These latter events may result in tighter binding and transient immobilization of the DNA.

Materials and methods Crystallization Klentaq1 was purified according to Korolev et al. (1995). The resulting material was further purified by size-exclusion chromatography followed

by high-resolution cation exchange chromatography. The protein was then dialyzed in a buffer containing 20 mM Tris pH 8.0, 20 mM NaCl, 1 mM EDTA and 1mM 2-mercaptoethanol, and concentrated to 25 mg/ml (0.4 mM). Production of the selenomethionine-derived protein was carried out in the methionine-auxotroph E.coli strain DL41 and protein purification proceeded as for wild-type. Incorporation of the selenium was confirmed by electrospray ionization mass spectrometry. The primer/template DNA consisted of 59-GACCACGGCGC-39 for the primer and 59-AAAGGGCGCCGTGGTC-39 for the template. The duplex was formed by mixing these two oligonucleotides (3.0 mM) in a 1:1 molar ratio in a buffer containing 10 mM Tris–HCl pH 8.0, 10 mM NaCl and 5 mM MgCl2, and was annealed using standard procedures. The ternary complex of Klentaq1 was formed by reacting a mixture of the protein, the primer/template DNA and the ddCTP in a molar ratio of 1:7.5:50 and in 20 mM MgCl2. Wild-type crystals were grown by vapor diffusion using the hanging drop method against a reservoir containing 0.1 M HEPES pH 7.5, 20 mM MnCl2, 0.1M Na acetate and 10% (w/v) PEG4000 at 20°C. Sizable crystals (0.730.1530.15 mm) appeared within 1 week. After cryoprotecting progressively (15 min total) to a final solution containing 0.1 M HEPES pH 7.5, 20 mM MnCl2, 0.1 M Na acetate, 20% (v/v) glycerol and 22.5% (w/v) PEG4000, the crystals were flash-frozen to liquid nitrogen temperature. The crystals diffracted to 2.3 Å in the laboratory setting (Rigaku Raxis IV image plate mounted on a Rigaku RU200 rotating anode X-ray generator). Crystals were in space group P3121 with unit cell dimensions a 5 b 5108.3 Å and c 5 90.4 Å, and one complex in the asymmetric unit. A complete data set to a resolution to 2.3 Å (Table II) was collected using a single crystal with an oscillation range of 1.5° and exposure time of 45 min/frame. These crystals corresponded to the closed form of the ternary Klentaq1 complex. Selenomethionine-derived crystals were grown as described for wildtype. However, crystals were incubated for 5 days in a stabilizing solution containing 0.1 M HEPES pH 7.5, 20 mM MnCl2, 0.1 M Na acetate and 15% (w/v) PEG4000 prior to being transported to the Advanced Photon Source (APS) at Argonne National Laboratory. Just before data collection [at the Structural Biology Center (SBC) beamline], these crystals were cryoprotected as described above and flash-frozen. The selenomethionine-derived crystals were in the same space group with the same unit-cell dimensions as wild-type. MAD data were collected to a resolution of 2.3 Å at four wavelengths (details below) using a single crystal. The oscillation range was 1.5° and the exposure time was 8 s/frame. These crystals corresponded to the open form of the ternary Klentaq1 complex. Binary complex crystals were derived from closed ternary complex crystals incubated for 30 days in the stabilizing solution described above. A complete native data set to a resolution of 2.5 Å was collected at Stanford Synchrotron Radiation Laboratory (SSRL; beamline 7.1) using a single frozen crystal (1.5° oscillation range, 20 s exposure). All data were processed and reduced using DENZO and SCALEPACK (Otwinowski, 1993).

Structure determination The first attempt to solve the structures of each of the Klentaq1 complexes was carried out by molecular replacement (MR) (Rossman, 1972). The structure of the apo-form of Klentaq1 (Korolev et al., 1995) was used as a search model and the MR method was implemented using the program AMoRe (Navaza, 1994). Both the rotation and translation functions showed distinct peaks which could have constituted possible solutions for the MR problem. However, incorporating these models in refinement (program XPLOR, Bru¨nger, 1992a) did not result in a significant decrease in R or free-R-factors. Multiwavelength anomalous diffraction (MAD) data were collected from a single selenomethionine-substituted crystal (Hendrickson et al., 1990). The crystal was oriented such that Bijvoet pairs could be collected on the same frame. Scaling of data sets and anomalous pairs was performed using local scaling (program HEAVYv4.5, Terwilliger and Eisenberg, 1983). The anomalous differences recorded at the wavelength where f 0 is maximum (SeMet-3 in Table II) were first used in an anomalous difference Patterson synthesis (program HASSP, Terwilliger et al., 1987) which yielded three strong peaks. These selenium positions matched the sulfur positions of three methionines in the MR model obtained using the SeMet-3 data, and therefore were used to generate phases solely based on the MAD data (program HEAVYv4.5). The resulting phases were used in a difference Fourier against the same anomalous differences to generate six additional selenium positions. The validity of the selenium positions was confirmed using difference Fourier techniques by systematically omitting each of the nine positions. These

7523

Y.Li, S.Korolev and G.Waksman positions were subsequently used to generate MAD phases using the resolution range 30.0–2.7 Å resolution (program SHARP, De La Fortelle and Bricogne, 1997). Values for f9 and f99 were from Hall et al. (1995). Subsequent solvent flattening and phase extension to a resolution of 2.3 Å (program SOLOMON, Abrahams and Leslie, 1996) resulted in an electron-density map which was of excellent quality (Figure 1) in the N-terminal, palm and thumb domains, but of poor to moderate quality in the fingers domain. An apo-Klentaq1 model was adjusted in the electron density using the program O (Jones et al., 1991). During model building, partial models consisting of the N-terminal, palm and thumb domains were used as external phase information to the phasing based on the multiwavelength anomalous dispersions, resulting in improved electron-density maps in the fingers region which could then be built at least partially (see Refinement below). At this stage, it was clear that the configuration of the fingers was similar to that observed in the apo form of the enzyme, i.e. open (Korolev et al., 1995). The phases calculated from a partial model where the fingers domain was omitted were also used in the calculation of an electron density map using |Fo–Fc| coefficients, where Fcs were the structure factor amplitudes calculated from the partial model, and Fos were the amplitude data collected using the native ternary (closed) complex crystal (Table II). Remarkably, this electron-density map clearly showed density for the fingers domain which, once modeled, positioned the fingers domain in a configuration similar to that observed in the T7 polymerase quaternary complex structure, i.e. closed (Doublie´ et al., 1998). In both forms of the ternary complex, electron density for the DNA was clearly interpretable (Figure 1) and a model could be built readily. Electron density for the entire ddCTP was well defined only in the closed form. For the open form, experimental and omit electron-density maps showed clear density only for the base and the ribose of the ddCTP, whereas electron density for the triphosphate moiety was patchy with density only for the α- and γ-phosphates. This could be rationalized by the absence of the O helix to stabilize the phosphate moiety, whereas the remaining contacts such as stacking contacts and interactions with Tyr671 (Figure 4) may have been sufficient to stabilize the base. Building of the two metal ions in the closed form relied on the excellent quality of the electron density in this region and on the optimal coordination sphere observed with the surrounding protein and ddCTP atoms. In the open form of the ternary complex, metal ion B only was built in density and was found to coincide with the same metal ion in the closed form of the ternary complex. Metal ion A could not be identified unambiguously in this form. The open binary Klentaq1–DNA complex structure was determined by directly evaluating the open ternary complex model (minus the ddCTP) against the native binary complex data. Rigid body refinement (program XPLOR) was instrumental in finding the correct orientation of the model in the asymmetric unit. Simulated annealing omit maps for the region around the reacted 39 end primer base (ddC) demonstrated the absence of electron density for the ddCTP or the metal ions.

Refinement Both conjugate gradient minimization and simulated annealing in cartesian space were used during refinement (program XPLOR, Bru¨nger et al., 1987). Progress was assessed by monitoring the free R-factor (Bru¨nger, 1992b). Individual B-factor refinement was used only at the later stages of the process, resulting in tightly restrained B-factors (Table I). The atomic model of the open binary complex was refined using the ‘native (binary)’ data (program XPLOR, Table II). After bulk solvent correction, the refinement of this complex converged to a final R-factor of 22.7% with an R-free of 29.8% (30.0–2.5 Å resolution range; |F|/ σ|F| . 2.0). The model for this structure contains protein residues from 294 to 832 with no interruption, 13 template nucleotides, 11 primer nucleotides and one incorporated ddCMP, and 103 well-defined water molecules. Electron density was poor for 29 side chains at the surface of the protein, which were built as Ala. The atomic model of the open ternary complex was refined using the ‘SeMet-2’ data, whereas the model for the closed ternary complex was refined against the ‘native (ternary)’ data (program XPLOR and Table II). After bulk solvent correction, the refinement converged to a final R-factor of 21.8% with an R-free of 27.5% for the closed form, and to a final R-factor of 22.4 with an R-free of 28.8% for the open form (30.0–2.3 Å resolution range; |F|/σ|F| .2.0). The closed form model contains protein residues from 293 to 831 with no interruption, 14 template nucleotides, 11 primer nucleotides, one incorporated ddCMP, one ddCTP, two Mg metal ions and 149 well-defined water molecules.

7524

Side chains for 26 residues, mostly solvent exposed, were not well defined in density and consequently were built as Ala. The open ternary complex model contains protein residues from 295 to 832, 13 template nucleotides, 11 primer nucleotides, one incorporated ddCMP, one ddCTP, one Mg metal ion and 150 well-defined water molecules. The occupancy for the ddCTP in this complex was set to 0.3 in the final refined structure resulting in B values for the entire ddCTP similar to the B values of the surrounding model atoms. Electron density for side chains in the fingers region between residues 636 and 699 was poor, and these residues were built as Ala, except for Phe667, Tyr671 and Met673, which were ordered. The open ternary complex model is also interrupted between residues 645 and 654. Twenty-one additional residues at the protein surface were built as Ala. Only one residue (His784) in all three structures has (psi, phi) angles in the disallowed region of the Ramachandran plot (Ramachandran and Sasisekharan, 1968). This residue is near the active site and a similar conformation has been observed for the corresponding residue in other polymerase structures (Doublie´ et al., 1998; Kiefer et al., 1998). The coordinates of the three structures reported here have been deposited (PDB entry codes 2KTQ for the open ternary complex, 3KTQ for the closed ternary complex and 4KTQ for the binary Klentaq1–DNA complex).

Acknowledgements We thank J.Kuriyan, K.Johnson, J.Majors, F.S.Mathews and T.M.Lohman for comments on the manuscript, K.Fu¨tterer for help in data collection and phasing, D.Mosbaugh for discussions, T.Ellenberger for the coordinates of the quaternary complex of T7 DNA polymerase and the staff at the Structural Biology Center beamline of the Advanced Photon Source (APS) at Argonne National Laboratory and the staff of beamline 7.1 at the Stanford Synchrotron Radiation Laboratory for assistance during synchrotron data collection. This work was supported by National Institute of Health grant GM54033.

References Abrahams,J.P. and Leslie,A.G.W. (1996) Methods used in the structure determination of bovine mitochondrial F1 ATPase. Acta Crystallogr., D52, 32–42. Astatke,M., Grindley,N.D.F. and Joyce,C.M. (1995) Deoxynucleoside triphosphate and pyrophosphate binding sites in the catalytically competent ternary complex for the polymerase reaction catalyzed by DNA polymerase I (Klenow fragment). J. Biol. Chem., 270, 1945–1954. Beese,L.S. and Steitz,T.A. (1991) Structural basis for the 39–59 exonuclease activity of Escherichia coli DNA polymerase I: a two metal ion mechanism. EMBO J., 10, 25–33. Beese,L.S., Derbyshire,V. and Steitz,T.A. (1993) Structure of DNA polymerase I Klenow fragment bound to duplex DNA. Science, 260, 352–355. Brautigam,C.A. and Steitz,T.A. (1998) Structural and functional insights provided by crystal structures of DNA polymerases and their substrate complexes. Curr. Opin. Struct. Biol., 8, 54–63. Bru¨nger,A.T. (1992a) X-PLOR (Version 3.1) Manual. The Howard Hughes Medical Institute and Department of Molecular Biophysics and Biochemistry, Yale University, 260 Whitney Avenue, New Haven, CT 06511, USA. Bru¨nger,A.T. (1992b) The free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature, 355, 472–474. Bru¨nger,A.T., Kuriyan,J. and Karplus,M. (1987) Crystallographic R-factor refinement by molecular dynamics. Science, 235, 458–460. Carroll,S.S. and Benkovic,S.J. (1990). Mechanistic aspects of DNA polymerases: Escherichia coli DNA polymerase I (Klenow fragment) as a paradigm. Chem. Rev., 90, 1291–1307. Carson,M. (1997) Ribbons. Methods Enzymol., 277, 493–505. Catalano,C.E., Allen,D.J. and Benkovic,S.J. (1990) Interaction of Escherichia coli DNA polymerase I with azidoDNA and fluorescent probes: identification of protein–DNA contacts. Biochemistry, 29, 3612–3621. Dahlberg,M.E. and Benkovic,S.J. (1991) Kinetic mechanism of DNA polymerase I (Klenow fragment): identification of a second conformational change and evaluation of the internal equilibrium. Biochemistry, 30, 4835–4843. De La Fortelle,E. and Bricogne,G. (1997) Maximum-likelihood heavyatom parameter refinement for multiple isomorphous replacement and

Structures of closed and open forms of Klentaq1 multiwavelength anomalous diffraction methods. Methods Enzymol., 276, 472–493. Delarue,M., Poch,O., Tordo,N., Moras,D. and Argos,P. (1990) An attempt to unify the structure of polymerases. Protein Eng., 3, 461–467. Doublie´,S., Tabor,S., Long,A.M., Richardson,C.C. and Ellenberger,T. (1998) Crystal structure of a bacteriophage T7 DNA replication complex at 2.2 Å resolution. Nature, 391, 251–258. Eom,S.H., Wang,J. and Steitz,T.A. (1996) Structure of Taq polymerase with DNA at the polymerase active site. Nature, 382, 278–281. Freemont,P.S., Friedman,J.M., Beese,L.S., Sanderson,M.R. and Steitz,T.A. (1988) Cocrystal structure of an editing complex of Klenow fragment with DNA. Proc. Natl Acad. Sci. USA, 85, 8924–8928. Hall,T.M.T., Porter,J.A., Beachy,P.A. and Leahy,D.J. (1995) A potential catalytic site revealed by the 1.7 Å crystal structure of the aminoterminal signalling domain of Sonic hedgehog. Nature, 378, 212–216. Hendrickson,W.A., Horton,J.R. and LeMaster,D. (1990) Selenomethionine proteins produced for analysis by multiwavelength anomalous diffraction (MAD): a vehicle for direct determination of three-dimensional structure. EMBO J., 9, 1665–1672. Hodel,A., Kim,S.-H. and Bru¨nger,A.T. (1992) Model bias in crystal structures. Acta Crystallogr., A48, 851–858. Johnson,K.A. (1993) Conformational coupling in DNA polymerase fidelity. Annu. Rev. Biochem., 62, 685–713. Jones,T.A., Zou,J.Y., Cowan,S.W. and Kjeldgaard,M. (1991) Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallogr., A47, 110–119. Joyce,C.M. and Steitz,T.A. (1994) Function and structure relationships in DNA polymerases. Annu. Rev. Biochem., 63, 777–822. Kiefer,J.R., Mao,C., Braman,J.C. and Beese,L.S. (1998) Visualizing DNA replication in a catalytically active Bacillus DNA polymerase crystal. Nature, 391, 304–307. Kornberg,A. and Baker,T.A. (1991) DNA Replication, 2nd edn. W.H.Freeman, New York, NY. Korolev,S., Nayal,M., Barnes,W., Di Cera,E. and Waksman,G. (1995) Crystal structure of the large fragment of Thermus aquaticus DNA polymerase I at 2.5 Å resolution: structural basis for thermostability. Proc. Natl Acad. Sci. USA, 92, 9264–9268. Kuchta,T.D., Mizrahi,V., Benkovic,P.A., Johnson,K.A. and Benkovic,S.J. (1987) Kinetic mechanism of DNA polymerase I (Klenow). Biochemistry, 26, 8410–8417. Kuchta,R.D., Benkovic,P. and Benkovic,S.J. (1988) Kinetic mechanism whereby DNA polymerase I (Klenow) replicates DNA with high fidelity. Biochemistry, 27, 6716–6725. Li,Y., Kong,Y., Korolev,S. and Waksman,G. (1998) Crystal structures of the Klenow fragment of Thermus aquaticus DNA polymerase I complexed with deoxyribonucleoside triphosphates. Protein Sci., 7, 1116–1123. Navaza,J. (1994) AMoRe: an automated package for molecular replacement. Acta Crystallogr., A50, 157–163. Nicholls,A., Sharp,K.A. and Honig,B. (1991) Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Protein Struct. Funct. Genet., 11, 281–296. Ollis,D.L., Brick,P., Hamlin,R., Xuong,N.G. and Steitz,T.A. (1985) Structure of the large fragment of Escherichia coli DNA polymerase I complexed with dTMP. Nature, 313, 762–766. Otwinowski,Z. (1993) Oscillation data reduction program. In Sawyers,L., Issacs,N. and Bailey, S. (eds), Proceedings of the CCP4 Study Weekend: Data Collection and Processing. Warrington, DC: SERC Daresbury Laboratory, pp. 56–62. Patel,S.S., Wong,I. and Johnson,K.A. (1991) Pre-steady-state kinetic analysis of processive DNA replication including complete characterization of an exonuclease-deficient mutant. Biochemistry, 30, 511–525. Pelletier,H., Sawaya,M.R., Kumar,A., Wilson,S.H. and Kraut,J. (1994) Structures of ternary complexes of rat DNA polymerase β, a DNA template-primer and ddCTP. Science, 264, 1891–1903. Ramachandran,G.N. and Sasisekharan,V. (1968) Conformations of polypeptides and proteins. Adv. Protein Chem., 23, 283–437. Rossman,M.G. (1972) In Rossman,M.G. (ed.), The Molecular Replacement. International Scientific Review. Gordon and Breach, New York, NY. Steitz,T.A. (1993) DNA- and RNA-dependent DNA polymerases. Curr. Opin. Struct. Biol., 3, 31–38. Steitz,T.A., Smerdon,S., Ja¨ger,J. and Joyce,C.M. (1994) A unified polymerase mechanism for nonhomologous DNA and RNA polymerases. Science, 266, 2022–2025. Suzuki,M., Baskin,D., Hood,L. and Loeb,L.A. (1996) Random

mutagenesis of Thermus aquaticus DNA polymerase I: concordance of immutable sites in vivo with the crystal structure. Proc. Natl Acad. Sci. USA, 93, 9670–9675. Terwilliger,T.C. and Eisenberg,D. (1983) Unbiased three-dimensional refinement of heavy-atom parameters by correlation of origin-removed Patterson functions. Acta Crystallogr., A39, 813–817. Terwilliger,T.C., Kim,S.-H. and Eisenberg,D. (1987) Generalized method of determining heavy-atom positions using the difference Patterson function. Acta Crystallogr., A43, 1–5. Wong,I., Patel,S.S. and Johnson,K.A. (1991) An induced-fit kinetic mechanism for DNA replication fidelity: direct measurement by singleturnover kinetics. Biochemistry, 30, 526–537. Received August 12, 1998; accepted October 19, 1998

7525