An Extended RNA Binding Surface through Arrayed S1 ... - Cell Press

10 downloads 15821 Views 792KB Size Report
the most fundamental host adjuvant. Supporting ... Five functional areas have been defined in the best- ... tures, RNA binding pockets on the S1 and KH domains.
Molecular Cell, Vol. 7, 1177–1189, June, 2001, Copyright 2001 by Cell Press

An Extended RNA Binding Surface through Arrayed S1 and KH Domains in Transcription Factor NusA Michael Worbs,1 Gleb P. Bourenkov,2 Hans D. Bartunik,2 Robert Huber,1 and Markus C. Wahl1,3 1 Max-Planck-Institut fu¨r Biochemie Abteilung Strukturforschung Am Klopferspitz 18a D-82152 Martinsried Germany 2 MPG-ASMB c/o DESY Arbeitsgruppe Proteindynamik Notkestraße 85 D-22603 Hamburg Germany

Summary The crystal structure of Thermotoga maritima NusA, a transcription factor involved in pausing, termination, and antitermination processes, reveals a four-domain, rod-shaped molecule. An N-terminal ␣/␤ portion, a five-stranded ␤-barrel (S1 domain), and two K-homology (KH) modules create a continuous spine of positive electrostatic potential, suitable for nonspecific mRNA attraction. Homology models suggest how, in addition, specific mRNA regulatory sequences can be recognized by the S1 and KH motifs. An arrangement of multiple S1 and KH domains mediated by highly conserved residues is seen, creating an extended RNA binding surface, a paradigm for other proteins with similar domain arrays. Structural and mutational analyses indicate that the motifs cooperate, modulating strength and specificity of RNA binding. Introduction Almost all cellular RNA is synthesized in bacteria during transcription of a DNA template by RNA polymerase (RNAP). The process can be divided into initiation, elongation, and termination phases, during which the essential core component of RNAP, an ␣2␤␤⬘ tetramer of about 400 kDa, can be influenced by a number of transcription factors (Mooney et al., 1998). During initiation, ␴-factors determine the specificity of RNAP for certain promoters (Helmann and Chamberlin, 1988). Upon commencement of elongation, RNAP releases these factors (Greenblatt and Li, 1981a) and becomes responsive to four N-utilization substances (NusA, B, E, and G), which regulate pausing, termination, and antitermination upon encounter of specific mRNA elements. NusA, a putative permanent modifier of RNAP during elongation (Gill et al., 1991a), binds competitively with the ␴-factors to the same region of the enzyme, suggesting two factor-defined stages of transcription (Greenblatt and Li, 1981a). Termination of transcription can occur at certain mRNA sequences without the influence of additional factors (in3

Correspondence: [email protected]

trinsic signals), while other stop signals require the action of Rho (Richardson and Greenblatt, 1996). The first Nus factors were identified through mutations which prevented protein N-dependent ␭ growth in Escherichia coli (Friedman and Baron, 1974), hence their acronyms. Specific mRNA sequences, the nut-site (composed of the two elements, boxA and boxB), guide the assembly of the Nus factors and N into a ribonucleoprotein complex, which bestows RNAP with termination resistance (Das and Wolska, 1984). Under certain conditions, an N•NusA complex alone can instruct RNAP to override terminators in close vicinity of a nut site (Chattopadhyay et al., 1995), indicating that NusA embodies the most fundamental host adjuvant. Supporting this notion, lambdoid phages encode another antiterminator, Q, acting in late gene expression (Friedman and Court, 1995), which only requires NusA (Barik and Das, 1990). Most interestingly, the Nus-factor ensemble also cooperates with yet another protein, Nun, of phage HK022, to invoke transcription termination in response to precisely the same nut loci employed by N (Chattopadhyay et al., 1995; Henthorn and Friedman, 1996). Therefore, NusA or the Nus-factor collection seems to conduct RNAP into a state which is open to further modulation by other proteins with the possibility for diverse outcomes (Friedman, 1992). While the four Nus factors also cooperate in the antitermination of ribosomal (r-) RNA (rrn) operons by a mechanism similar to the N-mediated process (Vogel and Jensen, 1997), NusA alone participates in numerous other transcriptional control events. It regulates pausing of RNAP in the trp and his operons (Chan and Landick, 1989; Farnham et al., 1982; Landick and Yanofsky, 1984) and guides the readthrough of the rpoB and rpoC genes for the RNAP subunits ␤ and ␤⬘ (Linn and Greenblatt, 1992). NusA also stimulates attenuation in the leader region of the S10 operon of some ␥-proteobacteria in cooperation with r-protein L4 (Zengel and Lindahl, 1992). In agreement with its exceptional status, NusA is the largest of the Nus proteins (Mr ⵑ55,000 in E. coli), is strictly essential in E. coli, and is present in the Mycoplasma genitalium genome, the presumed minimal set of genes required for bacterial life (Fraser et al., 1995). Many functions of NusA seem to be compatible with a slowdown of the rate of transcription in response to certain structures on the mRNA (Schmidt and Chamberlin, 1984). NusA-dependent control points are often located between translation start sites and intragenic Rho-dependent terminators. In antitermination, retardation of transcription by NusA could allow translating ribosomes to keep track with RNAP and mask terminators (Ruteshouser and Richardson, 1989), constituting a link between transcription and translation. For termination, NusA-induced pausing may enable Rho to catch up with the elongating RNAP to stall the enzyme. A direct interplay between NusA and Rho function is underlined by the observation that an ordinarily lethal NusA mutation is viable provided that Rho levels are concomitantly reduced (Zheng and Friedman, 1994). Five functional areas have been defined in the best-

Molecular Cell 1178

characterized E. coli NusA (ecoNusA). An N-terminal domain (NTD) was most strongly implicated in RNAP binding (Mah et al., 1999). Sequence comparisons suggested three consecutive RNA binding modules, an S1 and two KH domains, in the central portion of ecoNusA (Bycroft et al., 1997; Gibson et al., 1993). A further C-terminal domain (CTD) not present in many other bacterial species was shown to be involved in the interaction with RNAP subunit ␣ and ␭ protein N (Mah et al., 2000). The nusA134 mutation gives rise to a protein truncated after residue 343 and results in a temperature-sensitive but otherwise normal phenotype (Tsugawa et al., 1988), indicating that the CTD is dispensable for NusA functions. Of particular interest is the area encompassing the S1 and KH motifs, as these domains occur in many nucleic acid binding proteins, both alone and tandemly repeated. The ubiquity of such arrays suggests that they are used as a general tool to adjust the specificity and strength of the RNA•protein interactions. However, no detailed picture existed of KH and S1 domains repeated in tandem or in combination. In this work, we present the high-resolution crystal structure of NusA from Thermotoga maritima (tmaNusA). While tmaNusA lacks the CTD, its high homology (52%) to the remainder of ecoNusA makes it an excellent model for the minimal vital portion of the protein and allows reconciliation of biochemical findings on the E. coli homolog. Our results suggest that RNA and RNAP binding sites on the protein can be attributed to disparate surfaces. Based on comparison with previous structures, RNA binding pockets on the S1 and KH domains are seen to associate into an expansive binding surface. This enlarged composite binding site extends into the N-terminal domain via a positively charged spine alongside NusA. Mutational analyses are in agreement with these interpretations. The structure paves the way for further detailed functional dissections of NusA. Results and Discussion Structure Solution and Quality of the Model The present structure is supported by the outcome of two independent MAD approaches (Figure 1a) using a single mercury site or four selenium positions. A number of flexible loops and the three most C-terminal residues could only be fitted into low-contoured experimental maps and acquired high temperature factors during refinement. Combinations of the independent experimental maps did not improve these weak-density regions. Nevertheless, the global path of the main chain was unquestionable. Side chains were ambiguous in a long connector helix (␣3, residues 108–122) and in regions neighboring mobile loops. Almost all of the poorly defined regions map to the N-terminal 130 residues, while the remainder of the structure is extremely clear. The significant portion of residues which fall into poorly outlined, unstructured regions may be an explanation for the free R factor, which converged at around 31% (Table 1). The mean residual error of the coordinates was estimated at 0.35 A˚ by a Luzzati analyis (Luzzati, 1952). All residues outside the preferred φ/␺ regions were either very well defined in the experimental electron density or fell into flexible loop regions.

Structural Overview tmaNusA is a highly elongated molecule with dimensions of 115 A˚ ⫻ 28 A˚ ⫻ 25 A˚, due to four domains arranged in a linear manner (Figure 1b). In the NTD (residues 1–105), two ␣ helices lie alongside the first and last strand of a three-stranded antiparallel ␤ sheet. Its secondary structure elements are connected by extensive loops. Although a few hydrophobic residues of ␣1 (V23, L27, and L31) interact with V113 and L114 of ␣3 and with L6 at the N terminus, hydrophilic residues (E7 and Q11) are interspersed and disturb the formation of a regular hydrophobic core. Furthermore, the layer-like fold opposes the buildup of a globular domain. The conformation in this part of the protein is therefore rather flexible and may be critically stabilized by crystal contacts. The DALI server (Holm and Sander, 1993) detected no relevant resemblance to a known fold. The NTD is linked by a long helix, ␣3, and an extended irregular stretch (residues 123–135) to a S1 homology motif (residues 132–199). The latter domain, named after its identification in r-protein S1 (Subramanian, 1983), folds into a five-stranded antiparallel ␤ barrel with Greek key topology and a small 310 helix following the third strand, ␤6. C terminally of the S1 domain, NusA features two consecutive K-homology motifs (residues 200–276 and 277– 344) first discovered in pre-mRNA binding protein K (Siomi et al., 1993). They consist of three-stranded mixed ␤ sheets packed against three (KH2) to four (KH1) ␣ helices on one side. The surface potential of NusA displays a prominent asymmetry (Figure 1c). A strongly positively charged ribbon stretches along one side of the molecule. About 30 amino acids contribute to this band, located in helix ␣1 of the NTD, in the hinge helix ␣3, in the connecting loops between the ␤ strands and the 310-helix of the S1 domain, and in ␤10, ␤11, and ␣10 of the two KH domains. Because the nature of many of the side chains is conserved (Figure 2), the positive flank is a feature of all NusA proteins. Since nucleic acid binding proteins often use basic residues to contact the sugar-phosphate backbone, the nascent mRNA might be guided to this face of NusA. On the opposite side, NusA presents an area of mixed character, involving the connecting part between the N-terminal module and the S1 domain (F127 to T138), the first ␣-helical segment of KH1 (L210, E214, and E216), and some residues between the two KH domains (D277 and D278). The latter area might be partly used for sequence-specific RNA recognition, but could also interact with other protein components of the transcriptional apparatus (see below). RNA Binding by NusA General Considerations There has been conflicting evidence regarding the nucleic acid interactions of NusA. Sequence alignments had revealed regions homologous to S1 and KH domains (Bycroft et al., 1997; Gibson et al., 1993), now corroborated by the structure. Variants of both motifs were shown to bind boxA or certain small RNA hairpins (Lewis et al., 2000; Mogridge and Greenblatt, 1998), in agreement with NusA contacting the nut-site RNA in antitermination complexes (Das and Wolska, 1984; Horwitz et al., 1987). A regulatory sequence similar to boxA was

Structure of Thermotoga maritima NusA 1179

also found to be involved in antitermination of the rrn operons (Aksoy et al., 1984; Berg et al., 1989; Li et al., 1984), but it may rather bind to NusB and NusE (Nodwell and Greenblatt, 1993). Other RNA signals in response to which NusA increases the efficiency of transcription termination are characterized by a stem loop (Farnham et al., 1982; Greenblatt et al., 1981), but a direct interaction was never demonstrated. However, NusA could be cross-linked to single-stranded nascent mRNA in transcription complexes without sequence preferences (Liu and Hanna, 1995). Most confusing, it has been difficult to reproducibly demonstrate RNA association of isolated NusA. Recently, Greenblatt and coworkers showed that the extra CTD of ecoNusA serves as an autoinhibitor of RNA binding (Mah et al., 2000). Nut-RNA binding was unveiled upon partial CTD deletions and was abolished by alterations in either the boxA or boxB sequences, emphasizing the specificity of the interaction. In transcription complexes, the CTD is sequestered by RNAP subunit ␣ or protein N, similarly unmasking the RNA interaction potential of NusA. In the following, we will delineate nucleic acid binding areas on the S1 and KH domains and demonstrate how they, together with additional putative contacts in the NTD, are clustered on one side of the protein, creating an extended, mosaic interaction platform. OB Fold Proteins Suggest an RNA Interaction Mode for the S1 Domain S1 domains belong to the superfamily of OB fold proteins and have been seen in the structures of E. coli polynucleotide phosphorylase (PNPase) (Bycroft et al., 1997) and the major cold shock protein (CspB) (Schindelin et al., 1993). All presently known S1 structures were determined in the absence of RNA, but several RNA complex structures were determined for OB fold proteins, like transcription termination factor Rho bound to a cytidylic acid 9-mer (Bogden et al., 1999), r-protein S17 complexed to rRNA (Schluenzen et al., 2000; Wimberly et al., 2000), and type II tRNA synthetases in contact with the anticodon loop of tRNAs (Eiler et al., 1999). Because of the structural similarity, an overlay between, e.g., the OB fold•anticodon loop complex of the AsptRNA synthetase (Eiler et al., 1999) and the S1 domain of NusA allows the identification of putative RNA binding residues in the latter motif (Figure 3a; rms deviation 2.1 A˚ for 45 C␣ atoms). In the modeled complex, the RNA is positioned on strands ␤5 and ␤6 complemented by the 310 helix and the loop between ␤7 and ␤8. The outlined area is topologically equivalent to the putative interaction sites in previous S1 structures. It is largely positively charged and participates in the protein’s flank with positive surface potential (Figure 1c). Eleven residues of the S1 domain superimpose quite well with RNA contacting side chains of the tRNA synthetase (R144, M146, G147, W149, R153, E158, R160, P162, K163, K173, and K190). They are mostly located in areas of high sequence diversity (Figure 2). Therefore, the S1 domains of different NusA proteins could be tailored to nonconserved mRNA sequences or bind mRNA in a sequence-independent fashion. NusA has been shown to bind to RNA containing boxA in vitro (Tsugawa et al., 1985). Suggestively, a similar affinity has been seen for r-protein S1 (Mogridge and

Greenblatt, 1998). An A-to-T transversion in boxA abolished N-mediated antitermination with ecoNusA but allowed ␭ growth with the Salmonella typhimurium homolog (Friedman and Olson, 1983; Friedman et al., 1990). These two NusA proteins are over 94% sequence identical. An insertion of five amino acids in the Salmonella protein between residues 153 and 154 of ecoNusA (the 449 region) is the only significant deviation (Craven et al., 1994), mapping to ␤5 of the S1 domain. These results are most easily explained by the ␤5/␤6 platform manifesting the boxA binding site of the S1 domain as supported by the present models. However, unpublished data by D.I. Friedman, J. Greenblatt, and coworkers indicate that the mutant NusA still assembles into a complex with N and nut RNA but does not bind RNAP (personal communication). The nusA1 (L183R in ecoNusA, I182 in tmaNusA; Friedman and Baron, 1974) and the temperature-sensitive nusA11 mutations (G181D in ecoNusA, V180 in tmaNusA; Nakamura et al., 1986) map to the hydrophobic core of the S1 domain (Figure 1d). Their pleiotropic effects, preventing the NusA-dependent propagation of bacteriophage ␭ without affecting bacterial proliferation (nusA1) and, in contrast, disrupting normal termination without eliminating ␭ growth (nusA11), are likely due to a disturbance of the fold. A direct consequence could be impaired mRNA binding. nusA11 suppressor mutations are known (Ito et al., 1991). D181A (nusA1101) restores the hydrophobic nature and small size of the altered amino acid, while the nusA1102 suppressor retains the G181D replacement and additionally changes D84 (D84 in tmaNusA) to Y. D84 is located in the N-terminal RNAP binding domain (see below) and does not engage in interactions with other residues. Contrary to previous suggestions (Ito et al., 1991), residues 84 and 181 are far apart in the tertiary structure, so the effect cannot be explained by a direct interaction. KH Domains Display a Diverse Topology but Conserved RNA Binding NMR and crystal structures were determined for isolated KH domains from the mammalian RNA-associated proteins FMR I (Musco et al., 1996), vigilin (Musco et al., 1997), NOVA (Lewis et al., 1999), and hnRNP-K (Baber et al., 1999). Bacterial KH motifs were found in the structures of the E. coli cell cycle regulator ERA (Chen et al., 1999) and in r-protein S3 from Thermus thermophilus (Schluenzen et al., 2000; Wimberly et al., 2000). A comparison with the analogous NusA motifs revealed two distinct topologies (Figure 3c). The sequence of secondary structure elements in bacteria is ␣(␣)␤␤␣␣␤, whereas eukaryotes exhibit a ␤␣␣␤␤␣ arrangement. The underlined ␤␣␣␤ substructures make up a minimal KH core motif (Gibson et al., 1993), which was differently expanded in the different domains of life. The core in all known instances harbors a helix-turn-helix (HTH) motif with a conserved GXXG turn sequence. The recent structure of the third KH domain of the human protein NOVA in complex with an in vitro-selected small RNA hairpin (Lewis et al., 2000) showed that eukaryotic KHs bind RNA between the GXXG motif and a so-called variable loop (Figure 3). In an overlay of the NOVA KH3•RNA complex with the first KH domain in NusA (rms deviation 2.2 A˚ for 34 C␣ atoms) the GXXG motifs match well, while the variable loop of NOVA KH3 superimposes with

Molecular Cell 1180

Figure 1. Overall Structure (a) Exemplary section of an experimental MAD map at 2.8 A˚ resolution. (b) Stereo ribbon diagram of tmaNusA. Secondary structure elements and domains are indicated.

Structure of Thermotoga maritima NusA 1181

Table 1. Crystallographic Data SeMet Peak

Edge

Remote

HgCl2 Peak

Edge

Remote

Native

Data Collection Space group Unit cell a ⫽ b (A˚) c (A˚) Wavelength (A˚) Resolution (A˚) Unique reflections Redundancy Completeness (%) I/␴(I) Rmergea (%)

P43212

P43212

P43212

115.0 61.6 0.9788 15.0–2.7 12,248 10.8 99.7 (98.9) 39.6 (9.7) 3.3 (14.7)

114.9 63.5 1.0000 15.0–2.7 12,210 10.8 99.6 (99.9) 36.4 (5.2) 3.4 (22.9)

115.5 63.8 0.9500 20–2.1 25,817 7.8 99.8 (99.8) 40.6 (3.5) 7.3 (40.2)

0.9793 15.0–2.7 12,248 10.8 99.7 (100) 39.6 (9.6) 3.7 (15.2)

0.9500 15.0–2.7 12,278 10.7 99.6 (100) 39.6 (10.1) 3.2 (13.7)

15.0–2.8

15.0–2.8

1.0085 15.0–2.7 12,181 10.9 99.2 (97.7) 36.4 (5.0) 3.5 (28.1)

0.9500 15.0–2.7 12,246 10.9 99.4 (99.9) 37.4 (5.9) 3.5 (19.7)

15.0–2.8

15.0–2.8

Phasing Resolution (A˚) Heavy atom sites FOMb Before DM After DM (15.0–2.1 A˚)

15.0–2.8 4

0.55 0.83

15.0–2.8 1

0.27 0.80

Refinement Resolution (A˚) Reflections Protein atoms Water oxygens Rworkc (%) Rfreec (%) Rmsd bond lengths (A˚) Rmsd bond angles (⬚)

20.0–2.1 25,817 (99.8%) 2,667 375 24.4 31.9 0.011 1.66

Data in parentheses are for the last 0.1 A˚. a Rmerge ⫽ (⌺h⌺i[|I(h,i) ⫺ ⬍I(h)⬎|]/⌺h⌺i I, in which I(h,i) is the intensity value of the ith measurement of h, and ⬍I(h)⬎ is the corresponding value for h for all i measurements; the summation is over all measurements. b FOM ⫽ figure of merit. c Rwork ⫽ ⌺|Fobs ⫺ Fcalc|/⌺|Fobs|, in which Fobs and Fcalc are the observed and calculated structure factor amplitudes, respectively; Rfree was calculated with a random 5% of the data, which was omitted from all stages of the refinement.

the C terminus of the NusA KH1 domain (Figure 3b). The connecting peptide between ␤10 and ␣6 in KH1, which we refer to as the additional loop, clashes with the nucleic acid in the modeled complex but could easily be rearranged upon binding of the RNA to contribute favorable contacts. The second KH domain contains the same interaction sites but features a smaller additional loop (residues 310–312). We therefore suggest that in bacterial KHs a three-point anchor of the RNA between the GXXG motif, the additional loop, and the C terminus of the module replaces the two-point grip on the RNA between the GXXG turn and the variable loop in eukaryotic KHs. The RNA binding site of NusA KH1 is located to the side of the positive surface potential (Figure 1c), suggesting an interaction mode which does not rely on the nonspecific sugar-phosphate backbone. Its large additional loop could foster sequence-specific RNA contacts. The interaction would be further enhanced by

the neighboring RNA binding motifs assisting in binding (see below). KH1, therefore, is a prime candidate for sequence-specific RNA binding. The continuity of the electropositive ribbon is maintained by another positively charged patch on KH1 (R199 and R232). In contrast, KH2 is embedded in the positive surface potential, possibly to support unspecific nucleic acid binding. Its design with a short additional loop resembles the KH domain of r-protein S3, which also seems to be responsible for sequence-independent mRNA binding (Schluenzen et al., 2000). Our assignment of the KH RNA binding sites is corroborated by mutational analyses. Y. Zhou, D.I. Friedman, and J. Greenblatt have converted the first G of each GXXG motif individually to D, resulting in defective RNA binding and, in the case of the KH1 motif, in dysfunctional ␭ antitermination (personal communication). Intriguingly, the KH1 mutation seems to affect NusA functions and mRNA binding more severely than the KH2 alteration

(c) The left-hand side shows the putative RNA binding surface with positive electrostatic potential. Yellow ribbons mark RNA ligands for the S1 and KH motifs. The right-hand side shows the surface opposite the positively charged stretch. (d) nusA1 (I182R in tmaNusA) and nusA11 (V180D in tmaNusA) mutations mapped to the S1 hydrophobic core. The domain is in the same orientation as in (b). (e) Map of NusA mutations (large spheres) discussed in the text. “449” marks the so-called 449 region (see text). Other mutations are labeled by residue numbers.

Molecular Cell 1182

Structure of Thermotoga maritima NusA 1183

Figure 3. RNA Binding Models (a) Overlay of the NusA S1 domain (red) with the N-terminal OB fold motif of Asp-tRNA synthetase (blue) in complex with the tRNA anticodon loop (gold) (Eiler et al., 1999). The view is from the upper left corner of Figure 1b, with the NTD at the bottom and KH1 at the top. (b) Overlay of the first KH domain of NusA (red) and the NOVA KH3 domain (blue) in complex with a RNA hairpin (gold) (Lewis et al., 2000). The view is from the direction of NusA KH2 (bottom of Figure 1b) onto the KH1 ␤ sheet. In both panels, known and putative RNA-contacting peptides are colored cyan and pink for Asp-tRNA synthetase/NOVA and NusA sequences, respectively. GXXG, GXXG turn of the helix-turn-helix motif; VL, variable loop (NOVA); AL, additional loop (NusA). (c) The different topologies of eukaryotic and bacterial KH domains. Green denotes a common core motif, and gray denotes terminal extensions. Yellow stars indicate RNA contact sites.

(Y. Zhou and D.I. Friedman, personal communication), consistent with its special design. The RNA binding models for the S1 and KH modules were deduced from isolated domains. When assembled into the NusA structure, the binding pockets were nicely accessible (Figure 1c). Furthermore, a quasicontinuous, extended RNA binding surface ensued (see below). Additional RNA Interactions in the N-Terminal Domain The NTD of NusA, usually seen as the main RNAP binding determinant (see below), is also required to bind mRNA in vitro (Mah et al., 2000). In the absence of RNAP, the results cannot be explained by a proximity effect, with the NTD keeping the C-terminal RNA binding portions close to the site of mRNA production. Likewise,

because of the loose connection to the S1 domain, a conformational disturbance of the S1/KH region can be excluded. Rather, one can expect direct RNA contacts through the NTD. Significantly, the positive spine expands into the NTD through areas of helix ␣1 and ␣3 (Figure 1c), which could constitute an RNA interaction site. It is framed by negative stretches on helix ␣2 and strands ␤2 and ␤3. A conditionally lethal cold-sensitive phenotype is exhibited by the nusA10 mutation (double mutation R104H and E212K) (Schauer et al., 1987). Both affected amino acids (R104 and E211 in tmaNusA) are highly conserved in bacteria (Figure 2). They are located far apart at the N terminus of helix ␣3 and in the first helix of KH1 (Figure

Figure 2. Sequence Alignment Alignment of bacterial NusA sequences (Thermotoga maritima, Escherichia coli, Haemophilus influenzae, Borrelia burgdorferi, Thermus thermophilus, Mycoplasma genitalium, and Bacillus subtilis). Numbering corresponds to tmaNusA. The background of amino acids identical in at least six of the seven species is colored red, and those with five conserved sequences are yellow. The secondary structure is colored according to the domains (bottom). Black triangles mark residues that are part of the positive flank.

Molecular Cell 1184

Figure 4. Domain Stacking (a) Closeup of the interactions between S1 and KH1. The view is from the back of Figure 1b, with S1 at the left and KH1 at the right. (b) Details of the KH1–KH2 contacts. The view is from the left in Figure 1b, with KH1 rotated to the left and KH2 to the right. Absolutely conserved residues involved in the interactions are labeled.

1e). Both residues are not involved in intramolecular interactions, suggesting that their importance stems from interactions with other molecules. R104 is therefore a prime candidate for RNA interactions in the positive region of the NTD. Conversely, E211 is to the side of the KH1 RNA binding pocket, so that it is more likely involved in protein interactions (see below). Domain Arrays Create an Extended RNA Interaction Surface The present structure of NusA reveals an arrangement of multiple S1 and KH modules. The three C-terminal motifs form a seemingly rigid, tightly associated section. The paired N-terminal halves of the first and fourth strands (␤4 and ␤7; V137, D176, and Y181) of the S1 fold stack to helix ␣4 (F203 and L210) and the C-terminal portion of ␣7 (E264 and K266) of KH1 through hydrophobic and hydrogen-bonded interactions (Figure 4a). 1165 A˚2 of surface area is buried, suggesting a rather complementary match. The deduced binding pockets of S1 and KH1 are linked by a small area of positive surface potential (Figure 1c), creating an enlarged RNA binding surface. For the interface between the two KH modules, again two helices of the C-terminal domain (␣8 and ␣10) are packed against a ␤ sheet of the preceding motif (␤9, ␤10, and ␤11) (Figure 4b). There exists a clear shape complementarity between the surfaces of the ␤ sheet and the ␣-helical layer (Figure 5a) with a buried surface area of 1270 A˚2. The second KH element is rotated by about 100⬚ relative to the first, which allows connection by a short spacer. The two KH domains could cooperate in RNA binding because KH2 presents side chains in the loop following helix ␣8 for interaction with the KH1bound nucleic acid (Figure 5b).

In the absence of structural data on arrayed KH domains, possible modes of association were previously deduced from crystal-packing arrangements of isolated modules. In the structures of naked and RNA-complexed NOVA KH monomers (Lewis et al., 1999, 2000), two protomers were connected via their helical faces (Figure 5c). An additional motif in the array would therefore require a novel association. In contrast, a stacking via opposite poles as in NusA can easily link any number of KH domains through equivalent contacts. Furthermore, the NusA KH1–KH2 interface buries distinctly more surface area than the ⵑ900 A˚2 interface of the crystal packing. Finally, as mentioned, KH2 in NusA may assist the first KH domain in RNA binding, effectively expanding the overall RNA interacting surface. Cooperative binding is not recognizable in the noncrystallographic NOVA dimers, with the two RNAs lying on opposite sides of the complex (Figure 6). However, we cannot exclude the possibility that the different topology of eukaryotic KH elements will translate into a different association of the modules. As a precedent, for proteins containing multiple RNA recognition motifs (RRM), different modes of domain association have been inferred (Varani and Nagai, 1998). In any case, the neighboring S1 and KH RNA binding sites in the present structure show that the domains embody fundamental building blocks which can be arrayed to create extended RNA binding surfaces. In light of the present structure, mutational analyses underscore the importance of the domain arraying for mRNA interaction. The R199A mutation in ecoNusA was reported to abrogate N-mediated antitermination and reduce interaction with an N•nut-site complex (Mah et al., 2000). This absolutely conserved residue (R198 in

Structure of Thermotoga maritima NusA 1185

Figure 5. KH Arrays (a) Arrangement of the NusA KH domains via opposite poles. The view is from the bottom right of Figure 1b. (b) Ribbon diagram of the same region with a KH1-bound RNA hairpin rotated ⵑ90⬚ counterclockwise about the longitudinal axis. The prime RNA binding peptides as deduced from Figure 3 are labeled (red). Additional contacts through KH2 are in green. (c) Crystal contacts of two NOVA KH3•RNA complexes. The bottom KH domains in (a) and (c) are in approximately the same orientation.

tmaNusA) is located in the interface between the S1 and the first KH region (Figures 1e and 4a). It is involved in a number of crucial hydrogen bonds and salt bridges to other strictly conserved amino acids, E170, R227, and E264. Replacing R198 will destabilize the relative

Figure 6. Protein Contacts Residues involved in protein contacts in other OB fold and KH domain proteins from the Protein DataBank mapped in green on the S1/KH region of NusA. RNAs are shown as yellow ribbons. Part of the KH1 RNA binding site appears as a protein interaction face only because of its intimate contact to KH2 in NusA, a consequence of the cooperation between the modules (see text). Consistently, the equivalent surface in KH2 is not labeled. Views are identical to those of Figure 1c.

orientation of the two RNA binding domains and may disturb the continuous course of the positively charged backbone. mRNA binding deficiency in this mutant was demonstrated by gel shift assays (Mah et al., 2000; J. Greenblatt and D.I. Friedman, personal communication). Because R198 is completely buried in the S1–KH1 interface, a direct RNA contact through this residue can be excluded. It is noteworthy that the interface between KH1 and KH2 is also stabilized by a number of strongly conserved interactions involving the side chains of K234 and N324 and main chain atoms of A286 and A288 (Figure 4b). It might be instructive to investigate the impact of mutations in these residues on the activity of NusA. It has been inferred that autoinhibition of RNA binding in ecoNusA is mediated via a negative patch on the CTD (Mah et al., 2000; Mogridge et al., 1995). Consistent with RNA binding to NusA being mediated by the composite positive flank, the CTD could nicely block this area or part thereof through its negative surface. Regulation of Nucleic Acid Binding Over 50 KH domain proteins are known (Jensen et al., 2000), many of which contain KH modules in multiples or in combinations with other RNA binding domains. At one extreme, vigilin and its homologs display 15 KH repeats and have been associated with various functions (Kanamori et al., 1998; Kruse et al., 1998; Weber et al., 1997). A distinct group of proteins involved in intracellular RNA localization contains four KH domains in combination with one RRM (Havin et al., 1998). PNPase contains both S1 and KH motifs, albeit in a different order than NusA (Bycroft et al., 1997). Further-

Molecular Cell 1186

more, S1 domains occur in 6-fold redundancy in r-protein S1 (Subramanian, 1983) and in four repeats in a human DNA binding protein of unknown function (Eklund et al., 1995). The mosaic proteins are employed to recognize a large spectrum of nucleic acids. For KH domains, the scope ranges from mRNA in the case of r-protein S3 (Schluenzen et al., 2000) to heterochromatic DNA for DDP1 (Cortes and Azorin, 2000). Besides specificity, domain multiplication can also influence RNA affinity. For example, for the three-KH domain protein NOVA, there is evidence for increased RNA affinity with all KH modules compared to isolated KH3 (Jensen et al., 2000). The principle of a modular expansion of the RNA binding sites exemplified by the structure of NusA provides a straightforward mechanism for discrimination among different targets and modulation of the affinity. Consistently, there is evidence for vigilin that the composite binding site on the protein is matched by a composite recognition sequence on the mRNA (Kanamori et al., 1998). Since knowledge about the domain arrangements is a prerequisite for the understanding of the function of these proteins, the dimeric KH array of NusA was expanded by further modules. Straight, highly elongated molecules were the result. Putative nucleic acid binding sites spiraled around these constructs. Performing such expansions with homology-modeled structures of vigilin KH domains did not reveal any conserved interface patterns. While the similar shape of prokaryotic and eukaryotic KH domains suggests that an ␣/␤ stacking will also govern the assembly in the latter case (see above), details of the contacts may differ from the prokaryotic situation.

Part of the detrimental effects of the nusA10 mutation (see above) may be due to impaired binding to RNAP or other transcription factors. The affected highly conserved E211 seems to be a good candidate for a contact residue (Figure 1e). It is particularly noteworthy that the corresponding portion of the KH domain of r-protein S3 is involved in interactions with S14 in the 30S structure (Schluenzen et al., 2000; Wimberly et al., 2000). In order to further delineate possible protein interaction surfaces on NusA, we have searched for and superimposed structures encompassing KH and OB fold motifs in which these elements are covalently or noncovalently associated with other polypeptides (NusA has been included). Intra- or intermolecular protein contacts of these domains take place predominantly through areas which are opposite or to the side of the deduced RNA interaction sites (Figure 6). RNAP has been observed to undergo significant conformational changes upon switching from the ␴- to the NusA-defined stage (Gill et al., 1991a). The unrestrained NTD in the present structure could enable NusA to respond to or even induce such changes in RNAP. Malleability of RNAP binding sites is also seen in other transcription factors, like ␴70 (Malhotra et al., 1996; Severinova et al., 1996), and may be suitable to communicate signals from the nucleic acids to the enzyme through conformational changes. Although NusA and ␴70 bind to overlapping RNAP sites, their interaction modules or known parts thereof (Malhotra et al., 1996) are structurally unrelated. Therefore, they might influence each others’ binding allosterically or contact different subsets of RNAP residues. Conclusions

RNA Polymerase and Transcription Factor Binding The capability of NusA to stimulate pausing, termination, or antitermination is strongly correlated with its binding to core RNAP (Mah et al., 1999). Furthermore, NusA interacts with protein N and NusB in antitermination complexes (Greenblatt and Li, 1981b). Deletion experiments revealed that an ecoNusA variant containing only the N-terminal 136 residues retains the essential RNAP binding determinants but exhibits no antitermination activity (Mah et al., 1999). While interactions with the RNAP ␣ subunits (Liu et al., 1996) seem to be restricted to the C-terminal regulatory extension of ecoNusA (Mah et al., 2000) not present in tmaNusA, targeted protein footprinting (Traviglia et al., 1999) and binding studies with immobilized NusA (Liu et al., 1996) indicated that NusA contacts the ␤ and ␤⬘ subunits of RNAP at positions overlapping the ␴70 binding site. Footprints on ␤⬘ are located in the N-terminal region of the polypeptide, which is not seen in the Thermus aquaticus RNAP structure (Zhang et al., 1999). Conversely, footprints on ␤ were distributed over 90 A˚. Clearly, the NusA N-terminal domain alone in the present conformation (longest dimension ⵑ55 A˚) cannot account for these dispersed sites. Other areas of NusA have to contribute and the protein has to adopt a rod-like structure, as seen in the present work. An elongated shape for NusA is corroborated by hydrodynamic studies (Gill et al., 1991b). As a consequence, the S1 and KH domains presumably will also be in contact with RNAP.

The present crystal structure suggests that NusA creates an extended, mosaic RNA interaction surface by domain arraying. Consistently, all portions of the molecule have been implicated by mutational analyses in RNA binding: deletions of the NTD, the R199A mutation in the interface of S1 and KH1, and point mutations in the GXXG motifs of both KH elements all impair binding to RNA sequences recognized by the wild-type protein. The domain interactions spatially arrange the individual binding pockets, possibly enabling NusA to recognize a variety of combinations of signals cooperatively. The wide distribution of mosaic interaction surfaces in nucleic acid binding proteins suggests that they are useful in modulating RNA binding strength and specificity. We suggest that as a result of the domain stacking, a positively charged flank of NusA marks the trace of the nascent mRNA and allows the factor to continuously scan for regulatory structures. Consistently, the R-rich sequence between residues 164 and 191 in ecoNusA was implied in nonspecific RNA binding (Ito et al., 1991). mRNA may be driven along this surface by the ongoing polymerization in the RNAP active site. By analogy, translation initiation factor eIF1A, an OB fold protein with another composite RNA binding site, is employed in scanning for the initiation codon (Battiste et al., 2000). Cross-links of nucleotides ⬎14 upstream of the newly synthesized mRNA residue to the ␣ subunit of RNAP are not seen in the presence of NusA (Mooney et al.,

Structure of Thermotoga maritima NusA 1187

1998), in agreement with the factor relieving RNAP of the nascent mRNA directly after its emergence from an exit tunnel. After initial baiting of the mRNA, the S1 and KH modules may pick out regulatory sequences or combinations of signals. In this model, the domain stacking leads to superimposition of specific binding sites on an area that nonspecifically attracts RNA. The prominent asymmetry in the surface potential of NusA is in accord with concomitant binding to RNA, RNAP, and transcription factors. Previously observed protein contacts involving OB fold and KH domains mainly imply the surface opposite the RNA binding sites in protein–protein interactions. It has been suggested for transcription factor GreA (Stebbins et al., 1995) that a patch of positive potential mediates attachment to the strongly electronegative RNAP. Whether NusA can substitute extended contacts for such a concise positive patch, or whether RNAP is still contacted through the positively charged surface requires further experimentation. Structures of physiological transcription complexes will be necessary in order to unambiguously identify all functional sites in NusA. Interestingly, many of the structural principles which seem to underlie NusA function have analogs in RRMcontaining proteins (Varani and Nagai, 1998); RRM motifs often occur in multiples and have been likened to an RNA binding counterpart of zinc finger or homeobox proteins for DNA. Individual RRMs within an array can provide specificity for certain RNAs by binding special signals. Multiplication of the elements results in extended RNA recognition surfaces lending themselves to the cooperative interaction with diverse RNAs. Finally, like NusA, RRM proteins often depend on concomitant binding to other proteins for which they carry auxiliary domains and some of these elements seem to assist in RNA binding. Experimental Procedures Cloning, Expression, and Purification The gene for NusA was PCR extracted from T. maritima total genomic DNA and inserted into the pET22b(⫹) expression vector (Novagen, Madison, WI) by standard techniques. The authenticity of the construct was verified by total sequencing of the promoter and insert regions. After transformation, E. coli BL21(DE3)/pLysS cells were grown at 37⬚C and induced at an OD595 of 0.8 with 1 mM isopropyl-␤-D-thiogalactopyranoside (IPTG). Following 6 hr of induction, the cells were harvested, resuspended in 50 mM Tris-HCl (pH 7.6), 3 mM EDTA (buffer A), and stored at ⫺70⬚C. For purification, the cells were ultrasound disrupted, and the 30,000 ⫻ g supernatant was subjected to a heat treatment (30 min, 95⬚C). The heat-stable fraction was fractionated on a 200 ml DEAE sepharose FF column (Amersham-Pharmacia, Uppsala, Sweden) with a 1.8l gradient from buffer A to buffer A plus 700 mM NaCl. Peak fractions were identified by SDS–PAGE, adjusted to 1 M ammonium sulfate, loaded onto a 75 ml phenyl sepharose HP column (Amersham-Pharmacia), equilibrated with buffer A plus 1 M ammonium sulfate, and eluted with a 750 ml gradient to buffer A. The pool was concentrated to ⵑ40 mg/ml by ultrafiltration (Amicon membrane, 10 kDa cutoff; Millipore, Eschborn, Germany). The buffer was exchanged for 10 mM HEPES (pH 7.0) via a NAP-25 column (Amersham-Pharmacia), and the protein concentration was adjusted to ⵑ9mg/ml. Shock-frozen aliquots were stored at ⫺70⬚C. Crystallization and Data Collection Crystals of NusA were obtained by sitting drop vapor diffusion, mixing 3 ␮l protein stock with 1.5 ␮l reservoir (1.8 M ammonium

sulfate, 2%–5% PEG 400, 100 mM HEPES [pH 6.8–7.4]). Crystals belonging to space group P43212 with a ⫽ b ⫽ 115 A˚, c ⫽ 64 A˚ appeared after 2 weeks at room temperature and could be conserved in a liquid nitrogen stream after transfer into perfluoropolyether (PFO-X125/03; Lancaster, England) and removal of residual mother liquor. They diffracted to 2.1 A˚ resolution at a synchrotron source. For MAD phasing, crystals were either conventionally soaked with HgCl2, or selenomethionine-derivatized protein was crystallized under the above conditions. Three-wavelength MAD experiments were conducted at beamline BW6 of DESY (Hamburg, Germany), employing the K edge of selenium and the LIII edge of mercury (Table 1). Structure Solution and Refinement Identification and refinement of the heavy atom positions from anomalous difference Patterson maps and phasing of the data were performed with programs RSPS and MLPHARE of the CCP4 collection (1994). Initial phases were calculated to 2.8 A˚ resolution and expanded to 2.1 A˚ by solvent flattening (program DM). Both data sets yielded experimental maps of comparable quality which were readily interpretable in terms of a backbone trace and could subsequently be decorated with side chains. Models in which all residues in the molecule were accounted for were built with MAIN (Turk, 1996) and refined with CNS (Bru¨nger et al., 1998) according to standard strategies, including positioning of 375 water molecules. No data were excluded, and the free R factor (Rfree; 5% of the observed reflections) was continuously monitored during the refinement (Table 1). Acknowledgments M.C.W. was supported by postdoctoral fellowships from the Deutsche Forschungsgemeinschaft and the Engelhorn-Stiftung. We thank Christian Riedel for his help in purification and crystallization of tmaNusA. Received December 21, 2000; revised March 22, 2001. References Aksoy, S., Squires, C.L., and Squires, C. (1984). Evidence for antitermination in Escherichia coli rRNA transcription. J. Bacteriol. 159, 260–264. Baber, J.L., Libutti, D., Levens, D., and Tjandra, N. (1999). High precision solution structure of the C-terminal KH domain of heterogeneous nuclear ribonucleoprotein K, a c-myc transcription factor. J. Mol. Biol. 289, 949–962. Barik, S., and Das, A. (1990). An analysis of the role of host factors in transcription antitermination in vitro by the Q protein of coliphage lambda. Mol. Gen. Genet. 222, 152–156. Battiste, J.L., Pestova, T.V., Hellen, C.U., and Wagner, G. (2000). The eIF1A solution structure reveals a large RNA-binding surface important for scanning function. Mol. Cell 5, 109–119. Berg, K.L., Squires, C., and Squires, C.L. (1989). Ribosomal RNA operon anti-termination. Function of leader and spacer region box B-box A sequences and their conservation in diverse micro-organisms. J. Mol. Biol. 209, 345–358. Bogden, C.E., Fass, D., Bergman, N., Nichols, M.D., and Berger, J.M. (1999). The structural basis for terminator recognition by the Rho transcription termination factor. Mol. Cell 3, 487–493. Bru¨nger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros, P., Grosse-Kunstleve, R.W., Jiang, J.S., Kuszewski, J., Nilges, M., Pannu, N.S., et al. (1998). Crystallography and NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr. D Biol. Crystallogr. 54, 905–921. Bycroft, M., Hubbard, T.J., Proctor, M., Freund, S.M., and Murzin, A.G. (1997). The solution structure of the S1 RNA binding domain: a member of an ancient nucleic acid-binding fold. Cell 88, 235–242. Chan, C.L., and Landick, R. (1989). The Salmonella typhimurium his operon leader region contains an RNA hairpin-dependent transcrip-

Molecular Cell 1188

tion pause site. Mechanistic implications of the effect on pausing of altered RNA hairpins. J. Biol. Chem. 264, 20796–20804. Chattopadhyay, S., Garcia-Mena, J., DeVito, J., Wolska, K., and Das, A. (1995). Bipartite function of a small RNA hairpin in transcription antitermination in bacteriophage lambda. Proc. Natl. Acad. Sci. USA 92, 4061–4065. Chen, X., Court, D.L., and Ji, X. (1999). Crystal structure of ERA: a GTPase-dependent cell cycle regulator containing an RNA binding motif. Proc. Natl. Acad. Sci. USA 96, 8396–8401. CCP4 (Collaborative Computational Project 4). (1994). The CCP4 Suite: Programs for protein crystallography. Acta Crystallogr. D Biol. Crystallogr. 50, 760–763. Cortes, A., and Azorin, F. (2000). DDP1, a heterochromatin-associated multi-KH domain protein of Drosophila melanogaster, interacts specifically with centromeric satellite DNA sequences. Mol. Cell. Biol. 20, 3860–3869. Craven, M.G., Granston, A.E., Schauer, A.T., Zheng, C., Gray, T.A., and Friedman, D.I. (1994). Escherichia coli-Salmonella typhimurium hybrid nusA genes: identification of a short motif required for action of the lambda N transcription antitermination protein. J. Bacteriol. 176, 1394–1404. Das, A., and Wolska, K. (1984). Transcription antitermination in vitro by lambda N gene product: requirement for a phage nut site and the products of host nusA, nusB, and nusE genes. Cell 38, 165–173. Eiler, S., Dock-Bregeon, A., Moulinier, L., Thierry, J.C., and Moras, D. (1999). Synthesis of aspartyl-tRNA(Asp) in Escherichia coli—a snapshot of the second step. EMBO J. 18, 6532–6541. Eklund, E.A., Lee, S.W., and Skalnik, D.G. (1995). Cloning of a cDNA encoding a human DNA-binding protein similar to ribosomal protein S1. Gene 155, 231–235. Farnham, P.J., Greenblatt, J., and Platt, T. (1982). Effects of NusA protein on transcription termination in the tryptophan operon of Escherichia coli. Cell 29, 945–951. Fraser, C.M., Gocayne, J.D., White, O., Adams, M.D., Clayton, R.A., Fleischmann, R.D., Bult, C.J., Kerlavage, A.R., Sutton, G., Kelley, J.M., et al. (1995). The minimal gene complement of Mycoplasma genitalium. Science 270, 397–403. Friedman, D.I. (1992). Interaction between bacteriophage l and its Escherichia coli host. Curr. Opin. Genet. Dev. 2, 727–738. Friedman, D.I., and Baron, L.S. (1974). Genetic characterization of a bacterial locus involved in the activity of the N function of phage lambda. Virology 58, 141–148. Friedman, D.I., and Olson, E.R. (1983). Evidence that a nucleotide sequence, “boxA,” is involved in the action of the NusA protein. Cell 34, 143–149. Friedman, D.I., and Court, D.L. (1995). Transcription antitermination: the lambda paradigm updated. Mol. Microbiol. 18, 191–200. Friedman, D.I., Olson, E.R., Johnson, L.L., Alessi, D., and Craven, M.G. (1990). Transcription-dependent competition for a host factor: the function and optimal sequence of the phage lambda boxA transcription antitermination signal. Genes Dev. 4, 2210–2222. Gibson, T.J., Thompson, J.D., and Heringa, J. (1993). The KH domain occurs in a diverse set of RNA-binding proteins that include the antiterminator NusA and is probably involved in binding to nucleic acid. FEBS Lett. 324, 361–366. Gill, S.C., Weitzel, S.E., and von Hippel, P.H. (1991a). Escherichia coli sigma 70 and NusA proteins. I. Binding interactions with core RNA polymerase in solution and within the transcription complex. J. Mol. Biol. 220, 307–324. Gill, S.C., Yager, T.D., and von Hippel, P.H. (1991b). Escherichia coli sigma 70 and NusA proteins. II. Physical properties and selfassociation states. J. Mol. Biol. 220, 325–333. Greenblatt, J., and Li, J. (1981a). Interaction of the sigma factor and the nusA gene protein of E. coli with RNA polymerase in the initiationtermination cycle of transcription. Cell 24, 421–428. Greenblatt, J., and Li, J. (1981b). The nusA gene protein of Escherichia coli. Its identification and a demonstration that it interacts with the gene N transcription antitermination protein of bacteriophage lambda. J. Mol. Biol. 147, 11–23.

Greenblatt, J., McLimont, M., and Hanly, S. (1981). Termination of transcription by nusA gene protein of Escherichia coli. Nature 292, 215–220. Havin, L., Git, A., Elisha, Z., Oberman, F., Yaniv, K., Schwartz, S.P., Standart, N., and Yisraeli, J.K. (1998). Rna-binding protein conserved in both microtubule- and microfilament-based Rna localization. Genes Dev. 12, 1593–1598. Helmann, J.D., and Chamberlin, M.J. (1988). Structure and function of bacterial sigma factors. Annu. Rev. Biochem. 57, 839–872. Henthorn, K.S., and Friedman, D.I. (1996). Identification of functional regions of the Nun transcription termination protein of phage HK022 and the N antitermination protein of phage gamma using hybrid nun-N genes. J. Mol. Biol. 257, 9–20. Holm, L., and Sander, C. (1993). Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233, 123–138. Horwitz, R.J., Li, J., and Greenblatt, J. (1987). An elongation control particle containing the N gene transcriptional antitermination protein of bacteriophage lambda. Cell 51, 631–641. Ito, K., Egawa, K., and Nakamura, Y. (1991). Genetic interaction between the beta´ subunit of RNA polymerase and the arginine-rich domain of Escherichia coli nusA protein. J. Bacteriol. 173, 1492– 1501. Jensen, K.B., Musunuru, K., Lewis, H.A., Burley, S.K., and Darnell, R.B. (2000). The tetranucleotide UCAY directs the specific recognition of RNA by the Nova K-homology 3 domain. Proc. Natl. Acad. Sci. USA 97, 5740–5745. Kanamori, H., Dodson, R.E., and Shapiro, D.J. (1998). In vitro genetic analysis of the RNA binding site of vigilin, a multi-KH domain protein. Mol. Cell. Biol. 18, 3991–4003. Kruse, C., Grunweller, A., Willkomm, D.K., Pfeiffer, T., Hartmann, R.K., and Muller, P.K. (1998). tRNA is entrapped in similar, but distinct, nuclear and cytoplasmic ribonucleoprotein complexes, both of which contain vigilin and elongation factor 1 alpha. Biochem. J. 329, 615–621. Landick, R., and Yanofsky, C. (1984). Stability of an RNA secondary structure affects in vitro transcription pausing in the trp operon leader region. J. Biol. Chem. 259, 11550–11555. Lewis, H.A., Chen, H., Edo, C., Buckanovich, R.J., Yang, Y.Y., Musunuru, K., Zhong, R., Darnell, R.B., and Burley, S.K. (1999). Crystal structures of Nova-1 and Nova-2 K-homology RNA-binding domains. Structure Fold. Des. 7, 191–203. Lewis, H.A., Musunuru, K., Jensen, K.B., Edo, C., Chen, H., Darnell, R.B., and Burley, S.K. (2000). Sequence-specific RNA binding by a Nova KH domain: implications for paraneoplastic disease and the fragile X syndrome. Cell 100, 323–332. Li, S.C., Squires, C.L., and Squires, C. (1984). Antitermination of E. coli rRNA transcription is caused by a control region segment containing lambda nut-like sequences. Cell 38, 851–860. Linn, T., and Greenblatt, J. (1992). The NusA and NusG proteins of Escherichia coli increase the in vitro readthrough frequency of a transcriptional attenuator preceding the gene for the beta subunit of RNA polymerase. J. Biol. Chem. 267, 1449–1454. Liu, K., and Hanna, M.M. (1995). NusA contacts nascent RNA in Escherichia coli transcription complexes. J. Mol. Biol. 247, 547–558. Liu, K., Zhang, Y., Severinov, K., Das, A., and Hanna, M. M. (1996). Role of Escherichia coli RNA polymerase alpha subunit in modulation of pausing, termination, and antitermination by the transcription elongation factor NusA. EMBO J. 15, 150–161. Luzzati, P. (1952). Traitement statistique des erreurs dans dans la de´termination des structures cristallines. Acta Crystallogr. 5, 802–810. Mah, T.F., Li, J., Davidson, A.R., and Greenblatt, J. (1999). Functional importance of regions in Escherichia coli elongation factor NusA that interact with RNA polymerase, the bacteriophage lambda N protein and RNA. Mol. Microbiol. 34, 523–537. Mah, T.-F., Kuznedelov, K., Mushegian, A., Severinov, K., and Greenblatt, J. (2000). The a-subunit of E. coli RNA polymerase activates RNA binding by NusA. Genes Dev. 14, 2664–2675. Malhotra, A., Severinova, E., and Darst, S.A. (1996). Crystal structure

Structure of Thermotoga maritima NusA 1189

of a sigma 70 subunit fragment from E. coli RNA polymerase. Cell 87, 127–136.

amber mutation that causes temperature-sensitive growth of Escherichia coli. J. Bacteriol. 170, 908–915.

Mogridge, J., and Greenblatt, J. (1998). Specific binding of Escherichia coli ribosomal protein S1 to boxA transcriptional antiterminator RNA. J. Bacteriol. 180, 2248–2252.

Turk, D. (1996). An interactive software for density modifications, model building, structure refinement and analysis. In P.E. Bourne and K. Watenpaugh, (eds.), Proceedings from the 1996 Meeting of the International Union of Crystallography Macromolecular Computing School.

Mogridge, J., Mah, T.F., and Greenblatt, J. (1995). A protein-RNA interaction network facilitates the template-independent cooperative assembly on RNA polymerase of a stable antitermination complex containing the lambda N protein. Genes Dev. 9, 2831–2845. Mooney, R.A., Artsimovitch, I., and Landick, R. (1998). Information processing by RNA polymerase: recognition of regulatory signals during RNA chain elongation. J. Bacteriol. 180, 3265–3275. Musco, G., Stier, G., Joseph, C., Castiglione Morelli, M.A., Nilges, M., Gibson, T.J., and Pastore, A. (1996). Three-dimensional structure and stability of the KH domain: molecular insights into the fragile X syndrome. Cell 85, 237–245. Musco, G., Kharrat, A., Stier, G., Fraternali, F., Gibson, T.J., Nilges, M., and Pastore, A. (1997). The solution structure of the first KH domain of FMR1, the protein responsible for the fragile X syndrome. Nat. Struct. Biol. 4, 712–716. Nakamura, Y., Mizusawa, S., Tsugawa, A., and Imai, M. (1986). Conditionally lethal nusAts mutation of Escherichia coli reduces transcription termination but does not affect antitermination of bacteriophage lambda. Mol. Gen. Genet. 204, 24–28. Nodwell, J.R., and Greenblatt, J. (1993). Recognition of boxA antiterminator RNA by the E. coli antitermination factors NusB and ribosomal protein S10. Cell 72, 261–268. Richardson, J.P., and Greenblatt, J. (1996). Control of RNA chain elongation and termination. In Escherichia coli and Salmonella typhimirium, F.C. Neidhardt, ed. (Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press), pp. 386–406. Ruteshouser, E.C., and Richardson, J.P. (1989). Identification and characterization of transcription termination sites in the Escherichia coli lacZ gene. J. Mol. Biol. 208, 23–43. Schauer, A.T., Carver, D.L., Bigelow, B., Baron, L.S., and Friedman, D.I. (1987). Lambda N antitermination system: functional analysis of phage interactions with the host NusA protein. J. Mol. Biol. 194, 679–690. Schindelin, H., Marahiel, M.A., and Heinemann, U. (1993). Universal nucleic acid-binding domain revealed by crystal structure of the B. subtilis major cold-shock protein. Nature 364, 164–168. Schluenzen, F., Tocilj, A., Zarivach, R., Harms, J., Gluehmann, M., Janell, D., Bashan, A., Bartels, H., Agmon, I., Franceschi, F., and Yonath, A. (2000). Structure of functionally activated small ribosomal subunit at 3.3A˚ resolution. Cell 102, 615–623. Schmidt, M.C., and Chamberlin, M.J. (1984). Amplification and isolation of Escherichia coli NusA protein and studies of its effects on in vitro RNA chain elongation. Biochemistry 23, 197–203. Severinova, E., Severinov, K., Fenyo, D., Marr, M., Brody, E.N., Roberts, J.W., Chait, B.T., and Darst, S.A. (1996). Domain organization of the Escherichia coli RNA polymerase sigma 70 subunit. J. Mol. Biol. 263, 637–647. Siomi, H., Matunis, M.J., Michael, W.M., and Dreyfuss, G. (1993). The pre-mRNA binding K protein contains a novel evolutionarily conserved motif. Nucleic Acids Res. 21, 1193–1198. Stebbins, C.E., Borukhov, S., Orlova, M., Polyakov, A., Goldfarb, A., and Darst, S.A. (1995). Crystal structure of the GreA transcript cleavage factor from Escherichia coli. Nature 373, 636–640. Subramanian, A.R. (1983). Structure and functions of ribosomal protein S1. Prog. Nucleic Acid Res. Mol. Biol. 28, 101–142. Traviglia, S.L., Datwyler, S.A., Yan, D., Ishihama, A., and Meares, C.F. (1999). Targeted protein footprinting: Where different transcription factors bind to RNA polymerase. Biochemistry 38, 15774–15778. Tsugawa, A., Kurihara, T., Zuber, M., Court, D.L., and Nakamura, Y. (1985). E. coli NusA protein binds in vitro to an RNA sequence immediately upstream of the boxA signal of bacteriophage lambda. EMBO J. 4, 2337–2342. Tsugawa, A., Saito, M., Court, D.L., and Nakamura, Y. (1988). nusA

Varani, G., and Nagai, K. (1998). RNA recognition by RNP proteins during RNA processing. Annu. Rev. Biophys. Biomol. Struct. 27, 407–445. Vogel, U., and Jensen, K.F. (1997). NusA is required for ribosomal antitermination and for modulation of the transcription elongation rate of both antiterminated RNA and mRNA. J. Biol. Chem. 272, 12265–12271. Weber, V., Wernitznig, A., Hager, G., Harata, M., Frank, P., and Wintersberger, U. (1997). Purification and nucleic-acid-binding properties of a Saccharomyces cerevisiae protein involved in the control of ploidy. Eur. J. Biochem. 249, 309–317. Wimberly, B., Brodersen, D.E., Clemons, W.M., Jr., Morgen-Warren, R.J., Carter, A.P., Vonrhein, C., Hartsch, T., and Ramakrishnan, V. (2000). Structure of the 30S ribosomal subunit. Nature 407, 327–329. Zengel, J.M., and Lindahl, L. (1992). Ribosomal protein L4 and transcription factor NusA have separable roles in mediating terminating of transcription within the leader of the S10 operon of Escherichia coli. Genes Dev. 6, 2655–2662. Zhang, G., Campbell, E.A., Minakhin, L., Richter, C., Severinov, K., and Darst, S.A. (1999). Crystal structure of Thermus aquaticus core RNA polymerase at 3.3A˚ resolution. Cell 98, 811–824. Zheng, C., and Friedman, D.I. (1994). Reduced Rho-dependent transcription termination permits NusA-independent growth of Escherichia coli. Proc. Natl. Acad. Sci. USA 91, 7543–7547. Accession Numbers The atomic coordinates for tmaNusA have been submitted to the Protein DataBank under ID code 1hh2.