and domain organization of StpA and H-NS are closely ..... The name for each primer is ..... bial genomes using the search tools on the respective Web.
Molecular Microbiology (1998) 28(4), 847–857
Domain structure and RNA annealing activity of the Escherichia coli regulatory protein StpA Michael E. Cusick and Marlene Belfort* Molecular Genetics Program, Wadsworth Center, New York State Department of Health and School of Public Health, State University of New York at Albany, PO Box 22002, Albany, NY 12201-2002, USA. Summary The Escherichia coli regulatory protein StpA bears striking similarity to the chromatin-associated protein H-NS. These two proteins have many structural, functional and mechanistic parallels. Although H-NS is more abundant in the cell, both proteins act as transcriptional regulators, both bind to curved DNA and both restrain DNA supercoils. However, StpA is better able to promote RNA annealing and trans -splicing in vitro. In this study, phylogenetic analyses and experiments to examine the protease sensitivity of StpA and H-NS suggest a similar structure for the two proteins. Both proteins consist of two structured domains separated by an exposed protease-sensitive linker. The N-terminal (StpA-NterL) and C-terminal (StpA-CterL) domains of StpA, as well as the full-length StpA and H-NS proteins, were cloned, overproduced in E. coli and purified to homogeneity. StpA-CterL, but not StpA-NterL, promotes strand annealing of complementary RNA oligonucleotides and in vitro trans -splicing of a model group I intron. Both StpA and StpA-CterL exhibited stronger RNA-modulating activity than H-NS. Phylogenetic analyses showed that the N-terminal and C-terminal domains can exist autonomously. The phylogenetic and experimental data are compatible with a two-domain model for StpA and H-NS, with independently functioning modules joined by a non-conserved linker, and with the observed RNA-related activities residing entirely within the C-terminal domain. Introduction The Escherichia coli stpA gene was isolated initially as a multicopy suppressor of the phenotype of a splicing defective td intron of bacteriophage T4 (Zhang and Belfort, 1992; Zhang et al., 1995). Subsequently, stpA was also identified by its ability, when overexpressed, to rescue Received 20 January, 1998; revised 23 February, 1998; accepted 2 March, 1998. *For correspondence. E-mail marlene.belfort@ wadsworth.org; Tel. (518) 473 3345; Fax (518) 474 3181.
Q 1998 Blackwell Science Ltd
the repression of adi gene expression, which is lost in an hns mutant (Shi and Bennett, 1994). It was recognized immediately that StpA is an intraspecies homologue, i.e. a paralogue, of the E. coli H-NS protein (Zhang and Belfort, 1992; Zhang et al., 1996). H-NS has been investigated heavily in recent years (for reviews, see Ussery et al., 1994; Williams and Rimsky, 1997; Atlung and Ingmer, 1997). StpA and H-NS are nearly identical in size (134 and 137 residues respectively) and share 58% identity overall. The identity is greater in the C-terminal region (73% identity from residue 91 onwards) than in the N-terminal region (51% identity to residue 90). Given the substantial sequence similarity between StpA and H-NS, it is no surprise that there are many functional and mechanistic parallels between the two proteins. H-NS acts principally as a transcriptional repressor of many unrelated and unlinked genes (Atlung and Ingmer, 1997; Williams and Rimsky, 1997). StpA also acts as a transcriptional repressor, but only for a subset of the genes affected by H-NS. Both proteins can restrain DNA supercoils, bind to curved DNA and inhibit transcription from a model promoter. Both proteins cross-regulate the expression of each other’s gene and autoregulate their own expression (Sonden and Uhlin, 1996; Zhang et al., 1996). Although both proteins can stimulate strand annealing of complementary RNA molecules and trans -splicing of the group I td intron in vitro, the activity of StpA in these areas is much greater than that of H-NS (Zhang et al., 1996). Another difference between the two proteins is the degree to which they are expressed. Under conditions of exponential growth, the estimated number of H-NS molecules in the cell (approximately 18 000) is orders of magnitude greater than that of StpA molecules (approximately 200) (Spassky et al., 1984; Sonden and Uhlin, 1996; Free and Dorman, 1997; Williams and Rimsky, 1997). Recent investigations of missense and nonsense mutants generated in H-NS have indicated that H-NS has at least two functional domains (Ueguchi et al., 1996; 1997; Williams et al., 1996; Spurio et al., 1997; Williams and Rimsky, 1997; Free et al., 1998). Mutations towards the C-terminal end, between residues 90 and 120, reduce DNA binding, while mutations towards the N-terminal end, roughly residues 12–65, retain DNA binding but have lost H-NS repressor function, presumably owing to loss of protein–protein interaction (Williams et al., 1996; Spurio et al., 1997; Ueguchi et al., 1997). Several of these mutations, when placed into the context of the StpA sequence, show similar defects
848 M. E. Cusick and M. Belfort in function, providing yet more evidence that the function and domain organization of StpA and H-NS are closely related (Williams et al., 1996; Free et al., 1998). We show here by biochemical means that StpA and H-NS have a two-domain structure, in accordance with the prediction raised by our phylogenetic sequence comparisons and by the genetic results described above. Furthermore, we show that the ability of StpA to promote RNA annealing resides solely in the C-terminal domain, as does its ability to stimulate a trans -splicing reaction in vitro. These data are compatible with a model in which StpA has two structured domains, connected by a linker region, with the C-terminal domain having a discrete and separable strand annealing function. Results
StpA has two structural domains The sequence of the E. coli StpA protein, and of its paralogue H-NS, provides several indications that it may have a loosely structured linker segment connecting two structured domains. First, the region between residues 76 and 90 is the least conserved part of the sequence in the H-NS family (Fig. 1), suggesting the absence of an essential function. Secondly, the same segment is abundant in amino acid residues that are prevalent in linker regions of multidomain proteins of known structure, notably alanine, lysine and proline (Argos, 1990). Thirdly, independent functions for the two regions are suggested by the existence of plasmid-borne bacterial genes that contain only one of the two domains. Thus, mdbA consists of an N-terminal region connected to an unrelated C-terminal region (O’Brien and Mahanty, 1994), and korB consists of a duplicated C-terminal region with the N-terminal region absent (More´ et al., 1996). Finally, point mutations that impair H-NS transcriptional repression activity are located throughout the sequence, but not one is in the potential linker segment (Ueguchi et al., 1996; Williams et al., 1996; Williams and Rimsky, 1997). To test the hypothesis that StpA is a two-domain protein, it was subjected to partial proteolysis. Unstructured linker regions are generally more sensitive to proteolysis than the structured regions that they connect. The StpA protein used in these experiments was a partially purified preparation (> 85% purity) made as described previously (Zhang et al., 1996). Reaction conditions that partially digested the protein without reducing it completely to small polypeptides were determined for four enzymes: trypsin, chymotrypsin, subtilisin and thermolysin (see Experimental procedures ). Digestion with each enzyme produced a number of stable products (Fig. 2A). Notable among these was an approximately 6-kDa product with each of the enzymes. Partial proteolysis of H-NS, the homologue of StpA, with trypsin gave rise to an almost identical pattern of digestion
products. Like StpA, trypsin digestion of H-NS showed a prominent digestion product of approximately 6 kDa (Fig. 2B). These results suggest that StpA and H-NS have a similar domain structure. The partial digestion products were analysed by mass spectroscopy, allowing the assignment of cleavage sites to specific residues. The sites sensitive to trypsin in StpA and H-NS and the sites sensitive to chymotrypsin, subtilisin and thermolysin in StpA are shown in Fig. 2C. A large proportion of the cut sites mapped into the putative linker segment between residues 76 and 90. The most prominent cut site was at the conserved residue 90. The KKR string at the end of this segment was especially trypsin sensitive, a result of note in a protein that is lysine and arginine rich (17%) throughout its length. Although the C-terminal region from residues 91–134 was identified as a stable intermediate of 6 kDa, a fragment comprising the entire N-terminal region could not be found as a stable product. However, fragments consisting of most of the N-terminal domain (residue 20 to the linker) were seen. Results of mass spectroscopic analysis with H-NS trypsin products were similar. The KAKR string at the end of the putative linker in H-NS was especially sensitive, and the C-terminal region was detected as a stable product at approximately 6 kDa, but the N-terminal region was not detected as a stable intermediate. These findings concur with previous work in which a stable tryptic fragment of H-NS comprising residues 91–137 could be purified to homogeneity (Shindo et al., 1995). From these experiments, we conclude that both StpA and H-NS have a bipartite structure, composed of relatively structured C-terminal and N-terminal domains separated by a highly proteasesensitive, exposed linker.
Purification of StpA domain polypeptides To characterize the structure of StpA further and to determine whether its nucleic acid annealing activity could be localized to a particular domain, we engineered overexpression clones corresponding to the N-terminal and C-terminal domains as delineated by the mass spectroscopic analysis. Accordingly, expression clones in pET3a were designed to begin and end in the putative linker region between residues 76 and 90 (Fig. 3A). Each construct was tested for expression and solubility. The clones designated NterL (residues 1–80, with about one-third of the putative linker residues present) and CterL (residues 81–134, with about two-thirds of the putative linker segment present) could be induced to high expression levels of soluble protein for use in subsequent analyses (Fig. 3B). Between the two clones, the entire fulllength sequence of StpA is represented. Clones of fulllength StpA and H-NS were also constructed in the same expression vector. Both of these clones could be induced Q 1998 Blackwell Science Ltd, Molecular Microbiology, 28, 847–857
Domain structure of E. coli StpA protein 849
Fig. 1. The family of H-NS related proteins. Multiple sequence alignment of known H-NS homologues has residues that are identical or or have conservative substitutions in all H-NS homologues indicated by white type on a black background. The L residue instead of the invariable P residue at position 117 in HNS-pvu is presumed to represent a sequencing error (Spurio et al., 1997). Grey shading indicates positions that are identical or have conservative substitutions in more than 65% of the homologues present in that column. The conservative groupings used were (E, D) (N, Q) (I, L, V, M) (F, Y, W) (A, G) (S, T) and (R, K). H-NS-eco and H-NS-sfl are identical sequences, so each sequence is weighted half in determining the consensus sequence. The heavy bar overlining residues 77–89 indicates the non-conserved linker region. Numbers before and after incomplete sequences indicate the position within the full sequence. The mdbA gene has no similarity to H-NS past residue 97. KorBN and KorBC represent the N-terminal (residues 1–49) and C-terminal (residues 50–101) halves of the korB gene (More´ et al., 1996). The bracketed number in the XrvA-xor sequence indicates eight amino acids that are not depicted. All proteobacterial non-plasmid H-NS homologues are between 134 and 137 residues in length. The three-letter genus–species designations are as follows: eco, Escherichia coli ; sfl, Shigella flexneri ; sty, Salmonella typhimurium ; sma, Serratia marcelans ; hin, Haemophilus influenzae ; rca, Rhodobacter capsulatus ; rsp, Rhodobacter sphaeroides ; bpe, Bordetella pertussis ; xor, Xanthomonas oryzae. The accession numbers and source for each gene are: H-NS-eco, Sw:P08936 (Pon et al., 1988); H-NS-sfl, Sw:P09120 (Hromockyj et al., 1992); H-NS-sty, Sw:P17428 (Hulton et al., 1990); H-NS-sma, Sw:P18955 (La Teana et al., 1989); H-NS-pvu, Sw:P18818 (La Teana et al., 1989); StpA-eco, Sw:P30017 (Zhang and Belfort, 1992); H-NS-hin, Sw:P43841 (Fleischmann et al., 1995); MdbA-eco, GB:U47048 (O’Brien and Mahanty, 1994); orf4-eco, PIR:S34257 (Tietze and Tscha¨pe, 1994); HvrA-rca, Sw:P42505 (Buggy et al., 1994); spb-rsp, PIR:S82916 (Shimada et al., 1996); KorB-eco, PIR:I79262 (More´ et al., 1996); BpH3-bpe, GB:U82566 (Goyard and Bertin, 1997); XrvA-xor, EMBL:X97866 (Dow, unpublished).
to express high amounts of soluble protein (Fig. 3B). All four constructs were used to express the corresponding protein to high levels, and each protein was purified to homogeneity. StpA was purified in this work beyond the level of purity achieved originally, because the batch protocol used previously, although fast and convenient, did not produce homogeneously pure protein (Zhang et al., 1995) (see also Fig. 2A and B). We began with the batch procedure Q 1998 Blackwell Science Ltd, Molecular Microbiology, 28, 847–857
that was applied previously, then followed it with heparin– agarose chromatography. The StpA protein so obtained was judged to be > 95% pure by Coomassie staining (Fig. 3C) and free of nuclease contamination by in vitro oligonucleotide degradation assays. This latter criterion was of particular importance when measuring nucleic acid association activity. Different purification protocols were also developed for the remaining three proteins, as described in Experimental procedures. All three proteins,
850 M. E. Cusick and M. Belfort Fig. 2. Partial proteolysis of StpA and H-NS A. Proteolysis products of StpA. Analysis was on 15% SDS–polyacrylamide gels. Molecular weight markers, in kDa, are indicated along the side of the gel. The black dot marks a band of approximately 6 kDa common to all of the protease digestions. Lanes 1, 15 min digestion; lanes 2, 30 min; lanes 3, 45 min; lanes 4, 60 min; lanes 0, no treatment. Bands $ 45 kDa are contaminants in the partially purified StpA preparation. B. Trypsinolysis products of H-NS and StpA. Lanes are marked as in (A). The white dot indicates a prominent band of approximately 6 kDa common to both protein digestions. C. Schematic representing the sites of protease cleavage in the sequence of StpA and H-NS. The sequence of the putative linker region is shown between grey rectangles that represent the N-terminal and C-terminal domains. Arrowheads along the sequence indicate prominent sites of digestion for each protease as determined by mass spectroscopy. The arrowhead symbol for each protease is shown above the respective lanes on the gel in (A).
StpA-NterL, StpA-CterL and H-NS were also judged to be > 95% pure (Fig. 3C).
StpA and StpA-CterL promote RNA annealing and trans-splicing in vitro The annealing assay measures the association of two complementary 21-mer oligonucleotides to form a 21-mer duplex oligonucleotide (Zhang et al., 1995). At 0.5 mM protein, StpA accelerated the rate of duplex formation about 10-fold over H-NS under similar conditions (Fig. 4A and B). At 1.5 mM protein, StpA again outperformed H-NS, and StpA-CterL accelerated the association of RNA oligonucleotides at least as well as StpA (Fig. 4C). Least squares analysis of the rate at early timepoints showed clearly that the rate acceleration of StpA-Cter is equal to or better than the rate acceleration induced by full-length StpA (data not shown). In contrast, StpA-NterL showed no increase in duplex formation over that which occurred in buffer alone (Fig. 4C), even at 10-fold higher concentrations (data not shown). These results argue that the ability of StpA to accelerate strand annealing resides entirely in the C-terminal segment that comprises about one-third of the full-length protein. Previous analyses had demonstrated that, in addition to strand-annealing activity, a partially purified preparation of StpA had RNA strand exchange activity (Zhang et al.,
1995; 1996). However, neither our most highly purified preparations of full-length StpA nor the StpA-CterL protein exhibited strand exchange activity, which seems to be catalysed by a contaminant present in the original partially purified preparation (data not shown). It was therefore of importance to test the ability of the purified proteins to promote in vitro trans -splicing, an activity that was originally used to characterize StpA (Zhang et al., 1995). The trans -splicing assay measures 58 splice site cleavage, which is coupled to exon ligation. This is achieved by following the incorporation of [a-32P]-GTP at the cleaved 58 splice site of a td intron that has been split into two half molecules, H1 and H2 (Fig. 5A; Coetzee et al., 1994; Zhang et al., 1995; 1996). Whereas full-length StpA stimulated trans -splicing significantly (Fig. 5B), the ability of H-NS to promote trans -splicing was again about 10-fold less than that of StpA (Fig. 5B; Zhang et al., 1996). StpACterL also stimulated trans -splicing, although to lower levels than StpA, but StpA-NterL showed no activity above background (Fig. 5C). Thus, the ability of StpA to promote RNA annealing and trans -splicing both reside in the C-terminal domain of the protein, with a complete absence of these activities in the N-terminal domain. Discussion In this work, we have demonstrated by biochemical means Q 1998 Blackwell Science Ltd, Molecular Microbiology, 28, 847–857
Domain structure of E. coli StpA protein 851
Fig. 3. Cloning, expression and purification of StpA, H-NS, StpA-CterL and StpA-NterL. A. Scheme for cloning of the N-terminal and C-terminal domains of StpA. Bent arrows indicate the primers used; the portion of the line parallel to the StpA schematic represents the part of the primer complementary to the StpA nucleotide sequence. Arrowheads indicate the 38 end of the primers. The name for each primer is 1 : StpA-58; 2 : Cter 58L; 3 : Nter-38L; 4 : StpA-38. Restriction sites used for cloning are indicated at the 58 end of each primer. B, Bam H1, X, Xba I. SD indicates the position of the Shine–Dalgarno sequence. The sequence of the linker is shown. Numbers indicate position within the sequence. B. Protein overexpression. Each expression plasmid was induced in the strain BL21(DE3)pLysS as described in Experimental procedures. Black dots indicate the position of each representative protein. MW markers, in kDa, are indicated along the sides. C. Purified proteins. The approximately 9 kDa StpA-NterL migrates anomalously at 6 kDa, possibly because of its high content of charged amino acid residues.
W
W
W
W
that StpA and its paralogue H-NS have a bipartite structure consisting of two domains separated by a protease-sensitive linker (Fig. 2). Furthermore, we have demonstrated that a protein consisting only of the StpA C-terminal domain (StpACterL) can promote by itself RNA annealing (Fig. 4), DNA annealing (data not shown) and trans -splicing of a catalytic Q 1998 Blackwell Science Ltd, Molecular Microbiology, 28, 847–857
Fig. 4. RNA annealing activity of StpA, StpA derivatives and H-NS. A. Autoradiogram of RNA annealing at 0.5 mM protein. [ 32P]-21Rþ single-stranded (ss) oligonucleotide was mixed with StpA or H-NS and the complementary unlabelled 21R¹ oligonucleotide for the indicated times, then reactions were quenched and products separated on a 15% polyacrylamide gel. The position of the 21Rþ / 21R¹ duplex (ds) oligonucleotide is indicated. B. Plot of data in (A). The amount of radioactivity in each species was determined with a Phosphorimager as described in Experimental procedures. C. RNA annealing at 1.5 mM protein. Activity of each protein was determined and quantitated as described in (A) and (B).
852 M. E. Cusick and M. Belfort
Fig. 6. Model of StpA and H-NS structure. The two-domain structure of StpA and H-NS is depicted containing an alignment of the two proteins generated using the BESTFIT program of the Wisconsin GCG package. Dashes between the sequences indicate identity; colons indicate similarity (see legend to Fig. 1). The N-terminal and C-terminal domains are shaded, and the protease-sensitive linker (PS linker) region is black. The residue hypersensitive to trypsin digestion is indicated by an asterisk. The small arrowheads indicate the highly conserved TWTG-GR-P motif. Although the N-terminal domain has not been isolated as an intact entity, its existence is inferred from phylogenetic and functional studies. The RNA-modulating activity promoted by the C-terminal domain of StpA is indicated.
Fig. 5. In vitro trans -splicing A. The trans -splicing reaction. Precursor RNAs H1 and H2 associate in trans, and splicing is initiated by an exogenous guanosine cofactor, resulting in splice products (ligated exons, LE; intron sequences from H1 and H2, I1 and I2 respectively). *G represents [a-32P]-GTP, which becomes covalently joined to the 58 end of the intron (*G-I1). B. The splicing reaction was carried out with equal amounts of H1 and H2 RNAs (20 nM each) and [a-32P]-GTP at 378C for 20 min with 0, 2, 4, 7 and 15 mM of the indicated purified proteins. The inset shows an autoradiogram of representative reactions at 558C or 378C with the indicated proteins. 0, no protein. The amount of radioactivity in the *G-I1 product was determined from autoradiograms as described in Experimental procedures, and the background levels (buffer, 378C) were subtracted. StpA was >10-fold more active than H-NS and two- to threefold more active than StpA-CtrL at protein concentrations of < 10 mM over four independent experiments.
intron in vitro (Fig. 5). From these and phylogenetic data (Fig. 1), we propose a model of StpA and H-NS (Fig. 6) in which an exposed linker connects two structured domains (Fig. 2). Furthermore, the C-terminal domain of 54 residues
can function to promote nucleic acid strand annealing independently of any interaction with the N-terminal domain. The association of strand-annealing activity with StpACterL indicates that this is the part of the protein that binds RNA and DNA. This inference is consistent with recent biochemical and genetic data that indicate that DNA binding resides in the C-terminal region of H-NS. The C-terminal domain of H-NS, isolated by trypsinolysis from purified full-length H-NS, exhibits DNA-binding ability, although the measured K d is several orders of magnitude lower than that of the full-length protein (Shindo et al., 1995). The nuclear magnetic resonance (NMR) structure of this fragment gives little indication of how the protein interacts with nucleic acids (Williams and Rimsky, 1997). Additionally, point mutations in the C-terminal domain, especially those in or about the TWTG-GR-P motif that defines the H-NS family (Fig. 1), are associated with loss of H-NS function in vivo and with reduction in double-stranded DNA binding in vitro (Ueguchi et al., 1996; Williams et al., 1996; Q 1998 Blackwell Science Ltd, Molecular Microbiology, 28, 847–857
Domain structure of E. coli StpA protein 853 Spurio et al., 1997; Free et al., 1998). Furthermore, mutations in the N-terminal region, many of which fall in or near the conserved LxNxRxLR motif around residue 10 (Fig. 1), have lost transcriptional repression yet maintain DNA binding (Ueguchi et al., 1996). Some of these mutants have been shown to have lost the ability to form H-NS dimers and higher oligomers (Williams et al., 1996; Ueguchi et al., 1997). Finally, our data, which localized annealing activity to the C-terminal domain of StpA, are consistent with in vivo tests of the effect of StpA mutants upon the suppression of the tdSC34 splicing-defective intron (Zhang et al., 1995). The deletion of 30 residues at the C-terminus or an insertion of a linker at residue 102 that creates a frameshift, both of which eliminate the TWTG-GR-P motif, completely abrogated suppression. In contrast, the deletion of 19 residues at the N-terminus still permitted suppression of the td phenotype (Zhang, 1994). The previous genetic experiments were unable to define the boundary between the N-terminal and C-terminal domains (Ueguchi et al., 1996; Williams and Rimsky, 1997), nor did they predict the existence of a linker between the two domains. The boundary was placed somewhere between residues 60 and 80. We show instead that the boundary is the linker that extends from around residue 76 to residue 90. The amino acid at residue 90 is one of the most conserved residues in the H-NS protein family, being an arginine or lysine in every instance (Figs 1 and 6). Supporting this positioning of the linker is the observation that all proteolytic fragments, consisting of the entire C-terminal domain or the majority of the N-terminal domain (Fig. 2), had their end-points in the linker. RNA chaperones are defined as proteins that act within the cell to promote correct RNA folding by preventing misfolding and by resolving misfolded forms (Herschlag, 1995). This activity would seem to incorporate both the acceleration of folding, as measured in vitro by strand annealing, and the acceleration of unfolding, as measured in vitro by strand exchange. Although both strand-annealing and strand exchange functions were previously assigned to StpA (Zhang et al., 1995; 1996), our highly purified StpA preparations, as well as StpA-CterL, have annealing activity but no strand exchange activity (data not shown). Nevertheless, highly purified StpA and StpA-CterL are able to promote the trans -splicing reaction of the group I td intron (Fig. 5). However, their ability to do so is compromised relative to the cruder StpA preparations that contain both strand exchange activity and annealing activity (data not shown). These results suggest that robust facilitation of trans -splicing requires both activities to resolve misfolded structures as well as to promote proper folding. It has been pointed out that another putative chaperone, p53, has only annealing activity. The authors propose that strand association may be all that is necessary for RNA chaperone function. Requisite dissociation would either take advantage Q 1998 Blackwell Science Ltd, Molecular Microbiology, 28, 847–857
of the natural ‘breathing’ of nucleic acids, which occurs because of thermomolecular fluctuations for short pairings, or require collaboration with RNA helicases to facilitate RNA unfolding (Jean et al., 1997). Regardless, whether StpA behaves physiologically as an RNA chaperone remains to be determined. Several intriguing observations emerged from the phylogenetic analysis of H-NS-related proteins (Fig. 1). Membership of the H-NS family for several of the proteins listed (MdbA, Spb, HvrA, Orf4 and XrvA) had not been noted previously. However, all of the H-NS and StpA homologues identified to date, chromosomal and extrachromosomal, have been found only among proteobacterial species (Fig. 1). Among the complete prokaryotic genome sequences available, there are no H-NS homologues in Gram-positive bacteria (Fraser et al., 1995; Himmelreich et al., 1996; Kunst et al., 1997), spirochaetes (Fraser et al., 1997), cyanobacteria (Kaneko et al., 1996) or archaeotes (Bult et al., 1996; Klenk et al., 1997; Smith et al., 1997). Nor is H-NS universal even among the proteobacteria. Helicobacter pylori, a representative member of the delta class of proteobacteria, does not contain an H-NS homologue (Tomb et al., 1997). It would be of considerable interest to know how non-proteobacteria substitute for H-NS, both from gene regulatory and nucleoid maintenance standpoints. Of further interest, E. coli is so far the only organism in which two H-NS-like genes, stpA and hns, have been detected. The differences in the ability of H-NS and StpA to act upon RNA substrates supports the idea that StpA, besides acting as a molecular backup for H-NS (Zhang et al., 1996), has also differentiated to perform RNA-related functions. Our model for the two-domain structure of StpA is consistent with the phylogenetic analyses that we have undertaken (Figs 1 and 6). First, the linker region is the least conserved region in the H-NS protein family (Fig. 1). Secondly, this region is composed of amino acids that have been shown to occur at high frequency in known linkers (Argos, 1990). Thirdly, no mutation that impairs H-NS function has yet been found in this region (Ueguchi et al., 1996; 1997; Spurio et al., 1997; Williams and Rimsky, 1997). Finally, and most telling of all, is the existence of plasmid-encoded H-NS homologues that consist of the N-terminal domain with a C-terminal domain that is not H-NS related (the mdbA gene of plasmid p24-2; O’Brien and Mahanty, 1994) or of tandemly duplicated C-terminal domains without an H-NS-related N-terminal domain (the korB gene of plasmid pKM101; More´ et al., 1996). It is noteworthy that these naturally occurring N-terminal and C-terminal proteins end and begin in the linker region respectively. These findings re-emphasize the independent function of each domain and lead to the proposal that domain switches and duplications have occurred in evolution. Thus, with mdbA , the N-terminal domain is fused to
854 M. E. Cusick and M. Belfort an alternative C-terminal domain, whereas with korB, the N-terminal domain is replaced by a second C-terminal domain. Three H-NS homologues, both of the single-domain homologues mentioned above plus one other full length homologue (orf4 of the IncM plasmid R446; Tietze and Tscha¨pe, 1994), are encoded by plasmids that determine conjugative pili formation needed for the spread of antibiotic resistance (orf4 and korB ) or that control colicin production (mdbA). They are therefore related to the clinical spread of pathogenic E. coli strains. Might the plasmidencoded homologues of H-NS and StpA that were revealed by our phylogenetic studies function in subverting cellular gene expression to assist the pathogenic state? Experimental procedures
Partial proteolysis Reaction mixtures contained 10 mg of StpA or H-NS in 100 ml of 50 mM Tris-HCl (pH 8.0). Trypsin was added at a 1:1000 (w/w) ratio, chymotrypsin and subtilisin at a 1:500 (w/w) ratio and thermolysin at a 1:250 (w/w) ratio. The thermolysin reaction also contained 3 mM MgCl2 . Reaction mixtures were incubated for 15–60 min at 238C, except for thermolysin, which was incubated at 378C. Aliquots were quenched by the addition of 2 mM phenylmethylsulphonyl fluoride (PMSF) and 5 mM EDTA, and protein fragments were separated on 15% SDS–polyacrylamide gels. Mass spectroscopy analysis of protein fragments was carried out as described previously (Derbyshire et al., 1997).
Molecular cloning All overexpression constructs were made into the vector pET3a (Studier et al., 1990). Constructs were designed to express full-length StpA (pET-StpA), full-length H-NS (pET-HNS), the N-terminal domain of StpA (pET-Nter) and the C-terminal domain of StpA (pET-Cter). The insert DNA fragments were generated by polymerase chain reaction (PCR). All PCR-generated DNA fragments had an Xba I site at the 58 end and a Bam HI at the 38 end for cloning into pET3a. Each 58 primer had six extra bases at the 58 end before the Xba I site, then 26 nucleotides of pET3a sequence followed by an ATG start codon and 21–26 nucleotides of the appropriate coding sequence. Similarly, each 38 primer was designed with six extra bases, a Bam HI site, the complement to a TAA stop codon and 20–25 nucleotides of the appropriate coding sequence. The DNA for the StpA derivatives was amplified off the vector pAZ205, and the H-NS construct was amplified off the vector pSU-HNS (Zhang et al., 1996). Each clone was sequenced completely on both strands to ensure against mutations. The StpA C-terminal derivative could not be amplified by the standard PCR, presumably because the 58 primer used for this reaction had considerable secondary structure (data not shown). For this construct, a technique designated ‘PCR redux’ was applied. In PCR redux, the assumption is made that a tiny amount of correct product was made, invisible on a gel. Accordingly, the part of the gel at which the product
should be was excised, whatever DNA was present was eluted from the gel slice, and another round of standard PCR was performed. The oligonucleotides used, according to Fig. 2 (primers 1–4), were: StpA-58 (GGCGGCTCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATACATATGTCCGTAATGTTACAAAG); Cter-58L (GGCGGCTCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATACATATGTCTGCTGCTGC-ACCACGCGCTG); StpA-38 (GGGGGATCCTTAGATCAGGAAATCGTCGAGAG); Nter-38L (GGGGGATCCTTAAGAGCTATTACCCAATAACTCTTC); HNS-58 (GGCGGCTCTAGAAATTTTGTTTAACTTTAAGAAGGAGATATACATATGAGCGAAGCACTTAAAATTC); and HNS-38 (GGGGGATCCTTATTGCTTGATCAGGAAATCGTC).
Induction of expression and protein purification Plasmids for overexpression were transformed into the strain BL21(DE3)pLysS (Studier et al., 1990). Cells were grown at 378C in TBYE medium, supplemented with 100 mg ml¹1 ampicillin, to an OD600 of approximately 0.6. Protein expression was then induced by the addition of 1 mM IPTG, and the culture was grown for 3 h at 378C. The cells were harvested and stored frozen at ¹808C. For all protein purifications, the initial steps of cell lysate preparation were as described previously (Zhang et al., 1995; Derbyshire et al., 1997). At each step in the purification, proteins were detected by SDS–PAGE. Protein concentrations were determined by Lowry assay using a Bio-Rad kit.
StpA . The initial batch preparation steps, consisting of back extraction of a 0.3% PEI pellet, precipitation by 40% ammonium sulphate and two cycles of precipitation by low-salt dialysis, were performed as described previously (Zhang et al., 1995). This fraction, in TNED buffer [50 mM Tris-HCl, pH 7.5, 500 mM NaCl, 1 mM EDTA and 0.5 mM dithiothreitol (DTT)], was applied to a 5 ml heparin–agarose column (Hi-Trap; Pharmacia Biotech) equilibrated in TNED buffer. The column was washed with 5 volumes of buffer, and then proteins were eluted with a 0.5–2 M NaCl gradient in TNED buffer. StpA eluted at 650 mM NaCl. Purified StpA was dialysed into TNED buffer containing 10% glycerol and stored at ¹208C. StpA-NterL . To a cleared crude lysate containing StpA-NterL, polyethyleneimine was added to 0.3% final concentration to precipitate nucleic acids. The precipitate so formed was collected by centrifugation at 10 000 g for 10 min. StpA-NterL was recovered from the pellet by extraction with 50 mM Tris-HCl (pH 7.5) and 500 mM NaCl. Solid ammonium sulphate was added to 40% saturation, and the precipitate was collected by centrifugation at 26 000 g for 15 min. StpA-NterL was soluble in low salt and, therefore, the low-salt precipitation step that was successful for full-length StpA could not be used. Instead, the ammonium sulphate pellet was dissolved in TN250ED buffer (as TNED but with 250 mM NaCl) and applied to a 5 ml heparin–agarose column as above. StpA-NterL, which appeared in the flowthrough, was dialysed into TN100ED, then applied to a 5 ml anion exchange column (HiTrap-Q; Pharmacia Biotech). Proteins were eluted with a 0.1–1 M NaCl gradient. StpA-NterL eluted at 150 mM NaCl. The pure StpA-NterL was dialysed into TN100ED buffer containing 10% glycerol and stored at ¹208C. Q 1998 Blackwell Science Ltd, Molecular Microbiology, 28, 847–857
Domain structure of E. coli StpA protein 855 StpA-CterL . Unlike StpA and StpA-Nter, StpA-Cter was soluble in 0.3% polyethyleneimine and remained in the supernatant. Solid ammonium sulphate was added to 60% saturation to the supernatant fraction, and the precipitate was cleared by centrifugation at 26 000 g for 15 min. The supernatant fraction was dialysed into TN100ED buffer and applied to a 5 ml heparin–agarose column as described above. Proteins were eluted with a 0.1–2 M NaCl gradient. StpA-CterL eluted at 600 mM NaCl. The StpA-CterL-containing fractions were collected, dialysed again into TN100ED buffer and applied in 250 ml aliquots to a Superose 12 gel filtration column (Pharmacia Biotech) equilibrated and developed in 50 mM Tris (pH 7.5) and 100 mM NaCl. Solid ammonium sulphate was added to 2 M final concentration to the pooled StpA-CterL-containing fractions. This material was applied to a 1 ml phenyl–sepharose column (HiTrap; Pharmacia Biotech). StpA-CterL did not bind to this resin but was the only protein in the flowthrough fractions. Purified StpA-CterL was dialysed into TN100ED buffer containing 10% glycerol and stored at ¹208C.
H-NS. This protein was purified by a modification of the technique of Dersch et al. (1993). A 40–60% ammonium sulphate cut of the cleared lysate was dialysed into TN150ED buffer, then loaded onto a 5 ml double-stranded DNA cellulose (dsDC) column (Pharmacia Biotech). The dsDC column was washed with 10 volumes of TN250ED buffer, but was then eluted with a 250–500 mM NaCl gradient, instead of the stepped elution described previously (Dersch et al., 1993). H-NS eluted at 425 mM NaCl. In our hands, H-NS prepared by the step gradient protocol was contaminated with nuclease activity. Purified H-NS was dialysed into TN100ED buffer containing 10% glycerol and stored at ¹208C.
Strand annealing and trans-splicing assays Annealing activity was determined using complementary synthetic 21-mer oligonucleotides, 21Rþ and 21R¹ (National Biosciences) for RNA annealing, and 21Dþ and 21D¹ for DNA annealing (for oligonucleotide sequences, see Zhang et al., 1995). The 21þ oligonucleotide was labelled at the 58 end with [g-32P]-ATP (3000 Ci mmol¹1; DuPont) and T4 polynucleotide kinase (New England Biolabs) in a 2 h incubation, then phenol extracted and ethanol precipitated to isolate the labelled oligonucleotide. A 25 ml reaction mixture containing 10 nM 21þ and 20 nM 21¹ oligonucleotides in 1× annealing buffer (50 mM Tris, pH 7.5, 3 mM MgCl2 and 1 mM DTT) was incubated at 378C. To stop the reaction, 20 ml of phenol– CIA (1:1) and 5 ml of 1% SDS were added. The reaction mix was extracted and mixed with 5 ml of stop solution (0.25% bromophenol blue, 20 mM EDTA, 0.2% SDS, 0.1 mg ml¹1 yeast tRNA and 20% glycerol). The 21þ /21¹ duplex was separated from labelled 21þ oligonucleotide on a 15% polyacrylamide gel (29:1; acrylamide–bis-acrylamide) in TBE buffer at 15 W constant power at 48C. Products were visualized by autoradiography and quantitated with a Molecular Dynamics Storm 860 scanner with associated IMAGEQUANT software. The assay of trans -splicing was by [a-32P]-GTP incorporation as described previously (Coetzee et al., 1994; Zhang et al., 1995).
Q 1998 Blackwell Science Ltd, Molecular Microbiology, 28, 847–857
Bioinformatic analyses H-NS family members were identified by a reiterative search process. Initially, the set of closely related homologues from the enterobacteria were detected by BLASTP searches (Altschul et al., 1990) using E. coli H-NS and StpA as the query sequences. This group of sequences was used to create a multiple alignment using the PILEUP program of the GCG package (Devereux et al., 1984), and a profile generated from this multiple alignment using the PROFILEMAKE program of the GCG package was used to search the SWISSPROT protein database. Homologous proteins detected in this manner (Z scores >10) were added to the multiple alignment, the profile was regenerated, and the SWISSPROT database was searched again. This process was repeated until no new homologues could be detected. In the course of these experiments, it was noticed that the TWTG-GR-P (pronounced twit-grip ) motif was absolutely conserved in all homologues. For this reason, the profile was limited to the C-terminal domain of H-NS/StpA in subsequent database searches. Z scores between 5 and 10 were examined manually for this motif to judge whether the match was significant and, if so, were added to the profile. Homologues of H-NS/StpA were searched for in completed microbial genomes using the search tools on the respective Web site servers. In each instance, the C-terminal domain, containing the TWTG-GR-P motif, was used as the search query. Output was examined manually for the presence or absence of this motif.
Acknowledgements We thank Dorie Smith and Stacey Cavanagh for expert technical support, and Maryellen Carl and Maureen Belisle for capable help with the manuscript and figures respectively. Dr Vicky Derbyshire and Dr Richard Lease provided insightful advice. We thank Drs Benoit Cousineau, Vicky Derbyshire, Richard Lease and Monica Parker for their comments on the manuscript. The Molecular Genetics Core Facility at the Wadsworth Center provided DNA oligonucleotides and DNA sequencing services, and Charles Hauer of the Biological Mass Spectroscopy Group at the Wadsworth Center provided mass spectroscopy services. This work was supported by NIH grants GM39422 and GM44844 to M.B.
References Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990) Basic local alignment search tool J Mol Biol 215: 403–410. Argos, P. (1990) An investigation of oligopeptides linking domains in protein tertiary structures and possible candidates for general gene fusion. J Mol Biol 211: 943–958. Atlung, T., and Ingmer, H. (1997) H-NS: a modulator of environmentally regulated gene expression. Mol Microbiol 24: 7–17. Buggy, J.J., Sganga, M.W., and Bauer, C.E. (1994) Characterization of a light-responding trans-activator responsible for differentially controlling reaction center and light-harvesting-I gene expression in Rhodobacter capsulatus. J Bacteriol 176: 6936–6943.
856 M. E. Cusick and M. Belfort Bult, C.J., White, O., Olsen, G.J., Zhou, L., Fleischmann, R.D., Sutton, G.G., et al. (1996) Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii. Science 273: 1058–1073. Coetzee, T., Herschlag, D., and Belfort, M. (1994) E. coli proteins, including ribosomal protein S12, facilitate in vitro splicing of phage T4 introns by acting as RNA chaperones. Genes Dev 8: 1575–1588. Derbyshire, V., Kowalski, J.C., Dansereau, J.T., Hauer, C.R., and Belfort, M. (1997) Two-domain structure of the td intron-encoded endonuclease I-Tev I correlates with the two-domain configuration of the homing site. J Mol Biol 265: 494–506. Dersch, P., Schmidt, K., and Bremer, E. (1993) Synthesis of the Escherichia coli K-12 nucleoid-associated DNA-binding protein H-NS is subjected to growth-phase control and autoregulation. Mol Microbiol 8: 875–889. Devereux, J., Haeberli, P., and Smithies, O. (1984) A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res 12: 387–395. Fleischmann, R.D., Adams, M.D., White, O., Clayton, R.A., Kirkness, E.F., Kerlavage, A.R., et al. (1995) Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269: 496–512. Fraser, C.M., Gocayne, J.D., White, O., Adams, M.D., Clayton, R.A., Fleischmann, R.D., et al. (1995) The minimal gene complement of Mycoplasma genitalium. Science 270: 397–403. Fraser, C.M., Casjens, S., Huang, W.M., Sutton, G.G., Clayton, R., Lathigra, R., et al. (1997) Genomic sequence of a Lyme disease spirochaete, Borrelia burgdorferi. Nature 390: 580–586. Free, A., and Dorman, C.J. (1997) The Escherichia coli stpA gene is transiently expressed during growth in rich medium and is induced in minimal medium and by stress conditions. J Bacteriol 179: 909–918. Free, A., Williams, R.M., and Dorman, C.J. (1998) The StpA protein functions as a molecular adapter to mediate repression of the bgl operon by truncated H-NS in Escherichia coli. J Bacteriol 180: 994–997. Goyard, S., and Bertin, P. (1997) Characterization of BpH3, an H-NS-like protein in Bordetella pertussis. Mol Microbiol 24: 815–823. Herschlag, D. (1995) RNA chaperones and the RNA folding problem. J Biol Chem 270: 20871–20874. Himmelreich, R., Hilbert, H., Plagens, H., Pirkl, E., Li, B.C., and Herrmann, R. (1996) Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. Nucleic Acids Res 24: 4420–4449. Hromockyj, A.E., Tucker, S.C., and Maurelli, A.T. (1992) Temperature regulation of Shigella virulence: identification of the repressor gene virR, an analogue of hns, and partial complementation by tyrosyl transfer RNA (tRNA1Tyr ). Mol Microbiol 6: 2113–2124. Hulton, C.S., Seirafi, A., Hinton, J.C., Sidebotham, J.M., Waddell, L., Pavitt, G.D., et al. (1990) Histone-like protein H1 (H-NS), DNA supercoiling, and gene expression in bacteria. Cell 63: 631–642. Jean, D., Gendron, D., Delbecchi, L., and Bourgaux, P. (1997), p53-mediated DNA renaturation can mimic strand exchange. Nucleic Acids Res 25: 4004–4012.
Kaneko, T., Sato, S., Kotani, H., Tanaka, A., Asamizu, E., Nakamura, Y., et al. (1996) Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions. DNA Res 3: 109–136. Klenk, H.P., Clayton, R.A., Tomb, J.F., White, O., Nelson, K.E., Ketchum, K.A., et al. (1997) The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus. Nature 390: 364–370. Kunst, F., Ogasaware, N., Moszer, I., Albertini, A.M., Alloni, G., Azevedo, V., et al. (1997) The complete genome sequence of the Gram-positive bacterium Bacillus subtilis. Nature 390: 249–256. La Teana, A., Falconi, M., Scarlato, V., Lammi, M., and Pon, C.L. (1989) Characterization of the structural genes for the DNA-binding protein H-NS in enterobacteriaceae. FEBS Lett 244: 34–38. More´, M.I., Puhlman, R.F., and Winans, S.C. (1996) Genes encoding the pKM101 conjugal mating pore are negatively regulated by the plasmid-encoded KorA and KorB proteins. J Bacteriol 178: 4392–4399. O’Brien, G.J., and Mahanty, H.K. (1994) Colicin 24, a new plasmid-borne colicin from a uropathogenic strain of Escherichia coli. Plasmid 31: 288–296. Pon, C.L., Calogero, R.A., and Gualerzi, C.O. (1988) Identification, cloning, nucleotide sequence and chromosomal map location of hns, the structural gene for Escherichia coli DNA-binding protein H-NS. Mol Gen Genet 212: 199–202. Shi, X., and Bennett, G.N. (1994) Plasmids bearing hfq and the hns -like gene stpA complement hns mutants in modulating arginine decarboxylase gene expression in Escherichia coli. J Bacteriol 176: 6769–6775. Shimada, H., Wada, T., Handa, H., Ohta, H., Mizoguchi, H., Nishimura, K., et al. (1996) A transcription factor with a leucine-zipper motif involved in light-dependent inhibition of expression of the puf operon in the photosynthetic bacterium Rhodobacter sphaeroides. Plant Cell Physiol 37: 515–522. Shindo, H., Iwaki, T., Ieda, R., Kurumizaka, H., Ueguchi, C., Mizuno, T., et al. (1995) Solution structure of the DNA binding domain of a nucleoid-associated protein, H-NS, from Escherichia coli. FEBS Lett 360: 125–131. Smith, D.R., Doucette-Stamm, L.A., Deloughery, C., Lee, H., Dubois, J., Aldredge, T., et al. (1997) Complete genome sequence of Methanobacterium thermoautotrophicum H: functional analysis and comparative genomics. J Bacteriol 179: 7135–7155. Sonden, B., and Uhlin, B.E. (1996) Coordinated and differential expression of histone-like proteins in Escherichia coli : regulation and function of the H-NS analog StpA. EMBO J 15: 4970–4980. Spassky, A., Rimsky, S., Garreau, H., and Buc, H. (1984) Hla, an E. coli DNA-binding protein which accumulates in stationary phase, strongly compacts DNA in vitro. Nucleic Acids Res 12: 5321–5340. Spurio, R., Falconi, M., Brandi, A., Pon, C.L., and Gualerzi, C.O. (1997) The oligomeric structure of nucleoid protein H-NS is necessary for recognition of intrinsically curved DNA and for DNA bending. EMBO J 16: 795–1805.
Q 1998 Blackwell Science Ltd, Molecular Microbiology, 28, 847–857
Domain structure of E. coli StpA protein 857 Studier, F.W., Rosenberg, A.H., Dunn, J.J., and Dubendorff, J.W. (1990) Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enzymol 185: 60–89. Tietze, E., and Tscha¨pe, H. (1994) Temperature-dependent expression of conjugation pili by IncM plasmid-harbouring bacteria: identification of plasmid-encoded regulatory functions. Basic Microbiol 34: 105–116. Tomb, J.F., White, O., Kerlavage, A.R., Clayton, R.A., Sutton, G.G., Fleischmann, R.D., et al. (1997) The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388: 539–547. Ueguchi, C., Suzuki, T., Yoshida, T., Tanaka, K., and Mizuno, T. (1996) Systematic mutational analysis revealing the functional domain organization of Escherichia coli nucleoid protein H-NS. J Mol Biol 263: 149–162. Ueguchi, C., Seto, C., Suzuki, T., and Mizuno, T. (1997) Clarification of the dimerization domain and its functional significance for the Escherichia coli nucleoid protein H-NS. J Mol Biol 274: 145–151. Ussery, D.W., Hinton, J.C., Jordi, B.J., Granum, P.E., Seirafi, A., Stephen, R.J., et al. (1994) The chromatin-associated protein H-NS. Biochemie 76: 968–980.
Q 1998 Blackwell Science Ltd, Molecular Microbiology, 28, 847–857
Williams, R.M., and Rimsky, S. (1997) Molecular aspects of the E. coli nucleoid protein H-NS: a central controller of gene regulatory networks. FEMS Microbiol Lett 156: 175– 185. Williams, R.M., Rimsky, S., and Buc, H. (1996) Probing the structure, function, and interactions of the Escherichia coli H-NS and StpA proteins by using dominant negative derivatives. J Bacteriol 178: 4335–4343. Zhang, A. (1994) Proteins that facilitate splicing of the bacteriophage T4 td intron. Dissertation, State University of New York, Albany, NY. Zhang, A., and Belfort, M. (1992) Nucleotide sequence of a newly identified Escherichia coli gene, stpA , encoding an H-NS-like protein. Nucleic Acids Res 20: 6735. Zhang, A., Derbyshire, V., Galloway Salvo, J.L., and Belfort, M. (1995) Escherichia coli protein StpA stimulates selfsplicing by promoting RNA assembly in vitro. RNA 1: 783–793. Zhang, A., Rimsky, S., Buc, H., and Belfort, M. (1996) Escherichia coli protein analogs StpA and H-NS: regulatory networks, similar and disparate effects on nucleic acid dynamics. EMBO J 15: 1340–1349.