THE JOURNAL OF BIOLOGICAL CHEMISTRY © 1999 by The American Society for Biochemistry and Molecular Biology, Inc.
Vol. 274, No. 19, Issue of May 7, pp. 13127–13132, 1999 Printed in U.S.A.
Characterization and Cloning of CelR, a Transcriptional Regulator of Cellulase Genes from Thermomonospora fusca* (Received for publication, September 8, 1998, and in revised form, December 29, 1998)
Nikolay A. Spiridonov‡ and David B. Wilson§ From the Section of Biochemistry, Molecular and Cell Biology, Cornell University, Ithaca, New York 14853
CelR, a protein that regulates transcription of cellulase genes in Thermomonospora fusca (Actinomycetaceae) was purified to homogeneity. A 6-kilobase NotI-SacI fragment of T. fusca DNA containing the celR gene was cloned into Esherichia coli and sequenced. The celR gene encodes a 340-residue polypeptide that is highly homologous to members of the GalR-LacI family of bacterial transcriptional regulators. CelR specifically binds to a 14-base pair inverted repeat, which has sequence similarity to the binding sites of other family members. This site is present in regions upstream of all six cellulase genes in T. fusca. The binding of CelR to the celE promoter is inhibited specifically by low concentrations of cellobiose (0.2– 0.5 mM), the major end product of cellulases. The other sugars tested did not affect binding at equivalent or 50-fold higher concentrations. The results suggest that CelR may act as a repressor, and that the mechanism of induction involves a direct interaction of CelR with cellobiose.
Biodegradation of cellulose by bacteria and fungi is accomplished by extracellular cellulolytic enzymes encoded by genes that are subject to transcriptional control. Several regulators of the transcription of cellulase genes in the cellulolytic filamentous fungus, Trichoderma reesei, including the activators ACE I and ACE II and the glucose repressor, Cre1, have been described (1, 2). However, little is known about the molecular mechanisms of cellulase regulation in soil bacteria. The thermophilic actinomycete, Thermomonospora fusca, a major degrader of cellulose in plant residues, is an extensively studied cellulolytic bacterium. This species produces six different extracellular cellulases, designated E1 through E6. Three of the enzymes (E1, E2, and E5) are endocellulases, two are exocellulases (E3 and E6), and one enzyme possesses both exoand endocellulolytic activity (E4). The major product of these enzymes is cellobiose (3–5). Biosynthesis of cellulases in T. fusca is regulated by cellobiose induction and catabolite repression, with any readily metabolized sugar acting as a repressor (6). The lowest level of cellulase synthesis (3 nM) was observed with xylose as a carbon source, and the highest level was found in cultures grown on
* This work was supported by Grant DE-FG02– 84ER13233 from the Department of Energy Basic Energy Research Program. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EBI Data Bank with accession number(s) AF086819. ‡ Permanent address: Institute of Theoretical and Experimental Biophysics, Russian Academy of Sciences, Pushchino, Moscow Region 142292, Russia. § To whom correspondence should be addressed: 458 Biotechnology Bldg., Cornell University, Ithaca, NY 14853. Tel.: 607-255-5706; Fax: 607-255-2428; E-mail:
[email protected]. This paper is available on line at http://www.jbc.org
microcrystalline cellulose. Endocellulases and exocellulases showed distinctly different regulation patterns, with exocellulases showing the highest level of induction (7). The structural genes for all these enzymes, designated celA through celE (8 –10) and celF,1 have been cloned and sequenced. All six cellulase genes in T. fusca contain the 14-bp inverted repeat sequence TGGGAGCGCTCCCA in their 59upstream regions. A regulatory protein that interacts with the upstream regions of cel genes was detected in T. fusca cultures grown on cellulose. The binding site was identified as the 14-bp inverted repeat by DNase I footprinting of the celE gene and by chemical footprinting of the celB gene (11, 12). The same palindrome was found in the regulatory regions of cellulase genes from different streptomycetes, in particular, the cenC gene from Streptomyces strain KSM9 (13), the celA genes from Streptomyces halstedii (14) and Streptomyces lividans (15), and also in the cel1 gene from Streptomyces reticuli (16), where it serves as the operator for a repressor protein (17). The same sequence was also found in a xylanase gene from Thermomonospora alba (18). The occurrence of this site in cellulase and xylanase genes from different species may indicate its special role in regulating carbohydrate catabolism in streptomycetes and other high GC Gram-positive bacteria. Here we report the isolation and properties of CelR, a regulatory protein from T. fusca that specifically binds to the 14-bp inverted repeat site in cellulase genes, as well as the complete nucleotide sequence of the celR gene. MATERIALS AND METHODS
Bacterial Strains and Plasmids—T. fusca ER1 is an extracellular protease-negative strain derived from T. fusca YX (12). Escherichia coli strain DH5a (19) was used for cloning and plasmid isolation. Plasmid pE5-46 has been described previously (3). Plasmid pNS1 was constructed as described below. Growth of Organisms—T. fusca was grown on Hagerdahl medium (20) supplemented with 0.5% Solka Floc (microcrystalline cellulose, James River Corp.). E. coli strains containing recombinant plasmids were grown in Luria broth or plated on Luria agar plates containing 0.1 mg/ml ampicillin. Quantitation of DNA Binding Activity—Plasmid pE5-46 from E. coli strain D541, which contained the celE gene from T. fusca (3) was cut with EcoRI, SalI, and XhoI, and the 39 ends of the fragments were labeled with [a-32P]dCTP or [a-32P]dATP, using DNA polymerase I Klenow fragment (21). The following 32P-labeled pE5-46 fragments were present in the reaction mixture: a high mobility 548-bp fragment (containing the celE promoter and regulatory region), a low motility 2656-bp fragment (containing the pE5-46 promoters), and a 1428-bp fragment (containing the coding region of the celE gene). Binding was carried out in 15–20 ml of 40 mM Tris-HCl buffer (pH 8.0), 150 mM KCl, 1 mM MgCl2, 0.1 mM EDTA, 0.1 mM DTT,2 3 ng of 32P-labeled fragments, T. fusca extract (1– 6 mg of protein), 10 mg/ml poly(dG-dC), and 12.5% glycerol. After incubation at 37 °C for 15 min, samples were electro-
1
D. Irwin, unpublished results. The abbreviations used are: DTT, dithiothreitol, PAGE, polyacrylamide gel electrophoresis; bp, base pair(s); kb, kilobase pair(s); HPLC, high performance liquid chromatography. 2
13127
13128
CelR, a Transcriptional Regulator of Cellulase Genes
phoresed through a 1.2% agarose gel in 90 mM Tris borate buffer (pH 8.3) containing 3 mM EDTA. Gels were dried at 80 °C under vacuum and radioautographed. The extent of binding to the celE regulatory region was estimated from dilutions that yielded 25–75% conversion of the DNA fragment to the slower mobility DNA-protein complex. One unit of celE promoter binding activity is the amount of DNA-binding protein that converts 50% of the DNA fragment to a DNA-protein complex under the assay conditions. Purification of the CelR Protein—T. fusca strain ER1 was grown in 10 liters of Hagerdahl medium (20) containing 0.5% Solka Floc and 2 g of yeast extract at 52 °C (pH 7.2 at 40% oxygen saturation). Cells were harvested after 15 h of cultivation, sedimented and resuspended in lysis buffer containing 10 mM Tris-HCl (pH 7.8), 1 mM EDTA, 0.1 mM phenylmethylsulfonyl fluoride, and 0.1 mM DTT. All purification steps were performed at 4 °C, unless otherwise noted. Cells were lysed with a French press, and cell debris was removed by centrifugation at 10,000 3 g for 25 min. Streptomycin sulfate (4%) was added to the supernatant, and the DNA precipitate was removed after 1 h by centrifugation at 10,000 3 g for 25 min. Sodium chloride (0.5 M) was added, and the supernatant (700 ml) was loaded on a 200-ml phenyl-Sepharose CL-4B column (Amersham Pharmacia Biotech). The column was eluted successively with 0.5 M, 20 mM NaCl in 10 mM Tris-HCl (pH 7.8) containing 1 mM EDTA, 0.1 mM DTT, the same buffer without NaCl, and with distilled water. Fractions eluted with distilled water (150 ml) were applied to a 40-ml heparin-Sepharose CL-4B column (Amersham Pharmacia Biotech) equilibrated with 10 mM Tris-HCl (pH 7.8), 0.1 M NaCl, 1 mM EDTA, 0.1% Triton X-100, and 0.1 mM DTT (buffer A). The column was eluted with a linear gradient of NaCl from 0.1 to 1.5 M (400 ml). Appropriate fractions (120 ml) were pooled, dialyzed against buffer A, and loaded on a 7-ml DNA-Sepharose column that contained DNA fragments with multiple copies of the 14-bp inverted repeat (TGGGAGCGCTCCCA) coupled to Sepharose 4B (Amersham Pharmacia Biotech). DNA affinity chromatography was performed in buffer A at room temperature. After washing with 25 ml of buffer A, the column was successively eluted with 0.3, 0.6, and 1.0 M NaCl in buffer A. Electrophoretically pure (.95%) CelR protein was eluted with 10 ml of warm (50 °C) 1.8 M NaCl, concentrated, and desalted using Centricon-30 concentrators. Protein was determined with the BCA reagent (Pierce) using bovine serum albumin (Sigma) as a standard. Molecular Weight Determination—The molecular weight of CelR was determined by 12% SDS-PAGE (22) using low and high molecular weight protein standards (Sigma). Gels were stained with Coomassie Brilliant Blue R-250 (Sigma). Duplicate gels were cut into 2-mm sections and washed successively with 50% isopropanol, 10% isopropanol, and 20 mM Tris-HCl (pH 7.8), 1 mM EDTA. Gel sections were homogenized in the same buffer, and protein was extracted overnight at 4 °C. DNA binding activity in gel extracts was measured as described. The molecular weight of native CelR was determined by HPLC gel chromatography. A Waters Protein Pak 200SW glass column (8 mm 3 30 cm) was equilibrated and eluted with 10 mM Tris-HCl (pH 8.0), 0.1 M NaCl at 15 ml/h. The column was calibrated with gel chromatography protein standards (Bio-Rad). Purified CelR (28 mg in 200 ml) was applied to the column, and fractions of 0.25 ml were collected and assayed for celE-promoter binding activity. Effect of Sugars on CelR Protein-DNA Binding—The effect of sugars on the CelR-DNA interaction was measured by gel retardation experiments using purified CelR with the conditions described above. Sugars (0.1–100 mM) were introduced into the reaction mixture prior to the addition of CelR (10 –90 pg/ml). Protein Sequencing—Purified CelR was cleaved with endo-lys-C, and the products of digestion were separated by HPLC. The N-terminal sequence of CelR and an internal peptide were determined on an Applied Biosystems model 492 automated protein sequencer at the BioResource Center, New York State Center for Advanced Technology (Cornell University, Ithaca, NY). An internal standard peptide was included in all sequencing runs in order to independently verify the performance of the sequencer. Isolation of Genomic DNA from T. fusca—Genomic DNA was isolated from a 125-ml T. fusca culture grown for 24 h on 0.5% cellobiose. Cells were pelleted by centrifugation (8,000 rpm, 15 min), frozen, thawed, and resuspended in 30 ml of 50 mM Tris-HCl (pH 8.0), 25 mM EDTA at 4 °C. SDS (1%), and 0.8 ml of diethyl pyrocarbonate were added. Cells were French pressed at 5,000 p.s.i. directly into 35 ml of phenol:chlorophorm:isoamyl alcohol mixture (25:24:1) on ice. The lysate was mixed and centrifuged (8,000 rpm, 15 min), and the water phase was treated with phenol three more times. Then the water phase was consecutively extracted with a chlorophorm:isoamyl alcohol mixture (24:1) and with ether. DNA was precipitated from the water phase on ice with 0.3 M
NaCl and 2.5 volumes of cold ethanol, spooled on a glass rod, washed in cold ethanol, and redissolved in 1.5 ml of TE buffer (pH 8.0) overnight. Cloning of the celR Gene—Degenerate sets of 15-mer oligonucleotides GC(C/G/T)GT(C/G)ATCAACGGC encoding the N-terminal region of the CelR protein from Arg25 to Gly29, and GA(G/A)AACAACCAGAA(G/A), complementary to the strand encoding the internal part of the protein from Glu73 to Lys77 were synthesized by the BioResource Center, Cornell University. Oligonucleotides were labeled with fluorescein by the 39-tailing reaction with terminal transferase. Total genomic DNA from T. fusca was completely digested with NotI and SacI, electrophoresed on a 0.7% agarose gel, Southern blotted into a Hybond N1 membrane (Millipore), and hybridized with fluorescein-labeled oligonucleotides. The 39-oligolabeling and signal amplification system for the FluorImager (Amersham Pharmacia Biotech) was used in the hybridization experiments. A 6-kb DNA fragment that hybridized to both oligonucleotide probes was visualized with a STORM 840 scanner and the ImageQuant program (Molecular Dynamics). Fragments of ;6 kb were isolated from a NotI-SacI complete digest of genomic DNA separated on a 0.7% agarose gel and purified with a QIAquick gel extraction kit (Qiagen). The DNA was ligated to pBluescript SK1 (Stratagene, La Jolla, CA) digested with the same restriction enzymes. The ligation mixture was used to transform E. coli strain DH5a, and the cells were plated on LB ampicillin plates containing 5-bromo-4-chloro-3-indolyl b-D-galactopyranoside and isopropyl-1-thiob-D-galactopyranoside (21). Transformants were screened by hybridization of E. coli colonies on Hybond N1 membrane with a fluoresceinlabeled probe, as described. Plasmid DNA was purified from positive transformants with a plasmid miniprep kit (Qiagen), digested with NotI and SacI, electrophoresed on 1.2% agarose gel, Southern blotted into Hybond N1 membrane, and hybridized with both fluorescein-labeled probes, as described. Plasmid DNA from one of the clones that carried a 6-kb insert which hybridized with both oligonucleotide probes was called pNS1. DNA Sequencing—Double-stranded DNA from pNS1 was used as a template for sequencing the celR structural gene and its 39- and 59flanking regions. The sequences of both strands were determined by the dideoxy chain termination method (23) at the BioResource Center, Cornell University. The two degenerate 15-mer oligonucleotides were used to determine the initial nucleotide sequences. Specific primers for sequencing the celR gene were synthesized by the BioResource Center. DNA and Protein Sequence Analysis—Gapped BLAST and PSIBLAST programs (24) were used for searching for sequence homologies in the GenBank and SwissProt data bases. The Pfam HMM program3 was used for creating alignment with the GalR/LacI family. Sequencher (Gene Codes Corp.) was used for analysis and assembling DNA sequencing data. Calculations of codon usage, molecular weight, isoelectric point, and hydrophobicity of the protein product were made with DNASTAR Lasergene software. RESULTS
CelE Promoter Binding Activity in T. fusca Extracts—T. fusca cultures grown on microcrystalline cellulose possessed considerable celE promoter binding activity (Fig. 1). Preliminary experiments showed that the maximum level of activity was observed after 12–24 h of cultivation, and that the levels of binding activity remained high during an additional 48 h of cultivation. Cells grown on non-inducing carbon sources had insignificant celE promoter binding activity (data not shown). Reduced motility of the high molecular weight DNA fragment containing the E. coli plasmid promoters was also observed (Fig. 1, lanes 2– 4). However, it was independent of the carbon source used to grow T. fusca. Presumably, the reduced motility of the high molecular weight fragment was due to the interaction of RNA polymerase from T. fusca present in the crude extract with plasmid promoters from pE5-46. No shift was detected for the 1428-bp DNA fragment, which contained no promoter elements. Purification of CelR Protein—The CelR protein was purified from 10 liters of T. fusca culture by three chromatographic steps. The process of purification was followed by the gel retardation assay, Lowry protein measurement (25), and SDS3 The Pfam HMM program is available via the World Wide Web (http://pfam.wustl.edu/hmmsearch.shtml).
CelR, a Transcriptional Regulator of Cellulase Genes
FIG. 1. DNA binding activity of T. fusca ER1 extract and the purified CelR protein. A gel retardation assay was used to analyze binding to 32P-labeled DNA fragments. A, DNA fragment containing the celE promoter; B, celE promoter DNA-CelR complex; C, DNA fragment containing celE coding region; D, DNA fragment containing pE5-46 promoters. Lane 1, no protein; lanes 2– 4, crude T. fusca extract (120, 240, and 480 ng of intracellular protein); lanes 5– 8, purified CelR protein (0.22, 0.44, 0.88, and 1.75 ng).
PAGE (22). A purification summary is presented in Table I. Purification yielded highly active protein of greater that 95% purity (Fig. 2). The resulting CelR bound only to the DNA fragment containing the celE promoter (Fig. 1). Molecular Weight of CelR Protein—The molecular mass of pure CelR protein as determined by SDS-PAGE is 41 6 1.5 kDa. Samples of partially purified and pure CelR were separated with SDS-PAGE, and the gels were cut into 2-mm sections. Protein was eluted from each section, renatured, and tested for DNA binding activity. In each case the binding activity was associated with the 41-kDa band (data not shown). Estimation of the native molecular weight of purified CelR with HPLC gel chromatography showed that the protein aggregated under non-denaturing conditions. About 3% of CelR came off the column as a dimer (70 – 85 kDa), a tetramer (145–175 kDa), and an octamer (250 – 400 kDa). The major part of the protein formed high molecular weight aggregates that could not be resolved by HPLC. Only traces of CelR monomer were found. CelR Binding Affinity—The dissociation constant was calculated as the concentration of CelR that caused 50% of the DNA to form a complex with the protein under the described conditions. The apparent Kd for the CelR-celE promoter complex in the absence of cellobiose was 0.5–1 3 1029 M. Effect of Cellobiose on CelR Protein-DNA Binding—The binding of CelR to the celE promoter region was inhibited by cellobiose at concentrations of 0.2– 0.5 mM and higher (Fig. 3). The apparent dissociation constant for cellobiose was 5 3 1024 M, as measured in gel retardation experiments under the above conditions with 75 pg/ml CelR. The effect of cellobiose on the dissociation constant of the CelR-celE promoter complex is shown on Fig. 4. A 500-fold increase in cellobiose concentration (from 0.1 to 50 mM) resulted in a 7-fold increase in Kd for the CelR-celE promoter complex. At the same time, equivalent or higher concentrations of other tested sugars did not affect binding. Cellotriose, sophorose, and xylobiose showed little inhibition at 50 mM. Other mono- and disaccharides (glucose, galactose, mannose, xylose, arabinose, sucrose, lactose, and maltose) had no effect on binding even at 100 mM (data not shown). CelR Gene Cloning and Characterization—The N terminus
13129
and an internal region of CelR were sequenced. Two degenerate oligonucleotides prepared on the basis of reverse translation of the amino acid sequence were used to select E. coli clones. About 200 E. coli transformants from a NotI-SacI library were screened. Six transformants contained plasmid DNA with a 6-kb insert that hybridized with the two different oligonucleotide probes. Plasmid DNA (pNS1) from one of the positive clones was used as a template for sequencing. The DNA insert contained a three-cistron operon bglABC coding for two sugar permeases and a b-glucosidase,4 and celR, coding for the regulatory protein. Both strands were sequenced, and the sequence of celR is shown in Fig. 5. The celR gene has a G1C content of 68%, which is similar to the 65% G1C content of T. fusca DNA (9). A reading frame from nucleotide 109 to nucleotide 1128 encodes a 340-amino acid moderately hydrophobic protein (125 hydrophobic amino acids). The molecular weight of CelR, inferred from its amino acid sequence, is 36,863 daltons, lower than its apparent molecular weight determined by SDSPAGE. The reading frame that codes for CelR has a high content of G and C (85.6%) in the third positions of the codons, typical for cellulase genes from T. fusca (8 –10). The N-terminal (first 37 amino acids) and an internal (amino acids 60 –77) sequence of the protein read from the nucleotide sequence of the celR gene are identical to the N-terminal sequence of CelR protein and the sequence of its endo-lys-C cleavage product, as determined with the protein sequencer. A potential ribosome binding site (GGAA) is located 5 bases upstream of the start codon. The 59-regulatory region of celR contains an inverted repeat that may act as a transcription terminator for the bglC gene whose termination codon is located 108 nucleotides upstream of the celR start codon. A putative transcription terminator sequence for the celR gene (a 21-base palindrome) is located immediately downstream of its structural region. It is interesting to note that the reverse strand of the celR gene contains a 1008-nucleotide open reading frame that begins with GTG start codon (nucleotides 1140 –1138) and terminates with TAG (nucleotides 135–133). A putative ribosomebinding site (AGG) is located 5 nucleotides upstream from the start codon. The hypothetical 335-amino acid protein product is 23% identical to a putative ATP/GTP-binding protein from Streptomyces coelicolor, AL031225. CelR Sequence Similarities—The amino acid sequence of CelR has been scanned against the GenBank and EMBL data bases. CelR shares significant homology with a number of proteins that belong to the GalR-LacI family of bacterial transcriptional regulators. It is 49% identical to a transcriptional regulator from S. coelicolor (e1309425), 36% identical to a ribose operon repressor RbsR from E. coli (P25551), 32% identical to a transcriptional repressor CytR from E. coli (P06964), a transcriptional regulator DegA from Bacillus subtilis (e1173517), and a ribose operon repressor RbsR from Haemophilus influenzae (P44329), 30% identical to a galactose operon repressor GalR (P03024), a maltose repressor MalI (M60722), and a sucrose operon repressor CscR (P40715) from E. coli, 29% identical to the lactose repressor from E. coli (P03023) and to many other members of the family. The N-terminal helix-turn-helix motif, located between amino acids 8 and 29, is the most conserved part of the CelR protein. It is 65% identical and 80% similar to the GalR-LacI consensus (Fig. 6). An unusual feature of CelR is a cluster of four consecutive arginine residues near the N terminus that is followed by proline and threonine (amino acids 3– 8). This may represent a phosphorylation site for a cAMP- and cGMP-de-
4
N. A. Spiridonov, unpublished results.
13130
CelR, a Transcriptional Regulator of Cellulase Genes TABLE I Purification of the CelR regulatory protein from T. fusca Fraction
Crude extract Phenyl-Sepharose Heparin-Sepharose DNA affinity-Sepharose
Volume
Total protein
Total activity
ml
mg
unit 3 10
650 150 120 12
3,120 117 23 0.105
13,000 6,000 4,800 240
3
Yield
Specific activity
Purification
%
unit/mg 3 10
-fold
100 46 37 1.8
4.2 51.3 210 2,290
3
1 12 50 547
FIG. 4. The effect of cellobiose on the dissociation constant of CelR-celE promoter complex.
FIG. 2. Electrophoregrams of CelR purification fractions separated by 12% SDS-PAGE and stained with Coomassie Brilliant Blue R-250. Lane 1, protein molecular weight markers; lane 2, crude extract of T. fusca; lane 3, phenyl-Sepharose; lane 4, heparin-Sepharose; lane 5, affinity DNA-Sepharose.
FIG. 3. Inhibition of CelR protein binding to the celE promoter by cellobiose. A–D, as in Fig. 1. Lane 1, no protein; lane 2, 1.2 ng of CelR protein; lanes 3– 8, 1.2 ng of CelR protein and cellobiose (0.2, 0.5, 1, 5, 10, or 50 mM).
pendent protein kinase (26). Location of this site at the boundary of the DNA-binding motif may indicate its involvement in regulation of DNA-protein interactions. The amino acid sequence (residues 205–210, 256 –260, and 282–285) of CelR also shares some homology with elements of the sugar binding sites of the GalR/LacI proteins (27, 28). DISCUSSION
A functionally active DNA-binding protein that interacts specifically with the 59-regulatory region of the celE gene was isolated from T. fusca grown on microcrystalline cellulose. CelR did not bind to the E. coli promoters in pE5-46 or to the coding region of the celE gene; it bound tightly to the celE promoter.
Sequence similarities showed that CelR belongs to the GalRLacI family of transcriptional regulators. The GalR-LacI regulators bind to their target DNA sites as homodimers, and their operator sequences are inverted repeats. Each monomer of a homodimer interacts with its half of an operator sequence with an N-terminal DNA-binding domain that contains a characteristic helix-turn-helix motif (27). The CelR protein shares a number of common features with other members of this family. CelR possesses a typical DNA binding domain, based on sequence comparison, and its binding site is similar to operator sites for the lactose repressor LacI from E. coli, the amylase repressor CcpA from Bacillus subtilis, and some other members of the GalR-LacI family (Table II). Unlike other GalR-LacI proteins that form dimers and tetramers, CelR, because of its hydrophobicity, shows strong aggregation under non-denaturing conditions. The 59 regulatory regions of all known cellulase genes in T. fusca contain from one to three copies of this sequence (7). Presumably, all six cel genes in T. fusca are under transcriptional control of the CelR protein. Apparently, cellulase regulation in T. fusca follows the general design of transcriptional control of genes encoding enzymes for carbohydrate catabolism in other eubacteria. In particular, it resembles the regulatory system of the amylase and chitinase genes in S. lividans that are controlled by Reg1, a member of the GalR/LacI family, that acts as a positive regulator of chitinase genes under inducing conditions, or as a negative regulator of a-amylase and chitinase production in the presence of glucose (29). Similar to reg1, which is not adjacent to the a-amylase genes in S. lividans controlled by reg1, celR is not adjacent to any of the T. fusca cellulase genes. Cellulase regulation in T. fusca is different from regulation of cellulase genes in the fungus T. reesei, although the two evolutionally distant species possess similar cellulolytic enzyme systems (30). In T. reesei, the cellulase genes are controlled by the transcriptional activators ACE I and ACE II and a glucose repressor Cre1 belonging to the family of Cys2-His2 zinc finger proteins (1, 2). The vast majority of proteins of the GalR/LacI family bind carbohydrate or nucleoside effectors (27). Our results show that
CelR, a Transcriptional Regulator of Cellulase Genes
13131
FIG. 5. DNA sequence of the celR gene and deduced amino acid sequence of CelR protein. The N-terminal domain of CelR and its endo-lys-C cleavage product determined by protein sequencing are noted. RBS, ribosome binding site. Inverted repeats in 59- and 39-regions of the gene are indicated with arrows.
FIG. 6. Alignment of the N-terminal sequence of the CelR protein from T. fusca with the consensus sequence of 37 bacterial regulatory proteins belonging to the GalR/LacI family (Pfam HMM program). TABLE II DNA binding half-sites for CelR and other bacterial regulators of the GalR-LacI family (27) Regulator
CelR LacI CcpA GalR GalS a
Sequencea
Species
TGGGAGC TTGTGAGC TGTAAGC GTGKAANC GTGKAANC
T. fusca E. coli B. subtilis E. coli E. coli
N 5 any base, K 5 G/T.
the DNA binding activity of CelR is modulated by cellobiose. Results of in vitro experiments explain the literature data on the induction of cellulase production in T. fusca cultures by cellobiose (6, 7). The fact that low physiological levels of cellobiose (0.2– 0.5 mM) are sufficient for dissociation of the CelRcelE promoter complex in vitro is evidence that cellobiose is the true inducer of the cel genes, and that the mechanism of induction involves a direct interaction of CelR with cellobiose. In the absence of cellobiose, CelR presumably forms complexes with the 14-bp inverted repeats next to the cel genes and blocks their transcription. However, a very low constitutive level of cellulase synthesis (about 3 nM) was observed even when cells were
grown on non-inducing sugars (7). When cellulose is present in the environment, cellobiose resulting from its digestion enters cells and form complexes with CelR, allowing transcription to proceed. As cellobiose in the cells is exhausted, CelR likely represses transcription. This simple mechanism may allow quick adaptation of cells to changing environments, improve the efficiency of the cel regulon, and help avoid the unnecessary production of extracellular enzymes. The occurrence of the 14-bp inverted repeat in the upstream regions of cel genes from different Streptomyces species (13–16) suggests that this regulatory mechanism may be also shared by other cellulase genes. One result that is not explained by this model is the absence of active CelR protein in cells grown without cellulose or cellobiose. The mechanism of cellulase regulation in T. fusca under non-inducing conditions requires further investigation. Acknowledgments—We gratefully thank Diana Irwin, Sheng Zhang, Joseph Calvo and Theodore Thannhauser for advice and discussions, Guifang Lao for preparation of the DNA affinity column, and William Enslow for expert protein sequencing and HPLC gel chromatography. REFERENCES 1. Ilmen, M., Thrane, C., and Penttila, M. (1996) Mol. Gen. Genet. 251, 451– 460 2. Saloheimo, A., Ilmen, M., Aro, N., Margolles-Clark, E., and Penttila, M. (1998) in Carbohydrases from Trichoderma reesei and Other Microorganisms: Structures, Biochemistry, Genetics and Applications (Claeyssens, M., Nerinckx, W., and Piens, K., eds) pp. 267–273, The Royal Society of Chemistry, Cambridge, United Kingdom 3. Irwin, D., Walker, L., Spezio, M., and Wilson, D. (1993) Biotechnol. Bioeng. 42, 1002–1013 4. Barr, B. K., Hsieh, Y.-L., Ganem, B., and Wilson, D. B. (1996) Biochemistry 35, 586 –592 5. Sakon, J., Irwin, D., Wilson, D. B., and Karplus, P. A. (1997) Nat. Struct. Biol. 4, 810 – 817
13132
CelR, a Transcriptional Regulator of Cellulase Genes
6. Lin, E., and Wilson, D. B. (1987) J. Bacteriol. 53, 1352–1357 7. Spiridonov, N. A., and Wilson, D. B. (1998) J. Bacteriol. 180, 5463–5467 8. Jung, E. D., Lao, G., Irwin, D., Barr, B. K., Benjamin, A., and Wilson, D. B. (1993) Appl. Environ. Microbiol. 59, 3032–3043 9. Lao, G., Ghangas, G. S., Jung, E. D., and Wilson, D. B. (1991) J. Bacteriol. 173, 3397–3407 10. Zhang, S., Lao, G., and Wilson, D. B. (1995) Biochemistry 34, 3386 –3395 11. Lin, E., and Wilson, D. B. (1988) J. Bacteriol. 170, 3843–3846 12. Wilson, D. B. (1992) Crit. Rev. Biotechnol. 12, 45– 63 13. Nakai, R., Horinouchi, S., and Beppu, T. (1988) Gene (Amst.) 65, 229 –238 14. Fernandez-Abalos, J. M., Sanchez, P., Coll, P. M., and Villanueva, J. R. (1992) J. Bacteriol. 174, 6368 – 6376 15. Theberge, M., Lacaze, P., Shareck, F., Morosoli, R., and Kluepfel, D. (1992) Appl. Environ. Microbiol. 58, 815– 820 16. Schlochtermeier, A., Walter, S., Schroeder, J., Moorman, M., and Schrempf, H. (1992) Mol. Microbiol. 6, 3611–3621 17. Walter, S., and Schrempf, H. (1996) Mol. Gen. Genet. 251, 186 –195 18. Blanco, J., Coque, J. J. R., Velasco, J., and Martin, J. F. (1997) Appl. Microbiol. Biotechnol. 48, 208 –216 19. Hanahan, D. (1983) J. Mol. Biol. 166, 557–580 20. Hagerdahl, B. G. R., Ferchak, J. D., and Pye, E. K. (1978) Appl. Environ.
Microbiol. 36, 606 – 612 21. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY 22. Laemmli, U. K. (1970) Nature 227, 680 – 685 23. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A. 74, 5463–5467 24. Altschul, S. F., Madden, T. L., Scha¨ffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997) Nucleic Acids Res. 25, 3389 –3402 25. Lowry, O. H., Rosebrough, N. J., Farr, A. L., and Randall, R. J. (1951) J. Biol. Chem. 193, 265–275 26. Glass, D. B., el-Maghrabi, M. R., and Pilkis, S. J. (1986) J. Biol. Chem. 261, 2987–2993 27. Weickert, M. J., and Adhya, S. (1992) J. Biol. Chem. 267, 15869 –15874 28. Markiewicz, P., Kleina, L. G., Cruz, C., Ehret, S., and Miller, J. H. (1994) J. Mol. Biol. 240, 421– 433 29. Nguyen, J., Francou, F., Virolle, M.-J., and Guerieau, M. (1997) J. Bacteriol. 179, 6383– 6390 30. Wilson, D. B., and Irwin, D. (1999) in Recent Progress in Bioconversion of Lignocellulosics, in press