Rothstein, S. J., Lazarus, C. M., Smith, W. E.,. Baulcombe, D. C. & Gatenby, A. A. (1984) Nature. (London) 308,662-665. 79. Kothstein, S. J., Lahners, K. J., ...
Biochemical Society Transactions
42
76. Ashikari, T., Nakamura, N., Tanaka. Y., Kiuchi, N., Shibano, Y., Tanaka, T., Amachi, T. & Yoshizumi, ti. (1985) Agric. €501.Chem. 49,2521-2523 77. Gatenby, A. A., Boccara, M., Baulcombe, D. C. & Kothstein, S. J. (1 986) Gene 45, 1 1 - 18 78. Rothstein, S. J., Lazarus, C. M., Smith, W. E., Baulcombe, D. C. & Gatenby, A. A. (1984) Nature (London) 308,662-665 79. Kothstein, S. J., Lahners, K. J., Lazarus, C. M., Baulcombe, D. C. & Gatenby, A. A. (1987) Gene 55, 353-356 80. Aoyagi, K., Sticher, I,., Wu. M. &Jones, K. L. (1990) I’lanta 180, 333-340 81. Thomsen, K. K. (1983) Carlsberg Kes. Commun. 48, 545-555
82. Filho, S. A., Galembeck, E. V., Faria, J. H. & Prascino. A. C. S. (1986) Hio/Technology 4.3 1 1-3 15 83. Sato, T., Tsunasawa, S., Nakamura, Y., Emi, M.. Miyanohara, A., Nishide, T. & Matsubara, K. (1986) Gene 50,247-257 84. Nakamura, Y., Sato, T., Emi, M.. Miyanohara, A., Nishide, T. & Matsubara, K. (1986) Gene 50, 239-245 85. Shiosaki, K., Takata, K., Omichi, K., Tomita. N., Horii, A,, Ogawa, M. & Matusbara, K. (1 990) Gene 89.253-258
Received 10 September 199 1
Bacterial cellulases Pierre Beguin, Jacqueline Millet, Sylvie Chauvaux, Sylvie Salamitou, Kostas Tokatlidis, Jesus Navas, Tsuchiyoshi Fujino, Marc Lemaire, Odette Raynaud, Marie-Kim Daniel and Jean-Paul Aubert Unite de Physiologie Cellulaire, Departement des Biotechnologies, lnstitut Pasteur, 28 rue du Dr Roux, 75724 Paris, Cedex I 5 France
Introduction Among prokaryotes, a large number of saprophytic soil bacteria and plant pathogens produce cellulolytic enzymes, but comparatively few can utilize crystalline cellulose as a carbon source. Furthermore, the total mass of cellulases secreted even by truly cellulolytic bacteria is inferior by at least one or two orders of magnitude to the amount secreted by filamentous fungi such as Trichoderma reesei, making biochemical studies rather difficult. However, the discovery that some bacterial cellulases have a very high specific activity has greatly stimulated research in this area. Another factor was the introduction of recombinant DNA technology, which is much easier to carry out with prokaryotes than with eukaryotes. This paper focuses on the biochemistry of bacterial cellulases, with particular emphasis on recent developments in structural and functional analysis based on recombinant DNA technology.
Cellulases: basic properties of individual enzymes Cellulases can be loosely defined as enzymes that hydrolyse P- 1,4 glucosidic bonds. They can be Abbreviations used: CHL). cellulose-binding domain; LXI’C. diethyl pyrocarbonate.
Volume 20
classified according to their mode of action and substrate specificity. The three main categories are (i) endoglucanases, which attack cellulose molecules at random, combining at multiple sites within the molecule; (ii) exoglucanases, which attack cellulose chains stepwise from the non-reducing end, libsrating cellobiose (cellobiohydrolases) or glucose (glucohydrolases) at each step; (iii) P-glucosidases which hydrolyse cellobiose and low-molecular mass cellodextrins to glucose. Other parameters of activity, such as the affinity of enzymes for cellulose, the stereochemical course of the reaction (retention versus inversion of configuration at the C- 1 carbon) or the rate of hydrolysis of various substrates derived from, or structurally related to, cellulose are also useful in defining cellulase specificity and mode of action. To understand the structural basis of such functional properties is one of the long-term goals of enzymologists working with cellulases. Primary structure of cellulases Bacterial cellulase genes are easier to clone than fungal genes. It is therefore not surprising that most of the sequence information about cellulase genes and their products comes from bacteria. Over 50 cellulase genes have now been sequenced, and a first set of conclusions can be drawn by comparing the sequences.
Biochemistry of Plant Polysaccharides
Modular structure of cellulases. Many cellulases are composed of multiple domains, which can be found in various combinations in different enzymes. In several cases the function of specific domains has been identified by proteolytic truncation experiments or by genetic deletion analysis. It is probably safe to assume that regions of other proteins sharing homology with the identified domains also have similar functions. Catalytic domains. The size of cellulase domains responsible for catalytic activity ranges between approximately 300 and 500 residues. Using hydrophobic cluster analysis, the catalytic domains of most cellulases have been classified into six broad families, termed A to F [ 11. A catalogue of the various enzymes belonging to each family is given in [Zj. The analysis of features that are correlated with structural similarity provide some interesting insights:
(1) There is only a poor correlation between the degree of similarity of various enzymes and the phylogenic relatedness of the organisms that produce them. At least four of the families contain enzymes of both prokaryotic and eukaryotic origin, sometimes even belonging to the same subtype. This suggests that interspecific exchange of the ancestral DNA sequences encoding the catalytic domains of cellulases has occurred on a very wide basis. (2) In spite of probably sharing the same basic type of structural framework, enzymes of the same family can vary considerably in their enzymic properties. At least two families (H and C) contain both endo- and exoglucanases and in many cases enzymes with clearly related sequences differ in their pattern of specificity toward different substrates (cellodextrins, carboxymethylcellulose, chromogenic cellobiosides, xylan). Such behaviour is not unexpected, since a few changes may suffice to alter the substrate-binding site, and therefore the specificity of the enzyme. ( 3 ) In the best represented families, residues strictly conserved in all enzymes are restricted to a small number. Since at least some of the conserved residues are likely to be involved in the active site, sequence alignments are quite helpful in designing mutagenesis experiments to locate essential catalytic residues.
Proline and hydroxyamino acid-rich segments. In many cellulases, the various domains present in the protein are separated by segments that are greatly enriched in proline and/or hydroxyamino acids, These segments often display a highly reiterated
structure. So far, no specific function has been assigned to them.
Cellulose-binding domains. A number of cellulolytic enzymes contain domains, ranging between 30 and 130 residues, which do not participate in catalysis but enhance binding of the proteins to cellulose. Among bacterial enzymes, the Cellulomonas fimitype of cellulose-binding domain (CBD) is the most widespread and the most extensively studied. T h e C jmi-type CBD is about 100 residues in length and binds tightly to cellulose in the presence of salt, suggesting that binding involves hydrophobic interactions. Indeed, sequence alignments show that four tryptophan residues are strictly conserved and might therefore be involved in contacts with carbohydrate. The cellulose-binding properties of the C fimi-type CHD are independent of the rest of the molecule and offer interesting potential applications for the construction of chimeric genes in which the sequence encoding a protein of interest is fused in frame with the CHI) coding sequence. The resulting fusion protein could then be purified by affinity chromatography on cellulose, a cheap and readily available support. The feasibility of the scheme was demonstrated for fusions between the C fimi CenA CHL) and alkaline phosphatase or Agrobacterium sp. /3-glucosidase [ 31. Other types of C I 3 h have been recently identified. One of them is a segment of about 130 residues first identified at the carboxyl end of Bacillus subtiZis endoglucanase as a region non essential for activity, and subsequently found in several other bacterial endoglucanases [21. The cellulose-binding capacity of the segment present in Bacillus hutus CelA was demonstrated after controlled proteolysis and separation of the catalytic core from the CHD (C. Hansen, personal communication). Regions involved in cellulose binding, but not required for activity, have also been identified in the C-terminal regions of endoglucanase CelE of Clostridium thermocellum [ 41 and endoglucanase CelZ of Erwinia chrysanthemi [ S ] , but they appear to be unique, at least among the cellulases sequenced so far. Other domains. A highly conserved region of about 65 amino acids was first identified in several endoglucanases and in a xylanase of Cl. thermocellum [6], and later found in two other cellulolytic clostridia, Clostridium cellulolyticum [7] and Clostridium cellulozlorans [7a]. This region, which is usually located at the C-terminus of the proteins, contains two homo-
I992
43
Biochemical Society Transactions
44
logous segments of 23 residues each, separated by 9- 15 amino acids. Deletion experiments show that it is not directly involved in catalysis or substrate binding [4, 8-10]. Recent data [lOa] indicate that the duplicated segment serves to anchor the various catalytic components to the high molecular mass, multi-enzyme complex, termed the cellulosome, which is responsible for the hydrolysis of crystalline cellulose (see below). Probing the catalytic mechanism of cellulases
Like other carbohydrases, cellulases are thought to act by an acid-based mechanism of hydrolysis involving two carboxylic residues, one acting as a proton donor to favour the release of the nonreducing end group and the other stabilizing the resulting carbonium ion. This mechanism is well documented in the case of lysozyme. At present, the most thoroughly investigated bacterial cellulase is endoglucanase CelD of CL thermocellum. The enzyme was crystallized [ 1 11 and its three-dimensional structure was elucidated recently (M. Juy, A. G. Amit, P. M. Alzari, R. J. Poljak, M. Claeyssens, P. Heguin & J. P. Aubert, unpublished work). The structure comprises three domains: (i) an N-terminal domain of about 100 residues, consisting of P-pleated sheets, which is not involved in the catalytic site; (ii) a domain of about 450 residues, consisting of 12 a-helices; (iii) the 6.5-residue region containing the duplicated segment of 23 amino acids discussed above. The structure of the latter is mobile within the crystal and therefore could not be determined. Co-crystallization with the inhibitor o-iodobenzyl-thio-P-l)-cellobioside revealed that the catalytic site lies within a cleft formed at the surface of the molecule by three of the loops connecting the a-helices. In parallel to the determination of the structure, several mutants of CI. thermocellum CelD were constructed by site-directed mutagenesis and characterized to localize residues participating in the active site and to define their function. His-516 was identified as an active-site residue by a combination of chemical modification and site-directed mutagenesis experiments [ 121. Cell3 was 70% inactivated by diethyl pyrocarbonate (DEPC), a reagent specifically modifying His residues, Inactivation was prevented in the presence of P-methylcellotrioside, a competitive inhibitor, showing that the critical modification occurred in the active site. Site-directed mutagenesis was carried out for each of the 12 His residues of CelD. Among the mutated proteins, CelD carrying the His-5 16 * Ser mutation had 25% of the activity of
Volume 20
the wild type, but, in contrast with the wild type and all other mutated proteins, CelD His-5 16 * Ser was insensitive to inactivation by DEPC. From this, and from the fact that DEPC inactivation is caused by the modification of a residue lying in the active centre, it was concluded that His-S 16 lies in the active site of CelD. The conclusion was confirmed by crystallographic data showing that His-S 16 lies on one of the loops forming the catalytic site and that its side-chain points inside of the catalytic groove. Glu-555 was identified after site-directed mutagenesis of conserved Asp and Glu residues and characterization of inactive mutants [ 12a]. Five mutated proteins having less than 1% residual activity were purified and characterized. The Glu555 *Ala mutation reduced the k,,, of the enzyme 4000-fold, but, in contrast with other mutations, did not significantly alter other properties, such as K,, effect of Ca2+,chromatographic behaviour or sedimentation coefficient. It was therefore proposed that Glu-555 might act as a general acid catalyst in the hydrolysis reaction. As in the case of His-516, the prediction is in full agreement with crystallographic data.
Cellulases as multi-enzyme systems No known organism can efficiently degrade native cellulose with a single enzyme species. In the welldocumented case of cellulolytic fungi, at least three different types of enzymes: endoglucanases, cellobiohydrolases and P-glucosidases must act synergistically to hydrolyse crystalline cellulose (see Wood pp. 46-53). Similarly, all bacteria capable of utilizing crystalline cellulose produce a set of cellulases with distinct physico-chemical and enzymic properties. However, very little is known about the synergistic interactions involved in the degradation of cellulose by bacterial cellulases. Unassociated or transiently associated cellulases
Several bacteria, e.g. Cellulomonas, the related Thermoactinomycetes Thermomonospora fusca and Microbispora bispora or the anaerobe Clostridium stercorarium produce cellulases that do not form stable, high-molecular-mass complexes. Several distinct cellulases are produced, often carrying a separate cellulose-binding domain. In the absence of evidence to the contrary, such cellulase systems are held to resemble the systems of aerobic fungi. Exoglucanases have been identified in CL stercorarium, C f i m i and M. bispora. There is evidence of endo-exo synergy in the case of CL stercorarium [ 131 and M. bispora [ 141.
Biochemistry of Plant Polysaccharides
High-molecular-mars cellulase complexes
Several anaerobic bacteria can degrade crystalline cellulose quite efficiently while secreting low amounts of cellulolytic enzymes. The high specific activity of such cellulase systems has attracted considerable interest. In this respect, the most intensively studied organism is the Gram-positive, anaerobic and thermophilic bacterium CL thermocellum. The cellulase system of CL thermocellum consists of a high-molecular-mass complex, termed a cellulosome, which is responsible for the hydrolysis of crystalline cellulose [15]. The surface of Cl. thermocellum cells is studded with protruberances composed of multi-cellulosomal aggregates, which are responsible for the adhesion of the bacteria to cellulose, but ultimately detach from the cell surface and proceed independently with the hydrolysis of cellulose [ 161. The cellulosome has a molecular mass of 2000-4000 kDa and contains at least 14 different components ranging in molecular mass from 40 to 250 kDa [ 171. Several bands having carboxymethylcellulase or xylanase activity are revealed by zymogram staining [17, 181. The complex binds strongly to cellulose and is highly resistant to dissociation by various chaotropic agents, making purification and characterization of the individual components difficult. Current ideas about the structural organization and mode of action of the cellulosome are based on electron micrographs of the complex [ 191 and on partial reconstitution of activity against crystalline cellulose by a mixture of two purified components [20, 211. One is a 82 kDa protein, termed S,,with activity against carboxymethylcellulose, but not against crystalline cellulose. The other is a 250 which is devoid of kDa glycoprotein, termed S,,, catalytic activity. S,, promotes the binding of S, to cellulose and is considered to be both a cellulosebinding factor and a scaffolding protein of the cellulosome. A model was proposed to explain the high activity of the cellulosome against crystalline cellulose by assuming that catalytic subunits are aligned at closely spaced intervals along the same cellulose molecule. Quasi-simultaneous cutting events would generate cellodextrins, which would subsequently be cleaved to cellobiose without leaving the complex [ 191. Molecular cloning of cel (cellulose degradation) genes of CL thermocellum has shown that the diversity of cellulolytic components is largely caused by the presence of a variety of different genes. At least 15 different endoglucanase genes,
two xylanase genes and two /3-glucosidase genes have been identified so far [22,231. As mentioned above, most endoglucanases and at least one xylanase carry a non-catalytic domain consisting of two highly conserved, homologous segments, which appears to mediate attachment of the catalytic subunits to the scaffolding protein of the cellulosome. Other anaerobic bacteria, such as C1 cellulouoruns [24], Acetivibrio cellulolyticus [24a] and Bucteroides cellulosolvens [251 produce high-molecular-mass cellulase complexes with properties similar to those of the CL thermocellum cellulosome, although they generally contain fewer subunit species. In addition, several other cellulolytic bacteria display polycellulosome-like protuberances on their cell surface [26]. It is not clear how similar the cellulase systems of these micro-organisms actually are to the C1 thermocellum cellulosome. Significant differences could exist in the stability of binding of individual enzymes within the complex, or the stability of attachment of the complex itself to the cell wall. For example, Fibrobucter (formerly Bucteroides) succznogenes closely adheres to the substrate with the concomitant formation of grooves owing to the erosion of cellulose underneath the contact surface [27]. The strictly localized degradation of cellulose in the immediate vicinity of the bacteria suggests that cellulolytic complexes are tightly bound to the outer surface of the cells.
Conclusion The study of bacterial cellulases has made rapid progress owing to the introduction of recombinant DNA technology. By comparing primary structures derived from gene sequencing, several basic structural paradigms have been identified among the various subdomains present in cellulolytic enzymes. Combined approaches based on protein biochemistry, X-ray diffraction and site-directed mutagenesis will predictably lead to a rapid improvement in our understanding of the basic mechanisms of hydrolysis and substrate binding by individual types of domains. As a next step, it will become possible to manipulate the specificity of individual enzymes. However, our knowledge of whole enzyme systems required for crystalline cellulose degradation is still very vague. Little is known about which enzymes are essential or how they interact together synergistically. In this respect, the lack of systems allowing the targeted genetic manipulation of cellulolytic bacteria is a handicap. Nonetheless, it has become clear that bacterial cellulose systems are not necessarily similar to the classic fungal cel-
I992
45
Biochemical Society Transactions
lulase systems, and that they represent interesting alternatives to them.
46
I . ltenrissat, H.. Claeyssens. M., Tomme. I)., 1,emesle. I,. & Mornon, J.-1’. (1980) Gene 81, 83-95 2. Bkguin, 1’. (1000) Annu. Rev. Microbiol. 44, 219-248 3. Ong, I