Trichoderma reesei - FEBS Press

4 downloads 0 Views 959KB Size Report
and the strain DBY746 (E, his3A I , leu2-3, leu2-112, ura3-52, trpl-289, Cyh) for production of the EGIV enzyme. The E. coli strain DH5a (F-, endAl, hsdRI7(r;,m,'), ...
Eur. J. Biochem. 249, 584-591 (1997) 0 FEBS 1997

cDNA cloning of a Trichoderma reesei cellulase and demonstration of endoglucanase activity by expression in yeast Markku SALOHEIMO, Tuna NAKARI-SETALA, Maija TENKANEN and Merja PENTTILA VTT Biotechnology and Food Research, Espoo, Finland (Received 6 J u n d l l August 1997) - EJB 97 0810/2

A Trichoderma reesei cDNA encoding a previously unknown protein with a C-terminal cellulosebinding domain was obtained by complementation screening of a 7: reesei cDNA library in a secl yeast mutant impaired in protein secretion. The 7: reesei protein shows amino acid similarity over its entire length to the Agaricus hisporus cellulose-induced protein CELl whose function is not known. These two proteins form a new glycosyl hydrolase family, number 61. Expression of the 7: reesei cDNA in yeast showed that it encoded a protein with endoglucanase activity and thus the protein was named EGIV and the corresponding gene eg14. Polyclonal antibodies were prepared against EGIV produced in Escherichia c-oli and detected a 56-kDa protein in the 7: reesei culture supernatant. Northern hybridisation revealed that 7: reesei eg14 is regulated in the same manner as other cellulase genes of this fungus. Keywords: cellulase ; endoglucanase; Trichoderma; yeast; regulation.

The enzymatic degradation of cellulose performed by fungi and bacteria is of great importance to the carbon cycle of the biosphere, since cellulose is the most abundant biopolymer. Trichoderma reesei is the most widely studied cellulolytic fungus and serves as a model for the studies of other fungal cellulolytic systems. 7: reesei produces three types of enzyme activities needed for degradation of crystalline cellulose into glucose. These are cellobiohydrolase (CBH), endoglucanase (EG), and p-glucosidase. Members of each of these enzyme groups have been characterised from Trichoderina reesei both at the enzyme and gene levels (reviewed by Nevalainen and Penttila, 1995). Industrial 7: reesei strains constructed by extensive mutagenesis programs produce extracellular proteins in large quantities. The enzyme mixture secreted by Trichadernza is dominated by a few celluase species that also comprise the main part of the produced extracellular protein. These cellulases were the first cellulases to be isolated and characterised at the gene level. Two major cellobiohydrolase genes, cbhl (Shoemaker et al., 1983; Teeri et al., 1983) and chh2 (Chen et al., 1987; Teeri et al., 1987), and two major endoglucanase genes egll (Penttila et al., Correspondence to M. Saloheimo, VTT Biotechnology and Food Research, P.O. Box 1500, FIN-02044 VTT, Espoo, Finland Fux: +358 9 455 2103. E-nznil: markku.saloheimo@ vtt.fi Abbreviciricms. CBD, cellulose binding domain; CBH, cellobiohydrolase ; CELl , Agaricus hisponls cellulose-induced protein ; cell, gene encoding CELl ; EG, endoglucanase; egl, gene encoding endoglucanase; endo-H, endoglycosidase H ; HCA, hydrophobic cluster analysis ; IPTG, isopropyl thio-B-D-galactoside; S E C l , yeast gene involved in the terminal stage of protein secretion; Ssol, Sso2, yeast proteins involved in the terminal stage of protein secretion : SC-URA, yeast synthetic complete medium without uracil; YPD, yeast peptone/dextrose medium. Enzyme. Cellulase (endo-I ,4-/$glucanase) (EC 3.2.1.4) ; cellulose 1,4-/~-cellobiosidase(EC 3.2.1.91); 8-glucosidase (EC 3.2.1.21). Note. The novel nucleotide sequence data published here have been deposited with the EMBL sequence data bank and are available under accession number Y 11 1 13.

1986; van Arsdell et al., 1987) and eg12 (originally called eg13, Saloheimo et al., 1988) were isolated. Two additional endoglucanase genes encoding minor activities were found later. egl3 (Ward et al., 1993) was cloned by PCR based on the amino acid sequence obtained from a protein secreted by an EGI-EGIInegative 7: reesei strain. eg15 (Saloheimo et al., 1994) was isolated i n a yeast expression library screening based on the yeastproduced enzyme activity, without any prior knowledge about the protein. With the exceptions of EGIII and BGLI (p-glucosidase I), the cellulases whose genes have been isolated from 7: reesei have a modular structure. They have a well-conserved cellulosebinding domain (CBD) in their N-terminus or C-terminus (Shoemaker et al., 1983; Teeri et al., 1983, 1987; Penttila et al., 1986; van Arsdell, 1987; Chen et al., 1987; Saloheimo et al., 1988, 1994), separated from the catalytic core domain by a linker region rich in serine, threonine and proline residues. Moreover, this modular overall architecture with a C-terminal CBD has been found in two 7: reesei hemicellulases, the endomannanase (Stllbrand et al., 1995) and acetyl xylan esterase (Margolles-Clark et al., 1996b). Here, we describe the isolation of a cDNA encoding a previously unknown 7: reesei cellulase. The cloning was not based on data obtained on an isolated protein or enzyme activity screening but on the unexpected ability of the cDNA to suppress yeast secretory mutations. The cDNA was demonstrated to encode an endoglucanase by expression of the cDNA in yeast.

EXPERIMENTAL PROCEDURES Strains and vectors. The Saccharomyces cerevisiae strain Sf750-14Da (a, secl-I, his4-580, ura3-52, trpl-289, leu2-3, leu2- 112) was used in the see1 complementation experiments and the strain DBY746 ( E , his3A I , leu2-3, leu2-112, ura3-52, trpl-289, C y h ) for production of the EGIV enzyme. The E. coli strain DH5a (F-, endAl, hsdRI7(r;,m,'), supE44, fhi-I, 2 , recA1, gyrA96, relAl, A(argFlacZYA)U169,~801acZAMl5) was

Saloheimo et al. (Eu,: J. Biochem. 249)

used as a plasmid host and the strain RV308 (su-, AlacX74, galISII::OP308, strA) for EGIV production in E. coli. The 7: reesei cDNA library (Margolles-Clark et al., 1996a) was in the vector pAJ401 (Saloheimo et al., 1994). Media and culture conditions. The yeast strains were grown without plasmids on yeast peptone/dextrose (YPD) medium and with plasmids on synthetic complete medium lacking uracil (SC-URA) (Sherman, 1991). To show B-glucanase activity in a plate assay, the yeast DBY746, containing plasmid pMS54, was grown on an SC-URA plate with 0.1 % barley pglucan (Biocon) for five days at 30°C and the plate was stained with Congo red (Merck) as described by Penttila et al. (1987b). To obtain protein for activity assays, the yeast strain DBY746 containing the plasmid pMS54 and the control strain with the vector pAJ401 were cultivated for 46 h in SC-URA medium at 30°C in a 1.5-1 fermentor as described by Margolles-Clark et al. (19964. After cultivation, the cells were separated by centrifugation at 4000 g for 5 min, the supernatants were filtered, and concentrated 50-fold by ultrafiltration (Margolles-Clark et al., 1996a). For Northern hybridisation, 7: reesei QM9414 (Mandels et al., 1971) was grown in shake flasks (28"C, 200 rpm) in a minimal medium (Penttila et al., 1987a) with different carbon sources as follows: in 5% glucose for 3 days, in 2% cellobiose for 2 days, in 2 % Solka floc cellulose for 3 days, in 3 % lactose for 3 days, and in 5% xylose for 3 days. To test for induction of the egL4 gene, sophorose was added to 1 mM into a 2% sorbito1 cuiture after 72 h growth and into a 5 % glucose culture after 57 h growth. The cultures were grown for further 10 h and the same amount of sophorose was added. The mycelium was harvested 5 h after this addition. A 2% sorbitol cultivation of 87 h was carried out without sophorose additions as a control. The glucose-depleted culture sample was taken from a glucose batch fermentation after 125 h of growth. Glucose was depleted from this cultivation at 75 h (IlmCn et al., 1997). After the cultivations, the mycelia were harvested by filtration with a glass fibre filter, washed with 0.9% NaCl, and frozen. For detection of EGIV from Trichoderma culture medium by western blotting, the 7: reesei strain QM9414 was cultivated on cellulose-based medium in a shake flask as previously described (Nakari-Set and Penttila, 1995). Cloning and sequencing of the eg14 cDNA. Yeast transformation was performed with the LiAc method (Gietz et al., 1992). Plasmids were recovered from yeast by isolating total DNA and using this in electroporation of E. coli. Sequencing was carried out with the dideoxynucleotide chain-termination method using the modified T7 polymerase (USB) and sequencespecific primers. The sequences were analysed with the PCGene software package (Intelligenetics). Hydrolysis experiments. Different polysaccharides were treated with concentrated yeast supernatants to evaluate the activity of the protein encoded by eg14. The substrates tested were as follows : barley p-glucan (Biocon) ; Avicel, mainly crystalline cellulose (Serva 14204) ; phosphoric-acid-swollen amorphous cellulose prepared from Avicel (Walseth, 1952) ; carboxymethyl cellulose (DS 0.7, Aldrich 93 1-1); 4-0-methylglucuronoxylan (Roth 7500) ; ivory nut niannan (Megazyme) and galactomannan (locust bean gum; Sigma (3-0753). The optimum pH for the hydrolysis experiments was determined by incubating 1 ml 1 % barley B-glucan in 50 niM sodium citrate/phosphate buffers with pH values in the range 3-7 with 0.1 ml 50-fold concentrated yeast culture filtrate for 5 h at 40°C. The hydrolysis experiments were performed by incubating 0.1 ml different substrates (5 g/l) in 50 mM sodium acetate, pH 4.5, at 40°C for 24 h with 0.05 ml concentrated yeast culture filtrate plus 0.05 ml buffer. The reactions were stopped by boiling for 5 min. The amount of reducing

585

sugars formed was determined by the method of Bernfeld (1955) using glucose, xylose, or mannose (Fluka) as standards. The monooligosaccharides and oligosaccharides in hydrolyzates were analysed with high-performance anion-exchange chromatography on a Dionex 4500i series chromatograph with pulsed amperometric detection (Tenkanen, M., unpublished procedure) using monosaccharides and cello- (Merck), xylo- (Megazyme), and manno-oligosaccharides (Megazyme) of 2-4 residues as standards. Endoglucanase activity was determined by analysing the effect of the yeast culture filtrates on the viscosity of carboxymethyl cellulose. 5 ml 0.5 % carboxymethyl cellulose solution in 50 mM sodium acetate, pH 4.5, was hydrolysed with 0.1 ml concentrated yeast culture filtrate for 2 h at 40°C. The reaction was stopped by boiling for 5 min after which the viscosity of the solution was determined using a Brookfield LVTDV-I1 CP viscometer (Brookfield Engineering Laboratories Inc.). Hydrophobic cluster analysis. HCA plots of the sequences were obtained using the plot program (version 5.0) from Doriane S.A., which automatically draws the clusters of hydrophobic amino acid residues (V, I, L, M, F, W, or Y). Analysis of the plots was carried out following published guidelines (Gaboriaud et al., 1987; Lemesle-Varloot et al., 1990). Expression of eg14 cDNA in E. coli for antibody production. The eg14 cDNA extending from the putative signal sequence cleavage site after Gly18 to the STOP codon of the gene was amplified with VENT polymerase (New England Biolabs) using the primers 5'-AGA GAG GAA TTC GTT GTC GGA CAT GGA CAT AT (sense) and 5'-ATA TAT CTA GAT CAG TTA AGG CAC TGG GCG TAG T (antisense) and the plasmid pMS54 as the template. The oligonucleotide primers included an EcoRI cleavage site at the 5' end and a XbaI site at the 3' end of the PCR fragment to facilitate the subsequent cloning step. The PCR fragment digested by EcoRI and XbaI was cloned into the pFLAGl expression vector (IBI) resulting in pTNS3. The RV308 cells transformed with pTNS3 were grown at 37°C in ampicillin-containing (100 pg/ml) LB medium to an A,,,, of about 1.5 after which IPTG was added to a final concentration of 1 mM. After induction of 4 h at 30"C, the cells were harvested and fractionated according to Sambrook et al. (1989). The FLAG-EGIV fusion protein was purified from the cytoplasmic fraction of E. coli by anti-M2 affinity chromatography according to the manufacturer's instructions (IBI). The purified non-denatured protein was used to immunize rabbits (KUO:NZW, Laboratory Animal Center, Helsinki University).

Endoglycosidase H treatment of protein samples and western blotting. The yeast (50-fold concentrated) and 7: reesei culture supernatant samples were treated with 2 mU endoglycosidase H (Boehringer Mannheim) i n 125 mM sodium citrate, pH 5.0, at 37°C for 24 h. SDSPAGE was performed in 12.5% polyacrylamide gels (Laemmli, 1970) followed by visualisation of the proteins by western blotting. Polyclonal anti-EGIV antibodies produced in this work were used together with alkalinephosphatase-conjugated anti-rabbit IgG (H+L) (Bio-Rad) that was detected by using ProtoBlot Western Blot AP (Promega). Northern hybridisation. Total 7: reesei RNA was isolated according to Chirgwin et al. (1979). RNA samples (5 pg) were run in a 1% agarose gel in 10 mM phosphate. Capillary Northern blotting was performed with 20XNaCUCit (NaCl/Cit is 0.15 M NaCl, 15 mM sodium citrate, pH 7.0) onto the Hybond N filter (Amersham). Hybridisation was carried out overnight at 42°C in 50% formamide, 10% dextran sulphate, 1 % SDS, 1 M NaC1, 125 pg/ml herring sperm DNA. The filters were washed at 42°C in SXNaCl/P,/EDTA for 15 min, in 1XNaCl/P,/EDTA, 0.1 % SDS for 2 x 1 5 min and in 0.1 XNaCI/P,/EDTA, 0.1 % SDS

586

Saloheimo et al. (Eul: J. Biochem. 249) 90 CAGATATAGGCTTTACTGAGACTCGCTTTGTTTCTTTCAC~TGATCCAG~GCTTTCC~CCTCCTTGTCACCGCACTGGCGGT M I Q K L S N L L V T A L A V I

180 G G C T A C T G G C G T T G T C G G A C A T G G A C A T A T T A A T G A C A T T A T G V V G H G H I N D I V I N G V W Y Q A Y D P T T F P Y

T 20 40 270 CGAGTCAllACCCCCCCATAGTAGTGGGCTGGACGGCTGCCGACCTTGAC~CGGCTTCGTTTCACCCGACGCATACC~CCCTGACAT E S N P P I V V G W T A A D L D N G F V S P D A Y O N P D I 60 3 60 CATCTGCCACAAGAATGCTACGAATGCCAAGGGGCC~GGGGCACGCGTCTGTC~GGCCGGAGACACTATTCTCTTCCAGTGGGTGCCAGTTCCATG I

C

H

K

N

A

T

N

A

K

G

H

A

S

V

K

A

G

D

T

I

L

F

Q

80*

W

V

P

V

P

W

100

450 G C C G C A C C C T G G T C C C A T T G T C G A C T A C C T G G C C A A C T G C T

P

H

P

G

P

I

V

D

Y

L

A

N

C

N

G D 120

C

E

T

V

D

K

T

T

L

E

F

F

K

I

540 CGATGGCGTTGGTCTCCTCAGCGGCGGGGATCCGGGCACCTGGGCCTCAGACGTGCTGATCTCC~C~C~CACCTGGGTCGTC~GAT

D

G

V

G

L

L

S

G

G

D

P

G

T

W

A

S

D

V

L

I

S

N

N

N

140

T

W

V

V

K

I

160

630 CCCCGACAATCTTGCGCCAGGCAATTACGTGCTCCGCCACGAGATCATCGCGTTACACAGCGCCGGGCAGGC~CGGC~TCAG~CTA P D N L A P G N Y V L R H E I I A L H S A G Q A N G A Q N Y 180 720

CCCCCAGTGCTTCAACATTGCCGTCTCAGGCTCGGGTTCTCTGCAGCCCAGCGGCGTTCTAGGGACCGACCTCTATCACGCGACGGACCC P Q C F N I A V S G S G S L Q P S G V L G T D L Y H A T D P 200 220 810 T G G T G T T C T C A T C A A C A T C T A C A C C A G C C C G C T C A A C T A C C G A G T G T T G C C C A

G

V

L

I

N

I

Y

T

S

P

L

N

Y

I

I

P

G

P

T

V

V

S

G

L

P

T

S

V

A

240

Q snn

GGGGAGCTCCGCCGCGACGGCCACCGCCAGCGCCACTGTTCCTGGAGGC~TAGCGGCCCGACCAGCAG~CCACGAC~CGGCGAGGAC

G

S

S

A

A

T

A

T

A

S

A

T

V

260

=$

P G G linker

G

S

G

P

T

S

R

T

T

T

T

A

R

T

280

990 G A C G C A G G C C T C A A G C A G G C C C A G C T C T A C G C C T C C C G C A T G

T

Q

A

S

S

R

P

S

S

T

P

P

A

T

T

S

A

P

A

G

G

P

T

Q

T

L

e 3

300

Y G CBD

Q

C

1080 TGGTGGCAGCGGTTACAGCGGGCCTACTCGATGCGCGCCGCCAGCCACTTGCTCTACCTTG~CCCCTACTAC~CCCAGTGCCTTAACTA

G

G

S

G

Y

S

G

P

T

R

C

A

P

P

A

T

C

S

T

L

N

P

Y

Y

A 340

320

Q

C

L

N

e 1170

GAGGGCATCACGGCGGGTCTTTGGAGCTTCGAGGCACACACGCGGGCTAGTGCTTCCTAG~CTGAGGTAGTT~TTCGGGGCGAGGAGGA 1260

ATAATCTTTTACATATACTGTACTGAATCTTGACGATGACGATGT~CTCCTGATATCACTAGATG~GTCTACGGGTA~AGTGATATTGCC~GA 1350 G C A G A T T G T T A C A T T A T C A A T ~ T C T G G T C A T C A T C A A T T G +

4

4

TCCATAGGTATAGCAATGAGATTCGATTCACCATTGTCATCTACTAGCC 1399

Fig. 1. Nucleotide sequence of the T. reesei eg14 cDNA and deduced amino acid sequence. The predicted signal sequence cleavage site is shown by an arrow (I), potential N-glycosylation sites by asterisks (*), and polyadenylation sites by diamonds (+). The linker region and cellulose-binding domain are indicated

for 2 X 15 min at room temperature (NaClIPJEDTA is 0.18 M NaC1, 10 mM NaH,PO,, 1 mM EDTA, pH 7.7).

RESULTS Isolation and sequence of a T reesei cellulase cDNA, egZ4. The 7: reesei eg14 cDNA was isolated in an attempt to clone by complementation the T. reesei gene corresponding to the SECl secretory gene of Succharomyces cerevisiae. A 7: reesei expression cDNA library (Margolles-Clark et al., 1996a) was transformed into the temperature-sensitive secl-1 yeast mutant strain. 3X lo5 transformants were first grown to small colonies at the permissive temperature 24 "C and subsequently replicated onto fresh plates and grown at the restrictive temperature 36°C. Several yeast clones that were able to grow were collected. Their cDNA library plasmids were transformed into E. coli, isolated, and analysed by restriction enzyme digestions. Plasmids representing different clone types were retransformed into the yeast strain and the ones that could promote growth at the restrictive temperature were analysed further. The cDNA insert of one of these plasmids, pMS54, was sequenced and, unexpectedly,

found to encode a protein with a consensus cellulose-binding domain at its C-terminus. The same cDNA was present in four different clones isolated in the cloning experiment. A separate complementation cloning experiment of the T reesei sso genes was later performed with a yeast strain that can be conditionally depleted of the Ssol and Sso2 proteins. These are components of the late secretory pathway functionally coupled with the Secl protein (Aalto et al., 1993). The eg14 cDNA was also isolated in this experiment and thus it is able to suppress two different defects of the latest stage of yeast secretion. The eg14 cDNA is 1399 bp long without its poly(A) tail. This is in agreement with the corresponding mRNA size determined by northern hybridisation, i.e. 1.5 kb (Fig. 6). Sequencing of several cDNA clones from their 3' ends revealed that at least three different polyadenylation sites are used in this gene (Fig. 1). The first ATG codon found in the cDNA starts an open reading frame (ORF) of 1032 bp, encoding a 344-amino-acid protein. This start codon is preceded by an in-frame stop codon about 30 bp upstream and thus the cDNA most probably contains the whole protein-coding region of the eg14 gene. The pro-

Saloheimo et al. (Eur: J . Biochem. 249)

CBHI

461

EGII EGV EGIV

1 205 308

587

Fig. 2. Sequence alignment of the cellulose-binding domains of I: reesei cellulases. Invariant amino acids are boxed. The amino acids torniing the flat CBD surface binding to cellulose are shown by asterisks (*). Disulphide bridges are shown by connecting lines. The numbering is for the first amino acid of the shown sequence.

EGIV CELl

88 A S V K A G D T I L F Q W V P V P W P H P - G P I V D Y L A N C N G D - C E T V G L L S G G 91 ATVAAGTAITAYWNQV-WPHPYGPMTTYLGKCPGSSCDGVNTNSLKWFKIDEAGLLSGTV * * ** * * * **** ** ** * * * * * **** ***** t

EGIV CELl

145 DPGTWASDVLISNNNTWVVKIPDNLAPGNYVL~EIIALHSAGQANGAQNYPQCFNIAVS 150 GKGVWGSGKMIDQNNSWTTTIPSTVPSGAYMIRFETIALHSLP----AQIYPECAQLTIT

EGIV CELl

=) linker 205 GSGSLQPSG---VLGTDLYHATDPGVLINIYTSPLNYIIPGPTWSGLPTSVAQGSSAAT 206 GGGNRAPTSSELVSFPGGYSNSDPGLTVNLYTQEA--MTDTTYIVPGPPLYGSGGN----

*

* * *

* *

* * *

*

**

*

* * *****

* *

***

* **

** ** *

* * *

*

=)

EGIV CELl

CBD 262 ATASATVPGGGSGPTSRTTTTARTTQASSRPSSTPPATTSAPAGGPTQTLYGQCGGSGYS 260 ---------GGSPTTTPHTTT---------PITTSPPPTSTPG---TIPQYGQCGGIGWT

EGIV CELl

322 GPTRCAPPATCSTLNPYYAQCLN 299 GGTGCVAPYQCKVINDYYSQCL

***

* * *

*

*

*

***

*

* *

** *

*

****** *

* ** ***

Fig. 3. Linear sequence alignment of T. reesei EGIV and A. bisporus CEL1. Conserved amino acids are indicated by asterisks (*) and the possible active site residues are in bold. The conserved putative N-glycosylation site is shown by a diamond (+).

gram PSIGNAL of von Heijne (1986) predicts that the putative EGIV protein starts by a signal sequence that is cleaved after the Gly at position I8 in the ORE With a signal sequence of 18 amino acids, the mature protein would be 326 amino acids long and have a calculated molecular mass of 33685 Da.

The cellulose binding domain and linker region of EGIV. The EGIV protein has a C-terminal cellulose binding domain with all the characteristics of fungal CBDs (Fig. 2). The invariable amino acids conserved in the CBDs of other I: reesei cellulases and hemicellulases can be unambiguously identified in the EGIV sequence. The fungal CBDs have either four or six cysteine residues that form two or three disulphide bridges, respectively (HoffrCn et al., 1995). The EGIV CBD has four cysteines at conserved positions and thus it is probably stabilised by two disulphide bridges. All the residues forming a flat hydrophilic surface of the domain that probably interacts with the cellulose chains (Kraulis et al., 1989) are conserved in the EGIV CBD (Y312, Y338, and Y329, N337 and 4341). The CBD of EGIV is preceded in the sequence by a putative linker peptide region of about 56 amino acids, rich in serines, threonines, and prolines (10 Ser, 13 Thr, and 6 Pro). The EGIV linker is exceptionally long, being close to the length of the CBHII linker (51 amino acids, Teeri et al., 1987). The linkers of the other I: reesei cellulases and those of the two hemicellulases with CBDs are 22-34 amino acids in length. The linker peptides are probably 0-glycosylated (Fagerstam et al., 1984) and take an extended conformation (Schmuck et al., 1986) to give spatial separation to the CBD from the catalytic core domain. Sequence comparisons of the EGIV core domain. When a search through the protein sequence databanks with the program BLAST (Altschul et al., 1990) was carried out with EGIV core amino acid sequence, the only significant similarity found was with CELl of the edible mushroom Agaricus bisporus (Raguz

et al., 1992; Armesilla et al., 1994). Cell encodes a protein with a C-terminal cellulose binding domain but the activity of the protein is not known. I: reesei EGIV and A. bisporus CELl can be aligned by linear sequence alignment over an area of about 250 amino acids (Fig. 3), starting from residue 88 of EGIV. In this region, the proteins have an identity of 38%. No similarity can be detected by this alignment method between the N-terminal parts of EGIV and CELl (residues 1-87 of EGIV). To examine whether the N-termini of the proteins are also homologous, the sequences were subjected to hydrophobic cluster analysis (HCA; Gaboriaud et al., 1987) that has been used for the classification of glycosyl hydrolases (Henrissat and Bairoch, 1993). Comparison of the HCA plots (Fig. 4) reveals similar clusters of hydrophobic amino acids not only in the region where similarity can be detected with the linear sequence alignment, but also in the N-terminal 90 amino acids of EGIV and CELI. The most pronounced difference between the proteins is the length of the linker region: the EGIV linker is more than twice as long (about 56 amino acids) as the CELl linker (about 23 amino acids). Cellulases cleave the bonds between glucose units of the cellulose chain through acid catalysis with acidic amino acid residues in the active site acting as catalysts (Rouvinen et al., 1990; Divne et al., 1994). Three acidic amino acids are conserved between I: reesei EGIV and A. bisporus CELl (D136, E179, and D224, Figs 4 and 5). Of these, D136 and El79 reside in areas best conserved between EGIV and CELl and thus we propose that they could form a pair of catalytic residues of the type that is found in the active sites of T reesei CBHI and CBHII (Divne et al., 1994; Rouvinen et al., 1990). The aromatic amino acids Tyr and Trp are also important for cellulase catalysis, forming subsites that bind the substrate to the active site. Four tryptophans (WIOO, W105, W149, and W161 of EGIV) and four tyrosines (Y114, Y195, Y220, and Y232) are conserved between EGIV and CELl and some of these could be involved in substrate binding. Furthermore, three cysteines that could be

588

Saloheimo et al. (ELMJ . Biochern. 249) LINKER

CBD

,

I

Fig.4. Hydrophobic cluster analysis (HCA) plots of T. reesei EGIV (A) and A. bisponcs CELl (B). The conserved hydrophobic clusters are shown for the N-terminal 90 amino acids and the proposed active site regions. Conserved carboxylic acid residues are shaded. Amino acids are represented in the plots by standard one letter code, except the following amino acids represented by symbols: (*) glycine, (*) proline,

(n)

threonine, (m) serine.

Table 1. Hydrolysis of P-glucan and cellulose by yeast culture filtrates. The substrates were hydrolysed with 50-fold concentrated yeast culture filtrates for 24 h. Reducing sugars were measured according to Bernfeld (1955) and glucose, cellobiose, and cellotriose with high-performance anion-exchange chromatography, RS, reducing sugars; G, glucose ; G2, cellobiose; G,, cellotriose; n.d., not detected. ~~

Sample

Substrate fi-glucan

Control EGIV

amorphous cellulose

RS

G

G,

G,

RS

G

G,

GI

RS

G

G,

G3

0.20 0.49

0.06 0.12

0.02 0.27

0.03 0.08

0.04 0.13

0.01 n.d.

0.02 0.15

n.d. 0.01

n.d. n.d.

n.d. n.d.

n.d. 0.02

n.d. n.d.

Table 2. Effect of EGIV produced in yeast on the viscosity of 0.5 Ya carboxymethyl cellulose solution. The solution was incubated for 2 h with SO-fold concentrated yeast supernatants and the viscosity of the samples was determined with a viscometer.

Sample

Viscosity Pa. s

Buffer Control yeast EGIV yeast

avicel

0.053 0.0441 0.0049

involved in disulphide bridge formation are conserved in the core domain (C118, C122, and C198 of EGIV). One of the two potential N-glycosylation sites of EGTV (N158) is conserved in CELl .

Properties of the cellulase enzyme. To test if the new cDNA encodes a protein with cellulolytic activity, the yeast carrying the plasmid pMS54 was grown on an indicator plate containing barley [I-glucan. This yeast produced a faint hydrolysis zone on the plate after staining with Congo red (data not shown), indicating some p-glucanase activity. This zone was not visible around streaks of control yeast carrying the vector pAJ401.

To further characterise the activity of the enzyme, the yeast strains carrying the plasmids pMS54 and pAJ4Ol were cultivated in a fermentor. The growth media were concentrated 50 times and tested against various substrates (Table 1). The hydrolysis experiments were performed at pH 4.5 as the optimum pH for P-glucan hydrolysis was found to be between 4.0 and 5.5. The concentrated culture supernatant of the yeast containing pMS54 clearly hydrolysed P-glucan (Table 1).The enzymes produced by the control strain also degraded this substrate but with a significantly lower rate than the pMS.54 strain. Yeast is known to produce enzymes that degrade different p-glucans. The pMS54 yeast supernatant also degraded amorphous cellulose, but no clear degradation of crystalline cellulose (Avicel) could be detected (Table 1). The main product formed from P-glucan and amorphous cellulose was cellobiose. Tentatively we can conclude that since glucose and cellotriose were also formed during the hydrolysis of P-glucan, the new cellulase is an endoglucanase. The endo-type action of the protein was further confirmed in an experiment where the reduction of the viscosity of carboxymethyl cellulose was tested (Table 2). Xylan or mannan were not degraded in the hydrolysis experiments (data not shown). To detect EGIV in culture supernatants of 7: reesei and yeast, the protein was expressed in E. coli as a fusion with the FLAG epitope (Fig. 5 A ) and antibodies were raised against the purified

Saloheimo et al. ( E M .I. Biochem. 249)

B

A 1

589

kDa

2

94 - 67 -

43

1

30

1

2

3

1 4

5

2

3

4

5

6

7

8

9

6

-

-

1.5 kb-

-

r

-

-20

14.1

-

Fig.5. Western analysis of EGIV produced in E. coli, yeast, and T. reesei. (A) Lane 1, E. coli cell lysate from a strain carrying the vector alone; lane 2, E. coli lysate from the strain producing EGIV. (B) Immunodetection of EGIV from the culture mediums of fi-ichodernza and yeast. Lane 1, 2 pg purified EGV; lane 2, 30 p1 culture medium of 7: reesei QM9414 grown on cellulose; lane 3, same as in lane 2 but endo-H treated; lane 4, 5 p1 endo-H-treated SOXconcentrated culture medium of yeast producing EGIV; lane 5 , same as in lane 4 but without endo-H treatment; lane 6 , 8 p1 SOXcoiicentrated culture medium of a yeast strain with the vector pAJ401. The molecular mass markers (kDa)

are shown. protein. Immunoreactive bands were detected in western blotting both in the yeast and Tr-ichodertm culture media (Fig. 5 B , lanes 2 and 5). In the 7: reesei sample the upper band of approximately 56 kDa represents EGIV (Fig. SB, lane 2). This is supported by the fact that the yeast-produced endo-H-treated EGIV has a mobility close to that of the endo-H-treated upper 7: reesei band (Fig. SB, lanes 3 and 4). The smaller 7: reesei band of about 36 kDa also detected with the anti-EGIV antibodies reprcsents another endoglucanase, EGV, as suggested by the close migration of the band with purified EGV (Fig. SB, lanes 1 and 2). The reactivity of the antibodies with EGV is due to a sinall amount of contaminating EGV protein released from the antiM2 affinity column into the EGIV preparation used for immunization rather than to unspecific cross-reactivity of the antibodies (the column had previously been used for EGV purit'ication). The difference of the calculated EGIV mass (34 kda) to the mass of EGIV secreted by 7: reesei (56 kDa) can be partly explained by glycosylation. Endo-H treatment of the 7: reesei culture supcrnatant decreased the electrophoretic mobility or EGIV only slightly (Fig. SB, lane 3). Many of the glycan groups in the protein are, however, most probably of the 0-type. The linkers of cellulases are heavily 0-glycosylated and EGIV has an exceptionally long linker region. The difference between the calculated and observed molecular masses can be partly due to anomalous migration of the protein in SDS/PACE which might be caused by extended conformation at the 0-glycosylated linker region. The same kind of anomalous migration was not detected with EGIV produced in E. coli (Fig. 5A). The polyclonal antibodies raised against enzymes with cellulose-binding domains often cross-react with othcr CBD-containing proteins. The EGIV antibody also shows slight cross-reaction with some other protein(s) of slower migration in the 7: reesei culture supernatants (the upmost faint bands in Fig. SB, lanes 2 and 3). Yeast is known to overglycosylate heterologous proteins, which has been observed for Trichodermn CBHI, CBHII, and EGI (Penttilii et al., 1987b, 1988). The EGIV enzyme produced i n yeast is also overglycosylated, being roughly 68 kDa in molecular mass (Fig. 5 B).

Regulation of the Trichoderma eg14 gene. The cellulase genes isolated so far from Trichodertna reesei are glucose repressed and expressed to varying extents on other carbon sources. To

1

2

3

4

5

6

7

8

9

Fig.6. Northern analysis of egZ4 expression in T. reesei QMY414 grown on different carbon sources. 1 , glucose; 2, glucose fermentor cultivation sample SO h after glucose depletion; 3, sophorosc added to glucose culture: 4, sorbitol; 5 , sophorose added to sorbitol culture; 6, cellobiose; 7, cellulose; 8, lactose; 9, xylose. The loading of RNA (5 pgi lane) was controlled by staining the gel with acridine orange prior to

blotting (shown below). study if eg14 expression is regulated in a similar manner, 7: reesei QM9414 was grown on different carbon sources, mycelium was harvested and total RNA was isolated and analysed by northern hybridisation (Fig. 6). No eg14 transcript was detected in mycelia grown on glucose or in mycelia grown on sorbitol or xylose as the only carbon source. In a glucose-based culture where glucose had been depleted, eg14 expression was derepressed. When the disaccharide sophorose was added to a sorbito1 culture, eg14 was induced. This induction could not be observed in the glucose-grown culture. Cellulose, cellobiose, and lactose all induced eg14 expression. Sophorose was found to induce eg14 more strongly in a glycerol culture than in a sorbitol culturc (data not shown). This difference bctween glycerol and sorbitol cultures is less pronounced with the other cellulase genes (IlmCn et al., 1997).

DISCUSSION The eg14 cDNA described in this paper was isolated i n an unexpected manner. It was found as a suppressor of a yeast SCcretory mutation of the SECl gene and later it turned out that the same cDNA was able to suppress the Sso protein depletion of yeast, another secretory defect involved in fusion of the secretory vesicles with the plasma membrane. Why should an endoglucanase cDNA do this '? The observed suppressions probably have no meaningful biological explanation, e.g. that the EGIV protein would specifically interact with some cotnponent of the yeast secretory pathway. One possible explanation could be that the endoglucanase modifies the yeast cell wall glucan structures and thus somehow interferes with growth of the ycast andlor its secretion machinery at the plasma membrane. Another explanation might be that the expression of a foreign protein (EGIV) affects the expression of the secretory proteins in such a way that the suppression is observed. The hydrolysis experiments with the EGIV produced in yeast clearly show that we have isolated an endoglucanase cDNA.

590

Saloheimo et al. (Euc J. Biochem. 249)

Even though the major product from P-glucan and amorphous cellulose was cellobiose as in the case with cellobiohydrolases, our results suggest that also glucose and cellotriose were produced from p-glucan. A more indicative piece of evidence for the mode of action of EGIV is that it clearly attacked the substituted carboxymethyl cellulose, which the cellobiohydrolases do not hydrolyse. EGIV showed no activity towards xylan or mannan. The mature EGIV shows moderate amino acid similarity through its entire length to the CELl protein of Agaricus bisporus. In the N-terininal part, the similarity is low and can be detected only with hydrophobic cluster analysis. The A. bisporus cell was found as a cellulose-induced gene encoding a protein with a C-terminal cellulose binding domain (Raguz et al., 1992; Yague et al., 1994; Armesilla et al., 1994). No enzymatic activity has been assigned to CELl despite the efforts based on zymograms of native gels and immunodetection of the protein (Armesilla et al., 1994). Binding of CELl to cellulose, however, has been demonstrated (Armesilla et al., 1994). Taken together, the similarity between EGIV and CELl and the endoglucanase activity of EGIV shown in this study suggest that CELl could also be an endoglucanase. Production of CELl in a heterologous system in an active form might facilitate the elucidation of its activity. Glycosyl hydrolases have been classified into families of related sequences using hydrophobic cluster analysis (Henrissat and Bairoch, 1993). Before the discovery of 7: reesei EGIV, A. bisporus CELl had no known homolog. Now these two proteins form the new glycosyl hydrolase family 61 (Henrissat, B., personal communication). The catalytic core domains of the five endoglucanases and two cellobiohydrolases characterised from T reesei belong presently to six different HCA families. Only EGI and CBHI are in the same family and show clear sequence similarity. Even though CBHII and EGII belong to different HCA families, they are probably structurally related. This is suggested by the a$barrel structures found in endoglucanases E l of Acidothermus cellulolyticus (Sakon et al., 1996) and CelCCA of Clostridium cellulolyticum (Ducros et al., 1995), two members of the HCA family 5 where EGII also belongs. This structural framework is found in 7: reesei CBHII (Rouvinen et al., 1990). Despite the fact that ECIV is a new cellulase not related in sequence to the other 7: reesei cellulases, it could still be structurally related to some of them. A counterpart in the thermophilic fungus Humicola insolens has been found for all the previously isolated Trichoderma reesei cellulases (Schulein et al., 1993). No counterpart for EGIV has apparently been found from H. insolens. In many respects, ECIV is a typical 7: reesei cellulase. It has a modular structure, resembling CBHI, EGI, and EGV and also the two hemicellulases with CBDs in having its cellulose binding domain in the C-terminus. The long linker region is a special feature of EGIV and CBHII. This extended 0-glycosylated region may partly cause the slow migration of EGIV in SDS/ PAGE. The eg14 gene is regulated qualitatively in a manner typical for the Trichoderma cellulase genes. The gene is under carbon catabolite repression and induced by cellulose, sophorose, lactose, and cellobiose. We wish to thank Riitta Nurmi, Riitta Isoniemi, and Tarja Hakkarainen for skillful technical assistance, Marjukka Perttula for HPAEC-PAD analysis, Marja IlmCn and Emilio Margolles-Clark for RNA samples, Matti Siika-aho for purified EGV protein, and Bernard Henrissat for help in HCA analysis.

REFERENCES Aalto, M. K., Ronne, H. & Keranen, S. (1993) Yeast syntaxins Ssolp and Sso2p belong to a family of related membrane proteins that function in vesicular transport, EMBO J. 12, 4095-4104. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. (1990) Basic local alignment search tool, J. Mol. Biol. 215, 403410. Armesilla, A. L., Thurston, C. F. & Yague, E. (1994) CELI: a novel cellulose binding protein secreted by Agaricus bisporus during growth on crystalline cellulose, FEMS Microbiol. Lett. 116, 293300. Bernfeld, P. (1955) Amylases, a and p, Methods Enzymol. 1, 149- 158. Chen, C. M., Gritzali, M. & Stafford, D. W. (1987) Nucleotide sequence and deduced primary structure of cellobiohydrolase I1 from Trichodermu reesei, Biotechnology 5, 274- 278. Chirgwin, J. M., Przybyla, A. E., MacDonald, R. J. & Rutter, W. J. (1979) Isolation of biologically active ribonuclei enriched in ribonuclease, Biochem. J. IS, 5294-5299. Divne, C., Stlhlberg, J., Reinikainen, T., Ruohonen, L., Pettersson, G., Knowles, J. K. C., Teeri, T. T. & Jones, A. (1994) The three-dimensional crystal structure of the catalytic core cellobiohydrolase I from Trichoderma reesei, Science 265, 524-528. Ducros, V., Czjzek, M., Belaich, A,, Gaudin, C., Fieribe, H.-P., Belaich, J.-P., Davies, G. & Haser, R. (1995) Crystal structure of the catalytic domain of a bacterial cellulase belonging to family 5, Structure 3, 939-949. Fagerstam, L. G., Pettersson, L. G. & Engstrom, J. A. (1984) The primary structure of a 1,4-p-glucan cellobiohydrolase from the fungus Trichoderma reesei QM9414, FEBS Lett. 167, 309-315. Gaboriaud, C., Bissery, V., Benchetrit, T. & Mornon, J. P. (1987) Hydrophobic cluster analysis: an efficient new way to compare and analyse amino acid sequences, FEBS Lett. 224, 149-155. Gietz, D., St Jean, A,, Woods, R. A. & Schiestl, R. H. (1992) Improved method for high efficiency transformation of intact yeast cells, Nucleic Acids Res. 20, 1425. Henrissat, B. & Bairoch, A. (1993) New families in the of glycosyl hydrolases based on amino acid sequence similarities, Biochem. J. 293, 781-788. HoffrCn, A.-M., Teeri, T. T. & Teleman, 0. (1995) Molecular dynamics simulation of fungal cellulose-binding domains: differences in molecular rigidity but a preserved cellulose binding surface, Protein Eng. 8,443--450. IlmCn, M., Saloheimo, A., Onnela, M.-L. & Penttila, M. E. (1997) Regulation of cellulase gene expression in the filamentous fungus Trichoderma reesei, Appl. Environ. Microbiol. 63, 1298- 1306. Kraulis, P. J., Glore, G. M., Nilges, M., Jones, T. A., Pettersson, G., Knowles, J. & Gronenborn, A.M. (1989) Determination of the threedimensional structure of the C-terminal domain of the cellobiohydrolase I from Trichoderma reesei. A study using nuclear magnetic resonance and hybrid distance geometry-dynamical simulated annealing, Biochemistry 28, 7241 -7257. Laemmli, U. K. (1970) Cleavage of structural ptoteins during the assembly of the head of bacteriophage T-l, Nature 227, 680-685. Lemesle-Varloot, L., Henrissat, B., Gaboriaud, C., Bissery, V., Morgat, A. & Mornon, J.-P. (1990) Hydrophobic cluster analysis: procedures to derive structural and functional information from 2D- representation of protein sequences, Biochimie (Paris) 21, 555 -574. Mandels, M., Weber, J. & Parizek, R. (1971) Enhanced cellulase production by a mutant of Trichoderma viridae, Appl. Microbiol. 21, 152154. Margolles-Clark, E., Tenkanen, M., Nakari-Setiila, T. & Penttila, M. (1996a) Cloning of genes encoding u-L-arabinofurdnosidase and pxylosidase from Trichoderma reesei by expression in Saccharomyces cerevisiae, Appl. Environ. Microbiol. 62, 3840 -3846. Margolles-Clark, E., Tenkanen, M., Siiderlund, H. & Penttila, M. (1996b) Acetyl xylan esterase from Trichodermu reesei contains an active site serine and a cellulose-binding domain, EUK J. Biochem. 237, 553-560. Nakari-Setala, T. & Penttila, M. (1995) Production of Trichoderma reesei cellulases on glucose-containing media, Appl. Em,. Microbiol. 61, 3650-3655.

Saloheimo et al. (Eur: J. Biochem. 249) Nevalainen, H. & Penttila, M. (1995) Molecular biology of cellulolytic fungi, in The Mycota 11. Genetics and hiotechnology (Kuck, U., ed.) pp. 303-319, Springer-Verlag, Berlin. Penttila, M., Lehtovaara, P., Nevalainen, H., Bhikhabhai, R. & Knowles, J. (1986) Homology between cellulase genes of Trichoderma reesei: complete nucleotide sequence of the endoglucanase I gene, Gene (Amst.) 45, 253-263. Penttila, M., Nevalainen, H., Ratto, M., Salminen, E. & Knowles, J. (1987a) A versatile transformation system for the cellulolytic filamentous fungus Trichoderma reesei, Gene (Amst.) 61, 155- 164. Penttila, M., AndrC, L., Saloheimo, M., Lehtovaara, P. & Knowles, J. (1987b) Expression of two Trichoderma reesei endoglucanases in the yeast Saccharomyces cerevisiae, Yeast 3, 175- 185. Penttila, M. E., AndrC, L., Lehtovaara, P., Bailey, M., Teeri, T. & Knowles, J. K. C. (1988) Efficient secretion of two fungal cellobiohydrolases in Saccharomyces cerevisiae, Gene (Amst.) 63, 103112. Raguz, S., Yague, E., Wood, D. A. & Thurston, C. F. (1992) Isolation and characterisation of a cellulose-growth-specific gene from Agaricus bisporus, Gene (Amst.) 119, 183- 190. Rouvinen, J., Bergfors, T., Teeri, T., Knowles, J. K. C. & Jones, T. A. (1990) Three-dimensional structure of cellohiohydrolase I1 from Trichoderma reesei, Science 249, 380-386. Sakon, J., Adney, W. S., Himmel, M. E., Thomas, S. R. & Karplus, P. A. (1996) Crystal structure of thermostabile family 5 endocellulase E l from Acidothermus cellulolyticus in complex with cellotetraose, Biochemistry 35, 10648-10660. Saloheimo, M., Lehtovaara, P., Penttila, M., Teeri, T. T., Stahlberg, J., Johansson, G., Pettersson, G., Claeyssens, M., Tomme, P. & Knowles, J. (1988) EGIII, a new endoglucanase from Trichoderma reesei: the characterisation of both gene and enzyme, Gene (Amst.) 63, 11-21. Saloheimo, A., Henrissat, B., HoffrCn, A.-M., Teleman, 0. & Penttila, M. (1994) A novel, small endoglucanase gene, egl5, from Trichoderma reesei isolated by expression in yeast, Mol. Microbiol. 13, 219 - 228. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular cloning: a labomto ry manual, 2nd edn, Cold Spring Harbor Laboratory, Cold Spring Harbor NY. Schmuck, M., Pilz, I., Hayn, M. & Esterbauer, H. (1986) Investigation of cellobiohydrolase from Trichoderrna reesei by small angle X-ray scattering, Biotechnol. Lett. 8, 398-402.

591

Schulein, M., Tikhomirov, D. F. & Schou, C. (1993) Humicola insolens alkaline cellulases, in Proceedings of the second TRICEL symposium on Trichoderma reesei cellulases and other hydrolases (Suominen, P. & Reinikainen, T., eds) pp. 109-116, Foundation for Biotechnical and Industrial Fermentation Research, Helsinki. Sherman, F. (1991) Getting started with yeast, Methods Enzymol. 194, 3-21. Shoemaker, S., Schweickart, V., Ladner, M., Gelfand, D., Kwok, S., Myambo, K. & Innis, M. (1983) Molecular cloning of exo-cellobiohydrolase derived from Trichoderma reesei strain L27, Biotechnology 1, 691 -695. Stjlbrand, H., Saloheimo, A,, Vehmaanpera, J., Henrissat, B. & Penttila, M. (1995) Cloning and expression in Saccharornyces cerevisiae of a Trichoderma reesei b-mannanase gene containing a cellulose-binding domain, Appl. Environ. Microbiol. 61, 1090- 1097. Teeri, T., Salovuori, I. & Knowles, J. (1983) The molecular cloning of the major cellulase gene from Trichoderma reesei, Biotechnology I , 696 - 699. Teeri, T. T., Lehtovaara, P., Kauppinen, S., Salovuori, I. & Knowles, J. (1987) Homologous domains in Trichoderma reesei cellulolytic enzymes: gene sequence and expression of cellobiohydrolase 11, Gene (Amst.) 51, 43-52. van Arsdell, J. N., Kwok, S., Schweickart, V. L., Ladner, M. B., Gelfand, D. H. & Innis, M. A. (1987) Cloning, characterisation and expression in Saccharomyces cerevisiae of endoglucanase I from Trichoderma reesei, Biotechnology 5, 60-64. von Heijne, G. (1986) A new method for predicting signal sequence cleavage sites, Nucleic Acids Res. 14, 4683-4690. Walseth, C . S. (1952) Occurrence of cellulases in enzyme preparations from micro-organisms, Tech. Assoc. Pulp Pap. Ind. 35, 228-233. Ward, M., Wu, S., Dauberman, J., Weiss, G., Larenas, E., Bower, B., Rey, M., Clarkson, K. & Bott, R. (1993) Cloning, sequence and preliminary structural analysis of a small, high PI endoglucanase (EGIII) from Trichoderma reesei, in Proceedings of the second TRICEL symposium on Trichoderma reesei cellulases and other hydrolases (Suominen, P. & Reinikainen, T., eds) pp. 153-158, Foundation for Biotechnical and Industrial Fermentation Recearch, Helsinki. Yague, E., Wood, D. A. & Thurston, C. F. (1994) Regulation of transcription of the cell gene in Agaricus bisporus, Mol. Microbiol. 12, 41-47.