The Structure and Function of an Arabinan-specific -1

0 downloads 0 Views 3MB Size Report
Proteins fused to GST were puri- ... contained their respective His6 or GST tags. ...... Cooper, S. W., Pfeiffer, D. G., and Tally, F. P. (1985) J. Clin. ..... Table S6 NMR assignments: 1H and 13C chemical shift (ppm) of the arabinan oligosaccharides ...
THE JOURNAL OF BIOLOGICAL CHEMISTRY VOL. 286, NO. 17, pp. 15483–15495, April 29, 2011 © 2011 by The American Society for Biochemistry and Molecular Biology, Inc. Printed in the U.S.A.

The Structure and Function of an Arabinan-specific ␣-1,2-Arabinofuranosidase Identified from Screening the Activities of Bacterial GH43 Glycoside Hydrolases*□ S

Received for publication, December 23, 2010, and in revised form, February 16, 2011 Published, JBC Papers in Press, February 21, 2011, DOI 10.1074/jbc.M110.215962

Alan Cartmell‡§1, Lauren S. McKee‡§1, Maria J. Pen˜a§, Johan Larsbrink¶, Harry Brumer¶, Satoshi Kaneko储, Hitomi Ichinose储, Richard J. Lewis‡, Anders Viksø-Nielsen**, Harry J. Gilbert‡§2, and Jon Marles-Wright‡ From the ‡Institute for Cell and Molecular Biosciences, Newcastle University, The Medical School, Newcastle upon Tyne NE2 4HH, United Kingdom, the §Complex Carbohydrate Research Center, University of Georgia, Athens, Georgia 30602, the ¶School of Biotechnology, Royal Institute of Technology, AlbaNova University Centre, 10691 Stockholm, Sweden, the 储Food Biotechnology Division, National Food Research Institute, 2-1-12 Kannondai, Tsukuba, Ibaraki 305-8642, Japan, and **Novozymes A/S, Krogshoejvej 36, 2880 Bagsvaerd, Denmark

* This work was supported by the Biotechnology and Biological Sciences Research Council, Woodwisdom, Novozymes, and the United States Department of Energy. This work was also supported in part by Department of Energy Grant DE-FG02–96ER20220 and by the Department of Energy-funded Center for Plant and Microbial Complex Carbohydrates (DE-FG02-93ER20097) (to M. J. P.). This research was also supported in part by the Bioenergy Science Center, a United States Department of Energy Bioenergy Research Center. □ S The on-line version of this article (available at http://www.jbc.org) contains supplemental Fig. S1 and Tables S1–S6. The atomic coordinates and structure factors (codes 3QED, 3QEE, and 3QEF) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ (http://www.rcsb.org/). 1 Both authors contributed equally to this work. 2 To whom correspondence should be addressed: Institute for Cell and Molecular Biosciences, Newcastle University, The Medical School, Framlington Place, Newcastle upon Tyne NE2 4HH, United Kingdom. Tel.: 44-191-2228800; Fax: 44-2227424; E-mail: [email protected].

APRIL 29, 2011 • VOLUME 286 • NUMBER 17

that O2 arabinose decorations are directed into the active site pocket. A shelflike structure adjacent to the active site pocket accommodates O3 arabinose side chains, explaining how the enzyme can target O2 linkages that are components of single or double substitutions.

Plant cell walls comprise a diverse repertoire of chemically complex polysaccharides (1). The microbial degradation of these composite structures, which is mediated by an extensive repertoire of hydrolytic enzymes, is of considerable biological and industrial significance. Not only is the release of soluble sugars, principally monosaccharides, from the plant cell wall required to maintain microbial ecosystems, but the volatile fatty acids generated by these microbiota are essential nutrients for higher order organisms, such as mammalian herbivores (2). Within an industrial context, the microbial enzymes that catalyze this process are integral to the exploitation of lignocellulose as an environmentally sustainable substrate for the biofuel and bioprocessing industries (3, 4). An important limitation in the industrial exploitation of lignocellulose is the cost of the pretreatment process, reflecting the low activity displayed by the degradative enzymes against cell wall structures. This reflects, in part, the chemical complexity of the plant cell wall, particularly within the pectin group of polysaccharides (5). There is, therefore, an urgent need to discover glycoside hydrolases that exhibit novel activities against these complex structural polysaccharides. Such enzymes will not only have the potential to contribute to the degradative process but will also be useful reagents with which to probe the molecular architecture of cell walls. Within the plant cell wall, cellulose fibers are cross-linked with hemicellulosic and pectic polysaccharides (1). The backbones of these matrix polysaccharides are often decorated with arabinofuranose-containing side chains. For example, rhamnogalacturonan I is a pectic polysaccharide found in primary cell walls, which plays an important role in plant cell growth and tissue development. Rhamnogalacturonan I contains a backbone of the repeating disaccharide 4-␣-D-galacturonic acid-1,2-␣-L-rhamnose-1, which is extensively decorated with a variety of oligosaccharides, of which arabinan is the most abunJOURNAL OF BIOLOGICAL CHEMISTRY

15483

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

Reflecting the diverse chemistry of plant cell walls, microorganisms that degrade these composite structures synthesize an array of glycoside hydrolases. These enzymes are organized into sequence-, mechanism-, and structure-based families. Genomic data have shown that several organisms that degrade the plant cell wall contain a large number of genes encoding family 43 (GH43) glycoside hydrolases. Here we report the biochemical properties of the GH43 enzymes of a saprophytic soil bacterium, Cellvibrio japonicus, and a human colonic symbiont, Bacteroides thetaiotaomicron. The data show that C. japonicus uses predominantly exo-acting enzymes to degrade arabinan into arabinose, whereas B. thetaiotaomicron deploys a combination of endo- and side chain-cleaving glycoside hydrolases. Both organisms, however, utilize an arabinan-specific ␣-1,2-arabinofuranosidase in the degradative process, an activity that has not previously been reported. The enzyme can cleave ␣-1,2-arabinofuranose decorations in single or double substitutions, the latter being recalcitrant to the action of other arabinofuranosidases. The crystal structure of the C. japonicus arabinan-specific ␣-1,2-arabinofuranosidase, CjAbf43A, displays a fivebladed ␤-propeller fold. The specificity of the enzyme for arabinan is conferred by a surface cleft that is complementary to the helical backbone of the polysaccharide. The specificity of CjAbf43A for ␣-1,2-L-arabinofuranose side chains is conferred by a polar residue that orientates the arabinan backbone such

Arabinan-specific Arabinofuranosidase

MATERIALS AND METHODS Gene Cloning—The genes encoding the mature forms (lacking their signal peptides) of the C. japonicus and B. thetaiotaomicron GH43 enzymes were amplified by PCR from genomic DNA with the thermostable DNA polymerase Kod and the primers listed in supplemental Tables S1 and S2. The PCR products were cloned into the expression vectors pRSET, pET21d, pET22b, pET28b, and pET32b to generate the plasmids listed in supplemental Tables S1 and S2. All of the recom-

15484 JOURNAL OF BIOLOGICAL CHEMISTRY

binant enzymes contained either a His6 tag or were fused to glutathione S-transferase (GST). Protein Expression and Purification—Escherichia coli BL21 or Tuner cells, transformed with the relevant plasmid, were cultured in Luria-Bertani broth at 37 °C to midexponential phase (A600 ⫽ 0.6). Recombinant protein expression was induced by the addition of 1 mM isopropyl-1-thio-␤-D-galactopyranoside and grown for a further 12 h at 16 °C. Cell-free extracts were prepared as described previously (16), and proteins with a His6 tag were purified by immobilized metal ion affinity chromatography (IMAC) using TalonTM resin and eluted with 50 mM Tris/HCl buffer, pH 8.0, containing 300 mM NaCl and 100 mM imidazole. Proteins fused to GST were purified using a glutathione resin and eluted with 10 mM Tris/HCl buffer, pH 8.0, containing 10 mM reduced glutathione. Enzymes used in biochemical assays were dialyzed against 50 mM sodium phosphate buffer, pH 7.0, at 4 °C. Proteins destined for crystallographic analysis were subjected to further purification on a 26/60 S75 Superdex gel filtration column equilibrated with 10 mM Tris/HCl, pH 8.0, containing 150 mM NaCl. Fractions containing the purified protein were pooled and dialyzed against ultrapure water. To produce a selenomethionine derivative of CjAbf43A, the E. coli methionine auxotroph B834 (DE3) containing pAC12 (encodes CjAbf43A) was cultured as described previously (16), and the arabinofuranosidase was purified by IMAC and gel filtration as described above, except that the buffers were supplemented with 5 mM ␤-mercaptoethanol or 10 mM DL- dithiothreitol, respectively. The purified protein was dialyzed extensively against 10 mM DL-dithiothreitol. Site-directed Mutagenesis—Site-directed mutagenesis was conducted using the PCR-based QuikChange site-directed mutagenesis kit (Stratagene) according to the manufacturer’s instructions, using pAC12 as the template and the primer pairs listed in supplemental Table S3. Enzyme Assays—The substrates 4-nitrophenyl-␣-L-arabinofuranoside (4NPA)3 and sugar beet arabinan were purchased from Sigma and Megazyme, respectively. Assays in which 4NPA was the substrate were monitored by the release of 4-nitrophenolate at a wavelength of 400 nm. The product was quantified using a molar extinction coefficient of 10,500 M⫺1 cm⫺1. When using polysaccharides such as arabinan as the substrate, arabinose release was monitored using galactose dehydrogenase (Megazyme), an enzyme that catalyzes the oxidation of galactose/arabinose with concomitant reduction of NAD⫹ to NADH (17). The concentration of NADH generated was quantified by measuring ⌬A340 nm using a molar extinction coefficient of 6230 M⫺1 cm⫺1. All reactions were carried out in 50 mM sodium phosphate buffer, pH 7.0, at 25 °C for the C. japonicus enzymes and at 37 °C for the B. thetaiotaomicron enzymes. For pH optima assays, a range of different buffers were used. Assays were monitored discontinuously, and time points were terminated by the addition of an equal volume of 500 mM Na2CO3. The addition of Na2CO3 raises the pH to 11, where 4-nitrophenolate has an extinction coefficient of 24,150 M⫺1 cm⫺1. Reducing sugar assays were carried out using 3,5-dinitrosalicylic acid 3

The abbreviations used are: 4NPA, 4-nitrophenyl-␣-L-arabinofuranoside; HPAEC, high performance anion exchange chromatography.

VOLUME 286 • NUMBER 17 • APRIL 29, 2011

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

dant (5). Arabinan consists of an ␣-1,5-L-arabinofuranose backbone that is heavily substituted with ␣-1,3-L-arabinofuranose and/or ␣-1,2-L-arabinofuranose side chains (5). The ␤-1,4-xylose backbone of the major hemicellulosic polysaccharide in cereals, arabinoxylan, is also extensively decorated at O2 and/or O3 with single arabinofuranose units (6). Within the CAZy data base, glycoside hydrolases are grouped into families based on sequence, structural, and mechanistic similarities (7). Enzymes that hydrolyze arabinofuranose-linked glycosidic bonds are predominantly located in glycoside hydrolase families GH43, GH51, GH54, and GH62 (7). Of particular note is GH43, which, in addition to containing arabinofuranosidases and arabinanases (8 –10), also contains ␤-xylosidases (11), exo-␤-1,3-galactanases (12), and xylanases (13). An intriguing feature of several organisms, which utilize the plant cell wall as an important nutrient, is that their genomes encode a large number of GH43 enzymes. For example the soil saprophyte Cellvibrio japonicus, which degrades all of the major components of the cell wall, contains 14 genes encoding GH43 enzymes (14), whereas Bacteroides thetaiotaomicron, a human colonic symbiont that targets pectins and mammalian glycans, has the genetic capacity to synthesize 33 different GH43 hydrolases (15). Superficially, GH43 appears to contain relatively few enzyme activities, and thus the biological rationale for the expansion of this family in certain organisms is unclear. It is likely, however, that GH43 arabinofuranosidases may target different hemicellulosic and pectic polysaccharides. Here we have explored the range of activities displayed by the C. japonicus and B. thetaiotaomicron GH43 enzymes against arabinose-containing polysaccharides. The data underline the different strategies used by the two microorganisms to degrade arabinan. A common feature of the two systems is an arabinanspecific ␣-1,2-arabinofuranosidase, an activity that has not previously been reported. It is likely that the primary role of the enzyme is to target O2-linked arabinose residues that are components of double substitutions. The crystal structure of the C. japonicus arabinan-specific ␣-1,2-arabinofuranosidase, CjAbf43A, reveals a curved substrate binding cleft that accommodates the helical structure of the arabinan backbone. The active site, which comprises a deep pocket in the middle of the cleft, displays considerable conservation with other GH43 exoacting enzymes. Asn165, located in the ⫹1 subsite, in combination with aromatic residues, appear to orientate the polysaccharide chain such that the O2 arabinose decoration is housed in the active site. Thus, the shape of the cleft, in combination with specific interactions with the arabinan backbone, confers the specificity displayed by this novel enzyme.

Arabinan-specific Arabinofuranosidase

APRIL 29, 2011 • VOLUME 286 • NUMBER 17

512 ⫻ 512 complex points. These data were processed typically with zero filling to obtain a 1024 ⫻ 1024 matrix. Chemical shifts were measured relative to internal acetone (␦H 2.225, ␦C 30.89). Data were processed using MestRe-C software (Universidad de Santiago de Compostela, Spain). Crystallization, Data Collection, Solution, and Refinement of CjAbf43A—A selenomethionine derivative of CjAbf43A in ultrapure water was crystallized at 30 mg/ml in 1.5 M ammonium sulfate and 100 mM Tris/HCl buffer, pH 7.5. Crystals, which grew overnight at 20 °C, were transferred to a cryoprotectant solution of the crystallization condition supplemented with 25% (v/v) glycerol. Native protein in water was crystallized at 10 mg/ml in 13% PEG 3350 and 150 mM sodium acetate at 20 °C, with crystals reaching maximum size in 4 weeks. The purified catalytic base mutant was dialyzed against 50 mM sodium chloride and crystallized at 20 mg/ml in the presence of 20 mM arabinotetraose in 20% PEG 3350 and 200 mM ammonium tartrate at 20 °C, with crystals growing over a 2-month period. These two crystal forms were cryoprotected with the addition of 25% (v/v) ethylene glycol to the respective crystallization conditions. Crystals were flash-cooled in liquid nitrogen, and diffraction data, for the native arabinofuranosidase and the enzyme-ligand complex, were collected at Diamond Light Source on beamline I02 and at the Advanced Photon Source on beamline 22-BM, respectively. Because no appropriate molecular replacement model was available for CjAbf43A in the Protein Data Bank, the structure was solved by single wavelength anomalous dispersion using the selenomethionine derivative. A fluorescence energy scan was performed around the selenium K atomic absorption edge of 12.658 keV, confirming the presence of selenomethionine in the sample, and a single wavelength anomalous dispersion data set was collected at the optimal peak wavelength of 0.979 Å. The data were integrated using iMosflm (19) and scaled and merged in Scala (20). The single wavelength anomalous dispersion data set had an overall anomalous correlation of 0.31. The scaled data were input into the SHELX C/D/E pipeline (21) for experimental phasing, where the anomalous signal, as indicated by the 具d⬙/sig典 statistics, was 3.3 in the resolution shell up to 8 Å and was 0.98 in the highest resolution shell, between 3.2 and 3.0 Å. In total, 30 selenium atom positions were identified, and phasing and density modification unambiguously determined the hand of the data to generate an interpretable initial electron density map. The initial phases were improved through the use of solvent flattening and 2-fold non-crystallographic symmetry averaging, with operators determined from the selenium positions, with DM (22). An initial model was automatically built with ARP/wARP (23) and was extended and refined using Buccaneer (24). This model had an Rcryst of 0.200 and Rfree of 0.236 and was used as a molecular replacement model for the native high resolution data set. The native and the D41A ligand-enzyme complex data were integrated using iMosflm (19) and scaled and merged with Scala (20). The scaled data for the two crystal forms were input into Phaser (25), and the structures were solved by molecular replacement using the selenomethionine structure as the search model. Refinement of all structures was performed using Refmac5 (26) and phenix.refine (27) interspersed with manual JOURNAL OF BIOLOGICAL CHEMISTRY

15485

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

according to Ref. 18. At various time points, a 500-␮l aliquot of the reaction was taken and stopped by the addition of an equal volume of 3,5-dinitrosalicylic acid reagent. After boiling for 20 min, the A575 nm values were determined and compared with a standard curve of arabinose. The enzymes used in these assays contained their respective His6 or GST tags. All enzyme assays were carried out in triplicate, and the S.E. is reported for the individual kinetic parameters. HPAEC of Monosaccharide and Oligosaccharide Reaction Products—Oligosaccharides and monosaccharides, derived from polysaccharide hydrolysis by either enzymatic or acid treatment, were analyzed using a CARBOPACTM PA-100 anion exchange column (Dionex) equipped with a CARBOPACTM PA-100 guard column. The fully automated system (LC25 chromatography oven, GP40 gradient pump, ED40 electrochemical detector, and AS40 autosampler) had a loop size of 200 ␮l and flow rate of 1.0 ml/min, with sugars detected by pulsed amperometric detection. The pulsed amperometric detection settings were as follows: E1 ⫽ ⫹0.05, E2 ⫽ ⫹0.6, and E3 ⫽ ⫺0.6. Typical elution conditions were as follows: 0 –5 min, 66 mM NaOH; 5–30 min, 66 mM NaOH with a 0 –75 mM sodium acetate linear gradient; 30 – 40 min, 500 mM sodium acetate in 66 mM NaOH; 40 –50 min, 500 mM sodium hydroxide; and then 50 – 60 min, 66 mM NaOH. Appropriate oligosaccharides were used as standards at a concentration of 50 ␮M. Samples to be analyzed were centrifuged at 13,000 ⫻ g for 10 min. Data were collected and manipulated using the ChromeleonTM chromatography management system version 6.8 (Dionex) via a ChromeleonTM Server (Dionex). Final graphs were drawn with Prism 4.02 (GraphPad). Thin Layer Chromatography—TLC plates (silica gel 60, 20 ⫻ 20, Merck) spotted with samples (two 1-␮l drops) were run in 1-butanol/acetic acid/water in a ratio of 2:1:1 (v/v/v). Visualization of migrated sugars was achieved by immersing the TLC plates in developer (sulfuric acid/ethanol/water (3:70:20, v/v/v), orcinol (0.1%)) for 1–2 s. Plates were again dried and heated between 80 and 100 °C. Nuclear Magnetic Resonance—Arabinooligosaccharides were prepared for NMR analysis by the controlled acid hydrolysis of sugar beet arabinan. HCl was added to 10 ml of a 30 mg/ml solution of sugar beet arabinan to a final concentration of 50 mM, and the solution was boiled for 30 min. The reaction was then put on ice and neutralized with 500 ␮l of 1 M NaOH. The large polymers in the mixture were precipitated in 80% (v/v) ethanol, and the arabinooligosaccharide-containing supernatant was vacuum-dried. The arabinan oligosaccharides were dissolved in 5 mM sodium phosphate, pH 7.0, at 5 mg/ml and incubated for 60 min in the presence and absence of 10 ␮M CjAbf43A at 25 °C. After the incubation, the whole reaction without separation of the reaction products or the enzyme was lyophilized. The dried material was dissolved in D2O (99.9%; Cambridge Isotope Laboratories), and 1H and 13C NMR twodimensional spectra were recorded with a Varian Inova NMR spectrometer operating at 600 MHz and with a sample temperature of 298 K. Gradient versions of COSY, heteronuclear single quantum coherence, and heteronuclear multiple quantum coherence experiments were recorded using standard Varian pulse programs. Heteronuclear spectra were recorded with

Arabinan-specific Arabinofuranosidase TABLE 1 Data collection and refinement statistics Data collection Data collection wavelength (Å) Space group Unit cell parameters (Å) Unit cell parameters (degrees) Resolution (Å) No. of observations No. of unique reflections Mean I/␴I Rmerge Ranom Completeness (%) Anomalous completeness (%) Multiplicity

a b

Native

D41A-Ligand

0.98 P41212 a ⫽ b ⫽ 192.65, c ⫽ 132.5

1.00 P41212 a ⫽ b ⫽ 85.64, c ⫽ 195.65

72.17-2.99 (3.15-2.99)a 728,565 50,829 14.9 (5.2) 0.22 (0.57) 0.07 (0.29) 100 (100) 100 (100) 14.3 (14.7)

0.97 P21 a ⫽ 47.29, b ⫽ 139.27, c ⫽ 50.46 ␤ ⫽ 116.0 36.29-1.64 (1.73-1.64) 260,148 71,488 11.5 (4.1) 0.10 (0.40) 99.8 (100) 3.6 (3.7)

0.200/0.236

0.153/0.184

0.169/0.198

0.011 1.35

0.010 1.226

0.007 1.142

94.7 99.9

96.5 99.7

96.3 100

51.5 21.03 21.82 ⫺/⫺ 3QED

11.20 9.60 10.18 ⫺/22.00 3QEE

26.50 24.77 28.25 35.33/36.38 3QEF

44.73-1.79 (1.89-1.79) 1,008,728 69,576 16 (5.4) 0.11 (0.50) 100 (100) 14.5 (14.5)

Values in parentheses correspond to the highest resolution shell. Ramachandran statistics generated using molprobity.

model correction and building in Coot (28) until convergence. Waters were added using ARP/waters (23) and checked manually. Data collection and refinement statistics are given in Table 1. It should be noted that the D41A mutation introduces a dual conformation of His267. One of these conformations, with a ␹1 angle of ⫺65°, points toward the bound oligosaccharide and is stabilized in this position by the presence of a solute network that coincides spatially with the other rotamer position of His267 (␹1 ⫽ ⫺161°), which is oriented away from the oligosaccharide. In the apo structure, His267 adopts a single conformation. Interestingly, a related GH43 enzyme, PDB entry 3C7G, contains a bound sodium ion in a similar location, but our diffraction data do not permit the exact identification of the bound solutes for rigorous alternative chemical occupancy refinement, and the deposited model coordinates do not include this partially occupied solute constellation.

RESULTS Screening of the Activities Displayed by B. thetaiotaomicron and C. japonicus GH43 Enzymes Previous studies have shown that the GH43 family contains a large number of enzymes that contribute to the degradation of polysaccharides containing arabinofuranose residues (8 –10). To explore the relationship between the genetic potential of C. japonicus and B. thetaiotaomicron to synthesize GH43 enzymes, and the capacity of the organisms to metabolize arabinose-containing complex carbohydrates, the genes encoding the GH43 proteins were expressed in E. coli. In total, 23 of the 33 B. thetaiotaomicron and 11 of the 14 C. japonicus GH43 proteins were produced in soluble form and purified to electrophoretic homogeneity (data not shown). The ORF nomenclature used herein corresponds to that adopted in the annotation of the two genomes (14, 15).

15486 JOURNAL OF BIOLOGICAL CHEMISTRY

The catalytic activity of the GH43 proteins was assessed using 4NPA and a range of arabinose-containing polysaccharides (and oligosaccharides), including wheat and rye arabinoxylan, sugar beet, linear arabinan, and debranched rhamnogalacturonan I. The data, summarized in supplemental Tables S4 and S5, show that six enzymes, three from both B. thetaiotaomicron and C. japonicus, displayed significant 4NPAase activity, two B. thetaiotaomicron enzymes displayed endo-arabinanase activity, and two C. japonicus GH43 proteins were shown to function as exo-␣-1,5-arabinofuranosidases (and also hydrolyzed 4NPA). A single ␤-xylosidase, CJA_3070 from C. japonicus, was identified. Finally, an enzyme, common to B. thetaiotaomicron and C. japonicus, Bt0369 and CJA_3018, respectively (designated hereafter as CjAbf43A), hydrolyzed 4NPA and released arabinose from sugar beet arabinan but was not significantly active against linear arabinan or any of the other polysaccharides tested (Table 2 and supplemental Tables S4 and S5). A surprising feature of the biochemical studies described above is that nine of the GH43 proteins displayed no significant activity against polysaccharides, whereas 17 of the proteins displayed trace endo-xylanase activity, and one enzyme, CJA_3601, appeared to function as an extremely weak xylanspecific arabinofuranosidase. Although these activities could be detected by HPLC, after reactions were incubated for ⬃48 h, the concentrations of the products were too low to be quantified by reducing sugar assays. It is evident, therefore, that this xylanase activity is too low to be of biological significance, exemplified by the observation that B. thetaiotaomicron is unable to grow on xylan (29). These data may have some resonance with a previous report suggesting (through zymogram analysis) that a GH43 protein functions as an endo-xylanase VOLUME 286 • NUMBER 17 • APRIL 29, 2011

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

Refinement Rwork/Rfree Root mean square deviations Bond lengths (Å) Bond angles (degrees) Ramachandran plotb Favored (%) Allowed (%) Mean B-factor Wilson B (Å2) Main chain (Å2) Side chain (Å2) Ligand/water (Å2) Protein Data Bank codes

Selenomethionine

Arabinan-specific Arabinofuranosidase TABLE 2 Kinetic properties of C. japonicus and B. thetaiotaomicron. GH43 enzymes The C. japonicus and B. thetaiotaomicron enzymes were assayed in 50 mM sodium phosphate buffer, pH 7.0, at 25 and 37 °C, respectively. Enzyme CJA_3012

CJA_0806

CjAbf43A (CJA_3018) Bt0369 Bt0360 Bt0367

4NPA Arabinobiose Arabinotetraose Arabinohexaose Sugar beet arabinan 4NPA Arabinobiose Arabinotriose Arabinotetraose Arabinohexaose Sugar beet arabinan 4NPA Arabinohexaose Sugar beet arabinan 4NPA Arabinohexaose Sugar beet arabinan Sugar beet arabinan Linear arabinan Sugar beet arabinan Linear arabinan 4NPA 4NPA 4NPA 4NPX Xylotriose Xylotetraose Xylohexaose

Km

kcat

kcat/Km

mM

min⫺1

min⫺1 M⫺1

0.1 ⫾ 0.02 0.3 ⫾ 0.03 1.1 ⫾ 0.85 1.1 ⫾ 0.08 —a — 7.6 ⫾ 1.27 3.4 ⫾ 0.80 4.0 ⫾ 0.2 4.3 ⫾ 1.3 — 3.1 ⫾ 0.5 — 0.3 ⫾ 0.02b 8.0 ⫾ 2.4 — 2.1 ⫾ 0.1 ⫻ 10⫺5b — — — — 0.4 ⫾ 0.02 2.3 ⫾ 0.2 1.0 ⫾ 0.1 — — — —

0.6 ⫾ 0.1 755 ⫾ 100 708 ⫾ 19 693 ⫾ 11 — — 220 ⫾ 22.1 357 ⫾ 18.8 268 ⫾ 11.0 321 ⫾ 41.7 — 1896 ⫾ 319 — 308 ⫾ 8 4.0 ⫾ 0.7 ⫻ 104 — 2681 ⫾ 246 — — — — 155.5 ⫾ 19.2 14.6 ⫾ 6.1 101.3 ⫾ 3.4 — — — —

7.0 ⫻ 103 2.2 ⫻ 106 6.6 ⫻ 105 6.3 ⫻ 105 2.0 ⫻ 104 ⫾ 1.1 ⫻ 103b 2.6 ⫻ 104 ⫾ 2.0 ⫻ 103 2.9 ⫻ 104 1.1 ⫻ 105 6.8 ⫻ 104 7.4 ⫻ 104 1.0 ⫻ 103 ⫾ 2.0 ⫻ 101b 6.2 ⫻ 105 1.1 ⫻ 102 ⫾ 7.0 ⫻ 10⫺3 1.1 ⫻ 106 5.0 ⫻ 106 3.6 ⫻ 103 ⫾ 3.2 ⫻ 102 1.3 ⫻ 107 508.5 ⫾ 56.9c 57.8 ⫾ 10.2c 52.4 ⫾ 3.2c 323.9 ⫾ 21.8c 44.3 ⫻ 105 6.3 ⫻ 103 1.0 ⫻ 105 2.03 ⫻ 103 ⫾ 2.8 ⫻ 102 4.09 ⫻ 103 ⫾ 2.54 ⫻ 102 6.93 ⫻ 103 ⫾ 7.46 ⫻ 102 7.15 ⫻ 103 ⫾ 1.37 ⫻ 103

a

—, parameter could not be determined. Sugar beet arabinan at 1 mg/ml was enzymatically hydrolyzed to completion to calculate effective substrate concentration. The data were used to determine Km or kcat/Km. c Substrate concentration was mg/ml; thus,. units for kcat/Km are min⫺1 mg⫺1 ml. b

(13), although true GH43 xylanases have recently been identified (30). Characterization of Arabinan-degrading Enzymes C. japonicus—Biochemical analysis of the two exo-␣-1,5-arabinofuranosidases, CJA_3012 and CJA_0806 (Table 2), showed that both enzymes displayed the highest catalytic activity against linear arabinan and arabinooligosaccharides, releasing exclusively arabinose, probably from the non-reducing end (all exo-acting GH43 enzymes attack the non-reducing end of polysaccharides (8, 11, 31, 32)). The two enzymes released only very small amounts of arabinose from sugar beet, reflecting the limited concentration of terminal undecorated ␣-1,5-arabinofuranose units in this highly branched polysaccharide (data not shown). The catalytic efficiency of the two enzymes against arabinooligosaccharides with different degrees of polymerization was assessed. The data show that CJA_3012 is ⬃300-fold more active against arabinobiose than 4NPA (Table 2), suggesting that the interaction of arabinofuranose with the ⫹1 subsite plays a critical role in substrate binding and catalysis (the ⫺1 subsite houses the sugar at the non-reducing end of the substrate, and the scissile glycosidic bond is between sugars bound at ⫺1 and ⫹1 (33)). The enzyme did not display elevated activity against substrates with a degree of polymerization of ⬎2, indicating that the arabinofuranosidase contains only two kinetically significant subsites. Indeed, the longer arabinooligosaccharides appeared to be hydrolyzed more slowly than the disaccharide. The similar activity displayed by CJA_0806 against 4NPA and arabinobiose indicates that the ⫹1 subsite exhibits limited specificity for arabinose, although the enzyme appears to be optimized to hydrolyze arabinotriose because the APRIL 29, 2011 • VOLUME 286 • NUMBER 17

trisaccharide is a better substrate than either di- or tetrasaccharides. It is possible that the ⫹2 subsite in CJA_0806 and the ⫹1 subsite in CJA_3012 preferentially bind to arabinose at the reducing end of substrates by targeting its pyranose conformation or the ␤-anomer of the furanose sugar. An intriguing feature of the two exo-␣-1,5-arabinanases is that although they display similar catalytic efficiencies against 4NPA, the Km and kcat of CJA_0806 were much higher than for CJA_3012. The sequence identity between these two enzymes is 49%. It is possible that the ⫹1 site of CJA_3012 binds much more tightly to 4-nitrophenyl than CJA_0806, which may result in the low Km value. This tight binding, however, may limit aglycone departure from the active site, which would impact upon the catalytic rate. Kinetic analysis of CjAbf43A showed that the enzyme displayed similar catalytic efficiency against 4NPA and sugar beet arabinan, releasing monomeric arabinose (Fig. 1). The enzyme did not show activity against any other plant cell wall polysaccharides evaluated, such as linear arabinan, xylans (arabinoxylans derived from oat, wheat, rye, and birchwood), pectin backbone structures (homopolygalacturonic acid and rhamnogalacturonan), and ␤-glucans (cellulose, ␤-1,4,1,3 mixed linkage glucans, and laminarin) (supplemental Table S4. CjAbf43A was ⬃104-fold less active against arabinohexaose (an ␣-1,5-linked arabinofuranose hexasaccharide) than sugar beet arabinan (Table 2). Sugar beet arabinan consists of an ␣-1,5linked L-arabinofuranose backbone that is ⬃60% monosubstituted with ␣-1,3-L-arabinofuranose side chains and, less frequently, with single ␣-1,2 L-arabinofuranose (5). Some backbone residues are doubly substituted with both ␣-1,2 and JOURNAL OF BIOLOGICAL CHEMISTRY

15487

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

Bt2852 Bt3094 Bt3655 CjXyl43A (CJA_3070)

Substrate

Arabinan-specific Arabinofuranosidase

␣-1,3 side chains (5). The high activity of the enzyme against sugar beet arabinan and the low activity against ␣-1,5 arabinooligosaccharides suggest that CjAbf43A hydrolyzes the ␣-1,2 and/or ␣-1,3 side chains present in sugar beet arabinan. To further explore CjAbf43A specificity, arabinan oligosaccharides were subjected to two-dimensional NMR spectroscopy analysis after incubation without (control) or with the enzyme. Resonances in the two-dimensional spectra were assigned to specific arabinosyl residues on the basis of their scalar coupling patterns and chemical shifts and by comparison with the published data (34) (supplemental Table S6. The heteronuclear single quantum coherence NMR spectrum of the control contained two distinct cross-peaks at ␦ 4.170, 87.6 and ␦ 4.314, 85.6 that disappeared after the treatment with the enzyme (Fig. 2). The resonance at ␦ 87.6 is diagnostic of ␣-1,2 side chains because the chemical shift of the C2 of the ␣-1,5-linked L-ara-

15488 JOURNAL OF BIOLOGICAL CHEMISTRY

Phylogenetic Analysis of B. thetaiotaomicron and C. japonicus GH43s The B. thetaiotaomicron and C. japonicus GH43 proteins were subjected to phylogenetic analysis. The analysis incorporated selected enzymes from other organisms with known activities. The data, presented in Fig. 3, show that enzymes that displayed arabinanase and ␣-1,5-exo-arabinanase activity are in clades with glycoside hydrolases that exhibit similar activities. By contrast, and consistent with their novel activity, CjAbf43A (CJA_3018) and Bt0369 formed a clade that contains no other enzymes with known catalytic properties. The proteins displaying “trace” xylanase activity are not clustered into a specific region of the family tree, suggesting that this minor activity may be a generic feature of GH43. Furthermore, one might speculate that an ancestral GH43 enzyme displayed significant xyla4

L. McKee, unpublished data.

VOLUME 286 • NUMBER 17 • APRIL 29, 2011

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

FIGURE 1. Catalytic activity of Bt3069 and CjAbf43A. The enzymes at 50 nM were incubated with sugar beet arabinan, and the reaction products generated were analyzed by HPAEC.

binofuranose residues shifted markedly downfield when the residues were substituted at the O2 position. The resonance at ␦ 85.6 corresponded to arabinosyl residues substituted at the O2 and O3 positions. Other resonances within the spin systems of the single and double substituted arabinosyl residues also disappeared after the treatment with the enzyme. These NMR data are consistent with the view that CjAbf43A hydrolyzes ␣-1,2-L-arabinofuranose glycosidic bonds within the context of double or single substitutions. The specificity of the enzyme for the O2 linkage to backbone arabinose residues that carry two substitutions (at O2 and O3) was further examined through synergy experiments with HiAXHd3, an arabinofuranosidase that targets exclusively O3 linkages in double substitutions of arabinan and arabinoxylan (35).4 Arabinan that had been pretreated with CjAbf43A was not hydrolyzed by HiAXHd3. CjAbf43A, however, was able to release arabinose residues from arabinan pretreated with HiAXHd3. These results are consistent with the NMR data in showing that CjAbf43A targets O2 linkages in backbone arabinose residues that are singly or doubly substituted. CjAbf43A releases 33% more arabinose than HiAXHd3, indicating a ratio of O2 single and O2 ⫹ O3 double substitutions of 1:3. B. thetaiotaomicron—Although both Bt0360 and Bt0367 display endo-arabinanase activity, they differ in specificity for linear and branched arabinan. Bt0360 shows a preference for branched arabinan, whereas Bt0367 exhibits higher activity against linear arabinan (Table 2). Bt0369, an arabinan-specific arabinofuranosidase, did not release arabinose from arabinan pretreated with CjAbf43A, whereas the Cellvibrio arabinofuranosidase was inactive against the polysaccharide that had been incubated with the Bacteroides enzyme. NMR of the reaction products showed that Bt0369 removed O2-linked arabinose side chains in the context of both single and double substitutions. These data show that Bt0369 and CjAbf43A display the same substrate specificity, consistent with their structural similarity (the primary sequences of the two enzymes are 65% identical). The functional significance of the substrate specificities of the B. thetaiotaomicron and C. japonicus enzymes, within the wider context of arabinan degradation by the two bacteria, is discussed below.

Arabinan-specific Arabinofuranosidase

nase activity; however, as new specificities were introduced into the GH43 fold, the endogenous capacity to hydrolyze xylan was lost. The requirement for a GH43 xylan-degrading enzyme may also have been greatly reduced as cell wall-degrading organisms acquired more powerful GH10 and GH11 xylanases (for reviews, see Refs. 36 and 37). It should be noted that nine of the 34 recombinant proteins from the two bacteria, although soluble, displayed no biologically significant enzyme activity. It is possible that, through redundancy, there was no requirement for the bacteria to retain functional forms of these enzymes. Alignment of eight of these enzymes of unknown function shows that they lack one or more of the catalytic residues, providing support for the view that, indeed, they are not catalytically active. This view is further illustrated by the structure of Bt2959 (Protein Data Bank entry 3NQH), which shows that the protein lacks the canonical catalytic base. It is also possible that for those “apparently inactive” enzymes that contain the expected catalytic residues, the appropriate substrate was not evaluated in this study. Three-dimensional Structure of CjAbf43A The crystal structure of CjAbf43A was solved as a selenomethionine derivative, a native form, and a native catalytic base mutant in complex with ligand to resolutions of 3.0, 1.6, and 1.8 Å, respectively. The selenomethionine and ligand-bound crystals of CjAbf43A were in the tetragonal space group P41212, whereas the native crystals were in the monoclinic space group P21. The structures each consist of two molecules of CjAbf43A in the asymmetric unit, equating to residues 29 –325 in the fulllength enzyme. The two molecules in each asymmetric unit APRIL 29, 2011 • VOLUME 286 • NUMBER 17

were not restrained to one another during the course of model refinement but nonetheless can be considered identical, superimposing with root mean square deviations of ⬃0.2 Å on 295 matched C␣ atoms. CjAbf43A displays a five-bladed ␤-propeller fold (Fig. 4), typical of GH43 enzymes (8, 10, 11, 32, 38). It has a cylindrical shape with a diameter and height of 35 Å. The propeller is based upon a 5-fold repeat of blades formed by ␤-sheets that adopt the classical “W” topology of four antiparallel ␤-strands. The blades are arranged radially from the center of the propeller. In some proteins that display a five-bladed ␤-propeller fold, the fifth blade consists of strands from the N and C terminus. Such closure of ␤-propeller proteins is colloquially termed “molecular velcro” and appears to provide considerable stabilization to the fold (39). In CjAbf43A, the fifth strand comprises only ␤-strands from the C-terminal sequence of the domain, typical of other GH43 enzymes, and thus the propeller fold is not “closed” by the fifth blade. The N- and C-terminal blades, however, are in close association, and stability is provided by strong hydrogen bonds between Asp78 in blade 1 and several residues in blade 5. CjAbf43A lacks a C-terminal ␤-jelly roll domain seen in several exo-acting GH43 enzymes (8, 10, 11, 31). Structural Comparison of CjAbf43A with GH43 Enzymes Structural comparison of CjAbf43A by secondary structure matching of the global structure revealed that the Bacillus subtilis xylan-specific GH43 arabinofuranosidase (10), BsAXHm2,3 (Protein Data Bank entry 3C7G), was the closest structural homologue, with a root mean square deviation of 1.4 Å over 271 aligned C␣ atoms and a sequence identity of 37%. The JOURNAL OF BIOLOGICAL CHEMISTRY

15489

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

FIGURE 2. Heteronuclear single quantum coherence one-bond 13C-1H correlation spectroscopy. Reactions were performed in 5 mM sodium phosphate buffer, pH 7.0. CjAbf43A (10 ␮M) was incubated with acid-hydrolyzed sugar beet arabinan for 1 h. Chemical shifts are shown in supplemental Table S6. The blue circle in the left panel is used to show the effect of the substitution at O2 in the chemical shift of the H2C2 signals of single and double substituted arabinosyl residues. The red circle indicates the H1C1 signals of single and double substituted arabinosyl residues and terminal arabinosyl residues linked at O2. These signals are absent in the spectrum of the sample after treatment with CjAbf43B (right panel). Capital letters in the figure represent individual arabinosyl residues in the arabinose containing oligosaccharides. Backbone arabinosyl residues substituted at O2 and O3 (A), at O2 (B), at O3 (C). Terminal arabinosyl residue linked to O3 (D) of an arabinosyl residue that is also substituted with an arabinosyl residue at O2 (I). Terminal arabinosyl residues linked to O2 (F) or to O3 (G) of single substituted arabinosyl residues.

Arabinan-specific Arabinofuranosidase

other most meaningful structural homologue was the ␣-L-arabinanase (38) from C. japonicus (Protein Data Bank entry 1GYD), which shares 24% sequence identity with CjAbf43A and can be superimposed on 237 matched C␣ atoms, yielding a root mean square deviation of 1.9 Å. Comparisons of the active sites of CjAbf43A with BsAXH-m2,3 and other GH43 enzymes are discussed in brief below. The Interaction of CjAbf43A with Substrate CjAbf43A was also crystallized in the presence of arabinotetraose (Megazyme). To trap the complex between the protein and the ligand, we utilized the D41A mutant, which is enzymatically inactive (Table 3; see below). We expected to observe a linear oligosaccharide defining the active site, but the solved structure revealed an ␣-1,5-arabinotriose molecule with an ␣-1,3-linked arabinofuranose side chain on the middle arabinose (32-Ara-Ara3) in one of the two CjAbf43A molecules in the asymmetric unit, chain B. The second CjAbf43A molecule (chain A) has the same oligosaccharide bound, but the electron density is too poor to define with certainty the position of the

15490 JOURNAL OF BIOLOGICAL CHEMISTRY

arabinofuranose branch, so instead an ␣-1,5-arabinotriose has been modeled in this molecule. The surface representation of CjAbf43A (Fig. 4) reveals a deep pocket in the center of a highly curved cleft. Structural conservation and mutagenesis studies (see below) support the view that the catalytic apparatus is housed in the pocket, which therefore comprises the active site of the enzyme. The rim of the pocket abuts onto a shelflike structure that accommodates the O3-linked arabinose branch. The ␣-1,5-trisaccharide is located in the central region of the curved surface, confirming its identity as the substrate binding cleft that accommodates the arabinan backbone. The pseudosymmetry of arabinofuranosides makes determining their orientation difficult because oligosaccharide chains built in either direction fit equally into the electron density. The orientation of the built oligosaccharide was guided by the knowledge that Asn185, a residue that plays a key role in arabinan recognition (see below), only makes a key hydrogen bond with the substrate in the modeled orientation. Furthermore, VOLUME 286 • NUMBER 17 • APRIL 29, 2011

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

FIGURE 3. Phylogenetic tree of C. japonicus and B. thetaiotaomicron GH43 enzymes. The phylogenetic tree was derived from the protein sequences of the GH43 members of C. japonicus and B. thetaiotaomicron together with a number of enzymes with known activities as markers, which are boxed. The two enzymes shown to display arabinan-specific ␣-1,2-arabinofuranosidase activity are boxed in red with a salmon background. Bt3516 contains two GH43 modules, which are defined as Bt3516A (N-terminal GH43 module) and Bt3516B (C-terminal GH43 module). The predicted GH43 superfamily domain(s) of the genes were aligned using MUSCLE (46). The tree also contained a selection of enzymes that displayed known activities (boxed). Proteins that lacked one or more of the catalytic residues contain a ⫹. The alignment used to construct the phylogenetic tree, shown in supplemental Fig. S1, deployed the maximum likelihood method in the program PhyML (47). The reliability of the tree was analyzed by bootstrap analysis of 100 resamplings of the data set. The tree was displayed using MEGA4 (48).

Arabinan-specific Arabinofuranosidase and ⫹2R. Based on the topology of the enzyme, the substrate binding cleft is unlikely to extend distal to the ⫹2R subsite, although the enzyme may contain at least one additional nonreducing subsite (⫹3NR). Subsite Interactions

O2 of the central backbone arabinose, which participates in the glycosidic bond hydrolyzed by the enzyme, points directly into the active site pocket, confirming that the orientation of the arabinan backbone has been built correctly. The O3-linked arabinose in the 32-Ara-Ara3 ligand is located on the shelf adjacent to the active site pocket, demonstrating how the enzyme has plasticity for ␣-1,2-L-arabinofuranose side chains in single and double substitutions. The subsite topology of CjAbf43A is defined as follows: the scissile bond is between the arabinose decoration at ⫺1 (the active site) and the backbone arabinose at ⫹1. Subsites extended toward the reducing end of the arabinan backbone (from the ⫹1 subsite) are defined as ⫹2R, ⫹3R etc., whereas subsites extending to the non-reducing end of the polymer are designated ⫹2NR, ⫹3NR, and so forth. The subsite accommodating the arabinose-linked O3 to the ⫹1 subsite (and is therefore not part of the backbone) is defined as the ⫹2NR* subsite. Thus, the bound ligand occupies subsites ⫹2NR, ⫹2NR*, ⫹1, APRIL 29, 2011 • VOLUME 286 • NUMBER 17

JOURNAL OF BIOLOGICAL CHEMISTRY

15491

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

FIGURE 4. Crystal structure of CjAbf43A. A, protein cartoon of CjAbf43A color-ramped from the N terminus (blue) to C terminus (red). B, solvent-accessible surface of CjAbf43A in complex with 32-Ara-Ara3. The bound ␣-1,5-Larabinotriose backbone is shown in yellow (carbon), whereas the O3-linked arabinofuranose decoration is in cyan (carbon). The residues that contribute to the curved shape of the cleft and make hydrophobic and polar contacts with the arabinan backbone are shown in stick format. This figure and other structure figures were drawn with PyMOL (DeLano Scientific, LLC).

Positive Subsites—The arabinose at the ⫹2R subsite is sandwiched between the aromatic residues, Phe66 and Trp164. The high activity displayed by F66A suggests that the phenylalanine contributes little to substrate binding. By contrast, W164A is 10 –30-fold less active than the wild type enzyme, suggesting that the tryptophan does make a contribution to arabinan recognition (Table 3). At the ⫹1 subsite, the N␦2 of Asn185 interacts with the endocylic ring oxygen and the glycosidic oxygen that links sugars at the ⫹1 and ⫹2NR subsites. O⑀2 of Glu215, the catalytic acid (see below), interacts with O2 of the ⫹1 arabinose, which comprises the scissile glycosidic bond. Consistent with its catalytic role, mutation of Glu215 completely inactivates the enzyme (Table 3). The N185A mutation actually causes a modest increase in activity against 4NPA, reflecting a slight lowering of Km. This implies that Asn185 is not able to interact with the aryl-glycoside because it does not offer any hydrogen bond donors or acceptors. Replacing the polar side chain of asparagine with alanine increases the volume and the hydrophobicity of the ⫹1 subsite, which is thus more able to accommodate the bulky phenyl ring of 4NPA. The amino acid substitution, however, results in a ⬃3000-fold decrease in activity against arabinan (Table 3). The substantial effect of the N185A mutation on arabinan hydrolysis is consistent with the hydrogen bonds made by the Asn with the ⫹1 arabinose and the glycosidic bond between substrate bound at ⫹1 and ⫹2NR. Indeed, it is likely that the interactions made by Asn185 play a role in defining the specificity of CjAbf165A for the O2 rather than the O3 linkage. The only polar interaction between the substrate and CjAbf43A that is unique, when O2 is orientated into the ⫺1 pocket, is the hydrogen bond between Asn185 and the glycosidic oxygen linking the sugars at the ⫹1 and ⫹2NR subsites (Fig. 5). As stated above, if the oligosaccharide was built such that O3 is orientated into the active site, Asn185 is unable to interact with the glycosidic oxygen at the ⫹1/⫹2NR interface. Attempts at using NMR to show that N185A was able to hydrolyze both O2 and O3 linkages were compromised by the high abundance of O3 over O2 linkages and the very low activity of the N185A mutant, preventing unequivocal quantitative comparisons. Furthermore, it is possible that hydrophobic residues, such as Phe234, which make weak apolar contacts with C3-C5 of the ⫹1 arabinose, also contribute to arabinan orientation, and thus randomization of O2 and O3 hydrolysis may require multiple amino acid substitutions, which would probably render the enzyme completely inactive. A BLAST search of CjAbf43A against the UNIPROT data base showed that of the 50 top hits, which displayed identities ranging from 44 to 82% (data not shown), 46 contained an Asn in the motif WGN. Significantly, Bt0369, which is 66% identical to CjAbf43A and also displays arabinan-specific ␣-1,2-arabinofuranosidase activity, contains the WGN motif. These data indicate that Asn185 in CjAbf43A and Asn186 in Bt0369 confer specificity for the O2 linkage.

Arabinan-specific Arabinofuranosidase TABLE 3 Kinetic constants of wild type and variants of CjAbf43A Enzyme concentrations varied from 5 nM to 10 ␮M. Reactions were carried out at 25 °C in 50 mM sodium phosphate buffer, pH 7.0. Kinetic parameters were determined by non-linear regression analysis. 4NPA CjAbf43A

a

kcat

kcat/Km

Km

kcat

kcat/Km

mM

min⫺1

min⫺1 M⫺1

mM

min⫺1

min⫺1 M⫺1

3.1 ⫾ 0.51 —a — — — — 2.5 ⫾ 0.2 — 4.7 ⫾ 1.6 3.4 ⫾ 1.2 — — 0.9 ⫾ 0.05 4.9 ⫾ 0.2 11.2 ⫾ 0.4 3.5 ⫾ 0.1 4.3 ⫾ 0.5 9.4 ⫾ 0.8 — 2.9 ⫾ 0.5 — 3.9 ⫾ 1.0

1896 ⫾ 320 — — — — — 524 ⫾ 43 134 ⫾ 46 906 ⫾ 245 1676 ⫾ 147 727 ⫾ 98 586 ⫾ 1 717 ⫾ 57 2315 ⫾ 403 354 ⫾ 17 — 797 ⫾ 210 — 606 ⫾ 252

6.2 ⫻ 105 — — — 2.2 ⫻ 105 ⫾ 2.99 ⫻ 104 1.6 ⫻ 103 ⫾ 2.58 ⫻ 102 2.1 ⫻ 105 2.5 ⫻ 102 ⫾ 5.3 ⫻ 101 2.9 ⫻ 104 2.6 ⫻ 105 6.4 ⫻ 104 ⫾ 1.1 ⫻ 104 1.9 ⫻ 103 1.8 ⫻ 106 1.5 ⫻ 105 5.3 ⫻ 104 2.1 ⫻ 105 5.4 ⫻ 105 3.8 ⫻ 104 8.8 ⫻ 105 ⫾ 1.9 ⫻ 104 2.8 ⫻ 105 1.7 ⫻ 104 ⫾ 2.2 ⫻ 102 1.6 ⫻ 105

0.3 ⫾ 0.02 — — — 1.4 ⫾ 0.1 1.6 ⫾ 0.5 0.2 ⫾ 0.02 — 0.7 ⫾ 0.08 0.5 ⫾ 0.09 — — — 0.4 ⫾ 0.03 — 0.8 ⫾ 0.08 4.1 ⫾ 0.4 — 0.8 ⫾ 0.2 0.5 ⫾ 0.04 1.4 ⫾ 0.2 —

308 ⫾ 8 — — — 427 ⫾ 11 14 ⫾ 2 266 ⫾ 27 — 47.27 ⫾ 2 293 ⫾ 18 — — — 260 ⫾ 7 — 154 ⫾ 8 27 ⫾ 2 — 30 ⫾ 0.3 329 ⫾ 1 415 ⫾ 69 —

1.1 ⫻ 106 — — — 3.2 ⫻ 105 9.0 ⫻ 103 1.5 ⫻ 106 — 6.7 ⫻ 104 5.8 ⫻ 105 3.6 ⫻ 104 ⫾ 4.0 ⫻ 103 2.5 ⫻ 103 ⫾ 3.9 ⫻ 102 3.0 ⫻ 102 ⫾ 3.3 ⫻ 101 3.4 ⫻ 105 33 ⫾ 1.5 2.0 ⫻ 105 6.6 ⫻ 103 2.5 ⫻ 103 ⫾ 3.9 ⫻ 103 3.7 ⫻ 104 6.5 ⫻ 105 3.1 ⫻ 105 8.8 ⫻ 102 ⫾ 1.8 x 102

—, parameter could not be determined.

At the ⫹2NR subsite, N⑀ of Trp164 and O␦1 of Asn185 make hydrogen bonds with O2 of the bound arabinose, whereas Phe234 makes extensive hydrophobic contacts with the substrate. Of particular note are the hydrophobic contacts made with C5 of the ⫹2NR arabinose, which, as discussed above, may assist in orientating the arabinan backbone. The importance of these hydrophobic interactions is illustrated by the reduction in activity when Phe234 is substituted with alanine (Table 3). At the ⫹2NR* subsite, which houses the ␣-1,3-linked arabinofuranose side chain in double substitutions, the sugar makes apolar contacts with Phe234, whereas the O2 and O3 atoms of the sugar appear to be within hydrogen bonding distance of Tyr293 and Gln292, respectively. The potential polar contacts with Tyr293 and Gln292, however, are unlikely to be functionally significant because the activities of the mutants Y293A and Q292A are similar to that of the wild type enzyme (Table 3). To explore the significance of the ⫹2NR* subsite, the double substitutions in arabinan were converted to O2-monosubstituted decorations by treatment with HiAXHd3 (the enzyme removes O3-linked arabinose residues from double substitutions (35)). The activity of the arabinan-specific ␣-1,2-arabinofuranosidase was similar against arabinan and the HiAXHd3-treated polysaccharide. Thus, it is unlikely that the putative interactions with the arabinose at the ⫹2NR subsite are functionally significant, and therefore Phe234 probably only contributes to substrate binding at the ⫹2NR subsite. As described above, CjAbf43A is likely to contain a ⫹3NR subsite that may simply accommodate or, potentially, make productive interactions with the substrate. Although Thr186 and Thr214 could comprise a functional ⫹3NR subsite, substituting these residues with alanine had no significant effect on catalysis, suggesting that the subsite does not bind to the substrate (Table 3). Replacing the two threonines with bulky residues (T186W, T214W, and T186W/T214W), however, caused

15492 JOURNAL OF BIOLOGICAL CHEMISTRY

a substantial reduction in activity against arabinan but had a very limited effect on 4NPA hydrolysis (Table 3). The introduction of the tryptophan residues is predicted to occlude the ⫹3NR subsite and probably generates steric clashes with the arabinose at the ⫹2NR subsite, explaining why the mutations reduced activity against arabinan. Indeed, it is possible that changing the binding cleft into a blind canyon type topology may have restricted the specificity of the enzyme to only O2 arabinose decorations at the non-reducing terminus of the arabinan chains, which could explain the substantial increase in Km. The ⫺1 Subsite—The pocket housing the ⫺1 subsite has no arabinose present, but a molecule of ethylene glycol was observed. The ethylene glycol “stacks” against Trp103 and binds to Arg295 via N␩1 and N␩2 (Fig. 6). Furthermore, comparing the structure of CjAbf43A with other exo-acting GH43 enzymes reveals several conserved features that probably contribute to substrate binding and transition state stabilization (Fig. 6). Surprisingly, the residues that interact with the arabinose in the Streptomyces exo-␣-1,5-arabinofuranosidase Araf43A (8) are conserved not only in other arabinofuranosidases, such as CjAbf43A, but also in GH43 ␤-xylosidases, exemplified by XynB3. It is difficult, therefore, to understand how these enzymes distinguish between an Araf and Xylp at the critical ⫺1 subsite because arabinofuranosidases that hydrolyze 4NP-Araf are not active on 4NP-Xylp. Thus, Asp41, Glu215, and Asp168, which function as the catalytic base, catalytic acid, and modulator of the pKa of the catalytic acid, respectively, are conserved in GH43 enzymes (8, 10, 11, 31, 32). Consistent with their catalytic function, alanine substitution of these three residues inactivates the arabinofuranosidase (Table 3). Trp103 is also invariant in the other GH43 structures. The structural equivalent to this residue, Trp101 in Araf43A and Trp76 in BsAXH-m2,3, provides a hydrophobic platform by stacking VOLUME 286 • NUMBER 17 • APRIL 29, 2011

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

Wild type D41A D168A E215A F66A F67A D101A W103A R121A F129A W164A I167A N185A T186A T186W T214A T214W F234A H267A Q292A Y293A T186W/T21W

Sugar beet arabinan

Km

Arabinan-specific Arabinofuranosidase

against the respective sugar rings, and thus Trp103 is likely to play an equivalent role in CjAbf43A. The importance of the tryptophan is illustrated by the substantial reduction in catalytic activity (against arabinan, the enzyme is completely inactive) when Trp103 is substituted for Ala (Table 3). Ile167 and Phe67 are also highly conserved in other GH43 enzymes where the equivalent residues make hydrophobic contacts with substrate (Fig. 6). Ile167 sits at the top of the active site pocket and is 3.6 Å from the catalytic acid, Glu215 and thus may also create an apolar environment, which contributes to the elevated pKa of the acidic residue. The observation that the amino acid substitutions, F67A and I167A, cause a ⬎100-fold reduction in activity suggests that Ile167 and Phe67 also make significant contributions to substrate binding in the active site. In addition to the catalytic residues, Arg295 and His267 are also likely to make polar contacts with arabinose at the active site. These residues are highly conserved in other GH43 enzymes, and the equivalent amino acids make direct polar contacts with arabinose and xylose residues at the ⫺1 subsite. APRIL 29, 2011 • VOLUME 286 • NUMBER 17

The R295A mutant did not express in E. coli and thus could not be studied. Although mutation of His267 did not affect catalytic efficiency against 4NPA, the Km was substantially increased (too high to measure), suggesting that there was a commensurate increase in kcat. It is possible that the reduction in substrate binding, due to the loss of the imidazole side chain, leads to an increase in the rate of product departure, which results in the elevation in kcat. Against arabinan, the H267A mutation reduced activity by ⬃30-fold, which suggests that the decrease in substrate affinity at the ⫺1 subsite does not influence product departure because the main chain of the polysaccharide, primarily the arabinose at the ⫹1 subsite, makes additional interactions with the enzyme.

DISCUSSION The Mechanism of Arabinan Degradation—This study reports on the capacity of the large number of GH43 enzymes, produced by B. thetaiotaomicron and C. japonicus, to hydrolyze arabinose-containing polysaccharides. The data presented JOURNAL OF BIOLOGICAL CHEMISTRY

15493

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

FIGURE 5. The active site of CjAbf43A. A, three-dimensional position of amino acids of CjAbf43A that contribute directly to substrate recognition through polar interactions. The carbon atoms of the amino acids are in green. In the bound substrate, 32-Ara-Ara3, the ␣-1,5-L-arabinotriose backbone is shown in yellow (carbon), whereas the O3-linked arabinofuranose decoration is in cyan (carbon). The maximum likelihood/␴a-weighted 2Fo ⫺ Fc electron density map for 32-Ara-Ara3 is shown in gray mesh and contoured at 1␴ (0.30 e⫺/Å3). The structure is shown in divergent (wall-eyed) stereo. B, schematic of the residues that make polar contacts with the substrate.

Arabinan-specific Arabinofuranosidase

here, in concert with previous biochemical and transcriptomic data, has led to the following models for arabinan degradation in the two bacteria. In B. thetaiotaomicron, genes encoding enzymes of related function are clustered into polysaccharide utilization loci (PULs) (40). PUL7, which is up-regulated by sugar beet arabinan and orchestrates the metabolism of this polysaccharide, contains three GH43 and two GH51 enzymes.5 The GH43 endo-arabinanases Bt0360 and Bt0367 display a preference for decorated and linear arabinan, respectively. These enzymes act in consort with the arabinan-specific ␣-1,2arabinofuranosidase Bt0369, which is also encoded by PUL7. Bt0369 represents a “pretreatment” stage of arabinan processing, converting all of the double linked arabinose into single O3 decorations as well as removing all single O2 arabinose side chains. Thus, Bt0360, Bt0367, and Bt0369 degrade arabinan to arabinooligosaccharides that are either linear or contain single O3-linked arabinose decorations. These oligosaccharides, which are transported into the periplasm by two SusC/SusDlike complexes (41), are then further metabolized by two GH51 arabinofuranosidases. At least one of the two periplasmic GH51 enzymes is likely to remove the single O3-arabinose decorations to generate linear oligomers, a feature typical of enzymes in this family (42). It is possible that the other GH51 arabinofuranosidase converts the linear oligosaccharides into arabinose, although no GH51 enzyme has been reported to display ␣-1,5-exo-arabinanase activity. In C. japonicus, CjAbf43A converts the double substitutions into single O3-linked decorations, which are then removed by the extracellular, membrane-associated, GH51 arabinofuranosidase (43). The linear arabinan generated is then hydrolyzed, exclusively, to arabinotriose by an endo-processive arabinanase 5

E. C. Martens, personal communication.

15494 JOURNAL OF BIOLOGICAL CHEMISTRY

CONCLUSION The expansion of GH43 enzymes in microorganisms from varied habitats points to a complex array of specificities within this family. Through the analysis of GH43s from C. japonicus and B. thetaiotaomicron, some definable activities have been VOLUME 286 • NUMBER 17 • APRIL 29, 2011

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

FIGURE 6. Overlay of the active site of CjAbf43A with BsAXH-m2,3 and Araf43A. Shown is an overlap of the three enzymes at the ⫺1 active site with CjAbf43A in green, BsAXH-m2,3 (Protein Data Bank entry 3C7G) in cyan, and Araf43A (Protein Data Bank entry 3AKH) in yellow. The silver-colored arabinose is derived from the Araf43A structure, and ethylene glycol (magenta) is from the CjAbf43A structure.

(9). The trisaccharide is metabolized, probably in the periplasm, by the GH43 ␣-1,5-exo-arabinanase CJA_0806 that displays a moderate preference for arabinotriose. The arabinobiose generated is transported into the cytoplasm, where it is hydrolyzed by the second GH43 ␣-1,5-exo-arabinanase, CJA_3012, which appears to display a preference for the disaccharide. A key feature of arabinan metabolism is the hydrolysis of one of the linkages in double substitutions, which makes the substrate accessible to other exo- and endo-acting enzymes. It is interesting that whereas C. japonicus and B. thetaiotaomicron have adopted different strategies to hydrolyze the arabinan backbone, the two bacteria utilize the same enzyme activity to cleave double substitutions, a critical component of the degradative hierarchy. There is a paucity of information on enzymes that hydrolyze double substitutions. Indeed, the only other enzyme known to hydrolyze such structures is the GH43 arabinofuranosidase AXHd3 from the colonic bacterium Bifidobacterium adolescentis (44). AXHd3, however, displays broader specificity than Bt3069 or CjAbf43A because it hydrolyzes double substitutions in both arabinoxylan and arabinan. Such an activity is not required by B. thetaiotaomicron as the bacterium does not metabolize xylan (29). By contrast, C. japonicus has an extensive xylan-degrading apparatus (36, 45), and thus probably contains an additional arabinofuranosidase that targets xylose residues decorated at O2 and O3. Specificity of CjAbf43A for Sugar Beet Arabinan—The specificity of CjAbf43A for sugar beet arabinan is due to the surface topography of the enzyme. Phe66, Trp164, and Phe234 contribute to the formation of a curved surface cleft around the ⫺1 pocket. This topology is complementary to the extended helical structure of the ␣-1,5-L-arabinofuranose backbone of sugar beet arabinan (Fig. 4), explaining why the enzyme targets this polysaccharide. The enzyme is unable to hydrolyze arabinoxylan, which also contains ␣-1,2-L-arabinofuranose side chains. The helical pitch of arabinoxylans (3-fold screw axis) is shorter than that of arabinans (6-fold), and this constrained structure is arabinoxylan is likely to make steric clashes with the curved surface of the substrate binding cleft. Conversely, the GH43 xylan-specific arabinofuranosidase, BsAXH-m2,3 (10), is unable to attack arabinan side chains; despite structural conservation with CjAbf43A at the ⫺1 subsite, the arabinan backbone could not be accommodated by the linear substrate binding cleft that houses the xylan backbone. Thus, it is the architecture of the surface substrate binding cleft, curved in CjAbf43A and linear in BsAXH-m2,3, that dictates the specificity of these enzymes. Apart from Phe234, residues that line the substrate binding cleft distal to the ⫹1 subsite (which plays a key role in providing the binding energy required for substrate distortion in the active site) bind weakly to the arabinan backbone. This is consistent with the requirement for the polysaccharide backbone to dissociate prior to the release of arabinose from the active site pocket.

Arabinan-specific Arabinofuranosidase

Acknowledgments—We thank the staff at the beamlines used for data collection at Diamond Light Source and the Advanced Photon Source. REFERENCES 1. Burton, R. A., Gidley, M. J., and Fincher, G. B. (2010) Nat. Chem. Biol. 6, 724 –732 2. Hooper, L. V., Midtvedt, T., and Gordon, J. I. (2002) Annu. Rev. Nutr. 22, 283–307 3. Himmel, M. E., and Bayer, E. A. (2009) Curr. Opin. Biotechnol. 20, 316 –317 4. Himmel, M. E., Ding, S. Y., Johnson, D. K., Adney, W. S., Nimlos, M. R., Brady, J. W., and Foust, T. D. (2007) Science 315, 804 – 807 5. Caffall, K. H., and Mohnen, D. (2009) Carbohydr. Res. 344, 1879 –1900 6. Gruppen, H., Kormelink, F. J., and Voragen, A. G. (1993) J. Cereal Sci. 18, 111–128 7. Cantarel, B. L., Coutinho, P. M., Rancurel, C., Bernard, T., Lombard, V., and Henrissat, B. (2009) Nucleic Acids Res. 37, D233–238 8. Fujimoto, Z., Ichinose, H., Maehara, T., Honda, M., Kitaoka, M., and Kaneko, S. (2010) J. Biol. Chem. 285, 34134 –34143 9. McKie, V. A., Black, G. W., Millward-Sadler, S. J., Hazlewood, G. P., Laurie, J. I., and Gilbert, H. J. (1997) Biochem. J. 323, 547–555 10. Vandermarliere, E., Bourgois, T. M., Winn, M. D., van Campenhout, S., Volckaert, G., Delcour, J. A., Strelkov, S. V., Rabijns, A., and Courtin, C. M. (2009) Biochem. J. 418, 39 – 47 11. Bru¨x, C., Ben-David, A., Shallom-Shezifi, D., Leon, M., Niefind, K., Shoham, G., Shoham, Y., and Schomburg, D. (2006) J. Mol. Biol. 359, 97–109 12. Ichinose, H., Yoshida, M., Kotake, T., Kuno, A., Igarashi, K., Tsumuraya, Y., Samejima, M., Hirabayashi, J., Kobayashi, H., and Kaneko, S. (2005) J. Biol. Chem. 280, 25820 –25829 13. Gosalbes, M. J., Pe´rez-Gonza´lez, J. A., Gonza´lez, R., and Navarro, A. (1991)

APRIL 29, 2011 • VOLUME 286 • NUMBER 17

J. Bacteriol. 173, 7705–7710 14. DeBoy, R. T., Mongodin, E. F., Fouts, D. E., Tailford, L. E., Khouri, H., Emerson, J. B., Mohamoud, Y., Watkins, K., Henrissat, B., Gilbert, H. J., and Nelson, K. E. (2008) J. Bacteriol. 190, 5455–5463 15. Xu, J., Bjursell, M. K., Himrod, J., Deng, S., Carmichael, L. K., Chiang, H. C., Hooper, L. V., and Gordon, J. I. (2003) Science 299, 2074 –2076 16. Charnock, S. J., Bolam, D. N., Turkenburg, J. P., Gilbert, H. J., Ferreira, L. M., Davies, G. J., and Fontes, C. M. (2000) Biochemistry 39, 5013–5021 17. Melrose, J., and Sturgeon, R. J. (1983) Carbohydr. Res. 118, 247–253 18. Miller, G. L. (1959) Anal. Chem. 31, 426 – 428 19. Leslie, A. G. W. (1992) Joint CCP4 and ESF-EACBM Newsletter on Protein Crystallography Number 26, Daresbury Laboratory, Warrington, UK 20. Evans, P. (2006) Acta Crystallogr. D 62, 72– 82 21. Sheldrick, G. M. (2008) Acta Crystallogr. A 64, 112–122 22. Collaborative Computational Project 4 (1994) Acta Crystallogr. D 50, 760 –763 23. Langer, G., Cohen, S. X., Lamzin, V. S., and Perrakis, A. (2008) Nature Protocols 3, 1171–1179 24. Cowtan, K. D., and Main, P. (1993) Acta Crystallogr. D 49, 148 –157 25. McCoy, A. J., Grosse-Kunstleve, R. W., Adams, P. D., Winn, M. D., Storoni, L. C., and Read, R. J. (2007) J. Appl. Crystallogr. 40, 658 – 674 26. Murshudov, G. N., Vagin, A. A., and Dodson, E. J. (1997) Acta Crystallogr. D 53, 240 –255 27. Zwart, P. H., Afonine, P. V., Grosse-Kunstleve, R. W., Hung, L. W., Ioerger, T. R., McCoy, A. J., McKee, E., Moriarty, N. W., Read, R. J., Sacchettini, J. C., Sauter, N. K., Storoni, L. C., Terwilliger, T. C., and Adams, P. D. (2008) Methods Mol. Biol. 426, 419 – 435 28. Emsley, P., and Cowtan, K. (2004) Acta Crystallogr. D 60, 2126 –2132 29. Cooper, S. W., Pfeiffer, D. G., and Tally, F. P. (1985) J. Clin. Microbiol. 22, 125–126 30. Zhao, S., Wang, J., Bu, D., Liu, K., Zhu, Y., Dong, Z., and Yu, Z. (2010) Appl. Environ. Microbiol. 76, 6701– 6705 31. Brunzelle, J. S., Jordan, D. B., McCaslin, D. R., Olczak, A., and Wawrzak, Z. (2008) Arch. Biochem. Biophys. 474, 157–166 32. Proctor, M. R., Taylor, E. J., Nurizzo, D., Turkenburg, J. P., Lloyd, R. M., Vardakou, M., Davies, G. J., and Gilbert, H. J. (2005) Proc. Natl. Acad. Sci. U.S.A. 102, 2697–2702 33. Davies, G. J., Wilson, K. S., and Henrissat, B. (1997) Biochem. J. 321, 557–559 34. Swamy, N. R., and Salimath, P. V. (1991) Phytochemistry 30, 263–265 35. Sørensen, H. R., Jørgensen, C. T., Hansen, C. H., Jørgensen, C. I., Pedersen, S., and Meyer, A. S. (2006) Appl. Microbiol. Biotechnol. 73, 850 – 861 36. Gilbert, H. J. (2010) Plant Physiol. 153, 444 – 455 37. Gilbert, H. J., Stålbrand, H., and Brumer, H. (2008) Curr. Opin. Plant Biol. 11, 338 –348 38. Nurizzo, D., Turkenburg, J. P., Charnock, S. J., Roberts, S. M., Dodson, E. J., McKie, V. A., Taylor, E. J., Gilbert, H. J., and Davies, G. J. (2002) Nat. Struct. Biol. 9, 665– 668 39. Neer, E. J., and Smith, T. F. (1996) Cell 84, 175–178 40. Bjursell, M. K., Martens, E. C., and Gordon, J. I. (2006) J. Biol. Chem. 281, 36269 –36279 41. Martens, E. C., Koropatkin, N. M., Smith, T. J., and Gordon, J. I. (2009) J. Biol. Chem. 284, 24673–24677 42. Beylot, M. H., McKie, V. A., Voragen, A. G., Doeswijk-Voragen, C. H., and Gilbert, H. J. (2001) Biochem. J. 358, 607– 614 43. Beylot, M. H., Emami, K., McKie, V. A., Gilbert, H. J., and Pell, G. (2001) Biochem. J. 358, 599 – 605 44. van den Broek, L. A., Lloyd, R. M., Beldman, G., Verdoes, J. C., McCleary, B. V., and Voragen, A. G. (2005) Appl. Microbiol. Biotechnol. 67, 641– 647 45. Pell, G., Szabo, L., Charnock, S. J., Xie, H., Gloster, T. M., Davies, G. J., and Gilbert, H. J. (2004) J. Biol. Chem. 279, 11777–11788 46. Edgar, R. C. (2004) BMC Bioinformatics 5, 113 47. Guindon, S., and Gascuel, O. (2003) Syst. Biol. 52, 696 –704 48. Tamura, K., Dudley, J., Nei, M., and Kumar, S. (2007) Mol. Biol. Evol. 24, 1596 –1599

JOURNAL OF BIOLOGICAL CHEMISTRY

15495

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

described, but a function could not be assigned for many of the proteins. This may partly reflect the loss in activity in these proteins due to functional redundancy. It is also likely, however, that the function of active enzymes could not be assigned because the substrates for these biocatalysts were not used, or the activities of these GH43 are only apparent when they are acting in synergy with other degradative enzymes. Despite the problems in assigning functions to all of the GH43 proteins, the data presented here indicate that the main chain of arabinan, a target substrate for numerous GH43 enzymes, is degraded by an endo- and exo-mechanism in the gut symbiont and soil saprophyte, respectively. Although distinct, the mechanism of arabinan degradation in the two bacteria display an element of convergence; both C. japonicus and B. thetaiotaomicron remove O2-linked arabinose decorations, in the context of single or double substitutions, through the action of an arabinanspecific ␣-1,2-arabinofuranosidase, an activity that has not previously been reported. The biological rationale for such an activity probably reflects the capacity of the enzyme to convert double substitutions into single decorations, which will then be accessible to the GH51 arabinofuranosidases expressed by these bacteria. The crystal structure of CjAbf43A reveals several topological features, such as a curved substrate binding cleft, a shelflike structure adjacent to the active site, and the targeting of the only asymmetric oxygen in arabinan, which confer the specificity displayed by the enzyme. The arabinanspecific ␣-1,2-arabinofuranosidase activity identified here will add to the toolbox of biocatalysts required to degrade and to understand the molecular architecture of plant cell walls.

Table S1 The vectors restriction sites and primers used to clone genes encoding C. japonicas GH43 enzymes

Clone identification

Restriction used for cloning

Primers used

Expression Level

Purified by IMAC

CJA_0799

Vector used for cloning the gene pET32b

BamHI;XhoI

+++

+

CJA_0806

pRSET

BamHI;HindIII

F-G TCG AGG GAT CCG GAC AAT AAA ACC TAC CCA TCG ACT R-CCC CTT CTC GAG TGA TGA TTG ATA CCC CCA GGA TTT ACC F- CTCCAG GGATCC ACA AAG CCC ACC ATA CCGR CTCCAG AAGCTT GTT TGT GTG TCA CTA AGT CGT

+++

+

CJA_0816

pET32b

BamHI;XhoI

+++

+

CJA_0818

pRSET

BamHI;HindIII

F-G GCC GCG GAT CCG AAT GGT CTG ATA AAT TCC CAT GAT R-GGA CTT CTC GAG TTA CCA GCC ATT GCT GTA GGT CAT CTT F-CTCCAG GGATCC GGC AAG GAA AAA AGT TCG GCG

+++

+

+++

+

+++

+

++

+

++++

+

+++

+

+++

+

+++

+

R- CTCCAG AAGCTT GTC TGT CCA GGT AAA TTG TTG CGC CJA_0819

pET32b

BamHI;XhoI

CJA_3007

pRSET

BamHI;HindIII

F- C CCG CCG GAT CCG CAG ACA ACC TTT AAT AAT CCA CTC R- TAA ATT CTC GAG TCA GGC TGT CGA GGT CGC AAC CGG ACT F- CTCCAG GGATCC ACT CGC GCC CCG CTA TGG CGC

CJA_3012

pET22b

NdeI;XhoI

R- CTCCAG AAGCTT CGT TGA CGA GCG CGC CTG AAA TTG F-CTC CGG CATATG ATG GCT GAA TTA TTA AAA CCA CTG

CJA_3018

pET28b, pET32b

BamHI;XhoI

CJA_3070

pRSET

BamHI;HindIII

CJA_3594

pET28b

BamHI;XhoI

CJA_3601

pRSET

BamHI;HindIII

R- AAA TTT CTCGAG CTC AAT CAG CGA GGG TTT GCC F-G AGC TCG GAT CCG CAG GCT GAA AAC CCC ATC TTT ACG R- AGG CAA CTC GAG CTT GGC GGC TTT TTT AAC CCG CTC R- AGG CAA CTC GAG TTA CTT GGC GGC TTT TTT AAC CCG CTC F- CTCCAG GGATCC CAG GAT AAA TCT GCC GAG TCA ACA CCA R- CTCCAG AAGCTT GAA CTA TGG GTT GGA TAC TGC CAT CAT

F -G AGC TCG GAT CCG AAT AGC GGC AGC CAT CAC ACC GGG R- CGA ACC CTC GAG CGG CAG CGC CCT ATA CAC CAG ATT F- CTCCAG GGATCC ATA CAC CAA CCT GTG CAA CAG R- CTCCAG AAGCTT CTT AAC CAA CGG CGT CAC CTG

1

Table S2

The vectors restriction sites and primers used to clone genes encoding B. thetaiotaomicron GH43 enzymes

Gene

Vector

BT0145

pRSETA

BT0145

pRGST

BT0264

pRSETA

BT0265

pRSETA

BT0360

pRSETA

BT0367

pET21

BT0369

pRSETA

BT1021

pRSETA

BT1021

pRGST

BT1781

pET21

BT1873

pET21

BT2112

pRSETA

BT2852

pRSETA

BT2895

pRSETA

BT2895

pRGST

BT2898 BT2898

pRSETA pRGST

BT2900

pRSETA

BT2912

pET21

BT2959

pRSETA

BT3094

pRSETA

BT3094

pRGST

BT3108

pRSETA

Forward primer

Reverse primer

Restriction sites BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 HindIII, Xho1

Soluble protein? -

Purified by IMAC NT

CTCCAG GGATCC ATGATAAAGACCAGAGATCCG CTCCAG GGATCC ATGATAAAGACCAGAGATCCG CTCCAG GGATCC CAACCAGCTTTCGCAACTCAATCG CTCCAG GGATCC CAAAACACCCAGATCAGTCCG CTCCAG GGATCC GCGTGC AGCGACGACGATGAAAACTCC CTCCAG GGATCC AATTCTTGG GATGATAATTATTTATCAG CTCCAG GGATCC CAGAATGACACTACTTTTGTAGC CTCCAG GGATCC ATGTTCACTTCATTTCATGAGCCG CTCCAG GGATCC ATGTTCACTTCATTTCATGAGCCG CTCCAG CC AAGCTT G GACGGCTCGGGCCGTGGTTGGC CTCCAG GCCATGG CC AATGATGATGGACAGACTTCC CTCCAG GGATCC GACACATCACACAAGTATACAGC CTCCAG GGATCC CAAACAAAGAATGTGACATGG CTCCAG GGATCC GAGAGTGCAACAGATGACGAG CTCCAG GGATCC GAGAGTGCAACAGATGACGAG CTCCAGGGATCCAGTACAGTGGATAAAGGTGA TAACAATGC CTCCAGGGATCCAGTACAGTGGATAAAGGTGA TAACAATGC CTCCAG GGATCC GAA GATCCCGCTAAGACTTATAAGAACC CTCCAG GCCATGG CC CAAGAACATGATGGTGATTATAC CTCCAG GGATCC CGGAAAACGGAAAAAGTAGTAAATAACG CTCCAG GGATCC CAAGCATATGGAACTGCTGATAC CTCCAG GGATCC CAAGCATATGGAACTGCTGATAC CTCCAG GGATCC ATGAAAACACTCATCCAATTTTTATTGGCG

CTCCAG GAATTC TCATTTAGTTACCCGGAACCAGTC

-

NT

++

+

++++

+

+

+

+++

+

++++

+

Insol

NT

++

+

-

NT

Nco1, HindIII

++

-

BamHI, XhoI

++

+

BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamHI, XhoI

++

+

+++

+

+

+

-

NT

Insol

-

++

+

CTCCAG GAAGCTT CC TTTTAATTCTATCTTCTTTGCAAAAAC CTCCAG C CTCGAG GG CAGTTTTACAATATCAGTAATGG

Nco1, HindIII

+

-

BamHI, XhoI

+++

+

CTCCAG GAATTC TCATTTTGCCTGATAATTGTGATTAGG

BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1

++++

+

+++

+

+++

+

CTCCAG GAATTC TCATTTAGTTACCCGGAACCAGTC CTCCAG GAATTC TTATTGTGGTTTTCCGTAAGCAGCAAATTC CTCCAG GAATTC TTATTGGGCTTTCATCTGTTTTAGATTATTAGC CTCCAG GAATTC TTGGAGTTTCTTCCCCCAATATGTC CTCCAG GAATTC TTTGATTCTTTTTCCCCAGAC CTCCAG GAATTC TTGTACGCCTTCCGTTGTCATCTGAATGCG CTCCAG GAATTC TTATTTCTTCTTGCTTTCTTCCAG C CTCCAG GAATTC TTATTTCTTCTTGCTTTCTTCCAG C CTCCAG C CTCGAG GG TTA TTTAGTTTGCTTCACCATTTTTATAGTGC CTCCAG GAAGCTT CC ATTCAGAGCTACAAATTTTCCATC CTCCAG C CTCGAG GG GAATTCGTATGTATCAGGTGACCAGC CTCCAG GAATTC TCAGTTAAAGTCGTAAGTGAACCAG CTCCAG GAATTC TTACTTCAGAACTCTCGGCTTAATTACAG CTCCAG GAATTC TTACTTCAGAACTCTCGGCTTAATTACAG CTCCAG GAATTC TTA CTTTATAACTTTCGGTTTAATAACC CTCCAG GAATTC TTA CTTTATAACTTTCGGTTTAATAACC CTCCAG C CTCGAG GG AAAATGAGGTTTGGGGG

CTCCAG GAATTC TCATTTTGCCTGATAATTGTGATTAGG CTCCAG GAATTC GAAAATGTTATCAAGCCAATGTGGAAA

2

BT3467

pRSETA

BT3467

pRGST

BT3515

pRSETA

BT3515

pRGST

BT3516

pRSETA

BT3516

pRGST

BT3655

pET21

BT3656

pET21

BT3662

pRSETA

BT3662

pET21a

BT3662

pRGST

BT3663

pRSETA

BT3663

pET21a

BT3663

pRGST

BT3675

pET21

BT3675

pRSETA

BT3675

pRGST

BT3683

pET21

BT3683

pRSETA

BT3683

pRGST

BT3685

pRSETA

BT3685

pRGST

BT4095 BT4152

pRSETA pRSETA

BT4185

pET21

CTCCAG GGATCC GGAAAAAAAGAAGATGTTGAAAAAG CTCCAG GGATCC GGAAAAAAAGAAGATGTTGAAAAAG CTCCAG GGATCC CAGCATAAGAAGGCTACAACAGAG CTCCAG GGATCC CAGCATAAGAAGGCTACAACAGAG CTCCAG GGATCC CAAAACAAAAGTAAGCTTCCG CTCCAG GGATCC CAAAACAAAAGTAAGCTTCCG CTCCAG GGATCC TACCTGTTTGTCTATTTCACCGG CTCCAG GGATCC ATTCGCAA GAATACCCGAAAGTC CTCCAG GGATCC CACAATAATCCTTTTGGCAATGC CTCCAG GAGCTC CACAATAATCCTTTTGGCAATGC CTCCAG GGATCC CACAATAATCCTTTTGGCAATGC CTCCAG GGATCC AGACAGATCAGTGAAGAAATGAACTGG CTCCAG GAGCTC ATGGTGATCA TACATTCCAA GG CTCCAG GGATCC AGACAGATCAGTGAAGAAATGAACTGG CTCCAG GCCATGG CC CAGAATAAGAAATCCGGCAATCC CTCACG GAATTC CAGAATAAGAAATCCGGCAATCC CTCACG GAATTC CAGAATAAGAAATCCGGCAATCC CTCACG GAATTC CAGGACGACTGGCAACTTGTTTGG CTCCAG GGATCC CAGGACGACTGGCAACTTGTTTGG CTCCAG GGATCC CAGGACGACTGGCAACTTGTTTGG CTCCAG GGATCC CAGAAGAATAGTTATATTATTCCCGGGAAG CTCCAG GGATCC CAGAAGAATAGTTATATTATTCCCGGGAAG CTCCAG GGATCC CAAGAACAAAACTACTTCACAAATCC CTCCAG GGATCC CTGCATCTTGCCTACAGTTATGATGGG CTCCAG GCCATGG CC CAGAAAAACTATGTATCCGAAGTATGG

CTCCAG GAATTC TTGATTTGGAACTGTGCCAGATGG

BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, HindIII SacI, XhoI

+

+

-

NT

++

+

-

NT

+

+

+

+

++

+

++

+

-

NT

-

NT

BamH1, HindIII BamH1, EcoR1 SacI, XhoI

-

NT

-

NT

-

NT

BamH1, EcoR1 Nco1, HindIII

-

NT

Insol

NT

CTCACG GAATTC TGGTGTAGGGATTACCTGTTTG

+/ insol

-

CTCACG GAATTC TGGTGTAGGGATTACCTGTTTG

++

+

CTCCAG GAATTC TTGATTTGGAACTGTGCCAGATGG CTCCAG GAATTC TTAGTTCTGTTTTTTATGAATAGGGGAACC CTCCAG GAATTC TTAGTTCTGTTTTTTATGAATAGGGGAACC CTCCAT GAATTC TTATCTGAAGTCCGGAGCAGC CTCCAT GAATTC TTATCTGAAGTCCGGAGCAGC CTCCAT GAATTC CATCTTGTCGGGCTTTCCGTAGGCC CTCCAT GAATTC TTTCCTGTTTTCCCAATGACTTCTCAG CTCCAG CC AAGCTT G TTA ATAAATATTCCATTCCCAGATACCG CTCCAG AC CTCGAG A TTAATAAATATTCCATTCCCAGATAC CTCCAG CC AAGCTT G TTA ATAAATATTCCATTCCCAGATACCG CTCCAG GAATTC TTA ATATAAATAAATAAACTCATCC CTCCAG AC CTCGAG A TTAATATAAATAAATAAACTCATCC CTCCAG GAATTC TTA ATATAAATAAATAAACTCATCC CTCCAG GAAGCTT CC TGGTGTAGGGATTACCTGTTTG

CTCACG ACTCGAGCA ATCAAAATTCCACCTATCCATCC

EcoRI, XhoI

-

NT

CTCCAG GGATCC ATCAAAATTCCACCTATCCATCC

BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, EcoR1 BamH1, HindIII Nco1, HindIII

-

NT

-

NT

+

+

-

NT

++

+

-

NT

Insol

NT

CTCCAG GGATCC ATCAAAATTCCACCTATCCATCC CTCCAG GAATTC CTATTTTTTCTGTTTGTCAGATTTCTTGTATTTGTC CTCCAG GAATTC CTATTTTTTCTGTTTGTCAGATTTCTTGTATTTGTC CTCCAG GAATTC TTATTTTAAATTCTTCATTTTAAAATAAGAAAACTC CTCCAG CC AAGCTT G TTA CATTGAAAACACCAGTACTCCTGCGG CTCCAG GAAGCTT CC TTTCCTTGTAATTCGGAACCAGTC

3

Table S3 Primers used in site-directed mutagenesis of CjAbf43A Mutant of CjAbf43B D21 E148 E195 F46A F47A D81A W83A R100A F109A W144A I147A N165A T166A T166W T194A T194W F214A H247A Y273A Q272A R275A

Primer F- GAT GTG TTT ACC GCC GCG CCG GCG GCC CTG GTA CAC AAG GGC R- GCC CTT GTG TAC CAG GGC CGC CGG CGC GGC GGT AAA CAC ATC F- ATC GAT TGG GAC GAT ATA GCC CCT TCG GTA TTT ATT GAT GAC R- GTC ATC AAT AAA TAC CGA AGG GGC TAT ATC GTC CCA ATC GAT F- CTG CCC GAA TTT ACC GCC GCG ATC TGG GTA CAT AAA TAC CAG G R- C CTG GTA TTT ATG TAC CCA GAT CGC GGC GGT AAA TTC GGG CAG F- CCG GAT AAC ACC ACG GCC TTT GTG ATG AAC GAA R- TTC GTT CAT CAC AAA GGC CGT GGT GTT ATC CGG F- GAT AAC ACC ACG TTT GCC GTG ATG AAC GAA TGG R- CCA TTC GTT CAT CAC GGC AAA CGT GGT GTT ATC F- ACC TGG GCC AAG GGC GCC GCC TGG GCC AGC CAG R- CTG GCT GGC CCA GGC GGC GCC CTT GGC CCA GGT F- GCC AAG GGC GAT GCC GCC GCC AGC CAG GTG ATT R- AAT CAC CTG GCT GGC GGC GGC ATC GCC CTT GGC F- TAC TGG TAT GTG ACC GTC GCC CAC GAC GAC ACC AAG CCG GGT R- ACC CGG CTT GGT GTC GTC GTG GGC GAC GGT CAC ATA CCA GTA F- GAC ACC AAG CCG GGT GCC GCC ATT GGT GTG GCC R- GGC CAC ACC AAT GGC GGC ACC CGG CTT GGT GTC F- GAT ACC CCT ATC GAT GCC GAC GAT ATA GAT CCT R- AGG ATC TAT ATC GTC GGC ATC GAT AGG GGT ATC F- ATC GAT TGG GAC GAT GCC GAT CCT TCG GTA TTT ATT R- AAT AAA TAC CGA AGG ATC GGC ATC GTC CCA ATC GAT F- TAC CTG TTC TGG GGA GCC ACC AGG CCA CGC TAC R- GTA GCG TGG CCT GGT GGC TCC CCA GAA CAG GTA F- TAC CTG TTC TGG GGA AAT GCA AGG CCA CGC TAC GCC AAG CTG R- CAG CTT GGC GTA GCG TGG CCT TGC ATT TCC CCA GAA CAG GTA F- CTG TTC TGG GGA AAT TGG AGG CCA CGC TAC GCC AAG R- CTT GGC GTA GCG TGG CCT CCA ATT TCC CCA GAA CAG F- GAA GGG CTG CCC GAA TTT GCA GAG GCG ATC TGG GTA CAT AAA R- TTT ATG TAC CCA GAT CGC CTC TGC AAA TTC GGG CAG CCC TTC F- GGG CTG CCC GAA TTT TGG GAG GCG ATC TGG GTA CAT R- ATG TAC CCA GAT CGC CTC CCA AAA TTC GGG CAG CCC F- TCC TAT GCC ATG GGC GCC CCC GAG AAG ATC GGC R- GCC GAT CTT CTC GGG GGC GCC CAT GGC ATA GGA F- GGC AAT ACG CCC ACT AAC GCC CAG GCC ATT ATC GAA TTC AAT R- ATT GAA TTC GAT AAT GGC CTG GGC GTT AGT GGG CGT ATT GCC F- CCC GAT GGC GGC CAA GCC CGT CGT TCG GTC AGT R- ACT GAC CGA ACG ACG GGC TTG GCC GCC ATC GGG F- CGC CCC GAT GGC GGC GCC TAC CGT CGT TCG GTC AGT R- ACT GAC CGA ACG ACG GTA GGC GCC GCC ATC GGG GCG F- GAT GGC GGC CAA TAC CGT GCC TCG GTC AGT ATC GAT GAG CTG R- CAG CTC ATC GAT ACT GAC CGA GGC ACG GTA TTG GCC GCC ATC

4

Table S4 Activity screens of C. japonicas GH43 enzymes Enzyme CJA_0799 CJA_0806 CJA_0816 CJA_0818 CJA_0819 CJA_3007 CJA_3012 CJA_3018 CJA_3070 CJA_3594 CJA_3601

Wheat arabinoxylan +/+/+/+/+/-

Potato galactan -

RGI -

Sugar beet arabinan + -

Linear arabinan +/+/-

4NPA

4NPX

+ + + +/-

+ -

The activity of the enzymes were screened by TLC, HPLC and Spectroscopically. The activity is graded as active (+), trace activity that is unlikely to be biologically significant (+/-), not active (-). Intial reactions were performed overnight at 37o C using 5 M of the appropriate enzyme.

5

Table S5 Activity screens of B. thetaiotaomicron GH43 enzymes

Enzyme BT0145 BT0264 BT0265 BT0360 BT0367 BT0369 BT1021 BT1873 BT2112 BT2852 BT2895 BT2912 BT2959 BT3094 BT3108 BT3515 BT3516 BT3655 BT3656 BT3675 BT3685 BT4095 BT4185

Wheat arabinoxylan +/+/+/+/+/+/+/+/+/+/+/+/+/-

Potato galactan -

RGI -

Sugar beet arabinan + + + -

Linear arabinan + + -

4NPA

4NPX

+ + + +/+/+ -

-

The activity of the enzymes were screened by TLC, HPLC and Spectroscopically. The activity is graded as active (+), trace activity that is unlikely to be biologically significant (+/-), not active (-). Intial reactions were performed overnight at 37o e appropriate enzyme.

6

Table S6 NMR assignments: 1H and 13C chemical shift (ppm) of the arabinan oligosaccharides corresponding to Figure 2.

H1

H2

H3

H4

H5

C1

C2

C3

C4

C5

T-α-L-1,3 arabinofuranose

5.157

4.137

3.959

4.034

3.835, 3.721

107.68

81.62

77.14

84.6

61.7

T-α-L-1,2 arabinofuranose

5.186

4.130

3.966

4.065

3.828 3.719

107.68

-

-

-

-

T-α-L-1,2 arabinofuranose

5.180

4.132

3.963

4.080

3.833 3.718

107.51

81.62

77.13

84.6

61.70

5-α-L-1,5 arabinofuranose

5.085

4.129

4.030

4.217

3.892 3.798

108.77

81.6

77.10

82.99

67.22

5-α-L-1,5 arabinofuranose

5.095

4.131

4.010

4.260

-

108.77

81.6

77.3

-

-

3,5-α-L-1,5 arabinofuranose

5.118

4.292

4.094

4.307

3.947 3.840

108.1

79.79

82.98

82.32

67.05

3,5-α-L-1,5 arabinofuranose

5.112

4.291

4.037

4.218

3.891 3.806

108.00

79.70

82.66

83.32

67.00

2,5-α-L-1,5 arabinofuranose

5.209

4.170

4.160

4.097

3.955 3.885

107.02

87.60

75.86

-

67.00

2,3,5-α-L-1,5 arabinofuranose

5.254

4.314

4.255

4.300

3.900 3.800

107.20

85.60

80.79

81.66

67.00

α-L-arabinopyranoside

4.509

3.501

3.650

3.934

3.890 3.671

97.39

72.49

73.16

69.17

67.05

β-L-arabinopyranoside

5.229

3.808

3.872

3.999

4.018 3.641

93.07

69.05

69.21

69.38

63.10

α-L-arabinofuranose

5.247

4.031

3.980

4.110

3.790 3.687

101.74

-

-

-

-

β-L-arabinofuranose

5.293

4.085

4.040

3.834

3.800 3.659

95.93

-

-

-

-

Free Arabinose

7

Figure S1 Catalytic base

Bt1021 Bt4152 Bt0264 Bt3467 Bt3515 Bt2912 CJA_0818 CJA_0819 SavAraf43A CJA 3012 CJA_0806 Bt3655 Bt3094 Bt1873 Bt2959 Bt0265 Bt3683 Bt2112 Bt3685 Bt3108 Bt2895 Bt2898 Bt3662 BsAXH-m2,3 PpXyn43A Bt3675 Bt3516A UXOrf66A CJA_1316 CJA_3070 Bt1781 UXOrf66B Bt0369 CJA_3018 HiAXH-d3 CJA_0799

10 20 30 40 50 60 70 80 90 100 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| --HRLMRDPSM-VRTPDGTYHLVWTTSWKGD----------------LGFGYA-NSKDLIHWSEQQMIPVM----------------------------A GKDKLMRDPSI-CQAPDGTFHMVWTSS------------------WTDRIIGYASSRDLIHWSEQQAIPVM----------------------------M ---ESAQKPQG-MKIGELYFSYKMENVEKEV-----------------------DGQTLKGWLATES--------------------------------F --FMEGANSSA--IYHNGKYYYTHETN--------------------EQIYLW-VTTDITDMAHSTCKEVWVP--------------------------K --LERGAEPWA--IFHEGKYYYTQGSE--------------------NRVILW-ETDDITDLSHATQKDVWIP--------------------------T --LSSGADPWV--IKHEGWYYYCSGVP--------------------GGIGVS-RSRDLHKIN--PPVRVWKVPE------------------------K --YPNGADPWL--EYFEGNYYLTTTT-------------------WTSQLVMR-KSPTLAGLANATPVYIWSE--------------------------S --FANGADPWV--QFYQGNYYALTTT-------------------WTSQIAMR-KSPTLAGLASATPTYIWSD--------------------------T --AEKRADPHI-FKHTDGYYYFTATVP------------------EYDRIVLR-RATTLQGLATAPETTIWTKH-------------------------A --ILQRADPCI-YKHSDGYYYFTASVP------------------LYDRIELR-RAKTIDGLVDAQTVDAWRKP-------------------------D --VLQRADPQI-YRTQEGRYYFIATAP------------------EFDRIELR-AADRLADLTQAPAQVIWRKH-------------------------A --TGGVRDPHILRCEDGKTFYMVVTDMVSGNGW------------SSNRAMVLLKSKDLVNWTSNIVNIQKKY--------------------------P --DYWMRDTWA-TLGPDGYYYITGTTSTPDRHFPGQRHCWD----WNDGLYLW-RSKDLKIWE-AMGRIWSMEKDGTWQKDPKVYKEGEKYAKKSINNDP --SLRLRDPFILVDKKTSMYYLHFNNN--------------------LKIRVY-KSKDLSTWK-DEGYSFIAK--------------------------S --IVNAHGACI--VEENGRYYLFGEYKSDKSN-------------AFPGFSCY-SSDDLVNWK-FERVVLPMQSSGILGPD------------------R --QINAHGGCV--VYEKGTYYWFGEDRTGFKS---------------NGVSCY-QSKDLYNWK-RLGLSMKTTGEAREDM-------------------N ----NAHGGGI--LFHEGKYYWFGEHRPASGFV------------TEKGINCY-SSTDLYNWK-SEGIALAVSEEEG----------------------H --KINAHGGGI--LYHEGIYYWYGECKSDSTYWNPNVPGWECYRTEAGGVNCY-SSKDLLNWK-YEGLVLQPEISDTQ---------------------S --PINAHGGGL--LYHDGTYYWYGEYKK-GKTILPDWATWECYRTDVTGVGCY-SSKDLLNWK-FEGIVLPAVKEDPN---------------------H --EITFADPTI--FVENGKYYLTGTRNR-----------------EPNGFAIF-ESTDLKEWKTANGDTLQLILRKG----------------------D --YLPIADPYV--MFYNNKYYAYGTGGTTAG----------------EGFACF-SSDDLKNWK-REGQALSA---------------------------T --YLPIADPFV--MLYNNKYYAYGTGGTVN-----------------EGFACF-SSDDLKNWK-RERQALSA---------------------------D --PDMIADASI--QEINGVFYCYATTDGYGQGL------------KTSGPPVVWKSKDFVHWS-FDGTYFPSAAK--------------------------HHLGADPVA--LTYNGRVYIYMSSDDYEYNSNGTIKDNSFA--NLNRVFVI-SSADMVNWT-DHGAIPVAG--------------------------A --HKLGADPYS--LVYDGRVYIFMSSDTYVYNKDGSIKENDFS--ALDRIQVI-SSTDMVNWT-DHGTIPVAG--------------------------A --PGFHADPEVLYSHQTKRYYIYPTSDGFPGW-------------GGSYFKVF-SSKNLKTWK-EETVILEMGK-------------------------N --PGWYADPEG--IVFGDEYWIYPTYSAAYD--------------DQIFMDAF-SSKDLVNWT-KHPKVLSKE--------------------------N IGTSFTPDPAP--YVHGDKVYLFVDHDEDDAVYF-----------KMKDWLLY-STEDMVNWT-YLGAPVSTA--------------------------T --HIYTADPSA--HVFNGKVYIYPSHDIDAGIPFNDNGDHF----GMEDYHVL-RMDSPEGKAEDCGVALHVK--------------------------D ----KPGDPVV-----------------------------------------------------DHGVALTLE--------------------------D --HMYTADPSA-HVWEDGRLYVYASHDIAPPRGCD----------LMDRYHVF-STDDMINWT-DHGEILSSD--------------------------Q --TKFTADPAP--MVYDGTVYLYTTHDEDGAQGF-----------QMFDWLLY-TSTDMVNWT-DHGAVASLD--------------------------S --YKYTADPGA--MVHDGKVYIYAGHDECPPPREHY---------QLNEWCVF-SSPDMKTWT-EHPVPLKAK--------------------------D FTDVFTADPAA--LVHKGRVYLYAGRDEAPDNTTFF---------VMNEWLVY-SSDDMANWE-AHGPGLRAK--------------------------D --WEDHPDLEV--FRVGSVFYYSSSTFA-----------------YSPGAPVL-KSYDLVHWT-PVTHSVPRLNFGSNYDLPSG---------------T WQDIGVHDPSV--VKAGSTYYIFGSHLAAA------------------------KSTDLINWE-YISVLSANDRVDESPLFNTYSTEIAEGI-------T

8

CJA_0816 CjArb43A BsArb43A GsAbnB Bt2900 Bt3516B Bt0360 Bt0367 Bt4095 BaAXHd3 Bt3656 CJA_3594 CJA_3007 GsXynB3 CJA_3067 Bt3663 Bt2852 CJA_3601 Bt0145 BT4185

--LINSHDPGT--ITKDGSTYFHFTTG--------------------TGIWYS-TSTNLTTWTGGPGPVFSSYPSWIYS--------------------K --QVDVHDPVM--TREGDTWYLFSTG---------------------PGITIY-SSKDRVNWR-YSDRAFATEPTWAKR--------------------V ---ELLHDPTM--IKEGSSWYALGTGLT-----------------EERGLRVL-KSSDAKNWT-VQKSIFTTPLSWWSN--------------------Y --DLWAHDPVI--AKEGSRWYVFHTG---------------------SGIQIK-TSEDGVHWE-NMGWVFPSLPDWYKQY-------------------V --NYSLPDPSV-IRAEDGYFYLYATEDI-------------------RNLPIH-RSKDLIKWE-SVGTAFTDETR------------------------P --NFSLPDPTI-IKGDDEYYYLYATESI-------------------KNVPIH-RSRDLVNWY-YVGTAFTDATR------------------------P ------HDPTV-MKTDDGYYYMYQTDASYGNAHS-----------GNGHFHAR-RSKDLVNWE-YLGATMSETPPTWIKEKLNAYRQEMGLE-------P --TYNVHDPSC--RKLGDYYYMYSTDAIFGENRKEAREKG-----VPLGFIQMRRSKDLVHWE-FLGWAFPEIPEE-----------------------A --RGDVPDPSV--IRIEDTYYATGTSSE-----------------WAPFYPMF-TSKDLVNWK-QVGHVFTK---------------------------Q --YTDFPDPDI--IRVGDVYYMATTTMH-----------------FTPGCDIL-RSYDLVHWE-FIAHALNIVADTPEERLECE---------------G --SGDYPDPSV--MRDGEDYYMTHSPFY-----------------YAPGFLIW-HSRDLMNWE-PVCR-------------------------------V --AGDRPDPSI--LKDGDDYYMVFSTFE-----------------SYPALTIW-HSRDMVNWQ-PIGP-------------------------------S --PGDHPDPTI--LKDGEDYYMTFSSFN-----------------SYPGIVLW-HSTDLVNWA-PVGA-------------------------------A --TGFHPDPSI--CRVGDDYYIAVSTFE-----------------WFPGVRIY-HSKDLKNWR-LVARPLNRLSQLNM---------------------I --AGFYPDPSI--VRVGEDFFLATSSFS-----------------YSPGVPIF-HSKDLVNWR-SLGHALIDTEQLPL---------------------F -------------------------------------------------MVII-HSKDLVNWT-IKGHVVSDLRQISEEMNWT----------------R --NADYSDPDV--IRVGDKYYMVNSDF------------------HYMGMPVL-ESDDMINWK-IISQVYRRLDFPDWD--------------------T --HADYSDPDV--IRVGDTYYMTASSFN-----------------SAPGLPLL-TSKDMVNWE-LVGHAFQQQSPLAVH--------------------A --MIKTRDPKG------------------------------------------------------------------------------------------YADYSDPDA--CRVGDDFYMTSSSFN-----------------CLPGLQIL-HSKDLVNWT-IIGAAVPYALTPIETP-------------------E

Bt1021 Bt4152 Bt0264 Bt3467 Bt3515 Bt2912 CJA_0818 CJA_0819 SavAraf43A CJA 3012 CJA_0806 Bt3655 Bt3094 Bt1873 Bt2959 Bt0265 Bt3683 Bt2112 Bt3685 Bt3108

110 120 130 140 150 160 170 180 190 200 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| DEPTTVN--------VWAPE-IFYD--DESEQ--FMVVWASCVPGRF---EKGIEEEE---NNH----RLYYITTKD--------FKTVSKAKVL----HEPEAHN--------CWAPE-LFYD--EPSET--YYIFWATTIPGRHKEVPTSESEKG---LNH----RMYYVTTKD--------FRTFSKTKMF----SPTGLNG--------SWSTS-----------R--ELAFYPNVKFE-----GIAFRYNA---KTG----KVVLSAHYEDQSGY---VAAKIYLAQI----DPSNSHN--------LWNPE-IRNI----NGK--WYIYFAADDG-----------------NTDNHQIYVIENDSPDPMQGE---FRMK--GAIM----DPSNSYH--------LWAPE-IHRI----NNK--WYIYFAADDG-----------------NMDNHQIYVVENEAANPMEGT---FVMK--GRIQ----GQWNSTC--------VWAPE-LHFW----KGK--WYIYYAAGYSGPPFIH-----------QKA----GVLESVTSDAM-GK---YVDK--GMLF----ELERCCN--------FWAFE-FHRLKTEKGYR--WYVMYTSGIAE----------------NFDRQHLSVIESEGDDPM-GP---YTYK--GSPM----TASRCCN--------FWAFE-MHRLNGPSGWR--WYIIYTAGISG----------------DLGGQRNHVLESEGDDPM-GP---YRYR--GTPM----SGVMGAH--------IWAPE-IHFI----DGK--WYVYFAAGSTS----------------DVWAIRMYVLESGAANPLTGS---WTEK--GQIA----TGPYSEL--------VWAPE-IHFNKDPQTGESAWYIYFAAAPSREIKYNLFQ--------HRM----YCIRNCHANPLEGT---WEFM--GQVD----SGPMSAN--------IWAPE-LHRW----DNT--WYIYFAAGETE----------------RPGHIRMYVLANDHVDPLQGQ---WREL--GRVN----NQEDLKR--------VWAPQ-TIYD----KEAKKYMVYWSMQHGN----------------GPD----IIYYAYANKD-------FTDIEGEPKT----MDNRFHA--------VWAPE-MHYI----KSAKNWFIVACMNQSAGG--------------RGS----FILRSTTGKPE-GP---YENIEGNEDK----DFWGQQD--------FWAPD-VYEY----EGR--YYLFTTFSNAG----------------VKR----GTSILVSDSPK-GP---FTPLVNKAIT----VGERVKV--------MKCPS---------TGE--YVMYMHADDMNY---------------KDP----HIGYATCSTIA-GE---YKLH--GPLL----DISQGRL--------FERPK-VIYN--PQTKK--WVMWSHWESGDGY--------------GAA----RVCVATSDKIM-GP---YVLY--KTFR----DIEKGCI--------MERPK-VIYN--AKTGK--FVMWLHLELK-----------------GQGYGPARAAVAVSDSPA-GP---YRFIRSGRVN---PG DLHPSKV--------LERPK-VIYN--DKTKK--FVMWLHVDSDDY---------------NKA----AAGVAVSDSPT-GV---FTYL--GSMH----DLHPSKV--------LERPK-VIYN--KKTGK--FVMWAHVESADY---------------SKA----CAGVAVSDFPN-GP---FTYL--GSFR----SAYGERG--------FWAPQ-LFKE----GNR--YYLTYTANE-------------------------HTALASSNSVY-GP---FRQDSIRPID-----

9

Bt2895 Bt2898 Bt3662 BsAXH-m2,3 PpXyn43A Bt3675 Bt3516A UXOrf66A CJA_1316 CJA_3070 Bt1781 UXOrf66B Bt0369 CJA_3018 HiAXH-d3 CJA_0799 CJA_0816 CjArb43A BsArb43A GsAbnB Bt2900 Bt3516B Bt0360 Bt0367 Bt4095 BaAXHd3 Bt3656 CJA_3594 CJA_3007 GsXynB3 CJA_3067 Bt3663 Bt2852 CJA_3601 Bt0145 BT4185

DSYGTWG--------FWAPE-VYYV--ESKKK--FYLFYSAEEH-------------------------ICVATSTTPE-GP---FRQE--VKQP----DSYGKWG--------FWAPE-VYYI--QSKKK--FYMFYSVEEH-------------------------ICVATSDSPV-GP---FRQE--SKQP---------EK--------YWAPSKAIFA----NGK--YYIYPTING-------------------------YMYPAVADKPE-GP---FKLARGKDEF----NGANGGRGIAKWAGASWAPS-IAVK--KINGKDKFFLYFANSGG------------------------GIGVLTADSPI-GP---WTDPIGKPLV----NNKNSGRGIAKWASNSWAPA-VAHK--KINGRDKFFLYFANGGA------------------------GIGVLTADTPI-GP---WTDPLGKALV----VSWANGN--------AWAPC-IEEK--KIDGKYKYFFYYSANPTTN---------------KGK----QIGVAVADSPT-GP---FTDL--GKPI----ISWLRRA--------LWAPA-VIHA----NDK--YYFFFGANDI-----------------QNNNELGGIGVAVADNPA-GP---FKDALGKPLI----FPWAFQG------DRAWASQ-AVEQ----NGK--WYWYVCLTEAAT---------------RAD----ALAVAVADRPE-GP---YTDAIGGPLA----VPWAERQ--------MWAPD-AITK----DGK--YYLYFPARARD----------------GLF----KIGVAIGDQPE-GP---FVAE--PEPI----VPWASRQ--------LWAPD-ATEK----DGK--YYLYFPAKNKE----------------GIF----QIGVALSDSPT-GP---FNAE--PEPI----VPWGRKEGGF-----MWAPD-CAYR----NGT--YYFYFPHPSE-----------------TDWNDSWEIGVATSDKPAEG----FKVQ--GYIE----FVWYDGKNG------AWAEC-IVER----DGK--FYMYCPIHGH------------------------GIGVLVADTPF-GP---FKDPLGKPLV----FSWAKGE--------AWASQ-VIER----DGK--FYWYVTVEHGTI---------------HGK----SIGVAVSDSPV-GP---FVDARGSALI----FTWAKGD--------AWASQ-VIER----NGK--FYWYVTVRHDDTK--------------PGF----AIGVAVGDSPI-GP---FKDALGKALI----PGAYVKG--------IWAST-LRYR--RSNDR--FYWYGCVEGRT----------------YLWTSPGGNALANNGEVPPSA---WNWQHTATID----WSDGYKG--------NWAAD-VIQA---PNGK--YWFYYNHCAQTEAD-------------GGCWNRSYLGLAEAESIE-GP---YVDK--GVFLRSGYR VPNFSGS--------FWAPD-IIHM----NGY--FYLYYSVSTFGS---------------SVS----AIGVARSASLTSPS---WQDL--GIVV----SPSFDGH--------LWAPD-IYQH----KGL--FYLYYSVSAF-----------------GKNTSAIGVTVNKTLNPA-SPDYRWEDK--GIVI----VPNYGQN--------QWAPD-IQYY----NGK--YWLYYSVSSFGS---------------NTS----AIGLASSTSISSGG---WKDE--GLVI----PEKDEDH--------LWAPD-ICFY----NGI--YYLYYSVSTF-----------------GKNTSVIGLATNQTLDPR-DPDYEWKDM--GPVI----TFEPKGN--------LWAPD-INKI----GDK--YVLYYSMSCWG----------------GEWTC--GIGVATADKPE-GP---FVDH--GMMF----NFEPKGR--------IWAPD-INKI----GDK--YVMYYSMSTWG----------------GEWTC--GIGIATADKPE-GP---FTDL--GKLF----IDNPSYG--------YWAPV-ARKV---SNGK--YRMYYSIVITNYIQTGKPEIENNGNFDGSWTERAFIGLMETSDPASNI---WEDK--GFVV---CS VQWVQSHAGGQGATNIWAPY-IIPY----KDK--YRLYYCVSAFGR---------------KTS----YIGLAESTSPE-GP---WTQK--GCIV----PSWTSNS--------FWAPE-LFYH----NNK--VYCYYTARQKST---------------GIS----YIGVATSDSPL-HE---FTDH--GPIV----ANAYGRG--------MWAPS-LRYH----RGT--WYVLFAANDTH-----------------------TSYLLTADDPC-GP---WRKR----------MPEYEGS--------AMAPD-LLKY----KGK--YYIYYPAKGT-------------------------NWVIWANDIK-GP---WSKPIDLKVS----LTKYIGS--------IWAPE-LTRH----QGR--YYVYIPAKFP-----------------GHN----SNYVVWADDIR-GP---WSDP--IDLN----LQKNIGT--------VWALD-LCKH----NNR--YYIYIPAAAEG----------------KPW----SIYVIWADDIR-GP---WSDPIDLHIE----GNPDSGG--------VWAPH-LSYS----DGK--FWLIYTDVKVV----------------EGQWKDGHNYLVTCDTID-GA---WSDP--IYLN----NQSTSRG--------IYAPT-LRHH----KGV--FYLITTLVDV-------------------R----GNFLVTATNPA-GP---WSKP--ILLP----MNRYGRG--------IWAGA-IRYY----QGK--FYVYFGTPDE------------------------GYFMSTATDPA-GP---WEPL--HCVK----NGNYGGG--------SWAPS-IRHH----DGK--FWIYFCTPRE------------------------GLMMSTATDPH-GP---WSPL--HCVK----RPQHGGG--------VWAPS-LRYH----NNM--FWIFYGDPDV------------------------GIYMYQAKHFA-GP---WSKP--HLLL-------------------SWSKP-VLVK----AGK------------------------------------------------G------------------RPEHGNR--------VWAPS-IRHH----NGE--FYIFWGDPDQ------------------------GAFMVKAKDPQ-GP---WTEP--VLVK-----

10

pKa modulator of catalytic acid

Bt1021 Bt4152 Bt0264 Bt3467 Bt3515 Bt2912 CJA_0818 CJA_0819 SavAraf43A CJA 3012 CJA_0806 Bt3655 Bt3094 Bt1873 Bt2959 Bt0265 Bt3683 Bt2112 Bt3685 Bt3108 Bt2895 Bt2898 Bt3662 BsAXH-m2,3 PpXyn43A Bt3675 Bt3516A UXOrf66A CJA_1316 CJA_3070 Bt1781 UXOrf66B Bt0369 CJA_3018 HiAXH-d3 CJA_0799 CJA_0816

210 220 230 240 250 260 270 280 290 300 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| ------------------YDPGFS-----------------------SIDAVIVKRA-KNDYVMVLKDNTRPER----------------------NLKI ------------------FNPDFS-----------------------VIDAAIVKDPTQGDLIMIVKNENSNPP-------------EK-------NLRV ------TPKGELEVGTMERPLGYD-----------------------SRDQSLFIDD-DGTAYLLSATNMNR------------------------DINI -----------------TNPDWNW-----------------------GIHPSTFEHK--GELYLLWSGWPNRRI-------------GSETQ----CIYI -----------------TDKDNNW-----------------------AIHASTFEHD--GQRYMIWCGWQKRRI-------------DSETQ----CIYI -----------TGDALGDWENNRW-----------------------AIDMTLLDHK--GQLYAVWSGWEGNEL-------------TDKTQQ---HLYI --------------------PDSW-----------------------NIDGNYLEHN--GQLYLLWSEWVGDEQ----------------------SIFI --------------------SSTW-----------------------NIDGTYLEHN--GSLYLLWSEWQGAYQ----------------------TLFI ------------------TPVSSF-----------------------SLDATTFVVN--GVRHLAWAQRNPAED-------------NNT------SLFI ------------------SGIDTF-----------------------CLDATTFEHK--GVLYYVWAQKDVAIE-------------GNS------NIYI ------------------TARDSF-----------------------SLDATVFEHR--GVRYLLWAQKDPQNL-------------FNS------ALYL ---------------LFLPKNGKS-----------------------CIDGDIIYKD--GLYHLFYKTEGDGN-----------------------GIK--------------------AIFP-----------------------NIDGSLFEDT-DGTVYFVGH-----------------------------NHYI -------------------PSGWM-----------------------CLDGSLYIDK-EGNPWLLFCREWLETI-------------DG-------EIYA ---------------YEGKPIRRW-------------------------DMGTYQDT-DGTGYLLLHGGI-----------------------------V -------------------PNKNE-----------------------SRDQTLFVDT-DGKAYHFCSTDMNT------------------------NMNI AYPLNMTRKERKMKWNPEEYKEWWTPKWYEAIAKGMFVKRDLKDGQMSRDMTLFVDD-DGKAYHIYSSEDNL------------------------TLQI -------------------PNGQI-----------------------SRDMTLFKDD-DGKAYHIYSSEQNS------------------------TLYI -------------------PNNAM-----------------------SRDQTVFVDD-DGRAYQFYSSENNE------------------------TMYI --------------------ETAK-----------------------NIDSFLFKDS-DGKYYLYHVRFNKGN-----------------------YLWV -------------------IWSEK-----------------------SIDTSLFIDD-DGTPYLYFVRFTDGN-----------------------VIWV -------------------IWEEK-----------------------SIDTSLFIDD-DGTPYLYFVRFTDGN-----------------------VIWV --------YKPFTPSTLLQSKNPG-----------------------GIDAEIFVDD-DGQAYVFWGRRH-----------------------------V ------------TPSTPGMSGVVW-----------------------LFDPAVFVDD-DGTGYLYAGGGVPGVS-------------NPTQ-----GQWA ------------THSTPGMAGVTW-----------------------LFDPAVLVDD-DGTGYLYSGGGIPNES-------------DPASIANPKTARV --------------ITSSPTGRGQ-----------------------QIDVDVFTDPVSGKSYLYWGNG---------------------------YMAG ----------------DKIVNGAQ-----------------------PIDQFVFKDD-DGQYYMYYGGWG--------------------------HCNM --------------------TGFA-----------------------FIDPTVFIDD-DGTPWLFWGNK---------------------------GCWY --------------------AGSY-----------------------SIDPAVFGDD-DGEFYLYFGGIWGGQLQKYRDNTYSEIHEEPTADQPALGARV --------------------KGSF-----------------------SIDPAVFKDD-DGSYYLYFGGIRGGQLQRWTTGTYSDTDAYPASNQPALSPKI --------------------GMDP-----------------------MIDPCVFVDD-DGQAYIYNGGGG--------------------------TCKG -----------------WQKEHWE-----------------------DIDPSVFIDD-DGQAYMYWGNP---------------------------HVYW ----------TNDMTTEFTKISWE-----------------------DIDPTVFIDD-DGQAYLYWGNT---------------------------QCYY -----------TNDMTTDTPIDWD-----------------------DIDPSVFIDD-DGQAYLFWGNTR--------------------------PRY--------------------NCYY-------------------------DAGLLIDD-DDTMYIAYGNP---------------------------TINV EAGEFSSYPLDNGQTTWNGAVDPN-----------------------VIDPAAFYDK-SGGLWMVYGSYSG-------------------------GIFV --------------ESFGGGSEIN-----------------------AIDPALYRDH-DGKVYMSYGSFFG-------------------------GIGL

11

CjArb43A BsArb43A GsAbnB Bt2900 Bt3516B Bt0360 Bt0367 Bt4095 BaAXHd3 Bt3656 CJA_3594 CJA_3007 GsXynB3 CJA_3067 Bt3663 Bt2852 CJA_3601 Bt0145 BT4185

--------------ESVPQRDLWN-----------------------AIDPAIIADD-HGQVWMSFGSFWG-------------------------GLKL ---------------RSTSSNNYN-----------------------AIDPELTFDK-DDNPWLAFGSFWS-------------------------GIKL ---------------HSTASDNYN-----------------------AIDPNVVFDQ-EGQPWLSFGSFWS-------------------------GIQL ---------------RSNEIGVQN-----------------------SIDPFYIEDG--GRKYMFWGSFR--------------------------GIYG ---------------RSNEINIQN-----------------------CIDPFYIEDN--GKKYLFWGSFH--------------------------GIYG ASDKGKTDYGRSSINDWEGYFKIN-----------------------AIDPTYIITE-NGEHWLIYGSWHSGIA-------------ALQVNPEDGKPL---------------KTDDSTAMN-----------------------AIDPSVIADENTGKWWMHYGSFFG-------------------------GLYC -------------------EYGKE-----------------------AIDAFIYDDN--GQLYISWKAYGLDTR-------------PI-------ELLG -------------------ELDGF-----------------------YYDSGLFFDD-DDRAYVVHGQS---------------------------TLRI -----------------------------------------------GIDPGHIADQ-EGNRYLYVDKGEV---------------------------------------------PDKKYPD-----------------------RIDPGHLVGE-DGKRYLFFSH----------------------------GDY-----------------------G-----------------------CIDPGHAVGE-DGKRYLFVN-----------------------------GIRK ----------------------SS-----------------------GFDPSLFHDE-DGRKYLVNMYWDHRVD-------------HHPFY----GIVL ---------------------EVG-----------------------GIDPDIFFDD-DGRVYIAHNDGPPGEP-------------QYEGHR---AIWM -------------------AEKGW-----------------------D-DCCPFFDD-DGQLYFVGTHFADKY-----------------------KTYL -------------------RIGGW-----------------------E-DPCPIWDD-NGQAYLGRSQLGAG------------------------PIIL --------------------PGKG-----------------------LIDPTPFWDE-DGNAYLLHAWAHSRAGISN-------------------KLTL -----------------------------------------------MIDPTPLWDE-DGKVYLIHAYAGSRSG-------------VNS------ILVI --------------------PGKG-----------------------IIDTCPLWDE-DGKVYLVHAYAGSRAG-------------LKS------VITI Catalytic acid

Bt1021 Bt4152 Bt0264 Bt3467 Bt3515 Bt2912 CJA_0818 CJA_0819 SavAraf43A CJA 3012 CJA_0806 Bt3655 Bt3094 Bt1873 Bt2959 Bt0265 Bt3683

310 320 330 340 350 360 370 380 390 400 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| --------AFSDSLTGP--YS------PSSQPFTES-------------------FVEGPTVEKLGDD--YLIYFD----------------------VY -------TRTKNIAKGF----------PTKVSAPITGKY----------------WAEGPAPLFVGDA--LYVYFD----------------------KY -------YKLDPSWTKP--VL------LVNTICKGL-------------------HRETPAIIKKDGE--YYFFSS------------------KASGWY -------ARMENPWTLA--TE------RVLISKPEYEWERQWVNPDGGRTAYPIYVNESPQYFHSKDNRTVIVYYA------------------ASGCWS -------ATMKNPWTLS--SD------RILISKPEYEWECQWVNPDGSKTAYPIHVNEAPQYFESKNKDKACIFYS------------------ASGSWT -------AKMLNPWTMA--SG------RVKISSPDRYYEQGELP-----------LNEGPQILKHEKD--VFIVYS------------------CGQSWL -------ARMVNPWTIT--GE------RVLIAKPEYEWEKSGMK-----------VNEGPEIIKRNNR--TFMVYS------------------ASFCNT -------QAMSNPWTVT--GS------RVAIASPAYSWELVGAN-----------VNEAPVVLKRNGR--TFVIYS------------------ASSCST -------AKMANPWTIS--GT------PTEISQPTLSWETVGYK-----------VNEGPAVIQHGGK--VFLTYS-------------------ASATD -------APMETPWQLG--GE------PVMLTKPEFEWEIRGFW-----------VNEGPSIVKRNGR--IFLSYS-------------------ASATD -------ATLETPTQVG--SR------EVEISRPDLGWEIQGYK-----------VNEGAAVIKHNGR--IFVSYS-------------------ASATD -------KATTASLTSGQWTE------SEDYKQQTKE------------------AVEGAGIFPLIGTDKYILMYD------------------------------ARMKPDMSGL--AE------EIKTLKESKYSPEP--------------YVEGAFIFKYDGK--YHLVQAIWAHRTVKGDTYVEKEGLTNKKTR -------QRLAKDLKTT--EG------DPYLLFKASEAPWVGSITSSGVTGN---VTDAPFIYRLDDGK-LIMLWS------------------SFRKTD -------YRLSKDYRTA--EE------KVVSGVGGS-------------------HGESPAMFKKDGT--YFFLFS------------------NLTSWE -------ALLRDDYLEP--TP------TETKILKGL-------------------KYEAPAIFKVGDM--YFGLFS------------------GCTGWE -------AELADDYLSH--TG------KYIRIFPGG-------------------HNEAPAIFKKEGT--YWMITS------------------GCTGWD

12

Bt2112 Bt3685 Bt3108 Bt2895 Bt2898 Bt3662 BsAXH-m2,3 PpXyn43A Bt3675 Bt3516A UXOrf66A CJA_1316 CJA_3070 Bt1781 UXOrf66B Bt0369 CJA_3018 HiAXH-d3 CJA_0799 CJA_0816 CjArb43A BsArb43A GsAbnB Bt2900 Bt3516B Bt0360 Bt0367 Bt4095 BaAXHd3 Bt3656 CJA_3594 CJA_3007 GsXynB3 CJA_3067 Bt3663 Bt2852 CJA_3601 Bt0145 BT4185

-------SLLSDDYLKP--SG------TFTRNFAGK-------------------FREAPAVFKHNGL--YYVISS------------------GCTGWS -------SLLTDDYLKP--SG------RFTRNFIKE-------------------SREAPAVFKHKGK--YYMLSS------------------GCTGWD -------AEFDLQKGCIKPET------LKKCFDNTEAWERTPNYKSAP-------VMEGPTVVKLDDL--YYLFYS------------------ANHFKN -------AQMTDDLMSIKTET------LNQCIKAEVSWELLQGK-----------VAEGPSLLKKNGV--YYLIYS------------------ANHYEN -------AQMTDDLMKIKKET------LSECIKAEEPWELLQAK-----------VAEGPSVLKKKGM--YYLIYS------------------ANHYQN -------AKLNEDMITV--DS------VVQVISTPRKE-----------------YSEGPIFFKRKGI--YYYLYT--------------------IGGD NPKTARVIKLGPDMTSV--VG------SASTIDAPF-------------------MFEDSGLHKYNGT--YYYSYC-------------INFGGTHPADK -------IKLGADMTSV--IG------SATTIDAPY-------------------LFEDSGIHKYNGK--YYYSYC-------------INFAGTHPQQY -------AELNDDMLSI--KE------ETTVVLTPKGGTLQTYA-----------YREAPYVIYRKGI--YYFFWS------------------VDDTGS -------VKMAPDLLSIVPFE------DGTIYKEVTPQN----------------YVEGPFMLKHNGK--YYFMWS------------------EGGWGG -------GRLGDDMISF--PDGWKEVPGFHDPACFGPESMKPDWSDGGKEKMMVGYEEGPWLTKRGDT--YYLSYP---------------------AGG -------ARLSADMKSFVEAS------REVVILDEQGQPLLAGDNSRR-------YFEGPWMHKYQGK--YYLSYS----------------------TG -------ARLSDDMLQFAEVP------RDVLLQDEGGTPLLGTDNNRR-------FFEAAWVHKYNGK--YYFSYS----------------------TG -------GKLKDNMIEL--DG------PMRTMEGLSD------------------FHEATWIHKYNGK--YYLSYS-----------------DNHDDGE -------VKLNKDMISV--DG------EIHRIDYPIDT-----------------YQEGPWFYKHNGK--YYLAFA---------------------STC -------VKLKKNMIEL--DG------PIVPVHLPR-------------------YTEAPWIHKCGDW--YYLSYA----------------------SE -------AKLKKNMVEL--DG------PIRAIEGLPE------------------FTEAIWVHKYQDN--YYLSYA----------------------MG -------AQLSPDGTRQ--VR------VQQRVYAHPQGQ----------------TVEGARMYKIRGN--YYILVT---------------------RPA -------LAMDENTGKPKAGQ------GFGKRLVGGDFR----------------AIEGAFVMYSPVADYYYLFYS-----------------VAGFAAN -------AEINQSTGKL--NG------SVTKIYGGNHA-----------------DIEAPYITRNGDY--YYLFIN---------------RGACCQGAS -------FKLNDDLTRP--AE------PQEWHSIAKLERSVLMDDSQAGSA----QIEAPFILRKGDY--YYLFAS---------------WGLCCRKGD -------TKLDKSTMKP--TG------SLYSIAARPNNGG---------------ALEAPTLTYQNGY--YYLMVS---------------FDKCCDGVN -------IQLDTETMKP--AA------QAELLTIASRGEEPN-------------AIEAPFIVCRNGY--YYLFVS---------------FDFCCRGIE -------IELTADGLKV--KS------GADKKQIAGT------------------AYEGTYIHKKNGY--YYLFAS---------------TGTCCEGLK -------TELTDDGLSLKKGA------PIEQIAGT--------------------AYEGTYIHKRDGY--YYLFAS---------------IGRCCEGLK -------NALGNPWDIT--GEDNSGYGKIIATRGTSRWQ----------------ASEGPEVIYRDGY--YYLFLA-------------------YGSLS -------VELNPETGLALNEG------DLGHLVARRANYRKD-------------NLEAPEIIYHPELKQYYLFTS-------------------YDPLM -------CKLSADGLHL--DG-----EPFTLLVDEKGI-----------------GMEGQYHFKEGDY--YYIVYA----------------AHGCCGPS -------TELNPELSGPMPGG------LDRVIVQDDPQADL--------------GYEGSHLYKHDGR--YYVFTC-------------------HFPQG -------IRLTDDGLAT--IG------QKQKVYEGWRYPNHWETECM--------CLESPKLNYHNGY--YYLTSA----------------QGGTAGPA -------VQLSDDGLSL--VG------EMKHVYDGWKYPEAWDVESY--------AQEGPKMMKHGEY--FYMVTA----------------VGGTAGPP -------IRLTDDGLAT--EG------ELEHAYSPWRYPDHWIVENF--------APEGPKLMRRGEW--FYLVTA----------------VGGTAGPV -------QEYSVEQKKL--VG------EPKIIFKGTDLR----------------ITEGPHLYKINGY--YYLLTA-------------------EGGTR -------WEYDPDNQKIIPDS------RRLLVDGGSDLKKKPI------------WVEGPHLYKINGW--YYLVCA-------------------EGGTG -------YRMTPDGKTL--IE------NSKILINEGY------------------GREASKLYKINGT--YYHFFS--------------------EVKN -------HKMSADGRTL--ED------DGHVIYTGP-------------------VAEGTKFHKRDGY--YYISIP-------------------EGGVG -------RRMSPDGRRILNQE------APIVVNADLLPEYR--------------ALEGPKLYKRNGY--YYIFAP-------------------AGGVA -------CELNAEGTEV--IS------DPVMVFDGNDGKNH--------------TVEGPKLYKRNGY--YYIFAP-------------------AGGVA -------CELNKEATKA--IT------PSRIIFDGHEAHQ---------------TCEGPKFYKRNGY--YYIFHP-------------------AGGVP

13

Bt1021 Bt4152 Bt0264 Bt3467 Bt3515 Bt2912 CJA_0818 CJA_0819 SavAraf43A CJA 3012 CJA_0806 Bt3655 Bt3094 Bt1873 Bt2959 Bt0265 Bt3683 Bt2112 Bt3685 Bt3108 Bt2895 Bt2898 Bt3662 BsAXH-m2,3 PpXyn43A Bt3675 Bt3516A UXOrf66A CJA_1316 CJA_3070 Bt1781 UXOrf66B Bt0369 CJA_3018 HiAXH-d3 CJA_0799 CJA_0816 CjArb43A BsArb43A GsAbnB Bt2900

410 420 430 440 450 460 470 480 490 500 ....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....| KKKIYGAM---RTR-DFKN-FTDI--------TESVSVP--------------VGHKH-GTIFRASESDVKTLL---EESKKK----------------RDHRYGAV---RSL-DHGETWEDV--------SDQVSFP--------------KGIRH-GTAFAVDASV------------------------------PSQT-MYT---SAA-DLGGEWT--------------PMREIG------NNSTFDAQFN-RISTVGKTCGVWSYH---WGAQRK------Y------KTPA PYYCTGMLTADADS-DLLNPASWK--------KSPVPVFQQR------PEDGVYGPAN-LSFLPSPDGTEWYIL---YQARSV--PSGNT------GESE PYYCVGLLTADAKA-NLLNPASWK--------KSPVPVLQQD------PENNVYGPGG-ISFTPSPDGKEWYML---YHARQI--PNDAP------GASD DTYKLSYLRLKDSDADLLDPESWI--------KSEKPAFEGT--------NQVLGVGH-ASFTTSPDDREYYIC---YHSKRE------K------TPGW PDYKLGVIEL-VGD-DPLKPDSWH--------KFDQPFFSRA--------NGVYGPGH-NGFFKSPDGTEDWLI---YHGNAS----EQE------GCTS ADYKLGQLEL-TGT-NPLLASSWA--------KTTAPVFQQA--------NGVYGPGH-NGFFTSPDGTENWIA---YHANAS----ASQ------GCGD ANYCLGMLSASASA-DLLNAASWT--------KSSQPVFKTS------EATGQYGPGH-NSFTVSEDGKSDILV---YHDRNYKDISGDP------LNDP ENYAMGLLWIDDNA-DLLNPANWT--------KAREPVLCSD------IPNKIFGPGH-NSFTYAEDGDTVMLV---YHARTYTEIEGDP------LWNP HRYAMGLLWIDETA-DILNPAAWN--------KAPEPVFFTH------EKLQRYGPGH-NSFTLAEDGKTPVLV---YHARDYLELQGAP------LTDP -----VYM---KGK-YQFTESTDL---------ENFKVIDNA-------ISMDFHPRH-GTVMPITD-KELKRL---YKAYGK----------------YSYDCIIA---TAD-NVYGPYGKR-------------------------YNAITGGGH-NNLFQDKD-GNWWATMF-FNPRGA----QAA------EYKV GKYAIGQAV--SASGNVLGPWVQE------------PETLNS-----------DDGGH-AMVFKDLK-GRLMIS---YHAPNS----------------Q KNDN-FYF---TAP-SVKGPWTRQ------------GLFAPE--------GSLTYNSQTTFVFPLKCGEDTIPM---FMGDRW---------------SY PNPG-RSA---YST-DILGNWTTG------------NNFAVD------KLKQVTYNSQSCYVFKVEGKEKAYIY---MGDRWN-------------SKDV PNKA-RLL---TAD-SMLGEWKQL--------PN--PCVGED-------ADKTFGGQS-TYILPLPEKGQFFFM---ADMWRP-------------KSLA PNEA-EYA---VAT-QILGPWTVK----------GNPCVGKD------ADKTFYGQST-HVLPIQGKKDAYIAM---FDKWNK-------------TDLI PNVA-EIA---VAD-SIMGTWKTI-----------------------------------GNPCTGPDADKTFYA-------------------------IDYAVGYA---TST-SPYGPWKKH--------ANN-PIIHRS-------IVHENGSGH-GDLFKGLD-GRYYYV---YHVHHS------D------SIVQ KGYGVGYA---TSD-TPMGPWVKY--------SKN-PLLQGD------AATGLVGTGH-GAPFQCKD-GSWKYI---FHAHWS------A------AEIQ KGYGVGYA---TSK-SPMGPWTKY--------DKN-PLLQGD------ESTGLVGTGH-GAPFQCKD-GSWKYI---FHAHWN------K------EKIH EKYQYAYV---MSRVSPMGPFEAP--------EQDIISTTNY-------ERGIFGPGH-GCVFHPEGTDNYYFA---YLEFGR--------------RST PPGEIGYM---TSS-SPMGPFTYR--------GHFLKNPGAF--------FGGGGNNH-HAVFNFK--NEWYVV---YHAQTV-SSALFG------AGKG PAGEIGYM---VSD-NPMGPFTYK--------GHFLKNPYTF--------FGVGGNNH-HAVFNFK--NEWYVV---YHAQTV-SKAQIG------AGKG PNYHVVYG---TAQ-SPLGPIEVA--------KEPIVLIQNP-------KEEIYGPAH-NSILQVPGKDKWYIV---YHRINK--NHLND------GPGW PDYSVAYA---IAD-SPFGPFERV--------G---KILQQD-------TNIATGAGH-HSVVKGPGSDEWYII---YHRRPL-----GE------TAAN VPEHMAYS---TAK-SIHGPWTYR--------G---RIMDEA-------ENSFTIHGG-NITFR----GHNYMF---YHNGKL-----PN------GGGF DTHFLCYA---TSD-NPYGPFTYQ--------G---QILTPV----------VGWTTH-HSICEFE--GKWYLF---YHDSVL-----SE------GVTH DTHYIAYG---VGD-SPYGPFTYK------------GIVLTP---------VLGWTTH-HSIVEYQ--GKWYLF---YHDAQL-----SG------GETH KHNRMCYA---ISD-SPLGPWEYK------------GIYMEP---------TDSYTNH-GSIVEFK--GQWYSF---YHNSAL------------SGHDW CPEGIGYA---MSD-SPVGPWKSM--------G---AIMDRT--------WR-TRGNH-PGIIEFK--GQSYVFGLNYDL---HHLETFK------HAEFPEKICYA---MSR-SITGPWEYK------------GILNEI--------AGNSNTNH-QAIIEFK--GDWYFI---YHNGSI----NTA------GGSF FPEKIGYA---MGK-SIKGPWVYK------------GILNEV--------AGNTPTNH-QAIIEFN--NKHYFI---YHTGAG----RPD------GGQY DAEYVLRS---TTG-SPFGPYEAR---------TLVSRIQGP--------LANAGFAHQGGIVDAPD-GTWHYV---AFMDAY---------------PG DGYNIRIA---RSK-TPDGPYLDP--------AGNDIKDAAG----LEIGGKLLGGFE-FVQTLGEEGKSWGYQ---SPGHNS----AYY------DAAT STYYVQVA---RST-SIYGPYTDV--------RTILPNVDGK----------YKGPGH-IGVLKQD--GCNYVS---THYYDL-------------NDNG STYHLVVG---RSK-QVTGPYLDK--------TGRDMNQGGG-SLLIKGNKRWVGLGH-NSAYTWD--GKDYLV---LHAYEA-------------ADNY STYKIAYG---RSK-SITGPYLDK---------SGKSMLEGGGTILDSGNDQWKGPGG-QDIVNGN-----ILV---RHAYDA----------------STYKIAVG---RSK-DITGPYVDK---------NGVSMMQGGGTILDEGNDRWIGPGH-CAVYFSG--VSAILV---NHAYDA-------------LKNG STYQTVVG---RSS-SLWGPYLDK--------QGR-SMLDNNHVVLIHNNKSFVGTGHNAELITDKADNDWIL----YHGVSV-------------SNPY

14

Bt3516B Bt0360 Bt0367 Bt4095 BaAXHd3 Bt3656 CJA_3594 CJA_3007 GsXynB3 CJA_3067 Bt3663 Bt2852 CJA_3601 Bt0145 BT4185

STYTTVVG---RSK-NLFGPYVDK--------KGR-SMSDNHHEILIQKNKAFVGPGHNSELVTDKKGNDWMF----YHAVSV-------------ANPE VEYNTRVC---RSK-NIDGPYVDIHGNSAMGSAQLYPILTAP--YRFDNSYGWVGISH-CGIFDDGA-GNWFYT---SQGRFPVNVGGNE------YSNA TTYNVRVS---RSD-AAEGPFTDYFGKAVKDTTNNFPILTAP--YRFENHTGWAGTAH-CAVFSDGE-GNYFMA---HQGRLS-------------PQNQ SDYDVYVA---RAR-NYGGPYEKY--------SGN-PILHGG-------EGDYKSCGH-GTVVRTSD-GRMFYM---CHAYLK-----GD------GFFI KGKTEACL---MAE-SLDGAFEVR------------EIIEDD------LSFHGYGVAQ-GGMVDTPD-GDWYAF---MMQDRG---------------GV TSHMAVAA---RSK-SVTGPWENS---------PYNPVVHTY-----SAHDNWWSKGH-GTLIDDVN-GNWWIV---YHAYAK------------GYHTL TGHMVIAA---RAK-SVHGPWENS--------PYNPIIRTES------ALEKWWSRGH-ATLVEGPTAGDWYMM---YHGYEN------G------FHTL TGHMVIAA---RAR-SIHGPWEHC--------PHNPLVRTLS------EAEPWWSRGH-ATLVEGPK-GDWWMV---YHAYEN------G------FRTL YNHAATIA---RST-SLYGPYEVH---------PDNPLLTSW----PYPRNPLQKAGH-ASIVHTHT-DEWFLV---HLTGRPLPREGQPLLEHRGYCPL YDHSAVVF---RSK-SLQEPFIPY--------GKN-PILTQR-DLDPKRANAINNAGH-ADLVQTAE-GEWWAV---FLATRT---YQQE------HFNT GGRYIMMQ---RSS-SITGPYLER------------KQLSHV-------QREYNEPNQ-GGLVEGPD-RKWYFF---THHGTG--------------DWA EGWQ-TIL---RSK-NIYGPYEKK------------VVLEKG-------STNVNGPHQ-GALVDTPE-GEWWF----YHFQLT--------------EPL PGWQ-SVF---RAK-QIYGPYEDR------------IVLEQG-------NTDVNGPHQ-GAWVQTPEGTDWFFH---FQDKGV----------------Y TGWQ-LVL---RSK-NIYGPYESK------------IVMAQG-------KTTINGPHQ-GGWVDTNTGESWFVH---FQDQGA----------------Y TGWQ-VVL---RSK-NAYGPYEWR------------TVLAQG-------DSPVNGPHQ-GAWVDTPSGEDWFFH---FQDVGA----------------Y

Bt1021 Bt4152 Bt0264 Bt3467 Bt3515 Bt2912 CJA_0818 CJA_0819 SavAraf43A CJA 3012 CJA_0806 Bt3655 Bt3094 Bt1873 Bt2959 Bt0265 Bt3683 Bt2112 Bt3685 Bt3108 Bt2895 Bt2898 Bt3662 BsAXH-m2,3 PpXyn43A

510 520 530 540 550 ....|....|....|....|....|....|....|....|....|....|....|.... --------------------------------------------------------------------------------------------------------------------GNFPRISIAAFNKGYASMDYYRYLEFSDKYGIIPVQNGKNLTLNVPVTAAV-------SRNPRLQKIGWDADG-----------------MPDLGVPLPVNTPLPKPSGTVPNQ--SRSPRLQKIDWDKDG-----------------MPVLGTPQKEEEPMARPSGSPIHKKQN KRDIRLQKFTFDASG-----------------VPCFGKP-------------------TRSLRAQQFTWTDDD-----------------LPFFGEPVASGVPVQ-----------TRSLRAQKFTWNSNG-----------------TPNFGSPVATS---------------NRRTRLQKVYWNADG-----------------TPNFGIPVADGVT-------------DRHTFVKPLRWDDQG-----------------MPVFGKPSLIE---------------NRHTYIRRLQWTTQG-----------------FPDFGQERADDLVTHKQSTK---------------------------------------PDKM---------------------TCRPGLIPMLYENGKF----------------KPNHNYQAK-----------------TEHPVITPIYIKDGK-----------------FVALN---------------------PHQASAATYVW---------------------MPM-----------------------GKSHHV--------------------------W-------------------------DSRYIWLPVQFDDKG-----------------VPFIKWMDRWNFD-------------HSTYIWLPIQFREDGSMAIQ------------WLDSWSPDTYEF---------------QSTYVQPVIGKKNA-----------------YIAMF---------------------PRKTRIVPLILKKEN-DVYSITIDKENVIKPMWK------------------------PRTSYIKDFAISDQG-----------------VVTISGTVIKPRVLK-----------PRTSYFKDFTISKQG-----------------VVSITGKVIKPKVIK-----------NRQTYVNQLKFNEDG-----------------TI------------------------YRSPHINKLVHNADG-----------------SIQEVAANYAGVTQI-----------YRSPHINKLVHKEDG-----------------SIS------------------------

15

Bt3675 Bt3516A UXOrf66A CJA_1316 CJA_3070 Bt1781 UXOrf66B Bt0369 CJA_3018 HiAXH-d3 CJA_0799 CJA_0816 CjArb43A BsArb43A GsAbnB Bt2900 Bt3516B Bt0360 Bt0367 Bt4095 BaAXHd3 Bt3656 CJA_3594 CJA_3007 GsXynB3 CJA_3067 Bt3663 Bt2852 CJA_3601 Bt0145 BT4185

HREVCIDRMEFNPDG-----------------TIKQVIPTP-----------------SRATCIDRMYFNKDG---------------KIEPVRMTF-------------------HRSTCVEEFTYGADG----------------SIPFIPFT-------------------LRSVKVTELHYEADG------------KIKTIHPYRD---------------------LRNIKVTELVHHDDGSI---------------QPIVPYSN------------------LRSICVDKLYYNPDG-----------------TIKMVKQTK-----------------QRSVSVAPMYYEPDG-SIRKVPY---------WLDN----------------------RRSVCIDRLYYNEDG-----------------TMKRIQMTTEGVQ-------------RRSVSIDELFYNPDG-----------------TIKRIVMTTEGVAPNKSPERVKKAAKGRIPVVAPLRWTADG-----------------WP------------------------GRHILVTHTRFPATSTD---------------YPNNEEAHAVRVH-------------AAKLDLLKMTYSNG------------------WP------------------------LQKLKILNLHWDGEG-----------------WPQVDEKELDSYISQRLK-------------------NDNG-----------------IPSFSSMI------------------EPTLQIRPLYWDDEG-----------------WPYLSV--------------------GRVLLLDRVDWKKG------------------WPVVKSSSASTESPKPHF--------GRVLMMDRVQWKGG------------------WPFVKGAVPSLETAAPDFR-------IMMGHVRSIRWDANG-----------------WP------------------------LMVLHIRQLFFTPEG-----------------WP------------------------GRQPILQEMEMTDDH-----------------W-------------------------GRVPILMPMRFGEDG-----------------FPVVGENGKVPQSVSVPAA-------GRSTLIEPIEWTADG-----------------WYRTKSTATPIKTD------------GRQTLLEPIEWTDDG-----------------WF------------------------GRQTLLEPIEWTDDG-----------------WF------------------------GRETAIQRLEWKDG------------------WPYVVGGNGPSLEIDGPSV-------GRETFLLPVRWHDE------------------WP------------------------GRIASLLPVYWVDG------------------WPIIGEVGQDGIGTMV----------GRVVHLQPAHWKDG------------------WPVIGVD-------------------GRILHLQPMRWEND------------------WP------------------------GRVIHLNPMKWVNE------------------WPVIG---------------------GRLVHLQPMKWVND------------------WP-------------------------

Alignment of the sequences included in the phylogenetic tree in Figure 3 of the main body of the paper

16

Enzymology: The Structure and Function of an Arabinan-specific α -1,2-Arabinofuranosidase Identified from Screening the Activities of Bacterial GH43 Glycoside Hydrolases

J. Biol. Chem. 2011, 286:15483-15495. doi: 10.1074/jbc.M110.215962 originally published online February 21, 2011

Access the most updated version of this article at doi: 10.1074/jbc.M110.215962 Find articles, minireviews, Reflections and Classics on similar topics on the JBC Affinity Sites. Alerts: • When this article is cited • When a correction for this article is posted Click here to choose from all of JBC's e-mail alerts

Supplemental material: http://www.jbc.org/content/suppl/2011/02/21/M110.215962.DC1.html This article cites 47 references, 19 of which can be accessed free at http://www.jbc.org/content/286/17/15483.full.html#ref-list-1

Downloaded from http://www.jbc.org/ by guest on November 3, 2015

Alan Cartmell, Lauren S. McKee, Maria J. Peña, Johan Larsbrink, Harry Brumer, Satoshi Kaneko, Hitomi Ichinose, Richard J. Lewis, Anders Viksø-Nielsen, Harry J. Gilbert and Jon Marles-Wright

Suggest Documents