Structure, function and latency regulation of a ... - Semantic Scholar

2 downloads 0 Views 3MB Size Report
Feb 1, 2011 - Briefly, profragilysin-3 was amplified from genomic B. fragilis DNA and cloned into a modified pET-28a vector for overexpression in E. coli Origa ...
Structure, function and latency regulation of a bacterial enterotoxin potentially derived from a mammalian adamalysin/ADAM xenolog Theodoros Goulas, Joan L. Arolas, and F. Xavier Gomis-Rüth1 Proteolysis Lab, Department of Structural Biology, Molecular Biology Institute of Barcelona, Consejo Superior de Investigaciones Científicas (CSIC); Barcelona Science Park, Helix Building, c/Baldiri Reixac 15-21, E-08028 Barcelona, Spain Edited* by Robert Huber, Max Planck Institute for Biochemistry, Planegg-Martinsried, Germany, and approved December 6, 2010 (received for review August 18, 2010)

Enterotoxigenic Bacteroides fragilis is the most frequent diseasecausing anaerobe in the intestinal tract of humans and livestock and its specific virulence factor is fragilysin, also known as B. fragilis toxin. This is a 21-kDa zinc-dependent metallopeptidase existing in three closely related isoforms that hydrolyze E-cadherin and contribute to secretory diarrhea, and possibly to inflammatory bowel disease and colorectal cancer. Here we studied the function and zymogenic structure of fragilysin-3 and found that its activity is repressed by a ∼170-residue prodomain, which is the largest hitherto structurally characterized for a metallopeptidase. This prodomain plays a role in both the latency and folding stability of the catalytic domain and it has no significant sequence similarity to any known protein. The prodomain adopts a novel fold and inhibits the protease domain via an aspartate-switch mechanism. The catalytic fragilysin-3 moiety is active against several protein substrates and its structure reveals a new family prototype within the metzincin clan of metallopeptidases. It shows high structural similarity despite negligible sequence identity to adamalysins/ADAMs, which have only been described in eukaryotes. Because no similar protein has been found outside enterotoxigenic B. fragilis, our findings support that fragilysins derived from a mammalian adamalysin/ADAM xenolog that was co-opted by B. fragilis through a rare case of horizontal gene transfer from a eukaryotic cell to a bacterial cell. Subsequently, this co-opted peptidase was provided with a unique chaperone and latency maintainer in the time course of evolution to render a robust and dedicated toxin to compromise the intestinal epithelium of mammalian hosts. bacterial endotoxin ∣ human pathogen ∣ zymogen activation

T

he gastrointestinal tract is that part of the interface between the organism and its external environment where food digestion and nutrient uptake occur. The tract hosts bacteria that are beneficial for the host by controlling invasion and proliferation of pathogens, enhancing the immune system, processing indigestible food and providing essential nutrients. The most populated section of the tract is the large intestine, which is anaerobic and contains ten times more bacterial cells than the number of human cells in the entire body (1). However, in certain circumstances, this beneficial relationship can be disrupted and pathogenic bacteria can invade and proliferate, causing a number of disturbances. Members of the genus Bacteroides comprise the majority of intestinal obligate anaerobes, of which Bacteroides fragilis is most frequently associated with disease. Enterotoxigenic B. fragilis (ETBF) strains colonize and affect humans and livestock, and they have been linked to intraabdominal abscesses, diarrhea, inflammatory bowel disease, anaerobic bacteremia, and colon cancer (1–3). In addition to the bacterial capsule, which induces abscess formation, the only identified virulence factor for ETBF is a 21-kDa zinc-dependent metallopeptidase (MP), termed fragilysin alias B. fragilis toxin (BFT) (4–6). It is synthesized as a preproprotein of 397 residues, with an 18-residue signal peptide for secretion, a ∼170-residue prodomain (PD) flanked by flexible

1856–1861 ∣ PNAS ∣ February 1, 2011 ∣ vol. 108 ∣ no. 5

segments, and a ∼190-residue catalytic domain (CD). The latter encompasses two sequence elements that ascribe it to the metzincin clan of MPs: (i) an extended zinc-binding consensus sequence (ZBCS), HEXXHXXG/NXXH/D, which comprises three histidines that bind the catalytic zinc ion plus the general base/acid glutamate involved in catalysis; and (ii) a conserved methionine within a tight 1,4-β-turn, the Met-turn (5–8). However, upstream of the ZBCS there is no significant sequence similarity to any other metzincin, which suggests that fragilysin is a unique metzincin prototype (7). The enzyme exists in three closely related isoforms of identical length: fragilysin-1, -2, and -3 alias BFT-1, -2, and -3, which display pairwise sequence identities of 93–96% (9). Analysis of clinical isolates reveals that the three isoforms are generally present simultaneously (2) and that fragilysin-1 is the most abundant (see table 9 in ref. 2). The three fragilysin isoforms are encoded by a chromosomal pathogenicity islet that is absent in nonenterotoxigenic strains. In addition to fragilysin, this island contains a second gene, mpII, which is countertranscribed and encodes a potential MP of similar size and moderate sequence identity (28–30%) to the three fragilysin isoforms. However, its potential role in ETBF pathogenesis remains to be established (2). The only proven substrate for fragilysin-1 in vivo is E-cadherin, an intercellular adhesion molecule. Shedding of E-cadherin by fragilysin-1 led to increased permeability of the epithelium and, ultimately, cell proliferation, which supported a role for this MP in colorectal carcinoma (10). In vitro, fragilysin-1 was shown to cleave type-IV collagen, gelatin, actin, fibrinogen, myosin, tropomyosin, human complement C3, and α1 -proteinase inhibitor (5, 6). The protein is stable at room temperature and below, but it undergoes rapid autodigestion above 37 °C. Fragilysin-3, alias BFT-3 and BFT-Korea, was shown to cleave E-cadherin in HT29/C1 cells similarly to isoforms 1 and 2 (9), but no further biochemical studies have been reported. Orally administered broad-spectrum antibiotics may remove enteropathogens from the gastrointestinal tract but they also affect the beneficial and commensal flora. In the absence of this flora, opportunistic microorganisms may colonize the intestine and lead to severe digestion alterations and gastrointestinal diseases. In addition, ETBF can be resistant to antibiotics such as penicillin, ampicillin, clindamycin, tetracycline, and metronidazole (2). Accordingly, there is a substantiated need for better Author contributions: F.X.G.-R. designed research; T.G., J.L.A., and F.X.G.-R. performed research; F.X.G.-R. contributed new reagents/analytic tools; T.G. and F.X.G.-R. analyzed data; and T.G., J.L.A., and F.X.G.-R. wrote the paper. The authors declare no conflict of interest. *This Direct Submission article had a prearranged editor. Data deposition: The atomic coordinates and structure factors have been deposited in the Protein Data Bank, www.pdb.org (PDB ID code 3P24). 1

To whom correspondence should be addressed. E-mail: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/ doi:10.1073/pnas.1012173108/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1012173108

Results and Discussion Autolytic vs. Heterolytic Activation. Wild-type profragilysin-3 was

recombinantly overexpressed in Escherichia coli and purified to homogeneity (Fig. S1A). Stepwise autolytic processing of profragilysin-3 via various intermediates was observed, which resulted in an apparently stable 21-kDa form (Fig. S1B). This protein started at the same residue, Ala212, as authentic fragilysin-1 purified from natural sources. In contrast, the Glu349Ala-mutant of profragilysin-3, which ablated the general base/acid required for catalysis, was more stable over time and only slight autolysis was observed (Fig. S1C). As to the possible implications of such autolysis for in vivo activation, previous studies on profragilysin-1 had shown that mutation of essential residues for structure and function eliminated proteolytic activity of the toxin but not PD processing; i.e., the latter process was exerted by another protease (11). Therefore, autolysis may occur in the purified recombinant proenzyme in vitro only, and a heterolytic activation mechanism would be responsible for activation in vivo. The most likely candidate is trypsin, which is widely expressed in stomach and small intestine, and also in stomach- and colon-cancer cell lines (12, 13). In addition, the final activation cleavage site, Arg211Ala212, matches trypsin’s substrate specificity and, in our hands, limited proteolysis generated mature wild-type fragilysin-3— indistinguishable from the autolytically activated form—and the mature Glu349Ala-mutant, both starting at Ala212. This provided a means of reproducibly obtaining a homogeneous sample for subsequent studies (Fig. S1A). To assess the importance of the N-terminus of the mature peptidase, limited proteolysis experiments were further carried out on the zymogen with three serine proteinases of distinct specificity, which all rendered forms with comparable activity to that of the autolytically processed or trypsin-activated enzyme. Role of the Prodomain. The fragilysin-3 CD alone was difficult to express in E. coli even though a number of fusion constructs were assayed. It could be obtained in soluble form with an N-terminal His6 -Z-tag. However, after fusion protein removal, the CD was observed to aggregate in size-exclusion chromatography (Fig. S2A). In addition, the protein did not show activity and, contrary to the mature form obtained by tryptic processing, it was completely digested by trypsin (Fig. S2B). Moreover, a comparative thermal shift assay was performed to measure the temperature of midtransition (T m ) of the two CD forms and the proenzyme. The T m for trypsin-activated fragilysin-3 was ∼5 °C lower than that of profragilysin-3 (T m values of 51.4  0.9 °C and 56.1 0.1 °C, respectively), revealing the significantly higher stability of the zymogen due to the presence of the PD. In contrast, it was not possible to determine T m for the directly expressed variant due to the anomalous behavior of the curve (Fig. S2C), which we attribute to a labile conformational state and aggregation. This hypothesis was further confirmed by circular dichroism: the spectrum of directly expressed fragilysin-3 was very similar to that of the trypsin-activated variant at 50 °C; i.e., under conditions of Goulas et al.

thermal denaturation of the latter (Fig. S2D). Taken together, all these results indicate that directly expressed fragilysin-3 is unstable and strongly point to a role for the PD, in addition to latency maintenance, as a chaperone that assists in the folding and stabilization of CD, as previously reported for other MP zymogens (14, 15). In the absence of this chaperone, fragilysin-3 CD is not able to fold correctly in the environment provided by the bacterial expression host. Proteolytic Activity and its Inhibition. Mature wild-type fragilysin-3 cleaved casein, fibrinogen, actin, and fibronectin (Fig. S1 D–G), as well as a casein fluorescein-conjugate, azocollagen, and azocasein, but not azoalbumin in the conditions assayed. In addition, eight standard fluorogenic peptides were analyzed for cleavage, two of which were efficiently cleaved, including a matrix metalloproteinase (MMP) probe, NFF3 (16). The rest were cleaved moderately or not at all (Fig. S1I). Analysis of protein and peptide cleavage fragments revealed broad substrate specificity for fragilysin-3 (see Table S1), and maximal activity was recorded at pH5.5 (Fig. S1H). As histidine side chains normally tend to be in a double protonated state at pH < ∼6 and this species is incompatible with cation binding, this maximum of activity entails that the catalytic zinc ion must be in an overall hydrophobic environment and very firmly bound to the protein. This is in agreement with binding constants exceeding 1012 M −1 reported for extracellular zinc enzymes (17). Activity was completely abolished by standard zinc-chelating agents and significantly impaired by excess zinc and the broad-spectrum small-molecule MMP inhibitor, CT1746 (18). No inhibition was exerted by inhibitors of serine, cysteine, or aspartic proteases, or by inhibitors of other MPs (Table S2). In contrast, only residual activity was found for the trypsin-processed Glu349Ala-mutant against fibrinogen and fluorophore-labeled casein. Together, these results indicate that fragilysin-3 is an active, broad-spectrum MP in vitro. Overall Structure of Profragilysin-3. The structure of profragilysin-3 was solved using data to 1.8 Å resolution (see Table S3). The zymogen is a bilobal molecule and shows the PD and the CD as two delimitated globular moieties (Fig. 1A). The PD is constituted by a large twisted antiparallel β-sheet (strands β4-β8, β10, and β9), which vertically traverses the whole domain (obliquely back-to-front in Fig. 1 A and B) and has a concave and a convex side. Preceding strand β4, an N-terminal ∼40-residue segment nestles at the convex side of the β-sheet and folds into a small three-stranded β-sheet of mixed parallel (β1 and β2) and antiparallel (β3) connectivity plus an adjacent antiparallel helix, α1. The latter is inserted roughly coplanar to the sheet between strands β1 and β2. This sheet is orthogonal to the large β-sheet and both give rise to a β-sandwich structure (Fig. 1 A and B). The strands of the large β-sheet are linked by simple vertical connectivity from β4 to β8. Thereafter, the chain enters helix α2, which runs along the front surface, roughly top-to-bottom (Fig. 1A), and flows into a short 310 -helix, η1, before entering the last strand of the sheet, β9, at the front of the molecule. After this strand, the polypeptide folds back and enters the penultimate strand of the sheet, β10, which leads to helix α3, at the domain interface. After this helix, the chain adopts an extended conformation that includes strand β11 and runs across the front surface of the CD (see below). The last residue defined by electron density of PD varies from Pro199 to Pro202 in the four independent molecules found in the crystal asymmetric unit, on the outermost left surface of the catalytic moiety (Fig. 1A). The polypeptide chain is again defined by electron density from Thr210 or Arg211, which features the beginning of the roughly spherical CD. This domain is divided by the active-site cleft into a large upper subdomain (Thr210/Arg211-Gly355) and a small lower subdomain (Ala356-Asp397). The former consists of a twisted five-stranded β-sheet, whose first four strands from PNAS ∣

February 1, 2011 ∣

vol. 108 ∣

no. 5 ∣

1857

BIOCHEMISTRY

understanding ETBF targets such as fragilysin and for the design of highly specific antimicrobials to tackle them, and detailed structural information greatly contributes to these aims. To explore the mechanisms of fragilysin activation and activity, we examined the proteolytic capacity of fragilysin-3 in vitro and the X-ray crystal structure of its zymogen, profragilysin-3, which provided a highresolution scaffold for the design of specific inhibitors. Taken together with the results of phylogenetic studies, these data enabled us to propose a mechanism for latency maintenance and activation of this enterotoxigenic MP, as well as a plausible hypothesis for its evolutionary origin based on xenology (from ξενoς, ´ Greek for “stranger” or “alien”), which is gene homology that is the result of horizontal gene transfer (HGT) between species and not of Darwinian evolution.

Fig. 1. Overall structure of profragilysin-3. (A) Richardson-type plot of profragilysin-3 in stereo with helices (labeled α1–α7 and η1) as ribbons and β-strands (β1–β15) as arrows. The polypeptide chain is flexible and thus interrupted between the end of the PD (depicted in magenta) and the beginning of the CD (cyan). The latter is displayed approximately in the standard orientation characteristic for MPs (i.e., with the view into the active-site cleft) that runs from left (nonprimed side) to right (primed side) according to ref. 46. (B) Topology of profragilysin-3 PD displaying the regular secondary structure elements, which comprise strands β1–β11 (arrows) and helices α1–α3 plus η1 (rods). Strands β4 and β11 establish parallel β-sheet interactions with CD strands β13 and β15 (in cyan), respectively. (D) Zinc-binding site of profragilysin-3 with the metal ion in magenta and the coordinating residues from the PD (Asp349) and the CD (His348, His352, and His358). The distance ranges for each bond found in the four molecules of the asymmetric unit of the wild-type structure are depicted.

back to front in Fig. 1A (β13, β12, β14, and β16) run parallel and from left to right, whereas the outermost front strand, β15, runs antiparallel to the previous strands and creates the upper-rim of the active-site cleft. Loop Lβ14β15 protrudes from the molecular surface and delimitates the active-site cleft on its primed side. Two helices, the “backing helix” α4 and helix α5 are found on the concave and convex sides of the sheet, respectively. The upper subdomain ends after the “active-site helix” α6, which comprises the first part of the ZBCS and thus the first two zinc-binding histidines, His348 and His352, and the catalytic general base/acid, Glu349. At Gly355, still within the ZBCS, the polypeptide chain undergoes a sharp turn downward and enters the lower subdomain, which only spans 42 residues and mainly comprises the third zinc-binding histidine, His358; the methionine containing Met-turn (Asp364-Leu365-Met366-Tyr367); and the “C-terminal helix” α7, which forms part of the CD moiety (Fig. 1A). In this way, our structure confirms earlier hypotheses based on homology modeling of fragilysin-1 (19), which suggested that this helix formed part of the domain rather than protruding from it to interact with the eukaryotic membrane, as had been proposed by other authors (20). Interestingly, the polypeptide chain does not finish after helix α7 but adopts a loop structure that lies below the N-terminal segment of upper subdomain. The chain C-terminus at Asp397 is anchored to the domain moiety through interactions with the side chains of His252 and Arg256 from Lα4β13. It is conceivable that C-terminal deletion mutants disrupt this interaction network and thus impair overall stability and activity of the CD. This is in accordance with previous biochemical studies showing that ablation of the two C-terminal residues of fragilysin1 greatly reduced activity and eight missing residues even abolished it, giving rise to an unstable molecule (21). 1858 ∣

www.pnas.org/cgi/doi/10.1073/pnas.1012173108

Molecular Base for Zymogenicity. PDs prevent access of substrates to active-site clefts in zymogens. In profragilysin-3, the PD interaction with the CD exhibits good shape complementarity and covers an area of 1;996 Å2 at the protein interface. The interaction involves 63 residues and comprises 98 close contacts (see Table S4). In contrast to metalloprocarboxypeptidases of the funnelin tribe of MPs (22) and to pro-MMPs (23), the PD does not cap the CD but is attached to its right lateral surface (Fig. 1A). It prevents access to the active-site cleft through the C-terminal segment, which runs in extended conformation across the entire CD front in the opposite orientation to a substrate. The most important segment for latency is that encompassing α3, Lα3β11, and β11, which traverses the front of the CD from right to left. It establishes a parallel β-sheet interaction on the nonprimed side of the cleft through strand β11 with “upper-rim strand” β15 of the CD (Fig. 2 A and B and Table S4). It also approaches the beginning of helix α5 above the cleft, as well as Lβ14β15, Lβ15β16, Lβ16α6, active-site helix α6, and the segment connecting the Met-turn with C-terminal helix α7. Additional segments involved in PD/CD interaction include the solvent-accessible part of the backing helix α4 of the CD and the concave side of the large PD β-sheet, which contributes with residues from β5-β8 and β10; a parallel β-sheet consisting of strands β13 (CD) and β4 (PD); and segments α3 and Lα3β11 (PD) and helix α5 and loops Lβ15β16, Lα6α7, and Lβ16α4 (CD). A prominent bulge preceding β11 gives rise to a tight 1,4-turn spanning Tyr191-Asp194 (Fig. 2 A and B). This bulge causes the side chains of Tyr191 and Asp194 to penetrate the catalytic moiety whereas the residues embraced, Ile192 and Asn193, point to the bulk solvent. The aromatic residue occupies the S1 ’ site of the cleft, which is framed by atoms provided by Ile313, Leu314, Gly344, Val345, His352, His358, Leu365, and Tyr370, as well as by backbone atoms of segment Leu365-Leu374 Goulas et al.

(Fig. 2B). This is a large pocket (360 Å3 ) and Tyr191 is a long way from filling it. Although most of the framing side chains are hydrophobic, the pocket could also accommodate short side chains and large hydrophilic residues, which could be bound by the main-chain carbonyl groups of Leu365, Asp364, Tyr367, or Tyr373, or by Thr371 Oγ1. Accordingly, this pocket would be compatible with a broad specificity at this site. Asp194, in turn, coordinates the catalytic zinc ion in substitution of the solvent molecule usually found in mature CDs primed for catalysis (24). The binding is bidentate and exerted from the top by the two terminal carboxylate oxygen atoms, which together with the Nϵ2 atoms of His348, His352, and His358 give rise to a tetrahedral þ 1 coordination sphere of the catalytic zinc (Fig. 1C). In addition, Asp194 Oδ2 points to the Oϵ1 atom of the general base/acid, Glu349, at 2.69–3.04 Å in the four monomers in the asymmetric unit (Fig. 2B). This implies that one of the two oxygens must be protonated. Downstream of Asp194, Tyr195 most likely occupies cleft subsite S2 , framed by His352, Leu319, and His358; and Ile196 is probably in S3 , shaped by Trp318, Leu280, As277, and Ile316 (Fig. 2 A and B). As observed for S01 , these sites could harbor several types of residue, thus supporting the broad substrate specificity of fragilysin-3 observed in vitro. This inhibitory mechanism follows an “aspartate switch,” in analogy to MMPs and ADAMs/adamalysins, for which the term “cysteine switch” was coined. The name was based on a cysteine Sγ atom replacing the catalytic solvent molecule in the zymogen (25, 26). Such an aspartate-switch mechanism has been described for proastacin (27). In this zymogen, however, the PD is much shorter (34 residues), the zinc-binding aspartate is provided by a wide loop immediately downstream of a prodomain helix that occupies the primed side of the cleft, and no interactions are observed between the PD and the upper-rim strand. Goulas et al.

Structural Similarities. We found no overall similarity between the PD and any other structure deposited with the Protein Data Bank (PDB). Only the functionally unrelated bacteriochlorophyll A binding protein from the green sulfur bacterium, Prosthecochloris aestuarii 2K, showed similarity with the central part of the large β-sheet and helix α2 (PDB 3EOJ; (28); Z-score 4.3, rmsd 5.3 Å, 106 common residues; 6% sequence identity) (Fig. S3A). However, in the Prosthecochloris protein, the β-strands are much longer and form part of an overall open-barrel structure with no resemblance to profragilysin-3. In addition, topological relatedness includes only about 50% of PD residues: the small orthogonal β-sheet and helix α1, on the convex side of the large sheet, as well as helices η1 and α3 of PD, do not have structural equivalents in the Prosthecochloris protein. Accordingly, we conclude that profragilysin-3 PD conforms to a new fold, understood as a domain with a structure formed by regular secondary structure elements in an orientation and connectivity not found in previously reported molecules. In contrast, searches with the CD unambiguously identified members of the adamalysin/ADAMs family as close structural relatives of fragilysin-3, with Z-scores of 13–17, rmsd values of 2.5–2.9 Å, and common sequence stretches of 160–166 residues (out of 188 total residues in fragilysin-3 and 200–260 in adamalysin/ADAM catalytic moieties) but only ∼15% sequence identity. Next in similarity was ulilysin, the structural prototype of the pappalysin family, followed by MMPs, serralysins, and astacins, which are all members of the metzincin clan of MPs (7, 8). Fig. S3B depicts the superposition of fragilysin-3 with adamalysin II [PDB 1IAG (29, 30)], which reveals that almost all their regular secondary structure elements colocalize. Both CDs display a large upper subdomain of 3∕4 and a small lower subdomain of 1∕4 of PNAS ∣

February 1, 2011 ∣

vol. 108 ∣

no. 5 ∣

1859

BIOCHEMISTRY

Fig. 2. Active-site cleft of profragilysin-3. (A) Detail of the active-site cleft of fragilysin-3 superimposed with its Connolly surface colored according to electrostatic potential. PD segment Cys189-Thr198 is shown as a stick model. (B) Close-up view of Fig. 1A in stereo showing the active-site environment, the specificity pocket, and the residues involved. All residues except those already tagged in Fig. 1C are labeled.

the total size. With helix α5, fragilysin-3 has one of the most characteristic elements for adamalysins/ADAMs not present in other metzincin prototypes, the large “adamalysin helix” preceding the central strand of the upper subdomain β-sheet (7). Furthermore, the upper-rim strand and the preceding bulge on top of the cleft on its primed side are very similar in both structures, both in length and conformation. The adamalysin/ADAM family is named after the first family member to be structurally characterized, adamalysin II from Crotalus adamanteus snake venom, and mammalian reproductive-tract proteins (31, 32). The family has 40 members in humans (http://degradome.uniovi.es/met.html#M10) and has only been found in metazoans and some fungi, which include two opportunistic human pathogens, Pneumocystis carinii and Aspergillus fumigatus; and fission yeast, Schizosaccharomyces pombe (7, 33–35). Sequence similarity searches within bacteria and archaea revealed potential adamalysin-like sequences only in Marinobacter aquaeolei and Marinobacter algicola, which, however, have yet to be characterized. Because fragilysins showed only 15% sequence identity with adamalysins/ADAMs, we performed a phylogenetic analysis including four human, two ophidian, and two fungal adamalysins/ADAMs, in addition to the putative Marinobacter relatives and fragilysin-3 (Fig. S3C). This study revealed that the metazoan forms cluster together. Next in divergence would be the two Marinobacter relatives, followed by the fungal forms. Fragilysin-3 would be clearly farthest away in evolution, indicating that fragilysins are not true adamalysins/ADAMs anymore despite the close overall structure due to the action of evolution in a particular environment. Consistent with this separation, there are several subtle differences between fragilysin-3 and adamalysin II (Fig. S3B). Implications for Fragilysin Isoforms. The sequence identity values indicate that the three profragilysin isoforms have identical structure. They all span 397 residues, and differences in sequence of profragilysin-3 (UniProt O86049) with profragilysin-1 (Q9S5W0) and -2 (O05091) are found at just five positions in PD and at 26 positions in CD (Table S5). Interestingly, almost half of the mutations (12∕26) cluster at the lower subdomain of the CD. All changes are compatible with the current structure as they mostly affect surface-exposed segments. Only three mutants, Asp277Lys, Asn312Lys, and Lys331Glu, only present in fragilysin-1, could affect substrate binding and alter distant subsites at both ends of the cleft; i.e., beyond S5 and S03 . In any case, no mutation affects the residues shaping pockets S1 and S01 , which were reported to preferentially accommodate hydrophobic residues in the case of fragilysin-1 (5, 6). As fragilysin-3 shows broad specificity (Table S1), we do not have a structural explanation for these differences. Overall, the conclusions of this structural analysis can be extrapolated to all three isoforms, which is consistent with a common function in vivo. Conclusions. Until the discovery of fragilysins, the main molecular

determinants of ETBF virulence, no MP had been reported to act as a potent enterotoxin (19). We show here that fragilysin-3 is a functional broad-spectrum MP in vitro with a large S01 pocket that could be targeted by specific inhibitors to treat ETBF infection and associated diseases. In vivo, this MP is inhibited by a unique PD via an aspartate-switch mechanism until it is secreted to the gut lumen. Fragilysin-3 is one of three isoforms present in ETBF and no further related proteins have been described. They are encoded by a pathogenicity islet, which is absent in nonvirulent B. fragilis strains, that must have been acquired by HGT from an exogenous source (36). HGT may occur through nucleic-acid transduction, conjugation, or transformation, and it contributes to evolution of life complementing the Darwinistic tree-based mechanism. It results in the acquisition of xenologs (i.e., homologs that do not result from common ancestor inheritance) and 1860 ∣

www.pnas.org/cgi/doi/10.1073/pnas.1012173108

confers adaptive advantages that can change the relationship of bacterial species with the environment and their pathogenic character (37). Such gene shuffling is common within and across bacteria and archaea and examples include the transfer of resistance plasmids (38). Contrary, documented transfer from bacteria to eukaryotic cells is restricted to few examples, among them the interaction between Agrobacterium tumefaciens and plant cells, which leads to Crown–Gall disease (39, 40). Lastly, gene transfer from eukaryotes to bacteria is extremely uncommon and only detectable through phylogenetic and comparative genome analyses (41, 42). It is of great potential relevance for bacterial pathogenicity (37, 41). Direct evidence for such HGT does not exist as there is no footprint of such evolutionary processes and, therefore, sequence-independent structural similarity provides a complementary tool to unveil potential cases of xenolog exaptation. In this sense, the CD structure, but not the sequence, of fragilysin-3 strongly resembles adamalysins/ADAMs only and it is conceivable that fragilysins derived from a xenolog of adamalysins/ ADAMs co-opted long ago during the intimate coexistence between B. fragilis and mammalian intestinal tracts. Fragilysins would subsequently have evolved in a bacterial environment, thus giving rise to small structural changes and a different protein sequence expressed as three isoforms, and, putatively, to the gene product of mpII. In this context, it has been shown that human colon cell lines express mRNA of ADAM-10, -12, and -15 (43), which could potentially be incorporated by competent intestinal bacteria by transformation and subsequent action of DNA polymerase I, which exerts RNA-dependent DNA polymerase activity in several bacteria (44). In addition, mammalian adamalysin/ ADAM CDs are difficult to produce as recombinant proteins and they require PDs that work as intramolecular chaperones (45). The same is the case for fragilysin-3, which could only be produced in a functional form if fused to its unique and tailormade PD. All these lines of evidence support a relation between the aforementioned mammalian MPs and fragilysins and, thus, the development of a functional protein that eventually became toxic for the intestinal wall: putatively the origin of its own ancestor. Materials and Methods A detailed description of procedures is provided in SI Materials and Methods. Briefly, profragilysin-3 was amplified from genomic B. fragilis DNA and cloned into a modified pET-28a vector for overexpression in E. coli Origami-2 (DE3) cells. The protein was purified by nickel-affinity chromatography, digested with tobacco-etch virus protease to remove the N-terminal hexahistidine tag, and polished by gel filtration. The selenomethionine variant was obtained in the same way, except that cells were grown in minimal medium with selenomethionine replacing methionine. The Glu349Ala mutant was obtained by site-directed mutagenesis and produced as aforementioned. The active wild-type enzyme was obtained by time-dependent autolysis or tryptic limited proteolysis of the zymogen and subsequent gel-filtration purification. Thermal shift and circular dichroism assays, as well as proteolytic activity assays against protein and peptide substrates and inhibitory assays, were performed according to standard protocols. The 1.8 Å-crystal structure of profragilysin-3 was solved by single-wavelength anomalous diffraction by using orthorhombic selenomethionine-derivatized crystals obtained by sitting-drop vapor diffusion. Program SHELXD was used to identify all 20 selenium sites of the dimer present in the asymmetric unit and the noncrystallographic symmetry operator could be derived. Subsequent phasing with SHELXE and density modification with DM under 2-fold averaging rendered an electron density map that enabled manual tracing of roughly 3∕4 of each protomer. These coordinates were refined with REFMAC5 and used to solve the structure of monoclinic native crystals, which contained a tetramer per asymmetric unit, by Patterson search with PHASER. A subsequent run with ARP/wARP rendered an excellent electron density map. Thereafter, manual model building alternated with crystallographic refinement until the model was complete. Phylogenetic analyses were performed with MULTALIN and PHYLIP after optimal superposition of the available crystal structures to derive a sequence alignment.

Goulas et al.

“AntiPathoGN,” FP7-HEALTH-2010-261460 “Gums&Joints,” BIO2008-04080E, BIO2009-10334, CSD2006-00015, PSE-010000-2009-8, and 2009SGR1036). We acknowledge the help provided by European Molecular Biology Laboratory (EMBL) at Grenoble, France and European Synchrotron Radition Facility (ESRF) synchrotron local contacts. Funding for data collection was provided in part by ESRF.

1. Rabizadeh S, Sears C (2008) New horizons for the infectious diseases specialist: How gut microflora promote health and disease. Curr Infect Dis Rep 10:92–98. 2. Sears CL (2009) Enterotoxigenic Bacteroides fragilis: A rogue among symbiotes. Clin Microbiol Rev 22:349–369. 3. Holton J (2008) Enterotoxigenic Bacteroides fragilis. Curr Infect Dis Rep 10:99–104. 4. Myers LL, Firehammer BD, Shoop DS, Border MM (1984) Bacteroides fragilis: A possible cause of acute diarrheal disease in newborn lambs. Infect Immun 44:241–244. 5. Moncrief JS, et al. (1995) The enterotoxin of Bacteroides fragilis is a metalloprotease. Infect Immun 63:175–181. 6. Vines RR, Wilkins TD (2004) Handbook of Proteolytic Enzymes, eds AJ Barrett, ND Rawlings, and JF Woessner, Jr (Elsevier, London), pp 588–591. 7. Gomis-Rüth FX (2003) Structural aspects of the metzincin clan of metalloendopeptidases. Mol Biotechnol 24:157–202. 8. Gomis-Rüth FX (2009) Catalytic domain architecture of metzincin metalloproteases. J Biol Chem 284:15353–15357. 9. Chung GT, et al. (1999) Identification of a third metalloprotease toxin gene in extraintestinal isolates of Bacteroides fragilis. Infect Immun 67:4945–4949. 10. Wu S, Rhee KJ, Zhang M, Franco A, Sears CL (2007) Bacteroides fragilis toxin stimulates intestinal epithelial cell shedding and γ-secretase-dependent E-cadherin cleavage. J Cell Sci 120:1944–1952. 11. Franco AA, Buckwold SL, Shin JW, Ascon M, Sears CL (2005) Mutation of the zinc-binding metalloprotease motif affects Bacteroides fragilis toxin activity but does not affect propeptide processing. Infect Immun 73:5273–5277. 12. Koshikawa N, et al. (1998) Expression of trypsin by epithelial cells of various tissues, leukocytes, and neurons in human and mouse. Am J Pathol 153:937–944. 13. Miyata S, et al. (1999) Expression of trypsin in human cancer cell lines and cancer tissues and its tight binding to soluble form of Alzheimer amyloid precursor protein in culture. J Biochem 125:1067–1076. 14. Khan AR, James MN (1998) Molecular mechanisms for the conversion of zymogens to active proteolytic enzymes. Protein Sci 7:815–836. 15. Marie-Claire C, Roques BP, Beaumont A (1998) Intramolecular processing of prothermolysin. J Biol Chem 273:5697–5701. 16. Nagase H, Fields CG, Fields GB (1994) Design and characterization of a fluorogenic substrate selectively hydrolyzed by stromelysin 1 (matrix metalloproteinase-3). J Biol Chem 269:20952–20957. 17. Fraústoda Silva JJR, Williams RJP (2001) The Biological Chemistry of the Elements: The Inorganic Chemistry of Life (Oxford University Press, New York). 18. Chander SK, et al. (1995) An in vivo model for screening peptidomimetic inhibitors of gelatinase A. J Pharm Sci 84:404–409. 19. Obiso RJJ, Bevan DR, Wilkins TD (1997) Molecular modeling and analysis of fragilysin, the Bacteroides fragilis toxin. Clin Infect Dis 25(Suppl. 2):S153–S155. 20. Saidi RF, Sears CL (1996) Bacteroides fragilis toxin rapidly intoxicates human intestinal epithelial cells (HT29/C1) in vitro. Infect Immun 64:5029–5034. 21. Sears CL, Buckwold SL, Shin JW, Franco AA (2006) The C-terminal region of Bacteroides fragilis toxin is essential to its biological activity. Infect Immun 74:5595–5601. 22. Gomis-Rüth FX (2008) Structure and mechanism of metallocarboxypeptidases. Crit Rev Biochem Mol Biol 43:319–345. 23. Tallant C, Marrero A, Gomis-Rüth FX (2010) Matrix metalloproteinases: Fold and function of their catalytic domains. BBA-Mol Cell Res 1803:20–28. 24. Auld DS (2004) Handbook of Proteolytic Enzymes, eds AJ Barrett, ND Rawlings, and JF Woessner, Jr (Elsevier Academic Press, London), pp 268–289. 25. Springman EB, Angleton EL, Birkedal-Hansen H, Van Wart HE (1990) Multiple modes of activation of latent human fibroblast collagenase: Evidence for the role of a Cys73

active-site zinc complex in latency and a “cysteine switch” mechanism for activation. Proc Natl Acad Sci USA 87:364–368. Rosenblum G, et al. (2007) Molecular structures and dynamics of the stepwise activation mechanism of a matrix metalloproteinase zymogen: Challenging the cysteine switch dogma. J Am Chem Soc 129:13566–13574. Guevara T, et al. (2010) Proenzyme structure and activation of astacin metallopeptidase. J Biol Chem 285:13958–13965. Tronrud DE, Wen J, Gay L, Blankenship RE (2009) The structural basis for the difference in absorbance spectra for the FMO antenna protein from various green sulfur bacteria. Photosynth Res 100:79–87. Gomis-Rüth FX, et al. (1994) Refined 2.0 Å X-ray crystal structure of the zinc-endopeptidase adamalysin II. Primary and tertiary structure determination, refinement, molecular structure and comparison with astacin, collagenase and thermolysin. J Mol Biol 239:513–544. Gomis-Rüth FX, Meyer EF, Kress LF, Politi V (1998) Structures of adamalysin II with peptidic inhibitors. Implications for the design of tumor necrosis factor α convertase inhibitors. Protein Sci 7:283–292. Gomis-Rüth FX, Kress LF, Bode W (1993) First structure of a snake venom metalloproteinase: A prototype for matrix metalloproteinases/collagenases. EMBO J 12:4151–4157. Schlondorff J, Blobel CP (1999) Metalloprotease-disintegrins: Modular proteins capable of promoting cell-cell interactions and triggering signals by protein-ectodomain shedding. J Cell Sci 112:3603–3617. Kennedy CC, Kottom TJ, Limper AH (2009) Characterization of a novel ADAM protease expressed by Pneumocystis carinii. Infect Immun 77:3328–3336. Nakamura T, Abe H, Hirata A, Shimoda C (2004) ADAM family protein Mde10 is essential for development of spore envelopes in the fission yeast Schizosaccharomyces pombe. Eukaryot Cell 3:27–39. Lavens SE, Rovira-Graells N, Birch M, Tuckwell D (2005) ADAMs are present in fungi: Identification of two novel ADAM genes in Aspergillus fumigatus. FEMS Microbiol Lett 248:23–30. Franco AA, et al. (1999) Molecular evolution of the pathogenicity island of enterotoxigenic Bacteroides fragilis strains. J Bacteriol 181:6623–6633. Ochman H, Lawrence JG, Groisman EA (2000) Lateral gene transfer and the nature of bacterial innovation. Nature 405:299–304. Macrina FL, Archer GL (1993) Bacterial Conjugation, ed DB Clewell (Plenum, New York), pp 313–329. Sprague GF, Jr (1991) Genetic exchange between kingdoms. Curr Opin Genet Dev 1:530–533. Gomis-Rüth FX, Solà M, de la Cruz F, Coll M (2004) Coupling factors in macromolecular type-IV secretion machineries. Curr Pharm Design 10:1551–1565. Koonin EV, Makarova KS, Aravind L (2001) Horizontal gene transfer in prokaryotes: Quantification and classification. Annu Rev Microbiol 55:709–742. Doolittle WF (1999) Lateral genomics. Trends Cell Biol 9:M5–M8. Charrier L, et al. (2005) ADAM-15 inhibits wound healing in human intestinal epithelial cell monolayers. Am J Physiol-Gastr L 288:G346–G353. Harada F, et al. (2005) RNA-dependent DNA polymerase (RT) activity of bacterial DNA polymerases. Bull Osaka Med Coll 51:35–41. Gonzales PE, Galli JD, Milla ME (2008) Identification of key sequence determinants for the inhibitory function of the prodomain of TACE. Biochemistry 47:9911–9919. Abramowitz N, Schechter I, Berger A (1967) On the size of the active site in proteases. II. Carboxypeptidase-A. Biochem Bioph Res Co 29:862–867.

Goulas et al.

26.

27. 28.

29.

30.

31.

32.

33. 34.

35.

36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46.

PNAS ∣

February 1, 2011 ∣

vol. 108 ∣

no. 5 ∣

1861

BIOCHEMISTRY

ACKNOWLEDGMENTS. We thank M. J. Avila-Campos (Brazil) for providing B. fragilis genomic DNA, J. L. Reymond (Switzerland) for providing fluorogenic peptides, T. Guevara for excellent laboratory assistance, Robin Rycroft for helpful suggestions to the manuscript, and to the Crystallography Platform (PAC) at Barcelona Science Park (PCB) for assistance during crystallization experiments. This study was supported in part by grants from European, Spanish, and Catalan agencies (FP7-HEALTH-F3-2009-223101

Supporting Information Goulas et al. 10.1073/pnas.1012173108 SI Materials and Methods Protein Production and Purification. Genomic DNA from Bacter-

oides fragilis strain ATCC 43858 was used as a template for PCR amplification of the coding sequence of profragilysin-3 (UniProt O86049) by using PfuTurbo DNA polymerase (Stratagene) and oligonucleotides 5′-CATTATCCATGGCATGTTCTAATGAAG-3′ (forward primer) and 5′-CTCTCTCGAGATCTAATCGCCATCT G-3′ (reverse primer; both synthesized by Sigma), which included a NcoI and a XhoI restriction site (underlined), respectively. The 1,140-bp amplicon comprised the coding sequence of the proprotein starting at Ala18; i.e., without the signal peptide as suggested by SIGNALP (1). The amplicon was cloned into a modified pET-28a expression vector (2), which attaches an N-terminal His6 -tag and a tobacco-etch virus protease cleavage site. The gene sequence encoding mature fragilysin-3 CD was amplified by PCR from the aforementioned vector with forward primer 5′-CTT CCATGGCAGTACCTTCTGAAC-3′ (encoding a NcoI restriction site) and the same reverse primer as for profragilysin-3. The 558-bp amplicon was cloned in a modified pETM30 vector, which attaches an N-terminal His6 -Z-tag plus a tobacco-etch virus protease cleavage site. In both constructs, the cloning strategy entailed that a tripeptide of sequence (Gly-3)—(Pro-2)—(Met-1) and (Gly-3)—(Ala-2)—(Met-1), respectively, preceded the first protein residue after removal of the tags with tobacco-etch virus protease. Profragilysin-3 Glu349Ala mutant (numbering according to the full pre-pro-protein sequence, UniProt O86049) was obtained by site-directed mutagenesis of the profragilysin-3 expression vector according to (3). The mutagenic forward and reverse primers were, respectively, 5′-GTGCTAACCATGCGGA TGATCCAAAAG-3′ and 5′-CCAATATATGACCTAGTGCGT GTGCCATAAC-3′. Reactions with restriction endonucleases (Fermentas), T4 polynucleotide kinase (Promega), and T4 DNA ligase (Invitrogen) were performed according to manufacturers’ instructions. All constructs were verified by DNA sequencing. For protein overexpression, plasmids were transformed in Escherichia coli Origami-2 (DE3) cells (Novagen), which were grown in Luria–Bertani medium supplemented with 30 μg mL−1 kanamycin. After initial growth of the culture at 37 °C to an OD600 nm ≈ 0.6, the culture was cooled to 18 °C and protein expression was induced with 0.4 mM isopropyl-β-D-1-thiogalactopyranoside for 18–20 h. The selenomethionine variant of profragilysin-3 was obtained in the same way except that cells were in minimal medium implemented with amino acids, with selenomethionine (Sigma) replacing methionine. Pelleted cells from the cultures were washed in buffer A (50 mM Tris-HCl, 500 mM NaCl, 20 mM imidazole, pH7.4), harvested, and resuspended in buffer A, which contained EDTA-free Protease Inhibitor Cocktail tablets (Roche Applied Science) and DNAse I (Roche Applied Science). Cells were broken with a cell disrupter (Constant Cell Disruption Systems) at a pressure of 1.35 kbar and nondisrupted cells and cell debris were removed by centrifugation at 35;000 × g for 45 min in a Sorvall centrifuge. The supernatant was filtered (0.22 μm pore size; Millipore) and incubated at 4 °C for 1 h with nickel-nitrilotriacetic acid resin (Invitrogen) previously equilibrated with buffer A. Subsequently, the sample was applied to a batch purification open column (BioRad) and washed with five column volumes of buffer A. The His6 -tagged protein was eluted with buffer A containing 250 mM imidazole. The protein sample was dialyzed overnight at 4 °C against buffer A in the presence of 1 mM 1,4-dithioDL-threitol (Sigma) and subsequently digested at room temperature for 24 h with His6 -tagged tobacco-etch virus protease at an enzyme:substrate ratio of 1∶200 (w∕w). The digested sample Goulas et al. www.pnas.org/cgi/doi/10.1073/pnas.1012173108

was again passed through the nickel-nitrilotriacetic acid resin column previously equilibrated with buffer A. The eluate was collected, concentrated by ultrafiltration, and further purified by size-exclusion chromatography with a Superdex 75 10∕300 GL column (GE Healthcare) previously equilibrated with buffer B (20 mM Tris-HCl, 150 mM NaCl, pH7.4). Active fragilysin-3 was obtained by limited tryptic digestion of purified profragilysin-3 at an enzyme:substrate ratio of 1∶100 (w∕w) at room temperature for 3 h in buffer B. After trypsin inactivation with 1 mg mL−1 Pefabloc SC (Roche), the sample was dialyzed against buffer C (20 mM Tris-HCl, 40 mM NaCl, pH7.4) overnight at 4 °C and was subjected to anion-exchange chromatography to remove the inactivated serine protease in a MonoQ HR 5∕50 column (GE Healthcare) equilibrated with buffer C. Elution was carried out by using a linear gradient from 40 to 250 mM NaCl within 30 column volumes. Protein-containing fractions were pooled, concentrated by ultrafiltration, and subjected to a final size-exclusion chromatography step with a Superdex 75 10∕300 GL column equilibrated with buffer B. At all stages of purification, the purity of the protein samples was assessed by 10–15% SDS-PAGE (tricine buffer) stained with Coomassie blue. All ultrafiltration steps were done with Vivaspin 500 and Vivaspin 2 filter devices of 5-kDa cut-off (Sartorius Stedim Biotech). The concentration of the proteins was determined with the BCA protein assay kit (Thermo Scientific) by using bovine serum albumin as a standard. Concentrated protein samples were stored at 4 °C. Thermal Shift and Circular Dichroism Assays. Aliquots were prepared by mixing 7.5 μL of 300x Sypro Orange dye (Molecular Probes), 5 μL protein solution (6 mg mL−1 in buffer B), and 42.5 μL of buffer B. The samples were analyzed in an iCycler iQ Real Time PCR Detection System (BioRad) by using 96-well PCR plates sealed with optical tape. Samples were heated from 20 °C to 90 °C at a rate of 1 °C min−1 and the change in absorbance (λex ¼ 490 nm; λem ¼ 575 nm) was monitored over time. The T m was determined for profragilysin-3 and trypsin-activated fragilysin-3 CD. Samples for far-UV circular dichroism spectroscopy were prepared by dissolving the protein at a final concentration of 0.25 mg mL−1 in buffer B. Measurements were carried out in a Jasco J-810 spectrometer at 25 °C and 50 °C by using a 2-mm path length cell. Proteolytic Activity Assays. Proteolytic activity of fragilysin-3 was routinely measured with the fluorescence-based EnzCheck assay kit containing BODIPY FL-casein (10 μg mL−1 ) as a fluorescein conjugate (Invitrogen) at λex ¼ 485 nm and λem ¼ 528 nm by using a microplate fluorimeter (FLx800, Biotek). Reactions were carried out in buffer D (100 mM Tris-HCl, 150 mM NaCl, pH7.4) at room temperature with a final peptidase concentration of 4.5 μg mL−1 . Proteolysis of azo-substrates (at 5 mg mL−1 ) was assayed by incubating fragilysin-3 (at 11 μg mL−1 ) with azocoll (Calbiochem), azocasein (Sigma) or azoalbumin (Sigma) in buffer D at 37 °C for up to 24 h. Reaction mixtures with the latter two substrates were quenched with an equal volume of 5% trichloroacetic acid, centrifuged at 13;000 × g for 5 min, and neutralized with an equal volume of 0.5 M NaOH. The enzyme activity was monitored by using a microplate spectrophotometer (PowerWave XS, Biotek) at a wavelength of 520 nm for azocoll and 440 nm for azocasein and azoalbumin. In addition, fragilysin-3 (at 9 μg mL−1 ) was tested for proteolytic activity on eight fluorogenic substrates (at 10 μM) of sequence: Abz-Lys-Asp-Glu-SerTyr-Arg-K(dnp) (FRET1; for definition of non-amino-acid com1 of 8

ponents, see Table S1), Abz-Thr-Val-Leu-Glu-Arg-Ser-K(dnp) (FRET2), Abz-Asp-Tyr-Val-Ala-Ser-Glu-K(dnp) (FRET3), Abz-TyrGly-Lys-Arg-Val-Phe-K(dnp) (FRET4), Abz-Val-Lys-Phe-Tyr-AspIle-K(dnp) (FRET5), Abz-Gly-Ile-Val-Arg-Ala-K(dnp) (FRET6) (λex ¼ 260 nm and λem ¼ 420 nm) (4, 5); Mca-Pro-Leu-Gly-LeuDap(Dnp)-Ala-Arg-NH2 [MMPsub; (6)]; and Mca-Arg-ProLys-Pro-Val-Glu-Nva-Trp-Arg-Lys(Dnp)-NH2 [NFF3 (7)] (λex ¼ 328 nm and λem ¼ 393 nm). The reactions were performed in buffer D at room temperature and monitored for up to 24 h in a microplate fluorimeter (Infinite M200, TECAN). Finally, proteolytic activity assays against protein substrates (at 1–2 mg mL−1 ) included human plasma fibrinogen, bovine plasma fibronectin, bovine muscle actin, and bovine milk α-casein (all from Sigma). All reactions were carried out in buffer D at 37 °C for up to 6 h and at an enzyme:substrate ratio of 1∶100 (w∕w) except for reactions with fibronectin, where the ratio was 1∶10 (w∕w). Hydrolysis was assessed by 10–15% SDS-PAGE. Sample Identification. Samples destined to N-terminal sequencing by Edman degradation were analyzed by SDS-PAGE and electroblotted onto an Immun-Blot PVDF membrane (BioRad). Membranes were stained with Coomassie R-250 and air-dried. The bands were cut and analyzed at the Protein Chemistry Facility of the Centro de Investigaciones Biológicas in Madrid (Spain) (http://www. cib.csic.es/en/servicio.php?iddepartamento=27). Peptide cleavage sites of fluorogenic substrates were determined by MALDI-TOF fragmentation at the Laboratori de Proteòmica at the Institut de Recerca Hospital Universitari Vall d’Hebron in Barcelona (Spain) (http://www.ir.vhebron.net/easyweb_irvh/Serveis/UCTS/Proteomica/ tabid/124/Default.aspx). pH Optimum and (Autolytic) Activation in Vitro. The pH dependence of fragilysin-3 activity was assessed with the azocoll digestion assay. Reactions were carried out as described above except for the buffer, which was either 100 mM sodium acetate (pH4.0, 4.5, 5.0, and 5.5), 100 mM sodium phosphate (pH6.0, 6.5, and 7.0), or 100 mM Tris-HCl (pH7.5, 8.0, 8.5, and 9.0). The reaction mixtures were incubated at 37 °C for 6 h and monitored by using a microplate spectrophotometer (PowerWave XS, Biotek). To assess autolysis, samples of wild-type and Glu349Ala-mutant profragilysin-3 (at 25 mg mL−1 in buffer B) were incubated at 37 °C for up to 6 days. Autoproteolysis was assessed by 15% SDSPAGE. To assess the importance of the N-terminus of the catalytic fragilysin moiety, limited proteolysis of profragilysin-3 was carried out with the serine proteinases α-chymotrypsin from bovine pancreas, subtilisin A from Bacillus licheniformis, and proteinase K from Engyodontium album (all Sigma) at enzyme: substrate ratios of 1∶50, 1∶500 and 1∶100 (w∕w), respectively, at room temperature for various times in buffer B. Activating serine proteinases were subsequently inhibited by 1 mg mL−1 Pefabloc and the activity of fragilysin was monitored against BODIPY FL-casein (see above). Inhibition of Proteolytic Activity in Vitro. Fragilysin-3 (at 4.5 μg mL−1

in180 μL of buffer D) was incubated for 30 min with different classes of protease inhibitors (see Table S2). Subsequently, 20 μL of BODIPY FL-casein (final substrate concentration 10 μg mL−1 ) was added to the reaction mixture, which was incubated at room temperature for 1 h. The remaining activity was measured in a microplate fluorimeter as described above. Crystallization and Structure Analysis. Crystallization assays were performed by the sitting-drop vapor diffusion method. Reservoir solutions were prepared by a Tecan robot and 100-nL crystallization drops were dispensed on 96 × 2-well MRC plates (Innovadyne) by a Cartesian (Genomic Solutions) nanodrop robot at the High-Throughput Crystallography Platform (PAC) of the Barcelona Science Park. No crystals were obtained for trypsinGoulas et al. www.pnas.org/cgi/doi/10.1073/pnas.1012173108

activated fragilysin-3. In contrast, crystals suitable for structure analysis were obtained for profragilysin-3 in a Bruker steadytemperature crystal farm at 4 °C from equivolumetric drops containing protein solution (at 8–15 mg mL−1 in 20 mM Tris-HCl, 150 mM NaCl, pH7.4) and 100 mM sodium citrate dihydrate, 20% PEG 3000, pH5.5 as reservoir solution. These conditions were scaled up to the microliter range with 24-well Cryschem crystallization dishes (Hampton Research). Prism shaped single crystals of two different types appeared within one week for wildtype, selenomethionine-derivatized, and Glu349Ala-mutant profragilysin-3. A cryocooling protocol was established consisting of successive passages through reservoir solution containing increasing concentrations of glycerol (up to 20%). Complete diffraction datasets were collected from liquid-N2 flash-cryo-cooled crystals at 100 K (provided by an Oxford Cryosystems 700 series cryostream) on ADSC Q315R CCD detectors at beam lines ID23-1 (wild-type and selenomethionine-derivative) and ID29 (Glu349Ala-mutant) of the European Synchrotron Radiation Facility (ESRF, Grenoble, France) within the Block Allocation Group “BAG Barcelona.” Crystals were either monoclinic (wild-type) or orthorhombic (selenomethionine-derivative and Gly349Alamutant), with four or two molecules per asymmetric unit, respectively. Diffraction data were integrated, scaled, merged, and reduced with programs XDS (8) and SCALA (9) within the CCP4 suite of programs (10) (see Table S3). The structure of profragilysin-3 was solved by single-wavelength anomalous diffraction by using the selenomethioninederivative and program SHELXD (11) (Table S3). Diffraction data of a crystal collected at the selenium absorption-peak wavelength as inferred from a XANES fluorescence scan enabled the program to identify all 20 selenium sites of the dimer present in the asymmetric unit. Subsequent phasing with SHELXE by using a higher-resolved dataset collected from the same crystal at the inflection-point as pseudonative data resolved the twofold ambiguity intrinsic to a SAD experiment due to the difference in the values of the pseudo-free correlation coefficient (11) of the two possible hands. Visual inspection of the heavy-atom sites on a Silicon Graphics Octane2 Workstation by using program TURBO-Frodo (12) allowed us to divide them into two sets of ten and to determine the noncrystallographic twofold axis that related them, with the help of program LSQKAB within the CCP4 suite (10). This symmetry operator was used in a subsequent density modification step with averaging by using program DM (13). These calculations rendered an electron density map that facilitated manual tracing of 305 residues of each protomer. This initial model was refined against the inflection-point dataset with program REFMAC5 (14), which included TLS refinement, and it was subsequently used as a searching model to solve the native structure with program PHASER (15). Four unambiguous solutions were found, which rendered a global log-likelihood gain of 11,024. Subsequently, a run with program ARP/wARP (16) was performed by starting with phases provided by the appropriately rotated and translated searching models. These calculations rendered an excellent electron density map. Thereafter, manual model building alternated with crystallographic refinement until the model was complete. The final model comprised residues 33–201 and 210–397 (molecule A), 33–200 and 211–397 (molecule B), 33–199 and 211–397 (molecule C), and 33–202 and 211–397 (molecule D). As the four molecules are equivalent, discussion focuses on molecule A unless otherwise stated. Seven residues (out of 1,418) were in disallowed regions of a Ramachandran plot and belonged to exposed surface loops of prodomain (PD) (see Table S3). In all four molecules, the linker connecting the PD with the CD was flexible and undefined by electron density for between eight and eleven residues, which could not be traced. In addition, three PD loop regions (Lys144-Glu151, loop Lα2η1; Asp160-Tyr169, Lβ9β1; and Ile183-Ile192, Lα3β11) were flexible, and traced on the basis of weak electron density maps to 2 of 8

preserve chain continuity. To examine such disorder/flexibility, the structure of the catalytically impaired Glu349Ala-mutant was solved by Patterson search as mentioned above and initially refined with REFMAC5 (Rfactor ¼ 0.240; free Rfactor ¼ 0.259). No significant differences were found with the wild-type structure in the critical regions, so the mutant structure was not further refined. In addition, N-terminal sequence analysis of both wild-type and mutant protein crystals revealed only intact protein starting at residue Gly-3 (see above). Accordingly, these flexible regions are intrinsic to the PD and not due to proteolysis in the crystallization drops. This is reminiscent of metallopeptidase (MP) zymogens of the matrix metalloproteinase family (MMPs), thermolysins, and astacins, which likewise evinced flexible and disordered regions within their PDs (17, 18). Such flexibility may enhance the functional properties: a PD shields the activesite but not so tightly as to prevent activation at the appropriate site and time point.

minimal structure sequence of adamalysin II with program MULTALIN (20) to delimitate the corresponding sequence stretches of these latter four proteins. In the last step, all eleven sequences were aligned with MULTALIN, which computes parameters for a phylogenetic tree in rfd format. The latter was manually converted to dnd format and plotted as a circular tree with PHYLIP DRAWGRAM at http://mobyle.pasteur.fr/cgi-bin/portal.py? form=drawgram.

Phylogenetic Analysis. Available catalytic-domain structures of four selected human adamalysins/ADAMs and two snake-venom MPs were superimposed by using a graphic display with program TURBO-Frodo onto fragilysin-3 to ascertain the topologically equivalent common structural limits for all proteins [Arg6Asn191 for adamalysin II; Protein Data Bank (PDB) access code 1IAG; (19)]. In addition, two fungal homologues and two potential bacterial sequences were aligned with the sequence of the

Miscellaneous. Figures were prepared with SETOR (21), GRASP (22), and TURBO-Frodo. Structure similarities were investigated with DALI (23). Model validation was performed with MOLPROBITY (24) and the WHATCHECK routine of WHATIF (25). The interaction surface between the prosegment and the mature enzyme moiety was calculated with CNS (26) as half of the surface area buried at the complex interface determined by using a probe radius of 1.4 Å. Close contacts were ascertained with the latter program and the PISA server at http://www.ebi.ac. uk/msd-srv/prot_int/cgi-bin/piserver (Table S4). Interface shape complementarity was computed with SC within CCP4, which rendered a value of 72%, thus indicating a good fit between the interacting surfaces. Pocket-size calculations were performed with CASTP (27). The final coordinates of wild-type profragilysin-3 have been deposited with the Protein Data Bank at www.pdb.org (access code 3P24).

1. Emanuelsson O, Brunak S, von Heijne G, Nielsen H (2007) Locating proteins in the cell using TargetP, SignalP, and related tools. Nat Protoc 2:953–971. 2. Marrero A, Mallorquí-Fernández G, Guevara T, García-Castellanos R, Gomis-Rüth FX (2006) Unbound and acylated structures of the MecR1 extracellular antibiotic-sensor domain provide insights into the signal-transduction system that triggers methicillin resistance. J Mol Biol 361:506–521. 3. Hemsley A, Arnheim N, Toney MD, Cortopassi G, Galas DJ (1989) A simple method for site-directed mutagenesis using the polymerase chain reaction. Nucleic Acids Res 17:6545–6551. 4. Yongzheng Y, Reymond JL (2005) Protease profiling using a fluorescent domino peptide cocktail. Mol Biosyst 1:57–63. 5. Cotrin SS, et al. (2004) Positional-scanning combinatorial libraries of fluorescence resonance energy transfer peptides to define substrate specificity of carboxydipeptidases: assays with human cathepsin B. Anal Biochem 335:244–252. 6. Knight CG, Willenbrock F, Murphy G (1992) A novel coumarin-labelled peptide for sensitive continuous assays of the matrix metalloproteinases. FEBS Lett 296:263–266. 7. Nagase H, Fields CG, Fields GB (1994) Design and characterization of a fluorogenic substrate selectively hydrolyzed by stromelysin 1 (matrix metalloproteinase-3). J Biol Chem 269:20952–20957. 8. Kabsch W (2001) International Tables for Crystallography Volume F: Crystallography of Biological Macromolecules, eds Rossmann MG, Arnold E (Kluwer Academic, The Netherlands), pp 730–734. 9. Evans P (2006) Scaling and assessment of data quality. Acta Crystallogr D 62:72–82. 10. CCP4 (1994) The CCP4 suite: Programs for protein crystallography. Acta Crystallogr D 50:760–763. 11. Sheldrick GM (2002) Macromolecular phasing with SHELXE. Z Kristallogr 217:644–650. 12. Carranza C, Inisan A-G, Mouthuy-Knoops E, Cambillau C, Roussel A (1999) AFMB Activity Report 1996–1999 (CNRS-UPR 9039, Marseille), pp 89–90. 13. Cowtan KD, Main P (1996) Phase combination and cross validation in iterated densitymodification calculations. Acta Crystallogr D 52:43–48. 14. Murshudov GN, Vagin AA, Dodson EJ (1997) Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr D 53:240–255. 15. McCoy AJ, et al. (2007) Phaser crystallographic software. J Appl Crystallogr 40:658–674. 16. Perrakis A, Sixma TK, Wilson KS, Lamzin VS (1997) wARP: Improvement and extension of crystallographic phases by weighted averaging of multiple refined dummy atomic models. Acta Crystallogr D 53:448–455.

17. Tallant C, Marrero A, Gomis-Rüth FX (2010) Matrix metalloproteinases: fold and function of their catalytic domains. BBA-Mol Cell Res 1803:20–28. 18. Guevara T, et al. (2010) Proenzyme structure and activation of astacin metallopeptidase. J Biol Chem 285:13958–13965. 19. Gomis-Rüth FX, et al. (1994) Refined 2.0 Å X-ray crystal structure of the zinc-endopeptidase adamalysin II. Primary and tertiary structure determination, refinement, molecular structure and comparison with astacin, collagenase, and thermolysin. J Mol Biol 239:513–544. 20. Corpet F (1988) Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res 16:10881–10890. 21. Evans SV (1993) SETOR: Hardware lighted three-dimensional solid model representations of macromolecules. J Mol Graphics 11:134–138. 22. Nicholls A, Bharadwaj R, Honig B (1993) GRASP: Graphical representation and analysis of surface properties. Biophys J 64(2):A166–A166. 23. Holm L, Kaariainen S, Wilton C, Plewczynski D (2006) Using Dali for structural comparison of proteins. Curr Protoc Bioinformatics Chapter 5:Unit 5 5. 24. Davis IW, et al. (2007) MolProbity: All-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res 35:W375–W383 (Web server issue). 25. Vriend G (1990) WHAT IF: A molecular modelling and drug design program J Mol Graph 8:52–56. 26. Brünger AT, et al. (1998) Crystallography & NMR System: A new software suite for macromolecular structure determination Acta Crystallogr D 54:905–921. 27. Dundas J, et al. (2006) CASTp: Computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues. Nucleic Acids Res 34:W116–W118 (Web Server issue). 28. Tronrud DE, Wen J, Gay L, Blankenship RE (2009) The structural basis for the difference in absorbance spectra for the FMO antenna protein from various green sulfur bacteria. Photosynth Res 100:79–87. 29. Gomis-Rüth FX, Kress LF, Bode W (1993) First structure of a snake-venom metalloproteinase: A prototype for matrix metalloproteinases/collagenases EMBO J 12:4151–4157. 30. Gomis-Rüth FX, Meyer EF, Kress LF, Politi V (1998) Structures of adamalysin II with peptidic inhibitors. Implications for the design of tumor necrosis factor α convertase inhibitors. Protein Sci 7:283–292. 31. Gomis-Rüth FX (2003) Structural aspects of the metzincin clan of metalloendopeptidases. Mol Biotech 24:157–202.

Goulas et al. www.pnas.org/cgi/doi/10.1073/pnas.1012173108

3 of 8

Fig. S1. (Pro)fragilysin-3 studies in vitro. (A) SDS-PAGE of purified profragilysin-3 (lane 1) and trypsin-activated fragilysin-3 (lane 2) with the respective N-terminal residues. (B) Autolytic cleavage of profragilysin-3 at 37 °C over time. The cleavage sites after 3 days are shown in the framed close-up view. (C) Profragilysin-3 Glu349Ala-mutant over time. (D)–(G) Proteolysis of bovine milk α-casein (D), human plasma fibrinogen (E), bovine muscle actin (F), and human plasma fibronectin by fragilysin-3 (G), respectively. (H) Activity dependence on pH as determined with azocoll as substrate. Values are represented as means and standard deviations of three independent experiments. (I) Proteolysis of fluorogenic substrates (see above for sequences). Substrates NFF3 and FRET5 were best cleaved, whereas the rest were cleaved only moderately (FRET2 and FRET4) or not at all.

Goulas et al. www.pnas.org/cgi/doi/10.1073/pnas.1012173108

4 of 8

Fig. S2. Conformational analysis of (pro)fragilysin-3. (A) Trypsin-activated fragilysin-3 (black line) and the directly expressed variant (gray line) were loaded onto a gel-filtration column, showing different elution times. (B) Trypsin-activated fragilysin-3 and the directly expressed variant (1 and 2, respectively) were incubated without (−) and with (+) trypsin at 1∶100 (w∕w) ratio at room temperature for 2 h. (C). Thermal shift curves for profragilysin-3 (dashed line), trypsinactivated fragilysin-3 (black line) and the directly expressed variant (gray line). (D). Circular dichroism spectra of trypsin-activated fragilysin-3 at 25 °C (black line) and 50 °C (dashed line) and of the directly expressed variant (gray line).

Goulas et al. www.pnas.org/cgi/doi/10.1073/pnas.1012173108

5 of 8

Fig. S3. Structural similarities and phylogenetic studies. (A) Superimposition in stereo of profragilysin-3 PD (lilac) with bacteriochlorophyl A protein from Prosthecochloris aestuarii 2 K [cyan; PDB 3EOJ (28)]. (B) Superimposition in stereo of fragilysin-3 CD (cyan) onto adamalysin II (orange) from Crotalus adamanteus [PDB 1IAG; (19, 29, 30)] in standard orientation. Diverging regions between both structures include: ① the N-termini are arranged distinctly due to the unique C-terminal extension after helix α7 in the bacterial enzyme; ② the backing helix is one turn longer in adamalysin II and preceded by a bulge, which is absent in the bacterial enzyme; ③ the loop connecting the outermost strand of the five-stranded β-sheet with the adamalysin helix is longer in the ophidian enzyme and includes a short helix present in all adamalysin/ADAM structures, which is absent in fragilysin-3; ④ the loop connecting the upper-rim strand with the last strand of the β-sheet is longer in fragilysin-3 and it opens to occupy the space of the C-terminal stretch after the last helix of the lower subdomain in adamalysin II; ⑤ the active-site helix starts two helical turns earlier in the ophidian enzyme; and ⑥ with seven residues separating the third zinc-binding histidine (His358) and the Met-turn methionine (Met366), fragilysin-3 better matches MMPs within this loop than adamalysin II, which has 13 residues. However, the latter finding should not be overemphasized as the length and conformation of this segment vary in adamalysins/ADAMs (31). (C) Circular phylogenetic tree reflecting evolutionary distances between B. fragilis fragilysin-3 (this work) and selected adamalysin/ADAMs, which include human ADAM-17 (PDB 1BKC), ADAM-33 (PDB 1R54), ADAMTS-1 (PDB 2JIH), and ADAMTS-4 (PDB 2RJP); snake-venom MPs adamalysin II from Crotalus adamanteus (PDB 1IAG) and VAP-1 from Crotalus atrox (PDB 2ERO); biochemically characterized fungal relatives, Schizosaccharomyces pombe mde10 (UniProt O13766) and Pneumocystis carinii ADAM (UniProt A3QZA9); and putative bacterial sequential relatives from Marinobacter aquaeoli (UniProt A1U5B6) and Marinobacter algicola (UniProt A6EZB8). No structure is available for the last four proteins.

Goulas et al. www.pnas.org/cgi/doi/10.1073/pnas.1012173108

6 of 8

Table S1. Fragilysin-3 cleavage sites Autolytic cleavage Autolytic cleavage Autolytic cleavage Autolytic cleavage Autolytic cleavage Autolytic cleavage α-Casein Fibrinogen FRET5 FRET2 NFF3

I II III IV V VI

Mca



R



Abz P

– –

Abz T K

– – –

M A S L I S E G V V P

– – – – – – – – – – –

A D L T T Q Q K K L V

– – – – – – – – – – –

C S T T E T K E F E E

– – – – – – – – – – –

S L T S S R Y K Y R O

– – – – – – – – – – –

✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂ ✂

– – – – – – – – – – –

N T S I Q A I V D S W

– – – – – – – – – – –

E T I D T V Q T I R

– – – – – – – – –

A – S – D – A – R – P – K – S – K (dnp) K (dnp) – K (dnp)

D I A P A S E G

Mca stands for 7-methoxycoumarin-4-acetyl, abz for aminobenzoyl, O for norvaline, dnp for 2,4-dinitrophenylamino, and dap for L-diaminopropionyl.

Table S2. Inhibition of fragilysin-3 activity Inhibitor None Bovine lung aprotinin PMSF Benzamidine Pefabloc Iodoacetamide E-64 Pepstatin A 1,10-Phenanthroline EDTA ZnCl2 CT1746 Phosphoramidon Captopril

Concentration

Specificity

% Relative activity

— 0.3 μM 1 mM 5 mM 4 mM 1 mM 10 μM 10 μM 5 mM 5 mM 5 mM 10 μM 10 μM 1 mM

— Serine proteases Serine proteases Serine proteases Serine proteases Cysteine proteases Cysteine proteases Aspartic proteases Metallopeptidases (MPs) MPs MPs Matrixins (MPs) Thermolysin (MP) Angiotensin-converting enzyme (MP)

100 97 82 94 88 92 94 92 0 0 18 13 88 101

Table S3. Crystallographic data Dataset

Native

Selenomethionine (absorption-peak)*

Glu349Ala-mutant

Space group/cell constants (a, b, c, in Å; β in ° if ≠ 90) P21 ∕82.74, 69.14, 158.91, 91.57 P21 21 21 ∕69.1, 83.2, 160.1 P21 21 21 ∕69.51, 82.27, 158.71 Wavelength (Å) 1.0723 0.9793 0.9724 No. of measurements/unique reflections 749;668∕164;503 119;679∕31;601 474;707∕72;415 Resolution range (Å) (outermost shell)† 45.2–1.80 (1.90–1.80) 44.9–2.50 (2.64–2.50) 42.1–1.90 (2.00–1.90) Completeness [/Anomalous completeness] (%) 98.9 (98.0) 97.1 (97.3)/88.8 (85.2) 99.9 (99.9) ‡ Rmerge 0.075 (0.597) 0.073 (0.305) 0.060 (0.631) Rr:i:m: ð¼ Rmeas Þ∕Rp:i:m: ‡ 0.084ð0.714Þ∕0.037ð0.388Þ 0.095ð0.395Þ∕0.060ð0.248Þ 0.065ð0.687Þ∕0.025ð0.268Þ Average intensity (h½hIi∕σðhIiÞi) 15.9 (2.4) 15.5 (5.5) 19.4 (3.6) B-Factor (Wilson) (Å2 )/Average multiplicity 24.4∕4.6ð3.1Þ 35.5∕3.8ð3.7Þ 29.4∕6.6ð6.3Þ Resolution range used for refinement (Å) ∞–1.80 No. of reflections used (test set) 163,722 (781) Crystallographic Rfactor (free Rfactor )‡ 0.170 (0.204) No. of protein atoms§/solvent molecules/ 11,395/1147/ ligands/ions 2 tetraethylene glycol, 6 glycerol, 2 azide/4 zinc Rmsd from target values bonds (Å)/angles (°) 0.011∕1.27 bonded B-factors (main-chain/side chain) (Å2 ) 0.79∕2.07 Average B-factors for protein atoms (Å2 ) 17.7 Main-chain conformational angle analysis¶ Residues in favored regions/outliers/all residues 1381∕7∕1418 *Friedel-mates were treated as separate reflections. † Values in parentheses refer to the outermost resolution shell. ‡ For definitions, see table 1 in ref. 1. § Including atoms and residues in alternative conformations. ¶ According to MOLPROBITY (2). 1 Mallorquí-Fernández N, et al. (2008) A new autocatalytic activation mechanism for cysteine proteases revealed by Prevotella intermedia interpain A. J Biol Chem 283:2871–2882. 2 Davis IW, et al. (2007) MolProbity: All-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res 35:W375–W383 (Web server issue).

Goulas et al. www.pnas.org/cgi/doi/10.1073/pnas.1012173108

7 of 8

Table S4. Direct interactions between PD and CD of profragilysin-3 (molecule A) Prodomain

Catalytic moiety

van der Waals interactions Arg80 Cδ Lys255 Cδ Thr81 Cγ2 Phe262 Cϵ2 Ile83 Cγ2 Met241 Sδ Leu85 Cδ2 Met241 Cϵ Leu85 Cδ2 Leu264 Cβ Ile104 Cγ2 Ser238 Cβ Phe109 Cϵ1 Pro234 Cβ Phe109 Cϵ2 Asn235 Cα Val111 Cγ2 Asn235 Cβ Tyr139 Cϵ2 Leu340 Cδ2 Ile175 Cδ1 Leu340 Cδ2 Val177 Cγ1 Met341 Cα Ala180 Cβ Met341 Cϵ Cys189 Sγ Thr308 Cγ2 Cys189 Cβ Leu314 Cδ2 Tyr191 Cδ2 Lys369 C Tyr195 Cδ1 Leu319 Cδ2 Tyr195 Cδ2 His352 Cϵ1 Tyr195 Cϵ2 His358 Cϵ1 Ile196 Cδ1 Leu280 Cδ2 Ile196 Cγ2 Trp318 Cζ3 Thr198 Cγ2 Trp318 Cϵ3 Salt bridges Asp86 Oδ2 Arg226 Nη2 Asp86 Oδ1 Arg226 Nη1 Arg113 Nη2 Asp243 Oδ1 Metallorganic bonds Asp194 Oδ1 Zn2þ Asp194 Oδ2 Zn2þ

Dist.(Å) 3.92 3.79 3.81 3.77 3.93 3.76 3.61 3.59 3.71 3.80 3.43 3.90 3.65 3.91 3.33 3.98 3.93 3.48 3.68 3.95 3.48 3.73 2.78 2.92 2.93 2.15 2.44

Prodomain Arg80 Nη2 Arg80 O Thr81 Oγ1 Lys82 N Lys82 O Gln84 Oϵ1 Gln84 N Gln84 O Asp86 N Phe103 O Tyr126 Oη Tyr126 Oη Tyr139 Oη Tyr139 Oη Pro187 O Lys181 Nζ Cys189 N Asp190 N Asp190 O Asp190 Oδ1 Asp194 O Asp194 Oδ2 Asp194 Oδ2 Asp194 Oδ1 Asp194 Oδ1 Ile196 N Ile196 O Gln200 Nϵ2

Catalytic moiety Hydrogen bonds Asn246 Oδ1 Tyr249 Oη Tyr249 Oη Leu260 O Phe262 N His261 Nϵ2 Phe262 O Leu264 N Leu264 O Gln242 Nϵ2 Asn235 Oδ1 Asn235 N Asn235 Nδ2 Glu236 Oϵ2 Tyr342 Oη Met341 O Tyr342 Oη Asn312 O Leu314 N Tyr370 N Ser317 N His348 Nϵ2 His352 Nϵ2 His352 Nϵ2 His358 Nϵ2 Ser317 O Leu319 N Asn320 Nδ2

Dist.(Å) 2.98 3.09 2.92 3.10 2.83 2.71 2.92 2.95 2.82 3.29 2.59 3.22 2.95 2.55 3.40 2.82 3.16 2.85 2.86 2.71 3.06 3.03 3.44 3.21 3.00 2.94 2.90 3.49

Table S5. Amino-acid variability in profragilysins Position Prodomain

Catalytic domain

Goulas et al. www.pnas.org/cgi/doi/10.1073/pnas.1012173108

32 102 169 170 177 228 232 257 260 270 275 277 281 289 312 316 319 320 331 357 359 361 362 368 369 370 375 380 383 384 393

Profragilysin-1

Profragilysin-2

Profragilysin-3

A S D I I N I Y F S D K E S K M F N E E T N S A T F S K D I A

T S D L I S V F L S N D D A N I L D K R A D P S K Y F E Y R I

A N Y P V N V Y L A N D D A N I L N K N A D P S K Y F K D I I

8 of 8

Suggest Documents