PSCD Domains of Pleuralin-1 from the Diatom ... - Cell Press

0 downloads 0 Views 4MB Size Report
Jun 16, 2016 - d. PSCD4 binds Ca2+ ions d. NMR shows an interaction of PSCD4 with a-frustulin ... Eike Brunner,1,5 Werner Kremer,1 and Hans Robert Kalbitzer1,* ... 4Department of Chemistry and Food Chemistry, B CUBE Center for ... INTRODUCTION .... proline residues could be determined from their chemical-shift.
Article

PSCD Domains of Pleuralin-1 from the Diatom Cylindrotheca fusiformis: NMR Structures and Interactions with Other Biosilica-Associated Proteins Graphical Abstract

Authors Silvia De Sanctis, Michael Wenzler, Nils Kro¨ger, ..., Eike Brunner, Werner Kremer, Hans Robert Kalbitzer

Correspondence hans-robert.kalbitzer@biologie. uni-regensburg.de

In Brief De Sanctis et al. describe the NMR structure of the PSCD4 domain of pleuralin-1 from Cylindrotheca fusiformis. PSCD4 contains three short helical elements and two binding sites for Ca2+ ions with millimolar affinity, and is stabilized by five unique disulfide bridges. Binding studies show an interaction with native silaffin-1A as well as with a-frustulins.

Highlights d

d

d

d

The structure of the PSCD4 domain in solution was determined by NMR spectroscopy PSCD4 binds Ca2+ ions NMR shows an interaction of PSCD4 with a-frustulin as well as silaffin-1A PSCD domains may connect hypotheca with epitheca by interaction with a-frustulin

De Sanctis et al., 2016, Structure 24, 1178–1191 July 6, 2016 ª 2016 Elsevier Ltd. http://dx.doi.org/10.1016/j.str.2016.04.021

Accession Numbers 2MK0

Structure

Article PSCD Domains of Pleuralin-1 from the Diatom Cylindrotheca fusiformis: NMR Structures and Interactions with Other Biosilica-Associated Proteins Silvia De Sanctis,1,6 Michael Wenzler,1,3,6 Nils Kro¨ger,2,4 Wilhelm M. Malloni,1 Manfred Sumper,2 Rainer Deutzmann,2 Patrick Zadravec,1 Eike Brunner,1,5 Werner Kremer,1 and Hans Robert Kalbitzer1,* 1Institute

of Biophysics und Physical Biochemistry, Centre of Magnetic Resonance in Chemistry and Biomedicine of Biochemistry, Microbiology and Genetics University of Regensburg, 93040 Regensburg, Germany 3Bruker BioSpin AG, 8117 Fa ¨ llanden, Switzerland 4Department of Chemistry and Food Chemistry, B CUBE Center for Molecular Bioengineering, TU Dresden, 01307 Dresden, Germany 5Bioanalytical Chemistry, Department of Chemistry and Food Chemistry, TU Dresden, 01062 Dresden, Germany 6Co-first author *Correspondence: [email protected] http://dx.doi.org/10.1016/j.str.2016.04.021 2Institute

SUMMARY

Diatoms are eukaryotic unicellular algae characterized by silica cell walls and associated with three unique protein families, the pleuralins, frustulins, and silaffins. The NMR structure of the PSCD4 domain of pleuralin-1 from Cylindrotheca fusiformis contains only three short helical elements and is stabilized by five unique disulfide bridges. PSCD4 contains two binding sites for Ca2+ ions with millimolar affinity. NMR-based interaction studies show an interaction of the domain with native silaffin-1A as well as with a-frustulins. The interaction sites of the two proteins mapped on the PSCD4 structure are contiguous and show only a small overlap. A plausible functional role of pleuralin could be to bind simultaneously silaffin-1A located inside the cell wall and a-frustulin coating the cell wall, thus connecting the interfaces between hypotheca and epitheca at the girdle bands. Restrained molecular dynamics calculations suggest a bead-chain-like structure of the central part of pleuralin-1.

INTRODUCTION Diatoms are eukaryotic unicellular organisms and constitute the largest group of microalgae (Bacillariophyceae) (Van den Hoek et al., 1993; Sitte et al., 1991). Approximately 20% of the global photosynthetic activity is performed by extant marine diatoms. Diatoms possess a silica-based rigid cell wall (frustule) that acts as a protective armor (Smetacek, 1999). The frustule is made up of two partially overlapping half-shells or thecae (hypotheca and epitheca). Each theca is composed of a plate-like structure (valve) and several strips of silica (girdle bands). The epitheca overlaps the hypotheca only on the first few silica strips. Valves and girdle bands are produced during cell division and interphase, respectively, within dedicated compartments 1178 Structure 24, 1178–1191, July 6, 2016 ª 2016 Elsevier Ltd.

termed silica deposition vesicles (SDVs) (for a review see Kro¨ger and Poulsen, 2008). It is believed that silica morphogenesis inside the SDVs is guided by an organic matrix composed of proteins (silaffins, silacidins, cingulins) and long-chain polyamines that together accelerate silicic acid polycondensation and act as template in nanopatterning of the developing silica (for reviews see Kro¨ger and Poulsen, 2008; Sumper and Brunner, 2008). After completion of silica morphogenesis, valves and girdle bands are deposited on the cell surface through exocytosis. The organic SDV components (silaffins, silacidins, cingulins, long-chain polyamines) remain tightly associated with the silica after exocytosis. Previously it has been demonstrated in the diatom Cylindrotheca fusiformis that after exocytosis, valves and girdle bands become decorated with additional proteins, which include frustulins (Kro¨ger et al., 1994) and pleuralins (formerly called HEPs, Kro¨ger et al., 1997). Frustulins are spread uniformly across the entire cell wall and appear to constitute a protective coat (Kro¨ger et al., 1997), In contrast, pleuralins are specifically located in the girdle-band region of the frustule, where they are confined to the proximal surface of the terminal girdle bands (i.e. pleural bands) of the epitheca (Kro¨ger and Wetherbee, 2000). Although the presence of pleuralins is a defining feature of an epitheca, the exact function of pleuralins in the frustule is not yet clear. So far pleuralins can only be released from pleural bands by treatment with anhydrous hydrogen fluoride, which does not cleave peptide bonds but dissolves silica and cleaves o-glycosidic and o-phosphate ester bonds. Therefore, it was hypothesized that pleuralins are either covalently linked to the silanol groups (Si-OH) of the cell-wall silica, or that the pleuralins are covalently cross-linked with each other or other cell-wall biopolymers through glycosidic bonds and/or phosphodiester bonds (Kro¨ger et al., 1997). The primary structure of pleuralins is exemplified by pleuralin-1 (formerly HEP200) in Figure 1A. Being a secretory protein, pleuralin-1 carries an N-terminal signal peptide for co-translational import into the ER. Following a short, extremely prolinerich stretch (26 out of 42 residues) are five domains of 87 or 89 amino acids each, in which proline, serine, cysteine, and aspartate comprise more than 40% of the amino acid residues, and

Figure 1. Pleuralin-1 Protein from C. fusiformis (A) General structure of pleuralin-1 protein. (B) Sequence comparison of the five PSCD domains of the pleuralin-1 protein. The sequence identities to the target domain (PSCD4 domain) are highlighted by shaded rectangles while the unaltered positions of the ten cysteine residues are represented by blank boxes. The PSCD4 domain encompasses the residues E372 to P458 that correspond to E16 to P102 in the initial His6PSCD4 construct.

are hence named PSCD domains. The C-terminal part is composed of a 40-kDa non-repetitive domain that is more acidic than the PSCD domains but contains lower amounts of proline (8%), cysteine (6%), and serine (5%). Previously the nuclear magnetic resonance (NMR) signals of the PSCD4 domain (amino acids T366 to T468) of pleuralin-1 were sequence-specifically assigned (Wenzler et al., 2001). Here, we report the NMR structure of the PSCD4 domain of pleuralin-1 from C. fusiformis, which is almost devoid of canonical secondary structure elements and structurally not related to other known proteins. It comprises only three short helical elements and is stabilized by five unique disulfide bridges. It contains two binding sites for Ca2+ ions located on opposite sides of the 3D structure. NMR-based interaction studies show an interaction of the domain with native silaffin-1A as well as with a-frustulins. A plausible functional role of pleuralin could be to bind simultaneously silaffin-1A located inside the cell wall and a-frustulin coating the cell wall, thus connecting the interfaces between hypotheca and epitheca at the girdle bands. The interaction of a-frustulin with pleuralin is substantially enhanced by binding of divalent ions at Ca2+ concentrations typical for seawater. Because of the high homology of the five PSCD domains of pleuralin-1, it was possible to calculate the structure of the whole PSCD domain by restrained molecular dynamics calculations, which suggest a bead-chain-like structure of the central part of pleuralin-1. RESULTS Localization of Disulfide Bridges and Secondary Structure Prediction of the PSCD4 Domain The present study focuses on the fourth PSCD domain (PSCD4) comprising 87 amino acids. A recombinant protein, His6PSCD4, harboring the PSCD4 domain has been expressed in Escherichia coli and secreted into the periplasmic space (Wenzler et al., 2001). His6PSCD4 is made up of 112 amino acids and contains an N-terminal His tag (nine amino acids), six residues of the PSCD3 domain, and part of the linker (ten amino acids) to the

following PSCD5 domain (Figure 2). It contains ten cysteine residues in five disulfide bridges that are formed during the transport of the protein to the periplasmic space. Specific proteases such as trypsin failed to generate PSCD4 peptides for determination of disulfide bonds. Therefore, the protein was digested with subtilisin overnight (typically using 200 mg of material). Shorter times resulted in incomplete cleavage, indicating that recombinantly expressed PSCD4 is in a native disulfide-bonded state. The peptides were separated by reverse-phase chromatography either before or after reduction with DTT. Those peptides present in the unreduced but not in the reduced run were collected and subjected to further analysis by mass spectrometry (MS) and protein sequence analysis by Edman degradation. Since disulfide-bonded peptides yielded at least two sequences, the peptides were reduced, and the resulting monomers separated by reverse-phase chromatography and identified by Edman degradation. In this way it was shown that cysteines 24 and 94 (peptides DCGEVI and CSPT) and Cys57 and Cys68 (peptides GRPDCNVLPFPN and NIGCPS) were disulfide bridged. Elucidation of the disulfide bridges between Cys31, -36, -49, -71, -72, and -76 proved to be substantially more difficult as these six cysteine residues were located within a single disulfide-bonded peptide. MS, supported by Edman degradation, revealed the presence of the disulfide bond containing peptide A-D, presented in Figure 2 (top left) as was demonstrated through the series of experiments described in the following. A disulfide bond containing peptide X with monoisotopic molecular mass of 3,278.8 Da (m/z 3,279.8) was subjected to Edman degradation in the unreduced form (A + B + C + D) revealing the presence of peptides A, B, C, and D (e.g., cycle 4 was the only cycle that contained the phenylthiohydantoin [PTH] amino acid phenylalanine, indicating the presence of the peptide CCPF, see Figure 2). Tandem MS (MS/MS) fragmentation analysis identified peptide B (SDSARPPDCT) as well as the dimeric peptide B + D as constituents of the unreduced peptide A-D. Upon reduction and alkylation of the cysteine residues by 4-vinylpyridine, the monomeric peptides were separated by Structure 24, 1178–1191, July 6, 2016 1179

Figure 2. Disulfide Bridges and Hydrogen Bonds in the His6PSCD4 Construct (Top) Fragments contained from limited proteolysis and analyzed by Edman degradation and MS. (Bottom) The sequence of the His6PSCD4 construct made up of 112 amino acids. Gray, His tag; green, six residues of the PSCD3 domain and ten residues belonging to the linker between the PSCD4 and PSCD5 domains; cyan, disulfide bridges determined by limited proteolysis, Edman degradation, and MS; red, hydrogen bonds detected directly by an HNCO experiment (dashed lines) or by analysis of the NOE and hydrogen exchange patterns (solid lines). See also Figure S3.

reverse-phase chromatography and analyzed by Edman degradation and MS. This allowed for unambiguous identification of the sequences of the four constituent peptides of peptide A-D. However, this set of experiments did not reveal the binding partners of C71 and C72. This question was resolved by combining Edman degradation of the intact peptide X with MS analysis. After subjecting peptide A-D to one cycle of Edman degradation (removal of the N-terminal amino acid), MS revealed two fragments (I and II, see top right of Figure 2) with m/z values of 1,873.6 and 1,197.4 for fragments I and II, respectively. The identity of fragments I and II was confirmed by MS/MS analysis. This dataset clearly indicates disulfide bonds between C36-C72 and C49-C71. The sequence of purified product I (isolated through reverse-phase chromatography), was confirmed through MALDI-MS and MALDI-MS/MS analysis. Edman degradation gave further proof for composition of peptides A0 + C0 + D0 (A0 , C0 , and D0 denote the peptides A, C, and D shortened by one amino acid). Alternative disulfide bonds between C36-C71 and C49-C72 could be ruled out since the corresponding masses m/z = 1,629.7 (A0 + C0 + Cys) and m/z = 1,324.5 (B0 + D0 ) were not observed in the MS analysis of fragments obtained after one cycle of Edman degradation of peptide A-D. In conclusion, there is no indication from the MS analysis as well as the NMR spectroscopy for alternative S-S-bond patterns in PSCD4 other than those shown in Figure 2. The disulfide bonds stabilize multiple loops in the structure (Table 1 and Figure 2, bottom). The configuration of the peptide bonds preceding most proline residues could be determined from their chemical-shift differences and/or nuclear Overhauser effect (NOE) patterns (Wu¨thrich et al., 1984; Schubert et al., 2002). The loop structures between amino acids 24 and 94 are stabilized by disulfide 1180 Structure 24, 1178–1191, July 6, 2016

bonds, and all residues where the peptide bond configuration could be determined unambiguously are in trans configuration with exception of P85, which is in cis configuration (Table S1). This suggests that this range of the structure has a unique folding. Outside this region in the linker between the fourth and the fifth PSCD domain, two other proline residues (P102 and P110) occur in cis configuration, suggesting also for this linker some ordered structure. Only P100 at the C-terminal linker region is clearly in equilibrium between trans and cis configuration and leads to two different chemical-shift values for neighboring amino acids such as A106. Such an equilibrium is usually observed for random-coil structures; the corresponding equilibrium constant K is 3.4 in random-coil peptides (Kremer et al., 2004). The hydrogen bonds determined experimentally from a longrange HNCO experiment or derived from an NOE analysis of slowly exchanging amide protons are indicated in Figure 2. Additional hydrogen bonds could be identified after calculation of the 3D structures. In the present study, the spectral assignment (Wenzler et al., 2001) was further completed. The additional assignments were stored in the Biological Magnetic Resonance Bank (BMRB) database (accession number 4958). In agreement with a compactly folded structural arrangement, hydrogen bonds are observed between amino acids quite distant in the sequence, e.g., main-chain hydrogen bonds between D42 and S86 or between F62 and C71. Two canonical secondary elements were inferred from the chemical shifts: an a helix from C49 to V52 and a b sheet from S70 to E75 as predicted by TALOS+ software (Corneliescu et al., 1999). The chemical-shift index predicts two a helices (one of them is the same helix predicted by TALOS+ while the other is from D34 to C36) and two b sheets, respectively from D79 to F83 and from C94 to L99 (Wishart et al., 1992).

Table 1. Summary of the Experimental Restraints and Structural Statisticsa His6PSCD4

PSCD4

T10–T112

E16–P102

Experimental Restraints Disulfide bridges

5

5

3

9

9

JHNHa couplings

Dihedral angles (Talos)

8

8

RDCs

50

43

Hydrogen bonds

12

12

NOEs

826

790

Intraresidual (i, i)

335

316

Interresidual

491

474

Interresidual

261

249

491

249

(i, i + 2)

261

82

(i, i + 3)

85

45

(i, i + 4)

45

11

(i, j; j > i + 4)

13

87

Total number of restraints

87

867

Sequential (i, i + 1) Medium-range

Long-range

1

Energies (kJ mol ) Etotal

9,259 ± 228

8,145 ± 303

ENOE

320 ± 49

316 ± 59

Ebond

273 ± 23

242 ± 16

Eangle

1,399 ± 64

1,347 ± 72

EvdW

376 ± 103

292 ± 77

Eimp

651 ± 55

588 ± 86

Edihe

1,913 ± 22

1,751 ± 35

Eelec

13,449 ± 186

12,107 ± 276

NOEs >0.02 nm

8

4

J coupling >2 Hz

0

0

RDCs

0

0

All amino acids

0.38

0.41

Folded core

0.14

0.08

Violations

Backbone RMSD (nm):

(4, c) in most favored or allowed regions (%): All amino acids

94.2

88.7

Folded core (E346 to N436)

95.6

96.8

The structures were calculated using only the experimental restraints given. The energies and the violations refer to the lowest-energy structure after water refinement. The RMSD to the mean of the ten lowest-energy structures was calculated with MOLMOL (Koradi et al., 1996), and the quality of the Ramachandran plots after water refinement with PROCHECK (Laskowski et al., 1993). See also Tables S1 and S2. a His6PSCD4 corresponds to all amino acids of the studied construct omitting only the first nine amino acids (His tag) of the expression vector. Note that T10 in the construct corresponds to T366 in pleuralin-1. The disulfide bonds were determined by a combination of limited proteolysis, Edman degradation, and MS. The 3JHNHa coupling constants were detected from the HNCA-E.COSY spectrum (Griesinger et al., 1987). The

3D Structure of the PSCD4 Domain The 3D structures of the His6PSCD4 construct (103 residues excluding the His tag) and of the PSCD4 domain (87 residues from E16 to P102) were initially calculated using a total of 910 and 867 experimental restraints, respectively (Table 1), with the standard simulated annealing protocol of the CNS program (Bru¨nger and Nilges, 1998). In particular, in the longer construct, apart from 826 NOE distance restraints and five S-S bonds, 12 hydrogen bonds were detected either from a long-range HNCO spectrum or by slow hydrogen-deuterium exchange rates in combination with typical NOE patterns (Figure 2). In addition, 50 residual dipolar couplings from a 1H,15N HSQC (heteronuclear single-quantum coherence) spectrum oriented in a bicell system, nine 3JHNHa-coupling restraints from an HNCA-E.COSY (exclusive correlation spectroscopy) spectrum, and eight TALOS+ dihedral angle restraints (predicted from the chemical shifts) were also used. The assignment of the NOE spectroscopy (NOESY) cross-peaks was performed semi-automatically as described in Experimental Procedures. The energies and the violations of Table 1 refer to the ten lowest-energy structures after water refinement using only the experimental restraints. The PERMOL routine (Mo¨glich et al., 2005) was used to obtain 1,069 additional substitute restraints (Table S2) from the simulated structure of the PSCD4 domain. The generation of the substitute restraints (Cano et al., 2009) is described in Experimental Procedures and leads to the definition of 132 dihedral angles, 13 hydrogen bonds, and 919 NOE distances (Table S2). Figure 3 shows the lowest-energy structure of the PSCD4 domain (from E16 to P102) as yielded from a combination of experimental and substitute restraints. Three short a helices were obtained, helix H1 from I33 to C36, helix H2 from T50 to V52, and helix H3 from D56 to N58. Helices H1 and H2 were also predicted by the chemical-shift analysis. For the whole PSCD4 domain a root-mean-square deviation (RMSD) value of 0.41 nm was obtained for the ten lowest-energy structures simulated using only the experimental restraints. For the better folded part of the structure between amino acids E30 and N80 the RMSD is only 0.08 nm (Table 1). When including both the experimental and the substitute restraints the RMSD values of the PSCD4 domain are 0.09 and 0.03 nm for the complete domain and the core region, respectively (Table S2). The quality of the obtained structures was checked by PROCHECK and AUREMOL; in particular the interresidual NMR R factor (Gronwald et al., 2000, 2007) is 0.24, indicating that the NOESY spectra are in good agreement with the NMR structure. The majority of the 4 and c angles are positioned in the favorable and the allowed regions (88.7%). Note that this percentage increases when considering only the core region (96.8%). The structure of the two linker regions is much less defined, indicating a higher mobility of these regions (see below). backbone torsion-angle restraints were predicted by TALOS+. The residual 1H,15N residual dipolar couplings were obtained from a 1H,15N HSQC spectrum measured in a bicellar system. Hydrogen bonds were detected both directly by an HNCO experiment and by analysis of the NOE and hydrogen exchange patterns. The REFINE routine (Trenner, 2006) was used for automatically generating NOE distance restraints and distance limits from the assigned 2D 1H,1H NOESY spectrum.

Structure 24, 1178–1191, July 6, 2016 1181

Figure 3. 3D Structure of the PSCD4 Domain of Pleuralin-1 Protein The structure was calculated as described in Experimental Procedures with the experimental restraints (Table 1) and the substitute restraints (Table S2), and was refined in explicit water. The lowest-energy structure from E16 to P102 is shown. Disulfide bonds are depicted in orange, the secondary structures are green, and the electrostatic surface potentials are red (negative charges) and blue (positive charges).

Internal Mobility in His6PSCD4 Construct From the disulfide patterns (Figure 2) and the quality of the 3D structure obtained (Figure 3), one would expect that the range stabilized by disulfide bonds (amino acids C24 to amino acid C94) and some residues close to these borders are part of a folded structure and thus show only a limited internal mobility. The NMR structure of the N and C termini is not well defined: here a higher internal mobility is to be expected. Experimentally, the internal mobility can be estimated by measuring the 15N,1H heteronuclear NOE intensities (Figure 4). Unfortunately, PSCD4 contains 24 proline residues that do not possess an amide proton and thus do not allow the measurement of their 15N,1H NOE. Therefore, only an incomplete picture can be obtained. Indeed, in the range from S22 to L99 a mean heteronuclear NOE of 1182 Structure 24, 1178–1191, July 6, 2016

0.98 is obtained, as is typical for a well-folded protein. From C24 toward the N terminus the NOEs rapidly decrease and become negative, indicating increased internal mobility. A similar pattern is obtained for the C-terminal residues where after residue S103 negative NOEs are observed, close to 2 for the residues A106, T108, and V109. The last large loop stabilized by disulfide bonds is significantly more mobile than the other part of the core structure, indicated by smaller but still positive heteronuclear NOEs. 3D Structure of the PSCD Regions of Pleuralin-1 The PSCD domains reveal a very high sequence identity with the experimentally analyzed PSCD4 domain (from 65% of the PSCD5 domain to 90% of the PSCD2 domain). The

Figure 4. Internal Mobility in PSCD4 The sample contained 0.8 mM His6PSCD4 in 10 mM sodium phosphate buffer (pH 6.5). Heteronuclear 15 N,1H NOEs (gray bars) were measured at a proton frequency of 500 MHz at 298 K and are plotted as function of the sequence position. P, proline residues; 0, amino acids where a reliable NOE could not be determined because of peak overlap or insufficient signal-to-noise ratio in the saturated experiment.

PSCD1 and PSCD2 domains have the same sequence length as the PSCD4 domain (87 amino acids) while the third and the fifth domains are composed of 89 residues. The positions of the ten cysteine residues are unaltered among the PSCD domains (Figure 1B). Assuming that these domains have similar structures, the restraints (NOEs, S-S bonds, hydrogen bonds, residual dipolar couplings [RDCs], 3JHNHa couplings, and dihedral angles) of the PSCD4 domain can be transferred to the other domains for all strictly conserved residues (Figure 1B), and additional substitute restraints (Table S2) can be obtained by the AUREMOL routine PERMOL (Mo¨glich et al., 2005). The 3D structures of each PSCD domain were separately calculated by restrained molecular dynamics including both experimental and substitute restraints with a total of 1,513, 1,775, 1,756, and 1,513 restraints for PSCD1, PSCD2, PSCD3, and PSCD5, respectively. In total, 1,024 structures for each domain were calculated using the simulated annealing protocol of CNS coupled with the data imputation algorithm (see Experimental Procedures). Energies, RMSDs, and the qualities of the corresponding Ramachandran plots after refinement of the lowestenergy structure in explicit water are reported in Table S2, and the corresponding structures are shown in Figure 5A. It is clearly seen that the structures of PSCD1 and PSCD2 have a much higher quality than those of PCSD3 and PSCD5, mainly because the sequences of the first two domains are more similar to that of the PSCD4 domain. In addition, 1,024 structures of the complete PSCD region of pleuralin-1 have been simulated, joining together all the restraints of the single PSCD domains. The structure of the central PSCD region of pleuralin-1 encompassing 494 residues was simulated on the basis of a total of 8,359 (pseudo)-experimental restraints (7,745 NOEs, 25 disulfide bonds, no RDCs, 82 hydrogen bonds, 33 3JHNHa couplings, and 449 dihedral angles). Since the PSCD4 domain experimentally does not show any sign of aggregation at high concentrations (e.g., line broadening), it has been assumed that the same is true for a possible homo- or heteromer interaction between the

other PSCD domains. Therefore, 25 inter-domain non-NOEs were added to hold the different PSCD domains separated. Figure 5B shows the lowest-energy structure of the whole PSCD region. However, since experimental information on the linker structure and the mutual spatial arrangement of the individual PSCD domains (including possible direct contacts) does not exist, the 3D structure shown gives only an impression of the spatial arrangement of the chain of PSCD domains. Binding of Silicic Acid to PSCD4 It has been speculated that pleuralins may be covalently bound to the silica of the cell wall (Kro¨ger et al., 1997). This could be possibly achieved by a condensation reaction between the serine and threonine residues with silanol groups (Si-OH) on the silica surface, which was proposed even before the sequence of pleuralins or any other diatom cell-wall protein had become available (Hecky et al., 1973; Lobel et al., 1996). Therefore, we have investigated the interaction of silicic acid with the PSCD4 domain in vitro by NMR spectroscopy. Monomeric silicic acid is not stable at neutral pH and rapidly polymerizes to yield oligomeric silicic acid molecules and polysilicic acid nanoparticles. 15N-enriched His6PSCD4 (0.8 mM) was titrated with monomeric silicic acid that had been freshly prepared from tetramethoxysilane up to a final concentration of 4.0 mM. Significant chemical-shift changes were not observed in the 2D HSQC spectra, indicating that under our experimental conditions substantial binding of monomeric and oligomeric silicic acid molecules to the protein did not occur. Interaction of PSCD4 with Divalent Ions In addition, binding of divalent ions to PSCD4 has been tested by adding a concentrated solution of MgCl2 to a solution of 0.8 mM 15N-enriched His6PSCD4. Again, a specific chemicalshift perturbation could not be observed up to an MgCl2 concentration of 10 mM, thus indicating that pleuralins, despite their highly acidic nature, are not endowed with specific Mg2+ binding sites. In contrast, addition of 10 mM CaCl2 to a solution of 0.8 mM 15 N-enriched His6PSCD4 leads to significant spectral changes, indicating direct interaction of Ca2+ ions with PSCD4. Chemical-shift changes are observed (Figure 6). At a calcium concentration of 10 mM saturation is obtained, while further increasing the concentration does not lead to larger chemical-shift changes. Structure 24, 1178–1191, July 6, 2016 1183

Figure 5. 3D Structures of the Five PSCD Domains and the Complete PSCD Region of Pleuralin-1 (A) The 3D structures of the core region of the five PSCD domains. They were calculated assuming that the experimental restraints from PSCD4 can be applied for residues that are strictly conserved in comparison with the sequence of PSCD4. The structures were simulated with CNS using both experimental and substitute restraints. In each case the ten lowest-energy structures are superposed. The numbering of the residues used here corresponds to the complete sequence of pleuralin-1. (B) Surface representation of the PSCD chain of pleuralin-1. The electrostatic surface potential is indicated with positive (blue) and negative (red) surface charge. Note that information on the structure of the linker regions was not available for the molecular dynamics simulations.

Most significant changes above 2s0 are obtained for C24, D34, F37, L38, E75, and N80 (Figure 6). All of these amino acids except Cys24 are localized rather closely in the 3D structure and involve a part of the structure stabilized by two disulfide bonds (C31-C76 and C36-C72). Typically Ca2+ binding sites contain negatively charged aspartate and/or glutamate residues. In fact, with D34 and E75 two such residues are available that could coordinate the metal ion(s). Strong chemical-shift changes are also observed for C24; however, only two negatively charged residues (D23 and E26), necessary for Ca2+ binding, are in close proximity. Neither of these latter residues, however, shows strong chemical-shift changes, indicating that the spectral changes may be due to conformational changes induced by calcium binding to the other site. Addition of CaCl2 leads to an intensity increase and a linewidth decrease of many cross-peaks (Figure S1). In general, the spectra appear to be generally better resolved, indicating 1184 Structure 24, 1178–1191, July 6, 2016

that residual exchange broadening is reduced after calcium binding. These effects are especially pronounced close to the two putative calcium binding sites. Binding of a-Frustulin to the PSCD Domain Native silaffin-1A and a-frustulins are the main protein components of C. fusiformis cell wall. HSQC spectroscopy was used to test whether they interact with PSCD domains in vitro. A concentrated solution of a-frustulins was added to a solution of 15N-enriched His6PSCD4 and a series of 1H,15N HSQC spectra was recorded. Significant chemical-shift changes of the amide resonances of PSCD4 were not observed, but in general the cross-peak intensities were significantly reduced after correction for dilution effects (on average by more than 25%), as expected when a higher molecular mass complex with a higher rotational correlation time is formed after binding of a-frustulin. In addition, stronger residue-specific reductions in cross-peak intensities were observed, indicating a specific

Figure 6. Calcium Binding to His6PSCD4 (A) Plot of the cross-peak combined chemical-shift changes Ddcomb in the 1H,15N HSQC spectra of 0.8 mM His6PSCD4 in 5% D2O and 0.1 mM DSS after addition of 10 mM CaCl2. P, proline residue; 0, cross-peak shift could not be quantified due to low signal-to-noise ratio; *, no chemical-shift change observable; M, no chemical shift assigned to those residues. Green and blue lines represent the SD to zero s0 and 2s0, respectively. (B) Surface representation of the PSCD4 domain showing residues with Dd >2 s0 in red. The not observable residues including the prolines are depicted in gray. The secondary structure is plotted under the transparent surface using the above-described colors and highlighting the side chains of the charged residues (D34, E75) involved in the two Ca2+ binding sites. An additional Ca2+ binding site was located close to C24 with a chemical-shift change Dd >2 s0. See also Figure S1.

interaction with a-frustulins (Figure 7). The most significant intensity changes (>2s0) were observed for the red-colored residues in Figure 7B, and particularly in the sequence range between K40 and E75 significant intensity changes with frustulin binding were observed. There is a strong general additional intensity reduction in the 1 15 H, N HSQC spectra of the PSCD4-frustulin complex when calcium is added (Figure S2), indicating that calcium binding enhances the affinity of frustulin to PSCD4. The residues mapped as interaction sites are almost identical, meaning that in the presence of calcium the binding pattern of frustulin to PSCD4 is unchanged.

In addition, chemical-shift changes are observed after addition of CaCl2, indicating that calcium binding is not hindered by the frustulin binding but is probably involved in the proteinprotein interaction. Indeed, there is a partial overlap of the residues strongly shifting with calcium binding (C24, D34, F37, L38, E75, and N80). The potential Ca2+ binding site at E75 reveals a further increased chemical-shift change in the PSCD4frustulin complex. Binding of Silaffin-1A to the PSCD Domain In contrast to the cross-peak intensity changes observed for the binding of a-frustulins, the addition of native silaffin-1A to a Structure 24, 1178–1191, July 6, 2016 1185

Figure 7. Interaction of His6PSCD4 with a-Frustulin (A) Plot of the relative cross-peak intensity changes (Io-I)/Io in the 1H,15N HSQC spectra of 0.8 mM His6PSCD in 5% D2O and 0.1 mM DSS after addition of 0.5 mM frustulin in the same buffer. Measurements were performed at 298 K. I0 and I, cross-peak intensity before and after addition of frustulin, respectively. Intensities were corrected for dilution effects. P, proline residue; 0, cross-peak intensity could not be quantified; M, no chemical shift assigned to those residues. Green and blue lines represent the SD to zero s0 and 2s0, respectively. (B) Surface representation of the PSCD4 domain showing residues with (I0-I)/I0 % s0 and 2s0 < (I0-I)/I0 in blue and red, respectively. The not observable residues including the prolines are depicted in gray. See also Figure S3.

solution containing 15N-enriched PSCD4 leads mainly to chemical-shift perturbations in the HSQC spectra (Figure 8), indicating again a protein-protein interaction. The largest perturbations of the combined chemical shifts Ddcomb (Schumann et al., 2007) are observed for Q20, N64, E75, and C76, with Dd > 2s0. Q20 is the only residue perturbed by silaffin binding in the N-terminal part in front of cysteine 24 of disulfide bond I, N64 is located in the loop stabilized by disulfide bridge V (cysteine 57 to cysteine 68), and E75 and C76 are located immediately before or in the 1186 Structure 24, 1178–1191, July 6, 2016

disulfide bridge II and are part of the second calcium binding site. The majority of the other residues showing significant chemical-shift changes after silaffin-1A binding are located also close to disulfide bridge II (E29, E30, I33, and S77, N80) or to disulfide bridge I (D88, N93). D48 is located close to disulfide bridge IV formed by C49 and C71, and N65 is found in the loop stabilized by disulfide bridge V. Finally, L99 is part of the C-terminal part of the PSCD4 domain. As expected, a number of negatively charged residues are perturbed by the interaction with

Figure 8. Interaction of His6PSCD4 with Native Silaffin-1A (A) Plot of the combined chemical-shift changes Ddcomb (Schumann et al., 2007) in the 1H,15N HSQC spectra of 0.8 mM His6PSCD in 5% D2O and 0.1 mM DSS after addition of 1.82 mM native silaffin-1A in the same buffer. Measurements were performed at 298 K. P, proline residue; 0, cross-peak chemical shift could not be quantified. Green and blue lines represent the SD to zero s0 and 2s0, respectively. (B) Surface representation of the PSCD4 domain showing residues with Ddcomb % s0, s 0 < Ddcomb %2s0, and 2s0 < Ddcomb in blue, orange, and red, respectively. The not observable residues are depicted in gray.

the positively charged polyamine side chains of native silaffin1A. The addition of Ca2+ ions to the PSCD4-silaffin complex did not induce further changes of chemical shifts or crosspeak volumes, indicating that in this complex calcium does not compete with the positively charged polyamine side chains of native silaffin-1A. DISCUSSION The cell-wall protein pleuralin-1 from the diatom C. fusiformis comprises five PSCD domains that show 65%–90% sequence identity with the ten cysteine residues at conserved positions (Kro¨ger et al., 1997). Pleuralin-1 has no sequence similarity to other proteins in the database. We have experimentally

determined the structure of the fourth domain and were able to predict the entire structure of the PSCD region of the protein based on the high conservation of the sequence of the PSCD domains. Unfortunately, we currently have no information on the 3D structures of the N-terminal proline-rich region and the C-terminal domain of pleuralin-1. The PSCD structures were determined semi-automatically by using routines of AUREMOL. The structures of the PSCD domains are quite well defined and show that the different disulfide stabilized loop regions form a rather compact structure. The different PSCD domains are linked by short stretches of amino acids that are probably highly mobile and disordered. The arrangement of the PSCD domains forms a bead-chain-like structure. Structure 24, 1178–1191, July 6, 2016 1187

Figure 9. Model of the Mutual Arrangement of Pleuralin, Silaffin-1A, and a-Frustulin in the Overlap Region of Hypotheca and Epitheca

A point that is difficult to prove is that the disulfide bridge pattern obtained in the heterologously expressed protein is identical to the one occurring in the diatom cell wall. However, it is generally accepted that disulfide bonds only stabilize the natural fold but do not create it. In our recombinant expression system they were created during the export of the protein into the periplasm of E. coli and only a unique disulfide bridge pattern was observed. In agreement with this assumption, the reduction of the disulfide bonds did not lead to a substantial change of the NOE patterns and only to a reduction of localized chemical-shift changes close to the cysteine residues. This supports again our assumption that the PSCD4 structure reported here corresponds to that of the natural protein. The lack of interaction of PSCD4 with silicic acid in vitro suggests that pleuralins are not involved in silica biogenesis. This is consistent with previous results from immunolocalization experiments that demonstrated (1) the absence of pleuralins inside the SDVs, and (2) attachment of pleuralins to girdle bands on the cell surface, i.e., after completion of silica morphogenesis (Kro¨ger and Wetherbee, 2000). Furthermore, as no interaction with the silanol-group rich silicic acid molecules could be demonstrated, it seems unlikely that pleuralins attach to biosilica in vivo through the formation of ester bonds between hxydroxyl-bearing amino acid residues and the silanol groups. Instead pleuralins may be incorporated into the cell wall by non-covalent attachment to silaffins, which themselves are tightly bound to the silica (Kro¨ger et al., 1999). This is consistent with the interaction between PSCD4 and native silaffin-1A in vitro that was observed in the present study. It has previously been demonstrated that 1188 Structure 24, 1178–1191, July 6, 2016

frustulins are the main components of a cell-surface coating that also covers pleuralins in vivo (Kro¨ger et al., 1997). The binding of the frustulin-based coating to pleuralins may, at least partly, be mediated through the interactions between PSCD domains and a-frustulins that were revealed through the HSQC experiments described here. For the following reasons it is easily conceivable that pleuralins can simultaneously bind to both silaffins and frustulins in vivo. Firstly, the binding sites for native silaffin-1A and for a-frustulins in PSCD4 only partially overlap, and secondly the presence of multiple PSCD domains in each pleuralin allows for multivalent interactions with different binding partners. In addition, binding of Ca2+ ions to the specific calcium binding site leads to an increased affinity of frustulins at Ca2+ concentrations of the order of 10 mM. This is a calcium concentration typically found in seawater; according to the IASPO reference seawater with 3.5% salinity, sea water contains 10.5 mM Ca2+, 54.0 mM Mg2+, 455.4 mM Na+, and 10.2 mM K+ (Besson et al., 2013) but in the cytoplasm the concentrations of free divalent ions are much lower (Ca2+ about 0.1 mM, Mg2+ about 0.5 mM; MacDermott, 1990). Interestingly, a-frustulins are bound strongly to the cell walls in the presence of Ca2+ ions and are released by addition of EDTA, e.g., during the biochemical extraction procedure (Kro¨ger et al., 1994). From this a suggested model for the arrangement of the three proteins can be derived (Figure 9) in which the pleuralin chains connect the upper valves with lower valves when they come into contact with the calcium-rich seawater. Ca2+ ions are substantial for the protein-protein interaction, and at least part of their binding sites are located in the interface between PSCD4 and a-frustulin. Experimentally, in vivo frustulins can be released by high concentrations of EDTA. However, this model is exclusively derived from the molecular properties of the three proteins and the rather limited information on the location of the corresponding proteins in diatom cell walls, although it could serve as a working hypothesis for further experiments. EXPERIMENTAL PROCEDURES Expression and Isolation of Proteins The PSCD4 domain of pleuralin-1 protein from C. fusiformis has been expressed in E. coli using an export vector and purified as previously described by Wenzler et al. (2001). For sequential assignments, uniform 15N-13C isotope enrichment was performed by growing the bacteria in M9 medium (Sambrook et al., 1989) with [15N]ammonuim chloride and D-[13C]-glucose as nitrogen and carbon sources. The construct His6PSCD4 (112 residues) used in the NMR experiments contained an N-terminal linker with the sequence SYYHHHHHH, where the pleuralin sequence starts with T10 and ends with T112. T10 and T112 correspond to amino acids T366 and T468 of pleuralin-1 in the complete protein when defining the first amino acid, after the presequence that is cleaved off during the export of the protein, as amino acid 1. A typical SDS gel is shown in Figure S3. The preparation of a-frustulins was performed following a previously described protocol (Kro¨ger et al., 1994). Native silaffin1A was isolated as previously described (Kro¨ger et al., 2002). NMR Samples For structural determination, 10 mg/ml 15N-labeled His6PSCD4 was dissolved in 10 mM sodium phosphate buffer (pH 6.5), 10% D2O, 0.1 mM EDTA, 1 mM NaN3, 1 mM leupeptin, 1 mM pepstatin, 1 mM BPTI (bovine pancreatic trypsin inhibitor), and 0.1 mM DSS (4,4-dimethyl-4-silapentane-1-sulfonic acid). For measuring the RDCs, the sample was dissolved in 50 mM potassium phosphate buffer (pH 6.2) with 0.1 mM EDTA and 0.1 mM DSS in 500 ml of 95% H2O/5% D2O. Partial orientation was obtained by adding 7.5%

(w/v) 1,2-di-O-dodecyl-sn-glycero-3-phosphocholine (DIODPC)/3-[(3-cholamidopropyl)dimethylammonio]-2-hydroxy-1-propanesulfonate (CHAPSO)/cetyltrimethylammonium bromide (CTAB) (4,3:1:0,1) bicelles (Cavagnero et al., 1999). For the titration experiments, to a sample of 0.8 mM His6PSCD4 (pH 7.5) in 5% D2O and 0.1 mM DSS, appropriate amounts of a solution of 0.5 mM (40 mg/ml) frustulin or 1.82 mM native silaffin-1A dissolved in the same buffer were added. For the titration with divalent ions, appropriate quantities of a solution containing 1 M MgCl2 or 1 M CaCl2 were added. The pH of the solution was controlled with a glass electrode and did not change significantly during the titration. NMR Spectroscopy The NMR spectra were recorded on a Bruker DMX-500, a Bruker DRX-600, and an Avance-600 equipped with a CryoProbe, and a DRX-800 spectrometer operating at 500.13, 600.13, and 800.13 MHz, respectively. The sequence-specific assignment of the atoms of the backbone was performed exploiting the following experiments: 1H,15N HSQC, HNCA, CBCA(CO) NH, HBHA(CO)NH, 1H,15N-total correlation spectroscopy (TOCSY) HSQC, 1 H,15N NOESY-HSQC (NOESY mixing time 100 ms), CDCA(NCO)CAHA (Bottomley et al., 1999), HNCO, and HACACO. The Hb and Cb atoms of the side chain were assigned using the CBCA(CO)NH and HBHA(CO)NH experiments, whereas the H(C)CH-TOCSY, the (H)CC(CO)NH-TOCSY, and the H(CC) (CO) NH-TOCSY experiments were used to assign the aliphatic C atoms and the directly bound protons. The CO and NH2 groups of the asparagine and glutamine side chains were detected in the HNCO and the HNCA spectra, while those of the aspartate and glutamate side chains were found in the HACACO experiment. The chemical shifts of the protons in the aromatic rings were determined from the 1H,1H NOESY, 1H,1H TOCSY, and 1H,13C NOESYHSQC experiments. Distance restraints were mainly detected from a 2D 1 H,1H NOESY spectrum recorded with a mixing time of 60 ms and a relaxation delay of 1.6 s using TPPI in u1. The spectral width was 13.89 ppm in both directions using 512 3 4,096 time domain data points. In addition, a 3D 1H,15N NOESY-HSQC and a 3D 1H,13C NOESY-HSQC were recorded. The HNCAE.COSY spectrum (Griesinger et al., 1987) was used to determine the 3JHNHa coupling constant, whereas the hydrogen bonds were detected from the HNCO experiment performed with a long INEPT N-C polarization transfer time of 66 ms. Some additional hydrogen bonds were observed by means of hydrogen-deuterium exchange in a series of recorded 1H,15N HSQC spectra. A few 1H,15N HSQC spectra without decoupling in the indirect dimension were measured in the presence and absence of DIODPC/CHAPSO/CTAB bicelles with a spectral resolution of 1.2 Hz in the indirect dimension. Heteronuclear 15 N,1H NOEs were measured at a proton frequency of 500 MHz according to Farrow et al. (1994) with a proton saturation time of 3 s. The interaction sites of pleuralin-1 with a-frustulin and native silaffin-1A were determined by observation of chemical-shift changes as well as cross-peak intensity changes in 1H,15N HSQC of 15N-enriched PSCD4 depending on the concentration of the unlabeled proteins added. For the study of a possible interaction of silicic acid with PSCD4, it was freshly prepared before every titration step from tetramethoxysilane by adding 1 mM HCl to the solution. After adding appropriate amounts of silicic acid to His6PSCD4, a 1H,15N HSQC spectrum was recorded. All spectra were measured at 298 K. The absolute value of temperature was determined by external ethylene glycol as described by Raiford et al. (1979). Mass Spectrometry and Determination of the Disulfide Bridges Proteolytic peptides were separated by reverse-phase chromatography (Jupiter Proteo column, 250 3 2 mm, Phenomenex) using gradient elution (eluent A: 0.1% trifluoroacetic acid [TFA] in water, eluent B: 70% acetonitrile in 0.1% TFA). The peptides were separated either before or after reduction with DTT. MS was performed using a 4800 Plus MALDI-TOF/TOF Analyzer (Applied Biosystems). The lyophilized samples were dissolved in 5 ml of matrix solution (5 mg a-cyano-4-hydroxy-cinnamic acid [CHCA] in 1 ml of H2O/acetonitrile/ 0.1% TFA) and spotted onto the MALDI target. For molecular weight measurements, spectra were recorded either in the linear or reflector mode and mass calibration was done using internal standards. For structural analysis, peptides were fragmented and the MS/MS spectra recorded.

In MALDI-MS, laser bombardment of the sample produces singly charged [M + H]+ peptide ions (by transfer of one proton from the CHCA matrix to the neutral peptide). Therefore, m/z values (mass/charge), the values measured by MS, differ from the molecular mass of the neutral peptide by one mass unit. m/z values were determined by TOF MS. For conventional protein sequencing, peptides were sequenced on a Procise 492A sequencer (Applied Biosystems) with online detection of the PTH amino acids according to the manufacturer’s instructions. NMR Data Evaluation The software suites XWINNMR 2.6 and TOPSPIN 3.1 were used during the acquisition and the processing stages of the NMR data. SSA for solvent suppression and the ALS for baseline correction implemented in the program AUREMOL were applied to all spectra before performing the automated assignment to improve the spectral quality and reveal some resonances superposed by the solvent signal (Malloni et al., 2010; De Sanctis et al., 2011). The TALOS+ program (Corneliescu et al., 1999) was used to predict the secondary structure from the detected chemical shifts of Ca, Cb, C0 , Ha, and HN atoms yielding 4 and c dihedral angle restraints. Secondary structure prediction was performed according to Wishart et al. (1995). The configuration of the proline peptide bond was determined both from the NOE patterns (Wu¨thrich et al., 1984) and from chemical-shift difference of the Cb-Cg resonances (Schubert et al., 2002). Stereospecific assignments of the side-chain amide groups of asparagine in glutamine residues were obtained from the NOE patterns in the 3D 1H,15N NOESY-HSQC spectra and the chemical shifts (Harsch et al., 2013). The sequence-specific assignments reported earlier (Wenzler et al., 2001) were completed and are stored in the BMRB under the accession number 4958. NOESY spectra were evaluated with the help of AUREMOL (www. auremol.de) routines. For the primary NOESY assignment the AUREMOL routine KNOWNOE (Gronwald et al., 2002) was applied to the 2D 1H,1H NOESY spectrum. The REFINE routine (Trenner, 2006) was used to automatically generate distance restraints and distance limits from the assigned 2D 1H,1H NOESY spectrum. Distance restraints for degenerated resonances were corrected according to Kalbitzer and Hengstenberg (1993). Additional substitute restraints (Cano et al., 2009) were generated by PERMOL (Mo¨glich et al., 2005). Structure Calculation and Validation The 3D structures were determined using the standard simulated annealing protocol of the CNS 1.21 program for extended-strand starting structures (Bru¨nger and Nilges, 1998). The structures were refined according to the protocol proposed by Cano et al. (2009), using substitute restraints; finally, the structures were refined in explicit water (Linge et al., 2003). In a first step CNS was used to simulate 1,024 structures of the PSCD4 domain (amino acids 366–468) using only the experimental restraints. The simulated annealing improved the model using high-temperature torsional angle dynamics run at 50,000 K for 1,000 steps each of 15 fs. Successively, a first cooling phase of 1,000 steps with a starting temperature of 50,000 K and a time step of 15 fs was employed. During the simulation of the entire protein, a second cooling stage was run with 1,000 steps (each of 5 fs) of Cartesian dynamics and a starting temperature of 3,000 K. In the final phase, 200 steps of energy minimization were performed. The ten lowest-energy structures were refined in explicit water using the CNS protocol re_h2o.inp. From these structures additional substitute restraints were created (Cano et al., 2009), and the protocol was repeated including the experimental restraints and the substitute restraints. For calculating the structures of PSCD1 (amino acids N93 to N179), PSCD2 (amino acids E185 to P271), PSCD3 (amino acids A283 to P371), and PSCD5 (amino acids Q487 to P575), the domains were aligned with PSCD4 and for conserved residues the experimental restraints from PSCD4 were assumed to hold also for the other residues. With these restraints the above protocol was repeated for these domains. Using all substitute and experimental restraints, the structures of the part of pleuralin-1 containing all five PSCD domains (amino acids Q88 to E581) were calculated again with the above molecular dynamics protocol (using both experimental and substitute restraints except for the RDCs). The experimental restraints of the PSCD4 domain used for the calculations are listed in Table 1. The substitute restraints of each PSCD domain generated

Structure 24, 1178–1191, July 6, 2016 1189

with PERMOL are listed in Table S2. Twelve hydrogen-bond restraints were detected, eight of them from the long-range HNCO experiment and the other four from the hydrogen-deuterium exchange and NOE data. For hydrogenbond distances, the limits of the amide proton and the acceptor oxygen were defined as 0.18 and 0.25 nm, while they were set to 0.23 to 0.35 nm for the donator nitrogen and the acceptor oxygen. Eight backbone torsionangle restraints predicted by TALOS+ were used. In addition, from the HNCA-E.COSY spectrum (Griesinger et al., 1987) nine 3JHNHa coupling restraints were observed. Residual 1H,15N RDCs were obtained from measurements in an isotropic solution and in a bicellar system. The three components Aij of the molecular magnitude tensor were extrapolated from the histogram of the distribution of the observed RDCs (Clore et al., 1998). Dipolar couplings were introduced in the water refinement using the SANI protocol (Tjandra et al., 1997). A rhombicity of 0.443 and an axiality of 10.83 were computed and used by the CNS program to simulate the PSCD4 domain. For calculation of the structure of the full-length pleuralin-1 protein, the PERMOL routine (Mo¨glich et al., 2005) was used for the sequential alignment of the PSCD domains and for the extraction of NOE distances, dihedral angles, and hydrogen bonds of each PSCD domain from the previously determined structure of the PSCD4 domain. The RMSD to the mean of the ten lowestenergy structures was calculated with MOLMOL (Koradi et al., 1996). The Ramachandran plots were calculated with PROCHECK (Laskowski et al., 1993). The NMR R factor was calculated according to Gronwald et al. (2000, 2007). The structural representations were produced by PyMOL (DeLano, 2002). ACCESSION NUMBERS The 3D structures and structurally relevant NMR data are deposited in the PDB under the accession codes PDB: 2MK0 and 2NMBI, and the assignments in the BMRB under the accession number 4958. SUPPLEMENTAL INFORMATION Supplemental Information includes three figures and two tables and can be found with this article online at http://dx.doi.org/10.1016/j.str.2016.04.021. AUTHOR CONTRIBUTIONS M.W., SD.S., W.M.M., E.B., W.K., and H.R.K. were involved in the NMR data acquisition, evaluation, and structure calculation; N.K., M.S., and P.Z. in protein expression and purification; and N.K., P.Z., and R.D. in MS. All authors contributed to defining the project, manuscript writing, and the necessary discussions. ACKNOWLEDGMENTS We wish to thank the Deutsche Forschungsgemeinschaft (DFG) and the Peter & Traudl-Engelhorn Foundation for supporting this work. Received: December 20, 2015 Revised: April 27, 2016 Accepted: April 27, 2016 Published: June 16, 2016 REFERENCES Besson, P., Degboe, J., Berge, B., Chavagnac, V., Fabre, S., and Berger, G. (2013). Calcium, Na, K and Mg concentrations in seawater by inductively coupled plasma-atomic emission spectrometry: applications to IAPSO seawater reference material, hydrothermal fluids and synthetic seawater solutions. Geostan. Geonanal. Res. 38, 355–362.

Cano, C., Brunner, K., Baskaran, K., Elsner, R., Munte, C.E., and Kalbitzer, H.R. (2009). Protein structure calculation with data imputation: the use of substitute restraints. J. Biomol. NMR 45, 397–411. Cavagnero, S., Dyson, H.J., and Wright, P.E. (1999). Improved low pH bicelle system for orienting macromolecules over a wide temperature range. J. Biomol. NMR 13, 387–391. Clore, G.M., Gronenborn, A.M., and Bax, A. (1998). A robust method for determining the magnitude of the fully asymmetric alignment tensor of oriented macromolecules in the absence of structural information. J. Magn. Reson. 133, 216–221. Corneliescu, G., Delaglio, F., and Bax, A. (1999). TALOS: protein backbone angle restraints from searching a database for chemical shift and sequence homology. J. Biomol. NMR 13, 289–302. DeLano, W.L. (2002). The PyMOL Molecular Graphic System (San Carlos, CA, USA: DeLanoScientific). De Sanctis, S., Malloni, W.M., Kremer, W., Tome´, A.M., Lang, E.W., Neidig, K.P., and Kalbitzer, H.R. (2011). Singular spectrum analysis for an automated solvent artifact removal and baseline correction of 1D NMR spectra. J. Magn. Reson. 210, 177–183. Farrow, N.A., Muhandiram, R., Singer, A.U., Pascal, S.M., Kay, C.M., Gish, G., Shoelson, S.E., Pawson, T., Forman-Kay, J.D., and Kay, L.E. (1994). Backbone dynamics of a free and a phosphopeptide-complexedSrc homology 2 domain studied by 15N NMR relaxation. Biochemistry 33, 5984–6003. Griesinger, C., Sørensen, O.W., and Ernst, R.R. (1987). Practical aspects of the E.COSY technique. Measurement of scalar spin-spin coupling constants in peptides. J. Magn. Reson. 75, 474–492. Gronwald, W., Kirchhofer, R., Gorler, A., Kremer, W., Ganslmeier, B., Neidig, K.P., and Kalbitzer, H.R. (2000). RFAC, a program for automated NMR R-factor estimation. J. Biomol. NMR 17, 137–151. Gronwald, W., Moussa, S., Elsner, R., Jung, A., Ganslmeier, B., Trenner, J., Kremer, W., Neidig, K.P., and Kalbitzer, H.R. (2002). Automated assignment of NOESY NMR spectra using a knowledge based method (KNOWNOE). J. Biomol. NMR 23, 271–287. Gronwald, W., Brunner, K., Kirchho¨fer, R., Trenner, J., Neidig, K.-P., and Kalbitzer, H.R. (2007). AUREMOL-RFAC-3D, combination of R-factors and their use for automated quality assessment of protein structures. J. Biomol. NMR 37, 15–30. Harsch, T., Dasch, C., Donaubauer, H., Baskaran, K., Kremer, W., and Kalbitzer, H.R. (2013). Stereospecific assignment of the asparagine and glutamine side chain amide protons in random-coil peptides by combination of molecular dynamic simulations with relaxation matrix calculations. Appl. Magn. Reson. 44, 319–331. Hecky, R.E., Mopper, K., Kilham, P., and Degens, T.E. (1973). The amino acid and sugar composition of diatom cell walls. Mar. Biol. 19, 323–331. Kalbitzer, H.R., and Hengstenberg, W. (1993). The Solution Structure of the Histidine-Containing Protein (HPr) from Staphylococcus aureus as determined by Two-Dimensional 1H-NMR Spectroscopy. Eur.J.Biochem. 216, 205–214. Koradi, R., Billeter, M., and Wu¨thrich, K. (1996). MOLMOL: a program for display and analysis of macromolecular structures. J. Mol. Graph. 14, 51–55. Kremer, W., Arnold, M.R., Kachel, N., and Kalbitzer, H.R. (2004). The use of high-sensitivity sapphire cells in high pressure NMR spectroscopy and its application to proteins. Spectroscopy 18, 271–278. Kro¨ger, N., and Poulsen, N. (2008). Diatoms—from cell wall biogenesis to nanotechnology. Annu. Rev. Genet. 42, 83–107. Kro¨ger, N., and Wetherbee, R. (2000). Pleuralins are involved in theca differentiation in the diatom Cylindrotheca fusiformis. Protist 151, 263–273.

Bottomley, M.J., Macias, M.J., Liu, Z., and Sattler, M. (1999). A novel NMR experiment for the sequential assignment of proline residues and proline stretches in 13C/15N-labeled proteins. J. Biomol. NMR 13, 381–385.

Kro¨ger, N., Bergsdorf, C., and Sumper, M. (1994). A new calcium binding glycoprotein family constitutes a major diatom cell wall component. EMBO J. 13, 4676–4683.

Bru¨nger, A.T., and Nilges, M. (1998). Computational challenges for macromolecular structure determination by x-ray crystallography and solution NMR spectroscopy. Q. Rev. Biophys. 26, 49–125.

Kro¨ger, N., Lehmann, G., Rachel, R., and Sumper, M. (1997). Characterization of a 200-kDa diatom protein that is specifically associated with a silica-based substructure of the cell wall. Eur. J. Biochem. 250, 99–105.

1190 Structure 24, 1178–1191, July 6, 2016

Kro¨ger, N., Deutzmann, R., and Sumper, M. (1999). Polycationic peptides from diatom biosilica that direct silicananoshpere formation. Science 286, 1129– 1132. Kro¨ger, N., Lorenz, S., Brunner, E., and Sumper, M. (2002). Self-assembly of highly phosphorylated silaffins and their function in biosilica morphogenesis. Science 298, 584–586. Laskowski, R.A., MacArthur, M.W., Moss, D.S., and Thornton, J.M. (1993). Procheck—a program to check thestereochemical quality of protein structures. J. App. Cryst. 26, 283–291. Linge, J.P., Williams, M.A., Spronk, C.A.E.M., Bonvin, A.M.J.J., and Nilges, M. (2003). Refinement of protein structures in explicit solvent. Proteins 50, 496–506. Lobel, K.D., West, J.K., and Hench, L.L. (1996). Computational model for protein-mediated biomineralization of the diatom frustule. Mar. Biol. 126, 353–360. MacDermott, M. (1990). The intracellular concentration of free magnesium in extensor digitorum longus muscles of the rat. Exp. Physiol. 75, 763–769. Malloni, W.M., De Sanctis, S., Tome´, A.M., Lang, E.W., Munte, C.E., Neidig, K.P., and Kalbitzer, H.R. (2010). Automated solvent artifact removal and base plane correction of multidimensional NMR protein spectra by AUREMOL-SSA. J. Biomol. NMR 47, 101–111. Mo¨glich, A., Weinfurtner, D., Gronwald, W., Maurer, T., and Kalbitzer, H.R. (2005). PERMOL: restrained based protein homology modeling using DYANA or CNS. Bioinformatics 21, 2110–2111. Raiford, D.S., Fisk, C.G., and Becker, E.D. (1979). Calibration of methanol and ethylene glycol nuclear magnetic resonance thermometers. Anal. Chem. 51, 2050–2051. Sambrook, J., Fritsch, E.F., and Maniatis, T. (1989). Molecular Cloning: A Laboratory Manual (Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press). Schubert, M., Labudde, D., Oschkinat, H., and Schmieder, P. (2002). A software tool for the prediction of Xaa-Pro peptide bond conformations in proteins based on 13C chemical shift statistics. J. Biomol. NMR 24, 149–154.

Schumann, F.H., Riepl, H., Maurer, T., Gronwald, W., Neidig, K.P., and Kalbitzer, H.R. (2007). Combined chemical shift changes and amino acid specific chemical shift mapping of protein-protein interactions. J. Biomol. NMR 39, 275–289. Sitte, P., Ziegler, H., Ehrendorfer, F., and Bresinsky, A. (1991). Strasburger Lehrbuch der Botanik (Stuttgart: Gustav Fischer Verlag). Smetacek, V. (1999). Diatoms and the ocean carbon cycle. Protist 150, 25–32. Sumper, M., and Brunner, E. (2008). Silica biomineralisation in diatoms: the model organism Thalassiosira pseudonana. Chembiochem 9, 1187–1194. Tjandra, N., Omichinski, J.G., Gronenborn, A.M., Clore, G.M., and Bax, A. (1997). Use of dipolar 1H-15N and 1H-13C couplings in the structure determination of magnetically oriented macromolecules in solutions. Nat. Struct. Biol. 4, 732–738. Trenner, J.M. (2006). Accurate Proton-Proton Distance Calculation and Error Estimation from NMR Data for Automated Protein Structure Determination in AUREMOL (Regensburg: University of Regensburg), Doctoral thesis. Van den Hoek, C., Jahns, H.M., and Mann, D.G. (1993). Algen (Stuttgart, Germany: Thieme-Verlag). Wenzler, M., Brunner, E., Kro¨ger, N., Lehmann, G., Sumper, M., and Kalbitzer, H.R. (2001). Letter to the editor: 1H, 13C, 15N sequence-specific resonance assignment of the PSCD4 domain of diatom cell wall protein pleuralin-1. J. Biomol. NMR 20, 191–192. Wishart, D.S., Sykes, B.D., and Richards, F.M. (1992). The chemical shift index: a fast and simple method for the assignment of protein secondary structure through NMR spectroscopy. Biochemistry 31, 1647–1651. Wishart, D.S., Bigam, C.G., Holm, A., Hodges, R.S., and Sykes, B.D. (1995). H, 13C and 15N random coil NMR chemical shifts of the common amino acids: I. investigations of nearest neighbor effects. J. Biomol. NMR 5, 67–81.

1

Wu¨thrich, K., Billeter, M., and Braun, W. (1984). Polypeptide secondary structure determination by nuclear magnetic resonance observation of short proton-proton distances. J. Mol. Biol. 180, 715–740.

Structure 24, 1178–1191, July 6, 2016 1191

Structure, Volume 24

Supplemental Information

PSCD Domains of Pleuralin-1 from the Diatom Cylindrotheca fusiformis: NMR Structures and Interactions with Other Biosilica-Associated Proteins Silvia De Sanctis, Michael Wenzler, Nils Kröger, Wilhelm M. Malloni, Manfred Sumper, Rainer Deutzmann, Patrick Zadravec, Eike Brunner, Werner Kremer, and Hans Robert Kalbitzer

1

Supplemental Information Table S1, related to Table 1. Proline cis-trans isomers and disulfide bonds in PSCD4-domain.a Peptide bonds

Residues

Trans

Q20-P21, C31-P32, L38-P39, R45-P46, R54-P55, F62P63, C68-P69, C72-P73, N80-P81, S86-P87

Cis

T84-P85, S101-P102, V109-P110

Mixed

L99-P100

Undefined

E16-P17, P46-P47, L60-P61, S77-P78, S90-P91, P91P92, S95-P96, S103-P104

Disulfide bonds

C24 - C94, C31 - C76, C36 - C72, C49 - C71, C57 C68

The cis-trans configurations of the bonds preceding the proline residues were derived from the CC chemical shift difference resonances (Schubert et al., 2002) and/or typical NOE patterns (Wüthrich et al., 1986). Undefined configurations are due to peak overlap or insufficient signal-tonoise ratio in the NOESY spectrum or to missing chemical shift assignment. Disulfide bonds were determined by a combination of limited proteolysis, Edman degradation, and mass spectrometry (see Experimental Procedures). Note that T10 corresponds to T366 in pleuralin-1.

2

Table S2, related to Table 1. Restraints, Energies and structural statistics of the five PSCDdomains separately simulated using data imputation.a PSCD1

PSCD2

PSCD3

PSCD4

PSCD5

Number of substitute restraints

Dihedral angles

69

99

70

132

52

Hydrogen bonds

9

12

6

13

6

NOE distances

909

945

1116

919

992

Total number of

992

1061

1197

1069

1055

-8172 ± 279 -7709 ± 165

-7403 ± 341

-7673 ± 127

-7568 ± 1962

restraints

Structural statistics Etotal [kJmol-1] NOEs > 0.02 nm

13

15

17

10

12

all amino acids

0.16

0.16

0.47

0.09

0.51

folded core

0.09

0.04

0.17

0.03

0.18

89.4

88.5

89.6

88.7

95.2

RMSD [nm]:

Residues in favored / allowed regions [%] a

The five PSCD-domains have highly conserved sequences (Fig. 1). The positions of the ten

cysteine residues are unaltered among the domains. The substitute restraints (dihedral angles, hydrogen bonds and NOE distances) have been generated from the alignment of each domain with the PSCD4-domain using PERMOL (Möglich et al., 2005) with the conditions defined by Cano et al. (2009) for data imputation. Each PSCD-domain has been separately simulated joining the experimental restraints (in accordance to the conserved residues with the PSCD4-domain) and additional substitute restraints (Table 4). The energies and the violations are referred to the 10 lowest energy structures after water refinement. The backbone RMSD to the mean of the 10 lowest energy structures has been calculated with MOLMOL (Koradi et al., 1996). The well-folded core of the structure corresponds to the structure from E346 to N436. The quality of the Ramachandran plots after water refinement was calculated with PROCHECK (Laskowski et al., 1993).

3

Figure S1, related to Figure 6 Intensity changes of the [1H, 15N]-HSQC cross peaks of PSCD4 after addition of CaCl2.a

a

The relative peak intensity changes (I0 – I)/ I0 of the [1H, 15N]-HSQC cross peaks of PSCD4 after addition of calcium are plotted as function of the sequence position. I0, peak intensity (value of the maximum) in the absence of CaCl2; I, peak intensity at 10 mM CaCl2. Intensities were corrected for dilution effects. Violet and green lines represent the standard deviation to zero 0 and 20, respectively. For details see also Fig. 6.

Figure S2, related to Figure 7 Intensity changes of the [1H, 15N]-HSQC cross peaks of PSCD4 -frustulin complex after addition of CaCl2.a

a

The relative peak intensity changes (I0 – I)/ I0 of the [1H, 15N]-HSQC cross peaks of the PSCD4-frustulin complex after addition of calcium are plotted as function of the sequence position. I0, peak intensity (value of the maximum) in the absence of CaCl2; I, peak intensity at 10 mM CaCl2. Intensities were corrected for dilution effects. Violet and green lines represent the standard deviation to zero 0 and 20, respectively. For details see also Fig. 6.

4

Figure S3, related to Figure 2 SDS gel of His6PSCD4 used for the analysis of the disulfide bridges.a

a

(Left) Molecular mass markers, (middle) His6PSCD4 after 3 months at room temperature, (right) sample used for the analysis of the disulfide patterns.