227
Journal of Structural and Functional Genomics 4: 227–234, 2003. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
Solution structure of a PAN module from the apicomplexan parasite Eimeria tenella Philip J. Brown 1,3, Denise Mulvey 2, Jennifer R. Potts 2, Fiona M. Tomley 1,* & Iain D. Campbell 2,* 1 Institute for Animal Health, Compton, Newbury, Berkshire RG20 7NN, UK; 2Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, UK; 3Current address: Nuffıeld Dept. of Clinical Lab. Sciences, John Radcliffe Hospital, Headley Way, Headington, Oxford OX3 9DU, UK; *Authors for correspondence (Fax: +44-(0)1865−275253; E-mail:
[email protected];
[email protected])
Received 22 January 2003; accepted in revised form 21 May 2003
Key words: Apple, Eimeria tenella, NMR, PAN module Abstract Micronemes, specialised organelles found in all apicomplexan parasites, secrete molecules that are essential for parasite attachment and invasion of host cells. EtMIC5 is one such microneme protein that contains eleven tandemly repeating modules. These modules have homology with the PAN module superfamily. Members of this family are found in blood clotting proteins, some growth factors and some nematode proteins. This paper presents the structure of the 9 th PAN module in EtMIC5, determined using high resolution NMR. The structure shows similarities to and some differences from the N-terminal module of hepatocyte growth factor (HGF), the only previous member of the PAN family with known structure. Abbreviations: NMR – nuclear magnetic resonance; NOE – nuclear Overhauser enhancement; NOESY – NOE spectroscopy; COSY – correlated spectroscopy; TOCSY – total correlated spectroscopy; HSQC – hetero nuclear single quantum coherence; HMQC-J – hetero nuclear multiple quantum coherence-J coupling; MICs – microneme proteins; EtMIC5 – a microneme protein from Eimeria tenella; Apple9 – the ninth Apple repeat of EtMIC5; FXI – blood coagulation factor XI; PK – plasma prekallikrein; HGF – hepatocyte growth factor. Introduction Protozoa of the phylum Apicomplexa, such as Cryptosporidium, Eimeria, Plasmodium and Toxoplasma, are obligate intracellular parasites that cause serious diseases of man and livestock. The mechanism by which they actively penetrate target host cells is highly specialised and quite distinct from the entry of other intracellular pathogens. Parasite gliding motility and host cell invasion are tightly coupled to the regulated secretion of proteins from specialised secretory organelles, which are located just inside the parasite apical tip [1]. In particular, micronemes release their cargo early during invasion in response to environmental stimuli [2, 3].
Almost 40 microneme proteins (MICs) have so far been characterised from various apicomplexans. A striking feature is that many MICs, which are ligands for host-cell receptors, contain sequences homologous to protein modules (‘module’ is used here as defined in [4]) from higher eukaryotes [5]. Recently, a family of MICs has been identified that contain PAN modules; family members have been found in Sarcocystis (SML, [6]) Eimeria (EtMIC5, [7]) and Toxoplasma (TgMIC4, [8]). SML has two PAN modules and functions as a dimeric lectin with high affinity for galactose [9]. TgMIC4 has 6 PAN modules and its binding to host cells shows a biphasic response to galactose, N-acetylgalactosamine or N-acetylglucosamine, being enhanced by low concentrations (1 mg ml −1) and inhibited by high concentrations
Figure 1. Representative PAN modules were aligned using Clustalx and displayed in genedoc with cysteines and other key conserved sequences shaded. The selected modules are from a bacterium (Mesorhizobium loti protein, Q982A0), apicomplexans (Eimeria tenella ETMIC5 protein, Q9U966 (in bold); Toxoplasma gondii TgMIC4 protein, Q9XZH7; Sarcocystis muris SML protein, MIAM_SARMU, Q08668; Cryptosporidium parvum CpTSP4 protein, AAO39042), a filamentous fungus (Phytophthora parasitica cbel protein, O42830), plants (Zea mays receptor protein kinase, KPRO_MAIZE, P17801; Oryza sativa protein kinase, Q9LHU0; Ipomoea trifida secreted glycoprotein,Q40099; Arabidopsis thaliana receptor protein kinases, Q9ZRO8, Q39203, P93756), a tunicate (Polyandrocarpa misakiensis retinoic acid inducible proteinase, Q9YIV3), a nematode (Caenorhabditis elegans proteins, Q17797, YSM5_CAEEL, O62201), an arthropod (Drosophila melanogaster protein Q9BDM4), an amphibian (Xenopodinae xenopus hepatocyte growth factor, Q91402), an avian (Gallus gallus hepatocyte growth factor, Q90978) and a mammal (human hepatocyte growth factor, HGF_human, P14210; hepatocyte growth factor-like protein, HGFL_human, P26927; plasminogen, PLMN_human, P00747; prekallikrein, KAL_human, P03952; coagulation factor XI, FA11_human, P03951).
228
229 module of HGF, [17, 18, 19] which has two rather than three disulphide bridges.
Figure 2. The sequence of Apple9, amino acids numbered as in this work; T12 is equivalent to T712 in the sequence of EtMIC5; amino acids 1–8 are the FLAG attachment and 9−11 from the plasmid vector.
Materials and methods
(50 mg ml −1). EtMIC5 has 11 contiguous PAN modules with short linkers between the modules in all but two cases. Most recently PAN modules were described in a large family of proteins from Cryptosporidium parvum (TRAPC1, CpTSP3, 4, 5 & 6 and CpApp5, 6, 7, & 8, [10]) and in all of these proteins, the PAN modules are followed by a thrombospondintype1-like module. The localisation of these proteins within the parasite is not known. Based on homology and protein structure predictions [11], the PAN module superfamily is widely dispersed in nature with family members identified in proteobacteria, apicomplexan protozoa, filamentous fungi, plants, nematodes, arthropods, tunicates, birds, amphibians and mammals (Figure 1). All members of the superfamily contain a conserved core of two disulphide bridges (these modules are called PAN_Ap in the SMART nomenclature http//smart.embl-heidelberg.de/), and a subset of modules (called Apple in the SMART nomenclature) possess a third disulphide bridge that links the N and C termini [12]. The PAN_Ap group includes the N-terminal modules of members of the plasminogen/hepatocyte growth factor family and they also range through many different species of which some examples are shown in Figure 1. The Apple (three disulphide bridge) modules include those in blood coagulation factor XI (FXI) and plasma pre-kallikrein (PK; [13, 14]) and most of those in the apicomplexan and nematode proteins. However, the Cryptosporidum parvum PAN modules have eight conserved cysteines, suggesting that these have four disulphide bridges. In some proteins PAN modules appear to mediate protein-protein interactions, e.g. in factor XI, while, in others they mediate protein-carbohydrate interactions, e.g. the N-terminal module of HGF which binds heparin [15, 16]. To gain further insight into the biological function of apicomplexan PAN modules, we have used high resolution NMR to determine the structure of the a PAN module of E. tenella EtMIC5 microneme protein. Its structure is compared to the previously determined structure of the N-terminal
The ninth PAN module of EtMIC5 was chosen because in previous expression studies it gave a high yield of periplasmic expressed protein [12]. The module was subcloned into plasmid vector pFLAG-ATS essentially as described previously [12], except that a stop codon was incorporated into the 3⬘ PCR primer to avoid expression of 11 C-terminal residues derived from the vector. Escherichia coli BL21 cells harbouring plasmid pFLAGA9.540 were grown in minimal medium containing 1 ⫻ ( 15NH 4) 2SO 4 and the cells harvested at 3 h post-induction. The sequence of the expressed recombinant fragment, called Apple9, and the numbering system used are illustrated in Figure 2. Recombinant Apple9 protein was purified from a periplasmic shock fraction essentially as described previously [12] except that 10 mM triethylamine phosphate, pH 6.0 was used in place of 0.1% trifluoroacetate in all reverse-phase buffers and the product was eluted into water after the desalting step. 28 mg of 15N labelled Apple9 protein was produced. The extent of 15N labelling was assessed by mass spectrometry.
Sub-cloning, expression and purification of 15 N-labelled FLAGA9 protein
NMR spectroscopy Samples of Apple9 for NMR experiments were made 0.8 mM in 93% H 2O/ 7% D 2O or 1.7 mM in 99.9% D 2O, pH 4.7 (meter uncorrected for deuterium). All NMR experiments were acquired at 30 oC on spectrometers at the Oxford Centre for Molecular Sciences, operating at 500.1, 600.1, and 750.1 MHz for 1H. Experiments were recorded in a phase sensitive mode using the States/TPPI method for quadrature detection in the indirectly detected dimensions. 1 H− 15N decoupling was achieved using a Waltz-16 pulse-train. The following two-dimensional spectra were recorded in H 2O: a [ 1H− 1H} –PreTOCSY-COSY [20]; a [ 1H− 1H]-NOESY with a mixing time of 150 ms [21]; a [ 1H− 1H]-TOCSY with a mixing time of 41 ms [22]; a gradient enhanced [ 1H− 15N]-HSQC [23]; and a [ 1H− 15N]-HMQC-J [24]. The following two-dimen-
230 sional spectra were recorded in D 2O: a [ 1H− 1H]TOCSY with a mixing time of 42 ms [22] and a [ 1H− 1H]-NOESY with a mixing time of 150 ms [21]. For measuring the 15N- { 1H} NOE, two experiments were recorded either with or without 1H saturation during the recycle delay [25]. A three-dimensional gradient-enhanced [ 1H− 15N]-TOCSY with a mixing time of 46 ms [26] and a three-dimensional gradientenhanced [ 1H− 15N]-NOESY-HSQC with a mixing time of 150 ms [26] were measured. Slowly exchanging amide protons were identified by dissolving protonated peptide in D 2O and recording one dimensional spectra at time intervals over several hours. Data processing was performed using the FELIX 2.3 software package (Biosym Technologies Inc.). The 1H chemical shift (4.7 ppm) of the H 2O peak at 30 oC was calculated as described in [27] and that in the 15N dimension by using a 15N/ 1H frequency ratio of 0.101329118 [28]. NOEs were assessed in categories with upper distance limits for inter-proton distances of 2.8, 3.5, 5.0, 6.0 Å. Intraresidue NOEs between protons on adjacent carbon atoms were not included. Hydrogen bond restraints were included in the calculations if an H N-O distance of ⬍ 2.3 Å and an O-H N-N H angle of ⬎ 120° were found in at least 70% of the unrestrained structures, and the solvent exchange of the amide proton was slow. For each hydrogen bond two distance restraints were introduced into the calculation (d H N -O ⫽ 1.7 ⫺ 2.3 Å and d N-O ⫽ 2.4 ⫺ 3.3 Å). Backbone torsion angle () restraints were derived by measuring 3J H N -H ␣ spinspin coupling constants (where possible) from the [ 1H− 15N]-HMQC-J spectra using spectral simulations [29]. For residues with 3J H N -H ␣ ⬍ 6 or ⬎ 8 Hz, estimates of angles were obtained using a modified Karplus equation [30] and checked for consistency with preliminary, unrestrained structures. From this information angle restraints were included in the structure calculations with errors varying from ± 10° in the well-defined sections of the structure, to ± 30°. Structure calculations were performed using an ab initio simulated annealing protocol within the program XPLOR 3.8 [31] and the PROLSQ option in the ‘parallhdg.pro’ parameter file (version 5.1), [32]. Calculations, from an extended template, included initial randomisation of backbone torsion angles and incorporation of floating chirality for prochiral groups. Of 250 calculated structures, the 25 lowest energy structures were refined and 20 final structures were chosen on the basis of low energy and consistency with the experimental restraints.
Table 1. Experimental restraints and structural statistics with root mean square deviations for 20 structures of Apple9, the 9 th PAN module of EtMIC5. R.m.s. deviations from experimental data a All 1146 NOE restraints 0.013 ± 392 intraresidue NOEs 0.004 ± 358 sequential NOEs 0.020 ± 96 short-range NOEs 0.009 ± (1 ⬍ |i-j| ⬍ 5) 300 long-range NOEs 0.008 ± (|i-j| ⬎ 4) 20 hydrogen bond restraints 0.004 ± 0.001 ± 24 dihedral angles
0.003 0.001 0.006 0.003
Å Å Å Å
0.001 Å 0.002 Å 0.002°
R.m.s. deviations from covalent geometry Bonds 0.001 ± 0.0001 Å Angles 0.260 ± 0.004° Impropers 0.156 ± 0.018° Ramachandran analysis b Residues in favoured regions Residues in additional allowed regions Residues in generously allowed regions Residues in disallowed regions
65.4% 30.8% 2.9% 0.9%
Coordinate precision: secondary structure backbone; all heavy atoms c 0.57 ± 0.12 Å; 1.25 ± 0.28 Å a
None of the 20 accepted structures showed distance restraint violations of ⬎ 0.2 or dihedral restraints of ⬎ 2°. b Prolines, glycines are excluded. c Coordinate r.m.s. deviations were calculated following best-fit superimposition over secondary structure elements.
Results The 86-residue Apple9 peptide (Figure 2) includes the amino-terminal, eight-residue FLAG tag followed by three residues from the plasmid vector and finally the 75 residues of the ninth Apple domain of EtMIC5. The predicted mass of this protein, assuming 100% 15 N incorporation is 9324.32 Da. By electrospray mass spectrometry, a single homogenous species of 9317.58 ⫹ / ⫺ 0.16 Da was detected, suggesting that the protein is a homogenous species, with ⱖ 94% of the N atoms labelled. NMR spectroscopy NMR spectroscopy and structural calculations on Apple9 yielded the data shown in Figure 3 and Table 1.
231
Figure 3. A. Slowly exchanging amide protons of Apple9 (·); the chemical shift index (CSI) [33]; an indication of helical and sheet regions; B. number of NOEs, intraresidue (white), sequential (black), short (⬍ 5 amino acid residues; pale grey) and long (⬎ 4 amino acid residues; dark grey) per residue; C. Rmsd (Å) per residue calculated for backbone (solid) and sidechain (dotted) heavy atoms.
Figure 4 shows a stereo view of 20 structures superimposed over the structured regions of Apple9 (residues 1−11 and 82−86 omitted) and a ribbon diagram of a single structure (residues 1−11 and 82−86 omitted). The Apple9 fold is similar to the that of the N-terminal PAN module of HGF determined by NMR [17] and by crystallography [18, 19]. The two modules share 20% amino acid identity and 23% amino acid similarity (43% combined; Clustal W; [34]) compared with highest and lowest combined identity and similarity values for PAN modules in Figure 1 of 52% and 30% – modules in FXI and TgMIC4 (Toxoplasma gondii), respectively. A 10-residue helix lies on one side of a sheet of five antipar-
allel -strands; on the opposite face to the helix, there is a small antiparallel sheet formed between two long loops. Four phenylalanines (F20, F49, F50, F60) play an important role in the hydrophobic core of the structure, and two of these, F50 and F60, are conserved in the HGF structure. To investigate possible binding to Apple9, heparin was titrated into NMR samples of the peptide and 15 N− 1H HSQC spectra were measured. With low molecular weight (˜3000 Da) heparin, no binding, as detected by changes in chemical shifts, was observed at either pH 4.7 or 7.1. Binding was carried out at 30 oC in the presence of sodium acetate 53 mM, sodium chloride 103 mM with the molar ratio of heparin to
232
Figure 4. A. Stereoview of a family of 20 structures of Apple9 superimposed over structured regions (omitting residues under 12 and over 81). B. Low energy structure of Apple9 (omitting residues under 12 and over 81) in the same orientation as in 4A.
Apple9 varying from 0.05 to 6. This is in contrast to experiments by Zhou et al. [15], and Lietha et al. [16] where a heparin binding site was observed in the Nterminal module of HGF, but the lysine and arginine residues involved are not conserved in the Apple9 sequence. Lactose also did not bind to Apple9 under the same conditions as above although lactose binding to intact EtMIC5 protein has been observed (F. Tomley and J. Bumstead, unpublished). Except for changes of about 0.06 ppm for both amide proton and 15N chemical shifts of residues between C41 and C47, the effect of changing the pH from 4.7 to 7.1 was small and could be related to the influence of titrating side chains and terminal residues.
A heteronuclear 15N-{ 1H} NOE experiment was carried out to measure relative flexibility in the protein (data shown in supplementary material). The N- and C-terminal residues are highly flexible but apart from increased mobility around residues 24 and 63, which correspond to loop regions, the core of the protein appears to be stable.
Discussion PAN modules are relatively common in nature, having been identified in proteins from a wide range of biological species. PAN module sequences contain
233 two conserved disulphide bridges. A subset of PAN modules (Apple) are found in FXI, PK and apicomplexan proteins as well as in some proteins from C. elegans and plants. These Apple modules contain an additional disulphide bridge that links their N and C termini. Until now only one PAN structure was known, that of the N-terminal module of HGF. The first structure of the Apple subset of PAN modules is described here – Apple9 from the apicomplexan MIC protein EtMIC5. The third disulphide bridge of Apple9, which is absent from the HGF PAN module, forms between residues C13 and C81. Overall, the structure of Apple9 is more tightly folded than that of the HGF PAN; Apple9 has 17 less residues within the region corresponding to C13 – C81. Differences between the structures lie mostly in interconnecting loops. In particular, in the loop following the helix, in the two loops on the far side of the five-stranded sheet from the helix, and in the groove between the second strand of the -sheet and the helix where the heparin binds in the N-module of HGF [15, 17]. Apple modules in all of the apicomplexan MICs are similarly short, whereas those from FXI and PK have an extended loop between the third and fourth strands of the five-stranded sheet, and those of the nematode proteins have an extended posthelix loop. The two extra cysteines of the Cryptosporidium parvum PAN modules occur in the small -sheet between the two long loops on the other side of the five-stranded -sheet from the helix, corresponding approximately to F20 and L66 in the Apple9 structure. A disulphide bond here would help to hold the two loops together. The structures of Apple9 and the PAN domain of HGF overlay with an RMSD of 0.95 Å when the backbone heavy atoms of the helix and five-stranded -sheet are superimposed. The small -sheets in the back loops do not superimpose but the orientation of these is influenced by the length of the loops which are different in the two proteins. Although Apple9 and the HGF PAN module are in distinct subclasses of the PAN superfamily, their structures have been shown here to be similar. It seems likely that the Apple modules of pre-kallikrein, factor XI and the nematodes, as predicted on the basis of homology by Tordai et al. [11], as well as of the eight cysteine Cryptosporidium parvum will all have similar structures. In conclusion, the domains identified as ‘PAN’ and ‘Apple’ which occur in a very wide range of proteins and species (Figure 1 shows only a small number of representative examples) are a good
example of a structural unit used by biology in different contexts and different ways.
Accession code The coordinates have been deposited in the Brookhaven protein data bank with accession code 1hky.
Supplementary information A Table containing the chemical shift values of Apple9 at 30 oC and pH 4.7 and a Figure analysing the 15N-{ 1H} NOE are available.
Acknowledgements We would like to thank Lawrence Hunt and Philip Nugent for their help and advice with protein purification and Andy Gill for the mass spectrometry. IDC and JRP acknowledge support from the Wellcome Trust and the British Heart Foundation. PB was supported by a BBSRC studentship.
References 1. 2. 3. 4. 5. 6.
7. 8.
9. 10.
11.
Carruthers, V.B. and Sibley, L.D. (1997) Eur. J. Cell. Biol. 73, 114−123. Carruthers, V.B. and Sibley, L.D. (1999) Mol. Microbiol. 31, 421−428. Bumstead, J.M. and Tomley, F.M. (2000) Mol. Biochem. Parasitol. 110, 311−321. Bork, P., Downing, A.K., Kieffer, B. and Campbell, I.D. (1996) Q. Rev. Biophys. 29, 119−167. Tomley, F.M. and Soldati, D.S. (2001) Trends Parasitol. 17, 81−88. Eschenbacher, K.H., Klein, H., Sommer, I., Meyer, H.E., Entzeroth, R., Mehlhorn, H., and Ruger, W. (1993) Mol. Biochem. Parasitol. 62, 27−36. Brown, P.J., Billington, K.J., Bumstead, J.M., Clark, J.D. and Tomley, F.M. (2000) Mol. Biochem. Parasitol. 107, 91−102. Brecht, S., Carruthers, V.B., Ferguson, D.J., Giddins, O.K., Wang, G., Jaekle, U., Harper, J.M., Sibley, L.D. and Soldati, D. (2001) J. Biol. Chem. 276, 4119−4127. Klein, H., Loschner, B., Zyto, N., Portner, M. and Montag, T. (1998) Glycoconjugate J. 15, 147−153. Deng, M., Templeton, T.J., London, N.R., Bauer, C., Schroeder, A.A. and Abrahamsen, M.S. (2002) Infection Immunity 70, 6987−6995. Tordai, H., Bányai, L. and Patthy, L. (1999) FEBS Lett. 461, 63−67.
234 12. Brown, P.J., Gill, A.C., Nugent, P., McVey, J.H. and Tomley, F.M. (2001) FEBS Lett. 497, 31−38. 13. McMullen, B.A., Fujikawa, K. and Davie, E.W. (1991a) Biochemistry 30, 2050−2056. 14. McMullen, B.A., Fujikawa, K. and Davie, E.W. (1991b) Biochemistry 30, 2056−2060. 15. Zhou, H., Casas-Finet, J.R., Heath Coats, R., Kaufman, J.D., Stahl, S.J., Wingfield, P.T., Rubin, J.S., Bottaro, D.P. and Byrd, R.A. (1999) Biochemistry 38, 14793−14802. 16. Lietha, D., Chigadze, D.Y., Mulloy, B., Blundell, T.L. and Gherardi, E. (2001) EMBO J. 20, 5543−5555. 17. Zhou, H., Mazulla, M.J., Kaufman, J.D., Stahl, S.J., Wingfield, P.T., Rubin, J.S., Bottaro, D.P. and Byrd, R.A. (1998) Structure 6, 109−116. 18. Ultsch, M., Lokker, N.A., Godowski, P.J. and de Vos, A.M. (1998) Structure 6, 1383−1393. 19. Chirgadze, D.Y., Hepple, J.P., Zhou, H., Byrd, A.R., Blundell, T.L. and Gherardi, E. (1999) Nature Struct. Biol. 6, 72−79. 20. Otting, G. and Wüthrich, K. (1987) J. Magn. Reson. 75, 546−549. 21. Kumar, A., Ernst, R.R. and Wüthrich, K. (1980) Biochem. Biophys. Res. Commun. 95, 1−6. 22. Davis, D.G. and Bax, A. (1985) J. Am. Chem. Soc. 107, 2820−2821. 23. Kay, L.E., Kiefer, R. and Sarinen, T. (1992) J. Am. Chem. Soc. 114, 10663−10665.
24. Kay, L.E. and Bax, A. (1990) J. Magn. Reson. 86, 110−126. 25. Kay, L.E., Torchia, D.A. and Bax, A. (1989) Biochemistry 33, 5984−6003. 26. Marion, D., Driscoll, P.C., Kay, L.E., Wingfield, P.T., Bax, A., Gronenborn, A.M. and Clore, G.M. (1989) Biochemistry 28, 6150−6156. 27. Cavanagh, J., Fairbrother, W.J., Palmer III, A.G. and Skelton, N.J. (1996) Protein NMR Spectroscopy, Academic Press Inc., San Diego, CA, p. 175. 28. Wishart, D.S., Bigam, C.G., Yao, J., Abildgaard, F., Dyson, H.J., Oldfield, E., Markley, J.L. and Sykes, B.D. (1995) J. Biomol. NMR 6, 135−140. 29. Redfield, C. and Dobson, C.M. (1990) Biochemistry 29, 7201−7214. 30. Pardi, A., Billeter, M. and Wüthrich, K. (1984) J. Mol. Biol. 180, 741−751. 31. Brünger, A. T. (1992) X-PLOR (Version 3.1) A System for Xray Crystallography and NMR. Yale University, New Haven, CT. 32. Linge, J.P. and Nilges, M. (1999) J. Biomol. NMR 13, 51−59. 33. Wishart, D.S., Sykes, B.D. and Richards, F.M. (1992) Biochemistry 31, 1647−1651. 34. Higgins, D.G., Thompson, J.D., and Gibson, T.J. (1994) Nucleic Acids Res. 22, 4673−4680.