Chromobacterium violaceum (NP_899834), Porphyromonas gingivalis ... discoideum (XP_635637), Entamoeba histolytica (XP_654596), Bacillus clausii ...
SUPPLEMENTARY MATERIAL for “Structural insight into repair of alkylated DNA by a new superfamily of DNA glycosylases comprising HEAT-like repeats” by B. Dalhus, I. Høydal Helle, P. H. Backe, I. Alseth, T. Rognes, M. Bjørås and J. K. Laerdahl.
SUPPLEMENTARY TABLES Conserved B. cereus AlkD residue Tyr27 Arg43 Lys47 Arg74
Conservation
Location and/or putative function
91% Tyr or Phe 67% Arg or Lys 69% Arg or Lys 58% Arg
Trp109 Asp110 Asp113 Trp145 Arg148 Phe179 Phe180 Lys183 Trp187 Arg190 Lys194 Arg215 Lys219
100% Trp 70% Asp 100% Asp 98% Trp 98% Arg 70% aromatic 74% aromatic 95% Lys 95% Trp 95% Arg or Lys 86% Arg or Lys 90% Arg or Lys 67% Arg or Lys
Near active site DNA binding DNA binding Strong bidentate hydrogen binding with Asp110, i.e. important for correct folding Active site/recognition pocket See Arg74 Active site/recognition pocket Near active site Active site and/or DNA binding Near active site Near active site Active site and/or DNA binding Active site/recognition pocket DNA binding DNA binding DNA binding DNA binding
SUPPLEMENTARY TABLE 1. Main conserved residues of the AlkD homologs. The conserved residues of the AlkD homologs that are most likely involved in DNA binding, lesion recognition and catalysis are listed together with percentage conservation in 43 AlkD homologs. The residues that were investigated by site-directed mutagenesis are given in bold face.
1
SUPPLEMENTARY FIGURES
Bc AlkD PSIPRED PROF SSpro SAM_T02 Jnet
MHPFVKALQEHFIAHKNPEKAEPMARYMKNHFLFIGIQTPERRQLLKDVIQIHTLPDPKD CCHHHHHHHHHHHHHCCHHHHHHHHHHCCCCCEECCCCCHHHHHHHHHHHHHCCCCCHHH CCCHHHHHHHHHHHCCCHHHHHHHHHHHHCCCEEECCCCHHHHHHHHHHHHHCCCCCCHH CCHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHCCCCCCCCC CCHHHHHHHHHHHHHCCHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHHHHCCCHHH CCCHHHHHHHHHHHCCCCCHHHHHHHHHCCCCCECCCCCHHHHHHHHHHHHHCCCCCHHH
Bc AlkD PSIPRED PROF SSpro SAM_T02 Jnet
FRIIVRELWDLPEREFQAAALDMMQKYKKYINETHIPFLEELIVTKSWWDTVDSIVPTFL HHHHHHHHHCCHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHCCCCCHHHHHHHHHHHH HHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHCCCCCHHHHHHHHHHH CEEEEHHHCCCCHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHECCCCCHHHHHHHHHHHH HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHCCCCHHHHHHHHHHHHH HHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCEEEECCCCCCC
Bc AlkD PSIPRED PROF SSpro SAM_T02 Jnet
GNIFLQHPELISAYIPKWIASDNIWLQRAAILFQLKYKQKMDEELLFWVIGQLHSSKEFF HHHHHCCHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHCCCHHHH HHHHCCCCHHHHHHHHHHHHCCCCHHHHHHHHHHHHCCCCCCHHHHHHHHHHHCCCCHHH HHHHHHCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHH HHHHHHCHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHCCCHHHH CCCCCCCCCHHHHHHHHHHCCCCHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHCCCCHHH
Bc AlkD PSIPRED PROF SSpro SAM_T02 Jnet
IQKAIGWVLREYAKTKPDVVWEYVQNNELAPLSRREAIKHIKENYGINNEKIGETLS HHHHHHHHHHHHHHCCHHHHHHHHHHCCCCHHHHHHHHHHHHHHCCCCHHHHHCCCC HHHHHHHHHHHHCCCCHHHHHHHHHHCCCCCHHHHHHHHHCHHHCCCCHHHHHCCCC HHHHHHHHHHHHHCCCCCEEEEEEECCCCCCCCHHHHHHHHHHHCCCCCCCCCCCCC HHHHHHHHHHHHHHHCHHHHHHHHHHCCCCHHHHHHHHHCCCHHHHHHHHHHHCCCC HHHHHHHHHHHHCCCCCHHHHHHHHHCCCCHHHHHHHHHHCCCCCCCCCCCCCCCCC
SUPPLEMENTARY FIGURE 1. Secondary structure predictions for B. cereus AlkD employing the alphabet H=helix, E=strand, and C=other (loop). The predictions given below the protein amino acid sequence (Bc AlkD) are from PSIPRED v2.5 of Jones and co-workers (1,2), PROF of Rost and Sander (3), SSpro of Pollastri et al. (4), Jnet of Cuff and Barton (5), and the DSSP secondary structure predictions available as a part of the SAM_T02 predictions (6). The consensus 13 helical elements are coloured blue for clarity.
2
Bc AlkC PSIPRED PROF SSpro SAM_T02 Jnet
MGKYVPLKFLFNEELAEKMADSICKHDPTFSKRNFVSSVTCNVENLELKQRIEVIADELH CCCCCCHHHHCCHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHH CCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHCHHHHCHHHHHHHHHHHHH CCCCCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCECEEEECCCCCCHHHHHHHHHHHHH CCCCHHHHHHCCHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHH CCCCCCCHHHHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHCCCHHHHHHHHHHHHHHHHH
Bc AlkC PSIPRED PROF SSpro SAM_T02 Jnet
NALQKDFNEAIHTLLKTLGPENTTEVGTFTNGYMYMPIAKYVEKYGLNEFETSFNAMYEI HHCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCHHHHHHHHHHH HHCCHHHHHHHHHHHHHCCCCCCCCCCCCCCCCEECHHHHHHHHHCCCCHHHHHHHHHHH HHHHHHHHHHHHHHHHHCCCCCCCEECCCCCCCCCCCHHHHHHHHCHHHHHHHHHHHHHH HHCCHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH HCCCCCHHHHHHHHHHCCCCCCCCCCCCCCCCCEEECCHHHHHHCCCCCHHHHHHHHHHH
Bc AlkC PSIPRED PROF SSpro SAM_T02 Jnet
TKRNTAEYAIRPFLETYHEDTLNILQQWIHDENSHIRRLVSEGTRPRLPWAKKIGALKSD HCCCCHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHCCCCCCHHHHHHHHHHHC HHCCCHHHHHHHHHHCCHHHHHHHHHHHHHCCCCCEEEEECCCCCCCCCCCHHHHHHCCC HHHCHHHHHHHHHHHHCHHHHHHHHHHHHCCCCHHHHHHHHHCCCCCCHHHHHHHHHCCC HHHHHHHHHHHHHHHHCHHHHHHHHHHHHCCCCHHHHHHHHHCCCCCCCCCCCCHHHHHC HHCCCHHHHHHHHHHHCCHHHHHHHHHHCCCCCCEEEEEECCCCCCCCCCCCCCCCCCCC
Bc AlkC PSIPRED PROF SSpro SAM_T02 Jnet
FKYNLQLLEPLMNDPSKYVQKSVANHINDITKEDKELVFQWLQQLRDKQHPVNPWIIKHG HHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHCCCCCCHHHHHHHHHH CCHHHHHHHHHCCCCHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHCCCCCCHHHHHHH CCCCHHHHHHHCCCCCHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHCCCCCCHHHHHHH HHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHCHHHHHHHHHHHHHCCCHHHHHHHHHH CCCHHHHHHCCCCCCCEEEEECCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCCHHHHHHH
Bc AlkC PSIPRED PROF SSpro SAM_T02 Jnet
LRTVIKNGTLPKDFCF HHHHHHCCCCCCCCCC HHHHHHCCCCCCCCCC HHHHHHCCCCCCCCCC HHHHHHCCCCCCCCCC HHHHHCCCCCCCCCCC
SUPPLEMENTARY FIGURE 2. Secondary structure predictions for B. cereus AlkC employing the alphabet H=helix, E=strand, and C=other (loop). The predictions given below the protein amino acid sequence (Bc AlkC) are from PSIPRED v2.5 of Jones and co-workers (1,2), PROF of Rost and Sander (3), SSpro of Pollastri et al. (4), Jnet of Cuff and Barton (5), and the DSSP secondary structure predictions available as a part of the SAM_T02 predictions (6). The consensus helical elements are coloured blue for clarity.
3
Sequence alignment Bc_AlkD EF3068
MHPFVKALQEHFIAHKNPEKAEPMARYMKNHFLFIGIQTPERRQLLKDVIQIHTLPDPKD --------MDTLQFQKNPETAAKMSAYMKHQFVFAGIPAPERQALSKQLLKESHTWPKEK : : :****.* *: ***::*:* ** :***: * *:::: :.
60 52
Bc_AlkD EF3068
FRIIVRELWDLPEREFQAAALDMMQKYKKYINETHIPFLEELIVTKSWWDTVDSIVPTFL LCQEIEAYYQKTEREYQYVAIDLALQNVQRFSLEEVVAFKAYVPQKAWWDSVDAWRKFFG : :. :: .***:* .*:*: : : :. .: :: : *:***:**: *
120 122
Bc_AlkD EF3068
GNIFLQHPELISAYIPKWIASDNIWLQRAAILFQLKYKQKMDEELLFWVIGQLHSSKEFF SWVALH-LTELPTIFALFYGAENFWNRRVALNLQLMLKEKTNQDLLKKAIIYDRTTEEFF . : *: :.: :. : .::*:* :*.*: :** *:* :::** .* ::::***
180 171
Bc_AlkD EF3068
IQKAIGWVLREYAKTKPDVVWEYVQNNELAPLSRREAIKHIKENYGINNEKIGETLS IQKAIGWSLRQYSKTNPQWVEELMKELVLSPLAQREGSKYLAKASE---------******* **:*:**:*: * * ::: *:**::**. *:: :
237 217
SUPPLEMENTARY FIGURE 3. Alignment of the sequences of B. cereus AlkD and E. faecalis hypothetical protein EF3068. Alignments of full length proteins with identical residues coloured red. This alignment was used by SwissModel (7) to generate the B. cereus AlkD structure model.
4
SUPPLEMENTARY FIGURE 4. Multiple sequence alignment of 10 AlkD homologs. The predicted secondary structure for B. cereus AlkD (AlkD) is given together with the sequences for AlkD homologs in E. faecalis (EF3068), Listeria monocytogenes (NP_465770), Streptococcus mutans (NP_720542), Chromobacterium violaceum (NP_899834), Porphyromonas gingivalis (NP_905432), Dictyostelium discoideum (XP_635637), Entamoeba histolytica (XP_654596), Bacillus clausii (YP_174925), Lactobacillus casei (ZP_00384857), and Enterococcus faecium (ZP_00604707). Columns containing conserved, possibly DNA binding Arg and Lys residues are marked with a green asterisk, while a black asterisk is used to mark the active site residues that were targeted by site-directed mutagenesis.
5
REFERENCES
1. 2. 3. 4. 5. 6. 7.
Bryson, K., McGuffin, L. J., Marsden, R. L., Ward, J. J., Sodhi, J. S., and Jones, D. T. (2005) Nucleic Acids Res. 33, W36-38 Jones, D. T. (1999) J. Mol. Biol. 292, 195-202 Rost, B., and Sander, C. (1993) J. Mol. Biol. 232, 584-599 Pollastri, G., Przybylski, D., Rost, B., and Baldi, P. (2002) Proteins 47, 228-235 Cuff, J. A., and Barton, G. J. (2000) Proteins 40, 502-511 Karplus, K., Karchin, R., Draper, J., Casper, J., Mandel-Gutfreund, Y., Diekhans, M., and Hughey, R. (2003) Proteins 53, 491-496 Schwede, T., Kopp, J., Guex, N., and Peitsch, M. C. (2003) Nucleic Acids Res. 31, 33813385
6