(Mycalidae,Poecilosclerida). Toji, Japan. 34°64.12' N, 138°91.70' E. +. Niphates digitalis. (Niphatidae, Haplosclerida). Little San Salvador, Caribbean Sea.
SUPPLEMENTARY INFORMATION
doi:10.1038/nature12959
SUPPLEMENTARY NOTES Supplementary Note 1: Whole genome amplification of single filaments. The success of WGA can be highly influenced by the condition of the cells prior to MDA, such as cell preservation after collection, cell storage, and cell lysis conditions. For “Entotheonella factor”, WGA was successful when cells were sorted immediately after differential centrifugation. Cells sorted into 96-well plates that were to be used later were stored at 4 °C rather than the standard freezing at -80 °C. Prior to MDA, heat (95 °C) was sufficient to lyse the cells, while genome amplification was conducted based on the manufacturer's protocols.
Supplementary Note 2: Assembly of the 16S rRNA region from the metagenome. Since only a single contig was assembled containing an "Entotheonella"-derived 16S rRNA gene sequence, the original reads for the 16S region were manually inspected, revealing 35 SNPs with a frequency of 30% to 50%. The origin of these variants was reassessed by analyzing 16S rRNA gene sequences amplified from the enriched filamentous sample as well as the MDA plates. This identified two highly similar sequences (97.6% pairwise identity), one with 100% identity to the genomic sequence and a second with 36 nt differences, of which 35 were identical to the SNPs identified from the genome assembly.
Supplementary Note 3: Detailed protein isolation and mass exchange assay for A domain characterization. Cells from overnight expression cultures were harvested by centrifugation (5000 rpm, 20 min, 4 ºC), resuspended in lysis buffer (25 mM Tris-HCl pH 8.0, 400 mM NaCl, 10% (v/v) glycerol, 10 mM imidazole), and lysed by french press. The soluble fraction was purified using Ni-NTA resin with increasing amounts of imidazole and analyzed by SDS-PAGE. Pure fractions were pooled, desalted (PD10 column, GE), and concentrated (Vivaspin MWCO 30 kDa, Sartorius). For mass exchange-based adenylation
WWW.NATURE.COM/NATURE | 1
RESEARCH SUPPLEMENTARY INFORMATION
doi:10.1038/nature12959
assays, a 6 µL reaction mixture consisting of 600 nM enzyme, 1 mM amino acid. 20 mM Tris-HCl (pH 7.5), 5 mM MgCl2, 5 mM inorganic pyrophosphate, 0.3 mM DTT, and 1 mM γ-18O4-ATP was incubated at 25 ºC for 2 h before being quenched with 6 µL of
9-
aminoacridine in acetone (10 mg mL-1). The data were recorded on a MALDI Thermo LTQ Orbitrap™ XL equipped with a nitrogen laser at λ= 337 nm. The MS was operated in negative ionization and FTMS mode. The laser energy was tuned semi automatically on 9aminoacridine matrix and set to 35 µJ. The following parameters were applied: automatic spectrum filtering (ASF) = off, automatic gain control (AGC) = on, microscans = 1, resolution 15000, scan range from 500-520 m/z and crystal positioning system (CPS). The average of 100 scans was used for each experiment. Substrate conversion (%) was calculated with the equation % exchange = (100/0.833)*16O/(16O +
18
O) and normalized to the amino
acid with the greatest specificity.
WWW.NATURE.COM/NATURE | 2
RESEARCH SUPPLEMENTARY INFORMATION
doi:10.1038/nature12959
Table S1| Sequencing and assembly statistics of the different HTS technologies and platforms used. Total
Technology Platform
454 GS-FLX
Illumina Miseq
Run
Reads
Bases
Aligned w/ paired read
Pair distance
Reads
inferred Bases
read error [%]
GW1GXA001
278,628
103,517,209
/
/
199,866 (71.7%)
71,416,758 (69.0%)
1.15
G0BZMZS04
365,682
62,148,666
141.253
1,766.1 ± 592.6
252,538 (69.1%)
42,421,695 (68.3%)
1.79
G5M2T3U03
348,385
60,088,308
137.315
1,765.5 ± 592.6
241,724 (69.4%)
41,164,695 (68.5%)
1.69
4,166,800
583,304,284
4.166.800
537.5 ± 201.6
3,335,888 (80.1%)
457,764,474 (78.5%)
1.15
265,535 (84.0%)
53,994,815 (85.6%)
0.52
WGS-PE
PacBio RS
Total
MP
316,113
63,051,407
316.113
4,393.7 ± 1,098.4
Run01
21,978
28.873.272
/
/
11,468 (52.2%)
14,623,200 (50.7%)
2.29
5,497,586
900,983,146
4,761,481
/
4,307,019 (78.3%)
681,385,861 (75.6%)
1.19
WWW.NATURE.COM/NATURE | 3
RESEARCH SUPPLEMENTARY INFORMATION
doi:10.1038/nature12959
Table S2| Assembly statistics of binned sequence data TSY1 reads
TSY2
1,918,908
617,323
Contaminants & small contigs
Total
1,770,788
4,307,019
Assembled bases
*
*
*
303,571,246
97,660,499
280,154,117
681,385,861
563
860
436
1,859
contigs in scaffolds
1,577
2,303
703
4,583
large contigs (>= 500bp)
1,820
3,270
13,003
18,093
all contigs (>= 100bp)
n.d.
n.d.
77,162
82,252
8,894,357
8,820,512
27,775,048
45,489,917
Coverage
34.13
11.07
10.09
14.98
G+C content [%]
55.79
55.55
42.17
47.78
average
15,346
9,229
3,565
10,605
largest
105,049
65,735
21,890
105049
average of large contigs
5,015
2,815
756
1,556
average of scaffolded contigs
3,524
5,923
1,476
3,960
largest
48,845
27,101
5,386
48,845
scaffolds
Number of
Bases in contigs
†
†
Scaffold size
Contig size
*
calculated based on an average read length of 158.2 bp (total assembled bases divided by total assembled reads) † contigs of less than 500 bp were not subjected to binning
WWW.NATURE.COM/NATURE | 4
RESEARCH SUPPLEMENTARY INFORMATION
doi:10.1038/nature12959
Table S3| Pairwise identities of orthologous phylogenetic markers from "Entotheonella factor" TSY1 and TSY2. Gene
% Identity (Amino Acid)
% Identity (Nucleotide)
ffh
97.3
91.0
gcp
92.8
87.2
infB
93.7
87.0
lepA
96.7
89.2
pheS
93.0
90.3
pheT
90.5
87.5
pyrG
97.0
92.9
rnhB
87.6
82.0
rplA
94.4
87.5
rplB
94.9
87.1
rplC
95.0
89.7
rplD
94.9
88.7
rplE
98.4
91.4
rplF
93.6
91.2
rplJ
94.8
86.0
rplK
97.2
90.1
rplN
99.2
89.4
rplO
94.3
90.0
rplP
97.1
89.3
rplR
88.9
86.3
rplV
98.5
91.4
rplX
90.4
88.5
rpsB
92.6
89.4
rpsC
97.7
89.6
rpsD
97.7
91.7
rpsH
94.6
91.3
rpsI
92.1
91.4
rpsK
99.2
90.6
rpsL
97.9
89.1
rpsM
95.9
91.7
rpsO
98.9
90.4
rpsQ
97.6
91.1
rpsS
97.9
93.3
tgt
95.8
88.5
tpiA
87.0
84.1
WWW.NATURE.COM/NATURE | 5
RESEARCH SUPPLEMENTARY INFORMATION
doi:10.1038/nature12959
Table S4| Identified natural product biosynthetic loci from "Entotheonella" strains TSY1 and TSY2. Locus ID
Cluster Size [kb]
TSY1_01 TSY1_02 TSY1_03 TSY1_04 TSY1_05 TSY1_06 TSY1_07 TSY1_08 TSY1_09 TSY1_10 TSY1_11 TSY1_12 TSY1_13 TSY1_14 TSY1_15 TSY1_16 TSY1_17 TSY1_18 TSY1_19 TSY1_20 TSY1_21 TSY1_22 TSY1_23 TSY1_24 TSY1_25
13 28 49 31 20 66 52 9 14 5 16 24 10 20 9 11 7 14 17 14 9 9 8 6 26 or less
TSY2_01 TSY2_02
8 21
TSY2_03
8
TSY2_04 TSY2_05
7 10
TSY2_06
10
TSY2_07 TSY2_08
9 3
Biosynthetic Type
Proposed or Known Product
NRPS Hybrid PKS-NRPS Hybrid PKS-NRPS NRPS Type III PKS Type I PKS-NRPS Hybrid PKS-NRPS NRPS Hybrid PKS-NRPS Type I PKS Ribosomal peptide Hybrid PKS-NRPS * NRPS (open) Ribosomal peptide NRPS (open) Ribosomal peptide Ectoine Type I PKS-NRPS NRPS (open) Type I PKS (open) Ectoine NRPS Type I PKS (open) NRPS (open) Ribosomal peptide
Konbamides (putatively inactive) Keramamides/orbiculamides Unknown acylated peptide Nazumamide A Unknown aromatic polyketide Onnamides/theopederins Cyclotheonamides/Pseudotheonamides Unknown peptide Unknown acylated threonine derivative Onnamides/theopederins Polytheonamides Unknown mixed polyketide-peptide Unknown peptide fragment Unknown proteusin Unknown peptide fragment Unknown proteusin Ectoine Unknown mixed polyketide-peptide Unknown peptide Unknown polyketide fragment Ectoine Unknown peptide Unknown polyketide fragment Unknown peptide fragment Unknown proteusin
NRPS (open) NRPS Type III PKS (ortholog of TSY1_05) Type I PKS Ribosomal peptide NRPS (ortholog of TSY1_08) NRPS (open) NRPS (open)
Unknown peptide fragment Unknown pentapeptide Unknown polyketide Unknown polyketide Unknown proteusin Unknown peptide Unknown peptide fragment Unknown peptide fragment
Pathways for known products are indicated in bold. * Incomplete biosynthetic loci are designated with "(open)".
WWW.NATURE.COM/NATURE | 6
RESEARCH SUPPLEMENTARY INFORMATION
doi:10.1038/nature12959
Table S5| NRPS adenylation domain substrate predictions. Adenylation Domain TSY1_01_Kon_Orf1_A1 TSY1_01_Kon_Orf1_A2 TSY1_01_Kon_Orf2_A3 TSY1_01_Kon_Orf3_A4 TSY1_01_Kon_Orf4_A5 TSY1_01_Kon_Orf5_A6 TSY1_01_Kon_Orf6_A7 TSY1_01_Kon_Orf7_A8 TSY1_02_KerA_A0 TSY1_02_KerB_A1 TSY1_02_KerB_A2 TSY1_02_KerC_A3 TSY1_02_KerC_A4 TSY1_02_KerC_A5 TSY1_02_KerE_A6 TSY1_02_KerF_A6 TSY1_02_KerH_A7 TSY1_03 _Orf2_A1 TSY1_03 _Orf3_A2 TSY1_03_Orf3_A3 TSY1_03_Orf3_A4 TSY1_03_Orf4_A5 TSY1_04_Naz_Orf1_A1 TSY1_04_Naz_Orf2_A2 TSY1_04_Naz_Orf2_A3 TSY1_04_Naz_Orf2_A4 TSY1_06_OnnI_A1 TSY1_06_OnnJ_A2 TSY1_07_Cth_A1 TSY1_07_Cth_A2 TSY1_07_Cth_A3 TSY1_07_Cth_A4 TSY1_07_Cth_A5 TSY1_07_Cth_A6 TSY1_08_A1 TSY1_09_A1 TSY1_13_A1 TSY1_13_A2 TSY1_15_A1 TSY1_19_A1 TSY1_22_A1 TSY1_24_A1 TSY2_01_A1 TSY2_02_Orf1_A1 TSY2_02_Orf1_A2 TSY2_02_Orf1_A3 TSY2_02_Orf1_A4 TSY2_02_Orf2_A5 TSY2_06_A1 TSY2_07_A1 TSY2_08_A1
Active Site Code DVEDIGAVEK DAEDIGSVVK DAFFLGVTFK DAEDIGSVVK DLFNNALTYK DAWFLG----* DAWFLGNVVK DALHVGNMAK GIFWLGASGK DAFFLGVTYK DVSFMGAVMK DVGEIGSIDK DVQFIAHVAK DVYFVGAVIK DIYNNALTYK DLYNMSLIWK DALHVGNMAK LDWVSSLADK DVSFMGGVLK DLKNFGTDIK DVQFIAHVIK DVSFMGAIMK DVEDIGAITT DVQFIAQVVK DAFFLGVTFK DVYFMGGVIK DILQLGLIWK DVLDIGAIDK DVSFMGGVLK DIWELTADDK DVQFIAQVVK DVEDIGAITS DAWTIAAVCK DASTIAAVCK DMGGIGCLM-‡ DFWNVGMVHK GLTPLACSWK SDQLFSLADK DAFFLGVTFK DIWEVAADN-‡ DVSFMGGVLK DVYFIGGVIK TDWQFGIIYK DAFWLGGTFK ?§ DFWNIGMVHK DAAKVGQVGK DAWMSGAVCK DMGGIGCLM-‡ ADQLFGLADK GLTPVAFSWK
Nearest Neighbor Prediction Arg Lys Ile Lys Ala Leu Leu Hpg † n.p. Ile Phe Orn Pro Arg Ala Cys Hpg β-Ala Ala Ala Pro Phe Arg Pro Ile Ala Gly Arg Ala Ser Pro Arg Phe Tyr n.p. Thr 2-Oxoisovaleric acid β-Ala Ile Phe Ala Pro Gln Val n.p. Thr Asn Phe Orn β-Ala Asp
*
ORF insertion truncated the A domain binding pocket Predictive residues lie outside the applicability domain22 yielding no prediction (n.p.) ‡ Alignment gap yielding no predictive residue § Assembly gap preventing prediction †
WWW.NATURE.COM/NATURE | 7
RESEARCH SUPPLEMENTARY INFORMATION
doi:10.1038/nature12959
Table S6| UPLC-HRMS data and eMZed based compound identification of two independent enriched "Entotheonella" samples from T. swinhoei. retention time [min]
isotope ratio‡ compound name
sum formula
calculated
detected
mass error (ppm)
simulated
found
M0/M1
M0/M2
M0/M1
M0/M2
standard
found
polytheonamide A*†
C219H376N60O72S
1677.9181 3+ [M+3H]
1677.9195 3+ [M+3H]
-0.83
-
-
-
-
5.7
5.7.
polytheonamide B†
C219H376N60O72S
1677.9181 3+ [M+3H]
1677.9176 3 [M+3H]
0.30
-
-
-
-
6.7
6.7
onnamide A*†
C39H63N5O12
794.45460 + [M+H]
794.45341 + [M+H]
1.50
0.43
0.11
0.43
0.10
10.6
10.4
cyclotheonamide A*†
C36H45N9O8
366.67683 [M+2H]2+
366.67661 [M+2H]2+
0.60
0.41
0.09
0.34
0.05
7.9
7.9
aurantoside A*†
C36H46N2O15Cl2
817.23480 [M+H]+
817.23517 [M+H]+
-0.45
0.39
0.69
0.38
0.66
n.a.
13.3
aurantoside B*†
C35H44N2O15Cl2
803.21915 [M+H]+
803.22006 [M+H]+
-1.13
0.38
0.69
0.36
0.60
n.a.
12.9
aurantoside E†
C38H48N2O15Cl2
843.25045 [M+H]+
843.24689 [M+H]+
4.22
0.41
0.69
0.37
0.60
n.a.
13.9
orbiculamide A*†
C46H62N9O10Br
490.69743 [M+2H]2+
490.69679 [M+2H]2+
1.30
0.50
1.05
0.49
1.12
n.a.
12.3
keramamide B†
C54H77N10O12Br
569.25256 [M+2H]2+
569.25132 [M+2H]2+
2.18
0.58
1.09
0.60
1.19
n.a.
13.2
keramamide E or C†
C53H75N10O12Br
1123.48220 [M+H]+
1123.48050 [M+H]+
1.51
0.57
1.08
0.51
1.01
n.a.
12.9
C26H41NO10
528.28032 [M+H]+
528.27948 [M+H]+
1.59
0.28
0.06
-
-
n.a.
11.8
theopederin D*
* Enriched "Entotheonella" fraction used for metagenome sequencing † "Entotheonella" enriched from fresh sponge specimen ‡ See Supplementary Fig. 7 for polytheonamides M0 represents the monoisotopic peak and M1 and M2 the first and second isotopic peak thereof. n.a. not available
WWW.NATURE.COM/NATURE | 8
RESEARCH SUPPLEMENTARY INFORMATION
doi:10.1038/nature12959
Table S7| UPLC-HRMS data and eMZed based compound identification of a Theonella swinhoei extract. compound name
sum formula
calculated
detected
mass error (ppm)
isotope ratio† simulated
retention time [min] found
M0/M1
M0/M2
M0/M1
M0/M2
standard
found
polytheonamide A*
C219H376N60O72S
1677.9181 3+ [M+3H]
1677.9168 3+ [M+3H]
0.72
-
-
-
-
5.7
5.7.
polytheonamide B*
C219H376N60O72S
1677.9181 3+ [M+3H]
1677.9179 3 [M+3H]
0.12
-
-
-
-
6.7
6.7
onnamide A*
C39H63N5O12
794.45460 + [M+H]
794.45245 + [M+H]
2.71
0.43
0.11
0.45
0.10
10.6
10.6
onnamide B
C37H61N5O12
768.43895 + [M+H]
768.43787 + [M+H]
1.40
0.41
0.10
0.42
0.12
10.2
10.3
onnamide C
C39H61N5O14
824.42878 + [M+H]
824.42706 + [M+H]
2.09
0.43
0.11
0.39
0.10
n.a.
10.5
onnamide D
C38H63N5O11
766.45969 + [M+H]
766.45858 + [M+H]
1.45
0.42
0.10
0.41
0.06
n.a.
10.2
onnamide E
C37H59N5O10
734.43347 [M+H]+
734.43216 [M+H]+
1.78
0.43
0.11
0.42
0.13.
n.a.
10.3
pseudoonnamide A
C38H61N5O12
780.43895 [M+H]+
780.43751 [M+H]+
1.84
0.42
0.11
0.40
0.07
n.a.
10.6
orbiculamide A*
C46H62N9O10Br
490.69743 [M+2H]2+
490.69681 [M+2H]2+
1.26
0.50
1.05
0.51
1.12
n.a.
12.3
keramamide B*
C54H77N10O12Br
569.25256 [M+2H]2+
569.25125 [M+2H]2+
2.30
0.58
1.09
0.56
1.04
n.a.
13.2
keramamide E or C*
C53H75N10O12Br
562.24474 [M+2H]2+
562.24368 [M+2H]2+
1.88
0.57
1.08
0.55
1.11
n.a.
12.9
keramamide D
C52H73O12N10Br
555.23691 [M+2H]2+
555.23572 [M+2H]2+
2.14
0.56
1.08
0.52
1.06
n.a.
12.6
C28H43N7O8
606.32459 [M+H]+
606.32620 [M+H]+
-2.65
0.32
0.06
0.29
0.05
n.a.
13.5
aurantoside A*
C36H46N2O15Cl2
817.23480 [M+H]+
817.23289 [M+H]+
2.34
0.39
0.69
0.39
0.69
n.a.
13.3
aurantoside B*
C35H44N2O15Cl2
803.21915 [M+H]+
803.21784 [M+H]+
1.63
0.38
0.69
0.38
0.85
n.a.
12.9
aurantoside D or C
C37H46N2O15Cl2
829.23480 [M+H]+
829.23279 [M+H]+
2.42
0.40
0.69
0.37
0.58
n.a.
13.4
aurantoside E*
C38H48N2O15Cl2
843.25045 + [M+H]
843.24854 + [M+H]
2.26
0.41
0.69
0.37
0.61
n.a.
13.9
cyclotheonamide A*
C36H45N9O8
366.67683 [M+2H]2+
366.67699 [M+2H]2+
-0.44
0.41
0.09
0.40
0.07
7.9
7.9
theopederin D*
C26H41NO10
528.28032 [M+H]+
528.27970 [M+H]+
1.17
0.28
0.06
0.22
0.06
n.a.
11.8
nazuamide A
* Compounds also present in the enriched "Entotheonella" fraction † See Supplementary Fig. 7 for polytheonamides M0 represents the monoisotopic peak and M1 and M2 the first and second isotopic peak thereof.
WWW.NATURE.COM/NATURE | 9
RESEARCH SUPPLEMENTARY INFORMATION
doi:10.1038/nature12959
Table S8| Deconvoluted LC-HESI-HRMS masses from gene cluster TSY1_14 coexpression experiments. Precursor only
Coexpression with LanM-like protein
Predicted Identity
Expected Mass*
Deconvoluted Mass
Mass Error (%)
Relative Intensity
Deconvoluted Mass
Mass Error (%)
Relative Intensity
M-3H2O
13086.226
-
-
n.d.
13086.225
0.0001
100
M-2H2O
13104.237
-
-
n.d.
13104.250
0.0001
43.9
M-H2O
13122.247
13122.282
0.0003
3.2
13121.242
0.0076
26.9
M
13140.258
13139.255
0.0076
100
13138.237
0.0154
20.0
M-3H2O+2CAM
13200.269
-
-
n.d.
13200.299
0.0002
60.1
M-2H2O+3CAM
13275.301
-
-
n.d.
13275.311
0.0001
25.2
M-H2O+4CAM
13350.333
13351.344
0.0076
2.7
13350.335
0.0001
19.4
M-3H2O+2CAM+Gluc
13378.317
-
-
n.d.
13378.354
0.0003
100
M+5CAM
13425.365
13425.387
0.0002
100
13425.334
0.0002
21.8
M+5CAM+Gluc
13603.413
13603.404
0.0001
26.8
13603.424
0.0001
0.9
Mb+CAM
3225.588
3225.598
0.0003
100
3225.593
0.0001
100
Ma-3H2O+2CAM
4415.929
-
-
n.d.
4415.936
0.0001
67.9
Ma-2H2O+2CAM
4433.940
-
-
n.d.
4433.945
0.0001
76.1
y16-3H2O+2CAM
1654.546
-
-
n.d.
1654.543
0.0002
n.a.
y17-3H2O+2CAM
1767.630
-
-
n.d.
1767.628
0.0001
n.a.
y18-3H2O+2CAM
1880.714
-
-
n.d.
1880.705
0.0005
n.a.
y19-3H2O+2CAM
1993.798
-
-
n.d.
1993.790
0.0004
n.a.
y37-3H2O+2CAM
3710.575
-
-
n.d.
3710.567
0.0002
n.a.
Treatment
TCEP (retention time: 5.85-6.1 min)
TCEP and Iodoacetamide (retention time: 5.85-6.1 min)
TCEP, Iodoacetamide and trypsin (retention time: 6.166.46 min)
MS2 fragments of Ma3H2O+2CAM
67
* as calculated using the ChemCalc online tool M: His-tagged precursor peptide from TSY1_14 (MH124), lacking Met1 n.d. not detected within a range of +/- 3 Da n.a. not available
WWW.NATURE.COM/NATURE | 10
RESEARCH SUPPLEMENTARY INFORMATION
doi:10.1038/nature12959
Table S9| List of sponges investigated for the presence of "Entotheonella" spp. Detection of "Entotheonella"
Sponge
Location
Aaptos ciliata (Suberitidae, Hadromerida)
Oshima-shinsone, Japan 28°52.17' N, 129°33.02' E
+
Agelas dilatata (Agelasidae, Agelasida)
Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W
+
Agelas nakamurai (Agelasidae, Agelasida)
Kuchinoerabu-jima, Japan 30°47.67' N, 130°18.85' E
+
Amphimedon compressa (Niphatidae, Haplosclerida)
Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W
-
Amphimedon sp. (Niphatidae, Haplosclerida)
Io-jima, Japan 30°48.35' N, 130°19.07' E
+
Amphimedon sp. (Niphatidae, Haplosclerida)
Hachijo-jima, Japan 33°12.18' N, 139°70.11' E
-
Anthosigmella (Cliona) raromicrosclera (Clionaidae, Hadromerida)
Mitsukue, Japan 33°46.88' N, 132°25.84' E
+
Anthosigmella (Cliona) raromicrosclera (Clionaidae, Hadromerida)
Kamikoshiki-jima,Japan 31°81.78' N, 129°90.57' E
-
Aplysina aerophoba (Aplysinidae, Verongida)
Rovinj, Croatia 45°7.50' N, 13°39.48' E
+
Asteropus simplex (Ancorinidae, Astrophorida)
Shikine-jima, Japan 34°32.15' N, 139°22.07' E
+
Axinella sp. (Axinellidae, Halichondrida)
Shikine-jima, Japan 34°31.78' N, 139°21.80' E
-
Cacospongia mycofijiensis (Thorectidae, Dictyoceratida)
Mele Bay, Vanuatu 17° 43' S, 168° 14'E
+
Callyspongia vaginalis (Callyspongiidae, Haplosclerida)
Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W
+
Ceratopsion sp. (Raspailiidae, Poecilosclerida)
Yaku-shinsone, Japan 29°47.22' N, 130°19.88' E
+
Dercitus simplex (Ancorinidae, Astrophorida)
Oshima-shinsone, Japan 28°52.17' N, 129°33.02' E
+
Discodermia calyx (Theonellidae, Lithistida)
Nakagi, Japan 34°61.11' N, 138°82.07' E
+
Discodermia kiiensis (Theonellidae, Lithistida)
Nakagi, Japan 34°61.11' N, 138°82.07' E
+
Dysidea avara (Dysideidae, Dictyoceratida)
Rovinj, Croatia 45°7.50' N, 13°39.48' E
+
Dysidea etheria (Dysideidae, Dictyoceratida)
Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W
+
Epipolasis sp. (Halichondriidae, Halichondrida)
Nagannu-jima, Japan 26°14.61' N, 127°31.01' E
+
Epipolasis sp. (Halichondriidae, Halichondrida)
Hachijo-jima, Japan 33°13.77' N, 139°73.47' E
+
Erylus nobilis (Geodiidae, Astrophorida)
Shikine-jima, Japan 34°33.90' N, 139°20.83' E
+
Erylus placenta (Geodiidae, Astrophorida)
Hachijo-jima, Japan 33°07.14' N, 139°77.97' E
+
Fascaplysinopsis sp. (Thorectidae, Dictyoceratida)
Salary Bay, Madagascar 22°33' S, 43°16'E
-
Haliclona digitata (Chalinidae, Haplosclerida)
Ikara-jima, Japan 32°21.48' N, 130°18.98' E
-
WWW.NATURE.COM/NATURE | 11
RESEARCH SUPPLEMENTARY INFORMATION
doi:10.1038/nature12959
Hexadella sp. (Ianthellidae, Verongida)
Kuchinoerabu-jima, Japan 30°47.67' N, 130°18.85' E
+
Ircinia felix (Irciniidae, Dictyoceratida)
Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W
+
Mycale magellanica (Mycalidae,Poecilosclerida)
Toji, Japan 34°64.12' N, 138°91.70' E
+
Niphates digitalis (Niphatidae, Haplosclerida)
Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W
+
Penares aff. incrustans (Geodiidae, Astrophorida)
Hachijo-jima, Japan 33°13.40' N, 139°80.31' E
+
Penares aff. incrustans (Geodiidae, Astrophorida)
Hachijo-jima, Japan 33°13.40' N, 139°80.31' E
-
Penares sp. (Geodiidae, Astrophorida)
Uke-shima, Japan 28°05.42' N, 129°21.77' E
+
Petrosia volcano (Petrosidae, Haplosclerida)
Io-jima, Japan 30°48.35' N, 130°19.07' E
+
Psammocinia aff. bulbosa (Irciniidae, Dictyoceratida)
Milne Bay, Papua New Guinea 9° 32.493’ S 150° 16.715’ E
+
Pseudoceratina purpurea (Pseudoceratinidae, Verongida)
Nakano-shima, Japan 29°83.22' N, 129°85.14' E
+
Pseudoceratina purpurea (Pseudoceratinidae, Verongida)
Oshima-shinsone, Japan 28°52.17' N, 129°33.02' E
+
Ptilocaulis sp. (Axinellidae, Halichondrida)
Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W
+
Stylissa carteri (Dictyonellidae, Halichondrida)
FSAR reef, Thuwal, Saudi Arabia 22°23.09' N, 39°02.86' E
+
Stylissa carteri (Dictyonellidae, Halichondrida)
Kuchinoerabu-jima, Japan 30°47.84' N, 130°18.76' E
+
Theonella swinhoei W1, misakinolide chemotype (Theonellidae, Lithistida)
Hachijo-jima, Japan 33°13.77' N, 139°73.47' E
+
Theonella swinhoei Y1, onnamide chemotype (Theonellidae, Lithistida)
Nakagi, Japan 34°61.11' N, 138°82.07' E
+
Topsentia sp. (Halichondriidae, Halichondrida)
Nichinan-Oshima, Japan 31°53.96' N, 131°41.67' E
+
Xestospongia muta (Petrosiidae, Haplosclerida)
Little San Salvador, Caribbean Sea 24°34.39’N, 78°58.00’W
+
Xestospongia testudinara (Petrosiidae, Haplosclerida)
FSAR reef, Thuwal, Saudi Arabia 22°23’N; 39°03’E
-
Seawater
Rovinj, Croatia 45°7.50' N, 13°39.48' E
+
Seawater
Florida, USA 24°57.11' N, 80°27.30' W
+
Seawater
FSAR reef, Thuwal, Saudi Arabia 22°23.09' N, 39°02.86' E
+
+/-: "Entotheonella" detected/not detected (based on the amplicon sequence)
WWW.NATURE.COM/NATURE | 12
RESEARCH SUPPLEMENTARY INFORMATION
doi:10.1038/nature12959
Table S10| Primers used in this study. Primer ID
Sequence
Target gene (cluster)
onnOF onnOR onnIF onnIR poyOF poyOR poyIF poyIR kon1OF kon1OR kon1IF kon1IR kon2OF kon2OR kon2IF kon2IR naz1OF naz1OR naz1IF naz1IR naz2OF naz2OR naz2IF naz2IR cth1OF cth1OR cth1IF cth1IR cth2OF cth2OR cth2IF cth2IR kerOF kerOR kerIF kerIR ptsOF ptsOR ptsIF ptsIR 16SU27F 16SU1492R EntoIF Ento1290R Ento271F Ento238F Ento1442R KSDPQQF KSHGTGR kerA5F kerA5R cthA2F cthA2R Prec-TSY1_14-F Prec-TSY1_14-R Lanth-TSY1_14-F Lanth-TSY1_14-R
GTCAGCTGAGAACCTGTCGG CTTCCAGCCAGAATGCTGCC TTGCCGTGAATTCCGCTT AGCGGCTTCCAGATGACC CAAGAACTCACAGTCGCCGACGTGTT CGCTACGTGGTGAGCATCGAGGATT CCATTCTAACCCAGAAAGGAGTCCACCAT CATTGATATTGCCACCTGCGACCTGATT AGTTTTGTCCCAACTCCCGTGG AGACGACTTGATAGCGGAAGCG GCTACCGCTCCGACGGC CGTGACGTGAGCCAAATCGTCC TCAAGAAGATGTGGTCGTCGGC ATCAACGGGGTAGGCAAGAACG CACGACCCTGTTTGATTTGTCCG TGGTCTTTCAATGCCGTTTGCG CTTACGCACCACGTTTCCAACC GCCCAGGAAGAGGGTCAAATCG CAGTCTTACGCTCCCACTGTCG CGATGACGAGATCCTCTTGCCC GCGCAGCTTCACCTGAGTATCG CAACACCCGGACAACCTATCCC GACGTGTAAGAGACGAGCGGG GACCTTCATGCTGGCTGACACC CCATCACGCCATTTACGAAGCG TCTAAGACCTCTCCCGTCAGCC AACTTGCTGGTGGCGTACTTGG CGAGCAACCGGAAGGCATGG GTGCGTCTGGCGCTAATAATGG CTCAAGCCTGTGCCTATCTGGG TCTGGTAAGCCCGTTTGACAGC CTTTTGTGCCACGAGTACCTGC TCAGGTGGAACATGACGATGCC CTCACATGCAAGCACGGTTTCC ACCTGTATGGCAAGAGCCAAGC CTAACCGAAACGGGTGAGGTGG CTCGCTTATCTGCGTGCAATCG GTTTGAAGAGCAACCACGAGCG CGGTCGTCTTTAATGCACTCGC CTGGCTTTAGGTGTCGAGGAGG AKWGTTTGATCMTGGCTCAG GGHTACCTTGTTACGACTT GYATTAAGCCKYGGAAACKGT GCCCRGCWYVACCCGGTA GGGAAASGTTCGCBGGTCTG CCGGTCTGAGATGAGCTTGC TCACCCCAATCACCCCGC MGNGARGCNNWNSMNATGGAYCCNCARCANMG GGRTCNCCNARNSWNGTNCCNGTNCCRTG GTATCATATGGTTCACACCTTGCCGCTGCT CTATCTCGAGTCAGCAATCGTCTTTTCGAGCGC GGCAGCCATATGCTCGTCAGTAAGTTGCCTTTGC GTGGTGCTCGAGCTAGTCCAATTCCAACACATCCGCCC GTGCATATGTCACCGGCTGAAAATCGA GACAAGCTTTTACCCGCAAGCCCAACAA CGGTCTTCATGATTTACAAACCATGGGAAAATT GTACCTCGAGTTAGGATATGCTGCCAAAGACCAG
onnamide onnamide onnamide onnamide polytheonamide polytheonamide polytheonamide polytheonamide konbamide konbamide konbamide konbamide konbamide konbamide konbamide konbamide nazumamide nazumamide nazumamide nazumamide nazumamide nazumamide nazumamide nazumamide cyclotheonamide cyclotheonamide cyclotheonamide cyclotheonamide cyclotheonamide cyclotheonamide cyclotheonamide cyclotheonamide keramamide keramamide keramamide keramamide unknown proteusin unknown proteusin unknown proteusin unknown proteusin eubacterial 16S rRNA eubacterial 16S rRNA "Entotheonella" 16S rRNA "Entotheonella" 16S rRNA "Entotheonella" 16S rRNA "Entotheonella" 16S rRNA "Entotheonella" 16S rRNA general PKS gene detection general PKS gene detection keramamide A domain 5 keramamide A domain 5 cyclotheonamide A domain 2 cyclotheonamide A domain 2 TSY1_14 Precursor TSY1_14 Precursor TSY1_14 LanM-like TSY1_14 LanM-like
WWW.NATURE.COM/NATURE | 13
doi:10.1038/nature12959
RESEARCH SUPPLEMENTARY INFORMATION
Supplementary References 22
Rottig, M. et al. NRPSpredictor2 - a web server for predicting NRPS adenylation domain specificity. Nucleic Acids Res. 39, W362-367 (2011).
67
Patiny, L. & Borel, A. ChemCalc: a building block for tomorrow's chemical infrastructure. J Chem Inf Model 53, 1223-1228 (2013).
WWW.NATURE.COM/NATURE | 14