SUPPLEMENTARY INFORMATION
Comparative integrated omics: identification of key functionalities in microbial community-wide metabolic networks
5
Hugo Roumea,1*, Anna Heintz-Buscharta*, Emilie EL Mullera, Patrick Maya, Venkata P Satagopama, Cédric C Lacznya, Shaman Narayanasamya, Laura A Lebruna, Michael R Hoopmannb, James M Schuppc, John D Gillecec, Nathan D Hicksc, David M Engelthalerc, Thomas Sauterd, Paul S Keimc, Robert L Moritzb, and Paul Wilmesa,2
10
a
Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg.
b
Institute for Systems Biology, Seattle, WA, USA.
c
15
The Translational Genomic Research Institute-North, Flagstaff, AZ, USA.
d
Life Science Research Unit, University of Luxembourg, Luxembourg, Luxembourg.
1
Current address: Laboratory of Microbial Ecology and Technology, Ghent University, Belgium
2
Corresponding author: P Wilmes, Luxembourg Centre for Systems Biomedicine, University of Luxembourg,
L-4362 Esch-sur-Alzette, Luxembourg. E-mail:
[email protected].
* These authors contributed equally to this work.
20
Supplementary Materials and Methods
Biomolecular extraction and quality assessment For each sampling date, three 200 mg sub-samples derived from one OMMC islet (defined, herein1, as technical replicates) were used for biomolecular extraction. Each of the individual biomolecular fractions isolated from the technical replicates were combined into a pool in 25
order to yield sufficient biomolecular quantities for subsequent high-throughput analysis and to reduce the influence of fine-scale within-sample heterogeneity (as demonstrated by Roume et al.2). For the quality assessment of the isolated genomic DNA, fractions were separated by electrophoresis on a 1 % agarose gel containing 4 ‰ ethidium bromide (PlusOne Ethidium
30
Bromide, GE Healthcare). For size estimation, the MassRuler DNA ladder mix (Fermentas) was loaded onto the gels. Agarose gels were visualized on an InGenius gel imaging and analysis system (Syngene, Cambridge, UK; as described in Roume et al.1). The DNA pool sample was snap-frozen in liquid nitrogen in the elution buffer and stored at -80 °C until library preparation and sequencing.
35
RNA quality assessment and quantification was carried out using an Agilent 2100 Bioanalyzer (Agilent Technologies, Diegem, Belgium; as described in Roume et al.1). The quality of protein extracts was assessed following 1D-SDS-PAGE separation.
Preparation of RNA for shipment and sequencing The volume of the RNA pool sample was adjusted to 180 µl using RNase free-water. A 40
mixture of 18 µl of a 3 M (w/v) sodium acetate solution, 2 µl of a glycogen solution 2
(10.mg/µl) and 600 µl of ice-cold 96 % (v/v) ethanol was added to the RNA solution. The RNA solution was gently mixed by inversion and precipitated at -20 °C for at least 1.h. Following centrifugation at 10 000 x g for 30 min at 4 °C, the RNA pellet was washed three times with ice-cold 70 % (v/v) ethanol. The pellet was then air dried for 5 min at room 45
temperature overlaid with 100 µl of Ambion® RNAlater (Life Technologies, Gent, Belgium) and placed at -20 °C until shipment. For reclamation of the RNA, the solution was centrifuged at 14 000 x g for 10 min at 8 °C. After removal of the supernatant, the pellet was washed three times with 100 µl of 80 % (v/v) ethanol, followed by 25 000 x g centrifugation for 3 min at 8 °C. The pellet was then air dried until all visible ethanol had evaporated. The
50
dried RNA pellet was then re-suspended in 1 mM sodium citrate buffer solution (pH 6.4).
RNA-Sequencing In order to enrich the RNA fraction in mRNA, the rRNA was subtracted from the total RNA fraction using the Ribo-Zero rRNA Removal Kit (Meta-Bacteria; Epicentre, Madison, WI, USA). This was followed by cDNA synthesis and amplification using the ScriptSeqTM v2 55
RNA-Seq library preparation kit (Epicentre).
rRNA removal The procedure consisted of an initial washing and re-suspension of the Ribo-Zero microspheres with dedicated solutions. Following treatment of the total RNA sample with Ribo-Zero removal solution, the RNA sample solution was added to the re-suspended Ribo60
Zero microspheres and selected by hybridisation. The RNA-microsphere solution was then removed and the rRNA-depleted sample was purified by ethanol precipitation.
3
Library preparation The cDNA synthesis and amplification were performed using the ScriptSeqTM v2 RNA-Seq library preparation kit (Epicentre). The procedure consisted of initial fragmentation of the RNA, followed by the annealing of the cDNA synthesis primer. Briefly, addition of 4 µl 65
cDNA Synthesis Master Mix to the fragmented RNA solution was followed by incubation first at 25 °C for 5 min and then 42 °C for 20 min. cDNA Synthesis Master Mix was prepared by combining 3 µl cDNA Synthesis PreMix (ScriptSeq v2 RNA-Seq library preparation kit), 0.5 µl of DTT (100 mM) and 0.5 µl of StartScript Reverse Transcriptase solution (Epicentre). After cooling to 37 °C, 1 µl of finishing solution (Epicentre) was added to the cDNA
70
synthesis solution and incubated at 37 °C for 10 min, before an incubation at 95 °C for 3 min. 8 µl of terminal tagging master mix was then added to each solution and incubated at 25 °C for a further 15 min, following incubation at 95 °C for 3 min. Terminal tagging master mix was prepared using 7.5 µl of terminal tagging premix (Epicentre) and 0.5 µl of DNA polymerase. The 3’-terminal tagged cDNA was then purified using the AMPure XP system
75
(Beckman Coulter, Brea, CA, USA). The purified cDNA strand was then amplified by PCR, resulting in the generation of the second strand of cDNA, addition of the Illumina adaptor sequences and incorporation of specific barcodes, as well as amplification. Finally, the RNASeq library was purified using AMPure XP system (Beckman Coulter). The size distribution of the RNA-Seq library was assessed using an Agilent 2100 Bioanalyzer.
80 DNA/cDNA sequencing DNA/cDNA was prepared according to the modified instructions from The Wellcome Trust Sanger Institute3. The metagenomic sequencing protocol used a 96-well library preparation and the molecular barcoding method for Illumina library construction. The barcodes were
4
designed using Hamming codes, which allow single nucleotide sequencing errors to be 85
corrected and single indels (insertions/deletions) to be detected without ambiguity. Several optimisations were employed in the indexing protocol. The tags used were 8 bp long, which allowed the design of larger number of barcodes with error-correcting capability. The bar codes were introduced in a regular PCR, which simplified the PCR step and allowed for use of as few as six cycles. Before pooling, the relative concentration of each sample library was
90
measured by quantitative PCR (qPCR), which then allowed the accurate pooling of libraries together and improved the uniformity of their representation.
DNA fragmentation 1-5 µg of DNA was used for the fragmentation step, as determined following gel analysis, and re-suspended in 75 µl of 10 mM Tris-HCl, pH 8.5. The sample was then sheared for 95
150 s in 100 µl Covaris microtubes (Woburn, MA, USA). The following programme was used: duty cycle 20 %, intensity 5, cycle burst 200, power 37 W, temperature 7 °C and mode ‘Freq sweeping’. The sheared DNA samples were transferred into a MicroAmp optical 96well plate (Life technologies, Grand Island, NY, USA). Six samples, chosen at random, were run on an Agilent Bioanalyzer DNA 1000 chip to check the quality of the fragmentation. A
100
150-250 bp smear was detected and 70-90 % of the initial DNA amount was recovered.
Size selection A large fragment plate was prepared by dispensing 150 µl of beads into wells of a roundbottom Costar plate (Corning, Glendale, AZ, USA) for each sample. Additionally, a smallfragment plate was prepared by dispensing 60 µl of beads into wells of a separated Costar 105
plate for every sample. The large-fragment Costar plate was placed on a magnetic stand, 5
allowing the collection of beads. 45 µl of 10 mM Tris-HCl buffer were removed from the large-fragment wells, leaving all beads in the well. The plate was removed from the magnetic stand and 190 µl of sample from sonication tubes was added to the large fragment wells on the Costar plate. Following 8 min of incubation at room temperature, using two magnetic 110
stands, the large-fragment Costar plate was placed on a magnetic stand to collect beads. Following 5 min incubation at room temperature, 58 µl of bead buffer were removed from the small fragment wells, leaving behind beads and approximately 2 µl of bead buffer. Smallfragment wells were then removed from the magnetic stand. 300 µl of supernatant were transferred from large-fragment wells into small-fragment wells. After transfer, the large-
115
fragment plate was discarded. Samples were then incubated in small-fragment wells for 5 min and the plate was placed on a magnetic plate to collect beads. 300 µl of supernatant were removed and replaced by 300 µl of 80 % ethanol without disturbing beads. Following 30 s of incubation, ethanol was removed. This last step was repeated two times. Following ethanol removal by air-drying for 5 min, the small fragment plate was removed from the
120
magnetic stand and 45 µl of pre-warmed 10 mM Tris-HCl (pH 8.5) were dispensed into the wells. After re-suspension of beads by vigorous mixing, the solution was incubated for 2 min and placed on a magnetic plate to collect beads. Following a further 3 min incubation, 42.5 µl of supernatant containing the size-selected products were transferred to a new plate. All enzymes and reaction buffers used for the end-repair, dA-tailing, ligation and PCR
125
amplification were provided from the KAPA Library Preparation Kits with Standard PCR Library Amplification/Illumina series (KapaBiosystems, Woburn, MA, USA).
6
End-repair To the supernatant containing the sheared DNA, 10 µl of end-repair buffer and 5 µl of endrepair enzyme mix were added. 100 µl of this solution were added to each well. The plate 130
was then covered, vortexed briefly and spun down before being incubated for 30 min at 20 °C in a thermocycler. The samples were then cleaned using Agencourt AMPure SPRI beads (Beckman Coulter). 90 µl of SPRI beads were added into each well, the plate was covered and vortexed for 30 s. The reaction plate was placed on the magnetic SPIRPlate for 10 min and beads separated from the solution. Following incubation for 10 min, the
135
completely clear solution was discarded. 200 µl of 70 % (v/v) ethanol were added to each well and incubated for 30 s at room temperature. The ethanol washes were repeated two times. The reaction plate was dried for 15 min in a thermocycler and each sample was eluted with 43.5 µl of 10 mM Tris buffer (pH 8.0).
A-tailing 140
To 30 µl of end-repaired DNA, 5 µl of 10x A-tailing buffer, 3 µl of A-tailing enzyme and 12.µl of water were added. 50 µl of A-tailing master mix were added to each well. The plate was then covered, vortexed briefly and spun down before being incubated for 30 min at 30 °C in a thermocycler. The sample was then cleaned up using the Agencourt AMPure SPRI bead method with 90 µl of SPRI beads placed in each well. Each sample was then eluted with
145
36 µl of 10 mM Tris buffer (pH 8.0). The same six samples, randomly chosen for the DNA fragmentation step, were run on an Agilent Bioanalyzer DNA 1000 chip and the average sample concentration was calculated using the “integrated peak” function.
7
Adapter preparation and ligation To 30 µl of A-tailed DNA, 10 µl of 5x ligation buffer, 5 µl of DNA ligase and 5 µl of DNA 150
adaptor (30 µM) were added. 50 µl of this reaction mixture, were added to each well. The plate was then incubated at 20 °C (room temperature) for 15 min. The sample was cleaned up using Agencourt AMPure SPRI beads by addition of 40 µl of SPRI beads to each well. Each sample was then eluted with 30 µl of 10 mM Tris buffer (pH 8.0) / 0.05 % (v/v) Tween 20. The same six samples were randomly chosen for the DNA fragmentation step and run on an
155
Agilent DNA 1000 chip to check the success of the ligation; the smear obtained had an average molecular size of 50 to 200 bp larger than before ligation.
PCR amplification Library amplification was carried out according to a modified reaction setup as defined in the KAPA Library Preparation kit instructions (Illumina). 25 µl of 2x Kapa HIFI Hotstart Mix 160
were spiked with 1 M betaine, aiding in amplification of high-GC-content regions and reducing biases. 1 µl of the supplied PCR primers were added to each tube and a sufficient quantity of water was added to reach 50 µl of solution volume per PCR reaction. The tubes were then transferred into a PCR thermocycler and run with the recommended KAPA library preparation cycling programme. The samples were then cleaned using the SPRI beads
165
method with 40 µl of SPRI beads. Each sample was eluted using 50 µl of 10 mM Tris-HCl / 0.05 % (v/v) Tween 20. The same six samples as those randomly chosen at the DNA fragmentation step, were run on an Agilent Bioanalyzer DNA 1000 chip to check the success of the indexing enrichment PCR.
8
Final library quantification by qPCR 170
The KAPA Library Quant kit (KapaBiosystems) for final library quantification was used (Illumina). Briefly, after an initial 1:1 000 dilution in 10 mM Tris-HCl, pH 8.0 + 0.05 % (v/v) Tween 20, the 2x KAPA SYBR FAST qPCR master mix was used to amplify the DNA library with six other standards on an ABI 7900 thermocycler. The qPCR step was conducted using the following cycling conditions: 95 °C for 5 min followed by 35 cycles at 95 °C for
175
30.s and at 60 °C for 45 s. The concentration of the library was established using the standards according to the manufacturer’s instructions. Both the standards and the libraries with unknown concentration were run in triplicate. Prior to sequencing, the libraries were diluted to the required concentration (e.g. 4.5 pM) by following the Illumina cluster generation protocol.
180 Sequence assembly For each of the two sampling dates, the raw paired-end 100 nt read metagenomic and metatranscriptomic sequences were processed separately first using the PAired-eND Assembler4 (PANDAseq, Supplementary Figure 1, step 1) to assemble overlapping read pairs. PANDAseq was run with a score threshold of 0.9 and 25 nt minimum overlap 185
requirement to determine the location of the amplification primers, identify the optimal overlap between reads, correct sequencing errors and check length and base quality. The reads selected by the PANDAseq assembler were extracted from the raw sequence files using in-house Perl scripts. The remaining non-redundant paired-end reads were trimmed using the trim-fastq.pl script from the PoPoolation package5 using a quality-threshold of 20 (1 %
190
probability of miscall) and a minimum length of 40 nt resulting in two quality trimmed read sets, one including still paired-end and one containing only single-end reads, where the other
9
read pair was discarded during quality trimming. Metagenome and metatranscriptome FASTQ files were then combined into a combined FASTQ file. All PANDAseq and singleend reads were then combined into a single FASTQ file. Paired-end and single-end reads 195
were made non-redundant using CD-HIT-dup6 (Supplementary Figure 1, step 2). Noneredundant reads were then used as input for the MOCAT assembly pipeline7 (Supplementary Figure 1, step 3), using default parameters. The assembled contigs were filtered with minimum length threshold of 150 nt. To enhance the final assembly by reads that were assembled by PANDAseq, but not used by the MOCAT assembly pipeline, all PANDAseq
200
reads were mapped onto the MOCAT contigs using SOAPaligner8, with the following parameter settings: -r 2 -M 4 -l 30 -v 10 -p 8 (Supplementary Figure 1, step 4). The unmapped PANDAseq assembled reads with a minimum length of 150 bp, were extracted and added to the contigs. The final contig files were made non-redundant using CD-HIT6 (-c 1.0) by clustering identical sequences.
205 Extraction and classification of reads mapping to rRNA genes in the metagenomic data EMIRGE9 was run using metagenomic reads quality filtered to minimum average QV = 30 and a minimum of 40 bp with the trim-fastq.pl script from the PoPoolation package5. The reference database used was the truncated (non-redundant) small ribosomal subunit SILVA10 database release 111. The consensus sequences at 100 iterations were extracted and named 210
based on their identification in the reference database, without imposing any normalized posterior probability.
Generation of additional metagenomic data for contig extension and analysis To obtain an additional high-depth metagenomic dataset, a total of four additional floating 10
sludge samples islets were collected on 23rd February 2011, each representing a biological 215
replicate. Biomolecular extraction was performed on 200 mg of starting material as for the other two dates, using the protocol described by Roume et al.1. The resulting DNA fractions were sequenced under sample names I, II, III and IV using the sequencing protocol described above. Four technical replicates from sample I were generated, resulting in four libraries I-1, I-2, I-3 and I-4. While the remaining biological replicates II, III and IV were sequenced once
220
each, generating a total of 12.6 gigabases metagenomic sequence. The reads were then assembled using AMOS11 and MetaVelvet12.
Gene annotation Non-redundant contig files were split into two distinct files, one file with contigs of a length 225
below 500 bp and another file with contigs lengths above or equal to 500 bp. The contigs with lengths below 500 bp were annotated using FragGeneScan13 (Supplementary Figure 1, step 5), using settings for short sequence reads with sequencing error (-complete 0 –train illumine_5). The contigs with a length equal or above 500 bp were annotated with Prodigal gene finder14 (v2.60, Supplementary Figure 1, step 5) using the MOCAT gene prediction
230
processing steps. The resulting amino acids sequence files were then combined into a single file and made non-redundant using CD-HIT with a sequence identity threshold of 1.0 and a description string length within the cluster file of 5 000 (-c 1.0 –d 5000; Supplementary Figure 1, step 6). All sequences were mapped to the KEGG database version 64.0 using BLAT15 and sequences
235
were annotated with KOs (KEGG orthologous groups; Supplementary Figure 1, step 7).
11
Pre-processing of protein fraction for high-throughput analysis Following 1D-SDS-PAGE electrophoresis and staining with Imperial protein stain (Thermo Scientific, Erembodegem, Belgium; as described in Roume et al.1), the protein gel was conserved at 4 °C in the dark and under vacuum in sealing foil D0316L-20 (DOMO 240
ELEKTRO, Herentals, Belgium). Prior to further analysis, entire lanes were cut into 1 mmslices using a grid cutter (MEE-1x5, Gel Company, San Francisco, CA, USA), yielding approximately 70 slices per lane. Two 1 mm-slices were combined in a single well of a 96well V-bottom plate with a hole introduced into the bottom using a 30 gauge lancet needle (Becton Dickinson, Franklin Lakes, NJ, USA). The wells contained size 11 black hexagonal
245
glass beads (SB3656, Fusion Beads) to prevent the gel pieces from clogging the hole. For ingel digestion, an automated liquid handling system (Tecan EVO, Männedorf, Switzerland) was used for reduction, alkylation, tryptic digestion and peptide extraction from the gel pieces. After extraction, the peptide solution was dried and reconstituted in 20 µl of a solution of 0.1 % (v/v) [trifluoroacetic acid (TFA, 5 %; Sigma) / acetonitrile (ACN, 95 %;
250
BioSolve, Valkenwaard, Netherlands)] in MilliQ H2O in a round bottom polypropylene 96well plate (Greiner Bio-One, Monroe, NC, USA) and placed into an Eksigent Nano 2D plus system autosampler (ABSciex, Framingham, MA, USA) for analysis.
Liquid chromatography Peptides obtained from the 1 mm-gel bands were separated using an Eksigent Nano 2D LC 255
plus system employing splitless nanoflow. Reverse phase high performance liquid chromatography (RP-HPLC) and separation columns were prepared in-house by packing a Kasil fritted capillary [360 µm outer diameter (OD), 75 µm inner diameter (ID)] with a 1 cm bed of ReproSil Pur C18-AQ 3 µm 120 Å stationary phase (Dr. Maisch GmbH, Ammerbuch,
12
Germany) for the sample trap and desalting column. A Kasil fritted capillary (360 µm OD, 260
75 µm ID) was packed with a 15 cm bed of the same stationary phase as the separation column and this was connected to a PicoTip emmiter (360 µm OD x 20 µm ID, Tip 10 µm, FS360-20-10-N-20) for nano-electrospray ionisation. For each LC run, the sample was injected for 10 minutes at 2.5 µl/min with loading buffer (2 % v/v acetonitrile and 0.1 % v/v formic acid). The sample was separated by a linear gradient changing from 98 % solvent A
265
(0.1 % v/v formic acid in water) and 2 % solvent B (0.1 % v/v formic acid in acetonitrile) to 40 % A and 60 % B in 60 min at 0.3 µl/min.
Mass spectrometry Following LC separation, the peptides were analysed on a LTQ-Velos Orbitrap (ThermoFisher, San Jose, CA, USA). MS1 data were collected over the range of 300 – 2 000 m/z in 270
the Orbitrap at a resolution of 30 000. Fourier-transform mass spectrometry (FTMS) preview scan and predictive automatic gain control (pAGC) were enabled. The full scan FTMS target ion volume was 1 x 106 with a maximum fill time of 500 ms. MS2 data were collected in the LTQ-Velos with a target ion volume of 1 x 104 and a maximum fill time of 100 ms. The 10 most intense peaks were selected (within a window of 2.0 Da) for higher-energy collisional
275
dissociation at 15 000 resolution in the Orbitrap. Dynamic exclusion was enabled in order to exclude an observed precursor for 180 s after two observations. The dynamic exclusion list size was set at the maximum 500 and the exclusion width was set at ±5 ppm based on precursor mass. Monoisotopic precursor selection and charge state rejection were enabled to reject precursors with z = +1 or unassigned charge state.
280
13
Protein identification For MS analysis, Thermo .RAW files were converted to mzXML format using MSConvert (ProteoWizard16) and searched with X!Tandem17 version 2011.12.01.1. Spectra were searched against the metagenomic and metatranscriptomic data, common lab protein contaminants, and decoys. Redundancy was removed from these three data sets using 285
BlastClust. The contaminant database was a modified version of the common Repository of Adventitious Proteins (cRAP, www.thegpm.org/crap) with the Sigma Universal Standard Proteins removed and human angiotensin II and [Glu-1] fibrinopeptide B (MS test peptides) added, for a total of 66 entries. Decoys were generated with Mimic (www.kaell.org), which randomly shuffles peptide sequences between tryptic residues, but retains peptide sequence
290
homology in decoy entries. Search criteria used for X!Tandem included a precursor mass tolerance of 15 ppm and a fragment mass tolerance of 15 ppm for higher-energy collisional dissociation spectra. Peptides were assumed to be semi-tryptic (cleavage after K or R except when followed by P), but semi-tryptic peptides with up to 2 missed cleavages were allowed. The search parameters
295
included a static modification of +57.021464 Da at C for carbamidomethylation by iodoacetamide and potential modifications of +.15.994915 Da at M for oxidation, 17.026549 Da at N-terminal Q for deamidation, and -18.010565 Da at N-terminal E for loss of water from formation of pyro-Glu. Additionally, -17.026549 Da at the N-terminal carbamidomethylated C for deamidation from formation of S-carbamoylmethylcysteine and
300
N-terminal acetylation were searched. Peptide spectrum matches (PSMs) obtained from X!Tandem were validated using the Trans Proteomic Pipeline18 version 4.6 Rev.1. The PSMs were analysed with PeptideProphet19 to assign each PSM a probability of being correct. Accurate mass binning was employed to promote PSMs whose theoretical mass closely matched the observed mass of the precursor ion, and to correct for any systematic mass
14
305
errors. Decoys and the non-parametric model option were used to improve PSM scoring. Protein identifications were inferred with ProteinProphet. The ProteinProphet scores were then analysed in iProphet20, which combines results from multiple fractions and multiple database searches (although here, only X!Tandem was used) and assigns a probability for each unique protein and its corresponding peptide sequences. The false discovery rate for a
310
given iProphet probability was calculated using the number of decoy protein inferences at that probability. Only proteins identified at iProphet probabilities corresponding to a false discovery rate (FDR) less than 1.0 % were further considered.
KO annotations and protein quantification The corresponding sequences of the identified proteins were collected in FASTA format from 315
the non-redundant single peptides/protein sequences database previously generated from the combined metagenome and metatranscriptome assemblies. Relative protein quantitation was performed using the normalized spectral index (NSI) measure using an in-house software tool called NSICalc as previously described21. The amino acid sequences of identified proteins were then mapped to the KO library (KEGG database version 64.0) using BLAT15 (e-
320
value50, score>50). From the resulting list of KOs, the frequency of each KO was determined at the protein level, using an in-house developed Perl script.
Gene copy and transcript abundances To account for differences in read depth and sampling, the number of raw sequence reads from autumn and winter metagenome and metatranscriptome libraries were equilibrated by 325
randomly selecting reads in the larger libraries from the autumn sample to mirror the number of reads in the smaller library (winter) using an in-house Perl script based on the ‘shuffle’ 15
method from the ‘List::Util’ CPAN package. This resulted in 14 546 374 reads and 16 443 761 reads being used from the metagenomic and metatranscriptomic libraries, respectively. The four balanced raw read sequence libraries were then mapped separately to 330
the combined assembly for both sampling dates using SOAPaligner8 with the following parameter settings: -r 2 -M 4 -l 30 -v 10 -p 8. For each library, reads were mapped to genes and counted, except for reads mapping to multiple genes, for which weighted proportions were used. Next, to obtain the abundances of genes and transcripts, read counts were normalized by the length of the respective gene
335
sequences22, to obtain normalized gene copy abundances and normalized transcript abundances, respectively. Normalized gene copy abundances per KO were obtained by calculating the sum of normalized gene copy abundances from all genes belonging to the same KO group. Similarly, the KO-wise transcript abundances were calculated as the sum of normalized transcript abundances over all genes within the same KO.
340 Relative gene expression Relative expression of KOs was determined by dividing the normalized transcript abundance of each KO by the inferred gene copy abundance of the same KO23. The relative expression of a KO is greatly dependent on the normalized gene copy abundance, if this value is close to zero. Similarly, KOs with normalized gene copy 345
abundances close to zero are prone to be falsely identified as highly expressed due to their very low gene copy abundances. Therefore, highly expressed KOs were selected based on normalized gene copy and transcript abundances, in addition to relative expression. KOs were considered highly expressed, if their relative expression was above the 90th percentile. In addition, to avoid false positive identification of highly expressed KOs, the normalized 16
350
transcript abundances of highly expressed KOs had to be above the 3rd quartile of the normalized transcript abundances of all KOs or normalized gene copy abundances had to be above an empirically determined threshold. This threshold was determined by sorting KOs by their normalized gene copy abundances and applying a sliding window approach to determine the lowest normalized gene copy abundance with an average relative expression
355
robustly within the interquartile range (see Supplementary Figure 2).
Sensitivity of gene expression analysis to imposed cut-offs Results of the analysis of KOs exhibiting high relative expression are dependent on the quantile of KOs considered highly expressed (we considered KOs with a relative expression above the 90th percentile), as well as the lower cut-off which was set for gene copy 360
abundances to avoid false positive identification of KOs exhibiting very low gene copy abundances and only slightly higher transcript abundances as highly expressed. Therefore, we first analysed, if the numbers and identities of KOs identified based on their relative expression were robust to different levels of noise added to the data. In addition, we analysed whether the conclusions would also be robust to small to moderate changes in the selected
365
cut-off values. To address the first point (robustness to noise), we changed the gene copy and transcript abundances by a random number following a uniform distribution within different limits. The lowest of these limits corresponded to +/- 1 read mapped per kilobase of metagenomic sequence and the highest to +/- 50 reads mapped per kilobase. We then analysed the numbers
370
and identities of the highly expressed genes in 100 repetitions of each test, given the chosen cut-offs and the identities of enriched pathways within the selected sets of KOs. We compared the results to the same analysis carried out using a single numerical cut-off for
17
relative expression (minimal relative transcript abundance = 10 times relative gene copy abundance). 375
To address the second point (robustness to changes in cut-offs), we changed each cut-off within small to moderate limits. For the inclusion of KOs irrespective of their gene copy abundances, cut-offs at steps between the 55th to 95th percentile of transcript abundances were used. For the exclusion of KOs with low gene copy abundances, different cut-offs of robust relative expression were set between the 20th and 95th percentile. We then analysed the
380
numbers and identities of the KOs found to be highly expressed, as well as the pathways enriched in these KOs.
Comparison of the metagenomic dataset with the metatranscriptomic and metaproteomic datasets The congruency of metagenomic and metatranscriptomic datasets was determined by calculating the proportion of KOs with at least one gene having at least 10 metagenomic 385
reads per kilobase mapping to it and also at least one gene resulting in the mapping of 10 metatranscriptomic reads per kilobase. The congruency of the metagenomic and metaproteomic datasets was calculated analogously as the proportion of KOs with at least one gene mapped by at least 10 metagenomic reads per kilobase that also had at least one gene identified at the protein level.
390 Analysis of pathway membership Assignment of KOs to KEGG pathways was done by using the KO to pathway link (http://rest.kegg.jp/link/Ko/pathway) in the KEGG database version 67.1. Enrichment of KOs
18
in specific pathways was tested using a hypergeometric test and p-values were adjusted using FDR-control24. Test results with adjusted p-values below 0.05 were considered significant. 395 Community-wide metabolic network reconstructions The KO to R (reaction) link (http://rest.kegg.jp/link/Ko/reaction) was used to associate each individual KO to the corresponding reactions following the R to RP (reaction pair) link (http://rest.kegg.jp/link/reaction/RP), thereby, associating individual reactions to their corresponding main reaction pairs. The RPAIR annotation was specifically chosen to ignore 400
unspecific compounds of reactions (water, energy carriers and cofactors), thereby, only taking into account the main compounds of a reaction. The reaction pairs (RP) were then further
selected
by
using
(http://rest.kegg.jp/link/rn/rc).
only Finally,
RPAIRs the
with
reaction
assigned
reaction
pair
compounds
-
classes link
(http://rest.kegg.jp/list/RP) was used to associate individual RPs to corresponding pair(s) of 405
compounds. As some KOs have identical compounds (e.g. subunits of the same enzyme or enzymes that catalyse each other’s reverse reaction), KOs with identical compounds (as annotated in the KEGG database version 67.1) were grouped. A KO network graph was built by using all KOs as nodes. Edges between two KOs were introduced if a product metabolite of one KO was found as a substrate metabolite of the other KO. Multiple edges between the
410
same KOs were reduced into a single edge. For topological analysis of the reconstructed metabolic network, nodes sharing the exact same edges were regrouped to be represented as a single node. Multiple edges connecting the same nodes were likewise combined into a single edge. The undirected network graph was visualized and analysed using Cytoscape25, employing a spring embedded layout. Singleton nodes not connected to any other node were
415
removed.
19
Calculation of gene copy and transcript number and relative expression of regrouped nodes Gene copy numbers and transcript numbers of nodes which represented several KOs due to their sharing of the same edges were calculated by summing all normalized gene copy and 420
transcript numbers, respectively. Relative expression of a node was calculated by dividing the node-wise sum of normalized transcript abundances by the node-wise sum of normalized gene copy abundances, thereby levelling relative expression of the regrouped KOs.
Identification of choke points Choke points as defined by Rahman and Schomburg26 are enzymes which consume or 425
produce unique metabolites and possess a high load score in the metabolic network reconstruction. To assess whether a node within the metabolic network reconstructions could qualify as choke point, the number of edges representing every metabolite was counted, yielding the number of occurrence. In the cases where an edge represented more than one metabolite, the lowest number of occurrence was assigned to this edge. Every node was also
430
assigned the lowest number of occurrence of all its edges. Nodes with an assigned number of occurrence of 1 were considered potential choke points. The number of occurrence for every node is listed and potential choke points are highlighted in Supplementary Dataset 7.
Weighted load score Weighting of compounds in a bi-partite RPAIR-based metabolic network reconstruction has 435
been shown to increase pathfinding accuracy27. The number of occurrence described above was assigned as edge weight within the metabolic network reconstructions. A weighted betweenness centrality was calculated using the R-package igraph28. Alternative weighted load scores were calculated for each node from the weighted betweenness centralities and the 20
degree as defined in the manuscript, and potential key functionalities were determined using 440
this measure and expression as described in the main manuscript. The resulting weighted load scores and the identities of the key nodes were compared to the unweighted results discussed in the manuscript.
Matching of genes to Candidatus Microthrix parvicella Bio17 genome Amino acid sequences were aligned using BLAT15 with cut-offs chosen as follows: e-value < 445
10-5, %-identity > 50, score > 50.
Alignment of contigs encoding key functionalities Contigs were selected based on the following criteria: (1) they encoded a gene annotated with a KO representing a key functionality, and (2) the expression of this gene was corroborated by at least one mapped metatranscriptomic read. Selected contigs were aligned to the NCBI 450
non-redundant nucleotide database using BLASTn with default parameters29. The best hit with a query coverage above 50 % and a percentage identity above 80 % for each contig was documented. In addition, contigs containing genes annotated as K03921 were aligned to 85 isolate genomes from the same biological wastewater treatment (BWWT) plant using BLAST and selecting only results with percentage identity above 80 %.
455 Quantification of isolate sequences in combined metagenomic and metatranscriptomic assemblies Reads from the balanced libraries (see section Expression analysis and contextualization of omic datasets - gene copy and transcript abundances) were mapped against the genome of
21
Nitrosomonas sp. Is79 (Ref. 30) and Isolate LCSB065 using SOAPaligner8 with the following parameter settings: -r 2 -M 4 -l 30 -v 10 -p 8 and mapped reads were counted. 460 Ammonia monooxygenase (AMO) contig extension Genes encoding subunits A and B of ammonia monooxygenase (amoA and amoB) are established phylogenetic markers31. However, none of the contigs from the combined autumn and winter assembly that contained an open reading frame annotated as encoding for a subunit of AMO (K10944, K10945, or K10946) harboured a complete gene. In order to 465
recover a full gene sequence and determine the position of the amoA genes recovered from the combined metagenomic and metatranscriptomic data relative to other known amoA sequences within a phylogenetic tree, we employed a contig extension protocol in order to increase the length of the contigs via extension and merging of these contigs. The contig extension protocol was carried out in a step-wise fashion including contig
470
extension, gene calling and gene annotation. Contig alignment, merging and extension was performed using minimus2 (Ref. 32) from the AMOS suite33 with a minimum overlap of 60 bases at 98 % identity. The contig extension protocol was performed in three steps: i) contigs were extended by aligning contigs annotated with the same KO IDs; ii) contigs were extended by aligning contigs annotated with KOs K10944, K10945 or K10946; iii) contigs
475
were extended using a high-depth metagenomic assembly (see section Generation of additional metagenomic data for contig extension and analysis) from a different sample, using the AMO contigs from the previous step as a reference. This procedure was performed, because the genes encoding the subunits of AMO are known to exist in a cluster/operon34. Following the run on minimus2, gene calling was performed on the resultant contigs using
480
FragGeneScan13 and Prodigal14 using default parameters as previously used. Predicted
22
amino acid sequences were then merged and made non-redundant based on 100 % sequence identity using CD-HIT35 and were re-annotated with KO IDs using our annotation pipeline. Gene calling and re-annotation steps were performed to ensure that the extended contigs retained their original annotation reference. The extended contigs were then used for 485
downstream analysis in order to associate these AMO genes to a bacterial species.
AmoA phylogenetic analysis Nearly complete amino acid sequences (201 – 274 amino acids) of AmoA and/or MmoA from representative organisms belonging to the beta-Proteobacteria, gamma-Proteobacteria and archaea were retrieved from the Refseq protein database (see Supplementary Table 3) 490
and aligned with ClustalOmega (using default parameters). The alignment file was submitted to a phylogenetic analysis using the Phylogeny.fr customized workflow service36 including alignment curation with Gblocks37 (using default parameters), tree construction with PhyML38 (bootstrap of 100), and visualization by TreeDyn39.
Isolate LCSB065 isolation 495
Isolate LCSB065 was obtained from an OMMC biomass sample diluted by a factor of 104. The biomass was first cultivated on Petri dishes of wastewater-agar medium (1.5 % agar; w/v) in 800 ml filtered (0.2 µm, Sartorius, Göttingen, Germany) wastewater from the Schifflange BWWT plant. A single colony was then transferred to a Petri dish with R2A medium40 and cultivated at 20 °C under aerobic conditions. Isolates were grown on different
500
growth media recommended for the culture of bacteria from water and wastewater, particularly Microthrix parvicella, such as R2A40, wastewater agar medium, MSV + peptone and MSV A + B41 or Slijkhuis A and F42 under different growth conditions. 23
Nile red staining Lipid inclusions were visualized using a protocol modified from Fowler & Greenspan43. A 505
stock solution of Nile red (Sigma-Aldrich, Diegem, Belgium) in acetone (Sigma-Aldrich) was prepared at a concentration of 500 µg/ml and preserved at 4 °C protected from light. 50 µL of a working solution, containing 2.5 µl of the stock solution in 1 ml of 75 % (v/v) glycerol, were deposited onto a microscopy glass slide with heat fixed bacterial cells. After 5 min incubation, epifluoresence and bright field microscopic observations of the same fields
510
of view were carried out on an inverted microscope (Nikon Ti) equipped with a 60 × oil immersion Nikon Apo-Plan lambda objective (1.4 N.A). Intermediate magnification 1.5 × was used in order to better resolve images. Excitation light was from a Xenon arc lamp, and the beam was passed through an Optoscan monochromator (Cairn Research, Kent, UK) with 550/20 nm selected band pass. Emitted light was reflected through a 620/60 nm bandpass
515
filter with a 565 dichroic connected to a cooled CCD camera (QImaging, Exi Blue). All imaging data were collected and analysed using the OptoMorph (Cairn Research, Kent, UK) and ImageJ44.
Isolate genome sequencing and genome assembly Following DNA extraction from isolate cultures using the Power Soil DNA isolation kit (MO 520
BIO, Carlsbad, CA), a paired-end sequencing library with a theoretical insert size of 300 bp was prepared with the AMPure XP/Size Select Buffer Protocol as previously described by Kozarewa & Turner3, modified to allow for size-selection of fragments using the double solid phase reversible immobilisation procedure described earlier by Rodrigue et al.45 and sequenced on an Illumina HiSeq with a read length of 100 bp. The resulting 854 683
24
525
paired raw reads were de-duplicated with FastUniq46, and quality filtered to a minimum average QV = 30 and a minimum length of 60 bp with the trim-fastq.pl script from the PoPoolation suite5, leading to 551 103 read pairs and 145 500 single-end reads (high quality data yield of 73 %). Two separate preliminary assemblies were obtained with IDBA-UD47; v1.1.0, with parameters –mink 30 –maxk 90 –step 5 –similar 98 --pre_correction); and
530
SPAdes 2.5.0, using the hammer read error correction module which filtered for contigs shorter than 200 bp. The resulting assemblies were merged with phrap (minimum overlap of 50 bp). The merged assembly was manually inspected with Consed48 and contigs broken where merge conflicts were detected. The resulting contigs were uploaded to and analysed using RAST49. According to this
535
analysis, the isolate was a Rhodococcus sp. Similarity of contigs to bacterial genomes (NCBI database, accessed 6th March 2014) were assessed by BLAT15. The bitscore of each hit was recorded and only contigs with a hit to Rhodococcus spp. with a bitscore within 60 % of the best hit’s bitscore were selected for the final set. This led to the removal of 120 contigs most of which had low coverage of reads.
540
Filtered reads were mapped onto the assembled contigs using BWA50 with default parameters. Reads mapping with mapping quality scores at or above five were used to assess contig coverage (Supplementary Dataset 8). The final set of contigs was submitted to AmphoraNet server to determine the isolate species searching for 31 phylogenetic marker genes51, as well as to RAST49 (accession number 6666666.64457) and annotated.
545
The COG protein profile of Isolate LCSB065 was determined as previously described by Muller et al.21. Annotation of KOs was carried out on protein predictions from RAST as described for metagenomic proteins.
25
26
Supplementary Results and Discussion Sensitivity of gene expression analysis to imposed cut-offs 550
We found that the cut-offs imposed to avoid false-positives within the highly expressed genes and genes encoding key functionalities led to a greater robustness against noise than would be observed if simple numerical cut-offs had been chosen. This was obvious from the simulation of noise, in which the number and variance of highly expressed genes grew with the noise, when a numerical cut-off was chosen, whereas the choice of cut-offs, as defined
555
herein, resulted in mostly stable numbers (Supplementary Figure 3a&b). The identities and numbers of KOs identified as exhibiting high relative expression in our datasets were not overly sensitive to the cut-offs imposed to avoid false-positive results. The numbers of highly expressed KOs decreased by less than 20 % by the exclusion of KOs with low gene copy abundances (Supplementary Figure 3c). As most genes with very high
560
transcript abundances did not have low gene copy numbers (Figure 3), the cut-off selected for inclusion of KOs with high transcript abundances irrespective of their gene copy numbers changed the total number of highly expressed KOs by less than 5 %. If the transcript abundance cut-off for genes with low gene copy abundances was set between the 55th and 95th percentile (our default value was the 75th percentile), an enrichment with the KOs above
565
the 90th percentile of the highly expressed KOs of 4 or 5 out of the 5 pathways in autumn, and 5 out of the 6 pathways in winter was consistently detected (Supplementary Figure 3d). Only few additional pathways were enriched after variation of this cut-off (Supplementary Figure 3e). In particular, the findings of pathways ko00910 “Nitrogen metabolism” and ko00190 “Oxidative phosphorylation” as exhibiting overall high levels of gene expression
570
were resilient to changes in cut-offs.
27
In conclusion, the described gene expression analysis is robust to noise, as well as too small to moderate changes in the chosen cut-offs.
Effect of regrouping redundant KOs into single nodes To carry out a topological analysis of the reconstructed metabolic network, nodes and edges 575
were rendered non-redundant, by representing multiple KOs with identical substrate and product metabolites as a single node. Due to this step, 229 and 220 nodes representing more than one KO were part of the autumn and winter metabolic network reconstructions, respectively. The calculated load scores were overall only mildly affected by the regrouping (Spearman correlation of load scores in the redundant and non-redundant autumn or winter
580
network reconstructions: 0.98). 70 % of key functionalities identified in the non-redundant network were shared between both autumn network reconstructions (40 % for the winter network reconstructions), as reported in Supplementary Dataset 7. The rationale behind regrouping redundant KOs into single nodes was based on the fact that most KOs represent subunits of enzyme complexes, which do not work in parallel, but rather cooperatively in
585
metabolizing substrates. Consequently, some of the additional nodes found in the networks with regrouped redundant KOs represent multi-subunit complexes, such as AMO, and we therefore believe that the practice of regrouping redundant KOs into single nodes is warranted.
Genomic analysis of Isolate LCSB065 590
Mapping of filtered reads onto Isolate LCSB065’s contigs revealed a mean/standard deviation empirical sequencing insert size of 244±43 bp, and a mean read depth per mapped position (coverage) of 27±11x (median 25x).
28
As a first approach to analyse Isolate LCSB065’s genetic potential, protein coding genes were annotated with KOs as before for the metagenomic and metatranscriptomic sequences. 595
Of utmost interest, out of 3 373 protein coding genes that could be annotated with KOs, 420 genes (12.5 %) were annotated as KOs belonging to “Lipid metabolism”, according to KEGG Orthology. This relates to rank 4 behind “Amino acid metabolism” (19.1 %), “Carbohydrate metabolism” (16.2 %) and “Xenobiotics biodegradation and metabolism” (14.0 %). Similar results were obtained from COG categories52 and SEED subsystems categories of the
600
predicted proteins53. The assembled genome also encodes genes for the synthesis and polymerisation of poly-hydroxybutyrate (PHB, Supplementary Dataset 8 indicated in Figure 5b) and for the synthesis of TAGs (Supplementary Dataset 8 indicated in Figure 5b), inclusions of which are visible following Nile Red staining (see Supplementary Figure 8). Nine other genes beside the gene matching to the three metagenomic contigs encoding
605
acyl-[acyl-carrier protein] desaturases are annotated as desaturases in the isolate genome. The genomic region of the gene matching to the three metagenomic contigs encoding acyl[acyl-carrier protein] desaturases was assessed. The gene is the first gene of a syntenous block of four genes present in Rhodococcus jostii RHA1, Nocardia farcinica IF, Godronia bronchialis and Tsukamurella paurometabola DSM 20162. This block encodes the
610
homologous fatty acid desaturase (peg.6927), followed by a tRNA dihydrouridine synthase B, a cell envelope-associated transcriptional attenuator LytR-CpsA-Psr of the subfamily A1 and a phosphate transport system regulatory protein PhoU. The genome of Isolate LCSB065 furthermore contains six genes coding for lipases with an export signal peptide, thereby reinforcing its potential keystone role in the community (Supplementary
615
Dataset 8). The genomic enrichment of genes involved in lipid metabolism suggests that the isolated Rhodococcus sp. may occupy a function as keystone species within the sampled OMMCs. Additional work is required to elucidate the exact role of this organismal group
29
within the OMMC community.
30
620
List of Supplementary Figures Supplementary Figure 1
Overview of the assembly and annotation pipeline.
Supplementary Figure 2
Determination
of
lower
thresholds
for
gene
abundances for the selection of highly expressed KOs within the reconstructed community-wide metabolic networks.
Supplementary Figure 3
Results of the sensitivity analyses.
Supplementary Figure 4
Expression of KOs in metabolic pathways at the protein level.
Supplementary Figure 5
Generalized
OMMC-wide
metabolic
network
reconstructed from the combined metagenomic and metatranscriptomic data of OMMCs sampled in autumn and winter. Supplementary Figure 6
Representation of the fatty acid metabolic pathway (ko01212).
Supplementary Figure 7
OMMC-wide metabolic networks reconstructed from metagenomic and metatranscriptomic data of OMMCs sampled in autumn and winter.
Supplementary Figure 8
Micrographs of Isolate LCSB065.
31
List of Supplementary Tables Supplementary Table 1
Physicochemical characteristics of the wastewater at the time of sampling in the anoxic tank of the Schifflange BWWT plant.
Supplementary Table 2
Quantitative and qualitative characteristics of biomacromolecular fractions sequentially isolated from the OMMC samples.
Supplementary Table 3
Statistics of the combined assembly of the metagenomic and metatranscriptomic sequence datasets.
Supplementary Table 4
Ammonia-oxidizing organisms with corresponding accession numbers of amoA genes used for the reconstruction of the phylogenetic tree.
Supplementary Table 5
Numbers of metagenomic and metatranscriptomic reads from autumn and winter dates mapping to the isolate genomes of Nitrosomonas sp. Is79 and Isolate LCSB065.
32
List of Supplementary Datasets Supplementary Dataset 1
Dominant genera in the autumn and winter samples determined based on reconstruction of 16S rRNA gene sequences from the metagenomic data.
Supplementary Dataset 2
KO gene copy (KOGA), transcript (KOTA) and protein (NSI) abundances in datasets from autumn (04 October 2010) and winter (25 January 2011) samples.
Supplementary Dataset 3
Highly expressed metabolic KOs in the (a) autumn and (b) winter sample and pathways overrepresented by metabolic KOs highly expressed in the (c) autumn or (d) winter sample.
Supplementary Dataset 4
Generalized reconstructed using
microbial metabolic
the
combined
community-level
network
reconstructed
metagenomic
and
metatranscriptomic datasets in simple interaction format. Supplementary Dataset 5
Autumn-specific
reconstructed
microbial
community-level metabolic network in simple interaction format. Supplementary Dataset 6
Winter-specific reconstructed microbial communitylevel metabolic network in simple interaction
33
format. Supplementary Dataset 7
Results of topological analyses. List of all nodes in the metabolic networks reconstructed from OMMCs sampled in (a) autumn and (b) winter with topological measures and expression values. (c) KOs with betweenness centrality values different between metabolic networks reconstructed from OMMCs sampled in autumn and winter, and (d) pathways enriched in these KOs. KOs with key functionality in OMMCs sampled in (e) autumn or (f) winter including topological analysis results, gene abundances and expression, as well as protein abundances and pathway membership. (g) Results of BLAST searches of all genes encoding key functionalities against publicly available bacterial genomes.
Supplementary Dataset 8
Summary of characteristics of Isolate LCSB065’s genome, results of the phylogenetic analysis, and coverage of Isolate LCSB065 contigs by reads from the isolate genome sequencing.
34
Supplementary Figures
metagenomic reads autumn sample winter sample
metatranscriptomic reads autumn sample winter sample
PANDAseq + trim-fastq.pl: joverlaps joined and trimmed, QC
1
2
CD-HIT: processed reads made non-redundant
3
MOCAT assembly pipeline: combined assembly of metagenomic and metatranscriptomic reads
contigs > 150 nt
4
SOAP: reads joined by PANDAseq mapped to MOCAT contigs to recruit unassembled reads
Prodigal: genes called on contigs ≥ 500 nt
5
FragGeneScan: genes called on contigs < 500 nt
6 CD-HIT: predicted amino acid sequences made non-redundant
7
BLAST-X and KEGG: enzyme annotation (e-value < 10-5, %identity > 50, score > 50) K00001 legend:
nucleotide sequence
K00002
K00003
K00004
amino acid sequence
K00005
…!
KXXXXX KO identifier!
Supplementary Figure 1 Overview of the assembly and annotation pipeline. Annotated KOs were used for the subsequent metabolic network reconstructions. QC: quality control, 1-7: steps in the pipeline.
35
b 100.0 50.0
● ●
● ●
20.0 10.0 5.0
● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
2.0 1.0 0.5 0.2
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●●● ●● ● ●● ● ● ● ●●● ●●● ● ● ●● ●● ● ● ● ●● ●● ● ●● ●● ● ● ●● ●●●● ● ● ● ● ●● ● ● ● ● ● ●●● ● ●● ● ● ● ●●●● ● ● ●● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ●● ●● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●●● ● ● ●●● ●●●● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●
1e−03
1e−01
moving median winter relExpr
moving median autumn relExpr
a
100.0 50.0
●
● ●
●
10.0 5.0
● ● ● ●● ●●
●
●
1.0 0.5
●
●● ● ● ●● ● ●● ●● ● ●● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ●● ●● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
0.1 1e−03
1e+01
1e−01
autumn KOGA
c
d ● ●
● ● ● ● ● ● ● ● ●
1e+01
●
●
●
● ● ● ●
● ● ●
●
● ●
●
1e+00
●● ● ●
● ●● ● ●
●
● ●
1e−01 1e−02
● ● ● ●
●
●
●
●
●
● ●
●
● ●
●
● ● ●● ● ●
● ●
●● ●
●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ●●● ●● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●●● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ●●● ● ● ●● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ●●● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ●●● ● ● ● ● ● ●●●● ● ● ● ●●● ● ●● ● ● ● ●● ●● ● ● ● ●● ●●●● ● ● ● ●● ● ●● ● ●● ●● ● ●●● ●● ● ● ● ●● ●● ●●● ●● ● ●● ● ●● ●● ● ●● ●● ● ● ●●●●● ● ●● ● ● ●● ●● ● ● ●● ● ● ●● ●● ● ●● ● ● ●●●●●●● ●●● ● ● ● ● ● ●● ● ●● ● ●●●● ●● ●●● ● ● ● ● ● ● ●● ● ●● ●●●●● ●● ● ● ●● ● ●●●●●● ●● ● ● ●● ● ● ● ●● ● ●● ●● ●● ● ● ●● ●●● ●● ●●● ●● ● ● ●●● ●●●●● ● ●● ● ● ● ● ●●●●● ●● ● ●● ● ● ●● ● ●● ●● ●● ● ● ● ● ●●●● ● ● ● ● ●●● ●● ●● ● ● ● ●● ●● ●● ●● ● ●● ● ●● ●●●●● ● ● ● ● ● ● ● ●● ● ●● ● ●●●● ● ● ● ●● ● ●● ● ● ● ● ●● ● ●● ● ●●●● ● ●● ●●● ● ● ●● ●● ● ● ● ● ●● ● ●●● ● ● ●● ● ● ● ● ●● ●● ●● ●● ● ● ● ●● ●● ●●●● ● ● ●● ● ● ●●● ● ● ● ●●● ● ● ● ● ● ●●●● ● ● ●●● ●●●● ●● ●●●● ●● ●●● ● ● ● ●● ●●● ●●●●●● ● ● ●● ● ● ●● ● ●●● ● ●● ● ● ● ● ●● ● ● ●● ● ●● ● ●● ● ● ●●●●● ● ● ●● ● ●●● ● ● ● ●●● ● ● ● ●● ● ●●● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ●● ● ● ● ●● ● ●● ● ●● ●● ● ●● ●● ● ●●● ●● ● ● ● ●●● ●● ● ● ● ● ●● ● ● ● ● ●● ●●●●● ●● ● ● ● ●● ● ● ● ●●●● ● ● ●● ●● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ●●●●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ●● ●● ●● ● ● ●● ●● ● ●● ● ●● ● ●●● ●● ● ●●●● ● ●●●● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●●● ●● ● ● ●● ●● ●● ● ●●●● ● ● ● ●● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ●● ●●● ● ● ●●●● ● ● ●●● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ●● ●● ● ●●●●●● ●● ● ●● ● ● ● ● ● ●● ●● ● ● ● ●● ● ●● ● ●● ●● ● ●●●● ●●● ● ●● ● ●● ● ● ● ● ● ●●● ●● ● ● ● ●● ●● ● ●● ● ● ● ● ●● ● ●●● ● ● ● ●● ● ● ●● ●● ● ● ●●● ●●● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ●●● ● ●●● ● ●● ●● ●● ● ● ●●●●● ● ● ● ● ● ●●● ● ● ●● ●● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ●● ● ● ●●● ● ●● ● ● ● ●●●● ●●● ● ●● ● ●● ● ●● ● ● ●● ● ●● ●● ●● ●● ● ●●● ●● ● ●●● ● ● ● ●●● ● ● ●●●●● ● ●● ● ● ●● ● ● ●● ● ● ● ●● ● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●●● ●● ● ● ● ● ●● ● ●● ● ● ● ●● ●● ● ● ●●● ●● ●● ● ●●●● ● ● ● ● ●● ● ●●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
1e−03
●
●
1e−01
1e+01
winter relExpr
1e+02
●
1e+02
●
autumn relExpr
1e+01
winter KOGA
●
1e+01
● ● ●
●
●
1e+00
●
●
1e−01 1e−02
●
● ● ● ●
● ● ●
●
●
● ●
● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ●●●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ●●●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ●● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ●●● ● ● ●● ●●● ●●●● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ●●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ●● ● ●● ● ●● ●●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ● ●●●●●● ● ● ● ● ●● ● ●● ● ●● ●● ●● ●● ●● ● ● ● ●● ● ●●● ●● ● ●●● ●● ● ● ●● ● ● ●● ● ●●●●● ● ●● ● ●● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ● ●●●● ●● ● ● ● ● ●●●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ●● ●● ●● ● ●● ●● ●● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ●● ● ●● ●●● ● ● ●● ● ●● ●● ● ●● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ●● ●● ● ● ● ● ●●●●●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ●● ●●● ● ●● ●●●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ●●● ● ● ● ● ● ● ● ● ●● ● ●● ●● ●●● ● ● ● ● ●● ● ● ●●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ●● ● ●● ● ● ●●●●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●●●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ●●● ●● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ●● ● ●● ● ●● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ●● ● ● ● ●● ●●● ●●●● ● ● ● ● ●● ● ● ●●●●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ●●●●● ●● ● ● ●●●●●● ● ● ●● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ● ● ●● ●● ●●● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ● ●● ● ●● ●● ● ● ●● ● ● ● ● ●●● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ●●● ●● ● ● ● ●● ● ● ● ● ●● ● ●● ●●●●●● ● ● ● ●● ●●● ● ● ●● ●●● ●● ●●● ●● ●●● ●● ● ●●● ● ●● ●●● ● ●●● ● ●● ●● ● ● ● ●● ●●●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ●●● ●● ● ●● ● ●●● ● ●● ●●●●●●● ●● ● ● ● ● ●●● ●● ● ● ● ● ● ●●●● ● ● ● ●● ● ● ● ●● ● ● ●● ●● ● ●●● ● ● ● ●●● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ●●● ● ● ● ●● ●●● ●● ● ●● ●● ● ● ● ● ●● ● ● ●● ● ● ●● ●●●● ●● ● ● ●●●● ● ● ● ●● ● ● ● ● ●● ●●●● ●● ● ● ● ● ● ●● ● ● ● ● ●● ●●● ●● ●●● ●● ● ● ●● ● ●● ●●● ●● ● ● ●●● ●● ●●● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
●
● ●
● ●
1e−03
autumn KOGA
1e−01
1e+01
winter KOGA
Supplementary Figure 2 Determination of lower thresholds for gene abundances for the selection of highly expressed KOs within the reconstructed community-wide metabolic networks. (a) and (b) Moving median of KO relative expression (relExp) versus gene copy abundance (KOGA) for (a) autumn and (b) winter. (c) and (d) KO relative expression (relExp) versus gene copy abundance (KOGA) for (a) autumn and (b) winter. Vertical lines indicate lowest gene abundances with robust gene expression values within the interquartile range.
36
a
exclusion of genes with low copy abundance
simple cut−off
b
exclusion of genes with low copy abundance
simple cut−off
400
400
400
300
300
300
400 ●
●
● ●
●
●
● ●
100
● ●
200
● ●
●
● ● ● ● ● ● ● ● ● ● ●
100
●
200
●
● ●
number of genes
200
number of genes
●
●
number of genes
number of genes
●
● ● ●
● ● ● ● ● ●
300
200
● ●
●
●
● ● ●
100
100
0
0
● ●
1 2 3 4 5 7 10 15 20 50
1 2 3 4 5 7 10 15 20 50
1 2 3 4 5 7 10 15 20 50
noise [reads per kb]
noise [reads per kb]
noise [reads per kb]
noise [reads per kb]
autumn
winter
200 150 100 50
autumn
winter
6 4 2
autumn
winter
3 2 1
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
0
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
0
4
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
0
e
8
additional enriched pathways
d
250
conserved enriched pathways
highly expressed genes
c
0 1 2 3 4 5 7 10 15 20 50
0
cut−off for exclusion of genes with low gene copy abundance
cut−off for exclusion of genes with low gene copy abundance
cut−off for exclusion of genes with low gene copy abundance
Supplementary Figure 3 Results of the sensitivity analyses. (a) and (b) Effect of noise on the number and identity of genes with high relative expression determined by comparing a simple numerical cut-off and our method for the reduction of false-positives by excluding genes with very low gene copy abundances. Filled boxes indicate total number of genes with high relative expression, boxes with blue or brown lines indicate the sizes of the intersect of genes with high relative expression without noise and after addition of noise; (a) autumn dataset, (b) winter dataset. (c) Effect of changing the cut-offs for exclusion of genes with low gene copy abundances to reduce false-positives on numbers of genes with a high relative expression. (d) and (e) Effect of changing the cut-offs for exclusion of genes with low gene copy abundances to reduce false-positives on the identities of pathways enriched with highly expressed genes. (d) Number of pathways enriched with highly expressed genes that are found at the chosen cut-off (0.75) and after variation of the cut-off. (e) Number of additional
37
pathways enriched in highly expressed genes, which are not found at the chosen cut-off but after relaxation of the cut-off. 625
38
b 1e−01
winter proteomics NSI
autumn proteomics NSI
a ●
1e−03
1e−05
1e−07
● ● ● ● ●● ● ●● ● ● ●●● ● ● ● ●● ● ●● ● ● ● ●● ●●●● ● ● ●●●● ●● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ●● ● ●● ● ● ●●●● ●● ● ● ● ● ● ●● ● ●● ● ●● ●● ● ●● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ●● ●● ●●● ●● ● ●● ●● ● ●● ●● ● ●● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●●● ●● ●●●●● ● ● ●●● ●●●●●● ● ●● ● ● ● ●● ●● ● ● ● ●● ● ●●●● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●●●●● ●● ● ● ● ●●● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ●● ● ●●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●●● ● ● ● ● ●● ● ●● ● ● ●● ● ●● ● ●● ● ● ● ● ●●● ● ●● ● ●● ●● ● ● ● ●●● ●●● ●● ● ● ● ● ●● ● ● ● ●●● ●● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ●● ● ●● ●● ●● ●● ● ● ●● ●● ● ●● ● ● ●●● ●● ● ●● ● ● ● ●● ●● ● ● ●● ● ●● ● ●● ● ●● ● ●● ● ● ●● ● ●● ● ●● ●● ● ● ● ● ●●● ● ●● ● ● ●●● ● ● ● ●● ●● ●● ● ● ●●● ● ● ●● ● ● ● ●● ● ●● ● ●● ● ●● ● ●● ● ● ●● ●● ● ● ●●● ● ● ●● ● ● ● ● ● ● ●●●● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ●●● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●
●
● ● ● ● ●
●
1e−03
●
●●●
●●
1e−01
1e−01 ● ● ● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ●● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ●●● ●● ● ● ● ● ● ●●● ●● ●● ●● ● ●●●●●●●● ● ● ● ● ●● ●●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ●● ●● ●●● ● ● ●● ● ●● ●●● ● ● ● ●● ● ● ●● ●● ● ● ● ●●●● ● ● ●●●● ● ● ●● ●● ● ● ● ●●●●● ● ● ●● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ●●●●●●● ● ● ● ● ● ● ●● ● ●● ●●●● ●● ● ● ● ● ● ● ● ●●● ● ● ● ●●● ● ● ●●● ●●● ● ● ●● ● ●●●● ● ●● ●●● ● ● ● ● ● ● ● ● ● ●● ●●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●●●● ●●● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●● ● ● ● ● ●● ●●● ●● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ●● ● ●●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ●● ● ● ● ● ●● ● ●● ● ● ● ●●●●● ● ● ● ● ●● ● ●● ● ● ● ●●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ●●● ● ● ● ●● ● ● ●● ● ●●● ●● ● ● ● ●● ● ● ●● ●●● ● ● ● ●● ● ● ●●● ●● ● ● ● ● ●● ● ●● ● ● ●●● ●● ●● ●● ●●● ● ● ●●● ●●● ● ●● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ●●● ● ●● ● ● ●●● ●● ● ● ● ● ● ● ●
1e−03
1e−05
● ●
1e−07
●
1e+01
1e+03
1e−03
1e−01
autumn KOGA
1e−01
winter proteomics NSI
autumn proteomics NSI
d ●
1e−05
1e−07
● ● ●● ●● ● ●● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●●● ●● ● ●●●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ●● ●● ●●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ●●● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ●● ●● ●● ● ●● ● ● ●● ●●●● ● ● ●● ● ● ●● ● ● ● ●●●● ● ● ● ● ● ●●●● ● ●● ●● ●● ● ●●●● ● ● ●● ● ● ●● ● ● ●● ●● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ●● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ●●● ● ● ● ●●● ●●●● ●●●● ●● ●● ●● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ●●●● ● ● ●●● ● ●● ● ● ● ●● ● ● ●● ●● ● ●● ● ● ● ●● ●● ●● ● ● ●●●● ●● ● ●● ●● ●●● ●●●●● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ●● ●● ●●●● ●●●● ● ● ● ●● ● ●● ●●● ● ●● ●● ● ●● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●●● ●● ● ●● ● ●● ● ●● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●
●
●
1e−03
●
● ●●
● ●
●
1e+01
1e+03
● ●● ●●● ● ● ● ● ● ●● ● ● ● ● ●● ●●● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ●●●● ● ●● ●●● ● ●● ●●●● ● ● ●●● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ●● ● ● ●● ● ●● ● ●● ● ● ●● ● ●● ● ●● ● ● ●● ● ● ●● ● ● ● ●● ● ● ● ●●● ● ●●● ● ● ●● ● ●● ● ●●●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●●●●● ●● ● ●● ● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●●● ● ● ●● ● ● ●● ● ● ● ●● ●●● ●● ● ● ●● ● ●● ● ● ● ● ● ●●●●● ● ●●●● ● ● ●● ● ● ●●●● ●● ● ●●● ● ●●●● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ●● ●●● ● ● ● ● ● ●● ●●● ●●●●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●●● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ●● ● ●●● ●● ● ●● ● ●●● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ●● ●● ● ●● ●● ● ● ●● ●● ●● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ●●●● ● ● ● ● ●● ● ●●●● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●●●●● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●
1e−03 ● ● ● ●
1e−05
1e−03
autumn KOTA
● ●
●
●
●
1e−01
1e+01
f
●
1e−01
●
●●
● ●
● ● ● ●● ●
●
●
● ● ●
1e−03
1e−05
1e−07 1e−03
●
●
● ●
●
●● ●● ● ● ● ●
● ● ● ●● ● ● ●● ●● ●●● ● ● ● ●●● ●● ● ●● ●● ● ● ● ● ●● ●● ● ●●● ●●● ● ● ●●● ●● ● ● ● ●●● ● ●● ●●● ● ● ●● ● ● ● ● ● ● ●● ●● ● ●●● ●● ●● ●● ●● ● ●● ●● ● ● ● ● ● ●●● ●● ●● ● ●●●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●●●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●●●● ● ● ●● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ●● ● ● ● ● ● ● ● ●● ●●● ● ● ●●●● ●● ●●● ● ● ● ●● ● ● ●●● ● ●● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ●● ● ●●● ● ●● ● ● ●● ●●● ● ●●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●●● ●● ● ● ●●● ●●● ● ●●● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ●●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●
1e−01
●
1e+01
winter rel. protein expr.
autumn rel. protein expr.
● ● ● ●
1e+03
winter KOTA
e 1e−01
1e+03
1e−01
1e−07
●
1e−01
1e+01
winter KOGA
c
1e−03
●
●● ●
● ●
1e−03
1e−05
1e−03
autumn relExp
● ●
● ● ●
●● ●
●● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ●● ● ●● ●●● ● ● ●●● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ●● ● ● ●●● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ●●●● ●● ● ● ● ●● ● ●● ● ●●● ● ●●●● ● ● ● ● ●● ●● ● ● ● ● ●● ●●● ● ● ●● ●●● ● ●●● ● ● ● ● ●● ● ● ● ●● ●● ● ● ●● ● ●● ● ●● ●●●● ● ●● ●●● ●● ● ● ● ●● ●● ●● ● ●● ● ●●●● ●● ●● ● ●● ●● ●● ●● ● ●● ● ● ●●●● ● ● ● ●● ●● ●●● ● ● ● ●● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ●● ●●●● ● ● ● ● ●● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ●● ●● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ●● ● ●● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ●● ●● ● ●● ●●● ● ● ● ●● ● ●● ● ●●● ● ● ● ●●● ● ● ●● ●●●●● ● ● ● ●● ●●● ● ● ● ●● ● ●●● ●● ● ● ● ●● ●● ● ● ●● ● ● ● ●● ●● ●● ● ●● ● ● ●●● ● ●● ● ● ●● ●● ● ●●● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ●● ●●
● ●
●
●
1e−07 1e+03
● ● ● ●
●
● ●●
●
●
●
●
1e−01
1e+01
1e+03
winter relExp
Supplementary Figure 4 Expression of KOs in metabolic pathways at the protein level. (a) and (b) Comparison of gene copy abundances (KOGA) and protein abundances (NSI) in the (a) autumn and (b) winter samples. (c) and (d) Comparison of gene transcript abundances (KOTA) and protein abundances (NSI) in the (c) autumn and (d) winter samples. (e) and (f) Comparison of expression values relative to gene copy numbers (KOGA), transcript expression levels (KOTA) and protein abundances (rel. protein exp.) in the (e) autumn and (f) winter samples. (c to f) Highly expressed KOs are highlighted in red. 39
Node 8
Node 7
"K15857" Node 1 "K06013"
"K00523-K12452" "K03831-K15376" "K16020"
"K10187"
"K12503"
"K02293"
"K14066" "K15887" "K05954" "K00787-K00795" "K02291" "K05356" "K05355" "K01597" "K15888""K00805" "K11778" "K13787-K13789"
"K02294" "K09835" "K14605"
"K15745" "K10027"
"K00514"
"K09879" "K06443" "K02292"
"K00801" "K04713"
"K00044"
"K00806" "K02523"
"K06920"
"K01818" "K15669"
"K01709"
"K16019"
"K00423-K00434"
"K03379"
"K10960"
"K03476"
"K03391"
"K10026" "K00103"
"K06045" "K01499"
"K03527""K01823" "K05979"
"K01130"
"K00151"
"K03635-K03636"
"K01101"
"K01253"
"K01042"
"K00794"
"K13370"
"K01728"
"K15054"
"K16051"
"K00791" "K12343-K12344" "K09472" "K00251"
"K00067" "K04708"
"K05884" "K01634"
"K00968"
"K00511"
"K00983" "K00119"
"K13085" "K00090-K06152"
"K01218"
"K01198"
"K05898" "K15982-K15983"
"K09459"
"K09709" "K03526" "K01078-K09474" "K14394"
"K10219"
"K00376"
"K04712"
"K01790"
"K06167" "K00004"
"K10221"
"K01186"
"K03473"
"K15817" "K15812" "K00793" "K01053"
"K14470"
"K00967"
"K02821-K03475"
"K03273"
"K03388-K03389-K03390" "K01095-K01096"
"K02815"
"K01220" "K08678" "K01840"
"K01201"
"K01183-K13381"
"K00493-K14338" "K02164-K02305-K02448-K04561-K04748" "K05797"
"K03472" "K01654"
"K00502"
"K05351"
"K01787"
"K00096"
"K00886" "K01193"
"K01222"
"K00066" "K10046" "K01819" "K00039"
"K15917" "K00036"
"K10708" "K03785-K03786"
"K04037-K04038-K04039"
"K00218"
Node 2
"K01815"
"K02259"
"K15916"
"K01810-K06859" "K16011"
"K01735"
"K01711"
"K04480-K14081" "K01857"
"K15228" "K13498"
"K03396" "K03781" "K10713"
"K08093" "K00895" "K13810" "K00616" "K01783" "K01635-K08302" "K01837" "K00006-K00057-K00105-K00111-K00113" "K13831" "K01689" "K02446-K03841-K04041" "K01834-K15633-K15634-K15635" "K11532" "K00134-K00150-K05298-K10705" "K01623-K01624-K11645" "K06131-K06132" "K00058" "K16305" "K00131" "K00615" "K01622"
"K03381" "K01801"
"K00799" "K00481"
"K00368"
"K02170"
"K10676"
"K00452"
"K00315"
"K00463" "K11780-K11781" "K00980"
"K14520"
"K09516" "K16242-K16243-K16244-K16245-K16246"
"K15762"
"K06116-K06117"
"K01502"
"K15767" "K05783"
"K01621" "K01803"
"K00430-K11188" "K13812"
"K00146" "K01066"
"K01889-K01890" "K01746""K03782"
"K00210-K00211-K04517"
"K01563" "K02554"
"K00517" "K01868" "K10797"
"K11147-K11153"
"K07539" "K00480" "K01725"
"K03392" "K00450"
"K03464" "K15981-K16046" "K01607"
"K05915"
"K00766"
"K01934"
"K00224"
"K01061"
"K00577-K00581-K00582-K00583-K00584" "K00320" "K10535"
"K01069"
"K01618"
"K01759"
"K01841"
"K01629" "K01662" "K01601-K01602"
"K01856"
"K00258"
"K01734" "K00011" "K00068"
"K00222"
"K14083" "K02551"
"K01575" "K00360-K00367-K00370-K00371-K00372-K00373-K00374-K02567-K02568"
"K04097"
"K04091"
"K01628"
"K01633"
"K01632" "K02564-K12410"
"K00009" "K01809" "K03271"
"K04103" "K00501" "K00446" "K00500" "K15739" "K01866" "K12468" "K14028" "K12732" "K14727" "K12506" "K01695-K01696-K06001" "K00220" "K15226" "K04518-K05359" "K01835" "K12450" "K00012" "K01694" "K00050-K15893" "K00060-K15789" "K00002""K00148" "K15779" "K00217" "K01057-K07404" "K00285""K10775" "K01784" "K00354" "K01070" "K01851-K02361-K02552" "K00141" "K00147" "K00033" "K00274" "K00270" "K00963" "K01592" "K13853" "K10944-K10945-K10946" "K03738" "K05603" "K00014" "K03339" "K01875" "K00457" "K00104-K11472-K11473-K11517" "K05358" "K00965" "K07249" "K13832" "K03179" "K00276" "K01569" "K01737" "K00053" "K16055" "K00049-K12972" "K01608" "K00451" "K05947" "K10774" "K01807" "K00796" "K16306" "K00459" "K01820" "K01619" "K00697" "K01770" "K04093-K04516-K06208-K06209" "K01055" "K00720" "K00121" "K03418" "K00995" "K01736" "K01917" "K01808" "K00529-K05710" "K01131" "K14170" "K00015-K00018" "K06118" "K13953" "K02858" "K01893" "K14187" "K00468" "K00200-K00203-K00205-K11260-K11261" "K01920""K13951-K13952-K13980" "K00301-K00302-K00303-K00314" "K03637-K03639" "K00001" "K01839" "K00384" "K00505" "K01617" "K00114" "K01079" "K01690" "K00696" "K01821" "K01933" "K13954" "K00492" "K00721" "K10216" "K00362-K00363-K00366-K03385-K04016-K15876" "K01491-K13403" "K15864" "K00845" "K00754" "K13774-K13775""K10217" "K15066" "K01593" "K03735-K03736-K04019" "K01092" "K04022" "K01589" "K13036" "K00155" "K01556" "K11808" "K01495-K09007" "K01126" "K14419-K14420" "K01447-K01448" "K00322-K00323-K00324-K00325" "K13057" "K00800" "K01114" "K01588" "K00693-K00694-K16150-K16153" "K00847" "K05601" "K14422" "K07248" "K01878-K01879-K01880-K14164" "K14652" "K01667" "K01631" "K01625" "K00981" "K01620" "K01639" "K03333-K16045" "K13656" "K03430" "K00853" "K00900" "K05549-K05550-K05784" "K00850" "K01733" "K00863-K05878-K05879" "K00998" "K01595" "K01668" "K11753" "K01455" "K01666" "K01726" "K10011" "K01426" "K00600" "K03181" "K00849" "K00884" "K00864" "K00288" "K10012" "K00102-K03777-K03778" "K00882" "K00851" "K01945" "K03153" "K00854" "K01872" "K16146" "K00601-K08289-K11175" "K13830" "K00855" "K00927" "K11788" "K00865-K15788" "K03367-K14188" "K04782" "K01433-K01938-K13402" "K00885" "K13829""K07031" "K01596" "K01497" "K14759" "K00281-K00282-K00283" "K07246" "K01568" "K11529" "K12446" "K00922" "K00962" "K00852" "K00919" "K01752" "K00297" "K10760" "K01011" "K10218" "K13043" "K00524-K00525-K00526" "K00917" "K01916" "K02586-K02588-K02591" "K00848" "K15067" "K00748" "K01769-K12319-K12320-K12323-K12324" "K09019-K16066" "K01754""K11787" "K01501" "K03272" "K00888" "K05396" "K04765" "K00891" "K02619" "K01451" "K00858" "K10710" "K00923" "K15914" "K00945-K13800" "K13940" "K01610" "K00560" "K00605" "K00873-K12406" "K01139" "K00703-K16148" "K01571-K01572-K01573" "K01509-K01553" "K00259" "K02999-K03002-K03004-K03005-K03006-K03007-K03010-K03011-K03013-K03016-K03018-K03021-K03023-K03040-K03041-K03042-K03043-K03044-K03046-K03048-K03060-K13797-K13798" "K01775" "K06215-K08681" "K00289" "K00887" "K03429" "K02203" "K00921" "K11717" "K01652-K01653" "K00872-K02204" "K00287" "K00527" "K00483" "K00946" "K01948" "K09903" "K00912" "K11258" "K00101" "K01636" "K02437" "K00869" "K00877-K00941" "K14153" "K01768-K08043-K08048-K08049-K11029" "K01656" "K01525" "K01006-K01007" "K13799" "K00878" "K16330" "K13501" "K01791" "K04112-K04113-K04114-K04115" "K00161-K00162-K00163" "K02510" "K00929" "K10807-K10808" "K03465" "K16328" "K15849""K00681" "K00027-K00028-K00029" "K06015" "K03338" "K00985" "K00940"" K14154" "K00932" "K16329" "K00876" "K00158" "K00129" "K00820" "K13497" "K01424" "K02231" "K00948" "K01442" "K00604" "K00942" "K13941" "K00817" "K01714" "K00950" "K00016" "K00930" "K00928" "K00944" "K13998" "K03779" "K06982" "K12524-K12525" "K01657-K01658-K13503" "K01458" "K10353" "K03342" "K00874" "K02563" "K00830" "K00857" "K00822" "K01515-K08312" "K01522" "K00951" "K00926" "K12657" "K00931" "K00949" "K00273" "K01520" "K12659" "K00969-K06210-K13522" "K16164-K16165" "K00832" "K00815" "K00812-K00813-K11358-K14454-K14455" "K01954-K01955-K01956" "K01664-K01665-K13950" "K01008" "K00939" "K08967" "K00765-K02502" "K15359" "K00827-K14272" "K05287" "K00901" "K00762" "K06164" "K11541" "K00899" "K11540" "K00824" "K01494" "K07264" "K00943" "K01958-K01959-K01960" "K09758" "K13990" "K00867-K03525-K09680" "K03851" "K00860" "K01591" "K02527" "K01465" "K11754" "K08691" "K00768" "K01915" "K01914" "K00603" "K03417" "K00868" "K00814" "K13930" "K00856" "K01434" "K15519" "K00925" "K01505" "K01930" "K02428" "K00835" "K00955-K13811" "K00602-K01492" "K01443""K14264" "K00149" "K01512" "K01425" "K00788" "K11121" "K01241" "K01950" "K00773" "K02433-K02434-K02435" "K00831" "K14260" "K02319-K02320-K02321-K02322-K02323-K02324-K02325-K02327-K02335-K02337-K02338-K02339-K02340-K02341-K02342-K02343-K02684-K03504-K03763-K14159" "K00698" "K01952" "K00764" "K01493" "K12256" "K00260-K00261-K00262-K15371" "K01937" "K00157" "K01255-K01256-K01270-K07751-K11140-K11142" "K00264-K00265-K00266" "K01953" "K01886" "K00859" "K02472" "K04072" "K03426" "K03707" "K00656" "K00278" "K13421" "K00156" "K13035" "K01479" "K14085" "K01758" "K00226-K00254" "K01469" "K04021""K01637" "K09759" "K05597" "K01437" "K03416" "K00213" "K00128" "K00132-K04073" "K07425" "K06130" "K13745" "K00823" "K00292-K00293" "K03737" "K00100" "K01923" "K01467" "K07806" "K09482" "K01760-K14155" "K04105" "K12526" "K00194-K00197" "K01081-K03787-K11751" "K00840" "K01579" "K03269" "K09011" "K00654" "K00122-K00123-K00124-K00127-K05299-K08348-K15022" "K00841" "K00284" "K07250-K13524" "K01776" "K01800" "K00639""K01075" "K00609-K00610" "K01744" "K00836-K15785" "K00680" "K00606" "K00024-K00025-K00026-K00051-K00116" "K00193" "K14676" "K00294" "K01580" "K00816" "K15786""K01951" "K15372" "K00043" "K13547" "K08295" "K00169-K00170-K00171-K00172" "K01484" "K13922" "K13294" "K02232" "K01885" "K14267""K13566" "K14138" "K01638" "K01120-K13298" "K01904" "K00761-K02825" "K01076" "K01939" "K13017" "K01919-K11204" "K09698" "K12234" "K09470" "K01129" "K00640" "K04110" "K00619-K14682" "K01779" "K00454" "K05957" "K00819" "K04487" "K00767" "K01490" "K01846" "K01913" "K00643" "K10702" "K01912-K02614" "K00631-K03621-K08591" "K01000" "K01663-K02500-K02501" "K16173" "K01876" "K01058" "K00818" "K12298" "K10206" "K01924" "K13559""K01908""K00620" "K13763" "K03800" "K00821" "K00140" "K00382" "K01041" "K14621" "K01647-K01648-K15230-K15231" "K08748" "K11992" "K00613" "K13776" "K11212" "K00020" "K05830""K14268" "K15038" "K00010" "K02224" "K00658" "K15741" "K14674" "K01756" "K00763" "K14681" "K01655-K02594" "K00759" "K08969""K09251" "K00109" "K01659" "K00164" "K00143" "K01895" "K00627" "K05907" "K04116" "K08764" "K00174-K00175-K00176-K00177" "K03823" "K12405" "K01796" "K00760" "K01578" "K01031-K01032" "K00021-K00054" "K04020" "K05605" "K13821" "K08692-K14067-K14451" "K10977" "K01911-K14760" "K01896" "K00641" "K00826" "K00030" "K15913" "K03462" "K00769-K03816" "K00548" "K15018" "K03821" "K00625-K13788-K15024" "K13356" "K15780" "K13923" "K01643-K01644" "K15742" "K03918" "K14469" "K01068" "K00758" "K14157" "K16007" "K01918" "K07823" "K01697" "K04042" "K01039-K01040" "K08262" "K00634" "K14467" "K03517" "K07513" "K01905" "K14163" "K01026" "K00651" "K01897-K15013" "K08281" "K01067" "K05822" "K00674" "K00469" "K07508-K07509" "K10214" "K10817" "K02615" "K01940" "K13018" "K01641" "K03783-K03784" "K02233" "K01902-K01903" "K14371" "K00756" "K00626" "K00655" "K01452" "K01044" "K00632" "K13877" "K09699" "K00635" "K01899-K01900" "K00467" "K15232" "K05551" "K15866" "K06718" "K03815" "K00657" "K09018-K09024" "K01487" "K00003" "K00508" "K00757" "K00364" "K00997-K06133" "K00394-K00395" "K01907" "K00135-K00139" "K00022" "K15880" "K14245" "K00257" "K00186-K00187" "K01239" "K00673" "K14447" "K01034-K01035" "K00645" "K01514-K01524" "K01739""K01062" "K00031" "K01649" "K01640" "K00652" "K00648" "K00491-K13242" "K10852" "K11987" "K01825" "K00232" "K00252" "K00249" "K01782" "K00166-K00167-K11381" "K01942-K03524" "K00133" "K11533" "K00290" "K00668" "K01761" "K01049" "K00665" "K01431" "K00207" "K13034" "K06193" "K07514" "K01964-K15036" "K06445" "K14471-K14472" "K01961-K01962-K01963-K02160-K11262" "K02535" "K01740" "K10527" "K00087-K11177-K11178-K13480-K13481-K13482-K13483" "K01692" "K01027-K01028-K01029" "K01046-K14675" "K06447" "K00490" "K00300" "K00248-K09478" "K01965-K01966-K15052" "K01615" "K01738" "K00507" "K05823" "K13929" "K01489" "K07515" "K12339" "K07511" "K00088" "K05939" "K01847-K01848-K01849-K11942" "K01438" "K13767" "K01613" "K01427-K01428-K01429-K01430-K14048" "K00074" "K00023" "K01060" "K03274" "K15016" "K08693" "K07516" "K00255-K07512" "K01687" "K13746" "K08683" "K01715" "K02473" "K09479" "K00234-K00235-K00237-K00239-K00240-K00241-K00242-K00244-K00245-K00246-K00247" "K14534" "K15728" "K00106" "K00316" "K01486" "K07753" "K00052" "K13239" "K01488" "K00972-K11528" "K15938" "K01676-K01677-K01678-K01679" "K00137" "K11538" "K06446" "K00677" "K15019" "K00108-K00499" "K00065" "K00077" "K12711" "K13038""K01922" "K11410" "K15989" "K00599" "K02517" "K01555-K16171" "K12743" "K02536" "K01883-K01884" "K03801" "K00253" "K14446" "K05607-K13766" "K01594" "K10764" "K01968-K01969" "K01439" "K15894" "K12254" "K02474" "K00679" "K13015" "K00549" "K00647-K09458" "K00208" "K01468" "K00083" "K02618" "K01874" "K05829" "K12355" "K00667" "K01485" "K01755" "K15998" "K00790" "K10258" "K01254" "K01498" "K03343" "K01681" "K00145" "K03337" "K00772" "K02549" "K00271" "K03119" "K11752" "K01581" "K01682" "K01119" "K00956-K00957-K00958" "K03852" "K16044" "K01685" "K01082" "K08968" "K00797" "K01921" "K02228" "K01799" "K03335" "K02371" "K00215" "K00954-K02201" "K02372" "K05606" "K00611" "K01478" "K03644" "K04118" "K00227" "K04034-K04035" "K03921" "K00263" "K01686" "K01716" "K01673" "K00059" "K00209" "K10780" "K01887" "K01464" "K13777-K13778" "K01014" "K01582" "K00019" "K01476" "K07173" "K02225-K02227" "K14152" "K01251" "K01523" "K02609-K02610-K02611-K02612-K02613" "K10978" "K00390" "K01750" "K00543" "K01925" "K00547" "K01243" "K00192-K00195" "K00318" "K00198-K03518-K03519-K03520" "K00798" "K11731" "K11755" "K01932" "K05824" "K13622" "K01814" "K13621" "K05831" "K00130" "K09065" "K01195" "K01812" "K12251" "K02301" "K00789" "K03336" "K01583-K01584-K01585-K02626" "K01480" "K01873" "K01892" "K00551-K00570" "K00013" "K01661" "K03897" "K01496" "K04566-K04567-K04568" "K12726" "K02169" "K01457" "K13019" "K00569" "K01590" "K01089" "K00802" "K04075" "K11180-K11181" "K12502" "K01777" "K01598" "K07534" "K01672-K01674" "K07536" "K00040" "K14541" "K01703-K01704" "K08688" "K05362" "K03927" "K13388" "K01693""K00387-K05301-K07147" "K13384" "K01745" "K04486-K05602" "K00559" "K05827" "K00075" "K00380-K00381-K00392" "K07535" "K00286" "K00082" "K00041" "K01586" "K00588" "K05936" "K00471" "K00563-K00564-K06970-K08316-K12297" "K00275" "K13747" "K01843" "K05934" "K03428" "K13779" "K00097" "K00558" "K13016-K13020" "K05928" "K01611" "K02188" "K13623" "K03183" "K10793-K10794" "K01706" "K09187-K09188-K11419-K11420-K11423-K11429" "K00833" "K03394" "K01778" "K00595" "K00472" "K11185-K11186" "K05888" "K01881" "K00589-K02303-K02496" "K02302" "K15792" "K02492" "K06127" "K13542-K13543" "K15778" "K01112"
"K00975"
"K11333-K11334-K11335"
"K01626-K03856" "K06153"
"K01085""K01792"
"K10700" "K07538"
"K04100-K04101" "K04099" "K00383" "K00432"
"K01817"
"K03431" "K08097"
"K10618"
"K14519"
"K00055" "K00486"
"K00466" "K01816" "K01560-K01561"
"K09483"
"K10220"
"K01609" "K00453"
"K07106" "K00005"
"K00978"
"K01232" "K01226"
"K01091"
"K01002" "K01786-K03077-K03080" "K02768-K02769-K02770"
"K01858" "K09461" "K00966-K00971" "K00973"
"K01051" "K01721"
"K01867"
"K00048"
"K01199"
"K01223" "K01805" "K02777" "K04719" "K05343""K12309" "K01194-K05342" "K01190" "K01182" "K01838" "K01189-K07406-K07407" "K01187-K12317-K15922" "K01087" "K00691-K00705"
"K01853"
"K06120-K06121-K06122"
"K01178" "K01229-K12111" "K02778-K02779" "K01188-K05349-K05350"
"K14448"
"K11693"
"K02636"
"K02817-K02818-K02819" "K00045" "K00689-K00690-K05341" "K00008" "K00115-K00117" "K00702" "K01785"
"K10619-K16303"
"K01699-K13919-K13920"
"K02808-K02809-K02810"
"K00991"
"K05917"
"K00399-K00401-K00402" "K15758"
"K01077-K01113" "K02786-K02787-K02788"
"K02793-K02794-K02795-K02796" "K02802-K02803-K02804" "K02790-K02791" "K08679" "K02782-K02783" "K00034" "K02763-K02764-K02765" "K01804" "K12308"
"K02377"
"K00448-K00449"
"K01564"
"K00086"
"K02775" "K02798-K02799-K02800" "K07026" "K01710"
"K01852"
"K01826"
"K02548"
"K00099" "K14449"
"K03366"
"K00319-K13942"
"K01117"
"K00299" "K02771"
"K00485"
"K01576" "K09471"
"K01207"
"K01179-K01225" "K06879-K09457"
"K04040"
"K06163"
"K05921"
"K12451"
"K15813"
"K01822"
"K03587-K05364-K05365"
"K03082"
"K01772"
"K12658"
"K01423"
"K00568"
"K01473-K01474"
"K00042"
"K11263"
"K00979"
"K08353"
"K01928"
"K01720"
"K01941" "K14259"
"K01712"
"K09829" "K01707" "K04126"
"K01929"
"K01470"
"K01705"
"K01702"
"K05275"
"K03365" "K03403-K03404-K03405"
Node 6
"K14732"
"K06281-K06282" "K00231" "K01477" "K03340"
"K10536" "K03182-K03186" "K01845"
"K06042"
"K00228-K02495"
"K03383"
"K12725" "K12728" "K03270"
"K06134" "K06126" "K03795"
"K04127"
"K00440-K00441-K00443"
"K01599" "K03184"
"K11207"
"K01870" "K01869"
"K03474"
"K08963"
"K04835" "K05895" "K02230-K09882-K09883" "K03185" "K02304"
"K16054"
"K01719"
"K14269"
"K01698" "K01708"
"K09880" "K08351" "K01012"
"K01500" "K04128"
"K01627" "K01749"
"K00355" "K02827-K02828"
"K06720" "K12253"
"K03688"
"K01935"
"K01844" "K00329-K00330-K00331-K00332-K00333-K00334-K00335-K00336-K00337-K00338-K00339-K00340-K00341-K00342-K00343-K03878-K03879-K03880-K03881-K03882-K03883-K03884" Node 3
"K01466"
"K08964"
"K02083" "K08357-K08358"
Node 5 "K10674"
Node 4
Supplementary Figure 5 Generalized OMMC-wide metabolic network reconstructed from the combined metagenomic and metatranscriptomic data of OMMCs sampled in autumn and winter. Coloured nodes indicate pathways enriched in highly expressed KOs; blue – oxidative phosphorylation; yellow – nitrogen metabolism; red – TCA cycle; purple – glycerolipid metabolism. Large grey nodes are highly expressed during at least one season. Opacity of nodes indicates shortest average path length (the more transparent, the longer the path length).
40
Acetyl0CoA(
K01716(K00665( K02371( …..( K02372( K09458( K00208( Malonyl0ACP( K00647( Hexadecanoyl0CoA(
Palmi&c(acid(
K01897(
Hexadecanoyl(
K00665(
K00059( K10258(
K00647(K00208(( K02372(K00647( K09458(K00208( K02371( K00059( K10258(
Stearoyl0CoA(
Oleoyl0CoA(
K03921'
K10256(
Linoleoyl0CoA(
Octadecatrienoyl0CoA(
Icosadienoyl0CoA(
Icosatrienoyl0CoA(
K10257(
…..(
Icosenoyl0CoA(
…..(
…..(
Icosanoyl0CoA(
…..(
K00059(
Supplementary Figure 6 Representation of the fatty acid metabolic pathway (ko01212). KOs with a betweenness centrality that is much higher in the metabolic network reconstructed from the OMMC sampled in winter compared to the OMMC sampled in autumn are highlighted in red. The key functionality of KO K03921 (acyl-[acyl-carrier protein] desaturase) is highlighted in pink. Winter KOs with a high relative gene expression are represented in blue. The dotted lines represent the continuity of the pathway and the circles represent metabolites.
41
a
Supplementary
b
Figure
7
OMMC-wide
metabolic
networks
reconstructed
from
metagenomic and metatranscriptomic data of OMMCs sampled in (a) autumn and (b) winter. Large nodes indicate KOs encoding key functionalities whereas colours represent pathway membership: yellow, nitrogen metabolism; orange, fatty acid biosynthesis; light green, benzoate degradation; dark green, porphyrin and chlorophyll metabolism; pink, cystein and methionine metabolism and red, other pathways. 630
42
a
b
c
Supplementary Figure 8 Micrographs of Isolate LCSB065. (a) Non-polar granules observed in Isolate LCSB065 following Nile Red staining (λex 550/20 nm, λem 620/60 nm); (b) Bright field micrograph; (c) Overlay of (a) and (b). The scale bar is equivalent to 10 µm.
43
Supplementary Tables
Supplementary Table 1 Physicochemical characteristics of the wastewater at the time of sampling in the anoxic tank of the Schifflange BWWT plant. Sampling dates
Suspended
pH
solids (g/l)
NO3-
O2
NH4
PO4
Air
Water
(mg/l)
(mg/l)
(mg/l)
(mg/l)
temperature
temperature
(°C)
(°C)
4 October 2010
2.78
6.94
2.77
0.64
0.6
2.06
15
20.7
25 January 2011
3.24
7.01
3.37
1.13
1.72
1.02
0
14.5
635
44
Supplementary Table 2 Quantitative and qualitative analyses of biomacromolecular fractions sequentially isolated from the OMMC samples.
Sampling dates
DNA A260/A280
RNA Quantity
RIN
[µg]
Protein
Quantity
Quantity per
[µg]
gel lane [µg]
4 October 2010
2.20
15.44
8.9
81.97
20.1
25 January 2011
2.03
6.11
9.2
75
21.8
45
Supplementary Table 3 Statistics of the combined assembly of the metagenomic and 640
metatranscriptomic sequence datasets.
Statistic Total contig length
Value Unit 1 617 735 059 nt
Average contig length
239 nt
N50
258 nt
Maximal contig length Number of contigs
12 591 nt 6 761 781
46
Supplementary Table 4 Ammonia-oxidizing organisms with corresponding accession numbers of amoA genes used for the reconstruction of the phylogenetic tree. 645
Organism name
Accession number
Candidatus Nitrososphaera gargensis
gi|166007511
Methylobacter tundripaludum
gi|493947876
Methylococcus capsulatus Bath
gi|53803062
Methylomicrobium alcaliphilum 20Z
gi|357403888
Nitrosococcus halophilus Nc4
gi|292490804
Nitrosococcus oceani ATCC19707
gi|77165960
Nitrosococcus watsonii C-113
gi|300113334
Nitrosomonas cryotolerans ATCC49181
gi|12620331
Nitrosomonas europaea ATCC19718
gi|30248947
Nitrosomonas eutropha C91
gi|114332043
Nitrosomonas sp. Is79A3
gi|339481967
Nitrosomonas sp. JL21
gi|19310210
Nitrosomonas sp. AL212
gi|325981491
Nitrosospira briensis C-128
gi|1732262
Nitrosospira multiformis ATCC25196
gi|82701932
Nitrosospira sp. APG3A
gi|490283770
Nitrosospira sp. En13
gi|121483569
Nitrosospira sp. NpAV
gi|2062746
Nitrosospira sp. Np39-19
gi|2425028
Nitrosovibrio tenuis
gi|1732264
47
Supplementary Table 5 Numbers of metagenomic and metatranscriptomic reads from autumn and winter dates mapping to the isolate genomes of Nitrosomonas sp. Is79 (ref. 30) and Isolate LCSB065. Dataset
Sampling dates
Target organism Nitrosomonas sp. IS79
Isolate LCSB065
metagenomic reads
4 October 2010
7 282
5 665
metagenomic reads
25 January 2011
13 058
7 941
metatranscriptomic reads
4 October 2010
4 004
1 923
metatranscriptomic reads
25 January 2011
28 631
20 784
48
Supplementary References 1. Roume H, Muller EE, Cordes T, Renaut J, Hiller K, Wilmes P. A biomolecular isolation 650
framework for eco-systems biology. ISME J 2013; 7: 110-121.
2. Roume H, Heintz-Buschart A, Muller EE, Wilmes P. Sequential isolation of metabolites, RNA, DNA, and proteins from the same unique sample. Microbial Metagenomics, Metatranscriptomics, and Metaproteomics. Method Enzymol 2013; 531: 219-236. 655 3. Kozarewa I, Turner DJ. 96-plex molecular barcoding for the Illumina Genome Analyzer. High-Throughput Next Generation Sequencing 2011; 279-298.
4. Masella AP, Bartram AK, Truszkowski JM, Brown DG, Neufeld JD. PANDAseq: paired660
end assembler for illumina sequences. BMC Bioinformatics 2012; 13: 31.
5. Kofler R, Orozco-terWengel P, De Maio N, Pandey RV, Nolte V, Futschik A et al. PoPoolation: a toolbox for population genetic analysis of next generation sequencing data from pooled individuals. PloS one 2011; 6: e15925. 665 6. Huang Y, Niu B, Gao Y, Fu L, Li W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 2010; 26: 680-682.
7. Kultima JR, Sunagawa S, Li J, Chen W, Chen H, Mende DR et al. MOCAT: a 670
metagenomics assembly and gene prediction toolkit. PloS one 2012; 7: e47656.
8. Li R, Li Y, Kristiansen K, Wang J. SOAP: short oligonucleotide alignment program.
49
Bioinformatics 2008; 24: 713-714.
675
9. Miller CS, Baker BJ, Thomas BC, Singer SW, Banfield JF. EMIRGE: reconstruction of full-length ribosomal genes from microbial community short read sequencing data. Genome Biol 2011; 12: R44.
10. Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P et al. The SILVA ribosomal 680
RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res 2013; 41: D590-D596.
11. Treangen TJ, Koren S, Sommer DD, Liu B, Astrovskaya I, Ondov B et al. MetAMOS: a modular and open source metagenomic assembly and analysis pipeline. Genome Biol 2013; 685
14: R2.
12. Namiki T, Hachiya T, Tanaka H, Sakakibara Y. MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads. Nucleic Acids Res 2012; 40: e155-e155. 690 13. Rho M, Tang H, Ye Y. FragGeneScan: predicting genes in short and error-prone reads. Nucleic Acids Res 2010; 38: e191-e191.
14. Hyatt D, Chen G-L, LoCascio P, Land M, Larimer F, Hauser L. Prodigal: prokaryotic 695
gene recognition and translation initiation site identification. BMC Bioinformatics 2010; 11: 119.
50
15. Kent WJ. BLAT—the BLAST-like alignment tool. Genome Res 2002; 12: 656-664.
700
16. Kessner D, Chambers M, Burke R, Agus D, Mallick P. ProteoWizard: open source software for rapid proteomics tools development. Bioinformatics 2008; 24: 2534-2536.
17. Craig R, Cortens JP, Beavis RC. Open source system for analyzing, validating, and storing protein identification data. J Proteome Res 2004; 3: 1234-1242. 705 18. Deutsch EW, Mendoza L, Shteynberg D, Farrah T, Lam H, Tasman N et al. A guided tour of the Trans-‐Proteomic Pipeline. Proteomics 2010; 10: 1150-1159.
19. Keller A, Nesvizhskii AI, Kolker E, Aebersold R. Empirical statistical model to estimate 710
the accuracy of peptide identifications made by MS/MS and database search. Anal Chem 2002; 74: 5383-5392.
20. Shteynberg D, Deutsch EW, Lam H, Eng JK, Sun Z, Tasman N et al. iProphet: multilevel integrative analysis of shotgun proteomic data improves peptide and protein 715
identification rates and error estimates. Mol Cell Proteomics 2011; 10: M111.007690.
21. Muller EE, Pinel N, Laczny CC, Hoopmann MR, Narayanasamy S, Lebrun LA et al. Community-integrated omics links dominance of a microbial generalist to fine-tuned resource usage. Nat Commun 2014; 5: 5603; 1-10. 720 22. Lee S, Seo CH, Lim B, Yang JO, Oh J, Kim M et al. Accurate quantification of transcriptome from RNA-Seq data by effective length normalization. Nucleic Acids Res 2011;
51
39: e9-e9.
725
23. Tsementzi D, Poretsky R, Rodriguez-‐R LM, Luo C, Konstantinidis KT. Evaluation of metatranscriptomic protocols and application to the study of freshwater microbial communities. Environ Microbiol Rep 2014; 6: 640-655.
24. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful 730
approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological) 1995; 289-300.
25. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome 735
Res 2003; 13: 2498-2504.
26. Rahman SA, Schomburg D. Observing local and global properties of metabolic pathways:‘load points’ and ‘choke points’ in the metabolic networks. Bioinformatics 2006; 22: 1767-1774. 740 27. Faust K, Croes D, van Helden J. Metabolic pathfinding using RPAIR annotation. J Mol Biol 2009; 388: 390-414.
28. Csardi G, Nepusz T. The igraph software package for complex network research. 745
InterJournal, Complex Systems 2006; 1695: 1-9.
29. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search
52
tool. J Mol Biol 1990; 215: 403-410.
750
30. Bollmann A, Sedlacek CJ, Norton J, Laanbroek HJ, Suwa Y, Stein LY et al. Complete genome sequence of Nitrosomonas sp. Is79, an ammonia oxidizing bacterium adapted to low ammonium concentrations. Stand Genomic Sci 2013; 7: 469.
31. Liu W, Li L, Khan MA, Zhu F. Popular molecular markers in bacteria. Mol Genet 755
Microbiol Virol 2012; 27: 103-107.
32. Sommer DD, Delcher AL, Salzberg SL, Pop M. Minimus: a fast, lightweight genome assembler. BMC Bioinformatics 2007; 8: 64.
760
33. Treangen TJ, Sommer DD, Angly FE, Koren S, Pop M. Next generation sequence assembly with AMOS. Curr Protoc Bioinformatics 2011; 11.8. 1-11.8. 18.
34. Sayavedra-‐Soto L, Hommes N, Alzerreca J, Arp D, Norton JM, Klotz M. Transcription of the amoC, amoA and amoB genes in Nitrosomonas europaea and Nitrosospira sp. NpAV. 765
FEMS Microbiol Lett 1998; 167: 81-88.
35. Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 2012; 28: 3150-3152.
770
36. Dereeper A, Guignon V, Blanc G, Audic S, Buffet S, Chevenet F et al. Phylogeny. fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res 2008; 36: W465-W469.
53
37. Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol 2000; 17: 540-552. 775 38. Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 2010; 59: 307-321.
780
39. Chevenet F, Brun C, Bañuls A-L, Jacq B, Christen R. TreeDyn: towards dynamic graphics and annotations for analyses of trees. BMC Bioinformatics 2006; 7: 439.
40. Reasoner D, Geldreich E. A new medium for the enumeration and subculture of bacteria from potable water. App Environ Microbiol 1985; 49: 1-7. 785 41. Levantesi C, Rossetti S, Thelen K, Kragelund C, Krooneman J, Eikelboom D et al. Phylogeny, physiology and distribution of ‘Candidatus Microthrix calida’, a new Microthrix species isolated from industrial activated sludge wastewater treatment plants. Environ Microbiol 2006; 8: 1552-1563. 790 42. Slijkhuis H. Microthrix parvicella, a filamentous bacterium isolated from activated sludge: cultivation in a chemically defined medium. App Environ Microbiol 1983; 46: 832-839.
43. Fowler SD, Greenspan P. Application of Nile red, a fluorescent hydrophobic probe, for 795
the detection of neutral lipid deposits in tissue sections: comparison with oil red O. J Histochem Cytochem 1985; 33: 833-836.
54
44. Schneider CA, Rasband WS, Eliceiri KW. NIH Image to ImageJ: 25 years of image analysis. Nat Methods 2012; 9: 671-675. 800 45. Rodrigue S, Materna AC, Timberlake SC, Blackburn MC, Malmstrom RR, Alm EJ, Chisholm SW. Unlocking short read sequencing for metagenomics. PLoS one 2010; 5: e11840.
805
46. Xu H, Luo X, Qian J, Pang X, Song J, Qian G et al. FastUniq: a fast de novo duplicates removal tool for paired short reads. PloS one 2012; 7: e52249.
47. Peng Y, Leung HC, Yiu S-M, Chin FY. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 2012; 28: 1420810
1428.
48. Gordon D, Abajian C, Green P. Consed: a graphical tool for sequence finishing. Genome Res 1998; 8: 195-202.
815
49. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA et al. The RAST Server: rapid annotations using subsystems technology. BMC Genomics 2008; 9: 75.
50. Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 2009; 25: 1754-1760. 820 51. Kerepesi C, Bánky D, Grolmusz V. AmphoraNet: The webserver implementation of the AMPHORA2 metagenomic workflow suite. Gene 2014; 533: 538-540.
55
52. Tatusov RL, Galperin MY, Natale DA, Koonin EV. The COG database: a tool for 825
genome-scale analysis of protein functions and evolution. Nucleic Acids Res 2000; 28: 33-36.
53. Overbeek R, Begley T, Butler RM, Choudhuri JV, Chuang H-Y, Cohoon M et al. The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res 2005; 33: 5691-5702. 830
56