Document not found! Please try again

identification of key functionalities in microbial community ... - Nature

0 downloads 0 Views 8MB Size Report
Jan 25, 2011 - well plate (Greiner Bio-One, Monroe, NC, USA) and placed into an Eksigent Nano 2D plus ..... aligned using BLAT15 with cut-offs chosen as follows: e-value < ... using the Phylogeny.fr customized workflow service36 including.
SUPPLEMENTARY INFORMATION

Comparative integrated omics: identification of key functionalities in microbial community-wide metabolic networks

5

Hugo Roumea,1*, Anna Heintz-Buscharta*, Emilie EL Mullera, Patrick Maya, Venkata P Satagopama, Cédric C Lacznya, Shaman Narayanasamya, Laura A Lebruna, Michael R Hoopmannb, James M Schuppc, John D Gillecec, Nathan D Hicksc, David M Engelthalerc, Thomas Sauterd, Paul S Keimc, Robert L Moritzb, and Paul Wilmesa,2

10

a

Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg.

b

Institute for Systems Biology, Seattle, WA, USA.

c

15

The Translational Genomic Research Institute-North, Flagstaff, AZ, USA.

d

Life Science Research Unit, University of Luxembourg, Luxembourg, Luxembourg.

1

Current address: Laboratory of Microbial Ecology and Technology, Ghent University, Belgium

2

Corresponding author: P Wilmes, Luxembourg Centre for Systems Biomedicine, University of Luxembourg,

L-4362 Esch-sur-Alzette, Luxembourg. E-mail: [email protected].

* These authors contributed equally to this work.

20

Supplementary Materials and Methods

Biomolecular extraction and quality assessment For each sampling date, three 200 mg sub-samples derived from one OMMC islet (defined, herein1, as technical replicates) were used for biomolecular extraction. Each of the individual biomolecular fractions isolated from the technical replicates were combined into a pool in 25

order to yield sufficient biomolecular quantities for subsequent high-throughput analysis and to reduce the influence of fine-scale within-sample heterogeneity (as demonstrated by Roume et al.2). For the quality assessment of the isolated genomic DNA, fractions were separated by electrophoresis on a 1 % agarose gel containing 4 ‰ ethidium bromide (PlusOne Ethidium

30

Bromide, GE Healthcare). For size estimation, the MassRuler DNA ladder mix (Fermentas) was loaded onto the gels. Agarose gels were visualized on an InGenius gel imaging and analysis system (Syngene, Cambridge, UK; as described in Roume et al.1). The DNA pool sample was snap-frozen in liquid nitrogen in the elution buffer and stored at -80 °C until library preparation and sequencing.

35

RNA quality assessment and quantification was carried out using an Agilent 2100 Bioanalyzer (Agilent Technologies, Diegem, Belgium; as described in Roume et al.1). The quality of protein extracts was assessed following 1D-SDS-PAGE separation.

Preparation of RNA for shipment and sequencing The volume of the RNA pool sample was adjusted to 180 µl using RNase free-water. A 40

mixture of 18 µl of a 3 M (w/v) sodium acetate solution, 2 µl of a glycogen solution 2

(10.mg/µl) and 600 µl of ice-cold 96 % (v/v) ethanol was added to the RNA solution. The RNA solution was gently mixed by inversion and precipitated at -20 °C for at least 1.h. Following centrifugation at 10 000 x g for 30 min at 4 °C, the RNA pellet was washed three times with ice-cold 70 % (v/v) ethanol. The pellet was then air dried for 5 min at room 45

temperature overlaid with 100 µl of Ambion® RNAlater (Life Technologies, Gent, Belgium) and placed at -20 °C until shipment. For reclamation of the RNA, the solution was centrifuged at 14 000 x g for 10 min at 8 °C. After removal of the supernatant, the pellet was washed three times with 100 µl of 80 % (v/v) ethanol, followed by 25 000 x g centrifugation for 3 min at 8 °C. The pellet was then air dried until all visible ethanol had evaporated. The

50

dried RNA pellet was then re-suspended in 1 mM sodium citrate buffer solution (pH 6.4).

RNA-Sequencing In order to enrich the RNA fraction in mRNA, the rRNA was subtracted from the total RNA fraction using the Ribo-Zero rRNA Removal Kit (Meta-Bacteria; Epicentre, Madison, WI, USA). This was followed by cDNA synthesis and amplification using the ScriptSeqTM v2 55

RNA-Seq library preparation kit (Epicentre).

rRNA removal The procedure consisted of an initial washing and re-suspension of the Ribo-Zero microspheres with dedicated solutions. Following treatment of the total RNA sample with Ribo-Zero removal solution, the RNA sample solution was added to the re-suspended Ribo60

Zero microspheres and selected by hybridisation. The RNA-microsphere solution was then removed and the rRNA-depleted sample was purified by ethanol precipitation.

3

Library preparation The cDNA synthesis and amplification were performed using the ScriptSeqTM v2 RNA-Seq library preparation kit (Epicentre). The procedure consisted of initial fragmentation of the RNA, followed by the annealing of the cDNA synthesis primer. Briefly, addition of 4 µl 65

cDNA Synthesis Master Mix to the fragmented RNA solution was followed by incubation first at 25 °C for 5 min and then 42 °C for 20 min. cDNA Synthesis Master Mix was prepared by combining 3 µl cDNA Synthesis PreMix (ScriptSeq v2 RNA-Seq library preparation kit), 0.5 µl of DTT (100 mM) and 0.5 µl of StartScript Reverse Transcriptase solution (Epicentre). After cooling to 37 °C, 1 µl of finishing solution (Epicentre) was added to the cDNA

70

synthesis solution and incubated at 37 °C for 10 min, before an incubation at 95 °C for 3 min. 8 µl of terminal tagging master mix was then added to each solution and incubated at 25 °C for a further 15 min, following incubation at 95 °C for 3 min. Terminal tagging master mix was prepared using 7.5 µl of terminal tagging premix (Epicentre) and 0.5 µl of DNA polymerase. The 3’-terminal tagged cDNA was then purified using the AMPure XP system

75

(Beckman Coulter, Brea, CA, USA). The purified cDNA strand was then amplified by PCR, resulting in the generation of the second strand of cDNA, addition of the Illumina adaptor sequences and incorporation of specific barcodes, as well as amplification. Finally, the RNASeq library was purified using AMPure XP system (Beckman Coulter). The size distribution of the RNA-Seq library was assessed using an Agilent 2100 Bioanalyzer.

80 DNA/cDNA sequencing DNA/cDNA was prepared according to the modified instructions from The Wellcome Trust Sanger Institute3. The metagenomic sequencing protocol used a 96-well library preparation and the molecular barcoding method for Illumina library construction. The barcodes were

4

designed using Hamming codes, which allow single nucleotide sequencing errors to be 85

corrected and single indels (insertions/deletions) to be detected without ambiguity. Several optimisations were employed in the indexing protocol. The tags used were 8 bp long, which allowed the design of larger number of barcodes with error-correcting capability. The bar codes were introduced in a regular PCR, which simplified the PCR step and allowed for use of as few as six cycles. Before pooling, the relative concentration of each sample library was

90

measured by quantitative PCR (qPCR), which then allowed the accurate pooling of libraries together and improved the uniformity of their representation.

DNA fragmentation 1-5 µg of DNA was used for the fragmentation step, as determined following gel analysis, and re-suspended in 75 µl of 10 mM Tris-HCl, pH 8.5. The sample was then sheared for 95

150 s in 100 µl Covaris microtubes (Woburn, MA, USA). The following programme was used: duty cycle 20 %, intensity 5, cycle burst 200, power 37 W, temperature 7 °C and mode ‘Freq sweeping’. The sheared DNA samples were transferred into a MicroAmp optical 96well plate (Life technologies, Grand Island, NY, USA). Six samples, chosen at random, were run on an Agilent Bioanalyzer DNA 1000 chip to check the quality of the fragmentation. A

100

150-250 bp smear was detected and 70-90 % of the initial DNA amount was recovered.

Size selection A large fragment plate was prepared by dispensing 150 µl of beads into wells of a roundbottom Costar plate (Corning, Glendale, AZ, USA) for each sample. Additionally, a smallfragment plate was prepared by dispensing 60 µl of beads into wells of a separated Costar 105

plate for every sample. The large-fragment Costar plate was placed on a magnetic stand, 5

allowing the collection of beads. 45 µl of 10 mM Tris-HCl buffer were removed from the large-fragment wells, leaving all beads in the well. The plate was removed from the magnetic stand and 190 µl of sample from sonication tubes was added to the large fragment wells on the Costar plate. Following 8 min of incubation at room temperature, using two magnetic 110

stands, the large-fragment Costar plate was placed on a magnetic stand to collect beads. Following 5 min incubation at room temperature, 58 µl of bead buffer were removed from the small fragment wells, leaving behind beads and approximately 2 µl of bead buffer. Smallfragment wells were then removed from the magnetic stand. 300 µl of supernatant were transferred from large-fragment wells into small-fragment wells. After transfer, the large-

115

fragment plate was discarded. Samples were then incubated in small-fragment wells for 5 min and the plate was placed on a magnetic plate to collect beads. 300 µl of supernatant were removed and replaced by 300 µl of 80 % ethanol without disturbing beads. Following 30 s of incubation, ethanol was removed. This last step was repeated two times. Following ethanol removal by air-drying for 5 min, the small fragment plate was removed from the

120

magnetic stand and 45 µl of pre-warmed 10 mM Tris-HCl (pH 8.5) were dispensed into the wells. After re-suspension of beads by vigorous mixing, the solution was incubated for 2 min and placed on a magnetic plate to collect beads. Following a further 3 min incubation, 42.5 µl of supernatant containing the size-selected products were transferred to a new plate. All enzymes and reaction buffers used for the end-repair, dA-tailing, ligation and PCR

125

amplification were provided from the KAPA Library Preparation Kits with Standard PCR Library Amplification/Illumina series (KapaBiosystems, Woburn, MA, USA).

6

End-repair To the supernatant containing the sheared DNA, 10 µl of end-repair buffer and 5 µl of endrepair enzyme mix were added. 100 µl of this solution were added to each well. The plate 130

was then covered, vortexed briefly and spun down before being incubated for 30 min at 20 °C in a thermocycler. The samples were then cleaned using Agencourt AMPure SPRI beads (Beckman Coulter). 90 µl of SPRI beads were added into each well, the plate was covered and vortexed for 30 s. The reaction plate was placed on the magnetic SPIRPlate for 10 min and beads separated from the solution. Following incubation for 10 min, the

135

completely clear solution was discarded. 200 µl of 70 % (v/v) ethanol were added to each well and incubated for 30 s at room temperature. The ethanol washes were repeated two times. The reaction plate was dried for 15 min in a thermocycler and each sample was eluted with 43.5 µl of 10 mM Tris buffer (pH 8.0).

A-tailing 140

To 30 µl of end-repaired DNA, 5 µl of 10x A-tailing buffer, 3 µl of A-tailing enzyme and 12.µl of water were added. 50 µl of A-tailing master mix were added to each well. The plate was then covered, vortexed briefly and spun down before being incubated for 30 min at 30 °C in a thermocycler. The sample was then cleaned up using the Agencourt AMPure SPRI bead method with 90 µl of SPRI beads placed in each well. Each sample was then eluted with

145

36 µl of 10 mM Tris buffer (pH 8.0). The same six samples, randomly chosen for the DNA fragmentation step, were run on an Agilent Bioanalyzer DNA 1000 chip and the average sample concentration was calculated using the “integrated peak” function.

7

Adapter preparation and ligation To 30 µl of A-tailed DNA, 10 µl of 5x ligation buffer, 5 µl of DNA ligase and 5 µl of DNA 150

adaptor (30 µM) were added. 50 µl of this reaction mixture, were added to each well. The plate was then incubated at 20 °C (room temperature) for 15 min. The sample was cleaned up using Agencourt AMPure SPRI beads by addition of 40 µl of SPRI beads to each well. Each sample was then eluted with 30 µl of 10 mM Tris buffer (pH 8.0) / 0.05 % (v/v) Tween 20. The same six samples were randomly chosen for the DNA fragmentation step and run on an

155

Agilent DNA 1000 chip to check the success of the ligation; the smear obtained had an average molecular size of 50 to 200 bp larger than before ligation.

PCR amplification Library amplification was carried out according to a modified reaction setup as defined in the KAPA Library Preparation kit instructions (Illumina). 25 µl of 2x Kapa HIFI Hotstart Mix 160

were spiked with 1 M betaine, aiding in amplification of high-GC-content regions and reducing biases. 1 µl of the supplied PCR primers were added to each tube and a sufficient quantity of water was added to reach 50 µl of solution volume per PCR reaction. The tubes were then transferred into a PCR thermocycler and run with the recommended KAPA library preparation cycling programme. The samples were then cleaned using the SPRI beads

165

method with 40 µl of SPRI beads. Each sample was eluted using 50 µl of 10 mM Tris-HCl / 0.05 % (v/v) Tween 20. The same six samples as those randomly chosen at the DNA fragmentation step, were run on an Agilent Bioanalyzer DNA 1000 chip to check the success of the indexing enrichment PCR.

8

Final library quantification by qPCR 170

The KAPA Library Quant kit (KapaBiosystems) for final library quantification was used (Illumina). Briefly, after an initial 1:1 000 dilution in 10 mM Tris-HCl, pH 8.0 + 0.05 % (v/v) Tween 20, the 2x KAPA SYBR FAST qPCR master mix was used to amplify the DNA library with six other standards on an ABI 7900 thermocycler. The qPCR step was conducted using the following cycling conditions: 95 °C for 5 min followed by 35 cycles at 95 °C for

175

30.s and at 60 °C for 45 s. The concentration of the library was established using the standards according to the manufacturer’s instructions. Both the standards and the libraries with unknown concentration were run in triplicate. Prior to sequencing, the libraries were diluted to the required concentration (e.g. 4.5 pM) by following the Illumina cluster generation protocol.

180 Sequence assembly For each of the two sampling dates, the raw paired-end 100 nt read metagenomic and metatranscriptomic sequences were processed separately first using the PAired-eND Assembler4 (PANDAseq, Supplementary Figure 1, step 1) to assemble overlapping read pairs. PANDAseq was run with a score threshold of 0.9 and 25 nt minimum overlap 185

requirement to determine the location of the amplification primers, identify the optimal overlap between reads, correct sequencing errors and check length and base quality. The reads selected by the PANDAseq assembler were extracted from the raw sequence files using in-house Perl scripts. The remaining non-redundant paired-end reads were trimmed using the trim-fastq.pl script from the PoPoolation package5 using a quality-threshold of 20 (1 %

190

probability of miscall) and a minimum length of 40 nt resulting in two quality trimmed read sets, one including still paired-end and one containing only single-end reads, where the other

9

read pair was discarded during quality trimming. Metagenome and metatranscriptome FASTQ files were then combined into a combined FASTQ file. All PANDAseq and singleend reads were then combined into a single FASTQ file. Paired-end and single-end reads 195

were made non-redundant using CD-HIT-dup6 (Supplementary Figure 1, step 2). Noneredundant reads were then used as input for the MOCAT assembly pipeline7 (Supplementary Figure 1, step 3), using default parameters. The assembled contigs were filtered with minimum length threshold of 150 nt. To enhance the final assembly by reads that were assembled by PANDAseq, but not used by the MOCAT assembly pipeline, all PANDAseq

200

reads were mapped onto the MOCAT contigs using SOAPaligner8, with the following parameter settings: -r 2 -M 4 -l 30 -v 10 -p 8 (Supplementary Figure 1, step 4). The unmapped PANDAseq assembled reads with a minimum length of 150 bp, were extracted and added to the contigs. The final contig files were made non-redundant using CD-HIT6 (-c 1.0) by clustering identical sequences.

205 Extraction and classification of reads mapping to rRNA genes in the metagenomic data EMIRGE9 was run using metagenomic reads quality filtered to minimum average QV = 30 and a minimum of 40 bp with the trim-fastq.pl script from the PoPoolation package5. The reference database used was the truncated (non-redundant) small ribosomal subunit SILVA10 database release 111. The consensus sequences at 100 iterations were extracted and named 210

based on their identification in the reference database, without imposing any normalized posterior probability.

Generation of additional metagenomic data for contig extension and analysis To obtain an additional high-depth metagenomic dataset, a total of four additional floating 10

sludge samples islets were collected on 23rd February 2011, each representing a biological 215

replicate. Biomolecular extraction was performed on 200 mg of starting material as for the other two dates, using the protocol described by Roume et al.1. The resulting DNA fractions were sequenced under sample names I, II, III and IV using the sequencing protocol described above. Four technical replicates from sample I were generated, resulting in four libraries I-1, I-2, I-3 and I-4. While the remaining biological replicates II, III and IV were sequenced once

220

each, generating a total of 12.6 gigabases metagenomic sequence. The reads were then assembled using AMOS11 and MetaVelvet12.

Gene annotation Non-redundant contig files were split into two distinct files, one file with contigs of a length 225

below 500 bp and another file with contigs lengths above or equal to 500 bp. The contigs with lengths below 500 bp were annotated using FragGeneScan13 (Supplementary Figure 1, step 5), using settings for short sequence reads with sequencing error (-complete 0 –train illumine_5). The contigs with a length equal or above 500 bp were annotated with Prodigal gene finder14 (v2.60, Supplementary Figure 1, step 5) using the MOCAT gene prediction

230

processing steps. The resulting amino acids sequence files were then combined into a single file and made non-redundant using CD-HIT with a sequence identity threshold of 1.0 and a description string length within the cluster file of 5 000 (-c 1.0 –d 5000; Supplementary Figure 1, step 6). All sequences were mapped to the KEGG database version 64.0 using BLAT15 and sequences

235

were annotated with KOs (KEGG orthologous groups; Supplementary Figure 1, step 7).

11

Pre-processing of protein fraction for high-throughput analysis Following 1D-SDS-PAGE electrophoresis and staining with Imperial protein stain (Thermo Scientific, Erembodegem, Belgium; as described in Roume et al.1), the protein gel was conserved at 4 °C in the dark and under vacuum in sealing foil D0316L-20 (DOMO 240

ELEKTRO, Herentals, Belgium). Prior to further analysis, entire lanes were cut into 1 mmslices using a grid cutter (MEE-1x5, Gel Company, San Francisco, CA, USA), yielding approximately 70 slices per lane. Two 1 mm-slices were combined in a single well of a 96well V-bottom plate with a hole introduced into the bottom using a 30 gauge lancet needle (Becton Dickinson, Franklin Lakes, NJ, USA). The wells contained size 11 black hexagonal

245

glass beads (SB3656, Fusion Beads) to prevent the gel pieces from clogging the hole. For ingel digestion, an automated liquid handling system (Tecan EVO, Männedorf, Switzerland) was used for reduction, alkylation, tryptic digestion and peptide extraction from the gel pieces. After extraction, the peptide solution was dried and reconstituted in 20 µl of a solution of 0.1 % (v/v) [trifluoroacetic acid (TFA, 5 %; Sigma) / acetonitrile (ACN, 95 %;

250

BioSolve, Valkenwaard, Netherlands)] in MilliQ H2O in a round bottom polypropylene 96well plate (Greiner Bio-One, Monroe, NC, USA) and placed into an Eksigent Nano 2D plus system autosampler (ABSciex, Framingham, MA, USA) for analysis.

Liquid chromatography Peptides obtained from the 1 mm-gel bands were separated using an Eksigent Nano 2D LC 255

plus system employing splitless nanoflow. Reverse phase high performance liquid chromatography (RP-HPLC) and separation columns were prepared in-house by packing a Kasil fritted capillary [360 µm outer diameter (OD), 75 µm inner diameter (ID)] with a 1 cm bed of ReproSil Pur C18-AQ 3 µm 120 Å stationary phase (Dr. Maisch GmbH, Ammerbuch,

12

Germany) for the sample trap and desalting column. A Kasil fritted capillary (360 µm OD, 260

75 µm ID) was packed with a 15 cm bed of the same stationary phase as the separation column and this was connected to a PicoTip emmiter (360 µm OD x 20 µm ID, Tip 10 µm, FS360-20-10-N-20) for nano-electrospray ionisation. For each LC run, the sample was injected for 10 minutes at 2.5 µl/min with loading buffer (2 % v/v acetonitrile and 0.1 % v/v formic acid). The sample was separated by a linear gradient changing from 98 % solvent A

265

(0.1 % v/v formic acid in water) and 2 % solvent B (0.1 % v/v formic acid in acetonitrile) to 40 % A and 60 % B in 60 min at 0.3 µl/min.

Mass spectrometry Following LC separation, the peptides were analysed on a LTQ-Velos Orbitrap (ThermoFisher, San Jose, CA, USA). MS1 data were collected over the range of 300 – 2 000 m/z in 270

the Orbitrap at a resolution of 30 000. Fourier-transform mass spectrometry (FTMS) preview scan and predictive automatic gain control (pAGC) were enabled. The full scan FTMS target ion volume was 1 x 106 with a maximum fill time of 500 ms. MS2 data were collected in the LTQ-Velos with a target ion volume of 1 x 104 and a maximum fill time of 100 ms. The 10 most intense peaks were selected (within a window of 2.0 Da) for higher-energy collisional

275

dissociation at 15 000 resolution in the Orbitrap. Dynamic exclusion was enabled in order to exclude an observed precursor for 180 s after two observations. The dynamic exclusion list size was set at the maximum 500 and the exclusion width was set at ±5 ppm based on precursor mass. Monoisotopic precursor selection and charge state rejection were enabled to reject precursors with z = +1 or unassigned charge state.

280

13

Protein identification For MS analysis, Thermo .RAW files were converted to mzXML format using MSConvert (ProteoWizard16) and searched with X!Tandem17 version 2011.12.01.1. Spectra were searched against the metagenomic and metatranscriptomic data, common lab protein contaminants, and decoys. Redundancy was removed from these three data sets using 285

BlastClust. The contaminant database was a modified version of the common Repository of Adventitious Proteins (cRAP, www.thegpm.org/crap) with the Sigma Universal Standard Proteins removed and human angiotensin II and [Glu-1] fibrinopeptide B (MS test peptides) added, for a total of 66 entries. Decoys were generated with Mimic (www.kaell.org), which randomly shuffles peptide sequences between tryptic residues, but retains peptide sequence

290

homology in decoy entries. Search criteria used for X!Tandem included a precursor mass tolerance of 15 ppm and a fragment mass tolerance of 15 ppm for higher-energy collisional dissociation spectra. Peptides were assumed to be semi-tryptic (cleavage after K or R except when followed by P), but semi-tryptic peptides with up to 2 missed cleavages were allowed. The search parameters

295

included a static modification of +57.021464 Da at C for carbamidomethylation by iodoacetamide and potential modifications of +.15.994915 Da at M for oxidation, 17.026549 Da at N-terminal Q for deamidation, and -18.010565 Da at N-terminal E for loss of water from formation of pyro-Glu. Additionally, -17.026549 Da at the N-terminal carbamidomethylated C for deamidation from formation of S-carbamoylmethylcysteine and

300

N-terminal acetylation were searched. Peptide spectrum matches (PSMs) obtained from X!Tandem were validated using the Trans Proteomic Pipeline18 version 4.6 Rev.1. The PSMs were analysed with PeptideProphet19 to assign each PSM a probability of being correct. Accurate mass binning was employed to promote PSMs whose theoretical mass closely matched the observed mass of the precursor ion, and to correct for any systematic mass

14

305

errors. Decoys and the non-parametric model option were used to improve PSM scoring. Protein identifications were inferred with ProteinProphet. The ProteinProphet scores were then analysed in iProphet20, which combines results from multiple fractions and multiple database searches (although here, only X!Tandem was used) and assigns a probability for each unique protein and its corresponding peptide sequences. The false discovery rate for a

310

given iProphet probability was calculated using the number of decoy protein inferences at that probability. Only proteins identified at iProphet probabilities corresponding to a false discovery rate (FDR) less than 1.0 % were further considered.

KO annotations and protein quantification The corresponding sequences of the identified proteins were collected in FASTA format from 315

the non-redundant single peptides/protein sequences database previously generated from the combined metagenome and metatranscriptome assemblies. Relative protein quantitation was performed using the normalized spectral index (NSI) measure using an in-house software tool called NSICalc as previously described21. The amino acid sequences of identified proteins were then mapped to the KO library (KEGG database version 64.0) using BLAT15 (e-

320

value50, score>50). From the resulting list of KOs, the frequency of each KO was determined at the protein level, using an in-house developed Perl script.

Gene copy and transcript abundances To account for differences in read depth and sampling, the number of raw sequence reads from autumn and winter metagenome and metatranscriptome libraries were equilibrated by 325

randomly selecting reads in the larger libraries from the autumn sample to mirror the number of reads in the smaller library (winter) using an in-house Perl script based on the ‘shuffle’ 15

method from the ‘List::Util’ CPAN package. This resulted in 14 546 374 reads and 16 443 761 reads being used from the metagenomic and metatranscriptomic libraries, respectively. The four balanced raw read sequence libraries were then mapped separately to 330

the combined assembly for both sampling dates using SOAPaligner8 with the following parameter settings: -r 2 -M 4 -l 30 -v 10 -p 8. For each library, reads were mapped to genes and counted, except for reads mapping to multiple genes, for which weighted proportions were used. Next, to obtain the abundances of genes and transcripts, read counts were normalized by the length of the respective gene

335

sequences22, to obtain normalized gene copy abundances and normalized transcript abundances, respectively. Normalized gene copy abundances per KO were obtained by calculating the sum of normalized gene copy abundances from all genes belonging to the same KO group. Similarly, the KO-wise transcript abundances were calculated as the sum of normalized transcript abundances over all genes within the same KO.

340 Relative gene expression Relative expression of KOs was determined by dividing the normalized transcript abundance of each KO by the inferred gene copy abundance of the same KO23. The relative expression of a KO is greatly dependent on the normalized gene copy abundance, if this value is close to zero. Similarly, KOs with normalized gene copy 345

abundances close to zero are prone to be falsely identified as highly expressed due to their very low gene copy abundances. Therefore, highly expressed KOs were selected based on normalized gene copy and transcript abundances, in addition to relative expression. KOs were considered highly expressed, if their relative expression was above the 90th percentile. In addition, to avoid false positive identification of highly expressed KOs, the normalized 16

350

transcript abundances of highly expressed KOs had to be above the 3rd quartile of the normalized transcript abundances of all KOs or normalized gene copy abundances had to be above an empirically determined threshold. This threshold was determined by sorting KOs by their normalized gene copy abundances and applying a sliding window approach to determine the lowest normalized gene copy abundance with an average relative expression

355

robustly within the interquartile range (see Supplementary Figure 2).

Sensitivity of gene expression analysis to imposed cut-offs Results of the analysis of KOs exhibiting high relative expression are dependent on the quantile of KOs considered highly expressed (we considered KOs with a relative expression above the 90th percentile), as well as the lower cut-off which was set for gene copy 360

abundances to avoid false positive identification of KOs exhibiting very low gene copy abundances and only slightly higher transcript abundances as highly expressed. Therefore, we first analysed, if the numbers and identities of KOs identified based on their relative expression were robust to different levels of noise added to the data. In addition, we analysed whether the conclusions would also be robust to small to moderate changes in the selected

365

cut-off values. To address the first point (robustness to noise), we changed the gene copy and transcript abundances by a random number following a uniform distribution within different limits. The lowest of these limits corresponded to +/- 1 read mapped per kilobase of metagenomic sequence and the highest to +/- 50 reads mapped per kilobase. We then analysed the numbers

370

and identities of the highly expressed genes in 100 repetitions of each test, given the chosen cut-offs and the identities of enriched pathways within the selected sets of KOs. We compared the results to the same analysis carried out using a single numerical cut-off for

17

relative expression (minimal relative transcript abundance = 10 times relative gene copy abundance). 375

To address the second point (robustness to changes in cut-offs), we changed each cut-off within small to moderate limits. For the inclusion of KOs irrespective of their gene copy abundances, cut-offs at steps between the 55th to 95th percentile of transcript abundances were used. For the exclusion of KOs with low gene copy abundances, different cut-offs of robust relative expression were set between the 20th and 95th percentile. We then analysed the

380

numbers and identities of the KOs found to be highly expressed, as well as the pathways enriched in these KOs.

Comparison of the metagenomic dataset with the metatranscriptomic and metaproteomic datasets The congruency of metagenomic and metatranscriptomic datasets was determined by calculating the proportion of KOs with at least one gene having at least 10 metagenomic 385

reads per kilobase mapping to it and also at least one gene resulting in the mapping of 10 metatranscriptomic reads per kilobase. The congruency of the metagenomic and metaproteomic datasets was calculated analogously as the proportion of KOs with at least one gene mapped by at least 10 metagenomic reads per kilobase that also had at least one gene identified at the protein level.

390 Analysis of pathway membership Assignment of KOs to KEGG pathways was done by using the KO to pathway link (http://rest.kegg.jp/link/Ko/pathway) in the KEGG database version 67.1. Enrichment of KOs

18

in specific pathways was tested using a hypergeometric test and p-values were adjusted using FDR-control24. Test results with adjusted p-values below 0.05 were considered significant. 395 Community-wide metabolic network reconstructions The KO to R (reaction) link (http://rest.kegg.jp/link/Ko/reaction) was used to associate each individual KO to the corresponding reactions following the R to RP (reaction pair) link (http://rest.kegg.jp/link/reaction/RP), thereby, associating individual reactions to their corresponding main reaction pairs. The RPAIR annotation was specifically chosen to ignore 400

unspecific compounds of reactions (water, energy carriers and cofactors), thereby, only taking into account the main compounds of a reaction. The reaction pairs (RP) were then further

selected

by

using

(http://rest.kegg.jp/link/rn/rc).

only Finally,

RPAIRs the

with

reaction

assigned

reaction

pair

compounds

-

classes link

(http://rest.kegg.jp/list/RP) was used to associate individual RPs to corresponding pair(s) of 405

compounds. As some KOs have identical compounds (e.g. subunits of the same enzyme or enzymes that catalyse each other’s reverse reaction), KOs with identical compounds (as annotated in the KEGG database version 67.1) were grouped. A KO network graph was built by using all KOs as nodes. Edges between two KOs were introduced if a product metabolite of one KO was found as a substrate metabolite of the other KO. Multiple edges between the

410

same KOs were reduced into a single edge. For topological analysis of the reconstructed metabolic network, nodes sharing the exact same edges were regrouped to be represented as a single node. Multiple edges connecting the same nodes were likewise combined into a single edge. The undirected network graph was visualized and analysed using Cytoscape25, employing a spring embedded layout. Singleton nodes not connected to any other node were

415

removed.

19

Calculation of gene copy and transcript number and relative expression of regrouped nodes Gene copy numbers and transcript numbers of nodes which represented several KOs due to their sharing of the same edges were calculated by summing all normalized gene copy and 420

transcript numbers, respectively. Relative expression of a node was calculated by dividing the node-wise sum of normalized transcript abundances by the node-wise sum of normalized gene copy abundances, thereby levelling relative expression of the regrouped KOs.

Identification of choke points Choke points as defined by Rahman and Schomburg26 are enzymes which consume or 425

produce unique metabolites and possess a high load score in the metabolic network reconstruction. To assess whether a node within the metabolic network reconstructions could qualify as choke point, the number of edges representing every metabolite was counted, yielding the number of occurrence. In the cases where an edge represented more than one metabolite, the lowest number of occurrence was assigned to this edge. Every node was also

430

assigned the lowest number of occurrence of all its edges. Nodes with an assigned number of occurrence of 1 were considered potential choke points. The number of occurrence for every node is listed and potential choke points are highlighted in Supplementary Dataset 7.

Weighted load score Weighting of compounds in a bi-partite RPAIR-based metabolic network reconstruction has 435

been shown to increase pathfinding accuracy27. The number of occurrence described above was assigned as edge weight within the metabolic network reconstructions. A weighted betweenness centrality was calculated using the R-package igraph28. Alternative weighted load scores were calculated for each node from the weighted betweenness centralities and the 20

degree as defined in the manuscript, and potential key functionalities were determined using 440

this measure and expression as described in the main manuscript. The resulting weighted load scores and the identities of the key nodes were compared to the unweighted results discussed in the manuscript.

Matching of genes to Candidatus Microthrix parvicella Bio17 genome Amino acid sequences were aligned using BLAT15 with cut-offs chosen as follows: e-value < 445

10-5, %-identity > 50, score > 50.

Alignment of contigs encoding key functionalities Contigs were selected based on the following criteria: (1) they encoded a gene annotated with a KO representing a key functionality, and (2) the expression of this gene was corroborated by at least one mapped metatranscriptomic read. Selected contigs were aligned to the NCBI 450

non-redundant nucleotide database using BLASTn with default parameters29. The best hit with a query coverage above 50 % and a percentage identity above 80 % for each contig was documented. In addition, contigs containing genes annotated as K03921 were aligned to 85 isolate genomes from the same biological wastewater treatment (BWWT) plant using BLAST and selecting only results with percentage identity above 80 %.

455 Quantification of isolate sequences in combined metagenomic and metatranscriptomic assemblies Reads from the balanced libraries (see section Expression analysis and contextualization of omic datasets - gene copy and transcript abundances) were mapped against the genome of

21

Nitrosomonas sp. Is79 (Ref. 30) and Isolate LCSB065 using SOAPaligner8 with the following parameter settings: -r 2 -M 4 -l 30 -v 10 -p 8 and mapped reads were counted. 460 Ammonia monooxygenase (AMO) contig extension Genes encoding subunits A and B of ammonia monooxygenase (amoA and amoB) are established phylogenetic markers31. However, none of the contigs from the combined autumn and winter assembly that contained an open reading frame annotated as encoding for a subunit of AMO (K10944, K10945, or K10946) harboured a complete gene. In order to 465

recover a full gene sequence and determine the position of the amoA genes recovered from the combined metagenomic and metatranscriptomic data relative to other known amoA sequences within a phylogenetic tree, we employed a contig extension protocol in order to increase the length of the contigs via extension and merging of these contigs. The contig extension protocol was carried out in a step-wise fashion including contig

470

extension, gene calling and gene annotation. Contig alignment, merging and extension was performed using minimus2 (Ref. 32) from the AMOS suite33 with a minimum overlap of 60 bases at 98 % identity. The contig extension protocol was performed in three steps: i) contigs were extended by aligning contigs annotated with the same KO IDs; ii) contigs were extended by aligning contigs annotated with KOs K10944, K10945 or K10946; iii) contigs

475

were extended using a high-depth metagenomic assembly (see section Generation of additional metagenomic data for contig extension and analysis) from a different sample, using the AMO contigs from the previous step as a reference. This procedure was performed, because the genes encoding the subunits of AMO are known to exist in a cluster/operon34. Following the run on minimus2, gene calling was performed on the resultant contigs using

480

FragGeneScan13 and Prodigal14 using default parameters as previously used. Predicted

22

amino acid sequences were then merged and made non-redundant based on 100 % sequence identity using CD-HIT35 and were re-annotated with KO IDs using our annotation pipeline. Gene calling and re-annotation steps were performed to ensure that the extended contigs retained their original annotation reference. The extended contigs were then used for 485

downstream analysis in order to associate these AMO genes to a bacterial species.

AmoA phylogenetic analysis Nearly complete amino acid sequences (201 – 274 amino acids) of AmoA and/or MmoA from representative organisms belonging to the beta-Proteobacteria, gamma-Proteobacteria and archaea were retrieved from the Refseq protein database (see Supplementary Table 3) 490

and aligned with ClustalOmega (using default parameters). The alignment file was submitted to a phylogenetic analysis using the Phylogeny.fr customized workflow service36 including alignment curation with Gblocks37 (using default parameters), tree construction with PhyML38 (bootstrap of 100), and visualization by TreeDyn39.

Isolate LCSB065 isolation 495

Isolate LCSB065 was obtained from an OMMC biomass sample diluted by a factor of 104. The biomass was first cultivated on Petri dishes of wastewater-agar medium (1.5 % agar; w/v) in 800 ml filtered (0.2 µm, Sartorius, Göttingen, Germany) wastewater from the Schifflange BWWT plant. A single colony was then transferred to a Petri dish with R2A medium40 and cultivated at 20 °C under aerobic conditions. Isolates were grown on different

500

growth media recommended for the culture of bacteria from water and wastewater, particularly Microthrix parvicella, such as R2A40, wastewater agar medium, MSV + peptone and MSV A + B41 or Slijkhuis A and F42 under different growth conditions. 23

Nile red staining Lipid inclusions were visualized using a protocol modified from Fowler & Greenspan43. A 505

stock solution of Nile red (Sigma-Aldrich, Diegem, Belgium) in acetone (Sigma-Aldrich) was prepared at a concentration of 500 µg/ml and preserved at 4 °C protected from light. 50 µL of a working solution, containing 2.5 µl of the stock solution in 1 ml of 75 % (v/v) glycerol, were deposited onto a microscopy glass slide with heat fixed bacterial cells. After 5 min incubation, epifluoresence and bright field microscopic observations of the same fields

510

of view were carried out on an inverted microscope (Nikon Ti) equipped with a 60 × oil immersion Nikon Apo-Plan lambda objective (1.4 N.A). Intermediate magnification 1.5 × was used in order to better resolve images. Excitation light was from a Xenon arc lamp, and the beam was passed through an Optoscan monochromator (Cairn Research, Kent, UK) with 550/20 nm selected band pass. Emitted light was reflected through a 620/60 nm bandpass

515

filter with a 565 dichroic connected to a cooled CCD camera (QImaging, Exi Blue). All imaging data were collected and analysed using the OptoMorph (Cairn Research, Kent, UK) and ImageJ44.

Isolate genome sequencing and genome assembly Following DNA extraction from isolate cultures using the Power Soil DNA isolation kit (MO 520

BIO, Carlsbad, CA), a paired-end sequencing library with a theoretical insert size of 300 bp was prepared with the AMPure XP/Size Select Buffer Protocol as previously described by Kozarewa & Turner3, modified to allow for size-selection of fragments using the double solid phase reversible immobilisation procedure described earlier by Rodrigue et al.45 and sequenced on an Illumina HiSeq with a read length of 100 bp. The resulting 854 683

24

525

paired raw reads were de-duplicated with FastUniq46, and quality filtered to a minimum average QV = 30 and a minimum length of 60 bp with the trim-fastq.pl script from the PoPoolation suite5, leading to 551 103 read pairs and 145 500 single-end reads (high quality data yield of 73 %). Two separate preliminary assemblies were obtained with IDBA-UD47; v1.1.0, with parameters –mink 30 –maxk 90 –step 5 –similar 98 --pre_correction); and

530

SPAdes 2.5.0, using the hammer read error correction module which filtered for contigs shorter than 200 bp. The resulting assemblies were merged with phrap (minimum overlap of 50 bp). The merged assembly was manually inspected with Consed48 and contigs broken where merge conflicts were detected. The resulting contigs were uploaded to and analysed using RAST49. According to this

535

analysis, the isolate was a Rhodococcus sp. Similarity of contigs to bacterial genomes (NCBI database, accessed 6th March 2014) were assessed by BLAT15. The bitscore of each hit was recorded and only contigs with a hit to Rhodococcus spp. with a bitscore within 60 % of the best hit’s bitscore were selected for the final set. This led to the removal of 120 contigs most of which had low coverage of reads.

540

Filtered reads were mapped onto the assembled contigs using BWA50 with default parameters. Reads mapping with mapping quality scores at or above five were used to assess contig coverage (Supplementary Dataset 8). The final set of contigs was submitted to AmphoraNet server to determine the isolate species searching for 31 phylogenetic marker genes51, as well as to RAST49 (accession number 6666666.64457) and annotated.

545

The COG protein profile of Isolate LCSB065 was determined as previously described by Muller et al.21. Annotation of KOs was carried out on protein predictions from RAST as described for metagenomic proteins.

25

26

Supplementary Results and Discussion Sensitivity of gene expression analysis to imposed cut-offs 550

We found that the cut-offs imposed to avoid false-positives within the highly expressed genes and genes encoding key functionalities led to a greater robustness against noise than would be observed if simple numerical cut-offs had been chosen. This was obvious from the simulation of noise, in which the number and variance of highly expressed genes grew with the noise, when a numerical cut-off was chosen, whereas the choice of cut-offs, as defined

555

herein, resulted in mostly stable numbers (Supplementary Figure 3a&b). The identities and numbers of KOs identified as exhibiting high relative expression in our datasets were not overly sensitive to the cut-offs imposed to avoid false-positive results. The numbers of highly expressed KOs decreased by less than 20 % by the exclusion of KOs with low gene copy abundances (Supplementary Figure 3c). As most genes with very high

560

transcript abundances did not have low gene copy numbers (Figure 3), the cut-off selected for inclusion of KOs with high transcript abundances irrespective of their gene copy numbers changed the total number of highly expressed KOs by less than 5 %. If the transcript abundance cut-off for genes with low gene copy abundances was set between the 55th and 95th percentile (our default value was the 75th percentile), an enrichment with the KOs above

565

the 90th percentile of the highly expressed KOs of 4 or 5 out of the 5 pathways in autumn, and 5 out of the 6 pathways in winter was consistently detected (Supplementary Figure 3d). Only few additional pathways were enriched after variation of this cut-off (Supplementary Figure 3e). In particular, the findings of pathways ko00910 “Nitrogen metabolism” and ko00190 “Oxidative phosphorylation” as exhibiting overall high levels of gene expression

570

were resilient to changes in cut-offs.

27

In conclusion, the described gene expression analysis is robust to noise, as well as too small to moderate changes in the chosen cut-offs.

Effect of regrouping redundant KOs into single nodes To carry out a topological analysis of the reconstructed metabolic network, nodes and edges 575

were rendered non-redundant, by representing multiple KOs with identical substrate and product metabolites as a single node. Due to this step, 229 and 220 nodes representing more than one KO were part of the autumn and winter metabolic network reconstructions, respectively. The calculated load scores were overall only mildly affected by the regrouping (Spearman correlation of load scores in the redundant and non-redundant autumn or winter

580

network reconstructions: 0.98). 70 % of key functionalities identified in the non-redundant network were shared between both autumn network reconstructions (40 % for the winter network reconstructions), as reported in Supplementary Dataset 7. The rationale behind regrouping redundant KOs into single nodes was based on the fact that most KOs represent subunits of enzyme complexes, which do not work in parallel, but rather cooperatively in

585

metabolizing substrates. Consequently, some of the additional nodes found in the networks with regrouped redundant KOs represent multi-subunit complexes, such as AMO, and we therefore believe that the practice of regrouping redundant KOs into single nodes is warranted.

Genomic analysis of Isolate LCSB065 590

Mapping of filtered reads onto Isolate LCSB065’s contigs revealed a mean/standard deviation empirical sequencing insert size of 244±43 bp, and a mean read depth per mapped position (coverage) of 27±11x (median 25x).

28

As a first approach to analyse Isolate LCSB065’s genetic potential, protein coding genes were annotated with KOs as before for the metagenomic and metatranscriptomic sequences. 595

Of utmost interest, out of 3 373 protein coding genes that could be annotated with KOs, 420 genes (12.5 %) were annotated as KOs belonging to “Lipid metabolism”, according to KEGG Orthology. This relates to rank 4 behind “Amino acid metabolism” (19.1 %), “Carbohydrate metabolism” (16.2 %) and “Xenobiotics biodegradation and metabolism” (14.0 %). Similar results were obtained from COG categories52 and SEED subsystems categories of the

600

predicted proteins53. The assembled genome also encodes genes for the synthesis and polymerisation of poly-hydroxybutyrate (PHB, Supplementary Dataset 8 indicated in Figure 5b) and for the synthesis of TAGs (Supplementary Dataset 8 indicated in Figure 5b), inclusions of which are visible following Nile Red staining (see Supplementary Figure 8). Nine other genes beside the gene matching to the three metagenomic contigs encoding

605

acyl-[acyl-carrier protein] desaturases are annotated as desaturases in the isolate genome. The genomic region of the gene matching to the three metagenomic contigs encoding acyl[acyl-carrier protein] desaturases was assessed. The gene is the first gene of a syntenous block of four genes present in Rhodococcus jostii RHA1, Nocardia farcinica IF, Godronia bronchialis and Tsukamurella paurometabola DSM 20162. This block encodes the

610

homologous fatty acid desaturase (peg.6927), followed by a tRNA dihydrouridine synthase B, a cell envelope-associated transcriptional attenuator LytR-CpsA-Psr of the subfamily A1 and a phosphate transport system regulatory protein PhoU. The genome of Isolate LCSB065 furthermore contains six genes coding for lipases with an export signal peptide, thereby reinforcing its potential keystone role in the community (Supplementary

615

Dataset 8). The genomic enrichment of genes involved in lipid metabolism suggests that the isolated Rhodococcus sp. may occupy a function as keystone species within the sampled OMMCs. Additional work is required to elucidate the exact role of this organismal group

29

within the OMMC community.

30

620

List of Supplementary Figures Supplementary Figure 1

Overview of the assembly and annotation pipeline.

Supplementary Figure 2

Determination

of

lower

thresholds

for

gene

abundances for the selection of highly expressed KOs within the reconstructed community-wide metabolic networks.

Supplementary Figure 3

Results of the sensitivity analyses.

Supplementary Figure 4

Expression of KOs in metabolic pathways at the protein level.

Supplementary Figure 5

Generalized

OMMC-wide

metabolic

network

reconstructed from the combined metagenomic and metatranscriptomic data of OMMCs sampled in autumn and winter. Supplementary Figure 6

Representation of the fatty acid metabolic pathway (ko01212).

Supplementary Figure 7

OMMC-wide metabolic networks reconstructed from metagenomic and metatranscriptomic data of OMMCs sampled in autumn and winter.

Supplementary Figure 8

Micrographs of Isolate LCSB065.

31

List of Supplementary Tables Supplementary Table 1

Physicochemical characteristics of the wastewater at the time of sampling in the anoxic tank of the Schifflange BWWT plant.

Supplementary Table 2

Quantitative and qualitative characteristics of biomacromolecular fractions sequentially isolated from the OMMC samples.

Supplementary Table 3

Statistics of the combined assembly of the metagenomic and metatranscriptomic sequence datasets.

Supplementary Table 4

Ammonia-oxidizing organisms with corresponding accession numbers of amoA genes used for the reconstruction of the phylogenetic tree.

Supplementary Table 5

Numbers of metagenomic and metatranscriptomic reads from autumn and winter dates mapping to the isolate genomes of Nitrosomonas sp. Is79 and Isolate LCSB065.

32

List of Supplementary Datasets Supplementary Dataset 1

Dominant genera in the autumn and winter samples determined based on reconstruction of 16S rRNA gene sequences from the metagenomic data.

Supplementary Dataset 2

KO gene copy (KOGA), transcript (KOTA) and protein (NSI) abundances in datasets from autumn (04 October 2010) and winter (25 January 2011) samples.

Supplementary Dataset 3

Highly expressed metabolic KOs in the (a) autumn and (b) winter sample and pathways overrepresented by metabolic KOs highly expressed in the (c) autumn or (d) winter sample.

Supplementary Dataset 4

Generalized reconstructed using

microbial metabolic

the

combined

community-level

network

reconstructed

metagenomic

and

metatranscriptomic datasets in simple interaction format. Supplementary Dataset 5

Autumn-specific

reconstructed

microbial

community-level metabolic network in simple interaction format. Supplementary Dataset 6

Winter-specific reconstructed microbial communitylevel metabolic network in simple interaction

33

format. Supplementary Dataset 7

Results of topological analyses. List of all nodes in the metabolic networks reconstructed from OMMCs sampled in (a) autumn and (b) winter with topological measures and expression values. (c) KOs with betweenness centrality values different between metabolic networks reconstructed from OMMCs sampled in autumn and winter, and (d) pathways enriched in these KOs. KOs with key functionality in OMMCs sampled in (e) autumn or (f) winter including topological analysis results, gene abundances and expression, as well as protein abundances and pathway membership. (g) Results of BLAST searches of all genes encoding key functionalities against publicly available bacterial genomes.

Supplementary Dataset 8

Summary of characteristics of Isolate LCSB065’s genome, results of the phylogenetic analysis, and coverage of Isolate LCSB065 contigs by reads from the isolate genome sequencing.

34

Supplementary Figures

metagenomic reads autumn sample winter sample

metatranscriptomic reads autumn sample winter sample

PANDAseq + trim-fastq.pl: joverlaps joined and trimmed, QC

1

2

CD-HIT: processed reads made non-redundant

3

MOCAT assembly pipeline: combined assembly of metagenomic and metatranscriptomic reads

contigs > 150 nt

4

SOAP: reads joined by PANDAseq mapped to MOCAT contigs to recruit unassembled reads

Prodigal: genes called on contigs ≥ 500 nt

5

FragGeneScan: genes called on contigs < 500 nt

6 CD-HIT: predicted amino acid sequences made non-redundant

7

BLAST-X and KEGG: enzyme annotation (e-value < 10-5, %identity > 50, score > 50) K00001 legend:

nucleotide sequence

K00002

K00003

K00004

amino acid sequence

K00005

…!

KXXXXX KO identifier!

Supplementary Figure 1 Overview of the assembly and annotation pipeline. Annotated KOs were used for the subsequent metabolic network reconstructions. QC: quality control, 1-7: steps in the pipeline.

35

b 100.0 50.0

● ●

● ●

20.0 10.0 5.0

● ● ● ● ●

● ● ● ● ● ● ● ● ● ●

2.0 1.0 0.5 0.2

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●●● ●● ● ●● ● ● ● ●●● ●●● ● ● ●● ●● ● ● ● ●● ●● ● ●● ●● ● ● ●● ●●●● ● ● ● ● ●● ● ● ● ● ● ●●● ● ●● ● ● ● ●●●● ● ● ●● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ●● ●● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●●● ● ● ●●● ●●●● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●

1e−03

1e−01

moving median winter relExpr

moving median autumn relExpr

a

100.0 50.0



● ●



10.0 5.0

● ● ● ●● ●●





1.0 0.5



●● ● ● ●● ● ●● ●● ● ●● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ●● ●● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0.1 1e−03

1e+01

1e−01

autumn KOGA

c

d ● ●

● ● ● ● ● ● ● ● ●

1e+01







● ● ● ●

● ● ●



● ●



1e+00

●● ● ●

● ●● ● ●



● ●

1e−01 1e−02

● ● ● ●











● ●



● ●



● ● ●● ● ●

● ●

●● ●

●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ●●● ●● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●●● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ●●● ● ● ●● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ●●● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ●●● ● ● ● ● ● ●●●● ● ● ● ●●● ● ●● ● ● ● ●● ●● ● ● ● ●● ●●●● ● ● ● ●● ● ●● ● ●● ●● ● ●●● ●● ● ● ● ●● ●● ●●● ●● ● ●● ● ●● ●● ● ●● ●● ● ● ●●●●● ● ●● ● ● ●● ●● ● ● ●● ● ● ●● ●● ● ●● ● ● ●●●●●●● ●●● ● ● ● ● ● ●● ● ●● ● ●●●● ●● ●●● ● ● ● ● ● ● ●● ● ●● ●●●●● ●● ● ● ●● ● ●●●●●● ●● ● ● ●● ● ● ● ●● ● ●● ●● ●● ● ● ●● ●●● ●● ●●● ●● ● ● ●●● ●●●●● ● ●● ● ● ● ● ●●●●● ●● ● ●● ● ● ●● ● ●● ●● ●● ● ● ● ● ●●●● ● ● ● ● ●●● ●● ●● ● ● ● ●● ●● ●● ●● ● ●● ● ●● ●●●●● ● ● ● ● ● ● ● ●● ● ●● ● ●●●● ● ● ● ●● ● ●● ● ● ● ● ●● ● ●● ● ●●●● ● ●● ●●● ● ● ●● ●● ● ● ● ● ●● ● ●●● ● ● ●● ● ● ● ● ●● ●● ●● ●● ● ● ● ●● ●● ●●●● ● ● ●● ● ● ●●● ● ● ● ●●● ● ● ● ● ● ●●●● ● ● ●●● ●●●● ●● ●●●● ●● ●●● ● ● ● ●● ●●● ●●●●●● ● ● ●● ● ● ●● ● ●●● ● ●● ● ● ● ● ●● ● ● ●● ● ●● ● ●● ● ● ●●●●● ● ● ●● ● ●●● ● ● ● ●●● ● ● ● ●● ● ●●● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ●● ● ● ● ●● ● ●● ● ●● ●● ● ●● ●● ● ●●● ●● ● ● ● ●●● ●● ● ● ● ● ●● ● ● ● ● ●● ●●●●● ●● ● ● ● ●● ● ● ● ●●●● ● ● ●● ●● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ●●●●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ●● ●● ●● ● ● ●● ●● ● ●● ● ●● ● ●●● ●● ● ●●●● ● ●●●● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●●● ●● ● ● ●● ●● ●● ● ●●●● ● ● ● ●● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ●● ●●● ● ● ●●●● ● ● ●●● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ●● ●● ● ●●●●●● ●● ● ●● ● ● ● ● ● ●● ●● ● ● ● ●● ● ●● ● ●● ●● ● ●●●● ●●● ● ●● ● ●● ● ● ● ● ● ●●● ●● ● ● ● ●● ●● ● ●● ● ● ● ● ●● ● ●●● ● ● ● ●● ● ● ●● ●● ● ● ●●● ●●● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ●●● ● ●●● ● ●● ●● ●● ● ● ●●●●● ● ● ● ● ● ●●● ● ● ●● ●● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ●● ● ● ●●● ● ●● ● ● ● ●●●● ●●● ● ●● ● ●● ● ●● ● ● ●● ● ●● ●● ●● ●● ● ●●● ●● ● ●●● ● ● ● ●●● ● ● ●●●●● ● ●● ● ● ●● ● ● ●● ● ● ● ●● ● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●●● ●● ● ● ● ● ●● ● ●● ● ● ● ●● ●● ● ● ●●● ●● ●● ● ●●●● ● ● ● ● ●● ● ●●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1e−03





1e−01

1e+01

winter relExpr

1e+02



1e+02



autumn relExpr

1e+01

winter KOGA



1e+01

● ● ●





1e+00





1e−01 1e−02



● ● ● ●

● ● ●





● ●

● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ●●●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ●●●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ●● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ●●● ● ● ●● ●●● ●●●● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ●●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ●● ● ●● ● ●● ●●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ● ●●●●●● ● ● ● ● ●● ● ●● ● ●● ●● ●● ●● ●● ● ● ● ●● ● ●●● ●● ● ●●● ●● ● ● ●● ● ● ●● ● ●●●●● ● ●● ● ●● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ● ●●●● ●● ● ● ● ● ●●●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ●● ●● ●● ● ●● ●● ●● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ●● ● ●● ●●● ● ● ●● ● ●● ●● ● ●● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ●● ●● ● ● ● ● ●●●●●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ●● ●●● ● ●● ●●●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ●●● ● ● ● ● ● ● ● ● ●● ● ●● ●● ●●● ● ● ● ● ●● ● ● ●●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ●● ● ●● ● ● ●●●●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●●●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ●●● ●● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ●● ● ●● ● ●● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ●● ● ● ● ●● ●●● ●●●● ● ● ● ● ●● ● ● ●●●●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ●●●●● ●● ● ● ●●●●●● ● ● ●● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ● ● ●● ●● ●●● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ● ●● ● ●● ●● ● ● ●● ● ● ● ● ●●● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ●●● ●● ● ● ● ●● ● ● ● ● ●● ● ●● ●●●●●● ● ● ● ●● ●●● ● ● ●● ●●● ●● ●●● ●● ●●● ●● ● ●●● ● ●● ●●● ● ●●● ● ●● ●● ● ● ● ●● ●●●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ●●● ●● ● ●● ● ●●● ● ●● ●●●●●●● ●● ● ● ● ● ●●● ●● ● ● ● ● ● ●●●● ● ● ● ●● ● ● ● ●● ● ● ●● ●● ● ●●● ● ● ● ●●● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ●●● ● ● ● ●● ●●● ●● ● ●● ●● ● ● ● ● ●● ● ● ●● ● ● ●● ●●●● ●● ● ● ●●●● ● ● ● ●● ● ● ● ● ●● ●●●● ●● ● ● ● ● ● ●● ● ● ● ● ●● ●●● ●● ●●● ●● ● ● ●● ● ●● ●●● ●● ● ● ●●● ●● ●●● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●





● ●

● ●

1e−03

autumn KOGA

1e−01

1e+01

winter KOGA

Supplementary Figure 2 Determination of lower thresholds for gene abundances for the selection of highly expressed KOs within the reconstructed community-wide metabolic networks. (a) and (b) Moving median of KO relative expression (relExp) versus gene copy abundance (KOGA) for (a) autumn and (b) winter. (c) and (d) KO relative expression (relExp) versus gene copy abundance (KOGA) for (a) autumn and (b) winter. Vertical lines indicate lowest gene abundances with robust gene expression values within the interquartile range.

36

a

exclusion of genes with low copy abundance

simple cut−off

b

exclusion of genes with low copy abundance

simple cut−off

400

400

400

300

300

300

400 ●



● ●





● ●

100

● ●

200

● ●



● ● ● ● ● ● ● ● ● ● ●

100



200



● ●

number of genes

200

number of genes





number of genes

number of genes



● ● ●

● ● ● ● ● ●

300

200

● ●





● ● ●

100

100

0

0

● ●

1 2 3 4 5 7 10 15 20 50

1 2 3 4 5 7 10 15 20 50

1 2 3 4 5 7 10 15 20 50

noise [reads per kb]

noise [reads per kb]

noise [reads per kb]

noise [reads per kb]

autumn

winter

200 150 100 50

autumn

winter

6 4 2

autumn

winter

3 2 1

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

0

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

0

4

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

0

e

8

additional enriched pathways

d

250

conserved enriched pathways

highly expressed genes

c

0 1 2 3 4 5 7 10 15 20 50

0

cut−off for exclusion of genes with low gene copy abundance

cut−off for exclusion of genes with low gene copy abundance

cut−off for exclusion of genes with low gene copy abundance

Supplementary Figure 3 Results of the sensitivity analyses. (a) and (b) Effect of noise on the number and identity of genes with high relative expression determined by comparing a simple numerical cut-off and our method for the reduction of false-positives by excluding genes with very low gene copy abundances. Filled boxes indicate total number of genes with high relative expression, boxes with blue or brown lines indicate the sizes of the intersect of genes with high relative expression without noise and after addition of noise; (a) autumn dataset, (b) winter dataset. (c) Effect of changing the cut-offs for exclusion of genes with low gene copy abundances to reduce false-positives on numbers of genes with a high relative expression. (d) and (e) Effect of changing the cut-offs for exclusion of genes with low gene copy abundances to reduce false-positives on the identities of pathways enriched with highly expressed genes. (d) Number of pathways enriched with highly expressed genes that are found at the chosen cut-off (0.75) and after variation of the cut-off. (e) Number of additional

37

pathways enriched in highly expressed genes, which are not found at the chosen cut-off but after relaxation of the cut-off. 625

38

b 1e−01

winter proteomics NSI

autumn proteomics NSI

a ●

1e−03

1e−05

1e−07

● ● ● ● ●● ● ●● ● ● ●●● ● ● ● ●● ● ●● ● ● ● ●● ●●●● ● ● ●●●● ●● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ●● ● ●● ● ● ●●●● ●● ● ● ● ● ● ●● ● ●● ● ●● ●● ● ●● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ●● ●● ●●● ●● ● ●● ●● ● ●● ●● ● ●● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●●● ●● ●●●●● ● ● ●●● ●●●●●● ● ●● ● ● ● ●● ●● ● ● ● ●● ● ●●●● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●●●●● ●● ● ● ● ●●● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ●● ● ●●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●●● ● ● ● ● ●● ● ●● ● ● ●● ● ●● ● ●● ● ● ● ● ●●● ● ●● ● ●● ●● ● ● ● ●●● ●●● ●● ● ● ● ● ●● ● ● ● ●●● ●● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ●● ● ●● ●● ●● ●● ● ● ●● ●● ● ●● ● ● ●●● ●● ● ●● ● ● ● ●● ●● ● ● ●● ● ●● ● ●● ● ●● ● ●● ● ● ●● ● ●● ● ●● ●● ● ● ● ● ●●● ● ●● ● ● ●●● ● ● ● ●● ●● ●● ● ● ●●● ● ● ●● ● ● ● ●● ● ●● ● ●● ● ●● ● ●● ● ● ●● ●● ● ● ●●● ● ● ●● ● ● ● ● ● ● ●●●● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ●●● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●



● ● ● ● ●



1e−03



●●●

●●

1e−01

1e−01 ● ● ● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ●● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ●●● ●● ● ● ● ● ● ●●● ●● ●● ●● ● ●●●●●●●● ● ● ● ● ●● ●●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ●● ●● ●●● ● ● ●● ● ●● ●●● ● ● ● ●● ● ● ●● ●● ● ● ● ●●●● ● ● ●●●● ● ● ●● ●● ● ● ● ●●●●● ● ● ●● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ●●●●●●● ● ● ● ● ● ● ●● ● ●● ●●●● ●● ● ● ● ● ● ● ● ●●● ● ● ● ●●● ● ● ●●● ●●● ● ● ●● ● ●●●● ● ●● ●●● ● ● ● ● ● ● ● ● ● ●● ●●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●●●● ●●● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●● ● ● ● ● ●● ●●● ●● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ●● ● ●●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ●● ● ● ● ● ●● ● ●● ● ● ● ●●●●● ● ● ● ● ●● ● ●● ● ● ● ●●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ●●● ● ● ● ●● ● ● ●● ● ●●● ●● ● ● ● ●● ● ● ●● ●●● ● ● ● ●● ● ● ●●● ●● ● ● ● ● ●● ● ●● ● ● ●●● ●● ●● ●● ●●● ● ● ●●● ●●● ● ●● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ●●● ● ●● ● ● ●●● ●● ● ● ● ● ● ● ●

1e−03

1e−05

● ●

1e−07



1e+01

1e+03

1e−03

1e−01

autumn KOGA

1e−01

winter proteomics NSI

autumn proteomics NSI

d ●

1e−05

1e−07

● ● ●● ●● ● ●● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●●● ●● ● ●●●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ●● ●● ●●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ●●● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ●● ●● ●● ● ●● ● ● ●● ●●●● ● ● ●● ● ● ●● ● ● ● ●●●● ● ● ● ● ● ●●●● ● ●● ●● ●● ● ●●●● ● ● ●● ● ● ●● ● ● ●● ●● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ●● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ●●● ● ● ● ●●● ●●●● ●●●● ●● ●● ●● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ●●●● ● ● ●●● ● ●● ● ● ● ●● ● ● ●● ●● ● ●● ● ● ● ●● ●● ●● ● ● ●●●● ●● ● ●● ●● ●●● ●●●●● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ●● ●● ●●●● ●●●● ● ● ● ●● ● ●● ●●● ● ●● ●● ● ●● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●●● ●● ● ●● ● ●● ● ●● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●





1e−03



● ●●

● ●



1e+01

1e+03

● ●● ●●● ● ● ● ● ● ●● ● ● ● ● ●● ●●● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ●●●● ● ●● ●●● ● ●● ●●●● ● ● ●●● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ●● ● ● ●● ● ●● ● ●● ● ● ●● ● ●● ● ●● ● ● ●● ● ● ●● ● ● ● ●● ● ● ● ●●● ● ●●● ● ● ●● ● ●● ● ●●●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●●●●● ●● ● ●● ● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●●● ● ● ●● ● ● ●● ● ● ● ●● ●●● ●● ● ● ●● ● ●● ● ● ● ● ● ●●●●● ● ●●●● ● ● ●● ● ● ●●●● ●● ● ●●● ● ●●●● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ●● ●●● ● ● ● ● ● ●● ●●● ●●●●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●●● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ●● ● ●●● ●● ● ●● ● ●●● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ●● ●● ● ●● ●● ● ● ●● ●● ●● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ●●●● ● ● ● ● ●● ● ●●●● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●●●●● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●

1e−03 ● ● ● ●

1e−05

1e−03

autumn KOTA

● ●







1e−01

1e+01

f



1e−01



●●

● ●

● ● ● ●● ●





● ● ●

1e−03

1e−05

1e−07 1e−03





● ●



●● ●● ● ● ● ●

● ● ● ●● ● ● ●● ●● ●●● ● ● ● ●●● ●● ● ●● ●● ● ● ● ● ●● ●● ● ●●● ●●● ● ● ●●● ●● ● ● ● ●●● ● ●● ●●● ● ● ●● ● ● ● ● ● ● ●● ●● ● ●●● ●● ●● ●● ●● ● ●● ●● ● ● ● ● ● ●●● ●● ●● ● ●●●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●●●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●●●● ● ● ●● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ●● ● ● ● ● ● ● ● ●● ●●● ● ● ●●●● ●● ●●● ● ● ● ●● ● ● ●●● ● ●● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ●● ● ●●● ● ●● ● ● ●● ●●● ● ●●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●●● ●● ● ● ●●● ●●● ● ●●● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ●●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●

1e−01



1e+01

winter rel. protein expr.

autumn rel. protein expr.

● ● ● ●

1e+03

winter KOTA

e 1e−01

1e+03

1e−01

1e−07



1e−01

1e+01

winter KOGA

c

1e−03



●● ●

● ●

1e−03

1e−05

1e−03

autumn relExp

● ●

● ● ●

●● ●

●● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ●● ● ●● ●●● ● ● ●●● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ●● ● ● ●●● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ●●●● ●● ● ● ● ●● ● ●● ● ●●● ● ●●●● ● ● ● ● ●● ●● ● ● ● ● ●● ●●● ● ● ●● ●●● ● ●●● ● ● ● ● ●● ● ● ● ●● ●● ● ● ●● ● ●● ● ●● ●●●● ● ●● ●●● ●● ● ● ● ●● ●● ●● ● ●● ● ●●●● ●● ●● ● ●● ●● ●● ●● ● ●● ● ● ●●●● ● ● ● ●● ●● ●●● ● ● ● ●● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ●● ●●●● ● ● ● ● ●● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ●● ●● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ●● ● ●● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ●● ●● ● ●● ●●● ● ● ● ●● ● ●● ● ●●● ● ● ● ●●● ● ● ●● ●●●●● ● ● ● ●● ●●● ● ● ● ●● ● ●●● ●● ● ● ● ●● ●● ● ● ●● ● ● ● ●● ●● ●● ● ●● ● ● ●●● ● ●● ● ● ●● ●● ● ●●● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ●● ●●

● ●





1e−07 1e+03

● ● ● ●



● ●●









1e−01

1e+01

1e+03

winter relExp

Supplementary Figure 4 Expression of KOs in metabolic pathways at the protein level. (a) and (b) Comparison of gene copy abundances (KOGA) and protein abundances (NSI) in the (a) autumn and (b) winter samples. (c) and (d) Comparison of gene transcript abundances (KOTA) and protein abundances (NSI) in the (c) autumn and (d) winter samples. (e) and (f) Comparison of expression values relative to gene copy numbers (KOGA), transcript expression levels (KOTA) and protein abundances (rel. protein exp.) in the (e) autumn and (f) winter samples. (c to f) Highly expressed KOs are highlighted in red. 39

Node 8

Node 7

"K15857" Node 1 "K06013"

"K00523-K12452" "K03831-K15376" "K16020"

"K10187"

"K12503"

"K02293"

"K14066" "K15887" "K05954" "K00787-K00795" "K02291" "K05356" "K05355" "K01597" "K15888""K00805" "K11778" "K13787-K13789"

"K02294" "K09835" "K14605"

"K15745" "K10027"

"K00514"

"K09879" "K06443" "K02292"

"K00801" "K04713"

"K00044"

"K00806" "K02523"

"K06920"

"K01818" "K15669"

"K01709"

"K16019"

"K00423-K00434"

"K03379"

"K10960"

"K03476"

"K03391"

"K10026" "K00103"

"K06045" "K01499"

"K03527""K01823" "K05979"

"K01130"

"K00151"

"K03635-K03636"

"K01101"

"K01253"

"K01042"

"K00794"

"K13370"

"K01728"

"K15054"

"K16051"

"K00791" "K12343-K12344" "K09472" "K00251"

"K00067" "K04708"

"K05884" "K01634"

"K00968"

"K00511"

"K00983" "K00119"

"K13085" "K00090-K06152"

"K01218"

"K01198"

"K05898" "K15982-K15983"

"K09459"

"K09709" "K03526" "K01078-K09474" "K14394"

"K10219"

"K00376"

"K04712"

"K01790"

"K06167" "K00004"

"K10221"

"K01186"

"K03473"

"K15817" "K15812" "K00793" "K01053"

"K14470"

"K00967"

"K02821-K03475"

"K03273"

"K03388-K03389-K03390" "K01095-K01096"

"K02815"

"K01220" "K08678" "K01840"

"K01201"

"K01183-K13381"

"K00493-K14338" "K02164-K02305-K02448-K04561-K04748" "K05797"

"K03472" "K01654"

"K00502"

"K05351"

"K01787"

"K00096"

"K00886" "K01193"

"K01222"

"K00066" "K10046" "K01819" "K00039"

"K15917" "K00036"

"K10708" "K03785-K03786"

"K04037-K04038-K04039"

"K00218"

Node 2

"K01815"

"K02259"

"K15916"

"K01810-K06859" "K16011"

"K01735"

"K01711"

"K04480-K14081" "K01857"

"K15228" "K13498"

"K03396" "K03781" "K10713"

"K08093" "K00895" "K13810" "K00616" "K01783" "K01635-K08302" "K01837" "K00006-K00057-K00105-K00111-K00113" "K13831" "K01689" "K02446-K03841-K04041" "K01834-K15633-K15634-K15635" "K11532" "K00134-K00150-K05298-K10705" "K01623-K01624-K11645" "K06131-K06132" "K00058" "K16305" "K00131" "K00615" "K01622"

"K03381" "K01801"

"K00799" "K00481"

"K00368"

"K02170"

"K10676"

"K00452"

"K00315"

"K00463" "K11780-K11781" "K00980"

"K14520"

"K09516" "K16242-K16243-K16244-K16245-K16246"

"K15762"

"K06116-K06117"

"K01502"

"K15767" "K05783"

"K01621" "K01803"

"K00430-K11188" "K13812"

"K00146" "K01066"

"K01889-K01890" "K01746""K03782"

"K00210-K00211-K04517"

"K01563" "K02554"

"K00517" "K01868" "K10797"

"K11147-K11153"

"K07539" "K00480" "K01725"

"K03392" "K00450"

"K03464" "K15981-K16046" "K01607"

"K05915"

"K00766"

"K01934"

"K00224"

"K01061"

"K00577-K00581-K00582-K00583-K00584" "K00320" "K10535"

"K01069"

"K01618"

"K01759"

"K01841"

"K01629" "K01662" "K01601-K01602"

"K01856"

"K00258"

"K01734" "K00011" "K00068"

"K00222"

"K14083" "K02551"

"K01575" "K00360-K00367-K00370-K00371-K00372-K00373-K00374-K02567-K02568"

"K04097"

"K04091"

"K01628"

"K01633"

"K01632" "K02564-K12410"

"K00009" "K01809" "K03271"

"K04103" "K00501" "K00446" "K00500" "K15739" "K01866" "K12468" "K14028" "K12732" "K14727" "K12506" "K01695-K01696-K06001" "K00220" "K15226" "K04518-K05359" "K01835" "K12450" "K00012" "K01694" "K00050-K15893" "K00060-K15789" "K00002""K00148" "K15779" "K00217" "K01057-K07404" "K00285""K10775" "K01784" "K00354" "K01070" "K01851-K02361-K02552" "K00141" "K00147" "K00033" "K00274" "K00270" "K00963" "K01592" "K13853" "K10944-K10945-K10946" "K03738" "K05603" "K00014" "K03339" "K01875" "K00457" "K00104-K11472-K11473-K11517" "K05358" "K00965" "K07249" "K13832" "K03179" "K00276" "K01569" "K01737" "K00053" "K16055" "K00049-K12972" "K01608" "K00451" "K05947" "K10774" "K01807" "K00796" "K16306" "K00459" "K01820" "K01619" "K00697" "K01770" "K04093-K04516-K06208-K06209" "K01055" "K00720" "K00121" "K03418" "K00995" "K01736" "K01917" "K01808" "K00529-K05710" "K01131" "K14170" "K00015-K00018" "K06118" "K13953" "K02858" "K01893" "K14187" "K00468" "K00200-K00203-K00205-K11260-K11261" "K01920""K13951-K13952-K13980" "K00301-K00302-K00303-K00314" "K03637-K03639" "K00001" "K01839" "K00384" "K00505" "K01617" "K00114" "K01079" "K01690" "K00696" "K01821" "K01933" "K13954" "K00492" "K00721" "K10216" "K00362-K00363-K00366-K03385-K04016-K15876" "K01491-K13403" "K15864" "K00845" "K00754" "K13774-K13775""K10217" "K15066" "K01593" "K03735-K03736-K04019" "K01092" "K04022" "K01589" "K13036" "K00155" "K01556" "K11808" "K01495-K09007" "K01126" "K14419-K14420" "K01447-K01448" "K00322-K00323-K00324-K00325" "K13057" "K00800" "K01114" "K01588" "K00693-K00694-K16150-K16153" "K00847" "K05601" "K14422" "K07248" "K01878-K01879-K01880-K14164" "K14652" "K01667" "K01631" "K01625" "K00981" "K01620" "K01639" "K03333-K16045" "K13656" "K03430" "K00853" "K00900" "K05549-K05550-K05784" "K00850" "K01733" "K00863-K05878-K05879" "K00998" "K01595" "K01668" "K11753" "K01455" "K01666" "K01726" "K10011" "K01426" "K00600" "K03181" "K00849" "K00884" "K00864" "K00288" "K10012" "K00102-K03777-K03778" "K00882" "K00851" "K01945" "K03153" "K00854" "K01872" "K16146" "K00601-K08289-K11175" "K13830" "K00855" "K00927" "K11788" "K00865-K15788" "K03367-K14188" "K04782" "K01433-K01938-K13402" "K00885" "K13829""K07031" "K01596" "K01497" "K14759" "K00281-K00282-K00283" "K07246" "K01568" "K11529" "K12446" "K00922" "K00962" "K00852" "K00919" "K01752" "K00297" "K10760" "K01011" "K10218" "K13043" "K00524-K00525-K00526" "K00917" "K01916" "K02586-K02588-K02591" "K00848" "K15067" "K00748" "K01769-K12319-K12320-K12323-K12324" "K09019-K16066" "K01754""K11787" "K01501" "K03272" "K00888" "K05396" "K04765" "K00891" "K02619" "K01451" "K00858" "K10710" "K00923" "K15914" "K00945-K13800" "K13940" "K01610" "K00560" "K00605" "K00873-K12406" "K01139" "K00703-K16148" "K01571-K01572-K01573" "K01509-K01553" "K00259" "K02999-K03002-K03004-K03005-K03006-K03007-K03010-K03011-K03013-K03016-K03018-K03021-K03023-K03040-K03041-K03042-K03043-K03044-K03046-K03048-K03060-K13797-K13798" "K01775" "K06215-K08681" "K00289" "K00887" "K03429" "K02203" "K00921" "K11717" "K01652-K01653" "K00872-K02204" "K00287" "K00527" "K00483" "K00946" "K01948" "K09903" "K00912" "K11258" "K00101" "K01636" "K02437" "K00869" "K00877-K00941" "K14153" "K01768-K08043-K08048-K08049-K11029" "K01656" "K01525" "K01006-K01007" "K13799" "K00878" "K16330" "K13501" "K01791" "K04112-K04113-K04114-K04115" "K00161-K00162-K00163" "K02510" "K00929" "K10807-K10808" "K03465" "K16328" "K15849""K00681" "K00027-K00028-K00029" "K06015" "K03338" "K00985" "K00940"" K14154" "K00932" "K16329" "K00876" "K00158" "K00129" "K00820" "K13497" "K01424" "K02231" "K00948" "K01442" "K00604" "K00942" "K13941" "K00817" "K01714" "K00950" "K00016" "K00930" "K00928" "K00944" "K13998" "K03779" "K06982" "K12524-K12525" "K01657-K01658-K13503" "K01458" "K10353" "K03342" "K00874" "K02563" "K00830" "K00857" "K00822" "K01515-K08312" "K01522" "K00951" "K00926" "K12657" "K00931" "K00949" "K00273" "K01520" "K12659" "K00969-K06210-K13522" "K16164-K16165" "K00832" "K00815" "K00812-K00813-K11358-K14454-K14455" "K01954-K01955-K01956" "K01664-K01665-K13950" "K01008" "K00939" "K08967" "K00765-K02502" "K15359" "K00827-K14272" "K05287" "K00901" "K00762" "K06164" "K11541" "K00899" "K11540" "K00824" "K01494" "K07264" "K00943" "K01958-K01959-K01960" "K09758" "K13990" "K00867-K03525-K09680" "K03851" "K00860" "K01591" "K02527" "K01465" "K11754" "K08691" "K00768" "K01915" "K01914" "K00603" "K03417" "K00868" "K00814" "K13930" "K00856" "K01434" "K15519" "K00925" "K01505" "K01930" "K02428" "K00835" "K00955-K13811" "K00602-K01492" "K01443""K14264" "K00149" "K01512" "K01425" "K00788" "K11121" "K01241" "K01950" "K00773" "K02433-K02434-K02435" "K00831" "K14260" "K02319-K02320-K02321-K02322-K02323-K02324-K02325-K02327-K02335-K02337-K02338-K02339-K02340-K02341-K02342-K02343-K02684-K03504-K03763-K14159" "K00698" "K01952" "K00764" "K01493" "K12256" "K00260-K00261-K00262-K15371" "K01937" "K00157" "K01255-K01256-K01270-K07751-K11140-K11142" "K00264-K00265-K00266" "K01953" "K01886" "K00859" "K02472" "K04072" "K03426" "K03707" "K00656" "K00278" "K13421" "K00156" "K13035" "K01479" "K14085" "K01758" "K00226-K00254" "K01469" "K04021""K01637" "K09759" "K05597" "K01437" "K03416" "K00213" "K00128" "K00132-K04073" "K07425" "K06130" "K13745" "K00823" "K00292-K00293" "K03737" "K00100" "K01923" "K01467" "K07806" "K09482" "K01760-K14155" "K04105" "K12526" "K00194-K00197" "K01081-K03787-K11751" "K00840" "K01579" "K03269" "K09011" "K00654" "K00122-K00123-K00124-K00127-K05299-K08348-K15022" "K00841" "K00284" "K07250-K13524" "K01776" "K01800" "K00639""K01075" "K00609-K00610" "K01744" "K00836-K15785" "K00680" "K00606" "K00024-K00025-K00026-K00051-K00116" "K00193" "K14676" "K00294" "K01580" "K00816" "K15786""K01951" "K15372" "K00043" "K13547" "K08295" "K00169-K00170-K00171-K00172" "K01484" "K13922" "K13294" "K02232" "K01885" "K14267""K13566" "K14138" "K01638" "K01120-K13298" "K01904" "K00761-K02825" "K01076" "K01939" "K13017" "K01919-K11204" "K09698" "K12234" "K09470" "K01129" "K00640" "K04110" "K00619-K14682" "K01779" "K00454" "K05957" "K00819" "K04487" "K00767" "K01490" "K01846" "K01913" "K00643" "K10702" "K01912-K02614" "K00631-K03621-K08591" "K01000" "K01663-K02500-K02501" "K16173" "K01876" "K01058" "K00818" "K12298" "K10206" "K01924" "K13559""K01908""K00620" "K13763" "K03800" "K00821" "K00140" "K00382" "K01041" "K14621" "K01647-K01648-K15230-K15231" "K08748" "K11992" "K00613" "K13776" "K11212" "K00020" "K05830""K14268" "K15038" "K00010" "K02224" "K00658" "K15741" "K14674" "K01756" "K00763" "K14681" "K01655-K02594" "K00759" "K08969""K09251" "K00109" "K01659" "K00164" "K00143" "K01895" "K00627" "K05907" "K04116" "K08764" "K00174-K00175-K00176-K00177" "K03823" "K12405" "K01796" "K00760" "K01578" "K01031-K01032" "K00021-K00054" "K04020" "K05605" "K13821" "K08692-K14067-K14451" "K10977" "K01911-K14760" "K01896" "K00641" "K00826" "K00030" "K15913" "K03462" "K00769-K03816" "K00548" "K15018" "K03821" "K00625-K13788-K15024" "K13356" "K15780" "K13923" "K01643-K01644" "K15742" "K03918" "K14469" "K01068" "K00758" "K14157" "K16007" "K01918" "K07823" "K01697" "K04042" "K01039-K01040" "K08262" "K00634" "K14467" "K03517" "K07513" "K01905" "K14163" "K01026" "K00651" "K01897-K15013" "K08281" "K01067" "K05822" "K00674" "K00469" "K07508-K07509" "K10214" "K10817" "K02615" "K01940" "K13018" "K01641" "K03783-K03784" "K02233" "K01902-K01903" "K14371" "K00756" "K00626" "K00655" "K01452" "K01044" "K00632" "K13877" "K09699" "K00635" "K01899-K01900" "K00467" "K15232" "K05551" "K15866" "K06718" "K03815" "K00657" "K09018-K09024" "K01487" "K00003" "K00508" "K00757" "K00364" "K00997-K06133" "K00394-K00395" "K01907" "K00135-K00139" "K00022" "K15880" "K14245" "K00257" "K00186-K00187" "K01239" "K00673" "K14447" "K01034-K01035" "K00645" "K01514-K01524" "K01739""K01062" "K00031" "K01649" "K01640" "K00652" "K00648" "K00491-K13242" "K10852" "K11987" "K01825" "K00232" "K00252" "K00249" "K01782" "K00166-K00167-K11381" "K01942-K03524" "K00133" "K11533" "K00290" "K00668" "K01761" "K01049" "K00665" "K01431" "K00207" "K13034" "K06193" "K07514" "K01964-K15036" "K06445" "K14471-K14472" "K01961-K01962-K01963-K02160-K11262" "K02535" "K01740" "K10527" "K00087-K11177-K11178-K13480-K13481-K13482-K13483" "K01692" "K01027-K01028-K01029" "K01046-K14675" "K06447" "K00490" "K00300" "K00248-K09478" "K01965-K01966-K15052" "K01615" "K01738" "K00507" "K05823" "K13929" "K01489" "K07515" "K12339" "K07511" "K00088" "K05939" "K01847-K01848-K01849-K11942" "K01438" "K13767" "K01613" "K01427-K01428-K01429-K01430-K14048" "K00074" "K00023" "K01060" "K03274" "K15016" "K08693" "K07516" "K00255-K07512" "K01687" "K13746" "K08683" "K01715" "K02473" "K09479" "K00234-K00235-K00237-K00239-K00240-K00241-K00242-K00244-K00245-K00246-K00247" "K14534" "K15728" "K00106" "K00316" "K01486" "K07753" "K00052" "K13239" "K01488" "K00972-K11528" "K15938" "K01676-K01677-K01678-K01679" "K00137" "K11538" "K06446" "K00677" "K15019" "K00108-K00499" "K00065" "K00077" "K12711" "K13038""K01922" "K11410" "K15989" "K00599" "K02517" "K01555-K16171" "K12743" "K02536" "K01883-K01884" "K03801" "K00253" "K14446" "K05607-K13766" "K01594" "K10764" "K01968-K01969" "K01439" "K15894" "K12254" "K02474" "K00679" "K13015" "K00549" "K00647-K09458" "K00208" "K01468" "K00083" "K02618" "K01874" "K05829" "K12355" "K00667" "K01485" "K01755" "K15998" "K00790" "K10258" "K01254" "K01498" "K03343" "K01681" "K00145" "K03337" "K00772" "K02549" "K00271" "K03119" "K11752" "K01581" "K01682" "K01119" "K00956-K00957-K00958" "K03852" "K16044" "K01685" "K01082" "K08968" "K00797" "K01921" "K02228" "K01799" "K03335" "K02371" "K00215" "K00954-K02201" "K02372" "K05606" "K00611" "K01478" "K03644" "K04118" "K00227" "K04034-K04035" "K03921" "K00263" "K01686" "K01716" "K01673" "K00059" "K00209" "K10780" "K01887" "K01464" "K13777-K13778" "K01014" "K01582" "K00019" "K01476" "K07173" "K02225-K02227" "K14152" "K01251" "K01523" "K02609-K02610-K02611-K02612-K02613" "K10978" "K00390" "K01750" "K00543" "K01925" "K00547" "K01243" "K00192-K00195" "K00318" "K00198-K03518-K03519-K03520" "K00798" "K11731" "K11755" "K01932" "K05824" "K13622" "K01814" "K13621" "K05831" "K00130" "K09065" "K01195" "K01812" "K12251" "K02301" "K00789" "K03336" "K01583-K01584-K01585-K02626" "K01480" "K01873" "K01892" "K00551-K00570" "K00013" "K01661" "K03897" "K01496" "K04566-K04567-K04568" "K12726" "K02169" "K01457" "K13019" "K00569" "K01590" "K01089" "K00802" "K04075" "K11180-K11181" "K12502" "K01777" "K01598" "K07534" "K01672-K01674" "K07536" "K00040" "K14541" "K01703-K01704" "K08688" "K05362" "K03927" "K13388" "K01693""K00387-K05301-K07147" "K13384" "K01745" "K04486-K05602" "K00559" "K05827" "K00075" "K00380-K00381-K00392" "K07535" "K00286" "K00082" "K00041" "K01586" "K00588" "K05936" "K00471" "K00563-K00564-K06970-K08316-K12297" "K00275" "K13747" "K01843" "K05934" "K03428" "K13779" "K00097" "K00558" "K13016-K13020" "K05928" "K01611" "K02188" "K13623" "K03183" "K10793-K10794" "K01706" "K09187-K09188-K11419-K11420-K11423-K11429" "K00833" "K03394" "K01778" "K00595" "K00472" "K11185-K11186" "K05888" "K01881" "K00589-K02303-K02496" "K02302" "K15792" "K02492" "K06127" "K13542-K13543" "K15778" "K01112"

"K00975"

"K11333-K11334-K11335"

"K01626-K03856" "K06153"

"K01085""K01792"

"K10700" "K07538"

"K04100-K04101" "K04099" "K00383" "K00432"

"K01817"

"K03431" "K08097"

"K10618"

"K14519"

"K00055" "K00486"

"K00466" "K01816" "K01560-K01561"

"K09483"

"K10220"

"K01609" "K00453"

"K07106" "K00005"

"K00978"

"K01232" "K01226"

"K01091"

"K01002" "K01786-K03077-K03080" "K02768-K02769-K02770"

"K01858" "K09461" "K00966-K00971" "K00973"

"K01051" "K01721"

"K01867"

"K00048"

"K01199"

"K01223" "K01805" "K02777" "K04719" "K05343""K12309" "K01194-K05342" "K01190" "K01182" "K01838" "K01189-K07406-K07407" "K01187-K12317-K15922" "K01087" "K00691-K00705"

"K01853"

"K06120-K06121-K06122"

"K01178" "K01229-K12111" "K02778-K02779" "K01188-K05349-K05350"

"K14448"

"K11693"

"K02636"

"K02817-K02818-K02819" "K00045" "K00689-K00690-K05341" "K00008" "K00115-K00117" "K00702" "K01785"

"K10619-K16303"

"K01699-K13919-K13920"

"K02808-K02809-K02810"

"K00991"

"K05917"

"K00399-K00401-K00402" "K15758"

"K01077-K01113" "K02786-K02787-K02788"

"K02793-K02794-K02795-K02796" "K02802-K02803-K02804" "K02790-K02791" "K08679" "K02782-K02783" "K00034" "K02763-K02764-K02765" "K01804" "K12308"

"K02377"

"K00448-K00449"

"K01564"

"K00086"

"K02775" "K02798-K02799-K02800" "K07026" "K01710"

"K01852"

"K01826"

"K02548"

"K00099" "K14449"

"K03366"

"K00319-K13942"

"K01117"

"K00299" "K02771"

"K00485"

"K01576" "K09471"

"K01207"

"K01179-K01225" "K06879-K09457"

"K04040"

"K06163"

"K05921"

"K12451"

"K15813"

"K01822"

"K03587-K05364-K05365"

"K03082"

"K01772"

"K12658"

"K01423"

"K00568"

"K01473-K01474"

"K00042"

"K11263"

"K00979"

"K08353"

"K01928"

"K01720"

"K01941" "K14259"

"K01712"

"K09829" "K01707" "K04126"

"K01929"

"K01470"

"K01705"

"K01702"

"K05275"

"K03365" "K03403-K03404-K03405"

Node 6

"K14732"

"K06281-K06282" "K00231" "K01477" "K03340"

"K10536" "K03182-K03186" "K01845"

"K06042"

"K00228-K02495"

"K03383"

"K12725" "K12728" "K03270"

"K06134" "K06126" "K03795"

"K04127"

"K00440-K00441-K00443"

"K01599" "K03184"

"K11207"

"K01870" "K01869"

"K03474"

"K08963"

"K04835" "K05895" "K02230-K09882-K09883" "K03185" "K02304"

"K16054"

"K01719"

"K14269"

"K01698" "K01708"

"K09880" "K08351" "K01012"

"K01500" "K04128"

"K01627" "K01749"

"K00355" "K02827-K02828"

"K06720" "K12253"

"K03688"

"K01935"

"K01844" "K00329-K00330-K00331-K00332-K00333-K00334-K00335-K00336-K00337-K00338-K00339-K00340-K00341-K00342-K00343-K03878-K03879-K03880-K03881-K03882-K03883-K03884" Node 3

"K01466"

"K08964"

"K02083" "K08357-K08358"

Node 5 "K10674"

Node 4

Supplementary Figure 5 Generalized OMMC-wide metabolic network reconstructed from the combined metagenomic and metatranscriptomic data of OMMCs sampled in autumn and winter. Coloured nodes indicate pathways enriched in highly expressed KOs; blue – oxidative phosphorylation; yellow – nitrogen metabolism; red – TCA cycle; purple – glycerolipid metabolism. Large grey nodes are highly expressed during at least one season. Opacity of nodes indicates shortest average path length (the more transparent, the longer the path length).

40

Acetyl0CoA(

K01716(K00665( K02371( …..( K02372( K09458( K00208( Malonyl0ACP( K00647( Hexadecanoyl0CoA(

Palmi&c(acid(

K01897(

Hexadecanoyl(

K00665(

K00059( K10258(

K00647(K00208(( K02372(K00647( K09458(K00208( K02371( K00059( K10258(

Stearoyl0CoA(

Oleoyl0CoA(

K03921'

K10256(

Linoleoyl0CoA(

Octadecatrienoyl0CoA(

Icosadienoyl0CoA(

Icosatrienoyl0CoA(

K10257(

…..(

Icosenoyl0CoA(

…..(

…..(

Icosanoyl0CoA(

…..(

K00059(

Supplementary Figure 6 Representation of the fatty acid metabolic pathway (ko01212). KOs with a betweenness centrality that is much higher in the metabolic network reconstructed from the OMMC sampled in winter compared to the OMMC sampled in autumn are highlighted in red. The key functionality of KO K03921 (acyl-[acyl-carrier protein] desaturase) is highlighted in pink. Winter KOs with a high relative gene expression are represented in blue. The dotted lines represent the continuity of the pathway and the circles represent metabolites.

41

a

Supplementary

b

Figure

7

OMMC-wide

metabolic

networks

reconstructed

from

metagenomic and metatranscriptomic data of OMMCs sampled in (a) autumn and (b) winter. Large nodes indicate KOs encoding key functionalities whereas colours represent pathway membership: yellow, nitrogen metabolism; orange, fatty acid biosynthesis; light green, benzoate degradation; dark green, porphyrin and chlorophyll metabolism; pink, cystein and methionine metabolism and red, other pathways. 630

42

a

b

c

Supplementary Figure 8 Micrographs of Isolate LCSB065. (a) Non-polar granules observed in Isolate LCSB065 following Nile Red staining (λex 550/20 nm, λem 620/60 nm); (b) Bright field micrograph; (c) Overlay of (a) and (b). The scale bar is equivalent to 10 µm.

43

Supplementary Tables

Supplementary Table 1 Physicochemical characteristics of the wastewater at the time of sampling in the anoxic tank of the Schifflange BWWT plant. Sampling dates

Suspended

pH

solids (g/l)

NO3-

O2

NH4

PO4

Air

Water

(mg/l)

(mg/l)

(mg/l)

(mg/l)

temperature

temperature

(°C)

(°C)

4 October 2010

2.78

6.94

2.77

0.64

0.6

2.06

15

20.7

25 January 2011

3.24

7.01

3.37

1.13

1.72

1.02

0

14.5

635

44

Supplementary Table 2 Quantitative and qualitative analyses of biomacromolecular fractions sequentially isolated from the OMMC samples.

Sampling dates

DNA A260/A280

RNA Quantity

RIN

[µg]

Protein

Quantity

Quantity per

[µg]

gel lane [µg]

4 October 2010

2.20

15.44

8.9

81.97

20.1

25 January 2011

2.03

6.11

9.2

75

21.8

45

Supplementary Table 3 Statistics of the combined assembly of the metagenomic and 640

metatranscriptomic sequence datasets.

Statistic Total contig length

Value Unit 1 617 735 059 nt

Average contig length

239 nt

N50

258 nt

Maximal contig length Number of contigs

12 591 nt 6 761 781

46

Supplementary Table 4 Ammonia-oxidizing organisms with corresponding accession numbers of amoA genes used for the reconstruction of the phylogenetic tree. 645

Organism name

Accession number

Candidatus Nitrososphaera gargensis

gi|166007511

Methylobacter tundripaludum

gi|493947876

Methylococcus capsulatus Bath

gi|53803062

Methylomicrobium alcaliphilum 20Z

gi|357403888

Nitrosococcus halophilus Nc4

gi|292490804

Nitrosococcus oceani ATCC19707

gi|77165960

Nitrosococcus watsonii C-113

gi|300113334

Nitrosomonas cryotolerans ATCC49181

gi|12620331

Nitrosomonas europaea ATCC19718

gi|30248947

Nitrosomonas eutropha C91

gi|114332043

Nitrosomonas sp. Is79A3

gi|339481967

Nitrosomonas sp. JL21

gi|19310210

Nitrosomonas sp. AL212

gi|325981491

Nitrosospira briensis C-128

gi|1732262

Nitrosospira multiformis ATCC25196

gi|82701932

Nitrosospira sp. APG3A

gi|490283770

Nitrosospira sp. En13

gi|121483569

Nitrosospira sp. NpAV

gi|2062746

Nitrosospira sp. Np39-19

gi|2425028

Nitrosovibrio tenuis

gi|1732264

47

Supplementary Table 5 Numbers of metagenomic and metatranscriptomic reads from autumn and winter dates mapping to the isolate genomes of Nitrosomonas sp. Is79 (ref. 30) and Isolate LCSB065. Dataset

Sampling dates

Target organism Nitrosomonas sp. IS79

Isolate LCSB065

metagenomic reads

4 October 2010

7 282

5 665

metagenomic reads

25 January 2011

13 058

7 941

metatranscriptomic reads

4 October 2010

4 004

1 923

metatranscriptomic reads

25 January 2011

28 631

20 784

48

Supplementary References 1. Roume H, Muller EE, Cordes T, Renaut J, Hiller K, Wilmes P. A biomolecular isolation 650

framework for eco-systems biology. ISME J 2013; 7: 110-121.

2. Roume H, Heintz-Buschart A, Muller EE, Wilmes P. Sequential isolation of metabolites, RNA, DNA, and proteins from the same unique sample. Microbial Metagenomics, Metatranscriptomics, and Metaproteomics. Method Enzymol 2013; 531: 219-236. 655 3. Kozarewa I, Turner DJ. 96-plex molecular barcoding for the Illumina Genome Analyzer. High-Throughput Next Generation Sequencing 2011; 279-298.

4. Masella AP, Bartram AK, Truszkowski JM, Brown DG, Neufeld JD. PANDAseq: paired660

end assembler for illumina sequences. BMC Bioinformatics 2012; 13: 31.

5. Kofler R, Orozco-terWengel P, De Maio N, Pandey RV, Nolte V, Futschik A et al. PoPoolation: a toolbox for population genetic analysis of next generation sequencing data from pooled individuals. PloS one 2011; 6: e15925. 665 6. Huang Y, Niu B, Gao Y, Fu L, Li W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 2010; 26: 680-682.

7. Kultima JR, Sunagawa S, Li J, Chen W, Chen H, Mende DR et al. MOCAT: a 670

metagenomics assembly and gene prediction toolkit. PloS one 2012; 7: e47656.

8. Li R, Li Y, Kristiansen K, Wang J. SOAP: short oligonucleotide alignment program.

49

Bioinformatics 2008; 24: 713-714.

675

9. Miller CS, Baker BJ, Thomas BC, Singer SW, Banfield JF. EMIRGE: reconstruction of full-length ribosomal genes from microbial community short read sequencing data. Genome Biol 2011; 12: R44.

10. Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P et al. The SILVA ribosomal 680

RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res 2013; 41: D590-D596.

11. Treangen TJ, Koren S, Sommer DD, Liu B, Astrovskaya I, Ondov B et al. MetAMOS: a modular and open source metagenomic assembly and analysis pipeline. Genome Biol 2013; 685

14: R2.

12. Namiki T, Hachiya T, Tanaka H, Sakakibara Y. MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads. Nucleic Acids Res 2012; 40: e155-e155. 690 13. Rho M, Tang H, Ye Y. FragGeneScan: predicting genes in short and error-prone reads. Nucleic Acids Res 2010; 38: e191-e191.

14. Hyatt D, Chen G-L, LoCascio P, Land M, Larimer F, Hauser L. Prodigal: prokaryotic 695

gene recognition and translation initiation site identification. BMC Bioinformatics 2010; 11: 119.

50

15. Kent WJ. BLAT—the BLAST-like alignment tool. Genome Res 2002; 12: 656-664.

700

16. Kessner D, Chambers M, Burke R, Agus D, Mallick P. ProteoWizard: open source software for rapid proteomics tools development. Bioinformatics 2008; 24: 2534-2536.

17. Craig R, Cortens JP, Beavis RC. Open source system for analyzing, validating, and storing protein identification data. J Proteome Res 2004; 3: 1234-1242. 705 18. Deutsch EW, Mendoza L, Shteynberg D, Farrah T, Lam H, Tasman N et al. A guided tour of the Trans-­‐Proteomic Pipeline. Proteomics 2010; 10: 1150-1159.

19. Keller A, Nesvizhskii AI, Kolker E, Aebersold R. Empirical statistical model to estimate 710

the accuracy of peptide identifications made by MS/MS and database search. Anal Chem 2002; 74: 5383-5392.

20. Shteynberg D, Deutsch EW, Lam H, Eng JK, Sun Z, Tasman N et al. iProphet: multilevel integrative analysis of shotgun proteomic data improves peptide and protein 715

identification rates and error estimates. Mol Cell Proteomics 2011; 10: M111.007690.

21. Muller EE, Pinel N, Laczny CC, Hoopmann MR, Narayanasamy S, Lebrun LA et al. Community-integrated omics links dominance of a microbial generalist to fine-tuned resource usage. Nat Commun 2014; 5: 5603; 1-10. 720 22. Lee S, Seo CH, Lim B, Yang JO, Oh J, Kim M et al. Accurate quantification of transcriptome from RNA-Seq data by effective length normalization. Nucleic Acids Res 2011;

51

39: e9-e9.

725

23. Tsementzi D, Poretsky R, Rodriguez-­‐R LM, Luo C, Konstantinidis KT. Evaluation of metatranscriptomic protocols and application to the study of freshwater microbial communities. Environ Microbiol Rep 2014; 6: 640-655.

24. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful 730

approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological) 1995; 289-300.

25. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome 735

Res 2003; 13: 2498-2504.

26. Rahman SA, Schomburg D. Observing local and global properties of metabolic pathways:‘load points’ and ‘choke points’ in the metabolic networks. Bioinformatics 2006; 22: 1767-1774. 740 27. Faust K, Croes D, van Helden J. Metabolic pathfinding using RPAIR annotation. J Mol Biol 2009; 388: 390-414.

28. Csardi G, Nepusz T. The igraph software package for complex network research. 745

InterJournal, Complex Systems 2006; 1695: 1-9.

29. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search

52

tool. J Mol Biol 1990; 215: 403-410.

750

30. Bollmann A, Sedlacek CJ, Norton J, Laanbroek HJ, Suwa Y, Stein LY et al. Complete genome sequence of Nitrosomonas sp. Is79, an ammonia oxidizing bacterium adapted to low ammonium concentrations. Stand Genomic Sci 2013; 7: 469.

31. Liu W, Li L, Khan MA, Zhu F. Popular molecular markers in bacteria. Mol Genet 755

Microbiol Virol 2012; 27: 103-107.

32. Sommer DD, Delcher AL, Salzberg SL, Pop M. Minimus: a fast, lightweight genome assembler. BMC Bioinformatics 2007; 8: 64.

760

33. Treangen TJ, Sommer DD, Angly FE, Koren S, Pop M. Next generation sequence assembly with AMOS. Curr Protoc Bioinformatics 2011; 11.8. 1-11.8. 18.

34. Sayavedra-­‐Soto L, Hommes N, Alzerreca J, Arp D, Norton JM, Klotz M. Transcription of the amoC, amoA and amoB genes in Nitrosomonas europaea and Nitrosospira sp. NpAV. 765

FEMS Microbiol Lett 1998; 167: 81-88.

35. Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 2012; 28: 3150-3152.

770

36. Dereeper A, Guignon V, Blanc G, Audic S, Buffet S, Chevenet F et al. Phylogeny. fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res 2008; 36: W465-W469.

53

37. Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol 2000; 17: 540-552. 775 38. Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 2010; 59: 307-321.

780

39. Chevenet F, Brun C, Bañuls A-L, Jacq B, Christen R. TreeDyn: towards dynamic graphics and annotations for analyses of trees. BMC Bioinformatics 2006; 7: 439.

40. Reasoner D, Geldreich E. A new medium for the enumeration and subculture of bacteria from potable water. App Environ Microbiol 1985; 49: 1-7. 785 41. Levantesi C, Rossetti S, Thelen K, Kragelund C, Krooneman J, Eikelboom D et al. Phylogeny, physiology and distribution of ‘Candidatus Microthrix calida’, a new Microthrix species isolated from industrial activated sludge wastewater treatment plants. Environ Microbiol 2006; 8: 1552-1563. 790 42. Slijkhuis H. Microthrix parvicella, a filamentous bacterium isolated from activated sludge: cultivation in a chemically defined medium. App Environ Microbiol 1983; 46: 832-839.

43. Fowler SD, Greenspan P. Application of Nile red, a fluorescent hydrophobic probe, for 795

the detection of neutral lipid deposits in tissue sections: comparison with oil red O. J Histochem Cytochem 1985; 33: 833-836.

54

44. Schneider CA, Rasband WS, Eliceiri KW. NIH Image to ImageJ: 25 years of image analysis. Nat Methods 2012; 9: 671-675. 800 45. Rodrigue S, Materna AC, Timberlake SC, Blackburn MC, Malmstrom RR, Alm EJ, Chisholm SW. Unlocking short read sequencing for metagenomics. PLoS one 2010; 5: e11840.

805

46. Xu H, Luo X, Qian J, Pang X, Song J, Qian G et al. FastUniq: a fast de novo duplicates removal tool for paired short reads. PloS one 2012; 7: e52249.

47. Peng Y, Leung HC, Yiu S-M, Chin FY. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 2012; 28: 1420810

1428.

48. Gordon D, Abajian C, Green P. Consed: a graphical tool for sequence finishing. Genome Res 1998; 8: 195-202.

815

49. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA et al. The RAST Server: rapid annotations using subsystems technology. BMC Genomics 2008; 9: 75.

50. Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 2009; 25: 1754-1760. 820 51. Kerepesi C, Bánky D, Grolmusz V. AmphoraNet: The webserver implementation of the AMPHORA2 metagenomic workflow suite. Gene 2014; 533: 538-540.

55

52. Tatusov RL, Galperin MY, Natale DA, Koonin EV. The COG database: a tool for 825

genome-scale analysis of protein functions and evolution. Nucleic Acids Res 2000; 28: 33-36.

53. Overbeek R, Begley T, Butler RM, Choudhuri JV, Chuang H-Y, Cohoon M et al. The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res 2005; 33: 5691-5702. 830

56