Aims: Visualising layers of different omics data to ...

3 downloads 0 Views 24MB Size Report
3 types of fungus-plant interactions. High transcription profile. Transcriptomic profile. Secretomic profile. Self-organizing map. RNA-seq. LC-MS. Data integration.
Visualisation of large-scale fungal omics data with SHIN+GO Shingo Miyauchi1 , Marie-Noelle Rosso2, Annegret Kohler1, Francis Martin1 1.UMR Tree-Microbe Interactions, INRA-Nancy, 54280 Champenoux, France; 2. INRA-Aix Marseille Université, UMR1163 Biodiversité et Biotechnologie Fongiques, F-13288 Marseille, France

Aims: Visualising layers of different omics data to understand complex biological activities Background Self organizing map Harbouring Informative Nodes with Gene Ontology (SHIN+GO) is an omics data-mining platform that 1) reduces the high dimensionality of RNA-seq data and clusters similarly-regulated genes; 2) incorporates various types of data (i.e. secretomics, gene annotations, predicted functions) into the self organising map; 3) visualises layers of sorted information for biological extrapolations. The method has been used for fungal omics studies1,2.

Methods

P. coccineus Maltose Aspen

Sample preparation

Pine Wheat Straw

Case study 1 L. bicolor

Case study 2

Day 3

7

3 types of fungus-plant interactions Day 0

3

RNA-seq

7

14

Self Organising Map (SOM) is a datadriven machine learning algorithm3,4 that constructs network matrices with input data. It is suitable for simplifying highdimensional data because it reduces the number of features by grouping similar items and forming clusters.

21

Data visualisation Case study 1. Genome-wide integrative omics models of P. coccineus grown with the substrates for 3 and 7 days (see Methods)2.

Case study 1: Pycnoporus coccineus was cultivated with 4 substrates and cultured for 3 and 7 days to examine transcriptomic and secretomic responses. Case study 2: Laccaria bicolor was grown in three conditions; 1) interaction with a plant; 2) indirect interaction; 3) no interaction for transcriptomic responses at 5 time points.

Transcriptomic map: Mean transcription levels per node for each condition. Secretomic map: The total count of secreted proteins per node indicates secretion hotspots. (A): Magnified version of one treatment from (B). The node identification is labeled (i.e. 1 to 456). (B): Transcriptomic and secretomic maps from the four substrates at two time points.

Data acquisition

Transcriptome: Normalized log2 read count of genes for all substrates from RNA-seq data. Secretome: Identification of secreted proteins detected with liquid chromatography mass spectrometry (LC-MS).

LC-MS

Omics data mining Data integration Self-organizing map Transcriptomic profile High transcription profile Secretomic profile

Annotation term mining Biological interpretation

SHIN pipeline produces Self-organizing maps (SOMs). A master SOM is made with normalized log2 read count of genes from replicates in all conditions. Genes with similar transcription patterns are clustered into SOM nodes. Neighboring nodes have similar transcriptional patterns. Transcriptomic profile: SOM is overlaid with the node-wise mean of the normalized transcript read counts of genes, which represents the transcription level for each condition. High transcription profile: The selection of nodes shows condition-specific highly transcribed gene clusters. Secretomic profile: SOM is overlaid with the count of corresponding secreted proteins detected for each condition. GO pipeline identifies overrepresented biological terms in nodes with the functional annotation sets (i.e. CAZy, InterPro, GO, KEGG, KOG, etc). Discovery of omics hotspots: Constructed omics models pinpoint overlaps between highly transcribed co-regulated genes and highly frequently co-secreted proteins for each condition at each time point.

Visualisation of fungal omics profiles with SHIN+GO . (1) SHIN module integrates the fungal transcriptome from RNA-seq data and the secretome from liquid chromatography mass spectrometry. (2) GO module assists the biological interpretation of the outputs of the omics models with functional gene annotations. 1,2

Summary • SHIN + GO combines layers of biological information and systematically pinpoints hotspots in large-scale omics data. • Visual outputs enable the discovery of genome-wide patterns of organisms. • It is versatile and can be applied for comparative transcriptomics of different strains or species. • The ultimate goal is to combine various omics data - transcriptomics, proteomics, and metabolomics. Acknowledgments & References We would like to thank the people involved in the projects. The research was conducted under the supervision of; 1) Dr. Francis Martin and Dr. Annegret Kohler at Tree-Microbe Interactions, INRA-Nancy, France; 2) Dr. Marie-Noelle Rosso at INRA Axi-Marseille Univeristy, France. I thank Joske Ruytinx (Universiteit Hasselt, Belgium) for performing RNA extraction for RNA-seq of L.bicolor. The projects were funded by US DOE JGI and Agence Nationale de la Recherche.

1. Miyauchi, S. et al. Visual Comparative Omics of Fungi for Plant Biomass Deconstruction. Front. Microbiol. 7, 1335 (2016). 2. Miyauchi, S. e t al. The integrative omics of white-rot fungus Pycnoporus coccineus reveals co-regulated CAZymes for orchestrated lignocellulose breakdown. PLoS One 1, 1–17 (2017). 3.Kohonen, Teuvo (1982). “Self-Organized Formation of Topologically Correct Feature Maps”. Biological Cybernetics 43 (1): 59–69. 4.Wehrens R, Buydens LMC: Self-and Super-organizing Maps in R: The kohonen Package. JSS J Stat Softw 2007, 21. 5.US DOE JGI program http://genome.jgi.doe.gov 6.Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

SOM groups genes with similar patterns and forms gene groups (i.e. nodes). Nodes in close proximity have genes with relatively similar patterns (see figure).

Findings: The integrated omics model pinpointed omics hotspots showing co-regulation of CAZymes at the transcriptomic and secretomic levels. Case Study 2. Genome-wide transcriptomic models of Laccaria bicolor representing development of symbiosis with a host plant Populus tremula x alba at five time points. Transcriptomes of free-living (no interaction with the plant) , ectomycorrhizal (direct interaction), or extraradicular mycelia (indirect interaction) are shown. Tatami maps: Genome-wide transcriptions (mean normalised log2 read count of replicates) corresponding to the conditions. Node identification is labelled (1 to 621). Bottom: Highly (differentially) transcribed nodes at each time point selected based on 1) > 12 mean log2 reads (above 95th percentile of the transcription level of the all genes used for the model); and 2) > 2 log2 fold differences of each time point against 0 day. Findings: The transcriptomic models pinpointed unique co-regulated genes for the ectomycorrizhal development including small secreted proteins known to neutralise the defence system of host plants (manuscript in preparation).

Suggest Documents