protocol
Creating interactive, web-based and data-enriched maps with the Systems Biology Graphical Notation Astrid Junker1, Hendrik Rohn1, Tobias Czauderna1, Christian Klukas1, Anja Hartmann1 & Falk Schreiber1,2 Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben (IPK), Gatersleben, Germany. 2Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle, Germany. Correspondence should be addressed to A.J. (
[email protected]). 1
© 2012 Nature America, Inc. All rights reserved.
Published online 1 March 2012; doi:10.1038/nprot.2012.002
The Systems Biology Graphical Notation (SBGN) is an emerging standard for the uniform representation of biological processes and networks. By using examples from gene regulation and metabolism, this protocol shows the construction of SBGN maps by either manual drawing or automatic translation using the tool SBGN-ED. In addition, it discusses the enrichment of SBGN maps with different kinds of -omics data to bring numerical data into the context of these networks in order to facilitate the interpretation of experimental data. Finally, the export of such maps to public websites, including clickable images, supports the communication of results within the scientific community. With regard to the described functionalities, other tools partially overlap with SBGNED. However, currently, SBGN-ED is the only tool that combines all of these functions, including the representation in SBGN, data mapping and website export. This protocol aims to assist scientists with the step-by-step procedure, which altogether takes ~90 min.
INTRODUCTION The continuous generation of increasingly large-scale data sets in systems biology results in an urgent need for better visualizations of complex and interweaved biological data in the form of biological networks. This demand prompted the development of network languages, exchange formats and computational tools. To meet one basic requirement of large-scale network visualization—uniformity—the SBGN has been developed in the context of a large consortium by scientists of different fields including computer science, medical science, biochemistry and biology1 (http:// www.sbgn.org/; Box 1). Similarly to wiring diagrams in electrical engineering, SBGN allows the unambiguous representation of networks by using a limited number of easily recognizable symbols. The three different languages covered by SBGN (Box 1) enable this notation to represent any kind of biological network, such as metabolic, regulatory, signaling and interaction networks2–5, at different levels of granularity (see Box 2 for the controlled vocabulary). To simplify the map drawing and editing process, and to increase the applicability of SBGN among the scientific community, several tools have been developed (Table 1). In most cases, standard network visualization tools such as CellDesigner6, Cytoscape (using the BiNoM plug-in7), PathVisio8 or VANTED (using the SBGN-ED addon9) have been extended to support the representation of networks in the SBGN format (for reviews, see refs.10,11). All of these support representation of networks in the SBGN format; however, only some tools offer the possibility for drawing SBGN maps (Table 1), and this is mostly restricted to only one sublanguage and/or a limited set of SBGN glyphs. SBGN-ED9 (http://www.sbgn-ed.org/), an add-on for the VANTED2.0 system (http://www.vanted.org/; Box 3), provides full support for all three sublanguages of SBGN and all corresponding glyphs, with the Edinburgh Pathway Editor12 being the only alternative with similar functionalities (Table 1). SBGN-ED allows the import of network data from various sources and in different file formats (Box 4), as well as manual drawing of custom-kind networks, which have been inferred from literature or experimental data. Furthermore, it offers the possibility to validate all types of SBGN maps, including additional information about violations of the SBGN specification, and to translate pathway maps from the
KEGG13 database into SBGN style. By using SBGN-ED, it is possible to enrich SBGN maps with numerical data derived from wet-lab experiments or with useful additional information that may be provided via web links to other maps or any other web resources, such as other database entries. Such data-mapping functionalities are also implemented in other tools such as CellDesigner6 or PathVisio8. Besides the well-known static image export, SBGN-ED provides a way of enabling scientists to automatically export maps into a HTML-based website as clickable images. The website can easily be uploaded onto a server and users may visit it using normal web browsers from anywhere around the world. Therefore, the publishing process is abstracted from technical details such as writing HTML and designing website layout, which is usually unfamiliar to life scientists. Currently, SBGN-ED is the only tool that fulfills all the necessary requirements to achieve the anticipated results of this protocol. The overall procedure described here takes about 70–90 min. This time largely depends on the type and size of the SBGN map the user wants to create. A few general limitations of SBGN-ED exist that may prolong the procedure. The performance of SBGN-ED while handling large maps (several hundreds to thousands of nodes) is dependent on the PC hardware. SBGN-ED is able to process genome-size gene expression data sets, although processing times increase with the size of the data sets. The mapping of experimental data on SBGN maps depends on common identifiers, helping to relate SBGN glyphs with entries in the data set. It might be tedious to create a mapping table if mapping information is unknown and cannot be found in public databases. Another limitation of the protocol is related to the exchange of data-enriched SBGN maps. At the moment, only SBGN-ED is able to restore enriched maps correctly with the text-based graph markup language (GML) format. SBGN maps without mapped data might be exported by using the SBGN markup language (LibSBGN, http://libsbgn.sourceforge.net/) format, which is also supported by other network visualization tools. SBGN represents a standard nomenclature in biology, which will be routinely applied in network and pathway visualization. Therefore, this protocol aims to assist scientists with the step-bystep procedure of SBGN map drawing with SBGN-ED, enriching these maps with experimental data from various sources and exporting the results on websites. The protocol has been nature protocols | VOL.7 NO.3 | 2012 | 579
protocol Box 1 | SBGN The ‘Repressilator’ example31 is used in the following to illustrate the difference between the three sublanguages of SBGN. SBGN maps are adapted from http://www.sbgn.org/Documents/Examples. Graph Trinity PD—Process description language Essence: change Level of detail: high It represents the transitions of entities from one form or state to another. Features: unambiguous, mechanistic, sequential, combinatorial explosion Description: The transcript (mRNA) of each of the three components of the repressilator system (LACIm, Tetm, Clm) necessarily stimulates a process (translation) leading to the formation of the corresponding protein (LACIp, Tetp, Clp). Each of these three proteins acts as an inhibitor of a process (transcription) leading to the mRNA of one of the other components, thereby constituting an inhibitory circle. Laclp Laclm
© 2012 Nature America, Inc. All rights reserved.
Clp
Tetm Clm Tetp
ER—Entity relationship language Essence: influence Level of detail: medium It represents the influences of entities upon the behavior of others. Features: unambiguous, mechanistic, nonsequential Description: If the transcript of each component (LACIm, Tetm, Clm) exists, the state-variable ‘existence’ is assigned to the corresponding protein (necessary stimulation). The existence of the protein (LACIp, Tetp, Clp), in turn, lowers the probability that the statevariable ‘existence’ is assigned to the transcript of the next component (inhibition). Laclp
Tetp
Laclm
T
Tetm
T
T
Clp
T
T
Clm
T
AF—Activity flow language Essence: activity flow Level of detail: low It represents the activity flow from one entity to another or within the same entity. Features: ambiguous, conceptual, sequential Description: In principle, AF is an abstract representation of the processes shown in PD. The activities of LACIm, Tetm and Clm lead to the activities of LACIp, Tetp and Clp, respectively. These, in turn, decrease the activity of the next component. Laclp
Tetm
Tetp
580 | VOL.7 NO.3 | 2012 | nature protocols
Laclm
Clp
Clm
protocol Box 2 | Glossary Arc—network edge. Glyph—network node. Layout—positions of nodes and edges in 2D space; includes node sizes. Map—visual representation of a network. Network—collection of nodes and connecting edges. In the biological context, different networks exist, such as regulatory networks or metabolic networks. The visual representation of a network is referred to as a map. Topology—structure of the network by defining all nodes and connecting edges with optional labels. (Note: SBGN-ED does not support hyper-graphs.) Visual properties—color, gradients, line thickness, label font style and size.
© 2012 Nature America, Inc. All rights reserved.
applied to the representation of gene regulatory and metabolic networks using SBGN2,14, to the mapping of metabolic data on SBGN networks14 and to the publication of SBGN style maps on websites2. These uses exemplify the potential applications of this protocol, which can be categorized into three areas, as follows.
(i) Unambiguous visualization: SBGN allows the representation of all kinds of networks and all kinds of -omics data in the context of these networks. (ii) Gain of biological knowledge: data mapping gives insights into novel biological relationships, such as the correlation of different data types (metabolite concentrations,
Table 1 | Comparison of pathway visualization tools that support SBGN notation. SBGN Name
Ref.
Access
Availability
PD
SBGN Website ER AF validation Mapping export
SBGN-ED
9
http://sbgn-ed.org/
F
√
√
√
√
√
√
BiNoM
7
http://bioinfo-out.curie.fr/projects/binom/
F
√
— —
—
—
—
BioUML
—
http://www.biouml.org/
F
√
— —
—
—
—
—
—
√
—
CellDesigner
6
http://www.celldesigner.org/
F
√
Edinburgh Pathway Editor (EPE)
12
http://epe.sourceforge.net/
F
√
√
√
—
—
—
PathVisio
8
http://www.pathvisio.org/
F
√
— —
—
√
√
PathwayLab
—
http://www.innetics.com/
C
√
— —
—
—
√
Arcadia
21
http://arcadiapathways.sourceforge.net/
F
√
— —
—
—
—
Athena
22
http://athena.codeplex.com/
F
√
— —
—
—
—
BIOCHAM
23
http://contraintes.inria.fr/BIOCHAM/
F
√
— —
—
—
—
JWS Online
24
http://jjj.biochem.sun.ac.za/help.html
F
√
— —
—
—
—
Mayday
25
http://www-ps.informatik.uni-tuebingen. de/mayday/wp/
F
√
— —
—
√
—
Netbuilder’
26
http://strc.herts.ac.uk/bio/maria/ Apostrophe/
F
√
— —
—
—
—
SBML Layout Library
27
http://sbmllayout.sf.net/
F
√
— —
—
—
—
SubtiPathways
28
http://subtiwiki.uni-goettingen.de/ subtipathways.html
F
√
— —
—
—
—
Systems Biology Metabolic Modeling assistant
29
http://cath.gisum.uma.es:8080/sbmm/
F
√
— —
—
—
—
VISIBIOweb
30
http://www.bilkent.edu.tr/~bcbi/pvs.html
F
√
— —
—
—
—
Abbreviations: C, commercial; F, free. The first seven tools in the upper part of the table support the drawing of SBGN maps. All other tools only support the representation of maps in SBGN style.
nature protocols | VOL.7 NO.3 | 2012 | 581
protocol Box 3 | VANTED technical description
© 2012 Nature America, Inc. All rights reserved.
VANTED is a tool for the visualization and analysis of networks with related experimental data. It enables researchers to visualize and edit biological networks, such as metabolic and gene regulatory networks. The nodes may be enriched with different experimental data, such as metabolite concentrations and gene expression rates. The networks can be loaded from the file system or by direct connection to pathway databases such as MetaCrop and RIMAS using the Pathways tab. The networks may be manually or automatically transformed into valid SBGN maps using SBGN-ED. Experimental data enter the system by text files or filled Excel spreadsheet templates found by selecting Experiments → Data Input Templates. The experimental data are mapped to the network nodes and edges by searching for labels that are equal to the substance names in the experimental data. Mapped experimental data are represented inside the nodes or on top of edges by different diagrams, such as line and bar charts or heat maps. Statistical functions and tests, such as detection of outliers, investigation of distribution properties and calculating significance criteria may be carried out on the mapped experimental data using Tools → Statistics. Various add-ons make use of and extend the described basic functionalities of VANTED, as described in the following table:
SBGN-ED
Description
Link
Create and edit all three types of SBGN maps
http://www.sbgn-ed.org/
Validate SBGN maps according to the SBGN specifications Translate maps from KEGG pathway database into SBGN Export SBGN maps into several file and image formats MetaCrop
Browsing of the content of the hand-curated MetaCrop database
http://www.vanted.org/addons/ MetaCrop
Species-specific metabolic network models for pathway exploration, data-mapping and analysis, metabolic modeling FBASimVis
Constraint-based analysis of metabolic models
http://fbasimvis.ipk-gatersleben.de/
Dynamic and visual exploration of metabolic flux data DBE2 (Database for Biological Experiments)
Store biological experiment data in a central and safe place
http://www.vanted.org/addons/DBE2/
Access, share and combine data with other data sets HIVE (Handy Integration Combination of network-focused Systems Biology approaches and Visualization of multi with spatio-temporal information modal Experimental data) Handling of volumes and images, together with a workspace approach integration of data of different biological data domains
http://www.vanted.org/hive/
Formatting options SBGN-related graph elements are solely altered using the SBGN-ED tab. All other formatting of graph elements, such as size, line thickness and color, has no explicit meaning in SBGN and may be modified by a double-click (add and change labels, URL and tooltip) or using the Network tab. The sub-tab Graph of the Network tab enables to alter global graph properties, such as diagram visualization options for mapped experimental data. Using the sub-tabs Node and Edge, glyph- and arc-specific properties, such as color, may be changed. Select graph elements in the graph editor view. (i) Switch to the respective sub-tab (Network → Graph/Node/Edge) to see the current attribute values. (Note: ‘~’ indicates that there are different values for the selected elements.) (ii) Modify values. Note: if the selection is changed before applying, all changes will be lost. (iii) Click on Apply Changes to pass the changes to the selected graph elements. Note: please do not alter the shape attribute, as this will result in invalid SBGN maps. Use the SBGN-ED tab instead.
582 | VOL.7 NO.3 | 2012 | nature protocols
protocol enzyme activities and transcript abundancies of enzyme-coding genes) in the context of metabolic networks. (iii) Publication and information exchange: the export of SBGN-enriched maps in different formats (static images, editable network files, websites)
complies with different scientific purposes, such as the interactive exploration of scientific results and its communication within the biological community or in journal publications or scientific presentations.
© 2012 Nature America, Inc. All rights reserved.
Box 4 | File formats in SBGN-ED SBGN-ED supports various file input and output formats in order to exchange data with other applications. Networks enter the system by standardized graph formats, whereas experiment data enter the system by Excel templates or text files. The enriched maps may be exported in various formats for exchange, documentation and publication purposes. Network input formats 1. Graph markup language (GML) is the main exchange format of SBGN-ED and is also supported by various other tools. It supports all necessary graph attributes, such as topology, layout, visual properties, (URL-) links, experiment data and charting attributes. GML is a text-based file format, which nests key-value pairs, and is therefore difficult to edit manually without SBGN-ED or other tools. 2. GraphML is the XML-based graph-exchange version of GML; it supports all necessary graph attributes as well. As with GML, it is difficult to edit manually and is supported by various other tools. 3. KEGG markup language (KGML) is an XML-based exchange format of the KEGG pathway maps. It is XML-based and difficult to edit manually. It supports topology, layout, colors and (URL-) links. Further attributes such as experiment data, label styles and so on are not represented. KGML files serve in SBGN-ED as a source for topology, layout and (URL-) links. 4. Systems biology markup language (SBML) is an XML-based exchange format for representing biochemical models, and is therefore widely supported by other tools. The SBML loader of SBGN-ED supports only topology and some SBML attributes such as ‘role’, ‘stoichiometry’ and ‘compartment’. All other SBML attributes are neglected at the moment, but we are working on implementing full SBML support. The layout and visual properties must be adapted by hand. As in KGML, SBML files serve as a source for topology. 5. Simple interaction file (SIF) is a simple text-based format, representing only graph topology and it is used by other tools, such as Cytoscape. Such files can be created and edited manually, e.g., by using Excel or text editors. Therefore, it is easy to create large graphs by hand and to add visual properties in SBGN-ED afterward. The text file contains a space-delimited table with one or three columns. Each row contains the following structure nodeA interactionType nodeB or nodeC This row would result in two nodes with the labels ‘nodeA’ and ‘nodeB’, connected by a directed edge from ‘nodeA’ to ‘nodeB’ with the label ‘interactionType’ or a single node with label ‘nodeC’. 6. SBGN markup language (SBGN-ML) is an XML-based file format for the exchange of SBGN maps. Currently, only SBGN PD maps can be loaded (if SBGN-ED is available). So far, the file format supports only the exchange of basic information (type of glyphs and arcs, position and size of glyphs, path of arcs). Style information cannot be exchanged at the moment. 7. Various other formats: GraphViz (DOT), eXtensible graph markup and modeling language (XGMML), Pajek (NET), text files (TXT, similar to SIF), and so on. Experiment data input formats 1. Excel template (XLS/XLSX): templates for primary experiment data input can be found in Experiment → Data Input Templates. These templates provide a structured mask to enter experiment data of various types. The upper part consists of metadata concerning the experiment, such as experiment name and coordinator and a global list of conditions. The lower part lists all raw data in a matrix, consisting of measured substances (columns) and conditions/time points (rows). These matrices may be extended to accompany large data sets, but the basic structure of the template has to be retained (do not shift/alter gray cells or add/delete rows and columns). Such templates may be loaded in SBGN-ED without further interactions. The size of the template is limited by the maximal number of rows and columns in Excel sheets (XLS: 256 × 65,000 cells, XLSX: 16,000 × 1,000,000 cells). For examples see Figures 5 and 8. 2. Excel spreadsheet (XLS) or delimited text file (CSV/TXT) is a less structured way of loading experiment data, which is useful for huge data sets, such as gene expression data, and can be easily created manually. The file contains a matrix, where each row represents the name (e.g., the gene ID) and one or more raw data values of this substance. The first row in the matrix represents a header, assigning each raw data value to a condition. The columns are tab delimited for text files. When loading such files, the user may transpose the content and is asked for metadata, such as experiment name and coordinator. Note that each column in the file indicates a different condition without any time resolution. The time points have to be specified manually using the dialog, or using the Excel template in the first place. Main output formats SBGN-ED supports various export functionalities. As GML and GraphML are the graph formats that support all attributes, including the experiment data, we recommend using these formats to save all working results. To exchange maps with other tools supporting SBGN, a map can also be stored in SBGN-ML format. The graph visualization may be exported as high-quality raster images in the PNG or JPG format. For PNGs, there is also an option to create a clickable HTML map, which can be viewed by any browser and enables users to follow (URL-) links connected to nodes. Vector graphic images such as SVG and PDF are also supported. The PPT export is still experimental, but enables users to export most graphical SBGN features excluding charting. A set of graphs may also be exported as a website, using the clickable HTML map functionality. This is explained in detail in Steps 11–19.
nature protocols | VOL.7 NO.3 | 2012 | 583
protocol Box 5 | The RIMAS information portal
© 2012 Nature America, Inc. All rights reserved.
RIMAS (Regulatory Interaction Maps of Arabidopsis Seed Development) is a web-based information portal. It is open source and available at http://rimas.ipk-gatersleben.de/. The portal provides access to four detailed SBGN-based network maps that describe the interactions between LEC1/AFL-B3 transcription factors and maturation gene promoter elements, hormonal pathways and epigenetic processes. Furthermore, nodes and edges of the gene regulatory networks are linked to literature databases and to corresponding TAIR entries in order to obtain detailed information about practical methods or gene models. A collection of prior reviews covering seed development in Arabidopsis is also available. SBGN maps may be exported in common exchange formats (such as GML), allowing for the modification of network layouts to suit users’ requirements. Alternatively, .GML network files are accessible in SBGN-ED, with all possibilities for network extension or enrichment by data mapping.
Figure 1 | Workflow of the creation of interactive, web-based and data-enriched maps using SBGN. Top, SBGN maps are created on the basis of raw data or imported from various sources and can be translated manually or automatically into valid SBGN notation. Orange arrow, regulatory network; violet arrow, metabolic network; blue arrow, any kind of SBGN network may be exported as a website. Middle, these SBGN maps are enriched, bringing experimental data of various sources into context of the SBGN map. Expression data from a publicly available database are mapped on a regulatory network (orange). Metabolite data derived from literature are mapped on a metabolic network (violet). Any kind of a researcher’s own experimental data might be used for data mapping. Bottom, a set of such (enriched) maps are exported and published as an interactively explorable website.
Creation of SBGN maps Steps 1–9
Literature unconnected raw data networks
Manual drawing Steps 9A(i–xxiv)
Databases (in SBGN notation) MetaCrop Import Steps 9B(i–iii)
Other databases KEGG Automatic translation Steps 9C(i–vi)
SBGN network
Data mapping Step 10
Databases (own data)
Literature (own data)
Steps 10A(i–x) Export
Steps 10B(i–iii)
SBGN network with experimental data
Steps 11–19
Experimental design The workflow described in this protocol is shown schematically in Figure 1. In the first section (Steps 9A(i)–9C(vi)), SBGN-ED is used to create SBGN maps with the focus on two special kinds of networks, a gene regulatory network and a metabolic network. With regard to the former, this protocol exemplifies the drawing of the LEC1/AFL-B3 gene regulatory network as part of the RIMAS information portal 2 (Box 5; http://rimas.ipk-gatersleben.de/) using the SBGN Process Description (PD) language (see Fig. 2 for an overview of all glyphs of SBGN PD, adapted from ref. 15). This simple SBGN PD map represents the regulatory interactions of key transcription factors (TFs) in the seed development of Arabidopsis. Regulatory interactions were inferred from extensive literature studies, and useful information about network glyphs or their interactions are attached as web links. Metabolic networks are imported from the MetaCrop biochemical database16 (http://metacrop.ipk-gatersleben.de/), which is directly accessed by SBGN-ED and from the KEGG database13. MetaCrop
Website
offers SBGN PD style maps, whereas for the non-SBGN style metabolic networks obtained from KEGG, this protocol describes the automatic translation into SBGN PD using SBGN-ED. In the second section (Steps 10A(i)–10B(iii)), this protocol explains how to enrich the above-introduced regulatory and metabolic networks with gene expression and metabolite data, respectively. Different data sources are conceivable, such as databases (e.g., Genevestigator17 for large-scale expression datasets), literature reports and your own experimental data. According to individual purposes, different kinds of -omics data may be mapped and visualized in the context of user-defined networks. The third section (Steps 11–19) of this protocol delineates the export of SBGN maps as clickable images on websites for interactive exploration by the entire biologist community. For a better understanding of the whole step-by-step procedure, the reader can refer to the Supplementary Tutorial.
MATERIALS EQUIPMENT Software and hardware requirements • PC (at least a Pentium processor or an equivalent; minimum 1GB RAM; screen resolution of at least 1,024 × 768 pixels and 2 GB RAM for the 64-bit version) • Java SE Runtime Environment 6 (http://www.java.com/en/download/ index.jsp) in the 32-bit or 64-bit version on your PC (Note that
Mac OS X automatically supports Java. This tool is operatingsystem independent and has been tested on Windows XP (32 bit), Windows Vista (32 bit), Windows 7 (32 bit and 64 bit), Mac OS X (10.5 and 10.6, 64 bit) and Ubuntu Linux (10 and 11, 32 bit and 64 bit)). • Excel software (Microsoft)
PROCEDURE Downloading and installation ● TIMING 5–10 min 1| The download and installation of VANTED can be achieved using either option A (to download VANTED directly) or option B (using Java Web Start). 584 | VOL.7 NO.3 | 2012 | nature protocols
protocol Entity pool nodes (EPN)
Auxiliary units
Process nodes
Connecting arcs
pre:label
Unspecified entity
Simple chemical
Macromolecule
Nucleic acid feature
EPN with state variable
Perturbing agent
EPN with clone marker
N:5
val@var
Omitted process
?
N
Consumption
Target EPN
Production
Unknown process Source EPN or logical operator
Modulation
Phenotype
Source EPN or logical operator
Stimulation
Source EPN or logical operator
Catalysis
Source EPN or logical operator
Inhibition
Source EPN or logical operator
Necessary stimulation
N:2
Multimer nucleic acid feature
Multimer macromolecule
Container nodes (CN)
Logical operators
e:INFO
Compartment
AND
OR
NOT
Complex
© 2012 Nature America, Inc. All rights reserved.
N
Dissociation
EPN with clone marker Clone label
Multimer simple chemical
Multimer complex
Process
Association
N:2
N:5
Source EPN
EPN with unit of information
INFO INFO
varW
LABEL
varZ
Reference nodes
Source EPN
LABEL LABEL varX INFO
EPN or CN
A
B
EPN or CN
C
EPN or CN
Submap
Tag
varY
LABEL
Target logical operator
EPN
Logical arc
Equivalence arc
Figure 2 | The SBGN Process Description (PD) notation. This overview represents all SBGN glyphs used by the PD language. Glyphs are classified into the following categories: ‘entity pool nodes (EPNs)’, ‘process nodes (PNs)’ and ‘connecting arcs’ as basic elements of a process; ‘reference nodes’ and ‘container nodes (CN)’ as means of encapsulation of EPNs; ‘auxiliary units’ as annotations of EPNs and ‘logical operators’ as means of combining effects of different EPNs. Adapted from reference 15.
(A) Download VANTED directly (i) Download and install VANTED from http://sourceforge.net/projects/vanted/files/vanted2/v2.0/. For Windows, download the setup file vanted2.0.exe, and for Max OS X download the installer image vanted2.0.dmg. A systemindependent ZIP file vanted2.0.zip is also available for download. (B) Use Java Web Start (i) For operating systems with only 32-bit Java support run start2.0.jnlp at http://vanted.ipk-gatersleben.de/v2/, and for operating systems with 64-bit Java support run start2.0_64bit.jnlp. More information can be found on the VANTED homepage (http://vanted.ipk-gatersleben.de/index.php?file=doc2.html). 2| To download and install SBGN-ED, first start VANTED. 3| Go to the side panel Help → Settings. 4| Click Install/Configure Add-ons, the ’Add-on Manager’ window will be shown. 5| Click Find Add-ons/Updates. 6| Search for SBGN-ED and click Install Add-on. 7| Click OK. 8| Close the Add-on Manager window. Alternatively, a manual method of installation is described on the SBGN-ED homepage (http://www.sbgn-ed.org/download_installation.html). The SBGN-ED desktop is shown in Figure 3. Creation of SBGN maps 9| Maps can either be manually drawn (option A), or directly accessed and imported (for example, from the MetaCrop biochemical database; option B), or can be obtained by translating non-SBGN style networks (e.g., from the KEGG database; option C) into SBGN format using SBGN-ED. nature protocols | VOL.7 NO.3 | 2012 | 585
© 2012 Nature America, Inc. All rights reserved.
protocol (A) Manual drawing of SBGN networks ● TIMING 15–30 min (i) Select File→New. (ii) Select the SBGN-ED tab. (iii) Select the PD sub-tab. (iv) Select Macromolecule glyph from the Entity Pool Node (EPN) field. CRITICAL STEP The tooltip provides information about the corresponding glyph. SBGN-ED drawing mode is activated automatically (Fig. 3). (v) Click on four different positions in the drawing area to add four Macromolecule glyphs to the map. (vi) Switch to Selection mode (Fig. 3). (vii) Add a label to each Macromolecule glyph by selecting the glyph and entering a label in the text field Glyph Label. Click Apply Changes or press Enter key. (viii) Add unit of information to each Macromolecule glyph by selecting the glyph and clicking the Add button beside the label Unit of Information in the panel Auxiliary Units. Enter text in the text field (the label of the Unit of Information) and select a position using the dropdown box (12 different positions on the border of the glyph can be chosen). Click Apply Changes. To change the position of an auxiliary unit, double-click on the glyph; in the window that appears, you will be able to change all attributes of the node (node label and attributes of all attached auxiliary units, including the position). (ix) To layout the glyphs according to contextual purposes, select a glyph and hold the mouse button, and then move the glyph to the desired position. (x) Switch to Drawing mode. (xi) Select Omitted Process glyph from the Process Node (PN) field (Fig. 3). (xii) Click on eight different positions in the drawing area to add eight Omitted Process nodes. (xiii) Select Sink and Source glyph from the EPN field (Fig. 3). (xiv) Click on eight different positions in the drawing area to add eight Sink and Source nodes. CRITICAL STEP By default, this glyph appears comparatively large. The size of the glyphs can be adapted by clicking Network→Node→Node Attributes→Size. (xv) Switch to Selection mode. (xvi) To layout the Process, and Sink and Source glyphs according to contextual purposes, select a glyph and hold the mouse button, and then move the glyph to the desired position. (xvii) Switch to Drawing mode. (xviii) Select the Stimulation glyph from the Connecting Arcs field in the PD sub-tab. (xix) To connect Macromolecule glyphs and Omitted Process nodes by Stimulation arcs, click on the center of a Macromolecule glyph (arc appears) and then click on the center of an Omitted Process node (arc connects both glyphs). ? TROUBLESHOOTING (xx) Select the Consumption arc glyph from the Connecting Arcs field in the PD sub-tab. (xxi) To connect Sink and Source glyphs, and Omitted Process nodes by Consumption arcs, click at the center of a Sink and Source glyph (arc appears) and then click at the center of an Omitted Process node (arc connects both glyphs). (xxii) Select the Production Arc glyph from the Connecting Arcs field in the PD sub-tab. (xxiii) To connect Omitted Process nodes and Macromolecule glyphs by Production arcs, click on the center of an Omitted Process node (arc appears) and then click on the center of a Macromolecule glyph (arc connects both glyphs). (xxiv) To introduce bend points in connecting arcs, switch to Selection mode, click on the connecting arc and hold the mouse button; a bend point will be generated automatically, which can then be moved to the desired position. Figure 4 shows the resulting LEC1/AFL-B3 gene regulatory network. The corresponding .SBGN file is given in Supplementary Data 1. CRITICAL STEP To remove bend points, click at the bend point, hold the mouse button and move the bend point to the center of the connected node. Alternatively, you might select the edge and choose the main menu entry Edges→Bends→ Remove Bends. (xxv) To add web links to glyphs, double-click onto a glyph and enter any URL in the tooltip area. CRITICAL STEP Web links might refer to databases containing different information such as literature or gene locus details. (B) Import of SBGN networks ● TIMING 3–5 min (i) To load a MetaCrop pathway map, click the Pathways tab and choose MetaCrop pathways. (ii) Press Get list of pathways to get a tree view of available pathways in MetaCrop. (iii) Select the desired pathway(s) and use the Download selected pathways button or double-click to download and view MetaCrop pathways. The .SBGN file for the example of the metabolic network of the tricarboxylic acid (TCA) cycle is given in Supplementary Data 2. CRITICAL STEP Manual drawing of the metabolic TCA cycle network for a non-experienced user of SBGN-ED would take about 20–30 min. If it is necessary to include compartment information, first draw all glyphs of the metabolic 586 | VOL.7 NO.3 | 2012 | nature protocols
© 2012 Nature America, Inc. All rights reserved.
protocol network besides the compartment glyph and then move them ‘inside’ this containing node. The glyph drawn last is always on top. If certain glyphs have more than one occurrence in the network, it is necessary to add a clone marker. First, select the corresponding glyphs and go to the SBGN-ED sub-tab PD. Check the Add box for the Clone Marker in the Auxilliary Units field. Click Apply Changes. ? TROUBLESHOOTING (C) Automatic translation of non-SBGN style networks ● TIMING 2–3 min (i) To load a KEGG pathway map, go to http://www.genome.jp/kegg/pathway.html and select the desired pathway from the list (scroll down to view the complete list). The reference pathway will be loaded. (ii) In a pull-down menu below the pathway description, you can choose your model species or the reference pathway (enzymes are either represented using the KEGG orthology identifiers or EC numbers (EC)) and press Go. (iii) After the reference pathway is loaded, press Download KGML in the top menu and save the pathway. (iv) Open the pathway in SBGN-ED. CRITICAL STEP For further handling, it might be useful to concentrate on KEGG reference pathways (KEGG ortho logy). Missing knowledge about organism-specific pathways may lead to incomplete SBGN maps after translation. In the further course of this protocol, metabolic pathways will be used for mapping of data from metabolite concentration and enzyme activity measurements. Therefore, only KEGG pathways of the group ‘metabolism’ should be chosen. (v) To translate the KEGG pathway map into SBGN, choose the SBGN-ED→Tools sub-tab. Press Translate KEGG to SBGN. The SBGN map will appear in the graph editor view (Fig. 3). CRITICAL STEP The .KGML files of some KEGG maps may contain some inconsistencies that may make it necessary to manually layout and correct the derived SBGN maps. (vi) To validate the translated SBGN map, click Validate Map in the Validation panel. Invalid nodes and edges will be marked in red. Additional information about single invalid glyphs is given in the status line below the window when moving the mouse over the invalid part. Change nodes and edges accordingly or read the specifications for more detailed information (http://www.sbgn.org/Documents/Specifications). CRITICAL STEP Validation annotations may be deleted by clicking on Remove Validation Annotation. ? TROUBLESHOOTING
Figure 3 | Screenshot of the SBGN-ED desktop. (1) side panel, (2) graph editor view, (3) toolbar, (4) main menu, (5) status bar, (6) selection mode and (7) drawing mode. nature protocols | VOL.7 NO.3 | 2012 | 587
protocol Figure 4 | Regulatory interactions between four master regulators of Arabidopsis seed development: LEC1, LEC2, FUS3 and ABI3. LEC1 and LEC2 each control the respective three remaining factors. FUS3 and ABI3 operate as autoregulatory feedback loops. mt, material type; prot, protein.
mt:prot
LEC1
mt:prot
© 2012 Nature America, Inc. All rights reserved.
LEC2
Mapping 10| Different kinds of -omics data can be mapped onto networks and visualized according to individual requiremt:prot mt:prot FUS3 ABI3 ments; see option A for enriching a regulatory network with gene expression data and option B for enriching a metabolic network with metabolic measurement data. (A) Mapping of expression data on LEC1/AFL-B3 network ● TIMING 20–30 min (i) To create a template file for the input of experimental data, save the template file from Experiments→Data Input Templates→Experiment Data in SBGN-ED onto your desktop. CRITICAL STEP The file will automatically be loaded using your Microsoft Excel application. (ii) Enter metadata and measurement data of your experiment (Fig. 5). For example, the data set used in the current protocol contains expression data for four TFs in the LEC1/AFL-B3 network (downloaded from the Genevestigator database17) over nine developmental stages of Arabidopsis seed development. Thus, each developmental stage represents one time point and each gene represents one substance entry. Save the template datasheet after editing. The .xls file of the Excel template containing expression data is given in Supplementary Data 3. CRITICAL STEP A detailed description of the VANTED template files can be found at http://vanted.org/index. php?file=doc7.html. (iii) To load your experimental data, go to the Experiments tab, choose Load dataset in the Load Input File panel and choose your template datasheet from Step 10A(ii). (iv) For mapping purposes, it is necessary to have common identifiers in the network and experiment data. If there are no common IDs, it is possible to load a mapping table providing alternative IDs (Fig. 6). To do so, select Experiments→Identifier Annotation→Add Alternative IDs and choose your mapping table. This may contain any number of alternative IDs for multiple mappings. You can check your alternative IDs by clicking on the link Alternative Identifiers in the Identifier Annotation field. An XLS file of the mapping table used in the present example is given in Supplementary Data 4. (v) Load the map to be enriched (e.g., the manually edited LEC1/AFL-B3 gene regulatory network from Step 9A(xxv)). (vi) For data mapping, first choose all nodes of the network you want to map data on, and then click on Perform data mapping in the Experiments tab. CRITICAL STEP Please note that only selected nodes are used for mapping. ? TROUBLESHOOTING (vii) In the window that appears, choose the default settings and define the charting style for the representation of the integrated data. Color coding might be a good choice for gene expression data. Click OK. (viii) The data mapping results (including the number of nodes and edges with mapped data) will be displayed in a new window. Click OK. (ix) View the data mapping result in the graph editor view and layout the visual representation according to individual needs in the Network→Graph and Node sub-tabs. CRITICAL STEP Global features of the mapping can be changed in the Graph sub-tab. Mappings on individual nodes can be altered in the Node sub-tab. Figure 5 | Screenshot of Excel template file for experimental data sets. 588 | VOL.7 NO.3 | 2012 | nature protocols
protocol mt:prot
LEC1
–2.0 0 2.0
mt:prot
LEC2
mt:prot
FUS3
© 2012 Nature America, Inc. All rights reserved.
Figure 6 | Screenshot of a mapping table. The first column (A) gives gene identifiers (AGI) as used in the experiment data template. The second column (B) lists the corresponding gene names as used in the SBGN map.
mt:prot
ABI3
Figure 7 | Manually edited LEC1/AFL-B3 regulatory network of Arabidopsis seed transcription factors enriched by time-resolved expression data. Data was extracted from the Genevestigator database (http://www.genevestigator. com/gv/). Each bar represents the color-coded expression value of the respective gene at one of nine seed developmental stages. mt, material type; prot, protein.
(x) Create a legend that displays the color code by clicking on Mapping→Create legend in the main menu (Fig. 3). A final version of the LEC1/AFL-B3 regulatory network enriched by expression data is shown in Figure 7. The corresponding file is given in Supplementary Data 5. (B) Mapping of metabolic data on a MetaCrop pathway ● TIMING 20–30 min (i) Repeat Steps 10A(i–iii) to download the template file and enter metadata and measurement data for your experiment (Fig. 8). In the present example, the metabolite concentrations that were measured in several transgenic lines for inducible and constitutive overexpression of a yeast invertase in Solanum tuberosum18 were mapped on a MetaCrop biochemical pathway (TCA cycle). The XLS file of the excel template containing metabolite data is given in Supplementary Data 6. (ii) Open the SBGN style map of the TCA cycle from the MetaCrop database as described in Steps 9B(i-iii). (iii) Perform the data mapping as described above (Steps 10A(vi–x)); see the result in Figure 9. The corresponding file is given in Supplementary Data 7. CRITICAL STEP Deselect ‘Create new nodes or edges for measured substances that cannot be mapped’ so as not to visualize data of substances not present in the network. EC numbers (in terms of enzymes) and metabolite names are
Figure 8 | Screenshot of an Excel template with metabolic data for seven different transgenic lines and wild-type Solanum tuberosum plants. nature protocols | VOL.7 NO.3 | 2012 | 589
protocol Alcl, 1: Solanum tuberosum/wild type
Mitochondrion
Alcl, 2: Solanum tuberosum/Alcl-18 Alcl, 3: Solanum tuberosum/Alcl-38
Pyruvate
Alcl, 4: Solanum tuberosum/Alcl-3 Alcl, 5: Solanum tuberosum/Alcl-23 Alcl, 6: Solanum tuberosum/Alcl-43 Alcl, 7: Solanum tuberosum/Alcl-34 Alcl, 8: Solanum tuberosum/U-IN2-30
CoA
NAD+
1.2.4.1 CO2
NADH
CoA
Oxaloacetate 2.3.3.1
Relative units
Citrate NADH
4.2.1.3
1.0 0.5 0
1.1.1.37 NAD+
Relative units
Malate
Isocitrate
1.5 1.0 0.5
NADP+
0
4.2.1.2
NAD+
1.1.1.42
1.1.1.41
NADH
NADPH Fumarate Relative units
CO2
CO2
2.5 2.0 1.5 1.0 0.5 0
2-oxoglutarate NAD+
FADH2 CoA
1.3.5.1 1.2.4.2 Succinate FAD
Relative units
© 2012 Nature America, Inc. All rights reserved.
AcetylCoA
6.2.1.5 ATP
3
Succiny|CoA
2
CO2
1
NADH
0
ADP
CoA
P
Figure 9 | Metabolic network of the TCA cycle taken from the MetaCrop database, enriched by metabolic measurement data. Metabolic measurement data were taken from ref. 18.
590 | VOL.7 NO.3 | 2012 | nature protocols
protocol used as common identifiers; however, it might be necessary to create a mapping table if metabolite names or other IDs are not uniform (as described in Step 10A(iv)). ? TROUBLESHOOTING Exporting ● TIMING 5–10 min 11| Open all maps that should be included into the website, by selecting File→Open. 12| Select File→Create Website. 13| To select an output folder in the appearing window, click Select Output Folder. CRITICAL STEP Create a new directory, as many files will be created. 14| In the window that appears, add titles for the website and for each map. Click OK. CRITICAL STEP For practical reasons, it is useful to prepare website and network titles and descriptions in advance. During website generation, just copy the text into the input form.
© 2012 Nature America, Inc. All rights reserved.
15| Add a text as general description for the whole website and a more detailed description for each pathway. Click OK. 16| Specify a website footer, including the name of your institution and a feedback e-mail address. Click OK. 17| Specify the color scheme for all fields on your website (e.g., Header, Navigation, Network Description and Footer). Click OK. 18| The website will be constructed and your browser will automatically display the general view of your new website (Fig. 10). The general view gives a schematic image of each map on the website. Click on a schematic image to reach the detailed view of the map. This mode enables to view the maps at different zoom levels and to browse through the map, including other linked websites. 19| To change and/or update your website, repeat the procedure described above (Steps 11–18). CRITICAL STEP Old website files will be overwritten and do not need to be deleted. Creating interactive, web-based and data-enriched maps using the Systems Biology Graphical Notation Overview Enriched metabolic network: TCA cycle (metabolic data) Enriched regulatory network: LEC1/AFL-B3 (expression data) Metabolic network: TCA cycle Regulatory network: LEC1/AFL-B3
The Systems Biology Graphical Notation (SBGN) is an emerging standard for the uniform representation of biological processes and networks. Using examples from gene regulation and metabolism, this protocol shows the construction of SBGN maps by either manual drawing or automatic translation using the tool SBGN-ED. In addition, it discusses the enrichment of SBGN maps with different kinds of –omics data to bring numerical data into context of these networks in order to facilitate the interpretation of experimental data. Finally, the export of such maps to public websites including clickable images supports the communication of results within the scientific community. With regard to the described functionalities other tools partially overlap with SBGN-ED. However, SBGN-ED currently is the only tool which combines all of these functions including the representation in SBGN, data mapping and website export.
Feedback
Regulatory network: LEC1/AFL-B3
Enriched regulatory network: LEC1/AFL-B3 (expression data)
Metabolic network: TCA cycle
Enriched metabolic network: TCA cycle (metabolic data)
Figure 10 | Example website. The overview includes a general introduction and schematically represents all maps with their title. By clicking on a map, the map can be viewed in detail and with different zoom levels. A detailed description for the corresponding map is given, and it is possible to browse through the map and other linked websites.
nature protocols | VOL.7 NO.3 | 2012 | 591
protocol ? TROUBLESHOOTING Troubleshooting advice can be found in Table 2.
© 2012 Nature America, Inc. All rights reserved.
Table 2 | Troubleshooting table. Step
Problem
Possible solution
9A(xix)
It is difficult to select a small glyph or arc with many bend points
Zoom into the region and shift all overlapping elements from the desired element. For arcs with many bend points, it might be useful to delete bend points using the tool bar entry Edges→Bends→Remove Bends
9B(iii)
The map is too large/small
Zoom in/out by using the mouse wheel. Alternatively, you can use the zoom buttons in the toolbar. It is also possible to select any graph elements and choose Zoom: Selected Region in order to zoom to these elements
9C(vi)
Glyphs are not recognized as SBGN during validation
Check the SBGN identity of glyphs and arcs in the Network tab and the Node sub-tab. If the selected node/arc is recognized as SBGN glyph, the SBGN field appears in the side panel, giving information about the role of this SBGN glyph. If this is not the case, delete the selected glyph and redraw it with the SBGN-ED tab
10A(vi), Mapping was not successful; 10B(iii) no nodes with mapped data
The glyph labels do not correspond to the substance names of the experimental data; check all labels and substance names for typographical errors or create a mapping table in order to add additional identifiers to the substance names Incorrect selection; please select all glyphs you want to map data to or deselect all glyphs (data will be mapped to all glyphs) No mapping for this node is available; please specify experimental data in the template for specific substances represented in the map Incorrect order of identifiers in the mapping table (first column: ID from the experimental data, second column: ID from the map)
General
Mapping data mask the node label
Select glyphs; go to tab Network→Node and select the ‘Inside, top’ alignment position in the Label field panel
A certain element cannot be found
Select menu entry Edit→Find and Replace or press Ctrl + F. Select the element type (Node/Edge) and type in the name of the element in the Find area and press Enter. Matching elements are selected and the zoom will adapt to show these elements
● TIMING Steps 1–8, downloading and installation: 5–10 min, depending on your Internet connection speed Step 9A, manual drawing: 15–30 min Step 9B, importing: 3–5 min Step 9C, automatic translation: 2–3 min Step 10A, mapping expression data: 20–30 min (including data import into the Excel template and the construction of a mapping table) Step 10B, mapping metabolic data: 20–30 min (including data import into the Excel template and the construction of a mapping table) Steps 11–19, exporting: 5–10 min (assuming that all text has been previously written and is just copied into the input mask) ANTICIPATED RESULTS This protocol describes the generation of maps in SBGN style (PD sublanguage), which is a way of uniform network representation for various kinds of networks. In the SBGN PD sublanguage, all processes principally have the same composition, which is applied to the different kinds of networks presented in this manuscript. A biochemical process (reaction), as well as a regulatory process, is defined by three major components: one or more sources (substrates), a PN and one or more products. The PN is the object of influence by other entities (effectors) of the network, whereas the type of the influence is represented by a connecting arc. Substrates, products and ‘effectors’ are EPNs. Example 1: Gene regulatory network—LEC1/AFL-B3 network
TF proteins in the LEC1/AFL-B3 network (Fig. 3) include the CCAAT-box binding factor LEAFY COTYLEDON1 (LEC1) and the three B3 domain-containing proteins ABSCISIC ACID INSENSITIVE3 (ABI3), FUSCA3 (FUS3) and LEAFY COTYLEDON2 (LEC2), 592 | VOL.7 NO.3 | 2012 | nature protocols
protocol and are represented as macromolecule glyphs, which are associated with additional units of information, as indicated by supplementary labeling to indicate material type (e.g., protein). Interactions between the TF proteins were inferred from literature reporting about a large number of genetic, molecular genetic and phenotypic screens of the corresponding mutant lines19,20. In the present map, TF proteins are shown to influence processes coming from an ‘unspecified source’. This special glyph is particularly useful in the representation of regulatory processes, which require a large number of substrates such as trinucleotides for transcription and activated amino acids for translation. Regulatory processes itself are represented by ‘omitted process’ nodes, which include transcription and translation processes, as both the substrates and products of these processes are proteins. The nature of an influence of a TF protein (effector) on a process is given by connecting arcs. The stimulation arc represents a positive effect of an EPN on the flux of the target process. For example, LEC1 stimulates omitted processes that result in the formation of the ABI3 protein (Fig. 4). Example 2: Biochemical network—TCA cycle
© 2012 Nature America, Inc. All rights reserved.
Substrates and products of metabolic processes are metabolites that are represented in SBGN by the ‘simple chemical’ glyph. These are connected to the process by ‘production’ and ‘consumption arcs’, which may contain additional information about the reaction stoichiometry. Enzymes as reaction catalyzers are represented as macromolecules, analagous to TF proteins in the regulatory network. The EC number as a unique identifier is used as enzyme label. Many metabolites such as reduction equivalents or cofactors occur several times in one metabolic network and therefore need to be marked with a clone marker. Note: Supplementary information is available via the HTML version of this article. Acknowledgments We are grateful to J. Hüge and B. Junker for helpful discussions and advice. AUTHOR CONTRIBUTIONS All authors contributed extensively to the work presented in this paper. A.J. and H.R. wrote the paper. T.C. and C.K. wrote the code. A.H. developed the tutorial. A.H., T.C. and F.S. edited the manuscript. F.S. supervised the project and gave conceptual advice. COMPETING FINANCIAL INTERESTS The authors declare no competing financial interests. Published online at http://www.natureprotocols.com/. Reprints and permissions information is available online at http://www.nature. com/reprints/index.html. 1. Le Novere, N. et al. The systems biology graphical notation. Nat. Biotechnol. 27, 735–741 (2009). 2. Junker, A., Hartmann, A., Schreiber, F. & Baumlein, H. An engineer’s view on regulation of seed development. Trends Plant Sci. 15, 303–307 (2010). 3. Caron, E. et al. A comprehensive map of the mTOR signaling network. Mol. Syst. Biol. 6, 453 (2010). 4. Kierzek, A.M., Zhou, L. & Wanner, B.L. Stochastic kinetic model of two component system signalling reveals all-or-none, graded and mixed mode stochastic switching responses. Mol. Biosyst. 6, 531–542 (2010). 5. Jansson, A. & Jirstrand, M. Biochemical modeling with systems biology graphical notation. Drug Discov. Today 15, 365–370 (2010). 6. Funahashi, A. et al. CellDesigner 3.5: a versatile modeling tool for biochemical networks. Proceedings of the IEEE 96, 1254–1265 (2008). 7. Zinovyev, A., Viara, E., Calzone, L. & Barillot, E. BiNoM: a Cytoscape plugin for manipulating and analyzing biological networks. Bioinformatics 24, 876–877 (2008). 8. van Iersel, M.P. et al. Presenting and exploring biological pathways with PathVisio. BMC Bioinformatics 9, 399 (2008). 9. Czauderna, T., Klukas, C. & Schreiber, F. Editing, validating and translating of SBGN maps. Bioinformatics 26, 2340–2341 (2010). 10. Gehlenborg, N. et al. Visualization of omics data for systems biology. Nat. Methods 7, S56 S (2010). 11. Suderman, M. & Hallett, M. Tools for visually exploring biological networks. Bioinformatics 23, 2651–2659 (2007). 12. Sorokin, A. et al. The pathway editor: a tool for managing complex biological networks. IBM J. Res. Dev. 50, 561–573 (2006). 13. Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).
14. Schreiber, F. et al. MetaCrop 2.0: managing and exploring information about crop plant metabolism. Nucleic Acid Res. published online, doi:10.1093/nar/GKR1004 (15 November, 2011). 15. Moodie, S., Le Novère, N., Emek, D., Huaiyu, M. & Villeger, A. Systems biology graphical notation: process description language level 1V3. Nature Precedings published online, doi:10.1038/npre.2011.3721.4 (17 February, 2011). 16. Grafahrend-Belau, E. et al. MetaCrop: a detailed database of crop plant metabolism. Nucleic Acids Res. 36, D954–D958 (2008). 17. Zimmermann, P., Hirsch-Hoffmann, M., Hennig, L. & Gruissem, W. GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox. Plant Physiol. 136, 2621–2632 (2004). 18. Junker, B.H. et al. Temporally regulated expression of a yeast invertase in potato tubers allows dissection of the complex metabolic phenotype obtained following its constitutive expression. Plant Mol. Biol. 56, 91–110 (2004). 19. Stone, S.L. et al. Arabidopsis LEAFY COTYLEDON2 induces maturation traits and auxin activity: implications for somatic embryogenesis. Proc. Natl. Acad. Sci. USA 105, 3151–3156 (2008). 20. To, A. et al. A network of local and redundant gene regulation governs Arabidopsis seed maturation. Plant Cell 18, 1642–1651 (2006). 21. Villeger, A.C., Pettifer, S.R. & Kell, D.B. Arcadia: a visualization tool for metabolic pathways. Bioinformatics 26, 1470–1471 (2010). 22. Chandran, D., Bergmann, F.T. & Sauro, H.M. Athena: modular CAM/CAD software for synthetic biology. arXiv.org published online, arXiv:0902.2598v1 (2009). 23. Chabrier-Rivier, N., Fages, F. & Soliman, S. in Computational Methods in Systems Biology. Lecture Notes in Computer Science Vol. 3082 (eds. Danos, V. & Schachter, V.) 172–191 (Springer-Verlag, 2005). 24. Olivier, B.G. & Snoep, J.L. Web-based kinetic modelling using JWS Online. Bioinformatics 20, 2143–2144 (2004). 25. Battke, F., Symons, S. & Nieselt, K. Mayday—integrative analytics for expression data. BMC Bioinformatics 11 (2010). 26. Wegner, K. et al. The NetBuilder’ project: development of a tool for constructing, simulating, evolving, and analysing complex regulatory networks. BMC Syst. Biol. 1, P72 (2007). 27. Deckard, A., Bergmann, F.T. & Sauro, H.M. Supporting the SBML layout extension. Bioinformatics 22, 2966–2967 (2006). 28. Lammers, C.R. et al. Connecting parts with processes: SubtiWiki and SubtiPathways integrate gene and pathway annotation for Bacillus subtilis. Microbiology 156, 849–859 (2010). 29. Reyes Palomares, A. et al. Systems biology metabolic modeling assistant: an ontology-based tool for the integration of metabolic data in kinetic modeling. Bioinformatics 25, 834–835 (2009). 30. Dilek, A., Belviranli, M.E. & Dogrusoz, U. VISIBIOweb: visualization and layout services for BioPAX pathway models. Nucleic Acids Res. 38, W150–W154 (2010). 31. Elowitz, M. & Leibler, S. A synthetic oscillatory network of transcriptional regulators. Nature 403, 335–338 (2000).
nature protocols | VOL.7 NO.3 | 2012 | 593