schema (GML or GRAPH-ML) and the graph computation software may be downloaded from the following URL: http://pgrc.ipk-gatersleben.de/dlg/. 1 Motivation.
Journal of Integrative Bioinformatics 2007
http://journal.imbio.de/
Data Linkage Graph: computation, querying and knowledge discovery of life science database networks Matthias Lange, Axel Himmelbach, Patrick Schweizer, Uwe Scholz Institute for Plant Genetics and Crop Plant Research (IPK) Gatersleben, Germany {lange|himmelba|schweiz|scholz}@ipk-gatersleben.de Summary To support the interpretation of measured molecular facts, like gene expression experiments or EST sequencing, the functional or the system biological context has to be considered. Doing so, the relationship to existing biological knowledge has to be discovered. In general, biological knowledge is worldwide represented in a network of databases. In this paper we present a method for knowledge extraction in life science databases, which prevents the scientists from screen scraping and web clicking approaches. We developed a method for extraction of knowledge networks from distributed, heterogeneous life science databases. To meet the requirement of the very large data volume, the method used is based on the concept of data linkage graphs (DLG). We present an efficient software which enables the joining of millions of data points over hundreds of databases. In order to motivate possible applications, we computed networks of protein knowledge, which interconnect metabolic, disease, enzyme and gene function data. The computed networks enabled a holistic relationship among measured experimental facts and the combined biological knowledge. This was successfully applied for a high throughput functional classification of barley EST and gene expression experiments with the perspective of an automated pipeline for the provisioning of controlled annotation of plant gene arrays and chips. Availability: The data linkage graphs (XML or TGF format), the schema integrated database schema (GML or GRAPH-ML) and the graph computation software may be downloaded from the following URL: http://pgrc.ipk-gatersleben.de/dlg/
1 Motivation High throughput biotechnologies, like cDNA or oligo arrays, genome and EST sequencing projects towards proteomics techniques, produce data from measured molecular facts. Those facts could be gene expression, polypeptide concentration or EST sequences. To support the interpretation of measured molecular facts in order to discover the functional role of the measured molecule, the relationship to existing biological knowledge presented in world wide distributed databases, has to be built. To make the data accessible to desktop computers in the labs, the World Wide Web (WWW) is very popular. Techniques like manual browsing towards web services are commonly used. Using these techniques, bioinformatician may facilitate database embedded tools for linking wet-lab data to database knowledge in an explorative way. Methods like sequence similarities, stochastic networks for pattern recognition or text similarities queries are the selection operations, which are used for this data domain [Ste02, SGBB01]. Journal of Integrative Bioinformatics, 4(3):68, 2007
1
Journal of Integrative Bioinformatics 2007
http://journal.imbio.de/
One limiting drawback in this scenario is the data and tool distribution. Nowadays, we know about hundreds of databases, with different but also overlapping content. Efforts for integrating them were made in several projects using several methods. Because the majority is focused on particular analytic scenarios and applications, none of them provided a holistic, uninterpreted view to the world wide interlinked knowledge. Popular techniques, like data warehouses, multi database query languages or database mediators solved the problem of database heterogeneties and provided powerful query interfaces and helpful tools. But the original character of life science databases as a highly dynamic and flexible network of knowledge is not supported. Because the nature of the knowledge stored in the WWW is to be a network, we need to provide an efficient method for the extraction of data and in future knowledge networks [BK03]. The navigation and benefits of exploiting the data networks was motivated in [LMP+ 04]. Here a system called BioNavigation supports the scientist in exploring the content within an integrated database system using the BioMediation-system. This system supports the graph based browsing and query building on top of integrated databases. In this paper, we present a recently developed method for the efficient extraction and querying of data networks from distributed, heterogeneous life science databases. Surprisingly comprehensive knowledge networks were computed and are available for public access. First applications of this networks were successfully used for a high throughput functional classification of EST towards an automated pipeline for the provisioning of controlled annotation of plant gene expression arrays. This and further applications will be described.
2 Method To meet the requirement of the given flexibility of knowledge networks and to enable an efficient but also lightweight processing of very large volume of integrated data, the concept of data linkage graphs (DLG) and an algorithm to efficient compute them has been developed. 2.1 Data schema graphs The concept is the representation of integrated databases as a network of key attributes and its values. Key attributes are those one which define unique identifier like EMBL sequence accessions, Gene Ontology (GO) terms [ABB+ 00] or EC numbers. To keep the modeled context of the attributes, they are assigned to data entities. In the example above GO term and EMBL sequence are entities, id and accession are the key attributes. Following this, we define schema nodes S as a set of pairs S = (E, A), whereas E is the entity and A is the key attribute. In order to construct networks, edges have to be defined. In the matter of fact, life science database are strongly interconnected. Studies assume a connection rate of at least 80% in life science databases [BK03]. This is manifested by a number of hyperlinks in the Web presentation of the databases, references in XML documents or in primary key / foreign key pairs in relational data structures. We use this relationships and extract out of them the edges R among the schema nodes S. Formally, R is a subset of all possible combinations of two elements from S. Thus, we write: R⊆S×S Journal of Integrative Bioinformatics, 4(3):68, 2007
(1) 2
Journal of Integrative Bioinformatics 2007
http://journal.imbio.de/
Because, we are interested in real existing relationships among schema nodes, we restrict R by: Rlink = {(s, f ), (f, s) | s, f ∈ S ∧ f is in relation to s} Rfunc = {(s1 , s2 ), (s2 , s1 ) | s1 , s2 ∈ S ∧ s1 .E = s2 .E} R = Rlink ∪ Rfunc
(2) (3) (4)
In other words, we consider two kinds of bidirectional relations among key attributes: Rlink : Inter-entity relationships, which are modeled as links in the particular databases. Those could be represented by: HTML hyper links, primary key/foreign key, object references etc. Rfunc : Inner-entity relationships, which are modeled as functional dependency among attributes in one entity. Those could be represented by: co-occurrence in one HTML page, childs of the same XML parent item or modeling as attributes of one relational table. With this rules we are able to construct a graph as a pair of entity and relationship sets: α = (S, R), that comprises a database network at schema level. 2.2 Data linkage graphs The next step is the extension of the schema graph toward the instantiated data networks. Thus, we have to consider the attribute values, the stored data in the databases. In the context of the above example, we have a Gene Ontology term id GO:0005975, which identifies the term carbohydrate metabolism or the EMBL accession BAY074321, which identifies a sequence, coding for alpha-galactosidase. If we include this attribute values in our model and construct data networks, we will end up with graphs of interacting database entries. We extend the pair S to a triple, with one additional element V . Thereby, V is an element of the data domain of S, e.g. all valid EMBL accessions or Gene Ontology term identifier. We define data nodes SV as a set of a triple: SV = (E, A, V ) | V ∈ dom(S). If we imply the existence of a = relation in all used data domains for our key attributes A, we can define relationships * ) among two elements of a data node (sv1 , sv2 ∈ SV ), if one of the subsequent conditions is fulfilled. Furthermore, we assume the existence of two operations selection σ and projection π, which are borrowed from relational algebra. If we use SV as a relational table, the selection returns all those tuples (data nodes), in which a certain element is equal to a given one. To decide the equality, we use the = operation. In our application, the projection reduce the data node tuples to one specified element. {(sv1 .E, sv1 .A), (sv2 .E, sv2 .A)} ⊂ Rlink πsv1 .A (σsv2 .A=sv2 .V (sv2 .E)) = πsv2 .A (σsv1 .A=sv1 .V (sv1 .E))
(5)
{(sv1 .E, sv1 .A), (sv2 .E, sv2 .A)} ⊂ Rfunc σsv1 .A=sv1 .V (sv1 .E) = σsv2 .A=sv2 .V (sv2 .E)
(6)
or
Journal of Integrative Bioinformatics, 4(3):68, 2007
3
Journal of Integrative Bioinformatics 2007
http://journal.imbio.de/
Colloquially, two data nodes are in relation if, 1. the schema nodes are in relation and the data elements are the equal (equation 5), or 2. the data nodes are mappable to the same schema and the tuples come from the same instance, e.g. table row or Web page (equation 6).
G( O A. C C G, O N.I T RE RP O )
R
NI T E R P R O
E
N A M
A C C
ac vtii yt
G G G O O O 00: 00: 00: 05 00 00 97 45 45 5 64 53 be O m ac a-t gu-l eat bor ucrf ocs boli hdy ost hyl ms ar et dai dy es oarl es PI PI PI R R R 00: 00: 00: 00 62 01 56 32 11
DI
#3
#2
#1
G O
uf n c
( ME B L G. O A G, O A. C C)
R
G G G G O O O O 00: 00: 00: 00: 00 05 00 05 45 97 45 97 64 5 53 5
G O A
l ni k
( ME B L A. C C , ME B L G. O A ) D E
A C C
DI
N T REP R O = `I RP: 00 00 56 `(
G( O =) σI
G O N.I T RE PR O .I RP 00: 00 56
G O A. C C G. O 00: 05 97 5 G O A. C C G. O 00: 00 45 53
A CC = G` O 00: 05 97 5`
ME B L G. O A G. O 00: 05 97 5 ME B L G. O A G. O 00: 00 45 53
G O . DI G #. O 1 ) :
G O )) σ
A CC σ( A CC = G` O 00: 05 97 5` (
E M B L)
O A = G` O 00: 05 97 5` (
N T REP R O = `I RP: 00 00 56 `(
G( O =) σI
G O DI. G #. O 2 ) :
G O N.I T RE PR O .I RP :0 00 11 1
G( O ))
A CC σ( A CC = G` O 00: 00 45 53 `
G O A σ( G O A = G` O 00: 00 45 53 `
π E M B L. DI #2. nda G O . DI (E #. 2 M : B L) =) π
A CC = G` O 00: 05 97 5`
σ
ME B L)
O A = G` O 00: 00 45 53 `(
A CC = A` Y 07 43 21 `(
σ E M B L. DI #2. : E M B L) = σ G
E M B L A. C C . A Y 07 43 21
A CC = A` Y 07 43 21 `(
σ E M B L. DI #1. : ME B L) = σ G
G O A σ( G O A = G` O 00: 05 97 5` (
π E M B L . DI #1. nda G O I. E D#. M 1 B : L ) =) π
d a at nli k a g e g ar p h :
#4 #3 #2 #1 A Y1 Y1 A 15 15 Y07 Y07 59 59 43 43 21 21 Bf Bf Al Al ucr ucr pha pha 4t 4t ga- gacl cl ost ost dai dai es es
B L
E M
uf n c
R
s c h e m a g ar p h :
To illustrate these types of relationships, the relationships between two data sources are visualized in figure 1: Assuming that there are the EMBL and the Gene Ontology (GO) database.
Figure 1: example for the relationships in schema and data linkage graphs
Each comprise three attributes in one entity. With known inter-attribute relationships (e. g. derived from hyperlinks among EMBL and GO), our example includes three relations: Rf unc1 = (EM BL.ACC, EM BL.GOA) Rf unc2 = (GO.ACC, GO.IN T ERP RO) Rlink1 = (EM BL.GOA, GO.ACC) On the basis of the above defined relation * ) we can compute a set of data node pairs: RV ⊆ SV × SV
(7)
(sv1 , sv2 ) ∈ RV ∧ sv1 * ) sv2
(8)
such that
Journal of Integrative Bioinformatics, 4(3):68, 2007
4
Journal of Integrative Bioinformatics 2007
http://journal.imbio.de/
Finally, we are able to describe networks of data nodes as set of graphs with the structure β = (SV, RV ). Using DLGs in combination with an efficient graph query engine, a general concept for a novel flexible, lightweight database integration becomes possible. Flexible means no schema or data integration, with all the conflicts needs to be solved. Leigthweigt, because only the key attributes values have to be accessed physically, parsed and downloaded respectively. Subsequently we describe the concrete construction of schema networks and the computation of data networks, in form of data linkage graphs (DLGs).
3 Implementation In order to use the described concept for joining millions of data points in over hundreds of databases, we designed and implemented an efficient algorithm for the computation of DLGs. The prerequists are (1) a graph of schema nodes and (2) for all schema nodes, the attribute values. 3.1 Data Import For the first task, we designed a protein knowledge network, which interconnects metabolic, disease, enzyme and gene functional data. In this context, we merged the data schema of the public databases UNIPROT (UP), Gene Ontology, OMIM, BRENDA and KEGG and derived a graph of schema nodes. This graph, available at (http://pgrc.ipk-gatersleben. de/dlg/, comprises 318 nodes and 234 edges, which cover the majority of the currently available protein related databases. In a collaboration with the University of Bielefeld, Germany, methods are developed to construct schema graphs automatically. This will be realized based on public available life science database repositories, like the NAR Database Categories List, pattern recognition in life science database WWW-presentation and the tracking of their hyperlinks. The retrieval of the key attribute values was realized by a combination of an in-house developed parser, the Sequence Retrieval System (SRS) [EUA96] and a database mediator called BioDataServer (BDS) [BLSS04]. In this way, a file of key values for each schema node was generated. The results are 318 files comprising around 5.6*107 key values. Hereby, we observed a 10-fold redundancy. This potential for high compression rate motivates to the computation of a hash table, mapping each key value to an integer. If we assume a 64-bit coded integer, a 10-character identifier from the Gene Ontology can be stored in 8 bytes. This saves storage capacity in a way, so that is it possible now to represent the data linkage graphs in modern computer main memory. 3.2 Data network computation This properties influenced the strategy for the DLG computation. The standard way for the computation of transitive entity relations are index supported joins. Such queries are commonly used to find data relationships in integrated databases. Examples of such queries are Journal of Integrative Bioinformatics, 4(3):68, 2007
5
Journal of Integrative Bioinformatics 2007
http://journal.imbio.de/
those, mapping sequence data to the predicted functional role of the coded gene. The common established strategy is to import relevant data into a data warehouse and execute queries. Because of the complexity of such queries, the feasible number of tables and tuples in such joins is limited. In other words, if all of the i schema nodes were joined (./) and a cascading merge join was used, we would have a polynomial complexity, because database management systems are optimized for a data storage using secondary memory, usually hard discs and the cascading sort-merge join is the commonly used join implementation in relational database management systems. The complexity of each merge join is O(n). Consequently, the overall complexity is: Ojoin = O(n)table1 ./table2 ∗ . . . ∗ O(n)tablei−1 ./tablei = O(n)i
(9)
The consequence is the limitation to use case specific data warehouses, which disables a comprehensive database integration. Thus, there is a complexity problem for comprehensive data queries over huge data warehouses using big systems like SRS or even ORACLE. They support efficient joins only for a limited number of data, tables or databases. More comprehensive queries would result in days of waiting or even system failures. To avoid those problems and to improve the implementation, we investigated main memory based data structures and algorithms. We discussed previously that the nodes in our data networks show high redundancy and a compression rate of factor 10 can be achieved. Thus, we where able to represent all nodes by 5.6*107 64-bit pointer to 5.4*108 data nodes. This data structure took around 4GByte. Using this data structure, we pre-compute all possible joins in main memory. A C++ program loads all data nodes into main memory and computes all possible graphs. For this, we used the recursive breadth-first search (BFS) algorithm in combination with index structures like hash maps, B-trees and bit vectors. The resulted graphs where stored in a flat relational structure: (graphid, node depth, parent entity, parent attribute, child entity, child attribute, parent value, child value). This graphs are available in csv-format at the URL given bellow. Because of the recently finished graph computation, no data format optimization was done. The representation in a hierarchical data structure, like XML, is ad-hoc realizable, but not yet used practically. The graph computation took 3 weeks on a 2GHz Opteron server, whereas 12 GByte of main memory and 2 CPUs were used. 3.3 Data network queries We loaded the graphs into an ORACLE 10g database table. Some example queries were performed, like a query for all reachable data nodes starting from a given one. The application for such a query is e.g. the assignment of EMBL sequences to Gene Ontology controlled vocabulary, metabolic pathways in KEGG, diseases in OMIM etc. The former necessary join was replaced by a simple start-destination-search in the graphs. This search retrieves all graphs that connect a given start-node to an end-node. In SQL this may expressed as a self join: select g2.data from graphs g1, graphs g2 where g1.graphid = g2.graphid and Journal of Integrative Bioinformatics, 4(3):68, 2007
6
Journal of Integrative Bioinformatics 2007
http://journal.imbio.de/
g1.parent entity = ’EMBL’ and g1.parent attribute = ’ACCESSION’ and g2.child entity = ’GO TERM’ and g2.child attribute = ’ACCESSION’ The complexity of such a query is derived from the complexity of the two selection operations for selecting the graphs containing the queried start and end node, and additionally their union. Using indexes for the selection, the complexity is O(log n). Applying merge join on sorted graph identifiers, the complexity is O(n). Therefore, the overall complexity of the above query is: Ographquery = O(log n) + O(log n) + O(n) = O(n)
(10)
If we don’t take in account that we need to compute the DLG with the BFS complexity (because it will be necessary only once per database update), this will be a significant reduction of the polynomial join complexity.
4 Discussion The analysis and interpretation of wet-lab data, performed at a typical scientist desk is an iterative process of tool application and database exploration. A frequent task in biology is the functional gene annotation, which is often accomplished by finding similarities or patterns in known, experimentally annotated sequence data. One common application of this paradigm is, for instance, (1) the mapping of translated nucleic sequences by direct sequence homology recognition using the BLAST tool [AGM+ 90] in an annotated, merged polypeptide sequence database like NRPEP (ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nr.gz). The hits which match known proteins link to the original databases and have to be (2) extracted eighter manually or using parsers. Finally, all linked databases have to be inspected for getting relevant information for the input sequences. The datasets on functional aspects of sequences and proteins present in life science databases are mainly derived from literature or provided from the database annotation team in textual form in natural language. In contrast, the use of a controlled vocabulary (CV) is a necessary concept for making annotations computational comparable. Using CVs, automized phenotypic classification for a huge number of wet-lab data is possible without human interaction or interpretation of textual description of a protein function. There are several systems, implementing such kind of systematic knowledge representation: the MIPS Functional Catalog [MFG+ 02], the Gene Ontology Database [ABB+ 00], GeneQuiz [ABL+ 99], OMIM [HSA+ 02], KEGG [KGKN02] or MapMan [TBG+ 04]. All those databases where more or less manually constructed. 4.1 Functional mapping to controlled vocabulary Here we present an application of the DLGs to map general functional sequence annotation to controlled vocabulary in a computational way. The basic idea of the functional categorization is to (1) take BLAST [AGM+ 90] hits for a number of EST, e. g. an EST library, (2) extract the database identifiers from the hit descriptions, Journal of Integrative Bioinformatics, 4(3):68, 2007
7
Journal of Integrative Bioinformatics 2007
http://journal.imbio.de/
(3) find them in the data linkage graphs and finally (4) query the data link graphs to (5) find a path from the found database identifier to the interesting annotation database entity. In first use cases, this work flow was implemented to map Bests to their functional role in the Gene Ontology (GO) Database. The result is the assignment of ESTs to the three GO ontologies. The work flow has been implemented by an SQL-query, using the start-destination query pattern presented before. Figure 2 visualizes this process. S
t
e
p
1
S
t
e
BLAST against public databases E
S
T
l
i
b
r
a
r
p
2
PIR
EMBL
GB
y
DDBJ
UP
S
t
e
p
3
assign BLAST hits to DLG nodes (start nodes)
S
t
e
p
5
GO terms
Start-destination-search query on top of DLGs to map start-nodes to destination nodes (GO term) S
t
e
p
4
Figure 2: work flow to assign ESTs to GO terms
In order to evaluate the computational mapping of ESTs to GO terms, a set of 440 EST has been mapped computational to a GO term in around 2 minutes using an ORACLE 10g on a 2GHz Itanium server. Those ESTs represent genes which are used for an IPK internal cDNA macro array. Thus, the functional role of those gene has been studied in detailed from institute’s scientists, but had never been before included in GO. For the automatic annotation using the described pipeline, the cut-off criteria for the used BLAST results was an e-value less then 1E-10. The results of the annotation have been manually checked. We discovered, that 6% had a false assignment to GO terms. 80% were assigned unquestionable correct. The rest of 14% were not clearly decidable but tended to be correct. Those first investigations have been recently done. Thus, the data is available on request from the authors. Because of this promising results and to demonstrate the flexibility of the approach, another annotation mapping was computed to KEGG [KGKN02] metabolic pathways. This query took on the same server approximately 45 minutes. We were able to identify 18,607 out of 111,090 sequences stored in the IPK CR-EST database [KLF+ 05], which show enzymatic activity. By using the data linkage graphs and starting from either EMBL, Genbank, DDBJ, PIR or SWISSPROT/TREMBL in the BLAST results. 21% of these ESTs assigned to functional categories belong to carbohydrate metabolism, 19% to amino acid- and 17% energy metabolism. In contrast, using BLAST results only 1906 functional descriptions containing an EC numbers were found. Hence, compared to BLAST the sensitivity of functional annotation was increased 10fold. Figure 3 shows a simple statistic over the non-redundant metabolic annotated ESTs.
Journal of Integrative Bioinformatics, 4(3):68, 2007
8
Journal of Integrative Bioinformatics 2007
http://journal.imbio.de/
Figure 3: mapping of ESTs from the CR-EST database to GO terms
4.2 Database network discovery Beside the above mentioned graph queries, further analysis will be done. Scenarios are knowledge discovery, by context specific joining of graphs, like building context networks of data on specific diseases, regulatory elements, protein data and gene variants. This can be supported by a multidimensional, structural analysis of integrated databases, like centrality analysis for the identification of key elements of data networks. Furthermore, quality control of public databases is an important challenge in modern bioinformatics. Using DLGs we can e. g. instance extraction all networks that include contradicting data nodes for the same schema node. The URL http://pgrc.ipk-gatersleben.de/dlg/ can be used to download the data schema graph (GML or GraphML format), the data linkage graphs (CSV format) or examples of context specific extracted metabolic knowledge networks for plants (GML or GraphML format). Later are queried using SQL by joining the DLGs over the enzymes (EC numbers) which are present in glycolysis metabolic pathway and protein which are reported in Brassicae, Triticeae and Solanacea. Thus we extracted networks of knowledge to enzymes in barley which show catalytic activity in glycolysis. Figure 4 gives an impression of such a data network. A high resolution and comfortable version is from this figure is available, using the graph datafiles in a graph viewer. The files and a link to a viewer are available using the URL given previously.
5 Outlook Beside the presented application of the data linkage graphs, it is possible to perform more comprehensive applications of the presented concepts. The description was used to give a proofof-principle. More use cases and further more detailed quality control of the computational sequence annotation mappings are on the road map. Further plans are underway to expand the data basis and implement a user friendly and flexible tool to perform queries on top of data linkage graphs to discover knowledge in networks of life science databases.
Journal of Integrative Bioinformatics, 4(3):68, 2007
9
Journal of Integrative Bioinformatics 2007
http://journal.imbio.de/
GO_GRAPH_PATH/TERM1_ID/2141
GO_TERM2TERM/TERM1_ID/5609
GO_GRAPH_PATH/ID/9604 GO_TERM2TERM/TERM1_ID/8662 GO_TERM_DBXREF/DBXREF_ID/1211 GO_GRAPH_PATH/TERM2_ID/1429 GO_TERM2TERM/TERM1_ID/1429 GO_TERM2TERM/TERM2_ID/2141 GO_GRAPH_PATH/TERM1_ID/1820 GO_TERM2TERM/TERM1_ID/1818 GO_UNIPROT_LINK/ID/1801841 GO_TERM_DBXREF/TERM_ID/1429 GO_TERM2TERM/RELATIONSHIP_TYPE_ID/2 GO_EC_LINK/ID/2193
GO_GRAPH_PATH/TERM2_ID/2141 GO_TERM2TERM/TERM2_ID/1820 GO_LINK/ACCESSION/O65114 GO_TERM_DBXREF/DBXREF_ID/3395
GO_EVIDENCE/ID/7778742 GO_EVIDENCE/ID/777411 GO_TERM_DEFINITION/TERM_ID/1429 GO_UNIPROT_LINK/ID/1801842 GO_TERM2TERM/TERM2_ID/1430 GO_EVIDENCE/ID/840683 GO_EVIDENCE/ID/647215 GO_EVIDENCE/ID/645701GO_EVIDENCE/ID/645705
GO_TERM2TERM/TERM2_ID/1737 GO_LINK/ACCESSION/O65117
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/1167777
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129250 GO_TERM_DBXREF/TERM_ID/2141
GO_UNIPROT_LINK/ID/1801840 GO_TERM_DBXREF/DBXREF_ID/2193
GO_EVIDENCE/ID/840689 GO_EVIDENCE/ASSOCIATION_ID/6765476 GO_EVIDENCE/ID/645685 GO_EVIDENCE/ID/8394446
GO_GRAPH_PATH/TERM1_ID/1429 GO_TERM_DBXREF/TERM_ID/1737 GO_EVIDENCE/ASSOCIATION_ID/720699 GO_EVIDENCE/ID/645697 GO_EVIDENCE/ID/645689
GO_EVIDENCE/ASSOCIATION_ID/731461
GO_EVIDENCE/ASSOCIATION_ID/612102
GO_TERM_DEFINITION/TERM_ID/2141
GO_TERM2TERM/ID/3036
GO_EVIDENCE/ID/8395489
GO_UNIPROT_LINK/ID/1245104
GO_TERM2TERM/ID/2419 GO_EVIDENCE/ID/889156 GO_EVIDENCE/ID/790798
GO_TERM_SYNONYM/TERM_ID/1429
GO_GRAPH_PATH/TERM2_ID/1737
GO_UNIPROT_LINK/ID/1683944
GO_EVIDENCE/ID/645709 GO_EVIDENCE/ASSOCIATION_ID/610863 GO_EVIDENCE/ASSOCIATION_ID/610860
GO_TERM_DBXREF/DBXREF_ID/2444 GO_TERM_DEFINITION/TERM_ID/1737
GO_EVIDENCE/ID/889165 GO_EVIDENCE/ID/645693
GO_LINK/ACCESSION/O65115
GO_GRAPH_PATH/TERM1_ID/1737
GO_UNIPROT_LINK/ID/1801586
GO_EVIDENCE/ASSOCIATION_ID/769831 GO_EVIDENCE/ID/8395485
GO_EVIDENCE/ASSOCIATION_ID/7287508
GO_GRAPH_PATH/TERM2_ID/1820
GO_ASSOCIATION/ID/612102 GO_TERM2TERM/ID/2760 GO_LINK/ACCESSION/O65116 GO_EVIDENCE/ASSOCIATION_ID/7288423
GO_ASSOCIATION/ID/6765476 GO_EVIDENCE/ASSOCIATION_ID/610851
GO_EVIDENCE/ASSOCIATION_ID/7288417 GO_EVIDENCE/ASSOCIATION_ID/808844
GO_EVIDENCE/ASSOCIATION_ID/610857 GO_ASSOCIATION/ID/610860 GO_ASSOCIATION/ID/7288423
GO_ASSOCIATION/GENE_PRODUCT_ID/129544
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129248 GO_TERM_DBXREF/TERM_ID/1820 GO_TERM_DEFINITION/TERM_ID/1820
GO_DBXREF/ID/93395
GO_TERM2TERM/ID/2849
GO_EVIDENCE/ASSOCIATION_ID/610866
GO_ASSOCIATION/ID/610857 GO_ASSOCIATION/ID/610866
GO_EVIDENCE/ASSOCIATION_ID/610854 GO_EVIDENCE/ASSOCIATION_ID/808850 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129249
GO_EVIDENCE/ASSOCIATION_ID/7288420 GO_ASSOCIATION/ID/731461 GO_ASSOCIATION/TERM_ID/1820
GO_ASSOCIATION/GENE_PRODUCT_ID/153159
GO_ASSOCIATION/TERM_ID/1737 GO_ASSOCIATION/ID/720699
GO_ASSOCIATION/GENE_PRODUCT_ID/129254
GO_ASSOCIATION/GENE_PRODUCT_ID/129252 GO_ASSOCIATION/GENE_PRODUCT_ID/129251 GO_ASSOCIATION/ID/7288420 GO_ASSOCIATION/ID/610848
GO_ASSOCIATION/TERM_ID/2141
GO_ASSOCIATION/ID/610854 GO_ASSOCIATION/GENE_PRODUCT_ID/1725527 GO_ASSOCIATION/ID/769835 GO_ASSOCIATION/ID/7287508
GO_GENE_PRODUCT/ID/154934
GO_DBXREF/ID/93379
GO_ASSOCIATION/GENE_PRODUCT_ID/161712 GO_ASSOCIATION/ID/610863 GO_ASSOCIATION/ID/610851 GO_ASSOCIATION/GENE_PRODUCT_ID/175463 GO_ASSOCIATION/ID/7288417 GO_ASSOCIATION/ID/808844
GO_GENE_PRODUCT/ID/1607169 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/1607169
GO_ASSOCIATION/TERM_ID/1429
GO_EVIDENCE/ASSOCIATION_ID/769835 GO_GENE_PRODUCT/ID/129248 GO_EVIDENCE/ID/8395481 GO_ASSOCIATION/ID/769831
GO_ASSOCIATION/GENE_PRODUCT_ID/1725526
GO_GENE_PRODUCT/ID/153159 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/161712
GO_GENE_PRODUCT/SPECIES_ID/162261 GO_TERM/ID/1737
PUBMED_REFERENCE/REFERENCE_ACCESSION/P22988[1]
GO_GENE_PRODUCT/ID/129253
GO_ASSOCIATION/GENE_PRODUCT_ID/129248 GO_ASSOCIATION/GENE_PRODUCT_ID/129250
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129544
TAXONOMY_CROSS_REFERENCE/ACCESSION/O65114 MEDLINE_REFERENCE/REFERENCE_ACCESSION/P22988[1] TAXONOMY_CROSS_REFERENCE/ACCESSION/O65115
GO_GENE_PRODUCT_SEQ/SEQ_ID/3339
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/154934 GO_GENE_PRODUCT_SEQ/SEQ_ID/573635
GO_ASSOCIATION/GENE_PRODUCT_ID/154934 GO_ASSOCIATION/GENE_PRODUCT_ID/161713 GO_ASSOCIATION/ID/821115
GO_EVIDENCE/ASSOCIATION_ID/821115 GO_EVIDENCE/ASSOCIATION_ID/849915 GO_ASSOCIATION/GENE_PRODUCT_ID/1607169 GO_ASSOCIATION/ID/4892614 GO_EVIDENCE/ID/904591
LITERATURE_REFERENCE/REFERENCE_ACCESSION/P22989[1]
GO_GENE_PRODUCT_SEQ/SEQ_ID/531526 GO_ASSOCIATION/GENE_PRODUCT_ID/168629
LITERATURE_REFERENCE/REFERENCE_ACCESSION/P22988[1] GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129254
GO_GENE_PRODUCT/ID/129252
GO_GENE_PRODUCT/ID/129249
GO_TERM/ID/2141 GO_GENE_PRODUCT_COUNT/TERM_ID/1820 GO_UNIPROT_LINK/ID/247489 GO_GENE_PRODUCT_SEQ/SEQ_ID/573634
GO_ASSOCIATION/GENE_PRODUCT_ID/168630
GO_GENE_PRODUCT_COUNT/TERM_ID/2141
GO_GENE_PRODUCT_SEQ/SEQ_ID/150842 GO_GENE_PRODUCT_SEQ/SEQ_ID/3372
GO_UNIPROT_LINK/ID/234600
HSSP_LINK/ACCESSION/P22988
GO_SEQ/ID/573635
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/168630
GO_GENE_PRODUCT_SEQ/SEQ_ID/73390 GO_GENE_PRODUCT_SEQ/SEQ_ID/491962GO_SEQ/ID/531526 GO_LINK/ACCESSION/O65916 GRAMENE_LINK/ACCESSION/P22989 GRAMENE_LINK/ACCESSION/P08477 PANTHER_LINK/ACCESSION/P22989
PANTHER_LINK/ACCESSION/P22988 GO_UNIPROT_LINK/XREF_KEY/P22989 GO_UNIPROT_LINK/XREF_KEY/P22988 KEYWORD/ACCESSION/P22988
GO_GENE_PRODUCT/ID/170735 GO_SEQ/ID/73352 GO_GENE_PRODUCT/ID/1725526
GO_SEQ/DISPLAY_ID/O65916 GO_SEQ_DBXREF/SEQ_ID/573635 GO_SPECIES/ID/162261 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/168629
MEDLINE_REFERENCE/REFERENCE_ACCESSION/O65118[1]
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/170735 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/161713 GO_GENE_PRODUCT/ID/1725525
GO_SEQ_DBXREF/SEQ_ID/531526
EMBL_LINK/ACCESSION/P05336 GENE_NAME/ACCESSION/O65118 GO_GENE_PRODUCT_SEQ/SEQ_ID/599310 HSSP_LINK/ACCESSION/P08477
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/1725525
PANTHER_LINK/ACCESSION/P08477 GO_GENE_PRODUCT_SEQ/SEQ_ID/573636 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/153159 GO_SEQ_DBXREF/SEQ_ID/73352 GO_SEQ_DBXREF/SEQ_ID/573634 GO_SEQ/ID/73390 GO_SEQ/ID/491962 GO_SEQ/ID/601038 PROTEIN/ACCESSION/P22989 TAXONOMY_CROSS_REFERENCE/ACCESSION/P22989 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129252
EMBL_LINK/ACCESSION/O65118 HSSP_LINK/ACCESSION/P22989
GENE_NAME/ACCESSION/O65916 FEATURE/ACCESSION/O65120
FEATURE/ACCESSION/P05336 GO_LINK/ACCESSION/Q9ZNQ2
GO_SEQ_DBXREF/SEQ_ID/601038
PROTEIN/ACCESSION/P22988 GO_UNIPROT_LINK/XREF_KEY/P10848
GO_SEQ_DBXREF/SEQ_ID/73390
TAXONOMY_CROSS_REFERENCE/ACCESSION/P08477 GO_LINK/ACCESSION/Q9ZR29 PROTEIN/ACCESSION/P08477
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/1725267 GO_SEQ_DBXREF/SEQ_ID/150842 LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65118[1] GO_GENE_PRODUCT_SEQ/SEQ_ID/491961 GO_SEQ/ID/3372
INTERPRO_LINK/ACCESSION/P22989 EMBL_LINK/ACCESSION/O65120 GENE_NAME/ACCESSION/O65120 ORGANISM_SPECIES/ACCESSION/P05336
GO_SEQ/DISPLAY_ID/P08477 ORGANISM_CLASSIFICATION/ACCESSION/P08477
MEDLINE_REFERENCE/REFERENCE_ACCESSION/Q9FEH1[1]
GO_GENE_PRODUCT_SEQ/SEQ_ID/491958
GO_GENE_PRODUCT_SEQ/SEQ_ID/491956 GO_GENE_PRODUCT_SEQ/SEQ_ID/47491
GO_SEQ/ID/3386
ORGANISM_SPECIES/ACCESSION/P08477 GO_SEQ_DBXREF/SEQ_ID/599310
GO_LINK/ACCESSION/Q9FEH1 EMBL_LINK/ACCESSION/O65114
LITERATURE_REFERENCE/ACCESSION/O65118 GO_SEQ_DBXREF/SEQ_ID/47491
PFAM_LINK/ACCESSION/P22988
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/1725527 ORGANISM_SPECIES/ACCESSION/O65118
GO_SEQ/ID/3339
GRAMENE_LINK/ACCESSION/P10848 LITERATURE_REFERENCE/ACCESSION/P10848 PANTHER_LINK/ACCESSION/P05336
GO_GENE_PRODUCT_SEQ/SEQ_ID/3386
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/175463 GO_UNIPROT_LINK/ID/209451
GO_SEQ_DBXREF/SEQ_ID/491962
FEATURE/ACCESSION/O65118
GO_LINK/ACCESSION/O65120 GRAMENE_LINK/ACCESSION/P22988 LITERATURE_REFERENCE/REFERENCE_ACCESSION/P10848[1] AUTHORS/REFERENCE_ACCESSION/O65117[1]
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129251 GO_GENE_PRODUCT_SEQ/SEQ_ID/601038
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129253
GO_SEQ/ID/573634
ORGANISM_CLASSIFICATION/ACCESSION/P22989
PUBMED_REFERENCE/REFERENCE_ACCESSION/Q9FEH1[1]
AUTHORS/REFERENCE_ACCESSION/O65120[1]
GO_GENE_PRODUCT/SPECIES_ID/232832 GO_GENE_PRODUCT/ID/1725267
GO_SEQ/DISPLAY_ID/P10848
EMBL_LINK/ACCESSION/O65916 ORGANISM_CLASSIFICATION/ACCESSION/P22988 KEYWORD/ACCESSION/P08477 GENE_NAME/ACCESSION/P05336 INTERPRO_LINK/ACCESSION/P08477 GO_UNIPROT_LINK/XREF_KEY/P08477
FEATURE/ACCESSION/O65916
KEYWORD/ACCESSION/P22989 ORGANISM_SPECIES/ACCESSION/P22988
GO_ASSOCIATION/GENE_PRODUCT_ID/1725267
GO_GENE_PRODUCT/ID/161713 GO_ASSOCIATION/GENE_PRODUCT_ID/170735
GO_GENE_PRODUCT_COUNT/TERM_ID/1737
GO_GENE_PRODUCT/ID/1725527 AUTHORS/REFERENCE_ACCESSION/Q9ZR31[1]
ORGANISM_SPECIES/ACCESSION/P22989
TAXONOMY_CROSS_REFERENCE/ACCESSION/P22988
GO_GENE_PRODUCT/ID/161712 GO_GENE_PRODUCT/ID/168630
GO_GENE_PRODUCT/ID/129251
GO_SEQ/DISPLAY_ID/P10847
GO_UNIPROT_LINK/ID/247488 LITERATURE_REFERENCE/ACCESSION/P22989 LITERATURE_REFERENCE/ACCESSION/P22988 INTERPRO_LINK/ACCESSION/P22988 LITERATURE_REFERENCE/ACCESSION/P08477
GO_TERM/ID/1820
GO_GENE_PRODUCT/ID/129544 GO_GENE_PRODUCT_SEQ/SEQ_ID/73352
GO_SEQ_DBXREF/DBXREF_ID/40067
INTERPRO_LINK/ACCESSION/P10848
GO_SEQ_DBXREF/SEQ_ID/491956 GRAMENE_LINK/ACCESSION/O65118
GO_SEQ_DBXREF/SEQ_ID/573636
GO_SEQ/DISPLAY_ID/O65120
EMBL_LINK/ACCESSION/Q9ZR31 GO_GRAPH_PATH/ID/221964
GO_SEQ/ID/599310 TAXONOMY_CROSS_REFERENCE/ACCESSION/P05336
GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/1725526
LITERATURE_REFERENCE/REFERENCE_ACCESSION/P05336[1] GO_UNIPROT_LINK/ID/233004 PUBMED_REFERENCE/REFERENCE_ACCESSION/P05336[1] MEDLINE_REFERENCE/REFERENCE_ACCESSION/P05336[1]
HSSP_LINK/ACCESSION/P05336 GO_UNIPROT_LINK/XREF_KEY/P05336 HSSP_LINK/ACCESSION/P10848
LITERATURE_REFERENCE/ACCESSION/P05336
HSSP_LINK/ACCESSION/O65119 GO_SEQ/ID/573636 GO_SEQ_DBXREF/SEQ_ID/491961 HSSP_LINK/ACCESSION/O65118
PROTEIN/ACCESSION/O65119
FEATURE/ACCESSION/O65117
KEYWORD/ACCESSION/P10848
MEDLINE_REFERENCE/REFERENCE_ACCESSION/O65119[1] LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65119[1] EMBL_LINK/ACCESSION/Q9ZR30 GO_SEQ/DISPLAY_ID/O65117
PANTHER_LINK/ACCESSION/P10848
PROTEIN/ACCESSION/P10848
GENE_NAME/ACCESSION/O65114 INTERPRO_LINK/ACCESSION/O65117
ORGANISM_CLASSIFICATION/ACCESSION/P10848
PFAM_LINK/ACCESSION/P22989 GENE_NAME/ACCESSION/Q9ZR31 GO_SEQ_DBXREF/SEQ_ID/491957 GO_SEQ_DBXREF/SEQ_ID/491960 GO_SPECIES/ID/230061
PROTEIN/ACCESSION/O65117
ENZYME/ACCESSION/O65114 ENZYME/ACCESSION/O65916 GO_SEQ/ID/491961 ENZYME/ACCESSION/P05336 PUBMED_REFERENCE/REFERENCE_ACCESSION/O65119[1] ENZYME/ACCESSION/O65118
GO_SEQ/DISPLAY_ID/Q9ZR31 HSSP_LINK/ACCESSION/O65117
ORGANISM_SPECIES/ACCESSION/P10848
GO_SEQ/ID/491958 FEATURE/ACCESSION/Q9ZR31 GO_SEQ/ID/47491 GO_SEQ/ID/491956 GO_GOA_LINK/ID/93395 GO_SEQ/DISPLAY_ID/O65116 PUBMED_REFERENCE/REFERENCE_ACCESSION/O65118[1] GO_GENE_PRODUCT_SEQ/SEQ_ID/491960 GO_GENE_PRODUCT_SEQ/SEQ_ID/491957 GO_SEQ_DBXREF/SEQ_ID/558618
LITERATURE_REFERENCE/ACCESSION/O65117
PROTEIN/ACCESSION/P05336 GRAMENE_LINK/ACCESSION/P05336 KEYWORD/ACCESSION/P05336 ORGANISM_CLASSIFICATION/ACCESSION/P05336
TIGRFAMS_LINK/ACCESSION/P34937 GO_GENE_PRODUCT_SEQ/SEQ_ID/558618
GO_ASSOCIATION/GENE_PRODUCT_ID/1167777
TAXONOMY_CROSS_REFERENCE/ACCESSION/P10848
LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65117[1] PFAM_LINK/ACCESSION/P10848
FEATURE/ACCESSION/Q9ZR30 GRAMENE_LINK/ACCESSION/O65117 HSSP_LINK/ACCESSION/O65115
EMBL_LINK/ACCESSION/P10847 GRAMENE_LINK/ACCESSION/P34937 PROSITE_LINK/ACCESSION/P34937
PANTHER_LINK/ACCESSION/O65119
KEYWORD/ACCESSION/O65117
GENE_NAME/ACCESSION/P10847
PANTHER_LINK/ACCESSION/O65118
LITERATURE_REFERENCE/ACCESSION/P34937
GO_SEQ/ID/558618
GO_UNIPROT_LINK/ID/209452
GRAMENE_LINK/ACCESSION/Q9ZNQ2 HSSP_LINK/ACCESSION/Q9ZR29 PROSITE_LINK/ACCESSION/P08477
GO_SEQ/DISPLAY_ID/P22988 GO_SEQ/ID/491957 GO_SEQ/ID/491960 GO_SEQ/DISPLAY_ID/P26517
ENZYME/EC_NR/1.1.1.1 PANTHER_LINK/ACCESSION/O65117
INTERPRO_LINK/ACCESSION/P34937 PIR_LINK/ACCESSION/P22989 PROTEIN/ACCESSION/P34937 PANTHER_LINK/ACCESSION/Q7DLE1 ORGANISM_SPECIES/ACCESSION/P34937
GO_SEQ_DBXREF/DBXREF_ID/40860
ENZYME/ACCESSION/Q9ZR30 ENZYME/ACCESSION/O65117
TIGRFAMS_LINK/ACCESSION/P08477
FEATURE/ACCESSION/P10848 ENZYME/EC_NR/1.1.1.27 GO_SEQ/DISPLAY_ID/O65119 ORGANISM_SPECIES/ACCESSION/Q7DLE1 ALTERNATIVE_ACCESSIONS/ACCESSION/P05336 PFAM_LINK/ACCESSION/P26517
PROTEIN/ACCESSION/Q9ZNQ2 GO_UNIPROT_LINK/XREF_KEY/P34937
ORGANISM_SPECIES/ACCESSION/Q9ZNQ2
GRAMENE_LINK/ACCESSION/Q7DLE1
GENE_NAME/ACCESSION/Q9FEH1 PFAM_LINK/ACCESSION/P05336 PUBMED_REFERENCE/REFERENCE_ACCESSION/O65114[1] ENZYME/ACCESSION/Q9FEH1 ENZYME/ACCESSION/Q9ZNQ2 EMBL_LINK/ACCESSION/Q9FEH1 GO_LINK/ACCESSION/Q9ZR30 PANTHER_LINK/ACCESSION/O65114
PFAM_LINK/ACCESSION/P08477 GRAMENE_LINK/ACCESSION/O65115
EMBL_LINK/ACCESSION/P08477 PROTEIN/ACCESSION/Q7DLE1
TAXONOMY_CROSS_REFERENCE/ACCESSION/Q9ZNQ2
GO_SEQ_DBXREF/DBXREF_ID/49166
KEYWORD/ACCESSION/Q7DLE1
GO_UNIPROT_LINK/ID/253971
GO_SEQ/DISPLAY_ID/P05336 ENZYME/ACCESSION/O65116
HSSP_LINK/ACCESSION/Q9FEH1
PRINTS_LINK/ACCESSION/P22988 GO_SEQ/DISPLAY_ID/Q9ZR29 TAXONOMY_CROSS_REFERENCE/ACCESSION/P26517
FEATURE/ACCESSION/Q9FEH1 ENZYME/ACCESSION/O65119 ORGANISM_SPECIES/ACCESSION/O65115 GO_EC_LINK/XREF_KEY/1.2.1.12 FEATURE/ACCESSION/Q9ZNQ2 INTERPRO_LINK/ACCESSION/O65916 EMBL_LINK/ACCESSION/Q9ZNQ2 ORGANISM_CLASSIFICATION/ACCESSION/O65916
GO_SEQ/ID/150842 KEYWORD/ACCESSION/O65916 PROTEIN/ACCESSION/O65916
GO_LINK/ACCESSION/O65119
EMBL_LINK/ACCESSION/P10848 PIR_LINK/ACCESSION/P05336 GENE_NAME/ACCESSION/P10848 GO_UNIPROT_LINK/XREF_KEY/Q7DLE1 ALTERNATIVE_ACCESSIONS/ACCESSION/P34937 PROSITE_LINK/ACCESSION/Q7DLE1 PIR_LINK/ACCESSION/P10847 GENE_NAME/ACCESSION/P08477
ENZYME/ACCESSION/P22988
TAXONOMY_CROSS_REFERENCE/ACCESSION/O65117
LITERATURE_REFERENCE/REFERENCE_ACCESSION/Q9ZNQ2[1]
ENZYME/ACCESSION/P10847
FEATURE/ACCESSION/P22989 GO_SEQ/DISPLAY_ID/Q9ZNQ2 PROSITE_LINK/ACCESSION/P22989 INTERPRO_LINK/ACCESSION/Q9ZR30 PROSITE_LINK/ACCESSION/P22988 GO_UNIPROT_LINK/ID/249477 ORGANISM_SPECIES/ACCESSION/Q9ZR31
GRAMENE_LINK/ACCESSION/P26517
GO_UNIPROT_LINK/ID/209447
PROSITE_LINK/ACCESSION/Q9ZNQ2 GO_UNIPROT_LINK/XREF_KEY/Q9ZNQ2 PROSITE_LINK/ACCESSION/P10847
PROSITE_LINK/ACCESSION/Q9ZR31 GRAMENE_LINK/ACCESSION/Q9ZR30
GO_UNIPROT_LINK/XREF_KEY/P26517 PIR_LINK/ACCESSION/P26517 GO_SEQ/DISPLAY_ID/O65118 GO_SEQ/DISPLAY_ID/P34937 ORGANISM_CLASSIFICATION/ACCESSION/Q7DLE1 ORGANISM_CLASSIFICATION/ACCESSION/P26517 PIR_LINK/ACCESSION/P22988 TAXONOMY_CROSS_REFERENCE/ACCESSION/Q7DLE1 PIR_LINK/ACCESSION/P10848
ENZYME/ACCESSION/O65115
LITERATURE_REFERENCE/ACCESSION/O65114
GO_UNIPROT_LINK/XREF_KEY/O65116 LITERATURE_REFERENCE/ACCESSION/Q9ZNQ2
EMBL_LINK/ACCESSION/P22989
PROTEIN/ACCESSION/P26517 INTERPRO_LINK/ACCESSION/Q7DLE1 PANTHER_LINK/ACCESSION/P26517
PANTHER_LINK/ACCESSION/Q9FEH1
LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65114[1] PANTHER_LINK/ACCESSION/O65115 MEDLINE_REFERENCE/REFERENCE_ACCESSION/O65114[1] ORGANISM_SPECIES/ACCESSION/O65114 GRAMENE_LINK/ACCESSION/O65114 PROTEIN/ACCESSION/O65114 KEYWORD/ACCESSION/O65114
GO_UNIPROT_LINK/ID/209448 AUTHORS/REFERENCE_ACCESSION/Q9FEH1[1]
GO_TERM/ID/1429
ORGANISM_CLASSIFICATION/ACCESSION/Q9ZNQ2 PRINTS_LINK/ACCESSION/P26517 INTERPRO_LINK/ACCESSION/Q9ZNQ2
PIR_LINK/ACCESSION/P08477 KEYWORD/ACCESSION/P26517 KEYWORD/ACCESSION/P34937 ORGANISM_SPECIES/ACCESSION/P26517 INTERPRO_LINK/ACCESSION/P26517
PIRSF_LINK/ACCESSION/P22988 HSSP_LINK/ACCESSION/P26517 ENZYME/ACCESSION/P08477 ENZYME/ACCESSION/P10848
LITERATURE_REFERENCE/ACCESSION/O65115 INTERPRO_LINK/ACCESSION/O65115
GO_SEQ_DBXREF/SEQ_ID/3386
KEYWORD/ACCESSION/Q9ZNQ2 ORGANISM_CLASSIFICATION/ACCESSION/P34937 PRINTS_LINK/ACCESSION/P08477
TIGRFAMS_LINK/ACCESSION/P26517
GENE_NAME/ACCESSION/Q9ZR30
GO_SEQ/ID/491959
HSSP_LINK/ACCESSION/P34937
PROSITE_LINK/ACCESSION/P26517 PFAM_LINK/ACCESSION/P34937 HSSP_LINK/ACCESSION/O65114
PROTEIN/ACCESSION/O65115 PUBMED_REFERENCE/REFERENCE_ACCESSION/O65115[1] GENE_NAME/ACCESSION/Q9ZNQ2 LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65115[1]
TAXONOMY_CROSS_REFERENCE/ACCESSION/O65119 ORGANISM_CLASSIFICATION/ACCESSION/O65119
GO_SEQ_DBXREF/SEQ_ID/3372
GO_SEQ_DBXREF/SEQ_ID/491959
FEATURE/ACCESSION/P10847
GO_SEQ/DISPLAY_ID/O65115
MEDLINE_REFERENCE/REFERENCE_ACCESSION/O65115[1]
GO_SEQ/DISPLAY_ID/Q7DLE1 GO_SEQ/ID/47465
LITERATURE_REFERENCE/REFERENCE_ACCESSION/P34937[1] PANTHER_LINK/ACCESSION/Q9ZR29 GO_SEQ_DBXREF/SEQ_ID/47465 AUTHORS/REFERENCE_ACCESSION/P10847[1]
GO_UNIPROT_LINK/ID/209450
KEYWORD/ACCESSION/O65115
AUTHORS/REFERENCE_ACCESSION/Q9ZNQ2[1]
GO_SPECIES/ID/232832 GO_SEQ_DBXREF/SEQ_ID/3339
GO_GRAPH_PATH/ID/446838
KEYWORD/ACCESSION/O65119 ENZYME/ACCESSION/Q9ZR31
ORGANISM_SPECIES/ACCESSION/O65119 LITERATURE_REFERENCE/ACCESSION/O65119 ENZYME/ACCESSION/O65120
EMBL_LINK/ACCESSION/O65117
TIGRFAMS_LINK/ACCESSION/P22988
GO_SEQ_DBXREF/SEQ_ID/491958
GO_SEQ_DBXREF/DBXREF_ID/42796 GO_GENE_PRODUCT_SEQ/SEQ_ID/491959
KEYWORD/ACCESSION/O65118 PROTEIN/ACCESSION/O65118 INTERPRO_LINK/ACCESSION/O65118
GO_SEQ/DISPLAY_ID/Q9ZR30 INTERPRO_LINK/ACCESSION/O65119 ORGANISM_SPECIES/ACCESSION/O65117
GENE_NAME/ACCESSION/O65117 MEDLINE_REFERENCE/REFERENCE_ACCESSION/O65117[1]
GO_LINK/ACCESSION/Q9ZR31
GO_GENE_PRODUCT_SEQ/SEQ_ID/47465
TAXONOMY_CROSS_REFERENCE/ACCESSION/P34937 GO_SEQ/DISPLAY_ID/P22989
GRAMENE_LINK/ACCESSION/O65119
PUBMED_REFERENCE/REFERENCE_ACCESSION/O65117[1] INTERPRO_LINK/ACCESSION/P05336 AUTHORS/REFERENCE_ACCESSION/Q9ZR30[1]
GO_GRAPH_PATH/ID/660365
GO_EVIDENCE/ID/940600
GO_ASSOCIATION/GENE_PRODUCT_ID/129253 GO_GENE_PRODUCT_COUNT/TERM_ID/1429 GO_ASSOCIATION/GENE_PRODUCT_ID/129249 GO_ASSOCIATION/GENE_PRODUCT_ID/1725525 GO_GENE_PRODUCT/ID/168629 GO_GENE_PRODUCT/ID/129250 GO_GENE_PRODUCT/ID/175463
GO_GOA_LINK/ID/93379 LITERATURE_REFERENCE/REFERENCE_ACCESSION/P08477[1]
GO_EC_LINK/ID/1211
GO_GENE_PRODUCT/ID/1167777
GO_GENE_PRODUCT/SPECIES_ID/230061
AUTHORS/REFERENCE_ACCESSION/O65916[1]
GO_EVIDENCE/ID/5577195
GO_ASSOCIATION/ID/849915 GO_ASSOCIATION/ID/808850 GO_EVIDENCE/DBXREF_ID/93379
AUTHORS/REFERENCE_ACCESSION/O65118[1]
GO_UNIPROT_LINK/ID/241024
GO_EVIDENCE/DBXREF_ID/93395
GO_GENE_PRODUCT/ID/129254
ORGANISM_CLASSIFICATION/ACCESSION/O65115
AUTHORS/REFERENCE_ACCESSION/P05336[1] ORGANISM_CLASSIFICATION/ACCESSION/O65114
AUTHORS/REFERENCE_ACCESSION/P22989[1]
ENZYME/ACCESSION/P22989 GO_SEQ/DISPLAY_ID/Q9FEH1
ORGANISM_CLASSIFICATION/ACCESSION/O65117
PROSITE_LINK/ACCESSION/Q9ZR30 FEATURE/ACCESSION/P34937 ENZYME/ACCESSION/Q9ZR29
LITERATURE_REFERENCE/ACCESSION/Q7DLE1
EMBL_LINK/ACCESSION/O65116 GRAMENE_LINK/ACCESSION/O65916 GENE_NAME/ACCESSION/O65116 ORGANISM_SPECIES/ACCESSION/O65916
FEATURE/ACCESSION/P08477
ENZYME_DISEASE/EC/1.1.1.27
GO_EC_LINK/ID/2444
GO_UNIPROT_LINK/XREF_KEY/O65114
KEYWORD/ACCESSION/P10847 INTERPRO_LINK/ACCESSION/P10847 HSSP_LINK/ACCESSION/O65916
ENZYME/EC_NR/5.3.1.1
FEATURE/ACCESSION/O65116
LITERATURE_REFERENCE/ACCESSION/P26517
PROTEIN/ACCESSION/P10847 ENZYME_REACTION/EC/1.1.1.1
GENE_NAME/ACCESSION/O65119
HSSP_LINK/ACCESSION/P10847
ORGANISM_CLASSIFICATION/ACCESSION/Q9ZR30 FEATURE/ACCESSION/Q7DLE1 EMBL_LINK/ACCESSION/P34937
ORGANISM_CLASSIFICATION/ACCESSION/P10847
ORGANISM_SPECIES/ACCESSION/P10847 INTERPRO_LINK/ACCESSION/O65116 PUBMED_REFERENCE/REFERENCE_ACCESSION/O65116[1]
GO_UNIPROT_LINK/ID/209745
ENZYME_ORGANISM/EC/1.1.1.27 ENZYME_REACTION/EC/1.1.1.27 ENZYME_NAME/EC/1.1.1.27
ENZYME_INHIBITOR/EC/1.1.1.27 LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65916[1] ENZYME_LOCALIZATION/EC/1.1.1.27 AUTHORS/REFERENCE_ACCESSION/O65116[1] ENZYME_ORGANISM/ORGANISMID/1.1.1.27_28_
EMBL_LINK/ACCESSION/Q7DLE1 AUTHORS/REFERENCE_ACCESSION/O65114[1] EMBL_LINK/ACCESSION/P26517 INTERPRO_LINK/ACCESSION/O65114 GENE_NAME/ACCESSION/Q7DLE1
PROTEIN/ACCESSION/Q9FEH1
LITERATURE_REFERENCE/ACCESSION/O65916 ENZYME_OVERVIEW/EC/1.1.1.27 GO_UNIPROT_LINK/XREF_KEY/O65916
ENZYME_PATHWAY/EC/1.1.1.27 ENZYME_COFACTOR/EC/1.1.1.27
ENZYME/ACCESSION/P34937 GRAMENE_LINK/ACCESSION/Q9ZR31 PROTEIN/ACCESSION/Q9ZR31 ORGANISM_SPECIES/ACCESSION/Q9ZR30 PROTEIN/ACCESSION/Q9ZR30
PANTHER_LINK/ACCESSION/Q9ZR31
PANTHER_LINK/ACCESSION/O65916 TAXONOMY_CROSS_REFERENCE/ACCESSION/O65916
PROTEIN/ACCESSION/Q9ZR29 ORGANISM_CLASSIFICATION/ACCESSION/Q9ZR29 LITERATURE_REFERENCE/ACCESSION/Q9ZR31 INTERPRO_LINK/ACCESSION/Q9ZR31 ORGANISM_CLASSIFICATION/ACCESSION/Q9ZR31 TAXONOMY_CROSS_REFERENCE/ACCESSION/Q9ZR30 ENZYME/ACCESSION/P26517
LITERATURE_REFERENCE/REFERENCE_ACCESSION/Q7DLE1[1] ORGANISM_SPECIES/ACCESSION/Q9FEH1 PRINTS_LINK/ACCESSION/P22989
ENZYME/EC_NR/1.2.1.12
TAXONOMY_CROSS_REFERENCE/ACCESSION/Q9FEH1 GENE_NAME/ACCESSION/Q9ZR29 PROTEIN/ACCESSION/O65120 KEYWORD/ACCESSION/O65120 FEATURE/ACCESSION/O65115 KEYWORD/ACCESSION/Q9ZR30 HSSP_LINK/ACCESSION/Q9ZR31 PANTHER_LINK/ACCESSION/Q9ZR30 TAXONOMY_CROSS_REFERENCE/ACCESSION/P10847 GRAMENE_LINK/ACCESSION/P10847 GO_UNIPROT_LINK/XREF_KEY/Q9ZR30 KEYWORD/ACCESSION/Q9FEH1 GENE_NAME/ACCESSION/P26517 INTERPRO_LINK/ACCESSION/Q9FEH1 KEYWORD/ACCESSION/Q9ZR31 TIGRFAMS_LINK/ACCESSION/P22989 HSSP_LINK/ACCESSION/Q9ZNQ2 EMBL_LINK/ACCESSION/O65119 GO_LINK/ACCESSION/Q7DLE1 FEATURE/ACCESSION/O65119 GRAMENE_LINK/ACCESSION/O65120 GO_UNIPROT_LINK/XREF_KEY/P10847 ORGANISM_SPECIES/ACCESSION/Q9ZR29 ORGANISM_SPECIES/ACCESSION/O65120 PROSITE_LINK/ACCESSION/P05336 LITERATURE_REFERENCE/REFERENCE_ACCESSION/P26517[1] TAXONOMY_CROSS_REFERENCE/ACCESSION/Q9ZR31 GRAMENE_LINK/ACCESSION/Q9FEH1 INTERPRO_LINK/ACCESSION/Q9ZR29 FEATURE/ACCESSION/P22988 LITERATURE_REFERENCE/ACCESSION/Q9ZR30 PANTHER_LINK/ACCESSION/P10847 PANTHER_LINK/ACCESSION/Q9ZNQ2 PFAM_LINK/ACCESSION/P10847 ORGANISM_CLASSIFICATION/ACCESSION/Q9FEH1 LITERATURE_REFERENCE/REFERENCE_ACCESSION/Q9ZR31[1] ORGANISM_SPECIES/ACCESSION/O65116 PROSITE_LINK/ACCESSION/Q9FEH1 PANTHER_LINK/ACCESSION/O65116 GO_UNIPROT_LINK/XREF_KEY/Q9ZR31 EMBL_LINK/ACCESSION/P22988 ORGANISM_CLASSIFICATION/ACCESSION/O65120 EMBL_LINK/ACCESSION/Q9ZR29 KEYWORD/ACCESSION/O65116 TAXONOMY_CROSS_REFERENCE/ACCESSION/Q9ZR29 FEATURE/ACCESSION/Q9ZR29 INTERPRO_LINK/ACCESSION/O65120 GRAMENE_LINK/ACCESSION/Q9ZR29 KEYWORD/ACCESSION/Q9ZR29 GO_UNIPROT_LINK/XREF_KEY/O65118 GENE_NAME/ACCESSION/O65115 ENZYME/ACCESSION/Q7DLE1 PROTEIN/ACCESSION/O65116 PANTHER_LINK/ACCESSION/O65120 PRODOM_LINK/ACCESSION/P34937
MEDLINE_REFERENCE/REFERENCE_ACCESSION/O65916[1]
LITERATURE_REFERENCE/ACCESSION/O65116 LITERATURE_REFERENCE/ACCESSION/P10847
ENZYME_DISEASE/EC/1.2.1.12
GO_EC_LINK/ID/3395 ENZYME_COFACTOR/EC/1.2.1.12 GO_EC_LINK/XREF_KEY/1.1.1.1
ENZYME_NAME/EC/1.2.1.12 ENZYME_OVERVIEW/EC/1.2.1.12 ENZYME_LOCALIZATION/EC/1.2.1.12
GO_LINK/ACCESSION/O65118
HSSP_LINK/ACCESSION/O65116 GO_SEQ/DISPLAY_ID/O65114 HSSP_LINK/ACCESSION/O65120
PUBMED_REFERENCE/REFERENCE_ACCESSION/O65916[1]
GO_EC_LINK/XREF_KEY/5.3.1.1 GO_EVIDENCE/ASSOCIATION_ID/4892614
HSSP_LINK/ACCESSION/Q9ZR30
TAXONOMY_CROSS_REFERENCE/ACCESSION/O65120 AUTHORS/REFERENCE_ACCESSION/P08477[1] AUTHORS/REFERENCE_ACCESSION/P10848[1] GO_UNIPROT_LINK/XREF_KEY/Q9FEH1 ORGANISM_CLASSIFICATION/ACCESSION/O65116
EMBL_LINK/ACCESSION/O65115
FEATURE/ACCESSION/O65114 FEATURE/ACCESSION/P26517
GO_UNIPROT_LINK/ID/241023
PROSITE_LINK/ACCESSION/P10848 ENZYME_PATHWAY/EC/5.3.1.1
AUTHORS/REFERENCE_ACCESSION/Q7DLE1[1]
GO_UNIPROT_LINK/XREF_KEY/O65120
ENZYME_OVERVIEW/EC/5.3.1.1
GRAMENE_LINK/ACCESSION/O65116 LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65116[1]
ENZYME_REACTION/EC/1.2.1.12
TAXONOMY_CROSS_REFERENCE/ACCESSION/O65116 PROSITE_LINK/ACCESSION/Q9ZR29 GO_UNIPROT_LINK/XREF_KEY/Q9ZR29
LITERATURE_REFERENCE/ACCESSION/Q9FEH1
ENZYME_ORGANISM/EC/1.2.1.12 ENZYME_DISEASE/EC/5.3.1.1
LITERATURE_REFERENCE/ACCESSION/O65120
ENZYME_LOCALIZATION/EC/1.1.1.1
ENZYME_PATHWAY/EC/1.2.1.12
ENZYME_INHIBITOR/EC/1.2.1.12 GO_UNIPROT_LINK/XREF_KEY/O65119 ENZYME_LOCALIZATION/EC/5.3.1.1 ENZYME_NAME/EC/5.3.1.1
GO_UNIPROT_LINK/XREF_KEY/O65115
LITERATURE_REFERENCE/REFERENCE_ACCESSION/P10847[1]
ENZYME_ORGANISM/ORGANISMID/1.2.1.12_172_
ENZYME_COFACTOR/EC/1.1.1.1 ENZYME_ORGANISM/EC/5.3.1.1
LITERATURE_REFERENCE/ACCESSION/Q9ZR29
GO_UNIPROT_LINK/ID/209453
ENZYME_COFACTOR/EC/5.3.1.1 MEDLINE_REFERENCE/REFERENCE_ACCESSION/O65116[1] GO_UNIPROT_LINK/ID/209449
LITERATURE_REFERENCE/REFERENCE_ACCESSION/Q9ZR30[1] LITERATURE_REFERENCE/REFERENCE_ACCESSION/Q9FEH1[1]
MEDLINE_REFERENCE/REFERENCE_ACCESSION/P22989[1] GO_UNIPROT_LINK/XREF_KEY/O65117 LITERATURE_REFERENCE/REFERENCE_ACCESSION/Q9ZR29[1] PUBMED_REFERENCE/REFERENCE_ACCESSION/P22989[1]
ENZYME_ORGANISM/ORGANISMID/5.3.1.1_49_ LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65120[1]
AUTHORS/REFERENCE_ACCESSION/O65119[1] AUTHORS/REFERENCE_ACCESSION/P34937[1] AUTHORS/REFERENCE_ACCESSION/Q9ZR29[1] GO_EVIDENCE/ASSOCIATION_ID/610848 TAXONOMY_CROSS_REFERENCE/ACCESSION/O65118 AUTHORS/REFERENCE_ACCESSION/P22988[1] AUTHORS/REFERENCE_ACCESSION/P26517[1] ORGANISM_CLASSIFICATION/ACCESSION/O65118
MEDLINE_REFERENCE/REFERENCE_ACCESSION/O65120[1]
GO_EC_LINK/XREF_KEY/1.1.1.27
AUTHORS/REFERENCE_ACCESSION/O65115[1] PUBMED_REFERENCE/REFERENCE_ACCESSION/O65120[1]
809 ENZYME_PATHWAY/EC/1.1.1.1 ENZYME_ORGANISM/EC/1.1.1.1
ENZYME_PRODUCT/EC/1.1.1.1 ENZYME_DISEASE/EC/1.1.1.1
ENZYME_INHIBITOR/EC/1.1.1.1 ENZYME_NAME/EC/1.1.1.1 ENZYME_ORGANISM/ORGANISMID/1.1.1.1_1_
ENZYME_OVERVIEW/EC/1.1.1.1
Figure 4: data network of enzymes in barley showing catalytic activity in glycolysis (highresolution image: http://pgrc.ipk-gatersleben.de/dlg/Hordeum Glycolysis. pdf)
References [ABB+ 00] M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, M. A. Harris, D.P. Hill, L. IsselTarver, A. Kasarskis, S. Lewis, J. C. Matese, J. E. Richardson, M. Ringwald, Rubin G. M., and G. Sherlock. Gene Ontology: tool for the unification of biology. Nature Genetics, 25:25–29, 2000. [ABL+ 99] M. A. Andrade, N. P. Brown, C. Leroy, S. Hoersch, A. de Daruvar, C. Reich, A. Franchini, J. Tamames, A. Valencia, C. Ouzounis, and C. Sander. Automated genome sequence analysis and annotation. Bioinformatics, 15(5):391–412, 1999. [AGM+ 90] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 3:403–410, 1990.
Journal of Integrative Bioinformatics, 4(3):68, 2007
10
Journal of Integrative Bioinformatics 2007
[BK03]
http://journal.imbio.de/
F. Bry and P. Kr¨oger. A Computational Biology Database Digest: Data, Data Analysis, and Data Management. Distributed and Parallel Databases, 13(1):7– 42, 2003.
[BLSS04] S. Balko, M. Lange, R. Schnee, and U. Scholz. BioDataServer: An Applied Molecular Biological Data Integration Service. In S. Istrail, P. Pevzner, and M. Waterman, editors, Data Integration in the Life Sciences; First International Workshop, DILS 2004 Leipzig, Germany, volume 2994 of Lecture Notes in Bioinformatics, pages 140–155. Berlin et al: Springer, 2004. [EUA96]
T. Etzold, A. Ulyanow, and P. Argos. SRS: Information Retrieval System for Molecular Biology Data Banks. Methods in Enzymology, 266:114–128, 1996.
[HSA+ 02] A. Hamosh, A. F. Scott, J. Amberger, C. Bocchini, D. Valle, and V. A. McKusick. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Research, 30(1):52–55, 2002. [KGKN02] M. Kanehisa, S. Goto, S. Kawashima, and A. Nakaya. The KEGG database at GenomeNet. Nucleic Acids Research, 30(1):42–46, 2002. http://www.genome.ad.jp/kegg/. [KLF+ 05] C. K¨unne, M. Lange, T. Funke, H. Miehe, I. Grosse, and U. Scholz. CR–EST: a resource for crop ESTs. Nucleic Acids Research, 33(suppl 1):D619–621, 2005. [LMP+ 04] Z. Lacroix, T. Morris, K. Parekh, L. Raschid, and M.-E. Vidal. Exploiting multiple paths to express scientific queries. In Proceedings of SSDBM 2004, 2004. [MFG+ 02] H. W. Mewes, D. Frishman, U. G¨uldener, G. Mannhaupt, K. Mayer, M. Mokrejs, B. Morgenstern, M. M¨unsterk¨otter, S. Rudd, and S. Weil. MIPS: a database for genomes and protein sequences. Nucleic Acids Research, 30(1):31–34, 2002. [SGBB01] R. Stevens, C. Goble, P. Baker, and A. Brass. A classification of tasks in bioinformatics. Bioinformatics, 17(2):180–188, 2001. [Ste02]
L. Stein. Creating a bioinformatics nation. Nature, 417:119–120, 2002.
[TBG+ 04] O. Thimm, O. Blasing, Y. Gibon, A. Nagel, S. Me yer, P. Kruger, J. Selbig, L. A. Muller, S. Y. Rhee, and M. Stitt. MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. Plant Journal, 37(6):914–914, 2004.
Journal of Integrative Bioinformatics, 4(3):68, 2007
11