Data Linkage Graph: computation, querying and knowledge discovery

Journal of Integrative Bioinformatics 2007

http://journal.imbio.de/

Data Linkage Graph: computation, querying and knowledge discovery of life science database networks Matthias Lange, Axel Himmelbach, Patrick Schweizer, Uwe Scholz Institute for Plant Genetics and Crop Plant Research (IPK) Gatersleben, Germany {lange|himmelba|schweiz|scholz}@ipk-gatersleben.de Summary To support the interpretation of measured molecular facts, like gene expression experiments or EST sequencing, the functional or the system biological context has to be considered. Doing so, the relationship to existing biological knowledge has to be discovered. In general, biological knowledge is worldwide represented in a network of databases. In this paper we present a method for knowledge extraction in life science databases, which prevents the scientists from screen scraping and web clicking approaches. We developed a method for extraction of knowledge networks from distributed, heterogeneous life science databases. To meet the requirement of the very large data volume, the method used is based on the concept of data linkage graphs (DLG). We present an efficient software which enables the joining of millions of data points over hundreds of databases. In order to motivate possible applications, we computed networks of protein knowledge, which interconnect metabolic, disease, enzyme and gene function data. The computed networks enabled a holistic relationship among measured experimental facts and the combined biological knowledge. This was successfully applied for a high throughput functional classification of barley EST and gene expression experiments with the perspective of an automated pipeline for the provisioning of controlled annotation of plant gene arrays and chips. Availability: The data linkage graphs (XML or TGF format), the schema integrated database schema (GML or GRAPH-ML) and the graph computation software may be downloaded from the following URL: http://pgrc.ipk-gatersleben.de/dlg/

1 Motivation High throughput biotechnologies, like cDNA or oligo arrays, genome and EST sequencing projects towards proteomics techniques, produce data from measured molecular facts. Those facts could be gene expression, polypeptide concentration or EST sequences. To support the interpretation of measured molecular facts in order to discover the functional role of the measured molecule, the relationship to existing biological knowledge presented in world wide distributed databases, has to be built. To make the data accessible to desktop computers in the labs, the World Wide Web (WWW) is very popular. Techniques like manual browsing towards web services are commonly used. Using these techniques, bioinformatician may facilitate database embedded tools for linking wet-lab data to database knowledge in an explorative way. Methods like sequence similarities, stochastic networks for pattern recognition or text similarities queries are the selection operations, which are used for this data domain [Ste02, SGBB01]. Journal of Integrative Bioinformatics, 4(3):68, 2007

1



One limiting drawback in this scenario is the data and tool distribution. Nowadays, we know about hundreds of databases, with different but also overlapping content. Efforts for integrating them were made in several projects using several methods. Because the majority is focused on particular analytic scenarios and applications, none of them provided a holistic, uninterpreted view to the world wide interlinked knowledge. Popular techniques, like data warehouses, multi database query languages or database mediators solved the problem of database heterogeneties and provided powerful query interfaces and helpful tools. But the original character of life science databases as a highly dynamic and flexible network of knowledge is not supported. Because the nature of the knowledge stored in the WWW is to be a network, we need to provide an efficient method for the extraction of data and in future knowledge networks [BK03]. The navigation and benefits of exploiting the data networks was motivated in [LMP+ 04]. Here a system called BioNavigation supports the scientist in exploring the content within an integrated database system using the BioMediation-system. This system supports the graph based browsing and query building on top of integrated databases. In this paper, we present a recently developed method for the efficient extraction and querying of data networks from distributed, heterogeneous life science databases. Surprisingly comprehensive knowledge networks were computed and are available for public access. First applications of this networks were successfully used for a high throughput functional classification of EST towards an automated pipeline for the provisioning of controlled annotation of plant gene expression arrays. This and further applications will be described.

2 Method To meet the requirement of the given flexibility of knowledge networks and to enable an efficient but also lightweight processing of very large volume of integrated data, the concept of data linkage graphs (DLG) and an algorithm to efficient compute them has been developed. 2.1 Data schema graphs The concept is the representation of integrated databases as a network of key attributes and its values. Key attributes are those one which define unique identifier like EMBL sequence accessions, Gene Ontology (GO) terms [ABB+ 00] or EC numbers. To keep the modeled context of the attributes, they are assigned to data entities. In the example above GO term and EMBL sequence are entities, id and accession are the key attributes. Following this, we define schema nodes S as a set of pairs S = (E, A), whereas E is the entity and A is the key attribute. In order to construct networks, edges have to be defined. In the matter of fact, life science database are strongly interconnected. Studies assume a connection rate of at least 80% in life science databases [BK03]. This is manifested by a number of hyperlinks in the Web presentation of the databases, references in XML documents or in primary key / foreign key pairs in relational data structures. We use this relationships and extract out of them the edges R among the schema nodes S. Formally, R is a subset of all possible combinations of two elements from S. Thus, we write: R⊆S×S Journal of Integrative Bioinformatics, 4(3):68, 2007

(1) 2



Because, we are interested in real existing relationships among schema nodes, we restrict R by: Rlink = {(s, f ), (f, s) | s, f ∈ S ∧ f is in relation to s} Rfunc = {(s1 , s2 ), (s2 , s1 ) | s1 , s2 ∈ S ∧ s1 .E = s2 .E} R = Rlink ∪ Rfunc

(2) (3) (4)

In other words, we consider two kinds of bidirectional relations among key attributes: Rlink : Inter-entity relationships, which are modeled as links in the particular databases. Those could be represented by: HTML hyper links, primary key/foreign key, object references etc. Rfunc : Inner-entity relationships, which are modeled as functional dependency among attributes in one entity. Those could be represented by: co-occurrence in one HTML page, childs of the same XML parent item or modeling as attributes of one relational table. With this rules we are able to construct a graph as a pair of entity and relationship sets: α = (S, R), that comprises a database network at schema level. 2.2 Data linkage graphs The next step is the extension of the schema graph toward the instantiated data networks. Thus, we have to consider the attribute values, the stored data in the databases. In the context of the above example, we have a Gene Ontology term id GO:0005975, which identifies the term carbohydrate metabolism or the EMBL accession BAY074321, which identifies a sequence, coding for alpha-galactosidase. If we include this attribute values in our model and construct data networks, we will end up with graphs of interacting database entries. We extend the pair S to a triple, with one additional element V . Thereby, V is an element of the data domain of S, e.g. all valid EMBL accessions or Gene Ontology term identifier. We define data nodes SV as a set of a triple: SV = (E, A, V ) | V ∈ dom(S). If we imply the existence of a = relation in all used data domains for our key attributes A, we can define relationships * ) among two elements of a data node (sv1 , sv2 ∈ SV ), if one of the subsequent conditions is fulfilled. Furthermore, we assume the existence of two operations selection σ and projection π, which are borrowed from relational algebra. If we use SV as a relational table, the selection returns all those tuples (data nodes), in which a certain element is equal to a given one. To decide the equality, we use the = operation. In our application, the projection reduce the data node tuples to one specified element. {(sv1 .E, sv1 .A), (sv2 .E, sv2 .A)} ⊂ Rlink πsv1 .A (σsv2 .A=sv2 .V (sv2 .E)) = πsv2 .A (σsv1 .A=sv1 .V (sv1 .E))

(5)

{(sv1 .E, sv1 .A), (sv2 .E, sv2 .A)} ⊂ Rfunc σsv1 .A=sv1 .V (sv1 .E) = σsv2 .A=sv2 .V (sv2 .E)

(6)

or

Journal of Integrative Bioinformatics, 4(3):68, 2007

3



Colloquially, two data nodes are in relation if, 1. the schema nodes are in relation and the data elements are the equal (equation 5), or 2. the data nodes are mappable to the same schema and the tuples come from the same instance, e.g. table row or Web page (equation 6).

G( O A. C C G, O N.I T RE RP O )

R

NI T E R P R O

E

N A M

A C C

ac vtii yt

G G G O O O 00: 00: 00: 05 00 00 97 45 45 5 64 53 be O m ac a-t gu-l eat bor ucrf ocs boli hdy ost hyl ms ar et dai dy es oarl es PI PI PI R R R 00: 00: 00: 00 62 01 56 32 11

DI

#3

#2

#1

G O

uf n c

( ME B L G. O A G, O A. C C)

R

G G G G O O O O 00: 00: 00: 00: 00 05 00 05 45 97 45 97 64 5 53 5

G O A

l ni k

( ME B L A. C C , ME B L G. O A ) D E

A C C

DI

N T REP R O = Ì RP: 00 00 56 `(

G( O =) σI

G O N.I T RE PR O .I RP 00: 00 56

G O A. C C G. O 00: 05 97 5 G O A. C C G. O 00: 00 45 53

A CC = G` O 00: 05 97 5`

ME B L G. O A G. O 00: 05 97 5 ME B L G. O A G. O 00: 00 45 53

G O . DI G #. O 1 ) :

G O )) σ

A CC σ( A CC = G` O 00: 05 97 5` (

E M B L)

O A = G` O 00: 05 97 5` (

N T REP R O = Ì RP: 00 00 56 `(

G( O =) σI

G O DI. G #. O 2 ) :

G O N.I T RE PR O .I RP :0 00 11 1

G( O ))

A CC σ( A CC = G` O 00: 00 45 53 `

G O A σ( G O A = G` O 00: 00 45 53 `

π E M B L. DI #2. nda G O . DI (E #. 2 M : B L) =) π

A CC = G` O 00: 05 97 5`

σ

ME B L)

O A = G` O 00: 00 45 53 `(

A CC = A` Y 07 43 21 `(

σ E M B L. DI #2. : E M B L) = σ G

E M B L A. C C . A Y 07 43 21

A CC = A` Y 07 43 21 `(

σ E M B L. DI #1. : ME B L) = σ G

G O A σ( G O A = G` O 00: 05 97 5` (

π E M B L . DI #1. nda G O I. E D#. M 1 B : L ) =) π

d a at nli k a g e g ar p h :

#4 #3 #2 #1 A Y1 Y1 A 15 15 Y07 Y07 59 59 43 43 21 21 Bf Bf Al Al ucr ucr pha pha 4t 4t ga- gacl cl ost ost dai dai es es

B L

E M

uf n c

R

s c h e m a g ar p h :

To illustrate these types of relationships, the relationships between two data sources are visualized in figure 1: Assuming that there are the EMBL and the Gene Ontology (GO) database.

Figure 1: example for the relationships in schema and data linkage graphs

Each comprise three attributes in one entity. With known inter-attribute relationships (e. g. derived from hyperlinks among EMBL and GO), our example includes three relations: Rf unc1 = (EM BL.ACC, EM BL.GOA) Rf unc2 = (GO.ACC, GO.IN T ERP RO) Rlink1 = (EM BL.GOA, GO.ACC) On the basis of the above defined relation * ) we can compute a set of data node pairs: RV ⊆ SV × SV

(7)

(sv1 , sv2 ) ∈ RV ∧ sv1 * ) sv2

(8)

such that


4



Finally, we are able to describe networks of data nodes as set of graphs with the structure β = (SV, RV ). Using DLGs in combination with an efficient graph query engine, a general concept for a novel flexible, lightweight database integration becomes possible. Flexible means no schema or data integration, with all the conflicts needs to be solved. Leigthweigt, because only the key attributes values have to be accessed physically, parsed and downloaded respectively. Subsequently we describe the concrete construction of schema networks and the computation of data networks, in form of data linkage graphs (DLGs).

3 Implementation In order to use the described concept for joining millions of data points in over hundreds of databases, we designed and implemented an efficient algorithm for the computation of DLGs. The prerequists are (1) a graph of schema nodes and (2) for all schema nodes, the attribute values. 3.1 Data Import For the first task, we designed a protein knowledge network, which interconnects metabolic, disease, enzyme and gene functional data. In this context, we merged the data schema of the public databases UNIPROT (UP), Gene Ontology, OMIM, BRENDA and KEGG and derived a graph of schema nodes. This graph, available at (http://pgrc.ipk-gatersleben. de/dlg/, comprises 318 nodes and 234 edges, which cover the majority of the currently available protein related databases. In a collaboration with the University of Bielefeld, Germany, methods are developed to construct schema graphs automatically. This will be realized based on public available life science database repositories, like the NAR Database Categories List, pattern recognition in life science database WWW-presentation and the tracking of their hyperlinks. The retrieval of the key attribute values was realized by a combination of an in-house developed parser, the Sequence Retrieval System (SRS) [EUA96] and a database mediator called BioDataServer (BDS) [BLSS04]. In this way, a file of key values for each schema node was generated. The results are 318 files comprising around 5.6*107 key values. Hereby, we observed a 10-fold redundancy. This potential for high compression rate motivates to the computation of a hash table, mapping each key value to an integer. If we assume a 64-bit coded integer, a 10-character identifier from the Gene Ontology can be stored in 8 bytes. This saves storage capacity in a way, so that is it possible now to represent the data linkage graphs in modern computer main memory. 3.2 Data network computation This properties influenced the strategy for the DLG computation. The standard way for the computation of transitive entity relations are index supported joins. Such queries are commonly used to find data relationships in integrated databases. Examples of such queries are Journal of Integrative Bioinformatics, 4(3):68, 2007

5



those, mapping sequence data to the predicted functional role of the coded gene. The common established strategy is to import relevant data into a data warehouse and execute queries. Because of the complexity of such queries, the feasible number of tables and tuples in such joins is limited. In other words, if all of the i schema nodes were joined (./) and a cascading merge join was used, we would have a polynomial complexity, because database management systems are optimized for a data storage using secondary memory, usually hard discs and the cascading sort-merge join is the commonly used join implementation in relational database management systems. The complexity of each merge join is O(n). Consequently, the overall complexity is: Ojoin = O(n)table1 ./table2 ∗ . . . ∗ O(n)tablei−1 ./tablei = O(n)i

(9)

The consequence is the limitation to use case specific data warehouses, which disables a comprehensive database integration. Thus, there is a complexity problem for comprehensive data queries over huge data warehouses using big systems like SRS or even ORACLE. They support efficient joins only for a limited number of data, tables or databases. More comprehensive queries would result in days of waiting or even system failures. To avoid those problems and to improve the implementation, we investigated main memory based data structures and algorithms. We discussed previously that the nodes in our data networks show high redundancy and a compression rate of factor 10 can be achieved. Thus, we where able to represent all nodes by 5.6*107 64-bit pointer to 5.4*108 data nodes. This data structure took around 4GByte. Using this data structure, we pre-compute all possible joins in main memory. A C++ program loads all data nodes into main memory and computes all possible graphs. For this, we used the recursive breadth-first search (BFS) algorithm in combination with index structures like hash maps, B-trees and bit vectors. The resulted graphs where stored in a flat relational structure: (graphid, node depth, parent entity, parent attribute, child entity, child attribute, parent value, child value). This graphs are available in csv-format at the URL given bellow. Because of the recently finished graph computation, no data format optimization was done. The representation in a hierarchical data structure, like XML, is ad-hoc realizable, but not yet used practically. The graph computation took 3 weeks on a 2GHz Opteron server, whereas 12 GByte of main memory and 2 CPUs were used. 3.3 Data network queries We loaded the graphs into an ORACLE 10g database table. Some example queries were performed, like a query for all reachable data nodes starting from a given one. The application for such a query is e.g. the assignment of EMBL sequences to Gene Ontology controlled vocabulary, metabolic pathways in KEGG, diseases in OMIM etc. The former necessary join was replaced by a simple start-destination-search in the graphs. This search retrieves all graphs that connect a given start-node to an end-node. In SQL this may expressed as a self join: select g2.data from graphs g1, graphs g2 where g1.graphid = g2.graphid and Journal of Integrative Bioinformatics, 4(3):68, 2007

6



g1.parent entity = ’EMBL’ and g1.parent attribute = ’ACCESSION’ and g2.child entity = ’GO TERM’ and g2.child attribute = ’ACCESSION’ The complexity of such a query is derived from the complexity of the two selection operations for selecting the graphs containing the queried start and end node, and additionally their union. Using indexes for the selection, the complexity is O(log n). Applying merge join on sorted graph identifiers, the complexity is O(n). Therefore, the overall complexity of the above query is: Ographquery = O(log n) + O(log n) + O(n) = O(n)

(10)

If we don’t take in account that we need to compute the DLG with the BFS complexity (because it will be necessary only once per database update), this will be a significant reduction of the polynomial join complexity.

4 Discussion The analysis and interpretation of wet-lab data, performed at a typical scientist desk is an iterative process of tool application and database exploration. A frequent task in biology is the functional gene annotation, which is often accomplished by finding similarities or patterns in known, experimentally annotated sequence data. One common application of this paradigm is, for instance, (1) the mapping of translated nucleic sequences by direct sequence homology recognition using the BLAST tool [AGM+ 90] in an annotated, merged polypeptide sequence database like NRPEP (ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nr.gz). The hits which match known proteins link to the original databases and have to be (2) extracted eighter manually or using parsers. Finally, all linked databases have to be inspected for getting relevant information for the input sequences. The datasets on functional aspects of sequences and proteins present in life science databases are mainly derived from literature or provided from the database annotation team in textual form in natural language. In contrast, the use of a controlled vocabulary (CV) is a necessary concept for making annotations computational comparable. Using CVs, automized phenotypic classification for a huge number of wet-lab data is possible without human interaction or interpretation of textual description of a protein function. There are several systems, implementing such kind of systematic knowledge representation: the MIPS Functional Catalog [MFG+ 02], the Gene Ontology Database [ABB+ 00], GeneQuiz [ABL+ 99], OMIM [HSA+ 02], KEGG [KGKN02] or MapMan [TBG+ 04]. All those databases where more or less manually constructed. 4.1 Functional mapping to controlled vocabulary Here we present an application of the DLGs to map general functional sequence annotation to controlled vocabulary in a computational way. The basic idea of the functional categorization is to (1) take BLAST [AGM+ 90] hits for a number of EST, e. g. an EST library, (2) extract the database identifiers from the hit descriptions, Journal of Integrative Bioinformatics, 4(3):68, 2007

7



(3) find them in the data linkage graphs and finally (4) query the data link graphs to (5) find a path from the found database identifier to the interesting annotation database entity. In first use cases, this work flow was implemented to map Bests to their functional role in the Gene Ontology (GO) Database. The result is the assignment of ESTs to the three GO ontologies. The work flow has been implemented by an SQL-query, using the start-destination query pattern presented before. Figure 2 visualizes this process. S

t

e

p

1

S

t

e

BLAST against public databases E

S

T

l

i

b

r

a

r

p

2

PIR

EMBL

GB

y

DDBJ

UP

S

t

e

p

3

assign BLAST hits to DLG nodes (start nodes)

S

t

e

p

5

GO terms

Start-destination-search query on top of DLGs to map start-nodes to destination nodes (GO term) S

t

e

p

4

Figure 2: work flow to assign ESTs to GO terms

In order to evaluate the computational mapping of ESTs to GO terms, a set of 440 EST has been mapped computational to a GO term in around 2 minutes using an ORACLE 10g on a 2GHz Itanium server. Those ESTs represent genes which are used for an IPK internal cDNA macro array. Thus, the functional role of those gene has been studied in detailed from institute’s scientists, but had never been before included in GO. For the automatic annotation using the described pipeline, the cut-off criteria for the used BLAST results was an e-value less then 1E-10. The results of the annotation have been manually checked. We discovered, that 6% had a false assignment to GO terms. 80% were assigned unquestionable correct. The rest of 14% were not clearly decidable but tended to be correct. Those first investigations have been recently done. Thus, the data is available on request from the authors. Because of this promising results and to demonstrate the flexibility of the approach, another annotation mapping was computed to KEGG [KGKN02] metabolic pathways. This query took on the same server approximately 45 minutes. We were able to identify 18,607 out of 111,090 sequences stored in the IPK CR-EST database [KLF+ 05], which show enzymatic activity. By using the data linkage graphs and starting from either EMBL, Genbank, DDBJ, PIR or SWISSPROT/TREMBL in the BLAST results. 21% of these ESTs assigned to functional categories belong to carbohydrate metabolism, 19% to amino acid- and 17% energy metabolism. In contrast, using BLAST results only 1906 functional descriptions containing an EC numbers were found. Hence, compared to BLAST the sensitivity of functional annotation was increased 10fold. Figure 3 shows a simple statistic over the non-redundant metabolic annotated ESTs.


8



Figure 3: mapping of ESTs from the CR-EST database to GO terms

4.2 Database network discovery Beside the above mentioned graph queries, further analysis will be done. Scenarios are knowledge discovery, by context specific joining of graphs, like building context networks of data on specific diseases, regulatory elements, protein data and gene variants. This can be supported by a multidimensional, structural analysis of integrated databases, like centrality analysis for the identification of key elements of data networks. Furthermore, quality control of public databases is an important challenge in modern bioinformatics. Using DLGs we can e. g. instance extraction all networks that include contradicting data nodes for the same schema node. The URL http://pgrc.ipk-gatersleben.de/dlg/ can be used to download the data schema graph (GML or GraphML format), the data linkage graphs (CSV format) or examples of context specific extracted metabolic knowledge networks for plants (GML or GraphML format). Later are queried using SQL by joining the DLGs over the enzymes (EC numbers) which are present in glycolysis metabolic pathway and protein which are reported in Brassicae, Triticeae and Solanacea. Thus we extracted networks of knowledge to enzymes in barley which show catalytic activity in glycolysis. Figure 4 gives an impression of such a data network. A high resolution and comfortable version is from this figure is available, using the graph datafiles in a graph viewer. The files and a link to a viewer are available using the URL given previously.

5 Outlook Beside the presented application of the data linkage graphs, it is possible to perform more comprehensive applications of the presented concepts. The description was used to give a proofof-principle. More use cases and further more detailed quality control of the computational sequence annotation mappings are on the road map. Further plans are underway to expand the data basis and implement a user friendly and flexible tool to perform queries on top of data linkage graphs to discover knowledge in networks of life science databases.


9



GO_GRAPH_PATH/TERM1_ID/2141

GO_TERM2TERM/TERM1_ID/5609

GO_GRAPH_PATH/ID/9604 GO_TERM2TERM/TERM1_ID/8662 GO_TERM_DBXREF/DBXREF_ID/1211 GO_GRAPH_PATH/TERM2_ID/1429 GO_TERM2TERM/TERM1_ID/1429 GO_TERM2TERM/TERM2_ID/2141 GO_GRAPH_PATH/TERM1_ID/1820 GO_TERM2TERM/TERM1_ID/1818 GO_UNIPROT_LINK/ID/1801841 GO_TERM_DBXREF/TERM_ID/1429 GO_TERM2TERM/RELATIONSHIP_TYPE_ID/2 GO_EC_LINK/ID/2193

GO_GRAPH_PATH/TERM2_ID/2141 GO_TERM2TERM/TERM2_ID/1820 GO_LINK/ACCESSION/O65114 GO_TERM_DBXREF/DBXREF_ID/3395

GO_EVIDENCE/ID/7778742 GO_EVIDENCE/ID/777411 GO_TERM_DEFINITION/TERM_ID/1429 GO_UNIPROT_LINK/ID/1801842 GO_TERM2TERM/TERM2_ID/1430 GO_EVIDENCE/ID/840683 GO_EVIDENCE/ID/647215 GO_EVIDENCE/ID/645701GO_EVIDENCE/ID/645705

GO_TERM2TERM/TERM2_ID/1737 GO_LINK/ACCESSION/O65117

GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/1167777

GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129250 GO_TERM_DBXREF/TERM_ID/2141

GO_UNIPROT_LINK/ID/1801840 GO_TERM_DBXREF/DBXREF_ID/2193

GO_EVIDENCE/ID/840689 GO_EVIDENCE/ASSOCIATION_ID/6765476 GO_EVIDENCE/ID/645685 GO_EVIDENCE/ID/8394446

GO_GRAPH_PATH/TERM1_ID/1429 GO_TERM_DBXREF/TERM_ID/1737 GO_EVIDENCE/ASSOCIATION_ID/720699 GO_EVIDENCE/ID/645697 GO_EVIDENCE/ID/645689

GO_EVIDENCE/ASSOCIATION_ID/731461


GO_TERM_DEFINITION/TERM_ID/2141

GO_TERM2TERM/ID/3036

GO_EVIDENCE/ID/8395489

GO_UNIPROT_LINK/ID/1245104

GO_TERM2TERM/ID/2419 GO_EVIDENCE/ID/889156 GO_EVIDENCE/ID/790798

GO_TERM_SYNONYM/TERM_ID/1429



GO_EVIDENCE/ID/645709 GO_EVIDENCE/ASSOCIATION_ID/610863 GO_EVIDENCE/ASSOCIATION_ID/610860

GO_TERM_DBXREF/DBXREF_ID/2444 GO_TERM_DEFINITION/TERM_ID/1737

GO_EVIDENCE/ID/889165 GO_EVIDENCE/ID/645693

GO_LINK/ACCESSION/O65115



GO_EVIDENCE/ASSOCIATION_ID/769831 GO_EVIDENCE/ID/8395485



GO_ASSOCIATION/ID/612102 GO_TERM2TERM/ID/2760 GO_LINK/ACCESSION/O65116 GO_EVIDENCE/ASSOCIATION_ID/7288423

GO_ASSOCIATION/ID/6765476 GO_EVIDENCE/ASSOCIATION_ID/610851

GO_EVIDENCE/ASSOCIATION_ID/7288417 GO_EVIDENCE/ASSOCIATION_ID/808844

GO_EVIDENCE/ASSOCIATION_ID/610857 GO_ASSOCIATION/ID/610860 GO_ASSOCIATION/ID/7288423

GO_ASSOCIATION/GENE_PRODUCT_ID/129544

GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129248 GO_TERM_DBXREF/TERM_ID/1820 GO_TERM_DEFINITION/TERM_ID/1820

GO_DBXREF/ID/93395

GO_TERM2TERM/ID/2849


GO_ASSOCIATION/ID/610857 GO_ASSOCIATION/ID/610866

GO_EVIDENCE/ASSOCIATION_ID/610854 GO_EVIDENCE/ASSOCIATION_ID/808850 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129249

GO_EVIDENCE/ASSOCIATION_ID/7288420 GO_ASSOCIATION/ID/731461 GO_ASSOCIATION/TERM_ID/1820


GO_ASSOCIATION/TERM_ID/1737 GO_ASSOCIATION/ID/720699


GO_ASSOCIATION/GENE_PRODUCT_ID/129252 GO_ASSOCIATION/GENE_PRODUCT_ID/129251 GO_ASSOCIATION/ID/7288420 GO_ASSOCIATION/ID/610848

GO_ASSOCIATION/TERM_ID/2141

GO_ASSOCIATION/ID/610854 GO_ASSOCIATION/GENE_PRODUCT_ID/1725527 GO_ASSOCIATION/ID/769835 GO_ASSOCIATION/ID/7287508

GO_GENE_PRODUCT/ID/154934

GO_DBXREF/ID/93379

GO_ASSOCIATION/GENE_PRODUCT_ID/161712 GO_ASSOCIATION/ID/610863 GO_ASSOCIATION/ID/610851 GO_ASSOCIATION/GENE_PRODUCT_ID/175463 GO_ASSOCIATION/ID/7288417 GO_ASSOCIATION/ID/808844

GO_GENE_PRODUCT/ID/1607169 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/1607169

GO_ASSOCIATION/TERM_ID/1429

GO_EVIDENCE/ASSOCIATION_ID/769835 GO_GENE_PRODUCT/ID/129248 GO_EVIDENCE/ID/8395481 GO_ASSOCIATION/ID/769831


GO_GENE_PRODUCT/ID/153159 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/161712

GO_GENE_PRODUCT/SPECIES_ID/162261 GO_TERM/ID/1737

PUBMED_REFERENCE/REFERENCE_ACCESSION/P22988[1]


GO_ASSOCIATION/GENE_PRODUCT_ID/129248 GO_ASSOCIATION/GENE_PRODUCT_ID/129250


TAXONOMY_CROSS_REFERENCE/ACCESSION/O65114 MEDLINE_REFERENCE/REFERENCE_ACCESSION/P22988[1] TAXONOMY_CROSS_REFERENCE/ACCESSION/O65115

GO_GENE_PRODUCT_SEQ/SEQ_ID/3339

GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/154934 GO_GENE_PRODUCT_SEQ/SEQ_ID/573635

GO_ASSOCIATION/GENE_PRODUCT_ID/154934 GO_ASSOCIATION/GENE_PRODUCT_ID/161713 GO_ASSOCIATION/ID/821115

GO_EVIDENCE/ASSOCIATION_ID/821115 GO_EVIDENCE/ASSOCIATION_ID/849915 GO_ASSOCIATION/GENE_PRODUCT_ID/1607169 GO_ASSOCIATION/ID/4892614 GO_EVIDENCE/ID/904591

LITERATURE_REFERENCE/REFERENCE_ACCESSION/P22989[1]

GO_GENE_PRODUCT_SEQ/SEQ_ID/531526 GO_ASSOCIATION/GENE_PRODUCT_ID/168629

LITERATURE_REFERENCE/REFERENCE_ACCESSION/P22988[1] GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129254



GO_TERM/ID/2141 GO_GENE_PRODUCT_COUNT/TERM_ID/1820 GO_UNIPROT_LINK/ID/247489 GO_GENE_PRODUCT_SEQ/SEQ_ID/573634


GO_GENE_PRODUCT_COUNT/TERM_ID/2141

GO_GENE_PRODUCT_SEQ/SEQ_ID/150842 GO_GENE_PRODUCT_SEQ/SEQ_ID/3372


HSSP_LINK/ACCESSION/P22988

GO_SEQ/ID/573635


GO_GENE_PRODUCT_SEQ/SEQ_ID/73390 GO_GENE_PRODUCT_SEQ/SEQ_ID/491962GO_SEQ/ID/531526 GO_LINK/ACCESSION/O65916 GRAMENE_LINK/ACCESSION/P22989 GRAMENE_LINK/ACCESSION/P08477 PANTHER_LINK/ACCESSION/P22989

PANTHER_LINK/ACCESSION/P22988 GO_UNIPROT_LINK/XREF_KEY/P22989 GO_UNIPROT_LINK/XREF_KEY/P22988 KEYWORD/ACCESSION/P22988

GO_GENE_PRODUCT/ID/170735 GO_SEQ/ID/73352 GO_GENE_PRODUCT/ID/1725526

GO_SEQ/DISPLAY_ID/O65916 GO_SEQ_DBXREF/SEQ_ID/573635 GO_SPECIES/ID/162261 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/168629

MEDLINE_REFERENCE/REFERENCE_ACCESSION/O65118[1]

GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/170735 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/161713 GO_GENE_PRODUCT/ID/1725525

GO_SEQ_DBXREF/SEQ_ID/531526

EMBL_LINK/ACCESSION/P05336 GENE_NAME/ACCESSION/O65118 GO_GENE_PRODUCT_SEQ/SEQ_ID/599310 HSSP_LINK/ACCESSION/P08477


PANTHER_LINK/ACCESSION/P08477 GO_GENE_PRODUCT_SEQ/SEQ_ID/573636 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/153159 GO_SEQ_DBXREF/SEQ_ID/73352 GO_SEQ_DBXREF/SEQ_ID/573634 GO_SEQ/ID/73390 GO_SEQ/ID/491962 GO_SEQ/ID/601038 PROTEIN/ACCESSION/P22989 TAXONOMY_CROSS_REFERENCE/ACCESSION/P22989 GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129252

EMBL_LINK/ACCESSION/O65118 HSSP_LINK/ACCESSION/P22989

GENE_NAME/ACCESSION/O65916 FEATURE/ACCESSION/O65120

FEATURE/ACCESSION/P05336 GO_LINK/ACCESSION/Q9ZNQ2


PROTEIN/ACCESSION/P22988 GO_UNIPROT_LINK/XREF_KEY/P10848


TAXONOMY_CROSS_REFERENCE/ACCESSION/P08477 GO_LINK/ACCESSION/Q9ZR29 PROTEIN/ACCESSION/P08477

GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/1725267 GO_SEQ_DBXREF/SEQ_ID/150842 LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65118[1] GO_GENE_PRODUCT_SEQ/SEQ_ID/491961 GO_SEQ/ID/3372

INTERPRO_LINK/ACCESSION/P22989 EMBL_LINK/ACCESSION/O65120 GENE_NAME/ACCESSION/O65120 ORGANISM_SPECIES/ACCESSION/P05336

GO_SEQ/DISPLAY_ID/P08477 ORGANISM_CLASSIFICATION/ACCESSION/P08477

MEDLINE_REFERENCE/REFERENCE_ACCESSION/Q9FEH1[1]


GO_GENE_PRODUCT_SEQ/SEQ_ID/491956 GO_GENE_PRODUCT_SEQ/SEQ_ID/47491

GO_SEQ/ID/3386

ORGANISM_SPECIES/ACCESSION/P08477 GO_SEQ_DBXREF/SEQ_ID/599310

GO_LINK/ACCESSION/Q9FEH1 EMBL_LINK/ACCESSION/O65114

LITERATURE_REFERENCE/ACCESSION/O65118 GO_SEQ_DBXREF/SEQ_ID/47491

PFAM_LINK/ACCESSION/P22988

GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/1725527 ORGANISM_SPECIES/ACCESSION/O65118

GO_SEQ/ID/3339

GRAMENE_LINK/ACCESSION/P10848 LITERATURE_REFERENCE/ACCESSION/P10848 PANTHER_LINK/ACCESSION/P05336


GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/175463 GO_UNIPROT_LINK/ID/209451


FEATURE/ACCESSION/O65118

GO_LINK/ACCESSION/O65120 GRAMENE_LINK/ACCESSION/P22988 LITERATURE_REFERENCE/REFERENCE_ACCESSION/P10848[1] AUTHORS/REFERENCE_ACCESSION/O65117[1]

GO_GENE_PRODUCT_SEQ/GENE_PRODUCT_ID/129251 GO_GENE_PRODUCT_SEQ/SEQ_ID/601038


GO_SEQ/ID/573634

ORGANISM_CLASSIFICATION/ACCESSION/P22989

PUBMED_REFERENCE/REFERENCE_ACCESSION/Q9FEH1[1]

AUTHORS/REFERENCE_ACCESSION/O65120[1]

GO_GENE_PRODUCT/SPECIES_ID/232832 GO_GENE_PRODUCT/ID/1725267

GO_SEQ/DISPLAY_ID/P10848

EMBL_LINK/ACCESSION/O65916 ORGANISM_CLASSIFICATION/ACCESSION/P22988 KEYWORD/ACCESSION/P08477 GENE_NAME/ACCESSION/P05336 INTERPRO_LINK/ACCESSION/P08477 GO_UNIPROT_LINK/XREF_KEY/P08477


KEYWORD/ACCESSION/P22989 ORGANISM_SPECIES/ACCESSION/P22988


GO_GENE_PRODUCT/ID/161713 GO_ASSOCIATION/GENE_PRODUCT_ID/170735

GO_GENE_PRODUCT_COUNT/TERM_ID/1737

GO_GENE_PRODUCT/ID/1725527 AUTHORS/REFERENCE_ACCESSION/Q9ZR31[1]

ORGANISM_SPECIES/ACCESSION/P22989

TAXONOMY_CROSS_REFERENCE/ACCESSION/P22988

GO_GENE_PRODUCT/ID/161712 GO_GENE_PRODUCT/ID/168630


GO_SEQ/DISPLAY_ID/P10847

GO_UNIPROT_LINK/ID/247488 LITERATURE_REFERENCE/ACCESSION/P22989 LITERATURE_REFERENCE/ACCESSION/P22988 INTERPRO_LINK/ACCESSION/P22988 LITERATURE_REFERENCE/ACCESSION/P08477

GO_TERM/ID/1820

GO_GENE_PRODUCT/ID/129544 GO_GENE_PRODUCT_SEQ/SEQ_ID/73352

GO_SEQ_DBXREF/DBXREF_ID/40067

INTERPRO_LINK/ACCESSION/P10848

GO_SEQ_DBXREF/SEQ_ID/491956 GRAMENE_LINK/ACCESSION/O65118


GO_SEQ/DISPLAY_ID/O65120

EMBL_LINK/ACCESSION/Q9ZR31 GO_GRAPH_PATH/ID/221964

GO_SEQ/ID/599310 TAXONOMY_CROSS_REFERENCE/ACCESSION/P05336


LITERATURE_REFERENCE/REFERENCE_ACCESSION/P05336[1] GO_UNIPROT_LINK/ID/233004 PUBMED_REFERENCE/REFERENCE_ACCESSION/P05336[1] MEDLINE_REFERENCE/REFERENCE_ACCESSION/P05336[1]

HSSP_LINK/ACCESSION/P05336 GO_UNIPROT_LINK/XREF_KEY/P05336 HSSP_LINK/ACCESSION/P10848

LITERATURE_REFERENCE/ACCESSION/P05336

HSSP_LINK/ACCESSION/O65119 GO_SEQ/ID/573636 GO_SEQ_DBXREF/SEQ_ID/491961 HSSP_LINK/ACCESSION/O65118

PROTEIN/ACCESSION/O65119


KEYWORD/ACCESSION/P10848

MEDLINE_REFERENCE/REFERENCE_ACCESSION/O65119[1] LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65119[1] EMBL_LINK/ACCESSION/Q9ZR30 GO_SEQ/DISPLAY_ID/O65117

PANTHER_LINK/ACCESSION/P10848

PROTEIN/ACCESSION/P10848

GENE_NAME/ACCESSION/O65114 INTERPRO_LINK/ACCESSION/O65117


PFAM_LINK/ACCESSION/P22989 GENE_NAME/ACCESSION/Q9ZR31 GO_SEQ_DBXREF/SEQ_ID/491957 GO_SEQ_DBXREF/SEQ_ID/491960 GO_SPECIES/ID/230061

PROTEIN/ACCESSION/O65117

ENZYME/ACCESSION/O65114 ENZYME/ACCESSION/O65916 GO_SEQ/ID/491961 ENZYME/ACCESSION/P05336 PUBMED_REFERENCE/REFERENCE_ACCESSION/O65119[1] ENZYME/ACCESSION/O65118

GO_SEQ/DISPLAY_ID/Q9ZR31 HSSP_LINK/ACCESSION/O65117

ORGANISM_SPECIES/ACCESSION/P10848

GO_SEQ/ID/491958 FEATURE/ACCESSION/Q9ZR31 GO_SEQ/ID/47491 GO_SEQ/ID/491956 GO_GOA_LINK/ID/93395 GO_SEQ/DISPLAY_ID/O65116 PUBMED_REFERENCE/REFERENCE_ACCESSION/O65118[1] GO_GENE_PRODUCT_SEQ/SEQ_ID/491960 GO_GENE_PRODUCT_SEQ/SEQ_ID/491957 GO_SEQ_DBXREF/SEQ_ID/558618

LITERATURE_REFERENCE/ACCESSION/O65117

PROTEIN/ACCESSION/P05336 GRAMENE_LINK/ACCESSION/P05336 KEYWORD/ACCESSION/P05336 ORGANISM_CLASSIFICATION/ACCESSION/P05336

TIGRFAMS_LINK/ACCESSION/P34937 GO_GENE_PRODUCT_SEQ/SEQ_ID/558618


TAXONOMY_CROSS_REFERENCE/ACCESSION/P10848

LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65117[1] PFAM_LINK/ACCESSION/P10848

FEATURE/ACCESSION/Q9ZR30 GRAMENE_LINK/ACCESSION/O65117 HSSP_LINK/ACCESSION/O65115

EMBL_LINK/ACCESSION/P10847 GRAMENE_LINK/ACCESSION/P34937 PROSITE_LINK/ACCESSION/P34937

PANTHER_LINK/ACCESSION/O65119

KEYWORD/ACCESSION/O65117

GENE_NAME/ACCESSION/P10847

PANTHER_LINK/ACCESSION/O65118


GO_SEQ/ID/558618


GRAMENE_LINK/ACCESSION/Q9ZNQ2 HSSP_LINK/ACCESSION/Q9ZR29 PROSITE_LINK/ACCESSION/P08477

GO_SEQ/DISPLAY_ID/P22988 GO_SEQ/ID/491957 GO_SEQ/ID/491960 GO_SEQ/DISPLAY_ID/P26517

ENZYME/EC_NR/1.1.1.1 PANTHER_LINK/ACCESSION/O65117

INTERPRO_LINK/ACCESSION/P34937 PIR_LINK/ACCESSION/P22989 PROTEIN/ACCESSION/P34937 PANTHER_LINK/ACCESSION/Q7DLE1 ORGANISM_SPECIES/ACCESSION/P34937


ENZYME/ACCESSION/Q9ZR30 ENZYME/ACCESSION/O65117

TIGRFAMS_LINK/ACCESSION/P08477

FEATURE/ACCESSION/P10848 ENZYME/EC_NR/1.1.1.27 GO_SEQ/DISPLAY_ID/O65119 ORGANISM_SPECIES/ACCESSION/Q7DLE1 ALTERNATIVE_ACCESSIONS/ACCESSION/P05336 PFAM_LINK/ACCESSION/P26517

PROTEIN/ACCESSION/Q9ZNQ2 GO_UNIPROT_LINK/XREF_KEY/P34937

ORGANISM_SPECIES/ACCESSION/Q9ZNQ2

GRAMENE_LINK/ACCESSION/Q7DLE1

GENE_NAME/ACCESSION/Q9FEH1 PFAM_LINK/ACCESSION/P05336 PUBMED_REFERENCE/REFERENCE_ACCESSION/O65114[1] ENZYME/ACCESSION/Q9FEH1 ENZYME/ACCESSION/Q9ZNQ2 EMBL_LINK/ACCESSION/Q9FEH1 GO_LINK/ACCESSION/Q9ZR30 PANTHER_LINK/ACCESSION/O65114

PFAM_LINK/ACCESSION/P08477 GRAMENE_LINK/ACCESSION/O65115

EMBL_LINK/ACCESSION/P08477 PROTEIN/ACCESSION/Q7DLE1

TAXONOMY_CROSS_REFERENCE/ACCESSION/Q9ZNQ2


KEYWORD/ACCESSION/Q7DLE1


GO_SEQ/DISPLAY_ID/P05336 ENZYME/ACCESSION/O65116

HSSP_LINK/ACCESSION/Q9FEH1

PRINTS_LINK/ACCESSION/P22988 GO_SEQ/DISPLAY_ID/Q9ZR29 TAXONOMY_CROSS_REFERENCE/ACCESSION/P26517

FEATURE/ACCESSION/Q9FEH1 ENZYME/ACCESSION/O65119 ORGANISM_SPECIES/ACCESSION/O65115 GO_EC_LINK/XREF_KEY/1.2.1.12 FEATURE/ACCESSION/Q9ZNQ2 INTERPRO_LINK/ACCESSION/O65916 EMBL_LINK/ACCESSION/Q9ZNQ2 ORGANISM_CLASSIFICATION/ACCESSION/O65916

GO_SEQ/ID/150842 KEYWORD/ACCESSION/O65916 PROTEIN/ACCESSION/O65916


EMBL_LINK/ACCESSION/P10848 PIR_LINK/ACCESSION/P05336 GENE_NAME/ACCESSION/P10848 GO_UNIPROT_LINK/XREF_KEY/Q7DLE1 ALTERNATIVE_ACCESSIONS/ACCESSION/P34937 PROSITE_LINK/ACCESSION/Q7DLE1 PIR_LINK/ACCESSION/P10847 GENE_NAME/ACCESSION/P08477

ENZYME/ACCESSION/P22988

TAXONOMY_CROSS_REFERENCE/ACCESSION/O65117

LITERATURE_REFERENCE/REFERENCE_ACCESSION/Q9ZNQ2[1]

ENZYME/ACCESSION/P10847

FEATURE/ACCESSION/P22989 GO_SEQ/DISPLAY_ID/Q9ZNQ2 PROSITE_LINK/ACCESSION/P22989 INTERPRO_LINK/ACCESSION/Q9ZR30 PROSITE_LINK/ACCESSION/P22988 GO_UNIPROT_LINK/ID/249477 ORGANISM_SPECIES/ACCESSION/Q9ZR31

GRAMENE_LINK/ACCESSION/P26517


PROSITE_LINK/ACCESSION/Q9ZNQ2 GO_UNIPROT_LINK/XREF_KEY/Q9ZNQ2 PROSITE_LINK/ACCESSION/P10847

PROSITE_LINK/ACCESSION/Q9ZR31 GRAMENE_LINK/ACCESSION/Q9ZR30

GO_UNIPROT_LINK/XREF_KEY/P26517 PIR_LINK/ACCESSION/P26517 GO_SEQ/DISPLAY_ID/O65118 GO_SEQ/DISPLAY_ID/P34937 ORGANISM_CLASSIFICATION/ACCESSION/Q7DLE1 ORGANISM_CLASSIFICATION/ACCESSION/P26517 PIR_LINK/ACCESSION/P22988 TAXONOMY_CROSS_REFERENCE/ACCESSION/Q7DLE1 PIR_LINK/ACCESSION/P10848

ENZYME/ACCESSION/O65115


GO_UNIPROT_LINK/XREF_KEY/O65116 LITERATURE_REFERENCE/ACCESSION/Q9ZNQ2

EMBL_LINK/ACCESSION/P22989

PROTEIN/ACCESSION/P26517 INTERPRO_LINK/ACCESSION/Q7DLE1 PANTHER_LINK/ACCESSION/P26517

PANTHER_LINK/ACCESSION/Q9FEH1

LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65114[1] PANTHER_LINK/ACCESSION/O65115 MEDLINE_REFERENCE/REFERENCE_ACCESSION/O65114[1] ORGANISM_SPECIES/ACCESSION/O65114 GRAMENE_LINK/ACCESSION/O65114 PROTEIN/ACCESSION/O65114 KEYWORD/ACCESSION/O65114

GO_UNIPROT_LINK/ID/209448 AUTHORS/REFERENCE_ACCESSION/Q9FEH1[1]

GO_TERM/ID/1429

ORGANISM_CLASSIFICATION/ACCESSION/Q9ZNQ2 PRINTS_LINK/ACCESSION/P26517 INTERPRO_LINK/ACCESSION/Q9ZNQ2

PIR_LINK/ACCESSION/P08477 KEYWORD/ACCESSION/P26517 KEYWORD/ACCESSION/P34937 ORGANISM_SPECIES/ACCESSION/P26517 INTERPRO_LINK/ACCESSION/P26517

PIRSF_LINK/ACCESSION/P22988 HSSP_LINK/ACCESSION/P26517 ENZYME/ACCESSION/P08477 ENZYME/ACCESSION/P10848

LITERATURE_REFERENCE/ACCESSION/O65115 INTERPRO_LINK/ACCESSION/O65115


KEYWORD/ACCESSION/Q9ZNQ2 ORGANISM_CLASSIFICATION/ACCESSION/P34937 PRINTS_LINK/ACCESSION/P08477


GENE_NAME/ACCESSION/Q9ZR30

GO_SEQ/ID/491959


PROSITE_LINK/ACCESSION/P26517 PFAM_LINK/ACCESSION/P34937 HSSP_LINK/ACCESSION/O65114

PROTEIN/ACCESSION/O65115 PUBMED_REFERENCE/REFERENCE_ACCESSION/O65115[1] GENE_NAME/ACCESSION/Q9ZNQ2 LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65115[1]

TAXONOMY_CROSS_REFERENCE/ACCESSION/O65119 ORGANISM_CLASSIFICATION/ACCESSION/O65119



FEATURE/ACCESSION/P10847

GO_SEQ/DISPLAY_ID/O65115


GO_SEQ/DISPLAY_ID/Q7DLE1 GO_SEQ/ID/47465

LITERATURE_REFERENCE/REFERENCE_ACCESSION/P34937[1] PANTHER_LINK/ACCESSION/Q9ZR29 GO_SEQ_DBXREF/SEQ_ID/47465 AUTHORS/REFERENCE_ACCESSION/P10847[1]


KEYWORD/ACCESSION/O65115

AUTHORS/REFERENCE_ACCESSION/Q9ZNQ2[1]

GO_SPECIES/ID/232832 GO_SEQ_DBXREF/SEQ_ID/3339

GO_GRAPH_PATH/ID/446838

KEYWORD/ACCESSION/O65119 ENZYME/ACCESSION/Q9ZR31

ORGANISM_SPECIES/ACCESSION/O65119 LITERATURE_REFERENCE/ACCESSION/O65119 ENZYME/ACCESSION/O65120

EMBL_LINK/ACCESSION/O65117



GO_SEQ_DBXREF/DBXREF_ID/42796 GO_GENE_PRODUCT_SEQ/SEQ_ID/491959

KEYWORD/ACCESSION/O65118 PROTEIN/ACCESSION/O65118 INTERPRO_LINK/ACCESSION/O65118

GO_SEQ/DISPLAY_ID/Q9ZR30 INTERPRO_LINK/ACCESSION/O65119 ORGANISM_SPECIES/ACCESSION/O65117

GENE_NAME/ACCESSION/O65117 MEDLINE_REFERENCE/REFERENCE_ACCESSION/O65117[1]

GO_LINK/ACCESSION/Q9ZR31


TAXONOMY_CROSS_REFERENCE/ACCESSION/P34937 GO_SEQ/DISPLAY_ID/P22989

GRAMENE_LINK/ACCESSION/O65119

PUBMED_REFERENCE/REFERENCE_ACCESSION/O65117[1] INTERPRO_LINK/ACCESSION/P05336 AUTHORS/REFERENCE_ACCESSION/Q9ZR30[1]

GO_GRAPH_PATH/ID/660365


GO_ASSOCIATION/GENE_PRODUCT_ID/129253 GO_GENE_PRODUCT_COUNT/TERM_ID/1429 GO_ASSOCIATION/GENE_PRODUCT_ID/129249 GO_ASSOCIATION/GENE_PRODUCT_ID/1725525 GO_GENE_PRODUCT/ID/168629 GO_GENE_PRODUCT/ID/129250 GO_GENE_PRODUCT/ID/175463

GO_GOA_LINK/ID/93379 LITERATURE_REFERENCE/REFERENCE_ACCESSION/P08477[1]

GO_EC_LINK/ID/1211


GO_GENE_PRODUCT/SPECIES_ID/230061



GO_ASSOCIATION/ID/849915 GO_ASSOCIATION/ID/808850 GO_EVIDENCE/DBXREF_ID/93379



GO_EVIDENCE/DBXREF_ID/93395


ORGANISM_CLASSIFICATION/ACCESSION/O65115

AUTHORS/REFERENCE_ACCESSION/P05336[1] ORGANISM_CLASSIFICATION/ACCESSION/O65114

AUTHORS/REFERENCE_ACCESSION/P22989[1]

ENZYME/ACCESSION/P22989 GO_SEQ/DISPLAY_ID/Q9FEH1

ORGANISM_CLASSIFICATION/ACCESSION/O65117

PROSITE_LINK/ACCESSION/Q9ZR30 FEATURE/ACCESSION/P34937 ENZYME/ACCESSION/Q9ZR29

LITERATURE_REFERENCE/ACCESSION/Q7DLE1

EMBL_LINK/ACCESSION/O65116 GRAMENE_LINK/ACCESSION/O65916 GENE_NAME/ACCESSION/O65116 ORGANISM_SPECIES/ACCESSION/O65916

FEATURE/ACCESSION/P08477

ENZYME_DISEASE/EC/1.1.1.27

GO_EC_LINK/ID/2444

GO_UNIPROT_LINK/XREF_KEY/O65114

KEYWORD/ACCESSION/P10847 INTERPRO_LINK/ACCESSION/P10847 HSSP_LINK/ACCESSION/O65916

ENZYME/EC_NR/5.3.1.1



PROTEIN/ACCESSION/P10847 ENZYME_REACTION/EC/1.1.1.1

GENE_NAME/ACCESSION/O65119


ORGANISM_CLASSIFICATION/ACCESSION/Q9ZR30 FEATURE/ACCESSION/Q7DLE1 EMBL_LINK/ACCESSION/P34937


ORGANISM_SPECIES/ACCESSION/P10847 INTERPRO_LINK/ACCESSION/O65116 PUBMED_REFERENCE/REFERENCE_ACCESSION/O65116[1]


ENZYME_ORGANISM/EC/1.1.1.27 ENZYME_REACTION/EC/1.1.1.27 ENZYME_NAME/EC/1.1.1.27

ENZYME_INHIBITOR/EC/1.1.1.27 LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65916[1] ENZYME_LOCALIZATION/EC/1.1.1.27 AUTHORS/REFERENCE_ACCESSION/O65116[1] ENZYME_ORGANISM/ORGANISMID/1.1.1.27_28_

EMBL_LINK/ACCESSION/Q7DLE1 AUTHORS/REFERENCE_ACCESSION/O65114[1] EMBL_LINK/ACCESSION/P26517 INTERPRO_LINK/ACCESSION/O65114 GENE_NAME/ACCESSION/Q7DLE1

PROTEIN/ACCESSION/Q9FEH1

LITERATURE_REFERENCE/ACCESSION/O65916 ENZYME_OVERVIEW/EC/1.1.1.27 GO_UNIPROT_LINK/XREF_KEY/O65916

ENZYME_PATHWAY/EC/1.1.1.27 ENZYME_COFACTOR/EC/1.1.1.27

ENZYME/ACCESSION/P34937 GRAMENE_LINK/ACCESSION/Q9ZR31 PROTEIN/ACCESSION/Q9ZR31 ORGANISM_SPECIES/ACCESSION/Q9ZR30 PROTEIN/ACCESSION/Q9ZR30

PANTHER_LINK/ACCESSION/Q9ZR31

PANTHER_LINK/ACCESSION/O65916 TAXONOMY_CROSS_REFERENCE/ACCESSION/O65916

PROTEIN/ACCESSION/Q9ZR29 ORGANISM_CLASSIFICATION/ACCESSION/Q9ZR29 LITERATURE_REFERENCE/ACCESSION/Q9ZR31 INTERPRO_LINK/ACCESSION/Q9ZR31 ORGANISM_CLASSIFICATION/ACCESSION/Q9ZR31 TAXONOMY_CROSS_REFERENCE/ACCESSION/Q9ZR30 ENZYME/ACCESSION/P26517

LITERATURE_REFERENCE/REFERENCE_ACCESSION/Q7DLE1[1] ORGANISM_SPECIES/ACCESSION/Q9FEH1 PRINTS_LINK/ACCESSION/P22989

ENZYME/EC_NR/1.2.1.12

TAXONOMY_CROSS_REFERENCE/ACCESSION/Q9FEH1 GENE_NAME/ACCESSION/Q9ZR29 PROTEIN/ACCESSION/O65120 KEYWORD/ACCESSION/O65120 FEATURE/ACCESSION/O65115 KEYWORD/ACCESSION/Q9ZR30 HSSP_LINK/ACCESSION/Q9ZR31 PANTHER_LINK/ACCESSION/Q9ZR30 TAXONOMY_CROSS_REFERENCE/ACCESSION/P10847 GRAMENE_LINK/ACCESSION/P10847 GO_UNIPROT_LINK/XREF_KEY/Q9ZR30 KEYWORD/ACCESSION/Q9FEH1 GENE_NAME/ACCESSION/P26517 INTERPRO_LINK/ACCESSION/Q9FEH1 KEYWORD/ACCESSION/Q9ZR31 TIGRFAMS_LINK/ACCESSION/P22989 HSSP_LINK/ACCESSION/Q9ZNQ2 EMBL_LINK/ACCESSION/O65119 GO_LINK/ACCESSION/Q7DLE1 FEATURE/ACCESSION/O65119 GRAMENE_LINK/ACCESSION/O65120 GO_UNIPROT_LINK/XREF_KEY/P10847 ORGANISM_SPECIES/ACCESSION/Q9ZR29 ORGANISM_SPECIES/ACCESSION/O65120 PROSITE_LINK/ACCESSION/P05336 LITERATURE_REFERENCE/REFERENCE_ACCESSION/P26517[1] TAXONOMY_CROSS_REFERENCE/ACCESSION/Q9ZR31 GRAMENE_LINK/ACCESSION/Q9FEH1 INTERPRO_LINK/ACCESSION/Q9ZR29 FEATURE/ACCESSION/P22988 LITERATURE_REFERENCE/ACCESSION/Q9ZR30 PANTHER_LINK/ACCESSION/P10847 PANTHER_LINK/ACCESSION/Q9ZNQ2 PFAM_LINK/ACCESSION/P10847 ORGANISM_CLASSIFICATION/ACCESSION/Q9FEH1 LITERATURE_REFERENCE/REFERENCE_ACCESSION/Q9ZR31[1] ORGANISM_SPECIES/ACCESSION/O65116 PROSITE_LINK/ACCESSION/Q9FEH1 PANTHER_LINK/ACCESSION/O65116 GO_UNIPROT_LINK/XREF_KEY/Q9ZR31 EMBL_LINK/ACCESSION/P22988 ORGANISM_CLASSIFICATION/ACCESSION/O65120 EMBL_LINK/ACCESSION/Q9ZR29 KEYWORD/ACCESSION/O65116 TAXONOMY_CROSS_REFERENCE/ACCESSION/Q9ZR29 FEATURE/ACCESSION/Q9ZR29 INTERPRO_LINK/ACCESSION/O65120 GRAMENE_LINK/ACCESSION/Q9ZR29 KEYWORD/ACCESSION/Q9ZR29 GO_UNIPROT_LINK/XREF_KEY/O65118 GENE_NAME/ACCESSION/O65115 ENZYME/ACCESSION/Q7DLE1 PROTEIN/ACCESSION/O65116 PANTHER_LINK/ACCESSION/O65120 PRODOM_LINK/ACCESSION/P34937


LITERATURE_REFERENCE/ACCESSION/O65116 LITERATURE_REFERENCE/ACCESSION/P10847

ENZYME_DISEASE/EC/1.2.1.12

GO_EC_LINK/ID/3395 ENZYME_COFACTOR/EC/1.2.1.12 GO_EC_LINK/XREF_KEY/1.1.1.1

ENZYME_NAME/EC/1.2.1.12 ENZYME_OVERVIEW/EC/1.2.1.12 ENZYME_LOCALIZATION/EC/1.2.1.12


HSSP_LINK/ACCESSION/O65116 GO_SEQ/DISPLAY_ID/O65114 HSSP_LINK/ACCESSION/O65120

PUBMED_REFERENCE/REFERENCE_ACCESSION/O65916[1]

GO_EC_LINK/XREF_KEY/5.3.1.1 GO_EVIDENCE/ASSOCIATION_ID/4892614

HSSP_LINK/ACCESSION/Q9ZR30

TAXONOMY_CROSS_REFERENCE/ACCESSION/O65120 AUTHORS/REFERENCE_ACCESSION/P08477[1] AUTHORS/REFERENCE_ACCESSION/P10848[1] GO_UNIPROT_LINK/XREF_KEY/Q9FEH1 ORGANISM_CLASSIFICATION/ACCESSION/O65116

EMBL_LINK/ACCESSION/O65115

FEATURE/ACCESSION/O65114 FEATURE/ACCESSION/P26517


PROSITE_LINK/ACCESSION/P10848 ENZYME_PATHWAY/EC/5.3.1.1

AUTHORS/REFERENCE_ACCESSION/Q7DLE1[1]


ENZYME_OVERVIEW/EC/5.3.1.1

GRAMENE_LINK/ACCESSION/O65116 LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65116[1]

ENZYME_REACTION/EC/1.2.1.12

TAXONOMY_CROSS_REFERENCE/ACCESSION/O65116 PROSITE_LINK/ACCESSION/Q9ZR29 GO_UNIPROT_LINK/XREF_KEY/Q9ZR29

LITERATURE_REFERENCE/ACCESSION/Q9FEH1

ENZYME_ORGANISM/EC/1.2.1.12 ENZYME_DISEASE/EC/5.3.1.1


ENZYME_LOCALIZATION/EC/1.1.1.1

ENZYME_PATHWAY/EC/1.2.1.12

ENZYME_INHIBITOR/EC/1.2.1.12 GO_UNIPROT_LINK/XREF_KEY/O65119 ENZYME_LOCALIZATION/EC/5.3.1.1 ENZYME_NAME/EC/5.3.1.1


LITERATURE_REFERENCE/REFERENCE_ACCESSION/P10847[1]

ENZYME_ORGANISM/ORGANISMID/1.2.1.12_172_

ENZYME_COFACTOR/EC/1.1.1.1 ENZYME_ORGANISM/EC/5.3.1.1

LITERATURE_REFERENCE/ACCESSION/Q9ZR29


ENZYME_COFACTOR/EC/5.3.1.1 MEDLINE_REFERENCE/REFERENCE_ACCESSION/O65116[1] GO_UNIPROT_LINK/ID/209449

LITERATURE_REFERENCE/REFERENCE_ACCESSION/Q9ZR30[1] LITERATURE_REFERENCE/REFERENCE_ACCESSION/Q9FEH1[1]

MEDLINE_REFERENCE/REFERENCE_ACCESSION/P22989[1] GO_UNIPROT_LINK/XREF_KEY/O65117 LITERATURE_REFERENCE/REFERENCE_ACCESSION/Q9ZR29[1] PUBMED_REFERENCE/REFERENCE_ACCESSION/P22989[1]

ENZYME_ORGANISM/ORGANISMID/5.3.1.1_49_ LITERATURE_REFERENCE/REFERENCE_ACCESSION/O65120[1]

AUTHORS/REFERENCE_ACCESSION/O65119[1] AUTHORS/REFERENCE_ACCESSION/P34937[1] AUTHORS/REFERENCE_ACCESSION/Q9ZR29[1] GO_EVIDENCE/ASSOCIATION_ID/610848 TAXONOMY_CROSS_REFERENCE/ACCESSION/O65118 AUTHORS/REFERENCE_ACCESSION/P22988[1] AUTHORS/REFERENCE_ACCESSION/P26517[1] ORGANISM_CLASSIFICATION/ACCESSION/O65118


GO_EC_LINK/XREF_KEY/1.1.1.27

AUTHORS/REFERENCE_ACCESSION/O65115[1] PUBMED_REFERENCE/REFERENCE_ACCESSION/O65120[1]

809 ENZYME_PATHWAY/EC/1.1.1.1 ENZYME_ORGANISM/EC/1.1.1.1

ENZYME_PRODUCT/EC/1.1.1.1 ENZYME_DISEASE/EC/1.1.1.1

ENZYME_INHIBITOR/EC/1.1.1.1 ENZYME_NAME/EC/1.1.1.1 ENZYME_ORGANISM/ORGANISMID/1.1.1.1_1_

ENZYME_OVERVIEW/EC/1.1.1.1

Figure 4: data network of enzymes in barley showing catalytic activity in glycolysis (highresolution image: http://pgrc.ipk-gatersleben.de/dlg/Hordeum Glycolysis. pdf)

References [ABB+ 00] M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, M. A. Harris, D.P. Hill, L. IsselTarver, A. Kasarskis, S. Lewis, J. C. Matese, J. E. Richardson, M. Ringwald, Rubin G. M., and G. Sherlock. Gene Ontology: tool for the unification of biology. Nature Genetics, 25:25–29, 2000. [ABL+ 99] M. A. Andrade, N. P. Brown, C. Leroy, S. Hoersch, A. de Daruvar, C. Reich, A. Franchini, J. Tamames, A. Valencia, C. Ouzounis, and C. Sander. Automated genome sequence analysis and annotation. Bioinformatics, 15(5):391–412, 1999. [AGM+ 90] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 3:403–410, 1990.


10


[BK03]


F. Bry and P. Kröger. A Computational Biology Database Digest: Data, Data Analysis, and Data Management. Distributed and Parallel Databases, 13(1):7– 42, 2003.

[BLSS04] S. Balko, M. Lange, R. Schnee, and U. Scholz. BioDataServer: An Applied Molecular Biological Data Integration Service. In S. Istrail, P. Pevzner, and M. Waterman, editors, Data Integration in the Life Sciences; First International Workshop, DILS 2004 Leipzig, Germany, volume 2994 of Lecture Notes in Bioinformatics, pages 140–155. Berlin et al: Springer, 2004. [EUA96]

T. Etzold, A. Ulyanow, and P. Argos. SRS: Information Retrieval System for Molecular Biology Data Banks. Methods in Enzymology, 266:114–128, 1996.

[HSA+ 02] A. Hamosh, A. F. Scott, J. Amberger, C. Bocchini, D. Valle, and V. A. McKusick. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Research, 30(1):52–55, 2002. [KGKN02] M. Kanehisa, S. Goto, S. Kawashima, and A. Nakaya. The KEGG database at GenomeNet. Nucleic Acids Research, 30(1):42–46, 2002. http://www.genome.ad.jp/kegg/. [KLF+ 05] C. Künne, M. Lange, T. Funke, H. Miehe, I. Grosse, and U. Scholz. CR–EST: a resource for crop ESTs. Nucleic Acids Research, 33(suppl 1):D619–621, 2005. [LMP+ 04] Z. Lacroix, T. Morris, K. Parekh, L. Raschid, and M.-E. Vidal. Exploiting multiple paths to express scientific queries. In Proceedings of SSDBM 2004, 2004. [MFG+ 02] H. W. Mewes, D. Frishman, U. Güldener, G. Mannhaupt, K. Mayer, M. Mokrejs, B. Morgenstern, M. Münsterkötter, S. Rudd, and S. Weil. MIPS: a database for genomes and protein sequences. Nucleic Acids Research, 30(1):31–34, 2002. [SGBB01] R. Stevens, C. Goble, P. Baker, and A. Brass. A classification of tasks in bioinformatics. Bioinformatics, 17(2):180–188, 2001. [Ste02]

L. Stein. Creating a bioinformatics nation. Nature, 417:119–120, 2002.

[TBG+ 04] O. Thimm, O. Blasing, Y. Gibon, A. Nagel, S. Me yer, P. Kruger, J. Selbig, L. A. Muller, S. Y. Rhee, and M. Stitt. MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. Plant Journal, 37(6):914–914, 2004.


11

Data Linkage Graph: computation, querying and knowledge discovery

Data Linkage Graph: computation, querying and knowledge discovery

Suggest Documents

Data Linkage Graph: computation, querying and knowledge discovery

Knowledge Discovery for Flexible Querying - Semantic Scholar

Data Mining and Knowledge Discovery

Linkage Knowledge Management and Data ... - Semantic Scholar

querying graph databases - CiteSeerX

Knowledge Base Data Warehouse Methodology - Knowledge Discovery

The KGRAM Abstract Machine for Knowledge Graph Querying

Knowledge Discovery with Data and Text Mining

Foundations of Data Mining and Knowledge Discovery

Data Mining and Knowledge Discovery: Applications, Techniques ...

Introduction to Data Mining and Knowledge Discovery

Data Mining and Knowledge Discovery Handbook

DATABASE ISSUES IN KNOWLEDGE DISCOVERY AND DATA ...

Data Warehousing and Knowledge Discovery: A ... - CiteSeerX

Knowledge Discovery and Data Mining in Databases

big data analytics and knowledge discovery through

Knowledge Graph for Discovery and Navigation - CEUR Workshop

Structuring and Querying the Web through Graph

Querying RDF Data from a Graph Database Perspective - CiteSeerX

Semantic Data Integration for Knowledge Graph

THE LINKAGE OF A GRAPH*

Pattern Discovery from Graph-Structured Data - A Data Mining ...

Supporting Service Discovery, Querying and ... - Lancaster EPrints

Knowledge Discovery in Large Data Sets