Fast Alignments of Metabolic Networks - CiteSeerX

4 downloads 1377 Views 203KB Size Report
aDepartment of Computer Science, Georgia State University, Atlanta,Georgia 30303 ... and deletions of degree-2 vertices are allowed on any path. Our dynamic ...
Fast Alignments of Metabolic Networks Qiong Chenga , Piotr Bermanb , Robert Harrisona , Alexander Zelikovskya∗ a Department of Computer Science, Georgia State University, Atlanta,Georgia 30303 Email: {cscqxcx, rharrison, alexz}@cs.gsu.edu b Department of Computer Science and Engineering, Pennsylvania State University University Park, PA 16802, Email: [email protected]

Abstract Network alignments are extensively used for comparing, exploring, and predicting biological networks. Existing alignment tools are mostly based on isomorphic and homeomorphic embedding and require solving a problem that is NP-complete even when searching a match for a tree in acyclic networks. On the other hand, if the mapping of different nodes from the query network (pattern) into the same node from the text network is allowed, then trees can be optimally mapped into arbitrary networks in polynomial time. In this paper we present the first polynomial-time algorithm for finding the best matching pair consisting of a subtree in a given tree pattern and a subgraph in a given text (represented by an arbitrary network) when both insertions and deletions of degree-2 vertices are allowed on any path. Our dynamic programming algorithm is an order of magnitude faster than the previous network alignment algorithm when deletions are forbidden. The algorithm has been also generalized to pattern networks with cycles: with a modest increase in runtime it can handle patterns with the limited vertex feedback set. We have applied our algorithm to matching metabolic pathways of four organisms (E. coli, S. cerevisiae, B. subtilis and T. thermophilus species) and found a reasonably large set of statistically significant alignments. We show advantages of allowing pattern vertex deletions and give an example validating biological relevance of the pathway alignment.

1

Introduction

The explosive growth of biological network data requires efficient computational methods for analyzing and comparing networks. In this paper, we focus on network alignment for comparing, exploring, and predicting ∗ Correspondence

author.

these networks. For example, pairwise alignment of wellestablished networks can be employed to mine conserved subnetwork patterns and to extract evolutionary relations between metabolic pathways. Biological networks are usually represented by graphs (e.g., metabolic networks with vertices(enzymes) are directed graphs and protein-protein interaction networks are undirected graphs). Alignment of two networks, pattern and text, is usually understood as finding the largest similar parts such as isomorphic or homeomorphic subgraphs. Both isomorphic and homeomorphic alignments are NPcomplete [12]. Existing approaches to subgraph iso- and homeomorphism restrict the size (see [4, 5]) or topology of the pattern (see [1, 2, 3, 6, 7]) or use hueristics and approximation algorithms. GraphMatch [5] allows to delete disassociated vertices or induced subnetwork in query network and then align its remainder to target network by subgraph isomorphism. However, subgraph iso-/homeomorphism does not capture the widespread evolution machinery of gene duplication that results in vertex copying (see [13]). Previously, it was proposed to overcome this drawback by allowing alignment to map several similar vertices from the pattern to the same vertex in the text, i.e., replacing isomorphism with homomorphism (see [6]). Indeed, if two enzymes in the pattern are related by gene duplication and divergence then a single vertex would be split into two nodes, and the mapping between the two patterns would reflect this. In this case it is valid to find the largest similar subgraphs in the pattern and the text by mapping them to the same enzyme in the text such that homomorphic image of the pattern subgraph is homeomorphic to the text subgraph. In the paper [6] we considered the case when the pattern topology is restricted to multi-source trees and no vertex deletion in the pattern is allowed, i.e., edges of homomorphic image of the entire tree pattern should be subdivided by degree-2 vertices to obtain graph isomorphic to a subgraph of the text. It is shown that the network alignment

problem admits an O(|VT ||ET | + |VT2 ||VP |) optimal solution, where VP and VT are vertex sets of the pattern and text, respectively, and ET is the edge set of the text. In this paper we first reduce the runtime of this algorithm to O(|VP |(|ET | + |VT | log |VT |)). We next generalize the formulation from [6] to incorporate pattern vertex deletion. We allow two types of pattern vertex deletion (see Figure 1). The bypass deletion of a vertex of degree 2 supports homeomorphism, i.e., it allows the replacement of a path with a single edge. If the pattern is a directed graph, then a vertex v can be bypassed only if it belongs to a directed path a → v → b, and consequently the incoming and outgoing edges are replaced by a single edge a → b. The strong deletion corresponds to the operation of deleting subgraph of the pattern – all edges incident to deleted vertices are deleted and no replacement of them. Both types of deletion can be applied recursively and together with each other.

a

a

u

b

v

c

w

(1)

b v

(2)

Related Work

A naive enumeration algorithm to obtain network alignment is exponential. The works [1, 2] focuse only on similarity between vertices such as proteins or genes composing pathways disregarding topology. Kelly et al [4] has taken into account the nonlinearity of protein network topology and reduced it to be a problem of finding a highest-score path of length L in acyclic graph. The runtime of their procedure is efficient if the path length L is restricted to 6. Pinter et al [3] conserve network topology but requires pattern and text graphs to be trees. The runtime of their 2 P | |VT | algorithm is O( |Vlog |VP | + |VP ||VT | log |VT |). Yang et al [5] proposed path matching and graph matching algorithms. Path matching finds a best homeomorphic alignment from path to graph, which allows the operations of vertex insertions and deletions. Their graph matching allows to delete disassociated vertices or induced subnetwork in query network and then aligns its remainder to target network by subgraph isomorphism. Papers [6, 7] consider network alignment of metabolic pathways without vertex deletion. A polynomial-time algorithm for mapping tree pattern into an arbitrary text is proposed.

a

v c

2

u

u b

the runtime complexity. Section 5 applies network alignment to metabolic pathways, compares the quality of the proposed algorithm with the previous method, and offers a biologically relevant example.

c

( 3 ) = (2) + (1)

Figure 1. Examples of pattern vertex deletion. Solid lines represent pattern edges; dashed lines represent text paths; dashed arrows connect pattern vertices with their images in the text graph. (1) Bypass deletion of a patten vertex of degree 2. (2) Branch deletion of three pattern vertices; (3) =(2)+(1) Composition of strong and bypass deletions: after strong deletions a pattern vertex becomes eligible to bypass deletion.

3

Formal Network Alignment Definitions

Below we first formally define homomorphisms (i.e., the network alignments without deletions) and their cost, we then define general optimal network alignments. Let P = (VP , EP ) be a pattern graph that we wish to align with a text graph, T = (VT , ET ), both representing either metabolic or PPI networks. Each vertex has its label (EC-notation or protein sequence) and cost of label-to-label mapping which reflects biological relevance is given for any two labels ∆ : VP × VT → R. We also need to take account of dilation cost, i.e., mapping adjacent vertices into paths with additional intermediate vertices. For this we introduce notation σ(u, v) which is the minimum number of edges on a path from u to v minus 1, σ(u, v) = ∞ if such a path does not exist. A valid solution is a mapping f : VP → VT . The cost of f is

We then give an efficient algorithm for finding optimal network alignment of a tree pattern network with an arbitrary network and show how to generalize our approach to practical solution of the case when the pattern network has a limited size of the feedback vertex set. Finally, we apply our method to alignment of metabolic pathways of four organisms (E. coli, S. cerevisiae, B. subtilis and T. thermophilus species) and found a reasonably large set of statistically significant motifs which is not caused by database curating. We show advantages of allowing pattern vertex deletions and give an example which demonstrates the biological relevance of the pathway alignment. The remainder of the paper is organized as follows. The next section 2 overviews related work. Section 3 offers formal definitions and the problem formulation for optimal network alignment. Section 4 describe our dynamic programming algorithm, proves its correctness and analyzes

cost(f ) =

X u∈VP

2

∆(u, f (u)) + λ

X (u,v)∈EP

σ(f (u), f (v))

We can define cost of f restricted to the subtree rooted at u, so cost(f, u) = cost(u). If u is a leaf, cost(f, u) = ∆(u, f (u)). Otherwise X cost(f, u) = ∆(u, f (u))+λ (cost(f, v)+¯ σ (f (u), f (v)))

The first term reflects the cost of mapping the enzymes/proteins and the second term reflects the dilation cost scaled with the a fixed coefficient λ > 0. If cost(f ) < ∞, each edge of P is mapped into a directed path in T and we say that f is a homomorphism. If the pattern is an arbitrary graph, the problem of finding the minimum cost homomorphism is NP-hard (e.g., if ∆(u, v) is always equal 0, and P is a clique of k nodes, we have a solution with cost 0 if and only if T contains a clique of size k). In this paper we handle the cases when the pattern, viewed as an undirected graph, is either a tree, or it has a constant size feedback vertex set. We now consider general network alignments with deletions. Then a valid solution is a mapping f : VP → VT ∪ {b, d}, where b stands for bypass deletion and d stands for strong deletion. In order to handle bypass deletions we need the following definitions. Let EPf be the set of pairs (u, v) ∈ f −1 (VT ) × f −1 (VT ) such that in EP there exists a path (u = u0 , . . . , uk+1 = v) with f (ui ) = b for i = 1, . . . , k. If k = 0, this path is a simple edge, otherwise we say that it validates values f (ui ) = b for i = 1, . . . , k. We allow f (u) = b only if it is validated. We also require that (f −1 (VT ), EPf ) and G0 = −1 (f (VT ∪ {b}), EP ) are (weakly) connected, and f −1 (b) have degree 2 in G0 or have in- and outdegree 1 in G0 when P is undirected or directed, respectively. To define the cost of f , we need to have ∆ defined on VP × {b, d} which gives the costs of bypass and strong deletion of the pattern nodes. Finally, the problem is to find a network alignment f with the minimum cost

cost(f ) =

X u∈VP

4

∆(u, f (u)) + λ

X

v

where the sum is taken over all children v of u. Now we define the following two recursive functions A, B : VP × VT → R. In the algorithm, we fill two dynamic programming tables with their values. A(u, x) is defined as the least value of cost(f, u) such that f (u) = x. Note that the optimum solution cost is minv∈VT A(r, v). B(v, x) is defined as the least value of A(v, y) + σ ¯ (x, y), i.e., the contribution that child v can give to the cost of its parent u if f (u) = x. Having an additional table for B accelerates the computation of A, because it is faster to compute B(v, x) for each x together than separately. If u is a leaf or B(v, ∗) is computed and tabulated for every child v of u, we apply the formula X A(u, x) = ∆(u, x) + B(v, x) child v of u Computing A(u, x) takes time proportional to the number of children of u so the total time for computing the values of A is X O(|VT |( deg(u))) = O(|VT ||EP |) = O(|VP ||VT |) u∈VP

Computation of B(v, x) is more involved. Implementation proposed in [6] requires computation of the transitive closure T 0 = (VT , ET0 ) of the text graph T . Computing

σ(f (u), f (v)).

f (u,v)∈EP

B(v, x) =

Recursive/dynamic programming solutions

takes O(|VP ||ET0 |) runtime which can be as large as O(|VP ||VT |2 ). We propose instead the following adaptation of Dijkstra algorithm. Given the values A(v, y) for every y ∈ VT , it finds the values of B(v, x) for all x ∈ VT using a priority queue Q. The pseudo-code below assumes, without loss of generality, that the edge connecting v with its parent u is directed (u, v). An item (w, k) in Q is a node w with priority key k.

In this section we first describe a novel fast algorithm for finding optimal network alignment of the tree patterns with no deletions, then of the tree patterns with deletions, and finally, of arbitrary patterns with limited vertex feedback set.

4.1

min (A(v, y) + σ ¯ (x, y))

0 (x,y)∈ET

Tree patterns with no deletions

for each x ∈ VT insert (x, A(v, x) − λ) into Q B(v, x) ← ∞ while Q is not empty delete from Q item (y, k) with the minimum key k for every (x, y) ∈ ET if B(v, x) > k + λ

We orient the undirected graph of the pattern so it is a rooted tree with a root r, so each node u has a set of children. If the pattern is undirected, σ ¯ is the same as σ, otherwise if v is a child of u in our rooted tree  σ(f (u), f (v)) if (u, v) ∈ EP σ ¯ (f (u), f (v)) = σ(f (v), f (u)) if (v, u) ∈ EP 3

B(v, x) ← k + λ if key(x) > k + λ decrease key of x in Q to k + λ

subsection 4.1 is increased, in the worst case, by the factor |VT ||F | . Under such an assumption, we run the tree version of the algorithm in each (weak) connectedPcomponent of VT − F , and we add the resulting costs to u∈F ∆(u, f (u)). The only difference is that in the computation of A(u, x) we increase ∆(u, x) by the sum of the implied dilation costs caused by the assumed f (F ∪ {u}), and edges that connect u and F (we give a verbose description, as we have cases of directed and undirected edges). Summarizing, we have the running time of O(|VT ||F | |VP ||ET | log |VT |). Given that each u ∈ VP has only n(u) biologically meaningful P mappings into VT , we should minimize log n(F ) = u∈F log n(u). One can use an algorithm with approximation ratio 2 (see [11]) or an exact algorithm that runs in time O(10|F | n3 ) (see [14]). The additional consideration needed for the general case when vertex deletions are allowed. The running time may increase by a factor that is constant for small |F |. Observe that unless the pattern is a single cycle we can assume that F contains no path nodes, so we have complications caused by two phenomena: (a) nodes of F can be strongly deleted, (b) udel(f ) has to be connected. In turn, if we consider a connected component X of VP − F , f −1 (VT ) ∩ X does not have to be connected. To handle this issue, we introduce a notion of connectivity pattern in F , a family of pairwise disjoint subsets of F . One can easily find out that for |F | = 1, 2, 3, 4, 5, 6 the number of connectivity patterns is respectively 1, 2, 5, 15, 52, 177. The idea is that when we consider solutions within a subtree of the VP − F , we need to distinguish cases of obtaining different connectivity patterns. More precisely, when we consider the subtree of node u we need to know the connectivity pattern of F ∪ {u}. When we combine the solution for a child v with the solution of its parent u, we also combine connectivity patterns πu and πv . If the parent is bypass deleted, the resulting πu becomes πv , except that we replace u with v in one of the sets. If the parent or the child is strongly deleted, the resulting πu is obtained by combining the connections of πu and πv and removing v. If neither u nor v is strongly deleted, then we combine patterns as in the previous case, except that before we remove v we make a union of sets that contain u and v. Now we can define similar recurrences as before for A, B, C : TP × TV × Π where Π is the set of all connectivity patterns. Because not every pattern π can have a defined value of, say, A(u, x, π), we keep a list of possible patterns rather than a fixed array. Moreover, (we keep A as an example), if connectivity pattern π gives all connections of π 0 , and A(u, x, π) ≤ A(u, x, π 0 ) we can eliminate π 0 from the A-list of (u, x). Now when we update the entries of the tables according

If we implement the priority queue Q with the Fibonacci heaps, the runtime for computing B(v, x) for all x ∈ VT is O(ET + VT log VT ). Finally, the optimal cost(f ) = minv A(r, v) can be computed in time O(|VP |(|ET | + |VT | log |VT |)). A more practical priority queue based on a binary heap results in slightly higher total runtime of O(|VP ||ET | log |VT |).

4.2

Tree patterns with deletions

Handling the case with deletion does not increase the asymptotic running time, but it requires several additional considerations. To reduce the number of cases, we will assume that bypass deletion is applied only to so-called path nodes, which are pattern nodes with degree 2 in the case of undirected pattern, or in- and outdegree 1 in the case of directed pattern (this considerably simplifies the enforcement of the last consistency rule). For u ∈ VP , let D(u) be the sum of ∆(v, d) over all descendants of u (the cost of strongly deleting the subtree of u). Note that the optimum f has some u ∈ f −1 (VT ) such that f −1 (VT ) is contained in the subtree of u; under that assumption the optimum cost equals A(u, f (u)) + D(r) − D(u), so get the optimum cost by finding the minimum value of A(u, x) + D(r) − D(u). Moreover, when we consider the minimum contribution of a child, the value B(v, x), we have to consider two new possibilities. One is that the entire subtree of the child is strongly deleted, so D(v) is a possible value; this can be handled by initializing B(v, x) with D(v) rather than ∞. The second is that the child v bypass deleted, which means that it is a path node, and in the tree of P it has a single child w. In that case the contribution is “created” in the subtree of w and its cost is increased by ∆(v, b). To handle that, we introduce another function/array C(v, x), which for non-path nodes equals A(v, x), and for a path node v with a single child w equals min(A(v, x), C(v, x)+∆(v, b)), and we use C(v, x)−λ as the initial priority key of x (rather than A(v, x) − λ).

4.3

General patterns

We first consider the case when no deletions are allowed. If (VP , EP ) has cycles, we first find the minimum feedback node set F , a set of nodes such that the induced subgraph of VP − F has no cycles. Then we consider every possible assumption concerning the values of f (F ), there are |VT ||F | of them (although most of them could be prohibited if biologically meaningless). Thus the running time from 4

to our recurrences we need to consider lists of possible connectivity patterns and produce the resulting lists according to our simple rules. Finally, we find the least cost combination of the solutions for the roots of components of VP − F that gives the desired connectivity (all undeleted nodes of F in one set).

5

larity score is 0. The corresponding penalty scores are 0.5 and 0 individually for pattern vertex deletion and text vertex deletion. Additionally, we allow pattern partially identified enzymes to be mapped to any enzyme in text without any penalty but in the opposite way that partially identified enzyme happens in text, it will take 0.1 score as mismatch penalty when being mapped by any enzyme in pattern. Our implementation also provides a previously known enzyme similarity score (see [3]) but that score scheme results in biochemically less relevant pathway matches. Statistical Significance of Alignments. Following a standard randomization procedure, we randomly permute pairs of edges (u, v) and (u0 , v 0 ) if no other edges exist between these 4 vertices u, u0 , v, v 0 in the text graph by reconnecting them as (u, v 0 ) and (u0 , v). This allows us to keep the incoming and outgoing degree of each vertex intact. We find the minimum cost alignment from the pattern graph into the fully randomization of the text graph and check if its cost is at least as big as the minimum cost before randomization of the text graph. We say that the alignment is statistically significant with p < 0.05 if we found at most 4 better costs in 100 randomization of the text graph. Results. For alignments from T. thermophilus to B. subtilis, there are in total 2968 statistically significant mapped pairs, 87 out of 149 T. thermophilus pathways have statistically significant aligned images in B. subtilis and 143 out of 172 B. subtilis pathways have statistically significant pre-images. For alignments from E. coli to S. cerevisiae, there are in total 5418 statistically significant mapped pairs, 109 out of 255 E. coli pathways have statistically significant aligned images in S. cerevisiae and 153 out of 175 S. cerevisiae pathways have statistically significant homomorphic pre-images. We find more statistically significant pathway alignments then in [3] (52703 vs 13110 out of total 654481 pattern-text pathway pairs). Table 1 illustrates advantage of the network alignment over homomorphisms (network alignment without vertex deletion) (see [6]). For both characteristics – number of mismatches and number of gaps – the best network alignments significantly outperform the best homomorphisms. Symmetry of optimal network alignments. The advantage of the new network alignment with respect to the homomorphisms from [6] can be observed in symmetry of network alignment. The homomorphism is inherently asymmetric since it can delete vertices only from the text but aligns all pattern vertices. For example consider network alignment two pathways individually in E. coli and S. cerevisiae (see Fig.2). Pentose phosphate pathway in E. coli is a cytosolic process that serves to generate NADPH which facilitates the synthesis of pentose (5-carbon) sugars. Superpathway of oxidative and non-oxidative branches of pentose phosphate pathway in S. cerevisiae is an alternative way of oxidizing glucose.

Network Alignment of Metabolic Pathways

In this section we first describe the metabolic pathway data, then describe comparison and results showing advantage of network alignments over homomorphisms. We then give two example showing biological relevance of network alignments for metabolic pathways. Data. The genome-scale metabolic network data in our studies were drawn from BioCyc [8, 9, 10], the collection of 260 Pathway/Genome Databases, each of which describes metabolic pathways and enzymes of a single organism. We have chosen metabolic networks of E. coli, the yeast S. cerevisiae, the eubacterium B. subtilis, the archeabacterium T. thermophilus and the halobacterium H.NRC-1 so that they cover major lineages Archaea, Eukaryotes, and Eubacteria. The bacterium E. coli with 255 pathways is the most extensively studied prokaryotic organism. T. thermophilus with 149 pathways belongs to Archaea. B. subtilis with 172 pathways is one of the best understood Eubacteria in terms of molecular biology and cell biology. S. cerevisiae with 175 pathways is the most thoroughly researched eukaryotic microorganism. H. NRC-1 with 58 pathways has been extensively used for post-genomic analysis. MetNetAligner. We have developed an alignment tool called MetNetAligner which is based on the proposed algorithm. The alignment program is coded by ANSI C and all simulations were performed on a PC. [7]. Experiments. We ran all-against-all alignment among five species (B. subtilis, E. coli, T. thermophilus, S. cerevisiae and H. NRC-1). For each pair of them, using our algorithm we find the min cost network alignment from each pathway of one species to each pathway of the other and check if this biological homology is statistically significant. The experiments are run on a Pentium 4 processor, 2.99 GHz clock with 1.00 GB RAM. The total runtime was 2.5h for the input/output of pathways and computing the optimal patternto-text mapping and its p-value for every pair of pathways (there are in total 654481 pattern-text pathway pairs). Our approach uses EC encoding and the tight reaction property classified by EC. The EC number is expressed with a 4-level hierarchical scheme. The 4-digit EC number, d1 .d2 .d3 .d4 represents a sub-sub-subclass indication of biochemical reaction. If d1 .d2 of two enzymes are different, their similarity score is infinite; if d3 of two enzymes are different, their similarity score is 10; if d4 of two enzymes are different, their similarity score is 1; or else the simi5

NA HM

E. coli− >T. thermophilus Mismatches Gaps 0.58 0.04 0.76 0.07

E. coli− >B. subtilis Mismatches Gaps 0.23 0.03 0.38 0.06

E. coli− >H. NRC-1 Mismatches Gaps 1.60 0.10 2.31 0.12

E. coli− >S. cerevisiae Mismatches Gaps 0.22 0.04 0.22 0.05

Table 1. Alignment of tree pathways from different species with optimal homomorphisms (HM) [7] and optimal network alignments (NA). Average number of mismatches and gaps are reported on common statistically significant matched pathways.

Figure 2. Alignments between (A) pentose phosphate pathway in E. coli and (B) superpathway of oxidative and non oxidative branches of pentose phosphate pathway in S. cerevisiae Band vice verse.

This oxidation is coupled with NADPH synthesis. One can observe that these two pathways might have evolved from a common origin, since this is confirmed by almost identical statistical significant alignments between them. Significant Deletions. Our hypothesis is that statistically significant deletions in network alignment of metabolic pathways caused by one of the following reasons: (i) existence of an alternative pathway producing the same nutrient, (ii) the minimal media required for the growth of the text organism contains the product produced by missing pathway, and (iii) incomplete metabolic pathways for the text organisms. This hypothesis is confirmed in the example on Fig. 3 with the solid conserved subpath and dotted deleted subpath. One of deleted subpaths (dashed on Fig 3.a) produces methionine which is necessary for production of biotin and biotin is required for E. coli (pattern) while is not required by T. thermophilus (text organism) in their minimal media (see [8]).

Figure 3. An example of significant deletion. (a) pattern : aspartate superpathway in E. coli; (b) text : lysine biosynthesis pathway in T. thermophilus; (c) mapping result (see pattern subgraph strong deletion).(p < 0.05) Unmatched vertices in pattern are deleted.

[6] Q. Cheng, R. Harrison, A. Zelikovsky. Homomorphisms of Multisource Trees into Networks with Applications to Metabolic Pathways. BIBE’07, pp. 350-357 [7] Q. Cheng,D. Kaur, R. Harrison, A. Zelikovsky. Homomorphisms of Multisource Trees into Networks with Applications to Metabolic Pathways. Proc. RECOMB Satellite Conference on System Biology (RECOMB SCSB 2007). [8] http://www.biocyc.org/ [9] I. M. Keeler, V. J. Collard, C. S. Gama, J. Ingraham, et al. EcoCyc: a comprehensive database resource for Escherichia coli. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D334-7. [10] R. Caspi, H. Foerster,C. A. Fulcher, R. Hopkinson et al. MetaCyc: a microorganism database of metabolic pathways and enzymes. Nucleic Acids Res. 2006 January 1; 34(Database issue): D511CD516 [11] V. Bafna, P. Berman, T. Fujito. A 2-Approximation Algorithm for the Undirected Feedback Vertex Set Problem. SIAM J. Discrete Math. 12(3): 289-297 (1999) [12] M. Garey, D. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman and Company, 1979. [13] R. Sharan, T. Ideker. Modeling cellular machinery through biological network comparison. Nature Biotechnology, 24(4):427C433, 2006. [14] F. Dehne, et al. An 2O(k) n3 FPT algorithm for undirected feedback vertex set problem, Proc. COCOON 2005,859-869.

References [1] M. Chen, R. Hofestaedt. PathAligner: metabolic pathway retrieval and alignment. Appl Bioinformatics (2004) 3: 241-52. [2] M. Chen, R. Hofest. An algorithm for linear metabolic pathway alignment. In. silico biology (In silico biol.) ISSN,. 1386-6338: 111-128, 2005 [3] R.Y. Pinter, O. Rokhlenko, E. Yeger-Lotem, M. Ziv-Ukelson. Alignment of metabolic pathways. Bioinformatics. LNCS 3109. SpringerVerlag.(Aug 2005)21(16): 3401-8 [4] R. Sharan, S. Suthram, R. M. Kelley, T. Kuhn, S. McCuine, et al. Conserved patterns of protein interaction in multiple species. PNAS. Vol.102 : 1974-1979 (2005) [5] Q. Yang, S. Sze. Path Matching and Graph Matching in Biological Networks. Journal of Computational Biology. Vol. 14, No. 1: 56-67 : 5527-5530 (2007)

6

Suggest Documents