2017 IEEE 33rd International Conference on Data Engineering
Correcting and Speeding-Up Bounds for Non-Uniform Graph Edit Distance David B. Blumenthal
Johann Gamper
Free University of Bozen-Bolzano, Italy Email:
[email protected]
Free University of Bozen-Bolzano, Italy Email:
[email protected]
Abstract—The problem of deriving lower and upper bounds for the edit distance between labelled undirected graphs has recently received increasing attention. However, only one algorithm has been proposed that allegedly computes not only an upper but also a lower bound for non-uniform metric edit costs and incorporates information about both node and edge labels. In this paper, we show that this algorithm is incorrect in the sense that, in general, it does not compute a lower bound. We present B RANCH, a corrected version of the algorithm that runs in O(n5 ) time. We also develop a speed-up B RANCH FAST that runs in O(n4 ) time and computes a lower bound, which is only slightly less accurate than the one computed by B RANCH. An experimental evaluation shows that B RANCH and B RANCH FAST yield excellent runtime/accuracy-tradeoffs, as they outperform all existing competitors in terms of runtime or in terms of accuracy.
I.
between the node labels. If edit costs are metric, one can assume w. l. o. g. that V G = V H = {1, . . . , n}, as, in this case, filling up the smaller graph with isolated ε-labelled dummy nodes leaves λ(G, H) invariant [3]. Furthermore, for metric edit costs, λ(G, H) can alternatively be defined as the minimum cost of an edit path that is induced by a permutation π : V G → V H [3]. The induced edit path Pπ of a given permutation π is defined as follows: If π(i) = k, i’s label is changed from H G G but π(i)π(j) ∈ / E H , the edge ij V (i) to V (k). If ij ∈ E H −1 −1 is deleted. If kl ∈ E but π (k)π (l) ∈ / E G , the edge kl is inserted. Finally, if an edge ij ∈ E G is mapped to an edge H kl ∈ E H , ij’s label is changed from G E (ij) to E (kl). In the following, we always assume that edit costs are metric and that V G = V H = {1, . . . , n}, and work with the alternative definition of λ(G, H). Since exactly computing the graph edit distance is NP hard [9], research has mainly focused on the task of devising heuristics for calculating lower and upper bounds. From 2009 on, many such heuristics have been proposed for the uniform case [9]–[12]. In 2006, Justice et al. [3] developed a LPbased algorithm L P that produces a lower bound for nonuniform node edit costs, runs in O(n7 ) time, does not allow the incorporation of edge labels, and does not give a corresponding upper bound. They also proposed a second algorithm N ODE that runs in O(n3 ) time and computes both a lower and an upper bound. This algorithm does not consider edges at all and therefore usually produces very loose bounds. In a second stream of research that started with the algorithm B P proposed by Riesen et al. [6], several upper bounds have been suggested that also work for non-uniform edge edit costs [1], [2], [7]. Recently, Riesen et al. [8] argued that B P can also be used to compute a lower bound for the graph edit distance.
I NTRODUCTION
As labelled undirected graphs can be used for modelling various kinds of objects, they have received increasing attention over the past years. One task researchers have focused on is the following: Given a database G that contains labelled graphs, find all graphs G ∈ G that are sufficiently similar to a query graph H or to find the k graphs from G that are most similar to H. For this, a distance measure between undirected, labelled graphs G and H has to be defined. A commonly used measure is the graph edit distance. Formally, a labelled undirected graph G G G is a 4-tuple G = V G , E G , G is a set of V , E , where V G → ΣV nodes, E G is a set of undirected edges, and G V : V G G and E : E → ΣE are labelling functions that assign nodes an edges to labels from alphabets ΣV and ΣE . Both ΣV and ΣE contain a special label ε reserved for dummy nodes and dummy edges. The graph edit distance λ(G, H) between graphs G and H on common label alphabets ΣV and ΣE is defined as the minimum cost c(P ) of an edit path P between G and H. An edit path is a sequence of labelled graphs starting with G and ending at a graph that is isomorphic to H. Each graph along the path can be obtained from its predecessor by applying one of the following edit operations: Deleting or inserting an αlabelled edge, deleting or inserting an isolated α-labelled node, changing a node’s or an edge’s label from α to β = α. Edit operations on nodes and edges come with associated edit costs cV : ΣV × ΣV → R and cE : ΣE × ΣE → R, respectively. The cost of an edit path is defined as the sum of the costs of its edit operations. If the cost of each edit operation equals 1, we say that the edit costs are uniform. It is often natural to consider non-uniform metric edit costs. For instance, if the graphs model spatial objects and the node labels are Euclidean coordinates, cV is naturally defined as the Euclidean distance 2375-026X/17 $31.00 © 2017 IEEE DOI 10.1109/ICDE.2017.57
If correct, B P would have been the only available algorithm that considers edge labels and computes not only an upper but also a lower bound for non-uniform edit costs. In this paper, we show that, if viewed as an algorithm that computes a lower bound for λ(G, H), B P is incorrect and present a corrected version of it (Section II). This corrected algorithm, which we call B RANCH, runs in O(n5 ) time, builds upon the same ideas as the original, but avoids the technical flaw that leads to B P’s misbehaviour. We also propose a speed-up B RANCH FAST that yields slightly looser bounds than B RANCH but runs in O(n4 ) time (Section III). We empirically evaluate the new algorithms and show that both of them yield excellent runtime/accuracytradeoffs (Section IV). In particular, the experiments show that our algorithms are Pareto optimal as they outperform all existing competitors in terms of runtime or in terms of accuracy. 133 131
II.
from G. Therefore, it holds that λ(G, H) ≤ c(Pπ ) = 16. At the same time, we have f (˜ c1 (π )) = 18, as c˜E (π ) = 0.
B RANCH : A C ORRECTED V ERSION OF B P
The idea behind B P [6] is to compute a permutation π that induces a cheap edit path Pπ by decomposing G and H into their branches (nodes together with incident edges). Its cost c(Pπ ) is then used as an upper bound for λ(G, H). For this, an auxiliary complete bipartite graph (V G × V H , c˜1 ) between the nodes of G and H with edge costs c˜1 is constructed. The permutation π is computed as a solution to the minimum linear ), i. e., π is choassignment problem induced by (V G × V H , c˜1 n sen such that its assignment cost c˜1 (π ) := i=1 c˜1 (i, π (i)) G H equals the solution ψ(V × V , c˜1 ) := min{˜ c1 (π) | π : V G → V H is permutation} of the minimum linear assignment problem. This problem can be solved in O(n3 ) time, e. g., by using the Hungarian Algorithm [4]. The auxiliary costs c˜1 (i, k) are defined as the cost of transforming the branch rooted at i in G into the branch rooted at k in H:
ID = 1
VG
ID = 2
0 0
c˜(·, ·)
4 4
ID = 3
0
ID = 4
0
0
4 8
0
3 5
0 0
ID = 1 ID = 2
ε
ID = 3
ε
ID = 4
VH
Fig. 1. The auxiliary bipartite graph used in the proof of Lemma 1. Nodes are shown with incident edges and edges between groups of nodes represent the complete bipartite graph between the nodes contained in the groups.
The reason for B P’s misbehaviour is that the auxiliary costs c˜1 the permutation π is optimised for overemphasise edit operations on edges by a factor of 2. Although, after optimisation, this bias is removed, this might lead to wrong results. As in the counterexample provided above, optimising for c˜1 might yield a permutation that avoids editing edges at the price of costly edit operations on nodes. This diagnosis readily provides us with a remedy. Rather than removing the bias after optimisation, we have to remove it before optimisation. For this, we redefine the auxiliary edge costs: 1 G G G H H c˜2 (i, k) := cV (G V (i), V (k))+ ψ(E [δ (i)]×E [δ (k)], cE ) 2 Let B RANCH be the algorithm that uses this definition of the auxiliary costs and otherwise proceeds like B P. Theorem 1 shows that B RANCH computes a lower bound for λ(G, H).
G G G H H c˜1 (i, k) := cV (G V (i), V (k)) + ψ(E [δ (i)] × E [δ (k)], cE )
Here, δ G (i) and δ H (k) are the set of edges that are incident with i in G and k in H. The smaller of these sets is filled up with εlabelled dummy edges. The complexity of computing c˜1 (i, k) is hence O(max{|δ G (i)|, |δ H (k)|}3 ). On graphs without bounded maximum degree, B P thus runs in O(n5 ) time. In [8], it is claimed that, if π is the permutation comc˜E (π )/2 is a lower bound puted by B P, then c˜V (π ) + n H for λ(G, H). Here, c˜V (π ) := i=1 cV (G V (i), V (π (i))) is the portion of c˜1 (π ) which is inducedby relabelling the n G nodes of the branches, while c˜E (π ) := i=1 ψ(G E [δ (i)] × H H E [δ (k)], cE ) denotes the portion of c˜1 (π ) caused by adjusting their edges. As we have c˜1 (π ) = c˜V (π ) + c˜E (π ), this would imply that, given π , a lower bound can be computed in additional linear time. The intuition behind the claim that c˜V (π ) + c˜E (π )/2 is a lower bound for λ(G, H) is that, by definition, c˜1 (i, k) is given as the sum of the node relabelling cost one has to pay if i is mapped to k and the minimum cost of editing incident edges if global feasibility of edit operations on edges is disregarded. Therefore, so the argument goes, π can be viewed as an optimal solution to a relaxation of the problem of computing λ(G, H), and a lower bound for λ(G, H) can be obtained if the portion of c˜1 (π ) caused by editing edges is divided by 2. Lemma 1 shows that this reasoning is erroneous.
Theorem 1: The inequality ψ(V G × V H , c˜2 ) ≤ λ(G, H) holds for all labelled undirected graphs G and H. Proof: Let G and H be two labelled undirected graphs. For facilitating notation, we replace all missing edges by dummy edges in both G and H. These operations leave ψ(V G ×V H , c˜2 ) and λ(G, H) invariant. After filling up G and H with dummy edges, δ G (i) = {ij | j ∈ V G {i}} holds for all i ∈ V G , and δ H (k) = {kl | l ∈ V H {k}} holds for all k ∈ V H . We now consider a permutation π ˆ with λ(G, H) = c(Pπˆ ) and a permutation π with c˜2 (π ) = ψ(V G ×V H , c˜2 ). The following chain of inequalities proves the theorem: G cV (G π (i))) λ(G, H) = c(Pπˆ ) = V (i), V (ˆ
Lemma 1: There are labelled undirected graphs G and H such that c˜V (π ) + c˜E (π )/2 > λ(G, H), where π is the permutation computed by B P.
+
i∈V G G π (i)ˆ π (j))) cE (G E (ij), E (ˆ
G
ij∈(V2 ) G = cV (G π (i))) V (i), V (ˆ
Proof: Let G be a graph that contains two 4-labelled nodes 1 and 2 linked by a 0-labelled edge and two isolated 0-labelled nodes 3 and 4. H contains only two 0-labelled nodes 1 and 2 linked by a 0-labelled edge and two dummy nodes 3 and 4. The relabelling costs are given by the Euclidean distance between the labels and the insertion/deletion costs are cV (α, ε) = cV (ε, α) = 5 and cE (α, ε) = cE (ε, α) = 3. Figure 1 depicts the auxiliary bipartite graph induced by this instance. There are only three non-equivalent permutations for the linear assignment problem on (VG × VH , c˜1 ): π := ( 11 22 33 44 ), π := ( 13 24 31 42 ), and π := ( 11 24 33 42 ). By definition of c˜1 , it holds that c˜1 (π ) = 18, c˜1 (π ) = 22, and c˜1 (π ) = 20. Therefore, π is optimal for (VG × VH , c˜1 ). Now consider the edit path induced by π : We have to delete the edge 12 from G, insert the edge 12 into H and delete the nodes 1 and 2
i∈V G
1 G cE (G π (i)ˆ π (j))) E (ij), E (ˆ 2 j∈V G {i} G ≥ cV (G V (i), V (π(i))) +
i∈V G
+
1 2
min G
H
σ:V →V is permutation j∈V G {i}
G cE (G π (i)σ(j))) E (ij), E (ˆ
H H π (i))],c ) G =ψ(G E E [δ (i)]×E [δ (ˆ
= c˜2 (ˆ π ) ≥ c˜2 (π ) = ψ(V G × V H , c˜2 ) Clearly, B RANCH inherits the complexity O(n5 ) from B P. 132 134
III.
Γ(G, H, i, k) can be computed in linear time. Computing cik E requires quadratic time. Therefore, c˜4 (i, k) can be computed in quadratic time, which implies the statement of the lemma.
B RANCH FAST : A S PEED -U P OF B RANCH
Our second algorithm B RANCH FAST improves the runtime of B RANCH from O(n5 ) to O(n4 ) at the price of a slight loss in the accuracy. For this speed-up, we borrow techniques employed by an algorithm that Zheng et al. [11] developed for uniform edit costs, and adapt them to non-uniform edit costs.
IV.
E MPIRICAL E VALUATION
We compared B RANCH and B RANCH FAST to all relevant competitors, namely N ODE, L P, and B P.1 N ODE and L P were selected because they are the only existing correct algorithms that compute lower bounds for non-uniform edit costs. B P was selected because, to the best of our knowledge, for all existing upper bounds for non-uniform edit costs, there are experiments showing that none of them outperforms B P both in terms of runtime and in terms of accuracy. Since B RANCH and B RANCH FAST are designed for non-uniform edit costs, we did not select any algorithms that only work for uniform costs. We implemented all algorithms in C++, making them employ the same data structures and subroutines. For implementing L P, we used the C++ API of Gurobi Optimization, a highly efficient commercial LP-solver. All tests were carried out on a machine with two Intel Xeon E5-2667 v3 processors with 8 cores each and 98 GB of main memory running GNU/Linux.
Zheng et al.’s algorithm differs from B RANCH only in the definition of the auxiliary edge costs which they define as 1 H c˜3 (i, k) := cV (G V (i), V (k)) + Γ(G, H, i, k), 2 G where Γ(G, H, i, k) := max{dG (i), dH (k)} − |G E [δ (i)] ∩ H [δ (k)]|. It has been shown that, for uniform edit costs, H E it holds that c˜2 = c˜3 [11]. However, Zheng et al.’s algorithm outperforms B RANCH in terms of runtime complexity. This is because, for computing c˜3 (i, k), it only has to carry out a multiset intersection rather than to optimally solve a linear assignment problem. Therefore, c˜3 (i, k) can be computed in linear rather than cubic time [9]. In the uniform case, Zheng et al.’s algorithm can thus be viewed as an improved implementation of B RANCH. For non-uniform edit costs, c˜2 = c˜3 does no longer hold. In fact, it is easy to see that ψ(V G × V H , c˜3 ) is in general no lower bound for λ(G, H).
We conducted our tests on the datasets A IDS and P RO [5], which are widely used in the community [1], [6]–[12]. The datasets contain graphs with node and edge labels for which non-uniform metric relabelling costs cV and cE are induced by the domain [6]. For defining nonuniform metric edit costs, we thus only had to specify the deletion/insertion costs cV (α, ε) and cE (α, ε). This was done by setting cV (α, ε) := ρ max{cV (β, γ) | β, γ ∈ ΣV } for all α ∈ ΣV , and cE (α, ε) := ρ max{cE (β, γ) | β, γ ∈ ΣE } for all α ∈ ΣE , where ρ was varied over {1/2, 1, 100}. The parameter ρ must be at least 1/2 in order not to violate metricity of the edit costs. Setting ρ = 1 means that deleting or inserting a node (edge) is as expensive as the most expensive relabelling operation on nodes (edges); ρ = 100 means that inserting and deleting is 100 times more expensive than the most expensive relabelling operation. We used the experimental setup suggested in [9]: For both datasets, we randomly selected 100 model graphs and randomly constructed query groups Hi , each of which contains 5 query graphs H that satisfy the size constraint 5(i − 1) < |VH | ≤ 5i. We ran each algorithm A LG for all pairs of model graphs and query graphs and averaged the observed runtime t(A LG) and the metrics err UB (A LG) := UB (A LG)/UB and err LB (A LG) := LB /LB (A LG) over all 500 test runs associated to query group Hi . UB (A LG) and LB (A LG) denote the values of the upper (lower) bound returned by A LG, and UB and LB denote the value of the tightest upper (lower) bound computed by any of the tested algorithms. The metrics err UB (A LG) and err LB (A LG) hence measure the accuracy of the bounds produces by A LG in comparison to the most accurate available bound. Values close to 1 indicate tight bounds, while values 1 indicate loose bounds. TEIN
Our speed-up B RANCH FAST adapts the definition of c˜3 (i, k) to non-uniform edit costs. To this purpose, “flattened” edge relaG G H H belling costs cik E for the bipartite graphs E [δ (i)] × E [δ (k)] are defined. Recall that δ G (i) and δ H (k) are the set of edges that are incident with i in G and k in H, respectively, and that the smaller of these sets is filled up with ε-labelled dummy edges. The flattened edge relabelling costs are defined G as follows: If α = β for edge labels α ∈ G E [δ (i)] and H ik β ∈ H [δ (i)], we define c (α, β) as the minimum edge E E relabelling cost for changing an edge label α that, in G, is incident with i into a non-identical edge label β that, in H, is adjacent with k. Otherwise, cik E (α, β) is set to 0. The auxiliary edge costs used by B RANCH FAST are then defined as follows: 1 G G G H H ik c˜4 (i, k) := cV (G V (i), V (k))+ ψ(E [δ (i)]×E [δ (k)], cE ) 2 Apart from this modification, B RANCH FAST proceeds just like B RANCH. Lemma 2 shows that B RANCH FAST computes a lower bound for λ(G, H). Lemma 2: The inequality ψ(V G × V H , c˜4 ) ≤ λ(G, H) holds for all labelled undirected graphs G and H. Proof: Let G and H be two undirected labelled graphs, π2 be a permutation that is optimal for (V G × V H , c˜2 ), and π4 be a permutation that is optimal for (V G × V H , c˜4 ). As cik ˜4 (i, k) ≤ c˜2 (i, k) holds for all i ∈ V G and k ∈ V H . E ≤ cE , c We hence have c˜4 (π4 ) ≤ c˜4 (π2 ) ≤ c˜2 (π2 ). By Theorem 1, this implies the statement of the lemma. In fact, Lemma 2 does not only show that B RANCH FAST computes a lower bound for λ(G, H), but also that this lower bound is looser than the one produced by B RANCH. The advantage of B RANCH FAST compared to B RANCH is that it runs in O(n4 ) rather than O(n5 ) time.
Figure 2 (a) shows the average observed runtimes for both datasets and ρ = 1. L P is by far the slowest algorithm, while B P and B RANCH perform very similarly. On small graphs, N ODE is the fastest algorithm, but becomes slower than B RANCH FAST on large graphs. At a first glance, this is surprising, as N ODE runs in O(n3 ) and B RANCH FAST runs
Lemma 3: B RANCH FAST runs in O(n4 ) time. G G Proof: By definition of cik E , it holds that ψ(E [δ (i)] × H ik ik H [δ (k)], c ) = c Γ(G, H, i, k). As shown in [9], E E E
1 Recall
133 135
L P does not yield an upper and B P does not yield a lower bound.
avg. runtime t in seconds
N ODE
B RANCH
B RANCH FAST
A IDS (ρ = 1)
LP
the accuracy plus of B RANCH w. r. t. L P and B RANCH FAST increases, while setting ρ = 100 makes the result look very similar to the one for ρ = 1 on A IDS. To explain these results, recall that: increasing ρ amounts to increasing the importance of inserting and deleting w. r. t. the importance of relabelling; B RANCH fully models relabelling on both nodes and edges; B RANCH FAST considers relabelling edges in a more superficial way; and L P completely ignores relabelling on edges. It is therefore not surprising that, as the importance of relabelling decreases with increasing ρ, the lower bound computed by B RANCH becomes very similar to the one produced by B RANCH FAST, and that L P benefits from its higher computational complexity and computes a lower bound that is tighter than the one returned by B RANCH.
BP
P ROTEIN (ρ = 1)
0.1
1
0.01
0.1 0.01
0.001 0
20
40
60
80
0
avg. size of query graphs
50
avg. size of query graphs
100
(a) Averaged observed runtimes. P ROTEIN (ρ = 1)
avg. err UB
A IDS (ρ = 1) 2.51
1.12
1.58 1
0
20
40
60
80
1
avg. size of query graphs
0
50
avg. size of query graphs
100
Figure 2 (d) visualises the excellent runtime/accuracytradeoffs of B RANCH and B RANCH FAST in a concise way. Averages are taken over all test runs and dotted lines demarcate the region that is dominated by B RANCH and B RANCH FAST. We see that, w. r. t. the obtained lower bounds, B RANCH FAST is Pareto optimal on all datasets. If relabelling is important (e.g., on P ROTEIN with ρ ∈ {1/2, 1}), B RANCH is Pareto optimal, too, and dominates L P. On A IDS with ρ = 1, B RANCH and B RANCH FAST yield the same accuracy. Therefore, B RANCH FAST dominates B RANCH. L P is the only non-dominated competitor. L P is 25.34 times slower but only 1.01 times more accurate than B RANCH FAST. On P ROTEIN with ρ = 1, both B RANCH and B RANCH FAST are Pareto optimal and N ODE is the only non-dominated competitor. N ODE is 5.17 times less accurate but only 1.04 times faster than B RANCH FAST and 5.64 times less accurate and 4.77 times faster than B RANCH.
(b) Accuracy of upper bounds. P ROTEIN (ρ = 1)
avg. err LB
A IDS (ρ = 1) 10
1.26
1
1 0
20
40
60
80
0
avg. err LB
P ROTEIN (ρ = 1/2)
50
100
P ROTEIN (ρ = 100)
10
10
1
1 0
50
avg. size of query graphs
100
0
50
avg. size of query graphs
100
R EFERENCES
(c) Accuracy of lower bounds. P ROTEIN (ρ = 1)
avg. err LB
AIDS (ρ = 1) 1.08 1.06
5
1.04 1.02
[1] V. Carletti, B. Ga¨uz`ere, L. Brun, and M. Vento, “Approximate Graph Edit Distance Computation Combining Bipartite Matching and Exact Neighborhood Substructure Distance,” in GbRPR’15, 2015, pp. 188–197. [2] B. Ga¨uz`ere, S. Bougleux, K. Riesen, and L. Brun, “Approximate Graph Edit Distance Guided by Bipartite Matching of Bags of Walks,” in SSPR’14, 2014, pp. 73–82. [3] D. Justice and A. Hero, “A Binary Linear Programming Formulation of the Graph Edit Distance,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 8, pp. 1200–1214, 2006. [4] H. W. Kuhn, “The Hungarian Method for the Assignment Problem,” Nav. Res. Logist. Q., vol. 2, no. 1-2, pp. 83–97, 1955. [5] K. Riesen and H. Bunke, “IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning,” in SSPR’08, 2008, pp. 287–297. [6] ——, “Approximate Graph Edit Distance Computation by Means of Bipartite Graph Matching,” Image Vis. Comput., vol. 27, no. 7, pp. 950–959, 2009. [7] K. Riesen, M. Ferrer, A. Fischer, and H. Bunke, “Approximation of Graph Edit Distance in Quadratic Time,” in GbRPR’15, 2015, pp. 3–12. [8] K. Riesen, A. Fischer, and H. Bunke, “Computing Upper and Lower Bounds of Graph Edit Distance in Cubic Time,” in ANNPR’14, 2014, pp. 129–140. [9] Z. Zeng, A. K. H. Tung, J. Wang, J. Feng, and L. Zhou, “Comparing Stars: On Approximating Graph Edit Distance,” PVLDB, vol. 2, no. 1, pp. 25–36, 2009. [10] X. Zhao, C. Xiao, X. Lin, and W. Wang, “Efficient Graph Similarity Joins with Edit Distance Constraints,” in ICDE’12, 2012, pp. 834–845. [11] W. Zheng, L. Zou, X. Lian, D. Wang, and D. Zhao, “Graph Similarity Search with Edit Distance Constraint in Large Graph Databases,” in CIKM’13, 2013, pp. 1595–1600. [12] ——, “Efficient Graph Similarity Search Over Large Graph Databases,” IEEE Trans. Knowl. Data Eng., vol. 27, no. 4, pp. 964–978, 2015.
10
0.01
0.1
avg. runtime t in seconds
0.01
0.1
1
avg. runtime t in seconds
(d) Pareto optimality of lower bounds. Fig. 2.
Outcomes of the experiments.
in O(n4 ) time. It can, however, be explained: The Hungarian Algorithm, which both algorithms use as a subroutine, works better when the variance in the costs it is optimising for is high, and B RANCH FAST invests more effort in the computation of discriminative auxiliary edge costs.2 The results for ρ = 1/2 and ρ = 100 are very similar. Figure 2 (b) shows the accuracy of the upper bounds provided by B RANCH, B RANCH FAST, N ODE, and B P for ρ = 1. We can see that all produced upper bounds are similar. The results are again stable across variation of ρ. Figure 2 (c) shows the accuracy of the lower bounds provided by B RANCH, B RANCH FAST, N ODE, and L P. For both datasets and all values of ρ, N ODE produces the loosest lower bound. For ρ = 1 we observe that, on A IDS, we have err LB (B RANCH) ≈ err LB (B RANCH FAST) > err LB (L P), while, on P ROTEIN, err LB (B RANCH FAST) > err LB (L P) > err LB (B RANCH FAST) holds. If, on P ROTEIN, ρ is set to 1/2, 2 N ODE
G defines its auxiliary costs as c˜5 (i, k) := cV (G V (i), V (k)) [3].
134 136