A Software Library of Dynamic Graph Algorithms1 1 ... - CiteSeerX

0 downloads 0 Views 196KB Size Report
We report on a software library of dynamic graph algorithms. It was written in C++ as an ex- tension of LEDA, the library of e cient data types and algorithms.
129 Proceedings of "Algorithms and Experiments" (ALEX98) Trento, Italy, Feb 9{11, 1998

R. Battiti and A. A. Bertossi (Eds.) pp. 129-136

A Software Library of Dynamic Graph Algorithms1 David Alberts Inst. f. Informatik, MLU Halle, D-06099 Halle, Germany e-mail: [email protected] Giuseppe Cattaneo Dipartimento di Informatica ed Applicazioni, Universita di Salerno, Italy e-mail: [email protected] Giuseppe F. Italiano2 Dipartimento di Matematica Applicata ed Informatica Universita \Ca' Foscari" di Venezia, Venice, Italy e-mail: [email protected] Umberto Nanni Dipartimento di Informatica e Sistemistica Universita di Roma \La Sapienza", via Salaria 113, I-00198 Roma, Italia e-mail: [email protected] and Christos D. Zaroliagis Max-Planck-Insitut fur Informatik Im Stadtwald, D-66123 Saarbrucken, Germany e-mail: [email protected]

Abstract

We report on a software library of dynamic graph algorithms. It was written in C++ as an extension of LEDA, the library of ecient data types and algorithms. It contains implementations of simple data structures as well as of sophisticated data structures for dynamic connectivity, dynamic minimum spanning trees, dynamic single source shortest paths, and dynamic transitive closure. All data structures are implemented by classes derived from a common base class, thus they have a common interface. Additionally, the base class is in charge of keeping all dynamic data structures working on the same graph consistent. It is possible to change the structure of a graph by a procedure which is not aware of the dynamic data structures initialized for this graph. The library is easily extendible.

1 Introduction Traditional graph algorithms operate on static graphs. A xed graph is given, and an algorithmic problem (e.g., \Is the graph planar?") is solved on the graph. Dynamic graphs are not xed in 1 Supported in part by EU ESPRIT Long Term Research Project ALCOM-IT under contract no. 20244, and by the German{Italian Program \Vigoni 1997". 2 Supported in part by the Italian MURST Project \Ecienza di Algoritmi e Progetto di Strutture Informative", and by a Research Grant from University \Ca' Foscari" of Venice. Part of this work was done while at the Max Planck Institut fur Informatik, Im Stadtwald, 66123 - Saarbrucken, Germany.

A Software Library of Dynamic Graph Algorithms

130

time, but can evolve through local changes of the graph. The algorithmic problem has to be resolved quickly after each modi cation. Dynamic graphs model many graphs occurring in real-life applications much more closely, because no large system is truly static. A dynamic graph algorithm is a data structure operating on a graph supporting two types of operations: updates and queries. An update is a local change of the graph (e.g., insertion or deletion of an edge) and a query is a question about a certain property of the current graph (e.g., \Are nodes u and v connected in the current graph?"). The aim of such a data structure is to use structural information about the current graph in order to handle an update faster than by the obvious solution, that is recomputing everything from scratch with a static algorithm. Usually, queries take less time than updates, and the sequence of operations (updates and queries) is not known in advance. Since the input of a dynamic graph algorithm is more complicated than in the static case, and static graph algorithms for basic problems like connectivity or shortest paths are very ecient as a result of decades of research, dynamic graph algorithms sometimes have to be quite sophisticated to beat the static ones in theory. Practically, the actual running times depend on many parameters including the size and type of the input graphs, the distribution of operations (e.g., much more queries than updates or vice versa) and sometimes even on certain patterns in the update sequence (e.g., insertions and deletions of edges alternatingly in contrast to a subsequence of consecutive edge insertions followed by a subsequence of consecutive edge deletions). Consequently, in order to be able to choose the right data structure for a certain application, it is usually inevitable to do some experiments with di erent data structures. In the best case, these experiments give some problem-speci c insight which may lead to improved algorithms and better implementations. We provide a library of dynamic data structures which allow experimental comparison of di erent approaches with respect to inputs with speci c properties. Moreover, our library is easily adaptable and extendible. The library is written in C++ as an extension of LEDA { the popular Library of Ecient Data Types and Algorithms [19]. Our library is available for non-commercial use from the LEDA web site at http://www.mpi-sb.mpg.de/LEDA/leda.html.

2 The Algorithms

2.1 Dynamic Connectivity

A dynamic connectivity data structure usually supports queries of the type \Are vertices u and v in the current graph connected?", and edge insertions and deletions. The currently best theoretic bounds are achieved by a data structure invented by Henzinger and King [13] and improved by Henzinger and Thorup [14]. It achieves O(log n) worst-case deterministic query time and O(log2 n) amortized expected update time for edge updates provided that there are at least as many updates as edges in the initial graph. We implemented the algorithm by Henzinger and King, currently without the improvement by Henzinger and Thorup. It achieves O(log3 n) amortized expected time per update and the same query time as stated above. Initialization and theoretically unsupported updates take O((m+n log n) log n) time, which is again o by a (log n)-factor from the best theoretical bound for initialization. For comparison, we implemented two simple approaches to dynamic connectivity. One of them is the ad-hoc algorithm of doing a BFS for answering queries and ignoring updates. It achieves O(1) worst-case deterministic time for any kind of graph update and initialization and O(n0 + m0 ) worst-case deterministic time for a query where n0 and m0 are the number of nodes and edges in the a ected component(s), respectively. The second simple approach uses component labels at the vertices to answer queries in worst-case deterministic constant time and a spanning forest represented by edge labels, and recomputes the labels if necessary (insertion of a bridge, deletion of a forest edge, deletion of a node) using BFS. This leads to a worst-case deterministic update time of O(n0 + m0 ) as above. Initialization takes

A Software Library of Dynamic Graph Algorithms

131

worst-case linear time.

2.2 Dynamic Minimum Spanning Trees

The dynamic minimum spanning tree (MST) problem consists of mantaining a minimum spanning tree of a connected graph subject to edge insertions and edge deletions. In this library, we included three di erent data structures for this problem: Sparsi cation [5], Frederickson's clustering algorithms [8, 9] and a simple dynamic algorithm which we called adhoc. We have already performed an extensive empirical study on the performance of these data structures, and we refer the interested reader to reference [2] for the details. All our algorithms support adding and deleting edges, MST membership queries for edges, and a query returning the current MST. To improve the exibility and reusability of our code, we choose to maintain the edge costs as an external data structure. This data structure can be either provided by the user at inizialization time, or can be allocated by our class and inizialized via a method void set edge weight() (which is also used whenever a new edge is created). The adhoc algorithm is encapsulated in the class mst sd. It is a suitable combination of a partially dynamic data structure (based upon the dynamic trees of Sleator and Tarjan [23]) and the LEDA Min Spanning Tree algorithm, which is basically an experimental tuning of Kruskal's algorithm [17]. Let G = (V; E) be a graph with m edges and n vertices. First, adhoc keeps the minimum spanning tree T of the graph G as a dynamic tree [23]. Second, it stores all the edges of G in a search tree D according to their cost. When a new edge e is inserted into G, adhoc updates T and D in time O(logn). When an edge e is deleted from G, it is rst deleted from D in O(logn) time. If e was a non{tree edge (this can be checked quickly by properly marking tree edges), adhoc does nothing else. Otherwise, adhoc calls Min Spanning Tree on the sorted edge set obtained from D. Consequently, adhoc requires O(logn) time plus the running time of Min Spanning Tree in case of a tree edge deletion. One would expect that adhoc has very low implementation constants. This was exactly con rmed by our experiments. Let G = (V; E) be a graph, with minimumspanning tree T. The rst ingredient of the algorithms of Frederickson [8, 9] is clustering. Frederickson gives three di erent algorithms for maintaining the minimum spanning tree of a graph. Algorithm FredI is based on clustering only and obtains time bounds of O(m2=3 ) per update. If the partition is applied recursively and a topology tree is associated with each cluster, we get FredII, which yields better O((m log m)1=2 ) time bounds per update. Finally, algorithm FredIII uses 2{dimensional topology trees to achieve a time bound of O(m1=2 ) per update. All these algorithms are quite complicated, and we were the rst to be surprised by the fact that they still show some practical value. We refer the reader to [8, 9] for all the details of these algorithms, and to [2] for their implementation and experimental analysis. In this release of the library, we provide the implementation which was the fastest in the experiments. We mainly use it as an underlying algorithm for sparsi cation, the technique we introduce next. Sparsi cation [5] is a general technique that applies to a wide variety of dynamic graph problems. It is used on top of a given algorithm, in order to speed it up, and can be used as a black box . Let A be an algorithm that solves a certain problem in time f(n; m) on a graph with n vertices and m edges. There are two versions of sparsi cation: simple sparsi cation and improved sparsi cation [5]. When simple sparsi cation is applied to A, it produces a better bound of O(f(n; O(n)) log(m=n)) for the same problem. Improved sparsi cation uses a more sophisticated graph decomposition to eliminate the logarithmic factor, thus yielding a O(f(n; O(n))) time bound. We refer the reader to [5] for the details of the method.

2.3 Dynamic Transitive Closure

A data structure for dynamic transitive closure supports reachability queries (\is there a directed path between any two given vertices in a digraph G = (V; E)?"), and update operations, usually edge insertions and deletions. The graph G = (V; E  ) that has the same vertex set with G but has

A Software Library of Dynamic Graph Algorithms

132

an edge (u; v) 2 E  i there is a u-v path in G is called the transitive closure of G; G need not be explicitely stored. Let m denote jE  j. We have implemented three algorithms for dynamic transitive closure in this release of the library, namely Italiano's algorithm [15, 16], Yellin's algorithm [25], and the algorithm of Cicerone et.al. [3] (which is a generalization of another algorithm proposed by La Poutre and van Leeuwen [18]). All of these algorithms are partially dynamic: the incremental versions apply to any digraph, while the decremental ones apply only on directed acyclic graphs (DAGs). Query operations come in two forms: a Boolean-path query (which simply returns yes or no), and a nd-path query (which returns an actual path, if there exists one). Italiano's and Yellin's data structures support both types: a Boolean-path query in O(1) time and a nd-path query in O(`) time, where ` is the number of edges of the reported path. The data structure of Cicerone et.al. [3] supports only Boolean-path queries in O(1) time. The algorithms in [3, 15, 16] require O(n2 ) space. In the following, let G0 denote the initial digraph having n vertices and m0 edges. Italiano's algorithm is initialized in O(n2 + nm0 ) time, while Cicerone et.al.'s algorithm is initialized in O(n2) time. The Boolean-path version of Yellin's algorithm is initialized in O(n2) time and space; the nd-path version requires O(n2 + dm0 ) time and space, where d is the maximum outdegree of G0 and m0 the edges of G0 . The bounds for update operations given below hold regardless of the type of the query operation used. The incremental part in [3, 15, 16] requires O(n(m + m0 )) time to process a sequence of m edge insertions, while the decremental one requires O(nm0 ) time to process any number (m) of edge deletions; hence, if m = (m0 ), then an update operation takes O(n) amortized time. The incremental version of Yellin's algorithm requires O(d(m0 + m) ) time to process a sequence of m edge insertions starting from G0 and resulting in a digraph G; d is the maximum outdegree of G. The decremental version requires O(dm0 ) time to process any sequence of edge deletions; d is the maximum outdegree of G0. We have augmented the implementation of the above algorithms with additional procedures that help us to easily verify the correctness of the implementation. For example, one version of ndpath(u; v) operation has been implemented either to return the u-v path (if it exists), or to exhibit a cut in the graph separating the vertices reachable by u from the rest of the graph. In the former case, the correctness check is trivial. In the latter case, the heads of all edges in the cut should belong to the part containing u. For comparison, we have also implemented simple heuristic algorithms which were based on the following idea: when an update occurs, nothing is computed. When a query is issued, then a search procedure (e.g., DFS, BFS) from the source vertex is performed until the target vertex is encountered. Hence, updates take O(1) time and queries O(n + mc ) time, where mc denotes the current number of edges in the graph.

2.4 Dynamic Single-Source Shortest-Paths

Let G be a weighted directed graph, and s be a xed source vertex in G. A dynamic data structure for the single-source shortest-path problem supports update operations (usually edge insertions and deletions, and weight modi cations), and queries of two possible kinds: distance queries (\What is the distance of vertex x from the source?"), and path queries (\What is a shortest path from vertex x to the source?"). A linear-time o -line algorithm for the undirected version of this problem was recently proposed by Thorup [24]: the distances from a single source can be computed in O(n+m) time in a graph with n vertices and m edges. A well known o -line solution for general digraphs is the implementation of Dijkstra's algorithm [4] with Fibonacci heaps [10], with a time bound of O(m + n logn). Further solutions have been proposed by using sophisticated data structures for priority queues (recent results are provided, e.g., in [22]). We have implemented two dynamic algorithms for the fully-dynamic single-source shortest-path problem in digraphs with positive edge weights: the one proposed by Ramalingam and Reps in [21] (denoted as RR) and the one proposed by Frigioni, Marchetti and Nanni in [7] (FMN). These algorithms support fully dynamic updates of the edges (i.e., alternated insertions and deletions as well as

A Software Library of Dynamic Graph Algorithms

133

arbitrary weight updates) on general digraphs, while maintaining an explicit solution of the problem; distance queries are answered in constant time, while path queries require O(l) time, if the reported shortest path has l edges. The complexity of these algorithms is expressed in terms of output complexity. Namely, a node x is a ected by an edge operation if its distance from the root has to be updated. If an edge operation requires to update a set  of a ected nodes, RR can handle any edge operation (insertion, deletion, weight increase or decrease) in O(jj jj) worst case time, where jj jj denotes the size of the set  plus the number of edges with at least one a ected endpoint. This algorithm uses a strategy based on Dijkstra's algorithm: a priority queue is used, where nodes are dequeued in nondecreasing order of distance from the source. The basic idea is to enqueue only the a ected vertices plus their neighbors. While performing edge updates on a graph G, RR algorithm maintains the subset of edges in G that belong to at least one shortest path from s to the other vertices in G. The resulting DAG rooted in s is used when path-queries have to be answered. Algorithm FMN is based on a similar strategy, but an additional data structure is maintained on each node x in order to \guess" what are the neighbors of x that are to be updated when the distance from x to the source is updated. By using the terminology above, the resulting time bound for an edge operation is O(k  j j), that is the number j j of a ected vertices times a parameter jkj depending on the kind of considered graph. Namely, for a given graph G, k is bounded by the minimum outdegree of all possible orientations of the edgespof G; for example, k = 3 for planar graphs, k = O(t) for a graph with treewidth t, and k = O( m) for a general digraph. The time bound to handle edge operations is worst case for a graph with a static topology (i.e., if only weight updates, and edge deletions and reinsertions are allowed), and amortized otherwise. In order to compare the performances of RR and FMN with a static counterpart, we also have implemented a class based on the static algorithm currently implemented in LEDA, that is the Dijkstra's algorithm with Fibonacci heaps. The experiments, whose details are provided in [6], show that both on random graphs and on real world graphs, the dynamic algorithms allow to spend in updates less than 5% of the time required by the static one, and are even better in the case of dense graphs; the number of edge scanned is usually below 0:5%. The two dynamic algorithms provided in the library have incomparable performances: RR is faster on random cases, due to simpler data structures, but FMN scans always a subset of the edges scanned by RR, which may lead in special cases to be faster, even by a factor proportional to n, as theoretically stated.

3 The Design and Implementation All data structures are implemented as C++ classes derived from a common base class dga base. This base class de nes a common interface. Of course we are trying to achieve the usual software engineering goals like eciency, ease of use, extendibility and so on. Here we would like to focus on some domain speci c design issues. We found two main problems which arise in implementing a library of dynamic graph algorithms. Missing Update Operations The algorithms usually support only a subset of all possible update operations, e.g., most dynamic graph algorithms cannot handle single node deletions and insertions. Maintaining Consistency In an application a dynamic graph algorithm D might run in the background while the graph changes due to a procedure P which is not aware of D. Then there has to be a means of keeping D consistent with the current graph, because P will not use a possible interface for changing the graph provided by D, but will use the graph directly. Whether D exists or not should have no impact on P. We decided to support all update operations for convenience. Those updates which are not supported by the theoretical background are implemented by reinitializing the data structure for the new graph. This is not very ecient, but it is better than exiting the whole application. The documentation tells the users which updates are supported eciently or not. The fact that the user calls an update which is theoretically not supported results only in a (perhaps even negligible) per-

A Software Library of Dynamic Graph Algorithms

134

formance penalty. This enhances the robustness of the applications using the library or alternatively reduces the complexity of handling exceptional situations. In order to maintain consistency between a graph and a dynamic data structure D working on that graph, one could of course derive D from the graph class. However, this is not very exible. In the case that there are more than one dynamic graph data structures working on the same graph things would get quite complicated with this approach. We use the following approach instead. This is an application of the observer design pattern by Gamma et al. [11]. We create a new graph type msg graph which sends messages to interested third parties whenever an update occurs. The base class dga base of all dynamic graph algorithms is one such third party, it receives these messages and calls the appropriate update operations which are virtual methods appropriately rede ned by the speci c implementations of dynamic graph algorithms.

4 An Example The library is used by including the desired header les into the application and linking it with the object code library le libdynamic graphs.a. A simple example program is given below. Note that changes of the graph can be performed solely on the graph itself without explicitly updating the dynamic data structure. The dyncon.nodes yes() call returns a prede ned string which interpretes the positive result of a query with two nodes returning true (in this case \These nodes are connected."). #include main() { // initialize the graph, in this example to a path msg_graph G; // the graph we are working on node nodes[10]; edge edges[9]; int i; nodes[0] = G.new_node(); for(i = 1; i < 10; i++) { nodes[i] = G.new_node(); edges[i-1] = G.new_edge(nodes[i-1],nodes[i]); } // instantiate a dynamic connectivity data structure for G dc_simple dyncon(&G); // do something with the graph G.del_edge(edges[4]); G.del_edge(edges[5]); // perform a connectivity query if(dyncon.query(nodes[0],nodes[9])) cout

Suggest Documents