Soft Comput (2015) 19:2391–2402 DOI 10.1007/s00500-014-1434-2
METHODOLOGIES AND APPLICATION
Computing k shortest paths from a source node to each other node Guisong Liu · Zhao Qiu · Hong Qu · Luping Ji · Alexander Takacs
Published online: 21 August 2014 © Springer-Verlag Berlin Heidelberg 2014
Abstract The single-pair K shortest path (KSP) problem can be described as finding k least cost paths through a graph between two given nodes in a non-decreasing order, while single-source KSP algorithms aim to find KSPs from a given node to each other node. However, little effort has been devoted to the single-source KSP approaches. This paper proposes a novel single-source KSP algorithm in a given directed weighted graph where loops are allowed. The proposed method is designed to compute a set of shortest paths with exactly k distinctive lengths in a non-decreasing order. Meanwhile, it can also find all shortest paths with the length less than a given threshold. Inspired by water flowing principle, we imagine that there are waters flowing from a source node to each other node along edges at a constant speed. When the water reaches a node, the node will generate new waters flowing along its outgoing edges. By stepping back the traces of the water, the ordered shortest paths can be obtained. We also address the correctness and effectiveness of the method. Simulations are carried out using synthetic data and practical graph data, which demonstrate the considerable performance of the proposed approach especially for single-source KSP problems.
Communicated by V. Loia. G. Liu (B) · Z. Qiu · H. Qu · L. Ji Computational Intelligence Laboratory, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, People’s Republic of China e-mail:
[email protected] A. Takacs School of Computer Science and Communication, KTH Royal Institute of Technology, 100 44 Stockholm, Sweden
Keywords K shortest path problem · Single-pair KSP · Single-source KSP
1 Introduction The K -shortest-path problem (KSP) is about finding the k shortest paths in a directed weighted graph in a nondecreasing order. More generally, for a given pair of nodes s and t in a given directed weighted graph G = (V, E), enumerating the paths from s to t in a non-decreasing order with respect to their lengths. Recent application domain examples for KSP problems include multiple object tracking (Berclaz et al. 2011), sequence alignment (Ozer et al. 2010), gene network (Shih and Parthasarathy 2012), scheduling (Xu et al. 2012), dynamic routing (Wan et al. 2012) and many other areas in which optimization problems need to be solved (Eppstein 1988). Since first proposed by Hoffman and Pavley (1959) in the 1950s, the KSP problem has received much attention, and many approaches to solve the KSP problem have been published. In the literature, one of the KSP problems is to find k shortest paths without repeated node in any solution path (Shih and Parthasarathy 2012; Hu et al. 2012; SedenoNoda 2012; Sedeno-Noda and Espino-Martin 2013; Hershberger et al. 2007; Martins and Pascoal 2003; Yen 1971, 1972; Gotthilf and Lewenstein 2009; Gao et al. 2010). Without the restriction, the solution paths are not required to be simple, which means that loops are allowed in the paths (Eppstein 1988; Aljazzar and Leue 2011, 2008; Yang and Chen 2005; Jimenez and Marzal 1999, 2003). From another perspective, the goal of most of these algorithms is to compute KSPs between two given nodes, which also can be called single-pair KSP problem (Shih and Parthasarathy 2012; Hu et al. 2012; Sedeno-Noda 2012; Sedeno-Noda and Espino-Martin 2013;
123
2392
Hershberger et al. 2007; Aljazzar and Leue 2011, 2008; Yang and Chen 2005; Jimenez and Marzal 1999, 2003; Martins and Pascoal 2003; Yen 1971, 1972; Gotthilf and Lewenstein 2009; Gao et al. 2010). The related problem is single-source KSP problem which aims to find KSPs from a given node to each other node (Shih and Parthasarathy 2012; Eppstein 1988). From the literature we can easily conclude that much more effort has been devoted to single-pair KSP problem than to single-source problem (Gao et al. 2010). Meanwhile, to the best of our knowledge, no parallel methods has been proposed for solving the KSP problem. In this paper, we consider single-source KSP problem with loops allowed in the solution paths. The idea of our algorithm is inspired by the flow of water. Imagining the water flowing from the source node along the edges in the graph, when water arrives at a node, it will keep on flowing from this node to each outgoing edges of that node. When the water reaches at a node, there is a path from source node to that node; when a node is reached by the waters k times, there must exist k paths from source node to this node. Therefore, by stepping back the traces of waters, an ordered shortest path can be obtained. We call the proposed algorithm Water Spreading-based Algorithm (WSA). The proposed method has two features: first, computes a set of shortest paths with k distinctive path lengths in a non-decreasing order; second, computes all shortest paths whose lengths are less than a given threshold in advance. To the best of our knowledge, no methods published have tackled such a problem exactly. To show the efficiency of our method, we take the comparison with the traditional method which focuses on computing KSPs (may contain duplicate path length) and allows loops in solution paths. For solving single-source KSP problem, Shih and Parthasarathy (2012) proposed a heuristic algorithm to address the pathway inference problem in gene networks; Hu et al. (2012) presented an algorithm imitating the nature phenomenon of ripple in the surface of water. However, these two approaches only refer to simple paths, where loops are not allowed in solution paths. The famous EA algorithm proposed by Eppstein (1988) can be used to solve both singlepair KSP and single-source KSP, with time complexity of O(m + nlogn + k) and O(m + nlogn + kn), where n, m and k are the number of nodes, edges and a predefined number of paths. EA first runs Dijkstra’s algorithm to compute a minimum spanning tree, creates two kind of heaps for each node, which are used to construct the path graph, and finally searches the path graph to obtain KSPs. Moreover, an additionally O(logk) time per path is required for EA when the paths are listed in order of their lengths. Jimenez and Marzal (2003) presented a further optimization of EA algorithm, Lazy Variant of Eppstein’s Algorithm (LVEA), which maintains the same asymptotic worst-case complexity as EA algorithm but improves its practical performance as the result of optimizations. To the best of our knowledge, in the state-of-
123
G. Liu et al.
the-art single-pair algorithms, K* Aljazzar and Leue (2011) is the most efficient single-pair KSP algorithm according to their experiments, devised as a heuristics-guided, on-the-fly (meaning that the full problem graph does not need to be presented in the memory but nodes will be generated as needed) algorithm. K* is particularly well suitable for very large scale single-pair KSP problem. It is clear that K* is an optimization of EA and also can be applied to solve single-source KSPs. Our simulations on both kinds of KSP problems show that WSA outperforms K* for single-source KSP, while K* is more competitive than WSA for single-pair KSP problem. The rest of the paper is organized as follows. We introduce some preliminaries in Sect. 2. Section 3 presents the proposed WSA algorithm and its analysis, including an example to illustrate WSA. Simulations and conclusions are given in Sects. 4 and 5, respectively. 2 Preliminaries Let G = (V, E) be a directed weighted graph, where V is the set of nodes and E ⊆ V × V is the set of edges. Given an edge e = (u, v) ∈ E, we use tail(e) for u and head(e) for v. Let w: E → R ≥ 0 be a function mapping edges to nonnegative real-valued weights or lengths. Let s, t ∈ V denote source and target nodes, respectively. A path on G is denoted by P, without loss of generality, we let Pn be the n th shortest s-t path in G. The length of a path P = v1 → v2 →· · · → vn is defined as the sum of the edge lengths, formally,
l(P) =
n−1
w(vi , vi+1 ).
(1)
i=1
For more intelligible representation, we define the following concepts and symbols, (1) Split An action of a node. A node may have many Split actions, so each node has a split list to record these Splits. A reached W ater will trigger a Split, which will generates new waters flowing along its outgoing edges. Each Split has three attributes, – parentNode the node that generated the W ater which triggered this Split. – parentSplitIndex: the Split index, which represents where the Split (generated this W ater ) located in parentNode’s split list. – splitTime: the time when a Split action is triggered. Every Split action will be recorded by the corresponding node. (2) W ater : a W ater can flow along edges with a constant speed (unit by default). Each W ater has four attributes:
Computing k shortest paths
Fig. 1 A simple graph to illustrate the definitions
– bornNode: the node that generates the current W ater . – dieNode: the node where the current W ater will vanish. – parentSplitIndex: the index indicates which Split generates the current W ater . – reachTime: the time when a W ater arrives at its target node. Figure 1 explains the above definitions. The W ater s and Splits can be denoted by using the two formats w(bor n N ode, dieN ode, par ent Split I ndex, reachT ime), and s( par ent N ode, par ent Split I ndex, split T ime). The W ater s are shown above edges, and the array under node b is the Split recorded by node b. We set the water flow speed by unit. At time 20, there is a W ater w(a, b, 1, 20) flowing from node a along edge (a, b). At time 22, the water reaches node b, and b splits; this split is subsequently recorded by node b; the par ent N ode and par ent Split I ndex of the split are a and 1 (also the bor n N ode and bor n N odeSplit I ndex of w(a, b, 1, 20)). This Split generates three W ater s. One is for edge (b, d), i.e., w(b, d, 2, 22). Its par ent Split I ndex is 2, which is the position the Split (generated this water) located in the split record list of node b.
3 The WSA algorithm 3.1 Basic idea Let G be the problem graph and s the source node. First, let s split and s will immediately generate new waters for all of its outgoing edges; each water moves along edges (from tail to head) with a constant speed (same for all waters). A water will keep moving until it encounters a graph node and then vanishes. Next, the encountered node will split and generate new waters for its outgoing edges. By this means, the waters keep flowing in the whole graph, just like water flowing along canals. We record each split to their corresponding nodes to step back the trace of each water. Clearly, at the first time a node splits, the shortest path from s to this node has been found. Similarly, if a node has split k times, we can obtain
2393
the k shortest paths from s to this node. As we are interested in finding k shortest paths, this requires each node to split at least k times. In the above process, we can find the k shortest paths from s to each other node. The paths can be obtained by tracing the tracks of the waters, which will be described in the next subsection. We use an example to illustrate how the above idea works. As shown in Fig. 2, we are interested in computing two shortest paths from s0 to each other node. At the beginning, t = 0, s0 splits and creates two waters. At time 1.4, water w(s0 , s1 , 0, 1.4) reaches s1 , then vanishes. s1 splits and creates two waters flowing to s2 and s3 , respectively. At time 2.3, w(s1 , s2 , 0, 2.3) reaches s2 , s2 splits and creates two waters flowing to s1 and s3 . At time 2.4, w(s1 , s3 , 0, 1.4) reaches s3 and s3 splits but creates no water because s3 has no outgoing edge. At time 2.6, w(s2 , s1 , 0, 2.3) arrives at s1 and then vanishes, s1 splits and creates two waters. Notice that s1 has split two times and would not split anymore. At time 3.0, w(s0 , s2 , 0, 0) reaches s2 , s2 splits and creates two water flowing to s1 and s3 . The water flow situation at time 3.5 is the final split result of the example. Since all nodes except the source node has split 2 times and thus we can obtain 2 shortest paths from s0 to each other node by tracing the track of the water. From the above representation, the basic process of our algorithm can be summarized as: the water w arrives at its dieN ode v, then w vanishes and v splits, then we record this split r ecor d to v and node v creates new waters neww for each outgoing edge e of v. 3.2 Algorithm implementation In this subsection, we give a non-parallel implementation of WSA algorithm. We first create two data structures named Split and water . The members of both structures are listed in Sect. 2. The water List is maintained with its water elements ordered in a non-decreasing way according to reachT ime. The pseudo code of the implementation is shown in Fig. 3. First, in line 5, s splits and we add this Split to the split list of s, and new waters are created for each outgoing edge. We then insert these new waters into water List, and set the water speed to 1 by default. In line 6, the main loop begins. We choose the water , represented by w, which has the smallest reachT ime from water List and erase it from there. If w.dieN ode has split more than k times, the while loop is continued. Otherwise, let w.dieN ode split and new water s are created for each outgoing edge of w.dieN ode. We set these new water s’ par ent N ode and par ent I ndex to w.dieN ode and the size of the water list of w.dieN ode, respectively. Then we set these new water s’ dieN ode and reachT ime to the corresponding edges’ tail node and w.reachT ime, add the corresponding edges’ weight. Next these new W ater s are inserted
123
2394
G. Liu et al.
Fig. 2 An example to illustrate the basic idea of WSA
to water List and this Split is also recorded. We continue to do so until the water List is empty. The next step is how to access the output of WSA. When the above processes are finished, the ith shortest path from s to v (denoted by pvi ) can be obtained by the following scheme. Assume u, s, ss, p and pos are the current node, current Split, a Split, the par ent N ode of u and the par ent I ndex of u, respectively. We first assign v to u and set the ith Split of v to s. Do the flowing process until p becomes s and pos becomes 0: insert u to the head of pvi , set s to ss, set the posth Split of p to s, then we set p to u and set p and pos according to ss. When the above process is done, p will be the source node s and pos will be 0, which means that s is the first split of the whole network. At last, we add the source node s to the head of pvi to complete this scheme. The above procedure only prints one path for each distinctive length. In fact, the par ent N ode of a Split may refer to more than one node. In this case, there are many paths with the same path length from s to this node. Figure 4 shows the pseudocode to describe how to print out the KSPs for each node. Furthermore, we can modify the algorithm easily to compute k shortest paths between two given nodes, just by ter-
123
minating the while loop (lines 6–19 in Fig. 3) when the goal node has split k times. 3.3 An example using WSA Here, we will present an example to show how WSA works, as shown in Fig. 5. We are going to find three shortest paths from s0 to each other node. The process can be described as follows: first the source node splits and the created waters are added to water List. We choose a water with shortest reachT ime in water List and split the dieN ode (if this node has split no more than k times) of this water. We add the created water to water List and record this split. The process is repeated until water List is empty. In Table 1, a step-by-step execution of the implementation is listed. The first and second columns contain the current time and currently firing node, respectively. The content of water List with the water format (bornNode, dieNode, reachTime, parentSplitIndex) is listed in the third column and each recorded split with format (parentNode, parentSplitIndex, splitTime) is in the forth column. In the second row of Table 1, the source node s0 splits and then creates three
Computing k shortest paths
2395
Fig. 3 The structure of WSA algorithm
Fig. 5 Example graph with 5 nodes and 11 edges
Fig. 4 The procedure to print out the KSPs for each node
waters: (s0 , s1 , 1, 0), (s0 , s2 , 2, 0) and (s0 , s3 , 2, 0). Next, these waters are added to water List and this split event is recorded to s0 . The par ent N ode and par ent Split I ndex of
the first split are set to Null which is used to mark the first split. At time = 1, (s0 , s1 , 1, 0) arrives at s1 and is deleted from water List. Note that s1 has split zero times less than k. In this case, s1 splits and creates one new water (s1 , s2 , 5, 0), then we insert it into water List, and add this split to node s1 . We continue to do so until water List is empty. Table 1 shows the detailed process of this example. When all the above processes are finished, it is very easy and quick to obtain the paths from recorded splits. In Table 2, the final recorded splits and the corresponding paths are listed. As an example, we just illustrate how to find the third shortest path P from s0 to s3 : first P is an empty list, add s3 to P. Since the third (index 2) Split record of s3 is (s2 , 1, 6), we add the par ent N ode of the record s2 to the head of P. Because the par ent I ndex of this record is not 0
123
2396
G. Liu et al.
Table 1 The step-by-step execution of our implementation Time
Node
0 s0
waterList
Split record
(s0 , s1 , 1, 0)
s0 : (Null, Null, 0)
Time
Node
5
(s0 , s2 , 2, 0)
s2
(s0 , s3 , 2, 0) 1 s1
waterList
Split record
(s2 , s4 , 6, 0)
s0 : (Null, Null, 0)
(s4 , s2 , 6, 0)
s1 : (s0 , 0, 1)(s2 , 0, 3)(s4 , 0, 5)
(s4 , s1 , 6, 1)
s2 : (s0 , 0, 2)(s1 , 0, 5)
(s0 , s2 , 2, 0)
s0 : (Null, Null, 0)
(s2 , s1 , 6, 1)
s3 : (s0 , 0, 2)(s2 , 0, 3)
(s0 , s3 , 2, 0)
s1 : (s0 , 0, 1)
(s2 , s3 , 6, 1)
s4 : (s3 , 0, 3)(s3 , 1, 4)
(s1 , s2 , 5, 0)
s1
(s1 , s2 , 7, 1)
(s2 , s3 , 3, 0)
s0 : (Null, Null, 0)
(s4 , s2 , 7, 1)
s3
(s2 , s1 , 3, 0)
s1 : (s0 , 0, 1)
(s4 , s3 , 9, 0)
s2
(s3 , s4 , 3, 0)
s2 : (s0 , 0, 2)
(s1 , s2 , 9, 2)
(s1 ,s2 , 5, 0)
s3 : (s0 , 0, 2)
(s2 , s4 , 9, 1)
2
(s2 , s4 , 6, 0)
(s4 , s3 , 10, 1)
(s3 , s4 , 4, 1)
s0 : (Null, Null, 0)
s1
(s1 , s2 , 5, 0)
s1 : (s0 , 0, 1)(s2 , 0, 3)
s2
(s4 , s2 , 7, 1)
s1 : (s0 , 0, 1)(s2 , 0, 3)(s4 , 0, 5)
s4
(s4 , s1 , 5, 0)
s2 : (s0 , 0, 2)
s4
(s2 , s3 , 7, 2)
s2 : (s0 , 0, 2)(s1 , 0, 5)(s4 , 0, 6)
s3
(s2 , s4 , 6, 0)
s3 : (s0 , 0, 2)(s2 , 0, 3)
s3
(s4 , s3 , 9, 0)
s3 : (s0 , 0, 2)(s2 , 0, 3)(s2 , 1, 6)
(s4 , s2 , 6, 0)
s4 : (s3 , 0, 3)
(s1 , s2 , 9, 2)
s4 : (s3 , 0, 3)(s3 , 1, 4)(s2 , 0, 6)
3
6
(s1 , s2 , 7, 1)
(s1 , s2 , 7, 1)
(s2 , s4 , 9, 1)
(s4 , s3 , 9, 0) 4
s4
s0 : (Null, Null, 0)
(s4 , s2 , 9, 2)
(s1 , s2 , 5, 0)
s0 : (Null, Null, 0)
(s4 ,s3 , 10, 1)
(s4 , s1 , 5, 0)
s1 : (s0 , 0, 1)(s2 , 0, 3)
(s4 , s3 , 12, 2)
(s2 , s4 , 6, 0)
s2 : (s0 , 0, 2)
(s4 , s2 , 6, 0)
s3 : (s0 , 0, 2)(s2 , 0, 3)
(s4 , s1 , 6, 1)
s4 : (s3 , 0, 3)(s3 , 1, 4)
7
(s4 , s3 , 9, 0)
s0 : (Null, Null, 0)
(s1 , s2 , 9, 2)
s1 : (s0 , 0, 1)(s2 , 0, 3)(s4 , 0, 5)
(s2 , s4 , 9, 1)
s2 : (s0 , 0, 2)(s1 , 0, 5)(s4 , 0, 6)
(s4 , s2 , 9, 2)
s3 : (s0 , 0, 2)(s2 , 0, 3)(s2 , 1, 6)
(s4 , s2 , 7, 1)
(s4 , s3 , 10, 1)
s4 : (s3 , 0, 3)(s3 , 1, 4)(s2 , 0, 6)
(s4 , s3 , 9, 0)
(s4 , s3 , 12, 2)
s2
(s1 , s2 , 7, 1)
(s4 , s3 , 10, 1)
9
(s4 , s3 , 10, 1)
Ditto
10
(s4 , s3 , 12, 2)
Ditto
12
Null
Ditto
(s4 , s3 , 12, 2)
The underlined waters are created at the current time and added into waterList
and par ent N ode of the current node is not s (we have not reached the first Split of the whole network). We continue to consider the record in s2 with index 1: (s1 , 0, 5). We then add s1 to the head of P and continue to consider the record: (s0 , 0, 1), which is in s1 with index 0, and we add s1 to the head of p. Notice that, in this time, par ent N ode and par ent Split I ndex of water (s0 , 0, 1) are s0 and 0, respectively. We then add s0 to the head of P to complete the above process and the obtained path is s0 → s1 → s2 → s3 . 3.4 Algorithm analysis In this subsection, we firstly address the correctness and completeness of WSA algorithm for finite graphs.
123
Lemma 1 The split T ime of a split is less than or equal to the r eachT ime of the water created by this split in a nonnegative weight graph. Proof Similar to how we calculate the reachT ime of a water, we assume that water w is created by the split s and the corresponding edge is e. The reach time of w can be calculated as w.reachT ime = s.split T ime + w(e)/S,
(2)
where S is a constant water speed and w(e) is the length of edge e. Clearly w(e)/S ≥ 0, so we have s.split T ime ≤ w.reachT ime. This completes the proof.
(3)
Computing k shortest paths
2397
Table 2 The shortest path list (k = 3) in Fig. 4 SPs (k = 3)
Path lengths
0: (s0 , 0, 1)
1th: s0 → s1
1
1: (s2 , 0, 3)
2th: s0 → s2 → s1
3
2: (s4 , 0, 5)
3th: s0 → s3 → s4 → s1
5
0: (s0 , 0, 2)
1th: s0 → s2
2
1: (s1 , 0, 5)
2th: s0 → s1 → s2
5
2: (s4 , 0, 6)
3th: s0 → s3 → s4 → s2
6
0: (s0 , 0, 2)
1th: s0 → s3
2
1: (s2 , 0, 3)
2th: s0 → s2 → s3
3
Node
Split record
s1
s2
s3
s4
2: (s2 , 1, 6)
3th: s0 → s1 → s2 → s3
6
0: (s3 , 0, 3)
1th: s0 → s3 → s4
3
1: (s3 , 1, 4)
2th: s0 → s2 → s3 → s4
4
2: (s2 , 0, 6)
3th: s0 → s2 → s4
6
Lemma 2 Every node records the split actions in a nondecreasing order according to split T ime. Proof Assume that node v splits at time t1 , and will split again at time t2 . These two splits, namely s1 and s2 , are caused by waters w1 and w2 , respectively. Clearly, we have w1 .reachT ime = s1 .Split T ime = t1 ,
(4)
w2 .reachT ime = s2 .Split T ime = t2 .
(5)
Because, the water w with the smallest reachT ime is selected from water List and w.dieN ode splits at time w.reachT ime. According to Lemma 1, the reachT imes of those waters created by this split are all greater than wa.reachT ime, for all wa ∈ water List. So we have w1 .reachT ime ≤ w2 .reachT ime −→ t1 ≤ t2 . This completes the proof.
(6)
Theorem 1 WSA is correct and complete for finite graphs. Proof In a finite graph (which means the node number, the degree of each node and the length of weight between any two directly connected nodes are finite), since the water will flow to all possible outgoing edges of the node, then all possible tracks can be obtained in limited time. For any node, the reachT ime of all later reached waters is not larger than the last split action of the node; therefore, every possible path can be taken into consideration in the algorithm. By lemma 2, every node records the splits in a non-decreasing order, and each split represents one path (see the scheme to obtain explicit path from Split record in Fig. 4). Therefore, in a finite graph, with a certain water speed, every node connected to the source will be reached and triggered to split, which contributes the termination of WSA. Thus, the proposed algorithm is correct and complete.
Here, we analyze the time and space complexities of WSA. The main process of WSA is to create waters and insert waters into water List and erase waters from water List. As we are interested in finding k shortest paths from s to each other node, we limit the number of splitting times for each node to k. Thus, there have kn Splits and km waters in total, so the size of water List is at most km. Thus, the worst-case complexity of inserting and deleting operations on water List is O(log (km)) when using heap to store water List, and these operations will be executed at most 2km times. Therefore, the time complexity of this implementation is O(km log (km)), where n is the number of nodes, m is the number of edges and k is the number of paths needed to be computed. It should be pointed out that this complexity refers to determine an implicit representation of the K single-source shortest paths. As for space complexity, in our implementation, we need O(m) and O(n) to store m edges and n nodes. We also record all Split events for each node, which requires O(kn) space. As we have limited each node to split at most k times, and O(km) is needed to maintain water List which has at most km elements. Thus, the space complexity of WSA is O(kn + km + n + m). The space complexity of WSA can be further optimized. Firstly, there are at most k waters on the same edge simultaneously occurred. The sum of the size of the splits of a node v (denoted by sn(v)) plus the number of the water s with target node being v (denoted by wn(v)) is at most k, i.e., sn(v) + wn(v) ≤ k, clearly wn(v) ≤ k. For each node, there are at most k water s to be stored. The total space requirement to store the water s can be reduced to O(kn). However, to achieve the this space complexity, for any node v, we should store the first wn(v) water s ordered by their reachT ime among all the water s flowing to v. In our implementation, all the water s are stored in one set water List. So to realize this idea, when a water is generated, we need a further traversal operation on water List, which needs O(kn). Thus the total time complexity becomes O(k 2 nm) in this case, which is unacceptable. Furthermore, we study another implementation to reduce the time complexity to O(kmlogk + kn 2 )) with the space complexity O(kn). We store the water s separately according to their target node, each node maintains a ordered water List with the size at most k. The inserting operation for a new water needs O(logk) (the maximum number of this operation is km), and the deleting operation for the water with the smallest reachT ime needs O(n) (the maximum number of this operation is kn). So all these operations lead to the time complexity O(kmlogk + kn 2 ) (here we name this implementation WSA*). We can conclude that different store strategies of water s mean different trade-off between space and time complexities. Although WSA* always outperforms WSA in space usage because n < m (the gap of space usage depends on the ratio of n/m of a graph), the time complexity of WSA* is far
123
2398
worse than that of WSA (especially for small ratio graphs). In practical applications, although we have no sense about the real ratio, it is LESS THAN 3 for all the 12 USA road maps, opened in the 9th DIMACS Implementation Challenge Demetrescu et al. (2006). It should be pointed out that in the above complexities, the solution path lengths are listed in a non-decreasing order, but they does not include the time and space needed to print out the solution paths represented in a explicit way. The time complexity to obtain each explicitly represented path is proportional to the number of edges contained in this path.
4 Simulations The proposed method can compute a set of shortest paths with k distinct path lengths in a non-decreasing order; it can also find all shortest paths, which are not larger than a given threshold. To the best of our knowledge, no methods published have tackled such a problem exactly. It is known that EA algorithm Eppstein (1988) is the most famous one to solve single-source KSP as well as single-pair KSP problem with loops allowed in solution paths. LEVA (Jimenez and Marzal 2003) and K* (Aljazzar and Leue 2011) can be regarded as the optimizations of EA, just for solving sing-pair KSP problem. K* outperforms LVEA and can be considered as the state-of-art algorithm to solve single-pair KSP problem. Similar to EA, K* and LVEA can also be applied to singlesource KSP problem. Although the authors of K* and LVEA have not discussed how to perform them to solve singlesource KSP problem, we extend K* to solve such KSP problem for further comparison. Our extension of K* (named e-K*) can be stated as flows: first we run A* to compute minimum shortest path tree. There is no need to stop A* and resume it later like solving single-pair KSPs, because we need to compute KSPs from a source node to each other node. Then, we create tree heaps for each node in the graph. At last, using these tree heaps, we keep computing KSP node by node. Furthermore, all graph data need to be loaded into main memory for e-K* due to single-source KSP computation, and the feature of on-the-fly used in K* for singlepair KSP problem is no longer needed in single-source KSP problem. As we need to compute a complete (containing all nodes) minimum spanning tree, in this case, the performance between A* and Dijistra’s algorithm has exactly no difference. Thus, we do not need to use heuristic search strategy in e-K* when solving single-source KSP problem. We can conclude from the above description that the time complexity of e-K* is O(m + n log n + nk), O(m + n log n) for A*; the effort required for the construction of heaps is O(n log n), O(nk) to compute k shortest path for each other
123
G. Liu et al. Table 3 Datasets of four maps used in our simulations Map
No. of nodes
No. of edges
Colorado (COL)
435,666
1,057,066
San Francisco bay (BAY)
321,270
800,172
New York city (NY)
264,346
733,846
1,070,376
2,712,798
Florida (FLA)
node. We also can deduce that the space complexity of e-K* is O(m + nlogn + K ). Comparing the time complexities of WSA and e-K*, which are O(km log (km)) and O(m +n log n +nk), respectively, we can see that WSA is mainly influenced by edge number m while e-K* is mainly influenced by node number n. As a matter of fact, when k increases by 1, WSA needs to process another m waters, while e-K* needs to process one more step to search path graph for KSP computation of each node, and n more steps are needed in total. In other words, the complexities of WSA and e-K* depend on m and n, respectively. Thus, we need to explore how n and m influence the performance of both algorithms in the simulations. Although the main goal of WSA is for single-source KSP computation, it also can be applied to single-pair KSP problem by terminating the algorithm when the source node has split k times. Thus, we also check the performance of WSA for single-pair KSP problem. The implementations of WSA and e-K* as well as K* are written in C++ and complied with g++ compiler. All the simulations are running on a personal computer equipped with Intel Pentium Dual Core 3.2 GHz CPU and 8 GB RAM, running Ubuntu desktop 13.04 64bit. We use two kinds of graphs in our simulations: one is real road network data available from the home page of the 9th DIMACS Implementation Challenge (Demetrescu et al. 2006), the details of four graphs used in our experiments are listed in Table 3; the other is the randomly generated graph using SPRAND generator attributed to Cherkassky et al. (1996). Using this generator, we obtain different graph datasets with different seeds. The C code of this generator is contained in the SPLIB-1.4 library available from the personal web page of A.V. Goldberg at http://www.avglab.com/andrew/. 4.1 Single-source KSP Computation on real datasets We first use the real road graph data to test the performance of both algorithms when applied to single-source KSP problem. Our simulations of both algorithms are based on four maps: New York City, San Francisco Bay Area, Colorado and Florida, as shown in Table 3. The average runtime using WSA and e-K* on each map with different k is calculated. Figure 6 shows the runtime comparison of both algorithms. We can
Computing k shortest paths
1
50 K
0
100
0
Runtime(ms)
4 2
50 K
100 2.5
FLA
x 10
2
0
50 K 6
COL
x 10
1.5
5
0
50 K
15 e−K* WSA
8
Memory(KB)
Memory(KB)
10
5
NY
x 10
6 4 2 0
0
50 5
100
150
0
200
5
0
50
100 K
0
50 6
4
10
0
5
COL
x 10
BAY
x 10
10
Memory(KB)
15
100 K
150
200
100 K
150
5000
100 K
150
200
150
200
FLA
x 10
3 2
0
200
4 x 10 9K nodes and 27K edges
4
2 1.5 1
0
0.5 0
50
0.5
1
100
10000
500
Fig. 6 Runtime comparison for K* and WSA using road graph data for single-source KSP problem
Memory(KB)
1000
2
Runtime(ms)
0
6
0
1500
0
5
8
4
2000
15000 e−K* WSA
Runtime(ms)
2
5K nodes and 15K edges
2500 Runtime(ms)
Runtime(ms)
Runtime(ms)
e−K* WSA
1K nodes and 3K edges
BAY
x 10
6
3
0
Runtime(ms)
5
NY
x 10
Runtime(ms)
5
4
2399
50
100 K
150
200
50
100 K
150
200
4 x 10 15K nodes and 45K edges
3 2 1 0
50
100 K
150
200
Fig. 8 Runtime comparison for WSA and e-K* on graphs with a edgeto-node ratio of 3:1
for a node, the memory required specially to compute KSPs for the previous node will be released. Thus, memory usage for K* to solve single-pair KSPs and single-source KSPs is equivalent. The space complexity of K* is O(m +nlogn +k) which is not sensitive to k. However, WSA computes KSPs for each node in a time. The main space needed for WSA is to store edges (O(m)) and nodes (O(n)), record split (O(kn)) and water (O(km)), which is very sensitive (proportional) to k. Nevertheless, when k is not large (e.g., in Fig. 7 no more than 60), WSA performs better than K* in memory usage. It is a better choice especially by taking the fact into account that many practical applications do not need to find so many KSPs.
1 0
0
50
100 K
Fig. 7 Memory usage for K* and WSA using road graph data for single-source KSP problem
see that both runtimes of e-K* and WSA are nearly proportional to k, but e-K* grows much more faster than WSA. In other words, the performance of e-K* is more sensitive to k than WSA. Figure 7 shows the comparison of space usage for both algorithms. The curve of WSA is somewhat like a step function. This is because that we used the set container in C++ STL to store water List and it is the main memory consumption of WSA. According to C++ STL standards, the memory allocation of set is just like a step function. Clearly, the memory usage of e-K* is nearly a constant when k increases (slightly influenced by k), while memory consumption of WSA is almost proportional to k. The explanation is that eK* computes KSPs node by node. When computing KSPs
4.2 Single-source KSP computation on synthetic datasets The simulations on real graphs have demonstrated the higher efficiency of our method. However, it is known that the complexity of a graph, including the scale (number of nodes and edges) and ratio (between edges and nodes), will affect the performance of an algorithm. As we discussed in the previous subsection that when k increases by one, e-K* needs to process another n units while WSA needs to process another m units. Hence, the ratio of m and n should make a tremendous impact on the performance of both algorithms. According to the statistics of real road data (NY, BAY, COL, FLA:2.78, 2.49, 2.43, 2.53), we first fix the ratio between m and n to 3 with different graph scales, then we fix the node number of the graph and change m (change ratio of m/n) to test the performance of both algorithms. In the first test, we set the number of nodes increasing from 1,000 to 15,000 with m/n = 3. We randomly generated 100 nodes as the source node of each graph to avoid random deviation. The results are shown in Fig. 8.
123
2400
G. Liu et al.
4.5
1k 3k 5k 7k 9k 15k
3.5
4 x 10 1K nodes and 3K edges
3
e−K* WSA
Memory usage(KB)
4
Memory usage(KB)
1.6
1.5
1.4
1.3
0
10
20
4 x 10 5K nodes and 15K edges
2.5 2 1.5 1
30
0
10
4
2.5
2
4 x 10 9K nodes and 27K edges
6
3
2
1
0
10
20
4
2
0
30
0
10
100
120
140
160
180
200
K
Fig. 9 Different performance improvements of WSA compared to eK* denoted by te /tw , on different scale graphs with a constant ratio of 3 (edges/nodes)
By analyzing the experimental data in detail, we can conclude that (1) for e-K* and WSA, the runtime increases in direct proportion to the scale of the graph. As an example, the runtime (in ms) for e-K* when k = 1 (denoted by te (k)) with different scale is te (1) = [12, 62, 109, 185]. We normalize te to N (te ) by different graph scales. By this means, we obtain N (te )(1) = [12/1, 62/5, 109/9, 185/15] = [12, 12.4, 12.11, 12.33] and N (tw )(1) = [4, 4.6, 4.44, 4.53]. The normalized runtime with graph scale for e-K* and WSA has not changed much. Similar results can be obtained in other choice of k, so the performance of both algorithms will vary proportionally with different graph scales when the ratio of edges and nodes is a constant. (2) We keep exploring the performance improvements of our method in relation to the graph scale (with node number from 2,000 to 15,000). Figure 9 shows the different improvements valued by ratio (te /tw ) of runtime on different graph scales. It is easy to conclude that the improvement will decrease as graph scale increases, i.e., the runtime of e-K* is about four times of WSA’s when n = 2,000 and m = 6,000, while only about two times when n = 15,000 and m = 30,000; and the improvement is not very sensitive to k. We are not sure whether there exists a threshold of graph scale where the performance of both algorithms meet. We will take it into account in our future research due to our laboratorial limitations. Memory consumption of both methods is shown in Fig. 10. As the analysis in previous subsection, the space complexity of e-K* outperforms WSA very much when k is large. In the second simulation, we study how the performance changes when the ratio of nodes to edges varies. We set the number of nodes to 10,000, and set the ratio to 2, 3, 6 and
123
30
Fig. 10 Memory usage for e-K* and WSA on graphs with a edge-tonode ratio of 3:1 3
4 x 10 10K nodes and 20K edges
3
e−K* WSA
Runtime(ms)
80
20 K
2
1
0
3
50
100 K
150
4 x 10 10K nodes and 60K edges
1
0
50
100 K
150
1
6
2
200
4 x 10 10K nodes and 30K edges
2
0
200
Runtime(ms)
60
Runtime(ms)
40
Runtime(ms)
20
30
4 x 10 15K nodes and 45K edges
K 1.5
20 K
Memory usage(KB)
3
Memory usage(KB)
t /t
e w
K
50
100 K
150
200
4 x 10 10K nodes and 80K edges
4
2
0
50
100 K
150
200
Fig. 11 Runtime comparison on the graphs with 10,000 nodes and varying number of edges
8, respectively. The experiment result is shown in Fig. 11. Clearly, as the ratio becomes larger, the improvements of WSA compared to e-K* decrease. When the ratio is 6, WSA achieves similar performance to e-K*, while with a ratio greater than 6, K* performs better than WSA. This fact can be explained as: WSA is mainly influenced by edge number m while e-K* is mainly influenced by node number n. When n is fixed and m increases, WSA needs to process km water ; while the increased edge number only increases the size of tree-heap as well as the time needed to compute minimum spanning tree, which is computed only once and could be reused when computing KSPs for each node. Hence, the smaller ratio m/n will lead to the more advantages by WSA compared with e-K*, and the threshold is about 6 in our test. We have no relative statistics about the ratio in practice. Considering the real road graphs in Demetrescu et al. (2006), the ratios are all less than 3. Thus, our method should be able
Computing k shortest paths
2
20
2 1
30
0
10
K
15 Memory usage(KB)
Memory usage(KB)
x 10 10K nodes and 60K edges
6 4
0
10
20
2 1 0
0
20
40
3 2 1 0
60
0
20
K
4
8
2
K* WSA
3
BAY
x 10
30
K
4
10
20
4
x 10 10K nodes and 80K edges 4
3
4
COL
x 10
40
60
40
60
K 6
FLA
x 10
10
30
K
5
0
0
10
20
30
Runtime(ms)
10
3
4
NY
x 10
2
1
4
2
K
0
Fig. 12 Memory usage of both algorithms on graphs with 10,000 nodes and varying number of edges
As we mentioned before, the proposed algorithm can also be applied to single-pair KSP problem, by simply terminating the algorithm as soon as the destination node splits k times. In this case, we got k shortest paths with distinctive length. We compare WSA with K* (Aljazzar and Leue 2011), which is the state-of-the-art most efficient algorithm to this kind of KSP problem. Using the four road graph map data (Demetrescu et al. 2006), as (Aljazzar and Leue 2011) did, we use airline distance which is computed by the law of cosines as heuristic estimate for K*. We set the number of explored nodes or the number of explored edges grows by 20 % in each run of A*, just as (Aljazzar and Leue 2011) did. We randomly generate 50 s-t pairs for each graph. The average runtime using WSA and K* is calculated as shown in Fig. 13. We can see that the runtime of WSA is nearly proportional to k while K* is not sensitive to k, and when k is small the performances of two algorithms meet. In the worst case, the number of processed waters for WSA is km which is proportional to k; while the time complexity of K*, O(m + n log n + k), is not sensitive to k. When k is small, WSA needs to process small number of waters, km in the worst case, while K* still needs to compute the minimum spanning tree as well as the path graph. That is why WSA outperforms K* when k is rather small. Figure 14 shows the memory consumption results and the analysis is similar to previous subsection. The performance of K* outperforms WSA very much when k is large. Notice that in our experiments, A* is not resumed in K*. That is why the runtime curve of K* is not like a step function.
40
0
60
0
20 K
Fig. 13 Runtime comparison for K* and WSA using road graph data for single-pair KSPs 5
3 Memory(KB)
4.3 Single-pair KSP computation
20 K
5
NY
x 10
4 K* WSA
2
1
0
0
20
40
3
2
1
60
BAY
x 10
0
20
K 5
4 Memory(KB)
to adapt to practical applications. For memory consumptions shown in Fig. 12, the performance of WSA is comparable with e-K* only when k is rather small, as we discussed above.
0
5
COL
x 10
8
3
2
1
0
20
40 K
40
60
40
60
K
Memory(KB)
0
4
Runtime(ms)
1.5
4
4
Memory(KB)
2.5
4 x 10 10K nodes and 30K edges
Runtime(ms)
5
e−K* WSA
Runtime(ms)
4 x 10 10K nodes and 20K edges
Memory usage(KB)
Memory usage(KB)
3
2401
60
FLA
x 10
6
4
2
0
20 K
Fig. 14 Space comparison for K* and WSA using road graph data for single-pair KSPs
It should be pointed out that the goal of WSA is for solving single-source KSPs rather than single-pair KSPs. When we apply WSA to compute KSPs between two given nodes, WSA will terminate under two scenarios: (1) when the goal node has split k times; (2) when water List is empty. WSA computes less shortest paths than k in the case that there does not exist k shortest paths with distinctive length between the two given nodes. Notice that once a node splits, there is a path found from this node to the source node. Therefore, in the process of computing KSPs from the source node to the goal node, WSA also finds out many shortest paths from source node to other nodes simultaneously. That is the reason why WSA is good at solving single-source KSP problem.
123
2402
5 Conclusions The new proposed algorithm (WSA) aims to solve the singlesource KSP problem on a graph where loops and multiple edges between nodes are allowed. We have illustrated its correctness and determined its asymptotic worst-case complexity as for runtime and space. Compared with the state-ofart algorithm K*, although WSA is not competitive to solve the single-pair KSP problem, it runs faster than K* to find single-source KSPs when the graph ratio of m/n is small. Especially, WSA has the abilities to find k distinctive path lengths or to compute KSPs by giving a path length criteria in advance. The algorithm also provides a direction to solve KSP problems in a parallel way, such as using artificial neural networks or other methods. Very different from the runtime, the space requirement of WSA will grows violently when k increases, which is the weakness of WSA compared with K*. We will keep studying and optimizing WSA in our future work, especially for its neural network-based implementations. Acknowledgments This work was supported by the Fundamental Research Funds for the Central Universities under Grant ZYGX2013J076, and National Science Foundation of China under Grants 61273308 and 61175061.
References Aljazzar H, Leue S (2011) K*: a heuristic search algorithm for finding the K shortest paths. Artif Intell 175:2129–2154 Aljazzar H, Leue S (2008) K*: a directed on-the-fly algorithm for finding the k shortest paths. University of Konstanz, Gemany, Tech. Rep. soft-08-03, Mar, 2008 Berclaz J, Fleuret F, Turetken E et al (2011) Multiple object tracking using k-shortest paths optimization. IEEE Trans Pattern Anal Mach Intell 33(9):1806–1819 Cherkassky BV, Goldberg AV, Radzik T (1996) Shortest paths algorithms: theory and experimental evaluation. Math Program 73:129– 174 Demetrescu C, Goldberg A, Johnson D (2006) 9th DIMACS implementation challenge-shortest paths. American Mathematical Society
123
G. Liu et al. Eppstein D (1988) Finding the k shortest paths. SIAM J Comput 28(2):652–673 Gao J, Qiu H, Jiang X et al (2010) Fast top-k simple shortest paths discovery in graphs. Proceedings of the 19th ACM international conference on Information and knowledge management, pp 509–518 Gotthilf Z, Lewenstein M (2009) Improved algorithms for the k simple shortest paths and the replacement paths problems. Inform Process Lett 109(7):352–355 Hershberger J, Maxel M, Suri S (2007) Finding the k shortest simple paths: a new algorithm and its implementation. ACM Trans Algorithms 3(4):75 Hoffman W, Pavley R (1959) A method of solution of the Nth best path problem. J ACM 6:506–514 Hu XB, Wang M, Hu D et al (2012) A ripple-spreading algorithm for the k shortest paths problem. Third Global Congress on Intelligent Systems, pp 202–208 Jimenez VM, Marzal A (1999) Computing the k shortest paths: a new algorithm and an experimental comparison. Lect Notes Comput Sci 1668:15–19 Jimenez VM, Marzal A (2003) A lazy version of Eppstein’s shortest paths algorithm. Lect Notes Comput Sci 2647:179–190 Martins EV, Pascoal MB (2003) A new implementation of Yen’s ranking loopless paths algorithm. Q J Belg Fr Ital Oper Res Soc 1(2):121–133 Ozer B, Gezici G, Meydan C et al (2010) Multiple sequence alignment based on structural properties. Health Informatics and Bioinformatics (HIBIT), 2010 5th International Symposium on IEEE, pp 39–44 Sedeno-Noda A (2012) An efficient time and space K point-to-point shortest simple paths algorithm. Appl Math Comput 218(20):10244– 10257 Sedeno-Noda A, Espino-Martin JJ (2013) On the K best integer network flows. Comput Oper Res 40(2):616–626 Shih YK, Parthasarathy S (2012) A single source k-shortest paths algorithm to infer regulatory pathways in a gene network. Bioinformatics 28(12):49–58 Wan X, Hua N, Zheng X (2012) Dynamic routing and spectrum assignment in spectrum-flexible transparent optical networks. J Opt Commun Netw 4(8):603–613 Xu WT, Sw He et al (2012) Finding the K shortest paths in a schedulebased transit network. Comput Oper Res 39(8):1812–1826 Yang HH, Chen YL (2005) Finding K shortest looping paths in a trafficlight network. Comput Oper Res 32(3):571–581 Yen JY (1972) Another algorithm for fiding the k shortest-loopless network paths. In 41st Mtg. Operations Research Society of America, vol 20, p B/185 Yen JY (1971) Finding the K shortest loopless paths in a network. Manage Sci 17:712–716