A fast work function algorithm for solving the k-server ... - Springer Link

8 downloads 131 Views 342KB Size Report
Jul 27, 2011 - k mobile servers should serve a sequence of requests appearing at various ... mentioned, the k-server problem is posed in a fixed metric space,.
CEJOR (2013) 21:187–205 DOI 10.1007/s10100-011-0222-7 ORIGINAL PAPER

A fast work function algorithm for solving the k-server problem Tomislav Rudec · Alfonzo Baumgartner · Robert Manger

Published online: 27 July 2011 © Springer-Verlag 2011

Abstract This paper deals with the work function algorithm (WFA) for solving the on-line k-server problem. The paper addresses some practical aspects of the WFA, such as its efficient implementation and its true quality of serving. First, an implementation of the WFA is proposed, which is based on network flows, and which reduces each step of the WFA to only one minimal-cost maximal flow problem instance. Next, it is explained how the proposed implementation can further be simplified if the involved metric space is finite. Also, it is described how actual computing of optimal flows can be speeded up by taking into account special properties of the involved networks. Some experiments based on the proposed implementation and improvements are presented, where actual serving costs of the WFA have been measured on very large problem instances and compared with costs of other algorithms. Finally, suitability of the WFA for solving real-life problems is discussed. Keywords On-line problems · On-line algorithms · k-server problem · Work function algorithm · Implementation · Network flows

T. Rudec · A. Baumgartner Faculty of Electrical Engineering, University of Osijek, Kneza Trpimira 2b, 31000 Osijek, Croatia e-mail: [email protected] A. Baumgartner e-mail: [email protected] R. Manger (B) Department of Mathematics, University of Zagreb, Bijeniˇcka cesta 30, 10000 Zagreb, Croatia e-mail: [email protected]

123

188

T. Rudec et al.

1 Introduction In the k-server problem (Koutsoupias 2009; Manasse et al. 1990) one has to decide how k mobile servers should serve a sequence of requests appearing at various locations of a fixed metric space. It is usually required that the solution is produced in on-line fashion (Irani and Karlin 1997), so that each request is served before the next request arrives. Serving is accomplished by moving a server to the appropriate location. In addition to processing requests on time, a good on-line algorithm for solving the k-server problem also tries to minimize the total cost of serving, where the cost is estimated as the sum of distances crossed by all servers. A desirable property of such quasi-optimal on-line serving is called competitiveness (Sleator and Tarjan 1985). Vaguely speaking, an algorithm is competitive if its cost is only a bounded number of times worse than optimal. There are various on-line algorithms for solving the k-server problem found in literature. Among them, the best characteristics regarding competitiveness exhibits the work function algorithm—WFA (Bartal and Koutsoupias 2004; Koutsoupias and Papadimitrou 1994; Koutsoupias 1999; Koutsoupias 2009). In spite of its theoretical importance and interesting properties, the WFA is actually never used due to its prohibitive computational complexity. Consequently, with lack of practical evidence, it is not quite clear whether the competitive but complex WFA can really provide better service than much simpler but non-competitive heuristics such as the greedy or the balanced algorithm (Irani and Karlin 1997; Manasse et al. 1990). The aim of this paper is to address the mentioned practical aspects of the WFA. More precisely, the paper gives answers to the following questions. – Can the WFA be implemented more efficiently than it is implied by its definition? – Can such implementation be fast enough for real applications? – Can the WFA assure better costs of serving than simple heuristics? Note that the aspects of the WFA considered in this paper are closely linked together. Namely, developing a more efficient implementation makes sense only if the WFA really provides better service, otherwise we could rather stick to simple heuristics. On the other hand, to check that the WFA really provides better service, we need a fast implementation that would allow testing on very long sequences of requests. The paper is organized as follows. After the introduction in Sect. 1, all necessary preliminaries are listed in Sect. 2. Section 3 proposes a relatively efficient implementation of the WFA based on network flows, where one step of the WFA reduces to only one minimal-cost maximal flow problem instance. Section 4 explains how the network introduced in Sect. 3 can further be simplified if the metric space underlying the considered k-server problem is finite. Section 5 describes how the general procedure for computing optimal flows can be customized and speeded up by taking into account some special properties of the networks from Sects. 3 and 4. Section 6 reports on experiments, where the cost of serving incurred by the WFA has been measured and compared with corresponding costs produced by simple heuristics. Thanks to using the efficient implementation from Sect. 3 with the improvements from Sects. 4 and 5, it was possible to test the WFA on very long sequences of requests. Section 7 analyzes

123

A fast work function algorithm

189

whether the obtained implementation is still fast enough for real applications. The final Sect. 8 gives conclusions.

2 Preliminaries As we have already mentioned, the k-server problem is posed in a fixed metric space, let us call it M . An instance of the k-server problem is given by the following data.   (0) (0) (0) (0) – The initial configuration of k servers S (0) = s1 , s2 , . . . , sk , where s j specifies the initial location in M of the j-th server. – The sequence of n requests σ = (r1 , r2 , . . . , rn ), where ri describes the i-th request and again specifies a location in M . An on-line algorithm for solving the k-server problem works in the following way. In its i-th step the algorithm serves the request ri by moving   a server to the location (i−1) (i−1) (i−1) , s2 , . . . , sk of ri . Thereby the current server configuration S (i−1) = s1   (i) (i) (i) transforms into a new configuration S (i) = s1 , s2 , . . . , sk . The decision which server to move may be based only on the already seen requests r1 , r2 , …, ri−1 , ri , thus it must be taken without any information about the future requests ri+1 , ri+2 , …, rn . Whenever the algorithm moves a server from a location a to a location b, it incurs a cost equal to the distance D(a, b) between a and b in M . The main challenge is how to serve requests so that the total cost of serving remains as small as possible. As a concrete instance of the k-server problem, let us consider the set M of locations within the mountain and coastal area of Croatia, as shown in Fig. 1 with distances indicated. In this area, forest fires are very common during summer, and it can happen that several fires occur within the same day. Suppose that k = 3 fire-extinguishing helicopters are initially located at Rijeka (RI), Zadar (ZD) and Dubrovnik (DU), respectively. If the first fire starts for instance in Split (ST), then our on-line algorithm has to decide which of the available helicopters should be moved to that location. Seemingly the cheapest solution would be to move the nearest machine from Zadar. But such a

KA DE

OG

RI UM

PZ

250

GS

GR KN

PU

ZD

ST 110

Sˇ I

160

DU

Fig. 1 A k-server problem instance

123

190

T. Rudec et al.

choice could be wrong if, for instance, all forthcoming fires would appear between Zadar, Split and Dubrovnik, and none in Rijeka. Besides forest fires, we could imagine many other similar “geographical” applications of the k-server problem, for instance hail defending with ground-to-air rockets, responding to terrorist attacks, renting mobile equipment such as sound systems for concerts, etc. There are also applications within the area of computers, e.g. accessing data on a magnetic disk with multiple read/write heads. Also, the well known paging problem, dealing with computer memory management, can be considered as a special case of the k-server problem. The simplest on-line method for solving the k-server problem is the random algorithm—RAND (Irani and Karlin 1997), where each request is served by a randomly chosen sever. Thereby any server is chosen with equal probability, which means that “historical” information on previous requests and configurations is ignored, as well as information on distances. Another well known heuristic is the greedy algorithm—GREEDY (Irani and Karlin 1997). It serves the current request in the cheapest possible way, by taking into account distances while still ignoring history. Thus GREEDY always sends the nearest server to the requested location. A more sophisticated but still simple heuristic is the balanced algorithm— BALANCE (Manasse et al. 1990), which tries to take into account both distances and history. More precisely, it attempts to keep the total distance moved by various servers roughly equal. Consequently, BALANCE employs the server whose cumulative distance traveled so far plus the distance to the requested location is minimal. This paper is mostly concerned with the work function algorithm—WFA (Bartal and Koutsoupias 2004; Koutsoupias and Papadimitrou 1994; Koutsoupias 1999; Koutsoupias 2009). As any on-line algorithm, the WFA serves in its i-th step the request ri by switching from the current server configuration S (i−1) to a new configuration S (i). However, among k possibilities (any of k servers could be moved) S (i) is chosen so that       F S (i) = COPT S (0) , r1 , r2 , . . . , ri , S (i) + D S (i−1) , S (i)

(1)

becomes minimal. Thus the objective function F(S (i) ) used by the WFA is defined as a sum of two parts. – The first part, usually called the work function, is the minimum total cost of starting from S (0) , serving in turn r1 , r2 , . . . , ri , and ending up in S (i). – The second part is the distance traveled by a server to switch from S (i−1) to S (i). Together with the original WFA, we will also consider its “lightweight” version denoted as the w-WFA (Baumgartner et al. 2010), which is based on the idea that the sequence of previous requests and configurations should be examined through a moving window of size w. In its i-th step the w-WFA acts as if ri−w+1 , ri−w+2 , . . . , ri−1 , ri was the whole sequence of previous requests, and as if S (i−w) was the initial configuration of servers. In other words, the objective function F(), originally defined by (1), is redefined in the following way:

123

A fast work function algorithm

191

      F S (i) = COPT S (i−w) , ri−w+1 , ri−w+2 , . . . , ri−1 , ri , S (i) + D S (i−1) , S (i) . (2) Note that an on-line algorithm ALG can only approximate the performance of the optimal off-line algorithm OPT. Indeed, OPT knows the whole input in advance, and serves the whole request sequence at minimum total cost. ALG is said to be competitive if its performance is estimated to be only a bounded number of times worse than that of OPT on any input. More precisely (Sleator and Tarjan 1985), let us denote with CALG (S (0) , σ ) the total cost incurred by ALG on the problem instance given by the initial server configuration S (0) and the request sequence σ . Denote with COPT (S (0) , σ ) the minimum total cost on the same input data. Let α be a constant. Then we say that ALG is α-competitive if there exists another constant β such that on every S (0) and every σ it holds: CALG (S (0) , σ ) ≤ α · COPT (S (0) , σ ) + β. There are many interesting results dealing with competitiveness. For instance, it can be proven (Manasse et al. 1990) that any hypothetical α-competitive algorithm for the k-server problem must have α ≥ k. Also, it is easy to check (Irani and Karlin 1997) that both GREEDY and BALANCE are not competitive, i.e. they have no bounded α. Finally, it has been proven in (Koutsoupias and Papadimitrou 1994; Koutsoupias 1999) that the WFA is (2k − 1)-competitive. The WFA can be regarded as the “most competitive” algorithm for the k-server problem since its estimated value of α is much lower than for any other known algorithm (Bartal and Grove 2000). It is widely believed that the WFA is in fact k-competitive (thus achieving the best possible α), but this hypothesis has not been proven except for some special cases (Bartal and Koutsoupias 2004; Chrobak et al. 1991; Koutsoupias and Papadimitrou 1996; Koutsoupias 2009). 3 Efficient implementation by network flows It is well known that the optimal off-line algorithm OPT can be realized relatively easily by network flow techniques (Bazaraa et al. 2004). As shown by Chrobak et al. (1991), finding the optimal strategy to serve a sequence of n requests by k servers reduces to computing the minimal-cost maximal flow on a suitably constructed network with 2n + k + 2 nodes. The same network flow techniques can be adjusted to implement the WFA. Indeed, according to the definition (1), one step of the WFA consists of k optimization problem instances plus some simple arithmetics. Thus one step of the WFA could be reduced to k network flow problem instances. It is true, however, that the optimizations within (1) are not quite equivalent to those computed by OPT, namely, there are additional constraints regarding final configurations of servers. Still, the construction from (Chrobak et al. 1991) can be used after an obvious modification that has been described for instance in (Rudec et al. 2009). Consequently, by following directly (1),

123

192

T. Rudec et al.

the i-th step of the WFA can be reduced to k minimal-cost maximal flow problem instances, each on a network with 2i + 2k nodes. Now we will describe a new way of reducing the WFA to network flows, which is less obvious than the straightforward way from (Rudec et al. 2009), but much more efficient. Our new implementation reduces the i-th step of the WFA described by (1) to only one minimal-cost maximal flow problem instance. The involved network computes directly the whole objective function F() from (1), and it consists of 2i + 2k + 2 nodes. Since the usual straightforward approach uses k flow problem instances of roughly the same size, our approach is approximately k times faster. The construction is shown in Fig. 2. As we can see from Fig. 2, our network consists of a source s¯ , a sink t¯, and three additional layers of nodes. The first layer represents the initial server configuration S (0), i.e. each s j ( j = 1, 2, . . . , k) corresponds to the starting location of one server. The left part of the second layer together with the third layer represents the request sequence; thereby both nodes r p and r p ( p = 1, 2, . . . , i) correspond to the location of



s1

r1

s2

r2

r3

sk

......

...

ri−1

ri

s1

s2

...

r1

r2

r3

ri−1

ri

t¯ Fig. 2 The network corresponding to the i-th step of the WFA

123

...

sk

A fast work function algorithm

193

the same ( p-th) request. The right part of the second layer specifies the current server configuration S (i−1), i.e. each s j ( j = 1, 2, . . . , k) corresponds to a location covered by a server immediately before the i-th step of the WFA. Figure 2 also shows how the nodes in our network are connected by arcs. All arcs are assumed to have unit capacities, but their costs are different. First, the source s¯ is connected to each s j ( j = 1, 2, . . . , k) by an arc whose cost is 0. Next, from each s j ( j = 1, 2, . . . , k) to each r p ( p = 1, 2, . . . , i) and as well to each sl (l = 1, 2, . . . , k) there is an arc with the cost equal to the distance between the corresponding locations. We denote that distance in a natural way as D(s j , r p ) and D(s j , sl ), respectively. An r p ( p = 1, 2, . . . , i) has only one outgoing arc leading to the associated r p ; the cost of that arc is −L, where L is a suitably chosen very large positive number. An r p ( p = 1, 2, . . . , i − 1) is connected to an rq (q = 2, . . . , i) only if q > p, and the cost of the corresponding arc is D(r p , rq ). Also, each r p ( p = 1, 2, . . . , i − 1) is connected to each sl (l = 1, 2, . . . , k) by an arc with the cost D(r p , sl ). Finally, the node ri has only one outgoing arc, leading to the sink t¯, whose cost is 0. Similarly, from any sl (l = 1, 2, . . . , k) there is only one outgoing arc that leads to the sink t¯, but its cost is   (3) X − D sl , ri where 1     X= D s j , ri . k−1 k

(4)

j=1

Let us note that, thanks to unit arc capacities, a maximal flow through the network shown in Fig. 2 must have the value k. Moreover, any maximal flow can be decomposed into k disjunct unit flows from s¯ to t¯ (Bazaraa et al. 2004). The correspondence between the i-th step of the WFA and flows in our network is based on the fact that each unit flow can be interpreted as a working plan for one particular server. Indeed, if the unit flow passes through the node s j , then the associated server starts from the location determined by s j . Also, flowing through an arc of the form s j → r p or r p → rq or s j → sl or r p → sl is interpreted as moving the server between the respective locations. Traversing an arc of the form r p → r p means that the server should serve the p-th request. Finally, if the flow reaches t¯ over ri , then the server ends up in the location of ri . Otherwise, if the flow exits to t¯ over some sl , then the final location of the server is determined by that particular sl . By combining interpretations of all unit flows together, we obtain a complete schedule for serving the whole sequence of i requests by k servers. Let us now assume that the considered maximal flow is also a minimal-cost maximal flow. Then the schedule obtained by interpreting that flow must have the following properties. – Since k unit flows pass through k distinct s j -s, the initial configuration of servers must fully coincide with S (0). – Since the values L are by assumption very large, the minimal-cost flow must use all arcs of the form r p → r p , which means that all requests are served.

123

194

T. Rudec et al.

– One unit flow passes through ri → ri and must end up in t¯ over ri . The remaining k − 1 unit flows must pass through k − 1 distinct sl -s. Consequently, the final configuration of servers covers the location of ri and coincides with S (i−1) in all of its remaining k − 1 locations. In other words, our schedule has exactly the form that is analyzed by the WFA, i.e. it starts from S (0), serves all requests r1 , r2 , . . . , ri , and ends up in some configuration S (i) which can be obtained from S (i−1) by moving one server to the location of ri if necessary. In our particular case, switching from S (i−1) to S (i) is accomplished by moving the server whose location in S (i−1) corresponds to sx , where sx denotes the unique node among s1 , s2 , . . . , sk which is not saturated by the flow. Let us now analyze the cost of our minimal-cost maximal flow. Since all arcs have unit capacities, the cost of the flow is simply the sum of costs of all saturated arcs. Let us divide the flow cost in two parts. Denote with C1 the sum of costs of saturated arcs that do not enter the sink t¯, and with C2 the sum of costs of saturated arcs that enter t¯. Then we can observe the following. – C2 is equal to the distance D(sx , ri ). Or according to the notation from (1), C1 is equal to D(S (i−1) , S (i) ). To check this claim, we take into account that the cost of the arc ri → t¯ is 0, and that all arcs of the form sl → t¯ are saturated except sx → t¯. By using (3) and (4) we indeed obtain that C2 =

k  

  X − D sl , ri

l=1 l =x

= (k − 1) · X −

k 

    D sl , ri + D sx , ri

l=1

  = (k − 1) · X − (k − 1) · X + D sx , ri   = D sx , ri . – C1 is obviously the total cost of serving the whole sequence r1 , r2 . . . , ri by starting from S (0) and ending up in the chosen S (i) . Moreover, since the whole flow is of minimal cost and C2 depends only on the chosen S (i) , we are sure that C1 is in fact the optimal cost of serving under the mentioned constraints. Or, by using the notation from (1), C1 is equal to COPT (S (0) , r1 , r2 , . . . , ri , S (i) ). – By summing C1 and C2 we obtain exactly the objective function F(S (i) ) described by (1). Moreover, since our network flow is free to choose any possible version of S (i) and since it achieves minimal cost, we are sure that the actually chosen version of S (i) is exactly the one that minimizes F(S (i) ). Thus our minimal-cost maximal flow in fact minimizes the objective function of the WFA and therefore produces the same decision how to serve the i-th request as the WFA would do. Putting it all together, we conclude that the i-th step of the WFA can be implemented by constructing the network from Fig. 2, by finding the minimal-cost maximal flow in that network, by determining the unique node sx among s1 , s2 , . . . , sk which is not

123

A fast work function algorithm

195

saturated by the computed flow, and by sending the server from the location specified by sx to the location of ri . To fully realize the described implementation, we need to incorporate a suitable method for finding optimal flows. 4 Simplified implementation for finite metric spaces In the previous section we have explained how the WFA can be implemented by network flows. The described implementation is relatively efficient since it reduces one step of the WFA to only one flow problem instance. Still, the complexity of the whole procedure is not negligible, and it depends on the size and density of the involved network. Note that our network shown in Fig. 2 is in fact quite dense; namely for the i-th step of the WFA and k servers it contains O(i + k) nodes and as many as ((i + k)2 ) arcs. In this section we will show that the network from Fig. 2 can be simplified, so that its density reduces by an order of magnitude. Such reduction would hopefully lead to even faster execution of the WFA. The proposed simplification is applicable only if the involved metric space is finite, which is, luckily enough, almost always the case in real-life applications. Note that the network from Fig. 2 contains several groups of arcs, whose size is proportional to k, k 2 , i, ik and i 2 , respectively. Note also that in real applications i tends to be very large, and k is in fact a constant much smaller than i. Under such circumstances, the overall density of the network is determined by the only group of arcs whose size rises with i 2 , and these are the arcs of the form r p → rq connecting the third layer of nodes with the second layer. Our simplification of the network concentrates on reducing the number of such “critical” arcs between r p -s and rq -s. After reduction, their number will grow only linearly with i, thus making the whole network much sparser. Let us now consider the situation where two requests in the request sequence r1 , r2 , . . . , ri , say r p and r y (y > p), refer to the same location. Consider also the optimal serving schedule obtained by the minimal-cost maximal flow in the network. Our simplification of the network is based on the following simple claim. Suppose that the server that has served the request r p does not serve any of the requests r p+1 , r p+2 , . . . , r y−1 . Then we can assume that the same server must serve r y . To prove the claim, note that the server A that has served r p remains at the corresponding location, let us call it a1 , until the request r y occurs at the same location. So A can serve r y without any movement at cost 0. Suppose now that, contrary to our claim, the considered optimal schedule serves r y by bringing another server B from some location b1 to the location a1 . We will show that then the considered schedule can be modified, so that it ends up in the same configuration of servers S (i), serves r y by A, and remains optimal. Modification depends on what originally happens with B after serving r y . – If B is supposed to stay at a1 until the end of the schedule, then we switch the roles of A and B. Thus we serve r y by A, leave A at a1 , and send B directly from

123

196

T. Rudec et al.

b1 to do the remaining work of A (if any) starting from some location a2 . By such switching, we introduce a new cost D(b1 , a2 ) but spare two costs D(b1 , a1 ) and D(a1 , a2 ). The total cost cannot rise since according to the triangle inequality D(b1 , a2 ) ≤ D(b1 , a1 ) + D(a1 , a2 ). – If B is supposed to serve afterwards some other requests starting at some location b2 , then we serve r y by A at cost 0, leave A to do its remaining work (if any), and send B immediately from b1 to b2 . In this way we introduce a new cost D(b1 , b2 ) but spare two costs D(b1 , a1 ) and D(a1 , b2 ). Again, the total cost cannot rise thanks to the triangle inequality that assures that D(b1 , b2 ) ≤ D(b1 , a1 ) + D(a1 , b2 ). By taking into account the above claim, we can propose the following modification of the network from Fig. 2, which deletes some obsolete arcs of the form r p → rq . The network is constructed in the same way as explained in Sect. 3, except that for each r p ( p = 1, 2, . . . , i − 1) we do the following. – Find the first r y (y > p) that refers to the same location as r p . Or put y = i is there is no such r y . – Connect r p to each rq ( p < q ≤ y) by an arc with the cost D(r p , rq ). – Do not connect r p to any rq such that q > y. It is easy to see that the obtained modified network is equivalent to the original one in the sense that it produces essentially the same minimal-cost maximal flow. Namely, according to our claim, the optimal flow in the original network can be chosen so that it does not saturate arcs of the form r p → rq (q > y). Indeed, saturation of an r p → rq (q > y) would mean that the corresponding server first serves r p and then serves none of r p+1 , r p+2 , . . . , r y−1 , r y , thus contradicting the claim. Consequently, the arcs omitted in the modified network are in fact not needed, and the modified network can really be used instead of the original network. Note that the proposed modification does not bring any changes if all requests r1 , r2 , . . . , ri appear at different locations. Indeed, the whole effort brings some gain only if in the request sequence the same locations reoccur many times. Such reoccurrence certainly must happen if the involved metric space M is finite. Let us now assess the number of arcs of the form r p → rq in the modified network under the assumption that M consists of m locations, where m  i. Choose one particular location a. Suppose that a occurs in the request sequence within r p1 , r p2 , . . . , r pu . Then there are ( p2 − p1 ) arcs leaving r p1 , ( p3 − p2 ) arcs leaving r p2 ,…, (i − pu ) arcs leaving r pu . By summing up all these numbers, we get that the total number of arcs going out of all r p -s that correspond to the location a is ( p2 − p1 ) + ( p3 − p2 ) + · · · + (i − pu ) = i − p1 . So there are less than i arcs that correspond to the location a. Since the same reasoning can be repeated for any chosen location and there can be at most m different locations involved, the sum over all locations cannot exceed m · i. Consequently, since m is a constant, the total number of arcs of the form r p → rq in the modified network grows linearly with i. So we have proven that for a finite metric space M our modification substantially reduces the network density.

123

A fast work function algorithm

197

5 Actual computation of optimal network flows As already explained, the i-th step of the WFA with k servers reduces to computing the minimal-cost maximal flow in a suitable network consisting of 2i + 2k + 2 nodes. The structure of that network has been described in Sects. 3 and 4. In this section we are concerned with actual computation of the required flow. Although such computation can be accomplished by any general flow-finding method, we will propose a customized algorithm, which takes into account special properties of the involved network in order to gain additional speedups. The results of this section are related to those already published in the previous paper by Rudec et al. (2009). The main difference among the two papers is that (Rudec et al. 2009) is mainly concerned with the optimal off-line algorithm OPT, and with differently constructed networks. Still, both works share similar ideas for customization of general flow-finding methods. To avoid repetition, this section presents such ideas only very briefly. Also, the section provides no correctness proofs since they would be analogous to those already presented in (Rudec et al. 2009). Our customized algorithm is based on the general flow augmentation, method (Bazaraa et al. 2004), and it follows the outline from (Chrobak et al. 1991). Thus the method starts with the null flow and proceeds with k iterations. In each iteration the value of the current flow is augmented by one unit, so that it still has the minimal cost among those with the same value. After k iterations, the desired minimal-cost maximal flow with the value k is obtained. In each of k iterations, flow augmentation is achieved by finding a path in the corresponding displacement network (Bazaraa et al. 2004), which goes from the source s¯ to the sink t¯ and has the minimal sum of arc costs. Such shortest path determines the unit flow that has to be superimposed onto the current flow in order to obtain augmentation by one unit. The described shortest path problem can be solved by Dijkstra’s procedure (Jungnickel 2005). However, since Dijkstra can be applied only to networks whose arc costs are nonnegative, a suitable preprocessing of arc costs is needed. The details of that preprocessing can be found in (Edmonds and Karp 1972). It is well known that Dijkstra’s procedure has a quadratic complexity regarding the number of network nodes. Also, the complexity of the mentioned preprocessing is of the same order. Each displacement network has the same size as the original network, i.e. it consists of O(i + k) nodes. Thus without any optimization one iteration of the flow augmentation method would take O((i + k)2 ) time. Our customized algorithm has been obtained from the above described flow augmentation method by modifying each of its k iterations separately. Thus each iteration of the algorithm tries to find the required shortest path faster than it would be possible with the standard Dijkstra’s procedure. All modifications rely on special properties of particular displacement networks. Very good speedups are obtained in the first two iterations, while improvements of the remaining iterations are only moderate. In the first iteration the displacement network is identical to the original network described in Sects. 3 and 4. It is easy to see that the shortest path from s¯ to t¯ in that network must include all arcs r p → r p ( p = 1, 2, . . . , i) having negative costs −L. Thus the path we are looking for is almost completely determined in advance and it has the form

123

198

T. Rudec et al.  s¯ → sz → r1 → r1 → r2 → r2 → · · · → ri−1 → ri−1 → ri → ri → t¯.

Here sz denotes a node from the first layer chosen so that the cost of its arc sz → r1 is minimal among all arcs connecting the first layer with r1 . So our modification of the first iteration consists of the following. – No general path-finding procedure such as Dijkstra is used. – Instead, the shortest path is directly constructed according to the above specification. Consequently, the first iteration is accomplished in the time O(k) needed to find the minimum among k values, which is several orders of magnitude faster than with Dijkstra. After the first iteration, the arcs s¯ → sz , ri → t¯ and r p → r p ( p = 1, 2, . . . , i) are permanently removed from the network, since it can be proven that they cannot be used anymore in the remaining part of the algorithm. An interesting consequence of such removal is that the constants −L are never explicitly used, and therefore do not have to be specified at all. In the second iteration, the remaining displacement network turns out to be acyclic. Thus it is possible to find the shortest path in that network by simple one-way scanning of nodes. More precisely, the scanning procedure should process nodes according to the following “topological” ordering (Jungnickel 2005): s¯ , s1 , s2 , . . . , sz−1 , sz+1 , . . . , sk , r1 , sz , r2 , r1 , r3 , r2 ,   . . . , ri−1 , ri−2 , ri , ri−1 , s1 , s2 , . . . , sk , t¯. Here sz is the node determined in the previous iteration. For each node in the above sequence, the procedure finds its distance from s¯ by taking into account only the costs of its incoming arcs and the already computed distances for its direct predecessors. So our modification of the second iteration consists of the following. – No general path-finding procedure such as Dijkstra is used. – Instead, the shortest path is found by scanning the nodes in the topological order shown above. Since the ordering of nodes is known in advance, sorting the nodes takes no time. The scanning procedure takes time proportional to the number of arcs. For a sparse network, the number of arcs is an order of magnitude smaller than the squared number of nodes. Consequently, if our network has been simplified according to Sect. 4, then scanning in topological order runs considerably faster than Dijkstra. After the second iteration, the first and the last arc from the corresponding shortest path are again permanently removed from the network. Namely, it can be shown that those two arcs cannot be used in the remaining iterations of the algorithm. In the third, fourth, or any of the remaining iterations, the displacement network is not acyclic. Therefore the associated shortest path problem cannot be solved so simply as in the first two iterations. Still, it is possible to use a slightly modified version of Dijkstra’s procedure whose computing time is better than for the standard version, although within the same order of magnitude.

123

A fast work function algorithm

199

After the third, fourth, or any further iteration, it is again possible to delete two arcs that cannot be used in forthcoming iterations. Thanks to such deletions, the displacement network gradually becomes simpler and simpler, thus enabling faster execution. By analogy with the experimental results from (Rudec et al. 2009), we can estimate that the presented customized algorithm for computing network flows is at least 4 times faster than the general flow augmentation method. The speedup is expected to be even larger if the number of servers k is small.

6 Experimental evaluation of serving costs In order to do experiments, we have developed a C++ program that implements the WFA by using the network model from Sect. 3. The program also incorporates all simplifications and modifications from Sects. 4 and 5. To allow comparison, we have also realized two simple heuristics, GREEDY and BALANCE, as well as the random algorithm RAND and the optimal off-line algorithm OPT. All programs have been tested on the same k-server problem instances. The obtained serving costs have been measured and recorded. Thanks to the improvements from Sects. 3, 4 and 5, our implementation of the WFA turned out to be considerably fast, thus allowing experiments with very long sequences of requests. In our experiments we have used problem instances that are similar to those studied in the papers (Bartal et al. 2000; Bartal and Koutsoupias 2004; Bein et al. 2002; Chrobak et al. 1991; Fiat et al. 1994; Koutsoupias and Papadimitrou 1996; Koutsoupias 2009; Manasse et al. 1990). Thus we have considered the situation where the locations form a line or a circle, or the situation where the number of servers is one less than the number of locations, etc. The full list of results is constantly being updated and extended, and it can be found on our web site http://art.etfos.hr. In this paper we present only the most interesting part of the experiments, which are based on 27 problem instances. The selected instances have the following common properties. – The underlying metric space is always a grid consisting of locations in a plane with integer coordinates. – The distance among locations is computed as the Euclidean distance. – The initial configuration of servers is chosen by hand, so that the servers are spread through the metric space as evenly as possible. – The request sequence has always the length n = 10,000, which proved to be enough to expose the behavior of the WFA even on large metric spaces and with large numbers of servers. On the other hand, the selected problem instances differ in three important aspects. – The distribution of requests among locations can be uniform, moderately nonuniform, or highly non-uniform. – The number of possible locations m in the metric space can be small (m = 21), medium (m = 50), or large (m = 200). – The number of servers k can be small (k = 2), medium (k = 5), or large (k = 20).

123

200

T. Rudec et al.

In our experiments, special consideration has been given to distribution of requests among locations. For each problem instance, the sequence of requests with a desired distribution has been generated automatically by an appropriate random number generator. As we have said before, distribution can be uniform or to some extent non-uniform. Uniform distribution means that a new request can appear at any location within the considered metric space with the same probability. Non-uniform distribution means that certain locations occur more frequently than the others. Non-uniform distribution is more realistic, and it allows algorithms such as the WFA to learn from “history”. In our problem instances non-uniform distribution of requests among locations is effectively produced in the following way. Each location is first assigned a weight. Then, within the sequence of requests, a location is chosen with the probability equal to its weight divided by the sum of all weights. Some of our problem instances have been designed to exhibit a moderately non-uniform distribution, and some to exhibit a highly non-uniform distribution. The details depend on the number of possible locations m. For spaces with m = 21 or m = 50, moderately non-uniform means that about 10% of locations have the weight 10 and the remaining 90% the weight 1, while highly non-uniform means that 10% of weights are equal to 30 and the rest are 1. For metric spaces with m = 200 the percentage of locations with large weights is again about 10%, but the ratio of a large versus a normal weight is higher, so that it amounts to 50:1 in the moderately non-uniform case, or to 100:1 in the highly non-uniform case. The results of the experiments are presented in Tables 1, 2, and 3. Each table corresponds to problem instances with a certain type of distribution of requests among locations. Table 1 summarizes the results for uniform distribution, Table 2 for moderately non-uniform distribution, and Table 3 for highly non-uniform distribution. Within each table, the associated problem instances are ordered according to the number of possible locations m and the number of servers k. A table column corresponds to the instance with a certain combination of m and k, while a table row to a particular algorithm: RAND, GREEDY, BALANCE and the WFA, respectively. A single table entry records the performance (total cost) of the corresponding algorithm on the corresponding instance. Performance is expressed relatively, as the ratio between the cost incurred by that algorithm and the optimal cost incurred by OPT. In this way, we obtain some kind of empirical measurement of competitiveness. Table 1 Experimental results—serving costs—uniform distribution of requests among locations Number of locations (m)

21

Number of servers (k)

2

5

1.52

2.50

1.11

Cost of RAND Cost of OPT Cost of GREEDY Cost of OPT Cost of BALANCE Cost of OPT Cost of the WFA Cost of OPT

123

50 20

200

2

5

6.45

1.52

2.57

1.22

2.48

1.10

1.28

1.56

2.96

1.13

1.28

2.93

20

2

5

20

5.18

1.54

2.66

5.50

1.16

1.45

1.10

1.13

1.18

1.28

1.54

1.82

1.28

1.53

1.68

1.12

1.20

1.56

1.12

1.17

1.22

A fast work function algorithm

201

Table 2 Experimental results—serving costs—moderately non-uniform distribution of requests Number of locations (m)

21

50

Number of servers (k)

2

5

Cost of RAND Cost of OPT Cost of GREEDY Cost of OPT Cost of BALANCE Cost of OPT Cost of the WFA Cost of OPT

1.47

2.44

1.23

20

200

2

5

6.51

1.44

2.38

1.56

2.76

1.16

1.31

1.74

3.56

1.21

1.53

2.78

20

2

5

20

5.27

1.43

2.22

4.77

1.27

1.63

1.19

1.44

2.14

1.27

1.59

1.94

1.26

1.54

2.02

1.16

1.28

1.58

1.18

1.31

1.65

Table 3 Experimental results—serving costs—highly non-uniform distribution of requests Number of locations (m)

21

Number of servers (k)

2

5

1.50

2.51

1.44

Cost of RAND Cost of OPT Cost of GREEDY Cost of OPT Cost of BALANCE Cost of OPT Cost of the WFA Cost of OPT

50 20

200

2

5

6.55

1.41

2.33

2.24

2.68

1.26

1.35

1.98

3.10

1.29

1.70

2.72

20

2

5

20

5.26

1.40

2.15

4.76

1.60

1.75

1.26

1.61

3.24

1.29

1.70

2.00

1.26

1.55

2.24

1.23

1.48

1.58

1.20

1.35

1.86

Tables 1, 2, and 3 contain data that can be analyzed in various ways. By comparing values within or across tables, it is possible to determine how the performance of a chosen algorithm is affected by particular problem parameters, such as the number of locations m, the number of servers k, or the type of distribution of requests among locations. Also, it is possible to identify relative strengths and weaknesses of different algorithms when they are applied to a chosen type of problem instances. Listed below are some simple facts that are clearly visible from the presented data. – The WFA performs better than GREEDY or BALANCE if the distribution of requests among locations is non-uniform. Moreover, as the distribution becomes more non-uniform, the relative advantage of the WFA becomes more significant. – The actual serving cost incurred by the WFA is never too large compared to the optimal cost. The empirically measured ratio of costs is between 1.1 and 2.93, thus it is much below the theoretical upper bound 2k − 1, or even below the widely accepted bound k. – GREEDY usually performs better than the WFA if the distribution of requests among locations is uniform. Still, the difference in performance is quite small. – The performance of BALANCE is very hard to predict. For large and non-uniform problem instances, it is usually somewhere between GREEDY and the WFA, but sometimes it becomes surprisingly bad and worse than both GREEDY and the WFA.

123

202

T. Rudec et al.

– As far as the considered algorithms are concerned, the worst results are produced by RAND. Those results can be regarded as a limit showing how bad an on-line algorithm could be. The other considered algorithms are quite far from that limit, i.e. their performance is much closer to optimal than to worst possible. Thus even the simplest heuristic rules work much better than random guessing. 7 Suitability for real-life problems In this section we are concerned with the question whether our fast implementation of the WFA is indeed fast enough to be used in real applications. An obvious obstacle is that the complexity of the WFA rises from step to step, so that the algorithm sooner or later becomes intolerably slow or runs out of memory. For instance, in the previously described series of experiments we were able to run the WFA on sequences of 10,000 requests, but we still would not be able to run it on 50,000 requests. The described obstacle is an inherent property of the standard version of the WFA, and it cannot be compensated by any kind of implementation tricks. Keeping this in mind, we can conclude that the only really practical way of using the WFA is switching to its window version - the w-WFA. Indeed, the complexity of the w-WFA does not change from step to step. Moreover, according to Baumgartner et al. (2010), the wWFA assures similar quality of serving as the original WFA if the window size w is reasonably large. All improvements of the WFA described in the previous sections can as well be applied to the w-WFA, thus making it quite useful and efficient. – Indeed, the network corresponding to the i-th step of the w-WFA can have the same structure as specified in Sect. 3 and shown in Fig. 2, provided that the first layer represents the server configuration S (i−w) instead of S (0) , and that the next two layers represent only the last w requests ri−w+1 , ri−w+2 , . . . , ri−1 , ri instead of the whole sequence of requests. The network then consists of 2w +2k +2 nodes, i.e. its size does not rise with i. – The network can further be simplified according to Sect. 4. There will be some reduction of network density if the window size w is large enough and if requests frequently reoccur at the same locations. – While computing the optimal network flow, each iteration of the flow augmentation method can be modified as described in Sect. 5. In order to test the speed of such w-WFA implementation, we have developed another C++ program and performed another series of experiments. This time we have mimicked the forest fire application on a map of Croatia and Germany, respectively. The first map was larger than shown in Fig. 1 and consisted of m = 25 locations, while the second one was even larger and comprised m = 15,112 locations. For each map we tried different numbers of servers k and different window sizes w. We measured the response time of the algorithm on a conventional computer with a 2.4 GHz processor and 2 GBytes of memory. The obtained results are shown in Table 4. In Table 4 a column corresponds to a particular pair of m and k, while a row corresponds to a particular window size w. A table entry records the measured time for the corresponding combination of m, k and w. More precisely, this is the average

123

A fast work function algorithm

203

Table 4 Experimental results—computing times per step in milliseconds Number of locations (m)

25

Number of servers (k)

3

15,112 10

3

10

Time for 50-WFA

0.358

0.570

0.508

1.132

Time for 100-WFA

0.692

1.350

1.084

2.700 4.770

Time for 150-WFA

1.148

2.372

1.912

Time for 200-WFA

1.714

3.570

2.886

7.230

Time for 250-WFA

2.326

4.850

4.078

10.088

time needed by the w-WFA to decide how to serve a new request. Each average is measured over a sequence of 500 uniformly distributed requests. Uniform distribution is used because it is computationally more demanding, namely it usually forces servers to move from step to step. Note that in Table 4 the values for the same w and k but different m-s are not equal. There are two reasons for this phenomenon. The first reason is that distances in two maps are evaluated differently - in the case of Croatia they are retrieved from a matrix and not computed, while for Germany they are computed from location coordinates. The second and even more important reason is that in a smaller metric space a new request occurs more often at a location already covered by a server - then the algorithm serves that particular request in time 0, thus decreasing the average computing time over the whole sequence of requests. As we can see from Table 4, the computing time of one step of the w-WFA ranges from 1 millisecond or less for smaller windows to 10 milliseconds for larger windows. The measured values seem to be acceptable for most applications that have been mentioned in Sect. 2 except those dealing with computer memory or disks. So if an application can tolerate few milliseconds of delay, it should use the w-WFA instead of simple heuristics, thus assuring a better quality of service. At this point it must be stressed again that switching from the original WFA to the w-WFA really does not compromise the performance in terms of serving quality. Namely, as noticed by Baumgartner et al. (2010), the w-WFA with a reasonably large w always produces at least similar results as the original WFA, and in most cases even exactly the same results. Thus relative serving costs of the w-WFA, i.e. costs of the w-WFA divided by the corresponding costs of OPT, are in fact the same as for the WFA and behave as indicated by Table 1. The latter is particularly true for the problem instances used in Table 4, where each application of the w-WFA with w ≥ 150 produces exactly the same serving as the original WFA would do. More precisely, for the entries in Table 4 with w ≥ 150 the relative costs are columnwise: 1.21, 1.36, 1.14 and 1.18, which is consistent with Table 1. 8 Conclusions In this paper we have shown that the WFA can indeed be implemented more efficiently than by directly following its definition. Our improved implementation combines three features:

123

204

T. Rudec et al.

– a better network model, which reduces each step of the WFA to only one minimalcost maximal flow problem instance; – further simplification of that network model, which is applicable if the involved metric space is finite; – a customized procedure for computing optimal network flows, which takes into account special properties of the involved networks. Depending on the number of requests n and the number of servers k, our implementation can be from 8 times to several tens of times faster than the conventional implementation. In this paper, we have also presented some additional results based on our implementation, which are interesting from the theoretical as well as from the practical point of view. Namely: – We have provided experimental evidence that the WFA can really assure better costs of serving than simple heuristics such as the greedy or the balanced algorithm. Thanks to the speed of our programs, we were able to run the WFA on very large problem instances with n up to 10,000 and k up to 20. The experiments have shown that the WFA provides better service, but only if the request sequence is non-uniformly distributed among locations. – We have demonstrated by experiments that the WFA can be made fast enough to accommodate most applications. However, instead of the standard version of the WFA, it is necessary to use its window version. With a window size w up to 250 the window version becomes a good alternative to simple heuristics, since it provides better quality of serving at a response time within few milliseconds on a conventional computer.

References Bartal Y, Chrobak M, Larmor LDL (2000) A randomized algorithm for two servers on the line. Inf Comput 158:53–69 Bartal Y, Grove E (2000) The harmonic k-server algorithm is competitive. J ACM 47:1–15 Bartal Y, Koutsoupias E (2004) On the competitive ratio of the work function algorithm for the k-server problem. Theor Comput Sci 324:337–345 Baumgartner A, Rudec T, Manger R (2010) The design and analysis of a modified work function algorithm for solving the on-line k-server problem. Comput Inf 29:681–700 Bazaraa MS, Jarvis JJ, Sherali HD (2004) Linear programming and network flows, 3rd edn. WileyInterscience, New York Bein W, Chrobak M, Larmore LL (2002) The 3-server problem in the plane. Theor Comput Sci 289: 335–354 Chrobak M, Karloff H, Payne TH, Vishwanathan S (1991) New results on server problems. SIAM J Discret Math 4:172–181 Edmonds J, Karp RM (1972) Theoretical improvements in algorithmic efficiency for network flow problems. J ACM 19:248–264 Fiat A, Rabani Y, Ravid Y, Schieber B (1994) A deterministic O(k)-competitive k-server algorithm for the circle. Algorithmica 11:572–578 Irani S, Karlin AR (1997) Online computation. In: Hochbaum D (ed) Approximation algorithms for NP-hard problems. PWS Publishing Company, Boston, pp 521–564 Jungnickel D (2005) Graphs, networks and algorithms. Springer, Berlin

123

A fast work function algorithm

205

Koutsoupias E, Papadimitrou C (1994) On the k-server conjecture. In: Leighton FT, Goodrich M (ed) Proceedings of the 26-th annual ACM symposium on theory of computing, Montreal, Quebec, Canada, May 23–25. ACM Press, New York, pp 507–511 Koutsoupias E, Papadimitrou C (1996) The 2-evader problem. Inf Process Lett 57:249–252 Koutsoupias E (1999) Weak adversaries for the k-server problem. In: Beame P (ed) Proceedings of the 40th annual symposium on foundations of computer science. IEEE, New York, pp 444–449 Koutsoupias E (2009) The k-server problem. Comput Sci Rev 3:105–118 Manasse M, McGeoch LA, Sleator D (1990) Competitive algorithms for server problems. J Algorithms 11:208–230 Rudec T, Baumgartner A, Manger R (2009) A fast implementation of the optimal off-line algorithm for solving the k-server problem. Math Commun 14:119–134 Sleator D, Tarjan RE (1985) Amortized efficiency of list update and paging rules. Commun ACM 28: 202–208

123

Suggest Documents