Placement of Nodes in an Adaptive Distributed Multimedia Server Balázs Goldschmidt1, Tibor Szkaliczki2 , and Laszlo Böszörmenyi3 1
2
Budapest University of Technology and Economics
[email protected] Computer and Automation Research Institute of the Hungarian Academy of Sciences
[email protected] 3 University Klagenfurt, Department of Information Technology
[email protected]
Institute of Information Technology University Klagenfurt Technical Report No TR/ITEC/04/2.06 February 2004
Partial support of the EC Centre of Excellence programme (No. ICA1-CT-2000-70025) and the Hungarian National Science Fund (Grant No. OTKA 42559) is gratefully acknowledged.
Abstract. Multimedia services typically need not only huge resources but also a fairly stable level of Quality of Services. This requires server architectures that enable continuous adaptation. The Adaptive Distributed Multimedia Server (ADMS) of the University Klagenfurt is able to dynamically add and remove nodes to the actual configuration, thus realizing the offensive adaptation approach. This paper focuses on the optimal placement of nodes for hosting certain ADMS components (the so-called data collectors, collecting and streaming stripe units of a video) in the network. We propose four different algorithms for host recommendation and compare the results gained by running their implementations on different test networks. The greedy algorithm seems to be a clear looser. Among the three other algorithms (particle swarm, linear programming and incremental) there is no single winner of the comparison, they can be applied in a smart combination.
1 Introduction Even highly sophisticated multimedia servers with a distributed architecture, such as the Darwin server of Apple [1] or the Helix architecture of RealNetworks Inc. [2] are static in the sense that actual configurations of the the distributed server must be defined manually. The Adaptive Distributed Multimedia Server (ADMS) of the University Klagenfurt [3] is able to dynamically add and remove nodes to the actual configuration. Thus, ADMS realizes the offensive adaptation approach [4]. In case of shortage of resources, instead of reducing the quality of the audio-visual streams by usual, defensive, stream-level adaptation, it tries to migrate and/or replicate functionality (i.e. code) and/or audio-visual data on demand. It is crucial for the performance of the system, to find optimal placement for the server nodes. The host recommender component of the ADMS system determines the host computers in the network where the server components (also called applications) should be loaded. The optimal location depends on the actual load and capacity of the available nodes and links, and on the actual set of client requests. The distributed multimedia server architecture [3] has different components that can be located on different hosts of the network. Data managers or servers store and retrieve the media data. Data collectors or proxies collect the data from the servers and stream them to the clients. The host recommender needs actual information about the state of the underlying resources. The server infrastructure continuously monitors the usage of the resources in the host nodes and the actual QoS parameters of the network links. It collects information on the location of the current clients and the required QoS-parameters of their requests. This paper deals with the configuration recommendation algorithms. We examine, what kind of mathematical approaches can be applied to the adaptive host recommendation. We have to consider the following factors in order to find an efficient solution for the practical problem: – In order to keep the complexity manageable, the number of the optimization criteria must be strongly limited. We use the following three requirements: (1) The number of rejected clients (2) The quality of the accepted requests (3) The reserved bandwidth in the network. – The search space of the optimization can be drastically reduced based on the measured data and practical considerations. Graphs of computer networks have a special meaning, so more efficient algorithms can be found than in the case of general graphs. For example, the differences of communication channels within a local area network are negligible. For this reason, we partition the whole network into subnets and we work on a network of subnets instead of the network of individual computer nodes. – A relevant requirement for the system is to serve the requests from the clients within specified time. Therefore the optimization has to be finished within a given time boundary as well. It excludes the application of algorithms with long running time. For the current investigation we assume that only the proxies can be moved because replication or migration of the servers storing the audio-visual data is much more time
consuming task than that of the proxies. Using a proxy, many clients can share a common data stream of a server-proxy connection. Therefore, the proper location of the proxies can already greatly reduce the load of the network.
2 Related Work Finding the optimal deployment of proxies in a network is a well known problem in the literature. Most of the works, however, deal only with (1) static configurations, (2) web-proxies, and (3) caching-problems [5,6,7]. Static configuration means, that proxies are deployed once, and their placement can not change later, or with high cost only. On the other hand, web proxies have to serve the clients with relatively short documents and images that have to be arrive unmodified. Online multimedia data delivery serves huge data-streams, where some modifications are still acceptable, here the emphasis is on the timing constraints. Finally, we are currently not interested in caching problems, because they are orthogonal to the offensive adaptation problem. This issue was discussed elsewhere [4]. Therefore, we cannot use the former results directly. We looked at the mathematical background and found the facility location problem(FLP), which is an intensively studied problem in operations research. The problem is as follows. Given the set of facility candidates and the set of demand points, some of the facility candidates have to be selected, and each of the demand points has to be assigned to one of the selected facility candidates. The cost/profit function gives the total cost/profit belonging to the actual facility selection and assignment of demand points. The objective of the problem is to find a solution that optimizes this function. A detailed description of the problem can be found in [8,9]. The problem of recommending the optimum set of host nodes can be modelled as a variant of the facility location problem. However, some significant differences prohibit the direct application of FLP approximation algorithms to host recommendation. First, while the cost of a facility is usually a constant value in case of the FLP, in the current problem it depends on the maximum bandwidth required by the clients assigned to the proxy. Furthermore, the limited bandwidth of the subnets must be taken into account as well. In [8,10] it is shown that FLP is NP-hard for general graphs. In spite of this, many constant approximation algorithms have been published for the facility location problem with polynomial running time that are usually combined with each other. The linear programming techniques play key role in many algorithms with constant approximation ratio [11,12,13]. The best approximation ratio was achieved combining the primal-dual method with the greedy augmentation technique [14]. Local search heuristic has been applied to solve the FLP from the early sixties [15] where any feasible initial solution is iteratively improved by adding a facility, deleting a facility or changing the location of a facility. The first linear time algorithm uses a hierarchically greedy approach when first a region is selected and then a facility is recursively chosen within the region [16]. As a different approach, evolutionary algorithms (EA) also provide an efficient way of finding good solutions for NP-hard problems. In [17] an EA is proposed to solve the P-median problem, which problem is in close relation to FLP. A kind of evolution-
ary algorithms, particle swarm optimisation proved to be effective in a wide range of combinatorial optimisation problems too [18]. Lagrangian relaxation can be used to find a lower bound for the solution and it can be applied to judge the accuracy of other solutions [19].
3 The problem model In order to implement and compare different algorithms that solve the problem, we have defined the following model and metrics. First, we describe the initial considerations about the multimedia application and the underlying network infrastructure, then we show the cost function used in the measurements. From the actual point of view the Adaptive Distributed Multimedia Server consists of Data Managers (DM) and Data Collectors (DC) (see figure 1). Multimedia data (videos) are stored on the Data Managers. The videos are sliced, and the resulting stripe units are distributed to the Data Managers. When needed, the video is recollected and streamed to the clients (C) by the Data Collectors. This technique helps both network and node resource load balancing.
DM
C DC
DM
C DM
Fig. 1. The basic architecture of ADMS (DM - Data Manager, DC - Data Collector, C - client)
According to [20], Data Managers that contain stripe units of the same video should be kept as close to each other as possible, practically on the same subnet, because that configuration gives the best performance. Based on this result our model considers such a group of Data Managers a single server. The media is collected from the Data Managers by Data Collectors (proxies). They can be loaded on any node that hosts a Vagabond2 Harbour [21]. These nodes are considered candidates, that may play the role of a Data Collector. Part of the problem is the recommendation of hosts to be used as Data Collectors (hence the term Host Recommender). The client is the third kind of component in the model. The clients connect to Data Collectors and get the desired media via streaming from them. The clients define their requests as lists that contain QoS requirements in decreasing order of preference. These requirements describe the streaming parameters like bitrate and delay jitter, the required media, and the start time of broadcasting. In the current model, however, we assume that all clients want to see the same video at the same time. We also assume that the demand list has a last, default element, that represents the situation when the client’s request is rejected.
In our current model only the proxies can be deployed dynamically. The locations of the servers and that of the clients are not to be modified. The network model is basically a graph where the nodes are called areas. An area is either a subnet (including backbone links) or a router that connects subnets. We handle them similarly because from our point of view they have the same attributes and provide the same functionality. The edges of the graph are the connections between the routers and the subnets they are part of. See figure 2.
N1
N1
R1
N2
R1
N2
R2
R2
L1
R3
L1
R3
N3
N4
R4
N3
R4
N4
Fig. 2. The transformation of an example network to our graph representation. On the upper part circles are subnets, hexagons are routers and the links are network links between routers. On the lower diagram the style differences among the nodes are only to help understanding the transformation.
We assume that we know the subnet location of the clients and servers, the candidate nodes, and the attributes of the network, the latter presented as the attributes of each area: bandwidth, delay jitter, latency, delay, and loss rate. The route between any two nodes is described as a list of the neighbouring areas. The solutions of a problem are described as possible configurations that provide the following information: – for each client, the index of the QoS demand that has been chosen to be satisfied – for each client, the candidate that hosts the proxy for the client – for each candidate, the server, where the video should be collected from In order to compare different solutions, we have defined a cost function that gets the initial specifications and a possible configuration as input parameters. Using this information the function calculates the network resource needs (total allocation and over-allocation), the number of rejected clients (those whose chosen demand is the last, default one), the sum of the chosen demand indexes of the clients (linear badness), and the so called exponential badness, that is defined as
∑ 2 ic
c C
where C is the set of clients, and ic is the index of the chosen demand for client c. This last metric is useful if we want to prefer ‘fair’ to ‘elitist’ configurations4. 4
Given three clients consider the following demand lists: (4,5,5), meaning that client 1 has its 4th demand satisfied, client 2 and 3 their 5th demand. Let’s assume that there are two possible
The calculation of the network resource allocation is the following. For each client we reserve the required QoS on every area on the route from the client to its proxy. Then for each proxy we reserve the maximum QoS its clients need on every area on the route from the proxy to its server. The QoS reservations are calculated using rectified QoS values. These are nonnegative real numbers. The conversion between the original and the rectified values is shown on table 1. When converted to rectified, greater QoS values mean higher demand. parameter original rectified bandwidth x x jitter x 1 x delay x 1 x latency x 1 x loss rate x 1 x Table 1. The conversion between original and rectified QoS parameters
First, for each area we calculate the local cumulated reservation (the area has reservations r1 r2 rn , the cumulated reservation is rc , the parameters are all rectified): n
bandwidthrc jitterrc delayrc
∑ bandwidthri
n
i 1
max jitterri ;
lossraterc
i 1 n
max delayri ;
latencyrc
i 1
n
max lossrateri
i 1 n
max latencyri i 1
The equations are based on experiments where the impact of altering the load on a network link was measured. Then we calculate the total reservation by summing the local reservations for each parameter separately (again using rectified parameters): parametert
∑
a.parameterrc
a A
The over-allocation is calculated as: parametero
∑
a A
a.parameterrc
a.parameterm
improvement lists: (1,5,5) where the first client gets the best possible (elitist) QoS, and (4,4,4) where everybody gets a little bit better service (fair). The linear badnesses are respectively: (4+5+5) = 14, (1+5+5) = 11, (4+4+4) = 12. The exponential badnesses are: 24 25 25 80,
21 25 25 66, and 24 24 24 48. The second list has the best linear badness (‘elitist’ improvement), the third list has the best exponential badness (‘fair’ improvement).
(a.parameterm is the maximum reservable amount of parameter in area a, in rectified value, and A is the set of the areas.) Given two costs, that cost is less, that has (in decreasing order of preference): – – – –
less over-allocation (we can’t afford over-allocation), less rejection (we want to increase the number of accepted clients), less exponential badness (we prefer fairness), less total allocation (we want to minimize the network load). Using this cost metric we were able to compare several algorithms.
4 Solution algorithms In this chapter we introduce the algorithms we have implemented and tested. 4.1 Greedy The greedy algorithm we use here is almost identical to that published in [22]. The difference is that the cost function is changed to that described in the previous chapter, and that the clients demand list is also taken into account. In this algorithm we start with an empty set of proxies. In each turn that proxy is added that decreases the cost the most. When calculating the cost, in each turn a new configuration has to be generated based on the previous one and the recently extended proxy set. This configuration setup is made by an inner greedy algorithm that for every client selects the best proxy from the set, and tries to modify the index of the chosen demand by adding or subtracting one. Each proxy contacts the server that gives the least cost. The algorithm stops when adding any candidates to the set would not decrease the configuration cost. 4.2 Particle swarm The particle swarm algorithm is based on the algorithm of Kennedy and Eberhardt [18]. The original algorithm uses a set of particles, each of them representing a possible configuration. Every particle is connected to several other particles thus forming a topology. The particles are initialized with random values. Each particle knows its last config uration (x t ), its best past configuration (b p ), and the configuration of that neighbour (including itself) that has the least cost (bn ). In each turn, it counts a new configuration by creating a linear combination of b p and bn , using probabilistic weights, then it adopts this new configuration. The whole process runs until some condition is met. The combination is defined as:
xt
1
x t
ϕ1 b p x t
where ϕi is a random number from 0 1 .
1
ϕ2 bn x t 1
This original algorithm, however, can solve those problems only, where the dimensions of the configurations represent binary or real values. This is the consequence of the linear combination technique. In our case the configurations’ dimensions represent unordered elements of sets, thus linear combinations can not be applied to them. We use the following combination instead: for each dimension we first take the value with a certain probability from either x t , or bn . Then, with another probability, we change it to a random value (e) from the set. Finally, we assign the value to the given dimension. bni
if ϕ1
xi t
if ϕ1
xi t
1
e
p1
p1
if ϕ2 if ϕ2
S
p2
(crossover)
p2
(mutation)
On figure 3 we show a small example. The swarm consists of four particles forming a ring topology. The problem is to offer proxies for two clients. We have three candidates ( a b c ), five servers ( f g h i j ), and six QoS demands for each client. Each particle describes a configuration with seven dimensions. The first two dimensions define the candidate proxies for the two clients, the next two dimensions describe which QoS demands are considered, and the last three dimensions define the servers for each candidate. The cost comparisons are shown on the particles’ connections. Each particle that has a neighbour that provides less cost, combines with that neighbour. The example shows the combination of particles P1 and P2 . First, for each dimension we generate a random number from 0 1 , and if it is below p1 0 2 (framed on the figure), then, for the given dimension, we choose the value from particle P2 . Otherwise, the value is not changed. This yields the crossover result. Then for each dimension we generate a random value from the appropriate set, and a random number from 0 1 . If the latter is below p2 0 2 (framed on the figure), then, for the given dimension, we chose the random value. Otherwise, the value is chosen from the crossover result. Thus we gain the mutation result, the new configuration for particle P1 . In an iteration these steps are conducted for each particle. Crossover and mutation are also applied by genetic algorithms[23]. The difference between genetic algorithms and our algorithm is that in our case the connections of the partners (neighbours) are static. Particles do crossovers only with their neighbours, and the crossovers transfer information from the better configuration to the worse only, thus the best results are always preserved. The algorithm runs until every particle has the same cost. But this condition combined with the mutation leads from the initial fast evolution to a final fluctuation. Several particles try to reach the cost of the best ones, but the mutations always change something, which prevents the convergence. We apply the simulated annealing of the mutation rate in order to avoid the final fluctuation [24]. The control variable of the annealing is the number of particles that have their configuration changed. The less particle changed, the less the mutation rate is. The algorithm has the following control parameters: number of particles, the topology of swarm, the crossover probability (p1 ), and the annealing function of the mutation probability. Conducting several measurements, we have made the following decisions:
(1) Starting configuration with four particles (P1 P2 P 3 P4 , connected into a ring). The cost comparisons are shown on the links. Set of candidates is a b c , set of demands is 0 1 6 , set of servers is f g h i j .
Candidate for client 1 Candidate for client 2 Demand of client 1 Demand of client 2 Server for candidate a Server for candidate b Server for candidate c
a b 2 3 h j f P1
b c 1 3 f h i P2
c a 2 4 f j j P3
b a 1 4 i g j P4
(2) Combination (only P1 P2 is shown, the vectors are transposed, framed numbers show probabilities that are below threshold): Particle 1 (P1 )
a
b
2
3
h
j
f
Particle 2 (P2 )
b
c
1
3
f
h
i
Random ϕ1 values for crossover 0.16; 0.87; 0.79; 0.18; 0.45; 0.96; 0.23 Crossover result
b
b
2
3
h
j
f
Random values for mutation
c
a
4
2
g
h
i
Random ϕ2 values for mutation 0.31; 0.23; 0.74; 0.45; 0.02; 0.63 Mutation result
b
b
2
3
g
(3) Next configuration:
Candidate for client 1 Candidate for client 2 Demand of client 1 Demand of client 2 Server for candidate a Server for candidate b Server for candidate c
b b 2 3 g j f P1
b a 1 3 f h i P2
b a 2 2 f j g P3
a a 1 3 h g j P4
Fig. 3. Example step of the swarm algorithm
j
f
– number of particles: the more particles we have, the better results can be gained, but the more time is needed. The time consumption function of the particle number is almost linear, while the cost change has an elbow at about 100 particles, so we have chosen 100 particles. – topology of the swarm: there is an infinite number of topologies. We tested two, the ring (each particle has two neighbours), and the toroid (each particle has four neighbours). The latter was faster, but the results are an order of magnitude worse than that of the former topology. The cause is simple: the information flows faster in the toroid, but this may kill promising configurations. We have chosen the ring topology. – crossover probability: the measurements give similar results as in the case of the number of particles. The optimal point is when the probability is about 0.2. – function of mutation probability: we cannot choose a constant function because of the final fluctuation. Of the tested functions, we gained the best results with
Pm c
0 2
c N
where c is the number of particles that have been changed in the last step, and N is the number of particles. 4.3 Linear Programming Rounding Graph Model In order to facilitate recommending the configuration of the distributed server, we transformed the topological graph model of the network. We produced a bipartite graph from the graph of the network, where one set of nodes denotes the clients, the other one represents the proxy-server routes and the edges between them correspond to the client-proxy-server routes. The bipartite graph model simplifies the original model of the network and it focuses on the routes that are able to satisfy client requests. Let us call it FLP graph because this model corresponds to the facility location problem where the facilities are the proxy-server pairs and the clients stay for themselves. The best QoS parameters that still can be satisfied along the client-proxy-server route are stored at each edge. We take into account during the calculation of the parameters that the required bandwidth has to be reserved twice on the common segment of the client-proxy and the proxy-server route. Let us consider the sample network in Fig. 1. The QoS parameters of the network, the requests of the clients and the best QoS parameters of the requests that the routes still can serve are shown in Table 1, 2 and 3, respectively. Figure 4 depicts the FLP graph created from the sample network. The upper nodes represent the facilities of the proxy-server pairs, the nodes below represent the clients. The edges represent the clientproxy-server routes able to satisfy the client requests. Edges are omitted from the graph if their parameters are not appropriate. In this special case, the network is unable to satisfy the requests of Clients 0 and 2. The facility representing the connection between Proxy 4 and Server 8 is missing because no edges are connected to it. The steps of generating an FLP graph from the original network are as follows. 1. Consider the proxy-server connections
Table 2. The QoS parameters in a sample network (clients reside on subnets 0-3, candidate proxies on subnets 4-6, and servers on subnets 7-8) S7
S8 7
C0
C1
8
0
1
4
6
5
P4
P6
P5
3
C3
2
C2
subnet bandwidth jitter loss delay latency 0 1 2 3 4 5 6 7 8
256 128 128 128 128 128 128 200 128
20 20 30 20 20 20 20 20 100
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
100 100 200 100 100 100 100 100 500
200 200 200 200 200 200 200 200 1000
Table 3. The client requests in the sample network client bandwidth jitter loss delay latency 0 1 2 3
200 64 128 32
100 100 100 200
0.9 0.9 0.9 0.9
500 500 500 1000
1000 1000 1000 2000
Table 4. The parameters of the requests that still can be served along the routes represented by the edges of the FLP-graph route bandwidth jitter
loss
delay latency
1-4-7 1-6-7 3-5-8 3-6-8 3-4-7 3-5-7 3-6-7
0.41 0.41 0.41 0.41 0.522 0.41 0.41
500 500 900 900 700 500 500
64 64 64 64 64 64 64
44.721 44.721 107.703 107.703 52.915 44.721 44.721
1000 1000 1800 1800 1400 1000 1000
5−7
4−7
0
6−7
1
5−8
2
6−8
3
Facilities
Clients
Fig. 4. Sample FLP-graph
(a) Consider the first proxy (b) Consider the first server (c) Add a facility node to the FLP graph that represents the current proxy-server pair. (d) Calculate the parameters of the proxy-server connection (e) Store the parameters at the corresponding facility node. (f) If any server is left, take the next one and go to step 1.c) (g) If any proxy is left, take the next one and go to step 1.b) 2. Consider the clients (a) Consider the first client (b) Add client node to the FLP graph that represents it (c) If any client is left, take the next one and go to step 2.b) 3. Consider the client-proxy connections. (a) Consider the first facility and the proxy represented by it (b) Consider the first client (c) Calculate the parameters of the current client-proxy connection (d) Add the parameters of the current client-proxy connection and the parameters stored at the facility (e) If the sum is better than any requirement of the client i. Add edge to the FLP graph between the current client and facility. ii. Store the resultant parameters at the edge. iii. Store the first index of the client requirement satisfiable by the represented client-proxy-server route. (f) If any client is left, take the next one and go to step 3.c) (g) If any facility is left, take the next one and go to step 3.b) This model enables proxies to receive data from different servers. We assume that the location and the cost of the servers are fixed; otherwise we could not neglect the server-proxy hierarchy and substitute it with the direct product of the servers and proxies. During the creation of the graph, we can efficiently check whether the client requests could be satisfied through the route, and only edges with appropriate QoS parameters are put into the graph. We assume that jitter, loss, delay and latency of different data flows are independent. This assumption enables the model to handle these parameters easily. However, this assumption is not valid for bandwidth. Each data flow reduces the free bandwidth of
the subnets. For this reason, the bipartite graph model must be complemented by the continuous recalculation of the free bandwidth of the subnets during the selection of the client-proxy-server routes. Generation of the FLP graph is facilitated by a three-level graph called SPC graph. The nodes of this graph correspond to the clients, proxies and servers, and the edges between them correspond to the proxy-client and proxy-server routes. The QoS parameters of the edges are calculated by running the shortest-path algorithm from the proxies to each other nodes. The subsequent calculation of these parameters during the generation of the FLP graph, can be substituted by fast retrieving them from the SPC graph. LP Model We chose an algorithm based on linear programming rounding for the solution of the configuration problem. To derive the integer programming formulation of the problem, we introduce the variables as follows: C, P, S denote the set of clients, proxies, servers, respectively. I denotes the set of subnets. Xi j k l : (0, 1) variable to indicate whether the request l of client i is served by server k through proxy j. Si : (0, 1) variable to indicate whether any request of client i is served or not Y j k : bandwidth reserved for the connection between server k and proxy j bi j : bandwidth of the jth request of client i Q is the set of client-proxy-server routes satisfying a client request. This set is represented by the edges of the FLP graph. There are some notations belonging to the routes: ci : The client where route i starts pi : The proxy of route i si : The server where route i ends Ri : The set of requests satisfiable through route i. We introduce some notation in order to denote the relationship between the clientproxy-server routes and the subnets: Mi1 : The subset of P where the client-proxy segments of the routes include subnet i. Mi2 : The subset of P where the proxy-server segments of the routes include subnet i. Let Bn denote the bandwidth of subnet n. For simplicity, we minimize only the number of refused clients, the exponential badness and sum of the reserved bandwidth for each subnet. Weights are assigned to the different optimization criteria. They express the priority of the criteria. α1 , α2 and α3 are the weights of the number of refused clients, the exponential badness and the cumulative reserved bandwidth, respectively. Each weight is higher than the maximum of the subsequent optimization criteria and α 3 1. We ensure in this way that the subsequent criteria are concerned only if the criteria with higher weights are equal. The problem of recommending the optimum set of proxy nodes can be now formulated in the form (P1):
minimize
∑ Si
(1)
pi si j
(2)
∑ Y pi si
(3)
α1
i C
α2 α3
∑ ∑ 2j
1
Xci
i Q j Ri
∑ ∑ ∑ b ci j
Xci
pi si j
n I i Mn1 j Ri
i Mn2
subject to
∑ ∑ b ci j
Xci
∑ Y pi si
i Mn1 j Ri
I
(4)
Q
(5)
C
(6)
Ri
(7)
Si 0 1 i C Q s.t. i ci and j pi
(8) (9)
pi si j
Bn n
i Mn2
∑ b ci j
Xci
pi si j
Y pi si i
j Ri
∑ ∑ Xc pi si j
Sc
i Q ci c j Ri
1 c
Xci
pi si j
0 1
i
Q
j
Yi
j
0 i
P and j
S where i
Parts (1), (2) and (3) of the cost function represent the number of refused clients, the exponential badness and the sum of the reserved bandwidth for each subnet, respectively, each of them are weighted by the corresponding α. Inequation (4) is the bandwidth constraint of the subnets. Constraint (5) is the expressions for the reserved bandwidth by the proxy-server connections. Constraint (6) ensures that each client is either assigned to a client-proxy-server route or rejected. Consider the following LP-relaxation of the problem (P2): minimize (1) + (2) + (3) subject to (4), (5), (6), (9)
Xci
0 i
pi si j
Q j
Ri
(10)
Si
0 i
C
(11)
We enclose the linear program belonging to the problem instance described in Fig. 1.
minimize 868352S3
868352S2 3392S0 3392S2
3392S1 848X3 4 7 1
32X3 4 7 1
868352S0
848X1 4 7 1 848X3 5 8 1
848X3 5 7 1
32X3 5 8 1 Y4 7
Y6 7
868352S1
848X1 6 7 1 848X3 6 8 1
848X3 6 7 1 3392S3 64X1 4 7 1 64X1 6 7 1
32X3 5 7 1 32X3 4 7 1
32X3 6 8 1 64X1 4 7 1
32X3 4 7 1
Y5 8
(12)
(13)
Y5 7
32X3 6 7 1 64X1 6 7 1 32X3 5 8 1
32X3 5 7 1 32X3 6 8 1 32X3 6 7 1 Y4 7 32X3 4 7 1 Y5 8 Y5 7 Y6 8 32X3 6 8 1
32X3 6 7 1 64X1 6 7 1 Y4 7 Y5 7 Y6 7
Y5 8
Y6 8
(14)
subject to
Y4 7 Y4 7 Y6 7
32X3 4 7 1
32X3 4 7 1
Y5 8
Y6 8
128
Y5 7
Y6 7
200
Y5 8
Y5 7
Y6 8
32X3 6 8 1 32X3 6 7 1 64X1 6 7 1 128 32X3 4 7 1 Y5 8 Y5 7 32X3 5 8 1 32X3 6 8 1 64X1 4 7 1
32X3 6 7 1 64X1 6 7 1
Y4 7
32X3 5 7 1 32X3 4 7 1
32X3 5 8 1
32X3 5 7 1
32X3 6 8 1 64X1 4 7 1
32X3 6 7 1 64X1 6 7 1
X3 5 8 1
X3 6 8 1
X3 4 7 1
128 128
128 128
32X3 6 7 1 Y6 7
0
64X1 6 7 1 Y6 7 32X3 5 7 1 Y5 7
0 0
32X3 4 7 1 Y4 7 64X1 4 7 1 Y4 7
32X3 6 8 1 Y6 8 32X3 5 8 1 Y5 8
X1 4 7 1
X1 6 7 1
S0 S1
X3 5 7 1
X3 6 7 1
S2 S3
0 0
0 0
(16)
1 1
(15)
1 1
(17)
After solving the linear program we get the result: X1 4 7 1 0 5 X1 6 7 1 0 5 X3 6 7 1 1, all other X-variables are zero. It means, that the request of Client 3 can be satisfied, the requests of Clients 0 and 2 are rejected and we get invalid fractional results for Client 1. Rounding First we solve the linear program P2 and obtain an optimal solution. If Xi j k 1 then let the request of client i be served by server k through proxy j. Unfortunately, the possible fractional values of X-type variables do not represent legal solutions. We round the solution in a greedy manner. We take each X variables with fractional value one after the other. If the client is still not served, we try to select the current client-proxy-server route denoted by variable Xi j k , and check the load conditions in the network. The client i is served by server k through proxy j in the solution if and only if these conditions are fulfilled after the selection of the route. Let us consider again our sample network. Solving the linear program, we got that the request of Client 3 can be satisfied through Proxy 6 from Server 7. We find one more client during rounding, whose request can be satisfied: X1 4 7 1 1 represents a valid route in the network. After that, variable X1 6 7 1 is not rounded because Client 1 is already served. We got the following solution: – Client 3 is served through Proxy 6 from Server 7. – Client 1 is served through Proxy 4 from Server 7. 4.4 Incremental Algorithm In order to find solutions quickly, we implemented a very simple but efficient incremental algorithm. It operates on the FLP graph described in the previous subsection. The main steps of the algorithm are as follows. 1. Generate the FLP graph. 2. Sort the facilities 3. for each facility do 4. Decide on selecting the current facility 5. if the facility is selected then 6. for each client connected to the current facility do 7. Decide on assigning the current client to the current facility The facilities are sorted in decreasing order of the bandwidth of the represented proxy-server routes in Step 2. The algorithm selects facility f i in Step 4 if there is at least one client c j among the nodes adjacent to f i that satisfies the following criteria: – client c j is still not assigned to any facility or – client c j is assigned to facility f 0 and each of the following conditions hold the parameters of the edge between c j and fi are better than that between c j and f0 , the badness of the first satisfiable request on the edge between c j and fi is not greater than that of the satisfied request on the edge between c j and f0 .
These criteria are complemented with one more when client c j is assigned to facility f i in Step 7: – the request of client c j can be satisfied through facility f i without overloading the network. Let us consider the sample network as an example. First we generate the FLP-graph. Two client nodes (Client 0 and 2) have no edges, that is, their requests can not be served. In our example, each proxy-server route has the same bandwidth. For this reason, their order after sorting is not specified by the algorithm. In our implementation, the facilities representing the proxy-server routes are processed in the following order: Facility 5-8, 5-7, 6-8, 6-7, 4-7. In the first iteration, facility 5-8 is chosen because it is able to serve a client (Client 3). In the next iteration, facility 5-7 is selected because edge 3-5-7 has better parameters than edge 3-5-8 (see parameters jitter, delay and latency in Table 3). After that, facility 5-8 is unselected because no clients are connected to it. Facility 6-8 is not selected because it does not offer a better route to Client 3. Facility 6-7 does not improve the route from Client 3 either, but one more client (Client 1) is connected it. However, this client cannot be served because Subnet 6 would be overloaded. The free bandwidth of this subnet decreases from 128 to 96 after the reservation for Client 3. Only its half, 48 is the maximum bandwidth that Client 1 could ask along Route 1-6-7 because both its client-proxy and proxy-server segments use Subnet 6. However, Client 1 needs bandwidth of 64. For this reason, Facility 6-7 is not selected. Facility 4-7 is able to serve Client 1 because Proxy 4 is located in a better position than Proxy 6 is. For this reason, Facility 4-7 is selected in the last iteration and Client 1 is assigned to it. The final solution of the incremental algorithm is: – Facility 5-7 is selected and Client 3 is assigned to it – Facility 4-7 is selected and Client 1 is assigned to it
5 Results We implemented the algorithms and tested them on simulated network environments. Each test network consists of 50 subnets. Six test series were generated; each of them consisted of ten cases. The number of servers is always ten, while the number of clients and proxies varies in different series; there are 5, 10, 15, 20, 25, 30 clients and 10, 20, 30, 40, 40, 40 proxies in the different series. We examined the cost of the solutions and the running time as a function of the number of clients. Figure 5 shows the costs (linear and exponenstial badnesses, numbers of rejected clients) and the runtime. The figures compare the results of the algorithms described above, namely linear programming rounding, swarm algorithm, greedy algorithm and incremental algorithm. The linear programming does not produce legal solutions, without rounding, but can be used as a lower bound for the cost measures.
Runtime 100000
LinProg Incremental Greedy Swarm
LinProg Incremental Greedy Swarm
10000 1000 Seconds
Number of rejected clients
Rejection 9 8 7 6 5 4 3 2 1 0
100 10 1 0.1 0.01
5
10
15 20 Number of clients
25
30
5
10
15
20
25
30
Number of clients
Exponential badness
Linear badness
800
120 LinProg Incremental Greedy Swarm
700 600 500
LinProg Incremental Greedy Swarm
100 80
400 300
60 40
200 20
100 0
0 5
10
15
20
25
30
Number of clients
5
10
15
20
25
30
Number of clients
Fig. 5. The results of the measurements for linear programming rounding, incremental, greedy, and swarm algorithms.
According to the figure, the swarm algorithm produces the best results, and the linear programming rounding achieves results with slightly higher cost and the greedy algorithm fails to find nearly optimal algorithms for a high number of clients. On the other side, the running time of the incremental algorithm is clearly the best, the running time of linear programming with rounding is the second, while the swarm and the greedy algorithms run substantially slower.
6 Conclusions and Further Work We introduced a number of algorithms for host recommendation in an Adaptive Distributed Multimedia Server. We did not find an algorithm, which is the best in every aspect. A good idea is first to produce an initial solution quickly and to determine a more sophisticated solution later. In the future, we intend to improve the model of the network. We plan to take into account that the resources of the proxies are limited. Furthermore, the predicted future values of the network parameters and client requests have to be taken into account at the recommendation. The presented algorithms can be combined with each other and be enhanced according to the experiences in order to improve their running time and the quality of
their results. Some other algorithms are worth considering as well. We regard simulated annealing as a promising approach and we expect much from the incremental algorithm after customization to the real tasks. Later the implementations of the algorithms will be integrated into the Adaptive Distributed Multimedia Server and tested also in a real network environment.
References 1. http://developer.apple.com/darwin/: Darwin project homepage (2004) 2. https://www.helixcommunity.org/: Helix project homepage (2004) 3. Tusch, R.: Towards an adaptive distributed multimedia streaming server architecture based on service-oriented components. In Böszörményi, L., Schojer, P., eds.: Modular Programming Languages, JMLC 2003. LNCS 2789, Springer (2003) 78–87 4. Tusch, R., Böszörményi, L., Goldschmidt, B., Hellwagner, H., Schojer, P.: Offensive and Defensive Adaptation in Distributed Multimedia Systems. Technical Report TR/ITEC/03/2.03, Institute of Information Technology, Klagenfurt University, Klagenfurt, Austria (2003) Paper submitted to the Journal of Systems and Software, Special Issue on Multimedia Adaptation. 5. Steen, M., Homburg, P., Tannenbaum, A.S.: Globe: A wide-area distributed system. IEEE Concurrency (1999) 6. Li, B., Golin, M., Italiano, G., Deng, X., Sohraby, K.: On the optimal placement of web proxies in the internet. In: Proceedings of the Conference on Computer Communications (IEEE Infocom). (1999) 7. Qiu, L., Padmanabhan, V.N., Voelker, G.M.: On the placement of web server replicas. In: INFOCOM. (2001) 1587–1596 8. Cornuejols, G., Nemhauser, G.L., Wolsey, L.A.: The uncapacitated facility location problem. In Mirchandani, P., Francis, R., eds.: Discrete Location Theory. John Wiley and Sons, New York (1990) 119–171 9. Bumb, A.: Approximation algorithms for facility location problems. PhD thesis, Twente University (2002) 10. Kariv, O., Hakimi, S.L.: An algorithmic approach to network location problems ii: The p-medians. SIAM Journal on Applied Mathematics Vol. 37, No. 3 (1979) 539–560 11. Lin, J.H., Vitter, J.S.: ε2 -approximation with minimum packing constraint violation. In: Proceedings of the 24th Annual ACM Symposium on Theory of Computing. (1992) 771– 782 12. Shmoys, D., Tardos, E., Aardal, K.: Approximation algorithms for facility location problems. In: Proceedings of the 29th ACM Symposium on Theory of Computing. (1997) 265–274 13. Charikar, M., Guha, S.: Improved combinatorial algorithms for the facility location and k-median problems. In: IEEE Symposium on Foundations of Computer Science. (1999) 378–388 14. Mahdian, M., Ye, Y., Zhang, J.: Improved approximation algorithms for metric facility location problems. In: Proceedings of 5th International Workshop on Approximation Algorithms for Combinatorial Optimization. (2002) 15. Kuehn, A.A., J., H.M.: A heuristic program for locating warehouses. Management Science 9 (1963) 643–666 16. Mettu, R.R., Plaxton, C.G.: The online median problem. In: Proceedings of the 41st IEEE Symposium on Foundation of Computer Science. (2000) 339–348 17. Dvorett, J.: Compatibility-based genetic algorithm: A new approach to the p-median problem. In: Informs Fall 1999 Meeting. (1999) 18. Kennedy, J., Eberhardt, R.C.: Swarm Intelligence. Morgan Kaufmann (2001)
19. Cornuejols, G., Fisher, M.L., Nemhauser, G.L.: Location of bank accounts to optimize float: an analytic study of exact and approximate algorithms. Management Science 23 (1977) 789–810 20. Goldschmidt, B., Tusch, R., Boszormenyi, L.: A corba-based middleware for an adaptive streaming server. DAPSYS journals (2003) ?? 21. Goldschmidt, B., Tusch, R., Boszormenyi, L.: A mobile agent-based infrastructure for an adaptive multimedia server. In: 4th DAPSYS (Austrian-Hungarian Workshop on Distributed and Parallel Systems), Kluwer Academic Publishers (2002) 141–148 22. Goldschmidt, B., László, Z.: A proxy placement algorithm for the adaptive multimedia server. In: 9th International Euro-Par Conference. (2003) 1199–1206 23. Davis, L., ed.: Handbook of Genetic Algorithms. Van Nostrand Reinhold (1991) 24. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science, Number 4598, 13 May 1983 220, 4598 (1983) 671–680