J Comb Optim DOI 10.1007/s10878-013-9617-9
Optimal-constrained multicast sub-graph over coded packet networks M. A. Raayatpanah · H. Salehi Fathabadi · H. Bahramgiri · P. M. Pardalos
© Springer Science+Business Media New York 2013
Abstract Network coding is a technique which can be used to improve the performance of multicast communications by performing encoding operations at intermediate nodes. In real-time multimedia communication applications, there are usually several weights associated with links, such as cost, delay, jitter, loss ratio, security, and so on. In this paper, we consider the problem of finding an optimal multicast sub-graph over coded packet networks, where the longest end-to-end weight from the source to each destination does not exceed an upper bound. First, a mixed integer programming model is proposed to formulate the problem which is NP-hard. Then, a column-generation approach is described for this problem, in which the problem is decomposed into a master linear programming problem and several integer programming sub-problems. Moreover, two methods based on linear and Lagrangian relaxation are proposed to compute a tight lower bound of the optimal solution value. Computational results show that the proposed algorithm provides an efficient way for solving the problem, even for relatively large networks.
M. A. Raayatpanah · H. Salehi Fathabadi · H. Bahramgiri School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran e-mail:
[email protected] H. Salehi Fathabadi e-mail:
[email protected] H. Bahramgiri e-mail:
[email protected] M. A. Raayatpanah · P. M. Pardalos (B) Center for Applied Optimization, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, USA e-mail:
[email protected]
123
J Comb Optim
Keywords Communication networks · Network coding · Network flow optimization · Multicast
1 Introduction Network coding generalizes the traditional routing paradigm, by performing arbitrary operations on the content of the packets at intermediate network devices. It has been shown in Ahlswede et al. (2000) that network coding can achieve the max-flow/mincut bound on multicast capacity. Li et al. (2003) proved that linear coding is enough to achieve an optimal throughput of a multicast capacity. Jaggi et al. (2005) also showed that network codes could be designed using a polynomial time algorithm. Establishing an efficient multicast connection is one of the interesting properties in network coding. Indeed, traditional multicast problems can be formulated as the problem of finding the Steiner sub-graphs, which are NP-hard (Resende and Pardalos 2006; Oliveira and Pardalos 2011). Whereas, multicast sub-graph with network coding is formulated as a linear programming problem Lun et al. (2006). This problem can be solved in polynomial time and it is suitable for distributed implementation. Ghasvari et al. (2011) considered the problem of finding a minimum cost multicast based on network coding, where parameters variations over time. They presented a decentralized algorithm based on primal and dual decomposition. Xi and Yeh (2010) provided an analytical framework as well as a set of distributed solutions for optimizing the configuration of network coding in both wireline and wireless networks. Raayatpanah et al. (2012) addressed the problem of finding a minimum cost multiple multicast scheme using a network coding approach, where rates over all links are integer multiples of a basic rate. They formulated the problem as a mixed integer linear programming problem and developed an algorithm based on Benders decomposition. Content distribution on the Internet is one of issues in the area of network communication, which refers to the delivery of digital data such as text and multimedia files, streaming audio and video, and software to a large number of users in a network. Some examples in the content distribution include video, video-conferencing, telemedicine, and video-on-demand, which require multicasting with a certain Quality of Service (QoS). One of the most important QoS parameters is the maximum end-to-end delay from the source to any destination in a multicast session, which does not allow to exceed a certain delay bound. This problem is known as a delay-constrained minimum cost multicasting in traditional routed packet networks, which is NP-hard (Lee et al. 1995; Oliveira and Pardalos 2011). However, due to several advantages of network coding over simple routing, such as higher throughput (Ahlswede et al. 2000), lower delay (Yeung et al. 2005), higher reliability (Lun et al. 2008), and security (Cai and Yeung 2002), this technique has been proposed for content distribution on the Internet. The Microsoft Secure Content Distribution is considered as an example of a peer-to-peer system attempting to content distribution using network coding. In this case, peers produce linear combinations of the fragments they already hold. Such combinations are distributed together with a tag that describes the coefficients in the combination. When a peer has enough linearly independent combinations of the original fragments, it can decode and build
123
J Comb Optim
the original file (Fragouli and Soljanin 2008). Therefore, providing QoS is still a major challenge, when the content distribution employs multicast with network coding as the data delivery mechanism. The focus of our paper is on constructing the coding sub-graph for multicast traffic under end-to-end QoS guarantees. QoS parameters could include cost, delay, jitter, bandwidth, packet delivery ratio, and packet loss ratio. These parameters are interpreted as weights of links into account. The selected sub-graph provides guarantees in terms of end-to-end weight with minimum cost sub-graph while satisfying bandwidth constraints. In order to obtain an exact mathematical model for the problem, we first introduce a path-based mixed integer programming (MIP) model. Subsequently, we show that the problem is NP-hard and present a heuristic algorithm for it. Each iteration of the algorithm considers a column generation method to tackle the path-flow model, which provides primal solutions and obtains an upper bound on the optimal objective value. In addition, in each iteration, a procedure based on relaxation is described to obtain a lower bound in a reasonable amount of computational time. A decision maker can terminate the procedure before optimality, when the relative gap between upper and lower bounds is sufficiently small. The ability of the proposed approach is demonstrated through simulation results on random graphs. The rest of the paper is organized as follows: In Sect. 2, we introduce the system model and formulate the problem. In Sect. 3, the proposed problem is solved using the heuristic algorithm. Computational results and the conclusion are presented in Sects. 4 and 5, respectively. 2 Problem formulation A communication network is represented by a network G = (V, E), where V is a set of nodes with |V | = n, and E is a set of links with |E| = m. We consider a single session multicast, when a source node, s ∈ V , must transmit an integer number of R packets per unit time to every node in a set of destinations, K ⊂ V . Each link e = (i, j) represents a link from node i to node j, which is associated with three parameters: a non-negative cost, ce , denoting the cost per unit rate of sending coded packets over link e ∈ E; a non-negative integer capacity, u e , which denotes the number of packets that can be sent over link e in one time unit; and a non-negative integer weight, we . As mentioned earlier, we can be interpreted as a QoS parameter of link e, which can include delay, jitter, packet delivery ratio, or packet loss ratio. For a node i ∈ V , the terms Vi+ and Vi− denote the set of links leaving and entering node i, respectively. The flow rate toward destination k on link e is defined by xe(k) . Let z e denote the rate at which coded packets are injected onto link e. The rate vector Z = {z e } is called the coding sub-graph of the network (Lun et al. 2006). The flow rates and network-coded transmission rate on link e are related as follows: z e = max(xe(k) ). k∈K
Let P (k) denote the collection of all directed paths from source node s to destination node k in the underlying network G. Define variable f ( p) as the flow on path p ∈ P (k) .
123
J Comb Optim
Let δe ( p) be a link-path indicator variable, that is, δe ( p) is 1 if link e is contained in (k) path p, and is 0 otherwise. The amount of a link flow, xe , is computed from path flows by the following relation.
xe(k) =
δe ( p) f ( p).
p∈P (k)
The end-to-end weight of path p ∈ P (k) is defined as follows: W (k) ( p) =
we .
(2.1)
e∈ p
An upper bound on the longest end-to-end weight from source node s to destination node k is given by U (k) . Thus, the following constraint is considered to guarantee the longest end-to-end violation. max (W (k) ( p)) ≤ U (k) ,
p∈P (k)
∀k ∈ K .
(2.2)
∀k ∈ K , p ∈ P (k) .
(2.3)
Constrain (2.2) can be written as e∈ p
we ≤ U (k) ,
Constraint (2.3) should be satisfied for each path that carries a positive flow in an optimal solution. Then, constraints (2.3) for used paths can be formulated by introducing binary variables as follows: f ( p) ≤ Ry( p), ∀k ∈ K , ∀ p ∈ P (k) , y( p) we ≤ U (k) , ∀k ∈ K ,
(2.4) (2.5)
e∈ p
y( p) ∈ {0, 1},
∀k ∈ K , ∀ p ∈ P (k) .
(2.6)
Constraint (2.4) forces variable y( p) to be equal to one, when path p ∈ P (k) is used to transmit flow toward destination node k ∈ K . Variable y( p) is then used in constraint (2.5) to guarantee bounds on the end-to-end weight constraint for the new connection. Every non-negative link flow can be represented as a path and cycle flow (Ahuja et al. 1993). Now, if we apply non-negative cycle cost condition (Ahuja et al. 1993), then the flow on every cycle is zero in some optimal solutions for the problem. Consequently, cycle flow variables can be eliminated and any potentially optimal solution may be represented as the sum of flows on directed paths. Therefore, we propose the following path-based formulation for the problem of finding an optimal-constrained multicast sub-graph over coded packet networks. Z ∗ = min
e∈E
123
ce z e
(2.7a)
J Comb Optim
s.t.
δe ( p) f ( p) ≤ z e , ∀e ∈ E, k ∈ K ,
(2.7b)
p∈P (k)
f ( p) = R,
∀k ∈ K ,
(2.7c)
p∈P (k)
f ( p) ≤ Ry( p), ∀ p ∈ P (k) , k ∈ K , we δe ( p)y( p) ≤ U (k) , ∀k ∈ K ,
(2.7d) (2.7e)
e∈E
y( p) ∈ {0, 1}, 0 ≤ ze ≤ u e , 0 ≤ f ( p),
∀ p ∈ P (k) , k ∈ K , ∀e ∈ E, ∀ p ∈ P (k) , k ∈ K .
(2.7f) (2.7g) (2.7h)
Objective function (2.7a) minimizes the total cost of the problem. Constraint (2.7b) denotes the rate at which coded packets are injected onto link e. Equation (2.7c) formulates the flow conservation constraint. Note that the resulting set of the flow conservation constraints in the path-based formulation contains |K | equalities, while in a link-based formulation contains n|K | of such equalities. As mentioned before, constraints (2.7d), (2.7e), and (2.7f) guarantee bound on the end-to-end weight for each used path. Constraint (2.7g) denotes the capacity constraint, guaranteeing that the total rate passing through link e is less than or equal to its capacity. Finally, Inequality (2.7h) provides the range of model’s decision variables. Model (2.7) can be converted to a constrained shortest path problem, when there is only one destination. Since a constrained shortest path problem in general is NP-hard (Garey and Johnson 1979), the problem (2.7) is also NP-hard. 3 Proposed algorithm As mentioned earlier, the problem of finding an optimal-constrained multicast subgraph over coded packet networks is NP-hard. Moreover, this problem is usually considered as a large-scale MIP problem, even state-of-the-art techniques (e.g., branch-and-bound (Gupta and Ravindran 1985)) and their software implementations (e.g., BARON (Sahinidis 1996)) cannot provide a good solution. Then, we resort to develop a heuristic algorithm to obtain a solution for the problem. Hence, we describe a column generation and relaxation methods to find upper and lower bounds on the optimum objective value, respectively. When the algorithm returns a solution, it has discovered a bound on the exact optimal value for the problem. 3.1 Column generation approach Column generation is an efficient technique for solving larger-scale MIP (Desaulniers et al. 2005). In this subsection, we apply this technique for Model (2.7), in which the problem is decomposed into a linear master problem and several integer
123
J Comb Optim
sub-problems. The master problem is the original problem with only a subset of variables, and the sub-problems are new problems created to identify a new variable. Binary variables, y( p), in Model (2.7) indicate the selection of feasible paths. These variables are not necessary when Model (2.7) is defined over feasible path sets i.e., ∪k∈K P¯ (k) , where ¯ (k)
P
=
p| p ∈ P
(k)
,
δe ( p)we ≤ U
(k)
.
e∈E
Set P¯ (k) is simply the set of all feasible paths to destination k. Then, the following master problem is obtained over all feasible paths. Z ∗ = min
ce z e +
e∈E
s.t.
Ms (k)
(3.1a)
k∈K
δe ( p) f ( p) ≤ z e ,
∀e ∈ E, k ∈ K ,
(3.1b)
p∈ P¯ (k)
f ( p) + s (k) = R,
∀k ∈ K ,
(3.1c)
p∈ P¯ (k)
∀e ∈ E, 0 ≤ ze ≤ u e , 0 ≤ f ( p), ∀ p ∈ P¯ (k) , k ∈ K .
(3.1d) (3.1e)
Variable s (k) is an artificial variable associated with demand constraint k and has cost coefficient M, which is a sufficiently large number. Obviously, if the optimal value of s (k) be positive, then the problem is infeasible. Note that the theoretical difficulty for solving Model (2.7) is now hidden in P¯ (k) i.e., in the generation of feasible paths. Finding all feasible paths can be difficult because a network involves an exponential number of paths. However, this difficulty can be overcome by considering a subset of paths P¯ (k) , referred to as the restricted master problem (RMP). Let γe(k) , π (k) , and λe denote the dual variables associated with constraints (3.1b), (k) (3.1c), and (3.1d), respectively. We have γe ≥ 0, λe ≤ 0, and π (k) ≥ 0 , since constraints (3.1c) can be equivalently written as greater-than-or-equal-to constraints. γ ,π γ ,λ (k) The reduced costs of f ( p) and z e are c¯ p = e∈E δe ( p)γe − π (k) and c¯e = ce − k∈K γe(k) − λe , respectively. Then, dual feasibility conditions can be written as follows: γ ,π
c¯ p
γ ,λ
≥ 0,
∀ p ∈ P¯ (k) , ∀k ∈ K ,
c¯e ≥ 0, ∀e ∈ E, (k) M − π ≥ 0, ∀k ∈ K .
123
J Comb Optim
Moreover, we can obtain the following complementary slackness conditions. γ, π
∀k ∈ K , p ∈ P¯ (k) ,
f ( p) = 0,
c¯ p
γ ,λ
c¯e z e = 0,
∀e ∈ E,
(M − π (k) )s (k) = 0,
∀k ∈ K .
By applying the Karush-Kuhn-Tucker conditions (Bazaraa et al. 2006) for the master problem, solutions ( f ( p), z e ) and (γ , π, λ) are the optimal solutions of the primal and dual master problems if and only if they are primal and dual feasible solutions, respectively, and also satisfy the complementary slackness conditions. Let f ( p), z e , and s (k) constitute a feasible solution of the master problem (3.1) with (k) dual variables γe , λe , and π (k) . Since M is a sufficiently large number, condition (k) ≥ 0 will always be satisfied for each k. As soon as we have a feasible M−π solution with s (k) = 0, then (M − π (k) )s (k) = 0 is satisfied for each k. Subsequently, we need to check optimality condition to paths and coded packet variables. If there γ, π γ, λ exists a path, p ∈ P¯ (k) , with c¯ p ≤ 0 or a variable, z e , with c¯e ≤ 0, then f ( p) or z e may be chosen as entering variable. These results yield one sub-problem for each destination and one for each link. γ, π According to the definition of δe ( p), the reduced cost of path p ∈ P¯ (k) is c¯ p = (k) (k) e∈ p γe − π , which means the reduced cost of a path is just the cost of that path (k)
(k)
with respect to the costs γe minus π (k) . Then, to see whether the dual variables γe and π (k) satisfy the complementary slackness conditions, we obtain the sub-problem γ, π min{c¯ p | p ∈ P¯ (k) } for each destination k. That is
min
(k)
γi j f i j
(3.2)
(i, j)∈E
s.t.
fi j −
j:(i, j)∈E
wi j f i j ≤ U
j:( j,i)∈E (k)
⎧ ⎨ 1, if i = s, f i j = −1, if i = k, ⎩ 0, otherwise.
,
(i, j)∈E
f i j ∈ {0, 1}. This problem is known as a constrained shortest path problem. Researchers have developed many methods to solve it, such as Lagrangian relaxation, enumeration, k-th shortest paths, and combinations of these ideas (Desrochers and Soumis 1986; Ziegelmann 2007). In this paper, we consider a label-correcting algorithm for solving the sub-problems, which is called CSP-Multi-Labelling algorithm. Each label at each node, i, is a value pair, (Cih , Wih ), that states its cost and weight. One iteration of the multi-labelling algorithm is to choose a certain label and perform label updates. A formal description of the multi-labelling algorithm to solve sub-problem (3.2) for destination k is given below, where s, k, U (k) , Q, and l(i) denote the source, the destination, the weight limit, the queue, and the numbers of labels at node i, respectively.
123
J Comb Optim Algorithm 1: CSP-Multi-Labelling Algorithm 1. Initialization: Cs1 = 0, Ws1 = 0, Q = {(Cs1 , Ws1 )}, l(s) = 1, l(i) = 0, ∀i ∈ V − {s}. 2. Label selection: If Q = ∅, terminate. Otherwise, delete the first label {(Cih , Wih )} in Q. 3. Label update: For all j ∈ Vi+ , (a) If Wih + wi j > U (k) , then go to next j. (k) (b) If there is a label {C nj , W nj } such that Cih + γi j ≥ C nj and Wih + wi j ≥ W nj , then go to next j. l( j)+1 l( j)+1 (k) (c) Otherwise let C j = Cih + γi j and W j = Wih + wi j l( j)+1
l( j)+1
i. Create a new label {(C j , Wj )}. ii. If j = k,then insert this label into Q with respect to l(i). l( j)+1 l( j)+1 iii. If there is any label {C nj , W nj } such that C j ≤ C nj and W j ≤ W nj , then delete {C nj , W nj } from Q. Update l(i). Return to Step 2.
The above algorithm sorts the labels in queue Q with respect to l(i) and allows multiple Pareto-optimal labels to be created at each node. We now obtain the worstcase complexity of the CSP-Multi-Labelling algorithm. The basic operation in one iteration is to choose a certain label and perform label updates. Under the assumption of non-negative costs and non-negative integer weight values, the algorithm performs label selection in at most n D (k) iterations and in each iteration requires that insert labels (k) (k) into Q. Therefore, the total label selection time is 1+2+.....+n D (k) = n D (n2D −1) . The algorithm performs label updates Vi+ times for each iteration and in the total is + i∈V Vi = m times. Since each label update operation requires O(1) time, the algorithm requires O(m) total times for updating all distance labels. Then, the CSPMulti-Labelling algorithm has worst-case time complexity of O(n 2 D (k)2 ). We can improve the practical performance of the CSP-Multi-Labelling algorithm using a priori threshold. A priori a threshold value Ui ≤ U (k) at each node i can be identified by computing the classical shortest path from every node to the destination. Then, all labels at i that have the weight values exceeding Ui can be discarded. Similarly, a (k) cost-threshold value with respect to costs γe can be computed, any label with a cost exceeding this threshold value can be discarded because it will not lead to a negative γ, π c¯ p . the CSP-Multi-Labelling algorithm usually terminates with multiple labels at the destination node, where label represent a feasible path. We add a feasible path with respect to each destination, k, to the RMP and repeat iterations. One may add multiple labels columns to the RMP to improve the column generation convergence. However, since these columns have increased the RMP size, then solving RMP is more time-consuming. Moreover, these columns are generated from the same dual price, they do not achieve significant improvement. For variables z e , we wish to find min{c¯e |e ∈ E}, which is easy to compute. After finding a solution to the RMP, we determine whether there are any columns not included in the RMP with negative reduced cost. If none can be found, then the current solution to the RMP is an optimal for the master problem. If one or more such columns do exist, then they are added to RMP and the process is repeated. Since the obtained solution of the RMP corresponds to a feasible solution for the problem, each iteration of the column generation method computes an upper bound on the optimum value.
123
J Comb Optim
3.2 Relaxation method In this subsection, we propose two methods based on linear and Lagrangian relaxation to provide a tight lower bound on the objective value. In the first, we describe a linear relaxation of Model (2.7) as follows: According constraints (2.7c) and (2.7h), we get 0 ≤ f (Rp) ≤ 1, for each path p in (k) P . Then, constraint (2.7e) can be written as f ( p) we δe ( p)y( p) ≤ U (k) ∀k ∈ K , ∀ p ∈ P (k) . R
(3.3)
e∈E
Variables y( p) can be eliminated from inequality (3.3), because y( p) = 1 if f p > 0, and zero otherwise. Hence, the inequality can be modified as f ( p) we δe ( p) ≤ U (k) ∀k ∈ K , ∀ p ∈ P (k) . R
e∈E
The above constraint is the average weighted proportions of flow for each destination. Thus, the linear relaxation model of the problem can be expressed as follows:
Z r = min
ce z e
(3.4)
e∈E
s.t.
δe ( p) f ( p) ≤ z e ,
∀e ∈ E, k ∈ K ,
p∈P (k)
f ( p) = R,
∀k ∈ K ,
p∈P (k)
f ( p)we δe ( p) ≤ RU (k) ,
∀ p ∈ P (k) , k ∈ K ,
e∈E
0 ≤ ze ≤ u e , 0 ≤ f ( p),
∀e ∈ E, ∀ p ∈ P (k) , k ∈ K .
Due to Model (3.4) is a linear relaxation for Model (2.7), then Z r ≤ Z ∗ . If the weights of paths used have small variation, then the average weight provides a good approximation to the individual path weights. This observation is confirmed by the computational results in Sect. 4. Next, we describe a method based on Lagrangian relaxation to obtain a lower bound using the computed dual values of the column generation approach. Let Z¯ l (γ , λ, π ) denote the optimal objective function value of the RMP in iteration l of the column generation approach. Since the obtained solution of the RMP corresponds to a feasible solution for the problem, Z ∗ ≤ Z¯ l (γ , λ, π ).
123
J Comb Optim
Lemma 3.2.1 Z l (γ , λ, π ) =
γ ,λ
γ ,λ {e:c¯e π (k) , we should have s (k) = 0 to minimize the expression. Let path q ∈ P¯ (k) denote a path with the least reduced cost obtained in γ, π γ, π γ, π current iteration i.e., cq = min p∈ P¯ (k) {c¯ p }, if cq < 0, then f (q) = R, otherwise f ( p) = 0, ∀ p ∈ P¯ (k) . For each link e, the optimal value of z e is to be equal to u e if γ ,λ c¯e < 0, otherwise z e = 0. Thus, we get Z (γ , λ, π ) =
γ ,λ
{e:c¯e