2016 12th Int. Conference on the Design of Reliable Communication Networks (DRCN 2016)
Mixed-Integer Optimization for the Combined capacitated Facility Location-Routing Problem Dimitri Papadimitriou
Didier Colle, Piet Demeester
Nokia - Bell Labs Antwerp, Belgium Email:
[email protected]
Ghent University Ghent, Belgium Email: {didier.colle,piet.demeester}@intec.ugent.be
Abstract—Given a set of potential facilities (with individual opening cost and finite capacity) and a set of client demands, the capacitated Facility Location Problem (cFLP) consists in selecting a set of facilities to open and performing an assignment of demands to open facilities that minimize the sum of the facility opening cost, supplying cost and transportation/connection cost. This problem shares many similarities with the one consisting in minimizing the cost for opening a set of servers or replicas for storing a set of data objects (files) and connecting clients to facility locations so as to satisfy their demands. The multi-source dimension of the problem translates the situation where the same data object may be available simultaneously at different facility locations. When various data objects with different properties are available at (possibly multiple) opened facility locations, the problem shares the characteristics of the multi-product or multi-commodity model. On the other hand, the combination of the cFLP with the network design problem, referred to as the Capacitated Facility Location-Network Design Problem (cFLNDP), has been subject to several studies since the early 2000s. This (class of) problem(s) was initially motivated by the observation that when the (physical) network topology is determined endogenously, it may be more effective to change the configuration of the underlying network physical topology instead of locating additional facilities in order to account for the spatio-temporal variability in clients demands. However, in the context of communication networks, one does not require to physically rewire the network to obtain a new logical topology. Hence, in this paper we combine the capacitated multi-source multi-product facility location with the traffic routing problem and propose a mixed-integer program formulation. As dynamic routing offers the flexibility to re-configure routing tables entries and adapt routing decisions, our combined model also enables to dynamically re-allocate demands upon facility failure.
I.
I NTRODUCTION
Let G = (V, E) denote the graph where the vertex set V represents both the set of demand originating points (or clients) I ⊆ V, |I| = I and the set of potential facility locations J ⊆ V, |J | = J. Each facility j ∈ J of finite capacity bj has an associated opening (installation) cost ϕj and an assignment cost κij for satisfying the demand aik originated at i for product type k ∈ K, |K| = K. The (discrete) multisource capacitated Facility Location Problem (cFLP) consists in choosing a subset of potential locations where to open a facility and assign every client i ∈ I with known demands to a (sub)set of open facilities, while ensuring that no open facility is assigned to more client demands than its available capacity. The goal is to find a set of facilities to open and an assignment of demands to open facilities that minimize the sum of the opening cost of the selected facilities, the client (or
978-1-4673-8496-4/16/$31.00 ©2016 IEEE
14
customer) demand supplying cost at each facility and the cost of connecting each demand originating point to the facilities its demands are assigned to. A vast amount of literature has been dedicated to the study of various discrete facility location models and its numerous variants. The main properties of the facility location model considered in this paper are i) multi-source: the demands initiated by a given customer may be served by multiple (opened) facilities, i.e., each demand initiating point i ∈ I can be assigned to multiple installed facilities; ii) multi-product: each opened facility j ∈ J offers multiple product groups as digital objects (data files) show different properties (defining their type) in content distribution networks or other distributed information storage systems; and iii) shared-capacity: the capacity bj installed at each facility j is shared among all product types hosted by that facility. Moreover, a single and unique facility may be opened at each location. Our model also assumes symmetric connection cost; further referred as traffic routing cost in the remainder of this paper, this property ensures that the optimal solution to the client-to-server problem corresponds to the optimal solution to the server-to-client problem. It also implies that network metrics (and in particular, link metrics) are symmetric; the cost from the client demand point at vertex u to the facility located at vertex v equals to the cost from vertex v to u. This problem shares many similarities with the one consisting in minimizing the cost for opening a set of servers or replicas for storing a set of data objects (files) and connecting clients to facility locations so as to satisfy their demands. The multi-source dimension of the problem translates the situation where the same data object may be available simultaneously at different facility locations. Moreover, when various data objects with different properties are available at (possibly multiple) opened facilities, the problem shares the characteristics of the multi-product or multi-commodity model. Nevertheless, a specific characteristics of digital (data) objects compared to physical goods leads to different formulation of the inequalities characterizing facility capacity constraints (in contrast to classical facility location problems). Indeed, servicing at a given location one customer demand for a given data object doesn’t prevent servicing another customer requesting for the same data object. Hence, only a single copy of each data object hosted at installed facilities is required even if this object is assigned to multiple customer demands ai ; this is the main characteristic compared to the classical version of the facility location problem for physical goods.
2016 12th Int. Conference on the Design of Reliable Communication Networks (DRCN 2016)
In this paper, we propose to combine the capacitated Facility Location Problem (cFLP) and the routing decision problem. These two problems (facility location and routing decision) are usually solved independently, but arguably it is more realistic and effective to model and solve them simultaneously. Indeed, most of the aforementioned facility location models install facilities on a predetermined network topology though the graph it defines may seriously affect the optimal location of facilities as well as the allocation of client demands to opened facilities as capacity constraints significantly influences the choice of facility. On the other hand, in its simplest version, the routing decision problem can be scoped to the flow routing problem. The latter consists given a network graph G with capacitated edges and a set of commodities (demands), in minimizing the cost resulting from the traffic routing decisions performed by the network nodes. The goal is to find an assignment of all flows which satisfies all demands and respects the link capacities while minimizing the solution cost, defined as the sum of the cost incurred by the link capacity utilization to route traffic on the links. The latter is referred to as the min-cost multi-commodity flow (MMCF) problem. We then exploit the properties of this combined model to propose a dynamic demand re-allocation scheme which relies on the re-routing of traffic flows (adapting routing decisions) in case of facility failure. The remainder of this paper is structured as follows. In Section II, we detail the objectives and the main contribution developed throughout this paper. We review the prior work related to the combination of the cFLP with the network design problem in Section II and position it against our model. The properties of this model can also be exploited to propose a resilience scheme in case of facility failure. The combined multi-source multi-product cFLP - flow routing problem is documented in Section III together with its analysis in order to obtain a tighter formulation. In Section IV, we report the computational results obtained using this formulation and analyze them. Finally, Section V concludes this paper together with directions for future research work. II.
R ELATED W ORK , O BJECTIVES , AND C ONTRIBUTION
A. Related Work and Positioning The combination of the cFLP with the network design problem, referred to as cFLNDP, has been subject to several studies since the early 90’s. In [2], Daskin et al. extended the uncapacitated fixed charge Facility Location Problem to introduce the first (known) formulation of combined the Uncapacitated Facility Location - Network Design Problem (uFLNDP). In 2001, Melkote and Daskin generalized the uFLNDP formulation described in [3] to the Capacitated Facility LocationNetwork Design Problem (cFLNDP) in which facilities have constraining capacities on the amount of demand they can serve. The MIP formulation of the problem as proposed by these authors in [4] includes inequalities to strengthen the LP relaxation. In this seminal paper, some heuristic methods have been considered together with an extensive sensitive analysis for this problem. Since then, several efforts have been undertaken to further elaborate on this formulation. In 2012, Contreras et al. [1] proposed a unified view coupling location and network design comparing different arc-node and path formulations; however, authors do not compare these
978-1-4673-8496-4/16/$31.00 ©2016 IEEE
15
Fig. 1.
Rewiring
Fig. 2.
Rerouting
formulations with the aim of differentiating between network design and flow routing decisions. This (class of) problem(s) was initially motivated by the observation that when the (physical) network topology is determined endogenously, it may be more effective to change the configuration of the underlying physical topology of the network instead of locating additional facilities. Though (as expected) the facility investment monotonically decreases as the number of facilities and their capacity is increased, the results reported in [4] show that the link investment doesn’t monotonically decrease with the increase of the facility capacity. These results indicate that considering coupled decisions would be more effective than independent decisions. As depicted in Fig. 1, if the demand initiated by the client point i = 3 can’t be satisfied by the facility located at j = 2 rewiring (withdrawal of arc (3,2) and addition of arc (3,1)) to the facility located at j = 1 may provide a more effective solution than augmenting the capacity of facility j = 2. In practice, the cFLNDP has many real-world applications such as planning, supply chain and logistic problems, power transmission, pipeline distribution system, hub-andspoke transshipment-terminal networks. In the context of packet-oriented (or even circuit-oriented) communication networks, one doesn’t require to physically rewire the network to obtain a new logical topology (routing and switching functions fulfill this role). This situation is depicted in Fig. 2 where rerouting replaces physical rewiring: instead of requiring installation of an arc between client demand point i = 3 and facility location j = 1, intermediate nodes can be used to relay information from the demand point to their assigned facility. Hence, modeling and solving both facility location and flow routing problems simultaneously becomes a reasonable objective to reach. Moreover, for digital goods, only a single copy of each data object hosted on an installed facility is required even if this object is assigned to multiple client demands ai . This property defines the main distinguishing characteristic compared to the cFLP when applied to physical goods. As it can be observed from Fig. 3 (left-side), for physical goods, two concurrent demands for a given type of product (initiated from client
2016 12th Int. Conference on the Design of Reliable Communication Networks (DRCN 2016)
Fig. 3.
Physical goods vs. Digital goods model
Fig. 4.
Demands rerouting
Fig. 5.
Demands protection
demand points i = 3 and i = 4) assigned to the facility j = 1 require storing two copies of this product on that facility (thus, in turn, requiring to double the amount of capacity required on that facility in order to satisfy the client demands for that product). Additionally, facilities may also fail and lead in turn to excessive routing cost. Indeed, upon occurrence of facility failure, some customer demands have to be reassigned to facilities located much farther than their regularly (primarily) assigned facilities. To address this problem some facility location models that minimize the location and demand allocation costs as well as hedge against facility failures were developed. Snyder and Daskin [7] first introduced the reliable fixed-charged location problem (RFLP) model based on level assignments. The goal being to minimize the facility location cost while also taking into account the expected allocation cost after failures of facilities, the objective function is formulated as a weighted sum of the allocation cost under normal circumstances and random disruptions with uniform failure probability of the candidate facility locations. The strategy adopted for handling facility disruptions consists in assigning each customer demand to a set of facilities in sequence so that each customer would be served by a first backup facility if its primary facility fails (level r = 0), a second backup if its first backup fails (level r = 1), and so on. The RFLP model developed in [7] entails nevertheless a significant drawback because it assumes that facilities are uncapacitated whereas capacity significantly influences the choice of facility (and in turn the objective value) but also the resolution time by LP solvers. Recently, the RFLP has been extended to its capacitated variant, referred to as the cRFLP [9], so that backup facilities can only serve the demands assigned to failed facilities if they have enough extra capacity to satisfy the customers demands. In the capacitated variant, a given level r doesn’t imply that there are r closer opened facilities as in the RFLP because the closest remaining facility may not have enough excess capacity. Both RFLP variants protect customers demands against facility failure by means leveled backup assignment; they also involve an emergency facility that is non-failable if all open facilities have failed (and thus, no facilities are available to serve customer i) or the new allocation cost after failure higher than the cost of assigning i to any of the existing facilities that have not failed. Instead of considering backup allocation (at the customer demand level), [6] proposes assignment of backup locations (at the facility level). When choosing a location for a facility, another facility is selected which will serve as its backup when the primary facility fails. In other terms, with the RFLP variants (demand allocation protection), two customers assigned to the same facility may have different
978-1-4673-8496-4/16/$31.00 ©2016 IEEE
16
backup facilities, whereas with facility location protection, they have the same backup facility. In this paper, we take advantage of the re-routing capability of the proposed combined model in order to enable the dynamic reallocation of demands upon failure occurrence of their assigned facility. For this purpose, we assume that the flow routing decision process is aware of the distribution of the products per facility (and their availability) and that the non-failable entity behaves similarly to a depot which replenishes remaining facilities with digital objects that became inaccessible after the failure of their (primary) hosting facility. As depicted in Fig. 4, the demands initially assigned to the facility located at j = 1 are re-routed to the facility located at j = 2 when the facility placed at the former location fails. This resilience scheme has to be contrasted with one involving protection of customer demands shown in Fig. 5; in the latter case, the facility located at j = 1 acts as (level 1) backup of the facility located at j = 2 (and vice-versa) and all corresponding demand assignments are realized in anticipation to any potential facility failure. Such scheme increases both network/link and facility capacity requirements which result in high operating costs to hedge against facility disruptions. B. Objectives and Contribution This paper proposes a mixed-integer formulation for the combined multi-source multi-product capacitated facility location-flow routing problem, referred to as MSMP-cFLFRP. This formulation accounts for the specifics of digital object storage and supply. More precisely, that only a single copy of each digital object needs to be stored at each facility even if it satisfies more than one customer demand. This characteristic leads to a major distinction in the expression of the facility capacity constraints compared to their canonical formulation when modeling the facility location problem for physical goods. Indeed, the capacity constraints which ensure that customer demands are only served at opened facilities and
2016 12th Int. Conference on the Design of Reliable Communication Networks (DRCN 2016)
that the capacity of the installed facilities are not exceeded yield a fractional term in the LHS expressing the shareability of digital objects stored at each opened facility. To the best of our knowledge, this is the first known formulation of the so-called digital content placement problem (using the network-oriented language) for multiple commodities/products. Most known formulations translate the multiproduct problem as a single-commodity problem solved separately for each product (thus, preventing to account for shared capacity on installed facilities) and without accounting for the underlying traffic routing decision process other than considering some abstract cost metric for the connection cost. At the computational level, in particular, the inclusion of such class of constraints leads to non-linearity which require specific treatment to become processable by linear/quadratic solvers such as CPLEX. Finally, as dynamic routing offers the flexibility to re-configure routing tables entries and adapt routing decisions in case of facility failure, we exploit the properties of our model to propose a dynamic demand re-allocation scheme and compare its performance (in terms of facility capacity and demand allocation cost) with the capacitated reliable fixedcharge location problem (cRFLP). III.
F ORMULATION
We are given a directed graph G = (V, E) where, V defines the set of nodes and E the set of arcs, each denoted by (u, v) with u referring to the head-end and v to the tail-end of the arc. We also have at our disposal the set of potential facility locations J ⊆ V, the set of product types K(|K| = K) that can be hosted by each facility j ∈ J , and the demand set A where, aik denotes the size of the requested object (product) of type k ∈ K initiated by the client/customer demand point i ∈ I ⊆ V. The total demand A over all product types is given by the double sum: XX A= aik . (1)
•
zjk : binary variable which equals 1 if the product of type k is provided at the facility j.
The flow routing problem involves the continuous variable fuv,ijk which indicates the amount of traffic flowing on arc (u, v) in supply of the customer demand i for product k assigned to the opened facility j. C. Costs The cost of a solution to the MSMP-cFLFR problem combines the sum of i) the facility location cost ϕj : the cost for installing a facility at location j ∈ J ⊆ V; ii) the demand allocation cost κijk : the cost for allocating to the facility installed at j the fraction of the demand aik for product type k ∈ K requested by the customer demand point i ∈ I (includes the cost for storing and supplying aik units of product k from the facility j to the requesting customer/demand point i); and iii) the traffic routing cost from the demand point i to the assigned facility : the cost per unit of traffic routed along arc (u, v) is denoted by τuv .
The MSMP-cFLFR problem can be formulated as follows: X XX X min ϕj yj + κijk xijk j∈J
+
i∈I j∈J k∈K
X (u,v)∈E
τuv
XX X
fuv,ijk
(2)
i∈I j∈J k∈K
subject to the following constraints: The demand satisfaction constraints express that the demand aik for product type k originated by customer i shall be exactly satisfied by the set of installed facilities: X xijk = 1, i ∈ I, k ∈ K, aik > 0 (3) j∈J
i∈I k∈K
Note that in comparison to the canonical multicommodity flow routing problem, we have at our disposal the source initiating point of the demand but obviously not its destination (being the objective of the allocation problem). For the facility location part of the problem, the model parameters include the capacity bj of the facility installed at location j ∈ V. In this paper, this capacity represents the storage capacity available at the facility opened at node j. For the flow routing part of the problem, the model parameters comprise the nominal maximum capacity quv of the arc (u, v) ∈ E. B. Variables The following variables specific to the multi-source multiproduct cFLP are defined: xijk : continuous variable which denotes the fraction of the demand (of size) aik for product type k requested by the customer demand node i which is satisfied by the facility (installed at location) j ∈ J ;
978-1-4673-8496-4/16/$31.00 ©2016 IEEE
yj : binary variable which equals 1 if a facility of capacity bj is installed at location j;
D. Mixed-Integer Program
A. Input data and Parameters
•
•
17
The product type k is available on facility j only if this facility has been opened (these constraints forbid assigning products to closed facilities): zjk ≤ yj , j ∈ J , k ∈ K
(4)
Demand by customer i for product type k can be satisfied by facility j only if this product type is available on the opened facility j (these constraints forbid the delivery of product type k from facility j to the demand point i if the product type k is unavailable at that facility): xijk ≤ zjk , i ∈ I, j ∈ J , k ∈ K
(5)
The facility capacity constraints ensure that the sum of the fractions of the customer demands xijk assigned to a given opened facility j does not exceed its maximum capacity bj . In the physical goods model, a subset of d demands aik ∈ A for the same product type k and of the same size s that is assigned to a given opened facility j consumes d.s units of capacity (at each facility, identical demands are allocated to distinct objects). Hence, for physical goodsPthe facility capacity constraints are P given by the inequalities i∈I k∈K aik xijk ≤ bj yj , j ∈ J .
2016 12th Int. Conference on the Design of Reliable Communication Networks (DRCN 2016)
In contrast, for digital goods, the same set of d demands for the same product type k of the same size s that is assigned to a given opened facility j consumes only s units of capacity (at each facility, identical demands are allocated to the same object). Hence, each fraction xijk is divided by the term P `∈L x`jk , i.e., the sum of the fractions of identical demands that are assigned to the same facility j. In this sum, L ⊆ I denotes the set of clients originating identical demands (for product type k of the same size s) that are assigned to the same facility j as the other clients of that set. Note that another main consequence of the capacity constraints is that demands may not be assigned to their closest facility. XX xijk aik P ≤ bj yj , j ∈ J (6) `∈L x`jk i∈I k∈K
To linearize these constraints, we first define a new variable ξjk such that ξjk =
1 xijk +
P
i∗ ∈L\{i}
xi∗ jk
This equality is equivalent to X X ξjk xijk + xi∗ jk = ξjk xijk = 1 i∗ ∈L\{i}
(12)
(13)
i∈L
In terms of the new variables ξjk , the facility capacity constraints (6) can then be rewritten as follows XX aik ξjk xijk ≤ bj yj , j ∈ J (14) i∈I k∈K
A set of constraints are then introduced linking the flow routing decisions to the facility location problem. First, the traffic flowing along each arc (u, v) ∈ E from the demand point i to the opened facility j is delimited by the minimum between the corresponding arc nominal capacity quv and the fraction xijk of the customer demand aik for product type k allocated to the facility j. fuv,ijk ≤ min(quv , aik xijk ), (u, v) ∈ E, i ∈ I, j ∈ J , k ∈ K, i 6= j
(7)
The mutual capacity constraints ensure that the load (sum of traffic flows) on individual arcs (u, v) ∈ E does not exceed their nominal capacity quv . XX X fuv,ijk ≤ quv , (u, v) ∈ E (8) i∈I j∈J k∈K
The customer demand for product type k originated at demand point i may be serviced by multiple sources (facilities); this property prevents the use of binary activation variables for individual product (type,size) pairs in the capacity constraints (6). A fraction of the demand aik could indeed be satisfied by the (local) facility installed at that point i and the remainder part of the demand satisfied by multiple remote facilities. X X aik xiik + fiv,ijk = aik , (9) v∈V:(i,v)∈E j∈J i ∈ I, k ∈ K, i 6= j, aik > 0 The flow conservation constraints further impose that the (incoming) traffic flow associated to the demand for product type k originated by the client i and entering node u where a facility j is located, must be equal to the demand fraction served by that facility plus the outgoing traffic flow leaving that node when this demand is assigned to another facility. X X X X fvu,ijk = fuv,ijk + xiuk aik , (10) v:(v,u)∈E j∈J v:(u,v)∈E j∈J i ∈ I, u ∈ V, k ∈ K, u 6= i 1) Fractional Constraints: In the physical goods model, the facility P P capacity constraints impose for each facility j that i∈I k∈K aik xijk ≤ bj yj . Instead, for the digital goods model, these constraints (6) include a fractional term in their LHS which can be written as (L ⊆ I): XX x P ijk aik ≤ bj yj (11) xijk + i∗ ∈L\{i} xi∗ jk i∈I k∈K
978-1-4673-8496-4/16/$31.00 ©2016 IEEE
18
X
ξjk xijk = 1, j ∈ J, k ∈ K
(15)
i∈L
Following the Theorem [8] that the polynomial mixed term z = xy, where x is a binary variable and y is a continuous variable, can be represented by the following linear inequalities: 1) z ≤ U x; 2) z ≤ y −L(1−x); 3) z ≥ y −U (1−x); 4) z ≥ Lx, provided the upper U and lower L bounds of the continuous variable y are known, i.e., L ≤ y ≤ U . Hence, if one considers instead that the variables xijk are binary (and thus, that the model follows the single-assignment property), the facility capacity constraints can be linearized by introducing the auxialiary variables ζijk = ξjk xijk together with the following constraints: 1) ζijk ≤ U xijk ; 2) ζijk ≤ ξjk − L(1 − xijk ); 3) ζijk ≥ ξjk − U (1 − xijk ); 4) ζijk ≥ Lxijk . Following the introduction of the auxiliary variables ζijk = ξjk xijk such that L(= 0) ≤ ξjk ≤ U (= 1), we obtain the set of constraints: XX aik ζijk ≤ bj yj (16) i∈I k∈K
X
ζijk = 1
(17)
ζijk ≤ xijk ζijk ≤ ξjk ≥ ξjk − (1 − xijk ) ζijk ≥ 0
(18) (19) (20) (21)
i∈L
ζijk
Observe that the facility capacity constraints described by the inequalities (16) express now as for any capacitated facility location problem together with an additional inequality (17) imposed on the transformed fractions. The resulting model looks as if a capacitive allocation problem is embedded into a combined facility location - routing problem. However, this transformation increases the complexity of the model due to the addition of (I + 1)JK auxiliary variables ζijk and ξjk together with (4I + 1)JK associated constraints. In addition, it imposes that customer demands shall be served by a single facility (single-assignment property) instead of allowing for multi-sourcing, key property of our model. A closer look at the facility capacity constraints (6) shows that the explicit dependence on the product type index k in their LHS prevents per-product formulation, i.e., the constraints that account for the sharing of capacity among k product types show a more complex structure than the superposition
2016 12th Int. Conference on the Design of Reliable Communication Networks (DRCN 2016)
of k independent facility capacity constraints. Hence, in order to account for their inter-twinning effects while limiting their resulting computational complexity, one approach consists of starting from the facility capacity constraints as they would be expressed for the dedicated (per-product) capacity model: X xijk aik P ≤ bjk yj , j ∈ J , k ∈ K (22) `∈L x`j i∈I
One can then move the denominator out of the LHS to obtain the following inequality: X X aik xijk ≤ bjk yj xijk , j ∈ J , k ∈ K (23) i∈I
i∈L
Finally, we re-introduce the summation over k (in both members) in order to account for the capacity sharing among product types: ! XX X X aik xijk ≤ bjk yj xijk , j ∈ J (24) i∈I k∈K
k∈K
i∈L
This transformation removes the fractional term of the LHS but introduces a summation over individual product capacity (bjk ). That is, we have moved the complex dependence on the product index k from the allocation fractions xijk to the newly introduced individual product capacity bjk –the perproduct capacity distribution is by definition kept unspecified in the shared-capacity model. Thus, it would seem that there is no apparent gain from this transformation. Nevertheless, we could further postulate that the product types are evenly distributed among installed facilities and therefore, assume that the capacity per facility bj = Kbjk , ∀j. We obtain thus the following inequalities for the facility capacity constraints: XX XX 1 xijk , j ∈ J (25) aik xijk ≤ bj yj K i∈I k∈K
i∈L k∈K
We propose thus a number of valid inequalities to strengthen the LP relaxation of the MSMP-cFLFR problem at hand. First, the aggregate capacity constraints ensure that the set of opened facilities yields a feasible solution by providing sufficient (aggregated) capacity in order to satisfy all customer demands. These inequalities are usually considered to obtain better bounds when solving the canonical version of the cFLP. These constraints do also apply in the digital good model P but only P ifPone replaces their RHS by the 1 term K b y j j ijk yielding the constraints i∈I k∈K xP P P j∈J P P 1 a ≤ b y ik j j i∈I k∈K j∈J i∈I k∈K xijk . Next, we K can further (explicitly) impose that the individual allocation fractions xijk remain within the interval [0, 1], i.e., 0 ≤ xijk ≤ 1, i ∈ I, j ∈ J , k ∈ K. Moreover, since a single and unique facility may be opened at each potential location (vertex of the network graph), we can set an upper bound on the maximum number of facilities given by the number of network nodes V and a lower bound of 1. In the context of the facility location theory, these constraints lead altogether to a problem that could be referred to as the homogeneous multi-product cFLP. Finally, a simplified formulation of the objective function can be proposed by considering that the demand allocation cost corresponds to the connection cost, i.e., traffic routing cost. This formulation corresponds to the case where one can view the cost κijk as negligible compared to the network cost. In this case, the objective function becomes: X X XX X ϕj yj + τuv fuv,ijk . (26) j∈J
When |L| → 1, each product (type,size) pair available on a given facility is allocated to very few customers (down to 1) and demands for the same product (type,size) pair originated by different customers get allocated to different facilities. In this case, one can assume thatPthe facility capacity constraints P retrieve their canonical form i∈I k∈K aik xijk ≤ bj yj , j ∈ J . When L → I, identical product (type,size) pairs are available over a very few number of facilities (down to 1) and demands for the same product (type,size) pair originated by different customers are allocated to the same facility. In this case, L can be replaced by I in the RHS of inequalities (25). Even if the multi-product dimension of the original problem has been significantly reduced by approximating the effect of the shared capacity property of the model on the facility capacity constraints (25), the latter still comprise a non-linear term resulting from the multiplication of the facility installation variables yj with the fraction allocation variables xijk . Though their linearization increases the number of variables and constraints, the tractability of the approximated model increases. 2) Valid Inequalities: One of the main difficulties when dealing with multiple sources is that the number of opened facilities in an optimal solution is unknown; preventing in turn the use of facility partitioning constraints, technique commonly used to produce additional valid inequalities which can be
978-1-4673-8496-4/16/$31.00 ©2016 IEEE
useful to sharpen a relaxation. The main effects of the multiproduct dimension of the problem has already been considered in the previous section. Moreover, the flow conservation constraints (9) and (10) intertwin the flow routing variables fuv,ijk with the fraction allocation variables xijk and the facility installation variables yj , respectively.
19
(u,v)∈E
i∈I j∈J k∈K
Put altogether, the formulation of the MSMP-cFLFRP problem can then be written as presented in Fig. 6. E. Evaluation In order to determine the computational impact of the facility capacity constraints (25), we evaluate the performance in terms of the computation time and solution quality resulting from the execution of the program presented in Fig. 6. For this purpose, we generate a set of problem instances with 3000 demands (generated using the method documented in Section IV) and a network topology of 25 nodes and 90 arcs, by tuning the facility capacity and installation cost. The latter is set proportionally to the facility capacity (linear). Table I summarizes the performance results obtained with CPLEX 12.6 using the formulation described in Fig. 6. In this formulation, we can replace the RHS of inequalities (35) by introducing the auxiliary continuous variable ωuv,ijk ≥ 0 together with the following set ωuv,ijk ≤ quv , ωuv,ijk ≤ aik xijk , ωuv,ijk ≥ quv b1 , ωuv,ijk ≥ aik xijk − aik (1 − b2 ), and b1 + b2 = 1 where, b1 and b2 are binary variables. Observe thus that our formulation could easily accommodate variable link capacities.
2016 12th Int. Conference on the Design of Reliable Communication Networks (DRCN 2016)
min
X
X
ϕj yj +
j∈J
τuv
(u,v)∈E
XX X
fuv,ijk
(27)
i∈I j∈J k∈K
subject to: X xijk = 1
i ∈ I, k ∈ K, aik > 0
(28)
j ∈ J,k ∈ K i ∈ I, j ∈ J , k ∈ K
(29) (30)
j∈J
(31)
j∈J
zjk ≤ yj xijk ≤ zjk XX XX 1 aik xijk ≤ bj yj xijk K i∈I k∈K i∈I k∈K XX XX 1 X aik ≤ bj yj xijk K i∈I k∈K j∈J i∈I k∈K X X aik xiik + fiv,ijk = aik
(32) i ∈ I, k ∈ K, i 6= j, aik > 0
(33)
i ∈ I, u ∈ V, k ∈ K, u 6= i
(34)
(u, v) ∈ E, i ∈ I, j ∈ J , k ∈ K, i 6= j
(35)
(u, v) ∈ E
(36)
i ∈ I, j ∈ J , k ∈ K j∈J j ∈ J,k ∈ K (u, v) ∈ E, i ∈ I, j ∈ J , k ∈ K
(37) (38) (39) (40)
v∈V:(i,v)∈E j∈J
X
X
fvu,ijk =
v:(v,u)∈E j∈J
X
X
fuv,ijk + aik xiuk
v:(u,v)∈E j∈J
fuv,ijk ≤ min(quv , aik xijk ) XX X fuv,ijk ≤ quv i∈I j∈J k∈K
xijk ∈ [0, 1] yj ∈ {0, 1} zjk ∈ {0, 1} fuv,ijk ≥ 0 Fig. 6.
Mixed integer program formulation for the combined MSMP-cFLFR Problem
The computational results reported in Table I show that when using the Barrier (B) algorithm, the root relaxation solution time (second column) decreases by an order of magnitude (up to a factor 10) compared to the results obtained with the default configuration of CPLEX. We also enforce the use of the Barrier algorithm for the continuous optimizer used to solve the subproblems (after the initial relaxation) instead of letting the default CPLEX setting which currently selects the dual simplex optimizer for subproblem solution for MILP. With this setting, the results reported in the third column of this table, obtained after averaging over 10 executions, show that all scenarios can be solved to optimality within 7200s of total computation time (root + branch & cut) except for one of them (scenario 11). When the facility installation cost is set proportionally to the node degree or kept constant, the results obtained show similar trend. This formulation has also been evaluated on scenarios obtained by varying the properties of the initial topology (e.g., increasing the number of arcs by 10%) without observing any significant deviation compared to the results reported in this table. Note that a formulation using binary activation variables in the facility capacity constraints instead of the proposed approximation, yields a significant increase in computation time (by a factor 2 at least). We then estimate the distortion introduced by our approximation which assumes that the total facility capacity is homogeneously distributed per-product on each facility (following the shared capacity model). For this purpose, we consider the following worst-case scenario: each customer demand point i ∈ I = V, I = d initiates demands for different product types of the same size s; thus, when K = d such scenario involves d product (type,size) pairs and requires a total capacity of d.s. We set the capacity of each potential facility j ∈ J to s such that the demands initiated locally can be assigned locally (anticipated by inspecting the generated demand set) and the
978-1-4673-8496-4/16/$31.00 ©2016 IEEE
20
TABLE I. Scenario 1 2 3 4 5 6 7 8 9 10 11 12 Avg Stdev
N UMERICAL R ESULTS
Root time (s) 3037 2516 2306 2506 2921 3111 2912 2616 3270 3309 2895 3241 2887 333
Total time (s) 3294 2773 2565 2763 3179 5360 5706 7189 5967 6029 9664 5493 4999 2164
Final Gap (%) 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
facility installation cost low enough to steer local assignment. In that case, the routing cost should be zero (remember here that the facility charge is independent of the product type) and the capacity per facility sufficient to cover the locally initiated demands. The latter is however not verified because in (25) the per-facility capacity bj is divided by the number of product types K; hence, the homogeneous distribution of the capacity per-product type overestimates (assuming product types of identical size) by a factor K the capacity required on at least one of the installed facilities. IV.
N UMERICAL E XPERIMENTS
A. Instances and Input Data As we don’t have at our disposal any real-world demand sets with the minimum information required, we generate a set A of 100.|V| demand tuples (originator/client demand point, product type, demand size) selected among k = 5 product types by using the following distributions:
2016 12th Int. Conference on the Design of Reliable Communication Networks (DRCN 2016)
•
The size of the demands is determined by the Pareto distribution P (α) with shape parameter α set to 1.4; thus, with finite mean but infinite variance since α ≤ 2. This type of distribution is representative of the distribution of the size of the files populating content servers; hence, it is used to model the size of the demands observed at edge nodes located at the network periphery.
•
The frequency of identical demands (in both type and size) is driven by the Generalized Zipf-Mandelbrot (discrete) probability distribution. This distribution models phenomena where the occurrence frequency of an event is inversely proportional to its rank based on that occurrence frequency; hence, it is often used to model content popularity.
The evaluation of the formulation presented in Section III has been realized by means of a set of network topologies extracted from the SNDlib library [5] which provides a repository of topologies together with their link capacities and costs. The following topologies have been considered (in alphabetical order): atlanta, cost266, france, geant, india35, newyork, nobeleu, and norway. The capacity distributed over the set of (potential) facilities is non-blocking, i.e., the sum of all demands over Pall originating points does not exceed the total capacity j∈J bj . For this purpose, we set the per-facility capacity by starting from the minimum value bj = bmin required to serve all demands; we also assume that the total capacity is homogeneously distributed among facilities with different cost levels per capacity unit (e.g., 80%, 100%, 120%, 150%). The parameter bj is then incremented until reaching a value where two facilities would be sufficient to serve all demands. This incremental growth in capacity allows to evaluate the sensitivity to the facility capacity and its associated cost. The only input data for which we obtain little insight (out of SNDlib) is the facility location cost ϕj ; in particular, when nodes are indexed by numbers without any geographic indication (city, country or alike). In such situations, the facility location cost should be rather considered as a facility installation cost (independently of its physical location) in addition to which its capacity accounts for the cost of the resources provisioned at this location. For this purpose, we consider a location cost proportional to the facility capacity. B. Execution We have implemented the proposed resilience scheme using the formulation presented in Section III. To evaluate its performance compared to demand protection against failure of primary assigned facilities, we have also developed the cRFLP formulation documented in [9]. Both formulations have been solved with CPLEX 12.6. Executions have been performed on a dedicated server equipped with 8 x Intel Xeon quadcore processors and 512GB of DDR3 RAM, running under the Linux CENTOS operating system v6.5. For each topology, the combination of the input parameters leads to a batch of 48 executions. To solve the root relaxation of the original MIP, we enforce the use of the Barrier algorithm. This setting leads to a time gain of an order of magnitude compared to the situation where CPLEX is left with
978-1-4673-8496-4/16/$31.00 ©2016 IEEE
21
Fig. 7.
Number of facilities vs. Facility capacity
its default configuration. We also use this algorithm for the MIP subproblem resolution. C. Numerical Results and Analysis The general trends are the following. Referring to Fig. 7, we can observe as (intuitively) expected that the number of opened facilities (monotonically) decreases as the capacity installed per facility increases. Moreover, for a given facility capacity, the number of facilities decreases when the cost level increases. When the cost level per capacity unit is small (e.g., 80%), the number of installed facilities is large. As the cost level increases, the number of installed facilities and thus, the facility installation cost is progressively reduced at the detriment of higher traffic routing cost; hence, more expensive allocation cost to satisfy all customer demands. We also consider the situation where the facility installation cost remains constant independently of its capacity increase. In this case, as the facility capacity increases, the number of installed facilities decreases but we also observe that the traffic routing cost does not monotonically increases. Instead, it first increases and then after reaching a certain value it slowly decreases. This phenomena can be explained from the observation that as the facility capacity increases, the corresponding investment comes at the detriment of traffic routing investment but only after reaching a certain facility capacity threshold. Observe that the latter comes when the facility investment cost (capacity + installation) reaches its minimum. These results do not follow the common intuition that the lowest investment cost is necessarily reached when the number of facilities is minimal but when the capacity provisioned on installed facilities is not too low to limit remote allocations but also not too high to avoid waste of capacity. The latter tradeoff leads also to the highest traffic routing cost as we reach the minimal facility investment cost. Thus, facility capacity increase may amplify network investment, i.e., traffic routing cost doesn’t monotonically decrease as the facility capacity increases. This phenomena is observable from Fig. 8 which depicts for two cost profiles (1) at 100% and (2) at 120%, the evolution of the total cost (TOT), total facility installation cost (INST) and total routing cost (RTG). Moreover, as the facility capacity increases, one would intuitively expect that the amount of unused capacity (at installed facilities) also increases and possibly be compensated by the increase of the traffic routing cost. This is
2016 12th Int. Conference on the Design of Reliable Communication Networks (DRCN 2016)
Fig. 8.
Total, installation and routing cost vs. Facility capacity
Fig. 9.
actually the case; at lower capacity levels, increasing facility capacity leads to higher routing cost; however, after reaching a certain capacity level where the unused capacity is minimal, the latter increases without inducing any further increase in the traffic routing cost. This trend seems to imply that combining traffic routing with facility location and demand allocation is actually effective for better utilization of the facility capacity (at the expense of an increasing routing cost) but after reaching a certain capacity threshold, the benefit may be less significant. To evaluate the performance in terms of facility installation and demand allocation cost in case of facility failure, we execute our program on mid-size instances (such as france) and compute the resulting costs by assuming that all primary assignments of demands require reallocation to new facilities. Thus, if a given facility j had been installed as part of the primary demand assignment, this facility becomes unavailable after failure occurrence. As dynamic routing offers the possibility to re-configure routing tables entries in case of facility failure, we compute the cost for opening new facilities and for re-routing the complete set of demands to newly opened facilities. The results are then compared with the facility installation and demand allocation cost obtained when using the demand protection scheme provided by the cRFLP. For the latter, we consider a two-level protection where each demand is assigned to one primary facility at level r = 0 and to another (backup) facility at level r = 1. Hence, in addition to the facility installation cost, the objective consists of minimizing the cost for the primary assignment of each demand and the cost for assigning a backup facility to serve that demand in case its primary assigned facility fails. The results obtained when executing both resilience schemes are reported in Fig. 9. As the facility capacity increases, we observe that the total cost (R) for re-routing the flows associated to all demands and re-assigning them to newly opened facilities remains lower than the total cost (P) for protecting all primary demand assignments with backup facilities. The higher allocation cost required by the former compared to the latter scheme with only two levels of protection, can be explained by the smaller number of facilities it involves (label on the two upper curves) and the use of a load-dependent routing cost instead of a distance cost. The highest gain (36%) is obtained when the tradeoff between the spatial distribution of the facility capacity (over 8 locations) and the routing cost to access them reaches its optimal value.
978-1-4673-8496-4/16/$31.00 ©2016 IEEE
22
Total and allocation cost vs. Facility capacity
V.
C ONCLUSION
This paper proposes a mixed-integer formulation for the combined multi-source multi-product capacitated facility location-flow routing problem (MSMP-cFLFRP). This formulation accounts for the specifics of digital object storage and supply. This characteristic leads to a major distinction in the expression of the facility capacity constraints compared to their canonical formulation. Indeed, the resulting fractional constraints require specific treatment to become processable by solvers such as CPLEX. In contrast, most known formulations of this problem when applied to the digital content placement, translate the multi-product problem as a single-commodity problem solved separately for each product. Despite the intrinsic complexity of this problem, our approximation of these constraints enables to solve to optimality small- to mediumsize instances with an order of thousands of demands. Nevertheless, further refinement of this formulation seems to be required for larger instances executed with scenarios involving facility failure to avoid excessive computation times. R EFERENCES [1] I. Contreras, and E. Fern´ andez, General network design: A unified view of combined location and network design problems, European Journal of Operational Research, 219:680-697, 2012. [2] M.S. Daskin, A. Hurter, and M. Van Buer, Toward an integrated model of facility location and transportation network design, Working Paper, The Transportation Center, Northwestern Univ., Evanston (IL), 1993. [3] S. Melkote, and M.S. Daskin, An integrated model of facility location and transportation network design, Transportation Research Part A: Policy and Practice, 35(6):515-538, July 2001. [4] S. Melkote, and M.S. Daskin, Capacitated facility location/network design problems, European Journal of Operational Research, 129:481495, 2001. [5] S. Orlowski, M. Piro, A. Tomaszewski, and R. Wess¨ aly, SNDlib 1.0 Survivable Network Design Library, Networks, 55(3):276-286, 2010. [6] D. Shishebori, L.V. Snyder, and M.S. Jabalameli, A reliable budgetconstrained facility location/network design problem with unreliable facilities, Networks and Spatial Economics, 14(3-4):549-580, 2014. [7] L.V. Snyder and M.S. Daskin, Reliability models for facility location: The expected failure cost case, Transportation Science, 39(3):400-416, 2005. [8] H.P. Williams, Model Building in Mathematical Programming, 5th Edition, John Wiley & Sons, March 2013. [9] R. Yu, The Capacitated Reliable Fixed-charge Location Problem: Model and Algorithm, Theses and Dissertations, Paper 1684, 2015.