Optimal deployment of eventually-serializable data

7 downloads 0 Views 1MB Size Report
the cross dock can be viewed as a fully connected graph with distances on each of ... fault tolerance, no more than one replica should reside on any given node.
Ann Oper Res DOI 10.1007/s10479-010-0684-3

Optimal deployment of eventually-serializable data services L. Michel · A. Shvartsman · E. Sonderegger · P. Van Hentenryck

© Springer Science+Business Media, LLC 2010

Abstract Providing consistent and fault-tolerant distributed object services is among the fundamental problems in distributed computing. To achieve fault-tolerance and to increase throughput, objects are replicated at different networked nodes. However, replication induces significant communication costs to maintain replica consistency. EventuallySerializable Data Service (ESDS) has been proposed to reduce these costs and enable fast operations on data, while still providing guarantees that the replicated data will eventually be consistent. This paper reconsiders the deployment phase of ESDS, in which a particular implementation of communicating software components must be mapped onto a physical architecture. This deployment aims at minimizing the overall communication costs, while satisfying the constraints imposed by the protocol. Both MIP (Mixed Integer Programming) and CP (Constraint Programming) models are presented and applied to realistic ESDS instances. The experimental results indicate that both models can find optimal solutions and prove optimality. The CP model, however, provides orders of magnitude improvements in efficiency. The limitations of the MIP model and the critical aspects of the CP model are discussed. Symmetry breaking and parallel computing are also shown to bring significant benefits.

1 Introduction Data replication is a fundamental technique in distributed systems: it improves availability, increases throughput, and eliminates single points of failure. Data replication however induces a communication cost to maintain consistency among replicas. EventuallySerializable Data Service (ESDS) (Fekete et al. 1996) is a system that was formulated to help reduce these costs. The algorithm implementing ESDS allows the users to selectively L. Michel () · A. Shvartsman · E. Sonderegger University of Connecticut, Storrs, CT 06269-2155, USA e-mail: [email protected] P. Van Hentenryck Brown University, Box 1910, Providence, RI 02912, USA

Ann Oper Res

relax the consistency requirements in exchange for improved performance. Given a definition of an arbitrary serial data type, ESDS guarantees that the replicated data will eventually be consistent (i.e., presenting a single-copy centralized view of the data to the users), and the users are able to require the results for certain operations to be consistent with the stable total order of all operations. The design, analysis, and implementation of systems such as ESDS is not an easy task, and specification languages have been developed to express these algorithms formally. See, for instance, the framework of (timed) I/O automata (Lynch and Tuttle 1989; Kaynar et al. 2006) and their associated tools (Lynch et al. 2007) that allow theorem provers (e.g., PVS, Owre et al. 1996) and model checkers (e.g., UPPAAL, Larsen et al. 1997; Behrmann et al. 2001) to reason about correctness. The ESDS algorithm is in fact formally specified with I/O automata and proved correct (Fekete et al. 1996). Once a specification is deemed correct, it must be implemented and deployed. The implementation typically consists of communicating software modules whose collective behaviors cannot deviate from the set of acceptable behaviors of the specification; see Cheiner and Shvartsman (1999) for a methodic implementation of the algorithm and a study of its performance. The deployment then focuses on mapping the software modules onto a distributed computing platform while achieving good performance, for example, by minimizing communication costs between the software components. This research focuses on the last step of this process: the deployment of the implementation on a specific architecture. The deployment problem can be viewed as a resource allocation problem in which the objective is to minimize the network traffic while satisfying the constraints imposed by the distributed algorithm. These constraints include, in particular, the requirements that replicas cannot be allocated to the same computer since this would weaken fault tolerance. The ESDS Deployment Problem (ESDSDP) was considered by Bastarrica (2000) and was modeled as a MIP. Unfortunately, the experimental results were not satisfactory at the time as even small instances could not be solved optimally (Bastarrica 2000). This paper reconsiders the ESDSDP and studies both MIP and CP formulations. It demonstrates that MIP solvers can now solve practical instances in reasonable times, although the problems remain surprisingly challenging. It also presents a constraintprogramming approach that dramatically outperforms the MIP approach. The CP model is a natural encoding of the ESDSDP together with a simple search heuristic focusing on the objective function. The paper also evaluates empirically the strength of the filtering algorithms, the use of symmetry breaking, and the benefits of parallel computing. Surveying the current results in more detail, the CP model brings orders of magnitude improvements in performance over the MIP model; for the examples considered here, it returns optimal solutions and proves optimality within a minute in the worst-case and within 8 seconds in general. The CP model enforces arc-consistency on all different and multidimensional element constraints, which is critical for good performance. Symmetry breaking brings significant speedups (up to a factor 200), while parallel computing yields linear speedups on a 4 processor SMP machine. Our approach also makes it feasible to consider bandwidth modeling, where pairwise connections between nodes in the network have limited bandwidth. This paper presents extended models with support for bandwidth considerations both for the Integer Programming and Constraint Programming formulations and reports experimental results in all cases. The results in this paper also open new horizons for on-line optimization of distributed deployment. Given the observed improvements in obtaining optimal solutions, it becomes feasible next to consider optimizing deployment of components in reconfigurable consistent data services for dynamic systems, such as R AMBO (Lynch and Shvartsman 2002;

Ann Oper Res

Gilbert et al. 2003). Here configurations (quorum systems) of processors maintaining data replicas can be changed dynamically. Any server maintaining a replica can propose a new configuration and choosing suitable configurations is crucial for the performance of the service. There exists a trade-off between fast reconfiguration and the choice of suitable configurations. Enabling the servers to propose optimized configuration based on their local observations and decisions will substantially benefit such services. It is of note that realistic instance sizes (cf. Aguilera 2007; Saito et al. 2004) are now within the current ability to compute optimal deployment. The rest of this paper is organized as follows. Section 2 surveys some of the related work. Section 3 presents an overview of ESDS and illustrates the deployment problem on a basic instance. Section 4 introduces the high-level deployment model and presents the MIP and CP models. Section 5 extends those models to networks with bandwidth limitations. Section 6 reports the experimental results and analyzes the behavior of the models in detail. Section 7 concludes the paper.

2 Related work The ESDS Deployment Problem is a generalization of the Quadratic Assignment Problem (QAP). There is a vast body of literature addressing QAPs, including Pardalos and Wolkowicz (1994). The closest relative to the ESDS deployment problem is, perhaps, the cross-docking problem (Boysen 2010; Li et al. 2004; Lim et al. 2006; Miao et al. 2009; Yu and Egbelu 2008), which considers the scheduling of trucks to docks at a cross-docking facility. The objective is to minimize the delays in transferring goods from inbound trucks to outbound trucks. This is an assignment problem (of trucks to docks), with an single capacity constraint on the temporary storage available for pallets awaiting outbound trucks. Internally, the cross dock can be viewed as a fully connected graph with distances on each of the paths connecting the docks. This contrasts with an ESDS network configuration that typically is not fully connected. Additionally, ESDSDP may impose individual capacity constraints on the connections. The ESDSDP also has separation and co-location constraints that are not part of the cross dock problem. Finally, the cross dock problem focuses on a schedule of truck arrivals and departures over time, whereas an ESDS deployment seeks a steady-state assignment. Each of the papers (Boysen 2010; Li et al. 2004; Lim et al. 2006; Miao et al. 2009; Yu and Egbelu 2008) considers a variant of the basic cross-docking problem and presents a MIP solver for its specific variant. The MIP models generally share the same structure and core modeling features with the MIP model proposed herein for ESDSDP. Although the details of the models vary, none of the MIP models can be solved in a reasonable amount of time, even for small instances. Instead, each paper relies on one or more heuristic techniques, such as Tabu Search, Simulated Annealing, or genetic algorithms, to find a solution. The ESDS deployment problem is both harder and more general than the cross-dock problem, yet a complete solution method exists and is presented here.

3 Deployment of eventually-serializable data services An Eventually-Serializable Data Service (ESDS) consists of three types of components: clients, front-ends, and replicas. Clients issue requests for operations on shared data and

Ann Oper Res

receive responses returning the results of those operations. Clients do not communicate directly with the replicas; instead they communicate with front ends which keep track of pending requests and handle the communication with the replicas. Each replica maintains a complete copy of the shared data and “gossips” with other replicas to stay informed about operations that have been received and processed by them. The clients may request operations whose results are tentative, but can be quickly obtained, or they can request “strict” operations that are possibly slower, but whose results are guaranteed to be consistent with an eventual total order on the operations. Each replica maintains a set of the requested operations and a partial ordering on these operations that tends to the eventual total order of operations. Clients may specify constraints on how the requested operations are ordered. If no constraints are specified by the clients, the operations may be reordered after a response has been returned. A request may include a list of previously requested operations that must be performed before the currently requested operation. For any sequence of requests issued by the clients, the service guarantees eventual consistency of the replicated data (Fekete et al. 1996). ESDS is well-suited for implementing applications such as a distributed directory service, cf. Internet’s Domain Name System (IETF 1990), which needs redundancy for faulttolerance and good response time for name lookup but does not require immediate consistency of naming updates. Indeed, the access patterns of such applications are dominated by queries, with infrequent update requests. Optimizing the deployment of an ESDS application can be challenging due to nonuniform communication costs induced by the actual network interconnect, as well as the various types of software components and their communication patterns. In addition, for fault tolerance, no more than one replica should reside on any given node. Finally, there is a tradeoff between the desire to place front-ends near the clients with whom they communicate the most and the desire to place the front-ends near replicas. Note also that the client locations may be further constrained by exogenous factors. Deployment instances typically involve a handful of front-ends to mitigate between clients and servers, a few replicas, and a few clients. Instances may not be particularly large as the (potentially numerous) actual users are external to the system and simply forward their demands to the internal clients modeled within the ESDS. A significant additional complication in deriving deployment mappings can be due to bandwidth limitations placed on connections between the nodes in the target platform. Figure 1 depicts a simple ESDS deployment problem. The left part of the figure shows the hardware architecture, which consists of 10 heavy-duty servers connected via a switch (full interconnect) and 4 “light” servers connected via direct links to the first four heavyduty servers. For simplicity, the cost of sending a message from one machine to another is the number of network hops. For instance, a message from PC1 to PC2 requires 3 hops, since a server-to-server message through the switch requires one hop only. The right part of Fig. 1 depicts the abstract implementation of the ESDS. The ESDS software modules fall in three categories: (1) client modules that issue queries (c1 , . . . , c4 ); (2) front-end modules (f e1 , f e2 ) that mediate between clients and servers and are responsible for tracking the sequence of pending queries; and (3) replicas (s1 , . . . , s6 ). Each software module communicates with one or several modules, and the right side of the figure specifies the volume of messages that must flow between the software components in order to implement the service. The problem constraints in this problem are as follows: the first 3 client modules must be hosted on the light servers (PC1 , . . . , PC4 ) while the remaining components (c4 , f e1 , f e2 , s1 , . . . , s6 ) must run on the heavy-duty servers. Additionally, the replicas s1 through s6 must execute on distinct servers to achieve the fault tolerance that is possible

Ann Oper Res

Fig. 1 A Simple ESDS Deployment Problem

with ESDS. The deployment problem consists of finding an assignment of software components to servers that satisfies the constraints stated above and minimizes the overall network traffic expressed as the volume of messages sent given the host assignments.

4 Modeling optimal ESDS deployments The deployment model for ESDS follows the approach of Bastarrica et al. (1998), Bastarrica (2000). The input data consists of: – The set of software modules C; – The set of hosts N ; – The subset of hosts to which a component can be assigned is denoted by booleans sc,n equal to true when component c can be assigned to host n; – The network cost is directly derived from its topology and expressed with a matrix h where hi,j is the minimum number of hops required to send a message from host i to host j . Note that hi,i = 0 (local messages are free); – The message volumes. In the following, fa,b denotes the average frequency of messages sent from component a to component b; – The separation set Sep which specifies that the components in each S ∈ Sep must be hosted on a different servers; – The co-location set Col which specifies that the components in each S ∈ Col must be hosted on the same server. The decision variables xc are associated with each module c ∈ C, and xc = n if component c is deployed on host n. An optimal deployment minimizes  fa,b · hxa ,xb a∈C b∈C

subject to the following. Components may only be assigned to supporting hosts ∀c ∈ C: xc ∈ {i ∈ N | sc,i = 1}.

Ann Oper Res

For each separation constraint S ∈ Sep, we impose ∀i, j ∈ S: i = j

⇒

xi = xj .

Finally, for each co-location constraint S ∈ Col, we impose ∀i, j ∈ S: xi = xj . 4.1 The MIP model A MIP model for the deployment problem is now presented. It is interesting to observe that ESDSDP is a generalization of the Quadratic Assignment Problem (QAP). Indeed, in a QAP, C = N , the variables xi are required to form a permutation on N , and there are no separation and co-location constraints. A QAP is also obtained when the co-location constraints are absent and the model contains a single separation constraint over the set of components C. Basic model The MIP model uses a four-dimensional matrix y of 0/1-variables such that ya,i,b,j = 1 if xa = i ∧ xb = j . It also uses a two-dimensional matrix z of 0/1-variables satisfying za,i = 1 ⇔ xa = i. The ESDSDP can then be specified as the minimization of  fa,b · hi,j · ya,i,b,j a∈C i∈N b∈C j ∈N

subject to 

za,i ≤ sa,i za,i = 1

∀a ∈ C, ∀i ∈ N, ∀a ∈ C,

(1) (2)

i∈N

ya,i,b,j ≤ za,i

∀a, b ∈ C, ∀i, j ∈ N,

(3)

ya,i,b,j ≤ zb,j

∀a, b ∈ C, ∀i, j ∈ N,

(4)

ya,i,b,j ≥ za,i + zb,j − 1 ∀a, b ∈ C, ∀i, j ∈ N,  za,i ≤ 1 ∀S ∈ Sep, ∀i ∈ N,

(5) (6)

a∈S

za,i = zb,i

∀i ∈ N, ∀S ∈ Col, ∀a, b ∈ S.

(7)

Constraints (1) require the components to be hosted on supporting hosts, and the constraints (2) require that each component be deployed on exactly one host. The constraints (3), (4), (5) enforce the semantic definition of the z variables. The constraints (6) encode the separation constraints and (7) the co-location constraints. Improving the formulation In the above formulation, the conjunction za,i = 1 ∧ zb,j = 1 is represented twice: once with ya,i,b,j and once with yb,j,a,i . It is thus possible to use only half the variables in y. In addition, for all components a ∈ C and nodes i, j ∈ N , i = j ⇒ ya,i,a,j = 0, since component a cannot be deployed on nodes i and j simultaneously. Moreover, hi,i = 0 and therefore all the terms on the diagonal can be removed from the objective function. The objective function thus only needs to feature variables ya,i,b,j such that a ≺ b, where ≺ is a total ordering relation on C.

Ann Oper Res

Fig. 2 The Constraint-Programming Model in C OMET

4.2 The CP model A C OMET program for the ESDSDP is shown in Fig. 2. The model is depicted in lines 1–30. The data declarations are specified in lines 2–8 and should be self-explanatory. The decision variables are declared in line 9 and are the same as in the model presented in Sect. 4: variable x[c] specifies the host of component c and its domain is computed from the support matrix s. The objective function is specified in lines 10–13. It eliminates terms for noncommunicating components (f [a, b] = 0) and constraints the intermediate variable obj to equal the total communication costs. With C OMET, one can specify the consistency level of each constraint at the time it gets posted. For instance, the onDomains annotation on line 13 imposes the use of arc-consistency for the propagation of the objective function.1 The choice of consistency level can have a dramatic impact on the efficiency of the model. The following paragraphs briefly review the semantics of arc-consistency and boundconsistency for the filtering algorithm associated with the two-dimensional element expression present in the objective. Arc-consistency for a two-dimensional element Arc-consistency for the two-dimensional element constraint is obtained with a table T containing all the tuples i, j, hi,j for i, j ∈ N . 1 Naturally, C OMET will enforce bound-consistency for the n-ary linear equality.

Ann Oper Res

C OMET creates a variable σa,b for each term hxa ,xb in the objective and imposes the constraints xa , xb , σa,b ∈ T on which it achieves arc-consistency. The objective is then rewritten into  fa,b · σa,b a∈C b∈C

and bound-consistency is used for that linear constraint. Bound-consistency for a two-dimensional element For bound-consistency C OMET flattens (row-major style) the two-dimensional matrix h into a one-dimensional array h . C OMET then creates a new variable αa,b such that hxa ,xb = hαa,b where αa,b = xa · |N | + xb . This converts the two-dimension index of h into an equivalent index into the array h . The objective for bound-consistency becomes  fa,b · hαa,b . a∈C b∈C

This rewrite induces two phenomena. First, the element constraint on the array can use either arc-consistency or bound-consistency. Second, the flattening constraint, which uses bound-consistency, delivers a rewriting that is weaker, even if arc-consistency is used for the one-dimensional element. Figure 3 illustrates the differences between arc-consistency on the matrix and boundconsistency on its flattened counterpart. Assume that the domains of the indexing variables are {a, b} and {c, d}. Arc-consistency picks the four values shown in (1), whereas boundconsistency on the flattened array must consider all the values shown in (2). Even if the rows and columns of the matrix are permuted optimally, i.e., all the values still present in the domain are consecutive as in (3), bound-consistency still captures extra values that are discarded in (1). The deployment constraints Returning to the CP model in Fig. 2, lines 14–17 state the colocation constraints: for each set S (line 14), an element c1 ∈ S is selected (randomly) and the model imposes the constraint xc1 = xc2 for each other elements c2 in S. Lines 18–19 state the separation constraints for every set in Sep using alldifferent constraints. Once again, the onDomains annotation mandates arc-consistency on the equations and the alldifferent.

Fig. 3 Comparison of the value filtering for arc-consistency and bound-consistency

Ann Oper Res

The search procedure The search procedure is depicted in lines 21–29. It is a variable labeling with dynamic variable and value orderings. Lines 22–28 are iterated until all variables are bound (line 21) and each iteration nondeterministically assigns a variable x[a] to a host n (lines 24–27). The variable and value orderings are motivated by the structure of the objective function  fi,j · hxi ,xj . i∈C j ∈C

In the objective, the largest contributions are induced by assignments of components i and j that are communicating heavily and are placed on distant hosts. As a result, the variable and value ordering are based on two ideas: 1. Assign first a component i whose communication frequency f [i, j ] with a component j is maximal (line 22); 2. Try the hosts for component i in increasing number of hops required to communicate with component j (line 24). The variable selection thus selects first components with the heaviest (single) communications, while the value selection tries to minimize the number of hops.

5 Modeling bandwidth limitations Realistic target networks may impose bandwidth limitations on connections between nodes due to either physical channel limitations or to the QoS guarantees. To reflect such constraints, the deployment model is next extended to incorporate bandwidth limitations on some of the connections. Bandwidth limitations apply to network inter-connects. Each subnetwork can have two or more machines that are collectively impacted by the bandwidth constraints. Informally, a connection is a network inter-connect between a set of k machines. For instance, an inter-connect can be a dedicated point-to-point link (e.g., in a hypercube) or a switched 802.11-wired Ethernet subnet. A deployment platform then reduces to a set of connections with some nodes (e.g., routers or machines with several network cards) appearing in several connections to establish bridges. More formally, a deployment platform is an hypergraph H = (X, E) where X is the set of nodes and E is a set of hyperedges, i.e., E ⊆ P (X) \ {∅} where P (X) is the power-set of X. Each hyperedge c carries a bandwidth capacity that is denoted by c.bw and its vertices are denoted by c.nSet. If an hyperedge c (connection) has no bandwidth limitations, its bandwidth capacity is c.bw = 0 Whereas the basic ESDS model uses a two-dimentional matrix h to capture the communication costs between nodes i and j , the extended model relies instead on a three-dimentional matrix that captures the communication costs from i to j along a specific path k. The new matrix h can be computed from the hypergraph as follows. Each pair of hosts is connected by one or more paths, where a path is an ordered collection of hyperedges (connections) from the source host to the destination host. A path is bandwidth-limited if one or more of its hyperedges (connections) has limited (positive) bandwidth; otherwise a path is not bandwidth-limited. Given any two hosts i and j , any paths that are longer than the shortest path between i and j that is not bandwidth-limited are dominated and can be safely ignored (indeed, one can always use a shortest bandwidthillimited path instead without incurring a cost increase). The network’s paths are represented by the following calculated parameters:

Ann Oper Res

– Pi,j denotes the set of paths from host i to host j for all i, j ∈ N . Each set contains one shortest path from i to j that is bandwidth-illimited, if one exists, and all shorter bandwidth-limited paths. If i = j , Pi,j contains a single path of zero length; – A 3-D matrix h, where hi,j,p is the length of path p from host i to host j ; – A boolean matrix hasC, where hasCi,j,p,c is true if path p from host i to host j uses the hyperedge (connection) c. The model contains additional decision variables to choose the paths to be used by communicating software modules deployed on hosts. Specifically, a new decision variable patha,b is associated with each communicating pair of components a and b, where patha,b ∈ Pxa ,xb is the path component a uses to send messages to component b. This reflects the current assumption that each pair of components uses a single directed path for data transmission. However, patha,b and pathb,a may be different. An optimal deployment minimizes 

fa,b · hxa ,xb ,patha,b

a∈C b∈C

subject to the supporting, separation, and co-location constraints presented earlier and the following bandwidth constraint for each c ∈ E with c.bw > 0: 

fa,b · hasCxa ,xb ,patha,b ,c ≤ c.bw.

a∈C b∈C

5.1 The MIP bandwidth extensions For the bandwidth extensions, the four-dimensional matrix y in the basic MIP model is replaced with a five-dimensional matrix w of 0/1-variables such that wa,i,b,j,p = 1 if xa = i ∧ xb = j ∧ patha,b = p. The deployment problem then becomes the minimization of   fa,b · hi,j,p · wa,i,b,j,p a∈C b∈C i∈N j ∈N p∈Pi,j

subject to the following constraints: za,i ≤ sa,i ∀a ∈ C, ∀i ∈ N,  za,i = 1 ∀a ∈ C,

(8) (9)

i∈N



wa,i,b,j,p ≤ za,i

∀a, b ∈ C, ∀i, j ∈ N,

(10)

wa,i,b,j,p ≤ zb,j

∀a, b ∈ C, ∀i, j ∈ N,

(11)

p∈Pi,j

 p∈Pi,j



wa,i,b,j,p ≥ za,i + zb,j − 1 ∀a, b ∈ C, ∀i, j ∈ N,

(12)

p∈Pi,j

 a∈S

za,i ≤ 1

∀S ∈ Sep, ∀i ∈ N,

(13)

Ann Oper Res

za,i = zb,i ∀i ∈ N, ∀S ∈ Col, ∀a, b ∈ S,   fa,b · hasCi,j,p,c · wa,i,b,j,p ≤ c.bw ∀c ∈ Conn : c.bw > 0.

(14) (15)

a∈C b∈C i∈N j ∈N p∈Pi,j

As before, constraints (8) require components to be hosted on supporting hosts, constraints (9) require each component to be deployed on exactly one host, constraints (13) encode the separation requirements, and constraints (14) encode the co-location requirements. Constraints (10), (11), (12) enforce the semantic definition of the z variables using the new w variables. Finally, constraints (15) enforce the bandwidth restrictions. Unlike the basic MIP model, the variables wa,i,b,j,p and wb,j,a,i,p cannot be combined since patha,b may not be the same as pathb,a . 5.2 The CP bandwidth extensions The C OMET program for bandwidth-limited connections is shown in Fig. 4. The data declarations are specified in lines 2–11, and the decision variables are declared in lines 12–13. As in Sect. 4.2, variable x[c] specifies the host of component c, with its domain computed from the support matrix s. The variable path[c1, c2] specifies the path used to send messages from component c1 to component c2, expressed as the rank of the selected path in the set P [x[c1], x[c2]]. Lines 14–18 specify the objective function, which minimizes communication costs. The CP formulation now uses a three-dimensional element constraint since the matrix h is indexed not only by variables for the host nodes for the two components but also by the variable for the particular communication path used between the hosts. The deployment constraints begin with lines 19–24 which contain the co-location and separation constraints as in Fig. 2. Lines 25–26 limit the ranges of the individual path variables to the number of paths between the hosts onto which the components are deployed. Lines 27–29 are the bandwidth constraints: for each hyperedge c ∈ E, the bandwidth c.bw must be greater than or equal to the sum of the communication frequencies of all pairs of components with c in their chosen path. As in Fig. 2, the onDomains annotations indicate that arc-consistency must be enforced for each constraint. The search procedure, depicted in lines 31–45, operates in two phases. In the first phase (lines 31–40) all the components are assigned to hosts, beginning with the components that communicate most heavily. Unlike in Fig. 2, the search must estimate the communication cost between components a and b’s potential deployment sites along any given path. Line 33 picks the first site k for component b and the tryall instruction on line 34 considers the sites for component a in increasing order of path length based on an estimation equal to the shortest path one could take between the choice n and the selection k. The second phase (lines 41–45) labels the path variables to pick a specific path between all pairs of deployment sites that are still unassigned, backtracking as needed over the initial component assignments.

6 Experimental results The experimental results on ESDSDP are reported first for the basic ESDS deployment problem, and then for the ESDS deployment problem with bandwidth. In both cases, the paper reports results both for the MIP and CP models. The benchmarks are described first, followed by the results. All times are reported in seconds.

Ann Oper Res

Fig. 4 The Bandwidth-Limited Model in C OMET

6.1 The basic ESDSDP benchmarks The models are evaluated on a collection of synthetic benchmarks that are representative of realistic proprietary instances (Aguilera 2007). The benchmarks cover instances with different configurations of software components and different hardware architectures. All instances, which are available upon request, have from 12 to 18 software modules, and the hardware platforms range from 14 to 16 machines with 2 to 4 front-ends. In particular,

Ann Oper Res

Fig. 5 Instance HYPER16: Deploying 18 Components on a 16-Node Hypercube

Fig. 6 Instance SCSS2SNUFE: A Deployment on Two Subnets

Fig. 5 depicts instance HYPER16 which deploys 18 components on an hypercube, while Fig. 6 depicts instance SCSS2SNUFE. Table 1 gives a more detailed description of the instances. For each benchmark, it gives the number of hosts and components, the separation and co-location constraints, the size of the search space and the hardware and software configurations. The specification 3:6S:3FE:4C indicates that there are three separation sets, one for the six replicas, one for the 3 front-ends, and one for the 4 clients. The hardware setups named H1 through H5 are specified as follows. H1 is the hardware platform depicted in Fig. 1. H2 is a simple extension of H1 with a fifth client connected to the fifth server. H3 is the network depicted in Fig. 6 with 10 servers on two subnets and 1 server (machine 8) acting as a gateway between the subnets. Each subnet is arbitrated by a switch. The four clients are connected to the first 4 servers on the first subnet. H4 is based on an 8-node hypercube with 7 extra machines connected by a directed link to a single vertex of the hypercube (so the degrees of 7 vertices of the hypercube are 4 and the last vertex has a degree of 3). H5 is the 16-node hypercube depicted in Fig. 5. The software configurations named S1 through S3 are specified as follows. S1 is the architecture depicted in Fig. 1.

Ann Oper Res Table 1 High-Level Descriptions of the Benchmarks Bench

#N

#C

S

T

SearchSpace

Hard.

Soft.

SIMPLE2

14

12

1:6S

0

43 · 107

H1

S1

SIMPLE1

14

12

1:6S

0

43 · 108

H1

S1

SIMPLE0

14

12

1:6S

0

43 · 109

H1

S1

fe3c5pc

14

14

2:6S:3FE

0

44 · 1010

H1

S1

fe3c5pc5

15

14

2:6S:3FE

0

44 · 1010

H2

S1

fe3c5sun

14

14

2:6S:3FE

0

43 · 1011

H1

S1

fe3c6pc5

15

15

2:6S:3FE

0

55 · 1010

H2

S1

fe3c7pc5

15

16

2:6S:3FE

0

55 · 1011

H2

S1

fe3c7pc5CS

15

16

3:6S:3FE:4C

0

55 · 1011

H2

S1

fe3c7pc5CST

15

16

3:6S:3FE:4C

1:2C

55 · 1011

H2

S1

fe3dist

14

13

2:6S:3FE

0

43 · 1010

H1

S1

SCSS1SNUFE

14

12

2:6S:4FE

0

43 · 123 · 106

H1

S2

SCSS2SNUFE

14

12

2:6S:4FE

0

43 · 123 · 106

H3

S2

SCSS2SNCFE

14

12

2:6S:4FE

0

43 · 109

H3

S2

HYPER8

15

18

2:6S:4FE

0

56 · 1012

H4

S3

HYPER16

16

18

2:6S:4FE

0

56 · 1012

H5

S3

S2 is the architecture depicted in Fig. 6. It is identical to S1 . However, clients can communicate with more than one front-end, and front-ends communicate with several replicas as well. S3 is a scaled-up version of S2 and is shown in Fig. 5. 6.2 The MIP model The MIP model was run using CPLEX version 11 on an AMD Athlon64 at 2 GHz with 2 gigabytes of RAM. Various attempts were made to find the best settings for CPLEX. Changing the variable branching heuristic to use the coefficients of the objective function (set mip ordertype 1 in the interactive solver) seems to deliver the best performance. CPLEX then finds the optimal solution early on, in general, contrary to the default settings. Table 2 reports the results for the MIP model. Column Tend reports the running times (in seconds) to find the optimal solution and prove optimality. Column Topt reports the times to find the optimal solutions. Column #Nodes reports the number of nodes in the branch & bound tree, column MTS gives the peak size of the branch & bound tree (in megabytes), columns #R and #C give the number of rows and columns after presolve, and column Gap returns the optimality gap as a percentage of the optimal solution when it is found. Overall, the results indicate that optimal solutions can be found in reasonable time (except when the network topology is a hypercube) and that optimality proofs require significant computational resources. More precisely, the computational results can be summarized as follows. 1. On the FExCySz instances, the solver finds an optimum relatively quickly (from 210 to 1717 seconds) but the proofs of optimality are very costly. 2. On the SCSSx instances, the MIP solver also finds optimal solutions quickly on the first two instances. The proof of optimality takes significant time in both cases. In the third instance, the situation is reversed and the solver takes half the time to find the optimum.

Ann Oper Res Table 2 Experimental Results for the MIP model Bench

#Nodes

Tend

MTS

Topt

#R

#C

Gap

SIMPLE2

4

68

2

8834

3070

SIMPLE1

1420

4238

247

2.58

11295

3900

61.34% 25.77%

SIMPLE0

1182

11053

190

7.86

14056

4830

32.47%

fe3c5pc

5091

22741

1717

25.04

18442

6312

39.79%

fe3c5pc5

5407

26360

960

30.59

19804

6770

37.85%

fe3c5sun

7877

29216

210

33.94

20458

6990

88.50%

fe3c6pc5

10556

35901

507

41.54

21605

7375

78.21%

fe3c7pc5

14712

26818

312

34.38

25356

8635

70.50%

fe3c7pc5CS

44677

81673

387

116.00

25356

8635

67.82%

fe3c7pc5CST

6154

51223

480

66.44

25290

8610

64.50%

fe3dist

4105

33889

3100

34.62

17097

5860

15.44%

SCSS1SNUFE

844

16350

135

14.73

17492

5994

70.08%

SCSS2SNUFE

1624

32846

90

29.82

17492

5994

83.33%

SCSS2SNCFE HYPER8

3026

72711

1551

55.20

14048

4830

26.12%

53906

124918

51381

190.95

31583

10725

5.82%

3. On the HYPERx instances, the MIP solver has great difficulties and only solves the smallest instance. It took slightly more than 14 hours to obtain the optimal solution on HYPER8 and almost 15 hours to prove its optimality. 4. The MIP solver explores a large number of nodes on these instances, takes significant memory, and exhibits large optimality gaps. The linear relaxation is not particularly strong, which explains the large search tree. Although they highlight the computational progress in MIP solvers, these results are quite sobering. Indeed, the MIP solver took 1182 seconds for proving the optimality of the instance discussed in Sect. 3 after obtaining the optimal solution in about 190 seconds. This instance does not seem particularly difficult, since its software communication patterns are rather simple and well-structured. 6.3 The CP model Table 3 reports the results for the CP model with C OMET 0.07 (executing on an Intel Core at 2 GHz with 2 gigabytes of RAM). Column Tend gives the time to find the optimum and prove optimality, column #Chpt reports the number of choice points and column Topt reports the time to find the optimum. Several observations should be conveyed about these results. 1. The CP model dramatically outperforms the MIP model with speedup factors ranging from 37 on SIMPLE2 to 2837 on HYPER8. The number of search nodes is an order of magnitude smaller for CP, showing the significant pruning obtained by the CP solver compared to the MIP solver. 2. On the FExCySz instances, the CP solver finds the optimal solutions and proves optimality in less than 6 seconds. 3. On the SCSSx instances, the CP solver also finds the optimal solutions and proves optimality in less than 32 seconds. Interestingly, having several subnets does not seem to help the CP solver as far as computation times are concerned, although the number of choice points decreases.

Ann Oper Res Table 3 Experimental Results for the CP Model Bench

Tend

#Chpt

Topt

Bench

Tend

#Chpt

Topt

SIMPLE2

0.108

428

0.001

fe3c7pc5CS

3.010

10881

0.002

SIMPLE1

0.695

2720

0.002

fe3c7pc5CST

5.959

20634

0.002

SIMPLE0

4.005

15822

0.001

fe3dist

7.741

37090

0.002

fe3c5pc

1.264

3973

0.002

SCSS1SNUFE

21.249

69271

0.001

fe3c5pc5

1.974

6401

0.002

SCSS2SNUFE

31.754

105329

19.712

fe3c5sun

2.712

7802

0.002

SCSS2SNCFE

24.970

86864

14.446

fe3c6pc5

1.501

5902

0.002

HYPER8

19.047

42427

0.574

fe3c7pc5

3.231

11324

0.002

HYPER16

53.014

127834

45.075

4. On the HYPERx instances, the CP solver finds the optimum and proves optimality in less than a minute. These instances are more challenging (as they were for the MIP solver) but the computation times remain reasonable. 5. The instances HYPERx and SCSSx are hard on two counts: to find the optimum and to prove optimality. SCSS1x is the only exception which indicates that the network topology significantly impacts the search. Note that the simple instance discussed in Sect. 3 now requires 4 seconds for finding the optimal and proving its optimality (instead of 1182 seconds for the MIP solver). This remains sobering given the simplicity of that particular instance but it is perhaps reassuring that more complex FExCySz instances (from a visual standpoint) require roughly the same time. The value of arc-consistency The CP model presented in this paper is quite elegant since it enforces arc-consistency on all constraints and the element constraints derived from the objective function.2 One may wonder whether arc-consistency is critical in ESDSDPs or whether a weaker form of consistency is sufficient. Table 4 depicts the results when only bound reasoning is performed on the objective. The second and the third columns report the results of the CP solver when bound-consistency is enforced on the objective, while the fourth and the fifth columns reproduce those in Table 3. The experimental results show a dramatic loss in performance when arc-consistency is not used. On some benchmarks, the CP model with bound-consistency on the objective becomes more than 6000 times slower than the model with arc-consistency on the objective. The all-different constraints are less critical and reverting to a weaker consistency there does not induce significant losses. The performance gap between arc-consistency and bound-consistency does not appear to stem from either a poor permutation of the rows and columns in the deployment variable domains or the small set of values in the hops matrix. Table 5 compares three permutations of the components and nodes for SIMPLE2, namely, the “default” permutation used in the benchmark, a “poor” permutation, and the “ideal” permutation for bound-consistency, which has the components and nodes arranged contiguously in the order in which they are assigned in the search (similar to Fig. 3(3)). As expected, the bound-consistency performance improves with better permutations, whereas the arc-consistency performance remains steady. However, even if an oracle suggests an “ideal” permutation, the bound-consistency 2 As traditional, the solver dynamically adds new constraints whenever a new solution is produced, forcing

the objective function to improve upon the best known solution.

Ann Oper Res Table 4 The Value of Arc-Consistency for the CP model

Algo

CP-BC

Bench

Tend

Table 6 The Value of Arc-Consistency with Larger Sets of Hops Values

Tend

#Chpt

SIMPLE2

18.505

132508

0.108

428

SIMPLE1

151.462

1044624

0.695

2720 15822

SIMPLE0

1260.109

87783771

4.005

fe3c5pc

2984.628

22307333

1.264

3973

fe3c5pc5

5969.930

37506226

1.974

6401

fe3c5sun

6923.615

51771902

2.712

7802

fe3c6pc5

8516.837

49757614

1.501

5902

fe3c7pc5

21322.817

121033129

3.231

11324

fe3c7pc5CS

19565.937

110904263

3.010

10881

fe3c7pc5CST

12533.717

82704740

5.929

20634

fe3dist

10684.669

62887633

7.741

37090

SCSS1SNUFE

2836.095

23234748

21.249

69271

SCSS2SNUFE

1093.347

7765076

31.754

105329

SCSS2SNCFE

Table 5 The Value of Arc-Consistency with the “Ideal” Row and Column Permutations

CP-AC #Chpt

735.302

5542510

24.970

86864

HYPER8

3022.560

13329412

19.047

42427

HYPER16

8299.701

39623171

53.014

127834

Permutation

CP-BC Tend

CP-AC #Chpt

Tend

#Chpt

Default

51.791

132749

0.348

477

Poor

57.939

132949

0.352

498

Ideal

47.474

126967

0.336

477

Hops values

CP-BC

CP-AC

Tend

#Chpt

{0..3}

52.011

132749

0.348

477

{0..7}

19.021

50868

1.492

1908

Tend

#Chpt

{0..12}

24.849

61334

2.720

3728

{0..14}

35.014

93299

7.916

11530

{0..21}

47.598

129911

12.156

20980

performance is not competitive with arc-consistency. Table 6 shows the impact of artificially increasing the set of values in the hops matrix of SIMPLE2 from {0..3} to {0..21}. Performance degrades with more values for both arc-consistency and bound-consistency, but a gap remains. (It is unclear why the bound-consistency performance is relatively poor for the default hop values of {0..3}.) Both of these tests were performed with C OMET 0.07 executing on an Intel Pentium 4 CPU at 2.8 GHz with 512 megabytes of RAM. Exploiting value symmetries ESDSDPs may feature a variety of symmetries, which can be removed to further improve performance. Consider the instance presented in Sect. 3 and the

Ann Oper Res

Fig. 7 The Search Procedure with Value Symmetry Breaking Table 7 The Impact of Value Symmetry Breaking

Algo

CP+SYM

Bench

Tend

#Chpt

CP

fe3c7pc5

0.032

137

3.231

11324

fe3c7pc5CS

0.031

128

3.010

10881

fe3c7pc5CST

0.046

203

5.929

20634

fe3dist

0.062

394

7.741

37090

SCSS1SNUFE

0.100

508

21.249

69271

SCSS2SNUFE

0.759

3432

31.754

105329

SCSS2SNCFE

0.449

2250

24.970

86864

Tend

#Chpt

10 heavy-duty servers in particular. Four of these servers are connected through dedicated links to the “light” servers that must host some of the clients. This creates two classes of heavy-duty servers: those connected to the light servers and those that are not. Moreover, the heavy-duty servers 6–10 are fully interchangeable in that any permutation of these servers in a solution would also produce a solution. Techniques for removing these symmetries during search are well-known (see, for instance, Van Hentenryck et al. 2003). Figure 7 illustrates how to enhance the search procedures presented earlier with symmetry breaking. The sets of equivalent hosts are supplied as additional input data and are used to determine the set of non-equivalent hosts (lines 1–2). Each iteration of the search procedure calculates the set of nodes that are bound in line 6 and the set of nodes that are eligible to host the next component with lines 7 through 11. Line 7 starts by initializing the searchNodes, to all the non-equivalent nodes plus all the nodes on which components are already deployed. The loop in lines 8–11 simply adds to searchNodes one still unused node from each equivalence class. Lines 6–13 of Fig. 7 replace lines 23–24 of Fig. 2 and lines 33–34 of Fig. 4. Note that this search procedure handles all the value equivalence classes and implements partial value interchangeability. Table 7 depicts the results on the larger instances in which there are interchangeable heavy servers. The second and the third columns report the results of the CP solver with

Ann Oper Res

Fig. 8 Instance RING6: Deploying to a tightly coupled, bandwidth-limited network Table 8 The Impact of Parallelism

Bench

1

2

3

4

HYPER8

20.52

10.09

6.85

5.41

HYPER16

44.35

20.85

16.35

11.13

symmetry breaking while the fourth and the fifth columns reproduce the standard results from Table 3. The experimental results show nice improvements, including speedups close to 13.5 on some instances. The ability to break symmetries during search is important here to avoid interfering with the search heuristics. All these instances (which do not include the hypercube instances) are now solved optimally in less than 15 seconds. Note that there are potentially many other forms of symmetries in ESDSDPs which have not been exploited. For instance, in the simple ESDSDP presented earlier, the software components s3 , . . . , s6 are symmetric, revealing some variable symmetries. We did not make any effort to remove these symmetries since they are instance-dependent and were not necessary to solve the CP model effectively. Parallel computing Since C OMET supports transparent parallelism (Michel et al. 2007), the performance of the CP model was evaluated with a 4-processor SMP machine on the hypercube instances. The results in Table 8 exhibit linear speedups. (A parallel license for CPLEX was not available, so the impact of parallelism on the MIP model could not be determined.) 6.4 The extended ESDSDP with bandwidth benchmarks The bandwidth benchmarks fall into three categories: variants of SIMPLE2 as depicted in Fig. 1, variants of HYPER8, and variants of the RING6 model shown in Fig. 8. The RING6 model is studied, not because it reflects an actual network configuration, but because it is a simple representation of yet another class of network configurations, namely those with tightly coupled hosts and multiple bandwidth limitations. The bandwidth benchmarks differ from the earlier benchmarks in that they contain one extra software module, a “driver” that manages the communication channel, between each pair of replicas (components s1 , . . . , s6 in Fig. 1 and r1 , . . . , r6 in Fig. 8). This more realistically models the capabilities of the communication infrastructure of a distributed system. Each of these “drivers” is required to be co-located with its sending replica. Since most of

Ann Oper Res Table 9 Comparison of MIP and CP Models for Bandwidth-Limited Benchmarks Benchmark

MIP

CP #Nodes

Tend

Topt

#Chpt

Tend

Topt

SIM2BW0

14.52

38

2

0.090

115

0.007

SIM2BW1

12.85

67

4

0.277

169

0.022

SIM2BW2

17.05

29

2

0.372

155

0.026

HYP8BW0

138330.00

395029

27190

56.484

39516

35.522

HYP8BW1

142387.94

277075

117800

4097.667

119032

315.331

HYP8BW4

71101.72

197065

56160

12791.568

111290

10890.159

RING4

9.83

10

1

0.244

206

0.228

RING5

66.41

91

60

11.208

26727

10.834

RING6

327.91

100

270

87.561

208575

85.498

the benchmarks have six gossiping data replicas, this adds 6 × 6 or 36 software modules to the earlier benchmarks, and typically triples the time to find an optimal deployment. The bandwidth benchmarks are as follows: – SIM2BW0 is a variant of SIMPLE2 with no bandwidth-limited connections. – SIM2BW1 is a variant of SIMPLE2 with a bandwidth limit of 5 on the connection between PC2 and s2 . – SIM2BW2 is a variant of SIMPLE2 with a bandwidth limit of 5 on the connection between PC2 and s2 and a bandwidth limit of 10 on the connection between PC3 and s3 . – HYP8BW0 is a variant of HYPER8 with no bandwidth-limited connections. – HYP8BW1 is a variant of HYPER8 with a bandwidth limit of 10 on the connection between PC1 and s1 . – HYP8BW4 is a variant of HYPER8 with four bandwidth-limited connections. – RING4 is a variant of RING6 with only four gossiping replicas (and no messages from c3 to r5. – RING5 is a variant of RING6 with only five gossiping replicas. – RING6 is illustrated in Fig. 8. Table 9 reports the experimental results for the MIP model using CPLEX version 11 on an AMD Athlon64 at 2 GHz with 2 gigabytes of RAM and the CP model with C OMET 0.07 executing on an Intel Core 2 Duo at 2.4 GHz with 2 gigabytes of RAM. The CP model continues to outperform the MIP model on all benchmarks with speedups ranging from a factor of 6 to up to two orders of magnitude. The differences with the MIP solver are, however, less dramatic. Note that, as the instances become increasingly more constrained (e.g., going from HYP8BW0 to HYP8BW4), the performance of the CP model progressively degrades. We conjecture that the CP model has a hard time proving infeasibility during the second phase when there are many alternative paths, whereas the MIP, with its lower bound, is able to avoid some of the infeasible subtrees. In spite of this, the CP model is still much faster than the MIP model. Note that the CPLEX traces for HYP8BW1 and HYP8BW4 indicate that, indeed, the optimality gap closes much faster when the model uses more bandwidth constraints. This indicates that the MIP, with its lower bound from the linear relaxation, can leverage that information to perform stronger optimality-based pruning. The CP model finds the optimal solution for many of the bandwidth benchmarks fairly late in the calculation which indicates that alternate search strategies might deliver more efficient approaches. A more complete exploration of the bandwidth-limited model is reported

Ann Oper Res

in Michel et al. (2009). Better path selection procedures within the search yield speedup factors of almost 40 for HYP8BW4 and more than 100 for HYP8BW1. Constraint-based local search and hybrid models also are considered, with good results.

7 Conclusion In distributed systems, Eventually-Serializable Data Services (ESDS) support data replication and reduce the communication costs to maintain consistency between the replicas by allowing the users of the service to take advantage of the semantics of the sequential data types to relax consistency for certain operations while ensuring eventual consistency. Once ESDS protocols and algorithms have been designed and formally verified, they are deployed on specific architectures. More precisely, in ESDS deployments, a collection of communicating software components must be mapped onto a physical architecture to minimize the communication costs while preserving the safety of the ESDS service. The ESDS deployment problem was first considered in the late 1990s, but practical instances could not be solved optimally at the time by state-of-the-art MIP solvers. This paper reconsidered the deployment problem in ESDS, presented a traditional MIP model, and proposed a CP model. The models were evaluated experimentally on synthetic instances capturing the sizes and properties of ESDS. The experimental results indicate that MIP solvers are now capable of solving ESDS deployment problems, although these applications remain extremely challenging. In particular, the optimality gap is substantial for the MIP model, which gives rise to very large search trees. The CP model brings orders of magnitude improvements in performance over the MIP model; it returns optimal solutions and proves optimality within a minute in the worst-case and within 8 seconds in general. The CP model enforces arc-consistency on all different and multi-dimensional element constraints, which is critical for good performance. The benefits of symmetry breaking and parallelism were also studied. Symmetry breaking brings significant speedups (up to a factor 200), while parallel computing produces linear speedups. An extended model that captures bandwidth constraints on the various network inter-connects that form the communication infrastructure were also developed and evaluated. While all instances can still be solved more rapidly with a CP solver, the gap between MIP and CP solvers shrinks as the number of bandwidthconstrained network connections increases. Future work will exploit the structure of ESDS deployments more finely, as these problems often exhibit some regularities either in the underlying hardware or in the software components. Given the improvements in obtaining optimal solutions, it makes sense to explore a portfolio of algorithms that includes a CP model for the on-line deployment optimization of dynamic reconfigurable distributed data services such as R AMBO. Acknowledgements This work was partially supported through NSF awards ITR-0121277, DMI-0600384, IIS-0642906 and CCF-0702670, AFOSR Contract FA955007C0114, and ONR award N000140610607.

References Aguilera, M. (2007). Hewlett-Packard. Personal communication. Bastarrica, M. C. (2000). Architectural specification and optimal deployment of distributed systems. Ph.D. thesis, University of Connecticut. Bastarrica, M., Demurjian, S., & Shvartsman, A. (1998). Software architectural specification for optimal object distribution. In SCCC’98: Proc. of the XVIII int-l conf. of the Chilean computer science society, Washington, DC, USA.

Ann Oper Res Behrmann, G., David, A., Larsen, K., Möller, O., Pettersson, P., & Yi, W. (2001). Uppaal—present and future. In Proceedings of the 40th IEEE conference on decision and control (CDC’2001) (pp. 2881–2886). Boysen, N. (2010). Truck scheduling at zero-inventory cross docking terminals. Computers and Operations Research, 37(1), 32–41. http://dx.doi.org/10.1016/j.cor.2009.03.010. Cheiner, O., & Shvartsman, A. (1999). Implementing an eventually-serializable data service as a distributed system building block. Networks in Distributed Computing, 45, 43–71. Fekete, A., Gupta, D., Luchangco, V., Lynch, N., & Shvartsman, A. (1996). Eventually-serializable data services. In PODC’96: Proceedings of the fifteenth annual ACM symposium on principles of distributed computing (pp. 300–309). Gilbert, S., Lynch, N. A., & Shvartsman, A. A. (2003). Rambo ii: rapidly reconfigurable atomic memory for dynamic networks. In DSN (p. 259). Los Alamitos: IEEE Comput. Soc. IETF (1990). Domain name system, rfc 1034 and rfc 1035. Kaynar, D. K., Lynch, N., Segala, R., & Vaandrager, F. (2006). The theory of timed I/O automata (Synthesis lectures in computer science). San Rafaell: Morgan & Claypool Publishers. Larsen, K. G., Pettersson, P., & Yi, W. (1997). UPPAAL in a nutshell. International Journal on Software Tools for Technology Transfer, 1(1–2), 134–152. Li, Y., Lim, A., & Rodrigues, B. (2004). Crossdocking—JIT scheduling with time windows. Journal of the Operational Research Society, 55(12), 1342–1351. Lim, A., Ma, H., & Miao, Z. (2006). Lecture notes in computer science: Vol. 3982. Truck dock assignment problem with time windows and capacity constraint in transshipment network through crossdocks (p. 688). Lynch, N., & Shvartsman, A. (2002). Rambo: a reconfigurable atomic memory service for dynamic networks. In Proceedings of the 16th international symposium on distributed computing (pp. 173–190). Lynch, N., & Tuttle, M. (1989). An introduction to Input/Output Automata. CWI-Quarterly, 2(3), 219–246. Lynch, N. A., Garland, S., Kaynar, D., Michel, L., & Shvartsman, A. (2007). The tempo language user guide and reference manual. http://www.veromodo.com, VeroModo Inc., December 2007. Miao, Z., Lim, A., & Ma, H. (2009). Truck dock assignment problem with operational time constraint within crossdocks. European Journal of Operational Research, 192(1), 105–115. Michel, L., See, A., & Van Hentenryck, P. (2007). Parallelizing constraint programs transparently. In Proceedings of the 13th international conference on the principles and practice of constraint programming (CP-2007), Providence. Michel, L., Van Hentenryck, P., Sonderegger, E., Shvartsman, A., & Moraal, M. (2009). Bandwidth-limited optimal de ployment of eventually-serializable data services. In Integration of AI and OR techniques in constraint programming for combinatorial optimization problems: 6th international conference, CPAIOR 2009 (p. 193), Pittsburgh, PA, USA, May 27–31, 2009. Berlin: Springer. Pardalos, P. M., & Wolkowicz, H. (Eds.) (1994). DIMACS series in discrete mathematics and theoretical computer science: Vol. 16. Quadratic assignment and related problems: DIMACS workshop, May 20– 21, 1993. Providence: American Mathematical Society. Owre, S., Rajan, S., Rushby, J. M., Shankar, N., & Srivas, M. K. (1996). PVS: combining specification, proof checking, and model checking. In Proceedings of the eighth international conference on computer aided verification CAV (Vol. 1102, pp. 411–414), New Brunswick, NJ, USA. Saito, Y., Frølund, S., Veitch, A. C., Merchant, A., & Spence, S. (2004). Fab: building distributed enterprise disk arrays from commodity components. In Mukherjee, S., & McKinley, K. S. (Eds.), ASPLOS (pp. 48– 58). New York: ACM. Van Hentenryck, P., Flener, P., Pearson, J., & Ågren, M. (2003). Tractable symmetry breaking for csps with interchangeable values. In International joint conference on artificial intelligence (IJCAI’03). Yu, W., & Egbelu, P. (2008). Scheduling of inbound and outbound trucks in cross docking systems with temporary storage. European Journal of Operational Research, 184(1), 377–396.

Suggest Documents