Jul 16, 2000 - Nori and Savelsbergh 1999), we formulated the inventory routing problem as a Markov decision process, and proposed approximation ...
The Stochastic Inventory Routing Problem Anton J. Kleywegt ∗ Vijay S. Nori Martin W. P. Savelsbergh School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332-0205 July 16, 2000
Abstract This work is motivated by the need to solve the inventory routing problem when implementing a business practice called vendor managed inventory replenishment (VMI). With VMI, vendors monitor their customers’ inventories, and decide when and how much inventory should be replenished at each customer. The inventory routing problem attempts to coordinate inventory replenishment and transportation in such a way that the cost is minimized over the long run. In a recent paper (Kleywegt, Nori and Savelsbergh 1999), we formulated the inventory routing problem as a Markov decision process, and proposed approximation methods to find good solutions with reasonable computational effort for the inventory routing problem with direct deliveries. In this paper we propose solution methods for the inventory routing problem in which multiple customers can be visited on a route.
∗ Supported
by the National Science Foundation under grant DMI-9875400.
Introduction Recently the business practice called vendor managed inventory replenishment (VMI) has been adopted by many companies. VMI refers to the situation in which a vendor monitors the inventory levels at its customers and decides when and how much inventory to replenish at each customer. This contrasts with conventional inventory management, in which customers monitor their own inventory levels and place orders when they think that it is the appropriate time to reorder. VMI has several advantages over conventional inventory management. Vendors can usually obtain a more uniform utilization of production resources, which leads to reduced production and inventory holding costs. Similarly, vendors can often obtain a more uniform utilization of transportation resources, which in turn leads to reduced transportation costs. Furthermore, additional savings in transportation costs may be obtained by increasing the use of low-cost full-truckload shipments and decreasing the use of high-cost less-than-truckload shipments, and by using more efficient routes by coordinating the replenishment at customers close to each other. VMI also has advantages for customers. Service levels may increase, measured in terms of reliability of product availability, due to the fact that vendors can use the information that they collect on the inventory levels at the customers to better anticipate future demand, and to proactively smooth peaks in the demand. Also, customers do not have to devote as many resources to monitoring their inventory levels and placing orders, as long as the vendor is successful in earning and maintaining the trust of the customers. A first requirement for a successful implementation of VMI is that a vendor is able to obtain relevant and accurate information in a timely and efficient way. One of the reasons for the increased polularity of VMI is the increase in the availability of affordable and reliable equipment to collect and transmit the necessary data between the customers and the vendor. However, access to the relevant information is only one requirement. A vendor should also be able to use the increased amount of information to make good decisions. This is not an easy task. In fact, it is a very complicated task, as the decision problems involved are very hard. The objective of this work is to develop efficient methods to help the vendor to make good decisions when implementing VMI. In many applications of VMI, the vendor manages a fleet of vehicles to transport the product to the customers. The objective of the vendor is to coordinate the inventory replenishment and transportation in such a way that the total cost is minimized over the long run. The problem of optimal coordination of inventory replenishment and transportation is called the inventory routing problem (IRP). In a recent paper (Kleywegt, Nori and Savelsbergh 1999), we formulated the inventory routing problem as a Markov decision process, and we proposed approximation methods to find good solutions with reasonable computational effort for the inventory routing problem with direct deliveries (IRPDD). In this paper we propose solution methods for the inventory routing problem in which multiple customers can be visited on a route. Specifically, we study the problem of determining optimal policies for the distribution of a single product from a single vendor to multiple customers. The demands at the customers are assumed to have probability distributions that are known to the vendor. The objective is to maximize the expected discounted value, incorporating sales revenues, production costs, transportation costs, inventory holding costs, and shortage penalties, over an infinite horizon. A review of the related literature on the IRP was given in Kleywegt, Nori and Savelsbergh (1999). 2
Our work on this problem was motivated by our collaboration with a producer and distributor of air products. The company operates plants worldwide and produces a variety of air products, such as liquid nitrogen, oxygen and argon. The company’s bulk customers have their own storage tanks at their sites, which are replenished by tanker trucks under the supplier’s control. Approximately 80% of the bulk customers participate in the company’s VMI program. For the most part each customer and each vehicle is allocated to a specific plant, so that the overall problem decomposes according to individual plants. Also, to improve safety and reduce contamination, each vehicle and each storage tank at a customer is dedicated to a particular type of product. Hence the problem also decomposes according to type of product. We assume that vehicles and drivers are available at the beginning of each day, while appreciating the fact that this is not always the case in practice. The assumption that the probability distributions of the customers’ demands are known to the vendor and do not change over time can of course be criticized. In practice, these probability distributions have to be estimated from data, and the probability distributions do change over time. Fortunately, in this particular case, a large amount of data is available, and the demand characteristics of consumers do not seem to change rapidly over time.
1
Problem Definition
The problem definition of the IRP given in this section is similar to that in Kleywegt, Nori and Savelsbergh (1999). A more general description of the IRP is given in Section 1.1, after which a Markov decision process formulation is given in Section 1.2.
1.1
Problem Description
A product is distributed from a vendor’s facility to N customers, using a fleet of M homogeneous vehicles, each with known capacity CV . The process is modeled in discrete time t = 0, 1, . . . , and the discrete time periods are called days. Customers’ demands on different days are independent random vectors with a joint probability distribution F that does not change with time. The probability distribution F is known to the vendor. The vendor can measure the inventory level Xnt of each customer n at any time t. At each time t, the vendor makes a decision regarding which customers’ inventories to replenish, how much to deliver at each customer, how to combine customers into vehicle routes, and which vehicle routes to assign to each of the M vehicles. The set of feasible decisions is determined by constraints on the travel times and work hours of vehicles and drivers, delivery time windows at the customers, the maximum inventory levels and current inventory levels of customers, and other constraints dictated by the particular application. Taking these constraints into account, we assume that the decision maker can determine, for any given subset of the customers, whether there is a feasible route that visits all the customers in the subset. Also, there is a maximum amount Ci of the product that can be at each customer i. This maximum Ci can be due to limited storage capacity at customer i, as in the application that motivated this research. In other applications of VMI, there is often a contractual limit Ci , agreed upon by customer i and the vendor, on the maximum amount of inventory that may be at customer i at any point in time. One motivation for this contractual limit is to prevent the vendor from dumping too much product at the customer. It may be feasible for a 3
vehicle to perform more than one route per day. We assume that the duration of the task, consisting of one or more routes, assigned to each driver is less than the length of a day, so that all vehicles and drivers are available at the beginning of each day, when the tasks for that day are assigned. The cost of each itinerary is known to the vendor. This includes the travel costs cij on the arcs (i, j) of the distribution network, which may also depend on the amount of product transported along the arc. The cost of an itinerary may include the costs incurred at customers’ sites, for example due to product losses during delivery. If quantity di is delivered at customer i, the vendor earns a reward of ri (di ). Because demand is uncertain, there is often a positive probability that a customer runs out of stock, and thus shortages cannot always be prevented. Shortages are discouraged with a penalty pi (si ) if the unsatisfied demand on day t at customer i is si . Unsatisfied demand is treated as lost demand, and is not backlogged. If the inventory at customer i is xi at the beginning of the day, and quantity di is delivered at customer i, then an inventory holding cost of hi (xi + di ) is incurred. The inventory holding cost can also be modeled as a function of some “average” amount of inventory at each customer during the time period. The role played by inventory holding cost depends on the application. In some cases, the vendor and customers belong to different organizations, and the customers own the inventory. In these cases, the vendor typically does not incur any inventory holding costs based on the inventory at the customers. This was the case in the application that we worked on. In other cases, such as when the vendor and customers belong to the same organization, or when the vendor owns the inventory at the customers, the vendor does incur inventory holding costs based on the inventory at the customers. The objective is to choose a distribution policy that maximizes the expected discounted value (rewards minus costs) over an infinite time horizon.
1.2
Problem Formulation
We formulate the IRP as a discrete time Markov decision process with the following components: 1. The state x = (x1 , x2 , . . . , xN ) represents the current amount of inventory at each customer. Thus the state space is X = [0, C1 ] × [0, C2 ] × · · · × [0, CN ] if the quantity of product can vary continuously, or X = {0, 1, . . . , C1 } × {0, 1, . . . , C2 } × · · · × {0, 1, . . . , CN } if the quantity of product varies in discrete units. Let Xit ∈ [0, Ci ] (or Xit ∈ {0, 1, . . . , Ci }) denote the inventory level at customer i at time t. Let Xt = (X1t , . . . , XN t ) ∈ X denote the state at time t. 2. The decision space A(x) for each state x is the set of all decisions that satisfy the work load constraints, such that the vehicles’ capacities are not exceeded, and the customers’ maximum inventory levels are not exceeded after deliveries. For example, if di (a) denotes the quantity of product that is delivered to customer i while executing decision a, then the constraint that customers’ maximum inventory levels not be exceeded after deliveries can be expressed as Xit + di (At ) ≤ Ci for all i and t, if it is assumed that no product is used between the time that the inventory level Xit is measured and the time that the delivery of di (At ) takes place. If product is used during this time period, it may be possible to deliver more. The exact way in which the constraint is applied does not affect the rest of the development. For simplicity we applied the constraint as stated above. Let At ∈ A(Xt ) denote the decision chosen 4
at time t. 3. Let Uit denote the demand of customer i at time t. Then the amount of product used by customer i at time t is given by min{Xit + di (At ), Uit }. Thus the shortage at customer i at time t is given by Sit = max{Uit − (Xit + di (At )), 0}, and the next inventory level at customer i at time t + 1 is given by Xi,t+1 = max{Xit + di (At ) − Uit , 0}. The known joint probability distribution F of customer demands U gives a known Markov transition function Q, according to which transitionsoccur. For any state x ∈ X , any decision a ∈ A(x), and any Borel subset B ⊆ X , let U(x, a, B) ≡ U ∈ N + : max{x1 + d1 (a) − U1 , 0}, . . . , max{xN + dN (a) − UN , 0} ∈ B . Then Q[B | x, a] ≡ F [U(x, a, B)]. In other words, for any state x ∈ X , and any decision a ∈ A(x), P [Xt+1 ∈ B | Xt = x, At = a] = Q[B | x, a] ≡ F [U(x, a, B)] 4. Let g(x, a) denote the expected single stage net reward if the process is in state x at time t, and decision a ∈ A(x) is implemented. To give a specific example, for any decision a and arc (i, j), let kij (a) denote the number of times that arc (i, j) is traversed by a vehicle while executing decision a. Then,
g(x, a) ≡
N
ri (di (a)) −
i=1
cij kij (a) −
N
hi (xi + di (a))
i=1
(i,j)
−
N
EF pi max{Ui − (xi + di (a)), 0}
i=1
where EF denotes expected value with respect to the probability distribution F of U . 5. The objective is to maximize the expected total discounted value over an infinite horizon. Let α ∈ [0, 1) denote the discount factor. Let V ∗ (x) denote the optimal expected value given that the initial state is x, i.e., V ∗ (x) ≡
sup E
{At }∞ t=0
∞ t=0
αt g (Xt , At ) X0 = x
(1)
The decisions At are restricted such that At ∈ A(Xt ) for each t, and At may depend only on the history (X0 , A0 , X1 , A1 , . . . , Xt ) of the process up to time t, i.e., when the decision maker decides on a decision at time t, the decision maker does not know what is going to happen in the future. A stationary deterministic policy π prescribes a decision π(x) ∈ A(x) based on the information contained in the current state x of the process only. For any stationary deterministic policy π, and any state x ∈ X , the expected value V π (x) is given by π
V (x)
≡
E
π
∞ t=0
α g (Xt , π(Xt )) X0 = x t
5
=
g(x, π(x)) + α X
V π (y) Q[dy | x, π(x)]
From the results in Bertsekas and Shreve (1978) it follows that under conditions that are not very restrictive (e.g., g bounded and α < 1), to determine the optimal expected value in (1), it is sufficient to restrict attention to the class Π of stationary deterministic policies. It follows that for any state x ∈ X , V ∗ (x)
=
sup V π (x)
sup V ∗ (y) Q[dy | x, a] g(x, a) + α
π∈Π
=
(2)
X
a∈A(x) ∗
A policy π ∗ is called optimal if V π = V ∗ .
1.3
Solving the Markov Decision Process
Solving a Markov decision process usually involves computing the optimal value function V ∗ , and an optimal policy π ∗ , by solving the optimality equation (2). This requires the following major computational tasks to be performed. 1. Computation of the optimal value function V ∗ . Because V ∗ appears in the left hand side and right hand side of (2), most algorithms for computing V ∗ involves the computation of successive approximations to V ∗ (x) for every x ∈ X . Clearly, this is practical only if the state space X is small. For the IRP as formulated in Section 1.2, X may be uncountable. One may attempt to make the problem more tractable by discretizing X . Conditions under which the solutions obtained with the discretization of X converge to the solution of (2) have been studied by Bertsekas (1975), Chow and Tsitsiklis (1991), and Kushner and Dupuis (1992). Even if one discretizes X , the number of states grows exponentially in the number of customers. Thus even for discretized X , the number of states is far too large to compute V ∗ (x) for every x ∈ X if there are more than about four customers. 2. Estimation of the expected value (integral) in (2). For the IRP, this is a high dimensional integral, with the number of dimensions equal to the number of customers, which can be as much as several hundred. Conventional numerical integration methods are not practical for the computation of such high dimensional integrals. 3. The maximization problem on the right hand side of (2) has to be solved to determine the optimal action for each state. In the case of the IRP, the optimization problem on the right hand side of (2) is very hard. For example, the vehicle routing problem, which is NP-hard, is a special case. In Kleywegt, Nori and Savelsbergh (1999) we developed approximation methods for the inventory routing problem with direct deliveries to perform the computational tasks mentioned above efficiently and to obtain good solutions. To extend the approach to the IRP in which multiple customers can be visited on a route, we develop in this paper new methods for the first and third computational tasks, that is, to compute, at least approximately, V ∗ , and to solve the maximization problem on the right hand side of (2). 6
1.4
IRP with at most Two Deliveries
The problem in which multiple customers are visited on a route is significantly harder than the problem (IRPDD) in which only one customer is visited on a route. This is to a large extent due to the fact that the decision maker has to decide which customers to combine in routes as well as how much to deliver to each of them, and not so much due to the need to determine the sequence (TSP tour) in which the customers on a route are to be visited, because in most applications the maximum number of customers on a route is small, and thus the determination of optimal TSP sequences is easy. To simplify the notation, we present a special case of the IRP in which at most two customers are visited on each vehicle route. This special case of the IRP is called the inventory routing problem with at most two deliveries (IRPTD). Conceptually, the application of the approach to the IRP in which a larger number of customers can be visited on a route is clear, although it is also clear that the method becomes computationally more demanding as the number of customers that can be on a route increases. Also, the IRPTD is not a severe restriction of many IRPs encountered in practice. In the application that motivated this research, most vehicle routes visit at most two customers. The formulation of the IRPTD is the same as the formulation of the IRP in Section 1.2, except for the following. The decision space A(x) for each state x is the set of all decisions consisting of routes that visit only one or two customers on a route, and that satisfy the work load, vehicle capacity, and maximum inventory constraints. Let N ≡ {1, . . . , N } be the set of customer indices, and let 0 be the index of the vendor’s facility. The work load constraints may imply that some customers cannot be combined on a single vehicle route. Let N 2 ⊆ N × N denote the set of customer pairs that can be combined on a single vehicle route, with (i, i) ∈ N 2 for each i ∈ N denoting that a single customer can be visited on a vehicle route. For each i ∈ N , let Ni ≡ {j ∈ N : (i, j) ∈ N 2 } denote the set of customers (including customer i) that can be combined with customer i on a single vehicle route. The approach presented in the remainder of the paper is for a discrete demand distribution F and a discrete state space X , which may come about naturally due to the nature of the product or because of discretization of the demand distribution and the state space. Let fij denote the (marginal) probability mass function of the demand of customers i and j, that is, fij (ui , uj ) denotes the probability that the demand at customer i is ui and the demand at customer j is uj . We also assume that each customer is visited at most once per day by a vehicle, since in most applications, customers prefer receiving one larger delivery instead of several smaller deliveries during a day. Hence, a single stage decision a ∈ A(x) for the IRPTD consists of M or fewer routes, each visiting one or two customers, with each customer visited at most once, as well as the delivery quantities.
2
Value Function Approximation
The first major step in solving the IRPTD is the construction of an approximation Vˆ to the optimal value ˆ , which is chosen function V ∗ . An approximating function Vˆ then provides a stationary deterministic policy π
7
to satisfy
g(x, π ˆ (x)) + α
Vˆ (y) Q[y | x, π ˆ (x)] ≥
y∈X
sup
a∈A(x)
g(x, a) + α
y∈X
Vˆ (y) Q[y | x, a] − δ
(3)
for all x ∈ X , that is, decision π ˆ (x) is within δ of the optimal decision using approximating function Vˆ on the right hand side of the optimality equation (2). If V ∗ − Vˆ ∞ < ε, that is, Vˆ is an ε-approximation of V ∗ , then V πˆ (x) ≥ V ∗ (x) −
2αε + δ 1−α
ˆ is within (2αε + δ)/(1 − α) of the optimal value for all x ∈ X , that is, the value function V πˆ of policy π function V ∗ .
2.1
Subproblem Definition
To approximate the optimal value function V ∗ , we decompose the IRP into subproblems, and then combine the subproblem results in another optimization problem to produce the approximating function Vˆ . The subproblems are Markov decision processes that attempt to capture the behavior of the overall process from the point of view of the subset of customers involved in the subproblem. For the IRPTD there is a one-customer subproblem for each customer, and a two-customer subproblem for each pair of customers that can be combined on a single vehicle route, that is, there is a subproblem MDPij for each (i, j) ∈ N 2 . We describe the MDP formulations for the two-customer subproblems; the MDPs for the single-customer subproblems are similar. The state of a subproblem includes the inventory level at each of the customers in the subproblem. Vehicles are a limited resource, and the customers in a subproblem are sometimes visited by a vehicle and sometimes not. To capture information about the availability of vehicles for delivering to the customers in a subproblem, the state of a subproblem includes a component for vehicle availability. To determine possible values of the vehicle availability component vij , let us consider the ways in which two different customers i and j are affected by vehicle availability. 1. No vehicles are available for delivering to customers i and j (vij = 0). 2. A vehicle is available for delivering a fraction of its capacity to customer i, and no vehicle is available for delivering to customer j (vij = (i, 0)). (The vehicle that is available to visit customer i, also has to visit another customer k ∈ {i, j}.) 3. No vehicle is available for delivering to customer i, and a vehicle is available for delivering a fraction of its capacity to customer j (vij = (0, j)). (The vehicle that is available to visit customer j, also has to visit another customer k ∈ {i, j}.) 4. One vehicle is available for the exclusive use by customers i and j (vij = 1).
8
5. One vehicle is available for delivering exclusively to customer i, and a vehicle is available for delivering a fraction of its capacity to customer j (vij = (1, j)). (The vehicle that is available to visit customer j, also has to visit another customer k ∈ {i, j}.) 6. A vehicle is available for delivering a fraction of its capacity to customer i, and one vehicle is available for delivering exclusively to customer j (vij = (i, 1)). (The vehicle that is available to visit customer i, also has to visit another customer k ∈ {i, j}.) 7. One vehicle is available for delivering exclusively to customer i, and one vehicle is available for delivering exclusively to customer j (vij = (1, 1)). Each two-customer subproblem MDPij is defined as follows. 1. The state space is Xij = {0, 1, . . . , Ci } × {0, 1, . . . , Cj } × {0, (i, 0), (0, j), 1, (1, j), (i, 1), (1, 1)}. State (xi , xj , vij ) denotes that the inventory levels at customers i and j are xi and xj , and the vehicle availability is vij . (For a single-customer subproblem MDPii , the state space is Xi = {0, 1, . . . , Ci } × {0, i, 1}.) 2. The set Aij (xi , xj , vij ) of feasible decisions aij when the subproblem state is (xi , xj , vij ), is determined as follows. Recall that di (a) denotes the quantity of product that is delivered to customer i while executing decision a. When the vehicle availability is vij = 0, then no vehicles can be sent to customers i and j, and di (a) = dj (a) = 0. When vij = 1, then one vehicle can be sent to customers i and j, and di (a) + dj (a) ≤ CV , xi + di (a) ≤ Ci , and xj + dj (a) ≤ Cj . When vij = (1, 1), then one vehicle can be sent to each of customers i and j, and di (a) ≤ min{CV , Ci − xi }, and dj (a) ≤ min{CV , Cj − xj }. Whenever a vehicle is available for delivering a fraction of its capacity to a customer i, one should also specify what part λ of the vehicle’s capacity can be delivered to customer i. Thus, in subproblem MDPij , numbers λiij and λjij are specified. How these numbers λiij and λjij are determined, is discussed in Section 2.1.2. When vij = (i, 0), then one vehicle can be sent to customer i, no vehicle can be sent to customer j, and di (a) ≤ min{λiij , Ci − xi }, and dj (a) = 0. When vij = (i, 1), then one vehicle can be sent to each of customers i and j, and di (a) ≤ min{λiij , Ci − xi }, and dj (a) ≤ min{CV , Cj − xj }. Feasible decisions are determined similarly if vij ∈ {(0, j), (1, j)}. 3. The transition probabilities of the subproblems have to incorporate the probability distribution of customer demands, as well as the likelihoods of vehicle availabilities. Because we assume that the probability distribution of customer demands is known, this aspect is not a problem. However, the likelihoods of vehicle availabilities are not directly obtainable from the input data of the overall IRP. The basic idea is described next, and more details are provided in Section 2.1.1. Consider any policy π ∈ Π for the IRP with unique stationary probability ν π (x) for each x ∈ X . How such a policy π is chosen, is discussed later. Similar to the seven types of vehicle availability vij ∈ {0, (i, 0), (0, j), 1, (1, j), (i, 1), (1, 1)} for customers i and j identified above, the delivery actions for customers i and j of each decision a can be classified as belonging to one of the above seven types. Let vij (a) ∈ {0, (i, 0), (0, j), 1, (1, j), (i, 1), (1, 1)} denote the type of delivery action for customers i and j of decision a. Then, given the current inventory levels xi , xj and delivery quantities di , dj at customers i and j, the probability qij (yi , yj , vij |xi , xj , di , dj ) 9
that under policy π, at the beginning of the next day the inventory levels at customers i and j are yi and yj , and the vehicle allocation is vij , is given by qij (yi , yj , vij |xi , xj , di , dj )
≡
ν π (s)
si =xi , di (π(s))=di s∈X : sj =xj , dj (π(s))=dj
zi =yi , zj =yj , z∈X : vij (π(z))=vij
Q[z | s, π(s)]
ν π (s)
(4)
si =xi , di (π(s))=di s∈X : sj =xj , dj (π(s))=dj
if the denominator is positive; and qij (yi , yj , vij |xi , xj , di , dj ) = 0 if the denominator is 0. Then the transition probabilities are given by Pij (Xi,t+1 , Xj,t+1 , Vi,j,t+1 ) = (yi , yj , wij ) (Xit , Xjt , Vijt ) = (xi , xj , vij ), Aijt = aij = qij (yi , yj , wij |xi , xj , di (aij ), dj (aij )) 4. The expected net reward per stage, given state (xi , xj , vij ) and action aij , is given by gij (xi , xj , aij ) ≡
ri (di (aij )) + rj (dj (aij )) − c0i + cij + cj0 − hi (xi + di (aij )) + hj (xj + dj (aij )) − EF pi (max{Ui − (xi + di (aij )), 0}) + pj (max{Uj − (xj + dj (aij )), 0}) (5)
5. The objective is to maximize the expected total discounted value over an infinite horizon. Vij∗ (xi , xj , vij )
Let
denote the optimal expected value of subproblem MDPij , given that the initial state is
(xi , xj , vij ), i.e., Vij∗ (xi , xj , vij )
≡
sup
{Aijt }∞ t=0
E
∞ t=0
α gij (Xit , Xjt , Aijt ) (Xi0 , Xj0 , Vij0 ) = (xi , xj , vij ) t
The actions Aijt are constrained to be feasible and nonanticipatory. The subproblem MDPij for each (i, j) ∈ N 2 is relatively easy to solve using a dynamic programming algorithm such as modified policy iteration. Two issues related to the definition of the subproblems remain to be addressed. The first issue concerns the calculation of the transition probabilities qij (yi , yj , vij |xi , xj , di , dj ), and the second issue involves the calculation of the parts λiij of the vehicle capacity that are available for delivery to customer i when the vehicle also visits another customer k ∈ {i, j}. These two issues are addressed in the next two sections. 2.1.1
Computing Transition Probabilities
Computing qij (yi , yj , vij |xi , xj , di , dj ) using (4) is hard, because stationary probabilities ν π (x) have to be computed for all x ∈ X . One may attempt to estimate these probabilities by simulating the process under policy π. However, for most applications, the number of probabilities qij (yi , yj , vij |xi , xj , di , dj ) is far too
10
large to estimate accurately in reasonable time using simulation. Hence, we use an alternative approach to estimate the transition probabilities. The conditional probability pij (vij |yi , yj ) that the delivery action for customers i and j is vij under policy π, given that the inventory levels at customers i and j are yi and yj , is given by pij (vij |yi , yj ) =
ν π (x)
{x∈X : xi =yi , xj =yj , vij (π(x))=vij }
ν π (x)
{x∈X : xi =yi , xj =yj }
if the denominator is positive, and pij (vij |yi , yj ) = 0 if the denominator is 0. The number of probabilities pij (vij |yi , yj ) can still be quite large, but is much less than the number of probabilities qij (yi , yj , vij |xi , xj , di , dj ). Also, it is often easy to obtain good prior estimates of the probabilities pij (vij |yi , yj ). Let pˆijt (vij |yi , yj ) denote an estimate of pij (vij |yi , yj ) after t transitions of the simulation, where pˆij0 (vij |yi , yj ) denotes an initial estimate such that vij pˆij0 (vij |yi , yj ) = 1. Let Nijt (yi , yj ) denote the number of times that the inventory levels at customers i and j have been yi and yj by transition t of the simulation. Let Nij0 (vij |yi , yj ) denote the equivalent number of transitions associated with the initial estimate pˆij0 (vij |yi , yj ). Then Nij0 (vij |yi , yj )ˆ pij0 (vij |yi , yj ) + Nijt (yi , yj )ˆ pijt (vij |yi , yj ) + 1 N (v |y , y ) + N (y , y ) ij0 ij i j ijt i j + 1 if Xit = yi , Xjt = yj , and vij (π(Xt )) = vij N (v pij0 (vij |yi , yj ) + Nijt (yi , yj )ˆ pijt (vij |yi , yj ) ij0 ij |yi , yj )ˆ Nij0 (vij |yi , yj ) + Nijt (yi , yj ) + 1 pˆi,j,t+1 (vij |yi , yj ) = if Xit = yi , Xjt = yj , and vij (π(Xt )) = vij pˆijt (vij |yi , yj ) if Xit = yi or Xjt = yj It can be shown that if the Markov chain under policy π is positive recurrent, so that the stationary probabilities exist and are unique, then, with probability 1, the estimates pˆijt (vij |yi , yj ) converge to pij (vij |yi , yj ) as t → ∞. The estimates qˆijt (yi , yj , vij |xi , xj , di , dj ) are obtained as follows. qˆijt (yi , yj , vij |xi , xj , di , dj ) = pijt (vij |yi , yj ) fij (xi + di − yi , xj + dj − yj )ˆ ∞ pijt (vij |yi , yj ) ui =xi +di fij (ui , xj + dj − yj )ˆ ∞ pijt (vij |yi , yj ) uj =xj +dj fij (xi + di − yi , uj )ˆ ∞ ∞ pijt (vij |yi , yj ) ui =xi +di uj =xj +dj fij (ui , uj )ˆ
if yi > 0, yj > 0 if yi = 0, yj > 0
(6)
if yi > 0, yj = 0 if yi = 0, yj = 0
In general, the estimates qˆijt (yi , yj , vij |xi , xj , di , dj ) obtained from (6) do not converge to qij (yi , yj , vij |xi , xj , di , dj ) defined in (4). However, as mentioned before, the estimates qˆijt (yi , yj , vij |xi , xj , di , dj ) obtained from (6) using pˆijt (vij |yi , yj ) can be obtained much faster than direct estimates of qij (yi , yj , vij |xi , xj , di , dj ) using
11
simulation, and were found to provide good results. 2.1.2
Computing Available Vehicle Capacities
As mentioned in Section 2.1, for a subproblem MDPij , we have to specify the part λiij of the vehicle’s capacity that is available for delivery at customer i whenever a vehicle visits both customer i and another customer k ∈ {i, j}, that is, whenever the vehicle availability variable vij ∈ {(i, 0), (i, 1)}. Given state (xi , xj , vij ) in subproblem MDPij , these parts λiij and λjij are random. Several simplified models for the λs in the subproblems were investigated. As demonstrated in Section 4, good results were obtained by modeling the λs in the subproblems as deterministic, as follows. Again, we consider a policy π ∈ Π for the IRP with unique stationary probability ν π (x) for each x ∈ X . Let λiij
{x∈X : vij (π(x))∈{(i,0),(i,1)}}
≡
ν π (x)di (π(x))
{x∈X : vij (π(x))∈{(i,0),(i,1)}} ν
π (x)
(7)
if the denominator is positive, and λiij ≡ 0 if the denominator is 0. The λs defined above can be estimated ˆi denote the estimate of λi after t transitions of the simulation, where λ ˆ i denotes an by simulation. Let λ ijt ij ij0 i initial estimate, such as CV /2. Let Nijt denote the number of times that the delivery action vij for customers i i and j have been in {(i, 0), (i, 1)} by transition t of the simulation. Let Nij0 denote the equivalent number i ˆ . Then of transitions associated with the initial estimate λ ij0
ˆi λ i,j,t+1
ˆi + N i λ ˆi Ni λ ijt ijt + di (π(Xt )) ij0 ij0 i + Ni + 1 Nij0 = ijt λ ˆi
if vij (π(Xt )) ∈ {(i, 0), (i, 1)} if vij (π(Xt )) ∈ {(i, 0), (i, 1)}
ijt
As before, it can be shown that if the Markov chain under policy π is positive recurrent, then, with probability ˆ i converge to λi as t → ∞. 1, the estimates λ ijt
ij
An even simpler approach, using i
λ
≡
{x∈X : di (π(x))>0} ν
π
(x)di (π(x))
{x∈X : di (π(x))>0} ν
π (x)
(8)
ˆi . also lead to good results. These λs can also be estimated by simulation estimates λ t Since inventory levels and delivery quantities in the subproblems are integral, we rounded the estimates ˆ i to the nearest integers before solving the subproblems. or λ
ˆi λ ijt
2.2
t
Combining Subproblems
The next topic to be addressed is the calculation of the approximate value function Vˆ (x) at a given state x, using the results from the subproblems. Recall that solving the subproblems produces optimal value functions for the subproblems. In particular, solving two-customer subproblem MDPij for (i, j) ∈ N 2 provides optimal
12
value function Vij∗ (xi , xj , vij ). (For subproblem MDPii for customer i by itself, Vii∗ (xi , xi , vii ) denotes the optimal value function.) Given a state x, the approximate value Vˆ (x) is given by the optimal objective value of the following cardinality constrained partitioning problem. Vˆ (x) =
max y
subject to
Vii∗ (xi , xi , 0) yii0 +
i∈N
yii0 +
Vij∗ (xi , xj , 1) yij1
(i,j)∈N 2
yij1
= 1
∀i∈N
j∈Ni
yij1 ≤ M
(i,j)∈N 2
yii0 ∈ {0, 1}
∀i∈N
yij1 ∈ {0, 1}
∀ (i, j) ∈ N 2
The cardinality constrained partitioning problem partitions the set of customers into subsets, each selected subset corresponding to a subproblem MDPij . Each selected subset for which yij1 = 1 is allocated a vehicle, and contributes value Vij∗ (xi , xj , 1) to the objective. Each customer i that is not in any subset that is allocated a vehicle (yii0 = 1) contributes value Vii∗ (xi , xi , 0) to the objective. The first constraint requires that each customer is either in a selected subset that is allocated a vehicle, or the customer is not visited by a vehicle. The second constraint requires that at most M vehicles be allocated to subsets. Although the notation in the formulation above of the cardinality constrained partitioning problem is for the case in which at most two customers can be visited on a route, it is clear how the partitioning problem can be applied in more general cases. However, for the special case of the IRPTD, the cardinality constrained partitioning problem can be solved by solving a maximum weight perfect matching problem, as described below. That is convenient, since the cardinality constrained partitioning problem in general is NP-hard, but the maximum weight perfect matching problem can be solved in O(n2 m) time with Edmonds’ (1965a,1965b) algorithm, where n is the number of nodes and m is the number of arcs in the graph, or in O(n(m + n log n)) time with Gabow’s (1990) algorithm. In our computational work, we used the Blossom IV implementation described in Cook and Rohe (1998). In the construction explained next, n = 4N + 2M , and m = |N 2 | + N + M + 2N (2N + 2M ). The matching problem is described by describing the corresponding graph G = (V, E). There are four subsets of nodes, V ≡ V1 ∪ V2 ∪ V3 ∪ V4 , and four subsets of edges, E ≡ E1 ∪ E2 ∪ E3 ∪ E4 . Nodes in V1 represent customers, V1 ≡ {11 , . . . , i1 , . . . , N1 }, and for each pair of customers (i, j) ∈ N 2 , i = j, there is an edge (i1 , j1 ) ∈ E1 with value Vij∗ (xi , xj , 1). For each customer i ∈ N , there is also a node i2 ∈ V2 , and an edge (i1 , i2 ) ∈ E2 with value Vii∗ (xi , xi , 1). Choosing an edge (i1 , j1 ) ∈ E1 represents assigning a vehicle to subset (i, j) ∈ N 2 , i = j (for the purpose of computing Vˆ (x)), and choosing an edge (i1 , i2 ) ∈ E2 represents assigning a vehicle to customer i by itself. Vehicles can also be left idle. To capture that, there are 2M nodes, V3 ≡ {13 , . . . , (2M )3 }, and M edges, E3 ≡ {(13 , 23 ), (33 , 43 ), . . . , ((2M − 1)3 , (2M )3 )}, each with value 0. (It follows from the definitions of the subproblems that Vii∗ (xi , xi , 1) ≥ Vii∗ (xi , xi , 0) for all i and xi , and thus if N ≥ 2M , there is always an optimal solution to the partitioning problem such that all the ∗ = M . In such a case, there is no need for any nodes in V3 or any vehicles are assigned, that is, (i,j)∈N 2 yij1 13
edges in E3 .) Thus so far there are |V1 | + |V2 | + |V3 | = 2N + 2M nodes. The assignment of M vehicles is to be represented by the matching of 2M nodes. To match the remaining 2N nodes, there are 2N additional nodes, V4 ≡ {14 , . . . , (2N )4 }, and (2N )(2N + 2M ) edges, E4 ≡ E41 ∪ E42 ∪ E43 , where E4k ≡ Vk × V4 . Each edge (i1 , j4 ) ∈ E41 has value Vii∗ (xi , xi , 0), and each edge in E42 and E43 has value 0. (The number of edges can be reduced, for example by having only edges between odd numbered nodes in V3 and odd numbered nodes in V4 , and between even numbered nodes in V3 and even numbered nodes in V4 .) The partitioning and matching problems described above are equivalent. For any feasible solution to the partitioning problem, there is a feasible solution to the matching problem with the same objective value, as follows. For each (i, j) ∈ N 2 , consider the following cases. Case 1: If yij1 = 1 for i = j, then any two unmatched nodes k4 and l4 in V4 are picked, and edges (i1 , j1 ), (i2 , k4 ) and (j2 , l4 ) are selected. Case 2: If yij1 = 1 for i = j, then edge (i1 , i2 ) is selected. Case 3: If yii0 = 1, then any two unmatched nodes k4 and l4 in V4 are picked, and the corresponding edges (i1 , k4 ) and (i2 , l4 ) are selected. Such edges can always be chosen, because the first constraint ensures that exactly one of the three cases above holds for each customer i and thus each node in V1 is matched with exactly one other node, and it follows by construction that each node in V2 is matched with exactly one other node, and because there are 2N nodes in V4 , and because E41 ≡ V1 × V4 and E42 ≡ V2 × V4 . The number of unassigned vehicles is M − (i,j)∈N 2 yij1 . Thus M − (i,j)∈N 2 yij1 edges in E3 are selected. So far each node in V1 and V2 has been matched with exactly one other node. The number of unmatched nodes in V3 is 2 (i,j)∈N 2 yij1 . The number of unmatched nodes in V4 is 2N − 2 {(i,j)∈N 2 : i =j} yij1 − 2 i∈N yii0 ; that is, there is one unmatched node in V4 for each customer that is assigned a vehicle with another customer, and there are two unmatched nodes in V4 for each customer that is assigned a vehicle by itself. As a result, the number of unmatched nodes in V4 is 2 (i,j)∈N 2 yij1 , which is the same as the number of unmatched nodes in V3 . Now the unmatched nodes in V3 are matched with the unmatched nodes in V4 by selecting edges in E43 ≡ V3 × V4 . It is easily checked that the objective value of the resulting matching is the same as that of the partitioning solution. Conversely, for any feasible solution to the matching problem, there is a feasible solution to the partitioning problem with the same objective value, as follows. For each node i1 ∈ V1 , one of the following three cases holds. Case 1: If an edge (i1 , j1 ) ∈ E1 (i = j) is selected, then set yij1 = 1. Case 2: If an edge (i1 , i2 ) ∈ E2 is selected, then set yii1 = 1. Case 3: If an edge (i1 , k4 ) ∈ E41 is selected, then set yii0 = 1. All other decision variables of the partitioning problem are set to 0. It follows that the first constraint is satisfied. Let M denote the number of edges in E3 that are selected, matching 2M nodes in V3 . The remaining 2M − 2M nodes in V3 have to be matched with nodes in V4 . Thus the remaining 2N − 2M + 2M
14
nodes in V4 have to be matched with nodes in V1 and V2 . Hence 2N − (2N − 2M + 2M ) = 2M − 2M nodes in V1 and V2 are matched with each other, setting M − M variables yij1 equal to 1. Thus (i,j)∈N 2 yij1 = M − M ≤ M , and the second constraint is satisfied. It is again easily checked that the objective value of the resulting partitioning solution is the same as that of the matching. Figure 1 shows the matching graph G = (V, E) for an example with N = 3 customers and M = 2 vehicles. In the example, V1 = {11 , 21 , 31 }, V2 = {12 , 22 , 32 }, V3 = {13 , 23 , 33 , 43 }, and V4 = {14 , . . . , 64 }. The nonzero edge values are shown in the figure.
3
Choosing a Decision in a State
In the list of major computational tasks for solving the IRP given in Section 1.3, the third major task was the solution of the maximization problem on the right hand side of the optimality equation (2) for any given state. In this section we address this step in the development of a solution method for the Markov decision process model of the IRP. Given the current state, two types of decisions have to be made, namely which customers to combine in each vehicle route, and how much to deliver at each customer. These decisions are related, because the value of combining a set of customers in a vehicle route depends on the delivery quantities for the customers. For instances with more than approximately four customers and two vehicles, solving the maximization problem to optimality would require an unacceptable computational effort, and therefore a three-step heuristic procedure was developed.
3.1
Step 1: Choosing Direct-delivery Routes
It is easy to see that a greedy procedure that chooses vehicle routes one at a time could lead to bad decisions. For example, suppose two vehicles are available and there are two customers that urgently need deliveries, but the transportation cost between these two customers is quite large. A greedy procedure may combine both these customers in the vehicle route that is chosen first, due to their urgency, and then combine other customers in the second vehicle route. A better decision may be to combine one urgent customer with some nearby customers in one vehicle route, and the other urgent customer with some other nearby customers in the other vehicle route. The proposed heuristic avoids the pitfall described above by assigning at most one customer to each vehicle in step 1 of the algorithm. Specifically, in step 1 customers are assigned to vehicles using the algorithm proposed in Kleywegt, Nori and Savelsbergh (1999) for the inventory routing problem with direct deliveries. Since a route can visit more than one customer, better decisions may be obtained by modifying the direct delivery routes obtained from step 1. Therefore, in steps 2 and 3, the vehicle routes are grown by assigning more customers to vehicle routes, as described next.
15
12
) x 1,1 , x 1 ( * V 11
V*11(x1,x1,0)
V*11(x1,x1,0)
V*
11 x, 1 x 2 ,1 )
)
33 (x 3 ,x 3 ,0
V*
12 (
V*
) V *2 (x 2 2 ,x ,0 2 )
V*33(x3,x3,0)
1 ,x 1 ,0
x
11 (
V *2 (x 2 2 ,x ,0 2 )
32
43
24
34 V *3 (x 3 3 ,x ,0 3 )
0) x 2, x 2,
* 22(
54
V
44
33 (x 3 ,x 3 ,0
V*
)
14
,0) V* 22(x 2,x 2
33
23
,1) ,x 3 (x 3 * (x 3,x 3,0) V 33
13
* 33
22
V *22(x ,x ,0 2 2 )
V
V
31
V 33(x3,x3,0) *
0)
21 V*22(x2,x2,0)
,x (x 2
* 22
,1) 2
V*23(x2,x3,1)
x 1, x 1,
V*
* 11(
V
1) x 3, x 1,
V
* 13(
* 11
V
(x
) ,0 ,1x 1
11 (x 1 ,x 1 ,0)
64
Figure 1: Matching graph for an example with N = 3 customers and M = 2 vehicles. 16
3.2
Step 2: Ranking Customers to be Added to Routes
The current decision is modified by moving to a neighboring decision. To obtain a neighboring decision, one customer is added to one of the current vehicle routes, and the delivery quantities are modified. Customers already in a route (say route m) may be removed from that route and added to another route, in which case another unassigned customer can be added to route m. For each of the M vehicle routes, Θ(N ) customers can be added to the vehicle route, and for each of these a large number of delivery quantity combinations are possible, and thus the number of neighboring decisions can be large. Also, computing the value of a decision a using V (x, a) ≡ g(x, a) + α
Q[y|x, a]Vˆ (y)
(9)
y∈X
can be very time-consuming for instances with several customers, due to the number of terms in the sum, and the effort required to compute Vˆ (y) for each state y that can be reached from current state x with decision a. The number of moves to neighboring decisions is restricted, because with each move the number of customers visited by a vehicle increases by one. In spite of that, the large number of neighboring decisions and the large computational effort required to evaluate V (x, a) for each decision a, motivate one to find a method to identify promising decisions with little computational effort. For each of the vehicle routes, we consider each customer that can be added to the vehicle route. The new set of customers in the modified vehicle route should be a set of customers that can be visited by a single vehicle, and thus should correspond to a subproblem, such as those defined in Section 2.1. To obtain an initial indication of the value of adding a customer to a current vehicle route, we use the optimal delivery quantities from the subproblem for the resulting set of customers, with the state given by the inventory levels and the availability of one vehicle to the set of customers. For each of the M vehicle routes, Θ(N ) customers can be added to the vehicle route, and thus the number of neighboring decisions has been reduced to Θ(M N ). In the expression (9) for V (x, a), the single-stage value g(x, a) can be computed quickly, whereas the expected future value y∈X Q[y|x, a]Vˆ (y) is much harder to compute. Also, we observed in empirical studies with the IRP that the decision with the highest single-stage value g(x, a) often also has the highest value of V (x, a) among all the feasible decisions. Hence, g(x, a) seems to give a good indication of whether it is worth exploring a decision a in more detail. Thus, for each of the vehicle routes, and each customer that can be added to the vehicle route, a corresponding decision a and value g(x, a) have been identified. Next, for each of the vehicle routes, the customers that can be added to the vehicle route are ranked according to the corresponding values g(x, a). Let j(m, i) denote the customer with the ith largest value of g(x, a) that can be added to vehicle route m.
3.3
Step 3: Forming Routes Based on Total Expected Value
In step 3, we compute the total expected value V (x, a) resulting from adding those customers to vehicle routes which obtained the highest values in step 2. Then we move to a neighboring decision by adding the
17
customer to the vehicle route which leads to the best value of V (x, a) and return to step 2, if such a value is better than the value of the current decision; otherwise the procedure terminates with the current decision. When computing the total expected value V (x, a) resulting from adding a customer to a vehicle route, we need to choose the delivery quantities at the customers (which determine the decision a). In step 3 we use a local search approach, where a move to a neighboring set of delivery quantities consists of swapping a unit of delivery between two customers on the same route, or incrementing the delivery quantity at one customer by one unit if the vehicle capacity and the customer capacity allow such an increment, or decrementing the delivery quantity at one customer by one unit if the current delivery quantity at that customer is positive. We start the local search from the initial set of delivery quantities used in step 2 (the optimal delivery quantities of the subproblems), and terminate the local search when a local optimum is found. Let a(m, i) denote the chosen decision when adding customer j(m, i) to route m. For each vehicle route m, we successively compute the total expected value V (x, a) resulting from adding one of the highest valued customers to route m. That is, we first compute V (x, a(m, 1)) resulting from adding customer j(m, 1) to route m. Then we compute V (x, a(m, 2)) resulting from adding customer j(m, 2) (but not customer j(m, 1)) to route m. If V (x, a(m, 2)) ≥ V (x, a(m, 1)), then we do the same for customer j(m, 3), otherwise we continue with another vehicle route. Thus the computation for route m is stopped when we reach a customer j(m, i) for which the total expected value is worse than that for customer j(m, i − 1), i.e., V (x, a(m, i)) < V (x, a(m, i − 1)). Due to the preliminary ranking in step 2, this usually happened in computational tests when i = 2. After these computations have been completed for all vehicle routes, the vehicle route and added customer that provide the best total expected value V (x, a∗ ) are determined. If the obtained value V (x, a∗ ) is better than the total expected value V (x, a ) of the current decision a , then a∗ becomes the new current decision, and the procedure returns to step 2; otherwise the procedure stops with a as the chosen decision. The procedure also stops if no more customers can be added to any vehicle routes. As mentioned before, in the expression (9) for V (x, a), the expected future value
y∈X
Q[y|x, a]Vˆ (y) may
be the sum of a huge number of terms, especially if there are a large number of customers, and the demand of each customer can take on several values. As pointed out in Kleywegt, Nori and Savelsbergh (1999), if there are a large number of customers, it is usually more efficient to estimate the expected future value with random sampling. Related issues to be addressed are (1) how large the sample size should be, and (2) what performance guarantees can be obtained if random sampling is used to choose the best decision. To address these issues, we used a ranking and selection method based on the work of Nelson and Matejcik (1995). We also used variance reduction techniques, such as common random numbers and experimental designs such as orthogonal arrays, to reduce the sample size needed for a specified level of accuracy. Additional details are given in Kleywegt, Nori and Savelsbergh (1999) and Nori (1999). Algorithm 1 gives an overview of the steps in the procedure to choose a decision in a given state.
18
Algorithm 1 Choosing a Decision in a State Step 1: Compute direct-delivery routes and delivery quantities using the algorithm in Kleywegt, Nori and Savelsbergh (1999). Let a be the resulting decision. Step 2: if no more customers can be added to any routes then Stop with current decision a as the chosen decision. end if for each current vehicle route m do for each customer j that can be added to route m do Add customer j to route m. Use the optimal delivery quantities from the subproblems corresponding to the routes to determine the decision a. Compute the single-stage value g(x, a). Remove customer j from route m. end for Sort the customers that can be added to route m, in decreasing order of the single-stage values g(x, a), to obtain a sorted list of customers j(m, 1), j(m, 2), . . . for route m. end for Step 3: for each current vehicle route m do Set i = 1. Add customer j(m, i) to route m. Choose the delivery quantities using local search to determine the decision a(m, i). Compute V (x, a(m, i)). Remove customer j(m, i) from route m. repeat Increment i = i + 1. Add customer j(m, i) to route m. Choose the delivery quantities using local search to determine the decision a(m, i). Compute V (x, a(m, i)). Remove customer j(m, i) from route m. until V (x, a(m, i)) < V (x, a(m, i − 1)). end for Let m∗ be the route, j ∗ be the added customer, and a∗ be the decision with the best value of V (x, a(m, i)). if V (x, a∗ ) > V (x, a ) then Add customer j ∗ to route m∗ , and set a = a∗ as the new current decision. Go to Step 2. else Stop with current decision a as the chosen decision. end if 19
4
Computational Results
In this section, we discuss a number of experiments to test the quality of the policies produced by the dynamic programming approximation method. One of the important issues during the development of the methodology for the value function approximation was how to capture the interactions between the customer(s) in a subproblem and the remaining customers. One aspect of this interaction is the fact that when a vehicle visits both a customer i in a subproblem MDPij and a customer k not in the subproblem, then less than the full vehicle capacity is available for delivery at the customer in the subproblem. As described in Section 2.1, this interaction is captured by the partial vehicle capacities λiij available to customer i in subproblem MDPij . Appropriate values for λiij can be estimated with (7) or (8). A relevant question is how sensitive the values of the resulting policy are with respect to the estimates of λiij . In the first set of experiments, we compare the effect of using a ˆi = 0.5CV (policy π1 ) to that of using an estimate obtained using (8) and simulation simple estimate λ ij (policy π2 ), on the solution quality. For both policies, the expected values in (3) are computed exactly, and the decision in each state is chosen by evaluating all feasible decisions in that state. We compare the value functions of policies π1 and π2 , with the optimal value function for small instances of the IRPTD (for which the optimal value function can be computed in reasonable time). The instances used for the comparison are given in the Appendix. As it is difficult to give a concise presentation of the quality of a policy π, because it involves comparing its value function with the optimal value function over all states, we have chosen to compare the average value π of policy π over all states with the average optimal value over all states. That is, Vavg ≡ x∈X V π (x)/|X | ∗ is compared with Vavg ≡ x∈X V ∗ (x)/|X |. However, since we realize that averaging over all states may smooth out irregularities, we augment this comparison with a comparison of the minimum and maximum π π ≡ minx∈X V π (x), and Vmax ≡ maxx∈X V π (x) are compared with values over all states. That is, Vmin ∗ ∗ Vmin ≡ minx∈X V ∗ (x), and Vmax ≡ maxx∈X V ∗ (x). These comparisons are given in Table 1.
Table 1: Comparison of the values of policies that use different estimates of the partial vehicle availabilities λiij , with the optimal values. Instance topt1 topt2 topt3 topt4
∗ Vmin 66.77 66.62 22.93 148.02
∗ Vavg 68.13 69.19 27.17 153.19
∗ Vmax 69.21 70.63 29.78 156.42
π1 Vmin 66.42 65.96 22.17 145.68
π1 Vavg 67.84 68.65 26.53 151.34
π1 Vmax 68.83 70.00 28.98 154.53
π2 Vmin 66.58 66.43 22.74 147.27
π2 Vavg 68.09 69.12 27.10 152.90
π2 Vmax 69.05 70.51 29.63 155.71
To complement the information provided in Table 1, we present the results of this computational experiment in a different way. Instead of presenting statistics of the actual values of the policies, we present the values of these policies relative to the optimal values. To eliminate the effect of negative optimal values, or values in the denominator close to zero, we shift the values to fix the minimum value of the shifted optimal value function at 1. Specifically, let m ≡ minx∈X V ∗ (x), and for any stationary policy π, let ρπ (x) ≡ [V π (x)−m+1]/[V ∗ (x)−m+1]. In Table 2 we present ρπavg ≡ x∈X ρπ (x)/|X |, ρπmin ≡ minx∈X ρπ (x), 20
and ρπmax ≡ maxx∈X ρπ (x) for the policies evaluated. Table 2: Comparison of the relative values of policies that use different estimates of the partial vehicle availabilities λiij . Instance topt1 topt2 topt3 topt4
1 ρπmin 0.957 0.952 0.961 0.950
1 ρπavg 0.973 0.963 0.971 0.962
1 ρπmax 0.980 0.984 0.976 0.973
2 ρπmin 0.969 0.965 0.967 0.968
2 ρπavg 0.988 0.985 0.988 0.984
2 ρπmax 0.995 0.994 0.994 0.990
When we look at the results in Tables 1 and 2, we observe that the values of the two policies are very close to the optimal values, which indicates that our overall approach provide good policies. Furthermore, the results also reveal that using (8) and simulation to estimate λiij provides a better policy than using a crude estimate, at the cost of only a small increase in computation time. Hence, we used (8) and simulation to estimate λiij in the experiments discussed in the remainder of this section. The Gauss-Seidel policy evaluation algorithm used to compute the value functions of policies for smaller instances are not useful for larger instances, because the number of states becomes too large, and hence the available computer memory is not sufficient to store the values of all the states, and the computation time becomes excessive. For the same reasons, the optimal value functions cannot be computed for larger instances. In the absence of optimal values, we used a slightly modified version of the policy proposed by Chien, Balakrishnan and Wong (1989) (CBW) as described in Kleywegt, Nori and Savelsbergh (1999) for comparison with our dynamic programming approximation policy (KNS), presented in Algorithm 1. We also compared policy KNS with a Myopic policy that takes only the single-stage costs into account, i.e., the policy obtained by using value function approximation Vˆ = 0 or discount factor α = 0. The policies were evaluated by randomly choosing five initial states, and then simulating the processes under each of the different policies starting from the chosen initial states. Each replication produced a sample path over a relatively long but finite time horizon of 800 time periods. The length of the time horizon was chosen to bound the discounted truncation error to less than 0.01 (approximately 0.1%). Six sample paths were generated for each combination of policy and initial state, for each problem instance. The sample means µ and standard deviations σ of the sample means over the six sample paths, as well as intervals (µ − 2σ, µ + 2σ) were computed. We conducted three experiments to evaluate the quality of the three policies on larger instances. In each of these experiments, we varied a single instance characteristic and observed the impact on the performance of the policies. The three instance characteristics varied are (1) the number of customers, (2) the number of vehicles, and (3) the coefficient of variation of customer demand. To study the impact of the number of customers on the performance of the policies, the instances were generated so that larger instances have more customers with the same characteristics as the smaller instances. Hence, customer characteristics as well as the ratio of delivery capacity to total expected demand were kept the same for all instances. Table 3 shows the performance of the policies on instances with varying numbers of customers. The results clearly demonstrate that the KNS policy consistently outperforms the other policies. 21
Furthermore, the difference in quality appears to increase with the number of customers. Apparently, when the number of customers becomes larger, the KNS policy is better at coordinating deliveries than the other policies. Next, we studied the impact of the number of vehicles, and thus the delivery capacity available, on the performance of the policies. The numbers of vehicles was chosen in such a way that we could study the effectiveness of the policies when the available delivery capacity is smaller than the total expected demand, as well as when there is surplus delivery capacity. The results are given in Table 4. Intuitively, it is clear that when the delivery capacity is very restrictive, i.e., the number of vehicles is small, then it becomes more important to use the available capacity wisely. The results show the superiority of the KNS policy in handling these situations. The differences in quality are much larger for tightly constrained instances than for loosely constrained instances. Finally, we studied the impact of the customer demand coefficient of variation on the performance of the policies. The customer demand distributions for the three instances were selected so that the demand distribution is the same for all customers in an instance, and the expected customer demand for each of the instances is 5. We varied the distributions so that the customer demands have different variances, namely 1, 4 and 16. All other characteristics are exactly the same for the instances. The results are given in Table 5. The results show that when the coefficients of variation of customer demand are large and it becomes less clear what the future is going to bring, then the difference in quality between the KNS policy and the other policies tend to be smaller, although the KNS policy still does better on every instance. As expected, this indicates that carefully taking into account the available information about the future, such as through dynamic programming approximation methods, provides more benefit if more information is available about the future. Finally, we compare the performance of the three policies on a real-world instance. The data for this real-world instance was obtained from one of the smaller plants of a leading producer and distributor of air products. Before describing the results, we indicate some features of the data which are interesting and present in most data sets obtained from this company. We also indicate some changes made to the data, which helped in the solution process. 1. Tank sizes at the customers range from 90,000 cubic feet to 700,000 cubic feet. The tank sizes at the customers were rounded to the nearest multiple of 25,000, and product quantities were discretized in multiples of 25,000. 2. The company did not have estimates of the probability distributions of demands at the customers. However, they did have estimates of the mean and standard deviation of the demand. Using the mean and standard deviation, we created a discrete demand distribution for each customer with the given mean and standard deviation. 3. The company did not have exact values for the revenue earned per unit of product delivered. We used the same value for the revenue per unit of product at all the customers, assuming that the company charged the same price to all its customers.
22
Table 3: Comparison of the values of policies on instances with different numbers of customers.
23
Instance tcst1
N 10
tcst2
15
tcst3
20
µ -12.45 -12.21 -11.97 -12.19 -13.08 -17.62 -17.76 -18.25 -17.37 -18.17 -20.58 -20.81 -20.49 -21.25 -20.36
σ 0.37 0.27 0.28 0.40 0.24 0.42 0.28 0.42 0.39 0.33 0.36 0.29 0.34 0.33 0.26
CBW µ − 2σ -13.20 -12.74 -12.54 -12.98 -13.57 -18.47 -18.32 -19.08 -18.16 -18.83 -21.30 -21.38 -21.18 -21.91 -20.89
µ + 2σ -11.71 -11.67 -11.40 -11.39 -12.60 -16.78 -17.20 -17.41 -16.58 -17.52 -19.86 -20.24 -19.81 -20.58 -19.84
µ -11.39 -11.25 -11.88 -11.65 -11.73 -17.17 -17.09 -17.30 -17.13 -16.92 -19.84 -19.35 -19.21 -19.28 -19.87
Myopic σ µ − 2σ 0.26 -11.91 0.20 -11.64 0.34 -12.56 0.24 -12.13 0.18 -12.09 0.24 -17.64 0.28 -17.66 0.25 -17.80 0.17 -17.48 0.15 -17.21 0.35 -20.54 0.37 -20.10 0.28 -19.77 0.35 -19.97 0.42 -20.72
µ + 2σ -10.86 -10.86 -11.21 -11.18 -11.38 -16.70 -16.53 -16.79 -16.79 -16.62 -19.13 -18.60 -18.66 -18.58 -19.02
µ -8.60 -8.73 -8.53 -8.63 -8.92 -13.10 -13.57 -13.34 -13.63 -13.45 -16.68 -16.85 -16.43 -16.59 -16.21
σ 0.27 0.29 0.11 0.20 0.27 0.13 0.10 0.21 0.31 0.16 0.28 0.27 0.18 0.30 0.27
KNS µ − 2σ -9.13 -9.32 -8.75 -9.04 -9.46 -13.35 -13.77 -13.77 -14.24 -13.78 -17.24 -17.39 -16.79 -17.18 -16.75
µ + 2σ -8.07 -8.14 -8.31 -8.22 -8.37 -12.85 -13.38 -12.92 -13.02 -13.13 -16.12 -16.30 -16.07 -15.99 -15.66
Table 4: Comparison of the values of policies on instances with different numbers of vehicles.
24
Instance tveh1
M 3
tveh2
6
tveh3
9
µ -65.44 -65.85 -65.85 -66.03 -65.72 1.41 1.17 1.43 1.30 0.82 15.01 15.28 15.15 15.30 14.87
σ 0.17 0.25 0.20 0.19 0.32 0.13 0.24 0.18 0.16 0.20 0.32 0.19 0.12 0.24 0.19
CBW µ − 2σ -65.78 -66.34 -66.24 -66.41 -66.36 1.16 0.70 1.08 0.99 0.42 14.37 14.90 14.91 14.83 14.48
µ + 2σ -65.10 -65.35 -65.45 -65.64 -65.07 1.66 1.65 1.78 1.62 1.22 15.65 15.66 15.39 15.78 15.26
µ -64.11 -63.73 -63.82 -63.84 -63.93 2.00 2.17 1.58 1.96 2.18 16.10 15.93 15.98 16.09 16.23
Myopic σ µ − 2σ 0.18 -64.48 0.25 -64.23 0.25 -64.31 0.22 -64.29 0.27 -64.47 0.26 1.48 0.18 1.81 0.27 1.04 0.36 1.24 0.29 1.60 0.21 15.69 0.18 15.56 0.19 15.59 0.22 15.65 0.29 15.64
µ + 2σ -63.75 -63.23 -63.33 -63.40 -63.40 2.51 2.52 2.12 2.68 2.75 16.52 16.29 16.36 16.53 16.82
µ -58.58 -59.24 -59.05 -58.92 -58.73 4.83 5.30 5.43 5.14 5.28 18.34 18.06 17.64 18.17 17.84
σ 0.19 0.29 0.23 0.21 0.18 0.22 0.17 0.24 0.26 0.24 0.18 0.24 0.14 0.33 0.24
KNS µ − 2σ -58.96 -59.82 -59.52 -59.35 -59.09 4.39 4.96 4.95 4.61 4.79 17.97 17.58 17.36 17.52 17.35
µ + 2σ -58.20 -58.65 -58.58 -58.50 -58.36 5.27 5.64 5.91 5.67 5.76 18.71 18.53 17.91 18.83 18.33
Table 5: Performance of policies on instances with different demand variance.
25
Instance tvar1
CV 0.1
tvar2
0.4
tvar3
0.8
µ -17.21 -17.81 -17.59 -17.24 -17.38 -14.94 -15.15 -14.77 -14.58 -14.77 -9.55 -9.59 -9.85 -9.74 -8.90
σ 0.28 0.16 0.22 0.26 0.33 0.26 0.25 0.27 0.13 0.25 0.17 0.20 0.28 0.29 0.09
CBW µ − 2σ -17.76 -18.14 -18.02 -17.76 -18.04 -15.46 -15.66 -15.31 -14.84 -15.28 -9.89 -10.00 -10.42 -10.32 -9.08
µ + 2σ -16.65 -17.48 -17.15 -16.72 -16.72 -14.42 -14.64 -14.22 -14.33 -14.26 -9.21 -9.19 -9.28 -9.16 -8.72
µ -16.69 -16.71 -16.79 -16.20 -16.41 -14.14 -14.21 -13.60 -14.04 -14.09 -8.17 -8.03 -8.18 -8.04 -8.15
Myopic σ µ − 2σ 0.28 -17.24 0.27 -17.25 0.18 -17.14 0.17 -16.55 0.15 -16.71 0.22 -14.59 0.25 -14.70 0.15 -13.91 0.29 -14.62 0.23 -14.55 0.18 -8.54 0.18 -8.38 0.24 -8.65 0.21 -8.46 0.17 -8.49
µ + 2σ -16.14 -16.16 -16.43 -15.86 -16.11 -13.69 -13.71 -13.29 -13.46 -13.62 -7.80 -7.67 -7.70 -7.62 -7.81
µ -14.02 -13.93 -13.50 -13.88 -13.52 -12.27 -12.10 -11.65 -12.23 -11.73 -6.93 -6.76 -7.04 -7.06 -6.89
σ 0.24 0.25 0.14 0.30 0.28 0.22 0.27 0.21 0.17 0.24 0.29 0.13 0.23 0.25 0.24
KNS µ − 2σ -14.50 -14.42 -13.77 -14.48 -14.09 -12.71 -12.64 -12.08 -12.58 -12.21 -7.52 -7.03 -7.50 -7.56 -7.37
µ + 2σ -13.55 -13.44 -13.23 -13.29 -12.96 -11.83 -11.56 -11.22 -11.88 -11.24 -6.34 -6.50 -6.58 -6.56 -6.41
Table 6: Performance of policies on a real-world instance. Instance tprx1
µ 32.62 34.96 34.16 34.75 33.93
σ 1.27 1.28 1.84 1.23 1.31
CBW µ − 2σ 30.07 32.40 30.48 32.30 31.30
µ + 2σ 35.17 37.53 37.84 37.21 36.56
µ 36.54 39.41 37.55 39.88 37.14
Myopic σ µ − 2σ 0.48 35.57 1.51 36.38 1.69 34.17 1.19 37.50 1.22 34.71
µ + 2σ 37.51 42.44 40.93 42.26 39.57
µ 45.45 47.15 47.06 44.26 42.97
σ 2.70 1.96 1.62 2.07 1.26
KNS µ − 2σ 40.04 43.23 43.82 40.13 40.44
µ + 2σ 50.86 51.07 50.30 48.39 45.50
The instance that we solved is given in Table 16 and the performance of the three policies is shown in Table 6. As before, the performance of policy KNS is much better than the Myopic policy, which in turn is better than the CBW policy. Overall, the computational experiments conducted demonstrate the viability of using dynamic programming approximation methods for the IRPTD.
References Bertsekas, D. P. 1975. Convergence of Discretization Procedures in Dynamic Programming. IEEE Transactions on Automatic Control , 20, 415–419. Bertsekas, D. P. and Shreve, S. E. 1978. Stochastic Optimal Control: The Discrete Time Case. Academic Press, New York, NY. Chien, T. W., Balakrishnan, A. and Wong, R. T. 1989. An Integrated Inventory Allocation and Vehicle Routing Problem. Transportation Science, 23, 67–76. Chow, C. S. and Tsitsiklis, J. N. 1991. An Optimal One-Way Multigrid Algorithm for Discrete-Time Stochastic Control. IEEE Transactions on Automatic Control , AC-36, 898–914. Cook, W. and Rohe, A. 1998. Computing Minimum-Weight Perfect Matchings. preprint. Edmonds, J. 1965a. Maximum Matching and a Polyhedron with 0,1-Vertices. Journal of Research of the National Bureau of Standards, 69B, 125–130. Edmonds, J. 1965b. Paths, Trees and Flowers. Canadian Journal of Mathematics, 17, 449–467. Gabow, H. N. 1990. Data Structures for Weighted Matching and Nearest Common Ancestors with Linking, Proceedings of the First Annual ACM-SIAM Symposium on Discrete Algorithms, New York, NY, 434– 443. Kleywegt, A. J., Nori, V. S. and Savelsbergh, M. W. P. 1999. The Stochastic Inventory Routing Problem with Direct Deliveries, Technical Report TLI99-01, The Logistics Institute, School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0205. Kushner, H. J. and Dupuis, P. 1992. Numerical Methods for Stochastic Control Problems in Continuous Time. Springer-Verlag, New York, NY. Nelson, B. L. and Matejcik, F. J. 1995. Using Common Random Numbers for Indifference-zone Selection and Multiple Comparisons in Simulation. Management Science, 41, 1935–1945. Nori, V. S. 1999. Algorithms for Dynamic and Stochastic Logistics Problems. Ph.D. thesis, School of Industrial and Systems Engineering, Georgia Institute of Technology.
26
Appendices A
Instances Used in Computational Results Table 7: Instance topt1. i
xi
yi
1 2 3 4
0.0 -10.0 0.0 10.0
xi
yi
Ci
fi 0 1 10.0 2 0.0 0.5 0.0 2 0.0 0.7 -10.0 2 0.0 0.3 0.0 2 0.0 0.2 Vendor (0, 0), N = 4, M =
2 0.5 0.3 0.7 0.8 1, CV
ri
pi
hi
100 100 100 100 =4
40 40 40 40
1 1 1 1
ri
pi
hi
100 100 100 100
40 40 40 40
1 1 1 1
ri
pi
hi
100 100 100 100
40 40 40 40
1 1 1 1
Table 8: Instance topt2. i 1 2 3 4
0.0 -10.0 0.0 10.0
Ci
0 10.0 4 0.0 0.0 4 0.0 -10.0 4 0.0 0.0 4 0.0 Vendor (0, 0),
fi 1 2 0.2 0.2 0.1 0.5 0.3 0.3 0.2 0.3 N = 4, M =
3 0.4 0.2 0.3 0.5 1, CV
4 0.2 0.2 0.3 0.0 =5
Table 9: Instance topt3. i 1 2 3 4
xi 0.0 -10.0 0.0 10.0
yi 10.0 0.0 -10.0 0.0
Ci 6 6 6 6
0 0.0 0.0 0.0 0.0 Vendor
1 0.2 0.1 0.0 0.0 (0, 0),
fi 2 3 0.2 0.1 0.2 0.2 0.0 0.5 0.3 0.0 N = 4, M =
27
4 0.2 0.2 0.5 0.6 1, CV
5 0.2 0.2 0.0 0.0 =5
6 0.1 0.1 0.0 0.1
Table 10: Instance topt4. i 1 2 3 4
xi
yi
0.0 -10.0 0.0 10.0
10.0 0.0 -10.0 0.0
Ci 8 8 8 8
0 0.0 0.0 0.0 0.0
1 0.1 0.1 0.1 0.1 Vendor
2 0.1 0.1 0.1 0.1 (0, 0),
fi 3 4 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 N = 4, M =
5 0.1 0.1 0.1 0.1 1, CV
6 0.1 0.1 0.1 0.1 =8
7 0.1 0.1 0.1 0.1
8 0.1 0.1 0.1 0.1
ri
pi
hi
100 100 100 100
40 40 40 40
1 1 1 1
Table 11: Instances tcst1, tcst2 and tcst3. The values of (N, M ) are (10, 4), (15, 6) and (20, 8). i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
xi 16.2 -23.2 9.1 19.5 -20.0 -4.9 -0.8 4.3 -6.9 21.9 -17.8 7.4 9.1 -0.4 14.7 29.8 -16.4 -5.5 -8.7 25.3
yi -22.2 -18.7 9.8 -9.5 23.5 -22.1 -14.0 14.8 -4.2 -22.2 29.7 11.2 -0.4 23.7 22.0 12.2 -26.9 -25.0 -27.1 17.5
Ci 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 0.5 0.0 0.0 0.0 0.0 0.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0
2 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0
fi 3 4 5 6 0.0 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.0 0.3 0.4 0.3 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.4 0.6 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.2 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0.5 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.4 0.3 0.0 0.0 0.6 0.0 0.9 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.4 0.1 0.1 0.3 0.3 0.4 0.0 Vendor (0, 0), CV = 9
28
7 0.0 0.0 0.2 0.0 0.5 0.0 0.0 0.0 0.3 0.0 0.0 0.0 0.0 0.0 0.2 0.4 0.0 0.0 0.4 0.0
8 0.0 0.0 0.8 0.0 0.5 0.0 0.0 0.0 0.4 0.5 0.3 0.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.5 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
ri
pi
hi
598 504 571 569 581 551 585 518 571 557 550 551 581 518 575 511 521 523 562 598
310 294 307 304 262 347 266 257 305 281 315 259 346 340 264 327 282 287 271 335
1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
Table 12: Instance tvar1. i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
xi
yi
Ci
-11.4 8.0 18.7 14.6 3.5 10.4 -6.1 12.1 13.9 -14.0 21.8 9.6 -12.3 11.8 6.5
-11.8 5.2 -28.3 -19.2 11.0 18.2 4.4 21.6 6.8 -12.6 -4.6 -5.5 -4.5 12.2 8.0
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
xi
yi
Ci
0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
fi 1 2 3 4 5 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 Vendor (0, 0), N = 15, M =
6 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 5, CV
7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 = 12
8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
9 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
ri
pi
hi
541 515 587 415 507 485 442 515 598 586 492 448 510 476 432
315 238 328 211 237 279 397 287 305 389 295 270 330 244 212
2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
ri
pi
hi
541 515 587 415 507 485 442 515 598 586 492 448 510 476 432
315 238 328 211 237 279 397 287 305 389 295 270 330 244 212
2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
ri
pi
hi
541 515 587 415 507 485 442 515 598 586 492 448 510 476 432
315 238 328 211 237 279 397 287 305 389 295 270 330 244 212
2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
Table 13: Instance tvar2. i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
-11.4 8.0 18.7 14.6 3.5 10.4 -6.1 12.1 13.9 -14.0 21.8 9.6 -12.3 11.8 6.5
-11.8 5.2 -28.3 -19.2 11.0 18.2 4.4 21.6 6.8 -12.6 -4.6 -5.5 -4.5 12.2 8.0
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
xi
yi
Ci
0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
fi 1 2 3 4 5 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 Vendor (0, 0), N = 15, M =
6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 5, CV
7 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 = 12
Table 14: Instance tvar3. i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
-11.4 8.0 18.7 14.6 3.5 10.4 -6.1 12.1 13.9 -14.0 21.8 9.6 -12.3 11.8 6.5
-11.8 5.2 -28.3 -19.2 11.0 18.2 4.4 21.6 6.8 -12.6 -4.6 -5.5 -4.5 12.2 8.0
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
fi 1 2 3 4 5 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 Vendor (0, 0), N = 15, M =
29
6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 5, CV
7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 = 12
Table 15: Instances tveh1, tveh2 and tveh3. i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
xi 24.8 -3.3 -24.6 25.2 4.3 24.9 -29.3 24.3 5.7 5.9 4.5 22.0 -3.8 -22.6 28.5
yi 13.8 18.8 -14.6 5.9 26.7 -1.4 20.6 -6.6 -11.8 -2.4 -1.1 -1.9 -28.3 -9.7 26.0
Ci 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
fi 1 2 3 4 5 0.5 0.5 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Vendor (0, 0), N = 15, M =
6 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 3, CV
7 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.5 0.5 0.0 0.0 0.5 = 12
8 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.5
9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.0
10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.0
ri
pi
hi
599 502 644 533 467 479 588 629 647 639 480 593 497 647 562
256 328 268 347 255 324 260 340 301 303 324 266 278 327 284
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Table 16: Instance tprx1. i 1 2 3 4 5 6 7 8 9
Long -86.8 -85.3 -81.0 -96.4 -95.4 -85.8 -90.0 -90.1 -98.1
Lat 33.6 35.0 35.2 32.5 29.8 38.2 35.2 30.0 29.3
10 4 8 24 28 4 11 4 18
0 0 0 0 0 0 0 0 0 0
fi 1 2 3 4 5 0.0 0.0 0.0 0.0 0.0 0.5 0.0 0.5 0.0 0.5 0.0 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.5 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 Vendor (−84.2, 33.8), N = 9,
30
6 0.5
7 0.0
8 0.0
9 0.0
10 0.5
0.0 0.0 0.0
0.0 0.0 1.0
0.0 0.0 0.0
0.0 0.0
1.0 0.0
0.5
0.0
0.0
0.0
0.0
0.0 0.0 0.0 M = 4, CV = 20
0.0
0.0
ri
pi
hi
550 550 550 550 550 550 550 550 550
250 250 210 260 260 210 210 190 260
0 0 0 0 0 0 0 0 0