Locating capacitated facilities to maximize captured demand

IIE Transactions (2007) 39, 1015–1029 C “IIE” Copyright ISSN: 0740-817X print / 1545-8830 online DOI: 10.1080/07408170601142650

Locating capacitated facilities to maximize captured demand ODED BERMAN1,∗ , RONGBING HUANG2 , SEOKJIN KIM3 and MOZART B. C. MENEZES4 1

Joseph L. Rotman School of Management, University of Toronto, 105 St. George Street, Toronto, ON, Canada M5S 3E6 E-mail: [email protected] 2 School of Administrative Studies, York University, Atkinson Building, 4700 Keele Street, Toronto, ON, Canada M3J 1P3 3 Department of Business Administration, Millersville University, P.O. Box 1002, Millersville, PA 17551-0302, USA 4 Department of Logistics and Operations Management, HEC-School of Management, Paris, 1 Rue de la Liberation, F-78351 Jouy en Josas Cedex, France Received February 2006 and accepted October 2006

We consider the problem of locating a set of facilities on a network to maximize the expected number of captured demand when customer demands are stochastic and congestion exists at facilities. Customers travel to their closest facility to obtain service. If the facility is full (no more space in the waiting room), they attempt to obtain service from the next-closest facility not yet visited from its current position on the network. A customer is lost either when the closest facility is located too far away or all facilities have been visited. After formulating the model, we propose two heuristic procedures. We combine the heuristics with an iterative calibration scheme to estimate the expected demand rate faced by the facilities: this is required for evaluating objective function values. Extensive computational results are presented. Keywords: Location, congestion, optimization, queueing

1. Introduction One of the most important objectives for profit and nonprofit organizations when locating service facilities in a geographical region is to capture as much potential customer demand as possible. Strategically, most companies locate their facilities to maximize their market share and to create an entry barrier to potential competitors. Without effective location of facilities, it would be difficult to maintain a high-market-share position. Most non-profit organizations attempt to make their facilities more accessible to potential customers. Regardless of how attractive or useful an organization’s products and services may be, if customers cannot reach them with a desired level of service, the chance of success is greatly diminished. In this paper, we specifically attempt to capture maximal customer demand from the viewpoint of a decision maker at a strategic level. We assume that customers are concentrated at discrete points (or nodes) on a network. The weight of a node represents the demand intensity at that node. It reflects the total number of potential customers and should be estimated through an appropriate statistical analysis. The inter-arrival times of customers at a node are uncertain and drawn from some distribution specific to that node. A customer who wishes to obtain service travels to the next “eligible” facil∗

Corresponding author

C 2007 “IIE” 0740-817X

ity (the next-closest facility which has not yet been visited) if the facility is sufficiently close to her current position on the network. Otherwise, she gives up service and leaves the system. If the facility is full (no more space in the waiting room) when visited, she again considers traveling to the next eligible facility and so on. Service times at a facility are assumed to be drawn from some distribution identical across all servers in the network. For simplicity, it is assumed that each facility hosts a single server. The problem belongs to a class of optimization problems, called Location Problems with Stochastic Demands and Congestion (LPSDC), which attempt to find optimal locations for a set of facilities in the presence of stochastic demands and potential congestion at the facilities on a discrete undirected network. A comprehensive and yet detailed overview of LPSDC is provided by Berman and Krass (2002). LPSDC models are primarily involved with two types of uncertainty: (i) the actual amount and timing of the demand generated by customers; and (ii) a possible loss of demand or delay of service due to congestion at service facilities. It is well known that two important classes of deterministic location models in the literature are coverage type and median type. The two classes have their direct parallels in the LPSDC context. For median type models the readers can refer to Berman et al. (1990). Our paper belongs to the class of coverage-type models.

1016 There are two subclasses of coverage problems: set covering and maximal covering problems. Most coverage-type models addressing congestion trace their origins to the model developed by Daskin (1983). Since the busy fraction of a server is exogenously given, the stochastic behavior of an underlying system is not explicitly captured in Daskin’s model. However, the model has led to several subsequent models that have attempted to integrate congestion explicitly. Following the initial work of Larson (1975), Batta et al. (1989) attempt to relax the assumption that busy fractions of servers are statistically independent. Pursuing an alternative approach, ReVelle and Hogan (1989) focus on a local region of a network and attempt to capture the stochastic behavior of the region more explicitly. Specifically, region i is the set of nodes within a prespecified distance of node i. As in Daskin (1983), the number of busy servers in a region is assumed to be binomially distributed. However, the busy fraction of a server is not server specific but region specific, i.e., there are different busy fractions in different regions. Batta et al. (1989) examine both set covering and maximal covering problems in the framework of covering-location models for emergency situations that require multiple response units. Given an upper bound on exponential service times, Ball and Lin (1993) derive a sitespecific upper bound for busy fractions to provide solutions satisfying a required level of availability at each node. Following ReVelle and Hogan’s assumption that the busy fractions are region specific, Marianov and ReVelle (1994) treat a region having k servers within as an M/M/k/k system. Borras and Pastor (2002) provide an ex-post evaluation of the availability level on the three models in the literature by simulation and also suggest a new model formulated similar to the Ball–Lin model incorporating the estimate for the busy fraction of the ReVelle–Hogan model. In a closely related paper (Berman, Krass and Wang, 2006) where customers travel only to the closest facility, two types of demand loss addressed are: “lack of coverage”— which occurs when none of the facilities are close enough to customers’ location to provide a sufficient level of convenience and “lack of service”—which occurs when a customer finds the visiting facility full and thus is lost to the system. In this paper, we explicitly allow a dissatisfied customer (or a “lost” customer due to “lack of service” in Berman, Krass and Wang, 2006) to visit the next-closest facility from her current position, so that she may possibly visit several facilities on the network. Therefore, this paper may portray better a common customer behavior. The problem of locating facilities taking advantage of this customer behavior can be useful in various situations: 1. When the company providing service or product is a monopoly. For example in Ontario, Canada, liquor is monopolized by the government which operates LCBO (Liquor Control Board of Ontario) stores. These stores are often congested and it is quite common to see cus-

Berman et al. tomers traveling to another LCBO when the current one is full. 2. Services with loyal customers. For example, customers having an account with a particular bank tend to stay loyal since requiring services from other banks incurs extra cost or inconvenience. Another example in Canada is gas stations. One of the largest companies is Petro Canada which gives customers reward points. Therefore, customers who decide not to obtain gas from a full Petro Canada station will tend to go to another Petro Canada station even if there is a competitor gas station around the corner. 3. In a case of customers that are not fully loyal and existence of fierce competition, we can use a high sensitivityto-distance parameter in the function describing the fraction of customers willing to visit a facility (defined in the next section) to take into account the competition. This paper also extends recent work by Berman, Krass and Menezes (2006a, 2006b), which allow multiple visits to facilities assuming that facilities can be disrupted and thus unavailable to provide service. The objective is to minimize the total expected cost of traveling. In contrast to our paper, the facilities considered in Berman, Krass and Menezes (2006a, 2006b) are not congested. Our main contribution is two-fold: first we introduce the problem of redirecting demand, when facilities are congested, in a network setting; second, we develop an approximation algorithm to compute the objective function value that is very difficult to compute analytically and thus avoid the expensive time cost of simulation. The rest of this paper is organized as follows: in Section 2 we formulate the main problem. Next, in Section 3, we present the approximation algorithm to evaluate the objective function. In Section 4 two heuristics are introduced and in the subsequent section we show some computational results. Section 6 discusses two natural extensions to the model and a comparison of several models. We conclude with final remarks in Section 7.

2. Problem formulation Let G = (N, E) be a network, where N = {1, 2, · · · , n} is the set of nodes and E is the set of edges. The fraction of the total population associated with node i ∈ N is denoted by wi . The demand process at node i is Poisson distributed with rate λ wi . For simplicity of notation and without loss of generality, we assume that λ = 1. There are p facilities and in each facility there is a single server who performs service according to an exponential distribution with rate µ. Facilities are located only at nodes. A facility is defined by a pair (j, i) where j is a facility and i is its location. The facility is full when there are c customers in the system and thus no new customers are allowed to enter. We denote by d(x, y) the shortest distance between

1017

Locating capacitated facilities x and y, x, y ∈ G and by d(i, L) the shortest distance from node i to the closest facility in a set of nodes L, i.e., d(i, L) = minj∈L d(i, j). Thus, facility j operates as an M/M/1/c queueing system. If the number of customers qj at facility j is c, a visiting customer does not enter facility j for service. The fraction of demands from any node willing to visit facility j from customers’ current location i is denoted by a non-increasing function f (dij ). In this paper we consider the piecewise linear convex function given by f (dij ) = max {1 − dij /(αdmax ), 0}, where α > 0 is the distance sensitivity of customers and αdmax > 0 is a threshold distance. We assume that customers never visit any facility located beyond the threshold distance. A typical choice of dmax is the diameter of the network. Thus, a fraction 1 − f (dij ) of customers are lost. By using αi (instead of α) in f (dij ) we can implicitly take competition into account when customers are not fully loyal. We note that other functional forms of f are available in the literature, e.g., an exponential decay function f (d) = e−αd , a linear decay function f (d) = (ui − d)/(ui − li ) where li and ui are lower and upper bounds of d such that f (li ) = 1 and f (ui ) = 0, see Berman, Krass and Wang (2006). A solution is defined by a pair (S, L), where S = {1, 2, . . . , p} is the set of facilities and L = {L(1), . . . , L(p)} = {L1 , . . . , Lp } is a multi-set of locations, where L(j) (or Lj ) is the location of facility j ∈ S. We denote by ρ (j) the blocking fraction of facility j, i.e., the fraction of time that qj = c, where qj is the number of customers at fa(i) (i) (i) cility j. Let S(i) = {S1 , S2 , . . . , Sp } be the sequence of facilities to be visited by a customer from node i and (i) (i) (i) L(i) = {L1 , L2 , . . . , Lp } be the set of locations correspond(i) (i) (i) ing to S(i) , i.e., Lj = L(Sj ). By convention, we let S0 = 0, (i)

(i)

ρ(S0 ) = 1, and L0 = i. A customer from node i first at(i) tempts to enter service at facility S1 which is located at (i) L1 . Since multiple facilities are allowed at a node, L is a multi-set. For example, if L = {1, 2, 2, 4}, the second and third facilities are both located at node 2. The long-run fraction of demand from node i visiting (i) (i) (i) (i) facility S1 located at L1 is wi f (d(L0 , L1 )). If the number (i) (i) of customers q(S1 ) at facility S1 is less than c, the customer from node i enters the facility for service. Otherwise, the (i) customer visits the next facility S2 or leaves the system with (i) (i) (i) probability 1 − f (d(L1 , L2 )). Define Bi (Sj ) to be the long(i)

run fraction of time that facility Sj is available to provide service to demand originating from node i, while facilities (i) (i) (i) S0 , S1 , · · · , Sj−1 are not available (having c customers in each). The expected demand Vi (L) originating from node i

and captured by all p facilities is Vi (L) = wi

p

(i)

Bi (Sj )

j=1

j (i) (i) f d Lk−1 , Lk .

(1)

k=1

Since it is impossible to obtain the Bi (Sji ) values analytically, Equation (1) can not be used to calculate Vi (L) practically. We approximate Bi (Sji ) by j (i) (i) ρ Sk−1 , 1 − ρ Sj

k=1 (i) ρ(Sj ))

is the long-run of time that where (1 − j fraction (i) the jth facility is available and k=1 ρ(Sk−1 ) is the longrun fraction of time that the first j − 1 facilities are unavailable. This is clearly an approximation since it assumes that ρ1 , · · · , ρp are independent. Therefore, considering the approximation proposed, the expected demand Vi (L) originating from node i and captured by all p facilities is Vi (L) = wi

j j p (i) (i) (i) (i) 1 − ρ Sj ρ Sk−1 f d Lk−1 , Lk j=1

k=1

k=1

j p (i) (i) (i) (i) 1 − ρ Sj ρ Sk−1 f d Lk−1 , Lk , = wi j=1

k=1

(2) and the total expected demand captured by p facilities is Vi (L). (3) V (L) = i∈N

The problem is then to maximize the total expected demand captured by p facilities: (P1)

V (L∗ ) = max{V (L) : |L| = p}. L⊆N

Note again that (P1) is just an approximation to the original problem. This approximation is shown to be very good (in Section 5, we compare the results obtained by the heuristic algorithms developed for Equation (2) with the results obtained by simulation using Equation (1)). Given a fixed set of facility locations, let Q = (q1 , q2 , . . . , qp ) be the state of the system where qj is the number of customers at facility j, j = 1, · · · , p. There are (c + 1)p different states of the system for queues of maximum size of c which is a huge number. For example, for queues of maximum length of five customers and ten facilities the total number of states is over 60 000 000.

3. Calibrating blocking fractions The huge number of system states makes it very difficult to solve (P1) analytically. Furthermore, the blocking fraction of a facility is dependent on the other (p − 1) facilities. So, by changing one single element in the facility location

1018

Berman et al.

set the steady-state distributions must be recalculated. Our problem combines the complications of classical location problems (most of which are NP-complete) and the dynamics of queueing systems. A special case of our problem, when the maximum queue size (or equivalently the service rate) is infinite, is equivalent to the p-median problem which is an NP-hard problem. The incorporation of stochastic aspects into location decisions inevitably leads to intractable formulations which often requires simplifying assumptions on the problem and further reasonable approximations for the most important quantities of interest. Considering the complicated nature of the problem, we propose two heuristic procedures to improve the objective function value and to keep the feasibility of solutions at each iteration with computational efficiency. To do so, we need to approximate the blocking fraction ρ (j) of facility j. For an M/M/1/c queueing system (c.f., e.g., Gross and Harris (1985)), we have:  (j) c (j)   (λ /µ) (1 − λ /µ) if λ(j) = µ, (j) c+1 (j) 1 − (λ /µ) (4) ρ =   if λ(j) = µ, (c + 1)−1 where λ(j) , the mean demand rate for facility j, is the summation of node-to-facility demand assignments to facility j, i.e., how many customers from all nodes visit facility j ∈ S per unit time. Unfortunately, the exact quantity λ(j) is not known. We propose an iterative approximation procedure to calibrate the demand rates, λ(j) , ∀ j, which we will use in expression (4). We refer to it as Demand Assignment and Calibration (DAC). For a given set of facility locations and a specific demand node i, we designate the closest facility to i as its first preferred. If two or more facilities are equally close to demand node i, we will choose the facility with the smallest index as its first preferred. Denote by FP = {f1 , · · · , fv } the set of all facilities that can be chosen as the first preferred by some demand nodes. Notice that v ≤ p since as we mentioned in the last section, multiple facilities are allowed in a node. Therefore, it is possible that some facilities will never be chosen as the first preferred. A customer visiting her first preferred, say facility fk = j, will follow the same route as all other customers whose p first preferred is facility fk . Let SPk = {SPk1 , · · · , SPk } be the sequence of preferred facilities in which SPk1 = fk ∈ FP (k = 1, · · · , v). Note that once fk is chosen, SPk is fully deq termined (i.e., among facilities SPkt , · · · , SPk , SPkt is the closest facility to facility SPkt−1 for t = 2, · · · , v). Then for a customer whose first preferred is fk , her possible patronizing sequence is facility fk → facility SPk2 → · · · → facility p SPk . Of course, she may be served at some facility which is not at full capacity or just leave the system before visitp ing all other facilities in SPk . Let LPk = {LPk1 , · · · , LPk } be the set of locations corresponding to SPk . We can use SPk and LPk to keep track of customers’ patronizing informa-

Fig. 1. A 10-node example for using DAC.

tion. As an example consider the network depicted in Fig. 1. Shortest distances and weights are given in Table 1. Suppose p = 3, S = {1, 2, 3} and L = {7, 3, 9}. It is easy to verify that FP = {1, 2, 3}. We have SP1 = {1, 3, 2}, SP2 = {2, 3, 1}, and SP3 = {3, 1, 2}; LP1 = {7, 9, 3}, LP2 = {3, 9, 7} and LP3 = {9, 7, 3}. A customer from either nodes 1, 2, 3, 4, 6 will have SP2 = {2, 3, 1} and will choose facility 2 (at node 3) as her first preferred. Ignoring distance sensitivity, she will obtain service if facility 2 is not at full capacity; otherwise, she will visit facility 3 (at node 9). If facility 3 is also at full capacity, she will attend facility 1 (at node 7). We refer to a tour traveled from a demand node or a facility to its next-preferred facility as a “leg”. There are no (j) first leg demands for those facilities not in FP. Define λm as the demand rate originated from the first m legs faced by (j) (j) facility j. We would like to estimate λp . To obtain λp we (j) will have to estimate λm , m = 1, · · · , p, sequentially starting (j) with λ1 . Therefore, we have: (j)

λ1 =

(j)

 

(i) ωi f d i, L1 if j ∈ FP,

(i)

 {i∈N: S1 =j} 0

(5)

if j ∈ FP. (j)

Given λ1 , we compute an estimate for ρ1 using Equa(j) tion (4). In order to estimate λ2 , we need to consider all the demands visiting facility j as their first preferred one and also consider all the demands visiting facility j as their

1019

Locating capacitated facilities Table 1. Shortest distance matrix and fraction of total population Node Node 1 2 3 4 5 6 7 8 9 10 Weight

1

2

3

4

5

6

7

8

9

0 25.57 34.00 12.53 83.28 12.07 83.45 102.74 74.28 77.28 0.0767

25.57 0 36.10 37.47 85.38 13.49 85.55 77.17 48.72 79.38 0.048

34.00 36.10 0 43.02 49.28 32.37 49.45 73.08 44.25 43.28 0.0709

12.53 37.47 43.02 0 92.29 24.19 92.46 114.64 86.19 81.51 0.0578

83.28 85.38 49.28 92.29 0 81.64 19.49 59.76 57.40 16.14 0.07

12.07 13.49 32.37 24.19 81.64 0 81.81 90.66 62.21 75.65 0.0373

83.45 85.55 49.45 92.46 19.49 81.81 0 40.27 37.91 35.63 0.1909

102.74 77.17 73.08 114.64 59.76 90.66 40.27 0 28.83 75.90 0.1759

74.28 48.72 44.25 86.19 57.40 62.21 37.91 28.83 0 73.54 0.1881

second one, i.e., (j)

(j)

λ2 = λ1 +

(j)

(SPk1 ) (SPk1 ) ρ1 f d LPk1 , LPk2 ,

λ1

j = 1, · · · , p.

(6)

As can be seen in Equation (6), the demands that will visit facility j as the second preferred one must have visited their most-preferred one first, find it full and have the willingness to travel the distance to the second-most-preferred facilities. (j) (j) Once we have λ2 , it is easy to obtain ρ2 using Equation (4). Repeating the process, we have the estimate of the (j) λm (m = 2, · · · , p) as follows:

(j)

{k:SPkm =j}

j, m = 2, · · · , p.

77.28 79.38 43.28 81.51 16.14 75.65 35.63 75.90 73.54 0 0.0844

ρ3 sequentially. We perform the same process until our algorithm is terminated at the pth round. Given facility location pair (S, L), the DAC algorithm can be stated as follows:

{k:SPk2 =j}

λ(j) m = λm−1 +

10

(SPk1 )

λ1

m−1

(SPkn )

ρn

f d LPkn , LPkn+1 ,

n=1

(7)

The procedure above give us approximations to the true (j) (j) demand rates λp and blocking fractions ρp for all j. This is a first-round approximation which we will refine later. (j) Therefore, if ρ1 is the blocking fraction for facility j, then (j) λ2 from Equation (6) would be the true demand rate obtained from the first two legs faced by facility j. Based on this observation, since we would like to obtain steady-state measures, our second round starts with the results returned (j) by the first-round approximation, i.e., ρ1 of the second (j) round is set to be equal to ρp , ∀ j obtained at the end (j) of the first round. Again Equation (7) is used to find λ2 (j) and then Equation (4) is used to find ρ2 and proceeding in (j) this way using Equations (7) and (4) to find the values of λm (j) and ρm for all j and 2 ≤ m ≤ p, to finalize the second-round approximation. (j) (j) Similarly, if ρ1 and ρ2 are the real blocking fractions (j) for facility j, λ3 from Equation (7) will be the true demand rate obtained from the first three legs. Therefore, at the third (j) (j) (j) round, we set ρ1 and ρ2 to be equal to ρp of the second (j) round and apply Equations (7) and (4) to obtain λ3 and

Procedure DAC Step 1. Derive FP. SPk , LPk , where k = 1, · · · , |FP| (cardinality of FP). (j) Step 2. Set i := 1, m := 2. Calculate λ1 using Equation (5) (j) and ρ1 using Equation (4), ∀ j. (j) (j) Step 3. Calculate λm using Equation (7) and ρm using Equation (4), ∀ j. Step 4. If m < p, set m := m + 1 and do Step 3 until m = p. (j) (j) (j) (j) Step 5. Set ρ1 := ρp , · · · , ρi := ρp , ∀ j. Step 6. If i < p, set i := i + 1, m := i and go to Step 3 oth(j) erwise ρ(j) := ρp , ∀ j. Now let us reconsider the example depicted in Fig. 1. We assume c = 3, µ = 0.3, dmax = 114.64 (maximum among all shortest paths) and the parameter of the decay function α = 1. Consider the facility location vector (7, 3, 9). Again, we have FP = {1, 2, 3}, SP1 = {1, 3, 2}, SP2 = {2, 3, 1} and SP3 = {3, 1, 2}. The corresponding locations are LP1 = {7, 9, 3}, LP2 = {3, 9, 7} and LP3 = {9, 7, 3}. Set i = 1. (j) Step 2 of Procedure DAC calculates λ1 for j ∈ {1, 2, 3}. Recall that nodes {1, 2, 3, 4, 6} are closer to facility 2 than to any other facility. If we compute for each one of these nodes the weight times the fraction of demand that will (2) travel to facility 2 and add the results we obtain λ1 = 0.220 613. Using Equation (5) we find λ1 = {0.307 166, 0.220 613, 0.319 758} and, using Equation (4), ρ1 = {0.258 92, 0.148 729, 0.274 402}. Step 3, m = 2, using Equation (7) and (4) we find λ2 = {0.365 896, 0.220 613, 0.393 141} and ρ2 = {0.328 588, 0.148 729, 0.358 458}.

1020 In Step 4, m = 3, back to Step 3 we find λ3 = {0.370 731, 0.243 306, 0.393 141} and ρ3 = {0.334 009, 0.177 685, 0.358 458}. Step 5 fixes the first leg blocking fraction to ρ1 = {0.334 009, 0.177 685, 0.358 458}, which was the last blocking fraction found in iteration 1 (i = 1). After Step 3 is performed two times for m = 2 and 3, we have new values for vectors λ3 = {0.389 988, 0.251 786, 0.412 502} and ρ3 = {0.355 087, 0.188 587, 0.378 665}. Avoiding unnecessary details, the final result is ρ = ρ3 = {0.360 103, 0.191 375, 0.384 562}. The captured demand is then calculated using the objective function, which returns a value of 71.8%. A natural question is: how close Procedure DAC output is to the result obtained by simulation? The simulation output is 70.5%, a relative error of less than 2%. Procedure DAC has two major virtues. First, it yields very close approximations to the true blocking fractions (numerical computations show us, as reported later in this paper, that absolute values of the differences between estimates for blocking fractions obtained by the DAC procedure and those obtained by simulation are very small). Second, it achieves computational efficiency. Given the shortest-distance matrix and facility locations, for each node, to find the closest facility, it requires p − 1 comparisons. After finding the closest facility, it takes at most p − 1 comparisons to check whether or not this facility is in FP. Therefore, FP can be obtained in O(np) time. Since SPk1 = fk for any k ∈ {1, · · · , v}, it takes p − 1 comparisons to obtain SPk2 . Therefore, obtaining SPk for any k needs (p − 1) + (p − 2) + · · · + 2 = (p − 2)(p + 1)/2 comparisons. In other words, we can get SPk for all k in O(p 3 ) effort. Actually, if we sort the distances among p facilities first, we can obtain SPk even faster. However, as can be seen in the following, this will not affect the overall complexity. In Step (j) (j) 2, λ1 for all j ∈ FP can be calculated in O(np) time and ρ1 for all j can be calculated in O(p) time for a given c. For given m and j, Equation (7) can be obtained in O(p 2 ). Thus, Step 3 takes O(p 3 ) time (complexity of calculating Equation (4) is dominated by Equation (7)). Both Steps 4 and 6 call Step 3 again, so the total effort on Equation (7) is O(p5 ). The overall complexity of Procedure DAC is max{O(p 5 ), O(np)}. Unless p is large, the algorithm is quite fast.

4. Heuristics As mentioned earlier, (P1) is very difficult to solve. In this section we take advantage of the precision of Procedure DAC in two main heuristic procedures: a Greedy Heuristic (GH) and a Parametric Heuristic (PH). We also present a Randomized Heuristic (RH) for benchmark purposes. RH generates randomly a given number of location of p facilities, evaluates each one of them using Procedure DAC and chooses the one that maximizes the captured demand. Next we describe GH and PH.

Berman et al. 4.1. The GH This heuristic is simple and efficient. We first consider a single facility location. Once we find the best single location, say node i ∈ N, using complete enumeration, we locate all remaining facilities at node i and calculate the objective function value using Procedure DAC. Now we remove one facility from node i and insert it at a node that improves the objective function value the most. If there is no improvement, we leave the facility at node i and stop the algorithm; otherwise, we remove another single facility from node i repeating the procedure until either there is no improvement or there are no facilities left at node i. Procedure GH Step 1. Using complete enumeration obtain the best location for a single facility. Denote this node by i. Step 2. Set S := {1, 2, · · · , p}, L := {i, i, · · · , i}. Call Procedure DAC to approximate λ and ρ, and calculate the objective function value denoted by V using Equations (2) and (3) where (S, L) is the input parameter. Set j := 1, k := 1, L0 := L. Step 3. Call Procedure DAC and obtain the objective function value using Equations (2) and (3) where (S, Lk−1 \{i} ∪ {j}) is the input parameter. Call it V (S, Lk−1 , j). Step 4. If j < n, then set j := j + 1 and go back to Step 3. Step 5. Denote by Vk = maxj∈N {V (S, Lk−1 , j)}, i.e., the best objective function value returned by Step 3 and Step 4 and denote by j = arg maxj∈N {V (S, Lk−1 , j)}. Step 6. If Vk = V , i.e., j = i, Stop and Lk−1 is the “optimal” solution. Otherwise, set Lk := Lk−1 \{i} ∪ {j }, and V := Vk . Step 7. If k < p, then set j := 1, k := k + 1 and go back to Step 3. Otherwise, Lp is the “optimal” solution. Now let us consider again the example depicted in Fig. 1, with no changes in the problem parameters. The GH starts by choosing the best single location in Step 1 by complete enumeration. The best location turns out to be node 7. Step 2 places all three facilities at that node (L = (7, 7, 7)) and V = 0.6383. In a series of passages through Steps 3 and 4 together the location of the third facility is evaluated for all possible locations in N. The best objective function value is returned, the location vector (7, 7, 3), in Step 5. Step 6 records this solution as the best tentative solution and facility 3 is fixed at node 3 and V = 0.7089. The algorithm returns to Steps 3 and 4, the location of the second facility is evaluated for all possible locations in N. Step 5 records the best location for the second facility, in this example (7, 9, 3), and Step 6 includes it as best current solution and V = 0.7180. In the final evaluation, when we test the location of facility 1, it is found that we can not do better by moving that

1021

Locating capacitated facilities facility from node 7. Thus, the GH returns (7, 9, 3) with a captured demand of 71.8%. The actual optimal objective function value for this example obtained by complete enumeration is 72.2% achieved from the location vector {7, 9, 6}.

We now state the integer programming formulation used in the PH. The decision variables in the model are 1 if node i is served by the facility at node j, xij = 0 otherwise. yj =

4.2. The PH Recall that:

f (d(i, L)) = max 0, 1 −

d(i, L) , αdmax

where L is the location set of facilities and d(i, L) = minj∈L {d(i, j)}. (j) Suppose that the first-leg demand rate, λ1 , is significantly smaller than the service rate for each j. Then only a small fraction of customers would travel to the second facility and thus traveling to more than one facility can be ignored. In this case, the solution of the k-median problem should provide a good solution to our problem. (j) However, for sufficiently large values of dmax , if λ1 is significantly larger than the service rate for each j, most customers would be willing to travel long distances and thus intuitively it would make sense to locate all facilities at the 1-median solution which minimizes the total distance traveled. Based on this intuitive reasoning we developed a heuristic that works as follows: first, we solve the p-median problem with weights wi (1 − f (d(i, L))). Let L be the obtained location set. Then we locate the servers accordingly to L and calculate Vp = V (L). Next we solve a (p − 1)-median problem, locate the servers as the solution indicates and locate the extra server at the location with the largest blocking fraction among the p − 1 ones. If the cost of this new solution, V(p−1) , is less than Vp , we continue by solving a (p − 2)-median problem, etc. The process is repeated until either no improvement is made or there are no more problems to solve. The best solution is recorded and accepted as the solution to the PH. The intuition behind this procedure is that, instead of considering the blocking fractions as functions of location, we are implicitly estimating the “blocking fraction” as a parameter. That is, when p distinct locations are considered (the solution of the p-median) we are implicitly assuming that there is a low blocking fraction so no more than one facility location is visited. When we consider a solution of (p − j) distinct facility locations for some 0 < j < p, we implicitly assume that more than one facility will be visited before obtaining service, thus we want to increase the “reliability” of the solution by adding the extra j facilities to the location with the largest blocking fraction. As we shall see in the following, this procedure works quite well with some values of dmax because the procedure ignores customers’ sensitivity to distance.

1 0

if a facility is located at node j, otherwise.

We formulate an integer program as follows: (MP)

max

n n

ωi (1 − f (d(i, j)))xij ,

i=1 j=1

subject to

n

xij = 1,

i = 1, · · · , n,

(8)

j=1

xij ≤ yj , i, j = 1, · · · , n, n yj = k,

(9) (10)

j=1

xij , yj ∈ {0, 1},

i, j = 1, · · · , n, (11)

where k (equal to p − j + 1 in the jth iteration) is the number of facilities. Formulation (MP) corresponds to the classical k-median problem where we use minimization with “distance” f (d(i, j)) instead of d(i, j). Therefore, we can use any heuristic for the k-median problem to solve (MP). In this paper we use a Lagrangian Relaxation Heuristic (LRH) to solve (MP) (see Narula et al. (1977) and Daskin (1995) for more details). We note that Lagrangian relaxation is one of the most computationally attractive heuristics for the k-median problem. Lagrangian relaxation using a subgradient optimization method gives a very tight lower bound most of the time (less than 0.15%, see Daskin (1995)). Before presenting the PH algorithm we want to show that using only (MP) can deliver a very bad solution to (P1). Consider V H as the best objective function value among the GH, PH and the RH. Define RE = (V H − V M )/V H , where V M is the objective function value returned when the solution to (MP) is applied. Proposition 1. Let IN be the set of all possible instances of (P1), then: RE SUP = sup{RE[I]|I ∈ IN} = 1.

(12)

Proof. Obviously, RE cannot be greater than unity. Consider a network of size n with c = 1 and α → 0, implying that customers are very sensitive to distance and are not willing to travel farther than their own home nodes. Also let p = n, wi = δ for i ∈ {2, ..., n} and w1 = 1 − (n − 1)δ. It is easy to see that the solution to (MP) will be {1, ..., n} for any choice of µ. Thus, the captured demand in this system

1022 is given by (recall that λ = 1):

µ µ δ (n − 1) + (1 − (n − 1)δ) δ+µ (1 − (n − 1)δ) + µ µ as δ → 0. → 1+µ That solution is independent of the number of nodes in the network. However, when δ → 0, the greedy solution is to locate all servers on node 1. Suppose further that n → ∞, implying that the M/M/∞/∞ queueing system will capture every demand even for a small positive value of µ. Thus, the relative error for this specific instance is given by µ as δ → 0 and n → ∞. RE = 1 − 1+µ It follows that as µ → 0, RE → 1 as claimed in Equation (12) above. The result above shows that for an unbalanced network (nodes with high variability of weights) and large customers’ sensitivity to travel, the use of the classical approach to location problems can deliver a poor solution when the objective is to maximize captured demand under a queueing setting. However, if each facility has a large capacity in comparison to the total demand then the two problems become equivalent. Now we are ready to state the PH. Procedure PH Step 1. Set k := p + 1, V = 0, L = ∅. Step 2. Set k := k − 1. If k = 0 Stop. Step 3. Call Procedure LRH to solve (MP) and obtain the solution Xˆ = (xˆ 1 , · · · , xˆ k ) (may not be optimal). ˆ If Step 4. Set m := k. Set S := {1, 2, · · · , m} and L = X. m = p, go to Step 7, otherwise repeat Step 5 and Step 6 until m = p. Step 5. Call Procedure DAC to approximate λ(j) and ρ (j) (j = 1, · · · , m) using (S, L) as the input parameters. Step 6. Suppose that ρ (j ) is the largest among ρ (j) (j = 1, · · · , m), i.e., facility j is the one with the largest blocking fraction (ties are broken arbitrarily). Set m := m + 1, S = S ∪ {m}, L = L ∪ {xˆ j }. Step 7. Using the location pair (S,L), call Procedure DAC, and compute the objective function value Vk using Equations (2) and (3). If Vk > V , set V = Vk and L = L. Go to Step 2. Otherwise, Stop. 4.3. An upper bound The LRH works by adjusting the Lagrangian multipliers to narrow the gap between the lower and upper bounds. Define V − to be the objective function value of the Lagrangian dual problem when k = p in Procedure PH. Define ρ (a, p) to be the blocking fraction or the proportion of time an M/M/p/cp queue has cp customers, when customers arrive at rate a. Then we have the following proposition.

Berman et al. Proposition 2. The function (1 − ρ (V − , p)) is an upper bound for the maximum captured demand.

Proof. The upper bound used is calculated using the following two observations: 1. All customers in order to obtain service have to go at least to the closest facility. Therefore, V − is an upper bound on the amount of demand that will arrive to the first preferred facility, because we do not consider any leg traveled other than the first. 2. If there is no cost to travel from the first facility to the second facility and onwards, no additional demand is lost because of distance sensitivity, and we know that the actual queueing system cannot be more efficient in capturing demand than an M/M/p/cp system. These two observations together complete the proof. As will be shown in the following section, both heuristics perform well when evaluated using the upper bound developed above. We will also show that the PH is not as good as the GH in terms of time efficiency for large-scale problems. However, in terms of solution quality, although in most cases the greedy procedure provides a superior solution, there is no dominance.

5. Computational experiments In order to test the two heuristic algorithms, an extensive set of computational experiments was conducted. All runs were performed on a PC equipped with 677 MHZ processor and 128M RAM. The procedures were coded in ANSI C. The problem data used in the experiments were generated randomly as follows. The Cartesian coordinates of the nodes were generated over the interval (0, 100) uniformly. Then nodes were connected randomly until a tree was formed. Finally, a random number of links (this random integer number is between zero and n(n − 1)/2 − (n − 1), where n is the cardinality of the network) were added to the tree generated to create a network. All demand weights were generated over the interval (0. 1) randomly. The length of each link was calculated using the Euclidean distance formula. For all problem instances, we ensured that no two instances shared a common random seed. First, we investigate how good the Procedure DAC is, compared to simulation. As described above, we randomly generated 180 instances with 30 and 50 nodes each. For each instance we generated a random solution and computed the objective function value obtained using DAC and the analogous value obtained using simulation. The results are shown in Tables 2 and 3. The column “Relative error” gives intervals of the difference between the objective function value obtained using DAC and simulation divided by the latter. The column “Frequency” is the number of solutions within the class.

1023

Locating capacitated facilities Table 2. Comparing DAC with simulation for the 30-node network Relative error (%) [0, 0.1280] (0.1280, 0.2559] (0.2559, 0.3839] (0.3839, 0.5118] (0.5118, 0.6398] (0.6398, 0.7677] (0.7677, 0.8957] (0.8957, 1.0236] (1.0236, 1.1516] (1.1516, 1.2795] (1.2795, 1.4075]

Frequency

Cumulative percentage (%)

124 18 12 6 5 2 2 5 3 1 2

68.89 78.89 85.56 88.89 91.67 92.78 93.89 96.67 98.33 98.89 100.00

As we can see from the tables, about 97% of the solutions are within 1.00% of the relative error. The results are sufficiently good to conclude that Procedure DAC works very well. In Tables 4, 5 and 6, we compare the PH, the GH and the RH. In the RH we randomly generate 100 locations of p facilities on the network. The service rate µ varies according to |N| and p to avoid some extreme cases that add no value to the experiment. Table 4 contains computational results for |N| = 30. Ten instances were generated for each combination of p ∈ {4, 7, 10}, c ∈ {2, 5} and α ∈ {0.15, 0.3}. For the ten instances from each combination, we obtain the Average Number of distinct Locations (ANL), the Average Objective Function Value (AOFV), the average computational time (labeled as Time in the tables), the fraction of ten instances that each heuristic gives the best solution among the three heuristics (labeled as Best in the tables), and the worst-case ratio (ZU − ZH )/ZU (Ratio), where ZU is the upper bound of the optimal objective function value and ZH is the objective function value of heuristic H. We note that even though ANL is very close to p in most instances in Tables 4-6, it is not the case in general. For netTable 3. Comparing DAC with simulation for the 50 node network Relative error (%) (0, 0.2373] (0.2373, 0.3559] (0.3559, 0.4745] (0.4745, 0.5932] (0.5932, 0.7118] (0.7118, 0.8304] (0.8304, 0.9491] (0.9491, 1.0677] (1.0677, 1.1863] (1.1863, 1.3050]

Frequency

Cumulative percentage (%)

151 6 9 3 0 3 1 3 1 3

83.89 87.15 92.18 93.85 93.85 95.53 96.09 97.77 98.32 100.00

works with large variability of weights, ANL can be much smaller than p (Proposition 1 shows an extreme case of this). Therefore, using (MP) as a heuristic might be dangerous. Moreover, solving (MP) is the starting point of the PH. We repeat similar experiments for |N| = 50 (Table 5) with combinations of p ∈ {5, 10, 15}, c ∈ {2, 5} and α ∈ {0.1, 0.25}. For |N| = 80 (Table 6), we limit p to p = 8 with combinations of c ∈ {2, 5} and α ∈ {0.05, 0.1}. Overall, the GH and the PH outperform the RH in terms of worst-case ratio, which is the main performance criterion of our interest. The average worst-case ratio of the RH is 0.2, 0.23 and 0.26 respectively for |N| = 30, 50, 80. For |N| = 30 and |N| = 80, the PH and GH have the same average worstcase ratio of 0.11 and 0.06, respectively. This implies that both heuristics perform even better for a larger network. For |N| = 50, the average worst-case ratio of the GH is 0.1, which is slightly better than the value of 0.11, recorded for the PH. Note that when µ increases the worst-case ratios of the PH and the GH get smaller. When µ increases, the system is overall less congested and thus tends to accept more customers who are also accepted by the M/M/p/cp system. In this case, it would yield a better objective function value by spreading facilities over different nodes and thus accepting more customers who might be sensitive to distances. The RH takes the least computational time with smallest objective function values. The GH outperforms PH and is highly efficient in terms of computational time. It is interesting to observe that for |N| = 50 and p = 15, GH takes more time than for |N| = 80 and p = 8, which shows that computational time is primarily affected by p. For |N| = 80, the GH is much more efficient than the PH. Table 7 address the final question: what is the relative error between the GH and PH solution values (using DAC) and the actual optimal solution values, derived by complete enumeration where the Objective Function Value (OFV) is calculated by simulation? The experiments were done on a small random network of 12 nodes which enables us to solve problems by complete enumeration. The results are encouraging, with the GH returning the optimal solution in more than 50% of the cases and the Relative Errors (RE) are small. The PH returns the optimal solutions about 20% of the time with small relative errors (average is even smaller than for the GH).

6. Additional formulations We now change our focus from (P1) to problems that address in some sense the same objective of capturing demand, but assuming a different customer behavior. By looking at different customer behavior we can obtain several interesting insights. In (P2), which has a similar formulation as (P1), we assume that for a customer from i who is currently visiting facility k the probability that facility j will be visited next is

1024

|N|

30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30

#

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

4 4 4 4 4 4 4 4 4 4 4 4 7 7 7 7 7 7 7 7 7 7 7 7 10 10 10 10 10 10 10 10 10 10 10 10

p

0.13 0.13 0.13 0.13 0.25 0.25 0.25 0.25 0.38 0.38 0.38 0.38 0.07 0.07 0.07 0.07 0.14 0.14 0.14 0.14 0.21 0.21 0.21 0.21 0.05 0.05 0.05 0.05 0.1 0.1 0.1 0.1 0.15 0.15 0.15 0.15

µ

2 2 5 5 2 2 5 5 2 2 5 5 2 2 5 5 2 2 5 5 2 2 5 5 2 2 5 5 2 2 5 5 2 2 5 5

c

0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3 0.15 0.3

α 4 3.4 4 3.9 4 3.9 4 4 4 4 4 4 6.8 6 6.7 5.7 7 6.4 7 6.9 7 6.8 7 7 9.3 8.4 9.3 8.7 9.7 9.2 9.9 9.4 10 9.7 10 9.9

ANL

Table 4. Computational results for |N| = 30

0.26 0.34 0.32 0.41 0.33 0.44 0.36 0.50 0.32 0.46 0.38 0.52 0.34 0.38 0.42 0.46 0.45 0.54 0.51 0.63 0.47 0.59 0.51 0.67 0.38 0.42 0.45 0.48 0.51 0.60 0.62 0.72 0.57 0.67 0.66 0.77

AOFV 5.53 2.94 4.72 3.90 4.23 2.98 3.68 3.00 4.56 3.38 4.51 3.39 11.25 7.94 10.44 7.61 10.95 7.73 10.39 8.10 10.43 6.66 10.35 7.31 18.56 11.72 18.17 13.76 18.67 8.97 18.85 10.54 18.20 13.76 18.47 12.94 9.41

Time

PH

0.7 0.2 0.7 0.3 0.9 0.4 0.9 0.9 0.9 0.7 1 1 0.4 0 0.4 0 0.7 0 0.6 0.5 0.8 0.4 1 0.8 0 0 0 0 0 0.2 0.6 0.2 0.7 0 0.9 0.5 0.48

Best 0.20 0.23 0.08 0.14 0.10 0.15 0.01 0.03 0.04 0.09 0.00 0.01 0.27 0.22 0.15 0.08 0.17 0.22 0.03 0.08 0.09 0.14 0.01 0.03 0.24 0.17 0.10 0.05 0.22 0.25 0.07 0.11 0.13 0.15 0.02 0.04 0.11

Ratio 4 4 4 4 4 4 4 4 4 4 4 4 7 6.9 7 6.6 7 6.8 7 7 7 7 7 7 9.9 8.6 10 9.1 10 9.9 10 10 10 9.9 10 10

ANL 0.26 0.35 0.33 0.42 0.33 0.45 0.36 0.50 0.32 0.46 0.38 0.52 0.34 0.39 0.42 0.47 0.45 0.56 0.51 0.63 0.47 0.60 0.51 0.67 0.38 0.43 0.46 0.49 0.52 0.62 0.62 0.73 0.57 0.68 0.66 0.77

AOFV 0.21 0.20 0.19 0.19 0.18 0.18 0.18 0.19 0.19 0.18 0.19 0.19 1.06 1.05 1.06 1.06 1.06 1.04 1.06 1.05 1.05 1.05 1.06 1.06 4.45 4.36 4.49 4.40 4.68 4.56 4.60 4.62 4.60 4.61 4.51 4.52 1.93

Time

GH

1 0.9 1 0.9 0.9 0.6 1 0.6 1 0.7 1 0.8 1 1 1 1 0.9 1 0.9 0.6 0.7 0.7 1 0.3 1 1 1 0.9 1 0.8 1 0.8 0.8 1 1 0.5 0.87

Best 0.20 0.21 0.07 0.13 0.10 0.14 0.01 0.04 0.04 0.09 0.00 0.01 0.27 0.20 0.14 0.06 0.17 0.20 0.03 0.08 0.09 0.13 0.01 0.03 0.24 0.14 0.08 0.03 0.21 0.23 0.07 0.10 0.13 0.14 0.02 0.04 0.11

Ratio 3.9 3.9 4 4 4 4 4 4 4 4 4 4 6.9 6.8 7 6.9 6.9 7 6.9 6.9 7 6.9 6.9 6.9 9.2 9 9.5 8.9 9.7 9.5 9.7 9.8 9.7 9.7 9.6 9.5

ANL 0.24 0.34 0.29 0.39 0.30 0.41 0.31 0.46 0.29 0.42 0.33 0.47 0.31 0.37 0.37 0.44 0.38 0.52 0.42 0.57 0.41 0.54 0.43 0.59 0.34 0.40 0.40 0.47 0.44 0.57 0.52 0.65 0.48 0.63 0.53 0.67

AOFV 0.19 0.17 0.16 0.16 0.16 0.16 0.16 0.16 0.16 0.16 0.16 0.16 0.60 0.60 0.59 0.60 0.60 0.59 0.59 0.59 0.60 0.60 0.60 0.60 1.93 1.94 1.94 1.94 2.02 1.97 2.00 1.98 1.99 1.99 1.95 1.97 0.91

Time

RH

0 0.1 0 0 0 0 0 0 0 0.1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.1 0 0 0 0 0 0 0 0 0.008

Best

0.26 0.24 0.19 0.17 0.19 0.20 0.14 0.12 0.14 0.15 0.12 0.10 0.34 0.25 0.25 0.11 0.29 0.25 0.20 0.17 0.22 0.21 0.16 0.14 0.32 0.19 0.19 0.06 0.33 0.29 0.23 0.20 0.26 0.21 0.20 0.16 0.20

Ratio

1025

|N|

50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50

#

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

5 5 5 5 5 5 5 5 5 5 5 5 10 10 10 10 10 10 10 10 10 10 10 10 15 15 15 15 15 15 15 15 15 15 15 15

p

0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.2 0.3 0.3 0.3 0.3 0.05 0.05 0.05 0.05 0.1 0.1 0.1 0.1 0.15 0.15 0.15 0.15 0.03 0.03 0.03 0.03 0.07 0.07 0.07 0.07 0.1 0.1 0.1 0.1

µ

2 2 5 5 2 2 5 5 2 2 5 5 2 2 5 5 2 2 5 5 2 2 5 5 2 2 5 5 2 2 5 5 2 2 5 5

c

0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25 0.1 0.25

α 5 4.5 5 4.8 5 5 5 5 5 5 5 5 9.8 9.1 9.9 8.9 10 9.2 10 10 10 9.7 10 10 14.3 12.8 14.5 13.3 14.9 14.3 14.9 14.5 15 14.7 14.9 15

ANL


0.22 0.33 0.27 0.40 0.26 0.41 0.25 0.47 0.26 0.46 0.29 0.48 0.31 0.39 0.39 0.46 0.39 0.54 0.45 0.64 0.43 0.60 0.45 0.68 0.36 0.41 0.44 0.48 0.49 0.61 0.58 0.72 0.53 0.67 0.61 0.78

AOFV 49.07 50.39 53.74 49.75 52.49 40.47 48.57 41.89 48.07 41.39 46.12 39.85 123.57 109.26 120.27 113.99 122.48 107.42 123.84 105.02 122.34 113.89 123.14 103.57 213.95 170.68 217.79 171.63 210.71 162.70 231.75 186.99 235.51 191.52 1187.28 1574.69 186.27

Time

PH

0.9 0.2 0.9 0.1 1 0.6 1 0.9 1 0.7 0.9 1 0.6 0 0.3 0 0.8 0.2 0.9 0.4 0.8 0.4 0.9 0.3 0 0 0 0 0.3 0 0.4 0.1 0.7 0.1 0.8 0.7 0.50

Best 0.16 0.25 0.04 0.16 0.06 0.14 0.00 0.02 0.03 0.08 0.00 0.01 0.28 0.22 0.15 0.08 0.13 0.22 0.02 0.08 0.08 0.13 0.00 0.04 0.28 0.17 0.12 0.05 0.20 0.25 0.05 0.11 0.11 0.15 0.02 0.04 0.11

Ratio 5 5 5 5 5 5 5 5 5 5 5 5 9.9 9.7 10 9.5 10 10 10 10 10 10 10 10 14.9 12.2 14.9 13.9 15 14.8 15 15 15 15 15 15

ANL 0.22 0.33 0.27 0.41 0.26 0.41 0.25 0.47 0.26 0.46 0.29 0.48 0.31 0.40 0.39 0.47 0.39 0.56 0.45 0.64 0.43 0.61 0.45 0.68 0.36 0.43 0.44 0.49 0.49 0.62 0.58 0.73 0.54 0.68 0.61 0.77

AOFV 2.18 2.24 2.23 2.18 2.38 2.12 2.12 2.12 2.11 2.11 2.11 2.13 22.05 21.83 22.07 22.04 22.04 22.04 22.05 22.04 22.06 22.07 22.05 22.13 129.90 119.70 130.59 127.48 130.28 131.23 144.24 147.60 148.75 155.79 822.02 936.43 94.85

Time

GH

1 0.8 0.9 1 1 0.4 1 0.6 1 0.6 1 0.4 1 1 0.9 1 1 0.8 0.9 0.6 0.9 0.6 0.9 0.7 1 1 1 1 0.8 1 1 0.9 0.8 0.9 0.9 0.3 0.85

Best 0.16 0.23 0.04 0.14 0.06 0.14 0.00 0.03 0.03 0.08 0.00 0.01 0.28 0.20 0.14 0.05 0.13 0.20 0.02 0.08 0.08 0.12 0.00 0.04 0.27 0.13 0.11 0.02 0.20 0.23 0.05 0.10 0.11 0.13 0.02 0.04 0.10

Ratio 5 5 5 5 4.9 5 5 5 5 5 5 5 9.4 9.4 10 9.7 9.9 9.8 9.8 9.8 9.9 10 9.8 9.8 14 13.3 14.7 14.3 14.3 14.2 14.3 14.4 13.9 14.1 14.4 14.4

ANL 0.19 0.31 0.23 0.37 0.22 0.37 0.22 0.42 0.21 0.40 0.23 0.42 0.26 0.37 0.31 0.45 0.31 0.50 0.35 0.56 0.34 0.54 0.35 0.59 0.31 0.40 0.37 0.46 0.39 0.56 0.44 0.64 0.42 0.60 0.46 0.67

AOFV 0.91 0.94 0.93 0.95 0.97 0.89 0.90 0.89 0.89 0.89 0.89 0.89 5.82 5.85 5.84 5.82 5.86 5.83 5.86 5.83 5.84 5.84 5.86 5.84 24.88 24.83 25.47 24.87 24.98 25.30 27.38 28.13 28.24 29.15 165.34 175.75 19.04

Time

RH

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00

Best

0.27 0.30 0.17 0.22 0.21 0.22 0.15 0.14 0.20 0.20 0.18 0.14 0.40 0.26 0.30 0.11 0.31 0.29 0.25 0.20 0.26 0.22 0.22 0.16 0.38 0.19 0.27 0.08 0.36 0.30 0.27 0.21 0.31 0.24 0.27 0.17 0.23

Ratio

1026

|N|

80 80 80 80 80 80 80 80

#

1 2 3 4 5 6 7 8

8 8 8 8 8 8 8 8

p

0.06 0.06 0.06 0.06 0.19 0.19 0.19 0.19

µ

2 2 5 5 2 2 5 5

c

0.05 0.1 0.05 0.1 0.05 0.1 0.05 0.1

α 8 8 8 8 8 8 8 8

ANL


0.19 0.24 0.23 0.30 0.22 0.29 0.23 0.31

AOFV 1078.38 1057.98 1075.90 1052.43 1075.45 1054.16 1076.94 1076.29 1068.44

Time

PH

0.7 0.4 0.8 0.9 0.9 1 0.9 1 0.83

Best 0.12 0.20 0.02 0.05 0.02 0.04 0.00 0.00 0.06

Ratio 8 8 8 8 8 8 8 8

ANL 0.19 0.24 0.23 0.30 0.22 0.29 0.23 0.31

AOFV 87.24 87.19 87.13 87.25 87.20 87.19 87.20 87.11 87.19

Time

GH

1 0.9 1 0.9 1 0.8 0.9 0.5 0.88

Best

0.12 0.19 0.02 0.06 0.02 0.04 0.00 0.00 0.06

Ratio

8 7.9 8 8 8 8 8 8

ANL

0.15 0.20 0.18 0.24 0.17 0.24 0.18 0.23

AOFV

16.35 16.33 16.30 16.45 16.35 16.36 16.33 16.29 16.35

Time

RH

0 0 0 0 0 0 0 0 0.00

Best

0.30 0.35 0.23 0.25 0.25 0.23 0.22 0.25 0.26

Ratio

1027

Locating capacitated facilities

Table 7. Comparing the GH and PH with complete enumeration using simulation for 12-node network with p = 3, µ = 0.3 and c = 1 α

OFV of simulation

OFV of GH

0.25 0.3578 0.3546 0.75 0.4932 0.5002 1.25 0.5466 0.5609 0.25 0.3465 0.3500 0.75 0.5028 0.5131 1.25 0.5511 0.5696 0.25 0.3379 0.3361 0.75 0.4868 0.4933 1.25 0.5403 0.5577 0.25 0.3580 0.3653 0.75 0.5083 0.5165 1.25 0.5511 0.5691 0.25 0.3686 0.3688 0.75 0.5116 0.5225 1.25 0.5567 0.5726 0.25 0.3495 0.3533 0.75 0.4862 0.4953 1.25 0.5423 0.5592 0.25 0.3535 0.3513 0.75 0.4753 0.4887 1.25 0.5379 0.5617 Average Percentage of finding optimal solution (%)

RE of GH (%)

GH found optimal solution?

OFV of PH

RE of PH (%)

Parametric found optimal solution?

0.92 1.43 2.62 1.00 2.04 3.37 0.52 1.32 3.21 2.04 1.63 3.26 0.06 2.14 2.85 1.07 1.88 3.12 0.60 2.82 4.42 2.01

Yes No No Yes No No No Yes No Yes Yes Yes Yes Yes No Yes Yes No Yes No No

0.3546 0.4991 0.5609 0.3376 0.4938 0.5627 0.3367 0.4673 0.5485 0.3589 0.4953 0.5596 0.3688 0.4946 0.5660 0.3497 0.4594 0.5477 0.3513 0.4809 0.5617

0.92 1.20 2.62 2.57 1.78 2.11 0.36 4.01 1.50 0.28 2.55 1.53 0.06 3.31 1.67 0.05 5.51 1.00 0.60 1.18 4.42 1.87

Yes No No No No No Yes No No No No No Yes No No No No No Yes No No

52.38%

a function of the total distance traveled from i to k plus the leg from k to facility j. Thus, the contribution that node i makes to the objective function value in (P2) is ViP2 (L) = wi

p

j (i) (i) 1 − ρ Sj ρ Sk−1

j=1

f

an available facility. We call this (P3). Let ViP3 (L) = wi

p

j (i) (i) (i) ρ Sk−1 , 1 − ρ Sj f d i, Lj

j=1

k=1

k (i) (i) d Lt−1 , Lt .

(13)

(P3) V P3 (L∗ ) = max V P3 (L) = L⊆N

k=1

(15)

ViP3 (L) : |L| = p .

i∈N

(16)

t=1

The problem is then to maximize the total expected demand captured by p facilities as follows: (P2) V P2 (L∗ ) = max{V P2 (L) = ViP2 (L) : |L| = p}. L⊆N

19.05%

Obviously due to the triangle inequality property, customers will always travel less in the setting of (P3) than in the setting of (P2).

i∈N

(14) Note that the customer behavior in (P2) is not memoryless as it was in (P1). Here, customers having the same search path as in (P1) will have a smaller expected number of facilities visited. Hence, the demand captured in (P2), which we call P2[L], is always smaller than or equal to the demand captured in (P1), which we call P1[L], as is stated in observation 1. Observation 1. If the distance tolerance function f is nonincreasing, then for any set of facility locations L, P1[L] ≥ P2[L]. A natural problem to investigate is the problem when customers receive information so they can travel directly to

Observation 2. If the distance tolerance function f is nonincreasing, then for any set of facility locations L, P3[L] ≥ P2[L]. The relationship between P1[L] and P3[L] is not clear. On the one hand, by knowing the total distance traveled to obtain service, customers are less likely to begin to travel at all in the setting of (P3). On the other hand, the distance traveled (search path) to obtain service from the jth facility in (P1) may be much larger than directly traveling from node i to facility j in (P3). Comparing the last problem with the other two give us a sense of how much is lost (or gained) due to customers’ lack of information about facility congestion.

1028

Berman et al.

Table 8. Comparing (P1), (P2), (P3) and (BKW) α

Solution

OFV of (P1)

OFV of (P2)

OFV of (BKW)

OFV of (P3)

0.5

S1 = {1, 7, 9} S2 = {1, 7, 9} SB = {1, 7, 9} S1 = {6, 7, 9} S2 = {1, 7, 9} SB = {1, 7, 9} S1 = {3, 7, 9} S2 = {6, 7, 9} SB = {1, 7, 9} S1 = {7, 9, 9} S2 = {6, 7, 9} SB = {1, 7, 9}

0.626 0.626 0.626 0.722 0.720 0.720 0.765 0.764 0.762 0.791 0.787 0.785

0.620 0.620 0.620 0.698 0.699 0.699 0.732 0.735 0.735 0.738 0.760 0.759

0.607 0.607 0.607 0.660 0.662 0.662 0.669 0.677 0.678 0.510 0.685 0.686

0.631 0.631 0.631 0.742 0.740 0.740 0.776 0.769 0.782 0.782 0.780 0.779

1.0

1.5

2.0

Finally, we consider (BKW), introduced by Berman, Krass and Wang (2006). They use the probability of visiting a facility as a function of the distance traveled, but customers only travel to the closest facility and are either served or lost. Clearly, this problem cannot capture more demand than any of the problems above. Hence, Observation 3. If the distance tolerance function f is nonincreasing, then for any set of facility locations L, BKW[L] ≤ min{P1[L], P2[L], P3[L]} = P2[L]. To shed more light on this discussion we present an example for which we demonstrate the differences in the objective function values P1[L], P2[L], P3[L] and BKW[L]. Let S1, S2, S3 and SB be the corresponding optimal solutions of (P1), (P2), (P3) and (BKW). Although Procedure DAC works very well for (P1) and (P2), it does not deliver good results for (P3). Therefore, we will not attempt to optimize (P3) but compute the objective function value using simulation for any given facility location set. We use Fig. 1 with all parameters having the same values stated earlier except for α that varies in {0.5, 1.0, 1.5, 2.0}. The results are summarized in Table 8. For each problem we state the optimal solution and the OFV. The table compares the objective function values of the optimal solutions to the three problems. For instance, when α = 1.0, S2, the optimal solution to (P2), is {1, 7, 9} and the objective function value of S2 , P2[S2] = 0.699. Also, P1[S1] = 0.722 and BKW[SB] = Table 9. Comparing (P1), (P2), (P3) and (BKW) for the 25-node case with µ = 0.5, α = 1.25 and c = 1 Solution S1 = {5, 21, 23} S2 = {5, 16, 18} SB = {4, 8, 22}

OFV of (P1)

OFV of (P2)

OFV of (BKW)

OFV of (P3)

0.670 0.662 0.640

0.608 0.619 0.600

0.504 0.541 0.547

0.709 0.701 0.704

0.662. From the same table, if we compute the objective function value of (BKW) using the solution for (P1) we have that BKW[S1] = 0.660 and P3[S1] = 0.742. Note that, when α = 2.0, P3[S1] < P1[S1], implying that providing customers with information about facility congestion does not help to capture more demand. However, when α = 1.0, P3[S1] > P1[S1]. Hence, as customers are more prone to travel an extra leg in search of service the decision maker is less willing to provide that information. If customers, on the other hand, are very sensitive to distance, giving the information is beneficial. This change in the relation of objective function values between (P1) and (P3) does not occur between (P2) and (P3). Note that it is always better to provide information about facility congestion if customers behave as assumed in (P2). Another interesting question is how much is lost when using the solution to (BKW), as a proxy for the solution to (P1). In this particular example, it can be easily seen from the tables above that there is no major loss in doing so. However, in general this may be dangerous. For example, for an instance with 25 nodes and µ = 0.5, α = 1.25 and c = 1, as shown in Table 9, the relative difference between P1[SB] and P1[S1] is close to 4.5% but even more relevant, the absolute difference is 3%, a percentage of demand that goes directly to the company’s bottom line. We omit the results for other values of α for that same instance but similar values in the differences were found. In summary, instances returning up to 6% in the absolute differences are not uncommon.

7. Conclusions In this paper we extended the model presented by Berman, Krass and Wang (2006), where distance-sensitive customers only obtain service from the closest facility. We presented three alternatives to that model. In the first model, customers travel from one facility to another until service is obtained or their sensitivity to distance traveled forces them to abandon the search. In this model we consider that the sensitivity is a function of the distance between the current location and the next-closest facility; the distance traveled up to that moment has no impact in the upcoming decision. In the second model, customers take into account the total distance traveled from the home nodes. The third model considers the case where the decision maker gives full information to customers about congestion. Customers can then travel directly to the closest facility that is not congested. We showed the relationship between the objective functions of these three models, and discussed some bounds. The choice of models should be done according to customers’ behavior. Alternatively, the decision maker may be capable of forcing the customers to behave in a way that impacts the objective function value better. Note that the first and third

1029

Locating capacitated facilities models do not have a fixed relationship in terms of objective function values. Thus, if customers behave according to the first model then they can behave differently if information about congestion is given. Alternatively, by omitting that information the behavior can shift from the one described by the third model to that considered in the first model. Another contribution of this paper is the approximation algorithm (DAC) to calculate objective function values. The DAC procedure is a valuable tool to get approximate values given that the values obtained by simulation are time expensive. Even small problems can take close to an hour to obtain the objective function value through simulation. However, for any one of the heuristics presented the objective function value has to be computed many times. Future research can be done in the direction of finding ways to optimize the second and third models which we just touched on. We believe that while the methodology to optimize (P2) will be quite similar to that of (P1), (P3) requires a completely different approach since for each point there is a different ordering of preference for the visiting facilities. A hint of the complexity of (P3) can be seen in the working paper by Berman, Krass and Menezes (2006b), where the probability of finding an operating facility is not a function of facility location but a fixed exogenous parameter. Finally, empirical research using our models with the objective of addressing the issues of customers’ behavior could be interesting. The real impact of locating facilities under congestion is just starting to be understood from the theoretical perspective but very little has been done to evaluate the validity and relevance of our models from the empirical side.

References Ball, M. and Lin, F. (1993) A reliability model applied to emergency service vehicle location. Operations Research, 41, 18–36. Batta, R., Dolan, J. and Krishnamurthy, N. (1989) The maximal expected covering location problem: revisited. Transportation Science, 23, 277–287. Berman, O., Chiu, S., Larson, R., Odoni, A. and Batta, R. (1990) Location of mobile units in a stochastic environment, in Discrete Location Theory, Mirchandani, P. and Francis, R. (eds), Wiley, pp. 503–548. Berman, O. and Krass, D. (2002) Facility location problems with stochastic demands and congestion, in Location Analysis: Applications and Theory, Drezner, Z. and Hamacher, H. W. (eds), pp. 329–371. Berman, O., Krass, D. and Menezes, M. (2006a) MiniSum with imperfect information—minimizing incovenience. Working paper, Joseph L. Rotman School of Management, University of Toronto, Toronto Canada. Berman, O., Krass, D. and Menezes, M. (2006b) Reliability issues and strategic co-location in m-median problems: median problem with unreliable facilities. Operations Research, 55(2), 332–350. Berman, O., Krass, D. and Wang, J. (2006) Locating service facilities to reduce lost demand. IIE Transactions, 38, 933–946. Borras, F. and Pastor, J. (2002) The ex-post evaluation of the minimum local reliability level: an enhanced probabilistic location set covering model. Annals of Operations Research, 111(1), 51–74.

Daskin, M. (1983) A maximum expected covering location model: formulation, properties and heuristic solution. Transportation Science, 17, 48–70. Daskin, M. (1995) Network and Discrete Location: Models, Algorithms and Applications, Wiley, New York, NY. Gross, D. and Harris, C. (1985) Fundamentals of Queueing Theory, Wiley, New York, NY. Larson, R. (1975) Approximating the performance of urban emergency service systems. Operations Research, 23, 845–868. Marianov, V. and ReVelle, C. (1994) The queueing probabilistic location set covering problem and some extensions. Socio-Economic Planning Sciences, 28(3), 167–178. Narula, S.C., Ogbu, U.I. and Samuelsson, H.M. (1977) An algorithm for the p-median problem. Operations Research, 25, 709–713. ReVelle, C. and Hogan, K. (1989) The maximum availability location problem. Transportation Science, 23(3), 192–200.

Biographies Oded Berman is the endowed Sydney Cooper Chair in Business and Technology and the former Associate Dean of Programs at the Joseph L. Rotman School of Management at the University of Toronto. He received his Ph.D. (1978) in Operations Research from the Massachusetts Institute of Technology. He had been with the Electronic Systems Lab at MIT, the University of Calgary, and the University of Massachusetts at Boston, where he was also the Chairman of the Department of Management Sciences. He has published over 170 articles and has contributed to several books in his field. His main research interests include operations management in the service industry, location theory, network models, and software reliability. He is an Associate Editor for Management Science and Transportation Science, and a member of the editorial board for Computers and Operations Research. Rongbing Huang is an Assistant Professor at the School of Administrative Studies, York University, Canada. He holds a Ph.D. degree in Operations Management from the Joseph L. Rotman School of Management, University of Toronto. His research interests include facility location theory and combinatorial auctions. He teaches courses in business statistics, quantitative methods, decision analysis and introduction to operations research. Seokjin Kim is an Assistant Professor of Management at the Department of Business Administration, Millersville University. His teaching assignments include research methods in business, quantitative methods for business, and production and operations management. His research deals with stochastic optimization models in facility location and workforce scheduling. He received a Ph.D. degree in Operations Management at the University of Toronto’s Rotman School of Management and an M.S. degree in Engineering-Economic Systems (currently, Management Science and Engineering) at the Stanford University. He also received M.B.A. and B.B.A. degrees in Business Administration at the Yonsei University, South Korea. Mozart B. C. Menezes is an Assistant Professor at the Department of Operations Management and Information Technology at the HEC School of Management, Paris. His teaching assignments include supply chain management, inventory management and combinatorial optimization. His research interests include supply chain management, facility location problems, inventory management and services operations management. His B.Sc. degree in Civil Engineering was obtained at the Universidade Federal do Para, Brazil. He holds a M.Sc. degree in Industrial Administration and another in Project Management from Clemson University and a Ph.D. degree in Operations from the Joseph L. Rotman School of Management, University of Toronto.