Using Classifiers to Solve Warehouse Location ... - Semantic Scholar

0 downloads 0 Views 161KB Size Report
in several easy to solve smaller problems instead of one hard to solve problem. ..... Webegin by examining the best ten rules from one of the experiments.
Proceedings of the 40th Hawaii International Conference on System Sciences - 2007

Using Classifiers to Solve Warehouse Location Problems Kurt DeMaagd Ross School of Business University of Michigan Ann Arbor, MI 48109 Email: [email protected] Abstract We present a new technique for finding solutions to facility location problems. Traditional techniques are usually sensitive to small details of the problem. We present a more general method which generates rules that can be applied to a range of problems. Because these rules are more general, they help researchers build intuitions about their models and help managers more easily apply the model’s result to real world situations. We use a classifier to learn about the important features of the model. To generate new rules, we use a genetic algorithm. Although classifiers are traditionally used to find key features of documents or other similar objects, we find that they can also discover important attributes of facility location models. Based on these attributes, the classifier chooses what facilities should be built. While still only a proof of concept, our results indicate classifiers are a reasonable tool for solving facility location problems.

1. Introduction Solving facility location problems is a computationally intensive process. In addition, once a solution is found, the results are highly specific to the specific parameters of the model. Combine these two factors together and, to study a model under a range of different parameters and assumptions, researchers must run many iterations of different computationally intensive processes. Even with modern computers, this is simply not practical. Ideally, we would like a tool that can search for a set of heuristics that can be quickly applied to a model under a range of different assumptions. In other words, we need a tool that can generate generalizable heuristics for a given set of facility placement problems. In this paper, we describe one possible approach based on classifiers and genetic algorithms. Our general approach is to treat the description of the facility placement model as a series of messages. These messages identify key features of the model. For example, a message could represent the location of a customer or a competitor’s warehouse. The classifier then matches these

Scott Moore Ross School of Business University of Michigan Ann Arbor, MI 48109 Email: [email protected] messages and choose actions, such as building a new facility near the competitor’s warehouse. We use the bucket brigade algorithm [11] to train the classifier and assign credit to the different strategies and a genetic algorithm [10] to search for new classifiers. The problem we analyze is a firm entering a market in a rural developing economy. The firm needs to place warehouses to purchase agricultural goods from the local farmers and it needs to place IT infrastructure to communicate with the farmers. There is already an incumbent firm serving the market, so the firm faces competition. Because the entering and incumbent firm can respond to each other’s actions, the model must also be dynamic. Hence, we have a model in which a firm places two types of infrastructure in a competitive environment over time. We cover this in greater detail in section 3. We find that using a classifier is an effective tool for solving facility location problems. Most interestingly the classifier generates a system of if/then statements which can be applied to many different models. As a result, the rules are more generalizable than the typical output when solving this class of models. Although this paper is primarily a proof of concept and we need to perform further research on this topic (as covered in section 6), the results presented here are quite promising. In the next section, we describe some of the previous approaches to this problem. In sections 3 and 4 we describe the model and our solution. Then we analyze some results on the effectiveness of our approach. Finally, we summarize the results and describe some future directions and applications.

2. Literature Review Summarizing all of the previous solution techniques for facility location problems would require an entire book. In this section, we briefly cover some of the basic concepts and then describe some of their limitations that we hope to address. For a more complete summary of the main concepts and approaches to this problem, Drezner and Hamacher [8]

1530-1605/07 $20.00 © 2007 IEEE

1

Proceedings of the 40th Hawaii International Conference on System Sciences - 2007

and Anderson et al. [2] provide a good introduction to the literature. A facility location problem is concerned with choosing locations on which to place infrastructure such as warehouses or phone switching centers. One popular technique is the branch-and-bound algorithm [14]. The “branch” part of the name refers to the enumeration part of the technique and “bound” means that it eliminates possibilities that are greater or less than some set of bounds on the problem. In most cases, the bounds are calculated using a less constrained version of the problem. For example, in many cases the bound is computed by dropping the integer constraint. In many cases, relaxing the problem constraints simplifies the problem sufficiently to make it solvable. One of the best known is Lagrangian Decomposition [9]. In this approach, the problem is divided into subsets in which a smaller number of the constraints bind. This should result in several easy to solve smaller problems instead of one hard to solve problem. Another classic technique is Benders algorithm [3], which works particularly well when some of the variables are set to fixed values. It uses these values to find cutting planes which reduce the size of the problem space. Many other solution methods rely on generating cutting planes to reduce the problem space. Each cutting plane serves as a new constraint. Although this method can be used on any linear programming problem, the general purpose solutions tend to be slow. One of the best known examples is the Simplex algorithm [6]. Although general purpose implementations of cutting plane algorithms can be slow, some problem specific implementations work much faster. In addition, these algorithms can be combined with some of the methods described above. For example, branch-andcut [15] is a hybrid of the branch-and-bound algorithm and the cutting planes algorithm, using cutting planes to narrow the problem space and enumeration to search for the optimal solution. A Genetic Algorithm, as originally developed by John Holland [10], is a general purpose tool for finding good solutions in a large problem space. In the business domain, they have been used in every discipline from finance [1] to marketing [4]. Recently, some researchers have looked at GA’s and their potential application to facility location problems. The earliest work by Hosage and Goodchild [12] appeared to show that GA’s were not a good tool for location problems. Subsequent work, however, by Brimberg et al. [5] and Salhi and Gamal [17] has found that GA’s are generally effective in this class of problems. Within the context of this paper, prior approaches have two major limitations. First, the result found in the previous algorithms are highly sensitive to the particular structure of the problem. This limits their applicability to other similar

problems. On the one hand, we may want our solutions to fit the particular quirks of a model, especially when attempting to model real world scenarios. On the other hand, no model is a perfect representation of reality. We do not want the solution to fit insignificant details of the model which do not correspond to the real world. We propose a method which derives general rules based on the model but which are not as sensitive to small deviations in the model setup. Second, the above solutions, when applied to a competitive model, search for solutions for equilibrium conditions. Yet competitive dynamic facility location models are perfect examples of path dependent problems. Consider a slight variant on Hoteling’s model [13]. Begin with a single ice cream vendor selling on the beach. The optimal location for the ice cream vendor is in the middle of the beach. If, however, a competitor entered the market, Hoteling’s model predicts that the two vendors would optimally move to opposite ends of the beach. (Note that subsequent analysis has shown that endogenous prices change this equilibrium [2, 7], but that does not affect our example.) This result assumes that the vendors can easily move to new locations as competitive forces demand. Yet facility location problems frequently deal with immobile facilities. If the first ice cream vendor could not move, then the optimum choice for the entering vendor is quite different. Because facilities are often immobile, facility location problems are highly dependent on the set of previous choices. As a point of clarification, this problem described in the previous paragraph is not a path dependent problem in the strictest sense of the word. In his study of the different forms of path dependence, Page refers to this type of problem as phat dependence [16]. Perhaps a less stylized description is state dependence. For example, it does not matter if a firm builds a warehouse at location A and then at location B or if it builds a B first and then A. It merely matters that the firm already has facilities at locations A and B. Although it is common to lump path and state dependence together and refer to them both as path dependence, we acknowledge that our problem is more accurately described as state dependent.

3. Model Recall from the introduction that we are interested in a model of a firm entering a rural developing economy where it must place both warehouses and IT infrastructure. In this section we elaborate on that description, beginning with a general description and then mathematically implementing the model. Our model includes two firms: an incumbent firm with several warehouses (but no IT infrastructure) in place, and a competitor that is entering the market. Both firms purchase agricultural goods from local farmers. The farmers choose where to sell based on the expected profit.

2

Proceedings of the 40th Hawaii International Conference on System Sciences - 2007

The farmers are risk averse, so they value an uncertain expected profit less than an exact price quote. In addition, they must pay to transport their goods to wherever they choose to sell. The entering firm faces two decisions: where to place warehouses and where to place IT infrastructure. These represent the two ways the entering firm competes with the incumbent. If it builds more warehouses, then the farmers will have a shorter distance to ship their goods. If it builds IT infrastructure, the firm can provide the farmers with an exact price quote, reducing their uncertainty. Both of these would give the entering firm a competitive advantage over the incumbent. In addition, there may be an interaction between the two factors. Over time, the firms may also respond to their competitor’s actions. For example, if one firm builds a warehouse close to a cluster of farmers, the other firm may also choose to build there. Now we describe this problem mathematically. The first equation below describes the farmer’s decision problem. This is the farmer’s expected profit. If there is IT infrastructure nearby, the farmers can get a price quote for their goods. Hence, the expected profit is simply the price quote less the transportation cost. In the absence of a price quote, the expected profit is the expected price, adjusted for risk aversion, less the transportation cost. For simplicity, assume that each farmer produces one unit of goods, farmers do not have a cost of producing the goods, and the price offered to the farmers is the same at each warehouse. In addition, to reduce confounding effects in our analysis, we assume that the expected price if the farmer does not have access to price quotes is the same as the exact price quote. Although the quoted price is the same as the expected price, because of risk aversion, the farmers still value the quoted price over the uncertain price.

Table 1: Variables used in this model Variable Definition Π f ,w,s Farmer f ’s profit at warehouse w Πc,s Firm c’s profit u Risk aversion penalty p Price for the farmers goods t Transportation cost D( f , w) Distance to warehouse w r Firm’s resale price of goods cw Cost per warehouse ck Cost of IT infrastructure) {X,Y } (x,y) coordinates of the infrastructure s number of time periods remaining Wc,s Set of firm’s warehouses at time s Kc,s Set of firm’s IT infrastructure at time s

house.

 1 qw, f ,s = 0

if Π f ,w,s > Π f ,−w,s otherwise

(2)

Π f ,−w,s represents the profit the farmer could get at any warehouse except warehouse w. Recall that we assumed each farmer produces one unit of output. Each farmer sells their one unit of output to whatever location maximizes their profit. Now that we have a function to describe when farmers sell to the firm, we can describe the firm’s net income. In this case, the net income is the gross margin times the amount of goods minus the cost of the warehouses and the IT infrastructure. The firm chooses to place warehouses and IT infrastructure to maximize net income, as described in the following function: Πc,s = (r − p)



qw,s − cw Wc,s  − ck Kc,s 

(3)

w∈Wc,s

 Π f ,w,s =

p − t × D( f , w) p(1 − u) − t × D( f , w)

if I (IT = 1) if I (IT = 0)

(1)

The I (IT) function requires a bit of additional explanation. This is the indicator function which returns a 0 if the farmer does not have information access or a 1 if the farmer does have access to information provided by the IT-enabled firm. For example, in the developing world, information access often takes the form of Internet enabled kiosks built in the local villages. All other variables are as defined in Table 1. Note that the subscript s refers to the time period. We discuss the role of time in more detail later in this section. We are primarily interested in this problem from the perspective of the firms, so the next step is to translate the farmer’s decisions into the firm’s demand at each ware-

W f ,s is the set of warehouses owned by the entering firm at time s. K is similar to W , except it represents the units of IT investments. We use Wc,s  and Kc,s  to represent the number of warehouses and the number of kiosks respectively. Finally, both the entering firm and the incumbent attempt to maximize their total profit by choosing when and where to build new warehouses and IT infrastructure. max

∑ Π f ,s

wx,y,s ,kx,y,s s

(4)

So far we have largely ignored the s subscripts. These represent time. For the purposes of these experiments, we use 5 time periods. In each time period, the firm’s make their decisions and build their infrastructure. Then the farmers decide where to sell their goods and sell them. For ex-

3

Proceedings of the 40th Hawaii International Conference on System Sciences - 2007

ample, the incumbent firm begins with just enough warehouses to serve a majority of the population. The entering firm may build more warehouses drawing away a large portion of the incumbent’s sales. As a result, the incumbent may wish to respond by building more infrastructure during the next turn. Based on this set of actions, each time period is roughly similar to a one year time period in the real world. Both firms may move during each time step s. In this way, the model is similar to game with simultaneous moves common in game theory. Since computing the set of payoffs is NP-hard, a traditional game theory representation would not be appropriate here. In addition, the set of actions remains the same during each period and the firm only bases its choice on the current state of the model. In other words, it only cares that the competitor has at warehouses a location (x, y) and does not care about additional details such as when the warehouse was built. Hence, this problem is similar to a dynamic programming problem common in optimal control theory. Yet, because this is a competitive model, this approach is also not appropriate. We discuss our solution approach in greater detail in the next section. We have also not yet touched the question of the (x, y) coordinates. Obviously the size of the potential space and the distribution of farmers and facilities affects final outcomes. We use a 20 × 20 grid space. This ensures a large enough space that obvious simple solutions are insufficient. For example, we want to avoid problems which are easily solved by placing a single warehouse. Of course, we also do not want the problem to be unnecessarily complex by increasing the state space more than necessary. Through numerical testing of different possible ranges or (x, y) coordinates, we found that a 20 × 20 landscape ensured an interesting level of complexity without making the problem so large that it is unsolvable. To implement this model, we programmed a simulation using MATLAB. The simulation begins by randomly scattering farmers across the 20 × 20 grid. Each location has a 60% chance of containing a farmer. In addition, the model assumes an incumbent firm. We place warehouse for the incumbent firm every 10 spaces. When the simulation begins, the entering firm does not have any warehouses in place and neither firm has any IT infrastructure in place. With this system in place, the goal is to develop facility placement strategies for both the incumbent and the entering firm. Over time (we chose 5 iterations), the two firms place facilities in response to each other’s actions in an attempt to maximize profits. In the next section, we discuss our tool for deriving these facility placement strategies.

4. Classifier and GA Next we turn our attention to solving this model. We need to represent five elements of the model’s state: farmers, the incumbent firm’s warehouses, the incumbent firm’s IT infrastructure, the entering firm’s warehouses, and the entering firm’s IT infrastructure. In a traditional approach to this problem, we would most likely represent the set of all farmers as a matrix of ones and zeros. If an element of the matrix was set to one, that would represent a farmer at that location; if it was set to a zero, then there is no farmer there. We could create similar matrices for the other elements of the state space. Each matrix would be 20 × 20 to represent all of the elements on the map. Further complicating matters, to represent the model at each time step, we would need to replicate this 5 times. Simply describing the model would require 25 matrices with 400 elements each. This gives us 210000 different possible states. And that is just the number of possible states, to say nothing of the number of possible solutions. To represent the solutions, it is common to use a similar approach. Looking at just one of the firm’s problems, we would use a matrix to represent where to place the warehouses at each time step and another matrix to represent where to place the IT infrastructure. If a given element is set to one, then the firm should build at that location. If it is set to zero, then it does not build. In total, we would have two matrices for each of five time periods for a total of ten matrices per firm, and twenty matrices to represent both firm’s strategies. Once again, these are 20 × 20 matrices for a total of 400 elements each. This gives us a total of 28000 possible solutions. Recapping, we have a very large number of possible solutions that we need to match to a very complex state. We would like the solution to work well when applied to more than one starting state. Traditional methods often do not work well for this goal. Using earlier methods, the solution to a given problem is highly specific to the insignificant details of the model. Even small perturbations to the model could give different solutions. Figure 1 shows how even a slight change in the model can significantly change its results. In this figure, there is one customer at each node. The weight on the arcs represent the transportation cost between the nodes. The firm wants to place two warehouses, as represented by the larger circles over two nodes. In this example, the only difference between the two graphs is the transportation cost on one arc. Yet this small change results in the exact opposite facility placement strategy. From this, we see that the solution to a facility location problem can change substantially with even minor perturbations to the model. Because the solutions can be very specific to minor de-

4

Proceedings of the 40th Hawaii International Conference on System Sciences - 2007

tails of the model, the results may not be applicable to related real-world phenomena. Furthermore, because the solutions are so specific to the model, it may be difficult to understand the model at an intuitive level. As an alternative approach, we propose a method which generates a set of simplified rules or heuristics which, when applied to a model, provide a facility placement strategy. For example, the classifier could learn rules such as “build a warehouse wherever your competitor has a warehouse.” As a result, a good solution found on one specific problem may work well when applied to another similar problem. Granted, we do not want to take this claim too far. A different problem with substantially different fundamental factors would require a new solution. Nonetheless, deriving strategies instead of highly specific solutions may be a significant advantage. So how do we derive these general rules? In essence, we want a set of if-then clauses which are applied to a model to generate the facility placement strategies. Classifiers are an excellent tool for learning rules of the form if condition then action. We use a set of problems to train the classifier, which derives the set of rules. These rules should then be generalizable to similar problems. In the remainder of this section, we describe how we model a facility location problem as a set of if/then statements. First, we need a classifier-friendly means of representing the current state of the model. This includes representing the locations of the farmers, the incumbent firm’s warehouses and IT infrastructure, and the entering firm’s warehouses and IT infrastructure. To be classifier-friendly, the representation should lend itself to easy comparisons for the if clause. Hence we use a set of strings to represent each facility and farmer. The first part of each string contains the (x, y) coordinates of the facility. The second part of the string is a set of bits describing the type of object. The values for each bit are described in Table 2. If the bit is set to 1, then a facility of that type is present. If the bit is set to 0, then no facility of that type is present. More than one facility can be present in a given location. For example, a firm could build both IT infrastructure and a warehouse in a given location. In such a case, multiple bits on the string are set to one. Consistent with the previous work on classifiers, we refer to these strings as messages. Having described a classifier-friendly representation of the state, next we turn to the classifier itself. Consider the condition portion of the classifier. This matches different facilities and farmers described in the previous paragraph. As a result, the representation is very similar to what we described above. The condition consists of a set of five bits which correspond to the five bits in the previous paragraph that represent the facility type. If the condition bits of the classifier match the type bits from a string describing the state, then there is a match. (The conditions from

Bit 1 2 3 4 5

Table 2: Bits describing the type of facility Facility Location contains incumbent’s warehouse Location contains competitor’s warehouse Location contains incumbent’s IT infrastructure Location contains competitor’s IT infrastructure Location contains a farmer

multiple rules could match. Later in this section we discuss which of the rules is actually executed when multiple rules match.) Note that the matching includes a wildcard character. If the bit is set to #, then the facility may or may not be present. For example, the string #0#01 will match any location where the competitor does not have any warehouses or kiosks, the firm may or may not have a warehouse or kiosk, and there is definitely a farm. Notice that the rules do not match to specific (x, y) coordinates. We want to ensure that the rules learned by the classifier are not highly sensitive to the unique details of the model. For example, we do not want a rule that only executes if there is a farmer at location (12, 6). Instead, we want rules that execute when a more abstract set of conditions is met, such as being in the presence of many farmers. The next challenge is encoding the actions. We split the actions into three parts: what facility to build, where to build it, and the probability of building. When considering what facility to build, the firm can build a warehouse, a kiosk, or both. This is represented with two binary digits. The first digit, if set to one, tells the firm to build a warehouse and the second digit, if set to one, tells the firm to build a kiosk. If both are set to one, both facilities are built. The next four digits describe where to build the facility. Recall from the above discussion that different locations on the map can trigger the rule. Let the location that triggered the rule sit at the center of a circle. The location portion of the rule defines the radius of that circle. The warehouse is built at a random point within the radius of the circle. For example, consider the case where location (5,1) triggered a rule with the distance digits of 0110. Then the firm would randomly build within six spaces of (5,1). Finally, the rule should only execute a fraction of the time. For example, if the rule finds a farmer in a given square, the firm would want a warehouse nearby. Yet the firm should not build a warehouse next to every farm. Therefore, it should only build a warehouse for a fraction of the farmers. This method has a second benefit. If there are a large number of farmers in a given area, the firm should be more likely to build in that area. Rather than build in a complex set of logic to determine if there are numerous farmers in the area, the rule has a limited probability of building a warehouse for each farmer. Many farmers in an area means

5

Proceedings of the 40th Hawaii International Conference on System Sciences - 2007

Figure 1: A small change in one arc of the graph can have a major change in the optimum warehouse location. In the left graph, the optimum locations are the top two nodes. In the right graph, having slightly changed the arc between the left two nodes, the new optimum locations become the bottom two nodes.

Bit 1 2 3-6 7-12

Table 3: The classifier’s action string Facility Build a warehouse Build a kiosk Radius in which to randomly build the facility The probability of executing the rule

a higher likelihood that one of the times it will build a warehouse. If there are fewer farmers in the area, then there is a lower chance that a warehouse is built. There is no direct way of implementing a probability as a binary string. There is no integer value n such that 2n = 100, so the bits cannot simply represent a percentage. Instead, we use a fraction. Namely, the six bits of the probability are divided by 63. This gives a number between zero and one, which is a valid probability. This probability then determines the chance that the action specified by the rule is actually executed. For example, the string 010101 (which is 21 in decimal) indicates that there is a 33% chance that the rule is executed. To summarize, each rule consists of two parts: the classifier and the action. The classifier matches different states in the environment. If it matches, it then executes the rule. The condition portion of the rule is identical to the bits describing the type of facility, as summarized in Table 2 and the classifier’s actions are summarized in Table 3. As we briefly mentioned in a previous paragraph, it is possible for multiple conditions to match a given message. In addition, we need to assign credit to successful rules and blame to bad ones. This is the role of the bucket brigade algorithm [11]. Under the bucket brigade algorithm, each rule that matches a message places a bid. This bid determines which rule is executed. The highest bidder wins the

Table 4: Configuration of the genetic algorithm Parameter Value # of Generations 750 Population Size 750 Mutation Rate 1% Elitism 20 % Crossover Rate 80 %

right to execute. The bids are based on the rule’s previously earned fitness and the rule’s specificity (the ratio of non-wildcard bits to the total number of bits). The amount of the winning bid is then divided up among the rules that executed before it. This means that a rule shares its success with the previous rules which helped make it successful. The above discussion assumes the existence of a set of rules for the classifier. This leaves the obvious problem of generating the rules. With 17 bits in each classifier, each of which can assume three different states (0, 1, #), there are technically 129,140,163 different possible rules. Obviously enumeration would be a bad approach. Instead, we use a genetic algorithm to search for new rules. We begin with a set of 750 of randomly generated rules. We then use those rules to solve the model. Based on the fitness of those rules—in this case fitness is profitability—the genetic algorithm searches for new rules. This process is repeated 750 times. Table 4 lists several other configuration parameters for the genetic algorithm. Recall our discussion in the literature review regarding path and state dependence. As we discussed, state dependence is important in competitive facility location models. Rather than assume unerringly rational omniscient firms who can arrive at the same equilibrium point, we find solutions that evolve over time as firms respond to each other’s

6

Proceedings of the 40th Hawaii International Conference on System Sciences - 2007

Webegin by examining the best ten rules from one of the experiments. For simplicity, we convert string of bits into a more human-friendly description. The top ten rules from experiment number 5 are: 1. If the competitor has a warehouse at the location, build a IT infrastructure within 7 spaces with 24% probability. 2. Do nothing/no-op 3. If the competitor has a warehouse at the location, build a warehouse in the same location with 50% probability 4. If the competitor has a warehouse at the location, build a warehouse within 16 spaces with 50% probability Figure 2: The distribution of the average fitness scores of the classifier rules. The x axis is the rank of the rule and the y axis is the average fitness of that rule.

5. If the competitor has a warehouse at the location, build a warehouse within 17 spaces with 50% probability 6. Do nothing/no-op

actions. Each decision is based on the current state of the model and the available set of actions. Hence, this is a state dependent solution. This section describes an approach for solving a facility location problem using classifiers and genetic algorithms. The rules learned by the classifier are general if/then rules, which can then be applied to other situations. In addition, the general approach is state dependent, which we believe more accurately reflects real-world conditions. In the next section, we examine the effectiveness of the classifierderived facility placement strategies.

5. Analysis and Results Establishing the effectiveness of this tool is a complicated process. We must show that we find good solutions for the training data and that the results reasonably generalize to other similar problems. This is the first phase in that processes: the proof of concept. At this point, we show that the classifier learns reasonable strategies and we discuss how the results can be generalized to other problems. (We discuss the future steps in the section 6.) From the model discussed in section 3, six of the model’s parameters are exogenously defined. These are the uncertainty penalty, transportation costs, price offered to the farmers, the value of the goods to the firms, the cost of warehouses, and the cost of IT infrastructure. To test our model, we use 48 different combinations of these parameters. In this section, we examine the rules generated from these experiments. Obviously examining every rule from every experiment would we tedious, so we focus on some of the more interesting and informative trends.

7. Do nothing/no-op 8. Do nothing/no-op 9. Do nothing/no-op 10. Do nothing/no-op Several interesting trends appear in these results. First, notice the large number of no-ops in the best answers. Noops have an parallel with the computation of how many facilities to build. We cover this topic in more detail later in this section. Second, note that only one rule invests in IT. To further understand this, we went back to the parameters chosen for this experiment. In this case, the farmers are risk neutral. As shown in the model from section 3, risk neutral farmers will not get much value from IT. Hence, it makes sense that IT investments would be rare in this model. Finally, notice that two of the rules are almost identical. This is a potential problem with our approach. These two rules, which have essentially the same effect, compete with each other. Because we have two rules that share the same fitness instead of one stronger rule, this rule may not execute as frequently than we would ideally desire. But rather than focusing only on the individual runs, we now turn to a broad analysis of the trends in the rules. Broadly speaking, the rules tell the firm to build a warehouse, IT infrastructure, both, or neither (i.e. a no-op). Of all of these rules, 4.51% build warehouse, and 3.90% build IT infrastructure. The remaining rules do not result in any actions being taken. Also recall that each rule has a probability of executing. If these rules have a zero percent chance of executing, they are classified as no-ops. In the next few

7

Proceedings of the 40th Hawaii International Conference on System Sciences - 2007

paragraphs, we break these figures down into additional details. Of the different experiments, 72.9% of them included rules to build next to the other firm’s warehouse. This appears to be a strategy quite common in the fast food industry. We call this the Burger King strategy. Burger King builds franchises near McDonald’s. If the market is good enough for a competitor, then it must be good enough for you, too. This rule also brings up an interesting issue of generalizations. If there are no competitor’s warehouses currently on the map, this type of rule is worthless. It does not generalize well to new markets. In a brand new market with no competitors, even if there is a Ialth of potential customers, this rule will not execute. Hence, although it this type of rule may generalize well to other competitive markets, it would be horrible in new or emerging markets. When looking at the IT infrastructure, there is a similar story. 76.6% of all of the rule sets included a rule to build IT infrastructure build near a competitor’s IT infrastructure. We call this the Nicholas Carr IT strategy. IT is a competitive necessity, not a competitive advantage. Just build enough to keep up with the competition and then you will do fine. Similar to the problem of building warehouses only near other warehouses, this rule doesn’t help when trying to build IT infrastructure in new markets. Yet the high frequency of this rule shows that it generally does well in established markets. Looking at the no-ops in a little more detail, these have a parallel to computing the number of facilities. More specifically, consider the role of the probabilities in the action strings. If the probability is high, then more facilities will be built. If the probability is low, then the number of facilities will be lower. With this in mind, we can compare the role of the probability of executing and the no-ops to two canonical results from the previous facility location literature. Previous work has shown that as transportation costs increase, the number of facilities needed increases. In addition, as the cost of warehouses increases, the number of warehouses decreases. How do our rules compare with these common trends in operations management? Table 5 shows a regression run comparing transportation costs and warehouse costs with the probability of the rule executing. Recall that this probability is a proxy for the number of facilities actually built. As we would expect, an increase in transportation costs increases the number of facilities built. Also, an increase in the cost of warehouses results in an increase in the number of warehouses built. Hence, our rule’s predictions are consistent with this previous research. The above analysis has looked at all of the rules regardless of fitness. As the genetic algorithm searches for rules, it will combine the best rules to create the new strategies.

Table 5: The effect of transportation and on the probability of being a no-op Variable Coefficient Transportation cost 2.7237 Warehouse cost -0.0030 Intercept 0.3755

warehouse costs p 0.0003 < 0.0001 < 0.0001

Figure 3: An example of the improvement of the best solution over time. The x axis is the generation number and the y axis is the firm’s profitability

Hence, the above rules are should be related to the best rules. Nonetheless, that still does not tell us how many of the rules derived above are actually good rules. Figure 2 looks at the distribution of the fitness of the rules. We ranked the fitness of each rule, found the average among all of the experiments, and then plotted that data. After the 30th ranked rules, the average fitness was small enough that we stopped plotting. Based on this graph, we see that approximately the best 15 rules are useful in any classifier. The other 735 are still useful in the search process, but applying the rules only requires a very small subset of the total rules. If we restrict our attention to only the best rules in the classifier, do the general trends change? To test this, we did a limited analysis of only the 10 best rules from each of the experiments. The biggest change was the number of no-ops. Nine out of ten of the best rules are no-ops. This is somewhat intuitively appealing. Given a large number of the messages in the queue represent individual farmers, we do not want the system to build a warehouse or kiosk next to every farmer. Hence, we would expect to have a large number of no-ops in a good set of rules. Other than the increased no-ops, most of the results are very similar. The Burger King strategy and the Nicholas Carr strategy are still the most common rules for building warehouses and IT respectively. Considering that the genetic algorithm derives new rules from the best rules, is not too surprising that there is a similarity between the best rules and the rest of the population.

8

Proceedings of the 40th Hawaii International Conference on System Sciences - 2007

We also need to know when to stop searching. In our experiment, we ran the genetic algorithm for a 750 generations. Though some trial and error, we discovered that this was a reasonable amount of time. (We originally ran it for only 250 generations but found that was insufficient.) Figure 3 shows the progression of one of the experiments over time. Not surprisingly the GA weeds out the worst strategies quickly in the early generations. By around generation 600, the improvement has slowed dramatically. Adding another 25% to the number of generations to be on the cautious side, we concluded that 750 is an appropriate length. Notice that, unlike most other approaches to a facility location problem, these rules are rather generic. A traditional solution would look like, ”build a warehouse at point (3, 12).” This type of rule ties to specific points on the grid. Using our method, the rules reference features of the problem, such as, ”build a warehouse next to a competitor’s warehouse.” If the competitor happened to have a warehouse at point (3, 12), these two rules would have the same result. But the more general rule can adapt to a broader set of situations. Hence, these results are more generalizable.

6. Conclusion We have identified a new tool for solving facility location problems. One of the biggest improvements of this algorithm is the more generalizable solutions. Instead of finding a specific answer that is highly sensitive to the parameters of a specific problem, our classifier-based approach searches for general rules. These rules can then be applied to other similar problems. General solutions are important when trying to apply the model’s results to real world problems. It is impossible to create a model that perfectly reflects actual business problems. Therefore, the model’s results should be robust to slightly different situations. By focusing on generalizable solutions, our approach is a step in this direction. Managers can use our tool to identify solutions to facility placement problems with greater confidence that the solutions are adaptable to real world problems. We do not want to overstate the quality of our generalizable rules. So far we have only shown that we can generate them using a classifier and GA and that they results appear reasonable. In future work, we need to address two additional questions. First, we need to address whether the classifier is finding good solutions to the problems in the training set. We have shown that the results are reasonable, but we have not yet compared the results to other methods, such as those discussed in the literature review. In addition, we need to tie this into work on approximation theory. This would help us establish how well our results generalize to other situations. Broadly speaking, we have identified a possible method

for finding general solutions to facility location problems. This work is helpful for researchers who need to optimize a group of similar problems or who would like more intuitive answers to their problems. It can help managers who need models which are less sensitive to the small quirks, helping them translate into real world scenarios. While we still have work to do before we can properly claim success, our initial results appear quite promising.

References [1] Franklin Allen and Risto Karjalainen. Using genetic algorithms to find technical trading rules. Journal of Financial Economics, 51(1):245–271, 1999. [2] Simon P. Anderson, Andre de Palma, and JacquesFrancios Thisse, editors. Discrete choice theory of product differentiation. MIT Press, Cambridge, MA, 1992. [3] J. Benders. Partitioning procedures for solving mixedvariables programming problems. Numerische Mathematik, 4:238–252, 1962. [4] Siddhartha Bhattacharyya. Direct marketing performance modeling using genetic algorithms. INFORMS Journal on Computing, 11(3):248–257, 1999. [5] Jack Brimberg, Pierre Hansen, Nenad Mladenovic, and Eric D. Taillard. Improvements and comparison of heuristics for solving the uncapacitated multisource Weber problem. Operations Research, 48(3):444– 460, 2000. [6] George Dantzig. Linear Programming and Extension. Princeton University Press, Princeton, NJ, 1963. [7] C. d’Aspremont, J. Jaskold Gabszewicz, and J. Thisse. On Hotelling’s “stability in competition”. Econometrica, 47(5):1145–1150, 1979. [8] Zvi Drezner and Horst W. Hamacher, editors. An Efficient Genetic Algorithm for the p-Median Problem. Springer-Verlag, Berlin, 2002. [9] M. Guignard and S. Kim. Lagrangian decomposition: a model yielding stronger Lagrangian bounds. Mathematical Programming, 39:215–228, 1987. [10] John Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, MI, 1975. [11] John Holland. Properties of the bucket brigade. In Proceedings of the International Conference on Genetic Algorithms, 1985.

9

Proceedings of the 40th Hawaii International Conference on System Sciences - 2007

[12] C. Hosage and M. Goodchild. Discrete space locationallocation solutions from genetic algorithms. Annals of Operations Research, 6(1):35–46, 1986. [13] Harold Hotelling. Stability in competition. The Economic Journal, 39(153):41–57, 1929. [14] A. Land and A. Doig. An automatic method for solving discrete programming problems. Econometrica, 28(3):497–520, 1960. [15] Manfred Padberg and Giovanni Rinaldi. A branchand-cut algorithm for the resolution of large-scale symmetric traveling salesman problems. SIAM Review, 33(1):60–100, 1991. [16] Scott Page. Path dependence. Quarterly Journal of Political Science, 1(1):87–115, 2006. [17] S. Salhi and M.D.H. Gamal. A Genetic Algorithm based approach for the uncapacitated continuous location-allocation problem. Annals of Operations Research, 123(1):203–222, 2003.

10