Parallel algorithms for continuous multifacility competitive location

0 downloads 0 Views 113KB Size Report
plane. Continuous location theory is very vast. Apart from the many ... location problems, specially in discrete space (see for instance [3,29]). However, the use of.
Noname manuscript No. (will be inserted by the editor)

Parallel algorithms for continuous multifacility competitive location problems† J.L. Redondo∗ · J. Fern´andez+ · I. Garc´ıa∗ · P.M. Ortigosa∗

Received: date / Accepted: date

Abstract We consider a continuous location problem in which a firm wants to set up two or more new facilities in a competitive environment. Both the locations and the qualities of the new facilities are to be found so as to maximize the profit obtained by the firm. This hard-to-solve global optimization problem has been addressed in [38] using several heuristic approaches. Through a comprehensive computational study, it was shown that the evolutionary algorithm UEGO is the heuristic which provides the best solutions. In this work, UEGO is parallelized in order to reduce the computational time of the sequential version, while preserving its capability at finding the optimal solutions. The parallelization follows a coarse-grain model, where each processing element executes the UEGO algorithm independently of the others during most of the time. Nevertheless, some genetic information can migrate from a processor to another occasionally, according to a migratory policy. Two migration processes, named Ring-Opt and Ring-Fusion2, have been adapted to cope the multiple facilities location problems, and a superlinear speedup has been obtained. Keywords Evolutionary algorithm · Parallelization · Coarse grain model · Migratory policies · Continuous location · Competition

1 Introduction Location science deals with the location of one or more facilities in a way that optimizes a certain objective (minimization of transportation costs, minimization of social costs, maximization of the market share, etc). For an introduction to the topic see [14, 15, 21]. In continuous facility location problems, the set of possible locations is the plane or a region of the plane. Continuous location theory is very vast. Apart from the many situations in which the facilities can be located in any arbitrary place in an area (nuclear plants, prisons, garbage † This work has been supported by the Ministry of Education and Science of Spain under the research projects TIN2005-00447, ECO2008-00667/ECON and P06-TIC-01426 (CICE, Junta de Andaluc´ıa), in part financed by the European Regional Development Fund (ERDF). ∗ Dept. Computer Architecture and Electronics, University of Almer´ıa, Spain, E-mail: {juani, inma, pilar}@ace.ual.es · + Dept. Statistics and Operational Research, University of Murcia, Spain, E-mail: [email protected]

2

dumps, strategic military facilities, . . . ), problems whose discrete or network representations have a size that make them unmanageable are also reformulated as continuous models. Whereas some continuous location models are easy to solve (like, for instance, the classical Weber problem [45]), other more realistic models lead to global optimization problems which are very hard to solve, for instance the location of undesirable facilities (see [17, 18]), competitive location models (see [11, 20]) or the well known p-center problem (see [13]), to name a few. Several global optimization techniques able to solve different location models have been proposed to cope with these difficult problems. However, those algorithms are unable to solve some problems: either they are too slow or the computer runs out of memory. Furthermore, with the exception of the interval methods [18–20], they have been designed to deal with single facility location problems, i.e., problems in which a single facility is to be located. Thus, heuristic procedures have to be developed to solve those other more difficult problems. Usually, researchers propose ad-hoc heuristics for the specific problem they are handling, although well known meta-heuristics, such as tabu search [23], simulated annealing [33], variable neighborhood search [25] or ant systems [10] have also been applied to different location problems (see [5, 12, 46]). Genetic algorithm based approaches [24, 26] are not an exception, and although rather scarce, they have also been applied to solve a few location problems, specially in discrete space (see for instance [3, 29]). However, the use of genetic or evolutionary algorithms in continuous location problems can be qualified as an anecdote: to the extend of our knowledge, apart from our companion papers [36–38], the only published references are [6, 27, 42], and the three deal with the same problem, namely, the uncapacitated location-allocation problem. In this paper we deal with a multiple competitive facilities location problem presented in [44]. Competitive location deals with the problem of locating facilities to provide a service (or goods) to the customers (or consumers) of a given geographical area where other facilities offering the same service are already present (or will enter to the market in the near future). Thus, the new facilities will have to compete for the market with the existing facilities. Many competitive location models are available in literature, see for instance the survey papers [16, 32, 35] and the references therein. In [20, 44], a company wants to enlarge its presence in a given geographical region by opening new facilities. Its objective is the maximization of its profit. In the model, the demand is concentrated at n demand points, whose locations and buying power are known. The location and quality of the existing facilities are also known. It is considered that the demand points split their buying power among all the facilities proportionally to the attraction they feel for them. The attraction (or utility) function of a customer towards a given facility depends on the distance between the customer and the facility, as well as on other characteristics of the facility which determine its quality. The locations and the qualities of the new facilities are the variables of the problem. The problem is highly nonlinear, with many local optima. In [44], the case where two facilities are to be located is considered. An interval branchand-bound method is proposed to solve the problem exactly, but the running time makes the approach inadequate (the required time for solving one problem varies from 6 to 140 hours). In fact, the interval method could not solve problems where more than two facilities have to be located. Recently, in [38], several new procedures were investigated. In particular, the evolutionary algorithm UEGO introduced in [34] was adapted to cope with the multiple facilities location problem. In a set of fifteen problems in which two facilities had to be located and for which the optimal solutions were known, only the evolutionary algorithm UEGO was able to obtain the optimal solution with a 100% of success. Furthermore, in a comprehensive computational study on a set of problems in which up to ten facilities had

3

to be located, the evolutionary algorithm UEGO obtained better results than the other algorithms. The differences between the solutions provided by UEGO and the other algorithms were remarkable, which proved the high nonlinearity of the problem. Nevertheless, the CPU time of UEGO increases linearly with the number of facilities p to be located and it seems to increase, althought not linearly, with both the number of demand points and the number of existing facilities. Therefore, it is advisable to parallelize UEGO for handling problems with high computational requirements. Many examples of successful parallel implementations of evolutionary algorithms, as UEGO , are present in literature, see for example [1]. Typical and widely used strategies are: (i) Coarse-grain, where each processing element executes the corresponding algorithm (with smaller and different subpopulations) independently of the remaining ones during most of the time. However, some genetic information is exchanged among processors according to a migratory policy. (ii) Fine-grain, where selection and mating procedures are restricted to a small neighborhood, but neighborhoods overlap permitting some interaction among all the individuals. (iii) Master-slave, where one processing element (the master) has unidirectional control over one or more other processing elements (the slaves). In [41], a coarse grain strategy, called Ring-Opt, and a master-slave strategy were applied to a related genetic algorithm when optimizing discrete competitive location problems. The coarse-grain method was able to obtain either a lineal or superlinear speedup in most cases, and the capability of finding the optimal solutions was maintained. However, the master-slave strategy did not present such good performance results. In [37], the single facility case of the problem considered in this paper [20] was solved using UEGO . A coarsegrain parallelization of such an algorithm was presented in [39], where several migratory policies were implemented and evaluated, besides Ring-Opt strategy. Results showed that only two of them, Ring-Opt and Ring-Fusion2, were able to obtain the optimal solution in all the instances, being Ring-Fusion2 more efficient than Ring-Opt. In this work, both parallel techniques Ring-Opt and Ring-Fusion2 have been extended and adapted to cope with the multifacility problem. The paper is organized as follows. The location model is presented in Section 2. In Section 3 the evolutionary algorithm UEGO adapted to the multiple facilities location problem is briefly described. It is in Section 4 where we present the migratory policies for a coarse grain implementation of UEGO . In Section 5 we carry out computational experiments to study the performance of the different algorithms. The paper ends with some conclusions in Section 6.

2 A continuous competitive multifacilities location and design problem A chain wants to locate p new facilities in a given area of the plane, where there already exist m facilities offering the same goods or product. The first k of those m facilities belong to the chain, 0 ≤ k < m. The demand, supposed to be fixed regardless of the conditions of the market, is concentrated at n demand points, whose locations and buying power are known. The location and quality of the existing facilities are also known. The following notation will be used throughout: Indices i j l

index of demand points, i = 1, . . . , n. index of existing facilities, j = 1, . . ., m (the first k of those m facilities belong to the chain). index of new facilities, l = 1, . . ., p.

4

Variables zl αl n fl nf

location of the l-th new facility, zl = (xl , yl ). quality of the l-th new facility (αl > 0). variables of the l-th facility, n fl = (zl , αl ). variables of the problem, n f = (n f1 , . . ., n f p ).

Data pi wi fj dimin di j αi j gi (·) αi j /gi (di j ) γi S qmin qmax Miscellaneous dizl γi αl /gi (dizl )

location of the i-th demand point. demand (or buying power) at pi . location of the j-th existing facility. minimum distance from pi at which the new facilities can be located. distance between pi and f j . quality of f j as perceived by pi . a non-negative non-decreasing function. attraction that pi feels for f j . weight for the quality of the new facilities as perceived by demand point pi . region of the plane where the new facilities can be located. minimum allowed quality for the new facilities. maximum allowed quality for the new facilities.

distance between pi and zl . attraction that pi feels for n fl .

We may assume that gi (dil ) > 0 ∀i, l, because any demand point i for which gi (dil ) = 0 for some l would be totally lost to the new facilities, so it may simply be left out of the model. Following Huff’s suggestion [28], we consider that the patronizing behavior of customers is probabilistic, that is, demand points split their buying power among all the facilities proportionally to the attraction they feel for them. This rule is the most appropriate one unless there is a central control that forces the selection of only one facility, usually the closest one (as for instance when the governments allocate students to public schools or users to public hospitals). With the previous notation, the total market share attracted by the chain is p

n

M(n f ) = ∑ ωi i=1

γi αl

k

αi j

∑ gi (diz ) + ∑ gi (di j )

l=1 p

l

j=1

m αi j γi αl ∑ gi (diz ) + ∑ gi (di j ) l j=1 l=1

.

The problem to be solved is then  p   max G(n fl ) Π (n f ) = F(M(n f )) −  ∑   l=1 s.t. dizl ≥ dimin , i = 1, . . . , n, l = 1, . . . , p    αl ∈ [qmin , qmax ], l = 1, . . . , p   zl ∈ S ⊂ R2 , l = 1, . . . , p

(1)

5

where Π (n f ) is the profit obtained by the chain, calculated as the total income minus the costs. This profit disregards the operating costs of the existing facilities of the own chain, since these are considered to be constant. F(M) is a strictly increasing differentiable function which transforms the market share M into expected sales. G(n fl ) is a differentiable function which gives the operating cost of a facility located at zl with quality αl . In our studies we have assumed the function F to be linear, F(M) = c · M, where c is the income per unit of goods sold. Of course, other functions can be more suitable depending on the real problem considered. The function G(n fl ) should increase as zl approaches one of the demand points, since it is rather likely that around those locations the operational cost of the facility will be higher (due to the value of land and premises, which will make the cost of buying or renting the location higher). On the other hand, G should be a non decreasing and convex function in the variable αl , since the more quality we expect from the facility, the higher the costs will be, at an increasing rate. In our experiments we have assumed G to be separable, i.e. of the form G(n fl ) = G1 (zl ) + G2 (αl ), with G1 (zl ) = ∑ni=1 Φi (dizl ) and Φi (dizl ) = wi /((dizl )φi0 + φi1 ), φi0 , φi1 > 0, and G2 (αl ) = (αl /ξ0 )ξ1 , ξ0 > 0, ξ1 ≥ 1. Other possible expressions can be found in [20]. The model is rather general and can be specialized for many real situations. Of course, its ability to model general situations comes at the expense of the difficulty in its solution: the problem is neither concave nor convex. For the simpler model in which the qualities of the new facilities are given in advance (they are not variables of the problem), the operating costs of the facilities are disregarded (i.e., G = 0), and no constraints are taken into account, the number of local optima in a given problem with n = 100, m = 7, k = 0 were 11, 470 and 23714 for p = 1, 2 and 3, respectively, and more than 90000 for p = 4 (see [12]).

3 The evolutionary algorithm UEGO UEGO is a multimodal algorithm which is able both to solve multimodal optimization problems where the objective function has multiple local optima and to discover the structure of these optima as well as the global optimum. In a multimodal domain, each peak can be thought of as a niche. The analogy with nature is that within the environment there are different subspaces, niches, that can support different types of life (species or organisms). The concept of niche is renamed species in many niching or speciation methods [1, 31, 30, 34]. A species in UEGO can be thought of as a window on the whole search space. This window is defined by its center and a radius. The center is a solution, and the radius indicates its attraction area, which covers a region of the search space and hence, multiple solutions. The radius of the species is neither constant along the execution of UEGO nor the same for each species. This radius is a monotonous function that decreases as the index level (or cycles or generations) increases. The parameter L indicates the maximum number of levels in the algorithm. At each level i (with i ∈ [1, L]), the function radius Ri does not change. The radius function is defined in such a way that it coincides with the initial domain landscape at the first level (R1 ), and it converges to zero when the number of levels tends towards infinity. In addition to the radius value Ri , each level has two maxima on the number of function evaluations (f.e.), namely newi (maximum f.e. allowed when creating new species) and ni (maximum f.e. allowed when optimizing species). During the optimization process, a list of species is kept by UEGO . This concept, specieslist, would be equivalent to the term population in an evolutionary algorithm. UEGO is in fact a method for managing this species-list (i.e. creating, deleting and optimizing species). The

6

maximum length of the species list is given by the input parameter max spec num (maximum population size). In UEGO every species is intended to occupy a local maximizer of the fitness function, without knowing the total number of local maximizers in the fitness landscape. This means that when the algorithm starts it does not know how many species there will be at the end. For this purpose, UEGO uses a non-overlapping set of species which defines sub-domains of the search space. As the algorithm progresses, the search process can be directed towards smaller regions by creating new sets of non-overlapping species defining smaller sub-domains. This process is a kind of cooling method similar to simulated annealing. A particular species is not a fixed part of the search domain; it can move through the space as the search proceeds. Additionally, UEGO is a hybrid algorithm that introduces a local optimizer into the evolution process. In this way, at every generation, UEGO performs a local optimizer operation on each species, and these locally optimal solutions replace the caller species. Notice that any single step taken by the optimizer in a given species is shorter than the radius of the given species. UEGO is abstract in the sense that the ‘species-management’ and the cooling mechanism have been logically separated from the actual local optimization algorithm. Therefore it is possible to implement any kind of optimizer to work inside a species. For the multiple facilities location problem considered in this paper we have used a Weiszfeld-like algorithm introduced in [38] as local optimizer (namely, WLMa). ‘Species-management’ consists of procedures for creating, fusing and eliminating species during the whole optimization process. At the beginning, a single species (the root species) exists, and as the algorithm evolves and applies genetic operators, new species can be created and some of the existing species can be fused. Given two parents (random points in the attraction area of a species), a new species is created when it is likely that the parents are on different hills. Every time a new species is created, a radius, whose value depends on the current level, is associated to it. In this way, species which have been created on different levels, have different radii. The species created at the highest levels have the smallest radii. In this way, species with small radii behave as if they were cooler, they discover a relatively small area, their motion in the space is slower but they are able to differentiate among local optima that are relatively close to each other. It is interesting to point out that a location in the search space can belong to different species (and notice that the reproduction operators, i.e. optimizing and creating new species, are applied inside the area covered by the species, which evolves independently). In UEGO , the fusion procedure depends on the level at which the algorithm is each time. Each level i determines a different radius, Ri . When the algorithm is in a fusion stage, it unites species of the species list whose centers are closer than the distance defined by Ri . If two species are fused, the center of the new species is the center with maximum objective function value, and the assigned level will be the minimum level of the fused species. In view of the species list, the species with the lower level absorbs the other. As a consequence of the creation procedure and, in spite of the application of the fusion mechanism, the list length can become larger than the parameter max spec num. In this case, some species have to be deleted. Higher level species are deleted first, therefore species with larger radii are always kept. In particular, one species at level 1 (whose radius is equal to the diameter of the search domain) always exists, making it possible to escape from local optima. The reader is referred to [38] for a more detailed description of the UEGO algorithm.

7

4 Parallel UEGO : coarse-grain strategy UEGO is a population-based method, i.e., it works with a population of candidate solutions (species). This means that there exists an intrinsic parallelism which consists of dividing the species among the available processors. In literature we can find many examples of successful implementations to parallelize population-based methods. Some of them use a single population, while others divide the population into several relatively isolated subpopulations. Some methods can massively exploit parallel computer architectures, while others are better suited to multicomputers with fewer and more powerful processing elements [7, 22]. In this section, UEGO has been parallelized following a coarse-grain strategy. In a coarsegrain model, each processing element executes an algorithm independently of the remaining ones during most of the time [8]. The idea is that different processing elements work with smaller and different subpopulations in such a way that, when merging all the subpopulations, a population similar to that of the sequential version can be obtained. Nevertheless, some information can migrate from a processing element to another according to a migratory policy, which is controlled by the following parameters:

– Interval of migration: It establishes how often the migration of a certain amount of individuals will be conducted from a processing element to another. – Rate of migration: It indicates the number of individuals that have to communicate with other processing elements when the migration interval is fulfilled. – Selection criterion: It determines the policy that will be applied for the selection of migratory individuals. Assume that P processing elements are available. In our particular case, each processing element executes UEGO in an independent way. However, the maximum population size allowed for every processing element (max spec num(P)) is P times smaller than that of the sequential version, i.e., the maximum length of the species list in each processor is max spec num(P) = max spec num/P. This implies that the number of function evaluations for the creation and optimization of species used for every processing element is reduced accordingly. Therefore, the parameters ni and newi are also divided by P, giving place to ni (P) = ni /P and newi (P) = newi /P, which are the maximum number of function evaluations allowed when creating new species and when optimizing species, respectively, per processing element. However, in order to diversify the subpopulations as much as possible, random and different root species are generated for every processing element. From its particular root species, each processing element executes the algorithm UEGO and composes a subpopulation. Note that different processing elements can converge to the same area of the search space, which means that they can maintain similar species in their corresponding lists. This increases the redundant work, which helps to give robustness to the algorithm, but it reduces the efficiency of the parallel algorithm. On the other hand, some processing elements can work on promising areas with many local and global optima, while others can stay in unfavorable regions of the search space. Therefore, taking into account that UEGO tries to maintain a species per local maximizer, different subpopulations with different lengths can be supported by each processor. This may produce load unbalance, idle processors and hence a bad performance of the parallel techniques. The migration of information among processing elements can help solve all these disadvantages, but more drawbacks can also be produced if the exchange of information is not appropriate. Notice that every iteration of the algorithm UEGO symbolizes the generation of a new species (offspring). In this sense, in order to interchange information that is as rich as pos-

8

sible, the migration process has to be carried out only after an offspring is obtained. It does not mean that the migration process has to be performed at every level, but at the end of the selected levels. Algorithm 1: Coarse-grain (algorithm run by each processor) Begin Coarse-grain Init species list Optimize species(n1 (P)) for i = 2 to L Determine Ri , newi (P), ni (P) Create species(newi (P)) Fuse species(Ri ) Shorten species list(max spec num(P)) Optimize species(ni (P)) Fuse species(Ri ) if interval of migration is fulfilled Migration procedure end if end for End Coarse-grain The structure of the algorithm executed by each processing element is described in Algorithm 1. Basically, they execute the sequential algorithm UEGO (see [38]), although a Migration procedure is included for the exchange of information and the parameters newi (P), ni (P) and max spec num(P) are used at each level, instead of the parameters newi , ni and max spec num used by the sequential version. The selection of the parameter values involved in the migration, i.e. interval of migration, rate of migration and selection criterion, is very important, and it demands a comprehensive analysis. It can influence, on the one hand, the effectiveness of the algorithms at finding the global solutions and, on the other hand, their scalability and efficiency. In [39] several migratory policies were implemented to solve the planar single facility competitive location and design problem presented in [20, 37]. Nevertheless, only two of them, named Ring-Opt and Ring-Fusion2, were able to obtain the same solution as the sequential version with an acceptable computational time. These two parallel techniques have been extended and adapted to cope with the current problem, where the chain wants to locate more than one facility. Next, we describe those two parallel implementations. Notice that the creation and optimization procedures are focused on a single species, and that the evaluation of a species is independent of the rest of the population. Hence, it is possible to maintain the coherence when migrating species.

4.1 Ring-opt strategy In this strategy, processing elements are supposed to be connected in a ring topology in such a way that each processor i sends information to the processor i + 1 and receives information from the processor i − 1 (see Figure 1). The information to be sent is the species with the best fitness or objective function value, consequently the rate of migration is equal to 1. The migration process is not carried out at

9

subList 1 Best 1 Best 5

subList 5

subList 2

Best 2 Best 4 subList 3

subList 4

Best 3

Fig. 1 Ring-opt strategy

early phases of the algorithm. In particular, if the algorithm iterates during L levels, there is no migration process during the first L/2 levels. This will allow the subpopulations to evolve independently. At the second half of the levels, a migration is performed at every level.

4.2 Ring-Fusion2 strategy In this strategy, processors are also connected following a ring topology in such a way that processor i can only communicate with processors i−1 and i+1. A classification into groups of workers and collectors exists: half of the processors act as collectors and the other half as workers. It is important to highlight that the role of collector or worker is not fixed, i.e., processors interchange their role at every communication stage. The rate of migration in this strategy is equal to the length of the species sublists. Communications are established by pairs of processors and work as follows: In a migration, processor i is a worker and sends its species sublist to the next processor i + 1 (collector) (see the upper part of Figure 2). Processor i + 1 fuses this list with its own sublist and distributes the resulting list between both processors (see the lower part of Figure 2). In the next communication stage, processor i will be a collector that will receive a sublist from processor i − 1 and the processor i + 1 will be a worker that will send the sublist to the processor i + 2 (see Figure 3). The migration process is carried out at the first half of the levels of the algorithm.

5 Computational studies 5.1 Measures The main goal of this study consists of evaluating both parallel algorithms, Ring-Opt and Ring-Fusion2, in comparison to the sequential algorithm UEGO . To ensure the solutions provided by our parallel algorithms reach the same function value than UEGO and to determine

10 Proc i+1 (Collector)

Proc i+2 (Worker)

subList i+1

subList i

subList i+2

Proc i (Worker)

Proc i+3 (Collector) subList 4 subList i+3

Proc i+5 (Collector)

Proc i+4 (Worker)

subList i+4

subList i+5

Proc i+1 (Collector)

Proc i+2 (Worker)

NewsubList i+1

NewsubList i

NewsubList i+2

Proc i (Worker)

Proc i+3 (Collector) NewsubList i+3

Proc i+5 (Collector)

NewsubList i+4

Proc i+4 (Worker)

NewsubList i+5

Fig. 2 Ring-Fusion strategy. Communication j

which of our algorithms are efficient from a computational point of view, numerical values of effectiveness and efficiency were registered. As effectiveness measurement, we have computed the relative difference in objective value between the value obtained by the sequential UEGO OptVal(UEGO) and the solution obtained by the parallel algorithms using P processing elements OptVal(P), Di f Ob j(P) =

OptVal(UEGO) − OptVal(P) . OptVal(UEGO)

The closer to zero the value of Di f Ob j, the better the effectiveness of the parallel model. The efficiency of the parallel versions, which estimates how well-utilized the processors are in solving the problem, is computed as: E f f (P) =

T (1) , P · T (P)

where T (i) is the CPU time employed by the algorithm when i processing elements are used (i = 1, 2, ..., P).

11 Proc i+2 (Collector)

Proc i+1 (Worker)

subList i+2 subList i+1

Proc i (Collector)

Proc i+3 (Worker)

subList i

subList i−1 subList i+3

Proc i+5 (Worker)

Proc i+4 (Collector) subList i+4

subList i+5

Fig. 3 Ring-Fusion strategy. Communication j + 1

Ideal efficiency is obtained when E f f (P) = 1. However, obtaining an ideal efficiency is not always possible, since there are many factors which can increase the value of T (P) and hence reduce the corresponding efficiency. Among others, these factors can be: work load unbalance, parts of the algorithm which cannot be parallelized, communication overheads and synchronization among processors. Nevertheless, sometimes a super-ideal efficiency can be obtained. This term refers to efficiencies larger than 1 when a parallel calculation is performed. According to Amdahl’s Law [2] this is impossible. However, super-ideal efficiencies have often appeared in literature [9, 40, 41, 43]. Many studies have been carried out in order to predict super-ideal efficiencies or to determine the reason behind them (see [4] and the papers therein). Many of them coincide in that super-ideal efficiency can be obtained due to a suboptimal sequential algorithm, important changes which concern the performance of the parallel algorithms, or some unique feature of the architecture (memory, cache,...) that favors the parallel formation.

5.2 Test problems and the environment In [38], a comprehensive computational study was carried out to compare the sequential UEGO with other algorithms from literature when solving the multiple facilities location problem described in Section 2. In such a study, different types of problems, varying the number n of demand points, the number m of existing facilities, the number k of those facilities belonging to the chain were generated (the actual values can be seen in Table 1). Such problems were solved by considering p = 2, 3, ..., 10 facilities to be located, and it was concluded that the CPU time of UEGO increases linearly with p. For the current study, we have only considered the harder problems, namely, those with the number of new facilities p = 7, . . . , 10. The corresponding problems were generated by randomly choosing the parameters of the model uniformly within given intervals. The search space for every problem was z j = (x j , y j ) ∈ S = ([0, 10], [0, 10]), α j ∈ [0.5, 5] for all j = 1, . . . , p.

12 n m k

21 5 2,3

2 0,1

50 5 0,2

10 0,4

71 5 2

100 5 0,2

2 0,1

10 0,4

Table 1 Settings of the test problems

The executions have been carried out on a cluster of 32 nodes, each of them with 2 processors Xeon IV to 2.4GHz and 1 GByte RAM. The implementation was done using the library MPI (Message Passing Interface) and using only one processor in each node. The UEGO parameter setting for the algorithms is L = 15, RL = 0.25, M = 750 and N = 2 · 107 (see [38]). The parallel experiments were run for the number of processors P = 1, 2, 4, 8, 16, 32. Due to the stochastic nature of the algorithms, all the experiments were executed 5 times and average values were considered. It is important to highlight that the confidence intervals obtained for these average values are relatively narrow, which reveals the robustness of the algorithm’s solutions.

5.3 Results The effectiveness analysis showed that both algorithms are able to provide the same solution than the sequential UEGO for the complete set of problems. Therefore, next, only results of the efficiency are given. Tables 2 and 3 summarize the efficiency obtained by Ring-Opt and Ring-Fusion2, respectively, when they are executed with P = 2, 4, 8, 16, 32 processors. For completeness, the average computational times (in seconds) of the sequential version, Av(T), have also been included.

p 7 8 9 10

Av(T) 913.81 1442.21 1529.77 1619.05

2 0.87 0.87 0.92 0.95

4 0.92 0.90 0.93 0.97

P 8 1.01 1.04 1.05 1.06

16 1.02 1.07 1.09 1.17

32 1.09 1.16 1.20 1.25

Table 2 Efficiency estimation for all the problems. Ring-Opt algorithm

p 7 8 9 10

Av(T) 913.81 1442.21 1529.77 1619.05

2 0.93 0.95 1.00 1.00

4 0.99 0.99 1.01 1.04

P 8 0.98 1.04 1.06 1.15

16 0.99 1.13 1.13 1.15

32 1.09 1.23 1.23 1.25

Table 3 Efficiency estimation for all the problems. Ring-Fusion2 algorithm

As it can be seen, the efficiency obtained by both parallel strategies is either closed to the ideal one or slightly greater than the ideal case. Moreover, their scalability is good,

13

since its efficiency improves as the difficulty of the problem increases (with the number of facilities to be located). Besides, when the number of processors P increases, the values of efficiency also improve. The reason for this is that when the parameters max subp num, ni and newi are divided by the number of processors P the divisions are not exact, and the quotients are truncated. Hence, P·max subp num(P) < max subp num, P · ni (P) < ni and P · newi (P) < newi , so the parallel algorithms work less than the sequential version, and the higher the number of processors P, the higher the difference in work is. On the other hand, the cost of communication is not to high, since the interval of migration is small and processors communicate in pairs, which means they are not idle in practice, so there are no bottle necks. Also, the computational load is well balanced. Processors work with a number of species closed to the maximum allowed during the most of the time. This is mainly due to both the high degree of nonlinearity of the problems and the presence of local optima whose objective values are close to the optimal value. Since the evaluation of a species is time consuming and the number of species is high, the effect of the communication times stays hidden. This leads to efficiencies better than the ideal one. Notice that Ring-Fusion2 outperforms Ring-Opt for all the tested problems, as can be seen from Tables 2 and 3. Whereas Ring-Opt strategy does not execute any fusion procedure among processors in order to reduce the redundant work (only the best species migrate), Ring-Fusion2 eliminates possible duplicates making global fusion between pairs of processors (the complete sublists migrate, and pairs of sublists are fused). Although the communication time is higher for Ring-Fusion2, the decrement of the redundant work reduces the computational time (remind that the evaluation of a species is time consuming), and hence a better efficiency can be reached. 6 Conclusion In this paper, we have presented two parallel algorithms for solving a multiple facilities competitive location and design problem on the plane, in which market share is estimated by using a Huff-like model and costs associated to both the location and the quality of the new facility are taken into account. These two factors are the variables of the problem. The objective is to maximize the profit obtained by the chain. The model is rather general, and as a consequence, it is also rather difficult to solve. The only exact method proposed in literature to cope with it, an interval B&B method presented in [44], can only solve the case where two facilities are to be located. Several procedures were investigated in [38] to solve the problem when more than two facilities have to be located. The results showed that the evolutionary algorithm UEGO is the one which provides the best solutions in terms of objective value, although it is also time consuming. In this paper, we have presented two coarse-grain methods (Ring-Opt and Ring-Fusion2) to parallelize UEGO . Both algorithms provide the same solutions than UEGO in all the instances, and have a good scalability, since their efficiencies improve as the difficulty of the problem increases. Moreover, they obtain super ideal efficiencies in some cases. In practice, Ring-Fusion2 is more efficient than Ring-Opt, due to the use of intermediate fusions between pairs of processors, which reduces the computational load and hence the CPU time. References 1. E. Alba and M. Tomassini. Parallelism and evolutionary algorithms. IEEE Transactions on Evolutionary Computation, 6(5):443–462, 2002.

14 2. G. Amdahl. Validity of the single processor approach to achieving large-scale computing capabilities. In AFIPS Conference Proceedings, volume 30, pages 483–485, 1967. 3. H. Aytug and C. Saydam. Solving large-scale maximum expected covering location problems by genetic algorithms: a comparative study. European Journal of Operational Ressearch, 141(3):480–494, 2002. 4. G.L. Beane. The effects of microprocessor architecture on speedup in distributed memory supercomputers. PhD thesis, University of Maine, August 2004. 5. S. Benati and G. Laporte. Tabu search algorithms for the (r|X p )-medianoid and (r|p)-centroid problems. Location Science, 2(4):193–204, 1994. 6. J. Brimberg, P. Hansen, N. Mlandinovi´c, and E.D. Taillard. Improvement and comparison of heuristics for solving the uncapacitated multisource weber problem. Operations Research, 48(3):444–460, 2000. 7. E. Cant´u-Paz. A summary of research on parallel genetic algorithms. Technical Report IlliGAL 95007, University of Illinois at Urbana-Champaign, 1995. 8. E. Cant´u-Paz. A survey of parallel genetic algorithms. Technical Report IlliGAL 97003, University of Illinois at Urbana-Champaign, 1997. 9. M. Cosnard and J.-L. Philippe. Achieving superlinear speedups for the multiple polynomial quadratic sieve factoring algorithm on a distributed memory multiprocessor. In Proceedings of the Joint International Conference on Vector and Parallel Processing, pages 863–874, London, UK, 1990. SpringerVerlag. 10. M. Dorigo and G. Di Caro. The ant colony optimization meta-heuristic. In David Corne, Marco Dorigo, and Fred Glover, editors, New Ideas in Optimization, pages 11–32. McGraw-Hill, London, 1999. 11. T. Drezner. Competitive location in the plane. In Z. Drezner, editor, Facility Location: a survey of applications and methods, Springer Series in Operations Research and Financial Engineering, pages 285–300. Springer, Berlin, 1995. 12. T. Drezner, Z. Drezner, and S. Salhi. Solving the multiple competitive facilities location problem. European Journal of Operational Research, 142(1):138–151, 2002. 13. Z. Drezner. The p-center problem: heuristic and optimal algorithms. Journal of the Operational Research Society, 35(8):741–748, 1984. 14. Z. Drezner. Facility Location: a Survey of Applications and Methods. Springer, Berlin, 1995. 15. Z. Drezner and H.W. Hamacher. Facility location. Applications and theory. Springer, Berlin, 2002. 16. H.A. Eiselt, G. Laporte, and J.F. Thisse. Competitive location models: a framework and bibliography. Transportation Science, 27(1):44–54, 1993. 17. E. Erkut and S. Neuman. Analytical models for locating undesirable facilities. European Journal of Operational Research, 40(3):275–291, 1989. 18. J. Fern´andez, P. Fern´andez, and B. Pelegr´ın. A continuous location model for siting a non-noxious undesirable facility within a geographical region. European Journal of Operational Research, 121(2):259– 274, 2000. 19. J. Fern´andez and B. Pelegr´ın. Using interval analysis for solving planar single-facility location problems: new discarding tests. Journal of Global Optimization, 19(1):61–81, 2001. 20. J. Fern´andez, B. Pelegr´ın, F. Plastria, and B. T´oth. Solving a Huff-like competitive location and design model for profit maximization in the plane. European Journal of Operational Research, 179(3):1274– 1287, 2007. 21. R.L. Francis, L.F. McGinnis, and J.A. White. Facility layout and location: an analytical approach. Prentice Hall, Englewood Cliffs, 1992. 22. D. Ghazfan, B. Srinivasan, and M. Nolan. Massively parallel genetic algorithms. Technical Report 94-01, Department of Computer Technology. University of Melbourne, 1994. 23. F. Glover and M. Laguna. Tabu search. Kluwer Academic Publisher, 1997. 24. D.E. Goldberg. Genetic algorithms in search, optimization and machine learning. Addison-Wesley, 1989. 25. P. Hansen and N. Mladenovi´c. Variable neighborhood search: principles and applications. European Journal of Operational Research, 130(3):449–467, 2001. 26. J.H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, 1975. 27. C.R. Houck, J.A. Joines, and M.G. Kay. Comparison of genetic algorithms, random restart and twoopt switching for solving large location-allocation problems. Computers and Operations Research, 23(6):587–596, 1996. 28. D.L. Huff. Defining and estimating a trading area. Journal of Marketing, 28(3):34–38, 1964. 29. J.H. Jamarillo, J. Bhadury, and R. Batta. On the use of genetic algorithms to solve location problems. Computers and Operations Research, 29(6):761–779, 2002. 30. M. Jelasity. The shape of evolutionary search: discovering and representing search space structure. PhD thesis, Leiden University, January 2001.

15 31. M. Jelasity and J. Dombi. GAS, a concept on modeling species in genetic algorithms. Artificial Intelligence, 99(1):1–19, 1998. 32. M. Kilkenny and J.F. Thisse. Economics of location: a selective survey. Computers and Operations Research, 26(14):1369–1394, 1999. 33. S. Kirkpatrick, C.D. Gelatt, and M.P. Vecchi. Optimization by simulated annealing. Science, 220(4598):671–680, 1983. 34. P.M. Ortigosa, I. Garc´ıa, and M. Jelasity. Reliability and performance of UEGO, a clustering-based global optimizer. Journal of Global Optimization, 19(3):265–289, 2001. 35. F. Plastria. Static competitive facility location: an overview of optimisation approaches. European Journal of Operational Research, 129(3):461–470, 2001. 36. J.L. Redondo, J. Fern´andez, I. Garc´ıa, and P.M. Ortigosa. Heuristics for facility location and design (1|1)-centroid problem on the plane. Computational Optimizations and Applications, 2007. To appear, DOI: 10.1007/s10589-008-9170-0. 37. J.L. Redondo, J. Fern´andez, I. Garc´ıa, and P.M. Ortigosa. A robust and efficient global optimization algorithm for planar competitive location problems. Annals of Operations Research, 2007. To appear, DOI: 10.1007/s10479-007-0233-x. 38. J.L. Redondo, J. Fern´andez, I. Garc´ıa, and P.M. Ortigosa. Solving the multiple competitive facilities location and design problem on the plane. Evolutionary Computation, 2007. To appear, available at http://www.um.es/geloca/gio/josemain.html. 39. J.L. Redondo, J. Fern´andez, I. Garc´ıa, and P.M. Ortigosa. Parallel algorithms for continuous competitive location problems. Optimization Methods & Software, 23(5):779–791, 2008. 40. J.L. Redondo, I. Garc´ıa, P.M. Ortigosa, B. Pelegr´ın, and P. Fern´andez. Parallelization of an algorithm for finding facility locations for an entering firm under delivered pricing. In G.R. Joubert, W.E. Nagel, F.J. Peters, O. Plata, P. Tirado, and E. Zapata, editors, Parallel Computing: Current and Future Issues of High-End Computing, volume 33 of NIC series, pages 269–276. John von Neumann Institute for Computing, 2006. 41. J.L. Redondo, I. Garc´ıa, B. Pelegr´ın, P. Fern´andez, and P.M. Ortigosa. CG-GASUB: A parallelized algorithm for finding multiple global optima to a class of discrete location problem. In A. Paias and F. Saldanha da Gama, editors, Proceedings of the EURO Winter Institute on Locations and Logistic, pages 139–146. Universidade de Lisboa, 2007. 42. S. Salhi and M.D.H. Gamal. A genetic algorithm based approach for the uncapacitated continuous location-allocation problem. Annals of Operations Research, 123(1-4):203–222, 2003. 43. E. Speckenmeyer, B. Monien, and O. Vornberger. Superlinear speedup for parallel backtracking. In Proceedings of the 1st International Conference on Supercomputing, pages 985–993, London, UK, 1988. Springer-Verlag. 44. B. T´oth, J. Fern´andez, B. Pelegr´ın, and F. Plastria. Sequential versus simultaneous approach in the location and design of two new facilities using planar Huff-like models. Computers and Operations Research, 2007. To appear, DOI: 10.1016/j.cor.2008.02.006. 45. A. Weber. Uber den Standort der Industrien 1. Teil: Reine Theorie des standortes, T¨ubingen, Germany, 1909. 46. J. Yang and C. Yang. The retail stores’ competitive location problem with retail regional saturation. In Services Systems and Services Management, 2005. Proceedings of ICSSSM’05. 2005 International Conference on, volume 2, pages 1511–1516, 2005.