Optimization algorithms can be classified in the following categories: (1) exact .... Parallel ACS algorithms may be homogeneous or heterogeneous in respect to ...
International Scientific Conference Computer Science’2009
The Efficiency of Parallel Metaheuristics for Combinatorial Optimization – Paradigms, Models and Implementations Plamenka Borovska, George Yanchev Technical University of Sofia, Bulgaria pborovska@tu-sofia.bg http://csconf.org/cs/leader_eng.htm
Abstract: Parallel metaheuristics have proved to provide efficient and powerful tools for combinatorial optimization of grand challenge scientific and engineering problems. Metaheuristics offer the opportunity to find out optimal or suboptimal solution of NP-hard problems in reasonable time. Combinatorial optimization based on metaheuristics implies tree major aspects – the search space, the neighborhood relations and the guiding function, the specific forms of which determine the metaphor of the computation. The search strategies for the optimum implied may be trajectory-based or population-based, the latter simulating biological or cultural evolution. The major goal of parallelizing metaheuristics is not only to reduce significantly the computational time but to improve the quality of solutions obtained as well. The motives to utilize parallel metaheuristics are diversification and intensification. The paper focuses on the specifics of designing parallel computational models based on metaheuristics, implementing various parallel algorithmic paradigms and optimizing the correlations architectural space – target parallel computer architecture. Classifications of parallel computational models in respect to the granularity are presented. The aspects of tuning algorithmic parameters to the specifics of the problem being solved are considered. Parallel performance evaluation and quality of solution estimation on the basis of parallel program implementations are treated. Case studies are presented for trajectory-based GRASP parallel metaheuristics implementations on compact computer cluster of dual-core servers (superserver). Key-Words: Metaheuristics, Combinatorial Optimization, Parallel Computing, Parallel Algorithms and Programming, Parallel Performance, Quality of Solution
1. COMBINATORIAL OPTIMIZATION The computational problems are generally classified in the following major categories: (1) find out all solutions; (2) count the existing solutions; (3) find out if any solution exists; (4) find out a solution out of the set of solutions set such that minimizes or maximizes a specified function – optimization problems [1]. Optimization problem is presented as a triple P=(S,Ω,f), where S is the search space defined over a finite set of decision variables Xi, i=1,2,…,n, Ω is a set of constraints among the variables, f:S→IR+, is objective function that assigns a positive value to each solution of S [2]. The goal is to find a solution s∈S such that f ( s )≤ f ( s ') , ∀s∈ ' S (minimization) or f ( s )≥ f ( s ') , ∀s∈ ' S (maximization). Multi-objective combinatorial optimization involves the optimization of several, typically conflicting objectives. In order to decrease the complexity of multi-objective optimization in many cases it is possible to optimize the problem considering one objective while the other objectives are considered as constraints. We are concerned with the case when the decision variables X have discrete domains i.e. discrete (combinatorial) optimization.Combinatorial optimization is a branch of optimization in applied mathematics and computer science, related to operations 45
International Scientific Conference Computer Science’2009
research, theory of algorithms and computational complexity theory and sits at the intersection of several areas comprising computational intelligence, mathematics, and software engineering [3]. The application areas include bioinformatics, engineering, economics, geography, and other research areas with a quantitative analysis component. Optimization algorithms can be classified in the following categories: (1) exact algorithms; (2) approximation algorithms; (3) metaheuristics; (4) hybrid algorithms combining exact algorithms and metaheuristics; (5) multi-objective optimization algorithms. Combinatorial algorithms imply processing discrete finite mathematical structures. The combinatorial problems are NP-hard and the major challenge is to deal with the “combinatorial burst” of computations. Building up the complete list of NP-hard problems is not a trivial task but according to [4] it comprises the following areas: computational geometry, graph theory, network design, sets and partitions, storage and retrieval, sequencing and scheduling, mathematical programming, algebra and number theory, games and puzzles, logic, automata and language theory, program optimization, miscellaneous. 2. THE POWER OF METAHEURISTICS The term “metaheuristics” for the first time was used in 1986 by Fred Glover, University of Colorado, USA. It is a combination of two Greek words: heuristics (find out) and meta (high level). Before that the term “modern heuristics” was used. Metaheuristics provide methods and tools to find out optimal or sub-optimal solutions in reasonable time. The major goal of metaheuristics is to provide efficient exploration and exploitation of the search space by combining the basic heuristic methods within high level algorithmic frameworks. The specifics and advantages of metaheuristics can be summarized as (1) providing high level strategies for guiding the search procedure utilizing the accumulated experience; (2) the metaheuristics algorithms are approximate and often nondeterministic; (3) the basic concepts can be described on abstract level independent of the problems being solved, and (4) dynamic balance is required between diversification and intensification. The basic concepts of metaeuristics are: (1) diversification – extensive exploration of the search space, and (2) intensification – exploitation of the accumulated experience and guiding the search in promising areas of the search space. The metaphor of metaheuristics implies the search landscape, the neighborhood relations and the guiding function. In fact, the search process is navigated in the space of possible solutions by the guiding function possibly based on the accumulated experience. The transitions between neighboring configurations are called moves. Important aspect is the selection of the class of moves that guarantees that from any arbitrary solution at least one optimal solution can be reached. The metaheuristics search strategies strongly depend on the philosophy of the specific metaheuristics: trajectory-based, population-based, decentralized, and hybrid metaheuristics.
46
International Scientific Conference Computer Science’2009
3. PARALLEL METAHEURISTICS Parallelizing metaheuristics is the major strategy for fighting the combinatorial burst. A new class of algorithms was developed – parallel metaheuristics [5]. The major goals of parallel metaheuristics are to reduce the computational time significantly and to improve the quality of solutions as well. When designing parallel metaheuristics algorithms we should consider all the aspects of parallel algorithm design, primarily the correlations of the algorithmic and the architectural spaces. Fortunately, the inherent parallelism of metaheuristics, especially of the population-based methods, is considerable and a variety of parallel programming models and parallel algorithmic paradigms can be utilized. The motives for parallel metaheuristics utilization are to improve diversification, on one hand, by searching in parallel multiple areas of the search space, and to improve intensification, on the other hand, by searching in parallel promising areas of the search space. 3.1. Parallelizing Population-Based Metaheuristics The population-based metaheuristics algorithms are iterative and consequently are inherently data parallel. There exist two basic approaches to parallelize populationbased metaheuristics: (1) computation-based parallelization (operations for individuals are computed in parallel), and (2) population-based parallelization (the population is divided into sub-populations that involve separately and can be united lately). In respect to the granularity parallel genetic models may be classified into multithreaded, flat and hybrid (Fig. 1). The parallel genetic models may be standard (the population is a pool of individuals), or structured (the population is decentralized). Parallel genetic models may implement various granularity in the reproduction stage: (1) generation-based model (the current population is replaced by the new population); (2) steady population (the new individuals join the population) and (3) hybrid model (predefined number of new individuals replace some of the individuals in the population). Structured genetic algorithms (GA) are of two basic types: distributed GA (dGA) and cellular GA (cGA). Structured genetic algorithms divide the population into subpopulations (islands). Chromosome migration is characterized by the following parameters: migration paths (topology), migration frequency, number of migrants, strategy for selecting and integrating migrants (Fig.2). Genetic material is exchanged by “migrating” individuals among islands (dGA) or “diffusion” of good chromosomes (good solutions) from neighbor areas (cGA). Cellular GA interleave small neighbor areas thus forming a diffusion lattice. The cellular genetic model is shown in Fig.3. Genetic material is exchanged by means of diffusion and no individuals are transferred. Parallel strategy for diffusion is applied implying that each individual selects a partner out of the neighbors. Mutation follows the crossover. The evolution of the population based on circular topology neighborhood is shown in Fig.4. In case of dynamic neighborhood, a neighbor is excluded from the neighborhood structure if its children have fitness worse than the best fitness of the current population. Dynamic neighborhood does not allow individual isolation. In case the number of neighbors is less then a critical limit then arbitrary individuals are added to the neighborhood.
47
International Scientific Conference Computer Science’2009
PARALLEL GENETIC MODELS
FINE GRANULARITY (MULTITHREADED MODELS)
CENTRALIZED POPULATION
STRUCTURED POPULATIONS
COARSE GRANULARITY (FLAT MODELS)
NESTED POPULATIONS
CELLULAR GENETIC ALGORITHMS POPULATION REPLACEMENT STEADY POPULATION HYBRID POPULATION
ISLAND MODEL
DATA PARALLELISM
PARADIGM SYNCHRONOUS ITERATIONS
MANAGER/ WORKERS PARADIGM
NEIGHBOR CROSSOVER (DIFFUSION)
CHILD OF BETTER FITNESS REPLACES THE PARENT
MANAGER/ WORKERS PARADIGM
VARIABLE POPULATION PARAMETERS ON LEVELS
DATA PARALLELISM
SPMD PARADIGM
CROSS-OVER WITHIN THE ISLAND RANKING POPULATIONS PARALLEL MUTATION RATES
MIGRATION
WORKERS COMPUTE THE FITNESS OF THE INDIVIDUALS
STRUCTURED POPULATIONS
ISLAND MODEL
INCREMENTAL PARALLELISM
FUNCTIONAL PARALLELISM
MANAGER COMPUTES THE GENETIC OPERATIONS AND SELECTS THE BEST INDIVIDUAL
STRUCTURED POPULATIONS
HYBRID PARALLEL MODELS (MULTITHREADED PROCESSES)
DATA PARALLELISM
SPMD PARADIGM
DIFFUSION CROSSOVER WITHIN THE ISLAND
PARALLEL MUTATION RATES
MIGRATION
DYNAMIC NEIGHBORHOOD STRUCTURES
Fig. 1. Classification of parallel genetic models in respect to granularity
ISLAND (LOCAL EVOLUTION)
INDIVIDUALS
MIGRATION (GENETIC MATERIAL)
Fig. 2. The island model
Fig. 3. The cellular genetic model
Parallel genetic models of nested populations use p’ populations created out of p parents. These populations are used to generate λ initial populations of q individuals each. Each of the initial populations evolves γ populations. Populations are being 48
International Scientific Conference Computer Science’2009
ranked in respect to specified criteria (for ex., the average fitness). The nested population concept may be generalized for multiple levels and various criteria may be specified for the different levels. Parallelizing scatter search implies the following parallel models: (1) synchronous parallel scatter search (local search is parallelized by dividing the neighborhood into subsets that are processed by different processors); (2) replicated parallel scatter search with combining subsets of the reference set; (3) multi-start replicated parallel scatter search (the processes evolve different populations); (4) parallel scatter search with multiple combining.
A
M
R
IG
R
IG
A
M
N
TS
ELITE ANTS ARE RESPONSIBLE FOR UPDATING THE PHEROMONE MATRIX
N
ANT COLONY
TS
MUTATION DIFFUSION
ANT COLONY N
SIO
FU
ANT COLONY
CILD
S ANT MIGR
ANT COLONY
NEIGHBORHOOD
S
ANT COLONY
ANT
MIGR
DIF
VIRTUAL NEIGHBORHOOD (LOGIC RING)
PHEROMONE TRACE MIGRANTS
Fig. 4. The evolution of the population based on circular topology neighborhood
Fig. 5. System of ant colonies (ACS) with migration of elite ants
In parallelizing ant colony optimization metaheuristics (ACO) the finest granule encapsulates the construction of a single solution i.e. the activities of a single ant. The single solution construction including the evaluation of its quality cannot be parallelized due to the fact that it is inherently sequential by nature. The popular approach is that one multithreaded process simulates the activities of multiple agents (ants). For coarsegranularity models ant colony systems (ACS) are implemented. Each colony comprises a set of tightly coupled cooperating ants. The system comprises a finite number of ant colonies. The hybrid parallel model of ACS implies concurrent processes and each process is multithreaded and simulates the activities of the ants within a single colony. Parallelizing ACO metaheuristics is based on the following strategies: (1) parallelizing standard ACO algorithms (the goal is to reduce computational time without changes in the optimization behavior of the algorithm); (2) parallel version on the basis of modifying the optimization behavior of the algorithm (colonies search different areas of the search space); (3) centralized parallel ACO, based on the manager/workers paradigm (the manager gathers information about the solutions, updates and broadcasts the pheromone matrix to the workers); (4) decentralized ACO – every process (colony) updates the pheromone matrix on the basis of the information it gets from other processes (colonies). Parallel ACS algorithms may be homogeneous or heterogeneous in respect to the solution construction and pheromone update. Heterogeneity may be within the iteration (colonies apply different methods for solution construction and/or pheromone update) or between the iterations (colonies apply different strategies at iterations, the heuristics is periodically changed). The parallel SWARM metaheuristics implies metaphor of social collective behavior of biological species in nature such as ants, sorting dead animals in a cluster, fish 49
International Scientific Conference Computer Science’2009
school, bird flocks, a pack of wolves, bees, etc. The model of simulation is event-driven. Parallelization may apply two strategies: bulletin-board model (the activities of the agents are modeled by the server-process and each client process undertakes a task to compute) and non-bulletin-board model (agents are distributed stochastically among the processes and the server-process is responsible for the synchronization and the global information update. 3.2. Parallelizing Trajectory-Based Metaheuristics In respect to granularity (coarse, medium or fine) there exist 3 basic parallel models of local search: (1) multistart model (parallel local searches may be heterogeneous/ homogeneous, independent/ cooperative, starting from single/various solution(s), configured with identical/various parameters; (2) parallel moves (paradigm manager/workers); (3) accelerated moves – the quality of solutions are evaluated in parallel. The strategies for parallelizing variable neighborhood search (VNS) are: (1) synchronous parallel VNS; (2) multi-start replicated parallel VNS; (3) replicated parallel VNS with shaking (manager/workers paradigm, synchronous cooperation, manager is responsible for the sequential VNS and sends the current solution to workers for “shaking”, the solutions are sent to the manager which is responsible for selecting the best solution and the manager goes on; (4) Cooperative parallel VNS (asynchronous multi-start and based on central memory). The strategies for parallelizing simulated annealing (SA) are: (1) data parallelization (the search space is decomposed, the efficiency is problem-dependent); (2) multiple independent runs (Fig.6) (MIR – multiple instances of the algorithm are executed in parallel on different processors and different initial states); (3) parallel moves (Markov chain is parallelized until the steady state); (4) massive parallelization (efficient for the case of evolutionary SA); (5) parallelizing hybrid metaheuristics “evolutionary computations + SA”, known as evolutionary simulated annealing (ESA). Begin Markov chain n
Markov chain 3
Markov chain 2
Markov chain 1
Markov chain
Step 1
High energy Step 2
Step 3
Step 3 Step 4
Step n
Steady state Low energy
Best solution
End
Fig. 6. Multiple independent runs of simulated annealing
Fig. 7. Parallel moves of simulated annealing
The strategies for parallelizing tabu search are based on tri-dimensional classification of algorithm attributes in respect to control cardinality, type of the control and communication, search differentiation. Practically, synchronous communications are utilized such as hard synchronization and knowledge synchronization. The knowledge may concern global search and/or quality of good solutions. The parallel search differentiation may start from the same/different initial solutions with identical or different search strategies. 50
International Scientific Conference Computer Science’2009
The strategies for parallelizing GRASP are (1) search space decomposition; (2) iterations distributed among processes (multiple-walks, independent threads, manager/workers paradigm); (3) hybridization (GRASP + Path re-linking): multiple walks with independent threads or multiple walks with threads exchanging elite solutions obtained in previous iterations). Heterogeneous parallel metaheuristics algorithms may be hardware heterogeneous (the algorithmic components are computed on different computer platforms) and software heterogeneous (applying different search strategies). Software heterogeneous parallel algorithms may vary in search strategies on parameter level, operations level, solutions level, or algorithmic level. 4. PARALLEL METAHEURISTICS IMPLEMENTATIONS Parallel implementations are developed taking into consideration the specific problem to be optimized and the target parallel computer platform. In designing the parallel metaheuristics algorithm the paradigm should be selected carefully out of the spectrum of parallel algorithmic paradigms: SPMD, manager/workers, working pool, synchronous/ asynchronous iterations, phase-parallel paradigm. A wide spectrum of metaheuristics class libraries have been developed in C++, MPICH and the API OpenMP that provide case studies for various combinatorial problems. An important aspect to be considered is the correlation of the granularity of the metaheuristics algorithm and the architectural type of the target parallel computer platform. Metaheuristics provide general purpose algorithms for optimization but for the specific problem the encoding should be considered as well as the optimal tuning of the algorithmic parameters to the problem specifics. Let’s consider the case study of developing flat, multithreaded and hybrid parallel implementation of GRASP (MPI+OpenMP) based on the manager/workers paradigm for solving the traveling salesman problem for computer cluster as a target platform. In order to improve the convergence of the parallel algorithm reactive GRASP is applied, combined with 2-opt search, path-relinking, the reverse array and don’t look bit techniques. Measuring parallel performace is of special issue in the case of parallel metaheuristics. The scaling of the parallel GRASP in respect to the parallel machine size is shown in Fig.8. Quality of Solutions 53000 52000 51000 50000 49000 48000
SPEEDUP
15 10
0
5
5
10
15
20
Number of Runs
0
Optimal
Processes Serial
1
2 OpenMP X2
4 OpenMP X4
MPI
8 MPI +OpenMP X2
Fig. 8. Scaling of the parallel GRASP in respect to the parallel machine size for TSP (1291 cities)
Fig. 9. Quality of solutions of the parallel GRASP algorithm for TSP (1291 cities)
The quality of solutions obtained are based on the statistics of multiple runs of the parallel program – 30 to 50 runs is recommended. 51
International Scientific Conference Computer Science’2009
5. CONCLUSION Parallel metaheuristics have proved to provide efficient and powerful tools for combinatorial optimization of grand challenge scientific and engineering problems. The major goal of parallelizing metaheuristics is not only to reduce significantly the computational time but to improve the quality of solutions obtained as well. The motives to utilize parallel metaheuristics are diversification and intensification. The paper focuses on the specifics of designing parallel computational models based on metaheuristics, implementing various parallel algorithmic paradigms and optimizing the correlations architectural space – target parallel computer architecture. Classifications of parallel computational models in respect to the granularity are presented. Case studies are presented for trajectory-based GRASP parallel metaheuristics implementations on compact computer cluster of dual-core servers (super-server). 6. REFERENCES: [1] F. Glover, G. Kochenberger, Handbook of Metaheuristics, Kluwer’s Academic Publishers, 2003 [2] T. Gonzales, Handbook of Approximation Algorithms and Metaheuristics, Chapman & Hall/CRC Computer and Information Science Series, 2007. [3] E. Talbi, Parallel Combinatorial Optimization, Wiley-Interscience, 2006. [4] http://en.wikipedia.org/wiki/List_of_NP-complete_problems. [5] E. Alba, Parallel Metaheuristics – A New Class of Algorithms, WILEY-INTERSCIENCE, 2005. [6] Blum C., Roli A., Metaheuristics in Combinatorial Optimization: Overview and Conceptual, ACM Computing Surveys, 2003, pp. 159-174. [7] P. Borovska, M. Lazarova, Efficiency and Quality of Solution of Parallel Simulated Annealing, 11th WSEAS CSCC Conference on Systems, Crete, Greece, July, 2007, World Scientific and Engineering Academy and Society. [8] Z. Michalewicz, D. Fogel, How to solve it: Modern Heuristics, Springer, Second Edition, 2004. [9] M. Dorigo, V. Maniezzo, A. Colorni, Ant Systems: Optimization by a Colony of Cooperating Agents, IEEE Transactions on Systems, Man and Cybernetics, 1996. [10] P. Borovska, M. Lazarova, Scalability and Diversification of Parallel Simulated Annealing for Solving the Room Assignment Problem, WSEAS Transactions on Computers, World Scientific and Engineering Academy and Society, 2007. [11] P. Borovska, Parallel Metaheuristics, Summer School on Intelligent Systems, organized by Department of Computer Science, University of Cyprus, Nicosia, Cyprus, July, 2-6, 2007. [12] P. Borovska, S. Bahudejlla, Parallel Genetic Computation of the TSP with Circular Chromosome Migration, International Scientific Turkish-Bulgarian Conference “Computer Science”’2006, Istanbul, Turkey, 2006, Proceedings, Part I, ISBN 978954-438-601-6, pp. 102-107.
52