bors (we say chromosome migration). We execute .... bors in our method in which we have two types of ... topologies, ring, torus, and hypercube using two mi-.
Eects of Chromosome Migration on a Parallel and Distributed Genetic Algorithm T. Matsumura, M. Nakamura, D. Miyazato, and K. Onaga
J. Okech
Dept. of Information Engineering
Dept. of Maths and Computer Sciences
University of the Ryukyus
Jomo Kenyatta University
Nishihara, JAPAN
903-01
Abstract
In this paper we propose a parallel and distributed genetic algorithms (PDGA) on xed network topology multiprocessor systems in which each processor element carries out genetic operations on its own chromosome set and communicates with only the neighbors (we say chromosome migration). We execute the proposed method to investigate eects of chromosome migration on the multiprocessor systems with ring, torus, and hypercube topology for benchmark problem instances. From the results, we nd that the ring topology is more suitable for our proposed parallel and distributed execution since it avoids immature convergence for its topological feature. We show its eectiveness by experimental evaluation. 1
Introduction
Genetic algorithms (abbreviated to GAs) [1][2] are remarkable one of meta schemes to solve combinatorial-optimization-type problems. Many papers report its eectiveness and usefulness [1] but GAs generally take expensive computation cost. As a straightforward approach to overcome the problem, many proposals on parallel genetic algorithms are appeared [3][4][5][6][7][8]. The distributed computation of parallel GAs contributes to keeping variety of chromosomes to avoid immature convergence in which the chromosome set is divided into several groups and is operated independently [5][6]. This paper considers parallel and distributed computation of genetic algorithms on loosely-coupled multiprocessor systems. Loosely-coupled ones are more suitable for massively parallel executions and also more easily VLSI implementation than tightly-coupled ones. However, communication overhead is more serious for loosely-coupled ones. We propose in this paper a parallel and distributed execution method of GAs on loosely-coupled multiprocessor systems of xed network topologies in which each processor element carries out genetic operations on its own chromosome set and communicates with only the neighbors in order to reduce communication overhead. From the previous work [5], we can observe that the quantity and frequency of communication among PEs aect the so-
Nairobi, KENYA
PO Box 62000
lution quality of the parallel and distributed GAs. Therefore, the solution quality generated by our neighboring communication way seems to depend on the network topology of the multiprocessor system and the style of the chromosome exchanging. In Section 2, we describe the parallel machine model considered in this paper and in Section 3 show a parallel and distributed execution method of GAs called PDGA and the way to exchange chromosomes (we say chromosome migration). In Section 4 we investigate the eect of chromosome migrations on three traditional topologies, ring, torus, and hypercube. We conclude this paper in Section 5. 2
Parallel Machine Model
In this paper, we consider parallel and distributed computation of GAs on loosely-coupled multiprocessor systems with xed network topologies, which are more suitable for massively parallel computation and easier VLSI implementation than tightly-coupled ones. However, the communication overhead problem for loosely-coupled systems is more serious than that for tightly-coupled ones. Figure 1 shows the processor element (abbreviated to PE) model of a loosely-coupled multiprocessor system considered in this paper. Each PE is composed of a computation processor for genetic operations and communication processors for communication handling. PEs are connected by communication links based on the associated network topology. As most remarkable topologies in practical and theoretical points, a ring, torus, hypercube are usually listed up in literatures [9][10][11]. Genetic operations are executed on the computation processor and communication between two PEs are managed by the associated communication processors in the PEs. By function of the communication processor, a genetic operation and communications can be performed simultaneously and asynchronous communication between PEs can be realized. In future work, we will design and specify the PE by a HDL(Hardware Description Language) and develop a trial multiprocessor system on FPGAs. In this paper thus we will omit more detail description of the PE
with the elite strategy; Create new chromosomes by applying crossover and mutation according to the crossover rate and the mutation rate; until the termination condition holds; end;
3.2 Chromosome Migration
computation processor communication processor
Figure 1: Processor Element(PE) Model model. 3
Parallel and Distributed Genetic Algorithm: PDGA
In this section, we describe the detail of parallel and distributed execution of GAs on loosely-coupled multiprocessor systems. Since the isolated execution, say no communication, of GAs is not eective [5], in our method some chromosome is transfered to the neighbor PEs (we say two processors are neighbor when there is a directed communication link between them) and the population size in each processor is as few as possible. In this paper we call chromosome-transfer migration and also the parallel and distributed GA PDGA.
3.1 Procedure PDGA
At the beginning of the execution, each PE generates its own chromosome set according to the GA parameters sent from the host PE. Then like ordinary GAs, each processor carries out genetic operations, crossover, mutation, and selection, on its population at each generation. Before the next generation, the migration takes place through the associated communication processor if its condition holds. The procedure for each PE is shown as follows: procedure PDGA; begin Receive GA parameters and a signal for starting execution from the host PE; Initialize a population; repeat Evaluate all the chromosomes; if the migration condition holds then Do the chromosome migration; end if; Select the chromosomes for the next generation by the roulette wheel way
Chromosome migrations among PEs are very important for the PDGA. We mentioned previously that the chromosome migration occurs only between neighbors in our method in which we have two types of chromosome migration, that is, the immigration type and the emigration type. In the immigration type, the migration condition holds when the best tness is not updated and then the PE ask the neighbors to send a chromosome and receive one chromosome each. In the emigration type, the migration condition holds when the best tness value is updated and then the PE sends to the neighbors a chromosome and the neighbor PEs receive the chromosome each. In our method, we adapt the population of each processor is as few as possible. Migrated chromosomes aect its own population in a reproduction operation. Thus, we can say that not so much migration may avoid immature convergence of the solution and there is a relationship between the convergence of the solution and a network topology. In this paper, we describe this experimental evaluation in Section 4.2. 4
Experimental Evaluation
In the PDGA, migration-partners are restricted within the neighbors to save the communication overhead since communication between non-neighboring PEs needs for the intermediate PEs to relay the messages. On the parallel processing of the general searching-type program, more communication among processors results in better quality solutions. However, when it comes to the parallel execution of GAs, it is not always true. Not so much communication may avoid immature convergence of the solutions. Thus the migration frequency and its size (the number of transfered chromosomes) should aect the solution quality. Therefore, since only neighboring communication takes place in our method, the solution quality is in uenced by the network topology that determines how fast and wide chromosomes spread in the multiprocessor system. In our previous works, we reported the in uences for three standard topologies, ring , torus, and hypercube and proposed a cone topology which is suitable for PDGA [12]. The result of the experiment was that the ring topology was more suitable for the proposed PDGA than the other standard topologies, moreover the cone topology got higher solution than ring one. In this paper, we consider a comparison of migration type, immigration and emigration, from a point of a convergence of population and a frequency to obtain the optimum.
4.1 Solution Quality and Eects of Migration Types
We evaluate the solution quality obtained by the proposed method to investigate the in uence of the chromosome migration and the network topology. In this experiment, we solve by the PDGA eleven benchmark instances [13][14][15][16] of the multiple knapsack problem [17]. These problem instances are available from the OR Library [18] where problem instances with the optimum solutions are collected for several kinds of the combinatorial optimization problems. Our PDGA was run on a virtual multiprocessor system implemented by C++. Throughout the experiment, the following parameters are used: population size in each processor = 3 generation span= 100 crossover rate = 1.0 mutation rate = 0.0 migrated chromosome is a best tness chromosome The other parameters such as the number of PEs, and network topologies are varied at each case. In the experiment, we execute our proposed PDGA on the virtual multiprocessor systems with three standard topologies ring, torus, and hypercube. As described previously, we have two types of chromosome migration, the immigration type and the emigration type. Each migration is caused on satisfying migration condition. Therefore, in the immigration type, the migration condition holds when the best tness value is not updated and the PE receives a chromosome from each neighboring PE, and in the emigration type, the condition holds when the better chromosome is generated in own population. The migration chromosome is the best tness one in its population. We experiment for each topology with varying the number of processors 16,32(36 for torus),and 64. By twenty times executing for each case, we get Table 1-2. Table 1 shows by percentage how often among twenty times executions the proposed PDGA obtains the optimum solution for eleven problem instances and three topologies, ring, torus, and hypercube using two migration type. Table 2 shows the average number of migration until when an optimum which is best tness value in the execution is found rstly. From the experimental results, the followings are observed: (i)the executions on the ring got better quality solutions than the others (Table 1), (ii)we can not choose the uniformly suitable migration type on the solution quality (Table 1), and (iii) the emigration type on the ring can obtain faster better quality solution than the immigration one on that(Table 2). We therefore can say from the experiment that the combination of the emigration type and the ring topology is suitable for our proposed PDGA.
4.2 Eects of Topologies
Here we investigate the eects of the network topology to the solution quality. In the PDGA, the chromosome migration takes place only between neighbors. Therefore, the topology determines the chromosomes' spreading. If the chromosomes' spreading
convergence generation
80
ring torus hypercube
70 60
immigration type
50
ring torus hypercube
40
emigration type
30 20 15
20
25
30
35
40
45
50
55
60
65
number of processor
Figure 2: convergence generation curves is fast, the convergence speed is high. High convergence speed leads to immature convergence at local optimum points. Suppose that the average distance between any two PEs on a topology of an n PEs multiprocessor system is d(n). Then we can say that a PE is possible to receive chromosomes from all the PEs by d times migrations. From this point, we can say that each PE receives chromosomes from (n 0 1)=d(n) PEs on an average at every migration. We call the value (n 0 1)=d(n) the average number of migrate-partners and denote it by mp(n). The convergence speed of the PDGA is proportional to mp(n) and inversely proportional to the total number of chromosomes. For example, since the average distance of the ring topology is n=2 and the total number of chromosomes is O(n), the convergence speed is O (n). Figure 3 depicts the convergence speed curves for the ring, torus, andphypercube whose average distances [11] are O(n), O( n), and O(log n), respectively. Figure 2 shows the convergence generation curves with varying the number of PEs for the problem instance (sent02) in the previous experiment. In the experiment, we judged an execution converged when all the chromosomes are occupied by one chromosome. We found similarity between Figure 2 and Figure 3. To keep variety of the chromosomes contributes to avoid converging at the local optimum solution. The results in Figure 2 explain the reason of the eciency of the ring topology. 5
Concluding Remarks
This paper considered eciency of chromosome migration in parallel and distributed computation of genetic algorithm. We proposed a parallel and dis-
Table 1: Frequency to obtain the optimum [%] problem instance Immigration Emigration Ring topology Num. of PE 16 32 64 16 32 64 hp1 5 25 25 0 10 20 hp2 0 20 50 0 25 55 pb5 5 5 0 0 0 25 pb7 0 5 15 0 0 25 sent02 0 5 5 0 0 10 weing8 0 20 50 20 10 45 weish08 15 20 45 10 35 50 weish18 0 5 35 0 5 50 weish24 0 0 5 0 0 10 weish25 0 25 30 5 20 35 weish30 5 15 30 10 15 55 Torus topology hp1 0 5 25 15 15 0 hp2 0 20 35 5 5 30 pb5 0 0 10 0 5 15 pb7 0 0 5 10 0 10 sent02 0 0 0 0 5 10 weing8 0 25 25 0 20 60 weish08 10 15 20 5 45 25 weish18 0 0 5 0 0 5 weish24 0 5 5 0 0 0 weish25 5 15 20 0 10 45 weish30 0 10 5 0 0 40 Hypercube topology hp1 5 15 10 10 5 5 hp2 0 15 35 5 15 35 pb5 0 15 0 0 5 15 pb7 0 0 0 5 5 5 sent02 0 0 0 0 0 5 weing8 5 15 35 0 15 35 weish08 10 15 25 5 35 75 weish18 0 0 0 0 5 0 weish24 0 5 0 0 0 0 weish25 0 5 35 10 5 35 weish30 0 0 20 0 15 15
Table 2: Average number of migration until nding an optimum [frequency/1PE] problem instance Immigration Emigration Ring topology Num. of PE 16 32 64 16 32 64 hp1 4 6 7 2 3 4 hp2 7 11 27 4 5 8 pb5 3 5 6 2 3 3 pb7 6 7 10 4 5 4 sent02 12 20 36 7 9 10 weing8 7 16 20 5 7 8 weish08 5 7 7 5 6 6 weish18 15 19 38 9 10 13 weish24 17 28 33 11 12 14 weish25 13 17 14 10 10 11 weish30 16 30 36 12 14 14 Torus topology hp1 2 2 16 2 3 3 hp2 4 5 6 3 4 5 pb5 1 1 1 2 2 3 pb7 3 5 3 4 4 4 sent02 6 7 9 6 8 8 weing8 3 4 5 4 5 6 weish08 3 3 3 5 5 4 weish18 6 9 8 8 10 9 weish24 9 7 8 9 10 11 weish25 6 6 6 8 10 9 weish30 8 10 10 10 10 12 Hypercube topology hp1 2 3 13 2 2 2 hp2 4 7 4 4 4 4 pb5 1 2 1 2 2 2 pb7 2 2 1 4 4 4 sent02 6 9 6 6 7 8 weing8 3 3 3 5 5 6 weish08 3 2 2 3 3 4 weish18 6 8 7 8 8 9 weish24 8 8 6 9 9 9 weish25 7 6 6 8 9 8 weish30 9 7 8 9 11 10
[5] J. Okech, et al, \A Distributed Genetic Algorithm for the Multiple Knapsack Problem using PVM," Proc. of ITC-CSCC'97, pp. 899-902, 1996.
n=mp(n)
[6] H. Muhlenbein, M. Schomisch, and J. Born, \The Parallel Genetic Algorithm as Function Optimizer," Proceedings of the Fourth International Conference on GENETIC ALGORITHMS, pp. 271-278, 1991.
ring torus hypercube
[7] Roy Sterritt, et al, \A Parallel Genetic Algorithm for Cause and Eect Networks," Proceedings of the IASTED International Conference, Arti cial Intelligence and Soft Computing, pp. 105-108, 1997. [8] Reiko Tanese, \Distributed Genetic Algorithms," Proceedings of the Third International Conference on GENETIC ALGORITHMS, pp. 434-439, 1989. n
0
Figure 3: convergence speed curves
[9] Shumeet Baluja, \Structure and Performance of Fine-Grain Parallelizm in Genetic Search," Proceedings of the Third International Conference on GENETIC ALGORITHMS, pp. 155-162, 1993. [10] Dan I. Moldovan, Parallel Processing from Application to Systems, Morgan Kaufmann Publishers, 1993.
tributed execution method of GAs on loosely-coupled multiprocessor systems of xed network topologies in which each PE carries out genetic operations on its own chromosome set and communicates with only the neighbors in order to save communication overhead. We evaluated the proposed method on the virtual multiprocessor systems with ring, torus, and hypercube topologies. From the results, we found that the ring topology and emmigration type was more suitable for the proposed GA than the others. We investigated the reason of suitability of the ring topology by comparing its convergence speed with the others' and found that the ring topology can keep the variety of chromosomes because of its longer distance. As future works, we analyze theoretically eect of topologies and implement a trial multiprocessor system on FPGAs.
[11] Y. Takahashi, Parallel Processing Mechanism (Heiretu Shori Kikou), Maruzen, 1989.
References
[15] W. Shi, \A branch and bound method for the multiconstraint zero one knapsack problem," J. Opl. Res. Soc., pp. 30:369-378, 1979.
[1] Srivinas, S. Patnaik, \Genetic Algorithms: A Survey," IEEE Computer Society, Computer, Vol. 27, No 6, pp. 17-26, 1994.
[12] T. Matsumura, M. Nakamura, J. Okech, and K. Onaga, \Parallel Computation of Distributed GA on Loosely-coupled Multiprocessor Systems," Proc. of the IASTED International Conference, Arti cial Intelligence and Soft Computing, pp. 109-112, 1997. [13] S. Senyu and Y. Toyada, \An approach to linear programming with 0-1 variables," Management Science, pp. 15:B196-B207, 1967. [14] H. M. Weingartner and D. N. Ness, \Methods for the solution of the multi-dimensional 0/1 knapsack problem," Operations Research, pp. 15:83103, 1967.
[2] L. Davis, Handbook of Genetic Algorithms, Van Nostrand Reinhold, 1991.
[16] A. Freville and G. Plateau, \Hard 0-1 multiknapsack test problems for size reduction methods," Investigation Operativa, pp. 1:251-270, 1990.
[3] Chrisila C. Pettey, Michael R. Leuze, \A Theoretical Investigation of a Parallel Genetic Algorithm," Proceedings of the Fourth International Conference on GENETIC ALGORITHMS, pp. 398-405, 1989.
[17] Khuri, S. Back, T. Heitkotter, \The Zero/One Multiple Knapsack Problem and Genetic Algorithms," Proceedings of the 1994 ACM Symposium of Applied Computation, 1994.
[4] Piet Spiessens, and Bernard Manderick, \A Massively Parallel Genetic Algorithm Implementation and First Analysis," Proceedings of the Fourth International Conference on GENETIC ALGORITHMS, pp. 279-286, 1991.
[18] OR Library, URL http://mscmga.ms.ic.ac.uk/