A Comparison Introduction Sequential Elimination ... - Semantic Scholar

36 downloads 48573 Views 121KB Size Report
Email: [email protected]. Introduction. Genetic ... best method for every problem (some might say they are not the ... ever ... such automation would be.
Genetic Algorithms and Statistical Methods: A Comparison  Colin R. Reeves and Christine C. Wright Division of Statistics and Operational Research School of Mathematical and Information Sciences Coventry University UK Email: [email protected]

Introduction

Genetic algorithms (GAs) have been proposed very widely in the last decade as a simple but e ective search method for optimization problems. However, they are not necessarily the best method for every problem (some might say they are not the best for any problem!), and in many cases a deeper consideration of the nature of the search space and how to search it is likely to lead to better solutions more quickly. It has not been generally realized that GAs have a close relationship to more traditional methods of statistics|in particular, with the branch of statistics known as experimental design (ED). Some recent papers by Reeves and Wright [1, 2] demonstrate this relationship in detail. The principal di erences between GAs and ED are as follows:  GAs start from a random subset of the Universe, while ED uses a carefully structured orthogonal subset;  ED tries to extract information using as few points as possible, whereas GAs evaluate a relatively large fraction of the Universe1 ;  GAs do not explicitly remember the results of previous trials, while ED incorporates the cumulative information obtained;  GAs can proceed automatically, but ED needs substantial human interaction and interpretation;  Published in Proc. 1st IEE/IEEE International Conference on Genetic Algorithms for Engineering Systems: Innovations and Applications, Sheeld, UK, 1995. 1 The absolute size of the fraction of the Universe explored by a GA is still tiny of course, but it is nonetheless orders of magnitude more than ED would use.



GAs make no explicit use of modelling concepts, while ED has an explicit model with which it attempts to account for the observed phenomena.

In [2] we conclude The ideal algorithm would be one which was able to automate the decisions made in a typical ED, so that at any stage the `best' point to evaluate next could be chosen in the light of all the information available. However . . . such automation would be highly complex. The genetic algorithm could be viewed as a step in this direction, and despite its ineciencies, the weight of empirical evidence suggests that it does a fairly good job! Nevertheless, in smaller problems particularly, statistical methods should produce better results than GAs.

Sequential Elimination of Levels (SEL)

Recently Wu et al. [3] have suggested an iterative approach which both attempts to automate the search and reduces the number of points investigated. They call this method sequential elimination of levels (SEL). It starts with a small subset of points, chosen in accordance with ED principles of orthogonality. It reduces the size of the search space by eliminating those levels of each factor (in GA parlance, the alleles of each gene) which appear to give poor results. The procedure is repeated until 2 levels remain for each factor, and the nal set of points is used to determine the levels of each factor which give the best performance (i.e. the `best' levels after the nal round of

eliminations are assumed to denote the optimum). Elimination at each stage could be performed on the basis of average performance (SEL-mean), or on the basis of best performance (SEL-max). It is clear that this approach recalls the usual explanations of how a GA functions|a `hyperplane competition' is being (implicitly) performed whereby the better hyperplanes are explored at the expense of the inferior ones, which gradually drop out of the population. It is also interesting that SEL follows GA practice in that it does not explicitly take account of results obtained at a previous stage, and it therefore seemed interesting to compare the SEL approach with a GA on a suitable problem.

The Experiment The test problem used was an engineering design problem reported by Carlson et al. [4]| a problem of designing a hydraulic system by selecting from a set of components those which gave the best performance. The system in question had 6 basic components, each of which came in 5 types, so that the size of the total search space was 56 = 15625. To enumerate a search space of this size in an experimental design would normally be unthinkable, although for the purposes of the experiment the enumeration was in fact carried out. In [4] several GAs were tried, and the best one found the optimal solution in about 60% of 500 trials, on average enumerating 4170 points|rather more than one quarter of the search space. From an ED perspective, the optimum could almost certainly be identi ed using a 1 5 orthogonal fraction of the Universe, although the analysis might be computationally intensive. However, in ED it is abnormal to use such a large number of points anyway, and from a practical point of view it would be interesting to see how good the results would be using about 100 strings rather than over 3000. In this case we used an orthogonal Latin Square design for the initial stage containing 25 points, followed by an orthogonal design of 32 points after reducing the number of levels to 4. The third stage used a 27-point orthogonal design for 3 levels per factor, and the last stage was a half-fraction (32 points) of the remaining 26 points in the search space. Thus the total number of experiments was 116: this is actually an upper bound, as it is possi=

ble that some of the points evaluated at later stages had actually already been used at an earlier stage, but no attempt was made to examine whether this was so. The SEL approach was used with both the mean and maximum criteria, starting from 100 di erent initial sets of 25 points|these could be generated by randomizing the labelling of factors and levels for the Latin Square design. The same sets of 25 initial points were then used as an initial population for a GA which was run until a further 91 evaluations had been performed (i.e. a total of 116). In this case we used incremental reproduction (it was thought likely from experience on other problems that the generational approach would have exhausted the allowed 116 evaluations before signi cant convergence had been achieved). Each chromosome was encoded as a string of 6 genes, each of which was drawn from a 5-allele alphabet; unbiased uniform crossover was used and a mutation rate of 0.05 per gene. Here mutation was de ned as the substitution of an existing allele value by another one chosen at random. Selection for parents was based on linear ranking. One new o spring was generated each time, and it entered the population in place of one of the worst 50% of the current population. The programs for both the GA and SEL were written in Pascal and run on a Sequent S82 computer. The code for SEL was somewhat shorter, but of course there is still some human interaction needed in specifying the orthogonal designs to be used at each stage. An illustrative example of the SEL approach is given in the appendix to this paper.

Results

It would be expecting a lot of any approach to identify the global best solution from less than 1% of the search space, so the question of performance was approached by recording the number of times a solution from an elite group of the 60 best strings was discovered. By enumerating the Universe, these strings were found to belong to one of four well-de ned sets whose tnesses did not di er appreciably from that of the optimal string. The incidence of discovery of solutions belonging to these sets for each method is shown in Table 1, where the sets are listed in descending order of average tness (the customary GA convention of using a  to denote `don't care' values is used). For comparison, it should be realized that if 116 strings were selected at random, the ex-

pected incidence of one of these elite strings is less than 0.5. 1 5 4 5 4

Factor SEL SEL 2 3 4 5 6 -mean -max 4 1/2/3 3  3 1 12 4 3/4/5 3  3 0 17 3 1 3  3 0 0 3  3  3 26 6 Total 27 35

GA

Table 1: Frequency of identi cation of elite solutions (out of 100 trials) The gures show that both SEL-max and the GA were fairly successful in identifying one of the elite solutions. SEL-mean, when it found a good solution at all, nearly always found one from the same set (43  3  3). Although the average tnesses of the four sets are very similar, the GA did appear more successful at nding solutions in the top set. A preliminary ED analysis of variance (Anova) was carried out in order to get some further insight into the problem. From this it was clear that factor 4 was the most important effect, and factor 5 the least. There was some interaction between factors 1, 4 and 6, sucient to mislead the search in some cases into assigning factor 4 to level 4 instead of level 3, but although further analysis of the tness landscape is required, overall this would not appear to be a highly epistatic problem. A pairwise comparison between SEL-max and GA is also of some interest. This showed that although starting with the same 25 initial strings on each trial, they only found the same elite solution 5 times in 100 trials (and then only when factor 5|which is largely irrelevant|is ignored). On 10 occasions they both found one of the elite solutions, but different ones. On 37 occasions, neither found one of the elite group.

Conclusion A genetic algorithm has been compared with a semi-automatic experimental design approach in SEL. While the GA is a very general method, SEL needs some human input before it can be set up for a particular problem instance|in particular, it needs a human decision on the form of the orthogonal design to be used at each stage. Otherwise, it proceeds in a rather more automatic manner than tra-

23 10 1 9 43

ditional experimental design methods. Further, although some traditional explanations of GA operation are reminiscent of the structure of SEL, it is clear from this example that the inbuilt randomness of the choices made by a GA will almost always lead it on a very di erent search path from the rather more deterministic approach of SEL. Because it uses orthogonal designs at each stage, it might be expected that at least one of the SEL methods at least should perform better than the GA. However, in fact, GA is actually better than both SEL approaches (although not signi cantly better than SELmax), and its performance is really very impressive. Although the focus in ED is less exclusively on optimization than it is in the case of GAs, this appears to present a very real challenge to ED orthodoxy! There are clearly several areas for further study: the nature of the tness landscape in this particular problem needs some more exploration in order to determine just how challenging a problem it is. This may give some clues as to how to devise better test problems for comparing such methods. It would also be interesting to see how SEL would fare if the requirement for orthogonal subsets is relaxed; it might be expected that the performance would be degraded, but there would be a signi cant gain in that it would reduce the level of human input to the same order as that needed by a GA, and could thus be applied to far larger problems. It is also possible that SEL could be improved by borrowing some more ideas from the GA world; in some exploratory work which we hope to report on at a later date, we have been able to improve the performance of SEL-max substantially by incorporating the concept of elitism.

Acknowledgement

We would like to thank Susan Carlson for making available her data for the experiments reported in this paper.

A Appendix

An Illustrative Example for SELmax

As the SEL concept is not widely known, the rst stage of a SEL-max iteration is shown below for purposes of illustration. A particular set of 25 strings is displayed in the table below,

generated from an orthogonal Latin Square design. It can be seen that there are 5 occurrences of every level for each factor; further, for each pair of factors there is one occurrence of every combination of levels. 1 1 5 4 3 2 5 4 3 2 1 4 3 2 1 5 3 2 1 5 4 2 1 5 4 3

2 1 5 4 3 2 4 3 2 1 5 2 1 5 4 3 5 4 3 2 1 3 2 1 5 4

Factor 3 4 1 1 5 5 4 4 3 3 2 2 3 2 2 1 1 5 5 4 4 3 5 3 4 2 3 1 2 5 1 4 2 4 1 3 5 2 4 1 3 5 4 5 3 4 2 3 1 2 5 1

5 1 1 1 1 1 5 5 5 5 5 4 4 4 4 4 3 3 3 3 3 2 2 2 2 2

6 1 5 4 3 2 1 5 4 3 2 1 5 4 3 2 1 5 4 3 2 1 5 4 3 2

An orthogonal design for a 6-factor 4-level problem can be set up for the next stage (it uses 32 points), and the procedure repeated in order to reduce the search space further. For SEL-mean, the mean value must be calculated for each level of each factor instead of looking for the largest value|otherwise the procedure is identical.

Fitness 0.1783 0.0000 0.0233 0.1650 0.0286 0.0000 0.0534 0.0914 0.0972 0.1795 0.1157 0.0000 0.0157 0.0529 0.1100 0.3807 0.1904 0.0000 0.0228 0.0886 0.0729 0.1072 0.1414 0.0000 0.0235

References

[1] C.R.Reeves and C.C.Wright (1995) An experimental design perspective on genetic algorithms. In D.Whitley and M.Vose (Eds.) (1995) Foundations of Genetic Algorithms 3, Morgan Kaufmann, San Mateo, CA. [2] C.R.Reeves and C.C.Wright (1995) Epistasis in genetic algorithms: an experimental design perspective. To appear in L.J.Eshelman (Ed.) (1995) Proceedings of the 6th International Conference on Genetic Algorithms, Morgan Kaufmann, San Mateo, CA. [3] C.F.J.Wu, S.S.Mao and F.S.Ma (1990) SEL: A search method based on orthogonal arrays. In S.Ghosh (1990) (Ed.) Statistical Design and Analysis of Industrial Experiments, Marcel Dekker Inc., New

York, 279-310. [4] S.E.Carlson, R.Shonkwiler and M.Ingrim (1993) A comparative evaluation of search methods applied to catalog selection. In S.Forrest (Ed.) (1993) Proceedings of 5th International Conference on Genetic Algorithms, Morgan Kaufmann, San Mateo, CA, 630.

Now for each factor (gene) and each level (allele), the maximum tness is recorded, as shown below. For instance, the rst string in the above list has the largest value (0.1783) of the 5 strings with a 1 at factor 2. The minimum of these values for each factor is shown in boldface type; for instance, level 4 has the smallest maximum (0.1157) for factor 1. At the next stage, the levels associated with these minima are eliminated, so that the search will now be concentrated over levels f1 2 3 5g of factor 1, levels f1 3 4 5g of factor 2 and so on. ;

;

Factor 1 2 3 4 5 6

1 0.1795 0.1783 0.1904 0.1783 0.1783 0.3807

;

;

;

;

Level 3 4 Maximum tness 0.1904 0.3807 0.1157 0.1157 0.1650 0.1904 0.3807 0.1650 0.1795 0.0286 0.1904 0.3807 0.1414 0.3807 0.1157 0.1795 0.1650 0.1414 2

5 0.1414 0.3807 0.1157

0.0914 0.1795 0.1904

Suggest Documents