A Graphical User Interface For Genetic Algorithms - Joscho.de

8 downloads 510 Views 97KB Size Report
Many programs using genetic algorithms are suited for one single problem only. They typically ..... Genetic Algorithms + Data Structures = Evolution Programs.
A Graphical User Interface For Genetic Algorithms Tanja Dabs Jochen Schoof Report No. 98

February 1995

Lehrstuhl f¨ ur Informatik II Universit¨ at W¨ urzburg Am Hubland 97074 W¨ urzburg [email protected] [email protected]

Introduction

1

Introduction

Genetic algorithms (see e.g. [Hol75], [Gol89]) have become popular tools in many areas of optimisation problems. Compared with other techniques they converge to suboptima only in few cases. The usage of the black-box-principle, which only requires knowledge about a function’s input/output behaviour to perform optimisation on it, makes genetic algorithms successful even in solving problems which are practically intractable with ”classical” techniques. Many programs using genetic algorithms are suited for one single problem only. They typically get an input, start processing and generate an output without permitting or requiring interaction between user and program. Each of these programs has its own user interface having little in common with other programs’ interfaces. In most cases it is not possible for the user to get a view of the optimisation’s internals. To avoid these problems in projects at the Lehrstuhl f¨ ur Informatik II, a graphical user interface for genetic algorithms was developed as part of a diploma thesis [Dab94]. The main goals for this program were: • Ease of Use This is a must when designing a user interface. We especially wanted people, who just started working with genetic algorithms, to be able to get first results easily. Some functions offer possibilities to learn how a chosen algorithm exactly works. They can be used for debugging by experienced users as well. • Suitability for Many Classes of Problems Of course we did not want to support only one particular class of problems, but to offer the possibility to add new classes whenever necessary. Therefore we needed to exactly specify interfaces for future functions, which can be used for extensions to existing problems as well. • Easy Parameter Modification Setting mutation and crossover probabilities before starting an optimisation is required as well as the possibility to do so during the optimisation itself. The user now may easily change the rates if he thinks it is necessary or might help. • Visualisation of Algorithm’s Internals Very often genetic algorithms do not give information about the progress of the optimisation. For example the user is unable to see if the solution was found many generations before the optimisation ended. Much of this information can be turned off if not needed. We even wanted to be able to stop the optimisation and to continue stepwise generation by generation for closer examinations.

1

Genetic Algorithms • Reproduction of Program Runs For archiving and later presentation it is useful to be able to store a single generation as well as a complete program run. It is also possible to generate a detailed log-file of the program run. • Extendability The program was designed as a basis for further versions. So it had to be easily extendable. This includes some already projected extensions as well as future ones. For example we did take care of not implementing any obstacles for planned parallel versions. Each of these topics will be discussed in detail later on. We will present the solutions and point out the implementation’s advantages. But first we will give a short introduction to genetic algorithms for readers not familiar with them.

2

Genetic Algorithms

The main idea of genetic algorithms is to use the ”survival of the fittest”-model known from nature. This means that individuals who are well suited for a given environment have better chances to survive and reproduce themselves. This results in them producing more offspring than individuals with less fitness. Based on these thoughts John H. Holland introduced genetic algorithms as optimisation methods in the seventies. We will only present the basics of genetic algorithms in this paper. For a more detailed description we recommend [Gol89], [Whi93] and [SHF94]. The Population Contrasting to classical optimisation methods genetic algorithms do not only improve one single potential solution, but calculate many of them at the same time. Each of these is called an individual. An individual is typically represented as a chain of genes. Usually a gene is a single bit or a natural number, but there are many problem-oriented variations. An individual’s complete set of genes is also called its chromosome. The values in it are the individual’s genotype. The actual value of an individual, which might for instance be a float number, is called the phenotype. Using the evaluation function an evaluation for every individual is calculated from its phenotype. Finally, the individual’s fitness is derived from its evaluation. The fitness is always biggest for the fittest individual no matter whether a minimisation or maximisation is computed. All individuals together are called the genetic algorithm’s population. Genetic Operators Normally, there are at least two genetic operators to produce new individuals from the current population. Many variations have been derived for special problems. Mutation changes one gene of an individual with a given probability. If we represent genes by bits this simply means a 1 could become a 0 and vice

2

Design and Implementation

versa. Crossover requires two parent individuals. It splits the chains of genes at a randomly chosen position, crosses and reconnects them resulting in two new individuals. Selection and Reproduction To produce a new generation from a given population, first some individuals are selected to become parents. The probability of being chosen for reproduction is proportional to an individual’s fitness, meaning fitter individuals are selected more often than less suited ones. The selected parents’ offspring, which were produced by crossover, finally are mutated. It is possible to specify a crossover probability. This is the probability for performing of a crossover. If no crossover is used the parents are simply copied and mutated to get offspring. Replacement Strategy Finally, a new generation is combined from the old one and its offspring. Usually once again selection is used. It is also possible to give a certain replacement strategy. Elitism is an example of this which guarantees that the fittest individual always gets into the next generation. This is a way to make sure the optimisation is monotonous.

3

Design and Implementation

We will now describe the features of the program GIGA (Graphical user Interface for Genetic Algorithms). The program is available on several UNIX platforms including DEC OSF/1, HP UX and Linux. Compilation requires OSF/Motif v1.2. Different compilations used various C-compilers to achieve a good portability. Our presentation will concentrate on the main characteristics of the implementation. All details can be found in [Dab94]. We will use the well known travelling salesperson problem (TSP) to illustrate the program’s abilities.

3.1

General Concept

The majority of GIGA’s routines are completely independent from the special problem to be solved and can be used for any problem without further specification. There are services offered like choosing operators, modifying rates, displaying convergence, writing a log-file and controlling the program’s flow. seperate modules contain problem specific routines, which have to be rewritten for each new problem. They must include I/O-routines and a fitness function, but one may also provide special operators and a graphical output routines.

3.2

Parameter setting

The behaviour of a genetic algorithm depends on a number of parameters. These are crossover and mutation rates, the replacement types chosen to generate new

3

Design and Implementation

generations and the criteria for termination. Even the chosen operators for crossover and mutation can be parameterised. Many applications in genetic algorithms only allow changes to these parameters by either changing the source code or entering values at the beginning of a program run. We wanted to enable the user to test changes even at runtime. To support the user as much as possible it is necessary for him to get information about the values set and to be able to change them fast and precisely. GIGA’s main window shows all information.

The user gets an idea of how he might change some of the parameters. Looking at the crossover rate the slider tells us the present setting which we can change by simply moving the slider with the mouse. This can be done before starting an optimisation as well as during one. This is true for the setting of the mutation rate as well. The operator to be used for crossover or mutation can be chosen from a pulldown menu. It only offers operators specified for the particular problem of course. Changes at runtime are possible, too. The choice of the replacement strategy to be used looks quite similar. If elitism is chosen a number of individuals to be copied has to be specified. It is not always easy to decide whether a genetic algorithm should already be terminated or not. We offer three different criteria for termination which may even be combined. A common one is the specification of the number of generations to be computed. However it is possible that an optimisation converges very quickly. Therefore one may as well allow termination after a certain number of generations with only little change. Finally it is possible to terminate after a given time. The user simply activates the button for the wanted criterion and specifies the required value.

4

Design and Implementation

The only value not changable at runtime is the population size, because we do not support dynamically changing population sizes.

3.3

Visualisation of Progress

To give the user an idea whether it makes sense to use the techniques described above it is necessary to give him as much information about the optimisation in progress as he needs and wants. There is much information available on various topics. 3.3.1

Protocol Window

First of all the user can activate a window displaying the protocol of the current optimisation. He may specify how many details he wants to be displayed. This information will be updated after a specified number of generations. Of course, the window can be activated whenever needed.

Information typically displayed here includes the generation number, the genotype and phenotype of the best individual, its evaluation and fitness. It is possible to scroll through the entries for the last fifteen generations. This window is available for every problem integrated into GIGA. It is completely problem independent. 3.3.2

Convergence Window

The main question in genetic algorithms is whether the optimisation still shows any progress. We implemented a window which shows the evaluation of each generation’s best individual in a graph. Alternatively it is possible to view the

5

Design and Implementation

average evaluation of the whole population or the worst individual. Together these three windows provide sufficient information about the convergence.

To keep this window problem-independent it is necessary to rescale the graph from time to time. The window does not need any information whether maximisation or minimisation is computed. It always puts the current value into the middle of the display when rescaling. 3.3.3

”Best Individual” Window

The probably most important information for the user is the interpretation of the best individual up to now. Of course genotype and phenotype can be found in the log-file, but the pure numbers might not tell you so much. Given our example of the TSP, the genotype is simply a permutation of numbers from 1 to n. Most users will not be able to visualise the corresponding tour-graph. So we offered a possibility to link a user defined function for graphically representing the best individual. For the TSP we simply display the tour described by the permutation.

6

Design and Implementation

Of course this information is more user-oriented than the simple output of the permutation. However it is not mandatory to implement such a function for every problem. But if the user thinks about fine-tuning the genetic algorithm at runtime this feature will help a lot. 3.3.4

Additional Information Windows

Finally, more information is available on several topics. Some can be used for profiling the algorithm in a simple way, some help to understand the internals of the algorithm. They all can be used for any given problem without special requirements.

7

Design and Implementation

Visualising Operators

The window helps to understand the effects caused by the genetic operators. It can be used for debugging. The user can easily check whether an operator is working the way it should or not. The window is only available for small populations so it is necessary to choose an appropriate population size if one wants to use this feature. Time Statistics

The information about the time required for an optimisation is often important. The time window tells the user the time needed for the optimisation in real time and in CPU time. The time spent in the actual genetic routines is displayed as well. This function also helps you to get a feeling for the amount of time spent for the various kinds of monitoring described above. Using all the described functionality the user can easily get an impression on how well his algorithm works. We consider GIGA a tool for experimenting with genetic algorithms.

3.4

Reconstructing Program Runs

GIGA provides three different mechanisms of reconstructing and reprocessing program runs. Each of them was designed for a special purpose. We will point out the ideas behind these functions and show why all three of them are needed.

8

Design and Implementation

3.4.1

Log-File

Even though GIGA provides a lot of features to get information about the program run in textual or graphical mode the user might want different statistics or new function graphs. The log-file’s purpose is to supply the user with important information about the program run for further processing. The file is structured in such a way that it can be easily read by shell-scripts or other programs. The log-file consists of two parts. The first describes the given problem and the second lists the characteristics of every single generation. These include the chosen parameters, the generation’s average fitness and evaluation and special data on the best and worst individual, including genotype, phenotype, evaluation and fitness. The creation of the log-file is optinal. It is possible to skip intermediate generations which results in registering only every n-th generation. The log mechanism can be turned on and off even at runtime. This can be used to skip a large number of generations with no progress. 3.4.2

Playback File

The integration of some sort of playback mechanism was one of the first ideas when designing GIGA. Often when you get a good result in an optimisation you either cannot remember every parameter or you forgot the seed chosen for the random number generator. Thus you are unable to reproduce this particular run. The feature is also useful in teaching. Instead of setting all parameters you can simply load a playback file and reproduce an optimisation. So the user can create a playback file to save a complete program run with all its parameters (like crossover operator or mutation rate) and their changes at runtime. Loading this playback later will result in calculating exactly the same as in the original optimisation. This includes in particular the random numbers generated. The run is an identical copy of the original one. To achieve this the following values are written to the playback file: • initial values of parameters • initialisation of the random number generator • data describing the special problem • for every parameter change: generation number, values of all changeable parameters You might want to use only the parameters stored in a playback file without using identical random numbers for another test. This can be specified when loading a playback. So it is possible to do many calculations to check the reliability of a parameter set. It is important to mention that the creation of a playback file for the current optimisation is possible as well in the very beginning as after the termination and

9

Design and Implementation

at every time in between. So noticing that an optimisation did quite well, the user can decide to save a playback file of it even after the solution is found. Compared to the log-file the playback file is quite small. Nevertheless it is possible to get a log-file by simply restarting from a playback file. 3.4.3

Snapshot Files

The functionality of a snapshot is similar to that of the playback. A snapshot file contains data about all individuals of the current generation. The user can load this file to use the generation as the initial population of another program run. For example if the user notices that the population has converged to a certain quality he can save it to start subsequent runs with this improved population. In contrast to a playback or a log-file the user may create several snapshot files (at most one per generation) of one optimisation. A snapshot only contains information about one generation and its individuals, like the values of all parameters, problem dependent data, the generation number and finally genotype and phenotype of every single individual.

3.5

Using GIGA for New Problems

Another main goal of GIGA was suitability for a variety of problems. Therefore we specified an interface to GIGA to be used by functions implementing new problems. All required data is kept in one single structure of the type ProblemStruct. To integrate functions for a new problem, the user has to specify a name for the problem to be listed in GIGA’s initial problem selection box, and a suffix used for files (log-file, playback, snapshot) concerning this problem. In addition only pointers to three functions are mandatory. These are an evaluation function, an input and an output routine. All standard crossover and mutation operators can be used by simply giving pointers to them. Of course individually designed operators can be used as well. For a number of functions a new definition is possible but not necessary for optimisation. These include a special function for the generation of an initial population or the graphical output of the best individual in a window. Even an interface for interactive problem input can be created. This has not been used for the problem we implemented up to now. The complete struct is listed below.

10

Design and Implementation

typedef struct { /* mandatory: */ char *Name; char *suffix; void (*EvaluateFunc) (); int (*ReadFile) ();

void (*PrintFile) ();

/* /* /* /* /* /* /*

name of the problem (for the user) */ file suffix */ evaluation function */ reads a file with problem */ specification, return value: */ chromosome length */ prints problem specification */

/* necessary, but call of standard functions is sufficient */ CrossOp *CrossoverOps;

/* /* MutationOp *MutationOps; /* /*

pointer to array of crossover */ operators */ pointer to array of mutation */ operators */

/* optional */ void void void void

(*InitPopulation) ();/* (*GenoToPheno)(); /* (*PhenoToGeno)(); /* (*CreateGraphic) (); /* /* void (*GraphicOutput) (); /* void (*ManageGraphic) (); /* void (*UnmanageGraphic)();/* Widget *TopLevel; /* void (*ProblemInput)(); /* /* } ProblemStruct;

initialisation of chromosomes */ get pheno type from geno type */ get geno type from pheno type */ creates window with graph of best */ individual */ draws graph of best individual */ manages graph of best individual */ unmanages graph of best individual */ toplevel widget of problem class */ builds and manages problem input */ widget */

To simplify the implementation of new functions a shell script is provided. It creates a new module with a given name, which contains a template for the required routines with all headers and parameter lists. Comments describe what each routine is expected to do. As already mentioned the first problem implemented with GIGA was the TSP. To test GIGA’s independence of special problems an existing program was adapted to GIGA’s interface. This program dealing with compression of bilevel pictures ¨ 94] already used genetic algorithms but was designed without any knowl[Klo edge about GIGA. By simply cutting and pasting functions an adaptation of this program was possible within three days. When using GIGA’s interface from the very beginning the development will not take longer than writing a program not using GIGA. Using the templates makes writing functions for a completely new problem even easier than writing a stand-alone version.

11

Design and Implementation

3.5.1

Adding new operators

We will illustrate the extendability of GIGA by describing how to add new operators to an existing problem. Let us assume we already have an implementation of the TSP and would like to add another crossover operator. Any crossover operator in GIGA has the following interface: int xover_op(parent1, parent2, child1, child2) Individual *parent1, *parent2, *child1, *child2; { /* generate one or two children from parent1 and parent2 */ return n;

/* number of children generated */

}

This template allows crossover operators which produce either one or two children. Normally two children will be generated, but some special operators only result in a single new child. The Edge-Recombination-Operator [Mic92] for the TSP is an example of an operator which only calculates one child. GIGA will automatically call the operator as often as necessary to build a new population. After inserting the source code into the template a pointer to the new operator has to be added to the array of pointers to crossover operators. A reference to this array can be found in the interface structure (*CrossoverOps). Adding a new mutation operator is quite similar. The following template can be used: int mutat_op(indiv, m_prob) Individual *indiv; double m_prob; { int count=0, i; double rand; for(i=0; i

Suggest Documents