Gentle but rigorous introduction to genetic algorithms or what makes them tick? Marek W. Gutowski
[email protected] Institute of Physics, Polish Academy of Sciences Warszawa, Poland
EMRS 2007
Plan: Why Genetic Algorithms? ● General structure of Genetic Algorithms ● Essential questions, tuning parameters ● Genotypes, phenotypes, fitness and survival of the fittest ● Typical course of calculations ● Two kinds of genetic operators ● Essential questions and some answers ● What next? ●
Mammals,
birds,
fishes,
amphibians & reptiles,
insects,
plants,
bacteria & viruses,
Ebola Tobacco mosaic virus Malaria
H5N1
Staphylococcus aureus
as well as all the other creatures, big or small ...
A small parasite on the tip of a needle
Gonatus onyx (its length exceeds 10m)
It's a mushroom!
Schistosoma mansoni (worm)
Chlamydomonas algae
... share the same common feature: their offspring (descendants) is always similar to parents.
This is because the information concerning their size, shape, color(s) or any other feature (how many arms, eyes, etc.) is encoded in a single strand of a complicated molecule known under the name of DNA.
Egg and sperm: the carriers of two different DNA molecules to be combined during fertilization
Genetic manipulation: removal of the egg's nucleus
DNA in outer space?!
DNA is a helical stair-shaped object, whose steps are made of exactly four kinds of molecules: cytosine (C), guanine (G), adenine (A), and thymine (T). We can consider a DNA molecule to be the long word made of letters taken from a finite-size alphabet, like this: ATTGTCCAGACTTGACATCTGGGCGACG ... first string TAACAGGTCTGAACTGTAGACCCGCTGC ... complementary string
We will use the name of genotype, chromosome or individual to describe such a piece of genetic code. DNA is nothing else but the detailed prescription on how to obtain the truly new individual, more properly called a phenotype. Note the two different meanings of the word individual and never confuse genotypes with phenotypes!
Some big names in the field of Genetic Algorithms: - John H. Holland (Adaptation in Natural and Artificial Systems, The University of Michigan Press, Ann Arbor MI, 1975), founder of the idea - David A. Goldberg (Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley Publ. Co., Inc., Reading, Massachusetts, USA, 1989) - John J. Grefenstette (Proc. 1st Inst. Conf. On Genetic Algorithms and Their Applications, 1985, vol. 1, p. 160, Lawrence Erlbaum Assoc., Pittsburgh, PA), free software - Zbigniew Michalewicz, world-recognized Polish researcher
Every phenotype (chromosome) consists of smaller parts (substrings) called genes. Genes are assumed to be indivisible. Genetic Algorithms are the string-processing engines, which operate directly on genotypes only, that is on chromosomes and genes.
Their only driving force is the Darwinian rule of survival of the fittest. Conversion from genotypes to phenotypes, and their later evaluation belongs to the user.
General outline of the algorithm here we also evaluate each individual
repeat until new population is complete: ● select two parents (selection) ● create offspring (crossover) ● mutate offspring (mutation) ● evaluate offspring Most often used criteria are: ● max. number of generations reached, ● desired (absolute) fitness reached, ● fitness no longer improves, or ● fitness improves very slowly We may, or may not protect the best individual so far (elitist strategy).
Essential questions
How? How large should be the population?
which individuals qualify to be the parents? ● does every couple produce offspring? ● how the offspring is created? ● how do we mutate offspring? how often? ●
How long all this takes?
Can we fail?
Let's see how it works in practice. We have taken 52 chromosomes acting as agents on the Warsaw Stock Exchange and trying to make a profit there during one-year long training. The rules they use (learn) are inferred from their genes, 55/agent. The genes related to the weights of past prices are 24 bits long, while few other parameters of different nature (buy/sell threshold, a fraction of wealth which may be invested in single stock, and similar) are coded by 10-bits genes. The length of each chromosome was therefore equal to 1170 bits. Other tuning parameters were: ● crossover probability 0.018519, ● mutation probability 0.000172, Elitist strategy at work. We minimize losses, so the negative values mean profit. Note the logarithmic scale on the time axis!
First 50 generations
The best agent makes profit from the very beginning but others learn fast and soon become almost equally good.
First 150 generations
Already after the 9th generation our population makes profit on average, but perhaps we can do better?
First 1500 generations
Yes, we can! Maybe we should learn more? Let's see ...
All 15500 generations
Note the longer and longer periods when “nothing happens”
Generations 3000 – 15550; see big sudden jumps at generation 5200 (profit: 28%) and 6531 (44%). Finally we exceed 49%.
Is it the best we can have? Who knows – it is a stochastic algorithm, driven by chance ...
Advertisement In the presented case we evaluated 163 663 trial solutions. This may seem a big number, but it should be compared with the exhaustive search requiring 2^1170 evaluations, that is more than evaluations.
10^352
Now, when we have some feeling of what are the Genetic Algorithms, let's see how do they work at their deepest level.
Genetic Operators
crossover
mutation
The crossover operation appears to be a linear transformation of the “double chromosome”, i.e. of the pair of individuals! If so, then the rules of ordinary linear algebra apply.
The mutations can be treated similarly. Note, however, that it is sufficient to flip a single bit in order to have an entire gene mutated.
We need to potentially reach every point in this gigantic space. Therefore ...
... the minimal required number of individuals in the population, Nmin has to satisfy:
Nmin^2 > 2 Nbit, where Nbit is the number of bits (not unknowns!) contained in a single individual. For example: Nbit 10 Nmin 5
20 7
50 10
100 15
1000 45
10 000 142
100 000 448
1 000 000 1415
(M. Gutowski, 1999)
Crossovers We have presented single-point crossovers. However, in Nature the multiple-point crossovers also happen. If we require that the average number of the crossover points per pair of genotypes amounts to 1, then the switching probability must be
Psw = 1 / (Ngen - 1) If this is the case, then the fraction of childless (not crossed) couples quickly saturates with increasing Ngen to the limit equal to 1/e = 0.368, exceeding 0.3 already for Ngen > 4. In other words the birth rate is close to 63%.
Initial population is recommended to be as diverse as possible. That is it should consist of the individuals having “0” and of the individuals having “1” at any given position within the chromosome string. Failing to meet this condition results in dramatic decrease of chances to find the solution of the problem under study.
Every “lost” bit, i.e. the one having the same value in all individuals simply halves the available search space. Therefore the best way to generate the good initial population is to create it bit after bit, setting it to either “0” or “1”, each with probability equal to 50%.
Complete world of 5-bit chromosomes - a small world network The last bit in rectangular individuals is “1” and “0” in elliptical genotypes.
The links connect mutated individuals. Paradoxically they are quite close to each other ...
... so mutated individuals (mutants) are not necessarily easy to identify.
Don't be afraid to mutate often!
Degenerated population explores only a tiny a part of the search space and is therefore unable to locate the desired optimum.
Termination at false optimum bears the name of premature convergence.
To reduce chances for premature convergence you have to mutate often.
Mutations If we assume that the mating of two chromosomes in unsuccessful, that is they produce no offspring, mostly because at least one of parents-to-be is a mutant (there are no wars or any other disasters in a computer), then the mutation ratio is given by
Pmut ~ 1 - 0.82^(1/Nbit) Nbit 20 50 100 1000 Pmut 9.8E-3 4.0E-3 2.0E-3 2.0E-4
10 000 2.0E-5
100 000 1 000 000 2.0E-6 2.0E-7
Marek W. Gutowski, Biology, Physics, Small Worlds and Genetic Algorithms, in: Leading Edge Computer Science Research, pp. 165-218, Susan Shannon (Ed.), Nova Science Publishers, Inc., 2005
Mutations, cont. The mutation ratio alone makes possible to uniquely determine how long the evolution should be continued. A crude estimation is:
N ~ 3/Pmut while the better estimate reads (after this number of generations less than 0.5 bits of every genotype remain in their initial state):
N ~ log Nbit / (2Pmut) For Nbit > 403 both estimates practically coincide. Marek W. Gutowski, Biology, Physics, Small Worlds and Genetic Algorithms, in: Leading Edge Computer Science Research, pp. 165-218, Susan Shannon (Ed.), Nova Science Publishers, Inc., 2005
Selection The better fitted individuals have higher chances to become a parent than the poorly fitted ones. The transformation: fitness ==> probability to be a parent may be realized in many ways. However, it may be proven that only under strictly monotonic transformation our chances for success exceed 50%. Such kind of selection is called soft selection. Selection based on some threshold is hard selection.
Selection, cont. The strictly monotonic transformation may take form:
P = 1 / {1 + exp [ –(E – m) / T ]} where: ● E is the fitness of a given individual ● m is an average fitness (the median is recommended) of the current population ● T (“temperature”) describes the distribution of fitness values around the average fitness. The similarity to Fermi-Dirac distribution is not accidental. Marek W. Gutowski, Smooth genetic algorithm, J. Phys. A 27, 7893 (1994)
Complexity (how long it takes) The expected number of fitness evaluations is:
C ~ O (Nbit ^(3/2) log Nbit) that is
C < O (Nbit^(5/2)) what may be compared with complexity of solving a system of N linear equations, which is C ~ O (N^3).
Marek W. Gutowski, Biology, Physics, Small Worlds and Genetic Algorithms, in: Leading Edge Computer Science Research, pp. 165-218, Susan Shannon (Ed.), Nova Science Publishers, Inc., 2005
Conclusions ●
●
●
●
●
●
GA is an efficient tool for finding global optima in various complicated, multidimensional problems, with or without constraints only the values of the objective function are needed, not their derivatives => non-smooth problems can be attacked easily operating on discrete objects GA is nevertheless able to solve continuous problems equally well it is possible to find the values of more unknowns then the number of existing data points (!), but the available information necessarily limits their precision the results are never guaranteed, since this is a stochastic algorithm well programmed genetic engine is self-tuning, thus universal
What next? Steven Orla Kimbrough, Gary J. Koehler, Ming Lu, and David Harlan Wood On a Feasible-Infeasible Two-Population (FI-2Pop) genetic algorithm for constrained optimization: Distance tracing and no free lunch From the abstract: We track and compare one population of feasible solutions and another population of infeasible solutions. Feasible solutions are selected and bred to improve their objective function values. Infeasible solutions are selected and bred to reduce their constraint violations. Interbreeding between populations is completely indirect, that is, only through their offspring that happen to migrate to the other population. [...] Roughly speaking, the No Free Lunch theorems for optimization show that all blackbox algorithms (such as Genetic Algorithms) have the same average performance over the set of all problems. As such, our algorithm would, on average, be no better than random search or any other blackbox search method. However, we provide two general theorems that give conditions that render null the No Free Lunch results for the constrained optimization problem class we study. The approach taken here thereby escapes the No Free Lunch implications, per se.
European Journal of Operational Research Article in Press, doi:10.1016/j.ejor.2007.06.028
THE END
Some species appeared on the Earth quite recently but are already gone. Others exist in unchanged form since more than 400 million years. How about us, humans?