1
A Distributed Object-Oriented Environment for the Application of Genetic Programming to Signal Processing Pieter G.M. van der Meulen, Asker M. Bazen and Sabih H. Gerez University of Twente, Department of Electrical Engineering, Laboratory of Signals and Systems, P.O. box 217 - 7500 AE Enschede - The Netherlands Phone: +31 53 489 3827 Fax: +31 53 489 1060 E-mail:
[email protected] Abstract— In this paper, an environment for using genetic programming is presented. Although not restricted to a specific domain, our intention is to apply it to image processing problems such as fingerprint recognition. The environment performs tasks like: population management, genetic operators and distributed parallel evaluation of the programs. Furthermore, it provides a framework for implementation of the problem-specific part, like definition of the building blocks for the programs and the fitness evaluation. Using object-oriented methods, the environment is designed to offer a high degree of flexibility and ease of use. The paper concludes with results obtained from an edge detection application. Keywords— genetic programming, distributed processing.
I. Introduction Genetic programming (GP) [1], [2], [3] is a method to automatically search for computer programs that perform a specific task. GP is, among for instance neural networks, fuzzy logic and genetic algorithms, one of the computational intelligence (CI) methods. CI is used for a collection of problem-solving techniques that are inspired by nature. They learn to perform some task well without exactly knowing why. GP works with a population of individuals: programs that are possible solutions to the given problem. The first generation of this population is created randomly. Then, all programs are evaluated by applying them to the problem. This results in a fitness measure being assigned to each program. The programs that have the highest fitness, have the highest probability to participate in the creation of the next generation. This process repeats until a satisfactory solution is found or a fixed number of generations is reached. In this paper, an environment for using genetic programming is presented. Although not restricted to a specific domain, our intention is to apply it to image processing problems such as fingerprint recognition [4]. The environment is designed to perform the GP-specific tasks. Furthermore, it provides a framework for the implementation
of the problem-specific part. Solving realistic problems with GP may consume up to 1016 clock cycles [5]. This amount of computer power can be achieved by using a suitable parallel architecture. In order to keep the highest level of flexibility we have chosen to parallelize fitness evaluation instead of the more easily implementable deme structure, which is discussed in Section IV. The advantages of this solution are that it does not restrict the GP algorithm and that it can use the idle computing time in a heterogeneous computing environment that consists for instance of Unix workstations and WinNT and Linux PCs. This paper will present the considerations related to an efficient implementation of such an environment as well as experimental results. The rest of this paper is structured as follows. Section II explains the concept of genetic programming in more details. Section III discusses the requirements we have for a GP environment and Section IV gives some details of the design choices we made. Section V illustrates the use of GP by applying it to the edge detection problem, while Section VI concludes the paper. II. Genetic Programming This section pays attention to several aspects of GP, being the coding of individuals, the creation of the initial population, the fitness evaluation, the genetic operator and recent extensions to GP. A. Individuals The representation of the programs has to be chosen carefully. The individuals must use a representation that works well with genetic operators like crossover and mutation. This is guaranteed by using closure, which means that the output of each node, which is a basic building block for the programs, can be used as the input of any other node. Then, each new individual encodes a legal solution. Furthermore, the representation must be able to efficiently represent important programming concepts like conditional execution, loops and subroutines.
ProRISC 2000 Workshop on Circuits, Systems and Signal Processing, Veldhoven, The Netherlands, November 2000.
2
+ x
* +
6
x
* x
C. Fitness
x
Fig. 1. Tree representation for GP.
To deal with these constraints, programs are represented by tree-like expressions. Constructing offspring using crossover between parents, described in Subsection IID, then amounts to exchanging branches from the parents. This representation of genetic programs is often visualized in a tree. A possible GP outcome for the expression 6x2 + 7x is shown in Figure 1. Alternatively, the same program can be visualized in a so-called LISP S-expression, (+ (* 6 (+ (* x x) x)) x), which provides a representation in prefix coding. A tree structure consists of two types of nodes, based on their position in the tree. Terminal nodes are the leaves of the tree. They describe functions that have no inputs and one output. This way, terminals can be used as inputs to the program. Primitive functions are the non-leave or center nodes of the tree. Primitive functions can be any function that has inputs. Both kind of nodes have exactly one output. B. Initial Population The GP process starts by creating an initial population, containing a number of randomly created individuals. All individuals are subject to some constraints, like a maximum depth and a maximum number of nodes. Several methods of creating the initial population exist. When constructing an individual, one starts with the toplevel node and repetitively adds nodes to its inputs. Adding a terminal node terminates the construction of that branch. Adding a function node forces the construction of a new level. The full initialization creates trees of a given depth that only contains terminal nodes at the that depth level. On the other hand, the grow initialization creates a tree by randomly choosing between a function and a terminal, until the maximum depth is reached. In general, a ramped half-half initialization shows the best results, since it creates the most diverse population. The method creates a ramped population, which means that the depth of the individuals is uniformly distributed over the range from 1 to the maximum depth. Furthermore, half of it is fully initialized and the other half is grown.
Each individual has to be evaluated by applying it to certain test situations. How well the individual solves the problem is expressed in a fitness value. According to this value, candidate solutions are selected that will contribute to the next generation. The selection mechanism will make the better quality solutions survive from generation to generation, while the inferior solutions are eliminated. D. Genetic Operators Once a fitness value is assigned to all individuals, a new generation is evolved. New individuals are constructed from the selected individuals by three different genetic operators: • reproduction selects a individual and passes it unchanged to the next generation; • crossover selects two parent individuals, and combines fragments of them, resulting in two new offspring individuals; • mutation selects an individual, imposes a small random change on it, and then passes it to the next generation. The new individuals of the next generation may or may not represent an improved solution. However, the average fitness and best fitness will in general increase in each generation. E. Extensions to GP In the last years, many useful extensions to the basic form of genetic programming have been proposed by several authors. In fact, some of these extensions are essential for solving real problems by GP. • Koza [1] observed that GP often tries to construct some constants. An example of the constant 1/2, as observed by Koza, is given by (/ x (+ x x)). To help GP, he introduced the random constant as terminal node. These nodes are assigned a random number during initialization. During the run, they simply act as if they were constant nodes. • Automatically defined functions (ADFs) [6] can be used to create hierarchy in the program structure. ADFs are subroutines that are evolved, just like the main program. The nodes in the main program can then either be primitive functions or ADFs. Using this technique, it is for instance possible to evolve a robot that has its sensor processing in ADFs, while the main program controls its movements. • The addition of memory to genetic programming [7] provides Turing completeness. This means that any program can be evolved by genetic programming. • Architecture altering operations [3] allow genetic programming to determine the architecture of its programs, where architecture means here a set of constraints for the structure of the program. This is, to some extent, already possible in classic GP, but using architecture altering operations, for instance, enables GP to determine whether or not to use ADFs, loops, etc.
ProRISC 2000 Workshop on Circuits, Systems and Signal Processing, Veldhoven, The Netherlands, November 2000.
3
Prefix coded program: b2-4ac
III. Requirements for a Genetic Programming Environment The requirements to a GP environment can be summarized in two words: efficient and flexible. Of course, these two requirements are somewhat contradictory. Since realistic GP problems require a lot of processing power, efficiency is the most important requirement. However, we would like a general environment in which any problem is easily implementable. Many GP environments are available on the Internet. In C or C++: GPC++ [8], GPK [9], lil-gp [10], GPData [11], in Java: EJC [12], GPSys [13], JGProg [14] and two distributed environments: parallel lil-gp in C [15] and DGP in Java [16]. However, none of them fulfills these demands. This section describes the requirements in some more details. At the same time, it provides a motivation why we did not choose to use one of the available environments. The evaluation of the individuals should be highly efficient. As discussed in the next section, the system should use some kind of prefix coding for the individuals instead of trees of pointers. The system should support all possible extensions to GP. If they are not already included in the environment, it should be simple to include them, without redesigning the entire system. The system should not only be limited to support the data structures representing the individuals and the genetic operators. It should also handle the evaluation of the individuals. The system should be extensively configurable without the need for recompilation. This is important, since the best configuration is highly dependent on the problem considered. In general, several experiments have to be performed before a suitable configuration can be set. The system should support distribution of the GP process. If possible, it should be able to use idle computation time in a multi-platform heterogeneous computer environment. Furthermore, the system should handle distribution entirely by itself. The user should not have to write code that is specific to distribution. Instead, the distribution should be completely transparent to the user. The distributed systems we found on the Internet are not this flexible. IV. The Design of a Genetic Programming Environment This section discusses some details of the genetic programming environment as we implemented it. The system fulfills all requirements as specified in the previous section. This section does not provide a complete description of the system. Instead, it discusses the issues which we think are important. The system is implemented in C++. This language is object oriented and enables the use of polymorphism, which enables a very flexible system and enables the re-use of parts of the code. Furthermore, compilers for C++ are available for any platform, which is a requirement for a
-
* b b * 4 * a c
Node list:
* 4 a b c Fig. 2. Prefix code implementation by means of a node list.
cross-platform code. Finally, the STL part of the C++ standard library provides powerful containers and algorithms, resulting in very efficient code. The system uses prefix code for the storage of individuals. Although the logical structure of the individuals is a tree structure, this does not imply that they should be stored as pointer trees. Although the genetic operators are a little harder to implement, prefix code allows much more efficient evaluation of the individuals and has much lower memory requirements. The tree structure is implicitly stored in the data structure. This is important when individuals have to be sent over the network in a distributed system. Unlike pointer trees, this representation does not have to be rebuilt once received over the network. The prefix code of our system is based on the prefix jump table [17]. In this method, all nodes are represented by an index in a jump table. This jump table contains pointers to the functions to call. In our system, the jump table is replaced by a node list, containing all nodes that are used in the population. This is illustrated in Figure 2. Tokens are used as reference to a node instead of an index in the jump table and polymorphism is used instead of pointers to functions. Using this implementation, nodes may contain their own data. This is especially useful for constants. Random constants can be used by creating random constant nodes during the creation of the initial population. Furthermore, variables, world information and memory can easily be implemented. The implementation of the environment is strictly separated from the user code. An important part of the design is the interpreter of the prefix codes: the environment carries out the execution of individuals. The user is only client-programmer, who has to write the problem-specific parts. Examples are the code for new types of nodes and code for evaluation of the fitness of the individuals. The latter controls for which parameter settings the individuals have to be executed. This is illustrated in the edge detection example in Section V. The system is fully configurable without the need for recompilation. All parameters, describing all settings and the entire problem, are set in one single text file. The system
ProRISC 2000 Workshop on Circuits, Systems and Signal Processing, Veldhoven, The Netherlands, November 2000.
4
is designed in a strictly modular way. This way, it is even possible to select for instance which nodes, which genetic operators, which evaluator, etc. to use in configuration settings. The application of GP to realistic problems may take a lot of computational time. Therefore, it is highly recommended to use a distributed GP system. There are three ways in which GP can be distributed. • Parallel runs. In general, GP will not yield a good solution each time it is tried. Therefore, a number of runs are needed in order to find a good solution. An obvious way of distributing GP is to execute these runs at the same time on different computers. However, this solution does not offer most of the advantages of real distribution. For instance, it does not speed up the execution of a single run. • Demes. Demes are sub-populations that run at the same time, possibly at different computers. After each generation, a percentage of the individuals migrates to the neighboring demes. The use of demes is the most widely used method of distributing GP. However, for good results it requires all computers to run approximately at the same speed. This is not realistic in case of a heterogeneous computer environment and the use of idle time is not possible at all. Furthermore, the efficiency of this method it is still a topic of discussion [18]. • Parallel fitness evaluation. The most flexible way of distributing GP is to use parallel fitness evaluation. In this method, the population is managed on a single machine, while individuals are sent to several computers for the fitness evaluation. This way, the inherent parallelism of GP is exploited, without imposing any restrictions. The relatively small amount of time it takes to evaluate one single individual allows for fine grain scheduling. Therefore, this approach is well suited to a heterogeneous computer environment and allows the use of idle time. We have chosen to use parallel fitness evaluation as distribution method. The computer configuration consists of one server and multiple clients. The server manages the population. It applies the genetic operators and sends the individuals to the clients which perform the fitness evaluation. This process is completely transparent to the user. Since the clients only have to evaluate single individuals, their memory requirements are low. For this type of distribution, the communication overhead is critical. However, for problems that are worth distributing, the overhead is relatively low, compared to the evaluation time, since the communication is minimized. At the start of a run, the evaluator initialization and the node list are sent to all clients. During the run, only the prefix coded program for each individual, requiring only 1 or 2 bytes per node, has to be transferred over the network. The fitness value is returned to the server.
V. Application of GP to Signal Processing For many engineering problems, no exact optimal solution is known. In these cases, various computational intelligence methods, like neural networks, fuzzy logic, etc. are used to find an approximately optimal solution. Typical problems to solve with GP are found in image processing. Image processing is usually carried out by chaining a series of well known image processing operators. However, in most image processing problems, there is not one solution that is known to be optimal. Therefore, it might be beneficial to evolve the image processing task as a genetic program, using a well-defined fitness function. This is the case if the desired result is known while the operations that are needed to produce this result are not. A. Problem Description We have chosen the edge detection problem to demonstrate the possibilities of our GP environment. In this application, the task of GP is to find a suitable FIR filter for edge detection [19]. The modulus of the response of the filter should have a clear peak at at locations where the input signal contains a step edge. The FIR filter coefficients should be described by a function of a variable x, sampled at regular intervals in h−1, 1i. Such a function will be called a filter function (FF). The problem then reduces to finding a suitable FF. For a simple step edge in noise, the optimal solution is known to be the derivative of a Gaussian [20]. Although the existence of an optimal solution makes the problem an exception, it was chosen here because its simple nature helps clarifying the use of genetic programming in a signal processing problem. B. Application of GP In this application, each individual encodes an FF. The first step is to choose and implement the nodes to use. The nodes are functions that can be used in the symbolic expression. We have chosen the following functions, which are all straightforward to implement: binary:
unary:
terminals:
add mul sub div pow log exp sin cos x rnd
x1 + x2 x1 ∗ x2 x1 − x2 x1 /x2 xx1 2 log |x| ex sin x cos x input random constant in the range from 0 to 2
The function div is protected such that it will always produce a valid output. Division by zero will result in one in order to guarantee closure. The criteria to measure the performance of the various edge-detectors, given their response to some edges, are ob-
ProRISC 2000 Workshop on Circuits, Systems and Signal Processing, Veldhoven, The Netherlands, November 2000.
5
vious. Both the error rate and the displacement of the peak from the real edge position should be minimized. The fitness function F is a function of the maximum peak amplitude A1 , the second highest peak amplitude A2 and the displacement d of the peak from the real edge position: F =
A1 − A2 A1 · (1 + d)
(1)
C. Results
0.6
0.6
0.4
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4
−0.6
−0.6
−1 −1
−0.8
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
x x2 h(x) = − 2 exp(− 2 ) (2) σ 2σ with σ2 = 0.25, while examples of evolved filters are given by: Filter 1: * sin * * sin x cos * * x 0.781933 1.8889 cos * * sin x cos x cos 1.18318 cos x
+ sin + + x sin x + sin x sin + + sin x -
−1 −1
−0.8
−0.6
(a) Canny
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
−0.2
−0.2
−0.4
−0.4
−0.6
−0.6
−0.8
−1 −1
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
0.4
0.6
0.8
1
(b) Filter 1
−0.8
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
−1 −1
−0.8
−0.6
(c) Filter 2
−0.4
−0.2
0
0.2
(d) Filter 3
Fig. 3. Optimal edge detection filter (a) and evolved filters (b-d).
30
7
The results were obtained using a first version of our GP environment that does not yet support distributed processing. On a single 500 MHz Pentium III, a run of 50 generations with a population size of 500 individuals takes approximately 30 seconds. About 80% of this time is spent in the filtering. In every run, an acceptable solution is found. The evolved filters do approximate the optimal filter very well. The optimal Canny filter is given by:
and
1
0.8
−0.8
This fitness value of an individual is found by applying the evolved filters to a number of step edges with various additive noise levels. Then, the code for evaluation of fitness of an individual has to be written. The GP environment carries out the direct execution of the individual. In this case, this means that the coefficient values are returned for given values of x. The user code has to apply the edge-detection filter to the step edges from a benchmark set, to evaluate the results and calculate the fitness. Of course, standard mathlibraries may be used for e.g. the implementation of the filter action. Finally, the run parameters have to be configured. For instance, the size, and maximum depth of the population and the maximum number of generations have to be set. Then, the process is started, resulting in an “optimal” evolved edge detector.
Filter 2: sin + sin + + sin sin sin sin + x + sin - 0.0461224 sin sin sin sin sin + x sin + x sin + - 0.0699288 x + x - + sin x x x + sin sin sin sin sin sin x x sin x - 0.0461224 x sin + sin sin + x + sin + x sin x - 0.0461224 x - x sin sin + x x - 0.0461224 sin sin sin x sin x + 0.0461224 x
1
0.8
6
25 5
4
20 3
15
2
1
10 0
−1
5 −2
−3
0
20
40
60
80
100
(a) Step edge
120
140
0
0
20
40
60
80
100
120
140
(b) Filter responses
Fig. 4. Step edge and filter response.
Filter 3: sin sin + + x x x The impulse responses of these filters, sampled at 15 points between -1 and 1, are plotted in Figure 3. It is only a coincidence that the Canny filter is mirrored with respect to the evolved filters. Since the modulus of the response is taken, this does not influence the fitness values of the filters. The filters have been applied to a number of step edges. One example of such an edge is given in Figure 4(a). The responses of the filters are given in Figure 4(b). It can be seen from this figure, that all filters provide a good detection of the edge. One final remark has to be made. Once a satisfactory solution is found, GP is not needed anymore. The result of GP is a computer program, which can be used in the application domain, in any way that one likes.
ProRISC 2000 Workshop on Circuits, Systems and Signal Processing, Veldhoven, The Netherlands, November 2000.
6
VI. Conclusions Genetic programming is a method in which executable computer programs are evolved. In this paper, a distributed environment for using GP is presented, combining a very efficient design with a high degree of flexibility. This is achieved by a carefully designed C++ implementation. The system also supports distributed processing: in a heterogeneous computer network, client processes are used for the computationally intensive fitness evaluation, while a server process takes care of the evolution process. The capabilities of the system are illustrated by means of the evolution of an edge detection filter. It has been shown that the evolved filters perform approximately as well as the optimal edge detection filter by Canny. More information on the GP system can be found on http://www.sens.el.utwente.nl. Furthermore, a demo version of the system will be available from this location in the near future.
[19] C. Harris and B. Buxton, “Evolving edge detectors with genetic programming,” in Genetic Programming. Proceedings of the First Annual Conference 1996. MIT Press, Cambridge, MA, USA, 1996, pp. 309–314. [20] J. Canny, “A computational approach to edge detection,” IEEE Trans. Patt. Anal. Mach. Int., vol. PAMI-8, no. 6, pp. 679–698, 1986.
References [1] John R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press, 1992. [2] John R. Koza, Genetic Programming II: Automatic Discovery of Reusable Programs, MIT Press, Cambridge Massachusetts, May 1994. [3] John R. Koza, David Andre, Forrest H Bennett III, and Martin Keane, Genetic Programming III: Darwinian Invention and Problem Solving, Morgan Kaufman, Apr. 1999. [4] A.M. Bazen and S.H. Gerez, “Computational intelligence in fingerprint identification,” in Proceedings of SPS2000, IEEE Benelux Signal Processing Chapter, Hilvarenbeek, The Netherlands, Mar. 2000. [5] F.H. Bennett III, J.R. Koza, J. Shipman, and O. Stiffelman, “Building a parallel computer system for $18,000 that performs a half peta-flop per day,” in GECCO-99: Proceedings of the Genetic and Evolutionary Computation Conference, W. Banzhaf et al., Ed., July 1999, pp. 1484 – 1490. [6] David Andre, “Automatically defined features: The simultaneous evolution of 2-dimensional feature detectors and an algorithm for using them,” in Advances in Genetic Programming, Kenneth E. Kinnear, Jr., Ed., chapter 23, pp. 477–494. MIT Press, 1994. [7] Astro Teller, “The evolution of mental models,” in Advances in Genetic Programming, Kenneth E. Kinnear, Jr., Ed., chapter 9, pp. 199–219. MIT Press, 1994. [8] Adam Fraser and Thomas Weinbrenner, “GPC++,” http: //thor.emk.e-technik.tu-darmstadt.de/~thomasw/gp.html, C++. [9] Helmut H¨ orner, “GPK,” http://aif.wu-wien.ac.at/~geyers/ archive/gpk/vuegpk.html, C. [10] Douglas Zongker, “lil-gp,” http://garage.cps.msu.edu/ software/lil-gp/lilgp-index.html, C. [11] Andy Singleton and Bill Langdon, “GPData,” ftp://cs.ucl. ac.uk/genetic/gp-code, C/C++. [12] Sean Luke, “EJC,” http://www.cs.umd.edu/projects/plus/ ec/ecj/, Java. [13] Adil Qureshi, “GPSys,” http://www.cs.ucl.ac.uk/staff/A. Qureshi/gpsys_doc.html, Java. [14] Robert Baruch, “JGProg,” http://members.linuxstart.com/ ~groovyjava/JGProg/, Java. [15] Johan Parent, “Parallel lil-gp,” http://garage.cps.msu.edu/ software/lil-gp/lilgp-index.html, C. [16] Phyllis Chong, “DGP,” http://studentweb.cs.bham.ac.uk/ ~fsc/DGP.html, Java. [17] M.J. Keith and M.C. Martin, “Genetic programming in c++: Implementation issues,” in Advances in Genetic Programming, chapter 13. MIT Press, 1994. [18] B. Punch, “How effective are multiple poplulations in genetic programming,” in Genetic Programming 1998: Proceedings of the Third Annual Conference, San Francisco, July 1998, pp. 308– 313.
ProRISC 2000 Workshop on Circuits, Systems and Signal Processing, Veldhoven, The Netherlands, November 2000.