GENETIC ALGORITHMS VS. CLASSICAL SEARCH TECHNIQUES FOR IDENTIFICATION OF FUZZY MODELS Antoni Jerez
Angela Nebot
Dept. Llenguatges i Sistemes Informatics Universitat Politecnica de Catalunya, Modul C6 - Campus Nord Jordi Girona Salgado, 1-3, Barcelona 08034, Spain. Phone: (343) 4015642 Fax: (343) 4017014
[email protected] [email protected] Fuzzy systems; Inductive reasoning; Genetic algorithms; Model identi cation. ABSTRACT: Genetic algorithms (GAs) have been established as a viable technique for search problems across diverse disciplines of science and engineering. In this paper, dierent genetic algorithms are reported that have been implemented and applied to optimization problems in general system theory. The identi cation of qualitative models in Fuzzy Inductive Reasoning (FIR) is a complex search problem of behavior analysis. A study of GAs' performance in search problems inherent in qualitative model identi cation when compared with classical search techniques is presented in this paper. Keywords:
INTRODUCTION
Genetic algorithms, originally developed by Holland in the early seventies (Holland, 73), are adaptive heuristic search algorithms based on evolutionary ideas of natural selection and genetics. These algorithms encode a potential solution to a speci c problem on a simple chromosome-like data structure, and apply recombination operators in order to preserve critical information (Whitley, 93). In this paper, genetic algorithms are viewed as function optimizers. The range of problems that genetic algorithms can deal with is however considerably broader. In this paper, genetic algorithms are compared with classical search techniques in their ability to identify qualitative models in Fuzzy Inductive Reasoning (FIR). This goal is focused in a single type of search problem: the qualitative model identi cation in the fuzzy inductive reasoning methodology. In order to achieve the goal outlined above, the work described in (Nebot and Jerez, 97), presented at the EUFIT'97 conference, is taken as a starting point. In that research, dierent classical techniques (Hillclimbing, Branch and Bound and Beam Search with dynamic programming) were implemented in order to assess their performance in the search problem at hand. That paper shows that these classical search strategies can be attractive alternatives to exhaustive search techniques (British Museum) as they considerably reduce the computational burden while preserving a high probability of nding either the best model or at least a sub-optimal model of comparable quality. As it has been explained extensively in (Nebot and Jerez, 97), the FIR methodology is based on the General System Problem Solver (GSPS) (Klir, 85), a tool for general system analysis that allows to study the conceptual modes of behavior of dynamical systems. FIR is a powerful technique, suitable for modeling and simulation of systems for which no or only very limited a priori structural knowledge is available, such as in biomedicine, biology, and the economy. For a deeper insight into the methodology as a whole, the reader is referred to (Cellier et al., 96). One of the primary purposes of the FIR methodology is the identi cation of qualitative models of systems. Such models, called masks in the FIR terminology, that best describe a system under study are obtained by computing a quality measure (a metric) for all possible models, and then selecting the one with the highest quality. The search problem is presented in more detail in (Nebot and Jerez, 97) by introducing the reader to the FIR qualitative model identi cation algorithm.
GENETIC ALGORITHMS FOR QUALITATIVE MODEL IDENTIFICATION
In the FIR methodology, an exhaustive search of all possible models can only be accomplished by means of an algorithm of exponential computational complexity, i.e., an algorithm that is intractable when dealing with complex systems (Cellier et al., 96). To address this problem, a rst study based on classical search techniques was done (Nebot and Jerez, 97) obtaining promising results. All of the techniques of polynomial computational complexity that were tested in that study performed quite well at least for the given test cases. Encouraged by these results and recognizing that even the faster techniques would eventually become too slow when applied to a suciently complex problem, it was decided to apply genetic algorithms to the same search problem, in order to perform a comparative study of the two approaches. To this end, dierent types of genetic algorithms were designed and applied to the problem at hand. The rst step in solving a given optimization problem with a genetic algorithm is to nd a way of representing rules in the given problem domain. Representation is actually the most critical issue when dealing with GAs. In a genetic \population," each individual \chromosome" is encoded as a nite length vector of features, or variables, often expressed in terms of a binary alphabet f0,1g. Bit string representation has the advantages of being simple to create and manipulate, that many types of information can be easily encoded in this way, and that the genetic operators are easy to apply. The identi cation of a good representation of a given problem is essential for the genetic algorithm to obtain satisfactory results. In order to obtain a good individual representation of FIR models, it is necessary to know more about how models are represented in the FIR methodology. An inductive reasoning model is a mask that relates the input variables and previous values of the output variables to the current values of the outputs. An example of a mask (model) is presented below:
nx 0 u1 u2 y1 1 t ? 2t ?1 ?2 0 t ? t @ 0 0 ?3 A t 0 ?4 +1 t
(1)
Each negative element in the mask is called an m{input. It denotes a causal relation with the output. Each m{input in uences the output to a certain degree. The enumeration of the m{inputs is immaterial. The single positive value denotes the output. The term m{input is used to avoid a potential confusion with the inputs of the system. In the above example, the m{output, y1 (t) is said to depend on four quantities, namely u1(t ? 2t), u2(t ? 2t), y1 (t ? t), and u2 (t). Negative mask elements denote important relations with the designated m{output, whereas zero elements denote less important relations. Each mask element that is not the designated m{output can be viewed as a potential m{input. Therefore, a natural binary representation of a chromosome can be derived from the positions of the actual m{inputs within the mask. For the example given above, a chromosome representation of that model could be: 1 1 0 0 0 1 0 1, where the 1 elements denote the positions of m{inputs within the mask, enumerated from left to right and from top to bottom. The m{output is not represented in the chromosome since its position within the mask is xed. The set of all possible chromosomes (i.e., the set of all vectors of length eight consisting of zero and one elements only) is identical to the set of all possible masks. Once a representation of a chromosome is de ned, a tness function needs to be associated with it. In the model identi cation problem that we are dealing with, the quality of the mask is a natural tness function. The quality of the mask is based on an uncertainty measure and a complexity measure as explained in detail in (Nebot and Jerez, 97). For each mask, a quality can be computed that assesses the prediction power of that model. The GAs maintain populations of n chromosomes (masks) with associated tness values. The number of individuals of a population is xed for any given GA. It is a user-modi able parameter. Adult Population Generation i
Selection
Parent Population Generation i
Crossover
Child Population Generation i+1
Mutation
Adult Population Generation i+1
Figure 1: Stages of GA reproduction The reproduction proceeds in three stages, as shown in Figure 1. Potential \parents" are selected from the ith generation of the current \adult" population to mate on the basis of their tness, producing \ospring" via
a reproductive plan. Selection is the most critical issue after representation when dealing with GAs. Three dierent types of selection algorithms have been implemented in this research in order to assess their relative performance for the task at hand. These selection techniques are proportional, linear ranking, and tournament. In all three cases, a potential \parent" population of the same size as the \adult" population, n, is chosen, taking into account the tness assigned to each chromosome. If an individual \adult" has a higher tness than the rest of the adults, it will be assigned a higher probability of being able to pass parts of its chromosome on to the next generation by increasing its probability of becoming a parent. The proportional, linear ranking, and tournament techniques dier in how they select the potential \parent" population. The proportional selection is the original method proposed for genetic algorithms by Holland (Holland, 73). The probability of an individual to be selected is simply proportional to its tness value. The main drawback of this selection technique is that, when the tness function is similar for all the individuals of a given population, the selection algorithm becomes a random process. In spite of this disadvantage, the proportional selection method is still widely used in GA applications (Blickle and Thiele, 95). The linear ranking selection was rst introduced by Baker to eliminate the serious disadvantages of proportional selection (Grefenstette and Baker, 89). In this algorithm, the individuals are sorted according to their tness values, assigning a rank (a number between 1 and n) to each individual in the adult population. The selection probability is then chosen proportional to the assigned rank (Blickle and Thiele, 95b). In the tournament selection, 60% of the individuals are picked randomly from the adult population and a copy of the best individual from this group is copied into the parent population. This process is repeated n times, until n parents have been chosen (Blickle and Thiele, 95). the tournament selection can be implemented very eciently as no sorting of the population is required. After n potential parents have been selected in accordance with any one of the above three techniques, the reproduction takes place in a process of mating. Two chromosomes are randomly selected from the previous group, making them the parents of two children. The rst \child" has a chromosome consisting of the head of the chromosome of its \father" concatenated with the tail of the chromosome of its \mother," whereas the second child's chromosome uses the head of its mother's concatenated with the tail of its father's. The operator responsible for arbitrarily selecting the point in the chromosome strings where the parent chromosomes are broken apart and recombined to form the children chromosomes is commonly referred to as the crossover operator. The mating process is repeated 0:8 (n=2) times, until 80% of the \children" have been \begotten." The other 20% of the children are \cloned" from the remaining parent population that were not used in the crossover process. In order to allow for evolution to occur, a mutation operator is also introduced. 0.1% of the children in each new generation are allowed to mutate, by picking an arbitrary gene in their chromosome, toggling it from \0" to \1," or from \1" to "0." The n individuals that result after mutation has occurred form the adult population of the next generation. With this method, a new generation of chromosomes is produced that contains, on the average (so the theory), more good genes than the previous generation. Each successive generation will contain more good partial solutions than previous generations. Eventually, the overall population tness will converge on a maximum, and new generations won't produce ospring any longer that are noticeably dierent in tness from those in earlier generations. At this point, the algorithm is said to have converged to a set of solutions to the problem at hand. In the current implementation of GA in FIR, the following termination rule is being used. If the tness value of 90% of the chromosomes of the current population does not dier more than 1% from the mean tness value of the current population, the algorithm is assumed to have converged. It may happen that the convergence criterion is never met by the genetic algorithm. For this reason, a generation counter is also provided. This is a user-modi able parameter. The optimization process is terminated when either convergence has occurred, or when the maximum allowed number of iterations have taken place, which ever comes rst. Usually the number of iterations is limited to somewhere between 100 and 1000. This will allow the GA to converge in most cases. However, this parameter may need to be adjusted to meet the requirements of any speci c application.
CLASSICAL SEARCH STRATEGIES VS. GENETIC ALGORITHMS
In order to be able to compare classical search strategies with genetic algorithms, the same biomedical system that was presented in (Nebot and Jerez, 97), a real application stemming from the domain of the anesthesiology, is used. The main goal is to obtain a control model for the amount of anesthetic agent to be administered to a patient during surgery (Nebot et al., 96). The results obtained with classical search techniques are shown in the rst columns of Table 1. The classical strategies implemented are: British Museum (B.M.), British Museum with Pruning (B.M.P.), Hill-Climbing (H.C.), Beam search (B.) and Branch and Bound (B.B). The results from the experiments conducted with genetic algorithms are shown in the last columns of the same table. The genetic algorithms implemented are: proportional (GAP.), linear ranking (GALR.), and tournament (GAT.). For each of these techniques, the CPU time (measured in seconds), the number of masks explored, the best mask found, as well as the quality of this mask are shown. Two dierent tests were performed. In the rst experiment, masks with a maximum of 12 non-zero elements (k) were considered. Thereby, the complete search space is 211 ? 1 = 2047 masks. In the second experiment, masks with a maximum of 16 non-zero entries were considered, corresponding to a search space of 215 ? 1 = 32767 masks. The results of these tests are presented in table 1. B.M. B.M.P. H.C. B. B.B. GAP. GALR. GAT. Time 153 52 2 123 6 30 18 30 k = 12 Num.Masks 2047 459 38 684 71 600 400 600 Best Mask opt. opt. subopt. opt. subopt. subopt. subopt. subopt. Quality 0.5968 0.5968 0.5470 0.5968 0.5237 0.5470 0.4902 0.5470 Time 7395 231 2 651 13 84 25 113 k = 16 Num.Masks 32767 1373 54 5508 125 1600 500 2200 Best Mask opt. opt. subopt. subopt. subopt. subopt. subopt. opt. Quality 0.5968 0.5968 0.5243 0.5407 0.4141 0.4941 0.4737 0.5968
Table 1: Results of the anesthesiology application with maximum mask complexities of 12 and 16 The n parameter of the three GAs that represent the number of individuals in each population was set to 100 in all tests for which results are shown in Table 1. From Table 1, it can be postulated that linear ranking is the genetic algorithm that converges the fastest among the three types implemented. Convergence was reached in 4 and 5 iterations (generations), respectively. However, the best masks found in the two experiments are suboptimal masks of clearly inferior quality. From the results obtained for the anesthesiology application, as well as from other applications not discussed in this paper, it can be concluded that usually the tournament and proportional selection methods yield results that are superior to those obtained using the linear ranking approach with respect to the quality of the suboptimal mask found. It is important to notice that the number of iterations even using the tournament strategy is still rather small. Only 6 and 22 generations were needed to reach convergence in the anesthesiology example. Genetic algorithms seem to work equally well as the \intermediate" classical search strategies. The British Museum strategy corresponds to exhaustive search, and therefore, although it always nds the optimal mask, its exponential computational complexity makes this approach non-suitable for dealing with complex systems. On the other hand, the Hill-Climbing and Branch and Bound strategies are very fast, but usually nd suboptimal masks of unacceptably inferior quality. The intermediate classical strategies are British Museum with Pruning and Beam search. These techniques usually nd a suboptimal mask with a quality that is quite close to that of the optimal mask. They oer a polynomial computational complexity, which may make them acceptable in eciency even when dealing with complex systems. The genetic algorithms using proportional and tournament strategies oer a quite similar performance to those intermediate classical strategies. Table 2 shows the results obtained for two additional applications, both real application from the soft science area. The rst of them stems from the domain of cardiology, whereas the second stems from the shrimp farming domain. In both cases, the search space is 211 ? 1 = 2047 masks, and the number of individuals of a given population is n = 100 for all the genetic algorithms used. The parameter for Beam search is k = 2 in both cases. These two applications are included in the paper to further illustrate the relative eciency of both kinds of strategies, classical and genetic algorithms. As it is shown in table 2, even those techniques that converge more rapidely with minimal computational eort, such as Hill-Climbing and GA with linear ranking, are able to nd either the optimal or at least a very good suboptimal mask. In Table 2, the tness values of the optimal and suboptimal masks are given in parentheses. The best suboptimal mask found in the shrimp farming applications
by the dierent approaches are all exactly the same with a quality almost identical to that of the optimal mask. B.M. B.M.P. H.C. B. B.B. GAP. GALR. GAT. Card. NM 2047 455 38 670 71 1600 500 1700 BM(Qual.) opt(0.7602) opt. opt. opt. opt. opt. subopt(0.7531) opt. Shri. NM 2047 599 21 708 66 1500 400 800 BM(Qual.) opt(0.7828) opt. subopt(0.7826) subopt. subopt. subopt. subopt. subopt.
Table 2: Results of the cardiology and shrimp farming applications with maximum mask complexities of 12 One advantage of the genetic algorithms is that, using this technique, it is always possible to terminate the optimization early. The price for doing so will be a set of masks of reduced quality, but at least, there will always be a pool of masks better than that with which the optimization started. With the classical techniques, this is not the case. Early termination will usually result in rather poor masks, because the good suboptimal masks are all of similar complexity, a complexity level that may not have been reached by these search techniques yet at the time of interruption. The main advantage of genetic algorithms in the identi cation of qualitative models, however, is without any doubt that they can be ised on problems to which classical search techniques cannot be applied due to the extremely large size of their search space. However, it is important to keep in mind that one of the primary complaints against genetic algorithms is the occurrence of premature convergence (Potts et al., 94), and this usually is more critical when the search space is large. Another type of GA has been applied earlier to solving complex problems in general system theory (Dobransky and Wierman, 96). In that work, the GA used was Genitor (Whitley, 93). Genitor diers signi cantly from the GAs presented in this paper in that the better chromosomes found in the search are maintained over multiple generations.
CONCLUSIONS
In this paper genetic algorithms are compared with classical search strategies when dealing with the identi cation of qualitative models in the FIR methodology. From the dierent selection methods implemented, proportional, linear ranking, and tournament, the last one seems to be the one that oers the best compromise between computational eciency and quality of the best mask found. GAs are comparable with the \intermediate" classical search techniques, such as British Museum with Pruning and Beam search, from a performance point of view. Several applications have been presented in this paper, all of them stemming from the soft sciences.
REFERENCES
Blickle, T. and Thiele, L. 1995. A Comparison of Selection Schemes used in Genetic Algorithms. TIK-Report/Nr.11, Swiss Federal Institute of Technology. Cellier, F.E. and Nebot, A. and Mugica, F. and de Albornoz, A. 1996. Combined Qualitative/Quantitative Simulation Models of Continuous{Time Processes Using Fuzzy Inductive Reasoning Techniques. International Journal of General Systems Vol. 24 Num. 1-2, pp.95 - 116. Dobransky, M.K. and Wierman, M.J. 1996. GENETIC ALGORITHMS: A Search Technique Applied to Behavior Analysis. International Journal of General Systems Vol 24 Num. 1-2, pp. 125 - 135. Grefenstette, J.J and Baker, J.E. 1989. How genetic algorithms work: A critical look at implicit parallelism. Proceedings of the Third International Conference on Genetic Algorithms, San Mateo, CA, pp. 20-27. Holland, J.H. 1973. Genetic algorithms and the optimal allocations of trial. SIAM J. Computing Vol. 2 Numb. 2, pp. 88-105. Klir, George J. 1985. Architecture of Systems Problem Solving. Plenum Press, New York. Nebot, A. and Cellier, F.E. and Linkens, D.A. 1996. Synthesis of an Anaesthetic Agent Administration System Using Fuzzy Inductive Reasoning. Arti cial Intelligence in Medicine Vol. 8. Nebot, A. and Jerez, A. 1997. Genetic Algorithms vs. Classical search Techniques for Identi cation of fuzzy Models. EUFIT'97, 5th European Congress on Intelligent Techniques and Soft Computing, Aachen, Germany. Potts, J. and Giddens, T.D. and Yadav, S.B. 1994. The Development and Evaluation of an Improved Genetic Algorithm Based on Migration and Arti cial Selection. IEEE Transactions on Systems, Man and Cybernetics Vol. 24 Num. 1, pp. 73-85. Whitley, D. 1993. A Genetic Algorithm Tutorial. Technical Report CS-93-103, Colorado State University.