TABLE. Page. 3.1 Performance of the Fuzzy Expert System Measured ... 5.1 Comparison of Fitness Evolution for Population Sizes of 30 and 50 â¦â¦. 35 .... Following is a brief review of the two approaches and the applications of GAs in .... techniques such as crossover, mutation, inversion, etc., can be adopted to introduce.
DERIVATION OF MEMBERSHIP FUNCTIONS FOR FUZZY VARIABLES USING GENETIC ALGORITHMS
By Jun Chen
A Project Report Submitted to the Faculty of Mississippi State University in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science in the Department of Computer Science
Mississippi State University August 1998
DERIVATION OF MEMBERSHIP FUNCTIONS FOR FUZZY VARIABLES USING GENETIC ALGORITHMS
By Jun Chen
Approved:
Susan Bridges Associate Professor of Computer Science (Major Advisor and Graduate Coordinator)
Julia Hodges Professor of Computer Science (Committee Member)
Rainey Little Associate Professor of Computer Science (Committee Member)
Name: Jun Chen Date of Degree: August 1998 Institution: Mississippi State University Major Field: Computer Science Major Professor: Dr. Susan Bridges Title of Study:
DERIVATION OF MEMBERSHIP FUNCTIONS FOR FUZZY VARIABLES USING GENETIC ALGORITHMS
Pages in Study: 46 Candidate for Degree of Master of Computer Science
This report explores the applicability of genetic algorithms (GAs) to derive fuzzy variable membership functions in the domain of radiological waste characterization and classification. The experimental results show that the performance of the target fuzzy expert system can be improved by using the membership functions evolved by the GA to tune membership functions provided by human experts. Chapter I introduces the problem briefly and discusses why using machine learning technologies to improve the fuzzy variable definitions is desirable. Chapter II describes how fuzzy logic and GAs work. The major steps of a GA are discussed. Other work in applying GAs in fuzzy applications is also reviewed. In Chapter III, the purpose, structure, and implementation of the fuzzy expert system and the problems with the system performance are discussed. In Chapter IV, the design and implementation of the GA are presented in detail. This includes the design of data structures for genes and chromosomes, the design of the fitness evaluation function, the choice of selection strategies, and the implementation of genetic operators.
The influence of genetic parameters is also discussed. Chapter V presents the performance of the GA under different genetic parameter configurations and the corresponding performance improvement of the fuzzy expert system when using the membership functions evolved by the GA. Chapter VI summarizes the work and points out further work that can be done to improve the performance of the fuzzy expert system.
DEDICATION I would like to dedicate this project to my wife, Meiling Sha, and my son, Baijiang Chen.
ii
ACKNOWLEDGMENTS I am very grateful to Dr. Susan Bridges, who contributed many valuable insights into my research during the past one and a half years. As my major advisor, she provided me many directional ideas in determining this project topic. She also directed the entire process of this project and the writing of this report and my Master’s program. I owe thanks to Dr. Julia Hodges for her valuable suggestions and comments in this and our DIAL project. And I also want to thank Dr. Rainey Little for his kind help in my study here. Last, I want to thank Dr. Gene Boggess. The initial idea about this project was first formed while I was taking his great class “Genetic Algorithms” last year.
iii
TABLE OF CONENTS Page DEDICATION ……………………………………………………………….
ii
ACKNOWLEDGMENTS ………………………..………………………….
iii
LIST OF TABLES …………………………………………………………...
vi
LIST OF FIGURES ………………………………………………………….
vii
CHAPTERS I. INTRODUCTION TO THE PROBLEM ……………………
1
II.
LITERATURE REVIEW ……………………………………
4
2.1 Fuzzy Logic ……………………………………………… 2.2 Genetic Algorithms …….………………………………... 2.2.1 Introduction ………………………………….. 2.2.2 Selection Methods …………………………… 2.2.3 Crossover ……………………………………. 2.2.4 Mutation ……………………………………… 2.2.5 Genetic Parameters …………………………... 2.3 Application of Genetic Algorithms in Fuzzy Systems …...
4 9 9 12 14 15 16 16
III.
PROBLEM DESCRIPTION …………………………………..
20
IV.
DESIGN AND IMPLEMENTATION ……………………….
24
4.1 General Description of the Algorithm …………………… 4.2 Implementation …………………………………………... 4.2.1 Data Structures ………………………………. 4.2.2 Fitness Function ……………………………... 4.2.3 Selection Strategy ……………………………. 4.2.4 Genetic Operators ……………………………. 4.2.5 Genetic Parameters …………………………...
24 25 25 28 30 31 32
iv
4.2.6 Initialization of the Population …………………. 4.2.7 Other Implementation Details .…………………
33 33
RESULTS AND ANALYSIS ………………………………….
34
5.1 Population Size ……………………………………………. 5.2 Number of Generations .………..………………………….. 5.3 Mutation Probabilities .……………………………………. 5.4 Summary of Results …..………………………………….... 5.5 Comparison with the Results of a Neural Network …………
35 37 38 40 42
CONCLUSION …..………………………………………………...
43
REFERENCES ………………………………………………………………..
45
V.
VI.
APPENDIX A. PROJECT CONTRACT
………………………………………….
47
B. INITIAL FUZZY VARIABLE DEFINITIONS AND FUZZY RULE SET …………………………………………
51
C. TRAINING AND TEST DATA ……………………………………
55
v
LIST OF TABLES
TABLE
Page
3.1 Performance of the Fuzzy Expert System Measured in Absolute Error ……………………………………………………..……
23
4.1 The Structures of a Single Gene Representing One Fuzzy Variable with 3 Fuzzy Sets as Values ………….………………………….
27
4.2 The Structures of a Single Gene Representing One Fuzzy Variable with 4 Fuzzy Sets as Values ………….………………………….
27
5.1 Comparison of the Performance Improvement of the Fuzzy Expert System Using GA When the Population Size is 30 and 50 …………….….
36
5.2 Comparison of the Performance Improvement of the Fuzzy Expert System Using GA When Number of Generations is 50, 100, and 200 ….…
38
5.3 Comparison of the Performance Improvement of the Fuzzy Expert System Using GA When the Mutation probability is Dynamic and Fixed …
40
5.4 Comparison of the Performance of the Fuzzy System Tuned by the GA and the Performance of a BP Neural Network …………………
42
vi
LIST OF FIGURES
FIGURE
Page
2.1 Graphical Representation of Fuzzy Set “Young” and “Old” ………………
5
2.2 Linear Function Representation of Fuzzy Set “Young” ….……………….
6
3.1 Reasoning Structure of the Expert System Prototype ……………………..
21
3.2 An Example of a Fuzzy Variable Definition and a Fuzzy Rule …………..
22
4.1 Processing Flow of the GA ……………………………………………….
24
4.2 Graphical Representation of S, Z, and PI Functions and Their Parameters …………………………..………………………...
26
4.3 Fitness Evaluation Process ………………………………………….………
29
5.1 Comparison of Fitness Evolution for Population Sizes of 30 and 50 …….
35
5.2 Fitness Evaluation for 200 Generations ……………………….…………..
37
5.3 Comparison of Fitness Evolution When the Mutation Probability is Dynamic and Fixed …………………………………………………..…….
39
5.4 The Greatest Performance Improvement on Training Data ……………….
41
5.5 The Greatest Performance Improvement on Test Data …………….……..
41
vii
CHAPTER I INTRODUCTION TO THE PROBLEM Scientists at the Mississippi State University Diagnostic and Instrumentation Laboratory (DIAL) are working with the researchers at the Idaho National Engineering and Environmental Laboratory (INEEL) to develop an expert system which can characterize and classify containerized radiological waste (Bridges, Hodges, and Sparrow 1996). This radiological waste is a byproduct from the manufacture of nuclear weapons in the cold war. After several decades, this waste is still radioactive, so special measures must be taken before it can be remediated. To reduce the processing cost, it is necessary to classify the waste into one of two categories—those that meet the criteria to be shipped to the Waste Isolation Pilot Plant (WIPP) in New Mexico and those that can be disposed of locally (Bridges, Hodges, and Sparrow 1996). However, because this waste is stored in sealed drums and cannot be opened for checking, the classification is dependent on data from real-time radiography, passive and active neutron assay systems, and historical records. However, the detection ability of the instruments is limited, human observation is not always reliable, and the historical records may contain errors. All these factors mean that the data used in the classification procedure are inherently uncertain. Obviously, any expert system constructed to address this problem must be able to handle uncertainty. 1
2 Four main approaches have been investigated as potential methods for handling uncertainty in this domain: certainty factors, Dempster-Shafer theory, Bayesian networks, and fuzzy logic (Bridges, Hodges, and Sparrow 1997). Four prototype expert systems have been constructed using each of the above uncertainty process models. When the experimental results from these prototype systems were compared, it was found that fuzzy logic was found to be a relatively promising method for dealing with the uncertainty in the waste classification domain (Bridges, Hodges, and Yie 1996). However, the results given by the fuzzy logic prototype were still not very satisfactory. After further investigation, it was determined that a possible reason is that the fuzzy variable definitions (membership functions) given by domain experts may need to be tuned. For the fuzzy reasoning expert system to work, the fuzzy variable definitions and rules corresponding to the problem must be given first by domain experts. Domain experts define these membership functions and fuzzy rules based on their experience. Usually there is no solid scientific foundation they can rely on when determining those definitions, so these definitions, especially the membership functions, are subjective in nature. Consequently, it is not unexpected that the initial definitions and the corresponding initial results may not satisfy the proposed requirements, and thus need to be adjusted. To complete the task, domain experts can iterate the defining process. Yet this is a time consuming and costly process. An alternative is to introduce some machine learning technologies to improve the definitions since the iterative defining process can be viewed as a typical learning task.
3 There are many machine learning technologies available. Among them, genetic algorithms (GAs) are a recently developed one (Mitchell 1996). Genetic algorithms are especially useful for solving optimization problems, though they are also very useful in many other fields (Beasley, Bull, and Martin 1993). More specifically, GAs are especially appropriate when an explicit search strategy is not available for solving optimization or other types of problems. This characteristic of GAs is ideal for our problem because the search space in our problem is huge, and it is difficult to tell how to adjust the definitions in advance. GAs have previously been used to derive both fuzzy rules and membership functions for fuzzy variables (Homaifar and McCormick 1995). To simplify our problem, only the membership functions of all of the fuzzy variables were optimized using GAs. One assumption throughout this project is that the given training data and fuzzy rules are representative and accurate enough to direct the learning process.
CHAPTER II LITERATURE REVIEW In this study, two very distinct artificial intelligence (AI) methodologies are involved. While fuzzy logic is a symbolic AI approach, a genetic algorithm (GA) is a non-symbolic method. Both methods can be used to solve the same problem at the same level. But in this study a GA is used to improve the performance of a fuzzy reasoning system, so the GA is located at a lower level than fuzzy logic in the structure of this study. Following is a brief review of the two approaches and the applications of GAs in fuzzy systems.
2.1 Fuzzy Logic The theoretical basis of fuzzy logic is fuzzy set theory, which was first proposed by Lotfi Zadeh in 1965 (Stefik 1995). Fuzzy set theory was introduced to solve problems that are impossible for classical set theory and two-valued logic. In the real world, people possess extensive abilities to deal with fuzzy knowledge, which may be “vague, imprecise, uncertain, ambiguous, inexact or probabilistic in nature” (Orchard 1995). People are also able to reason and solve problems using this fuzzy knowledge. While it is difficult for traditional logic to represent these fuzzy concepts and simulate the fuzzy reasoning process, fuzzy logic overcomes this limitation by extending classical set theory and logic. 4
5 In traditional set theory, an element x in the universal set U either belongs to a set S or does not. Fuzzy set theory, on the other hand, allows an element x in universal set U to partially belong to a fuzzy set FS. A fuzzy set can be described by a characteristic function or membership function. The membership value of an element in that fuzzy set can vary from 0.0 to 1.0. A membership value of 0.0 indicates that the element x has no membership in the fuzzy set FS. On the other hand, a membership value of 1.0 indicates that x has complete membership in FS. Using this idea, many fuzzy concepts can be represented readily. For example, suppose the universe of discourse of human age is between 0 and 100; then the fuzzy concepts Young and Old can be expressed graphically as shown in Figure 2.1. One point noticeable in fuzzy logic is that an element x can belong to a given fuzzy set S and its complement S’ at the same time. This characteristic does not hold in traditional two-valued logic.
Membership Young
Old
1.0
0.0 20
40
60
80
100
Age
Figure 2.1 Graphical Representation of the Fuzzy Sets “Young” and “Old”
6 From Figure 2.1, it can be seen that when one is less than 20, this person is young (membership is 1.0). When this person gets older, his/her degree of being young decreases. Last, when he/she reaches the age 55 or so, this person is no longer young (membership is 0). The curve representing the fuzzy set Old can be explained similarly. Consequently, a fuzzy variable can take these fuzzy sets as its values. For example, if John’s age is a fuzzy variable, it can take on the values Young or Old. In practice, the two most frequently adopted methods to represent a membership function are enumeration representation and function representation. In certain situations, it is convenient to use a set of strict mathematical functions like those illustrated in Figure 2.1 to represent the membership functions of a fuzzy set. Another more commonly used approach is to simplify the strict mathematical function by using a piecewise linear function (see Figure 2.2 for an example) and to further represent the function by enumeration. For example, the piecewise linear function for Young shown in Figure 2.2 also could be expressed as: Young = (0/1.0, 20/1.0, 30/0.6, 40/0.2, 50/0.1, 60/0.0, 100/0.0)
1.0
0.0 0
20
40
60
80
100
Figure 2.2 Linear Function Representation of Fuzzy Set “Young”
7 After fuzzy variables have been defined using fuzzy sets, fuzzy rules can be constructed using these fuzzy variables in the antecedent and consequent. Fuzzy reasoning provides a mechanism to combine fuzzy evidence and fuzzy rules. The two most commonly used inference methods are MAX-MIN and MAX_PROD (Orchard 1995). Following is an example taken from Stefik (1995) showing how the two methods work. Suppose two fuzzy variables speed and braking force are defined as: Speed: Normal = (0/0, 0.1/20, 0.8/40, 1/60, 0.1/80, 0/100) Braking force: Medium = (0/0, 0.5/1, 1/2, 1/3, 0.2/4, 0/5) And a fuzzy rule is defined upon them: If Speed is Normal then Braking Force is Medium. Now the problem is how to determine the braking force for a specific speed such as 40 using the given fuzzy rule. To activate the rule, the specific speed can first be represented as a fuzzy set S: Speed (40): S = (0/0, 0/20, 0.8/40, 0/60, 0/80, 0/100) If the inference method is MAX-MIN, then a composition matrix M should be constructed using the two fuzzy sets Normal (A) for speed and Medium (B) for braking force. M can be defined as: M (i, j) = ATB = min(ai, bj). A and B are two vectors which only contain membership values. In our example,
8 min(0,0) min(0.1,0) min(0.8,0) T M =A B= min(1.0,0) min(0.1,0) min(0,0)
0 0 0 = 0 0 0
min(0,0.5)
min( 0,1)
min( 0,1)
min(0.1,0.5) min( 0.1,1) min(0.8,0.5) min( 0.8,1)
min( 0.1,1) min( 0.8,1)
min(1.0,0.5) min(0.1,0.5)
min(1.0,1) min( 0.1,1)
min(1.0,1) min( 0.1,1)
min(0,0.5)
min( 0,1)
min( 0,1)
0
0
0
0
0 .1 0 .5 0 .5 0 .1 0
0 .1 0 .8 1 .0 0 .1 0
0 .1 0 .8 1 .0 0 .1 0
0 .1 0 .2 0 .2 0 .1 0
min(0,0) min( 0.1,0.2) min(0.1,0) min( 0.8,0.2) min(0.8,0) min(1.0,0.2) min(1.0,0) min( 0.1,0.2) min(0.1,0) min( 0,0.2) min(0,0) min( 0,0.2)
0 0 0 0 0 0
Now we can determine the braking force for a speed of 40 by multiplying matrix M by vector S: Resulting braking force B = S * M, bj = max[min(si,mij)] Thus we get B = (0, 0.5, 0.8, 0.8, 0.2, 0). This is to say that the resulting braking force (fuzzy set) is (0/0, 0.5/1, 0.8/2, 0.8/3, 0.2/4, 0/5). This is a concept similar to Medium. When necessary, this resulting fuzzy set can be defuzzyified to a crisp value. We will discuss this shortly. The computational procedure for the MAX-PROD method is exactly the same except in the formation of the matrix. Instead of using the min() operator, M(i,j) is calculated as ATB = ai*bj. Correspondingly, the matrix M is: 0 0 0 M = 0 0 0
0
0
0
0
0
0 . 05
0.1
0.1
0 . 02
0
0.4
0.8
0.8
0 . 16
0
0.5
0.1
0.1
0.2
0
0 . 05
0.1
0.1
0 . 02
0
0
0
0
0
0
9 The final output B is (0/0, 0.1/1, 0.8/2, 0.1/3,0.1/4,0/5). During the reasoning procedure, there may exist multiple rules supporting the same conclusion. Unlike the classic two-valued logic, fuzzy logic allows re-assessing a fuzzy fact (Orchard 1995). If a fuzzy fact already exists when one rule tries to generate that fuzzy fact, the previous fact can be modified using the output of the current rule. In practice, the two most commonly used approaches to combine multiple fuzzy evidence are SUM and MAXIMUM. Suppose the two fuzzy sets are A and B. Using SUM, the combined fuzzy set C is: ci = ai+bi. On the other hand, using MAXIMUM, the combined fuzzy set C is: ci = max (ai,bi). As we mentioned earlier, it is necessary to convert a fuzzy conclusion into a crisp value in most applications. There are two frequently used methods. One is called CENTRIOD, which computes “the point at which an object of uniform density balances” (HyperLogic Corporation 1990-1993). The other method takes as its output the value with the maximum membership in the fuzzy set.
2.2 Genetic Algorithms 2.2.1 Introduction Non-symbolic approaches played a very important role in AI research during the past decade. As a still rapidly developing method, genetic algorithms (GAs) are receiving more and more attention from the AI community. As a matter of fact, simulating the mechanisms behind many natural phenomena is an important part of human cognitive activities. GAs are a perfect example of this approach.
10 From the viewpoint of computer science, a GA is an adaptive computational model; however, its adaptive process mimics the biological evolutionary processes that exist in the real world. GAs embody the principles of “natural selection” and “survival of the fittest,” which are the cornerstones of the evolutionary theory first proposed by Darwin (Beasley, Bull, and Martin 1993). Actually, this is where the name “genetic algorithms” came from. A genetic algorithm starts from an initial population of individuals. Each individual (chromosome) in the population represents a possible solution to the target problem. Though the initial population generally does not include the optimal solution we want, it may include some good cues (genes) to the optimal or acceptable solutions. So we first evaluate each individual's fitness in the population. The fitness of an individual means how well the individual can solve the target problem. This evaluation process should involve much domain-dependent knowledge. Whether a GA will succeed or not in a specific field heavily relies on this fitness evaluation process. After all individuals in the population have been evaluated, the next step is to select appropriate individuals from the population pool as parents to generate the next generation. In this selection process, more reproductive opportunities are assigned to those fitter individuals than those poorer ones. This allows us to keep the good characteristics of the population from one generation to the next with a greater probability. This is a correct and effective step toward the best solution the GA can achieve. During the reproductive procedure, many techniques such as crossover, mutation, inversion, etc., can be adopted to introduce variation into the population. If we continue this evolutionary procedure by using the new
11 generation as the next parent candidate pool, it is very possible that an optimal or suboptimal solution will appear. Following is a brief description of the main steps in a typical genetic algorithm (Beasley, Bull, and Martin 1993). 1. Initialize relevant parameters such as crossover probability, mutation probability, population size, etc. Generate a certain size initial population. 2. Evaluate the fitness of the individuals in the current population. A good fitness function reflecting the domain characteristics should be constructed. The fitness can be remapped to keep appropriate selection pressure when necessary. 3. Select two individuals from the population as parents according to some selection strategy. 4. Crossover the two individuals (chromosomes) selected in step 3. Mutate the genes of the selected individuals according to the mutation probability. Repeat steps 3 and 4 until a fixed number of new individuals have been generated. 5. Determine whether the evolutionary process has converged or a certain number of generations have been generated. If yes, exit the evolutionary loop and go to step 6. Otherwise, go to step 2 and continue the evolutionary process. 6. Pick up the evolved best solution and verify it when necessary. The performance of a GA is not only dependent on the application domain (whether an appropriate fitness function is available and whether good data structures for
12 genes and chromosomes can be found), but is also dependent on how we control the evolutionary process. We can control the evolutionary process by setting suitable parameters, and by picking an appropriate selection strategy, crossover method, and mutation method.
2.2.2 Selection Methods Selection methods can have an important influence on the evolutionary process. The purpose of selection is to choose appropriate individuals from the current generation to create offspring and to emphasize the fitter individuals. However, the degree of emphasis on fitter individuals should be balanced: if it is too strong, those locally optimal individuals will rapidly take over the population, thus reducing the diversity of the population, which is necessary for further progress (Mitchell 1996). If the emphasis is too weak, the evolution will be very slow. Researchers have developed many selection methods. Following are the most frequently used ones in practice (Mitchell 1996). 1. Fitness-Proportionate Selection With “Roulette Wheel” Using this strategy, an individual’s opportunities to reproduce itself are proportionate to its fitness. This method seems fair. But in two situations, this method is inappropriate. When several individuals are much better than others in the population, these minority elite and their descendents will dominate the population very quickly, thus preventing the genetic algorithm from further exploration. This phenomenon is called “premature convergence” (Mitchell
13 1996). On the other extreme, if there is little difference among the fitness of all individuals in the population, this will lead to a very slow evolution. 2. Sigma Scaling To solve the problem of fitness-proportionate selection, the original fitness can be “re-scaled” to vary the difference between all individuals’ fitness. This keeps the selection pressure from being too high or too low. 3. Boltzmann Selection Boltzmann selection is very similar to simulated annealing. The previous two methods maintain a constant selection pressure over the entire evolutionary process. But it is generally a good practice to vary the selection pressure during different phases of the evolution. Boltzmann selection can increase the selection pressure gradually, which allows the GA to maintain more variation in its early stages, and place more emphasis on fitter individuals in later stages. 4. Rank Selection Rank selection is another method to prevent premature convergence. Using this method, the reproductive opportunities of an individual are determined by its rank in the population—not by its fitness directly. 5. Elitism The philosophy behind this method is straightforward—to protect several of the best individuals at each generation from being destroyed by genetic operators such as crossover and mutation. Experience shows that this is
14 generally a very effective way to improve the performance of genetic algorithms.
2.2.3 Crossover Crossover is one of the two major genetic operators. The purpose of crossover is to split the parent chromosomes and exchange the corresponding components to create offspring, expecting that the recombination of genes may introduce new features to the individuals. There are three major ways to conduct crossover (Mitchell 1996). 1. Single-point crossover Single-point crossover splits each chromosome of the parents into two parts. Then one part of the parent chromosome exchanges with the corresponding part in another chromosome to form two new chromosomes. This is the most commonly used crossover method. But it has several disadvantages. First it can destroy a good schema in a chromosome if the crossover position is located in the schema. Second, single-point crossover cannot generate all possible combinations. 2. Two-point crossover Two-point crossover can solve the problems of single-point crossover to some extent, but it cannot erase them thoroughly. Using this method, there are two crossover points in each chromosome instead of just one. This method usually generates more combinations.
15 3. Uniform crossover Uniform crossover further allows each gene to be exchanged. Thus in theory, all combinations are possible. But not all exchanges will occur, because it also depends on the exchange probability. One obvious shortcoming of this approach is that it is very disruptive (Mitchell 1996). Since each gene is eligible to be exchanged, any schema can be easily disrupted. Generally, there is no simple way to determine which crossover method is the best. It depends on the application area and many other factors such as the gene representation, the fitness function, etc.
2.2.4 Mutation Mutation is another major genetic operator. When GAs were first proposed by Holland in 1975, crossover was the only major instrument that gave GAs the ability of exploration and innovation. Mutation played only the minor role of preventing the population from being fixed at a particular location. Later in some complex applications, researchers found that mutation can play a more important role and can be as powerful as crossover in exploring the search space. Like crossover, mutation also can destroy schemas and build new schemas. Today, most researchers agree that the important problem is not whether to use crossover or mutation, but how to balance them along with other factors such as fitness functions and the characteristics of the problem in a specific application (Mitchell 1996). In this study, we will see that mutation plays a more important role than crossover does.
16 2.2.5 Genetic Parameters Among the several parameters we mentioned in step 1, the performance of a GA is sensitive to the population size and the mutation probability. Usually the larger the population size, the more diverse the population. Thus, more good genes may be included in the initial population. Although it is possible for a GA to find the desired solution faster when using a larger initial population (because a larger initial population may contain better individuals than a smaller one), too large a population usually causes the time performance of GAs to deteriorate dramatically. A mutation probability that is too large or too small can cause problems. If the mutation probability is too large, the whole population will be continuously changing and it can destroy a good individual rapidly. If the mutation probability is too small, the attempt to search the problem space as thoroughly as possible will mostly depend on the crossover operator, whose effect is limited by the population size.
2.3 Applications of Genetic Algorithms in Fuzzy Systems Although genetic algorithms are especially capable of solving optimization problems (linear, non-linear, continuous, or discrete), they also have been used in many other fields such as image processing, machine learning, data analysis, prediction, and scheduling (Mitchell 1996). Recently much work has been done in applying GAs to improve the performance of fuzzy systems (e.g., Karr and Gentry 1993; Homaifar and McCormick 1995; Tang et al. 1998). Since the most important components in a fuzzy system are the membership functions of the fuzzy variables and the fuzzy rules in the
17 system, most work focuses on how to develop high performance membership functions and/or fuzzy rule sets (Homaifar and McCormick 1995). Without a computer-aided tool (such as GAs), the task of defining membership functions for the fuzzy variables in a target system is usually completed by human experts manually. Often the experts have to iterate the defining process many times before a set of optimal or satisfying membership functions can be found. This is a costly and time-consuming task. From the viewpoint of knowledge engineering, this is an important part of the knowledge acquisition process when constructing a fuzzy system (another part is to define fuzzy rules). To ease or release human experts from this tedious work, GAs have been extensively used to derive optimal or sub-optimal membership functions. For example, Karr and Gentry (1993) used a genetic algorithm to generate membership functions for a pH control system. Karr also used a GA to solve a similar problem for the controller of a pole-cart system. Many researchers’ work shows that GAs are generally effective in performance and cost when used to derive fuzzy variable definitions (Homaifar and McCormick 1995). As mentioned above, designing appropriate fuzzy rules is crucial to the success of a fuzzy system. To save human experts’ efforts, some machine learning technologies such as neural networks have been applied to complete this task (Tang et al. 1998). These intelligent learning methods are especially important when human experts are not available. A recent approach to solving these problems is to use genetic algorithms. Actually, more work has been done to evolve fuzzy rule sets using GAs than to evolve membership functions. Cooper (1995) used a GA to evolve fuzzy rules for a boat rudder
18 controller. He claims his method is unique because each evolved rule base for each fuzzy system is flexible and dynamic. Although fuzzy systems have been most extensively used in process control, Ishibuchi et al. (1995) developed a fuzzy classification system using a GA. The role of the GA in this system is to select a set of fuzzy if-then rules. While GAs have been used to evolve membership functions and fuzzy rules in many fields, Homaifar and McCormick (1995) argue that membership functions and fuzzy rules are co-dependent and should be evolved simultaneously. They claim that using a set of manually designed rules with GA-evolved membership functions or vice versa does not take full advantage of GAs. Using both GA-designed fuzzy rule sets and GA-evolved membership functions in a serial manner also has the same shortcoming. Homaifar and McCormick (1995) advocate using GAs to evolve membership functions and rule sets simultaneously. They performed experiments using this approach with a cart controller and a truck controller. Although their argument is generally true, the decision of how to use GAs also depends on the application area. If the domain is relatively easy to understand and rules are readily available, it may not be necessary to use GAs to evolve fuzzy rules. Another noticeable point is the possible combinatorial explosion of fuzzy rules in complex systems. Since GAs are usually computationally intensive, very long chromosomes containing a rule component will greatly deteriorate the time performance of GAs. The two most important problems in the implementation of a GA are the design of chromosomes and the fitness evaluation function. The design of chromosomes is basically dependent on the domain. Most algorithms use a string representation. While
19 many use binary-based genes, more complex integer-based genes are necessary in many cases (Homaifar and McCormick 1995). Another major challenge to the successful application of GAs is to design an appropriate fitness function. In many process control problems, the fitness of a chromosome is evaluated from the performance of the evolved system (Cooper 1995). But in many process control problems, there exist such situations that the final states of the system are the same, but the process (e.g. paths, response time) to reach those states may be different. Thus how to measure the performance of an evolved system comprehensively is not always an easy task. For example, Homaifar and McCormick (1995) define a simple experience-based formula as the fitness function but admit that it needs improvement. Generally speaking, fitness functions are the navigator of the evolutionary process, so they should be carefully designed.
CHAPTER III PROBLEM DESCRIPTION Scientists in Mississippi State University Diagnostic and Instrumentation Laboratory (DIAL) and researchers at Idaho National Engineering and Environmental Laboratory (INEEL) are working together to develop a Waste Assay Measurement Integration System (WAMIS). An important function of this system is “to improve confidence in and lower the uncertainty of waste characterization data” (Bridges, Hodges, and Sparrow 1996). A fuzzy reasoning expert system prototype has been constructed to complete this task. This expert system uses data from passive and active neutron detection systems. Specifically speaking, the expert system is used to measure confidence levels in three sets of measurements: Active Mode Confidence, Passive System Confidence, and Passive Shielded Confidence. A domain expert at INEEL (Greg Becker) gave the initial membership functions for the fuzzy variables and the fuzzy rules used in this system. Data from the passive and active neutron detectors has been converted to the form required by the expert system. The eight input variables to the system defined by Becker are: 1. 2. 3. 4.
PassShdTotalRate = Shielded Total/Count Time PassSystemTotalRate = System Total /Count Time ActivePassShdRateRatio = (Shielded Counts/4)/Pass Shielded Total Rate EarlyLateCntsRatio = Shielded Counts /Shielded Background
20
21 5. 6. 7. 8.
EarlyDrumChamberCntsRatio = Drum Flux Mon/Flux Monitor SystemShdCntsRatio = System Total/Flux Monitor ShieldedTotalCoincRatio = Pass Shielded Total Rate/SG Coinc rate SystemTotalCoincRatio = Pass System Total Rate/LG Coinc Rate
The reasoning structure of each of the system components is shown in Figure 3.1. The original membership functions provided by the expert were piecewise linear functions. Figure 3.2 provides an example of a fuzzy variable definition and a fuzzy rule. In Figure 3.2, the fuzzy variable PassShdTotalRate has four fuzzy sets as its values. Each fuzzy set is represented using the enumeration method. The definitions for all variables and rules can be found in Appendix B.
ActiveModeConfidence
ActivePassShdRateRatio
EarlyLateCntsRatio
EarlyDrumChamberCntRatio
PassiveSystemConfidence
SystemShdCntsRatio
SystemTotalCoincRatio
PassSystemTotalRate
PassiveShieldedConfidence
SystemShdCntsRatio
ShieldedTotalCoincRatio
PassShdTotalRate
Figure 3.1 Reasoning Structure of the Expert System Prototype
22 Fuzzy variable: PassShdTotalRate Low Medium_low Medium_high High
(0/1.0, 24/1.0, 28.75/0) (21.25/0, 152.5/1.0, 322/ 0) (238/0, 14140/1.0, 32200/0) (23800/0, 30000/1.0, 100000/1)
Fuzzy rule: IF ActivePassShdRateRatio IS High THEN ActiveInterrQuality IS High Figure 3.2 An Example of a Fuzzy Variable Definition and a Fuzzy Rule
Using the definitions provided by the expert, we implemented a fuzzy reasoning expert system with FuzzyCLIPS, a powerful fuzzy expert system shell developed by the National Research Council of Canada (Orchard 1995). Becker also provided 269 classified instances. We have used 180 of the cases for training and the remaining 89 cases for testing. These training and test cases are given in Appendix C. Table 3.1 lists the error rate of the initial fuzzy expert system with usersupplied membership functions for both the training and test data. Note that the names “training” and “test” are not significant at this point since the expert system was not trained with the data. The performance is measured using the absolute error between the results given by the expert and those given by the fuzzy expert system, as shown by the following formula: N
Error = ∑ (| Ei − Fi |) / N i =1
23 where N is the number of training or test cases. E i and Fi are the results given by the expert and the fuzzy system respectively. The comprehensive error is the average of the three confidence errors.
Training
ActiveMode ConfidenceErorr 0.155
PassiveSystem ConfidenceError 0.134
PassiveShielded ConfidenceError 0.212
Comprehensive Error 0.167
Test
0.152
0.158
0.238
0.183
Table 3.1 Performance of the Fuzzy Expert System Measured in Absolute Error
Considering that the confidence value is in the range of 0.0 to 1.0 and that many of the expert’s results are located in the low end of the range, the performance of the expert system was not considered satisfactory. One way to improve the performance is to adjust the fuzzy variable definitions and fuzzy rules. This adjustment may need to be repeated many times. To speed up the construction of a system and reduce cost, it is usually highly desirable to introduce machine learning technologies to complete the adjustment task instead of having human experts to do it. In this project, a GA has been developed for this adjustment purpose. To examine the applicability of GAs in waste classification problems, presently only the membership functions of the fuzzy variables are evolved using the GA. We assume that the fuzzy rules given by experts are sufficiently accurate. Another assumption is that the training and test cases given by the expert are also representative and accurate enough to direct the evolving process of the GA.
CHAPTER IV DESIGN AND IMPLEMENTATION 4.1 General Description of the Algorithm In Chapter II, we described the typical steps in applying a GA. In this project, the GA is used to improve the performance of the fuzzy expert system. Since the fuzzy expert system is implemented using FuzzyCLIPS, a good communication channel is required between the GA and FuzzyCLIPS. FuzzyCLIPS will not only measure the fitness of the evolved membership functions, but also will eventually measure the performance of the GA. The major steps in the GA are shown in Figure 4.1.
Initialize population
Calculate fitness via API of FuzzyCLIPS
Select parent from current population pool Crossover and mutate Algorithm converged?
Y
All populations generated?
N Y
Test the evolved definitions Figure 4.1 Processing Flow of the GA 24
N
25 4.2 Implementation 4.2.1 Data Structures The most important data structures in GAs are those that represent genes and chromosomes. Most researchers represent a chromosome as a string that is a permutation of a set of genes. The representation of genes, however, is very diverse and is totally domain-dependent. A gene can be very simple such as a bit in SGA (Simple Genetic Algorithm) (Mitchell 96) or a more complicated structure such as those in this project. In our case, each gene corresponds to one fuzzy variable whose definition is what the GA tries to evolve. In fuzzy systems, we represent the values of a fuzzy variable by membership functions. As we previously mentioned, the common enumeration representation of a membership function, such as: Young = (0/1.0, 20/1.0, 30/0.6, 40/0.2, 50/0.05, 60/0.0, 100/0.0) is inconvenient for our purpose. More specifically, this representation is difficult for the genetic operators to manipulate. Fortunately, in some fuzzy systems such as FuzzyCLIPS, there is an alternative way to express fuzzy variables. A fuzzy variable also can be expressed by several standard mathematical functions, each with a few parameters. According to the features of their shapes, these functions are usually called S, PI, and Z functions (See Figure 4.2). They can be defined mathematically as follows (Orchard 95):
26 S function: 0, 2(u − a ) /(c − a ) S (u , a, c) = 1 − 2(c − u ) /(c − a ) 1
if if
u ≤ a, u ∈ U a < u ≤ (a + c) / 2
if if
(a + c) / 2 < u ≤ c c