A Comparative Study of Different Methods for Representing and Reasoning with Uncertainty for a Waste Characterization Task
Susan Bridges Julia Hodges Charles Sparrow
Technical Report No. MSU-970613 June 13, 1997
Department of Computer Science Mississippi State University Box 9550 Mississippi State, MS 39762-9550
[email protected]
A Comparative Study of Different Methods for Representing and Reasoning with Uncertainty for a Waste Characterization Task
Susan Bridges Julia Hodges Charles Sparrow
ABSTRACT Containers of transuranic and low-level alpha contaminated waste generated as a byproduct of Department of Energy defense-related programs must be characterized before their proper disposition can be determined. Although nondestructive assay methods are the primary method for assessing the mass and activity of the entrained transuranic radionuclides, there are additional sources of useful information relevant to the characterization of the entrained waste. These include known strengths and weaknesses of assay systems, expected correlations between assay measurements, container manifests, information about the generation process, and destructive assay techniques performed on representative samples. Each of these sources of information provides evidence that may confirm or refute the characterization of the materials as determined by the assay system(s). Many different types of uncertainty and vagueness are associated with the different types of evidence. This paper describes a comparative study and evaluation of four different uncertainty calculi for representing and reasoning with the uncertainty and/or vagueness associated with evidence in this domain. These methods are: MYCIN-style certainty factors, Dempster-Shafer Theory, Bayesian networks, and fuzzy logic. Primary factors considered in the evaluation were power and “naturalness” of the representation for different types of evidence, applicability of the reasoning methods to this domain, the theoretical basis for the representation and reasoning methods, utility of the method for decision making, scale-up considerations for large systems, and knowledge acquisition issues.
1.0 Introduction Human experts are quite adept at reasoning with uncertain, incomplete, and imprecise information. Expert systems which are built to emulate human reasoning must also be able to represent and reason with this type of information. There have been many heated debates in the artificial intelligence (AI) literature about the efficacy of different methods for reasoning with uncertainty (e.g. Elkan 1994; Bezdek 1994). Almost all of these discussions concede that determination of the appropriate methods must be made on the basis of the properties of the expert task under consideration, but few studies have been reported that compare the applicability of different methods to specific problems or classes of problems.
The problem under
consideration here is the nondestructive characterization of containerized radioactive waste; in
particular determining whether the waste meets all of the criteria necessary for it to be shipped to the Waste Isolation Pilot Plant (WIPP) permanent storage facility in New Mexico. Nondestructive assay methods are available to determine the mass of radionuclides in the waste containers. An uncertainty measurement is associated with these masses that is based on the counting statistics of the assay system. There are, however, other known sources of uncertainty that are associated with the measurements. For example, some matrix materials are known to interfere with the measurements, certain configurations of the radioactive materials in the containers are known to cause problems, etc. In some cases, additional assays were done at the time the waste was packaged that may be more or less accurate than nondestructive assay information now available. There is also ancillary information available about the processes that were used to produce the waste that can provide confirmation of the identity of the matrix material and/or radionuclides involved. Our goal is to build an expert system that can integrate data from multiple sources to confirm or refute the characterization of the waste material and to quantify the confidence in that characterization.
Many different methods for representing uncertainty/vagueness in expert
systems have been developed (see Kanal and Lemmer 1986 for an overview), but the four general methods are widely accepted and used are MYCIN-style certainty factors, Dempster-Shafer theory, Bayesian networks, and fuzzy logic (Henkin and Harrison 1988; Stefik 1995). All of these methods involve a quantitative representation of uncertainty.
Methods that use qualitative
representations of uncertainty have also been developed, but these were not considered because quantification of the uncertainty associated with the characterization is one of the requirements for this task. In order to evaluate the applicability of different methods for representing and reasoning with uncertainty for the waste characterization task, we have developed a set of criteria for evaluating the methods, have analyzed the types of uncertain information that need to be represented, have determined how experts combine the different types of information when solving this task, and are in the process of developing prototypes using the different representation methods to empirically evaluate the effectiveness of each method. Section 2 of this paper provides a brief description of the waste characterization task under consideration. Section 3 will describe the evaluation criteria that were developed, will enumerate and illustrate the different types of uncertainty that we have found to be important in this domain, and will describe
different ways that experts combine evidence when solving this problem. Section 4 will give a brief description of the different methods for representing uncertainty that have been investigated, will describe how each of the types of uncertainty identified in the domain can potentially be represented with each method, and will describe combination operators provided for each method. Section 5 summarizes the results and describes future work in development of the expert system.
2.0 The Waste Characterization Task INEL and MSU are cooperating to design and build an expert system called WAMIS (Waste Assay Measurement Integration System).
The goal of WAMIS is to improve the
confidence in the characterization of containerized radiological waste based on a variety of data such as gamma spectra; radionuclide mass estimates; total alpha activity; thermal power; real-time radiography video; container attributes; and mass ratio estimates for americium, plutonium, and uranium isotopes.
Figure 1 illustrates the types of information that will be combined by the
system. In its simplest form, the problem being addressed is the classification of containers of radiological waste into one of two categories: those that can be shipped to a permanent storage site and those that cannot. Classification, which is a traditional task for knowledge-based systems, may be defined as identifying classes of data as solutions to a problem (Stefik 1995) . For example, in our application, the data comes from a wide variety of sources that include simple measurements (e.g., the weight of a drum of waste), classifications based on human judgment (e.g., determination of the type of waste matrix based on real-time radiography),
Historical/ generator data
Computer model output
AI classification results (i. e. neural networks)
NDA waste assay measurement system data
Results of statistical analyses
WAMIS Expert System
Results of algorithmic analyses
Expert domain knowledge
Presentation/Report 239 240 241
Pu,
Thermal Power Total Mass
241
Am,
235
U
TRU Content Concentrations Alpha Activity
Figure 1. The Data Integration Process for the WAMIS Expert System
historical data (e.g., records describing the processes used to produce the waste), and the results of calculations based on data from nondestructive assay systems (e.g., the concentrations of radioisotopes based on neutron assays). In a classification task, the set of classes is known in advance as well as the necessary conditions for class membership. The necessary conditions for class membership for our problem are the characteristics of containers that may be shipped to the WIPP site as defined in the Transuranic Waste Characterization Quality Assurance Program Plan (DOE 1995).
Many of the basic characteristics of containerized waste that are needed in order to determine if it can be shipped to the WIPP site have already been measured or computed. Some of this information was gathered at the time the waste was generated and is recorded in documents accompanying the waste.
Other measurements have been made at INEL.
Unfortunately, experience has demonstrated the documentation accompanying waste may contain errors, and that some of the nondestructive assay systems may not be able to characterize the radioactive contents of the drums to the required level of accuracy. The task for our exploratory expert system is to check the consistency of data from multiple sources, to combine these sources of information, and to quantify a level of confidence in that classification. A more complete description of the expert system task and the evidence that will be combined can be found in Bridges, et. al ( 1997).
3. 0
Issues in Evaluating Methods for Representing Uncertainty The process of reasoning with uncertain, vague, and incomplete information is known by
several different names depending on the emphasis of the authors and the types of uncertain information under consideration.
These names include evidential reasoning, approximate
reasoning, uncertain reasoning, and probabilistic reasoning. There have recently been a number of heated debates in the literature which argue strongly for one method over one or more other methods (e.g. Bezdek 1994; Elkan 1994). The implication of these articles is generally that one of these methods is inherently superior for representing all types of uncertainty in expert systems. Although it might be most convenient to identify one technique that could be used consistently for the expert system we are developing, we have not apriori limited ourselves to using a single method of representation for all uncertain information in the domain. We will consider using multiple methods if that approach has clear advantages and if techniques can be developed for integrating the different representations.
3.1 Evaluation Criteria A number of different evaluation criteria have been proposed for determining which method of representing uncertainty is most appropriate (e.g. Lee, Grize, and Dehnad 1987; Henkind and Harrison 1988). Some groups attempt to apply these criteria without regard to the application, but in general, the requirements of the application determine the applicability of a
technique.
The criteria that we are using to evaluate the methods for representing and reasoning
with uncertainty are listed below. Note that our goals in this study were not only to determine how well each method for representing uncertainty meets each of these criteria, but also to determine which of the criteria are most important for this particular task
1. Power and “naturalness” of the representation for different types of evidence. Does the method provide a means of representing the type(s) of uncertainty found in the domain? How easily can one map from information provided by experts to the representation?
2. Applicability of the reasoning methods to this domain. Does the method provide operators for combining uncertain information from multiple sources that yield results consistent with those provided by experts?
3. Theoretical basis for the representation and reasoning methods. What is the mathematical foundation for the representation and reasoning system? What are the assumptions made by the method (e.g. independence of evidence, mutually exclusive hypotheses, etc.)? Are these assumptions valid or important in this domain?
4. Utility of the methods for decision making. What is the meaning of the quantification of uncertainty that results from reasoning? How does one use this result to make a decision?
5. Scale-up considerations for large systems. Can the knowledge be modularized so that solutions to subproblems can be combined to solve the larger problem?
Does the
time/space complexity of the method allow for solution of realistically sized problems in a reasonable amount of time? Can one easily understand and maintain a large knowledge base that uses this method?
6. Knowledge acquisition issues. How many “numbers” must one acquire from the expert and/or data? How easy is it for the expert to provide these numbers? How sensitive is the system to small changes in the numbers provided by the experts?
7. Ease of implementation. Are expert system “shells” readily available that support this type of uncertain reasoning? Alternatively, is the approach relatively easy to implement in a standard programming language?
3.2 Types of Uncertainty in the Waste Characterization Domain A number of different classifications for types of uncertainty can be found in the literature (e.g. Bonissone and Tong 1985; Kanal and Lemmer 1986; Stefik 1995). In this section we will describe the major sources of uncertainty that we have identified in the waste characterization domain. •
Imprecise concepts The concepts may not be precisely defined. This type of uncertainty is often called vagueness. There are many examples of the use of vague information in the waste characterization domain.
For example, our domain expert looks at several ratios of
measured values to see if the ratios are “close to one.” In many cases, the expert checks for values for particular parameters that are “high” or “low”.
•
Detection limitations of sensors The data may be inaccurate due to the detection limitations of sensors. For example, the accuracy of the neutronic data is limited by both the amount of radioactive material present and the counting time. The SWEPP system uses counting statistics to estimate the uncertainty of the measurements due to these factors.
•
Violations of assumptions Potential violations of the assumptions used as a basis for the calculations done by the sensor system software may also contribute to uncertainty. For example, the SWEPP system assumes that the matrix material in the containers is homogeneous and that the radioactive materials are uniformly dispersed. Both of these assumptions are often violated. The expert has more confidence in the results of the system if he knows that the assumptions of the system have not been violated.
•
Erroneous records Some of the data associated with particular containers (i.e. the type of waste stored in the containers) has been extracted from records made at the time the waste was produced. Errors are known to exist in this data.
•
Reliability of human observation Some of the input for the expert system is based on human judgment (i.e. observation of real-time radiography). Different technicians may have different levels of expertise and technicians have more confidence in some of their observations than others. Although this type of knowledge might be considered to be the same as a limitations of the accuracy of sensors where the sensors in this case are human observers, there are problems involved in evaluating and representing the experience level and confidence of people that one does not encounter when dealing with devices.
•
Uncertain rules The knowledge base (rules in the case of a production system) may also be uncertain. That is, the correlation between the premises and the conclusions of some rules may be stronger than that of others.
3.3 How Experts Reason Using the Evidence In a rule-based expert system, the uncertainty associated with facts and with rules must be combined as part of the inference process. A number of issues must be considered in this process. Suppose we have an expert system with rules of the following form: If A1 op1 A2 op2. . . opn An+1 then B with confidence r1. where each opi is a logical operator and each Ai is a clause with an associated certainty ci. Decisions that must be made include the following: •
How should r1 and each ci be expressed?
•
What are the logical operators that need to be provided?
•
How should the levels of uncertainty associated with the clauses in the premise be aggregated for each of the logical operators?
•
What level of certainty in the premise is necessary before the rule is activated?
•
How are confidences in the premise and the rule combined to give a confidence in the conclusion?
•
If multiple rules result in the same conclusion with different levels of confidence, how are these combined?
Similar decisions are necessary when the knowledge is expressed in a form other than rules (i.e. the directed graphs of Bayesian networks.)
Before these decisions can be made, one must
determine how experts in the domain combine evidence from different sources. For the waste characterization task, experts generally start with an initial hypothesis and an associated confidence level that has been determined by statistical methods or from expert judgment. Other evidence is then collected and used to confirm or refute the initial hypothesis. A system which represents this reasoning process must be able to represent the confidence level in the initial hypothesis, must be able to represent the confidence in supporting evidence, and must be able to modify the confidence in the hypothesis as evidence is collected. We have found that different sources and types of evidence are treated in quite different ways. Some sources of information provide very strong evidence for refutation, but do little to increase the level of confidence in a hypothesis. This type of evidence is usually used as a “sanity check”. For example, one piece of evidence that is considered when confirming the identity of the matrix material in the container is that the measured density is consistent with the known density of the hypothesized matrix material. Finding that the density value is consistent does not increase one’s confidence in the identity of the matrix material since there are many materials with density values in this range, but finding that the density is not consistent is very strong evidence that the matrix material is misidentified. An example of this type of reasoning applied to neutronic data is when one checks to see that computed masses for radioisotopes are positive. A negative mass is clearly not reasonable and would reduce the confidence in the computed value to 0. (A negative mass value would, however, provide support for a hypothesis that the actual mass is smaller than some value that represents the lower limit of detection of the sensor.) In several instances we have found that one or two pieces of evidence are predominant in determination of the level of certainty in a hypothesis and that other pieces of evidence are used to slightly modify this confidence level. For example, when considering whether a matrix material is strongly moderating, the major pieces of evidence are the identity of the matrix material and its known moderating effect. In addition, a moderation index computed from sensor data provides
auxiliary information that can increase or decrease the confidence that the material is strongly moderating by ±10%. The domain expert often uses the values of certain parameter values determining which type of evidence is most reliable in a particular situation. For example, in cases where the Pu mass is expected to be low, the active neutron assay is more accurate; when the Pu mass is expected to be high, the passive neutron assay is more accurate; at certain intermediate values of Pu mass, both systems should be quite accurate and therefore should yield similar values for the mass. The expert often checks to see if several independent pieces of evidence agree and if they do, this results in a high level of confidence. When the pieces of evidence disagree, it is sometimes the case that the confidence is only slightly decreased because some of the evidence may be considered to be only slightly important given other, more important evidence. The ability to reason with pieces of evidence of varying importance appears to be necessary for the waste characterization task.
4.0 Methods For Representing Uncertainty Much of the basic research in the area of representing uncertainty in expert systems was conducted in the 70’s and 80’s and many books and review articles appeared in the literature in this time frame (e.g. Bonissone and Tong 1985; Kanal and Lemmer 1986; Henkind and Harrison 1988). The four most commonly used and discussed methods for representing uncertainty are Mycin-style certainty factors, Dempster-Shafer theory, Bayesian networks, and fuzzy logic. Many variations of each approach have been developed. In this section, we will briefly describe each of the methods for representing uncertainty, present an analysis of how each of the types of uncertainty described in section 3.2 can be represented using the method, discuss the operators provided by the method for combining uncertainty, and discuss how the method meets each evaluation criteria described in section 3.1. A more detailed description of each of the methods and the prototypes that were developed for matrix identification are given in Bridges et al. (1997).
4.1 Certainty Factors The certainty factors (CF) approach to representing uncertainty in an expert system was originally developed for the MYCIN expert system (Buchanan and Shortliffe 1990). In the CF
approach, each hypothesis made by the system (example: “Evidence is consistent with graphite in the container”) and each piece of evidence considered by the system (example: “The technician thinks RTR indicates graphite in the container”) has a certainty factor associated with it. This certainty factor indicates a degree of belief in that hypothesis (given all the evidence considered so far) or a degree of belief in that piece of evidence.
The certainty factor values are usually
obtained from domain experts, not from direct observation (Buchanan and Shortliffe 1990).
4.1.1 Reasoning with Certainty Factors Certainty factors were designed to be used with rule-based expert systems. Each rule is of the form shown below: If e1 op1 e2 op2. . . opn en+1 then h with CF = r1. Each ei is a piece of evidence with an associated certainty factor ci . h is the resulting hypothesis and r1 (called the attenuation factor) is the certainty with which one can conclude h if the evidence in the premise is certain, i.e. r1 is the certainty associated with the entire rule.
CF
values are in the range -1 to +1 where -1 indicates complete disbelief, +1 indicates complete belief, and 0 indicates no belief. Note that a CF value of 0 can also result when one combines conflicting evidence. The most commonly used logical operators provided with certainty factors are AND, OR, and NOT where the certainty factor that results from combining evidence using these operators is computed as shown below: Combination Operator e1 AND e2 e1 OR e2 NOT e1
Certainty of Combined Evidence min (c1, c2) max (c1, c2) - c1
In order to prevent activation of rules in which there is a relatively low level of certainty associated with the premise, systems that use certainty factors often require that the level of certainty associated with the premise be greater than some threshold value before the rule will be activated. The MYCIN system, for example, used a threshold value of 0.2. After the certainty in the premise of a rule has been computed, one must combine this certainty with the certainty in the rule itself to determine the CF of the hypothesis that results from firing the rule. This computation is done by multiplying the certainty in the premise times the certainty in the rule: CF = r c.
When more than one rule concludes the same hypothesis, a combination operator is provided to combine the certainties associated with the hypothesis resulting from the two rules. If the two certainty factors associated with the hypothesis are CF1 and CF2, then the rule for computing the combined certainty factor CF is: CF = CF1 + CF2 (1 - CF1) for both CF1 and CF2 > 0 CF = CF1 + CF2 (1 + CF1) for both CF1 and CF2 < 0 CF= (CF1 + CF2) / (1 - min (CF1, CF2)) otherwise The rules for combining evidence in the CF approach are defined such that it does not matter in what order the evidence is considered. A CF system simply stores the accumulative CF and continues to combine it with new evidence as the new evidence becomes available. The CF approach assumes modularity in the certainty computations. That is, this approach assumes that a rule can be used once its premises have been satisfied, no matter how the premises were derived or what other facts are currently in the knowledge base.
4.1.2 Representing Waste Characterization Uncertainty with Certainty Factors Although the original formulation of certainty factors derives the certainty associated with a piece of evidence from a measure of belief (MB) and a measure of disbelief (MD), most systems that support reasoning with certainty factors do not use these measures (Giarrantano and Riley 1994). Instead, a single number, the certainty factor, is used directly. All types of uncertainty associated with evidence are represented with a certainty factor. For the waste characterization problem, two different methods have been used to represent imprecise concepts using certainty factors. The first is to categorize parameter values into classes such as high, medium, and low. A particular parameter value is a member of only one class and the certainty of this class membership is usually close to 1.0. Another approach that has proven useful when the category is binary is to use a simple linear function to compute the certainty factor associated with a parameter value. In this case, the certainty factor values typically cover most of the range from -1 to +1 where +1 indicates complete membership in the category, 0 is an intermediate membership, and -1 indicates no membership in the category. In order to represent probabilistic information such as counting statistics using certainty factors, one must develop a method for converting probabilities to certainty factors. A simple approach is to normalize the 0-1 scale for probabilities to the -1 to +1 scale for certainty factors
so that a probability of 0 translates to a CF of -1, a probability of 0.5 translates to a CF of 0, and a probability of one translates to a CF of 1. In order to represent the uncertainty in a hypothesis caused by violation of assumptions used to derive the hypothesis, we used rules like the following: if h with CF = c1 and violation with CF = c2 then h with CF = r1 Different methods that were used to compute the new confidence in h with a rule like this one are discussed below. The certainty factors associated with information from historical records are most easily provided by an expert unless statistical information is available about the reliability of the data. The reliability of human observations is represented using a fact to experience level of the observer and a fact to represent the observer’s confidence in his/her observation. Estimates of observer experience on a 0-1 scale that were provided by the expert were converted to a -1 to 1 scale. The values provided by the expert were 0.2, 0.4, 0.6, 0.8 and 1.0 and these were translated to the certainty factor values -0.6, -0.2, 0.2, 0.6, and 1. Confidence expressed by the technician in his/her observation was also given on a 0-1 scale and was translated to CFs in a similar way. The correlation between the premises and the conclusions of rules is represented by a an attenuation factor associated with the rule. These attenuation factors are typically greater than 0.5. If the correlation between the premise and conclusion does not warrant a relatively high attenuation factor, the rule was not considered “strong” enough to be in the knowledge base.
4.1.3 Reasoning with Certainty Factors for Waste Characterization In general, it was quite easy to translate rules provided by the expert into rules with associated certainty factors. However, several of the standard methods of combing CF’s have proven to be unsatisfactory for the waste characterization task. The major problems have been the following: •
The standard AND and OR operators base the certainty of the entire premise on the certainty of either the strongest (in the case of OR) or weakest (in the case of AND) piece of evidence. Our expert, however, often gives rules of the form If e1 AND e2 AND. . . AND en+1 then h with CF = r1. where some pieces of evidence are more important than others. This means that the certainty of the entire premise should not necessarily be reduced to the certainty of the
weakest piece of evidence. Alternatives that we have explored are 1) dividing the rule into several rules with more complex premises and 2) developing a method for weighting evidence. •
The combination operator for combining CFs for a single hypothesis derived by multiple rules did not give results consistent with those of our expert. A custom rule of combination was developed which gave results closer to those of the expert.
4.1.4 Evaluation of the CF Approach The CF approach was evaluated with each of the criteria presented in section 3.1. 1. Power and “naturalness” of the representation for different types of evidence. It appears to be fairly straightforward to translate most types of uncertain information encountered in this domain into facts with associated CF’s or rules with CF’s.
2.
Applicability of the reasoning methods to this domain.
The standard methods for
combining CF’s did not appear to work very well and required development of a “custom” rule of combination.
3. Theoretical basis for the representation and reasoning methods. The theoretical basis of certainty factors is the theory of confirmation (Stefik 1995).
The certainty factor
approach has been widely criticized for its lack of mathematical foundation. “The logic of confirmation has no inherent connection with the logic of probability.
Thus, the
combining functions . . . depend on a domain being well behaved, which means that the events E1 and E2 can be treated as independent and that there are no aberrant rules.” The success of the certainty factor approach is dependent on the appropriateness of the assumptions of locality, detachment, and modularity (Stefik 1995). Locality means each rule is independent and can be applied no matter what other rules are present in the knowledge base. Detachment allows one to use the conclusion of a rule for further inference without regard to how that conclusion was derived. Modularity is the result of the combined effects of locality and detachment.
These requirements dictate that
dependent pieces of data must be grouped into single rules. The major problem we have
encountered in achieving meeting these requirements is determination of when related evidence should be grouped in a single rule or applied in multiple rules. Another difficulty with CFs is that they tend to accumulate for multiple hypotheses as evidence is combined even though the evidence is quite weak (i.e., has a small associated CF). Thus this approach does not work well for applications in which many rules are used to derive conclusions. If the chains of reasoning are short, however, the CF approach can work well. This cumulative effect is apparent in our system and has been a problem, even with relatively short inference chains. As inference chains become longer as the system is extended, this problem will be exacerbated.
4. Utility of the methods for decision making. In order to make decisions based on certainty factors, one must determine what level of confidence one is required to have in a hypothesis before it can be used for the basis of a decision.
The certainty factor
representation of confidence in a characterization has seemed natural to our domain expert. Initial results from a prototype that does matrix characterization using certainty factors appear to indicate that it would be possible to experimentally determine an appropriate level of confidence for decision-making.
5. Scale-up considerations for large systems. Certainty factors can only be legitimately applied if knowledge can be modularized into rules. Furthermore, in this domain, it appears to be relatively easy to identify groups of rules that can be used to solve subproblems of interest and to combine the results from these subproblems.
The
independence assumptions of certainty factors make the updating of certainty factors as evidence accumulates very efficient.
Rule-based systems that use certainty factors have
the same advantages and disadvantages of any large rule-based system.
Although
individual rules are generally easy to understand, it is often difficult to understand how a set of rules works together and to anticipate how changes in a single rule will influence system behavior.
6. Knowledge acquisition issues. One of the strengths of certainty factors that is generally cited is that very few numbers are required and it is typically fairly easy for experts to
provide these numbers. Furthermore, experiments by Buchanan and Shortliffe (1984) demonstrated that Mycin was quite insensitive to changes in certainty factors.
7. Ease of implementation. There are many expert system shells that implement certainty factors. In addition, it is quite easy to add certainty factor calculations to shells like CLIPS.
4.2 Dempster-Shafer Theory The Dempster-Shafer approach (often called DS Theory) provides a method for representing and reasoning with numerical degrees of belief (Stefik 1995). Unlike probabilistic approaches, it does not require a complete set of prior and conditional probabilities.
The
Dempster-Shafer approach also provides an explicit method for representing lack of knowledge (or ignorance) while other approaches provide no explicit way to distinguish lack of knowledge and conflicting knowledge. In DS Theory, one defines a frame of discernment (θ) - a mutually exclusive and exhaustive set of elements that respond to ground propositions. When used with diagnostic systems, these elements typically refer to diagnostic hypotheses. A basic probability assignment (bpa) is associated with each possible subset of the frame of discernment, and the sum of all bpas is 1.0. In a typical scenario, all belief is initially associated with the complete set θ indicating complete ignorance.
As evidence accumulates, a portion of the belief becomes
associated with more refined subsets of hypotheses.
4.2.1 Reasoning with DS Theory DS Theory provides a means of iteratively updating belief in hypotheses as evidence accumulates (Stefik 1995). Dempster’s rule of combination is used to compute a new basic probability assignment M denoted M1 ⊕ M2 as follows: M1 ⊕ M2(A) = Σ M1(X) M2 (Y) where X, Y ∈ 2 Θ and X ∩ Y = A. The commutativity of multiplication guarantees that the rule yields the same value regardless of the order in which evidence is considered. DS Theory requires that the bpa assigned to the null set always be 0. When conflicting evidence is combined using Dempster’s rule, there may be a non-zero bpa associated with the null set. In this case, the bpa assigned to the null set is assigned
a value of 0 and the mass that was assigned to the null set is redistributed to the other subsets using a normalization procedure.
4.2.2 Representing Waste Characterization Uncertainty with DS Theory DS Theory was not explicitly designed for expert systems and cannot be used for this purpose without modification (Lucas and van der Gaag 1991). DS Theory requires definition of a set of mutually exclusive and exhaustive hypotheses to which the evidence is applied. The theory is most useful when one can use the evidence to iteratively focus attention on smaller and smaller subsets of the frame of discernment. This has been particularly useful in domains in which a hierarchical structure can be imposed on the hypotheses so that groups of hypotheses form classes in the hierarchy. This type of reasoning is not generally used in our domain. In addition, our expert reasons with many intermediate hypotheses which become evidence for inference of other hypotheses. DS Theory does not explicitly include mechanism for this type of reasoning.
4.2.3 Reasoning with DS Theory for Waste Characterization The basic problems that prevent DS Theory from being directly applied to rule-based expert systems are its computational complexity and the lack of several combination functions (Lucas and van der Gaag 1991). As stated earlier, DS Theory is most appropriate when one can use the evidence to support smaller and smaller subsets of hypotheses. Several methods for overcoming the computational complexity associated with DS Theory have been developed for application to such hierarchies. Our problem does not appear to have this hierarchical structure. DS Theory has been adapted to rule-based systems using several different ad hoc rules of combination but none of these has proven entirely satisfactory (Lucas and van der Gaag 1991).
4.2.4 Evaluation of the DS Approach After an initial development of a small prototype using DS Theory, it became apparent that the structure of the particular waste characterization problem that we are working with did not lend itself well to this method of representation and reasoning. In fact, our rules typically assign certainty to only one hypothesis and in this case, it can be shown that the rule of combination for DS Theory reduces to the rule of combination for certainty factors (Stefik 1995). Because the other methods for representing uncertainty that we have been investigating
appear to hold much more promise, we have suspended further investigation of DS Theory in this domain. We will, however, reconsider this decision if we encounter an aspect of the waste characterization task that appears to use hierarchical refinement as the major method of reasoning.
4.3 Bayesian Belief Networks Although many early expert systems used methods based on probability theory to represent uncertainty, these methods have come under attack because of the ad hoc procedures used for combining certainties and because of the simplifying assumptions inherent in the systems. The development of Bayesian networks has caused renewed interest in the use of probability theory to represent uncertainty in expert systems because these networks provide a computationally feasible mechanism for representing and reasoning with a complete probabilistic model of a domain.
4.3.1 Reasoning with Bayesian Networks Bayesian networks are based on the subjective Bayesian view of probability rather than on the objective or frequentist view (Pearl 1988). In the objective view, probabilities represent the likelihood of an event based on the observed frequency of occurrence of an event over many experiments. In the subjective view, probabilities represent an expression of a person’s belief that a particular event will occur in a single experiment. This view of probability as a measure of personal belief is central to the use of probability theory in expert systems since it would be impossible to do the number of experiments necessary to determine al the probabilities needed in a large domain (Heckerman and Nathwani 1992; Heckerman, Horvitz, and Nathwani 1992). A Bayesian network (or belief network) is a directed acyclic graph used to represent dependencies among random variables and to give a concise specification of the joint distribution function (Russell and Norvig 1995). Each node in the network represents a random variable, i.e., a set of mutually exclusive and collective exhaustive states. The directed links between nodes represent a direct influence of the value of one random variable on the value of another. Associated with each node will be a conditional probability table that gives the probability of each value for the node given every possible combination of values for the parent nodes.
Prior probabilities are used to represent probabilities based on background knowledge and in the absence of evidence. As evidence is gathered, the prior probabilities are replaced by conditional or posterior probabilities denoted as P(X = xi | Y = yj). If we have defined a domain with a set of random variables {X1, ..., Xn}, then a Bayesian network is a representation of the joint probability distribution over those variables (Heckerman and Wellman 1995). The Bayesian network consists of a set of local conditional probability distributions along with a set of conditional independence assertions that enable us to compute the joint probability distribution from the local distributions. The formal definition of a Bayesian network is based on the concept of conditional independence, but in practice, one usually uses expert knowledge of cause and effect relationships to determine the topology of the network. An arc drawn from one node to another generally indicates that the value of the source node has a direct effect on the value of the target node. This method of constructing Bayesian networks almost always results in a network with accurate conditional independence assertions (Heckerman and Wellman 1995).
4.3.2 Representing Waste Characterization Uncertainty with Bayesian Networks Imprecise concepts in the waste characterization domain were represented using Bayesian networks in the same way they were for certainty factors. Categories such as high, medium, and low are established as the possible values for a random variable and the subjective probability that a particular parameter value fits into a particular category was determined by expert judgment. Note that the categories must be mutually exclusive and exhaustive. Detection limitations of sensors are easily represented in Bayesian networks using probabilities. The noisy-OR relationship (Russell and Norvig 1995) can be used to represent the influence of several factors on the accuracy of a sensor reading. Potential violations of the assumptions used as a basis for the calculations done by the sensor system are represented by using nodes to represent each type of violation and its probability. Arcs from the violation node(s) to a node representing the probable accuracy of the computation represent the influence of the violation on the a computation. Accuracy of historical data is represented using subjective probabilities (the probability that the data is correct) or may be computed if appropriate statistical information is available.
Experience levels of technicians and their assessment the accuracy of their observations are represented as imprecise concepts as described above. Bayesian networks use nodes and arcs rather than rules to represent relationships between evidence and hypotheses. Intermediate hypotheses, the basis for their derivation, and the variables that they effect are explicate represented in the network structure. The degree and type of influence of a set of parent nodes on a child node is represented in conditional probability table of the child node. Logical connectives such as AND, OR, and NOT are not explicitly represented, but methods for implementing “noisy” versions of these connectives in Bayesian networks have been developed. Multiple ways of concluding the same hypothesis are represented explicitly as a set of incoming arcs to the hypothesis.
4.3.3 Reasoning with Bayesian Networks for Waste Characterization It has proven to be very easy to translate the reasoning of our domain expert into the directed graph structure of the Bayesian network. Each piece of evidence, intermediate hypothesis and final hypothesis is represented as a node in the graph and the interactions between the evidence and hypotheses are explicitly represented in the topology of the graph. The conditional probability tables associated with each node represent the extent and type of influence that the parents of the node have on the probabilities of different values of the random variable represented by the node. The initial levels of probabilities of evidence (leaf nodes in the graph) are represented by a priori probabilities. Equal probabilities are often given to all possible values if information of about the probability of different values is not known. As particular values for the evidence nodes become available, the probabilities of the known value becomes 1 and the probabilities of all other values are 0. These probabilities are propagated through the network to influence the probabilities of values of other random variables. The effect of evidence that serves as a “sanity check” for a hypothesis is easily represented by making the probability of the hypothesis very low if the value of the evidence is unreasonable and leaving the probability of the values of the hypothesis node unchanged if the value of the evidence is reasonable. It has also proven easy to represent situations in which one or two pieces of evidence are predominant in determining confidence in a hypothesis and other pieces of evidence are used to slightly modify this confidence level. We have developed a method for computing the CPT
values for this type of node given the probability of the hypothesis for all possible combinations of the predominant evidence (this is often a single piece of evidence) and a subjective estimate of the relative influence of the auxiliary evidence (e.g. ± 10%). Representation of which types of evidence should instill the most confidence depending on the values of certain parameters is represented in the conditional probability tables. If the number of incoming arcs to a node becomes very large, it has proven useful to use several intermediate hypotheses with fewer incoming arcs. Representing the influence of evidence with different weights is easily done in the conditional probability tables.
4.3.4 Evaluation of the Bayesian Network Approach The Bayesian network approach was evaluated with each of the criteria presented in section 3.1. 1. Power and “naturalness” of the representation for different types of evidence. It appears to be possible to adequately represent different types of evidence from this domain using the Bayesian network approach. The subjective probabilities that must be supplied to represent apriori confidence in evidence are very similar to the certainty factors that must be supplied in the CF approach.
2. Applicability of the reasoning methods to this domain. The topology of the Bayesian network and the entries in the CPT allow one to adequately represent the different methods of reasoning that we have found in this domain.
3.
Theoretical basis for the representation and reasoning methods. foundation for Bayesian networks is Bayes Rule.
The mathematical
Proponents of Bayesian networks
always cite the firm mathematical foundation of the representation as one of its strengths (Stefik 1995). Dependencies among pieces of evidence and hypotheses are explicitly represented in the directed graph. One must provide a set of mutually exclusive and exhaustive values for all random variables. The requirement that the list be exhaustive is easily circumvented by including a value called “other “ to cover miscellaneous, low probability values. The requirement that the values be mutually exclusive is somewhat more problematic because of the vague nature of some of the categories used. It will be
necessary to do sensitivity analysis to determine if small changes in parameter values can cause sudden, large changes in category assignments. If this is the case, it may be possible to overcome the problem by adding more categories (e.g. expanding high, medium, low to very high, high, medium, low, very low.)
4. Utility of the methods for decision making. The meaning of probability values is familiar and widely used for decision making. It is possible to add a decision node to a Bayesian network that explicitly represents the probability values that are required for particular decisions (Charniak 1991).
5. Scale-up considerations for large systems. Exact updating of the probabilities in a Bayesian network is an NP-complete problem (Cooper 1990). Fortunately, several accurate and efficient approximation methods for belief updating have been developed (Charniak 1991). As previously stated, the decision process in the waste characterization domain has been modularized. Using the Bayesian network approach, each decision is be handled by a separate Bayesian network with a decision node. In general, the decision node determines if the waste container under consideration will continue to be processed by other Bayesian networks or be excluded removed from the process.
6. Knowledge acquisition issues. The Bayesian network approach is often criticized because of the large number of probability values that must be provided. At every node, a CPT must be constructed that gives the probability of each value of the node for every possible combination of values for the parent nodes. Although this appears to require that the domain expert provide many probabilities, in practice this number can be greatly reduced by use of standard definitions for “noisy” relationships. If a node has k parents, each with j possible values, the CPT for the node requires 2jk probabilities. The noisy relationships reduce the number of the values that must be acquired from experts to jk. Other values are entries in the CPT are computed from those provided. Experts are often reluctant to provide probability values, but are not reluctant to provide rough qualitative estimates of certainty in a conclusion. Appropriate subjective probability values are usually easy to
compute from these estimates. We will conduct sensitivity analyses to determine how sensitive our Bayesian network prototype is to changes in probability values.
7. Ease of implementation.
Several development programs are now available for
implementing Bayesian networks including the Hugin package that we have used for development of the prototypes in our work (Andersen et al. 1989).
4.4 Fuzzy Logic Fuzziness is related to “quantified degrees of knowing” . We can think of it as a representation of vagueness. It allows us to represent classifications that include vagueness and degree. Unlike uncertainty, fuzziness is not related to degree of belief and probabilities. The source of the imprecision in a fuzzy system is different from the source of imprecision in an uncertainty system. Specifically, a fuzzy system’s imprecision is caused by classifications for which the membership is a matter of degree. For example, suppose that it misted for an hour or so this morning, but was sunny for the rest of the day. Would we consider today to have been a rainy day? Fuzziness allows us to represent the day as having been somewhat rainy, or rainy to a degree. Similarly, fuzziness allows us to represent that a container partially or somewhat meets the requirements for shipment to WIPP.
4.4.1 Reasoning with Fuzzy Logic Most of the current work with fuzzy systems stems from a classic paper on fuzzy set theory by Lotfi Zadeh (1965). Fuzzy set theory, which was “created for use in pattern classification and pattern matching,” allows gradual or partial membership in a set or class (Stefik 1995). Fuzzy logic, or reasoning based upon fuzzy set theory, differs from traditional Boolean logic in that a statement need not be either true or false, but may have a degree of truth.
A fuzzy set is defined as a subset of some universal set, where each element in the fuzzy subset has associated with it a degree of membership in the subset. Let U represent a universal set. Then a fuzzy set or class A may be defined as a subset of U where each x ∈ U is assigned a value between 0 and 1, inclusive, that represents the degree of membership for x in A. The
function fA that assigns the degrees of membership to elements in A is called the characteristic function for set A. A classic example is the determination of the “tallness” of a person5. We may define any person more than 6 feet tall as belonging to the set of tall people with a membership value of 1, and any person who is less than 4 feet tall as belonging to the set with a membership of 0. Anyone whose height falls between 4 feet and 6 feet would have a membership value between 0 and 1. The following figure illustrates a possible characteristic function for the fuzzy set of “tall people”.
1 Generally true M aybe true
Sometimes true 0 4
5
6
7
Figure 1. A Characteristic Function for the Set of Tall People Another way to represent this information is to define “height” as a fuzzy or linguistic variable. Fuzzy variables are names that take fuzzy adjectives as values. Each fuzzy adjective is a fuzzy set. For example, the fuzzy variable “height” can take on values (or adjectives) such as “tall,” “medium,” or “short”. Characteristic functions describe the fuzzy sets. The “height” example is represented using the piecewise linear characteristic functions in Figure 2. Modifiers may be used in our characterizations (e.g.: “very tall,” “somewhat tall,” “very short”). In fuzzy set theory, such modifiers for fuzzy variables are called hedges.
1
M edium
Short
T all
0 4
5
6
7
H eight in feet
Figure 2. Three Piecewise Linear Characteristic Functions for the Fuzzy Variable “height” A statement that assigns a value to a fuzzy variable is called a fuzzy proposition. An example is the statement: John’s height is tall. “John’s height” is a fuzzy variable, and “tall” (which represents a fuzzy set) is the value for the variable. Fuzzy rules relate fuzzy propositions. Fuzzy inference techniques provide not only a conclusion, but also a degree of belief in that conclusion.
While fuzzy set theory provides a way of representing vagueness, fuzzy reasoning provides a way of combining fuzzy evidence. Max-min inference and max-product inference are the two most common methods used for fuzzy inference. The interested reader may refer to Stefik (1995) for an explanation of these methods. Of these two approaches, max-min inference is generally preferred for discrete systems, i.e., systems that must make a distinct choice based on the input data. The max-product inference is generally preferred for continuous systems, i.e., systems that must make a smooth transition in response to changing inputs.
4.4.2 Representing Waste Characterization Uncertainty with Fuzzy Logic
Fuzzy logic is specifically designed to represent imprecise concepts.
Although we
specificied classes like high, medium, and low with respect to certain parameter values in our certainty factor and Bayesian network prototypes, these methods of representing uncertainty assume that a specific item is a member of only one of these classes and that the associated certainty or probability value indicates our confidence in the membership assessment. In fuzzy set theory, however, a particular item may be a partial member of several of fuzzy sets. This idea of “degree of membership” is the major distinction between fuzzy set theory and the other methods. A particular feature of interest, for example experience of an observer, becomes a linguistic variable with several fuzzy sets as possible values (e.g. high, medium, and low). A membership function is defined for each of these sets that allows one to determine the degree of membership of a particular value in each set. Probabilistic data about the detection limitations of sensors must be converted into a possibility distribution.
This can be done in a very straightforward manner.
The effect of
violations of the assumptions used in a computation can be modeled as fuzzy rules that modify the degree of membership in the fuzzy sets associated with the linguistic variable of interest. For example, evidence of neutron self-shielding can decrease the membership in the fuzzy set high for the linguistic variable confidence in the active neutron assay and increase membership in the fuzzy set low.
Likewise, knowledge about the presence of errors in historical data can be
modeled by rules that decrease membership in fuzzy sets that represent high confidence. Different levels of confidence in human observations can be modeled as imprecise concepts and are readily represented using fuzzy sets. The concept of uncertain rules is generally not supported in fuzzy logic. Rather, one uses degrees of membership in fuzzy sets in the premise to determine the degree of membership in a fuzzy set in the conclusion.
4.4.3 Reasoning with Fuzzy Logic for Waste Characterization For the waste characterization task, the evidence and hypotheses are all represented as linguistic variables with particular values. The degrees of membership of particular values in fuzzy sets are represented using membership functions. The relationship between particular values for the evidence and different hypotheses are represented using rules. As in the case of certainty factors, these rules are assumed to be modular. Rules that are used primarily for refutation are easy to model using fuzzy logic. A high degree of membership in an inconsistent value implies low confidence in the relevant hypothesis. Fuzzy logic generally uses the same definitions for logical AND, OR, and NOT that are used for certainty factors. The same problems are encountered in using these definitions in a fuzzy logic system for waste characterization as were encountered with certainty factors. Specifically, it is difficult to model situations in which one piece of evidence has more weight than other pieces of evidence. Alternate logical operators are available which may be more appropriate in this domain (Giarratano and Riley 1994). A number of different inference procedures have defined for fuzzy logic (Hyperlogic 1990). A set of experiments was conducted with the prototype matrix identification system implemented with fuzzy logic to determine which is these was better in this domain. As expected, the max-min inference procedure produced results more consistent with those of the expert for this classification task than the max-product inference method. Likewise, several rules of combination are available when several rules yield the same conclusion. Three commonly used strategies are maximum, sum, and single best. In the maximum method, the two membership functions are combined by taking the maximum of the two functions at every point. In the sum method, the two membership functions are added. In the single best method, the “best” membership function is selected. Experiments with our prototype have shown that the single best strategy gives results most consistent with those of our domain expert. In order to come to a decision, fuzzy set theory requires that one translate a fuzzy set to a decision using a process called defuzzification. The two most common strategies for defuzzification are the centroid and maximum methods. As expected for a classification task, the maximum method for defuzzification was best for this problem.
4.4.4 Evaluation of the Fuzzy Logic Approach The fuzzy logic approach was evaluated with each of the criteria presented in section 3.1. 1. Power and “naturalness” of the representation for different types of evidence. Much of the evidence in this domain is imprecise. As with certainty factors and Bayesian networks, the expert seemed to be very comfortable providing ranges for membership in different categories (fuzzy sets). Unlike the other two approaches, however, fuzzy logic allows a particular value to be a member of more than one of the categories (fuzzy sets). Simple piece-wise linear functions with a prescribed degree of overlap appeared to be good representations of these imprecise concepts.
2. Applicability of the reasoning methods to this domain. The major drawback of fuzzy logic for reasoning in this domain appears to be the lack of a convenient method for representing the level of imporance of evidence. In particular, there is no convenient method for weighting different pieces of evidence.
3. Theoretical basis for the representation and reasoning methods.
Fuzzy logic has been
widely criticized for the lack of a theoretical foundation for its reasoning methods (Elkan 1994). The wide variety of inference procedures, combining rules, and defuzzification techniques that have been developed is often used as an illustration of the variety of ad hoc methods that must be adopted in order to make fuzzy logic work in different situations. There are however, a fairly dependable set of heuristics available for selecting appropriate methods for different types of problems. We found that the set of choices typically recommended for classifications tasks (max-min inference, single best rule of combination, and maximum defuzzification) (Hyperlogic 1990) were indeed the best choices.
The
problems with assumption of independence of rules and the decisions on how to group evidence in rules is the same as for the certainty factor approach.
4. Utility of the methods for decision making. The meaning of the quantification for decision making is determined by the method of defuzzification.
Problems are sometimes
encountered if the defuzzification technique must deal with very small membership values.
5. Scale-up considerations for large systems. Knowledge for fuzzy logic is modularized in the same way that it is for certainty factors. Use of piecewise linear membership functions allows for very efficient computation during inference. The fuzzy logic system is rulebased and, like the certainty factor approach, has the same strengths and weakness of any rule-based system. In addition, the interaction of membership functions of sets used in premises and conclusions of different rules is sometimes difficult to anticipate.
6. Knowledge acquisition issues. In a fuzzy logic system, one must acquire membership functions rather than single numbers.
Research has shown that simple functions are
usually adequate representations and can be readily supplied by domain experts (Stefik 1995) Sensitivity analysis will need to be conducted to determine the effect of small changes in membership functions on decisions.
7. Ease of implementation. Many expert systems shells are available which support fuzzy logic including the Fuzzy Clips and CubiCalc (HyperLogic 1990) packages which were used in this study.
5.0 Summary and Future Work The wide variety of uncertain information must be acquired, represented, and reasoned with in this domain. We are in the process of developing prototypes using different methods for reasoning with uncertainty to determine which method(s) is most appropriate. Our preliminary evaluation indicates that DS Theory is not well suited in this domain because we are not dealing with a problem which deals with a gradually narrowing set of hypotheses. Instead, our task is to assess the degree of certainty of a hypothesis based on a sensor system. Of the methods considered, certainty factors are the simplest and easiest to implement. It has been difficult, however, to use the standard logical operators and rules of combination for certainty factors to obtain results which agree with those of the expert. Further assessment of the “custom” rules that have been developed is necessary. The Bayesian network approach provides a very convenient visual representation of the interaction of evidence and intermediate hypotheses in the domain. The major drawback of this approach is the large number of probabilities that must be provided by experts. Refinement of appropriate “noisy” relationships for this domain appear to
hold promise for reducing the number of values required. Fuzzy logic is attractive because much of the knowledge in this domain appears to be imprecise. However, more work must be done to determine if an adequate methods for weighting different pieces of evidence can be developed. Another alternative that must considered is that of using more than one of these methods. The modular design of the knowledge base for different decisions appears to make this a reasonable alternative.
ACKNOWLEDGMENTS
This work is supported by the Mississippi State University Diagnostic Instrumentation and Analysis Laboratory and by the LIMTCO.
REFERENCES
Andersen, S. K., K. G. Olesen, F. V. Jensen, and F. Jensen. 1989. HUGIN – A shell for building Bayesian belief universes for expert systems. In Proceedings of the Eleventh International Joint Converence On Artificial Intelligence (IJCAI-89), Volume 2, Detroit, Michigan. New York: Morgan Kaufmann. 1080-1085 . Becker, G. K., C. Watts, J. Bennion, and T.J. Roney, 1995. Utility of neural networks in nondestructive waste assay measurement methods, In Proceedings of the 4th Nondestructive Assay and Nondestructive Examination Waste Characterization Conference held in Salt Lake City, Utah, October 24-26, 1995, 161-88. Bezdek, Jim. 1994. Fuzzienss vs. probability – again (! ?). IEEE Transactions on Fuzzy Systems. 2 (1): 1-3. Bonissone, P. P. and R. M. Tong. 1985. Editorial: Reasoning with uncertainty in expert systems. International Journal of Man-Machine Studies. 22: 241-250. Bridges, Susan, Julia Hodges, and Charles Sparrow. 1996. The Application of Artificial Intelligence Techniques to the Analysis of Waste Assay Data, Mississippi State University Diagnostic Instrumentation and Analysis Laboratory, Technical Report No. MSU-960620. Bridges, Susan, Julia Hodges, Charles Sparrow, Bruce Wooley, and Shiyun Yie. 1997. The development of an expert system for the characterization of waste assay data. In Proceedings of the 5th Nondestructive Assay and Nondestructive Examination Waste Characterization Conference. Salt Lake City, Utah, January 14-16, 1997. 421-453.
Buchanan, B. G., and E.H. Shortliffe. 1990. Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project, The MIT Press, Cambridge, MA. Charniak, Eugene. 1991. Bayesian networks without tears. AI Magazine. Winter. 50-63. Cooper, G. F. 1990. The computational complexity of probabilistic inference using Bayesian belief networks. Artificial Intelligence. 42: 393-405. DOE. 1995. Transuranic Waste Characterization Quality Assurance Program Plan. U.S. Department of Energy, National TRU Program Office, Carlsbad Area Office, Report No. DoE/CAO-94-1010, Revision 0. Elkan, Charles. 1994. The paradoxical success of fuzzy logic. IEEE Expert. 9(4): 3-8. Giarrantano J. and G. Riley. 1994. Expert Systems: Principles and Programming, 2nd Edition. Boston: PWS Publishing Company. Harker, Y. D., L.G. Blackwood, and T.R. Meachum. 1995. Uncertainty Analysis of the SWEPP Drum Assay System for Graphite Content Code 300, Idaho National Engineering Laboratory. Heckerman, D., J. Horvitz, and B.N. Nathwani. 1992. Toward normative expert systems: Part I: The pathfinder project. Methods for Information in Medicine, 32(2), 90-105. Heckerman D. and B.N. Nathwani. 1992. Toward normative expert systems: Part II: probability-based representations for efficient knowledge acquisition and inference. Methods for Information in Medicine. 32(2): 106-16. Heckerman, D. and M. Wellman. 1995. Bayesian networks.” Communications of the ACM 38(3): 27-30. Henkind, S. J. and M. C. Harrison. 1988. An analysis of four uncertainty calculi. IEEE Transactions on systems, Man, and Cybernetics. 8(5): 700-714. HyperLogic Corporation. 1990. CubiCalc: The Third Wave in Intelligent Software, Escondido, CA: HyperLogic Corporation. Kanal, L. N. and J. F. Lemmer. 1986. Uncertainty in Artificial Intelligence. Amsterdam: North-Holland. Lee, Newton S., Yves L. Grize, and Khosrow Dehnad. 1987. Quantitative Models for reasoning under uncertainty in knowledge-based expert systems. International Journal of Intelligent Systems, Vol. II, 15-38. Lucus, Peter and Linda van der Gaag. 1991. Principles of Expert Systems. Wokingham, England: Addison-Wesley Publishing Company. 253-335.
Matthews, S. D., G.K. Becker, E.S. Marwil, and G.V. Miller. 1993. SWEPP Assay System Software Requirements Specifications, U.S. Department of Energy, Office of Environmental Restoration and Waste Management. Mousseau, K. C., A.R. Hempstead, G.K. Becker, and T.J. Roney. 1995. Waste assay measurement integration system user interface. In Proceedings of the 4th Nondestructive Assay and Nondestructive Examination Waste Characterization Conference Conference held in Salt Lake City, Utah, October 24-26, 1995. 387-98. Pearl, J. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, San Mateo, CA: Morgan Kaufmann Publishers. Rumbaugh, James, Michael Blaha, William Premerlani, Frederick Eddy, and William Lorensen. 1991. Object-Oriented Modeling and Design, Englewood Cliffs, NJ: Prentice Hall. Russell, S. J. and P. Norvig. 1995. Artificial Intelligence: A Modern Approach, Englewood Cliffs, NJ: Prentice Hall. Stefik, Mark. 1995. Introduction to Knowledge Systems, San Francisco, CA: Morgan Kaufmann Publishers, Inc. Zadeh, L. 1965. Fuzzy sets, Information and Control, 8, 338-53.