Inference Methods for Partially Redundant Rule

0 downloads 0 Views 479KB Size Report
description. In addition, many rule conditions only specify some linguistic vari- .... The chosen t-norm for the intersection (AND) is the algebraic product. ... As a consequence, a rule base with normally 9 rules (Table 2) is formulated with. 5 rules.
Inference Methods for Partially Redundant Rule Bases Ralf Mikut, Jens Jäkel, and Lutz Gröll Forschungszentrum Karlsruhe GmbH, Institute of Applied Informatics (IAI) P.O. Box 3640, D-76021 Karlsruhe, Germany Phone: +49-7247-825731, Fax: +49-7247-825785 E-Mail: {mikut,jaekel,groell}@iai.fzk.de 1

Introduction

This paper discusses inference strategies for fuzzy rule bases resulting from databased automatic rule generation algorithms. Typical methods for rule generation are tree-oriented, statistical and evolutionary approaches. The aim of these data-based methods is the design of compact rule bases with a small number of interpretable rules which map the learning data set and provide a sucient statistical soundness. Powerful methods use linguistic hedges as at least, approximately or dierent abstraction levels of linguistic terms as positive for an abstract description or positive small and positive large for a more specialized description. In addition, many rule conditions only specify some linguistic variables with linguistic terms. The resulting rule bases are often characterized by partially redundant rules with identical conclusions, rules with overlapping premises and contradictory conclusions, disjunctive combinations of linguistic terms and missing rules for untrained input combinations. Any fuzzy inference strategy should produce the results expected by human experts reading the rules and membership functions. Classical inference strategies and fuzzy operators partially give strange results if some of the specic characteristics of automatically generated rule bases occur. A rst task is the formalization of these expected results into so-called semantic constraints. Secondly, modied inference approaches will be proposed to more adequate results. The aims of this paper are  to give a short overview about data-based methods for rule generation,  to discuss problems of classical fuzzy inference strategies and  to propose new operators and strategies to solve these problems. 2

Data-based rule generation

Inductive learning strategies for the generation of crisp rules from a set of examples have been studied for long time [1]. Today, heuristical approaches with a statistical point of view as the ROSA method [2, 3] and tree-oriented approaches [4, 5] play an important role.

The ROSA method generates single rules and evaluates these rules using statistical measures. The hypothesis generation uses dierent heuristics or evolutionary algorithms. Typically, the approach produces rules with an incomplete premise structure. It means, that only a few linguistic variables in the premise are specied by linguistic terms. Tree-oriented approaches create a decision tree consisting of nodes and branches. A node indicates a linguistic term or class of the output and contains a test on an input variable xi = ? if it is a decision node. For each outcome of a test, a linguistic term xi = Aij of the tested input, a branch starts from the decision node. A decision node receives the most frequent output term or class in the respective subset of examples. The construction algorithm consists of step-wise splits of the set of training examples using a test xi = ? in each step. If the searching algorithm terminates before specifying all linguistic variables, the same incomplete premise structure as for the ROSA method follows. The results of the subsequent pruning process depend on the strategy  rstly, pruning the tree by deleting subtrees [4, 5] or, secondly, pruning the (fuzzy) rules extracted from the tree deleting linguistic variables [46] or adding disjunctively linguistic terms [6]. The rst approach guarantees a complete rule base with mutually exclusive rules. The latter is characterized by rules with overlapping premises but produces normally a more compact rule base with a better generalization ability. A further problem is the existence of local contradictory rules which follows from small overlaps of dierent rules' premises. The disjunctive combination of linguistic terms is more transparent if new derived linguistic terms are created [6]. In the last step of rule search, only some signicant rules will be chosen for the rule base [7]. The advantage is that the number of necessary rules is further reduced but the resulting rule base does not cover the whole input space. Therefore, a default rule is introduced which complements the other rules to the whole input space. It is especially advantageous if a frequent output class (e. g. the class normal") is spread on the input space. As a consequence, inference strategies for data-based generated rule bases have to handle these characteristics in a satisfactory manner. In the next chapter, typical problems and solving strategies will be described. A more detailed description can be found in [8]. 3

Inference strategies

As discussed above, the linguistic terms Ai can be classied into primary and derived terms. Primary terms are the given terms in the classical sense, e.g. dened by an expert. Derived terms for ordered linguistic variables are automatically generated by modications using linguistic hedges like at least, rather, at most, not or derived generalized terms as positive instead of the primary specialized terms positive small, positive big. In contrast to [9] and [10], derived terms are built by the union of primary terms and not by modifying a primary term.

Linguistic terms and fuzzy sets should hold the following conditions (semantic constraints ): 1. Fuzzy sets of the disjunctive primary terms are triangular or trapezoidal, normal, convex and have single overlap [11]. 2. Fuzzy sets of primary terms cover the universe of discourse completely. 3. For every pair of linguistic terms A; B , 8 x 2 X : if A  B (a) A (x)  B (x), (b) A[B (x) = B (x), (c) A\B (x) = A (x) if A \ B = ; (d) A[B = maxf1; A + B g, (e) A\B = 0 Commonly used operators partially fail in the construction of the fuzzy sets of derived terms according to conditions 3(a)(e). For example, min does not hold 3(e), (algebraic) product 3(c,e), max 3(d), sum 3(b) and the algebraic sum 3(b,d). These conditions can be understood by regarding positive as result of a disjunction of the primary terms positive small, positive big. The disjunction of both expression has to deliver positive, the conjunction positive and positive small should give positive small, and so on. The crisp set relations of dierent primary and derived terms are the key to solve these problems. If also in the crisp case, an intersection set exists as discussed above, a dierent approach has to be used. This idea leads to the following fuzzy set operators:

(

if A \ B = ; (1) minfA(x); B (x)g else; ( A[B (x) = A (x) + B (x) ; A\B (x) = A (x) + B (x) if A \ B = ; (2) maxfA; B g else:

A\B (x) = 0

These operators consider also the crisp set relations and fulll the conditions 3(a)(e). Rule premises consist of conjunctions of statements like "x1 is positive small ". Conclusions contain only elementary statements like "y is small ". Moreover, rules are equally plausible. Rules which are partially redundant, i. e. which have overlapping premises, should not reinforce each other. In the following, a rule base with q rules is assumed processing s linguistic variables with m input fuzzy sets (mi the number of linguistic terms for xi ) and n output fuzzy sets. The chosen t-norm for the intersection (AND) is the algebraic product. The union of fuzzy sets (for rules and derived terms) is performed with the operator (2) extended to the multi-dimensional and multi-variable case

A1 [[A (x) = l

Xl i=1

A (x) ; i

Xl i

Suggest Documents