Granular Fuzzy Inference System (FIS) Design by ...

2 downloads 0 Views 346KB Size Report
Granular Fuzzy Inference System (FIS) Design by Lattice Computing. Vassilis G. Kaburlasos. Technological Educational Institution of Kavala. Department of ...
Granular Fuzzy Inference System (FIS) Design by Lattice Computing Vassilis G. Kaburlasos Technological Educational Institution of Kavala Department of Industrial Informatics 65404 Kavala, Greece [email protected]

Abstract. Information granules are partially/lattice-ordered. Therefore, lattice computing (LC) is proposed for dealing with them. The granules here are Intervals’ Numbers (INs), which can represent real numbers, intervals, fuzzy numbers, probability distributions, and logic values. Based on two novel theoretical propositions introduced here, it is demonstrated how LC may enhance popular fuzzy inference system (FIS) design by the rigorous fusion of granular input data, the sensible employment of sparse rules, and the introduction of tunable nonlinearities. Key words: Fuzzy inference system (FIS), Granular data, Inclusion measure, Intervals’ number (IN)

1

Introduction

An information granule [11] can be thought of as a (local) cluster. It turns out that clusters are partially-ordered – For a formal definition of partial-order see below. Under certain conditions, a partially-ordered set is a lattice. Hence, mathematical lattice theory emerges naturally in granular computing. The term Lattice Computing (LC) was coined by Gra˜ na [3, 4] to denote a Computational Intelligence branch, which develops algorithms in an algebra (R, ∨, ∧, +), where R is the set of real numbers. Later work [5, 9] proposed the following, wider definition: Lattice computing (LC) is an evolving collection of tools and methodologies that can process disparate types of data including logic values, numbers, sets, symbols, and graphs based on mathematical lattice theory. Note that the former LC definition was motivated mainly by mathematical morphology for image processing [12], whereas the latter LC definition has a wider motivation including, in addition, formal concept analysis [1], general clustering/classification/regression techniques [7], logic and reasoning [15], etc. A popular family of algorithms is Fuzzy Inference Systems (FISs) [6], whose inputs typically consist of vectors in the Euclidean space RN . Recent work described an approach to FIS design based on mathematical morphology [13]. This work proposes a rigorous extension of conventional FIS techniques towards computing with (information) granules, namely Intervals’ Number (INs).

2

Kaburlasos

More specifically, this work builds on an established mathematical result, namely “the resolution identity theorem”, which specifies that a fuzzy set can (equivalently) be represented either by its membership function or by its αcuts. In conclusion, based on two novel mathematical propositions, an inclusion measure function emerges here as an instrument towards substantial FIS improvements including the rigorous fusion of granular input data, the sensible employment of sparse rules, and the introduction of tunable nonlinearities. The work here is organized as follows. Section 2 presents the mathematical background. Section 3 introduces novel mathematical tools. Section 4 demonstrates granular FIS. Section 5 concludes by summarizing the contribution.

2

Mathematical Preliminaries

This section summarizes a hierarchy of lattices [7, 10] using an improved mathematical notation introduced recently [8, 10]. 2.1

The Complete Lattice (∆,¹) of Generalized Intervals

There is no unanimous opinion whether lattice (R, ≤) is complete or not. Here, we assume that lattice (R, ≤) is complete with least and greatest elements O = −∞ and I = +∞, respectively. We define a generalized interval, next. Definition 1. Generalized interval is an element of lattice (R, ≤∂ ) × (R, ≤). We remark that ≤∂ in Definition 1 denotes the dual (i.e. converse) of order relation ≤, i.e. ≤∂ ≡≥. Product lattice (R,≤∂ )×(R,≤) ≡ (R × R,≥ × ≤) will be denoted, simply, by (∆,¹). Note that curly symbols ¹, g, f are used for general lattice elements, whereas straight symbols ≤, ∨, ∧ are used for real numbers. A generalized interval will be denoted by [x, y], where x, y ∈ R. The meet (f) and join (g) in lattice (∆,¹) are given, respectively, by [a, b]f[c, d] = [a∨c, b∧d] and [a, b] g [c, d] = [a ∧ c, b ∨ d]. The set of positive (negative) generalized intervals [a, b], characterized by a ≤ b (a > b), is denoted by ∆+ (∆− ). It turns out that (∆+ ,¹) is a poset, namely poset of positive generalized intervals. Poset (∆+ ,¹) is isomorphic1 to the poset (τ (R),¹) of intervals (sets) in R, i.e. (τ (R),¹) ∼ = (∆+ ,¹). We augmented poset (τ (R),¹) by a least (empty) interval, denoted by O = [+∞, −∞]. Hence, the complete lattice (τO (R) = τ (R) ∪ {O},¹)∼ = (∆+ ∪ {O}, ¹) emerged. A strictly decreasing bijective, i.e. one-to-one, function θ : R → R implies isomorphism (R,≤) ∼ = (R,≥). Furthermore, a strictly increasing function v : R → R is a positive valuation2 in lattice (R,≤). It follows that function v∆ : ∆ → R given by v∆ ([a, b]) = v(θ(a)) + v(b) is a positive valuation in lattice (∆,≤). Parametric functions θ(.) and v(.) may introduce tunable nonlinearities. 1

2

A map ψ : (P, ¹) → (Q, ¹) is called (order) isomorphism iff both “x ¹ y ⇔ ψ(x) ¹ ψ(y)” and “ψ is onto Q”. Two posets (P, ¹) and (Q, ¹) are called isomorphic, symbolically (P, ¹) ∼ = (Q, ¹), iff there is an isomorphism between them. Positive valuation in a general lattice (L, ¹) is a real function v : L × L → R that satisfies both v(x) + v(y) = v(x f y) + v(x g y) and x ≺ y ⇒ v(x) < v(y) [2].

Granular FIS Design by Lattice Computing

2.2

3

The Complete Lattice (F,¹) of Intervals’ Numbers (INs)

Based on generalized intervals, this subsection presents intervals’ numbers (IN s). A more general number type is defined in the first place, next. Definition 2. Generalized interval number (GIN) is a function G : (0, 1] → ∆. Let G denote the set of GINs. It follows the complete lattice (G, ¹), as the Cartesian product of complete lattices (∆, ¹). Our interest here focuses on the sublattice3 of intervals’ numbers defined next. Definition 3. An Intervals’ Number, or IN for short, is a GIN F such that both F (h) ∈ (∆+ ∪ {O}) and h1 ≤ h2 ⇒ F (h1 ) º F (h2 ). Let F denote the set of INs. It follows that (F, ¹) is a complete lattice with least element O = O(h) = [+∞, −∞] and greatest element I = I(h) = [−∞, +∞], ∀h ∈ (0, 1]. Conventionally, a IN will be denoted by a capital letter in italics, e.g. F ∈ F. Moreover, a N -tuple IN will be denoted by a capital letter in bold, e.g. F = (F1 , ..., FN ) ∈ FN . Lattice (FN , ¹) is the fourth-level in a hierarchy of complete lattices whose first-, second- and third- level include lattices (R, ¹), (∆,¹) and (F, ¹), respectively. A IN is a mathematical object, which admits different interpretations as follows. First, based on the “resolution identity theorem”, a IN F (h), h ∈ (0, 1] may be interpreted as a fuzzy number, where F (h) is the corresponding α-cut for α = h. Hence, a IN F : (0, 1] → τO (R) may, equivalently, be represented by an upper-semicontinuous membership function mF : R → (0, 1]; that is the membership-function-representation for a IN. Moreover, a IN F (h), h ∈ (0, 1] is represented by a set of intervals; that is the interval-representation for a IN. There follows equivalence mF1 (x) ≤ mF2 (x) ⇔ F1 (h) ¹ F2 (h), where x ∈ R, h ∈ (0, 1]. Second, a IN F (h), h ∈ (0, 1] may also be interpreted as a probability distribution such that interval F (h) includes 100(1 − h)% of the distribution, whereas the remaining 100h% is split even both below and above interval F (h).

3

Novel Mathematical Tools

Consider the following definition [7, 8, 10]. Definition 4. Let (L, ¹) be a complete lattice with least and greatest elements O and I, respectively. An inclusion measure in (L, ¹) is a function σ : L×L → [0, 1], which satisfies the following conditions C0. C1. C2. C3. 3

σ(x, O) = 0, ∀x 6= O. σ(x, x) = 1, ∀x ∈ L. x f y ≺ x ⇒ σ(x, y) < 1. u ¹ w ⇒ σ(x, u) ≤ σ(x, w).

A sublattice of a lattice (L, ¹) is another lattice (S, ¹) such that S ⊆ L.

4

Kaburlasos

We remark that σ(x, y) can be interpreted as the fuzzy degree to which x is less than y; therefore notation σ(x ¹ y) may be used instead of σ(x, y). Two inclusion measures, namely sigma-meet and sigma-join, respectively, have been proposed [7, 8] in the complete lattice (τO (R), ¹) of intervals as follows. 1) σf ([a, b] ¹ [c, d]) = v(θ(a∨c))+v(b∧d) v(θ(a))+v(b) , if a ∨ c ≤ b ∧ d; otherwise, σf ([a, b] ¹ [c, d]) = 0, and v(θ(c))+v(d) 2) σg ([a, b] ¹ [c, d]) = v(θ(a∧c))+v(b∨d) , where function v : R → R is strictly increasing, whereas function θ : R → R is strictly decreasing. In conclusion, as detailed in [7], the following two inclusion measures emerge, respectively, in the complete lattice (F,¹) of INs. R1 1) σf (F1 ¹ F2 ) = σf (F1 (h) ¹ F2 (h))dh. 0

2) σg (F1 ¹ F2 ) =

R1

σg (F1 (h) ¹ F2 (h))dh.

0

The following Proposition can be interpreted with reference to Fig. 1. Proposition 1. Consider a continuous dual isomorphic function θ : R → R and a continuous positive valuation function v : R → R. Let U0 (h) = [u0 , u0 ], h ∈ (0, 1] be a trivial IN and let W (h), h ∈ (0, 1] be a IN with upper-semicontinuous membership function mW : R → R. Then σf (U0 ¹ W ) = mW (u0 ).

U0

1 h2

W mW(x)

mW(u0) = h0 h1 ah1

u0 ah2

bh2

bh1

x

Fig. 1. The sigma-meet σf (U0 ¹ W ) degree of inclusion of trivial IN U0 = [u0 , u0 ], h ∈ (0, 1] to IN W = W (h) = [ah , bh ], h ∈ (0, 1] equals mW (u0 ), where mW : R → R is the membership function of IN W .

We remark that Proposition 1 couples a IN’s two different representations, namely the interval-representation and the membership-function-representation. Note that the principal advantage of the former (interval) representation is that it enables useful algebraic operations, whereas the principal advantage of the latter (membership function) representation is that it enables convenient fuzzy logic interpretions. The practical significance of Proposition 1 as well as of the following Proposition is demonstrated below.

Granular FIS Design by Lattice Computing

5

Proposition 2. Consider complete lattices (Li , ¹), i ∈ {1, ..., N }, each equipped with an inclusion measure function σi : Li × Li → [0, 1]. Consider N -tuples x = (x1 , . . . , xN ) and y = (y1 , . . . , yN ) such that x, y ∈ L = L1 × . . . × LN . Furthermore, consider the conventional lattice ordering x ¹ y ⇔ xi ¹ yi , ∀i ∈ {1, ..., N }. Then, both functions 1) σ∧ : L × L → [0, 1] given by σ∧ (x ¹ y) = min{σi (xi ¹ yi )} and 2) σΠ : L × L → [0, 1] given by σΠ (x ¹ y) = Πσi (xi ¹ yi ), i ∈ {1, . . . , N } are inclusion measures in (L, ¹).

4

Computational Experiments

A FIS includes K rules (implications) Rk , k = 1, . . . K, of the following form Rule Rk : IF (variable V1 is Fk,1 ).and. . . . .and.(variable VN is Fk,N ) THEN ck This work does not concern with the consequents ck , k = 1, . . . K of rules. Instead, the interest here focuses exclusively on rule antecedents. Furthermore, unless otherwise stated, this work employs functions v(x) = x and θ = −x. Fig. 2 displays the antecedent of a FIS rule R with only two INs W1 and W2 having parabolic membership functions mW1 (x) = −x2 + 6x − 8 and mW2 (x) = −0.25x2 + 3.5x − 11.25, respectively. Let an input [u1,0 , u2,0 ] = [3.5, 5.5] be presented as shown in Fig. 3(a). Using conventional FIS techniques, the activation mR (u1,0 , u2,0 ) of rule R is a function of both mW1 (u1,0 ) = 0.75 and mW2 (u2,0 ) = 0.4375. For instance, it may be either mR (u1,0 , u2,0 ) = min{mW1 (u1,0 ), mW2 (u2,0 )} or mR (u1,0 , u2,0 ) = mW1 (u1,0 )mW2 (u2,0 ). Identical results are obtained by inclusion measure σf (., .) as explained next. Let trivial INs U1,0 = U1,0 (h) = [u1,0 , u1,0 ] = [3.5, 3.5], h ∈ (0, 1] and U2,0 = U2,0 (h) = [u2,0 , u2,0 ] = [5.5, 5.5], h ∈ (0, 1] represent real numbers u1,0 = 3.5 and u2,0 = 5.5, respectively. Then, based on Proposition 1, it follows both σf (U1,0 ¹ W1 ) = mW1 (u1,0 ) = 0.75 and σf (U2,0 ¹ W2 ) = mW2 (u2,0 ) = 0.4375. Finally, based on Proposition 2, the degree of inclusion of U0 = [U1,0 , U2,0 ] to W = [W1 , W2 ] may be either σ∧ (U0 ¹ W) = min{σf (U1,0 ¹ W1 ), σf (U2,0 ¹ W2 )} = min{mW1 (u1,0 ), mW2 (u2,0 )} or σΠ (U0 ¹ W) = σf (U1,0 ¹ W1 )σf (U2,0 ¹ W2 ) = mW1 (u1,0 )mW2 (u2,0 ).

W1

1

W2

1

mW (x 2)

AND

mW (x 1)

2

1

0

0

1

2

3

4

5 x1

6

7

8

9

10

0

0

1

2

3

4

5 x2

6

7

8

9

10

Fig. 2. A FIS rule R antecedent: “variable V1 is W1 ” and “variable V2 is W2 ”. The membership functions of INs W1 and W2 are parabolas mW1 (x1 ) and mW2 (x2 ) with maxima at x1 = 3 and x2 = 7, respectively.

6

Kaburlasos

A first substantial advantage for an inclusion measure is its capacity to accommodate “in principle” granular input INs for representing uncertainty/vagueness in practice [14]. For instance, consider the granular input INs U1 and U2 shown in Fig. 3(b) each with an isosceles (triangular) membership function of width 2 ∗ 0.2 = 0.4 centered at x1 = 3.5 and x2 = 5.5, respectively. Inclusion measure σf (., .) computes the activation of rule R in Fig. 3(b) as follows. 0.6825 0.7902 R R −0.2h−0.3+√1−h dh + One the one hand, it is σf (U1 ¹ W1 ) = 1dh + −0.4h+0.4 0

R1

0.6825

0dh ≈ 0.7456. On the other hand, it is σf (U2 ¹ W2 ) =

0.7902 0.5088 R 2√1−h−0.2h−1.3 dh −0.4h+0.4 0.3331

W1

1

1dh +

0

R1

+

0.3331 R

0dh ≈ 0.4321.

0.5088

U1,0

U2,0

1

W2 mW (x 2)

AND

0.75 mW (x 1)

2

1

0.4375

0

0

1

2

3 3.5 4

5 x1

6

7

8

9

0

10

0

1

2

3

4

5 5.5 6 x2

7

8

9

10

(a) W1

1

U1

W2 mW (x 2)

AND 0.3331

1

0

1

2

2

0.5088

mW (x 1)

0

U2

1

0.7902 0.6825

3 3.5 4

5 x1

6

7

8

9

0

10

0

1

2

3

4

5 5.5 6 x2

7

8

9

10

(b) Fig. 3. Consider the antecedent of rule R in Fig. 2. (a) Rule R is activated by trivial IN U0 = [U1,0 , U2,0 ]. (b) Rule R is activated by IN U = [U1 , U2 ], where both INs U1 and U2 have an isosceles (triangular) membership function of width 2 ∗ 0.2 = 0.4.

A second substantial advantage for σg (., .), in particular, is its capacity to deal with nonoverlapping INs towards sensibly employing a sparse rule-base. For instance, on the one hand, Fig. 4(a) shows a trivial IN input U0 = [U0 , U0 ], where U0 = U0 (h) = [4.5, 4.5], h ∈ (0, 1], presented to rule R. It follows σg (U0 ¹ W1 ) = R1 2√1−h R1 4√1−h √ √ dh ≈ 0.5974, moreover σ (U ¹ W ) = dh ≈ 0.6737. On g 0 2 1.5+ 1−h 2.5+2 1−h 0

0

the other hand, Fig. 4(b) shows a nontrivial IN input U = [U, U ] presented to rule R, where IN U has an isosceles (triangular) membership function of width √ R1 2 1−h √ dh ≈ 2 ∗ 0.2 = 0.4 centered at 4.5. It follows σg (U ¹ W1 ) = 1.7−0.2h+ 1−h 0.5693, and σg (U ¹ W2 ) =

R1 0

0

√ 4 1−h√ dh 2.7−0.2h+2 1−h

≈ 0.6555.

Granular FIS Design by Lattice Computing W1

1

U0

U0

1

7

W2 mW (x 2)

AND

mW (x 1)

2

1

0

0

1

2

3

4 4.5 5 x1

6

7

8

9

0

10

0

1

2

3

4 4.5 5 x2

6

7

8

9

10

(a) W1

1

U

W2

U

1

mW (x 2)

AND

mW (x 1)

2

1

0

0

1

2

3

4 4.5 5 x1

6

7

8

9

0

10

0

1

2

3

4 4.5 5 x2

6

7

8

9

10

(b) Fig. 4. Consider the antecedent of rule R in Fig. 2. (a) A trivial IN input U0 = [U0 , U0 ] is presented. (b) A granular IN input U = [U, U ] is presented. Only inclusion measure σg (., .) can activate “in principle” rule R.

Finally, a third substantial advantage for an inclusion measure is its capacity to employ alternative positive valuation functions, whereas, in stark contrast, the majority of FISs in the literature (implicitly) employ solely positive valuation v(x) = x. In the following we demonstrate the effects of the (paramet1 ric) sigmoid positive valuation function vs (x; λ, µ0 ) = 1+e−λ(x−µ , x ∈ R, where 0) >0 λ ∈ R , µ0 ∈ R. Consider INs W1 and W2 of Fig. 2, trivial IN U0 of Fig. 4(a), and triangular IN U of Fig. 4(b). Then, for the sigmoid function vs (x; 1, 4.5) shown in Fig. 5, it was computed σg (U0 ¹ W1 ) ≈ 0.6114 and σg (U0 ¹ W2 ) ≈ 0.9999; furthermore, σg (U ¹ W1 ) ≈ 0.5803 and σg (U ¹ W2 ) ≈ 1. Hence, a positive valuation can be used as an instrument for tunable decision-making.

W1

1

U0

W2

U 0.5

0 -6

-5

-4

-3

-2

-1

0

1

2 x

3

4.5

6

7

8

9

10

Fig. 5. INs W1 and W2 of Fig. 2 are displayed as well as both trivial IN U0 and triangular IN U of Fig. 4. Inclusion measures σg (., .) were computed using the displayed sigmoid positive valuation vs (x; λ, µ0 ) = 1/(1 + e−λ(x−µ0 ) ) with λ = 1, µ0 = 4.5.

8

5

Kaburlasos

Discussion and Conclusion

This work introduced two major theoretical results, presented by Proposition 1 and Proposition 2 relating, on the one hand, inclusion-measure-based algebraic operations in lattice (F, ¹) and, on the other hand, membersip-function-based fuzzy logic operations. In conclusion, significant improvements were demonstrated in FIS design including the rigorous fusion of granular input data, sensible employment of sparse rules, and introduction of tunable nonlinearities. Acknowledgement This work has been supported, in part, by a project Archimedes-III contract.

References 1. R. Belohlavek. Fuzzy Relational Systems: Foundations & Principles. Springer, 2002. 2. G. Birkhoff. Lattice Theory. AMS, Colloquium Publications 25, 1967. 3. M. Gra˜ na. State of the art in lattice computing for artificial intelligence applications. In: R. Nadarajan, R. Anitha, C. Porkodi, eds. Mathematical and Computational Models, pp. 233-242, 2007. 4. M. Gra˜ na. Lattice computing: lattice-theory-based computational intelligence. In: T. Matsuhisa, H. Koibuchi, eds. Proc. Kosen Workshop on Mathematics, Technology, and Education (MTE), pp. 19-27, 2008. 5. M. Gra˜ na, I. Villaverde, J.O. Maldonado, C. Hernandez. Two lattice computing approaches for the unsupervised segmentation of hyperspectral images. Neurocomputing 72(10-12): 2111–2120, 2009. 6. S. Guillaume. Designing fuzzy inference systems from data: an interpretabilityoriented review. IEEE Trans. Fuzzy Systems 9(3): 426–443, 2001. 7. V.G. Kaburlasos. Towards a Unified Modeling and Knowledge-Representation Based on Lattice Theory. Springer, ser. Studies in Computational Intelligence 27, 2006. 8. V.G. Kaburlasos, A.G. Hatzimichailidis. Improved fuzzy inference system (FIS) design based on fuzzy lattice reasoning (FLR). (submitted). 9. V.G. Kaburlasos, S.E. Papadakis. Piecewise-linear approximation of nonlinear models based on interval numbers (INs). In: V.G. Kaburlasos, U. Priss, M. Gra˜ na, eds. Proc. Lattice-Based Modeling (LBM 2008) Workshop, pp. 13–22, 2008. 10. S.E. Papadakis, V.G. Kaburlasos. Piecewise-linear approximation of nonlinear models based on probabilistically/possibilistically interpreted intervals’ numbers (INs). Information Sciences (to be published). 11. W. Pedrycz, A. Skowron, V. Kreinovich, eds. Handbook of Granular Computing. John Wiley & Sons, 2008. 12. G.X. Ritter, J.N. Wilson. Handbook of Computer Vision Algorithms in Image Algebra, 2nd ed. CRC Press, 2000. 13. P. Sussner, M.E. Valle. Morphological and certain fuzzy morphological associative memories for classification and prediction. In: V.G. Kaburlasos, G.X. Ritter, eds. Computational Intelligence Based on Lattice Theory. Springer, ser. Studies in Computational Intelligence 67, pp. 149–171, 2007. 14. P.P. Wang. Mathematics of Uncertainty – Guest Editiorial. Information Sciences 177(23): 5141–5142, 2007. 15. Y. Xu, D. Ruan, K. Qin, J. Liu. Lattice-Valued Logic. Springer, ser. Studies in Fuzziness and Soft Computing 132, 2003.

Suggest Documents