An Interactive Algorithm to Construct an Appropriate ... - Science Direct

0 downloads 0 Views 310KB Size Report
Membership Function Using Information Theory and Statistical. Method ... Keywords: Fuzzy entropy; Mathematical programming; Interactive algorithm. 1. .... maker's interval estimation with a given PDF to develop the efficient algorithm.
Available online at www.sciencedirect.com

ScienceDirect Procedia Computer Science 61 (2015) 32 – 37

Complex Adaptive Systems, Publication 5 Cihan H. Dagli, Editor in Chief Conference Organized by Missouri University of Science and Technology 2015-San Jose, CA

An Interactive Algorithm to Construct an Appropriate Nonlinear Membership Function Using Information Theory and Statistical Method Takashi Hasuikea,*, Hideki Katagirib, Hiroe Tsubakic a Faculty of Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan Graduate School of Engineering, Hiroshima University, 1-4-1 Kagamiyama, Higashi-Hiroshima, Hiroshima 739-8527, Japan c National Statistics Center, 19-1 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8668, Japan

b

Abstract This paper develops a constructing algorithm for an appropriate membership function as objectively as possible. It is important to set an appropriate membership function for real-world decision making. The main academic contribution of our proposed algorithm is to integrate a general continuous and nonlinear function with fuzzy Shannon entropy into subjective interval estimation by a heuristic method under a given probability density function based on real-world data. Two main steps of our proposed approach are to set membership values a decision maker confidently judges whether an element is included in the given set or not and to obtain other values objectively by solving a mathematical programming problem with fuzzy Shannon entropy. It is difficult to solve the problem efficiently using previous constructing approaches due to nonlinear function. In this paper, the given nonlinear membership function is approximately transformed into a piecewise linear membership function, and the appropriate values are determined. Furthermore, by introducing natural assumptions in the real-world and interactively adjusting the membership values, an algorithm to obtain the optimal condition of each appropriate membership value is developed. © 2015 2015The TheAuthors. Authors.Published Published Elsevier B.V. © byby Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of scientific committee of Missouri University of Science and Technology. Peer-review under responsibility of scientific committee of Missouri University of Science and Technology Keywords: Fuzzy entropy; Mathematical programming; Interactive algorithm

1. Introduction Uncertainty is generally represented as random variables using received numerical data. Conversely, it is also important to mathematically formulate other uncertainty such as fuzziness that is derived from human cognitive behavior, utility, and subjectivity. One standard mathematical approach is Fuzzy theory. The important mathematical element of fuzzy theory is to construct a membership function for the given set. Many guidelines on developing the membership functions for fuzzy sets have been shown in a survey by Gottwald [7]. With the previous standard

1877-0509 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of scientific committee of Missouri University of Science and Technology doi:10.1016/j.procs.2015.09.140

Takashi Hasuike et al. / Procedia Computer Science 61 (2015) 32 – 37

approaches, it is difficult to select an appropriate membership function’s shape and to set the membership values statistically; hence, heuristic methods have been used, which are not mathematically and statistically guaranteed. However, if a decision maker defines membership functions using the heuristic methods subjectively, the optimal solution in the mathematical programming problem is dependent on his/her subjectivity. Therefore, other people may not accept this decision due to lack of objectivity. In order to overcome this disadvantage of constructing membership functions, some researchers have proposed more rigorous approaches in terms of statistics. For instance, they adopted a transformation from a probability distribution to a possibility distribution (for instance, Bharathi and Sarma [2]). Civanlar and Trussell [6] proposed a more rigorous method to define the membership function for a certain group of fuzzy sets by maximizing the fuzzy entropy measure based on the probability density function derived from real-world data. This approach has been extended by Cheng and Cheng [5] and Nieradka and Butkiewicz [9]. Cheng and Cheng [4] also proposed an automatic determination approach of the membership function based on the maximum entropy principle. The advantage of fuzzy entropy approaches is that the decision maker can construct the membership function based on statistics and mathematical programming. However, almost all parameters are automatically determined by these approaches, and hence, human cognitive behavior and subjectivity may not be sufficiently reflected in the resulting membership function. On the other hand, there are some studies that compare various heuristic approaches in terms of appropriateness. Chameau and Santamarina [3] prepared questionnaires to compare four approaches to construct membership functions: point estimation, interval estimation, membership function exemplification, and pairwise comparison as possible candidates for practical applications. They concluded the interval estimation approach is better than the others. Yoshikawa [11] also discussed the influence of procedures for an interactive identification method on the forms of membership functions. His proposed approach is also based on the interval estimation and questionnaires related to the degree of membership. In real-world decision making, a decision maker can confidently set the intervals with membership values 0 and 1, because he/she can subjectively and objectively explain how the intervals are set. Thus, the interval estimation method was found to offer a number of advantages that make it very suitable for practical applications. However, heuristic methods are partially included in these approaches, particularly in ranges excluding intervals with membership values 0 and 1. Therefore, in order to overcome each disadvantage of the fuzzy entropy approach and heuristic method based on subjective interval estimation, we developed some constructing approaches of appropriate membership functions using both fuzzy Shannon entropy and the subjective interval estimation derived from human cognitive behavior and subjectivity [8]. However, our previous approaches do not include interactivity between the decision maker and the designer of the membership function. In addition, it is better to apply our proposed algorithm to various membership functions not depending on the specific function, and hence, we consider a piecewise linear membership function. Almost all membership functions including the S-curve function are approximated by the piecewise linear membership function, and the decision maker can obtain flexible and appropriate shapes of the membership function using our proposed approach. In addition, the proposed approach for the original problem is also a nonlinear programming problem, which is difficult to solve directly. Therefore, we develop an efficient algorithm to obtain the membership function under some natural assumptions. This paper is organized as follows. In Section II, we introduce fuzzy Shannon entropy based on the standard entropy in information theory. We also introduce a piecewise linear function to apply our constructing approach to any nonlinear membership functions. In Section III, we formulate a mathematical programming problem maximizing the fuzzy Shannon entropy with the piecewise linear function under a constraint to the total average membership value derived from the given probability density function (PDF). In order to solve our proposed model efficiently, we introduce a natural assumption for the probability density function and piecewise linear function, and transform the fuzzy Shannon entropy into a well-defined form. Using these conditions and nonlinear programming approaches, we develop a constructing interactive algorithm for the appropriate membership function. Finally, in Section IV, we conclude this paper and discuss future research efforts. 2. Mathematical Definition to Construct Appropriate Membership Function We introduce mathematical definitions to construct an appropriate membership function integrating the decision maker’s interval estimation with a given PDF to develop the efficient algorithm. The important mathematical

33

34

Takashi Hasuike et al. / Procedia Computer Science 61 (2015) 32 – 37

element of our proposed approach is to maximize fuzzy Shannon entropy for the membership function considering a piecewise linear function, and hence, definitions of the fuzzy Shannon entropy and piecewise function are introduced as follows. 2.1. Fuzzy entropy Various definitions of fuzzy entropy have been proposed based on statistical theory (for instance, Al-sharhan and Karray [1], Nieradka and Butkiewicz [9], Pal and Bezdek [10]). Let I be a set with random events ^x1 , x2 ,..., xn ` in an experiment and

pi be the corresponding probability of event xi occurring. Shannon entropy is one of the n

most standard entropies in information theory using random events, and formulated as

 ¦ pi log pi . By i 1

extending the Shannon entropy to uncertainty derived from the fuzziness of the fuzzy set, the fuzzy Shannon entropy of membership values

Pi

n

corresponding to events

xi is defined as  ¦ Pi log Pi . Evidently, the continuous i 1

form of fuzzy Shannon entropy is also represented as

f

 ³ P x log P x dx . In addition, it is possible to discuss f

n

the following using

 ¦ ^Pi log Pi  1  Pi log 1  Pi ` as another fuzzy Shannon entropy, which is very i 1

similar to the standard Shannon entropy in the case of binary discrete random variables. In this paper, we consider developing an efficient solution algorithm to construct each decision maker’s membership function, and hence, we focus on the simple discrete form of fuzzy Shannon entropy and construct the appropriate membership function. 2.2. Piecewise linear function In the real world, it is difficult to initially determine a specific membership function. In statistics and machine learning, an approach using the piecewise linear function is constructed to obtain membership functions without setting the specific membership function. Almost all membership functions including nonlinear functions are approximated by the piecewise linear membership function, and hence, it is natural to introduce the piecewise linear membership function as a general membership function. In this paper, we divide received data into T groups, and set

It

^xt 1 d x d xt `, t

1,2,..., T and piecewise linear function f x

ct  ct 1 x  xt 1  ct 1 , xt  xt 1

x  I t . In the case in which the membership function is determined using this linear membership function, parameter ct is represented as membership value P t . If f x is a smoothly monotonous increasing function on I t , it is acceptable that the difference between ct  ct 1 and ct 1  ct is equal to or approximately equal to 0. That is, ct  ct 1 | ct 1  ct is obtained where “ | ” means “approximately equal”. The above formula can also be represented in the following form:

ct 1  2ct  ct 1 | 0 For instance, in the case

ct  ct 1

ct 1  ct any t, the piecewise linear membership function is equivalently

transformed into a linear function which is basis of triangle and trapezoidal membership functions. The ideal condition is to satisfy the above formula under all ranges I t , t 1,2,..., T , and hence, the following formula based on the least squares method:

35

Takashi Hasuike et al. / Procedia Computer Science 61 (2015) 32 – 37 T 1

¦ c

 2ct  ct 1

2

t 1

(1)

t 1

p x is a histogram, i.e., p x pt , x  I t , it is natural to assume that we divide intervals including all data into same width 1 n . That is, xt  xt 1 xt 1  xt holds with respect to I t and I t 1 . In this case, one of the most smoothing formulas is given as 1 t ct 1  ct | ct  ct 1 | , t 1,2,...T  1 , that is, ct | , t 1,2,..., T . Therefore, the smoothing T T In addition, in the case that we assume that PDF

function based on the least squares method (1) is represented in the following form: T

t· § ¨ ct  ¸ ¦ T¹ t 1 ©

2

(2)

By integrating this smoothing formula with fuzzy Shannon entropy and maximizing the integrated function, we develop an algorithm to obtain an appropriate membership function. 3. Constructing Algorithm of Appropriate Membership Function We propose a constructing approach of the appropriate membership function in this section. In order to integrate human cognitive behavior and subjectivity into the statistical function discussed in Section 2, we estimate intervals initially set by an examinee using two ranges that include an event or a condition is entirely within human experience, and never within human experience, respectively. For instance, when we ask the examinee to specify a range of temperatures for which the examinee feels “entirely comfortable,” the response will be from 20 to 25 degrees Celsius. Then, when we ask the examinee to specify a range of “never comfortable” temperatures, she or he will also answer less than 14 degrees Celsius or more than 30 degrees Celsius. These questions will not be a burden, and it is not difficult to provide these two ranges. Furthermore, if we introduce these two ranges in the proposed fuzzy Shannon entropy-based model, it is possible to integrate subjective personal cognitive behavior with objectivity derived from statistical theory. We assume that each endpoint is initially set by the decision maker, i.e., P0 0 and PT 1 in this paper. 3.1. Mathematical modeling of our proposed approach The main objective of our approach is to maximize the fuzzy Shannon entropy. The only available quantitative data is PDF p x derived from real-world data. In previous studies, as a constraint in a mathematical programming problem to construct the appropriate membership function, the total expected value of the membership function

E P x

f

³ P x p x dx is more than the target value c, which is the target total average membership value initially determined by the examinee, i.e., ³ P x p x dx t c . This concept is based on the study conducted by f

f

f

c is a large value close to 1, the constructed membership p x ! 0 . On the other hand, if c is a small value close to 0, the constructed membership function is similar to a function whose value is 0 in all ranges I t . Therefore, it is Civanlar and Trussell [6]. If the value of parameter

function is similar to a function whose value is 1 under

important to flexibly set a value of parameter c according to the decision maker’s feelings. In addition, from the interview or questionnaire to the decision maker, membership values of some important groups

Pt

are interactively set as interval range

>P

L t

@

, PtU where PtL and P tU are the lower and upper values. If

36

Takashi Hasuike et al. / Procedia Computer Science 61 (2015) 32 – 37

there are no data from the decision maker,

PtL

and

P tU

can be set as 0 and 1, respectively. Therefore, from these

mathematical techniques, the appropriate membership function is obtained as optimal solutions of the following problem under the condition that pdf p x is a histogram, i.e., p x pt , x  I t : T

T

t 1

t

T

subject to

§ 1 ©

¦ Pt log Pt  W ¦ ¨ Pt 

M inimize

¦P p t

t

t· ¸ T¹

2

t c, P d Pt d P , t L t

U t

(3)

1,2,..., T

t 1

Using nonlinear programming approaches such as Lagrange function and KKT conditions, we obtain the following optimal solutions:

Pt

where

H

­ PtU ° ® Pt °P L ¯ t

Kt d O [ t  O  Kt , t 1,2,..., T  1 O d [t

­ t ·½ 1 ­ § L L ®log Pt  H  2W ¨ Pt  H  ¸ ¾, in the case pt ! 0 °[ t pt ¯ T ¹¿ © ° ° t ·½ 1 ­ § U U °Kt ®log Pt  H  2W ¨ Pt  H  ¸ ¾, in the case pt ! 0 pt ¯ T ¹¿ ° © ® °log P  1  2W § P  t ·  Op 0, in the case [  O  K or p ¨ t ¸ t t t t t ° T¹ © ° t ·· 1 T § § ° O Pt ¨ log Pt  1  2W ¨ Pt  ¸ ¸ ¦ ° ct1 © T ¹¹ © ¯

(4)

0

is the sufficiently small value. We find that each optimal solution in (4) is dependent on

particularly important which range

O

, it is

>[t ,Kt @ is included in O . In order to obtain the following optimal solutions

correctly, we develop the following interactive algorithm for the appropriate membership function. Interactive algorithm to obtain the membership function STEP1: Receive probability density function STEP2. STEP2: Arrange

[t

and

pt and set values of c and W . If possible, set PtL and P tU . Go to

K t , t 1,2,..., T

in ascending order, and redefine parameter

where S is the total number of different values of to STEP3. STEP3: Set O m

PtL , PtU

and K t ,

t

1,2,..., S

1,2,..., T Set s0 m 0 , j m 1 . Go

s j and solve equations (4) under O considering each range whose membership value is

or

Pt . Go to STEP4. T

T

STEP4: Calculate

[t

sj , j

¦ P t pt t 1

. If

¦ Pt pt  c , then set j m j  1, and return to STEP3. If t 1

LO m s j 1 , U O m s j . Furthermore, set k m 1 and go to STEP5.

T

¦P p t

t 1

t

t c , we set

37

Takashi Hasuike et al. / Procedia Computer Science 61 (2015) 32 – 37

LO  U O , and go to STEP6. 2 L U

STEP6: Solve equations (4) under Ok considering each range whose membership value is Pt , Pt or Pt . Go to STPE5: From STEPs 1 to 4, we find

LO  O  U O , and hence, Ok m

STEP7.

k t 2 and Ok  Ok 1  G where G is the sufficiently small value, then the current solution is the

STEP7: If

optimal solution of the main problem (9), and go to STEP9. If k T

STEP8: If

¦ Pt pt  c , then LO m Ok t 1

1 or Ok  Ok 1 t G , go to STEP8.

and k m k  1 , and return to STEP6. If

T

¦P p t

t

t c , then U O m Ok

t 1

and k m k  1 , and return to STEP6. STEP9: If the decision maker is satisfied with the current membership function, terminate this algorithm. If not, reset parameter set values of

c and W , and PtL , P tU if possible. Return to STEP2.

4. Conclusion In this paper, we have developed an interactive algorithm of an appropriate membership function by integrating fuzzy Shannon entropy for a piecewise linear membership function under the given probability density function. The proposed approach has been formulated as a more general mathematical programming problem than previous approaches, but it was difficult to solve the proposed problem directly. Therefore, we have introduced some natural assumptions for the fuzzy Shannon entropy and the piecewise linear function, and we have developed the efficient constructing algorithm to obtain the optimal condition of membership values using nonlinear programming approaches. Our proposed approach includes advantages of both statistical theory and interactive approach. In addition, our approach can be applied to various types of membership functions in the real world. Therefore, the obtained membership function will be more statistically and objectively appropriate as well as more subjective in terms of fitting to human feelings. As a future work, we will develop more efficient and versatile algorithms for original problem (3) without introducing any assumptions. Furthermore, we will apply our proposed algorithm to real-world decision making. References [1] [2] [3]

S. Al-sharhan, F. Karray, and O. Basir, “Fuzzy entropy: a brief survey”, Proceedings of FUZZ-IEEE2001, 3, pp. 1135-1138, 2001. B. Bharathi and V.V.S. Sarma, “Estimation of fuzzy membership from histograms”, Information Sciences, 35, pp. 43-59, 1985. J.L. Chameau and J.C. Santamarina, “Membership function I: Comparing methods of Measurement”, International Journal of Approximate Reasoning, 1, pp.287-301, 1987. [4] H.D. Cheng and J.R. Cheng, “Automatically detemine the membership function based on the maximum entropy principle”, Information Sciences, 96, pp.163-182, 1997. [5] H.D. Cheng and Y.H. Cheng, “Thresholding based on fuzzy partition of 2D histogram”, Proceedings of IEEE International Conference on Pattern Recognition, 2, pp. 1616-1618, 1997. [6] M.R. Civanlar and H.J. Trussell, “Constructing membership functions using statistical data”, Fuzzy Sets and Systems, 18, pp. 1-13, 1986. [7] S. Gottwald, “A note on measures of fuzziness”, Elektron Informationsverarb Kybernet, 15, pp. 221-223, 1979. [8] T. Hasuike, H. Katagiri, and H. Tsubaki, “Constructing an appropriate membership function integrating fuzzy Shannon entropy and human’s interval estimation”, ICIC Express Letters, 8(3), pp. 809-813, 2014. [9] G. Nieradka and B. Butkiewicz, “A method for automatic membership function estimation based on fuzzy measures”, Proceedings of IFSA2007, LNAI4529, pp. 451-460, 2007. [10] N.R. Pal and J.C. Bezdek, “Measuring fuzzy uncertainty”, IEEE Transactions on Fuzzy Systems, 2(2), pp. 107-118, 1994. [11] A. Yoshikawa, “Influence of procedure for interacive identification method on forms of identified membership functions”, Japan Society for Fuzzy Theory and Intelligent Informatices (in Japanese), 19(1), pp. 69-78, 2007.