An overview of membership function generation ...

68 downloads 1436 Views 1MB Size Report
from the matrix A--[aij], called the pairwise comparison alternative matrix, whose element ... element aji in the jth row and ith column of this matrix is 1/aij. Saaty used a .... Fig. l(a) shows the intensity image of a 256 x 256 color image obtained.
ELSEVIER International Journal of ApproximateReasoning 19 (1998) 391-417

An overview of membership function generation techniques for pattern recognition Swarup Medasani a, Jaeseok Kim u, Raghu Krishnapuram ~'* Department of Mathematical and Computer Sciences, Colorado School of Mines, Golden, CO 80401, USA b Samsung Biomedical Research Institute, 50 llwon-dong, Kangnam-Ku, Seoul, South Korea-135710 a

Received 1 April 1997; accepted 1 March 1998

Abstract

The estimation of membership functions from data is an important step in many applications of fuzzy theory. In this paper, we provide a general overview of several methods for generating membership functions for fuzzy pattern recognition applications. We discuss methods based on heuristics, probability to possibility transformations, histograms, nearest neighbor techniques, feed-forward neural networks, clustering, and mixture decomposition. We also illustrate these membership generation methods using synthetic and real data sets, and discuss the suitability and applicability of these membership function generation techniques to particular situations. © 1998 Elsevier Science Inc. All rights reserved. Keywords: Membership function generation; Typicality; Probability to possibility

transformations; K-nearest neighbor; Clustering; Mixture decomposition; Multiprototype classification

*Corresponding author. Tel.: +1 303 273 3877; fax: +1 303 273 3875; e-mail: rkrishna@mines. edu 0888-613X/98/$19.00 © 1998 Elsevier Science Inc. All rights reserved. PII: S 0 8 8 8 - 6 1 3 X ( 9 8 ) 1 00 1 7-8

392

S. Medasani et al. / lnternat. J. Approx. Reason. 19 (1998) 391~I17

1. Introduction

Eliciting membership functions from (training) data is one of the fundamental issues associated with the application of fuzzy set theory. There are no guidelines or rules that can be used to choose the appropriate membership generation technique. Another problem that makes membership function generation a non-trivial task, is the lack of consensus on the definition and interpretation of membership functions. For example, Dubois and Prade [1] point out three different interpretations of the statement: "The membership value of George Bush in the class of tall men is 0.8." The three interpretations are: (a) 80% of the population declared that George Bush is tall (likelihood view), (b) 80% of the population described "tall" as an interval containing George Bush's height (random set view), and (c) George Bush's height is at a normalized distance equal to 0.2 from the closest ideal prototype (typicality view). Thus, one could think of a variety of methods to generate membership values depending on the interpretation. Much literature is oriented toward the determination of membership functions that reflect subjective perceptions about vague or imprecise concepts such as tallpersons, old men, and young men. Unfortunately, these methods cannot be directly applied to many practical problems. For example, in fuzzy logic applications, membership functions are needed to model the prevailing uncertainty in the input information. It should be noted that no measures are available to evaluate the goodness or correctness of the membership function generated using a particular method. This is a serious problem when membership functions are used to model concepts that have no physical meanings. Therefore, models used for membership functions must be sufficiently flexible so that they can be easily adjusted or tuned to optimize the performance of the algorithm that uses them. The problem of membership function generation is of fundamental importance because the success of an algorithm depends on the membership functions used. It might be difficult, if not impossible, to come up with a single membership generation method which will work for most applications. Rather, several methods may have to be used in tandem, and the choice of the method may depend on the kind of the problem and the type of data available. Other approaches to model uncertainty (such as probability) also use a variety of methods to estimate the underlying distributions depending on the situation. In this paper, we focus on methods of membership function generation for pattern recognition applications. We suggest several methods to estimate membership functions and make some qualitative comparisons. In Section 2, we review membership function generation techniques that reflect subjective perception. Membership functions based on heuristic functions are presented in Section 3. In Section 4, we show how multi-dimensional histograms of input features can be used for generating membership functions. In Section 5, some methods to transform probability distributions to possi-

S. Medasani et aL/ Internat. Z Approx. Reason. 19 (1998) 391-417

393

bility distributions are presented. Section 6, delves into the issue of membership function generation using fuzzy K-nearest neighbor techniques. Neural network-based procedures are presented in Section 7. In Section 8, we discuss a popular clustering technique and a new mixture decomposition technique for estimating membership functions. Examples of each of the aforementioned techniques are presented in the respective sections. Finally, in Section 9 we present the conclusions.

2. Membership functions based on perception

In many decision-making applications, membership functions of fuzzy sets are based on subjective perceptions of vague or imprecise categories rather than on data or other objective entities involved in the given problem. The problem of assigning numbers to subjective perceptions of vague categories is a matter of mathematical psychology and requires the utilization of various techniques of the theory of measurement and scaling. Extensive experiments on generating membership functions for concepts such as tall man and aesthetically pleasing house were conducted by Norwich and Turksen [2] with the assumption that membership values should be defined on an interval scale. An interval scale is justified by the inapplicability of extensive measurements to fuzziness and the lack of a natural origin for memberships. In their experiments, two techniques, called direct rating and reverse rating, were used. In the direct rating procedure, the subject is presented with a random series of persons (houses) and then asked to indicate the membership degree to rate each one as tall (or pleasing). In the reverse rating procedure, the subject is presented with an ordered series of persons (houses) and asked to select the one person (house) best seeming to correspond to the indicated degree of membership in the category of tall persons (or pleasing houses). Another method proposed involves the polling technique [3,4]. This technique assumes that semantic uncertainty is merely a statistical uncertainty in the information-theoretic sense. The values of membership functions are found by randomly and repeatedly presenting a subject with elements and acquiring either a 'yes' or a 'no' response to the question: Does x belong to A? This polling method implies that probability of a positive answer is proportional to membership value. This interpretation of membership corresponds to the likelihood view. Saaty [5] used the relative preference method with the assumption that membership value is on a rational scale. Membership values are computed from the matrix A--[aij], called the pairwise comparison alternative matrix, whose element aij represents the relative membership value of an element xi in a fuzzy set F with respect to the membership value of an element xj in F. The element aji in the jth row and ith column of this matrix is 1/aij. Saaty used a

394

S. Medasani et al. I Internat. J. Approx. Reason. 19 (1998) 391-417

scale divided into seventeen levels {1/9, 1 / 8 , . . . , 1/21 l, 2 , . . . , 8, 9}. Each of these levels has a semantic interpretation. The larger the value of a v, the greater the membership of xi compared with that of xj. The membership values are determined by finding the eigenvector of A such that Au = nu, where A is assumed to be as consistent as possible (i.e., the matrix A really represents relative membership values), n is the number of elements in the reference set and u is the membership vector [ui, u2, u3,..., u,] T. Parametrized membership functions have been suggested by several researchers. Kochen and Badre [6] assumed that a membership function is continuous, differentiable and S-shaped. Furthermore, the marginal increase of a person's strength of belief that "x is A" is assumed to be proportional to the strength of his/her belief that "x is A" and to the strength of his/her belief that "x is not A". That is, du(x) = ku(x)(1 - u(x)) dx and its solution is given by 1

u(x) = 1 + exp(a - bx)"

(1)

This can be easily linearized since I n ( ( 1 - u ( x ) ) / u ( x ) ) = a - bx, and the parameters a and b can be determined by statistical methods such as regression. A similar membership function was derived by Zimmermann and Zysno [7]. Their parametrization of a membership function is 1

u(x) = 1 + d(x-------5 Membership is defined as a function of the distance d(x) between a given object x and a standard (ideal) member. This corresponds to the typicality view discussed in Section 1. Hence, d(x) = 0 =~ u = 1;d(x) = c¢ =~ u = 0. The distance function d(x) can be l/x, but, due to the evidence that the relationship between a physical unit and perception is generally exponential, a more appropriate distance function is 1 / e x p ( - a (x - b)) where the parameter a models the slope of the membership function and b represents the point at which the tendency of the subject's attitude changes from being rather positive to being rather negative (inflection point). Another parametrized membership function suggested by Dombi [8] is given by (1 - v)l-'~(x - a) "~ 2 > 1, u(x) = (1 - v)l-X(x - a) ~ + vX-l(b - x ) "~'

where (a, b) is the interval on which u(x) has non-zero membership values, 2 is the sharpness parameter, and v is interpreted as an expectation level. It is easy

S. Medasani et al. / lnternat. J. Approx. Reason. 19 (1998) 391-417

to show that this membership function has the form of 1/(1 + distance function is

395

d(x)),when the

d(x) ( All methods reviewed in this category lack a general principle (such as maximum likelihood to estimate probability density). However, it should be noted that our understanding of how the human mind perceives and manipulates vague categories is quite primitive and inconclusive.

3. Heuristic methods Heuristic methods use predefined shapes for membership functions and have been used successfully in rule-based pattern recognition applications [9]. In computer vision, heuristic membership functions may be used to describe certain spatial relations (such as "above" and "to the left of" [10,11]) and certain properties (such as lightness or darkness of a pixel value, position of a pixel, and narrowness of a region). Here, we present a few frequently used shapes for heuristic membership functions.

Piecewiselinearfunctions:

The membership functions may be chosen to be linearly increasing, linearly decreasing or a combination of these.

Example 1. x

#(x) -- 1 - - ; a

p(x) =-;x where X --- [0, a] is a reference set. a

Example 2. #(x)

I'l-la-xl

0 otherwise. The membership functions may also be chosen as piecewise linear functions, i.e., they have linearly increasing, decreasing and flat regions.

Example 3.

IO ifx

Suggest Documents