C2.3 STATISTICAL MODELS C2.3.1 Introduction

2 downloads 0 Views 214KB Size Report
1985, Wang 1987, Gil 1992, and Section B2.3 in this handbook). Other studies ... Rappoport et al. (1987), Jain and Agogino (1988), Stein and Zwick (1988),.
C2.3 STATISTICAL MODELS Norberto Corral, Mar´ıa Angeles Gil, Mar´ıa Teresa L´opez, Antonia Salas Departamento de Matem´ aticas Universidad de Oviedo C/ Calvo Sotelo, s/n 33007 Oviedo, Spain and Carlo Bertoluzza Dipartimento di Informatica e Sistemistica Universit` a degli studi di Pavia Via Ferrata, 1 27100 Pavia, Italy e-mail: [email protected] Abstract This section is devoted to presenting concepts and results concerning aggregation of fuzzy and statistical modelings. For this purpose, the incorporation of fuzzy elements in statistical problems is first discussed. Then, some of the problems and approaches more deeply analyzed in the literature are briefly summarized for both univariate and multivariate cases. Finally, references are made to some additional related studies.

C2.3.1 Introduction Statistics are concerned with the measurement and use of the available information to make statistical inferences or decisions. Statistical inferences or decisions refer to random phenomena, and uncertainty is commonly involved in most of the elements of statistical problems. Possible types and sources of uncertainty are diverse, and they cannot be classified in a unique way. In accordance with the nature of uncertainty, we can consider different models. The best founded approach to model uncertainty is that based on Probability Theory, whose basic elements are the experimental outcomes, the events of interest and their associated probabilities. This approach is the most suitable tool to manage problems involving randomness, which is the uncertainty associated with the occurrence of well-defined experimental outcomes or events. Nevertheless, to deal with problems involving fuzziness, which is the uncertainty associated with the definition or meaning of ill-defined elements, Fuzzy Set Theory supplies a powerful mathematical framework. The basic elements of Fuzzy Set Theory are fuzzy sets, and the fuzzy operations and rules for them.

Since Fuzzy Set Theory was introduced (Zadeh 1965) several studies have been developed to connect Statistics and Fuzzy Set Theory. Some of these studies have been devoted to analyze differences and similarities between fuzzy sets and probabilities (see, for instance, Hisdal 1982, 1988, Gil 1993), between fuzziness and randomness (see Goodman and Nguyen 1985, Klir 1989, Goodman et al. 1991, Weber 1991, and others, most of them discussing Lindley’s comments 1982, 1987, on the inadequacy of approaches different to probability to describe uncertainty) or, alternatively, between fuzzy and random sets (see, for instance, Goodman 1982, Wang and Sanchez 1983, Goodman and Nguyen 1985, Wang 1987, Gil 1992, and Section B2.3 in this handbook). Other studies have been devoted to comparing statistical and fuzzy methods in representing and managing uncertainty from different viewpoints. In this last sense, interesting references can be found in the literature (see, for instance, Dubois and Prade 1980, Goodman and Nguyen 1985, Graham and Jones 1988, Kruse et al. 1991, Zimmermann 1991, and more recently Laviolette et al. 1995, and the discussions in it, along with Chapters A2 and A1 in this handbook). The focus of this section is limited to studies establishing and handling concepts and techniques which combine fuzzy and statistical modeling.

C2.3.2 Statistical modeling of fuzzy elements Statistical modeling is firstly focused on establishing an appropriate model for random experiments, to develop later methods to make inferences or decisions about them. To perform a random experiment is to make or observe something happening leading to an outcome that cannot be predicted in advance. Most random experiments regard counting or measuring, so that they involve the observation of a random variable which associates a real or vectorial value with each experimental outcome. The model to represent such an experiment is given by two probability spaces. The first one formalizes the situation before introducing the random variable, and will be denoted by (Ω, A, P ), where Ω is the sample space or space of experimental outcomes, A is the σ-field of events of interest which are identified with classical sets of Ω, and P is a probability measure on A. The second probability space formalizes the situation after introducing the random variable X (which is a Borel-measurable real- or vectorial-valued function), and is referred to as the induced probability space, (R, BR , P ) if X is onedimensional, or (Rk , BRk , P ) if X is k-dimensional. When a Bayesian framework is considered, the probability measure P is not completely known, but it can be specified through a random process from a family of probability measures in accordance with a prior distribution ξ. Once the experimental distribution P is specified, the experimental performance leads to an experimental outcome ω ∈ Ω. Then, the random variable X associates with ω a real or vectorial value x = X(ω), which is assumed to be exactly reported by the observer. Finally, given an event of interest (X ∈ B), with B ∈ BR or BRk , it is possible to answer after the experimental performance whether it has occurred or not.

The specification of the experimental distribution (whenever a Bayesian framework is considered) and the experimental performance take place under randomness, whereas the numerical quantification stated by the random variable, the observation report and the answer to occurrence of events of interest are supposed to take place under certainty. Studies combining fuzzy and statistical modeling, commonly assume fuzziness arises in some of the last three experimental stages. Thus, sometimes the assessment of (prior or experimental) probabilities cannot be numerically stated, but rather the available evidences or the degrees of belief supporting them are more properly expressed in terms of fuzzy values, like quite probable, very likely, and so on. A combined model to deal with these ideas is given by the concept of fuzzy probabilities (or fuzzyvalued probabilities), which has been examined in the literature. In this respect, we can point to the work done by Negoita and Ralescu (1975), Zadeh (1975, 1984, 1995), Nguyen (1979), Watson et al. (1979), Yager (1979, 1984), Freeling (1980), Dubois and Prade (1982, 1985, 1989), Negoita and Ralescu (1987), Rappoport et al. (1987), Jain and Agogino (1988), Stein and Zwick (1988), Zwick and Wallsten (1990), Utkin (1993), and Ralescu (1995a) (see also Section B2.7 in this handbook). On the other hand, the quantification process defined by the random variable can occasionally associate with the experimental outcomes fuzzy values like a small number, much greater than 7, and so on. A combined model to deal with these ideas is given by the concept of fuzzy random variable, as introduced by Puri and Ralescu (1986). A fuzzy random variable is intended by these authors as a “measurable” fuzzy-valued function defined from the sample space Ω to the class of fuzzy sets of Rk , F(Rk ), and modeling an existing imprecise quantification process. Variable values are assumed to be normal, and to have compact (usually convex) α-cuts and support. Measurability in these variables is formalized by assuming that the set-valued α-cut functions are random compact (usually convex) sets associated with the induced probability space. A convenient constraint to model bounded fuzzy random variables has been added by Stojakovi´c (1992), who has supposed that the closed convex hull of the support of each variable value is a compact convex set and the setvalued closed convex hull of the support function is a random compact convex set. Fuzzy random variables can be now viewed as a generalization of both, random sets and random variables, and essential references on this topic are the papers from Puri and Ralescu (1985, 1986), Stojakovi´c (1992, 1994), Ralescu (1995bc), as well as the book from Negoita and Ralescu (1987) and the work from L´opez-D´ıaz (1996). Fuzzy random variables were first introduced in the literature by Kwakernaak (1978, 1979), and slightly modified by Kruse and Meyer (1987). In contrast to the concept presented by Puri and Ralescu, fuzzy random variables were stated by Kwakernaak, Kruse and Meyer as a model to describe fuzzy reports of existing numerical variable values. More precisely, a fuzzy random variable was interpreted as a fuzzy perception of a classical random variable (which is referred to as the original of the fuzzy one), but it has been defined also as a function from Ω to the class F(R). Measurability of fuzzy

random variables in this definition was formalized by assuming that the two real-valued functions given by the infimum and supremum of the α-cuts are Borel-measurable. Zhong and Zhou (1987) have proven that in the case in which variable values are assumed to be normal fuzzy sets of R and to have compact convex α-cuts, Puri and Ralescu’s and Kwakernaak-Kruse and Meyer’s definitions are equivalent whenever the measurability of the set-valued α-cut functions in Puri and Ralescu’s definition is understood in one of the strongest senses (see Hiai and Umegaki 1977). However, Puri and Ralescu (1986) have often considered the weakest measurability condition, which would coincide with other ones for complete probability spaces. The last concept is one of the combined models to represent situations in which the numerical quantification involved in the random experiment is exact, but the observation report is expressed by means of a fuzzy value. A well-known alternative model to represent these situations is that based on the concept of fuzzy information, and eventually on the concept of fuzzy information system (see Okuda et al. 1978, Tanaka et al. 1979). A fuzzy information is a Borel-measurable fuzzy set of the space of variable values, X(Ω), and a fuzzy information system is a fuzzy partition (in Ruspini’s sense 1969, 1970) of Borel-measurable fuzzy sets of X(Ω). Fuzziness can also arise in defining events of interest, so that one cannot answer whether these events occur or not, but rather one can only specify their “degree of occurrence”. The concept of fuzzy information we have just mentioned provides us with a proper model to deal with events of interest. Although fuzziness could affect the experimental outcomes, Statistics does not assume these elements must be real-valued or well-defined, and hence a combined model for them is not required. Fuzziness in the assessment of the experimental distribution has been scarcely investigated, but the recent work from Thomas (1995) addresses this purpose by considering fuzzy ranges of probability models. Buckley (1985) has considered an essentially different approach to statistics with fuzzy data. In Buckley’s approach fuzziness is the only type of uncertainty involved in experimental stages (in particular, instead of stages of assessment of prior and experimental probabilities, one could refer to stages of assessment of prior and experimental possibilities). Statistical problems are viewed in Buckley’s approach as special fuzzy decision problems, and optimal estimation and testing methods are obtained. Most of the studies establishing and handling concepts and techniques which combine fuzzy and statistical modeling concern Statistics with fuzzy experimental data. In this way, descriptive statistics, along with statistical inference and decision problems and methods, have been developed for fuzzy experimental data. In the next two subsections we are going to briefly summarize some of these studies, which are based on different approaches for fuzzy data. Section C2.3.3 will deal with univariate statistics and Section C2.3.4 is devoted to multivariate statistics.

C2.3.3 Univariate statistics of fuzzy experimental data Univariate statistics of fuzzy experimental data refers to an assortment of descriptive, inferential, and statistical decision techniques to manage situations in which a one-dimensional (fuzzy or fuzzily reported) random variable is involved.

C2.3.3.1 Descriptive statistics with fuzzy data. The aim of descriptive statistics is to develop procedures of data collection, classification, summarization, and presentation. Kruse and Meyer (1987) have extended this purpose to fuzzy data by means of the use of fuzzy random variables. In the Kruse and Meyer approach to descriptive statistics, the first question which has been considered is that of modeling fuzzy random variables providing fuzzy data. In accordance with Kwakernaak’s definition (1978, 1979), slightly modified by Kruse and Meyer (1987) (see also, Gebhardt et al. 1996), given a probability space (Ω, A, P ) a fuzzy random variable associated with it is a mapping X : Ω → Fc (R) (Fc (R) being the class of normal fuzzy sets A of R such that for each α ∈ (0, 1] Aα is compact), and the mappings inf Xα and supXα defined so that inf Xα : Ω → R, ω 7→ inf (X (ω))α ; supXα : Ω → R, ω 7→ sup(X (ω))α , are classical random variables. The second relevant question in descriptive statistics is that of summarizing fuzzy data. In this respect, Kruse and Meyer have answered this question by establishing some measures of central tendency and dispersion, and the fractiles of fuzzy data supplied by a sample (or a finite population in the case of a census), Ω = {ω1 , ..., ωn }. In particular, the expected value, the variance and the fractiles of a fuzzy random variable have been introduced as extensions of the mean, variance and fractiles, respectively, of the classical random variables by using Zadeh’s extension principle. Thus, if M denotes the summarizing descriptive measure in the nonfuzzy case, then the corresponding descriptive fuzzy measure for a fuzzy random variable X is given by the fuzzy set of R such that M (X )(t) =

sup

inf X (ω)(X(ω)),

X∈V(Ω,A), M (X)=t ω∈Ω

where V(Ω, A) is the set of classical random variables associated with the measurable space (Ω, A). Another interesting descriptive question which has been formulated is that of presenting fuzzy data through a distribution representation: the fuzzy empirical distribution function for fuzzy data. This function is also defined on the basis of Zadeh’s extension principle. Thus, if Ω = {ω1 , ..., ωn } is the sample (or population) to be described, and X is the fuzzy random variable on it, then the fuzzy empirical distribution function associated with (X (ω1 ), ..., X (ωn )) is given by the mapping SnX from R

to F(R), such that if SnX denotes the classical empirical distribution function of the classical random variable X, then SnX (x)(t) =

sup

inf X (ω)(X(ω)),

X (x)=t ω∈Ω X∈V(Ω,A), Sn

if t ∈ {0, 1/n, 2/n, ..., 1}, = 0 otherwise.

C2.3.3.2 Inferential statistics with fuzzy data. Inferential statistics of fuzzy experimental data concern the drawing of conclusions from statistical evidence given by a sample from the random experiment (and, if a Bayesian context is considered, along with the evidence given by the prior information). These conclusions refer to the experimental distribution or some of its parameters. Different approaches to statistical inference with fuzzy data have been developed, the most remarkable ones being those based either on the concept of fuzzy random variable or on the concept of fuzzy information. • Approaches based on fuzzy random variable models The studies on inferential statistical techniques with fuzzy data which use the models given by fuzzy random variables (both, Kwakernaak-Kruse and Meyer’s and Puri and Ralescu’s) concern the drawing of conclusions about fuzzy parameters (fuzzy perceptions of unknown classical parameters) characterizing these fuzzy variables. Sometimes kruse and Meyer’s studies refer to inferences about real-valued parameters characterizing the underlying classical random variable. In this way, Kruse and Meyer (1987) (see also Gebhardt et al. 1996) have generalized the concepts of fuzzy expected value, fuzzy variance, and fuzzy empirical distribution of a fuzzy random variable for general (finite or infinite) populations by employing Zadeh’s extension principle. Independence of fuzzy random variables has been also established by first introducing multi-dimensional fuzzy distribution functions. In an analogous way, the identity in distribution of fuzzy random variables is defined. Equivalent conditions of both, independence and identity in distribution, are derived in terms of the independence and identity in distribution, respectively, of the classical random variables infimum and supremum of α-cuts. On the basis of these concepts, several probabilistic and inferential notions and methods have been developed: − Regarding the probabilistic results, some limit theorems have been obtained (Kruse and Meyer 1987) to make the theory of fuzzy random variables useful for statistical purposes. Thus, some strong laws of large numbers, based either on the Hausdorff pseudometric or on the generalized Hausdorff pseudometric, have been stated. A central limit theorem has been presented, where a sequence of independent and identically distributed fuzzy random variables is said to be asymptotically normally distributed if it is asymptotically convex and the sequences of infimum and supremum of α-cuts random variables are asymptotically normally distributed.

− Regarding the point estimation of fuzzy parameters (like the fuzzy expected value, fuzzy variance, etc.) from fuzzy data, we can point out notions and results concerning properties such as the consistency, strong consistency and unbiasedness of fuzzy estimators (Kruse and Meyer 1987). On the other hand, the suggested method to obtain fuzzy estimates for a given sample realization makes use of Zadeh’s extension principle applied to classical estimates based on realizations of a classical random sample. These notions and results are based on the concept of fuzzy random sample, which is intended as a finite sequence of independent and identically distributed fuzzy random variables. − Regarding the interval estimation of fuzzy parameters from fuzzy data, again Kruse and Meyer (1987, 1988) have developed many ideas and techniques. Thus, after defining confidence one-, two-sided and general fuzzy intervals with confidence coefficient 1 − δ (δ ∈ (0, 1)), procedures are given to construct general fuzzy intervals from classical ones. − Regarding the testing of statistical hypotheses about fuzzy parameters from fuzzy data, the concepts of test function and one- and two-sided hypotheses have been extended to deal with fuzzy random variables. Methods based on confidence fuzzy intervals, along with other ones (like those extending chi-square and Kolmogorov-Smirnov classical tests of goodness of fit) have been also examined (see Kruse and Meyer 1987, Gebhardt et al. 1996). As an example of a fuzzy statistical technique based on Kruse and Meyer’s approach, we can mention the following one (Kruse and Meyer 1987): Theorem. Let δ ∈ (0, 1) and n ∈ N. Let P = {Pθ , θ ∈ Θ} be a parametric class of probability distributions of a random experiment X whose distribution depends on the real-valued parameter θ ∈ Θ. Let Tn1 : Rn → R and Tn2 : Rn → R two continuous and nondecreasing in each component functions, with Tn1 ≤ Tn2 in Rn , and so that [Tn1 , +∞) and (−∞, Tn2 ] are random intervals determining confidence intervals of θ with confidence coefficients 1 − δ1 and 1 − δ2 , respectively, (δ1 , δ2 ∈ (0, 1) and δ1 + δ2 = δ). Then, given a sample of fuzzy observations from the fuzzy random variable X , (A1 , ..., An ), (Ai , i = 1, ..., n, assumed to be normal fuzzy sets of R belonging to X (Ω)), the fuzzy interval In (A1 , ..., An ) such that for each α ∈ (0, 1]  1 [Tn (inf A1α , ..., inf Anα ), Tn2 (supA1α , ..., supAnα )],     if Tn1 (inf A1α , ..., inf Anα ) > −∞,     Tn2 (supA1α , ..., supAnα ) < +∞;   1   [Tn (inf A1α , ..., inf Anα ), +∞),   if Tn1 (inf A1α , ..., inf Anα ) > −∞, (In (A1 , ..., An ))α =  Tn2 (supA1α , ..., supAnα ) = +∞;   2   (−∞, Tn (supA1α , ..., supAnα )],     if Tn1 (inf A1α , ..., inf Anα ) = −∞,     Tn2 (supA1α , ..., supAnα ) < +∞;  R, otherwise is a fuzzy interval of the fuzzy parameter θ(X ) for the sample fuzzy realization (A1 , ..., An ) with confidence coefficient 1 − δ, where θ(X ) has been obtained by using Zadeh’s extension principle.

To apply the preceding inferential techniques more easily in practice, a software tool called SOLD has been stated (see Kruse and Meyer 1987, Kruse and Gebhardt 1989, and Gebhardt et al. 1996). On the other hand, Ralescu (1982, 1996), Ralescu and Ralescu (1984, 1986) have also developed some studies on inferential statistics from fuzzy data based on the model given by fuzzy random variables in Puri and Ralescu’s sense. In accordance with Puri and Ralescu’s definition (1985, 1986), given a probability space (Ω, A, P ) an integrably bounded fuzzy random variable associated with it is a mapping X : Ω → Fc∗ (Rk ), (Fc∗ (Rk ) being the class of normal fuzzy sets of Rk such that for each α ∈ (0, 1], Aα is compact - and often convex - and the support is compact), such that the set-valued functions Xα , defined so that Xα : Ω → Kc∗ (Rk ), ω 7→ (X (ω))α , (where Kc∗ (Rk ) is the class of nonempty compact sets of Rk ) are integrably bounded random compact (and often convex) sets (see, for instance, Kendall 1974, Matheron 1975, Gin´e and Hahn 1985). Recently, this definition has been slightly revised by Stojakovi´c (1992, 1994), to deal with bounded fuzzy random variables. In this revision, two assumptions have been added, namely, for each fuzzy variable value A the closed convex hull of supp(A) is assumed to be a compact set, and the setvalued function closed convex hull of supp(X ), defined so that ∗ C o X : Ω → Kcc (Rk ), ω 7→ C o X (ω) = cl [co(supp(X (ω)))], ∗ (where Kcc (Rk ) is the class of nonempty compact convex sets of Rk ) is an integrably bounded random compact convex set.

Of course, even in the case in which Puri and Ralescu’s and KwakernaakKruse and Meyer’s definitions coincide, the conditions added by Stojakovi´c cannot all be guaranteed in Kwakernaak-Kruse and Meyer’s approach to fuzzy random variables. Given a probability space (Ω, A, P ) and an associated integrably bounded fuzzy random variable, X , then the fuzzy expected value of X has been defined by Puri and Ralescu (1986) for nonatomic probability spaces (and later by Stojakovi´c 1992, 1994, for complete probability spaces), as the only fuzzy set of Rk , E(X ), such that for each α ∈ (0, 1] R (E(X ))α = Ω Xα dP, R where Ω Xα dP denotes the (set-valued) Aumann integral of the set-valued function Xα (cf. Aumann 1965). E(X ) can be proven to belong to Fc∗ (Rk ). Several studies based on Puri and Ralescu’s definition have been devoted to statistical applications, although related studies have been mainly focused on establishing the adequate probabilistic basis to develop statistics later. Thus, − Regarding the probabilistic results, some limit theorems, as a strong law of large numbers and a central limit theorem, have been obtained for these fuzzy random variables (see Ralescu and Ralescu 1984, Klement et al.

1984, 1986, Negoita and Ralescu 1987, Ralescu 1995b). In the last result, the normality of fuzzy random variables has been intended in accordance with Puri and Ralescu’s definition (1985), which is an extension of the concept of normality of random sets Lyashenko (1983). In fact, these limit theorems generalize those for random sets (see Artstein and Vitale 1975, Weil 1982). In Ralescu (1995b) a summary of these probabilistic results, along with a discussion on other ones given by other authors, is presented. − Regarding the inferential problems (point and interval estimation, and hypothesis testing) from fuzzy data and for fuzzy parameters, studies have been suggested (see Ralescu 1982, Ralescu and Ralescu 1984, 1986). In Ralescu (1995bc), several statistical inequalities with useful implications in Statistics (like the Brunn-Minkowski and the Jensen ones) have been first extended for random sets and later for fuzzy random variables. Applications of these inequalities in testing hypotheses are commented (Ralescu 1995bc), and other possible ones in Statistical Information concerning fuzzy data could be easily concluded. As an example of a fuzzy statistical result based on Puri and Ralescu’s approach, we can mention the following one (Ralescu 1995bc): Theorem. If X is a fuzzy random variable and ϕ : Fc∗ (Rk ) → R is a convex function (i.e., ϕ(λ ¯ U ⊕ (1 − λ) ¯ V ) ≤ λϕ(U ) + (1 − λ)ϕ(V ), for all U, V ∈ Fc∗ (Rk )) (with ¯ and ⊕ the fuzzy product and sum, respectively), then we have that ϕ(E(X )) ≤ E(ϕ(X )). • Approach based on fuzzy information models The studies on inferential statistical techniques with fuzzy data which use the models given by fuzzy information (and, quite often, also by fuzzy information systems), as intended by Okuda et al. (1978), Tanaka et al. (1979), concern the drawing of conclusions about classical parameters or the probability distribution of the underlying classical random variable, on the basis of the available fuzzy information from the observer report. In accordance with Okuda et al. (1978), Tanaka et al. (1979), given a probability space (Ω, A, P ) and a classical random variable X associated with (Ω, A), a Borel-measurable fuzzy set of X(Ω) is called fuzzy information associated with the random experiment involving X. Sometimes, a model to represent the class of the available observer reports is employed. This model corresponds to a fuzzy information system associated with the random experiment involving the observation of X, which is assumed ∗ to be a fuzzy P partition X (X(Ω)) of X(Ω) (i.e., to satisfy the orthogonality condition, A∈X ∗ (X(Ω)) A(x) = 1 for all x ∈ X(Ω)). It should be emphasized that the orthogonality condition is only occasionally required to develop a few of the statistical techniques based on this model. One of the characteristic features of the present model is that the available probabilistic information in it refer to classical sets of real values (in contrast

to the models based on fuzzy random variables, in which the probabilistic information refers to fuzzy sets of real or vectorial values). Since inferences are supposed to be based on fuzzy data, a probabilistic system associated with them becomes necessary. To this purpose, Okuda, Tanaka and Asai have adopted Zadeh’s probabilistic definition (1968), according to which the probability of the fuzzy information A ∈ X ∗ (X(Ω)) induced by the experimental distribution P is given by the Lebesgue-Stieltjes integral R P (A) = X(Ω) A(x) dP (x). The preceding definition can be viewed (see Gil 1993, Gebhardt et al. 1996) as an application of the generalization of random experiments given by Le Cam (1964, 1986), which are referred to as single stage experiments. Of course, if we take a σ-field of measurable fuzzy sets of X(Ω), say the σfield σ(X ∗ (X(Ω))) generated by X ∗ (X(Ω)), an induced probability space can be immediately P established. Thus, (X ∗ (X(Ω)), σ(X ∗ (X(Ω))), P ) is a probability space, since A∈X ∗ (X(Ω)) P (A) = 1. To represent sample observer reports, Corral and Gil (1984), Gil et al. (1984), have extended the concepts of fuzzy information and fuzzy information system. In this way, if Ai is a fuzzy set of X(Ω) for i = 1, ..., n, the n-tuple (A1 , ..., An ) denoting the algebraic product of A1 , ..., and An (i.e., (A1 , ..., An )(x1 , ..., xn ) = A1 (x1 ) · ... · An (xn ), for all x1 , ..., xn ∈ X(Ω)), is called sample fuzzy informationof size n associated with the random experiment. Appropriateness of the (product) aggregation above considered is discussed in R¨omer and Kandel (1995) and Ralescu and Ralescu (1996), and other (often generalized) aggregations are also analyzed. In the case that X ∗ (X(Ω)) is a fuzzy information system, the fuzzy information system X ∗n (X(Ω)) containing all the sample fuzzy data (A1 , ..., An ), Ai ∈ X ∗ (X(Ω)), i = 1, ..., n, is referred to as a fuzzy random sample of size n from X ∗ (X(Ω)). On the basis of the preceding concepts, several well-known techniques of inferential statistics (from both sampling-theory and Bayesian approaches) can be immediately extended with no theoretical inconveniences. These techniques are those only requiring the knowledge either of the likelihood function for the sample fuzzy data, or of probabilities of some events based on these data. However, some of these extensions can entail computational difficulties, mainly depending on the shape of fuzzy data for different methods. Studies on these extensions can be summarized as follows: − Regarding the quantification of the information contained in fuzzy data, about the experimental distribution or its parameters/states, extensions of several statistical measures in a Bayesian framework have been developed for fuzzy information systems. In this sense, the expected Fisher amount of information, the Shannon information, the Jeffreys invariant of Kullback and Leibler divergence, the Csisz´ ar parametric and nonparametric information, and (if a decision problem with “fuzzy experimentation” is considered) the Raiffa and Schlaifer expected value of sample information

have been defined for a fuzzy information system, and their properties have been examined (see Gil et al. 1984, 1985b, 1990, Pardo et al. 1986, Gil and Gil 1992, Gil and L´opez 1993) − Regarding the comparison of experiments (more precisely, the comparison of fuzzy information systems), several criteria based on the statistical information measures above have been stated along with these measures. Pardo et al. (1989) have also presented an extension of the fundamental Blackwell’s sufficiency criterion (1953). Thus, when we can consider several experiments whose distributions depend on the same parameter/state, a comparison among them to select the most informative or preferred one would be useful. Criteria based on the preceding measures preserve main properties of the nonfuzzy case. − Regarding the loss of information due to fuzziness in the observer report, the comparison between an experiment and the associated fuzzy information system by means of any information measure has allowed us to conclude that fuzziness always determines a loss of statistical information (see, for instance, Gil 1987, 1988, Okuda 1987). This conclusion has been complemented with the analysis of the choice of the appropriate size of the sample fuzzy information to guarantee the achievement of a desirable level of information, or the increasing of the fuzzy sample size with respect to the nonfuzzy one to compensate the loss of information, when only fuzzy observer reports are available. − Regarding the point estimation of nonfuzzy parameters from fuzzy data, we can first remark the extension of the maximum likelihood method (see Gil and Casals 1988, Gil et al. 1988, 1989), where the likelihood for the sample fuzzy information is numerically given by its Zadeh’s induced probability. Properties of the nonfuzzy case, like the equivariance by one-to-one transformations, and the best asymptotically normal behavior under some regularity conditions, are preserved. However, computational difficulties arise in applying the extension above. These difficulties can be avoided either by replacing fuzzy data by some approximate ones (see Gil and Casals 1988, Gil et al. 1989), or by developing alternative extensions of the maximum likelihood method, like the minimum inaccuracy estimation method, which in most cases supplies an operational technique providing us with a good approximation to maximum likelihood solutions (see Corral and Gil 1984, Gil and Casals 1988, Gil et al. 1988, 1989), or also by developing extensions of some classical approximations of the maximum likelihood method (see Okuda et al. 1991). On the other hand, and when a statistical Bayesian decision context is adopted, Bayes point estimation techniques for fuzzy data have been established (see Gil et al. 1985a, Gil 1987). Computational inconveniences often arise in determining the induced posterior distribution given the sample fuzzy data, and they could be removed by approximating fuzzy data (a question which is closely related to that of conjugated distributions in Bayesian Analysis).

− Regarding the interval estimation of nonfuzzy parameters from fuzzy data, Corral and Gil (1988) have stated a procedure to construct confidence intervals of a parameter for the available sample fuzzy information. − Regarding the testing of (nonfuzzy) statistical hypotheses from fuzzy data, several methods have been developed. In this way, techniques based exactly or asymptotically on Neyman-Pearson optimality criterion, have been extended for fuzzy data. More precisely, the Neyman-Pearson procedure for testing two simple hypotheses (see Casals et al. 1986a, Casals and Gil 1988) and the likelihood ratio test (Gil et al. 1989) for fuzzy data, have been examined. Significance tests, like the chi-square and likelihood ratio tests for goodness of fit, have been also extended to deal with fuzzy sample information (see Gil and Casals 1988, Gil et al. 1988, 1989). On the other hand, and when a statistical Bayesian decision context is assumed, Casals et al. (1986ab) have developed Bayes tests from fuzzy data. These tests do not involve computational difficulties neither if simple hypotheses are tested nor if the distribution space P has small cardinality. Finally, studies on testing of fuzzy statistical hypotheses and sequential tests, can be found (see, for instance, Casals and Salas 1988, Pardo et al. 1988, Casals 1993, Casals and Gil 1994). − Regarding the fuzzy discrimination problems decision rules minimizing the probability of error are derived, and some upper bounds for the average probability of error are obtained (see Asai et al. 1977, Pardo and Men´endez 1991). As an example of a statistical technique based on the fuzzy information approach, we can mention the following one (Gil et al. 1989): Theorem. Let X = (X(Ω), BX(Ω) , Pθ ), θ ∈ Θ, be a random experiment in which {Pθ , θ ∈ Θ} is a parametric family of probability measures dominated by the counting or the Lebesgue measure, λ. Assume that the experiment X satisfies the following regularity conditions: i) Θ is a real interval which is not a singleton; ii) the set (X(Ω))n = {(x1 , ..., xn )|L(x1 , ..., xn ; θ) > 0}, L being the likelihood function, does not depend on θ; iii) Pθ is associated with a parametric distribution function which is regular with respect to all its second θ-derivatives in Θ. Suppose that the sample fuzzy information (A R 1 , ..., An ) satisfies the following regularity conditions: iv) λ(A1 , ..., An ) = (A1 , ..., An )(x1 , ..., xn ) dλ(x1 )...dλ(xn ) and the inaccuracy function (X(Ω))n Z =(A1 , ..., An ; θ) = −

((X(Ω))n

|(A1 , ..., An )|(x1 , ..., xn )

· logL(x1 , ..., xn ; θ) dλ(x1 )...dλ(xn ), are both finite for all θ ∈ Θ, (| · | representing the standardization process in Saaty 1974); v) the product function (A1 , ..., An )(·) logL(·; θ) is “regular” with respect to all its first and second θ-derivatives in Θ, in the sense that

∂ =(A1 , ..., An ; θ) = − ∂θ · and

Z |(A1 , ..., An )|(x1 , ..., xn )

∂ logL(x1 , ..., xn ; θ) dλ(x1 )...dλ(xn ), ∂θ

∂2 =(A1 , ..., An ; θ) = − ∂θ2 ·

(X(Ω))n

Z (X(Ω))n

|(A1 , ..., An )|(x1 , ..., xn )

∂2 logL(x1 , ..., xn ; θ) dλ(x1 )...dλ(xn ). ∂θ2

Under the regularity conditions i) - v), if there is an estimator of θ for the simple random sample ((X(Ω))n , B(X(Ω))n , Pθ ), θ ∈ Θ, whose variance attains the ∂ Fr´echet-Cram´er-Rao bound, then the inaccuracy equation, ∂θ =(A1 , ..., An ; θ) = 0, admits a solution minimizing the inaccuracy =(A1 , ..., An ; θ) with respect to θ in Θ. Moreover, let T ((X(Ω))n ) be an estimator of θ for the simple random sample ((X(Ω))n , B(X(Ω))n , Pθ ), θ ∈ Θ, attaining the Fr´echet-Cram´er-Rao lower bound for the variance, and whose expected value is given by Eθ (T ) = h(θ), (h being a one-to-one real-valued function on Θ). Then, for the sample fuzzy information (A1 , ..., An ) the inaccuracy equation admits a unique solution minimizing the inaccuracy =(A1 , ..., An ; θ) and taking on the value θ? (A1 , ..., An ) ∈ Θ such that Z h(θ? (A1 , ..., An )) = |(A1 , ..., An )|(x1 , ..., xn ) (X(Ω))n

· T (x1 , ..., xn ) dλ(x1 )...dλ(xn ).

Additional references on the topic in this subsection can be found in the literature (cf., Puri and Ralescu 1981, Stein and Talati 1981, Delgado et al. 1985, Pardo 1985, Dubois and Prade 1986, 1989, 1993, Viertl 1987, 1992, Stein and Zwick 1988, Anisimov 1989, Viertl and Hule 1991, Fruhwirthschnatter 1992, Cai 1993, Watanabe and Imauzumi 1993, Saade 1994, R¨omer and Kandel 1995, R¨omer et al. 1995, Drossos and Theodoropoulos 1996, Gertner and Zhu 1996).

C2.3.4 Multivariate statistics of fuzzy experimental data Multivariate statistics refers to an assortment of descriptive and inferential techniques developed to handle and analyze situations in which there may be several independent variables and several dependent ones, all correlated with one another to different degrees. Among multivariate studies with fuzzy data, Data and Regression Analyses have certainly become the most successful ones.

C2.3.4.1 Data Analysis with fuzzy elements. When a large amount of data has to be analyzed cluster analysis (or clustering) is a useful tool. The aim of this technique is to group n objects, O1 , ..., On , each of them described by means of p variables, xi = (xi1 , ..., xip ), in k classes, C1 , ..., Ck , according to their similarity or distance. The grouping be exPcan k pressed by a partition matrix M ∈ M, where M = {(mij ), j=1 mij = 1, mij ∈ {0, Pn1}}, with mij = Cj (Oi ) = 1 if Oi ∈ Cj , = 0 otherwise. The condition i=1 mij > 0 for all j is usually added to avoid the existence of empty classes. This type of matrix is called hard k-partition. Many different techniques have been proposed (Anderberg 1973, Gordon 1981, Miyamoto 1990, Bandemer and N¨ather 1992) to obtain partitions in some disjoint groups. These techniques, based on distances (or similarities) between objects and clusters, can be classified in two types: partitioning and hierarchical methods. To use the former one we need to know the number of classes, k, we want to make the partition in. Then the partition in k clusters, which is optimal according to a given clustering criterion, is sought. As the information about the number of classes will not usually be available at the beginning of the investigation, analyses are usually undertaken for several different values. Hierarchical methods can also be divided into: agglomerative and divisive methods. All agglomerative methods begin with n clusters, each containing just only one object, and a measure of distance between classes is necessary. In each step the two nearest clusters are joined in a new single cluster. At the end of the classification all the objects are in one cluster. This process is commonly summarized in a tree diagram which is used to define the final partition. In the divisive methods, the first step is to divide n objects in two groups according to a given criterion. Once the first division has taken place the same method is applied separately to each group. Until now we have referred to methods that produce hard partitions. However, there are situations where it is interesting to allow “clumping”, that is, some overlap between clusters. For example, in linguistics, words have several meanings and may belong to several groups. Some methods of clumping have been incorporated in the CLUSTAN package. The preceding idea can be reformulated in a more natural way, by using fuzzy sets which allow graduate membership to cluster. Fuzzy sets are a good approach to represent structure in data where some objects belong definitely to a certain cluster, although there are other objects whose membership group is much less evident. The aim of fuzzy clustering is to look for the “best” summary of structure in data by using fuzzy partitions. So, fuzzy clustering techniques tackle some of the unsolved problems possessed by the hard clustering, such as bridges, strays, and undetermined objects among the cluster. A fuzzy , where Pkk-partition can be identified by a matrix M ∈ = {(mij ), j=1 mij = 1, mij ∈ [0, 1]}, with mij = Cj (Oi ) meaning the degree of membership of Oi to Cj . This searching is usually carried out as an optimization problem where the objective function can be built according to several criteria: based on intu-

M

M

ition, on generalized hard partition techniques, and on the maximum likelihood criterion. • Fuzzy clustering based on intuitive methods Early papers on fuzzy clustering were developed by Ruspini (1969, 1970), who has tried to optimize a criterion derived intuitively using the following objective function min JR (M) = min [ M

M

n X n k X X [c (mij − mi0 j )2 − d2 (Oi , Oi0 )]]2 i=1 i0 =1

j=1

where c > 0 is a parameter which reflects the importance of the difference between the membership functions associated with objects. Ruspini’s objective function tends to be small when a close pair of objects have nearly equal fuzzy cluster membership. To compute the optimal fuzzy partition Ruspini has suggested an adaptation of the usual gradient method which finds local minima. This is a slow process, unless a good initial approximation has been given. • Fuzzy clustering based on the k -means method One of the best known fuzzy cluster methods is the fuzzy k-means which was developed by Dunn (1974) and Bezdek (1987), and is based on a generalization of the within-groups sum of squares. This technique requires the existence of a norm, usually an Euclidean norm, in order to calculate the distance between objects and “centres” of clusters. The objective is to look for the next minimum min Jk (M, v) = min M,v

M,v

k X n X

mqij kxi − vj k2

j=1 i=1

where k · k is a norm in Rp induced via an inner product, q > 1 is a weighting exponent (the fuzzier the system the larger q, and conversely as qP tends to Pn n q q 1 the limit fuzzy partition becomes hard), and vj = i=1 mij i=1 mij xi / represents the “centre” of cluster Cj . The solution is obtained by an iterative process consisting in applying the following steps: Step 1.- Choose k ∈ {1, ..., n}, any norm in Rp , and q > 1. Step 2.- Initialize M0 , where M0 ∈ , and l = 0. Step 3.- Calculate the k cluster centres vjl associated with Ml . Step 4.- Update l = l + 1 and Ml , so that if xi = vjl , then mlij = 1 and mlit = 0 if t 6= j, else, k X kxi − vjl k 2/(q−1) −1 mlij = [ ( ) ] . kxi − vjl 0 k j 0 =1

M

Step 5.- If the “distance” d(Ml , Ml−1 ) < ε, then stop, else l = l+1 and return to Step 3.

This procedure was modified later by Wang et al. (1994), who have suggested focusing not only on the homogeneus property (minimum variation within-groups), but on the well-separate property, as well as avoiding the problems that would appear if groups have very different sizes. For this purpose, the following bi-objective function has been considered min JW (M, v) = min [β M,v

M,v

k X n X

m2ij kxi − vj k2 − α

j=1 i=1

j−1 k X X j=1

kvj − vj 0 k2 ]

j 0 =1

where α ≥ 0 and β > 0, α + β = 1 are weights that reflect, respectively, the importance of homogeneous and well-separated properties. In case α = 0, we obtain the fuzzy k-means method. As the most practical applications deal with large sets of data, and the preceding algorithm only produce local minima, the searching for efficient algoritms is essential. Some efforts in this direction have been made (Cannon et al. 1986, Kamel and Selim 1994) • Fuzzy clustering based on maximum likelihood method The previous k-means method search for clusters which have more or less the same shape, and the norm used in the objective function is the same for all clusters. Scott and Symons (1971) have proposed a new hard clustering method based on the maximum likelihood principle, which allows the search for clusters with different shapes. In this method we suppose that observations xi are independent and normally distributed as a p-dimensional p (µj , Σj ) if Oi ∈ Cj , where µj and Σj are unknown parameters. The aim of this method in the nonfuzzy case is the maximization of the log-likelihood function, which is equivalent to the minimization below

N

min [

M,µ,Σ

k X n k X n X X ( mij log|Σj |) + mij (xi − µj )t Σ−1 j (xi − µj )] j=1 i=1

j=1 i=1

This method can be generalized to fuzzy clustering (Trauwaert et al. 1991) by using the following revised minimization min [

M,µ,Σ

k X n k X n X X ( mij log|Σj |) + mqij (xi − µj )t Σ−1 j (xi − µj )] j=1 i=1

j=1 i=1

where usually q = 2. An appropriate way to solve this problem when q = 2 is searching for local minima, which can be found by iteratively applying the following equations: µj =

n X i=1

Σj =

n X i=1

m2ij xi /

n X

m2ij ,

i=1

m2ij (xi − µj )(xi − µj )t ,

mij = Pk

−1 [(xi − µj )t Σ−1 j (xi − µj )]

j 0 =1 [(xi

−1 − µj 0 )t Σ−1 j 0 (xi − µj 0 )]

.

If we analyze the preceding functional, we note that different norms (scatter matrices) arise in each cluster leading to diverse shapes. It should be emphasized that this method accords to fuzzy k-means under the hypothesis of equal variance-covariance matrix. Another fuzzy clustering method based on the likelihood principle is suggested by Yang (1993), who has considered mixtures of distributions and extended the previous objective function by adding a penalty term. • Fuzzy hierarchical clustering Dimitrescu (1988) has developed a divisive fuzzy hierarchical clustering, in which a priori knowledge about the number of classes in the partition is not necessary. This method builds a chain of fuzzy binary partitions according to the fuzzy k-means. The procedure starts by calculating a fuzzy partition of the set of objects in two classes. Once the first division has taken place, the same method is applied separately to each fuzzy set. However after a fuzzy set has been split, one has to examine whether or not this new partition defines “real clusters” by using an adequate measure. The decomposition process ends when no new “real clusters” are obtained.

C2.3.4.2 Regression Analysis of fuzzy data. Model fitting problems based on the knowledge of a group of data and on the assumption of a mathematical model relating the dependent variable(s) with the independent one(s), have been largely discussed in the literature. The fitting of the model (given by a function f defined on an appropriate space), consists in finding an optimal set of parameters a so that the observations (xi , yi ) satisfy the model yi = f (xi , a). This is almost impossible to achieve so conditions are weakened, in the sense that the observations do not exactly fit the model but there is a discordance (error term) between the observed values and the estimated ones, these errors being as small as possible (the smaller they are, the better the fit). In general regression, the deviations between the observed data and the estimated ones are considered as measurement errors (normally distributed with zero mean) and the problem consists in the minimization of these errors by means of any criteria (mean squares being the most used). When vagueness is present in the process, some (maybe all) of the elements described above can be fuzzy ones. The different ways of dealing with this vagueness produce different approaches of Regression Analysis in Fuzzy Sets Theory that can be classified in two main groups. In almost every case, the linear case is considered, even though some of these approaches can be extended to more general problems. Other approaches as well as some applications of Fuzzy Regression can be found in Yager (1982), Wang and Li (1990), Sakawa and Yano (1992a, 1992b), Savic and Pedrycz (1992), Kov´acs (1992), Ishibuchi and Tanaka (1992, 1993), Feng and Guang (1993).

• Fuzzy Regression (FR) as a Linear Programming (LP) Problem The first approach of FR was given by Tanaka et al. (1980, 1982) for the linear case. They consider that the differences between the observed and the estimated values are not due to measurement errors but to impreciseness of the model, which is translated into fuzzy parameters. The membership functions of these parameters (given in the L−R representation, (see Dubois and Prade, 1980)) is an input for the regression problem as well as the set of observations (yi , Xi ) with i = 1, ..., N . So what we have to find are the parameters such that the data differ from the calculated to a certain degree of belief. That is, given a function y = a1 x1 + ... + an xn = X t A where A = (a1 , ..., an )t is the set of parameters and X = (x1 , ..., xn )t are the independent variables, the question is which are the fuzzy parameters A∗ such that the fuzzy estimate yi∗ = Xit A∗ contains yi with more than H degree for all i? If fuzzy triangular numbers are considered (see Tanaka et al. 1982), this problem reduces to minimize c1 + c2 + . . . + cn subject to n n X X αj xij + | L−1 (H) | cj | xij | ≥ yi , j=1 n X j=1

j=1

αj xij − | L−1 (H) |

n X

cj | xij | ≤ yi ,

j=1

where αj are the centres, cj are the spreads of the fuzzy parameters aj and L is the function that characterizes the membershipness to the fuzzy numbers. The solution for this problem greatly depends on the selection of H and L. The effect of these input data on the solution of the LP problem is carefully studied in Moskowitz and Kim (1993) and can be summarized in the following Theorem. Let A∗H1 ,L1 = (α∗ , c∗ ) denote the optimal solution to a fuzzy regression problem with level of credibility H1 and membership function L1 . Then |L−1 (H )|

the optimal solution with H2 and L2 is A∗H2 ,L2 = (α∗ , |L1−1 (H1 )| c∗ ). 2

2

Some interesting applications of Fuzzy linear regression (FLR) to forecasting can be found in Hesmaty and Kandel (1985). A slightly different approach is considered in Savic and Pedrycz (1991) where a two-step procedure is considered in order to use the model obtained as solution of the FLR problem as a prediction model. In this paper, a solution for fitting a regression line is calculated from the available information about the centre points of the observations, this solution being an input data for the LP problem arising from the FLR one. In B´ardossy (1990) an extension of this problem is carried out. First, the function f for the relation between the dependent and independent variables is not necessarily linear; second, the fuzzy numbers are not triangular in general; third, the vagueness of the problem is measured not only as the sum of the spreads, but in four different ways (maximum of the maxima of the “spreads” being one of them). Additional assumptions (continuity and monotonicity of

f ) have to be made in order to reduce the constraints to a finite number. This transforms, in general, the regression problem to a nonlinear optimization problem. FLR is studied in B´ardossy (1990) as a particular case, leading to an LP problem with the additional difficulty that never has a unique solution on minimizing two of the measurements of vagueness considered in the paper. • Least Squares (LS) methods for FR A second approach consists in extending least squares methods directly. In this case, the errors are the distances between the observed and the estimated values. Difficulties arise on choosing an appropriate distance. The fitting of a linear function with crisp parameters and fuzzy input and output data was developed by Diamond (1991) (see Diamond 1988, B´ardossy et al. 1992). A more general approach for the FLR problem is obtained in Salas et al. (1991) where the parameters of the linear function are fuzzy numbers. In Bertoluzza et al. (1995a) the existence and uniqueness of the best polinomial approximation is proved. The regression coefficients cannot be determined by means of the classical variational methods because the set where the functional to be minimized is defined has the empty set as interior part. To avoid this difficulty, a numerical method based on active constraints has been developed (see Villani 1994). From the development of these works, higher level fuzzy numbers (see Diamond 1990) or new definitions of distances between fuzzy numbers (see Bertoluzza et al. 1995b) have arisen as a contribution of FR to Fuzzy Sets Theory. A different approach to least squares regression can be found in Guo and Chen (1992) (see also Celmi¸ns 1987). The problem deals with fuzzy multiple regression Y = X ◦ R where X and Y are vectors of triangular fuzzy numbers and R is a fuzzy relation. This latter is locally determined by minimizing the squares of the difference of the membership functions.

C2.3.5 Concluding remarks To end this section, we wish to make reference to other papers introducing and handling concepts combining statistical and fuzzy modelings, like those dealing with probabilistic sets (see Hirota 1977, CzogaÃla 1984), and those working on statistical decision problems involving fuzzy utilities or losses (see, for instance, Watson et al. 1979, Freeling 1980, Tong and Bonissone 1980, Dubois and Prade 1982, Whalen 1984, Gil and Jain 1992, Lamata 1994, and Gil and L´opez-D´ıaz 1996). Quite recently, Ralescu (1996) has developed an approach to statistical decision theory involving fuzzy probabilities. This approach addresses several problems, like the Bayesian estimation of parameters when the prior information is fuzzy-valued, the estimation of fuzzy probabilities, the testing of fuzzy hypotheses, the fuzzy quantification of rules and aggregation of decision criteria, and the regression analysis with fuzzy data. Finally, a complex combined model to deal with multivariate data has been also recently presented (Manton et al. 1995). Observations in it have

been assumed to be supplied by individuals that can belong to different sets or classes with different degrees, and the combined model is based on the idea of fuzzy partition and a set of probabilistic-possibilistic assumptions.

Acknowledgements The research in this section has been partially supported by the Spanish DGICYT Grant No. PB92-1014 and an Italian MURST Grant. Their financial support is gratefully acknowledged.

References Anderberg M 1973 Cluster Analysis for Applications (New York: Academic Press). Anisimov V Y 1989 Parameter estimation in the case of fuzzy information on the observation conditions Telecommunications and Radio Engineering 44 86-88. Artstein Z and Vitale R A 1975 A strong law of large numbers for random compact sets Annals of Probability 3 879-882. Asai K Tanaka H and Okuda T 1977 On discrimination of fuzzy states in probability space Kybernetes 6 185-192. Aumann R J 1965 Integrals of set-valued functions Journal of Mathematical Analysis and Applications 12 1-12. Bandemer H and N¨ather W 1992 Fuzzy Data Analysis (Boston: Kluwer Academic Publishers). B´ardossy A 1990 Note on fuzzy regression Fuzzy Sets and Systems 37 65-75. B´ardossy A Hagaman R Duckstein L and Bog´ardi L 1992 Fuzzy least squares regression and applications to earthquake data Fuzzy Regression Analysis eds J Kacprzyk and M Fedrizzi (Warsaw: Springer-Verlag) pp 181-193. Bertoluzza C Corral N and Salas A 1995a Polinomial regression in a fuzzy context. The least squares method Proceedings 6th IFSA Congress 2 (Sao Paolo) pp 431-434. Bertoluzza C Corral N and Salas A 1995b On a new class of distances between fuzzy numbers Mathware & Soft Computing 3 253-263. Bezdek J 1987 Pattern Recognition with Fuzzy Objective Function Algorithms (New York: Plenum Press). Blackwell D A 1953 Equivalent comparisons of experiments Annals of Mathematical Statistics 24 265-272. Buckley J J 1985 Fuzzy decision making with data: applications to statistics Fuzzy Sets and Systems 16 139-147. Cannon R L Bezdek J C and Dave J V 1986 Efficient implementation of the fuzzy c-means clustering algorithms IEEE Transactions on Pattern Analysis and Machine Intelligence 8 248-255. Casals M R 1993 Bayesian testing of fuzzy parametric hypotheses from fuzzy information R.A.I.R.O.-Recherche Op´erationnelle 27 189-199.

Casals M R and Gil M A 1988 A note on the operativeness of Neyman-Pearson tests with fuzzy information Fuzzy Sets and Systems 30 215-220. Casals M R and Gil P 1994 Bayesian sequential test for fuzzy parametric hypotheses from fuzzy information Information Sciences 80 283-298. Casals M R Gil M A and Gil P 1986a On the use of Zadeh’s probabilistic definition for testing statistical hypotheses from fuzzy information Fuzzy Sets and Systems 20 175-190. Casals M R Gil M A and Gil P 1986b The fuzzy decision problem: an approach to the problem of testing statistical hypotheses with fuzzy information European Journal of Operational Research 27 371-382. Casals M R and Salas A 1988 Sequential Bayesian test from fuzzy experimental information Uncertainty and Intelligent Systems − IPMU’88, Lecture Notes in Computer Science 313 314-321. Cai K Y 1993 Parameter estimations of normal fuzzy variables Fuzzy Sets and Systems 55 179-185. Celmi¸ns A 1987 Least squares model fitting to fuzzy vector data Fuzzy Sets and Systems 22 245-269. Corral N and Gil M A 1984 The minimum inaccuracy fuzzy estimation: an extension of the maximum likelihood principle Stochastica VIII 63-81. CzogaÃla E 1984 Probabilistic Sets in Decision Making and Control (K¨oln: Ver¨ Rheinland). lag TUV Corral N and Gil M A 1988 A note on interval estimation with fuzzy data Fuzzy Sets and Systems 28 209-215. Delgado M Verdegay J L and Vila M A 1985 Testing fuzzy hypotheses. A Bayesian approach Approximate Reasoning in Expert Systems eds M M Gupta A Kandel W Bandler and J B Kiszka (Amsterdam: North-Holland) pp 307-316. Diamond P 1988 Fuzzy least squares Information Sciences 46 315-332. Diamond P 1990 Higher level fuzzy numbers arising from fuzzy regression models Fuzzy Sets and Systems 36 265-275. Diamond P 1991 Least squares methods in fuzzy data analysis Proceedings 4th IFSA Congress Computer, Management and Systems Science (Brussels) 60-63. Dimitrescu D 1988 Hierarchical pattern recognition Fuzzy Sets and Systems 28 145-162. Drossos C A and Theodoropoulos P L 1996 B-fuzzy probabilities Fuzzy Sets and Systems 78 355-369. Dubois D and Prade H 1980 Fuzzy Sets and Systems. Theory and Applications (New York: Academic Press). Dubois D and Prade H 1982 The use of fuzzy numbers in Decision Analysis Fuzzy Information and Decision Processes eds M M Gupta and E Sanchez (Amsterdam: North-Holland) pp 309-321. Dubois D and Prade H 1985 Fuzzy cardinality and the modeling of imprecise quantification Fuzzy Sets and Systems 16 199-230. Dubois D and Prade H 1986 Fuzzy sets and statistical data European Journal of Operational Research 25 345-356. Dubois D and Prade H 1989 Fuzzy sets, probability and measurement European Journal of Operational Research 40 135-154.

Dubois D and Prade H 1993 Fuzzy sets and probability: misunderstandings, bridges and gaps Proceedings of the 2nd IEEE International Conference on Fuzzy Systems (San Francisco) pp 1059-1068. Dunn J C 1974 A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters Journal of Cybernetics 3 32-57. Feng L and Guang X X 1993 A forecasting model of fuzzy self-regression Fuzzy Sets and Systems 58 239-242. Freeling A N S 1980 Fuzzy Sets and Decision Analysis IEEE Transactions on Systems, Man and Cybernetics 10 341-354. Fruhwirthschnatter S 1992 On statistical inference for fuzzy data with applications to descriptive statistics Fuzzy Sets and Systems 50 143-165. Gebhardt J Gil M A and Kruse R 1996 Statistical methods and fuzzy-valued statistics International Handbook of Fuzzy Sets and Possibility Theory ed D Dubois H Prade and R Slowinski (Kluwe Academic Publishers: Boston) (to appear). Gertner G Z and Zhu H 1996 Bayesian-estimation in forest surveys when samples or prior information are fuzzy Fuzzy Sets and Systems 77 277-290. Gesu V 1994 Integrated fuzzy clustering Fuzzy Sets and Systems 68 293-308. Gil M A 1987 Fuzziness and loss of information in statistical problems IEEE Transactions on Systems, Man and Cybernetics 17 1016-1025. Gil M A 1988 On the loss of information due to fuzziness in experimental observations Annals of the Institute of Statistical Mathematics 40 627-639. Gil M A 1992 A note on the connection between fuzzy numbers and random intervals Statistics and Probability Letters 13 311-319. Gil M A 1993 Statistical management of fuzzy elements in random experiments Part 1: a discussion on treating fuzziness as a kind of randomness Information Sciences 69 229-242. Gil M A and Casals M R 1988 An operative extension of the likelihood ratio test from fuzzy data Statistical Papers 29 191-203. Gil M A Corral N and Casals M R 1989 The likelihood ratio test for goodness of fit with fuzzy experimental observations IEEE Transactions on Systems, Man and Cybernetics 19 771-779. Gil M A Corral N and Gil P 1985a The fuzzy decision problem: an approach to the point estimation problem with fuzzy information European Journal of Operational Research 22 26-34. Gil M A Corral N and Gil P 1988 The minimum inaccuracy estimates in χ2 tests for goodness of fit with fuzzy observations Journal of Statistical Planning and Inference 19 95-115. Gil M A and Gil P 1992 Fuzziness in the experimental outcomes: comparing experiments and removing the loss of information Journal of Statistical Planning and Inference 31 93-111. Gil M A and Jain P 1992 Comparison of experiments in statistical decision problems with fuzzy utilities IEEE Transactions on Systems, Man and Cybernetics 22 662-670. Gil M A and L´opez M T 1993 Statistical management of fuzzy elements in random experiments. Part 2: The Fisher information associated with a fuzzy information system Information Sciences 69 243-257.

Gil M A and L´opez-D´ıaz M 1996 Fundamentals and bayesian analyses of decision problems with fuzzy-valued utilities Int. J. Approx. Reas. 15, 202-224. Gil M A L´opez M T and Gil P 1984 Comparison between fuzzy information systems Kybernetes 13 245-251. Gil M A L´opez M T and Gil P 1985b Quantity of information: comparison between information systems: 1. Nonfuzzy states Fuzzy Sets and Systems 15 65-78. Gil P Gil M A Men´endez M L and Pardo L 1990 Connections between some criteria to compare fuzzy information systems Fuzzy Sets and Systems 37 183192. Gin´e E and Hahn M G 1985 Characterization and domains of attraction of p-stable random compact sets Annals of Probability 13 447-468. Goodman I R 1982 Fuzzy sets as equivalence classes of possibility random sets Fuzzy Set and Possibility Theory: Recent Development ed R R Yager (Oxford: Pergamon) pp 327-343. Goodman I R and Nguyen H T 1985 Uncertainty Models for Knowledge Based Systems (Amsterdam: North-Holland). Goodman I R and Nguyen H T and Rogers G S 1991 On the scoring approach to admissibility of uncertainty measures in expert systems Journal of Mathematical Analysis and Applications 16 550-594. Gordon A 1981 Classification (London: Chapman and Hall). Guo S and Chen S 1992 An approach to monodic fuzzy regression Fuzzy Regression Analysis eds J Kacprzyk and M Fedrizzi (Warsaw: Springer-Verlag) pp 81-90. Heshmaty B and Kandel A 1985 Fuzzy linear regression and its applications to forecasting in uncertain environment Fuzzy Sets and Systems 15 159-191. Hiai F and Umegaki H 1977 Integrals, conditional expectations, and martingales of multivalued functions Journal of Multivariate Analysis 7 149-182. Hirota 1977 Concepts of probabilistic sets Proceedings IEEE Conference on Decision Control (New Orleans) pp 1361-1366. Hisdal H 1982 Possibilities and probabilities Proceedings 2nd World Conference on Mathematics at the Service of Man (Las Palmas) 172-175. Hisdal H 1988 Are grades of membership probabilities? Fuzzy Sets and Systems 25 325-348. Ishibuchi H and Tanaka H 1992 Fuzzy regression analysis using neural networks Fuzzy Sets and Systems 50 257-265. Ishibuchi H and Tanaka H 1993 An architecture of neural networks with interval weight and its applications to fuzzy regression analysis Fuzzy Sets and Systems 57 27-39. Jain P and Agogino A 1988 Calibration of fuzzy linguistic variables for expert systems Computers in Engineering 1988 - Vol. 1. eds V A Tipnis and E M Patton (New York: American Society of Mechanical Engineering) pp 313-318. Kamel M S and Selim S Z A relaxation approach to the fuzzy clustering problem Fuzzy Sets and Systems 61 177-188. Kendall D G 1974 Foundations of a theory of random sets Stochastic Geometry eds E F Harding and D G Kendall (New York: Wiley) pp 322-376. Klir G J 1989 Is there more to uncertainty than some probability theorist might have us believe? International Journal of General Systems 15 347-378.

Klement E P Puri M L and Ralescu D A 1984 Law of large numbers and central limit theorem for fuzzy random variables Cybernetics and Systems Research 2 ed R Trappl (Amsterdam: North-Holland) pp 525-529. Klement E P Puri M L and Ralescu D A 1986 Limit theorems for fuzzy random variables Proceedings of the Royal Society of London, Series A 19 171-182. Kov´acs M 1992 Fuzzy linear model fitting to fuzzy observations Fuzzy Regression Analysis eds J Kacprzyk and M Fedrizzi (Warsaw: Springer-Verlag) pp 116-123. Kruse R and Gebhardt 1989 On a dialog system for modelling and statistical analysis of linguistic data Proceedings 3rd IFSA Congress (Seattle) pp 157-160. Kruse R and Meyer K D 1987 Statistics with Vague Data (Dordrecht: Reidel Publ. Co.). Kruse R and Meyer K D 1988 Confidence intervals for the parameters of a linguistic random variable Combining Fuzzy Imprecision with Probabilistic Uncertainty in Decision Making, Lecture Notes in Economics and Mathematical Systems 310 eds J Kacprzyk and M Fedrizzi (Berlin: Springer-Verlag) pp 113123. Kruse R Schwecke E and Heinsohn J 1991 Uncertainty and Vagueness in Knowledge Based Systems (Berlin: Springer-Verlag). Kwakernaak H 1978 Fuzzy random variables. Part I: Definitions and Theorems Information Sciences 15 1-29. Kwakernaak H 1979 Fuzzy random variables. Part II: Algorithms and examples for the discrete case Information Sciences 17 253-278. Lamata M T 1994 A model of decision with linguistic knowledge Mathware & Soft Computing 3 253-263. Laviolette M Seaman Jr J W Barret J D and Woodall W H A probabilistic and statistical view of fuzzy methods (with discussions)Technometrics 37 249-292. Le Cam L 1964 Sufficiency and approximate sufficiency Annals of Mathematical Statistics 35 1419-1455. Le Cam L 1986 Asymptotic Methods in Statistical Decision Theory (New York: Springer-Verlag). Lindley D V 1982 Scoring rules and the inevitability of probabilities International Statistical Review 50 1-26. Lindley D V 1987 The probability approach to the treatment of uncertainty in Artificial Intelligence and expert systems Statistical Science 2 17-24. L´opez-D´ıaz M 1996 Medibilidad e Integraci´ on de Variables Aleatorias Difusas. Aplicaci´ on a Problemas de Decisi´ on PhD Thesis (Universidad de Oviedo). Lyashenko N N 1983 Statistics of random compacts in an euclidean space Journal Soviet Mathematics 21 76-92. Manton K G Woodbury M A and Tolley H D 1994 Statistical Applications Using Fuzzy Sets (New York: Wiley). Matheron G 1975 Random Sets and Integral Geometry (New York: Wiley). Miyamoto S 1990 Fuzzy Sets in Information Retrieval and Cluster Analysis (Dordrecht: Kluwer Academic Publishers). Moskowitz H and Kwangjae K 1993 On assesing the H value in fuzzy linear regression Fuzzy Sets and Systems 58 303-327.

Negoita C V and Ralescu D A 1975 Applications of Fuzzy Sets to Systems Analysis (New York: Wiley). Negoita C V and Ralescu D A 1987 Simulation, Knowledge-based Computing, and Fuzzy Statistics (New York: Van Nostrand Reinhold). Nguyen H T 1979 Some mathematical tools for linguistic probabilities Fuzzy Sets and Systems 2 53-65. Okuda T 1987 A statistical treatment of fuzzy observations: estimation problems Proceedings 2nd IFSA Congress (Tokyo) pp 51-55. Okuda T Kodono Y Maehara K and Asai K 1991 Maximum likelihood estimation from fuzzy observation data Proceedings 4th IFSA Congress Computer, Management and Systems Science (Brussels) 185-188. Okuda T Tanaka H and Asai K 1978 A formulation of fuzzy decision problems with fuzzy information, using probability measures of fuzzy events Information and Control 38 135-147. Pardo L 1985 Information energy of a fuzzy event and a partition of fuzzy events IEEE Transactions on Systems, Man and Cybernetics 14 139-144. Pardo L and Men´endez M L 1991 Some bounds on probability of error in fuzzy discrimination problems European Journal of Operational Research 53 362-370. Pardo L Men´endez M L and Pardo J A 1986 The f ∗-divergence as a criterion of comparison between fuzzy information systems Kybernetes 15 189-194. Pardo L Men´endez M L and Pardo J A 1988 A sequential selection method of a fixed number of fuzzy information systems based on the information energy gain Fuzzy Sets and Systems 25 97-105. Pardo L Men´endez M L and Pardo J A 1989 Sufficient fuzzy information systems Fuzzy Sets and Systems 32 81-89. Puri M L and Ralescu D A 1981 Diff´erentielle d’une fonction floue Comptes Rendues de l’Acad´emie des Sciences de Paris, S´erie I 293 237-239. Puri M L and Ralescu D A 1985 The concept of normality for fuzzy random variables Annals of Probability 13 1373-1379. Puri M L and Ralescu D A 1986 Fuzzy random variables Journal of Mathematical Analysis and Applications 114 409-422. Puri M L and Ralescu D A 1991 Convergence theorem for fuzzy martingales Journal of Mathematical Analysis and Applications 160 107-122. Raiffa H and Schlaifer R 1961 Applied Statistical Decision Theory (Boston: Harvard University, Graduate School of Business). Ralescu A and Ralescu D A 1984 Probability and fuzziness Information Sciences 17 85-92. Ralescu A and Ralescu D A 1986 Fuzzy sets in statistical inference The Math¨ ematics of Fuzzy Systems eds A Di Nola and A G S Ventre (K¨oln: Verlag TUV Rheinland) pp 273-283. Ralescu D A 1982 Fuzzy logic and statistical estimation Proceedings 2nd World Conference on Mathematics at the Service of Man (Las Palmas) 605-606. Ralescu D A 1995a Fuzzy probabilities and their applications to statistical inference Advances in Intelligent Computing − IPMU’94, Lecture Notes in Computer Science 945 217-222. Ralescu D A 1995b Fuzzy random variables revisited Proceedings IFES’95 and Fuzzy IEEE Joint Conference (Yokohama).

Ralescu D A 1995c Inequalities for fuzzy random variables Proceedings 26th Iranian Mathematical Conference (Kerman) pp 333-335. Ralescu D A 1996 Statistical Decision-Making without numbers Proceedings 27th Iranian Mathematical Conference (Shiraz) pp 403-417. Ralescu A and Ralescu D A 1996 Extensions of fuzzy aggregation Fuzzy Sets and Systems (to appear). Rappoport A Wallsten T S and Cox J A 1987 Direct and indirect scaling of membership functions of probability phrases Mathematical Modelling 9 397418. R¨omer C and Kandel A 1995 Statistical tests for fuzzy data Fuzzy Sets and Systems 72 1-26. R¨omer C Kandel A and Backer E 1995 Fuzzy partitions of the sample space and fuzzy parameter hypotheses IEEE Transactions on Systems, Man and Cybernetics 25 1314-1322. Ruspini E H 1969 A new approach to clustering Information and Control 15 22-32. Ruspini E H 1970 Numerical methods for fuzzy clustering Informations Sciences 2 319-350. Saade J J 1994 Extension of fuzzy hypothesis testing with hybrid data Fuzzy Sets and Systems 63 57-71. Saaty T L 1974 Measuring the fuzziness of sets Journal of Cybernetics 4 53-61. Sakawa M and Yano H 1992a Fuzzy linear regression and its applications Fuzzy Regression Analysis eds J Kacprzyk and M Fedrizzi (Warsaw: Springer-Verlag) pp 61-80. Sakawa M and Yano H 1992b Multiobjective fuzzy linear regression analysis for fuzzy input-output data Fuzzy Sets and Systems 42 173-181. Salas A Bertoluzza C and Corral N 1991 Fuzzy linear regression: existence of solution for a generalized least squares method Proceedings 4th IFSA Congress Computer, Management and Systems Science (Brussels) 233-235. Savic D A and Pedrycz W 1991 Evaluation of fuzzy linear regression models Fuzzy Sets and Systems 39 51-63. Savic D A and Pedrycz 1992 Fuzzy linear regression models: construction and evaluation Fuzzy Regression Analysis eds J Kacprzyk and M Fedrizzi (Warsaw: Springer-Verlag) pp 91-100. Scott A and Symons M 1971 Clustering methods based on a likelihood ratio criteria Biometrics 27 387-398. Stein W E and Talati K 1981 Convex fuzzy random variables Fuzzy Sets and Systems 6 271-283. Stein W E and Zwick R 1988 Fuzzy random variables Combining Fuzzy Imprecision with Probabilistic Uncertainty in Decision Making, Lecture Notes in Economics and Mathematical Systems 310 eds J Kacprzyk and M Fedrizzi (Berlin: Springer-Verlag) pp 66-74. Stojakovi´c M 1992 Fuzzy conditional expectation Fuzzy Sets and Systems 52 151-158. Stojakovi´c M 1994 Fuzzy random variables, expectation, and martingales Journal of Mathematical Analysis and Applications 184 594-606.

Tanaka H Okuda T and Asai K 1979 Fuzzy information and decision in statistical model Advances in Fuzzy Sets Theory and Applications eds M M Gupta R K Rage and R R Yager (Amsterdam: North-Holland) pp 303-320. Tanaka H Uejima S Asai K 1982 Linear regression analysis with fuzzy model IEEE Transactions on Systems, Man and Cybernetics 12 903-907. Thomas S F 1995 Fuzziness and Probability (Kansas: ACG Press Wichita). Tong R M and Bonissone P P 1980 A linguistic approach to decision-making with fuzzy sets IEEE Transactions on Systems, Man and Cybernetics 10 716723. Trauwaert E 1988 On the meaning of Dunn’s partition coefficient for fuzzy clustering Fuzzy Sets and Systems 25 217-242. Trauwaert E Kaufman L and Rousseeuw P 1991 Fuzzy clustering algorithms based on the maximun likelihood principle Fuzzy Sets and Systems 42 213-227. Utkin L V 1993 Uncertainty importance of system components by fuzzy and interval probability Microelectronics and Reliability 33 1357-1364. Viertl R 1987 Is it necessary to develop a fuzzy Bayesian inference? Probability and Bayesian Statistics ed R Viertl (New York: Plenum Press) pp 471-475. Viertl R 1992 On statistical inference based on non-precise data Modeling uncertain data. Series: Mathematical Research 68 ed H Bandemer (Berlin: Academie Verlag) pp 121-130. Viertl R and Hule H 1991 On Bayes’ theorems for fuzzy data Statistical Papers 32 115-122. Villani E 1994 Metodi numerici per la determinazione dei coefficienti sfumati nella regressione PhD Thesis (Universit`a di Pavia). Wang P Z 1987 Random sets in fuzzy set theory Systems & Control Encyclopedia: Theory, tecnology, Applications ed M G Singh (New York: Pergamon Press) pp 3945-3947. Wang Z Y and Li S M 1990 Fuzzy linear regression analysis of fuzzy valued variables Fuzzy Sets and Systems 36 125-136. Wang P Z and Sanchez E 1983 Hyperfields and random sets Proceedings of the IFAC Symposium (Marseille) 335-339. Wang H Wang C and Wu G 1994 Bi-criteria fuzzy c-means analysis Fuzzy Sets and Systems 64 311-319. Watanabe N and Imaizumi T 1993 A fuzzy statistical test of fuzzy hypotheses Fuzzy Sets and Systems 53 167-178. Watson S R Weiss J J and Donnell M L 1979 Fuzzy decision analysis IEEE Transactions on Systems, Man and Cybernetics 9 1-9. Weber S 1991 Uncertainty measures, decomposability and admissibility Fuzzy Sets and Systems 40 1395-405. Weil W 1982 An application of the central limit theorem for Banach spacevalued random variables to the theory of random sets Zeitschrift f¨ ur Wahrscheinlichkeitsheorie und Verwandte Gebiete 60 203-208. Whalen T 1984 Decisionmaking under uncertainty with various assumptions about available information IEEE Transactions on Systems, Man and Cybernetics 14 888-900. Xizhao W and Minghu H 1992 Fuzzy linear regression analysis Fuzzy Sets and Systems 51 179-188.

Yager R R 1979 A note on probabilities of fuzzy events Information Sciences 18 113-122. Yager R R 1982 Fuzzy prediction based on regression models Information Sciences 26 45-63. Yager R R 1984 A representation of the probability of a fuzzy subset Fuzzy Sets and Systems 13 273-283. Yang M S 1993 On a class of fuzzy classification maximum likelihood procedures Fuzzy Sets and Systems 57 365-375. Yang M and Su C 1994 On parameter estimation for normal mixtures based on clustering algorithms Fuzzy Sets and Systems 68 13-28. Zadeh L A 1965 Fuzzy sets Information and Control 8 338-353. Zadeh L A 1968 Probability measures of fuzzy events Journal of Mathematical Analysis and Applications 23 421-427. Zadeh L A 1975 The concept of a linguistic variable and its application to approximate reasoning Information Sciences Part 1 8 199-249; Part 2 8 301353; Part 3 9 43-80. Zadeh L A 1984 Fuzzy probabilities Information Processing Management 20 363-372. Zadeh L A 1995 Probability Theory and Fuzzy Logic are complementary rather than competitive (Discussion on the paper from Laviolette et al., 1995) Technometrics 37 271-276. Zhong C and Zhou G 1987 The equivalence of two definitions of fuzzy random variables Proceedings of the 2nd IFSA Congress (Tokyo) pp 59-62. Zimmermann H J 1991 Fuzzy Set Theory and its Applications (Boston: Kluwer Academic Publishers) Zwick R and Wallsten T S 1990 Combining stochastic uncertainty and linguistic inexactness: theory and experimental evaluation of four fuzzy probability models Knowledge-Based Systems 3 eds B R Gaines and J H Boose (London: Academic Press) pp 337-379.